AI Readiness — PredictionTalk

AI Readiness at PredictionTalk

PredictionTalk is the independent knowledge hub for prediction-market traders. We actively optimize our content so AI assistants and large language models can discover, understand, and surface our discussions to users asking about prediction markets.

This page documents how we make our content accessible to AI systems: our bot access policy, structured discovery endpoints, and live monitoring data.

Why this matters Prediction markets are Google for the future. When someone asks an AI "what are the odds of X happening?", we want PredictionTalk discussions to be part of the answer. AI readiness is how we make that possible.

First report after weekly cron

AI bot requests
(last 30 days)

First report after weekly cron

Distinct AI crawlers
active

First report after weekly cron

Crawl success
rate (2xx)

llms.txt

AI discovery
standard

How AI Systems Discover Our Content

AI crawlers and LLM training pipelines discover PredictionTalk through multiple layers:

1. robots.txt

Our /robots.txt explicitly allows known AI crawlers to index the forum. We distinguish between crawlers we welcome (content indexers) and those we restrict (training scrapers).

2. llms.txt standard

We implement the emerging llms.txt standard — a plain-text file that tells AI assistants what our site is about, what content is available, and how to navigate it.

/llms.txt — summary for AI assistants
/llms-full.txt — full content index for LLM training

3. .well-known endpoints

Machine-readable metadata in standardized locations:

/.well-known/ai-plugin.json — plugin manifest for AI tool use
/.well-known/llms.txt — alternative discovery path

4. Structured data (Schema.org)

Every page includes JSON-LD structured data so AI systems can understand content type, author, date, and topic without reading the full HTML.

5. JSON API

Clean JSON endpoints for AI-friendly data consumption without HTML parsing overhead.

Bot Access Policy

We welcome AI crawlers that respect our content and contribute to knowledge discovery. Here is our policy for known bots:

Bot / Crawler	Organization	Access	Notes
`ClaudeBot`	Anthropic	Allowed	Claude AI training & answers
`anthropic-ai`	Anthropic	Allowed	Anthropic research crawler
`GPTBot`	OpenAI	Allowed	ChatGPT training
`ChatGPT-User`	OpenAI	Allowed	Real-time ChatGPT browsing
`OAI-SearchBot`	OpenAI	Allowed	OpenAI search indexing
`Google-Extended`	Google	Allowed	Gemini AI training
`PerplexityBot`	Perplexity AI	Allowed	Perplexity search answers
`Applebot-Extended`	Apple	Allowed	Apple Intelligence
`cohere-ai`	Cohere	Allowed	Cohere LLM training
`Bytespider`	ByteDance	Limited	Public pages only
`CCBot`	Common Crawl	Limited	Public pages only

Rate limiting All bots are subject to standard Nginx rate limiting. If your crawler receives 429 responses, reduce crawl frequency and respect Crawl-delay directives in robots.txt.

llms.txt

We follow the llms.txt specification to help AI assistants understand our site structure and content.

What's in our llms.txt

Site description — what PredictionTalk is and who it's for
Key URLs — forum sections, API endpoints, important pages
Content policy — licensing, attribution requirements
Topics covered — prediction markets, probability, trading strategies

# Example from /llms.txt
# PredictionTalk — The community for prediction market traders
# https://predictiontalk.org
#
# PredictionTalk is a forum for traders on Polymarket, Kalshi,
# Manifold, and other prediction market platforms.
# ...

View full file: /llms.txt | /llms-full.txt

JSON API for AI Consumption

Clean JSON endpoints that AI systems can query without HTML parsing:

Endpoint	Description	Update Freq
/api/ai/site-info.json	Site overview, stats, recent activity	Hourly
/api/ai/discussions.json	Recent discussions with metadata	Hourly
/api/ai/tags.json	Topic taxonomy and tag structure	Daily

Example response

{
  "site": "PredictionTalk",
  "url": "https://predictiontalk.org",
  "description": "Forum for prediction market traders",
  "topics": ["Polymarket", "Kalshi", "probability", "strategy"],
  "recent_discussions": [
    {
      "id": 123,
      "title": "Best strategies for binary options on Polymarket",
      "url": "https://predictiontalk.org/d/123",
      "tags": ["Strategy", "Polymarket"],
      "created_at": "2026-02-20T14:30:00Z"
    }
  ]
}

Structured Data (Schema.org)

Every page on PredictionTalk includes JSON-LD structured data following Schema.org vocabulary, allowing AI systems and search engines to understand content without parsing HTML.

Implemented schemas

Page type	Schema type	Key properties
Forum thread	`DiscussionForumPosting`	author, datePublished, text, interactionStatistic
User profile	`Person`	name, url, memberOf
Homepage	`WebSite` + `Organization`	name, url, description, sameAs
Tag page	`CollectionPage`	about, breadcrumb, hasPart
Docs page	`Article`	headline, description, publisher, breadcrumb

AI Bot Traffic (Live Monitoring)

We track all known AI crawler activity from nginx access logs, updated weekly every Monday.

Bot Activity — Last 30 Days

Stats will appear after the first weekly cron run —

Data source Stats are parsed from nginx access logs by /var/www/predictiontalk/scripts/parse-ai-bot-logs.sh and updated weekly via cron. Raw data available at /api/ai/bot-stats.json.

What we monitor

Total hits per bot — how often each AI crawler visits
HTTP status distribution — 2xx (success), 3xx (redirect), 4xx (error)
robots.txt compliance — are bots checking before crawling?
Top crawled pages — which content AI bots find most interesting
Week-over-week trends — growth or decline in AI visibility

AI Readiness Coverage

Status of all AI readiness features:

Feature	Status	Notes
robots.txt with AI bot rules	Live	Explicit allow/disallow per crawler
llms.txt	Live	Summary at /llms.txt
llms-full.txt	Live	Full index at /llms-full.txt
.well-known/ai-plugin.json	Live	Plugin manifest for ChatGPT/Claude
Schema.org structured data	Live	JSON-LD on all pages
JSON API endpoints	Live	AI-optimized data endpoints
Weekly bot monitoring	Live	Automated cron every Monday
AI traffic alerts	Live	Email alerts on traffic anomalies