AI Readiness at PredictionTalk

PredictionTalk is the independent knowledge hub for prediction-market traders. We actively optimize our content so AI assistants and large language models can discover, understand, and surface our discussions to users asking about prediction markets.

This page documents how we make our content accessible to AI systems: our bot access policy, structured discovery endpoints, and live monitoring data.

Why this matters Prediction markets are Google for the future. When someone asks an AI "what are the odds of X happening?", we want PredictionTalk discussions to be part of the answer. AI readiness is how we make that possible.
First report after weekly cron
AI bot requests
(last 30 days)
First report after weekly cron
Distinct AI crawlers
active
First report after weekly cron
Crawl success
rate (2xx)
llms.txt
AI discovery
standard

How AI Systems Discover Our Content

AI crawlers and LLM training pipelines discover PredictionTalk through multiple layers:

1. robots.txt

Our /robots.txt explicitly allows known AI crawlers to index the forum. We distinguish between crawlers we welcome (content indexers) and those we restrict (training scrapers).

2. llms.txt standard

We implement the emerging llms.txt standard — a plain-text file that tells AI assistants what our site is about, what content is available, and how to navigate it.

3. .well-known endpoints

Machine-readable metadata in standardized locations:

4. Structured data (Schema.org)

Every page includes JSON-LD structured data so AI systems can understand content type, author, date, and topic without reading the full HTML.

5. JSON API

Clean JSON endpoints for AI-friendly data consumption without HTML parsing overhead.

Bot Access Policy

We welcome AI crawlers that respect our content and contribute to knowledge discovery. Here is our policy for known bots:

Bot / Crawler Organization Access Notes
ClaudeBot Anthropic Allowed Claude AI training & answers
anthropic-ai Anthropic Allowed Anthropic research crawler
GPTBot OpenAI Allowed ChatGPT training
ChatGPT-User OpenAI Allowed Real-time ChatGPT browsing
OAI-SearchBot OpenAI Allowed OpenAI search indexing
Google-Extended Google Allowed Gemini AI training
PerplexityBot Perplexity AI Allowed Perplexity search answers
Applebot-Extended Apple Allowed Apple Intelligence
cohere-ai Cohere Allowed Cohere LLM training
Bytespider ByteDance Limited Public pages only
CCBot Common Crawl Limited Public pages only
Rate limiting All bots are subject to standard Nginx rate limiting. If your crawler receives 429 responses, reduce crawl frequency and respect Crawl-delay directives in robots.txt.

llms.txt

We follow the llms.txt specification to help AI assistants understand our site structure and content.

What's in our llms.txt

  • Site description — what PredictionTalk is and who it's for
  • Key URLs — forum sections, API endpoints, important pages
  • Content policy — licensing, attribution requirements
  • Topics covered — prediction markets, probability, trading strategies
# Example from /llms.txt
# PredictionTalk — The community for prediction market traders
# https://predictiontalk.org
#
# PredictionTalk is a forum for traders on Polymarket, Kalshi,
# Manifold, and other prediction market platforms.
# ...

View full file: /llms.txt | /llms-full.txt

JSON API for AI Consumption

Clean JSON endpoints that AI systems can query without HTML parsing:

EndpointDescriptionUpdate Freq
/api/ai/site-info.json Site overview, stats, recent activity Hourly
/api/ai/discussions.json Recent discussions with metadata Hourly
/api/ai/tags.json Topic taxonomy and tag structure Daily

Example response

{
  "site": "PredictionTalk",
  "url": "https://predictiontalk.org",
  "description": "Forum for prediction market traders",
  "topics": ["Polymarket", "Kalshi", "probability", "strategy"],
  "recent_discussions": [
    {
      "id": 123,
      "title": "Best strategies for binary options on Polymarket",
      "url": "https://predictiontalk.org/d/123",
      "tags": ["Strategy", "Polymarket"],
      "created_at": "2026-02-20T14:30:00Z"
    }
  ]
}

Structured Data (Schema.org)

Every page on PredictionTalk includes JSON-LD structured data following Schema.org vocabulary, allowing AI systems and search engines to understand content without parsing HTML.

Implemented schemas

Page typeSchema typeKey properties
Forum threadDiscussionForumPostingauthor, datePublished, text, interactionStatistic
User profilePersonname, url, memberOf
HomepageWebSite + Organizationname, url, description, sameAs
Tag pageCollectionPageabout, breadcrumb, hasPart
Docs pageArticleheadline, description, publisher, breadcrumb

AI Bot Traffic (Live Monitoring)

We track all known AI crawler activity from nginx access logs, updated weekly every Monday.

Bot Activity — Last 30 Days

Stats will appear after the first weekly cron run
Data source Stats are parsed from nginx access logs by /var/www/predictiontalk/scripts/parse-ai-bot-logs.sh and updated weekly via cron. Raw data available at /api/ai/bot-stats.json.

What we monitor

  • Total hits per bot — how often each AI crawler visits
  • HTTP status distribution — 2xx (success), 3xx (redirect), 4xx (error)
  • robots.txt compliance — are bots checking before crawling?
  • Top crawled pages — which content AI bots find most interesting
  • Week-over-week trends — growth or decline in AI visibility

AI Readiness Coverage

Status of all AI readiness features:

FeatureStatusNotes
robots.txt with AI bot rules Live Explicit allow/disallow per crawler
llms.txt Live Summary at /llms.txt
llms-full.txt Live Full index at /llms-full.txt
.well-known/ai-plugin.json Live Plugin manifest for ChatGPT/Claude
Schema.org structured data Live JSON-LD on all pages
JSON API endpoints Live AI-optimized data endpoints
Weekly bot monitoring Live Automated cron every Monday
AI traffic alerts Live Email alerts on traffic anomalies