AI Readiness at PredictionTalk
PredictionTalk is the independent knowledge hub for prediction-market traders. We actively optimize our content so AI assistants and large language models can discover, understand, and surface our discussions to users asking about prediction markets.
This page documents how we make our content accessible to AI systems: our bot access policy, structured discovery endpoints, and live monitoring data.
(last 30 days)
active
rate (2xx)
standard
How AI Systems Discover Our Content
AI crawlers and LLM training pipelines discover PredictionTalk through multiple layers:
1. robots.txt
Our /robots.txt explicitly allows known AI crawlers to index the forum. We distinguish between crawlers we welcome (content indexers) and those we restrict (training scrapers).
2. llms.txt standard
We implement the emerging llms.txt standard — a plain-text file that tells AI assistants what our site is about, what content is available, and how to navigate it.
- /llms.txt — summary for AI assistants
- /llms-full.txt — full content index for LLM training
3. .well-known endpoints
Machine-readable metadata in standardized locations:
- /.well-known/ai-plugin.json — plugin manifest for AI tool use
- /.well-known/llms.txt — alternative discovery path
4. Structured data (Schema.org)
Every page includes JSON-LD structured data so AI systems can understand content type, author, date, and topic without reading the full HTML.
5. JSON API
Clean JSON endpoints for AI-friendly data consumption without HTML parsing overhead.
Bot Access Policy
We welcome AI crawlers that respect our content and contribute to knowledge discovery. Here is our policy for known bots:
| Bot / Crawler | Organization | Access | Notes |
|---|---|---|---|
ClaudeBot |
Anthropic | Allowed | Claude AI training & answers |
anthropic-ai |
Anthropic | Allowed | Anthropic research crawler |
GPTBot |
OpenAI | Allowed | ChatGPT training |
ChatGPT-User |
OpenAI | Allowed | Real-time ChatGPT browsing |
OAI-SearchBot |
OpenAI | Allowed | OpenAI search indexing |
Google-Extended |
Allowed | Gemini AI training | |
PerplexityBot |
Perplexity AI | Allowed | Perplexity search answers |
Applebot-Extended |
Apple | Allowed | Apple Intelligence |
cohere-ai |
Cohere | Allowed | Cohere LLM training |
Bytespider |
ByteDance | Limited | Public pages only |
CCBot |
Common Crawl | Limited | Public pages only |
Crawl-delay directives in robots.txt.
llms.txt
We follow the llms.txt specification to help AI assistants understand our site structure and content.
What's in our llms.txt
- Site description — what PredictionTalk is and who it's for
- Key URLs — forum sections, API endpoints, important pages
- Content policy — licensing, attribution requirements
- Topics covered — prediction markets, probability, trading strategies
# Example from /llms.txt
# PredictionTalk — The community for prediction market traders
# https://predictiontalk.org
#
# PredictionTalk is a forum for traders on Polymarket, Kalshi,
# Manifold, and other prediction market platforms.
# ...
View full file: /llms.txt | /llms-full.txt
JSON API for AI Consumption
Clean JSON endpoints that AI systems can query without HTML parsing:
| Endpoint | Description | Update Freq |
|---|---|---|
| /api/ai/site-info.json | Site overview, stats, recent activity | Hourly |
| /api/ai/discussions.json | Recent discussions with metadata | Hourly |
| /api/ai/tags.json | Topic taxonomy and tag structure | Daily |
Example response
{
"site": "PredictionTalk",
"url": "https://predictiontalk.org",
"description": "Forum for prediction market traders",
"topics": ["Polymarket", "Kalshi", "probability", "strategy"],
"recent_discussions": [
{
"id": 123,
"title": "Best strategies for binary options on Polymarket",
"url": "https://predictiontalk.org/d/123",
"tags": ["Strategy", "Polymarket"],
"created_at": "2026-02-20T14:30:00Z"
}
]
}
Structured Data (Schema.org)
Every page on PredictionTalk includes JSON-LD structured data following Schema.org vocabulary, allowing AI systems and search engines to understand content without parsing HTML.
Implemented schemas
| Page type | Schema type | Key properties |
|---|---|---|
| Forum thread | DiscussionForumPosting | author, datePublished, text, interactionStatistic |
| User profile | Person | name, url, memberOf |
| Homepage | WebSite + Organization | name, url, description, sameAs |
| Tag page | CollectionPage | about, breadcrumb, hasPart |
| Docs page | Article | headline, description, publisher, breadcrumb |
AI Bot Traffic (Live Monitoring)
We track all known AI crawler activity from nginx access logs, updated weekly every Monday.
Bot Activity — Last 30 Days
/var/www/predictiontalk/scripts/parse-ai-bot-logs.sh and updated weekly via cron. Raw data available at /api/ai/bot-stats.json.
What we monitor
- Total hits per bot — how often each AI crawler visits
- HTTP status distribution — 2xx (success), 3xx (redirect), 4xx (error)
- robots.txt compliance — are bots checking before crawling?
- Top crawled pages — which content AI bots find most interesting
- Week-over-week trends — growth or decline in AI visibility
AI Readiness Coverage
Status of all AI readiness features:
| Feature | Status | Notes |
|---|---|---|
| robots.txt with AI bot rules | Live | Explicit allow/disallow per crawler |
| llms.txt | Live | Summary at /llms.txt |
| llms-full.txt | Live | Full index at /llms-full.txt |
| .well-known/ai-plugin.json | Live | Plugin manifest for ChatGPT/Claude |
| Schema.org structured data | Live | JSON-LD on all pages |
| JSON API endpoints | Live | AI-optimized data endpoints |
| Weekly bot monitoring | Live | Automated cron every Monday |
| AI traffic alerts | Live | Email alerts on traffic anomalies |