AI Crawler Robots.txt Guide: GPTBot, ClaudeBot & More
Why AI Crawlers Matter
AI models need to crawl your website to include it in their knowledge base. If you block AI crawlers, your content won't appear in ChatGPT, Claude, or Perplexity responses — regardless of how good your SEO is.
Major AI Crawlers
There are 14+ AI crawlers you should know about:
| Crawler | Owner | Purpose | |---------|-------|---------| | GPTBot | OpenAI | ChatGPT knowledge base | | OAI-SearchBot | OpenAI | ChatGPT search feature | | ClaudeBot | Anthropic | Claude knowledge base | | PerplexityBot | Perplexity | Perplexity search | | Google-Extended | Google | AI training & Gemini | | CCBot | Common Crawl | Public web archive | | Bytespider | ByteDance | TikTok |
Checking Your robots.txt
The ideal configuration allows all AI crawlers while blocking only what you need:
User-agent: GPTBot Allow: /User-agent: ClaudeBot Allow: /
User-agent: PerplexityBot
Allow: /
Common Mistakes
1. Blocking all crawlers — `Disallow: /` under `*` also blocks AI crawlers 2. Not checking — Most sites never check which crawlers are blocked
Best Practices
- Allow all AI crawlers unless you have specific legal concerns
- Monitor crawler access regularly as new AI crawlers emerge
- Implement llms.txt to help AI crawlers find your best content