What is ClaudeBot?
Direct Answer: ClaudeBot is Anthropic's web crawler that collects web content for training Claude AI models. It respects robots.txt directives and supports Crawl-delay.
ClaudeBot is operated by Anthropic to enhance generative AI model utility and safety by collecting web content for training datasets. Blocking ClaudeBot via robots.txt excludes future materials from AI model training. Anthropic commits to non-intrusive crawling, respecting robots.txt directives, and not bypassing anti-circumvention technologies like CAPTCHAs. Anthropic does not publish IP ranges, as they use service provider public IPs.
User-Agent Identification
The following user-agent strings identify ClaudeBot in your live traffic data:
ClaudeBot/1.0Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
robots.txt Rules for ClaudeBot
Respects robots.txt: Yes
Use the following robots.txt rules to control ClaudeBot access:
# Block ClaudeBot
User-agent: ClaudeBot
Disallow: /
# Allow ClaudeBot
User-agent: ClaudeBot
Allow: / Robots.txt is a directive, not a barrier
Anthropic states that ClaudeBot respects robots.txt. However, configuration mistakes, caching delays, and edge cases mean your directives may not always be followed as expected. Live traffic verification confirms whether ClaudeBot actually obeys your rules in practice.
Need continuous verification across 500+ bots? Can AI See It automates this.
Crawl Behavior
Frequency:Not Documented
Request Pattern:Non-Intrusive; Supports Crawl-Delay Extension To Robots.Txt
Official Documentation Quotes
"Enhances generative AI model utility and safety by collecting web content for training."
"To limit crawling activity, we support the non-standard Crawl-delay extension to robots.txt."
"Alternate methods like blocking IP address(es) from which Anthropic Bots operates may not work correctly or persistently guarantee an opt-out."
Why track ClaudeBot traffic?
Measure what Anthropic gives back. ClaudeBot takes your content for AI training — but does Anthropic send any traffic in return through other products? Track whether the trade-off is worth it before deciding to block.
Understand what content is being collected for AI training. ClaudeBot crawls your site to gather data that may train AI models. Tracking its activity reveals which pages are selected — and which are skipped.
Make an informed block-or-allow decision. Blocking ClaudeBot prevents your content from being used in future model training. But first, measure the volume: how many pages does it fetch, how often, and does Anthropic send any referral traffic through other products?
Detect content harvesting patterns. If ClaudeBot is systematically crawling your highest-value content (product pages, proprietary research, premium articles), you may want to restrict access using robots.txt or server-side rules.
What does ClaudeBot crawling actually cost you?
AI training bots like ClaudeBot collect your content to improve future AI models. Unlike AI search bots, there's no direct referral pipeline — ClaudeBot doesn't cite sources or send traffic back to your site.
What you give
- Server resources for every crawl request
- Your content, expertise, and original research
- Data that improves a competing AI product
What you get back
- No direct referral traffic from ClaudeBot
- No attribution in AI model outputs
- No revenue share from model usage
This doesn't automatically mean you should block ClaudeBot. But you need to measure the real cost before deciding. Anthropic may send traffic through other products (Claude web search) — blocking the training bot might not affect referrals at all, or it might. Only log data tells you.
What Can AI See It measures for AI training bots
How many pages ClaudeBot fetches from your site
Which pages and sections ClaudeBot prioritizes
Do Anthropic's OTHER products send you traffic?
Does ClaudeBot actually respect your robots.txt?
How is this different from prompt testing tools? Prompt testing checks if AI mentions your brand in simulated queries. Can AI See It measures what actually happens: real crawls, real referrals, real conversions — from your live traffic data.
Read: Why live traffic monitoring beats prompt testing →Log Verification
To verify ClaudeBot traffic in your live traffic data:
- Search access logs for the user-agent strings listed above
- Check if the IP addresses match documented ranges (if provided by Anthropic)
- Verify the crawl pattern matches documented behavior
- Use reverse DNS lookup for additional verification if available
Note: Observed behavior in production environments may differ from official documentation. Live traffic monitoring provides the only reliable verification of actual bot behavior.
Undocumented Information
The following information is not officially documented for ClaudeBot:
- crawl frequency
- published IP ranges
- JavaScript rendering behavior
Measure your Crawl-to-Referral Ratio for ClaudeBot
See how much traffic Anthropic actually sends back to your site relative to how much content ClaudeBot takes.
- Connect ClaudeBot crawls in your logs with referral sessions in analytics
- Calculate your CRR — the metric prompt testing tools can't provide
- Make data-driven block/allow decisions for every AI bot
Measure business impact from ClaudeBot
The question isn't just whether to block ClaudeBot — it's what you lose or gain from its crawling activity.
- Crawl volume: how many pages ClaudeBot collects from your site
- Content value: which content categories are targeted most
- Cross-platform CRR: does Anthropic send traffic through other products?
- Referral tracking: ClaudeBot takes — measure what Anthropic gives back. Track actual visits arriving from Anthropic's products to your site.
Based on your live traffic data and analytics — not synthetic prompt tests.
Official Documentation
View Official ClaudeBot Documentation →
Information sourced from official documentation. Content generated with AI assistance.