How can we help you?
How to use the Crawler Control tool?
Analyzing a website #
- Start by entering your website URL. You can use example.com or include https://example.com.
- Next, choose whether you want to analyze robots.txt or llms.txt.
- Click Analyze or press Enter.
The tool will then fetch the file, check it for errors, list blocked or allowed crawlers, and give clear recommendations on how to improve it.
Testing URL paths in robots.txt #
Once a robots.txt file is analyzed, you can test individual URLs.
- Scroll to the Test URL Path section.
- Enter a path such as /about or /admin/login.
- Select a crawler, for example Googlebot or GPTBot.
- Click Test Path.
You’ll instantly see whether that crawler is allowed or blocked and which rule caused the result.
Understanding the results #
Validation results
- Green checkmarks mean everything is fine.
- Yellow warnings point out non-critical issues that could still be improved.
- Red errors highlight problems that should be fixed.
Common issues include missing sitemap entries, syntax errors like missing colons, or duplicate user-agent definitions.
Blocked crawlers overview
The tool groups crawlers so results are easy to understand.
- Search engines include Google and Bing. These are usually allowed.
- AI training crawlers include GPTBot, ClaudeBot, and CCBot.
- AI assistant crawlers include ChatGPT-User and PerplexityBot.
- Commerce crawlers include Amazonbot.
Green means allowed, orange means partially blocked, and red means fully blocked.
Recommendations explained #
Recommendations are grouped into three areas.
Missing elements
These are things your file should include, like a sitemap URL.
SEO issues
These are critical. Blocking search engines or CSS and JavaScript files can break how your site appears in search results.
Security and privacy
These suggestions focus on blocking AI training crawlers if you don’t want your content used for model training.