Robots.txt Tester: Is ChatGPT & Google Blocked?
Fetch your live robots.txt or paste it. We flag the 9 common mistakes that quietly deindex pages: wildcard disallows, conflicting rules, missing sitemaps, and ignored crawl-delays.
VectraSEO's free Robots.txt Tester fetches your live robots.txt (or any pasted file) and flags the nine misconfigurations that quietly deindex pages: blanket Disallow: /, conflicting user-agent rules, missing sitemap directives, ignored crawl-delays, accidental noindex aliases, syntax errors, and case-sensitivity traps. It returns a rule-by-rule diagnosis with the exact fix and an explanation of how each major crawler — Googlebot, Bingbot, ChatGPT-User, PerplexityBot, ClaudeBot — will read it. No signup, runs in seconds.
Last updated
Frequently asked questions
- 01What does robots.txt actually do?
- It tells crawlers (Googlebot, Bingbot, ChatGPT-User, etc.) which paths they may or may not request. It is a polite request, not enforcement — well-behaved bots obey it; abusive scrapers ignore it. Use
noindexmeta tags or HTTP auth for content that must be hidden from search engines. - 02Where should robots.txt live?
- Always at the root:
https://example.com/robots.txt. A robots.txt at a subdirectory (/blog/robots.txt) or a subdomain with no separate file is ignored by Google. Each subdomain needs its own robots.txt. - 03Will <code>Disallow: /</code> deindex my whole site?
- Eventually, yes. Once Googlebot can no longer crawl the URLs, they'll drop from the index over weeks. To force a fast removal use Search Console's "Remove URLs" tool. Be especially careful with staging or pre-launch sites that get pushed to production with the staging robots.txt still in place.
- 04Should I list my sitemap in robots.txt?
- Yes — add
Sitemap: https://example.com/sitemap.xmlon its own line. It costs nothing, helps Bing and DuckDuckGo discover the sitemap, and doubles as a self-documentation breadcrumb for the next person who edits the file. - 05Does ChatGPT obey robots.txt?
- OpenAI's grounding crawler honours the standard. Block it explicitly with a
User-agent: ChatGPT-Userblock if you want to opt out of being cited in ChatGPT answers, or withGPTBotto opt out of model training. Most other AI engines (Perplexity, Anthropic, Google-Extended) publish their own user-agent names — block them the same way. - 06Why did Google still index a page I disallowed?
- Disallow stops Google from crawling, not from indexing. If other sites link to a blocked URL, Google can list the URL with no description ("blocked by robots.txt"). To remove the page from search entirely, allow crawling and add
<meta name="robots" content="noindex">on the page itself.
- Introduction to robots.txt — Google Search Central
- RFC 9309 — Robots Exclusion Protocol — IETF
- Bing — How to create and submit a robots.txt file — Bing Webmaster
- Google Search Central documentation — Google
- SEO Starter Guide — Google Search Central
- Schema.org vocabulary — schema.org