Question 1

What does robots.txt actually do?

Accepted Answer

It tells crawlers (Googlebot, Bingbot, ChatGPT-User, etc.) which paths they may or may not request. It is a polite request, not enforcement — well-behaved bots obey it; abusive scrapers ignore it. Use noindex meta tags or HTTP auth for content that must be hidden from search engines.

Question 2

Where should robots.txt live?

Accepted Answer

Always at the root: https://example.com/robots.txt. A robots.txt at a subdirectory (/blog/robots.txt) or a subdomain with no separate file is ignored by Google. Each subdomain needs its own robots.txt.

Question 3

Will Disallow: / deindex my whole site?

Accepted Answer

Eventually, yes. Once Googlebot can no longer crawl the URLs, they'll drop from the index over weeks. To force a fast removal use Search Console's "Remove URLs" tool. Be especially careful with staging or pre-launch sites that get pushed to production with the staging robots.txt still in place.

Question 4

Should I list my sitemap in robots.txt?

Accepted Answer

Yes — add Sitemap: https://example.com/sitemap.xml on its own line. It costs nothing, helps Bing and DuckDuckGo discover the sitemap, and doubles as a self-documentation breadcrumb for the next person who edits the file.

Question 5

Does ChatGPT obey robots.txt?

Accepted Answer

OpenAI's grounding crawler honours the standard. Block it explicitly with a User-agent: ChatGPT-User block if you want to opt out of being cited in ChatGPT answers, or with GPTBot to opt out of model training. Most other AI engines (Perplexity, Anthropic, Google-Extended) publish their own user-agent names — block them the same way.

Question 6

Why did Google still index a page I disallowed?

Accepted Answer

Disallow stops Google from crawling, not from indexing. If other sites link to a blocked URL, Google can list the URL with no description ("blocked by robots.txt"). To remove the page from search entirely, allow crawling and add <meta name="robots" content="noindex"> on the page itself.

Robots.txt Tester: Is ChatGPT & Google Blocked?

Frequently asked questions