criticalSEO RULE · R15

robots.txt misconfiguration: how to audit and fix it

A broken robots.txt is one of the few SEO problems that can take your site to zero traffic overnight. Audit yours today: it should not Disallow your indexable URLs and must allow CSS, JS, and image folders. robots.txt is a plain-text file at the root of your domain (yourdomain.com/robots.txt) that tells crawlers which paths they may and may not visit. It is the first file every crawler fetches when it arrives at your site.

Last updatedMay 18, 2026·part of the 53-rule library

What it is

robots.txt is a plain-text file at the root of your domain (yourdomain.com/robots.txt) that tells crawlers which paths they may and may not visit. It is the first file every crawler fetches when it arrives at your site.

Why it matters

A single line — Disallow: / — blocks every page on your site from being crawled and, over weeks, from being indexed. Blocking /wp-content/, /assets/, or /static/ paths stops Google from fetching CSS and JavaScript, so it sees a broken layout and ranks you accordingly. We see this misconfiguration most often after a staging-to-production deploy that copies the wrong robots.txt over.

How to fix it

Fetch yourdomain.com/robots.txt directly. Open it in a browser and read every line. If you see Disallow: / on a User-agent: * block, that is the bug.
Allow CSS, JS, and image folders. Google needs to render your page to evaluate it. Blocked CSS/JS folders mean Google sees an unstyled layout. Explicitly Allow: /wp-includes/*.js (etc.) if a broader Disallow is present.
Reference your sitemap. Add a Sitemap: https://yourdomain.com/sitemap.xml line at the bottom. This is the canonical way to advertise your sitemap to all crawlers, not just Google.
Test in Search Console robots.txt Tester. Submit specific URLs and confirm Google would be allowed to fetch them. The tester catches subtle wildcard mistakes (Disallow: /*?* unintentionally blocking parameterised but valid URLs).

Common false positives

A small site with no sensitive paths legitimately has a near-empty robots.txt. That is fine — empty is not broken.

Authoritative sources

Introduction to robots.txt — Google Search Central
RFC 9309 — Robots Exclusion Protocol — IETF
Google Search Central documentation — Google
Schema.org vocabulary — schema.org
SEO Starter Guide — Google Search Central
MDN — HTML meta and link elements — Mozilla MDN

What it is

Why it matters

How to fix it

Common false positives

Authoritative sources

Related SEO rules