404 and 5xx errors: how to find and fix broken pages
A 4xx or 5xx response from a URL you want indexed means it cannot rank, cannot be cited, and cannot pass link equity. Find them, decide what each should become, and fix at the source — server config, redirect, or content restoration.
A 4xx or 5xx response from a URL you want indexed means it cannot rank, cannot be cited, and cannot pass link equity. Find them, decide what each should become, and fix at the source — server config, redirect, or content restoration. HTTP status codes in the 4xx range (404 Not Found, 410 Gone, 403 Forbidden) and 5xx range (500 Internal Server Error, 502 Bad Gateway, 503 Unavailable) indicate the page is broken or unreachable. Crawlers stop indexing them after a few attempts; users bounce immediately.
Last updated·
What it is
HTTP status codes in the 4xx range (404 Not Found, 410 Gone, 403 Forbidden) and 5xx range (500 Internal Server Error, 502 Bad Gateway, 503 Unavailable) indicate the page is broken or unreachable. Crawlers stop indexing them after a few attempts; users bounce immediately.
Why it matters
Every 404 on an indexed URL is wasted ranking work — backlinks to that URL no longer count, the page disappears from search results within weeks, and crawl budget gets spent re-checking the dead URL. 5xx errors are worse: persistent server errors cause Google to slow or stop crawling your entire site.
How to fix it
- Identify every broken URL on the site. Crawl with a tool like VectraSEO, Screaming Frog, or Search Console's Pages report. Get the full list of 4xx/5xx responses, including the URLs that link to them.
- Classify each broken page. Three buckets: (a) should still exist → restore content, (b) moved → 301 redirect to the new URL, (c) genuinely gone → return 410 Gone (not 404) and remove internal links.
- Fix at the source, not with a redirect catch-all. Resist the urge to redirect every 404 to the homepage — Google treats that as a soft-404 and ignores the redirect. Each 301 should go to a topically relevant page.
- Fix 5xx errors immediately. A persistent 5xx will cause Google to crawl your site less. Check server logs, look for memory/CPU spikes, and rule out a misbehaving plugin or runaway worker.
- Set up monitoring. Re-running a crawl once a quarter is too slow. A continuous monitor (daily or weekly) catches new breaks before they cost rankings.
Authoritative sources
- HTTP status codes and what they mean for Google Search — Google Search Central
- RFC 9110 — HTTP semantics (status codes) — IETF
- Google Search Central documentation — Google
- Schema.org vocabulary — schema.org
- SEO Starter Guide — Google Search Central
- MDN — HTML meta and link elements — Mozilla MDN