Technical SEO

Crawl Errors

Shahid Maqbool

By Shahid Maqbool
On Mar 31, 2023

Crawl Errors

What are Crawl Errors?

Crawl errors happen when a user, Googlebot, or another search engine bot attempts to access a webpage on your site but is unsuccessful. These errors block search engines from properly indexing your content.

Without indexing, your pages won't appear in search results—meaning zero chances of ranking.

Types of crawl errors

Google Search Console categorizes crawl errors into two major types:

  • Site Errors – Impact the entire website.

  • URL Errors – Affect specific pages.

Site Errors

Site errors prevent search engines from accessing your entire domain. These are critical and should be addressed immediately.

Google identifies three main types:

DNS Errors

The DNS (Domain Name System) translates your website’s domain name into an IP address that servers use to locate it. DNS errors occur when a search engine bot can’t connect to your site due to a DNS timeout or lookup failure.

Although Google may still access your site under some circumstances, persistent DNS issues can severely block crawling and indexing.

Common DNS errors include:

  • DNS Timeout: The DNS server takes too long to respond.

  • DNS Lookup Failure: The server fails to locate your domain.

Fixing DNS Errors:

  1. Use the “Fetch as Google” tool in Search Console to test how Google accesses your site.

    • Use FETCH to check DNS response.

    • Use FETCH AND RENDER to see how Google visualizes your page.

    • If Fetch fails, the issue might be with your DNS provider. Contact your domain host for support.

Common server errors:

These errors happen when your server fails to respond to Googlebot within its allocated crawl time.

Unlike DNS errors (where the server is unreachable), server errors occur after a connection is made—but the page fails to load.

  • Timeout: Server takes too long to respond.

  • Truncated Headers: Server response is incomplete.

  • Connection Reset: Connection is dropped mid-process.

  • Truncated Response: Incomplete data sent to Googlebot.

  • Connection Refused: Server denies the connection request.

  • Connect Failed: The server is down or unreachable.

  • Connect Timeout: Googlebot cannot connect within the given time.

  • No Response: Server does not respond at all.

Fixing Server Errors:

  1. Use "Fetch as Google" to confirm whether the homepage and other pages load correctly.

  2. Investigate server performance and coding issues with your dev team.

  3. If issues persist, upgrade your server or consider switching hosts.

Robots Failure

A misconfigured robots.txt file can block Googlebot from crawling your site.

This file tells search engines which pages they’re allowed to crawl. It's placed in your site’s root directory.

If misused, it can stop all bots from accessing your content.

Example of blocking all bots:

User-agent: *

Disallow: /

To allow full crawl access:

User-agent: *

Disallow: 
Fixing Robots.txt Issues:
  1. Review the file for incorrect directives.

  2. Use Search Console to confirm Googlebot access.

  3. Avoid including a robots.txt file if you don't need to restrict bots.

  4. Consult an expert if you're unsure about the configuration.

  5. Contact your hosting provider if the issue persists after correcting the file.

URL Errors

URL errors affect individual pages rather than the entire website. These issues prevent search engines from crawling specific URLs.

Google Search Console’s Coverage report lists all such pages, prioritizing them by importance.

Soft 404

A Soft 404 occurs when a page returns a 200 (success) status, even though it should show a 404 (not found) error.

Common causes:

  • Pages with no meaningful content

  • Redirects to irrelevant pages

  • Deleted pages not returning proper 404 status

Fixing Soft 404s:

  1. Add valuable content to thin pages.

  2. Use 301 redirects to relevant pages—not the homepage.

  3. Return correct status codes (404 or 410) for removed pages.

404 (Not found)

This error occurs when Googlebot can’t find a specific page.

While 404s for low-priority pages are not harmful, frequent 404s for important pages can hurt user experience and SEO.

Fixing 404 Errors:
  1. Use Search Console to locate broken internal/external links.

  2. For internal links, update or remove the link.

  3. For external ones, apply a 301 redirect or contact the referring site.

  4. Republish or restore deleted content if needed.

Access Denied

These occur when bots are blocked from accessing a URL. Common reasons include:

  • Pages requiring login

  • Hosting providers blocking Googlebot

  • Misconfigured permissions

Fixing Access Denied Errors:
  1. Remove authentication requirements for public pages.

  2. Check the robots.txt file for accidental blocks.

  3. Ensure your server/firewall doesn’t restrict bots.

  4. Use Search Console to verify how Google views the page.

Not Followed

These errors mean Googlebot couldn’t follow the page to its final destination.

Common causes:
  • JavaScript-heavy navigation

  • Flash, cookies, or session-based redirects

  • Redirect loops or chains

  • Improper redirect formats

Fixing Not Followed Errors:
  1. Use Google Search Console’s “Fetch and Render” tool to simulate bot behavior.

  2. Avoid complex JS or Flash for essential navigation.

  3. Limit use of URL parameters and ensure proper redirects.

  4. Replace redirect-dependent architecture with static HTML links.

Server Errors & DNS Errors

Just like site-level errors, individual URLs can suffer from server or DNS issues.

Fixing These Errors:

  1. Check your server's response times for the affected URL.

  2. Troubleshoot your DNS configuration if timeouts or lookup failures occur.

Some specific URL errors

Some errors only affect certain types of websites:

Mobile Specific Errors

These occur if your site isn't mobile-friendly or contains faulty redirects (common in m.example.com setups).

Fixing Mobile Errors:

  • Implement responsive design.

  • Ensure proper redirection for mobile users.

  • Use Google's Mobile-Friendly Test for insights.

Malware Errors

These happen when Google detects harmful content on your site, such as malicious scripts or phishing attempts.

Fixing Malware Errors:

  • Scan your site for infections using security tools.

  • Remove malware or malicious code.

  • Request a review in Search Console once clean.

Google News Errors

Sites included in Google News may see crawl issues due to formatting or policy violations.

Fixing Google News Errors:

  • Ensure proper headline formatting.

  • Follow Google News content and technical guidelines.

  • Use Search Console to identify specific problems.

Conclusion

Crawl errors can originate from your hosting environment, CMS, server configuration, or even security plugins.

Some issues require urgent fixes to avoid losing search engine visibility. Regularly monitor your site using tools like:

  • Google Search Console

  • Screaming Frog

  • Sitebulb

  • Netpeak Spider

Staying proactive with audits and fixes ensures better indexing, visibility, and overall SEO health.

Related Articles

Leave a reply
All Replies (0)