Technical SEO

Index Coverage Report

Shahid Maqbool

By Shahid Maqbool
On Apr 5, 2023

Index Coverage Report

What is an Index Coverage Report?

Index Coverage Report is found in Google Search Console, which provides a detailed report on the indexing status of all the URLs on your website.

This report helps get the indexing status of your website URLs and informs you about various issues. As a result, it will enable you to promptly address different website issues and make your pages indexable.

The Index Coverage Report explained

If you want an index coverage report of your website, go to Google Search Console and log in to it. Now select your property (website). Verify it.

You will get the report in Coverage under the Index section of GSC on the left side.

This report will show four sections or status categories. Each one is explained below.

Valid

The URLs that appear under this section are valid, and they do not need further action.

Valid with warnings

These are the valid pages indexed by Google but contain some issues that need to be resolved.

Error 

The URLs that appear under this category contain critical errors that prevent them from getting indexed by Google and appearing in search results.

Google thinks you want these pages to get indexed, but some errors are preventing them.

Excluded 

These URLs are entirely excluded from indexing and need to be fixed. Google thinks this is intentional, as you have yet to submit specific URLs to Google for indexing. 

Google does not index the Error and Excluded URLs. These two must be addressed quickly.

Now we will discuss each category and the possible issues a website may face. We will not discuss the valid URLs as these are already indexed and encounter no issues.

Valid with warnings

Issues under this category can appear in two forms:

Indexed, though blocked by robots.txt

This indicates that Google has indexed your URL even though you have blocked it in robots.txt.

Normally, Google avoids indexing these URLs, but in that particular case, Google might have discovered the links and indexed them.

Action required: Download the list of all URLs that appear under this category. Check which URLs you want to keep as indexed and which you don’t.

Make sure to update your robots.txt file with a noindex directive for the URLs you don’t want to get indexed by Google.

Now go to Index Coverage Report of GSC and click Validate fix to send a request to Google to re-evaluate this URL.

Indexed without content

You will see this warning when Google has indexed your pages, but they contain no content. This may happen:

  • When your pages have little to no content

  • Cloaked

  • Blocked by robots.txt or Meta tags and makes Google unable to render the page content properly

  • Content structure and format are poor

Action required: Check the error pages to see how they look. Use the URL Inspection Tool of GSC to see how these URLs look to Google.

If these pages contain little content then fix the error by adding more valuable content to it.

If there is a problem with page structure, add appropriate links to and from them. Make sure you have allowed indexing these URLs in the robots.txt file.

After doing this, select Request Indexing. Google will re-crawl this URL, and if everything is fine, it will index it properly.

Error

The issues that come under the Error category are as follows:

Redirect error

Google can’t crawl the URLs because of redirect errors that can happen due to:

  • Redirect chains

  • Redirect loops

  • An empty URL in the redirect chain

  • A redirect URL that is too long

Action required: Investigate the exact reason behind these redirects, fix them, and set up a correct redirect.

Server error (5xx)

When requested, the server returns the 500 error to Google and stops it from accessing and crawling the page. Google abandoned the request either because the request timed out or the website was too busy.

Action required: Investigate what causes 500 server error. These errors are sometimes temporary and occur when the website is busy.

Google will have difficulty crawling and indexing the pages with server errors, so you must keep an eye on them.

Contact your server administrator or look for recent changes to your site. See how Google suggests fixing server errors.

URL blocked by robots.txt

This indicates that the page is blocked by the robots.txt file. If you have blocked specific URLs in your robots.txt file, Google may not index them, but that isn’t necessary. Google may still be able to index them if it finds a URL through another source.

Action required: You can verify the blocked pages using the robots.txt tester. If you want to remove certain pages from indexing, use a noindex tag.

URL blocked due to other 4xx issues

These are client-side issues that are preventing specific URLs from indexing due to some errors on the website itself, not on the server.

Action required: Investigate and debug the URLs using the URL Inspection Tool of Google Search Console. Fix the issue otherwise, remove URLs from the sitemap.

URL has a crawl issue

You have added a URL into an XML sitemap, but Google cannot crawl it; however, these issues are usually temporary.

Action required: Use the URL Inspection Tool to fetch these URLs. If there isn’t a particular issue, just wait for it, as it is temporary and will get resolved on its own.

URL marked ‘noindex’

This happens when you have assigned a noindex directive to a particular URL. Google tried to index it but couldn’t. If you have done this intentionally, it is ok, but if you haven’t, remove the noindex tag.

Action required: Remove the URLs from the XML sitemap that you do not want to index. If a URL is essential and shows this error, remove the noindex tag.

URL not found (404)

In this case, the page will return a 404 error. You have submitted this URL to sitemap, but it seems it does not exist.

Action required: If the important pages show the 404 error, 301 redirects them to valuable resources otherwise, remove them from the sitemap or fix the cause like internal links.

URL seems to be a Soft 404

In this case, you have submitted a URL through an XML sitemap, and it is returning a 200 status code (page does not exist) while displaying a 404 not found page.

Action required: If these pages are actually 404 pages, make sure these URLs return a proper 404 not found instead of a soft 404.

If they are not, ensure their content should reflect that. It means providing enough quality content on these pages and then reindexing them. 

URL returned 403

This client site error shows you don’t have permission to access this website. The server will understand the request but will not access the site due to a lack of providing the necessary credentials.

Action required: Remove these URLs from the XML sitemap but if they are necessary, make them available without any restrictions.

URL returns unauthorized request (401)

This happens when you have submitted particular URLs in your sitemap but have not authorized Google to access and index them. As a result, it will return a 401 HTTP response.

Action required: If you have done this intentionally, then it is ok; however, if you haven't, either remove the authorization requirements for Google or verify it so it can access these URLs.

Excluded

Pages that come under the Excluded section may have one of these types:

Excluded by “noindex” tag

This is similar to the type we discussed earlier in the Error section. The only difference is that you will not submit URLs in a sitemap in the latter type.

These pages have a noindex tag that’s why Google thinks you don’t want to index them because you haven’t also added them to the XML sitemap.

Action required: Check these pages and review them to ensure you actually do not want them to index i.e. log-in or user pages.

Crawled – Currently Not Indexed

If specific URLs fall under this category, that means these pages are not good enough in the eyes of Google. It indicates Google has crawled those pages but has not indexed them.

Sending “Request Indexing” again will not help, as Google has already mentioned in its official documentation:

It may or may not be indexed in the future; no need to resubmit this URL for crawling.

Action required: The only thing you need to do here is to check your website URLs under this category in terms of quality. See if these are valuable enough to meet the Google E-E-A-T.

FYI: E-E-A-T stands for experience, expertise, authoritativeness, and trustworthiness. It is part of Google Search Quality Rater Guidelines published in 2013 to help webmasters learn how to add value to their website and its content.

Discovered – Currently Not Indexed 

This message states that the URL was discovered by Google but has not been indexed. If you have many URLs under this category, that means Google has avoided indexing these URLs just to prevent overloading the server.

Google will soon crawl it, but if you see more URLs being added under this category, that indicates an issue with the crawl budget. It may indicate that Google thinks your website is of low quality.

Action required: You can look for many things to resolve this issue. Check your website speed, server health, and the quality of your pages.

If there are many non-canonical versions, you must assign a noindex directive to them to save the crawl budget.

404 Not Found 

The pages will return the 404 not found status when requested by Google. This one is already discussed under the Error section, but the only difference is that URLs are not submitted to Google via XML sitemap in the latter type.

Google will discover the pages from other sources, like a link on another website that has been removed or not updated.  

Action required: If the important pages show the 404 error, 301 redirects them to working pages.

Soft 404

This is an error page with little to no content, or that has been completely removed. The server will send a 200 OK status code while returning a 404 not found HTTP status code.

Action required: Make sure to add unique content to your pages to make them valuable; on the other hand, if these are actually error pages, make sure they return a valid 404 Not Found.

Page with Redirect

All the pages that are redirecting come under this category, and that is the reason they are not indexing.

Action required: Review the redirects and make sure they are done intentionally. Other than that, there is nothing you need to do.

Alternate Page with Proper Canonical Tag

These pages are duplicates but contain a canonical tag for the correct canonical version of a page, and Google has indexed it properly.

Action required: As Google has indexed your canonical version of a web page, so there is nothing to do here.

Duplicate Without User-Selected Canonical

Google does not index these URLs as it thinks these are duplicates of other URLs on your website. You have not selected its canonical version either so Google will choose it based on other factors.

Action required: Review the URLs that have been marked as canonical versions of these URLs. If you think Google has chosen the wrong URLs, you may assign canonical tags to the correct URL versions.

Moreover, if you think a URL is not a duplicate of Google selected canonical, make sure the content on those pages is different and valuable.

Duplicate, Google Chose a Different Canonical than the User

In this case, you have selected a canonical URL version, but Google has chosen another one and indexed it.

Action required: If you are sure these URLs are not duplicates, prove that by adding different content to all pages.

Moreover, if you think Google has selected the wrong canonical version, focus on providing enough signals (backlinks) to make Google choose your preferred or canonical version.

Duplicate, Submitted URL Not Selected As Canonical

Here you will submit different URLs in sitemap without declaring their canonical version, so Google thinks these are duplicate pages of other URLs so that it will choose a canonical version on its own.

So what is the difference between the last two statuses? 

The main difference is that you will submit all the URLs to the sitemap without adding their canonical version in the latter type.

As a result, Google considers them duplicate URLs and will choose a canonical version on its own.

While in “Duplicate, Google Chose Different Canonical than User, " you will not submit the other URLs in the sitemap; Google will find them from other sources and consider them duplicates. As a result, it will choose its own canonical version.

Action required: If you are sure these URLs are not duplicates, prove that by adding different content to all pages.

Moreover, if you think Google has selected the wrong canonical version, focus on providing enough signals (backlinks) to make Google choose your preferred or canonical version.

Blocked By Robots.txt

URLs under this section are blocked by the robots.txt file. Keep in mind even though you have put restrictions on specific URLs in the robots.txt file, Google can still index it by discovering it from other sources, i.e. links on other websites. However, it may display the pages in a damaged way.

Action required: Check if there isn’t any important URL under this category. If they are not, remove them from robots.txt or add a noindex directive.

Blocked By Page Removal Tool 

URLs are not appearing in the search results because you have requested Google to remove them using the Page Removal Tool. Remember that this removal is temporary; after 90 days, these URLs will again appear in search results.

Action required: If you want to remove these URLs permanently from SERPs, make sure to give a clear signal to Google by adding a noindex directive.

Blocked Due To Unauthorized Request (401)

URLs are not allowed to be accessed through Google because it is not authorized to access them. Google will get a 401 HTTP response.

This is usually done for staging environments made for testing and not allowed to be accessed by the world.

Action required: If you have done this intentionally, then ignore it. Google is just telling you that it has encountered this issue. Alternatively, make sure to verify the pages if they actually require authentication.

Blocked Due To Access Forbidden (403)

This usually happens due to server-side errors, does not grant access to Google, and shows the 403 response. This happens when the provided credentials are incorrect and access to the page is denied.

Action required: Make sure to allow access to Google and other search engines if you want to index these pages. If not, then add a noindex directive.

How often should You check the Index Coverage Report?

As per Google's suggestions, regularly checking the index coverage report is not mandatory. You will be notified via email whenever your website has an indexing issue.

In some cases, when an error gets worse, you may not get an email. So to avoid this situation, it is recommended to check your index coverage report in Google Search Console once in a while.

Moreover, keeping an eye on this report is also essential, especially after making changes to your website like site migration, URL restructuring, etc.

Related Articles

Leave a reply
All Replies (0)