Check your site for index bloat

Contributors

@inbillsmindgmail-com @paul-boag


Business Benefits

Improve site performance.


Go to Google Search Console, click on Coverage, and check the Valid checkbox.

  • If there has been a rapid increase in the number of valid indexed pages over the last three months, continue.
  • If there hasn’t been a rapid increase, then you don’t need to do anything more.

Search using the site:yourdomain.com operator on Bing and Google - take note of the total number of pages that come up in the results.

The number of valid pages from step 1 and number of search engine results pages don’t have to be the same, but they should be close.

Compare the number of search engine results and Google Search Console valid indexed pages against the number of URLs in your sitemap.xml file.

The results should be the same - if you have no errors in Google Search Console - or close (1-5%).

Click on the last page in Google Search Results to see the kind of URLs that are ranking lowest.

This will tend to include any dynamic content that isn’t optimized, such as site search results.

Note down any pattern in URLs that shouldn’t be indexed and duplicate pages.

For example, you might see that there are a lot of dynamically generated URLs or URLs with parameters like:

  • Comments
  • Product filtering results
  • and site search results.

URLs that shouldn’t be indexed include:

  • Test pages
  • Thank-you pages
  • Tag pages
  • Date archive pages
  • Tracking URLs
  • Pages with no original content.

For outdated pages and URLs with little to no original content, decide whether to delete them or create content for them.

Use your preferred SEO tool (SEMrush, Ahrefs, Moz are a few suggestions) to check the organic traffic coming into the page. If it’s a medium to a high percentage of the total traffic coming into your website, and it’s possible to write on the topic of that URL, write content for the page.

Set up no-index for tag pages, thank you pages, and any other pages that you don’t need people to find via search engines.

For example, if the website is a one-author site, set up a no-index tag for the author archive.

If you have duplicate pages, add a canonical tag to each duplicate, pointing to the master that you want people to find via search engines.

A canonical tag goes in the `` section of an HTML page, and it looks like this:


Set up the right HTTP status code for deleted pages.

  • If there’s a relevant page that replaces the deleted page, do a 301 redirect from the old to the new page.
  • If the page was deleted because it was no longer needed or retired, add a 410 code by going to your web hosting server, open the .htaccess file and add: Redirect 410 [page path].

Go to Google Search Console and click on Removals to quickly remove URLs from Google search.

Click on New request and enter the URL you want to remove.