Skip to main content

Evaluating Noindex Pages

Updated over a year ago

📘 This article explains how to use Botify reports to locate pages that include a Noindex tag.

Overview

Noindex tags are one of the most commonly used tools on the internet to direct search engines not to index particular pages. Websites should ensure that search engines are crawling and indexing the most valuable content on their site so the most important pages have the best chance to rank and perform well. Noindex tags control the indexing of irrelevant or unimportant pages to ensure that search engines do not have a bloated index of irrelevant pages.

The impact of Noindex pages on SEO depends on whether:

  • These pages are the primary or only path to some content you want search engines to index. This is typically when a paginated list has Noindex pagination leading to indexable content, which is not ideal.

  • These are pages you do not want to see in search engine results pages.

This article focuses on pages you do not want in search engine results pages. These generally create crawl waste since search engines need to crawl the page to see the Noindex tag, and some wasted PageRank (which indexable pages deserve and non-indexable pages do not).

Locating Noindex Pages

To identify pages on your website with a meta Noindex tag:

  1. Navigate to the SiteCrawler > Distribution report.

  2. In the Indexable/Non-Indexable URLs Distribution chart, click the Non-Indexable URLs segment.

    usecase_noindex1.png


    A URL Explorer report displays pages containing a meta Noindex tag:

    usecase_noindex2.png

  3. Add the No. of Follow Inlinks and Source - Full URL columns to show where these Noindex URLs are linked from:

    usecases_noindex3.png

  4. Click the Export as CSV link.

    usecase_noindex4.png

  5. Select all rows as the scope, then click Launch Export.

    usecases_noindex5.png

  6. Use the exported file to determine if there are a significant number of Noindex pages you suspect are sources of wasted crawl budget or PageRank, and consider making the following changes:

    • Use a rel=”nofollow” attribute on the link pointing to these pages. This hints to search engines that those pages do not need to be crawled as often.

    • Disallow these pages in the website's robots.txt file if they are pages that you never intend to be indexed.

💡 By default, the list of Noindex pages shown in the URL Explorer includes only pages with an HTML content type. To check if there are Noindex pages with other content types, filter the report by content type "not equal" to text/HTML:

usecases_noindex6.png
Did this answer your question?