📘 This article explains how to create a report of your pages deep in your website's structure, far from the home page.

Overview

Page depth is the number of clicks needed to reach the page using the shortest path from the analysis start page (usually the home page). The evaluation of your site's page depth is shown in SiteCrawler's Distribution report.

Creating a Deep Page Report

The "Indexable/Non-indexable URLs by Depth" chart shows the distribution of pages by depth. If there are pages deeper than ten clicks from the home page, they are grouped in the 10+ bar. To find out how deep the deepest pages are, click on the 10+ segment (either the green part for indexable URLs or the yellow for non-indexable URLs.

If the bar is too small to be easily clickable, expand the settings menu and select Display Chart (Percentages).

This example shows a report of indexable pages at depth ten or greater:

To expand this report to include non-indexable pages, remove the "Is Indexable" filter:

💡 Click twice on the Depth column header to sort the report to show the deepest pages first.

Determining Your Deepest Indexable Page Types

After seeing the overview of your deepest indexable pages, investigate what these deep pages are. From the report above, apply the URL parameter and segment filters explained below to give you the full view of these pages.

URLs with Parameters

Check whether there are URL parameters in many of these URLs. Add the following to the report:

The "URL Query String Keys: Exists". This means the URL must have a query string (i.e., the part of the URL after the "?").
The "URL Query String Keys" column. This will show the parameter names in a separate column in the results table. For example, if the URL is http://www.mywebsite.com/page_description? page =3& sort =1, the query string keys will be "page,sort".

In this example, more than 99% of very deep indexable URLs include at least one URL parameter:

The deepest pages in the following example are highly suspect because they have two parameters referring to a specific product and a third parameter for pagination, which normally applies to a list of products:

Add the following filters to take a closer look at pages with this set of parameters:

Over two-thirds of very deep pages include these parameters in their URL. Click on the blue arrow near the URL in the results table to open the page on your website.

To find where a page is linked from, click on the magic wand icon in the results table to open the URL Details panel, which compiles all information available in Botify about that page, and navigate to the Inlinks tab. Add the "Sample of Inlinks" column to the report to get this information for all pages in the list:

If your pagination information is placed within the URL path instead of a URL parameter (e.g., http://www.mywebsite.com/content_description-page2) you can filter the list on pagination using the "URL Path" field, as shown in the section below.

Content URLs

There are two ways to find which deep URLs have content (e.g., products, articles, posts):

Finding Content by Segment

If you have defined URL segments for your page templates, filter SiteCrawler reports by segment. Navigate to the "URLs Crawled by Botify by Pagetype" chart in the Distribution Segments report and click on the desired segment while holding the CTRL key on a PC or COMMAND on a Mac to filter the report for only that segment:

Filtering on the "articles" segment in this example updates the chart to show the distribution of four subsegments:

Alternatively, select the segment filter directly in the report filter at the top of the page:

Filter by URL Pattern

You will not have the predefined filters shown above if you have not defined segments in your Botify project. In this case, you can filter the URL Explorer report based on URL patterns, such as an element of the URL path ( http://www.domain.com /path ?query-string ). For example, to find if there are deep product pages with the following URL format:

http://www.mywebsite.com/[productDescription]-p[productID].html
where product ID is a number.

The following filter is sufficient for simpler URL patterns, such as URLs that contain /product/:
URL Path contains "/product/".

However, a regular expression is required if a fixed character sequence in the URL is insufficient. That is the case in this example, where the only fixed part of product URLs, "-p", is also found in the product description part of the URL. To get only product pages in this case, use the following filter: URL Path Matches regex "-p[0-9]+\.html$"

This shows a few very deep products.

Next Steps

After filtering by URL parameters and patterns, you probably have a good understanding of your website's main causes of depth. You can do the same for non-indexable pages. Ask the following questions about these very deep pages to determine the next steps:

Do they deserve to be on the website for users or search engine robots? For example, duplicates generated by tracking parameters, which can be implemented in an SEO-friendly manner.
Are they justified for users but not for search engine robots (e.g., duplicates generated by sorting parameters)?
Could some of these pages generate organic traffic but do not because they are too deep for search engine robots to explore, and very unlikely that users find them while navigating on the website? If they can, adjust their internal linking to reduce their depth.

See also:

Finding Your Deepest Pages