π This article describes the Distribution report in SiteCrawler, part of Botify's Analytics Suite, available with all Botify plans.
Overview
SiteCrawler's Distribution report shows detailed information about how different SEO-related indicators are distributed throughout your site, providing insights into whether your pages are likely to be found and indexed by search engines. Use this report for insights such as indexability distribution, page depth, content type, content language, index vs. noindex status, subdomain, protocol, and compression.
Distribution Report KPIs
The following KPIs for the current crawl are displayed at the top of the Distribution report. Click a KPI to display the corresponding URLs in a URL Explorer report:
Crawled URLs: The total number of URLs crawled by Botify in the current crawl.
Indexable URLs: The total number of indexable URLs crawled by Botify in the current crawl. An indexable URL can be indexed by a search engine by meeting the following criteria: it serves a 200 status code, does not include a noindex meta tag or a canonical tag pointing to a different URL, and is a text/HTML content type.
Non-Indexable URLs: The total number of non-indexable URLs crawled by Botify in the current crawl. A non-indexable URL is ineligible to be indexed by search engines because it does not meet the criteria identified above.
Average Depth: The average number of clicks the pages crawled by Botify were from the crawl start page (typically the home page).
Distribution Report Visualizations
The Distribution report's "Top Charts" section includes the following visualizations. Visit the Segments section of the Distribution report to see many of these visualizations focused on the page categories you have defined in your project segmentation:
Indexable/Non-Indexable URLs Distribution
This chart shows the proportion of pages crawled by Botify that met or did not meet the basic indexability requirements defined above. While most sites have some non-indexable pages for a good reason, a high proportion of non-indexable pages will harm your SEO performance.
Insights
This table displays indexability insights based on the evolution across the compared crawls. The "# URLs" column displays the number of URLs matching the metric in the current Botify crawl, and the Change column displays the percentage of increase or decrease from the compared crawl.
Click a metric to display the list of corresponding URLs in a URL Explorer report.
Click the alert icon to define an alert for future changes in the corresponding metric.
Click the View More link to drill into the chart in the Distribution Insights section.
Metrics:
Indexable URLs: Pages eligible to be served to search engines. See the full definition in the glossary.
Non-Indexable URLs: Pages ineligible to be served to search engines.
noindex URLs: Pages containing a noindex meta tag.
Not HTML URLs: Pages with a content type other than HTML or text (e.g., image).
GZIP URLs: Compressed pages.
Disordered Query Strings (duplicates): Pages considered duplicates because they have the same URL with query strings in a different order.
Internal Pagerank by Depth and Dimension
These charts convey the distribution of your pages by their internal link equity based on the number of links to the pages and the weight of those links. For example, links from a site's home page carry more weight than links from pages deep in a site's structure. These charts show how much internal Pagerank equity goes to pages by their depth level and page type.
In the example below, pages one click away from the crawl start page receive more than 50% of the Pagerank equity, and evergreen pages receive less than 10% of the Pagerank equity.
Indexable/Non-Indexable URLs by Depth
This chart shows the distribution of the pages by their indexability status and the number of clicks away from where the crawl started (typically the site's home page) using the shortest available path. The 0 on the chart's X-axis defines the crawl start page. Click on the non-indexable segments at low depths to investigate these pages since they may be hurting your site performance.
Indexable/Non-Indexable URLs Main Reason
This chart shows the breakdown of non-indexable pages found in Botify's crawl by the primary reason for its indexation ineligibility. Refer to the HTTP Codes report for details on the Bad HTTP codes and the Content report for Canonical Not Equal reason. The "Presence of Noindex Attribute" chart in the Distribution report provides insight into the Meta noindex reason.
Since a page can have multiple reasons, click into a segment to display the list of URLs matching the reason in a URL Explorer report. You can then add columns to the report for non-indexable reasons to determine if other reasons exist:
URLs by Depth and Content Type
This chart shows the distribution of your page content type (e.g., HTML, PDF, image) by the number of clicks away from where the crawl started (typically the site's home page) using the shortest available path. The 0 on the chart's X-axis defines the crawl start page. Use this chart to determine if too many of your pages are too deep in your site's structure, which makes them less likely to be explored by search engines.
Presence of Noindex Attribute
This chart shows the proportion of pages crawled by Botify that contained a noindex meta tag. You should aim to eliminate all crawlable noindex URLs on your site since they waste search engine crawl resources.
Protocol Distribution
This chart shows the proportion of your site pages crawled by Botify that use the HTTP or HTTPS protocol if enabled in your project settings. This is especially useful after site migrations to HTTPS to determine if any HTTP pages are still linked on your site.
Content Type Distribution
This chart shows the distribution of your pages crawled by Botify by content type (e.g., HTML, PDF, image). All pages should ideally be HTML or text.
Language Distribution
This chart shows the pages crawled by Botify by the language defined in the HTML tag's lang attribute. If your site uses hreflang tags to determine which pages are the same in a different country or language, insights are available in the Distribution report's Internationalization section.
Domain/Subdomain Distribution
This chart shows the percentage of URLs crawled by Botify in each domain or subdomain defined in your project settings.
Rel Prev/Next Distribution
This chart shows the percentage of pages crawled by Botify that have at least one page pointing to them via a rel="prev" link attribute (P1) and at least one page pointing to them via a rel="next" link attribute for pagination (PX).
Gzip/No Gzip Distribution
This chart shows the percentage of pages crawled by Botify that were compressed or not.
Extracted HTML Code
This chart shows the custom field extracted during Botify's crawl. Custom fields are extracted through HTML Extracts defined in crawl settings.
Indexable/Non-Indexable URLs Comparison
This table compares the number of indexable and non-indexable pages crawled by Botify according to the key distribution metrics.
Internationalization
For multilingual websites, Botify detects hreflang tags and each page's consistency with the language it declares to determine which pages are the same in a different country or language when the countries are included in your project settings. If you have implemented <link rel="alternate" hreflang> tags to manage linguistic versions of the same content, then the Distribution section will include an Internationalization subsection with hreflang data and language distribution according to the page's HTML tag "lang" attribute.
See also: