π This article describes the Search Engines report in SiteCrawler, part of Botify's Analytics Suite, available with all Botify plans.
Overview
The Search Engines reports compare your website as Botify crawled it to how search engines saw your website, according to the web server log data provided. These reports show how much of your website was explored by each search engine, how deep search engines go, how often they crawl pages, whether they encounter errors, and which pages are not crawled by search engines. Since server log files also contain information about organic traffic, these reports show which pages generate organic traffic and how often.
The Search Engines report is displayed for Google by default; however, you can click on the recommended filters at the top of the report to change to another search engine that is included in your Botify plan:
For ease of explanation, the metrics and screenshots in this article refer to Google.
Search Engines Report KPIs
The following KPIs for the current crawl are displayed at the top of the Search Engines report. Click a KPI to display the corresponding URLs in a URL Explorer report:
% URLs Crawled: The percentage of pages crawled by Botify and Google.
% Active URLs: The percentage of active pages (those with at least one organic visit during the previous 30 days) crawled by Botify and Google.
Visits Volume on URLs Crawled by Botify: The number of organic visits to pages crawled by Botify during the previous 30 days.
Orphan URLs: The number of URLs crawled by Google during the previous 30 days that are not linked from any other page on your site.
Crawl Volume on Orphan URLs: The number of crawls by Google to all orphan pages during the previous 30 days.
Active Orphan URLs: The number of crawls by Google to all orphan pages that received at least one organic visit during the previous 30 days.
Visits Volume on Orphan URLs: The number of organic visits to orphan pages crawled by Google during the previous 30 days.
Crawled Orphan URLs with Bad HTTP Codes: The number of orphan pages that returned "bad" HTTP codes (i.e., non-2xx HTTP codes) in Google crawls during the previous 30 days.
Search Engines Report Visualizations
The Search Engines report's "Top Charts" section includes the following visualizations. Visit the Segments section of the Search Engines report to see many of these visualizations focused on the page categories you have defined in your project segmentation:
Crawls Venn Diagram
This chart shows the discrepancies between the pages Botify crawled and those Google crawled during the previous 30 days with the following important insights:
Crawled by Botify only: Botify crawled these pages, but Google did not, which means they will not be indexed. Click this segment to investigate whether these pages should be indexed.
Crawled by Google only: This segment represents orphan pages. Hover over this segment to find the total number of orphan pages, and click the segment to find which URLs are orphans. Note: This number may be larger than shown in the KPIs above since the data is sampled in KPIs for sites with a large number of orphans.
Crawled by Google and Botify: This segment includes pages crawled by Google and Botify, ideally where you want all your indexable pages to be. Get more insights into pages crawled by Botify and Google in the Crawled / Not Crawled Distribution By Indexability chart.
Insights
This table displays search engine insights based on the evolution across the compared crawls. The "# URLs" column displays the number of URLs matching the metric in the current Botify crawl, and the Change column displays the percentage of increase or decrease from the compared crawl.
Click a metric to display the list of corresponding URLs in a URL Explorer report.
Click the alert icon to define an alert for future changes in the corresponding metric.
Click the View More link to drill into the chart in the Search Engines Insights section.
% Crawled Pages (Indexable): The percentage of your site pages crawled by Google in the previous 30 days that are indexable.
% Crawled Pages: The percentage of your site pages crawled by Google in the previous 30 days.
Avg Depth for Crawled Pages: The average number of hops away from the crawl start page (typically the home page) for pages crawled by Google during the previous 30 days.
% Active Pages: The percentage of your site pages crawled by Google during the previous 30 days that received at least one organic visit.
Avg Depth for Active Pages: The average number of hops away from the crawl start page (typically the home page) for pages crawled by Google that received at least one organic visit during the previous 30 days.
Crawled / Not Crawled Distribution By Indexability
This chart shows the pages crawled by Google and Botify by their indexable status. You should ensure a high proportion of the pages crawled by Google are indexable, and evaluate whether there are non-indexable pages currently only explored by Botify that you can get Google to explore.
Cumulative Crawl Over Time By Google
This graph shows the pages crawled by both Botify and Google, as shown in the Crawls Venn Diagram overlap, by the cumulative number of pages Google explored each day from your integrated log data (30 days). The proximity of the green bar at the 30-day mark to the total number of indexable pages Botify found in the current crawl (dotted line) demonstrates the proportion of indexable pages Google explored.
URLs Crawled By Google By Depth
This chart shows pages crawled by their number of hops away from the site's home page, segmented by those found by Botify only and those found by Botify and Google. Since Pagerank largely influences crawl, the crawl rate typically decreases with page depth. Also refer to the Crawl Rate By Internal Page Rank chart.
Active URLs On Google By Depth
This chart shows the proportion of active pages (i.e., those that received at least one organic visit from Google) found by Botify by their number of hops away from the home page.
HTTP Codes Distribution For URLs Crawled By Google and Botify
This chart shows the HTTP status codes returned to Google for pages crawled by both Botify and Google. Since Google can crawl the same page multiple times during a reporting period, the status code consistency is reported across all crawls.
Organic Visits HTTP Codes Distribution On Google
This chart shows the distribution and consistency of the HTTP status codes returned to users for active pages (i.e., those that generated at least one organic visit from Google in the previous 30 days) crawled by both Botify and Google.
Crawl Rate By Internal Page Rank
This chart shows Google's crawl rate for pages crawled by both Botify and Google by Internal Pagerank.
Crawl Distribution By Bot
This graph shows the distribution of the volume of Google crawls by type of Google bot for pages crawled by both Botify and Google. The crawl volume includes multiple crawls to the same page, including those from multiple bot types.
Google Crawls Frequency
This chart shows how often any Google bot crawled the URLs Botify analyzed. Crawl frequency is determined by whether a page was crawled on each day in the period, regardless of the number of crawls each day. The crawl frequency is indicated by the percentage of the total log period. For example, if Google crawled a page on 12 different days in a 30-day log period, the crawl frequency is 40%.
Google Visits Frequency
This chart shows how often pages received organic visits from Google search results pages for the URLs Botify analyzed. Visit frequency is determined by whether a page received an organic visit from Google on each day in the period, regardless of the number of visits each day. The visit frequency is indicated by the percentage of the total log period. For example, if a page received organic visits from Google on three different days in a 30-day log period, the visit frequency is 10%.
Search Engines Report Conversion Section
The Conversion section of the Search Engines report provides a visual summary of page performance by page type and a good indicator of crawl waste. Derived from your log data, the bars to the left of the median are pages Google crawled, and the bars on the right are the organic visits generated from Google search results. Any bar that shows a significant amount of crawls on the left, but a small amount of visits on the right, is a strong indicator of crawl waste.
Use the dropdown at the top of the chart to customize the display by any segment you have defined in your project. To dive deeper into a segment, use the Segments filter at the top of the page to get a more granular view:
This chart is similar to LogAnalyzer's Conversion chart, though it represents all pages crawled by Google by your selected time range (from your log data), while the chart in the Search Engines report is based on pages Botify crawled in the selected period.
See also: