Skip to main content

SiteCrawler Content Report

Updated over 10 months ago

πŸ“˜ This article describes the Content report in SiteCrawler, part of Botify's Analytics Suite, available with all Botify plans.

Overview

SiteCrawler's Content reports provide insights into what search engine impressions of your site may be based on the content quality and uniqueness evaluated during Botify's crawl. For all pages returned successfully in a Botify crawl, page text is evaluated for size and uniqueness, and page code (e.g., HTML tags, canonicals) is evaluated for its usefulness to search engines.

sc_content_overview.jpg

The Content evaluation in Botify crawls is enabled by default but can be disabled. If the Content report is not shown, enable the evaluation in the Report Features section of advanced project settings to access the report after your next crawl.

Content Report Visualizations

The Content report is organized into the following tabs:

Quality

The Quality section of the report measures word count and the percentage of unique content on your pages versus the percentage of templated content. Botify evaluates all page content displayed to users, including alt text and page title tags. Page content is unique to the page and is what the page is about, and the page template is what is used repeatedly on multiple pages (e.g., navigation elements). Evaluating page content without templates is essential to determine content uniqueness.

Content Vs Template

These visualizations show the proportion of pages crawled by Botify that contain unique content vs. templated content by page type.

Pages By Content Size (Words)

These charts show the number of words on the page by page type. This is an important visualization to identify the types of pages on your site with thin content.

sc_content_size.jpg

To include the page template in this evaluation, select the "Ignoring Nothing" option.

sc_content_sizewithtemplate.jpg

% Content Change Since Previous Crawl

This chart shows how much the content has changed since the previous analysis by page type.

sc_content_%change.jpg

To include the page template in this evaluation, select the "Ignoring Nothing" option.

sc_content_changewithtemp.jpg

Quality Insights

This table displays quality insights based on the evolution across the compared crawls. The "# URLs" column displays the number of URLs matching the metric in the current Botify crawl, and the Change column displays the percentage of increase or decrease from the compared crawl.

  • Click a metric to display the list of corresponding URLs in a URL Explorer report.

  • Click the download icon to export the list of corresponding URLs as a CSV file.

  • Click the alert icon to define an alert for future changes in the corresponding metric.

sc_content_quality.jpg

Metrics:

  • Pages with Template > 90% of Total Content: Pages containing more than 90% of templated content when crawled by Botify.

  • Pages with 100-250 Words: Pages containing more than 100 and less than 250 words when crawled by Botify (i.e., thin content).

  • Pages with content change >= 50%: Pages with content that changed 50% or more since the compared crawl by Botify.

Similarities/Duplicates

The Similarities/Duplicates section of the report identifies pages on your website with overlapping content and measures overall content uniqueness by segment.

Similar Pages By Similarity Score

These charts show how many pages a segment contains and how much content each page has in common with the most similar page.

sc_quality_similar.jpg

% Content Uniqueness (N-Grams)

This chart indicates the average percentage of word sequences (i.e., n-grams) found only in individual page content (excluding template).

sc_content_ngram.jpg

Similarities/Duplicates Insights

This table displays similarity/duplicate insights based on the evolution across the compared crawls. The "# URLs" column displays the number of URLs matching the metric in the current Botify crawl, and the Change column displays the percentage of increase or decrease from the compared crawl.

  • Click a metric to display the list of corresponding URLs in a URL Explorer report.

  • Click the download icon to export the list of corresponding URLs as a CSV file.

  • Click the alert icon to define an alert for future changes in the corresponding metric.

sc_content_dupinsight.jpg

Metrics:

  • Number of Pages with Similarity Score >= 90%: The similarity score observed for any pages similar to these pages was greater than or equal to 90%.

  • Number of Pages with Similarity Score >= 75%: The similarity score observed for any pages similar to these pages was greater than or equal to 75%.

  • Number of Pages with Similarity Score < 25%: The similarity score observed for any pages similar to these pages was less than 25%.

  • Pages with Unique Content < 10%: The measure of unique word sequences (i.e., n-grams) found on these pages was less than 10%.

  • Pages with Unique Content >=50%: The measure of unique word sequences (i.e., n-grams) found on these pages was greater than or equal to 50%.

Canonicals

The Canonicals section of the report compares the content on pages related by canonical tags and measures the accuracy of your canonical signals.

Distribution of Canonicals

These charts show the pages by segment that have canonical tags pointing to another page as the page to be indexed.

sc_canon_tags.jpg
  • Canonical Equal: Pages with a canonical tag that points to itself.

  • Canonical Not Set: Pages without a canonical tag.

  • Canonical Different: Pages with a canonical tag that points to another page, indicating the current page is a duplicate.

Canonical Similarity

These charts show whether your canonical tags point to pages with similar content, which is important since search engines may ignore the canonical signal if those pages are not similar.

sc_content_canonsimilar.jpg

Canonicals Insights

This table displays canonical insights based on the evolution across the compared crawls. The "# URLs" column displays the number of URLs matching the metric in the current Botify crawl, and the Change column displays the percentage of increase or decrease from the compared crawl.

  • Click a metric to display the list of corresponding URLs in a URL Explorer report.

  • Click the download icon to export the list of corresponding URLs as a CSV file.

  • Click the alert icon to define an alert for future changes in the corresponding metric.

sc_content_canoninsight.jpg

Metrics:

  • Pages with a Canonical Not Equal: Pages containing a canonical tag that points to another page.

  • Pages with noindex tag and Canonical Not Equal: Pages containing a noindex tag and a canonical tag that points to another page.

  • Non-canonical Pages with less than 50% in common with their canonical: Pages that point to a canonical page that is less than 50% similar to itself.

  • Non-canonical Pages with 50-75% in common with their canonical: Pages that point to a canonical page that is 50-75% similar to itself.

  • Non-canonical Pages with more than 75% in common with their canonical: Pages that point to a canonical page that is more than 75% similar to itself.

HTML Tags

The. HTML Tags section of the report evaluates the presence of H1, Title, and meta description tags and identifies errors for indexable URLs. Analyze anchor text distribution across links, and find heavily repeated anchor text.

To find information on H2 and H3 tags, use the H2 and H3 Content metrics as filters or columns in other reports.

HTML Tags Performance For Indexable URLs

This chart shows the distribution of HTML tag characteristics for all indexable URLs in the same zone (i.e., the combination of domain and language from the page's HTML tag "lang" attribute).

sc_content_tagperform.jpg
  • Unique: Pages with a tag not duplicated on another page in the same zone.

  • Duplicate: Pages with a tag the same as at least one other page in the same zone.

  • Not Set: Pages with a missing tag.

HTML Tags Performance For Indexable Active / Not Active URLs

This chart shows the distribution of HTML tag characteristics for all indexable URLs in the same zone segmented by whether they received at least one user visit in the last 30 days.

sc_content_tagperformactive.jpg

HTML Tags Performance For Indexable URLs By Segment

This chart shows the distribution of HTML tag characteristics for all indexable URLs in the same zone by segment.

sc_content_tagbysegment.jpg

Link Anchors (Number Of Distinct Anchor Linking To A Page)

While SiteCrawler includes detailed reports on links, the diversity of anchor text used in page links is included in the Content report since they provide signals about page content. These charts show the number of link anchors pointing to pages and the diversity of their link anchor text.

sc_content_linkanchor.jpg

HTML Tags Insights

This table displays HTML tag insights based on the evolution across the compared crawls. The "# URLs" column displays the number of URLs matching the metric in the current Botify crawl, and the Change column displays the percentage of increase or decrease from the compared crawl.

  • Click a metric to display the list of corresponding URLs in a URL Explorer report.

  • Click the download icon to export the list of corresponding URLs as a CSV file.

  • Click the alert icon to define an alert for future changes in the corresponding metric.

sc_content_taginsights.jpg

Metrics:

  • Indexable URLs with Duplicate Title: Indexable pages found in Botify's crawl with a title tag the same as another indexable page in the same zone.

  • Indexable URLs with Duplicate H1: Indexable pages found in Botify's crawl with an H1 tag the same as another indexable page in the same zone.

  • Indexable URLs with Duplicate Description: Indexable pages found in Botify's crawl with a description meta tag the same as another indexable page in the same zone.

  • Indexable URLs with Title Not Set: Indexable pages found in Botify's crawl with a missing title tag.

  • Indexable URLs with H1 Not Set: Indexable pages found in Botify's crawl with a missing H1 tag.

  • Indexable URLs with Description Not Set: Indexable pages found in Botify's crawl with a missing description meta tag.

  • URLs with TItle Length > 60: Indexable pages found in Botify's crawl with titles exceeding 60 characters.

Structured Data

If your site pages include structured data, this report section shows the structured data types detected in Botify's crawl. Use structured data metrics as filters and columns in other Botify reports to correlate your structured data metrics with sitewide performance metrics.

Structured Data Distribution

These charts show the number of pages in Botify's crawl in each structured data type.

sc_content_structured.jpg

Errors In Your Structured Data

This chart shows errors encountered during Botify's crawl by structured data type. Click into the error segments to find which URLs produced errors.

πŸ‘‰ This chart does not display if no errors were detected in the selected crawl.

sc_content_structured_errors.jpg

See also:

Did this answer your question?