Skip to main content

About ActionBoard Probability Metrics

Updated over a year ago

📘 This article describes the ActionBoard "Crawl Probability" and "Visit Probability" machine learning metrics and provides insight into how you can leverage them to optimize your site. ActionBoard is part of Botify's Intelligence suite, available with all Botify plans.

Overview

The "Crawl Probability" and "Visit Probability" metrics were introduced to identify the pages on your site that are most likely to get bot and user visits. The probability metrics are measured on a scale between zero and one and are computed using ActionBoard’s machine-learning models. These models are trained using metrics from Botify’s crawl to predict the probability of a page being crawled and refreshed by bots and the probability of a page receiving organic visits.

ab_probabilitymetrics.jpg

How the Models are Trained

Botify evaluates the characteristics of each URL to determine if it is crawlable or visitable using crawl and visit data from analytics or logs if you do not have analytics integrated with Botify. The models' success requires the most current and accurate visit and crawl data. If visit data is not ingested correctly or is outdated, then the models will produce inaccurate results.

Model Accuracy

The accuracy of the models varies from crawl to crawl since they learn per crawl. The accuracy is improved as the size of a crawl increases: with more URLs in the crawl, there is a greater opportunity to learn the pattern. Accuracy is also improved with balanced data. For example, if 99% of the URLs are crawled in one crawl, the model cannot learn the pattern of non-crawled URLs. Similarly, if 99% of the URLs are not visited, the model will not be able to learn the pattern of the visited URLs.

Use Cases

The following are examples of using these metrics to influence your site optimizations. Use the "Has Crawls" metric to determine crawled vs non-crawled pages:

ab_hascrawls.jpg

Condition

Probability Rating

Recommended Actions

Non-crawled pages

High crawl probability

  • Investigate whether these pages are discoverable to bots through links from crawled pages.

  • Determine if there is a crawl budget issue to optimize.

  • Use this information to inform internal linking strategies and to ensure that important pages are easily discoverable by bots.

Non-crawled pages

Low crawl probability

Investigate whether these pages have technical issues to fix, such as broken links or crawl errors.

Crawled pages

Low crawl probability

Many pages similar to this group of pages from Botify’s crawl are not crawled, and therefore, no action is required.

Crawled and non-crawled pages

High crawl probability

Use the list of high-crawl probability pages to optimize your sitemap and/or Robots.txt file to accurately represent your site's structure.

Non-crawled pages

High visit probability

  • Investigate whether these pages are discoverable by bots.

  • Determine if there is a technical problem that makes these pages unreachable.

Crawled and non-crawled pages

High visit probability

  • Use the list of high-visit probability pages to inform link-building strategies.

  • Prioritize efforts for high-quality links from/to these pages.

Crawled and non-crawled pages

High visit probability

Use the list of high-visit probability pages to guide the creation of new content that is optimized for users.

Crawled and non-crawled pages

Low visit probability

Check for content problems, such as thin or duplicate content.


See also:

Did this answer your question?