📘 This article describes the ActionBoard "Crawl Probability" and "Visit Probability" machine learning metrics and provides insight into how you can leverage them to optimize your site. ActionBoard is part of Botify's Intelligence suite, available with all Botify plans.
Overview
The "Crawl Probability" and "Visit Probability" metrics were introduced to identify the pages on your site that are most likely to get bot and user visits. The probability metrics are measured on a scale between zero and one and are computed using ActionBoard’s machine-learning models. These models are trained using metrics from Botify’s crawl to predict the probability of a page being crawled and refreshed by bots and the probability of a page receiving organic visits.
How the Models are Trained
Botify evaluates the characteristics of each URL to determine if it is crawlable or visitable using crawl and visit data from analytics or logs if you do not have analytics integrated with Botify. The models' success requires the most current and accurate visit and crawl data. If visit data is not ingested correctly or is outdated, then the models will produce inaccurate results.
Model Accuracy
The accuracy of the models varies from crawl to crawl since they learn per crawl. The accuracy is improved as the size of a crawl increases: with more URLs in the crawl, there is a greater opportunity to learn the pattern. Accuracy is also improved with balanced data. For example, if 99% of the URLs are crawled in one crawl, the model cannot learn the pattern of non-crawled URLs. Similarly, if 99% of the URLs are not visited, the model will not be able to learn the pattern of the visited URLs.
Use Cases
The following are examples of using these metrics to influence your site optimizations. Use the "Has Crawls" metric to determine crawled vs non-crawled pages:
Condition | Probability Rating | Recommended Actions |
Non-crawled pages | High crawl probability |
|
Non-crawled pages | Low crawl probability | Investigate whether these pages have technical issues to fix, such as broken links or crawl errors. |
Crawled pages | Low crawl probability | Many pages similar to this group of pages from Botify’s crawl are not crawled, and therefore, no action is required. |
Crawled and non-crawled pages | High crawl probability | Use the list of high-crawl probability pages to optimize your sitemap and/or Robots.txt file to accurately represent your site's structure. |
Non-crawled pages | High visit probability |
|
Crawled and non-crawled pages | High visit probability |
|
Crawled and non-crawled pages | High visit probability | Use the list of high-visit probability pages to guide the creation of new content that is optimized for users. |
Crawled and non-crawled pages | Low visit probability | Check for content problems, such as thin or duplicate content. |
See also: