Skip to main content

Understanding the Page Life Cycle in SpeedWorkers

Updated over a year ago

📘 This article provides information on the page lifecycle in the SpeedWorkers inventory. SpeedWorkers is part of Botify's Activation Suite, available as an option with a Botify Pro or Enterprise plan.

Overview

Your site pages must be added to the Speedworkers Inventory before they can be cached and delivered to bots. Pages remain in the inventory up to X days after being discovered, based on your inventory source and cache behavior settings in SpeedWorkers. URLs will likely be discovered in multiple lists if multiple input sources are configured. In this case, a URL will remain in the inventory until it is removed from all input sources.

Example

The page https://www.botify.com/platform/botify-activation?seo=great was requested 3 times by Googlebot between January 1 and January 2. This page had never been seen before and was not in the inventory. The “Bots Request” inventory source was configured to add pages to the inventory that were requested by Google two times in five days. Based on these settings, the page was added to the inventory on January 2, as soon as it hit the defined threshold. The page was removed from inventory on January 6, four days after the last date it was requested by Googlebot, since the defined refresh rate of 1 week had not been met.

sw_botrequest_example.jpg

SpeedWorkers Page Status

The following are the status types for pages in your inventory:

  • Not Indexed: The page was added to the inventory but has never been cached.

  • Indexed: The page is in the cache and is up-to-date.

  • Outdated: The page is still in the cache but will be flushed from the cache soon (i.e., stale).

  • Expired: The page was in the SpeedWorkers cache at one time but is not currently in the cache. Previously indexed pages become expired because the page was not refreshed before the beginning of the next refresh cycle. Reasons for expiration include the batch job being full or a server error encountered when attempting to refresh the page from the origin server.

sw_page_states.jpg

SpeedWorkers' Inventory Monitoring page shows the daily state of the inventory. The ideal state is to have all pages from the inventory in “indexed” status. You should try to avoid having pages in Expired or Not Indexed status for long periods.

sw_inventory_example.jpg
Did this answer your question?