Skip to main content
SpeedWorkers Implementation
Updated over 2 weeks ago

🛠 This document describes the customer infrastructure requirements for running SpeedWorkers.

Overview

SpeedWorkers delivers fully rendered versions of your pages to search engine bots in hundreds of milliseconds. By caching a mirror version of your site pages, pre-rendering the JavaScript, and serving your pages to bots, you can ensure search engines see all your page content and index your pages faster. The burden of rendering JavaScript is removed from search engines, leaving bots more time to crawl more pages on your site and improving your site’s indexation. Since SpeedWorkers aims to maximize your site’s crawl budget to improve your organic search traffic, it only responds to bot page requests, not user requests.

Implementation

SpeedWorkers is deployed at the CDN level to deliver fully rendered pages from the SpeedWorkers inventory to search engine bots.

In the recommended configuration, you redirect all eligible search engine bot queries to Botify’s CDN, which manages the calls to SpeedWorkers. If the requested page is not in the SpeedWorkers inventory, SpeedWorkers will retrieve it from your origin server and then serve it to the bot.

Advanced Implementation

An advanced option allows you to configure your CDN to handle the response to bots when pages are not in the SpeedWorkers inventory. This option is available if you are using one of the supported CDN types (please refer to the Configuration section), though this method may require a complex configuration of your CDN.

Comparison of Implementation Methods

The two implementation methods serve pages in the SpeedWorkers inventory to bots in the same way. The methods differ in functionality when bots request pages not in the SpeedWorkers inventory. The following provides a comparison of these methods.

Method

Benefits

Limitations

Recommended

Easy configuration

Page delivery time: There may be marginal latency over the advanced SpeedWorkers implementation for pages not in the SpeedWorkers inventory since the request is routed between SpeedWorkers and your CDN one additional time.

Recommended

Compatible with more CDNs than with the SpeedWorkers Advanced Implementation

IPs: When requesting pages not in inventory from the origin, the Botify CDN IPs will be used, which cannot be guaranteed since they are from a third party.

Advanced

No latency when the page is not in the SpeedWorkers inventory

May require a complex configuration of your CDN unless you are using Cloudflare.

Requirements

The following are the minimum client infrastructure requirements for using SpeedWorkers. The client infrastructure refers to the customer’s technical infrastructure, which is in charge of its website network traffic.

Routing

  • The client infrastructure must redirect all bot traffic to the SpeedWorkers origin (with a timeout) based on the user-agent header.

  • The client infrastructure must handle all asset traffic (e.g., CSS, JS, images).

Caching

  • The client infrastructure must disable content caching when the SpeedWorkers origin responds. With the “Advanced” implementation, responses from the client infrastructure should also have content caching disabled, if possible.

Security

The client infrastructure must support authentication through one of the following methods:

  • A token in the HTTP header of the request is sent to the SpeedWorkers origin.

  • A query parameter is added to the URL when redirecting requests to the SpeedWorkers origin.

👉 If subdomains need to be handled by SpeedWorkers, the original URL requested by bots must also be passed in a query parameter or a header.

Configuration

To configure your infrastructure to route bot requests to the Botify CDN, complete the following steps and then follow the instructions specific to your CDN in the guides linked in the Known Supported CDNs table.

Configuring your infrastructure for SpeedWorkers includes the following steps:

Filter the traffic

Your infrastructure must only request content from SpeedWorkers when requests come from search engine bots, and requests must be restricted to HTML pages only; your infrastructure will keep serving all resources (e.g., JS, CSS, images, videos).

  1. Filter the traffic from the search engine bots.

    • During the integration tests, only filter the traffic from the Botify Bot:

      botify-bot-sw-*
    • In production, modify your CDN rules to direct bot traffic based on the user-agent to the SpeedWorkers origin. For example, if your agreement with Botify includes Googlebot and Bingbot, use the following regular expression:

      Googlebot\/|bingbot\/|botify-bot-sw-|Google-InspectionTool/\|GoogleOther
  2. Do not call ADN for the website’s resources. Here are the filters to apply to the URL extensions:

    ["\.(js|map|css|jpg|jpeg|png|ico|gif|tiff|svg|woff|woff2|ttf|eot|mp4|otf|txt|xml)$"]

Redirect traffic to the ADN origin

❗️ Do not return a 3XX to bots to redirect the traffic to SpeedWorkers since it could compromise your token and it would have a strong negative impact on your SEO. Do the following switch of origin instead.

  1. Redirect the incoming traffic to the ADN origin for search engine bots, the Botify user-agent, and web pages only (no resources):

    • Production:

      {adn-id}.sw.adn.cloud
    • SpeedWorkers Integration Tests:

      {adn-id}.sw.staging.adn.cloud
    • Note: The integration should be done in a tool that respects the TTL of DNS entries since the IPs may change.

  2. The following are the methods for requesting pages from the ADN endpoint. When performing integration tests, use the following user-agent:

    botify-bot-sw-test

a. All in URL

curl --request GET \ --url https://{adn-id}.sw.staging.adn.cloud/?url=https://client.origin/path&x-sw-adn-token=<token> \ --header 'user-agent: <bot-user-agent>'

b. Hybrid

curl --request GET \ --url https://{adn-id}.sw.staging.adn.cloud/?url=https://client.origin/path \ --header 'user-agent: <bot-user-agent>' --header 'x-sw-adn-token: <token>'

curl --request GET \ --url https://{adn-id}.sw.staging.adn.cloud/?x-sw-adn-token=xyz \ --header 'user-agent: <bot-user-agent>' --header 'x-sw-url: https://client.origin/path'

c. All in headers

curl --request GET \ --url https://{adn-id}.sw.staging.adn.cloud/ \ --header 'user-agent: <bot-user-agent>' --header 'x-sw-url: https://client.origin/path' --header 'x-sw-adn-token: <token>'

Additional CDN Configuration

  1. Preserve headers sent from both sides:

    • The requester: Namely, the search engine bots

    • The ADN origin

  2. Disable caching pages when the response comes from the ADN origin.

  3. Disable retries if the ADN origin responded to an HTTP error code (e.g., 5xx, 4xx).

  4. Enable a 15-second timeout if the ADN origin does not respond.

  5. Fallback procedure: If SpeedWorkers does not cache the requested page, we gather it from your origin. Here are the requirements:

    • Do not redirect traffic to the ADN endpoint when the requests come from: Mozilla/5.0 (compatible; botify; http://botify.com) fallback.

    • Do not block IPs from the ADN endpoint (you cannot block unknown IP requests to your staging infrastructure since it is used for integration tests).

Notes:

  • When a request is routed to SpeedWorkers, the IPs from the requesters are appended within the "X-Forwared-For" header (e.g., X-Forwarded-For: <orignIP>, <firstRedirectIP>, ... , <lastRedirectIP>). Be careful if you apply rules concerning the IPs provided here. For example, if you base your Geo-IP rule on this header, our IPs could affect this resolution (they are based in North America).

  • If you measure traffic from the origin, you will only receive traffic from the fallback user agent. Please modify the source of traffic logs to the CDN.

💡 If you cannot modify your traffic logs source, please contact Support.

Verify Integration

Conduct the following tests to confirm your settings are correct for the SpeedWorkers integration.

"Always Success" Test

The "always success" test will force SpeedWorkers to return a cache hit even if the page is not in the cache. This test ensures that SpeedWorkers is called and its reply returned to the bot:

Always Success (force a cache hit in SW)
--------------
URL: Your homepage (https://www.mywebsite.com)

Headers:
User-Agent: botify-bot-sw-test
X-Sw-Options: passed-through,request-time,always-success,echo-67674
X-Sw-Options-Auth: XXXXXX <= the website ID provided by Botify

Expected response:
Status: 200
Body: "Success"
Headers:
X-Ftlcdn-Status: false
X-Sw-Echo: 67674
X-Sw-Passed-Through: true
X-Sw-Status: success

"Cache Miss" Test

The "cache miss" test forces SpeedWorkers to return a cache miss even if it has the page in the cache. This test ensures that when SpeedWorkers can't deliver the page, the request falls back properly:

URL: Your homepage (https://www.mywebsite.com)

Headers:
User-Agent: botify-bot-sw-test
X-Sw-Options: passed-through,request-time,always-notfound,echo-41521
X-Sw-Options-Auth: XXXXXX <= the website ID provided by Botify

Expected response:
Status: 200
Body: your homepage
Headers:
NO X-Sw-... headers

"Timeout" Test

The "timeout" test forces SpeedWorkers to delay its response enough to trigger the timeout in your environment. This test ensures that when SpeedWorkers doesn't reply, the request falls back properly:

URL: Your homepage (https://www.mywebsite.com)

Headers:
User-Agent: botify-bot-sw-test
X-Sw-Options: passed-through,request-time,always-timeout,echo-42300
X-Sw-Options-Auth: XXXXXX <= the website ID provided by Botify

Expected response after several seconds:
Status: 200
Body: your homepage
Headers:
NO X-Sw-... headers

Known Supported CDNs

The following table identifies the CDNs that have been tested with SpeedWorkers. This is not intended to be an exhaustive list of compatible CDNs. If your CDN meets the requirements identified above, it should be compatible with SpeedWorkers; please contact your SEO Success Manager (SSM) to discuss your current infrastructure. If you configure your CDN to route to your origin server when pages are not in the SpeedWorkers inventory, you must have one of the CDNs identified as compatible with the Advanced implementation.

CDN/Load Balancer

Recommended Implementation

Advanced Implementation

Akamai

Azure

Cloudflare

Cloudfront

Fasterize

Fastly

Imperva

NGINX

Refer to the implementation guide specific to your CDN type for configuration instructions. If your CDN does not have a guide linked above, please contact your SEO Success Manager.


Contact Support

If you need any assistance, please contact Support.


Did this answer your question?