Skip to main content
Deploying SpeedWorkers with AWS ALB
Updated over 2 weeks ago

🛠 This document explains the configuration requirements for running SpeedWorkers with the Amazon Web Services Application Load Balancer. This documentation does not apply to any other load balancer.

Source code:

The lambda performs a request to SpeedWorkers to fetch the cached version of the page requested by a search engine bot.

  • If SpeedWorkers has the page in its cache, the lambda returns the cached version of the page.

  • If SpeedWorkers cannot deliver the page, the lambda calls the origin server and returns the origin server response instead.

👉 The lambda response size is limited to 1MB. It means the lambda can only be used if we are sure the compressed page size and headers are less than 1MB.

The following rules must be added to the AWS ALB listener:

1221

In the rule forwarding to SpeedWorkers, set the bots to intercept based on your Botify subscription plan:

  • *googlebot/*

  • *bingbot/*

  • *yandexbot/*

  • *yandexmobilebot/*

  • *baiduspider/*

  • *baiduspider+*

  • *applebot/*

  • *botify-bot-sw*

👉 Refer to Supported Bots for the full list of user agents.

*botify-bot-sw* is required to let SpeedWorkers perform automatic tests.

If your resources are hosted on the same origin and pass through the same AWS ALB, you may need to add some rules to ensure the SpeedWorkers lambda is not called for these resources to avoid increasing the number of calls to the lambda.

❗️Ensure the fallback rules are set first in the list to avoid looping calls that could trigger DDoS protection or similar issues.

When the AWS ALB receives a request from a search engine bot, it will match the rule forwarding to SpeedWorkers and call the lambda.

  • If SpeedWorkers can deliver the page, the lambda will return the SpeedWorkers' response to the caller.

  • If SpeedWorkers can’t deliver the page, the lambda will add the X-Sw-Fallback header set to a particular value to match the first rule and call the origin target group. The lambda will then return the origin target group response to Google.

If the request doesn’t come from Google, the request will match the last rule and call the origin target group.

❗️The rule forwarding to SpeedWorkers should be as restrictive as possible. Called only for bots AND only for pages, never for resources!

The lambda target group must have the multi-header option enabled:

672

The lambda must have the following environment variables set:

823

Fallback domain port and protocol are the parameters the lambda will use when falling back to the origin if SpeedWorkers can’t deliver the page (should be the AWS ALB DNS name, port, and protocol).

👉 Ensure the lambda can connect to the AWS ALB (AWS ALB security group rules) and VPC. When assigning the lambda to a VPC it doesn't have access to the Internet. A NAT Gateway has to be added in the VPC.

The Origin domain is the protocol and host the lambda uses to rebuild the requested URL, as the lambda only receives the path.

Botify provides the SpeedWorkers domain, token, and website ID. Ensure you set the same secret value for the fallback secret as in the AWS ALB rule. Set the fallback timeout according to your usual origin server timeout.

Validating the SpeedWorkers Integration

To validate the integration of SpeedWorkers in your environment, you can send the following requests.

"Always Success" Test

The "always success" test will force SpeedWorkers to return a cache hit even if the page is not in the cache. This test ensures that SpeedWorkers is called and its reply returned to the bot:

Always Success (force a cache hit in SW)
--------------
URL: Your homepage (https://www.mywebsite.com)

Headers:
User-Agent: botify-bot-sw-test
X-Sw-Options: passed-through,request-time,always-success,echo-67674
X-Sw-Options-Auth: XXXXXX <= the website ID provided by Botify

Expected response:
Status: 200
Body: "Success"
Headers:
X-Ftlcdn-Status: false
X-Sw-Echo: 67674
X-Sw-Passed-Through: true
X-Sw-Status: success

"Cache Miss" Test

The "cache miss" test forces SpeedWorkers to return a cache miss even if it has the page in the cache. This test ensures that when SpeedWorkers can't deliver the page, the request falls back properly:

URL: Your homepage (https://www.mywebsite.com)

Headers:
User-Agent: botify-bot-sw-test
X-Sw-Options: passed-through,request-time,always-notfound,echo-41521
X-Sw-Options-Auth: XXXXXX <= the website ID provided by Botify

Expected response:
Status: 200
Body: your homepage
Headers:
NO X-Sw-... headers

"Timeout" Test

The "timeout" test forces SpeedWorkers to delay its response enough to trigger the timeout in your environment. This test ensures that when SpeedWorkers doesn't reply, the request falls back properly:

URL: Your homepage (https://www.mywebsite.com)

Headers:
User-Agent: botify-bot-sw-test
X-Sw-Options: passed-through,request-time,always-timeout,echo-42300
X-Sw-Options-Auth: XXXXXX <= the website ID provided by Botify

Expected response after several seconds:
Status: 200
Body: your homepage
Headers:
NO X-Sw-... headers

Troubleshooting

When testing the integration, if sending a request to SW doesn’t return the expected response, try replacing the SW host (origin) with a third-party service like PutsReq, Request Catcher — record HTTP requests, webhooks, API calls, or Beeceptor - Rest API mocking in seconds. It will help you verify that the call to SW is correct.

❗️Before testing with a third-party service, change the website ID and token in the recv snippet to avoid leaking them.


Did this answer your question?