Skip to main content

Crawling a Website with Access Control

📘 This article describes how to crawl a protected website in Botify.

Overview

You can crawl a website that is not publicly accessible, such as a development or pre-production version of your site. The way to do this depends on the access control method used:

Crawling a Password-Protected Site

If your site is password-protected, you can specify a username and password for Botify. Navigate to your project Settings > Advanced Settings tab and enter the login and password in the Access section:

Authentication Type

SiteCrawler uses the username and password provided in project settings for basic access authentication (client-side HTTP basic authentication, as explained here by Wikipedia). Botify adds a line in the HTTP header with these credentials for each page it requests.

Crawling by User Agent

You can crawl a website accessible only to a specific user agent if your website is validated (verified). You can instruct SiteCrawler to use your allowlisted user agent in your project settings. To identify your user agent:

  1. Navigate to Settings > Advanced Settings.

  2. In the Desktop User Agent or Mobile User Agent field, select "Custom", then identify the name of your user agent.

  3. Click Save at the bottom of the page.


See also:

Did this answer your question?