📘 This article contains frequently asked questions about working in Botify.

Overview

These are common questions that may help with basic troubleshooting across Botify. Also, refer to the links at the bottom of this page for FAQs within specific capabilities.

Project Setup

I scheduled a recurring crawl, but the analysis hasn't started; why?

Setting up a recurring crawl does not start the first analysis of the series. You must define when to start the first analysis by setting the repeat frequency.

How can I resend an invite to a Botify project?

Please contact Support to resend an invite to a project.

How do I enable the analysis of hreflang tags for my multilingual website?

Hreflang tag analysis is triggered when the tags are discovered on your site during Botify's crawl; no setup is required. Results are shown in the Internationalization section of SiteCrawler's Distribution report.

Crawls

Will Botify crawls cause a lift in traffic in Google Analytics?

No. Google Analytics filters out bot activity and Botify does not load GA tags when crawling, so GA is unaware that Botify has been to the site, even when rendering JavaScript.

My crawl is slowing down. What should I do?

If the crawl is slowing down a little and there is no increase in HTTP error codes, the crawler may have entered a slower area on your website or crawling pages not in your cache systems (read more about factors that impact crawl speed). Because Botify’s crawler works hard to meet the requested maximum pages per second, without overwhelming your servers, the crawl may be slowed intentionally.
If there is a significant increase in HTTP error codes or a high proportion of error codes, then there is likely a problem. Try to slow the crawl down using the settings button on the Live Stats page. If the rate of error codes persists, stop the crawl since the crawler may not be the cause of the problem.

When will my crawl end?

Botify cannot determine an exact end time because factors are variable, but you can find the estimated remaining crawl time by navigating to Crawl Manager > Watch Live Stats:

What can I do when I get an error message and my crawl stops unexpectedly?

There is nothing you need to do in this situation since the Botify Support team is automatically notified of a problem when this occurs, and they will resume your analysis. The unexpected can sometimes happen, depending on website specifics.

Why is the crawl speed the same after I validated my website?

If you validate your website after a crawl starts, the maximum speed is three pages per second, even if you entered a higher speed. The higher crawl speed will not be applied until you update and save the speed settings.

Reports

What's the difference between URLs Crawled by Google and Visits from Google?

URLs crawled by Google are crawls by Googlebot, and visits are organic visits from Google search engine result pages.

Why are there more discovered than crawled URLs in my SiteCrawler Overview report?

These numbers would ideally match, indicating that Botify crawled all pages defined in the scope of your project settings. When they do not match, one of the following occurred:

The maximum number of URLs identified in project settings was reached.
A subscription-based limit was reached before all pages were crawled.
The maximum depth in project settings was reached. If no maximum depth is identified, the crawler stops at depth 100. This limit is imposed since there is no benefit to crawling at this great depth, and search engine bots will not crawl this deep. If this applies to your site, we recommend investigating why your site structure is so deep and evaluating optimizations to ensure bots and users can access your important content.

I haven't defined any branded keywords - why do I get different results when filtering RealKeywords reports by branded vs. non-branded keywords?

This is expected since non-branded keywords exclude anonymized queries.

Why is there a one-day discrepancy in RealKeywords in year-over-year reports?

This is to align the days of the week. For example, your report may show 16 January 2023 - 15 January 2024. Many sites experience the bulk of their traffic on either weekdays or weekends, so there may be a discrepancy if comparing a Monday-Friday to a Tuesday-Saturday.

Why does my report only include one URL?

There are many reasons why this may occur. The most common cause is a redirect where the crawl uses a start URL within the allowed domains in the project settings but redirects to another disallowed domain or subdomain. It could also redirect to a disallowed protocol: for example, if the start URL http://www.mywebsite.com redirects to https://www.mywebsite.com and the HTTPS protocol is disallowed.

You can determine if a redirect was the cause by checking if the HTTP status code of the single URL crawled was HTTP 301 (permanent redirect) or HTTP 302 (temporary redirect). Check the redirect target and the allowed domains to confirm.

The redirect could also be a meta-redirect implemented in JavaScript (not executed by the crawler), which means the only URL crawled will return HTTP 200 (OK), but the page will contain no link for the crawler to follow.

Other potential reasons include:

A server error (i.e., HTTP 5XX).
Missing authentication (i.e., HTTP 401).
User-agent disallowed, which may result in HTTP 403 - Forbidden.
Network problems, in which case the HTTP status code field will contain a Botify-specific code (e.g., -101 for a DNS IP/hostname not found, -100 for an unknown network error).

My crawl stopped with a lower number of analyzed URLs than expected. Why?

Check your robots.txt file to determine if there are pages you expected to see in the analysis that are disallowed.
There may be fewer pages linked via HTML links than you thought. To investigate, use your browser's developer tools to view your website with JavaScript disabled.
Some pages may only be linked via nofollow links, and by default, Botify will respect nofollow directives. Start a new crawl with the "Respect no-follow rules" option disabled (in advanced settings).
The crawler may have encountered a CAPTCHA or authentication page with few or no links to deeper pages on your site.
If your website was not validated (i.e., you did not prove ownership), the website owner may have stopped the crawl using the emergency stop feature, which stops the analysis as if you had stopped it yourself during the crawl. Check if the number of discovered URLs was greater than the number of crawled URLs in the Overview report KPIs to determine if there were several URLs in the queue when the crawler stopped before but the maximum number of URLs or maximum depth was reached.

Why do URLs with different parameters appear as different in my report?

Search engines evaluate pages using the full URL, including the query string (the portion after the ?). As a result, "http://www.domain.com/page1.html?xxx and "http://www.domain.com/page1.html?yyy" are two distinct pages, which may have similar content or not, depending on what the query string does and how it affects the page content. URLs with audience tracking parameters will not change the page content and will be duplicates of the URL without parameters. Parameters that change the page content can either generate partial duplicates (e.g., the number of items displayed in a list) or no duplicates at all (e.g., pagination parameter). Even if URLs are consolidated appropriately with a canonical tag, search engines, and Botify will still consider them separate URLs.

Why does my report show more characters in a page's H1, Title, or meta-description tag than the tag displayed in the report?

Botify counts all characters present in the Title tag in the code, including spaces and special characters. You can check for extra spaces in a tag by adding one of the {tag} length fields (e.g., Title Length) as a report column in the URL Explorer and viewing the HTML source:

Why is the Sitemaps report missing from my SiteCrawler reports?

You must enable Sitemaps analysis in crawl settings to access the Sitemaps report.

Does Botify report on jump links?

No. SiteCrawler does not crawl links containing # in the URL. You can, however, find jump links that generate traffic by creating a simple RealKeywords report. Use the filter "Full URL contains #" and add columns for impressions, clicks, CTR, and average position.

Does Botify support negative lookahead regex?

No. Botify uses the Golang (RE2) flavor of regex, which does not include negative lookahead. We recommend using a testing tool like regex101 to check your regular expressions with the Golang flavor.

URL Explorer

Why is my report export not working?

Although this rarely happens, timeouts can occur, particularly when there is a lot of data. The failure will be shown in the Botify DataExports if this occurs.

While the export size limit is 1,000,000 URLs, the number of lines in the exported file can be far greater if you export a table with a multi-valued column (i.e., a column with a list of data for each URL from the left column). In that case, the export will include several lines per URL. For example, if you are displaying a sample of Inlinks (source pages, where the URLs are linked from), the sample can include up to 300 inlinks per URL, which is up to 3M exported lines. The export will be of the following form:

URLA	Inlink 1 for URL A
URL A	Inlink 2 for URL A
URL B	Inlink 1 for URL B
URL B	Inlink 2 for URL B

If your URL Explorer result is too big for a single export, the best way to solve the problem is to do several smaller exports using additional filters. For instance, if you were exporting all URLs with duplicate H1 tags, here are possible approaches:

Add filters based on the URL to export them one segment at a time.
Add a filter by page depth to export URLs closer to the home page first (these can be seen as priority URLs), then deeper URLs.
Add a filter based on the number of duplicates to export pages with a high amount of duplicates, and then those with a lower number.

Why can't I find the fields I want to display in the URL Explorer?

Start typing what you are looking for in the search field and the list to choose from will be shortened to a selection that contains what you typed. For instance, type "http" for HTTP code, "indexable" for fields related to indexable or non-indexable URLs, "H1" for fields related to H1 tags, "anchor" for fields related to links anchor texts, "inlinks" or "outlinks" for any field related to incoming/outgoing links.

How do I remove a column in the URL Explorer results table?

There are two ways to remove a report column:

Hover over the column name, then click the X to remove the column.
Click the Manage Columns link, click the X next to the column to delete, then click Build your chart to update the report.

See also:

Frequently Asked Questions