Skip to main content

Understanding AI Bot Data in Botify

Updated over 8 months ago

📘 This article explains how AI bots may interact with your site and where to access bot activity in Botify.

Overview

The search landscape is transforming significantly. While optimizing websites has historically focused on achieving high Google rankings, the emergence of AI assistants, such as ChatGPT, is revolutionizing how people seek information and how websites attract traffic. This new search paradigm presents exciting opportunities and challenges, necessitating a fundamental shift in your SEO strategy.

Impact on Your SEO Strategy

These new AI capabilities present more opportunities for your website to be found. But while they improve search, they don’t replace it. Technical SEO and quality content remain critical in this new landscape to ensure that AI capabilities find, index, and reference your website. In addition to optimizing for traditional search engines, producing content structured around user intent and more meaningful relationships between concepts continues to be important.

Leverage Botify’s capabilities to influence AI-powered search platforms with more of your content:

  • Identify how much of your strategic content is used by AI search and AI assistants.

  • Know how much of your content is being used to train AI models.

  • Analyze the demand for your content from actual user queries.

  • Find how GPT influences search results through Bing search.

  • Understand what AI bots are doing on your website.

Before evaluating this data, it’s essential to understand the basics of how AI search engines work and what AI bots do.

About AI Agents

GenAI search engines use artificial intelligence to understand and generate human language. They rely on Large Language Models (LLMs) that are trained on diverse datasets, through which they learn text patterns, associations, language rules, and contextual relationships. This training and refinement enable the models to generate responses to user queries with human-like qualities. ChatGPT is the most well-known example of an LLM.

Real-time Search

Traditional LLMs are trained on static sources that may not have recent content. When user queries require fresh content, some AI Search Engines offer real-time search capabilities to provide more relevant responses to queries from external sources in real-time. This technique, called Retrieval-Augmented Generation (RAG), is based on user queries and intent and provides links to sources, just as traditional search engines do. This means it remains essential to understand user intent, provide quality content that matches user queries, and make websites accessible.

What AI Bots Do

You know Google and other major search engines employ bots to scan and index your content. Some AI bots index content but can also have the following purposes:

AI bots focus on answering questions. Traditional search engine bots focus on ranking links to answer queries. Understanding the difference between indexing bots that drive traffic and SEO visibility and training bots that help train Al models but don't immediately impact rankings is crucial. You have the flexibility to determine what AI engines can do on your site, and it’s essential to develop a strategy for your business. For example, an e-commerce site may want to allow real-time search engines to crawl their site so they can rank on AI-driven search tools while ensuring their content is fresh and accessible for real-time retrieval. Publishers may want to block training bots from indexing their content to protect their intellectual property.

Bots by Purpose

The following AI bots are supported in Botify. Notice some have multiple purposes.

Bot Name

Purpose

GPTBot

Anthropic-ai*

Bytespider

CCBot

ClaudeBot

Claude-Web*

FacebookBot

Meta-ExternalAgent

YouBot

ChatGPT-User

PerplexityBot

AmazonBot

OAI-SearchBot

PerplexityBot

YouBot

* Old bots; used for historical reports only

Training Bots

Training bots gather publicly available information to train LLMs or enhance AI systems. This involves feeding the AI with large datasets to refine its algorithms and improve its performance in understanding and generating human-like responses.

⭐️ Why this is interesting: Tracking where training bots go on your site and how often they scan can provide beneficial insight into server load to help you manage resources. Evaluating training bot activity on your site can help you decide if you want to block them through robots.txt (or other means).

AI Assistants

AI assistants, like ChatGPT-User bot, provide direct, concise answers to user queries without requiring users to navigate multiple search results. Using AI and RAG, AI assistants provide real-time answers by focusing on understanding the intent behind a question.

⭐️ Why this is interesting: Allowing these bots is important since they can reference your site when they use RAG.

Indexation Bots

Like search engine bots, these AI bots index their content.

⭐️ Why this is interesting: Tracking these bot scans is the same as traditional search engines. When you allow them, you can potentially rank in their SERPs.

Where Will Your Content Show in AI Results?

In addition to having your content appear and be referenced in AI assistant responses, it can appear in AI-generated summaries. These summaries compile query results and related queries in a condensed format. Related queries are derived when the users who searched them also searched the main query. Your site can rank for both the main query and related queries.

💡 Your content can appear in AI assistant responses without being referenced. To be referenced by AI assistants, content must be indexed and ranked in one of the following ways:

  • In the scan/RAG process, as when ChatGPT queries Bing, and your content ranks for the query.

  • In the indexation process (e.g., SearchGPT or Perplexity), if the AI bot built an index before, and your content ranked for the specific query.


Where to Find Bot Activity in Botify

Every bot scan is recorded in your web server logs, just as user visits and search engines. When you include bot activity in the logs integrated with your Botify project, you’ll get bot activity reports in our custom report template and LogAnalyzer. With your third-party analytics data integrated with your Botify project, you also get traffic insights from AI search.

In Botify, AI bot activity is grouped into two categories:

OpenAI Bots

  • GPTBot

  • ChatGPT-User

  • OAI-SearchBot

Other AI Bots

  • AmazonBot

  • Anthropic-ai

  • Bytespider

  • CCBot

  • ClaudeBot

  • Claude-Web

  • FacebookBot

  • Meta-ExternalAgent

  • PerplexityBot

  • YouBot

Custom Report Template

The AI Bots in Search report template explores the critical changes in search behavior driven by AI bots, how these AI assistants interact with your website, and user visits resulting from that activity. It also identifies strategies to capitalize on this shift to drive traffic from diverse sources. Access the template by navigating to CustomReports > Templates > AI Bots in Search.

LogAnalyzer Reports

In LogAnalyzer, use the global filter at the top right of the page to select one of the AI bot groups:

For charts that break down AI bot activity by the individual bot, filter charts by bot, as in the following examples in LogAnalyzer's Overview report:

  • URLs Crawled by OpenAI/Other AI Bots by Segment by Day

  • URLs Crawled by OpenAI Bot/Other AI Bot User Agent by Day


See also:

Did this answer your question?