π This article explains how to use Botify reports to locate pages on your website that return HTTP 5xx errors and the pages that link to them.
Overview
While you may have removed broken links on your website by resolving HTTP 404 - Not found pages, broken links may also include server errors (i.e., HTTP 5xx). HTTP 5xx errors indicate the web server was unable to reply to the request for a variety of reasons that fall into two categories:
Permanent errors: Pages that always return a server error because the server does not know how to answer the request.
Temporary errors: Pages that returned a server error to Botify's crawler because of a short-term problem, usually unrelated to the page (URL), such as a temporary server overload or a scheduled maintenance in a section of the website. In all likelihood, the page does not return HTTP 5xx anymore.
You should focus on permanent server errors for SEO purposes since these are the ones you can correct. To remove these HTTP 5xx errors, you need to identify:
The list of URLs returning permanent errors, which need to be removed (Page A).
In which pages the links to these pages are located so you can update or remove the link (Page B).
The process for removing 5xx errors includes the following steps:
Creating a report of pages that return HTTP 5xx errors.
Exporting the list of error pages and the pages that link to them.
Create a Report of HTTP 5xx Errors
To identify 5xx HTTP errors (Pages A in the graph above):
Navigate to the SiteCrawler > HTTP Codes report:
Click on the 5xx section of the HTTP Status Codes Distribution chart or the 5xx URLs link in the Insights table.
A URL Explorer report shows all pages that returned HTTP 5xx status codes in the selected crawl.
π‘ If the 5xx section of the HTTP Status Codes Distribution chart is too small, use the dropdown to display only the 5xx family of codes:
β
Identify Permanent Errors
Before you can evaluate HTTP 5xx errors, you need to find the specific HTTP status code that was returned:
HTTP 503 means Service Unavailable, which is typically a temporary issue.
HTTP 501 means the server did not know how to respond, which is likely to be a permanent error.
HTTP 502 means the server the crawler talked to (i.e., a gateway or a proxy) received an error from the server it requested the content from. This is likely to be a permanent error.
HTTP 500 is a generic error with no additional details.
To evaluate HTTP 5xx errors:
Click the HTTP Code column heading in your URL Explorer report to sort by code. The following example shows HTTP 502, 503, and 500 errors.
βTo evaluate each type of error, filter the report to show specific HTTP codes (e.g., HTTP Status Code =502).
βClick on the blue arrow next to a URL in the report to view the page on the website and find if there is still a server error.
In this example, all URLs identified as HTTP 502 still return the 502 status code:
βExport this list of URLs, then repeat for the remaining server errors (i.e., 503 and 500).
Identify Groups of Similar Pages
If there are more than a few pages, identify groups of pages likely to have the same type of problem. Add one or more filters to extract pages with similar URL patterns. For example:
Full URL contains [N].
Host contains [N] (the host corresponds to the domain name in the URL).
Path contains [N] (the path is the URL part that starts with the / after the domain name and ends just before the "?" if it exists).
Query String contains [N] (the query string is the part of the URL after the "?", if any).
β
Click the arrows next to each URL to see if they still return server errors:
If they still return server errors, you can be confident that all pages in this set of similar pages return permanent server errors. Continue to change filters to investigate the rest of your HTTP 5xx, exporting each group of URLs.
Refine and Export the Report
With the list of error pages to correct, you only need to find where they are linked from:
Add a filter for Number of Internal Inlinks > 0 to find those linked from other pages.
Add the following columns:
Review the report:
Click the Export as CSV link. Use the exported file to update every instance of a link to the URL in Column A.