S04 - Robots.txt Not Found

Alert Properties Alert Description
Alert Name Robots.txt not found
Code S04
Description The Website does not include robots.txt file or X-Robots-Tag HTTP headers to manage crawler traffic efficiently
Level Error

What is robots.txt?

A robots.txt file helps search engine crawlers identify the relevant and irrelevant pages for the crawler to crawl. This is used mainly to avoid overloading your site with requests, and not exactly a mechanism to deflect attention from Google for any webpage. To keep a webpage out of Google, you should use noindex directives, or password-protect your page.

The robots.txt file is used primarily to manage crawler traffic to your site, and to keep a page off Google, depending on the file format:

A robots.txt file lives at the root of your site. So, for site www.example.com, the robots.txt file will be at path www.example.com/robots.txt. robots.txt is a plain text file that follows the Robots Exclusion Standard. A robots.txt file consists of one or more rules that blocks or allows access for a given crawler to a specified file path on that website.

For more information on robots.txt, check out this article Introduction to robots.txt.

How to create robots.txt?

Check out this article which explains about creation of robots.txt and other general guidelines: Create a robots.txt file. For addressing other queries, please check this faqs section.

Does my website need a robots.txt file?

No, it is not necessary. When search engine bots/ crawlers like the Google bot visits a website, they ask for permission to crawl by attempting to retrieve the robots.txt file. A website without a robots.txt file, robots meta tags, or X-Robots-Tag HTTP headers, will generally be crawled and indexed normally.

Note: robots.txt is not a place to hide webpages but for notifying crawlers to crawl the site optimally, with a balanced number of requests.