Inaccessible Robots.txt = Disallow All

A few weeks ago Google announced they are increasing the number of alerts sent to webmasters.

http://googlewebmastercentral.blogspot.co.uk/2012/07/new-crawl-error-alerts-from-webmaster.html

One of the messages they mention is ‘Your site’s robots.txt is inaccessible’.

Google have previously stated that they will not crawl a website if the robots.txt file is inaccessible. This is because if they can’t access the robots.txt file, they don’t know what is safe to crawl. The exception is a 404 error which Google treats an Allow All.

One of our clients recently has some site performance issues and received this slightly scary notification.

Over the last 24 hours, Googlebot encountered 2011 errors while attempting to access your robots.txt. To ensure that we didn’t crawl any pages listed in that file, we postponed our crawl.

Google appeared to have stopped crawling the site completely. They sent another identical message 24 hours later.

No further messages were received after the performance issues were resolved although site crawling resumed. So don’t expect any notification to say the error has cleared. It’s probably safe to assume the error has cleared if you stop receiving the messages as they appear to  keep notifying you as long as there is a problem.

This blog post explains some of the issues experienced with a persistent inaccessible Robots.txt so it’s worth taking this notification seriously.

Posted on July 31, 2012 at 6:23 pm by chris · Permalink
In: Robots.txt, SEO Strategy

Leave a Reply