Archive for the ‘Robots.txt’ Category

Google displaying message instead of description for robots.txt excluded files

We have just spotted today that Google started displaying a message instead of an empty description. This happens when you disallow a page in robots.txt preventing Google from indexing the content of a page and therefore the description as well. The message looks like this: The “learn more” link points to: https://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449   It is [...]

Posted on October 19, 2012 at 1:44 pm by mm · Permalink · Leave a comment
In: Google, Natural search, Robots.txt, Site architecture

Inaccessible Robots.txt = Disallow All

A few weeks ago Google announced they are increasing the number of alerts sent to webmasters. http://googlewebmastercentral.blogspot.co.uk/2012/07/new-crawl-error-alerts-from-webmaster.html One of the messages they mention is ‘Your site’s robots.txt is inaccessible’. Google have previously stated that they will not crawl a website if the robots.txt file is inaccessible. This is because if they can’t access the robots.txt [...]

Posted on July 31, 2012 at 6:23 pm by chris · Permalink · Leave a comment
In: Robots.txt, SEO Strategy

Google’s Hidden Interpretation of Robots.txt

Update: Google has confirmed the behaviour and provided detailed documentation. The original Robots.txt syntax was pretty straightforward. You could only use the Disallow directive to exclude pages and each Disallow directive acted like a broad match at the end. This seemed pretty intuitive to most people and for a while the world was a a [...]

Posted on November 15, 2010 at 8:03 am by chris · Permalink · 2 Comments
In: Google, Robots.txt, Site architecture, Webmaster Tools