Google Had Discussed Allowing Noindex In Robots.txt via @sejournal, @martinibuster

Google’s John Mueller responded to a question on LinkedIn to discuss the use of an unsupported noindex directive on the robots.txt of his own personal website. He explained the pros and cons of search engine support for the directive and offered insights into Google’s internal discussions about supporting it.

John Mueller’s Robots.txt

Mueller’s robots.txt has been a topic of conversation for the past week because of the general weirdness of the odd and non-standard directives he used within it.

It was almost inevitable that Mueller’s robots.txt was scrutinized and went viral in the search marketing community.

Noindex Directive

Everything that’s in a robots.txt is called a directive. A directive is a request to a web crawler that it is obligated to obey (if it obeys robots.txt directives).

There are standards for how to write a robots.txt directive and anything that doesn’t conform to those those standards is likely to be ignored. A non-standard directive in Mueller’s robots.txt caught the eye of someone who decided to post a question about it to John Mueller via LinkedIn, to know if Google supported the non-standard directive.

It’s a good question because it’s easy to assume that if a Googler is using it then maybe Google supports it.

The non-standard directive was noindex. Noindex is a part of the meta robots standard but not the robots.txt standard. Mueller had not just one instance of the noindex directive, he had 5,506 noindex directives.

The SEO specialist who asked the question, Mahek Giri, wrote:

“In John Mueller’s robots.txt file,

there’s an unusual command:

“noindex:”

This command isn’t part of the standard robots.txt format,

So do you think it will have any impact on how search engine indexes his pages?

John Mueller curious to know about noindex: in robots.txt”

Why Noindex Directive In Robots.txt Is Unsupported By Google

Google’s John Mueller answered that it was unsupported.

Mueller answered:

“This is an unsupported directive, it doesn’t do anything.”

Mueller then went on to explain that Google had at one time considered supporting the noindex directive from within the robots.txt because it would provide a way for publishers to block Google from both crawling and indexing content at the same time.

Right now it’s possible to block crawling in robots.txt or to block indexing with the meta robots noindex directive. But you can’t block indexing with the meta robots directive and block crawling in the robots.txt at the same time because a block on the crawl will prevent the crawler from “seeing” the meta robots directive.

Mueller explained why Google decided to not move ahead with the idea of honoring the noindex directive within the robots.txt.

He wrote:

“There were many discussions about whether it should be supported as part of the robots.txt standard. The thought behind it was that it would be nice to block both crawling and indexing at the same time. With robots.txt, you can block crawling, or you can block indexing (with a robots meta tag, if you allow crawling). The idea was that you could have a “noindex” in robots.txt too, and block both.

Unfortunately, because many people copy & paste robots.txt files without looking at them in detail (few people look as far as you did!), it would be very, very easy for someone to remove critical parts of a website accidentally. And so, it was decided that this should not be a supported director, or a part of the robots.txt standard… probably over 10 years ago at this point.”

Why Was That Noindex In Mueller’s Robots.txt

Mueller made clear that it’s unlikely that Google would support that tag and that this was confirmed about ten years ago. The revelation about those internal discussions is interesting but it’s also deepens the sense of weirdness about Mueller’s robots.txt.

See also: 8 Common Robots.txt Issues And How To Fix Them

Featured Image by Shutterstock/Kues

Leave a Reply

Your email address will not be published. Required fields are marked *