Tuesday, April 5, 2011

Google's Advice on Robots.txt

Even before the concept of search engine optimization was introduced, robots.txt has been a fixture on websites. With or without SEO, its purpose was clear: to request search engines to ignore crawling certain files and folders as specified in the protocol. These pages may have been placed on our servers for convenient access to certain people, but with the concept of open Web, whatever we uploaded can now be seen by anyone who has access to the Internet through search engines. That's where robots.txt comes into picture.

With SEO we have become more aware of what search engines can do. Our purpose of doing SEO is to make our pages more visible, thereby attracting more visitors who see its more prominent search engine placement. But prominent search engine placement for wrong keywords and pages is not 

only embarrassing, but also damaging especially if it involves moderately sensitive information such as next week's menu for a restaurant website or an upcoming promotion of a shopping mall.Personally I wouldn't require the placement of robots.txt into a website especially if all the pages and files uploaded are supposed to be crawled, indexed and ranked by search engines. I would introduce it only when I wish to block search engines from accessing certain pages or images. Some SEOs would recommend placement of robots.txt file even if their main objective is full indexing by search engines, often with a common code:

User-agent: * Allow: /.

But Google advises against doing so. Instead, the search giant recommends that webmasters remove the file and allow search engines to do their job.

In short, if we want search engines to scan all pages in our sites, there is no need to put robots.txt file into our servers; by default they'll do it anyway.


Post a Comment