Noindex
{{short description|Meta tag used to request that Internet bots avoid indexing a web page}}
{{lowercase}}
{{for|the internal use on Wikipedia|WP:NOINDEX|selfref=y}}
The noindex value of an HTML robots meta tag requests that automated Internet bots avoid indexing a web page.[http://www.w3.org/TR/html401/appendix/notes.html#h-B.4.1.2 Robots and the META element], Official W3 specification[http://www.robotstxt.org/meta.html About the Robots tag] It is also a value of the HTTP response header X-Robots-Tag.{{cite web|url=https://webmasters.stackexchange.com/questions/71351/robots-txt-vs-noindex-tags|title=Robots.txt vs. Noindex Tags}} Reasons why one might want to use this meta tag include advising robots not to index a very large database, web pages that are very transitory, web pages that are under development, web pages that one wishes to keep slightly more private, or the printer and mobile-friendly versions of pages. Since the burden of honoring a website's noindex tag lies with the author of the search robot, sometimes these tags are ignored. Also the interpretation of the noindex tag is sometimes slightly different from one search engine company to the next.
Noindexing entire pages
= Bot-specific directives =
The noindex directive can be restricted only to certain bots by specifying a different "name" value in the meta tag.
For example, to specifically block Google's bot,[http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=93710 Using meta tags to block access to your site], Google Webmasters Tools Help specify:
Or, to block Bing's bot, specify:
Or, to block Baidu's bot, specify:
= robots.txt file =
A robots.txt file can be used to block crawling.
Noindexing part of a page
It is also possible to exclude part of a Web page, for example navigation text, from being indexed rather than the whole page. There are various techniques for doing this; it is possible to use several in combination. Google's main indexing spider, Googlebot, is not known to recognize any of these techniques.
= <noindex> tag =
The Russian search engine Yandex introduced a new
Do index this text.
Don't index this text.
Other indexing spiders also recognize the
= microformat =
There is a 2005 draft microformats specification with the same functionality. The Robot Exclusion Profile looks for the attribute and value
Do index this text.
Don't index this text.
Don't index this text.
A combination of values is also possible, for example:
= Yahoo! =
In 2007, Yahoo! introduced similar functionality to the microformat into its spider. However, Yahoo!'s spider is incompatible in that it looks for the value
Do index this text.
Don't index this text.
Don't index this text.
= SharePoint =
SharePoint 2010’s iFilter excludes content inside of a Do index this text. The Google Search Appliance uses structured comments:{{cite web |url=https://developers.google.com/search-appliance/documentation/68/admin_crawl/Preparing |title=Administering Crawl: Preparing for a Crawl |date=August 23, 2012 |work=Google Search Appliance |publisher=Google Inc. |at=Section: Excluding Unwanted Text from the Index |access-date=March 23, 2013 |archive-url=https://web.archive.org/web/20121123112433/https://developers.google.com/search-appliance/documentation/68/admin_crawl/Preparing |archive-date=November 23, 2012}} Do index this text. Don't index this text. Other indexing spiders also use their own structured comments.= Structured comments =
== Google Search Appliance ==
See also
References