Adversarial information retrieval

{{short description|Information retrieval strategies in datasets}}

Adversarial information retrieval (adversarial IR) is a topic in information retrieval related to strategies for working with a data source where some portion of it has been manipulated maliciously. Tasks can include gathering, indexing, filtering, retrieving and ranking information from such a data source. Adversarial IR includes the study of methods to detect, isolate, and defeat such manipulation.

On the Web, the predominant form of such manipulation is search engine spamming (also known as spamdexing), which involves employing various techniques to disrupt the activity of web search engines, usually for financial gain. Examples of spamdexing are link-bombing, comment or referrer spam, spam blogs (splogs), malicious tagging. Reverse engineering of ranking algorithms, click fraud,Jansen, B. J. (2007) [https://faculty.ist.psu.edu/jjansen/academic/jansen_click_fraud.pdf Click fraud]. IEEE Computer. 40(7), 85-86. and web content filtering may also be considered forms of adversarial data manipulation.B. Davison, M. Najork, and T. Converse (2006), [https://web.archive.org/web/20090320173324/http://www.acm.org/sigs/sigir/forum/2006D/2006d_sigirforum_davison.pdf SIGIR Worksheet Report: Adversarial Information Retrieval on the Web (AIRWeb 2006)]

Topics

Topics related to Web spam (spamdexing):

Other topics:

History

The term "adversarial information retrieval" was first coined in 2000 by Andrei Broder (then Chief Scientist at Alta Vista) during the Web plenary session at the TREC-9 conference.D. Hawking and N. Craswell (2004), [http://es.csiro.au/pubs/trecbook_for_website.pdf Very Large Scale Retrieval and Web Search (Preprint version)] {{Webarchive|url=https://web.archive.org/web/20070829092407/http://es.csiro.au/pubs/trecbook_for_website.pdf |date=2007-08-29 }}

See also

References

{{reflist}}