Wikipedia:Wikipedia Signpost/2011-06-06/Recent research

{{Wikipedia:Signpost/Template:Signpost-header|||}}

{{Wikipedia:Signpost/Template:Signpost-article-start|{{{1|Various metrics of quality and trust; leadership; nerd stereotypes}}}|By Tilman Bayer| 6 June 2011}}

=CHI 2011: Shared leadership, nerd stereotypes and the fruit fly of social software=

A session on "Incentives & User Generated Content" at the annual "Conference on Human Factors in Computing Systems" (CHI 2011) last month featured two papers about Wikipedia ([http://chi2011.org/program/program.html#S1278 abstracts]):

  • Evidence of "shared leadership" found in 4 million talk page messages: As described in a paper titled "Identifying Shared Leadership in Wikipedia" , four researchers from Carnegie Mellon University used machine learning to train an algorithm to classify the text of talk page messages, based on evaluation of [http://www.cs.cmu.edu/~haiyiz/FeaturesVocabulary.htm a set of formal criteria] into four kinds of behavior indicating different kinds of "leadership", using the following descriptions and examples:
  • Providing Positive Feedback (Transactional Leadership: Energize people through acknowledging work and provides rewards): "I’m so impressed. This is a very fine article!"
  • Providing Negative Feedback (Aversive Leadership: Regulate people through reprimands): "If you continue in this manner you will be blocked from editing without further warning. Please stop, and consider improving rather than damaging the work of others."
  • Directing (Directive Leadership: Direct people through issuing instructions, commands, assigning tasks, setting goals): "Here is a new article on a former airport I thought you might want to check out."
  • Social exchange (Transformational Leadership: Promote emotional engagement through for example talking nice, starting off-topic conversation, etc.). "Drop me a line on my talk page sometime, we’ll get a coffee over at Hot Rize or the new King Kocoa…".

:They used this to classify four million user talk page messages from the English Wikipedia, from a January 2008 dump, "sent by 130,000 distinct users (who had edited Wikipedia for an average of 13.6 months) and were received by 1.1 million distinct users (who averaged 10.8 months of editing)". Aiming to find differences between Wikipedians in central and peripheral roles, the researcher compared results for admins and non-admins, and according to membership in a WikiProject (non-members, regular members or core members, the latter defined as founder or top three contributor of a WikiProject). They found that "the more central editors perform more leadership behaviors per person because they are generally more active. However, these differences are not huge. For example, 2.8% of administrators’ work consists of sending directive messages compared to 2.0% for non-administrators". They interpreted this as "strong evidence of shared leadership in Wikipedia" (defined as "a dynamic, interactive influence process among individuals in groups for which the objective is to lead one another to the achievement of group or organizational goals"), with "a large proportion of leadership behaviors performed by editors in peripheral as well as central roles", in contrast to traditional models of leadership. Although editors in all roles showed leadership behaviors, there were differences: "the role of core members in Wikiprojects may be less task-focused and more person-focused, with social or motivational messages to keep members active".
File:Wikimania 2010-07-09 Gdansk --by-RaBoe-111.jpg

  • Negative stereotypes about Wikipedians may deter newbies: For a paper titled "My Kind of People? Perceptions About Wikipedia Contributors and Their Motivations" ([http://www.slideshare.net/jantin/antin-my-kind-of-people-clean-7987385 slides]), a Yahoo! researcher conducted 20 in-person interviews with participants who had all edited Wikipedia before, but infrequently. When asked how they imagined Wikipedia contributors, interviewees used three "primary stereotypes": That of "regular folks", reflecting a perception of Wikipedia as an egalitarian community, secondly "well educated, credentialed group", and thirdly, "By far the most common image that participants invoked to describe Wikipedia’s contributors was that of the solitary techno-geek, ... an unflattering picture [where Wikipedians] are 'geeky' or 'nerdy,' technologically adept, unkempt, unhealthily obsessive, and absorbed with online life." The author stressed the potential damage caused by such negative stereotypes (even if their factual accuracy was questionable), as they might prevent new editors from joining the community. He states that a wiki's "deliberate design decision to hide the identities of individual authors in favor of a kind of collective authorship ... has consequences which are to date insufficiently investigated", possibly allowing readers to fill the void with preconceived stereotypes. "Wikipedia’s ongoing educational efforts could include 'meet the author' informational campaigns which highlight the identities of heavy contributors and emphasize their pro-social motivations. In other words, Wikipedia can combat speculative answers to the question 'Who writes Wikipedia?' by explicitly revealing and promoting that information to its users." The paper has already been noted by the Wikimedia Foundation's "Account Creation Improvement Project".
  • Wikipedia on a DVD player: Also at CHI 2011, a note titled "Utilizing DVD players as low-cost offline Internet browsers", describing a method that "enables communities in the developing world to access Wikipedia and other resources at very low cost", received a [http://www.chi2011.org/program/awards.html#hmnotes honorable mention]. According to the [http://research.microsoft.com/en-us/um/people/thies/chi11-abstract.txt abstract], the researchers put "the entirety of schools-wikipedia.org – encompassing 5,500 articles and 259,000 screens – to a double-layer DVD. We evaluate our system via a study of 20 low-income users in Bangalore, India. Using our DVD as reference, participants are able to answer factual questions with over 90% success."

A [http://twitter.com/#!/juddantin/status/68799710634328067 saying] [http://twitter.com/#!/andresmh/status/69770612033339392 reported] at the conference: "Wikipedia is the Drosophila of social software."

=Dissertation: Wikipedia's "cyborg individual" ideal=

In his dissertation titled "[http://etd.ohiolink.edu/view.cgi?acc_num=bgsu1300717552 Hackers, Cyborgs, and Wikipedians: The Political Economy and Cultural History of Wikipedia]" (submitted at Bowling Green State University last month), Andrew Famiglietti argues that Wikipedia "was shaped by an ideal I call, 'the cyborg individual,' which held that the production of knowledge was best entrusted to a widely distributed network of individual human subjects and individually owned computers. I trace how this ideal emerged from hacker culture in response to anxieties hackers experienced due to their intimate relationships with machines." Yochai Benkler's ideas are referred to, among those of others. One chapter, titled "Wikipedia and Google", rejects a blogger's claim that Wikipedians decide on the notability of article subjects solely based on Google hits. A detailed analysis of the fate of new articles from one entire day (which the author provides online in the form of a [http://blogs.bgsu.edu/wikipediadata/ blog], tagged by their eventual fate – e.g. [http://blogs.bgsu.edu/wikipediadata/tag/given-csd-a7/ those speedily deleted under CSD A7]), finds that while Google searches indeed play an important role in the corresponding deletion processes, Wikipedians "are justifiably confident in their ability to skillfully use" it, avoiding the "if it is not on Google it doesn't exist" trap.

=Briefly=

  • Wikipedia research survey: Wikipedia researcher Finn Årup Nielsen has [https://twitter.com/#!/fnielsen/status/77814786087333888 announced] an updated version of [http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6012 his survey paper on more than 1000 Wikipedia research publications] (see also the last "Recent research").
  • The upcoming Open Knowledge Conference 2011 will see a Workshop on Wikipedia & Research, organized by members of the Wikimedia Foundation' Research Committee.
  • Historical discussions about "No Personal Attacks" policy analyzed: In a paper titled "Self-Governance Through Group Discussion in Wikipedia: Measuring Deliberation in Online Groups" (appearing in the June issue of "Small Group Research", [http://sgr.sagepub.com/content/early/2011/05/08/1046496411406137.abstract abstract]), researchers from Ohio University, Cornell and Southern Illinois University Edwardsville examined "the small group discussions that undergird policy-making processes in a well-established online community, Wikipedia. Content analysis shows that these discussions demonstrated a relatively high level of problem analysis and providing of information, but results were mixed in the group’s demonstration of respect, consideration, and mutual comprehension". They note that Wikipedians "do not simply write and discuss encyclopedia articles: they also propose, collaboratively create, discuss, agree on, and enforce the policies that guide their interactions. This stands in sharp contrast to most online communities, where governance resides in the hands of a relative few community leaders". Concretely, they analyzed the postings on the talk page of Wikipedia:No personal attacks "from April 2002, when the first version of the policy was proposed, through August 2005" (it was first proposed {{diff|Wikipedia:Historical_archive/Rules_to_consider|52589|52582|here}} by Jimmy Wales), comprising 282 posts across 35 discussion threads, and coded them "on eight of the nine dimensions of deliberation: creating an information base, prioritizing values, identifying solutions, weighing solutions, making decisions, comprehension, consideration, and respect. For example, 40 postings or 14.2% were evaluated as showing "Lack of respect". They also created a social network graph of discussants on one archive page (with User:Snowspinner and User "SamS" as the two biggest nodes), and highlight concrete examples (including user names) of patterns formally interpreted as "conflict management" or "good deliberative discussion".
  • Editor retention not necessarily a good thing?: In an article titled "[http://www.samransbotham.com/sites/default/files/RansbothamKane_WikiDemotion_2012_MISQ.pdf Membership Turnover and Collaboration Success in Online Communities: Explaining Rises and Falls from Grace in Wikipedia]", two researchers from Boston College examine "the longitudinal history of 2,065 featured articles on Wikipedia" and find evidence that "contributions from a mixture of new and experienced participants both increases [sic] the likelihood that an article will be promoted to featured article status and decreases the risk it will be demoted after having been promoted. These findings imply that, contrary to many of the assumptions in previous research, participant retention does not have a strictly positive effect on emerging collaborative environments."
  • Web of trust: Three researchers from Paris [http://perso.telecom-paristech.fr/~maniu/pubs/SM_DBSocial11_wikisigned.pdf reported on efforts] to group Wikipedia editors into a "signed network" (also known as web of trust), based on the following kinds of interactions: "edits over commonly-authored articles, activities such as votes for adminship, the restoring of an article to a previous version, or the assignment of barnstars (a prize, acknowledging valuable contributions)."
  • Venetian network: Researcher Paolo Massa (user:Phauly) also examined "[http://www.gnuband.org/papers/social_networks_of_wikipedia/ Social Networks of Wikipedia]" (a paper presented at the ACM Hypertext 2011 conference this week), consisting of users on the Venetian Wikipedia with the edges of the network determined by the number of messages one user has left on the talk page of another.
  • A paper titled "[http://www.cs.dartmouth.edu/~cja/papers/wikiwatchdog.pdf Wiki-watchdog: Anomaly Detection in Wikipedia Through a Distributional Lens]" describes "an efficient distribution-based methodology that monitors distributions of revision activity for changes. We show that using our methods it is possible to detect the activity of bots, flash events, and outages, as they occur. Our methods are proposed to support the monitoring of the [Wikipedia] contributors" and other things.
  • Wikipedia's historiography: Dominant or alternative?: An article titled "The nature of historical representation on Wikipedia: Dominant or alterative historiography?" (appeared in this month's issue of the Journal of the American Society for Information Science and Technology) compared the "Wikipedia accounts of Singaporean and Philippine history", according to the [http://onlinelibrary.wiley.com/doi/10.1002/asi.21531/abstract abstract]. The author argues that "information professionals [should] take a keener interest in Wikipedia, with an eye to helping include accounts of documented historical perspectives that are ignored by mainstream historiographical traditions."
  • Identifying current events: A publication titled "[http://www.cs.jhu.edu/~ccb/publications/wikitopics-what-is-popular-on-wikipedia-and-why.pdf WikiTopics: What is popular on Wikipedia and why]" by three US researchers described an automated method to "identify and describe significant current events as according to Wikipedia content, and metadata", by first selecting articles with significantly increasing page views, followed by clustering, and then "generat[ing] textual descriptions for the clustered articles to explain why they are popular and what current event they are relevant to".
  • 8.5% of Wikipedia articles tagged as flawed: A paper titled "[http://www.uni-weimar.de/medien/webis/publications/downloads/papers/stein_2011d.pdf Towards automatic quality assurance in Wikipedia]" (prepared for the World Wide Web Conference 2011 two months ago) analyzed the frequency of cleanup templates (e.g. for NPOV or notability problems) on the English Wikipedia (as of January 2010) and found that 8.5% articles were tagged for at least one flaw – most often with Unreferenced, which was encountered in 135210 articles (4.57%). The researchers from Bauhaus-Universität Weimar then built automatic classifiers that tried to discern featured articles from those carrying one of the most frequent flaws, which worked "with a nearly perfect precision" in the case of the "orphan" and "notability" tags. They announced that "based on the lessons learned, we plan to operationalize our classification approach as a Wikipedia bot that tags articles autonomously".
  • A paper describing a method for the "Quality evaluation of Wikipedia articles through edit history and editor groups" promises that it "has better performance in quality evaluation than several existing metrics", according to the [https://doi.org/10.1007%2F978-3-642-20291-9_20 abstract].

{{Wikipedia:Signpost/Template:Signpost-article-comments-end||2011-04-11|2011-07-25}}

06 Research