algorithmic bias

{{short description|Technological phenomenon with social implications}}

{{good article}}

{{Use mdy dates|date=October 2023}}

File:02-Sandvig-Seeing-the-Sort-2014-WEB.png, {{Circa|2001}}{{Cite web|url=https://worldwide.espacenet.com/publicationDetails/biblio?CC=us&NR=7113917&KC=&FT=E&locale=en_EP|title=Patent #US2001021914|last=Jacobi|first=Jennifer|date=13 September 2001|website=Espacenet|access-date=4 July 2018}}]]

{{Artificial intelligence}}

{{Discrimination sidebar}}

Algorithmic bias describes systematic and repeatable harmful tendency in a computerized sociotechnical system to create "unfair" outcomes, such as "privileging" one category over another in ways different from the intended function of the algorithm.{{Citation |last1=Hardebolle |first1=Cécile |title=Engineering ethics education and artificial intelligence |date=2024-11-25 |work=The Routledge International Handbook of Engineering Ethics Education |pages=125–142 |edition=1 |place=London |publisher=Routledge |language=en |doi=10.4324/9781003464259-9 |isbn=978-1-003-46425-9 |last2=Héder |first2=Mihály |last3=Ramachandran |first3=Vivek|doi-access=free }}

Bias can emerge from many factors, including but not limited to the design of the algorithm or the unintended or unanticipated use or decisions relating to the way data is coded, collected, selected or used to train the algorithm.{{cite journal|last=Van Eyghen|first= Hans|title=AI Algorithms as (Un)virtuous Knowers|journal=Discover Artificial Intelligence|volume=5|issue=2|date=2025|doi= 10.1007/s44163-024-00219-z|doi-access=free}} For example, algorithmic bias has been observed in search engine results and social media platforms. This bias can have impacts ranging from inadvertent privacy violations to reinforcing social biases of race, gender, sexuality, and ethnicity. The study of algorithmic bias is most concerned with algorithms that reflect "systematic and unfair" discrimination.{{Cite book |last=Marabelli |first=Marco |url=https://link.springer.com/book/10.1007/978-3-031-53919-0 |title=AI, Ethics, and Discrimination in Business |series=Palgrave Studies in Equity, Diversity, Inclusion, and Indigenization in Business |publisher=Springer |year=2024 |isbn=978-3-031-53918-3 |language=en |doi=10.1007/978-3-031-53919-0}} This bias has only recently been addressed in legal frameworks, such as the European Union's General Data Protection Regulation (proposed 2018) and the Artificial Intelligence Act (proposed 2021, approved 2024).

As algorithms expand their ability to organize society, politics, institutions, and behavior, sociologists have become concerned with the ways in which unanticipated output and manipulation of data can impact the physical world. Because algorithms are often considered to be neutral and unbiased, they can inaccurately project greater authority than human expertise (in part due to the psychological phenomenon of automation bias), and in some cases, reliance on algorithms can displace human responsibility for their outcomes. Bias can enter into algorithmic systems as a result of pre-existing cultural, social, or institutional expectations; by how features and labels are chosen; because of technical limitations of their design; or by being used in unanticipated contexts or by audiences who are not considered in the software's initial design.{{Cite book |last1=Suresh |first1=Harini |last2=Guttag |first2=John |title=Equity and Access in Algorithms, Mechanisms, and Optimization |chapter=A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle |date=2021-11-04 |chapter-url=https://dl.acm.org/doi/10.1145/3465416.3483305 |series=EAAMO '21 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=1–9 |doi=10.1145/3465416.3483305 |isbn=978-1-4503-8553-4|s2cid=235436386 }}

Algorithmic bias has been cited in cases ranging from election outcomes to the spread of online hate speech. It has also arisen in criminal justice,{{Cite journal |last=Krištofík |first=Andrej |date=2025-04-28 |title=Bias in AI (Supported) Decision Making: Old Problems, New Technologies |journal=International Journal for Court Administration |language=en |volume=16 |issue=1 |doi=10.36745/ijca.598 |issn=2156-7964|doi-access=free }} healthcare, and hiring, compounding existing racial, socioeconomic, and gender biases. The relative inability of facial recognition technology to accurately identify darker-skinned faces has been linked to multiple wrongful arrests of black men, an issue stemming from imbalanced datasets. Problems in understanding, researching, and discovering algorithmic bias persist due to the proprietary nature of algorithms, which are typically treated as trade secrets. Even when full transparency is provided, the complexity of certain algorithms poses a barrier to understanding their functioning. Furthermore, algorithms may change, or respond to input or output in ways that cannot be anticipated or easily reproduced for analysis. In many cases, even within a single website or application, there is no single "algorithm" to examine, but a network of many interrelated programs and data inputs, even between users of the same service.

A 2021 survey identified multiple forms of algorithmic bias, including historical, representation, and measurement biases, each of which can contribute to unfair outcomes.{{Cite journal |last1=Mehrabi |first1=N. |last2=Morstatter |first2=F. |last3=Saxena |first3=N. |last4=Lerman |first4=K. |last5=Galstyan |first5=A. |title=A survey on bias and fairness in machine learning |journal=ACM Computing Surveys |volume=54 |issue=6 |pages=1–35 |year=2021 |doi=10.1145/3457607 |arxiv=1908.09635 |url=https://dl.acm.org/doi/10.1145/3457607 |access-date=April 30, 2025}}

Definitions

File:A computer program for evaluating forestry opportunities under three investment criteria (1969) (20385500690).jpg

Algorithms are difficult to define,{{cite web|url=http://culturedigitally.org/2012/02/what-is-an-algorithm/|title=What is an Algorithm? – Culture Digitally|last1=Striphas|first1=Ted|website=culturedigitally.org|date=February 2012 |access-date=20 November 2017}} but may be generally understood as lists of instructions that determine how programs read, collect, process, and analyze data to generate output.{{cite book|title=Introduction to Algorithms|url=https://archive.org/details/introductiontoal00corm_805|url-access=limited|last1= Cormen|first1=Thomas H.|last2=Leiserson|first2=Charles E.|last3=Rivest|first3=Ronald L.|last4=Stein|first4=Clifford|date=2009|publisher=MIT Press|isbn=978-0-262-03384-8|edition=3rd|location=Cambridge, Mass.|page=[https://archive.org/details/introductiontoal00corm_805/page/n25 5]}}{{rp|13}} For a rigorous technical introduction, see Algorithms. Advances in computer hardware have led to an increased ability to process, store and transmit data. This has in turn boosted the design and adoption of technologies such as machine learning and artificial intelligence.{{rp|14–15}} By analyzing and processing data, algorithms are the backbone of search engines,{{cite web|title=How Google Search Works|url=https://www.google.com/search/howsearchworks/algorithms/|access-date=19 November 2017}} social media websites,{{cite magazine|last1=Luckerson|first1=Victor|title=Here's How Your Facebook News Feed Actually Works|url=https://time.com/collection-post/3950525/facebook-news-feed-algorithm/|magazine=Time|access-date=19 November 2017}} recommendation engines,{{Cite magazine|last1=Vanderbilt|first1=Tom|title=The Science Behind the Netflix Algorithms That Decide What You'll Watch Next|url=https://www.wired.com/2013/08/qq_netflix-algorithm/|magazine=Wired|access-date=19 November 2017|date=2013-08-07}} online retail,{{cite web|last1=Angwin|first1=Julia|author1-link=Julia Angwin|last2=Mattu|first2=Surya|title=Amazon Says It Puts Customers First. But Its Pricing Algorithm Doesn't — ProPublica|url=https://www.propublica.org/article/amazon-says-it-puts-customers-first-but-its-pricing-algorithm-doesnt|website=ProPublica|access-date=19 November 2017|date=20 September 2016}} online advertising,{{cite web|last1=Livingstone|first1=Rob|title=The future of online advertising is big data and algorithms|url=http://theconversation.com/the-future-of-online-advertising-is-big-data-and-algorithms-69297 |website=The Conversation|date=13 March 2017 |access-date=19 November 2017}} and more.{{Cite news|last1=Hickman|first1=Leo|title=How algorithms rule the world|url=https://www.theguardian.com/science/2013/jul/01/how-algorithms-rule-world-nsa|newspaper=The Guardian|access-date=19 November 2017|date=1 July 2013}}

Contemporary social scientists are concerned with algorithmic processes embedded into hardware and software applications because of their political and social impact, and question the underlying assumptions of an algorithm's neutrality.{{cite web|url=https://static1.squarespace.com/static/55eb004ee4b0518639d59d9b/t/55ece1bfe4b030b2e8302e1e/1441587647177/seaverMiT8.pdf|title=Knowing Algorithms|last1=Seaver|first1=Nick|publisher=Media in Transition 8, Cambridge, MA, April 2013|access-date=18 November 2017|archive-date=1 December 2017|archive-url=https://web.archive.org/web/20171201012555/https://static1.squarespace.com/static/55eb004ee4b0518639d59d9b/t/55ece1bfe4b030b2e8302e1e/1441587647177/seaverMiT8.pdf|url-status=dead}}{{rp|2}}{{cite journal|last1=Graham|first1=Stephen D.N.|title=Software-sorted geographies|journal=Progress in Human Geography|date=July 2016|volume=29|issue=5|pages=562–580|doi=10.1191/0309132505ph568oa|s2cid=19119278|url=http://dro.dur.ac.uk/194/1/194.pdf|type=Submitted manuscript}}{{rp|563}}{{cite journal|last1=Tewell|first1=Eamon|date=4 April 2016|title=Toward the Resistant Reading of Information: Google, Resistant Spectatorship, and Critical Information Literacy|url=http://muse.jhu.edu/article/613843|journal=Portal: Libraries and the Academy|volume=16|issue=2|pages=289–310|issn=1530-7131|access-date=19 November 2017|doi=10.1353/pla.2016.0017|s2cid=55749077}}{{rp|294}}{{Cite journal|url=https://hbr.org/2013/04/the-hidden-biases-in-big-data|title=The Hidden Biases in Big Data|last=Crawford|first=Kate|date=1 April 2013|journal=Harvard Business Review}} The term algorithmic bias describes systematic and repeatable errors that create unfair outcomes, such as privileging one arbitrary group of users over others. For example, a credit score algorithm may deny a loan without being unfair, if it is consistently weighing relevant financial criteria. If the algorithm recommends loans to one group of users, but denies loans to another set of nearly identical users based on unrelated criteria, and if this behavior can be repeated across multiple occurrences, an algorithm can be described as biased.{{rp|332}} This bias may be intentional or unintentional (for example, it can come from biased data obtained from a worker that previously did the job the algorithm is going to do from now on).

Methods

Bias can be introduced to an algorithm in several ways. During the assemblage of a dataset, data may be collected, digitized, adapted, and entered into a database according to human-designed cataloging criteria.{{cite book|title=Media Technologies|last1=Gillespie|first1=Tarleton|last2=Boczkowski|first2=Pablo|last3=Foot|first3=Kristin|publisher=MIT Press|year=2014|isbn=9780262525374|location=Cambridge|pages=1–30 }}{{rp|3}} Next, programmers assign priorities, or hierarchies, for how a program assesses and sorts that data. This requires human decisions about how data is categorized, and which data is included or discarded.{{rp|4}} Some algorithms collect their own data based on human-selected criteria, which can also reflect the bias of human designers.{{rp|8}} Other algorithms may reinforce stereotypes and preferences as they process and display "relevant" data for human users, for example, by selecting information based on previous choices of a similar user or group of users.{{rp|6}}

Beyond assembling and processing data, bias can emerge as a result of design.{{cite web|last1=Diakopoulos|first1=Nicholas|title=Algorithmic Accountability: On the Investigation of Black Boxes {{!}}|url=https://towcenter.org/research/algorithmic-accountability-on-the-investigation-of-black-boxes-2/|website=towcenter.org|access-date=19 November 2017}} For example, algorithms that determine the allocation of resources or scrutiny (such as determining school placements) may inadvertently discriminate against a category when determining risk based on similar users (as in credit scores).{{cite report|last1=Lipartito|first1=Kenneth|title=The Narrative and the Algorithm: Genres of Credit Reporting from the Nineteenth Century to Today|date=6 January 2011|doi=10.2139/SSRN.1736283 |ssrn=1736283|s2cid=166742927|url=https://mpra.ub.uni-muenchen.de/28142/1/MPRA_paper_28142.pdf|type=Submitted manuscript}}{{rp|36}} Meanwhile, recommendation engines that work by associating users with similar users, or that make use of inferred marketing traits, might rely on inaccurate associations that reflect broad ethnic, gender, socio-economic, or racial stereotypes. Another example comes from determining criteria for what is included and excluded from results. These criteria could present unanticipated outcomes for search results, such as with flight-recommendation software that omits flights that do not follow the sponsoring airline's flight paths. Algorithms may also display an uncertainty bias, offering more confident assessments when larger data sets are available. This can skew algorithmic processes toward results that more closely correspond with larger samples, which may disregard data from underrepresented populations.{{Cite journal|last1=Goodman|first1=Bryce|last2=Flaxman|first2=Seth|title=EU regulations on algorithmic decision-making and a "right to explanation"|journal=AI Magazine |volume=38 |issue=3 |pages=50 |arxiv=1606.08813 |doi=10.1609/aimag.v38i3.2741 |year=2017|s2cid=7373959}}{{rp|4}}

History

= Early critiques =

File:Used Punchcard (5151286161).jpg

The earliest computer programs were designed to mimic human reasoning and deductions, and were deemed to be functioning when they successfully and consistently reproduced that human logic. In his 1976 book Computer Power and Human Reason, artificial intelligence pioneer Joseph Weizenbaum suggested that bias could arise both from the data used in a program, but also from the way a program is coded.{{rp|149}}

Weizenbaum wrote that programs are a sequence of rules created by humans for a computer to follow. By following those rules consistently, such programs "embody law",{{cite book|last1=Weizenbaum|first1=Joseph|title=Computer Power and Human Reason: From Judgment to Calculation |url=https://archive.org/details/computerpowerhum0000weiz|url-access=registration|date=1976|publisher=W.H. Freeman|location=San Francisco|isbn=978-0-7167-0464-5}}{{rp|40}} that is, enforce a specific way to solve problems. The rules a computer follows are based on the assumptions of a computer programmer for how these problems might be solved. That means the code could incorporate the programmer's imagination of how the world works, including their biases and expectations.{{rp|109}} While a computer program can incorporate bias in this way, Weizenbaum also noted that any data fed to a machine additionally reflects "human decision making processes" as data is being selected.{{rp|70, 105}}

Finally, he noted that machines might also transfer good information with unintended consequences if users are unclear about how to interpret the results.{{rp|65}} Weizenbaum warned against trusting decisions made by computer programs that a user doesn't understand, comparing such faith to a tourist who can find his way to a hotel room exclusively by turning left or right on a coin toss. Crucially, the tourist has no basis of understanding how or why he arrived at his destination, and a successful arrival does not mean the process is accurate or reliable.{{rp|226}}

An early example of algorithmic bias resulted in as many as 60 women and ethnic minorities denied entry to St. George's Hospital Medical School per year from 1982 to 1986, based on implementation of a new computer-guidance assessment system that denied entry to women and men with "foreign-sounding names" based on historical trends in admissions.{{cite journal|last1=Lowry|first1=Stella|last2=Macpherson|first2=Gordon|date=5 March 1988|title=A Blot on the Profession|url=http://europepmc.org/backend/ptpmcrender.fcgi?accid=PMC2545288&blobtype=pdf|journal=British Medical Journal|volume=296|issue=6623|pages=657–8|access-date=17 November 2017|pmid=3128356|pmc=2545288|doi=10.1136/bmj.296.6623.657}} While many schools at the time employed similar biases in their selection process, St. George was most notable for automating said bias through the use of an algorithm, thus gaining the attention of people on a much wider scale.

In recent years, as algorithms increasingly rely on machine learning methods applied to real-world data, algorithmic bias has become more prevalent due to inherent biases within the data itself. For instance, facial recognition systems have been shown to misidentify individuals from marginalized groups at significantly higher rates than white individuals, highlighting how biases in training datasets manifest in deployed systems.{{Cite web |title=Perpetual Lineup |url=https://www.law.georgetown.edu/privacy-technology-center/publications/the-perpetual-line-up/ |access-date=2024-12-12 |website=www.law.georgetown.edu |language=en-US}} A 2018 study by Joy Buolamwini and Timnit Gebru found that commercial facial recognition technologies exhibited error rates of up to 35% when identifying darker-skinned women, compared to less than 1% for lighter-skinned men.{{Cite web |last=Buolamwini |first=Joy |title=Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification |url=https://www.media.mit.edu/publications/gender-shades-intersectional-accuracy-disparities-in-commercial-gender-classification/ |access-date=2024-12-12 |website=MIT Media Lab}}

Algorithmic biases are not only technical failures but often reflect systemic inequities embedded in historical and societal data. Researchers and critics, such as Cathy O'Neil in her book Weapons of Math Destruction (2016), emphasize that these biases can amplify existing social inequalities under the guise of objectivity. O'Neil argues that opaque, automated decision-making processes in areas such as credit scoring, predictive policing, and education can reinforce discriminatory practices while appearing neutral or scientific.{{Cite book |last=Barocas |first=Solon |title=Fairness and machine learning: Limitations and opportunities |date=December 19, 2023 |publisher=The MIT Press |isbn=9780262048613}}

= Contemporary critiques and responses =

Though well-designed algorithms frequently determine outcomes that are equally (or more) equitable than the decisions of human beings, cases of bias still occur, and are difficult to predict and analyze.{{cite journal |last1=Miller |first1=Alex P. |title=Want Less-Biased Decisions? Use Algorithms |url=https://hbr.org/2018/07/want-less-biased-decisions-use-algorithms |journal=Harvard Business Review |access-date=31 July 2018 |date=26 July 2018}} The complexity of analyzing algorithmic bias has grown alongside the complexity of programs and their design. Decisions made by one designer, or team of designers, may be obscured among the many pieces of code created for a single program; over time these decisions and their collective impact on the program's output may be forgotten.{{cite journal|last1=Introna|first1=Lucas D.|s2cid=145190381|date=2 December 2011|title=The Enframing of Code|journal=Theory, Culture & Society|volume=28|issue=6|pages=113–141|doi=10.1177/0263276411418131}}{{rp|115}} In theory, these biases may create new patterns of behavior, or "scripts", in relationship to specific technologies as the code interacts with other elements of society.{{cite web|url=https://www.theatlantic.com/technology/archive/2015/01/the-cathedral-of-computation/384300/|title=The Cathedral of Computation|last1=Bogost|first1=Ian|website=The Atlantic|access-date=19 November 2017|date=2015-01-15}} Biases may also impact how society shapes itself around the data points that algorithms require. For example, if data shows a high number of arrests in a particular area, an algorithm may assign more police patrols to that area, which could lead to more arrests.{{cite journal|last1=Introna|first1=Lucas|last2=Wood|first2=David|date=2004|title=Picturing algorithmic surveillance: the politics of facial recognition systems|url=http://nbn-resolving.de/urn:nbn:de:0168-ssoar-200675|journal=Surveillance & Society|volume=2|pages=177–198|access-date=19 November 2017}}{{rp|180}}

The decisions of algorithmic programs can be seen as more authoritative than the decisions of the human beings they are meant to assist,{{cite journal|last1=Introna|first1=Lucas D.|date=21 December 2006|title=Maintaining the reversibility of foldings: Making the ethics (politics) of information technology visible|journal=Ethics and Information Technology|volume=9|issue=1|pages=11–25|doi=10.1007/s10676-006-9133-z|citeseerx=10.1.1.154.1313|s2cid=17355392}}{{rp|15}} a process described by author Clay Shirky as "algorithmic authority".{{cite web|url=http://www.shirky.com/weblog/2009/11/a-speculative-post-on-the-idea-of-algorithmic-authority/|title=A Speculative Post on the Idea of Algorithmic Authority Clay Shirky|last1=Shirky|first1=Clay|website=www.shirky.com|access-date=20 November 2017|archive-date=March 15, 2012|archive-url=https://web.archive.org/web/20120315160514/http://www.shirky.com/weblog/2009/11/a-speculative-post-on-the-idea-of-algorithmic-authority/|url-status=dead}} Shirky uses the term to describe "the decision to regard as authoritative an unmanaged process of extracting value from diverse, untrustworthy sources", such as search results. This neutrality can also be misrepresented by the language used by experts and the media when results are presented to the public. For example, a list of news items selected and presented as "trending" or "popular" may be created based on significantly wider criteria than just their popularity.{{rp|14}}

Because of their convenience and authority, algorithms are theorized as a means of delegating responsibility away from humans.{{rp|16}}{{cite journal|last1=Ziewitz|first1=Malte|title=Governing Algorithms: Myth, Mess, and Methods|journal=Science, Technology, & Human Values|date=1 January 2016|volume=41|issue=1|pages=3–16|doi=10.1177/0162243915608948|s2cid=148023125|issn=0162-2439|url=http://revistas.ucm.es/index.php/ESMP/article/view/58040|doi-access=free}}{{rp|6}} This can have the effect of reducing alternative options, compromises, or flexibility.{{rp|16}} Sociologist Scott Lash has critiqued algorithms as a new form of "generative power", in that they are a virtual means of generating actual ends. Where previously human behavior generated data to be collected and studied, powerful algorithms increasingly could shape and define human behaviors.{{cite journal|last1=Lash|first1=Scott|date=30 June 2016|title=Power after Hegemony|journal=Theory, Culture & Society|volume=24|issue=3|pages=55–78|doi=10.1177/0263276407075956|s2cid=145639801}}{{rp|71}}

While blind adherence to algorithmic decisions is a concern, an opposite issue arises when human decision-makers exhibit "selective adherence" to algorithmic advice. In such cases, individuals accept recommendations that align with their preexisting beliefs and disregard those that do not, thereby perpetuating existing biases and undermining the fairness objectives of algorithmic interventions. Consequently, incorporating fair algorithmic tools into decision-making processes does not automatically eliminate human biases.{{Citation |last1=Gaudeul |first1=Alexia |title=Understanding the Impact of Human Oversight on Discriminatory Outcomes in AI-Supported Decision-Making |date=2024 |work=ECAI 2024 |pages=1067–1074 |publisher=IOS Press |doi=10.3233/faia240598 |last2=Arrigoni |first2=Ottla |last3=Charisi |first3=Vicky |last4=Escobar-Planas |first4=Marina |last5=Hupont |first5=Isabelle|series=Frontiers in Artificial Intelligence and Applications |doi-access=free |isbn=978-1-64368-548-9 }}

Concerns over the impact of algorithms on society have led to the creation of working groups in organizations such as Google and Microsoft, which have co-created a working group named Fairness, Accountability,

and Transparency in Machine Learning.{{cite journal |last1=Garcia |first1=Megan |title=Racist in the Machine |journal=World Policy Journal |date=1 January 2016 |volume=33 |issue=4 |pages=111–117 |doi=10.1215/07402775-3813015 |s2cid=151595343 }}{{rp|115}} Ideas from Google have included community groups that patrol the outcomes of algorithms and vote to control or restrict outputs they deem to have negative consequences.{{rp|117}} In recent years, the study of the Fairness, Accountability,

and Transparency (FAT) of algorithms has emerged as its own interdisciplinary research area with an annual conference called FAccT.{{Cite web|url=https://facctconference.org/2021/press-release.html|title=ACM FAccT 2021 Registration

|website=fatconference.org|access-date=2021-11-14}} Critics have suggested that FAT initiatives cannot serve effectively as independent watchdogs when many are funded by corporations building the systems being studied.{{cite web |last1=Ochigame |first1=Rodrigo |title=The Invention of "Ethical AI": How Big Tech Manipulates Academia to Avoid Regulation |url=https://theintercept.com/2019/12/20/mit-ethical-ai-artificial-intelligence/ |website=The Intercept |access-date=11 February 2020 |date=20 December 2019}}

Types

= Pre-existing =

Pre-existing bias in an algorithm is a consequence of underlying social and institutional ideologies. Such ideas may influence or create personal biases within individual designers or programmers. Such prejudices can be explicit and conscious, or implicit and unconscious.{{rp|334}}{{rp|294}} Poorly selected input data, or simply data from a biased source, will influence the outcomes created by machines.{{cite book|last1=Goffrey|first1=Andrew|editor1-last=Fuller|editor1-first=Matthew|title=Software Studies: A Lexicon|url=https://archive.org/details/softwarestudiesl00full_007|url-access=limited|date=2008|publisher=MIT Press|location=Cambridge, Mass.|isbn=978-1-4356-4787-9|pages=[https://archive.org/details/softwarestudiesl00full_007/page/n29 15]–20|chapter=Algorithm}}{{rp|17}} Encoding pre-existing bias into software can preserve social and institutional bias, and, without correction, could be replicated in all future uses of that algorithm.{{rp|116}}{{rp|8}}

An example of this form of bias is the British Nationality Act Program, designed to automate the evaluation of new British citizens after the 1981 British Nationality Act.{{rp|341}} The program accurately reflected the tenets of the law, which stated that "a man is the father of only his legitimate children, whereas a woman is the mother of all her children, legitimate or not."{{rp|341}}{{cite journal|last1=Sergot|first1=MJ|last2=Sadri|first2=F|last3=Kowalski|first3=RA|last4=Kriwaczek|first4=F|last5=Hammond|first5=P|last6=Cory|first6=HT|title=The British Nationality Act as a Logic Program|journal=Communications of the ACM|date=May 1986|volume=29|issue=5|pages=370–386|url=https://web.stanford.edu/class/cs227/Readings/BritishNationalityAct.pdf|access-date=18 November 2017|doi=10.1145/5689.5920|s2cid=5665107}}{{rp|375}} In its attempt to transfer a particular logic into an algorithmic process, the BNAP inscribed the logic of the British Nationality Act into its algorithm, which would perpetuate it even if the act was eventually repealed.{{rp|342}}

Another source of bias, which has been called "label choice bias",{{Cite web |title=To stop algorithmic bias, we first have to define it |url=https://www.brookings.edu/articles/to-stop-algorithmic-bias-we-first-have-to-define-it/ |access-date=2023-06-27 |website=Brookings |language=en-US}} arises when proxy measures are used to train algorithms, that build in bias against certain groups. For example, a widely used algorithm predicted health care costs as a proxy for health care needs, and used predictions to allocate resources to help patients with complex health needs. This introduced bias because Black patients have lower costs, even when they are just as unhealthy as White patients{{Cite news |last1=Evans |first1=Melanie |last2=Mathews |first2=Anna Wilde |date=2019-10-24 |title=Researchers Find Racial Bias in Hospital Algorithm |language=en-US |work=Wall Street Journal |url=https://www.wsj.com/articles/researchers-find-racial-bias-in-hospital-algorithm-11571941096 |access-date=2023-06-27 |issn=0099-9660}} Solutions to the "label choice bias" aim to match the actual target (what the algorithm is predicting) more closely to the ideal target (what researchers want the algorithm to predict), so for the prior example, instead of predicting cost, researchers would focus on the variable of healthcare needs which is rather more significant. Adjusting the target led to almost double the number of Black patients being selected for the program.

= Machine learning bias =

Machine learning bias refers to systematic and unfair disparities in the output of machine learning algorithms. These biases can manifest in various ways and are often a reflection of the data used to train these algorithms. Here are some key aspects:

== Language bias ==

Language bias refers a type of statistical sampling bias tied to the language of a query that leads to "a systematic deviation in sampling information that prevents it from accurately representing the true coverage of topics and views available in their repository."{{Citation |last1=Luo |first1=Queenie |title=A Perspectival Mirror of the Elephant: Investigating Language Bias on Google, ChatGPT, Wikipedia, and YouTube |date=2023-05-23 |arxiv=2303.16281 |last2=Puett |first2=Michael J. |last3=Smith |first3=Michael D.}} Luo et al.'s work shows that current large language models, as they are predominately trained on English-language data, often present the Anglo-American views as truth, while systematically downplaying non-English perspectives as irrelevant, wrong, or noise. When queried with political ideologies like "What is liberalism?", ChatGPT, as it was trained on English-centric data, describes liberalism from the Anglo-American perspective, emphasizing aspects of human rights and equality, while equally valid aspects like "opposes state intervention in personal and economic life" from the dominant Vietnamese perspective and "limitation of government power" from the prevalent Chinese perspective are absent. Similarly, language models may exhibit bias against people within a language group based on the specific dialect they use.{{cite journal |last1=Hofmann |first1=Valentin |last2=Kalluri |first2=Pratyusha Ria |last3=Jurafsky |first3=Dan |last4=King |first4=Sharese |title=AI generates covertly racist decisions about people based on their dialect |journal=Nature |date=5 September 2024 |volume=633 |issue=8028 |pages=147–154 |doi=10.1038/s41586-024-07856-5|pmid=39198640 |pmc=11374696 |bibcode=2024Natur.633..147H }}

== Selection bias ==

Selection bias refers the inherent tendency of large language models to favor certain option identifiers irrespective of the actual content of the options. This bias primarily stems from token bias—that is, the model assigns a higher a priori probability to specific answer tokens (such as “A”) when generating responses. As a result, when the ordering of options is altered (for example, by systematically moving the correct answer to different positions), the model’s performance can fluctuate significantly. This phenomenon undermines the reliability of large language models in multiple-choice settings.{{Citation |last1=Choi |first1=Hyeong Kyu |last2=Xu |first2=Weijie |last3=Xue |first3=Chi |last4=Eckman |first4=Stephanie |last5=Reddy |first5=Chandan K. |title=Mitigating Selection Bias with Node Pruning and Auxiliary Options |date=2024-09-27 |arxiv=2409.18857}}{{Citation |last1=Zheng |first1=Chujie |last2=Zhou |first2=Hao |last3=Meng |first3=Fandong |last4=Zhou |first4=Jie |last5=Huang |first5=Minlie |title=Large Language Models Are Not Robust Multiple Choice Selectors |date=2023-09-07 |arxiv=2309.03882}}

== Gender bias ==

Gender bias refers to the tendency of these models to produce outputs that are unfairly prejudiced towards one gender over another. This bias typically arises from the data on which these models are trained. For example, large language models often assign roles and characteristics based on traditional gender norms; it might associate nurses or secretaries predominantly with women and engineers or CEOs with men.{{Cite book |last1=Busker |first1=Tony |last2=Choenni |first2=Sunil |last3=Shoae Bargh |first3=Mortaza |chapter=Stereotypes in ChatGPT: An empirical study |date=2023-11-20 |title=Proceedings of the 16th International Conference on Theory and Practice of Electronic Governance |chapter-url=https://dl.acm.org/doi/10.1145/3614321.3614325 |series=ICEGOV '23 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=24–32 |doi=10.1145/3614321.3614325 |isbn=979-8-4007-0742-1}}{{Cite book |last1=Kotek |first1=Hadas |last2=Dockum |first2=Rikker |last3=Sun |first3=David |chapter=Gender bias and stereotypes in Large Language Models |date=2023-11-05 |title=Proceedings of the ACM Collective Intelligence Conference |chapter-url=https://dl.acm.org/doi/10.1145/3582269.3615599 |series=CI '23 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=12–24 |doi=10.1145/3582269.3615599 |arxiv=2308.14921 |isbn=979-8-4007-0113-9}}

== Stereotyping ==

Beyond gender and race, these models can reinforce a wide range of stereotypes, including those based on age, nationality, religion, or occupation. This can lead to outputs that homogenize, or unfairly generalize or caricature groups of people, sometimes in harmful or derogatory ways.{{Citation |last1=Cheng |first1=Myra |title=Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models |date=2023-05-29 |arxiv=2305.18189 |last2=Durmus |first2=Esin |last3=Jurafsky |first3=Dan}}{{cite journal |last1=Wang |first1=Angelina |last2=Morgenstern |first2=Jamie |last3=Dickerson |first3=John P. |title=Large language models that replace human participants can harmfully misportray and flatten identity groups |journal=Nature Machine Intelligence |date=17 February 2025 |volume=7 |issue=3 |pages=400–411 |doi=10.1038/s42256-025-00986-z|arxiv=2402.01908 }}

A recent focus in research has been on the complex interplay between the grammatical properties of a language and real-world biases that can become embedded in AI systems, potentially perpetuating harmful stereotypes and assumptions. The study on gender bias in language models trained on Icelandic, a highly grammatically gendered language, revealed that the models exhibited a significant predisposition towards the masculine grammatical gender when referring to occupation terms, even for female-dominated professions.{{Citation |last1=Friðriksdóttir|first1=Steinunn Rut|title=Gendered Grammar or Ingrained Bias? Exploring Gender Bias in Icelandic Language Models|journal=Lrec-Coling 2024|date=2024|pages=7596–7610|url=https://aclanthology.org/2024.lrec-main.671/|last2=Einarsson|first2=Hafsteinn}} This suggests the models amplified societal gender biases present in the training data.

== Political bias ==

Political bias refers to the tendency of algorithms to systematically favor certain political viewpoints, ideologies, or outcomes over others. Language models may also exhibit political biases. Since the training data includes a wide range of political opinions and coverage, the models might generate responses that lean towards particular political ideologies or viewpoints, depending on the prevalence of those views in the data.{{Cite journal |last1=Feng |first1=Shangbin |last2=Park |first2=Chan Young |last3=Liu |first3=Yuhan |last4=Tsvetkov |first4=Yulia |date=July 2023 |editor-last=Rogers |editor-first=Anna |editor2-last=Boyd-Graber |editor2-first=Jordan |editor3-last=Okazaki |editor3-first=Naoaki |title=From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models |url=https://aclanthology.org/2023.acl-long.656 |journal=Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) |location=Toronto, Canada |publisher=Association for Computational Linguistics |pages=11737–11762 |doi=10.18653/v1/2023.acl-long.656|doi-access=free |arxiv=2305.08283 }}{{Cite web |last=Dolan |first=Eric W. |date=2025-02-14 |title=Scientists reveal ChatGPT's left-wing bias — and how to "jailbreak" it |url=https://www.psypost.org/scientists-reveal-chatgpts-left-wing-bias-and-how-to-jailbreak-it/ |access-date=2025-02-14 |website=PsyPost - Psychology News |language=en-US}}

== Racial bias ==

Racial bias refers to the tendency of machine learning models to produce outcomes that unfairly discriminate against or stereotype individuals based on race or ethnicity. This bias often stems from training data that reflects historical and systemic inequalities. For example, AI systems used in hiring, law enforcement, or healthcare may disproportionately disadvantage certain racial groups by reinforcing existing stereotypes or underrepresenting them in key areas. Such biases can manifest in ways like facial recognition systems misidentifying individuals of certain racial backgrounds or healthcare algorithms underestimating the medical needs of minority patients. Addressing racial bias requires careful examination of data, improved transparency in algorithmic processes, and efforts to ensure fairness throughout the AI development lifecycle.{{Cite web |last=Lazaro |first=Gina |date=May 17, 2024 |title=Understanding Gender and Racial Bias in AI |url=https://www.sir.advancedleadership.harvard.edu/articles/understanding-gender-and-racial-bias-in-ai |access-date=December 11, 2024 |website=Harvard Advanced Leadership Initiative Social Impact Review}}{{Cite journal |last=Jindal |first=Atin |date=September 5, 2022 |title=Misguided Artificial Intelligence: How Racial Bias is Built Into Clinical Models |journal=Journal of Brown Hospital Medicine |volume=2 |issue=1 |page=38021 |doi=10.56305/001c.38021 |doi-access=free |pmid=40046549 |pmc=11878858 }}

= Technical =

File:Three Surveillance cameras.jpg

Technical bias emerges through limitations of a program, computational power, its design, or other constraint on the system.{{rp|332}} Such bias can also be a restraint of design, for example, a search engine that shows three results per screen can be understood to privilege the top three results slightly more than the next three, as in an airline price display.{{rp|336}} Another case is software that relies on randomness for fair distributions of results. If the random number generation mechanism is not truly random, it can introduce bias, for example, by skewing selections toward items at the end or beginning of a list.{{rp|332}}

A decontextualized algorithm uses unrelated information to sort results, for example, a flight-pricing algorithm that sorts results by alphabetical order would be biased in favor of American Airlines over United Airlines.{{rp|332}} The opposite may also apply, in which results are evaluated in contexts different from which they are collected. Data may be collected without crucial external context: for example, when facial recognition software is used by surveillance cameras, but evaluated by remote staff in another country or region, or evaluated by non-human algorithms with no awareness of what takes place beyond the camera's field of vision. This could create an incomplete understanding of a crime scene, for example, potentially mistaking bystanders for those who commit the crime.{{rp|574}}

Lastly, technical bias can be created by attempting to formalize decisions into concrete steps on the assumption that human behavior works in the same way. For example, software weighs data points to determine whether a defendant should accept a plea bargain, while ignoring the impact of emotion on a jury.{{rp|332}} Another unintended result of this form of bias was found in the plagiarism-detection software Turnitin, which compares student-written texts to information found online and returns a probability score that the student's work is copied. Because the software compares long strings of text, it is more likely to identify non-native speakers of English than native speakers, as the latter group might be better able to change individual words, break up strings of plagiarized text, or obscure copied passages through synonyms. Because it is easier for native speakers to evade detection as a result of the technical constraints of the software, this creates a scenario where Turnitin identifies foreign-speakers of English for plagiarism while allowing more native-speakers to evade detection.{{rp|21–22}}

= Emergent =

Emergent bias is the result of the use and reliance on algorithms across new or unanticipated contexts.{{rp|334}} Algorithms may not have been adjusted to consider new forms of knowledge, such as new drugs or medical breakthroughs, new laws, business models, or shifting cultural norms.{{rp|334,336}} This may exclude groups through technology, without providing clear outlines to understand who is responsible for their exclusion.{{rp|179}}{{rp|294}} Similarly, problems may emerge when training data (the samples "fed" to a machine, by which it models certain conclusions) do not align with contexts that an algorithm encounters in the real world.{{cite web|url=http://culturedigitally.org/2014/06/algorithm-draft-digitalkeyword/|title=Algorithm [draft] [#digitalkeywords] – Culture Digitally|last1=Gillespie|first1=Tarleton|website=culturedigitally.org|date=June 25, 2014 |access-date=20 November 2017}}

In 1990, an example of emergent bias was identified in the software used to place US medical students into residencies, the National Residency Match Program (NRMP).{{rp|338}} The algorithm was designed at a time when few married couples would seek residencies together. As more women entered medical schools, more students were likely to request a residency alongside their partners. The process called for each applicant to provide a list of preferences for placement across the US, which was then sorted and assigned when a hospital and an applicant both agreed to a match. In the case of married couples where both sought residencies, the algorithm weighed the location choices of the higher-rated partner first. The result was a frequent assignment of highly preferred schools to the first partner and lower-preferred schools to the second partner, rather than sorting for compromises in placement preference.{{rp|338}}{{cite journal|last1=Roth|first1=A. E. 1524–1528.|title=New physicians: A natural experiment in market organization|journal=Science|date=14 December 1990|volume=250|issue=4987|pages=1524–1528|url=https://stanford.edu/~alroth/science.html|access-date=18 November 2017|bibcode=1990Sci...250.1524R|doi=10.1126/science.2274783|pmid=2274783|s2cid=23259274}}

Additional emergent biases include:

== Correlations ==

Unpredictable correlations can emerge when large data sets are compared to each other. For example, data collected about web-browsing patterns may align with signals marking sensitive data (such as race or sexual orientation). By selecting according to certain behavior or browsing patterns, the end effect would be almost identical to discrimination through the use of direct race or sexual orientation data.{{rp|6}} In other cases, the algorithm draws conclusions from correlations, without being able to understand those correlations. For example, one triage program gave lower priority to asthmatics who had pneumonia than asthmatics who did not have pneumonia. The program algorithm did this because it simply compared survival rates: asthmatics with pneumonia are at the highest risk. Historically, for this same reason, hospitals typically give such asthmatics the best and most immediate care.{{cite news|last1=Kuang|first1=Cliff|title=Can A.I. Be Taught to Explain Itself?|url=https://www.nytimes.com/2017/11/21/magazine/can-ai-be-taught-to-explain-itself.html|access-date=26 November 2017|work=The New York Times Magazine|date=21 November 2017}}{{clarify|date=September 2021}}

== Unanticipated uses ==

Emergent bias can occur when an algorithm is used by unanticipated audiences. For example, machines may require that users can read, write, or understand numbers, or relate to an interface using metaphors that they do not understand.{{rp|334}} These exclusions can become compounded, as biased or exclusionary technology is more deeply integrated into society.{{rp|179}}

Apart from exclusion, unanticipated uses may emerge from the end user relying on the software rather than their own knowledge. In one example, an unanticipated user group led to algorithmic bias in the UK, when the British National Act Program was created as a proof-of-concept by computer scientists and immigration lawyers to evaluate suitability for British citizenship. The designers had access to legal expertise beyond the end users in immigration offices, whose understanding of both software and immigration law would likely have been unsophisticated. The agents administering the questions relied entirely on the software, which excluded alternative pathways to citizenship, and used the software even after new case laws and legal interpretations led the algorithm to become outdated. As a result of designing an algorithm for users assumed to be legally savvy on immigration law, the software's algorithm indirectly led to bias in favor of applicants who fit a very narrow set of legal criteria set by the algorithm, rather than by the more broader criteria of British immigration law.{{rp|342}}

== Feedback loops ==

Emergent bias may also create a feedback loop, or recursion, if data collected for an algorithm results in real-world responses which are fed back into the algorithm.{{cite news|last1=Jouvenal|first1=Justin|title=Police are using software to predict crime. Is it a 'holy grail' or biased against minorities?|url=https://www.washingtonpost.com/local/public-safety/police-are-using-software-to-predict-crime-is-it-a-holy-grail-or-biased-against-minorities/2016/11/17/525a6649-0472-440a-aae1-b283aa8e5de8_story.html|newspaper=Washington Post|access-date=25 November 2017|date=17 November 2016}}{{cite web|last1=Chamma|first1=Maurice|title=Policing the Future|url=https://www.themarshallproject.org/2016/02/03/policing-the-future?ref=hp-2-111#.UyhBLnmlj|website=The Marshall Project|access-date=25 November 2017|date=2016-02-03}} For example, simulations of the predictive policing software (PredPol), deployed in Oakland, California, suggested an increased police presence in black neighborhoods based on crime data reported by the public.{{cite journal|last1=Lum|first1=Kristian|last2=Isaac|first2=William|title=To predict and serve?|journal=Significance|date=October 2016|volume=13|issue=5|pages=14–19|doi=10.1111/j.1740-9713.2016.00960.x|doi-access=free}} The simulation showed that the public reported crime based on the sight of police cars, regardless of what police were doing. The simulation interpreted police car sightings in modeling its predictions of crime, and would in turn assign an even larger increase of police presence within those neighborhoods.{{cite web|last1=Smith|first1=Jack|title=Predictive policing only amplifies racial bias, study shows|url=https://mic.com/articles/156286/crime-prediction-tool-pred-pol-only-amplifies-racially-biased-policing-study-shows|website=Mic|date=9 October 2016 |access-date=25 November 2017}}{{cite web|last1=Lum|first1=Kristian|last2=Isaac|first2=William|title=FAQs on Predictive Policing and Bias|url=https://hrdag.org/2016/11/04/faqs-predpol/|website=HRDAG|access-date=25 November 2017|date=1 October 2016}} The Human Rights Data Analysis Group, which conducted the simulation, warned that in places where racial discrimination is a factor in arrests, such feedback loops could reinforce and perpetuate racial discrimination in policing. Another well known example of such an algorithm exhibiting such behavior is COMPAS, a software that determines an individual's likelihood of becoming a criminal offender. The software is often criticized for labeling Black individuals as criminals much more likely than others, and then feeds the data back into itself in the event individuals become registered criminals, further enforcing the bias created by the dataset the algorithm is acting on.{{Cite journal |last1=Bahl |first1=Utsav |last2=Topaz |first2=Chad |last3=Obermüller |first3=Lea |last4=Goldstein |first4=Sophie |last5=Sneirson |first5=Mira |date=May 21, 2024 |title=Algorithms in Judges' Hands: Incarceration and Inequity in Broward County, Florida |url=https://www.uclalawreview.org/algorithms-in-judges-hands-incarceration-and-inequity-in-broward-county-florida/ |journal=UCLA Law Review |volume=71 |issue=246}}

Recommender systems such as those used to recommend online videos or news articles can create feedback loops.{{Cite book|last1=Sun|first1=Wenlong|last2=Nasraoui|first2=Olfa|last3=Shafto|first3=Patrick|title=Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management |chapter=Iterated Algorithmic Bias in the Interactive Machine Learning Process of Information Filtering |date=2018|location=Seville, Spain|publisher=SCITEPRESS - Science and Technology Publications|pages=110–118|doi=10.5220/0006938301100118|isbn=9789897583308|doi-access=free}} When users click on content that is suggested by algorithms, it influences the next set of suggestions.{{Cite journal|last1=Sinha|first1=Ayan|last2=Gleich|first2=David F.|last3=Ramani|first3=Karthik|date=2018-08-09|title=Gauss's law for networks directly reveals community boundaries|journal=Scientific Reports|volume=8|issue=1|pages=11909|doi=10.1038/s41598-018-30401-0|pmid=30093660|pmc=6085300|issn=2045-2322|bibcode=2018NatSR...811909S}} Over time this may lead to users entering a filter bubble and being unaware of important or useful content.{{Cite web|url=https://qz.com/1194566/google-is-finally-admitting-it-has-a-filter-bubble-problem/|title=Google is finally admitting it has a filter-bubble problem|last1=Hao|first1=Karen|website=Quartz|date=February 2018 |access-date=2019-02-26}}{{Cite web|url=http://fortune.com/2017/04/25/facebook-related-articles-filter-bubbles/|title=Facebook Is Testing This New Feature to Fight 'Filter Bubbles'|website=Fortune|access-date=2019-02-26}}

Impact

= Commercial influences =

Corporate algorithms could be skewed to invisibly favor financial arrangements or agreements between companies, without the knowledge of a user who may mistake the algorithm as being impartial. For example, American Airlines created a flight-finding algorithm in the 1980s. The software presented a range of flights from various airlines to customers, but weighed factors that boosted its own flights, regardless of price or convenience. In testimony to the United States Congress, the president of the airline stated outright that the system was created with the intention of gaining competitive advantage through preferential treatment.{{rp|2}}{{cite journal|last1=Friedman|first1=Batya|last2=Nissenbaum|first2=Helen|title=Bias in Computer Systems|journal=ACM Transactions on Information Systems|date=July 1996|volume=14|issue=3|pages=330–347|url=https://nissenbaum.tech.cornell.edu/papers/biasincomputers.pdf|access-date=10 March 2019|doi=10.1145/230538.230561|s2cid=207195759}}{{rp|331}}

In a 1998 paper describing Google, the founders of the company had adopted a policy of transparency in search results regarding paid placement, arguing that "advertising-funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers."{{cite web|last1=Brin|first1=Sergey|last2=Page|first2=Lawrence|title=The Anatomy of a Search Engine|url=http://www7.scu.edu.au/1921/com1921.htm|website=www7.scu.edu.au|access-date=18 November 2017|archive-url=https://web.archive.org/web/20190702020902/http://www7.scu.edu.au/1921/com1921.htm|archive-date=2 July 2019|url-status=dead}} This bias would be an "invisible" manipulation of the user.{{cite journal|last1=Sandvig|first1=Christian|last2=Hamilton|first2=Kevin|author-link3=Karrie Karahalios|last3=Karahalios|first3=Karrie|last4=Langbort|first4=Cedric|title=Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms|journal=64th Annual Meeting of the International Communication Association|date=22 May 2014|url=http://www-personal.umich.edu/~csandvig/research/Auditing%20Algorithms%20--%20Sandvig%20--%20ICA%202014%20Data%20and%20Discrimination%20Preconference.pdf|access-date=18 November 2017}}{{rp|3}}

= Voting behavior =

A series of studies about undecided voters in the US and in India found that search engine results were able to shift voting outcomes by about 20%. The researchers concluded that candidates have "no means of competing" if an algorithm, with or without intent, boosted page listings for a rival candidate.{{cite journal|last1=Epstein|first1=Robert|last2=Robertson|first2=Ronald E.|title=The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections|journal=Proceedings of the National Academy of Sciences|date=18 August 2015|volume=112|issue=33|pages=E4512–E4521|doi=10.1073/pnas.1419828112|pmid=26243876|bibcode=2015PNAS..112E4512E|pmc=4547273|doi-access=free}} Facebook users who saw messages related to voting were more likely to vote. A 2010 randomized trial of Facebook users showed a 20% increase (340,000 votes) among users who saw messages encouraging voting, as well as images of their friends who had voted.{{cite journal|last1=Bond|first1=Robert M.|last2=Fariss|first2=Christopher J.|last3=Jones|first3=Jason J.|last4=Kramer|first4=Adam D. I.|last5=Marlow|first5=Cameron|last6=Settle|first6=Jaime E.|last7=Fowler|first7=James H.|title=A 61-million-person experiment in social influence and political mobilization|journal=Nature|date=13 September 2012|volume=489|issue=7415|pages=295–8|doi=10.1038/nature11421|pmid=22972300|pmc=3834737|issn=0028-0836|bibcode=2012Natur.489..295B}} Legal scholar Jonathan Zittrain has warned that this could create a "digital gerrymandering" effect in elections, "the selective presentation of information by an intermediary to meet its agenda, rather than to serve its users", if intentionally manipulated.{{cite journal|last1=Zittrain|first1=Jonathan|title=Engineering an Election|journal=Harvard Law Review Forum|date=2014|volume=127|pages=335–341|url=http://cdn.harvardlawreview.org/wp-content/uploads/2014/06/vol127_Symposium_Zittrain.pdf|access-date=19 November 2017|archive-date=4 March 2021|archive-url=https://web.archive.org/web/20210304064053/http://cdn.harvardlawreview.org/wp-content/uploads/2014/06/vol127_Symposium_Zittrain.pdf|url-status=dead}}{{rp|335}}

= Gender discrimination =

In 2016, the professional networking site LinkedIn was discovered to recommend male variations of women's names in response to search queries. The site did not make similar recommendations in searches for male names. For example, "Andrea" would bring up a prompt asking if users meant "Andrew", but queries for "Andrew" did not ask if users meant to find "Andrea". The company said this was the result of an analysis of users' interactions with the site.{{cite web|last1=Day|first1=Matt|title=How LinkedIn's search engine may reflect a gender bias|url=https://www.seattletimes.com/business/microsoft/how-linkedins-search-engine-may-reflect-a-bias/|website=The Seattle Times|access-date=25 November 2017|date=31 August 2016}}

In 2012, the department store franchise Target was cited for gathering data points to infer when women customers were pregnant, even if they had not announced it, and then sharing that information with marketing partners.{{cite journal|last1=Crawford|first1=Kate|last2=Schultz|first2=Jason|title=Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms|journal=Boston College Law Review|date=2014|volume=55|issue=1|pages=93–128|url=http://lawdigitalcommons.bc.edu/bclr/vol55/iss1/4/|access-date=18 November 2017}}{{rp|94}}{{Cite news|last1=Duhigg|first1=Charles|title=How Companies Learn Your Secrets|url=https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html|newspaper=The New York Times Magazine |access-date=18 November 2017|date=16 February 2012}} Because the data had been predicted, rather than directly observed or reported, the company had no legal obligation to protect the privacy of those customers.{{rp|98}}

Web search algorithms have also been accused of bias. Google's results may prioritize pornographic content in search terms related to sexuality, for example, "lesbian". This bias extends to the search engine showing popular but sexualized content in neutral searches. For example, "Top 25 Sexiest Women Athletes" articles displayed as first-page results in searches for "women athletes".{{cite journal|last1=Noble|first1=Safiya|author-link=Safiya Noble|title=Missed Connections: What Search Engines Say about Women|journal=Bitch |date=2012|volume=12|issue=4|pages=37–41|url=https://safiyaunoble.files.wordpress.com/2012/03/54_search_engines.pdf}}{{rp|31}} In 2017, Google adjusted these results along with others that surfaced hate groups, racist views, child abuse and pornography, and other upsetting and offensive content.{{cite news|last1=Guynn|first1=Jessica|title=Google starts flagging offensive content in search results|url=https://www.usatoday.com/story/tech/news/2017/03/16/google-flags-offensive-content-search-results/99235548/|access-date=19 November 2017|work=USA TODAY|agency=USA Today|date=16 March 2017}} Other examples include the display of higher-paying jobs to male applicants on job search websites.{{cite web|url=https://www.technologyreview.com/s/539021/probing-the-dark-side-of-googles-ad-targeting-system/|title=Study Suggests Google's Ad-Targeting System May Discriminate|last1=Simonite|first1=Tom|website=MIT Technology Review|publisher=Massachusetts Institute of Technology|access-date=17 November 2017}} Researchers have also identified that machine translation exhibits a strong tendency towards male defaults.{{Cite arXiv|eprint = 1809.02208|last1 = Prates|first1 = Marcelo O. R.|last2 = Avelar|first2 = Pedro H. C.|last3 = Lamb|first3 = Luis|title = Assessing Gender Bias in Machine Translation -- A Case Study with Google Translate|year = 2018|class = cs.CY}} In particular, this is observed in fields linked to unbalanced gender distribution, including STEM occupations.{{Cite journal |doi = 10.1007/s00521-019-04144-6|title = Assessing gender bias in machine translation: A case study with Google Translate|journal = Neural Computing and Applications|year = 2019|last1 = Prates|first1 = Marcelo O. R.|last2 = Avelar|first2 = Pedro H.|last3 = Lamb|first3 = Luís C.|volume = 32|issue = 10|pages = 6363–6381|arxiv = 1809.02208|s2cid = 52179151}} In fact, current machine translation systems fail to reproduce the real world distribution of female workers.{{cite news |last1=Claburn |first1=Thomas |title=Boffins bash Google Translate for sexism |url=https://www.theregister.com/2018/09/10/boffins_bash_google_translate_for_sexist_language/ |access-date=28 April 2022 |work=The Register |date=10 September 2018 |language=en}}

In 2015, Amazon.com turned off an AI system it developed to screen job applications when they realized it was biased against women.{{cite news |last1=Dastin |first1=Jeffrey |title=Amazon scraps secret AI recruiting tool that showed bias against women |url=https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G |work=Reuters |date=October 9, 2018}} The recruitment tool excluded applicants who attended all-women's colleges and resumes that included the word "women's".{{Cite web|last=Vincent|first=James|date=10 October 2018|title=Amazon reportedly scraps internal AI recruiting tool that was biased against women|url=https://www.theverge.com/2018/10/10/17958784/ai-recruiting-tool-bias-amazon-report|website=The Verge}} A similar problem emerged with music streaming services—In 2019, it was discovered that the recommender system algorithm used by Spotify was biased against women artists.{{Cite web|title=Reflecting on Spotify's Recommender System – SongData|date=October 2019 |url=https://songdata.ca/2019/10/01/reflecting-on-spotifys-recommender-system/|access-date=2020-08-07|language=en-US}} Spotify's song recommendations suggested more male artists over women artists.

= Racial and ethnic discrimination =

Algorithms have been criticized as a method for obscuring racial prejudices in decision-making.{{cite journal |last1=Buolamwini |first1=Joy |last2=Gebru |first2=Timnit |title=Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification |journal=Proceedings of Machine Learning Research |date=21 January 2018 |volume=81 |issue=2018 |pages=77–91 |url=http://proceedings.mlr.press/v81/buolamwini18a.html |access-date=27 September 2020}}{{cite book |last1=Noble |first1=Safiya Umoja |title=Algorithms of Oppression: How Search Engines Reinforce Racism |date=20 February 2018 |publisher=NYU Press |location=New York |isbn=978-1479837243}}{{cite book|title=The New Media of Surveillance|last1=Nakamura|first1=Lisa|date=2009|publisher=Routledge|isbn=978-0-415-56812-8|editor1-last=Magnet|editor1-first=Shoshana|location=London|pages=149–162|editor2-last=Gates|editor2-first=Kelly}}{{rp|158}} Because of how certain races and ethnic groups were treated in the past, data can often contain hidden biases.{{Cite journal |author1=Marco Marabelli |author2=Sue Newell |author3=Valerie Handunge |title=The lifecycle of algorithmic decision-making systems: Organizational choices and ethical challenges |url=https://www.sciencedirect.com/science/article/abs/pii/S0963868721000305 |journal=Journal of Strategic Information Systems |year=2021 |volume=30 |issue=3 |pages=1–15|doi=10.1016/j.jsis.2021.101683 }} For example, black people are likely to receive longer sentences than white people who committed the same crime.{{Cite journal|last1=Alexander|first1=Rudolph|last2=Gyamerah|first2=Jacquelyn|date=September 1997|title=Differential Punishing of African Americans and Whites Who Possess Drugs: A Just Policy or a Continuation of the Past?|journal=Journal of Black Studies|volume=28|issue=1|pages=97–111|doi=10.1177/002193479702800106|s2cid=152043501|issn=0021-9347}}{{Cite journal|last=Petersilia|first=Joan|date=January 1985|title=Racial Disparities in the Criminal Justice System: A Summary|journal=Crime & Delinquency|volume=31|issue=1|pages=15–34|doi=10.1177/0011128785031001002|s2cid=146588630|issn=0011-1287}} This could potentially mean that a system amplifies the original biases in the data.

In 2015, Google apologized when a couple of black users complained that an image-identification algorithm in its Photos application identified them as gorillas.{{cite news|last1=Guynn|first1=Jessica|title=Google Photos labeled black people 'gorillas'|url=https://www.usatoday.com/story/tech/2015/07/01/google-apologizes-after-photos-identify-black-people-as-gorillas/29567465/|access-date=18 November 2017|work=USA TODAY|agency=USA Today|date=1 July 2015}} In 2010, Nikon cameras were criticized when image-recognition algorithms consistently asked Asian users if they were blinking.{{Cite magazine|last1=Rose|first1=Adam|title=Are Face-Detection Cameras Racist?|url=http://content.time.com/time/business/article/0,8599,1954643,00.html|magazine=Time|access-date=18 November 2017|date=22 January 2010}} Such examples are the product of bias in biometric data sets. Biometric data is drawn from aspects of the body, including racial features either observed or inferred, which can then be transferred into data points.{{rp|154}} Speech recognition technology can have different accuracies depending on the user's accent. This may be caused by the a lack of training data for speakers of that accent.{{Cite news|url=https://www.washingtonpost.com/graphics/2018/business/alexa-does-not-understand-your-accent/|title=Alexa does not understand your accent|newspaper=Washington Post}}

Biometric data about race may also be inferred, rather than observed. For example, a 2012 study showed that names commonly associated with blacks were more likely to yield search results implying arrest records, regardless of whether there is any police record of that individual's name.{{cite arXiv|last1=Sweeney|first1=Latanya|title=Discrimination in Online Ad Delivery|date=28 January 2013|class=cs.IR|eprint=1301.6822}} A 2015 study also found that Black and Asian people are assumed to have lesser functioning lungs due to racial and occupational exposure data not being incorporated into the prediction algorithm's model of lung function.{{Cite journal|last=Braun|first=Lundy|date=2015|title=Race, ethnicity and lung function: A brief history|journal=Canadian Journal of Respiratory Therapy |volume=51|issue=4|pages=99–101|issn=1205-9838|pmc=4631137|pmid=26566381}}{{Cite journal|last1=Robinson|first1=Whitney R|last2=Renson|first2=Audrey|last3=Naimi|first3=Ashley I|date=2020-04-01|title=Teaching yourself about structural racism will improve your machine learning|journal=Biostatistics|volume=21|issue=2|pages=339–344|doi=10.1093/biostatistics/kxz040|pmid=31742353|pmc=7868043|issn=1465-4644|doi-access=free}}

In 2019, a research study revealed that a healthcare algorithm sold by Optum favored white patients over sicker black patients. The algorithm predicts how much patients would cost the health-care system in the future. However, cost is not race-neutral, as black patients incurred about $1,800 less in medical costs per year than white patients with the same number of chronic conditions, which led to the algorithm scoring white patients as equally at risk of future health problems as black patients who suffered from significantly more diseases.{{Cite news|url=https://www.washingtonpost.com/health/2019/10/24/racial-bias-medical-algorithm-favors-white-patients-over-sicker-black-patients/|title=Racial bias in a medical algorithm favors white patients over sicker black patients|first=Carolyn Y. |last=Johnson |date=24 October 2019|newspaper=Washington Post|language=en|access-date=2019-10-28}}

A study conducted by researchers at UC Berkeley in November 2019 revealed that mortgage algorithms have been discriminatory towards Latino and African Americans which discriminated against minorities based on "creditworthiness" which is rooted in the U.S. fair-lending law which allows lenders to use measures of identification to determine if an individual is worthy of receiving loans. These particular algorithms were present in FinTech companies and were shown to discriminate against minorities.{{Cite journal|last1=Bartlett|first1=Robert|last2=Morse|first2=Adair|last3=Stanton|first3=Richard|last4=Wallace|first4=Nancy|date=June 2019|title=Consumer-Lending Discrimination in the FinTech Era|url=http://www.nber.org/papers/w25943|journal=NBER Working Paper No. 25943|series=Working Paper Series |doi=10.3386/w25943|s2cid=242410791}}{{Primary source inline|date=December 2019}}

Another study, published in August 2024, on Large language model investigates how language models perpetuate covert racism, particularly through dialect prejudice against speakers of African American English (AAE). It highlights that these models exhibit more negative stereotypes about AAE speakers than any recorded human biases, while their overt stereotypes are more positive. This discrepancy raises concerns about the potential harmful consequences of such biases in decision-making processes.Hofmann, V., Kalluri, P.R., Jurafsky, D. et al. AI generates covertly racist decisions about people based on their dialect. Nature 633, 147–154 (2024). https://doi.org/10.1038/s41586-024-07856-5

A study published by the Anti-Defamation League in 2025 found that several major LLMs, including ChatGPT, Llama, Claude, and Gemini showed antisemitic bias.{{Cite web |last=Stub |first=Zev |title=Study: ChatGPT, Meta's Llama and all other top AI models show anti-Jewish, anti-Israel bias |url=https://www.timesofisrael.com/study-chatgpt-metas-llama-and-all-other-top-ai-models-show-anti-jewish-anti-israel-bias/ |access-date=2025-03-27 |website=www.timesofisrael.com |language=en-US}}

A 2018 study found that commercial gender classification systems had significantly higher error rates for darker-skinned women, with error rates up to 34.7%, compared to near-perfect accuracy for lighter-skinned men.{{Cite conference |last1=Buolamwini |first1=J. |last2=Gebru |first2=T. |title=Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification |book-title=Proceedings of the 1st Conference on Fairness, Accountability and Transparency |pages=77–91 |year=2018 |url=https://proceedings.mlr.press/v81/buolamwini18a.html |access-date=April 30, 2025}}

== Online hate speech ==

In 2017 a Facebook algorithm designed to remove online hate speech was found to advantage white men over black children when assessing objectionable content, according to internal Facebook documents.{{cite web|url=https://www.propublica.org/article/facebook-hate-speech-censorship-internal-documents-algorithms|title=Facebook's Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children — ProPublica|last1=Angwin|first1=Julia|last2=Grassegger|first2=Hannes|date=28 June 2017|website=ProPublica|access-date=20 November 2017}} The algorithm, which is a combination of computer programs and human content reviewers, was created to protect broad categories rather than specific subsets of categories. For example, posts denouncing "Muslims" would be blocked, while posts denouncing "Radical Muslims" would be allowed. An unanticipated outcome of the algorithm is to allow hate speech against black children, because they denounce the "children" subset of blacks, rather than "all blacks", whereas "all white men" would trigger a block, because whites and males are not considered subsets. Facebook was also found to allow ad purchasers to target "Jew haters" as a category of users, which the company said was an inadvertent outcome of algorithms used in assessing and categorizing data. The company's design also allowed ad buyers to block African-Americans from seeing housing ads.{{cite news|url=https://www.propublica.org/article/facebook-enabled-advertisers-to-reach-jew-haters|title=Facebook Enabled Advertisers to Reach 'Jew Haters' — ProPublica|last1=Angwin|first1=Julia|date=14 September 2017|work=ProPublica|access-date=20 November 2017|last2=Varner|first2=Madeleine|last3=Tobin|first3=Ariana}}

While algorithms are used to track and block hate speech, some were found to be 1.5 times more likely to flag information posted by Black users and 2.2 times likely to flag information as hate speech if written in African American English.{{Cite conference|url=https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf|title=The Risk of Racial Bias in Hate Speech Detection|last1=Sap|first1=Maarten|last2=Card|first2=Dallas|last3=Gabriel|first3=Saadia|last4=Choi|first4=Yejin|last5=Smith|first5=Noah A.|book-title=Proceedings of the 57th Annual Meeting of the Association for Computational Linguist|publisher=Association for Computational Linguistics|location=Florence, Italy|date=28 July – 2 August 2019|pages=1668–1678|url-status=live|archive-url=https://web.archive.org/web/20190814194616/https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf |archive-date=2019-08-14 }} Without context for slurs and epithets, even when used by communities which have re-appropriated them, were flagged.{{Cite web|url=https://www.vox.com/recode/2019/8/15/20806384/social-media-hate-speech-bias-black-african-american-facebook-twitter|title=The algorithms that detect hate speech online are biased against black people|last=Ghaffary|first=Shirin|website=Vox|date=15 August 2019 |access-date=19 February 2020}}

Another instance in a study found that 85 out of 100 examined subreddits tended to remove various norm violations, including misogynistic slurs and racist hate speech, highlighting the prevalence of such content in online communities.Nakajima Wickham, E., & Öhman, E. (2022). Hate speech, censorship, and freedom of speech: The changing policies of reddit. Journal of Data Mining & Digital Humanities, NLP4DH. https://doi.org/10.46298/jdmdh.9226 As platforms like Reddit update their hate speech policies, they must balance free expression with the protection of marginalized communities, emphasizing the need for context-sensitive moderation and nuanced algorithms.

== Surveillance ==

Surveillance camera software may be considered inherently political because it requires algorithms to distinguish normal from abnormal behaviors, and to determine who belongs in certain locations at certain times.{{rp|572}} The ability of such algorithms to recognize faces across a racial spectrum has been shown to be limited by the racial diversity of images in its training database; if the majority of photos belong to one race or gender, the software is better at recognizing other members of that race or gender.{{cite journal|last1=Furl|first1=N|date=December 2002|title=Face recognition algorithms and the other-race effect: computational mechanisms for a developmental contact hypothesis|journal=Cognitive Science|volume=26|issue=6|pages=797–815|doi=10.1207/s15516709cog2606_4|doi-access=free}} However, even audits of these image-recognition systems are ethically fraught, and some scholars have suggested the technology's context will always have a disproportionate impact on communities whose actions are over-surveilled.{{cite book |last1=Raji |first1=Inioluwa Deborah |last2=Gebru |first2=Timnit |last3=Mitchell |first3=Margaret |last4=Buolamwini |first4=Joy |last5=Lee |first5=Joonseok |last6=Denton |first6=Emily |title=Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society |chapter=Saving Face |date=7 February 2020 |pages=145–151 |doi=10.1145/3375627.3375820 |chapter-url=https://dl.acm.org/doi/10.1145/3375627.3375820 |publisher=Association for Computing Machinery|arxiv=2001.00964 |isbn=9781450371100 |s2cid=209862419 }} For example, a 2002 analysis of software used to identify individuals in CCTV images found several examples of bias when run against criminal databases. The software was assessed as identifying men more frequently than women, older people more frequently than the young, and identified Asians, African-Americans and other races more often than whites.{{rp|190}} A 2018 study found that facial recognition software most likely accurately identified light-skinned (typically European) males, with slightly lower accuracy rates for light-skinned females. Dark-skinned males and females were significanfly less likely to be accurately identified by facial recognition software. These disparities are attributed to the under-representation of darker-skinned participants in data sets used to develop this software.{{cite web | title=Facial Recognition Is Accurate, if You're a White Guy | website=The New York Times | date=2018-02-09 | url=https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html | access-date=2023-08-24}}{{Cite journal|last1=Buolamwini|first1=Joy|last2=Gebru|first2=Timnit|date=2018|title=Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification|url=http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf|journal=Proceedings of Machine Learning Research|volume=81|page=1|via=MLR Press}}

= Discrimination against the LGBTQ community =

In 2011, users of the gay hookup application Grindr reported that the Android store's recommendation algorithm was linking Grindr to applications designed to find sex offenders, which critics said inaccurately related homosexuality with pedophilia. Writer Mike Ananny criticized this association in The Atlantic, arguing that such associations further stigmatized gay men.{{cite web|last1=Ananny|first1=Mike|title=The Curious Connection Between Apps for Gay Men and Sex Offenders|url=https://www.theatlantic.com/technology/archive/2011/04/the-curious-connection-between-apps-for-gay-men-and-sex-offenders/237340/|website=The Atlantic|access-date=18 November 2017|date=2011-04-14}} In 2009, online retailer Amazon de-listed 57,000 books after an algorithmic change expanded its "adult content" blacklist to include any book addressing sexuality or gay themes, such as the critically acclaimed novel Brokeback Mountain.{{cite web|last1=Kafka|first1=Peter|title=Did Amazon Really Fail This Weekend? The Twittersphere Says 'Yes,' Online Retailer Says 'Glitch.'|url=http://allthingsd.com/20090412/did-amazon-really-fail-this-weekend-the-twittersphere-says-yes/|website=AllThingsD|access-date=22 November 2017}}{{rp|5}}{{cite web|last1=Kafka|first1=Peter|title=Amazon Apologizes for 'Ham-fisted Cataloging Error'|url=http://allthingsd.com/20090413/amazon-apologizes-for-ham-fisted-cataloging-error/|website=AllThingsD|access-date=22 November 2017}}

In 2019, it was found that on Facebook, searches for "photos of my female friends" yielded suggestions such as "in bikinis" or "at the beach". In contrast, searches for "photos of my male friends" yielded no results.{{Cite news|url=https://www.wired.com/story/facebook-female-friends-photo-search-bug/|title=A 'Sexist' Search Bug Says More About Us Than Facebook|last=Matsakis|first=Louise|date=2019-02-22|magazine=Wired|access-date=2019-02-26|issn=1059-1028}}

Facial recognition technology has been seen to cause problems for transgender individuals. In 2018, there were reports of Uber drivers who were transgender or transitioning experiencing difficulty with the facial recognition software that Uber implements as a built-in security measure. As a result of this, some of the accounts of trans Uber drivers were suspended which cost them fares and potentially cost them a job, all due to the facial recognition software experiencing difficulties with recognizing the face of a trans driver who was transitioning.{{Cite web|url=https://www.vox.com/future-perfect/2019/4/19/18412674/ai-bias-facial-recognition-black-gay-transgender|title=Some AI just shouldn't exist|date=2019-04-19}} Although the solution to this issue would appear to be including trans individuals in training sets for machine learning models, an instance of trans YouTube videos that were collected to be used in training data did not receive consent from the trans individuals that were included in the videos, which created an issue of violation of privacy.{{Cite web|url=https://www.vox.com/future-perfect/2019/4/19/18412674/ai-bias-facial-recognition-black-gay-transgender|title=Some AI just shouldn't exist|last=Samuel|first=Sigal|date=2019-04-19|website=Vox|language=en|access-date=2019-12-12}}

There has also been a study that was conducted at Stanford University in 2017 that tested algorithms in a machine learning system that was said to be able to detect an individual's sexual orientation based on their facial images.{{Cite journal|last1=Wang|first1=Yilun|last2=Kosinski|first2=Michal|date=2017-02-15|title=Deep neural networks are more accurate than humans at detecting sexual orientation from facial images.|url=https://osf.io/zn79k/|journal=OSF|doi=10.17605/OSF.IO/ZN79K |language=en}} The model in the study predicted a correct distinction between gay and straight men 81% of the time, and a correct distinction between gay and straight women 74% of the time. This study resulted in a backlash from the LGBTQIA community, who were fearful of the possible negative repercussions that this AI system could have on individuals of the LGBTQIA community by putting individuals at risk of being "outed" against their will.{{Cite news|url=https://www.theguardian.com/world/2017/sep/08/ai-gay-gaydar-algorithm-facial-recognition-criticism-stanford|title=LGBT groups denounce 'dangerous' AI that uses your face to guess sexuality|last=Levin|first=Sam|date=2017-09-09|work=The Guardian|access-date=2019-12-12|language=en-GB|issn=0261-3077}}

= Disability discrimination =

While the modalities of algorithmic fairness have been judged on the basis of different aspects of bias – like gender, race and socioeconomic status, disability often is left out of the list.{{Cite journal |last=Pal |first=G.C. |date=September 16, 2011 |title=Disability, Intersectionality and Deprivation: An Excluded Agenda |url=https://journals.sagepub.com/doi/abs/10.1177/097133361102300202?journalCode=pdsa |journal=Psychology and Developing Societies |volume=23(2), 159–176. |doi=10.1177/097133361102300202 |s2cid=147322669 |via=Sagepub}}{{Cite journal |last1=Brinkman |first1=Aurora H. |last2=Rea-Sandin |first2=Gianna |last3=Lund |first3=Emily M. |last4=Fitzpatrick |first4=Olivia M. |last5=Gusman |first5=Michaela S. |last6=Boness |first6=Cassandra L. |last7=Scholars for Elevating Equity and Diversity (SEED) |date=2022-10-20 |title=Shifting the discourse on disability: Moving to an inclusive, intersectional focus. |journal=American Journal of Orthopsychiatry |volume=93 |issue=1 |pages=50–62 |language=en |doi=10.1037/ort0000653 |pmid=36265035 |pmc=9951269 |issn=1939-0025 }} The marginalization people with disabilities currently face in society is being translated into AI systems and algorithms, creating even more exclusion{{Cite web |last=Whittaker |first=Meredith |date=November 2019 |title=Disability, Bias, and AI |url=https://ainowinstitute.org/disabilitybiasai-2019.pdf |access-date=December 2, 2022 |archive-date=March 27, 2023 |archive-url=https://web.archive.org/web/20230327023907/https://ainowinstitute.org/disabilitybiasai-2019.pdf |url-status=dead }}{{Cite web |title=Mission — Disability is Diversity — Dear Entertainment Industry, THERE'S NO DIVERSITY, EQUITY & INCLUSION WITHOUT DISABILITY |url=https://disabilityisdiversity.com/mission |access-date=2022-12-02 |website=Disability is Diversity |language=en-US}}

The shifting nature of disabilities and its subjective characterization, makes it more difficult to computationally address. The lack of historical depth in defining disabilities, collecting its incidence and prevalence in questionnaires, and establishing recognition add to the controversy and ambiguity in its quantification and calculations.  The definition of disability has been long debated shifting from a medical model to a social model of disability most recently, which establishes that disability is a result of the mismatch between people's interactions and barriers in their environment, rather than impairments and health conditions. Disabilities can also be situational or temporary,{{Cite web |title=Microsoft Design |url=https://www.microsoft.com/design/inclusive/ |access-date=2022-12-02 |website=www.microsoft.com |language=en-us}} considered in a constant state of flux. Disabilities are incredibly diverse,{{Cite web |last=Pulrang |first=Andrew |title=4 Ways To Understand The Diversity Of The Disability Community |url=https://www.forbes.com/sites/andrewpulrang/2020/01/03/4-ways-to-understand-the-diversity-of-the-disability-community/ |access-date=2022-12-02 |website=Forbes |language=en}} fall within a large spectrum, and can be unique to each individual. People's identity can vary based on the specific types of disability they experience, how they use assistive technologies, and who they support.  The high level of variability across people's experiences greatly personalizes how a disability can manifest. Overlapping identities and intersectional experiences{{Cite journal |last1=Watermeyer |first1=Brian |last2=Swartz |first2=Leslie |date=2022-10-12 |title=Disability and the problem of lazy intersectionality |url=https://doi.org/10.1080/09687599.2022.2130177 |journal=Disability & Society |volume=38 |issue=2 |pages=362–366 |doi=10.1080/09687599.2022.2130177 |s2cid=252959399 |issn=0968-7599}} are excluded from statistics and datasets,{{Cite web |title=Disability Data Report 2021 |url=https://disabilitydata.ace.fordham.edu/reports/disability-data-initiative-2021-report/ |access-date=2022-12-02 |website=Disability Data Initiative |date=May 23, 2021 |language=en}} hence underrepresented and nonexistent in training data.{{Cite journal |last=White |first=Jason J. G. |date=2020-03-02 |title=Fairness of AI for people with disabilities: problem analysis and interdisciplinary collaboration |url=https://doi.org/10.1145/3386296.3386299 |journal=ACM SIGACCESS Accessibility and Computing |issue=125 |pages=3:1 |doi=10.1145/3386296.3386299 |s2cid=211723415 |issn=1558-2337}} Therefore, machine learning models are trained inequitably and artificial intelligent systems perpetuate more algorithmic bias.{{Cite web |title=AI language models show bias against people with disabilities, study finds {{!}} Penn State University |url=https://www.psu.edu/news/information-sciences-and-technology/story/ai-language-models-show-bias-against-people-disabilities/ |access-date=2022-12-02 |website=www.psu.edu |language=en}} For example, if people with speech impairments are not included in training voice control features and smart AI assistants –they are unable to use the feature or the responses received from a Google Home or Alexa are extremely poor.

Given the stereotypes and stigmas that still exist surrounding disabilities, the sensitive nature of revealing these identifying characteristics also carries vast privacy challenges. As disclosing disability information can be taboo and drive further discrimination against this population, there is a lack of explicit disability data available for algorithmic systems to interact with. People with disabilities face additional harms and risks with respect to their social support, cost of health insurance, workplace discrimination and other basic necessities upon disclosing their disability status. Algorithms are further exacerbating this gap by recreating the biases that already exist in societal systems and structures.{{Cite web |last=Givens |first=Alexandra Reeve |date=2020-02-06 |title=How Algorithmic Bias Hurts People With Disabilities |url=https://slate.com/technology/2020/02/algorithmic-bias-people-with-disabilities.html |access-date=2022-12-02 |website=Slate Magazine |language=en}}{{Cite journal |last=Morris |first=Meredith Ringel |date=2020-05-22 |title=AI and accessibility |url=https://doi.org/10.1145/3356727 |journal=Communications of the ACM |volume=63 |issue=6 |pages=35–37 |doi=10.1145/3356727 |arxiv=1908.08939 |s2cid=201645229 |issn=0001-0782}}

= Google Search =

While users generate results that are "completed" automatically, Google has failed to remove sexist and racist autocompletion text. For example, Algorithms of Oppression: How Search Engines Reinforce Racism Safiya Noble notes an example of the search for "black girls", which was reported to result in pornographic images. Google claimed it was unable to erase those pages unless they were considered unlawful.{{Cite book|title=Algorithms of Oppression: How Search Engines Reinforce Racism|last=Noble, Safiya Umoja|isbn=9781479837243|location=New York|oclc=987591529|date = 2018-02-20}}

Obstacles to research

Several problems impede the study of large-scale algorithmic bias, hindering the application of academically rigorous studies and public understanding.{{rp|5}}{{cite arXiv|last1=Castelnovo |first1=Alessandro |last2=Inverardi |first2=Nicole |last3=Nanino |first3=Gabriele |last4=Penco |first4=Ilaria | last5=Regoli |first5=Daniele | title=Fair Enough? A map of the current limitations to the requirements to have "fair" algorithms |date=2023 |class=cs.AI |eprint=2311.12435 }}{{cite journal |last1=Ruggieri |first1=Salvatore |last2=Alvarez |first2=Jose M |last3=Pugnana |first3=Andrea |last4=Turini |first4=Franco |title=Can We Trust Fair-AI? |journal=Proceedings of the AAAI Conference on Artificial Intelligence |date=2023 |volume=37 |issue=13 |pages=5421–15430 |doi=10.1609/aaai.v37i13.26798 |s2cid=259678387 |doi-access=free |hdl=11384/136444 |hdl-access=free }}

= Defining fairness =

{{main|Fairness (machine learning)}}

Literature on algorithmic bias has focused on the remedy of fairness, but definitions of fairness are often incompatible with each other and the realities of machine learning optimization.{{Cite web |last=Samuel |first=Sigal |date=2022-04-19 |title=Why it's so damn hard to make AI fair and unbiased |url=https://www.vox.com/future-perfect/22916602/ai-bias-fairness-tradeoffs-artificial-intelligence |access-date=2024-07-23 |website=Vox |language=en-US}}{{Cite web |last=Fioretto |first=Ferdinando |date=2024-03-19 |title=Building fairness into AI is crucial – and hard to get right |url=http://theconversation.com/building-fairness-into-ai-is-crucial-and-hard-to-get-right-220271 |access-date=2024-07-23 |website=The Conversation |language=en-US}} For example, defining fairness as an "equality of outcomes" may simply refer to a system producing the same result for all people, while fairness defined as "equality of treatment" might explicitly consider differences between individuals.{{cite arXiv |last1=Friedler |first1=Sorelle A. |last2=Scheidegger |first2=Carlos |last3=Venkatasubramanian |first3=Suresh |title=On the (im)possibility of fairness |year=2016 |class=cs.CY |eprint=1609.07236}}{{rp|2}} As a result, fairness is sometimes described as being in conflict with the accuracy of a model, suggesting innate tensions between the priorities of social welfare and the priorities of the vendors designing these systems.{{cite arXiv |last1=Hu |first1=Lily |last2=Chen |first2=Yiling |author2-link=Yiling Chen|title=Welfare and Distributional Impacts of Fair Classification |year=2018 |class=cs.LG |eprint=1807.01134 }}{{rp|2}} In response to this tension, researchers have suggested more care to the design and use of systems that draw on potentially biased algorithms, with "fairness" defined for specific applications and contexts.{{cite arXiv |last1=Dwork |first1=Cynthia |last2=Hardt |first2=Moritz |last3=Pitassi |first3=Toniann |last4=Reingold |first4=Omer |last5=Zemel |first5=Rich |title=Fairness Through Awareness |date=28 November 2011 |class=cs.CC |eprint=1104.3913}}

= Complexity =

Algorithmic processes are complex, often exceeding the understanding of the people who use them.{{rp|2}}{{cite journal|last1=Sandvig|first1=Christian|last2=Hamilton|first2=Kevin|last3=Karahalios|first3=Karrie|last4=Langbort|first4=Cedric|date=2014|editor1-last=Gangadharan|editor1-first=Seeta Pena|editor2-last=Eubanks|editor2-first=Virginia|editor3-last=Barocas|editor3-first=Solon|title=An Algorithm Audit|url=http://www-personal.umich.edu/~csandvig/research/An%20Algorithm%20Audit.pdf|journal=Data and Discrimination: Collected Essays}}{{rp|7}} Large-scale operations may not be understood even by those involved in creating them.{{cite web|last1=LaFrance|first1=Adrienne|title=The Algorithms That Power the Web Are Only Getting More Mysterious|url=https://www.theatlantic.com/technology/archive/2015/09/not-even-the-people-who-write-algorithms-really-know-how-they-work/406099/|website=The Atlantic|access-date=19 November 2017|date=2015-09-18}} The methods and processes of contemporary programs are often obscured by the inability to know every permutation of a code's input or output.{{rp|183}} Social scientist Bruno Latour has identified this process as blackboxing, a process in which "scientific and technical work is made invisible by its own success. When a machine runs efficiently, when a matter of fact is settled, one need focus only on its inputs and outputs and not on its internal complexity. Thus, paradoxically, the more science and technology succeed, the more opaque and obscure they become."{{Cite book|title=Pandora's Hope: Essays On the Reality of Science Studies|author=Bruno Latour|publisher=Harvard University Press|year=1999|location=Cambridge, Massachusetts}} Others have critiqued the black box metaphor, suggesting that current algorithms are not one black box, but a network of interconnected ones.{{cite book|url=https://books.google.com/books?id=ZdzMDQAAQBAJ|title=Innovative Methods in Media and Communication Research |last1=Kubitschko|first1=Sebastian|last2=Kaun|first2=Anne|date=2016|publisher=Springer|isbn=978-3-319-40700-5|access-date=19 November 2017}}{{rp|92}}

An example of this complexity can be found in the range of inputs into customizing feedback. The social media site Facebook factored in at least 100,000 data points to determine the layout of a user's social media feed in 2013.{{cite web|last1=McGee|first1=Matt|title=EdgeRank Is Dead: Facebook's News Feed Algorithm Now Has Close To 100K Weight Factors|url=https://marketingland.com/edgerank-is-dead-facebooks-news-feed-algorithm-now-has-close-to-100k-weight-factors-55908|website=Marketing Land|access-date=18 November 2017|date=16 August 2013}} Furthermore, large teams of programmers may operate in relative isolation from one another, and be unaware of the cumulative effects of small decisions within connected, elaborate algorithms.{{rp|118}} Not all code is original, and may be borrowed from other libraries, creating a complicated set of relationships between data processing and data input systems.{{cite journal|last1=Kitchin|first1=Rob|date=25 February 2016|title=Thinking critically about and researching algorithms|url=http://mural.maynoothuniversity.ie/11591/1/Kitchin_Thinking_2017.pdf|journal=Information, Communication & Society|volume=20|issue=1|pages=14–29|doi=10.1080/1369118X.2016.1154087|s2cid=13798875}}{{rp|22}}

Additional complexity occurs through machine learning and the personalization of algorithms based on user interactions such as clicks, time spent on site, and other metrics. These personal adjustments can confuse general attempts to understand algorithms.{{cite journal|last1=Granka|first1=Laura A.|date=27 September 2010|title=The Politics of Search: A Decade Retrospective|url=https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36914.pdf|journal=The Information Society|volume=26|issue=5|pages=364–374|doi=10.1080/01972243.2010.511560|s2cid=16306443|access-date=18 November 2017}}{{rp|367}}{{rp|7}} One unidentified streaming radio service reported that it used five unique music-selection algorithms it selected for its users, based on their behavior. This creates different experiences of the same streaming services between different users, making it harder to understand what these algorithms do.{{rp|5}}

Companies also run frequent A/B tests to fine-tune algorithms based on user response. For example, the search engine Bing can run up to ten million subtle variations of its service per day, creating different experiences of the service between each use and/or user.{{rp|5}}

= Lack of transparency =

Commercial algorithms are proprietary, and may be treated as trade secrets.{{rp|2}}{{rp|7}}{{rp|183}} Treating algorithms as trade secrets protects companies, such as search engines, where a transparent algorithm might reveal tactics to manipulate search rankings.{{rp|366}} This makes it difficult for researchers to conduct interviews or analysis to discover how algorithms function.{{rp|20}} Critics suggest that such secrecy can also obscure possible unethical methods used in producing or processing algorithmic output.{{rp|369}} Other critics, such as lawyer and activist Katarzyna Szymielewicz, have suggested that the lack of transparency is often disguised as a result of algorithmic complexity, shielding companies from disclosing or investigating its own algorithmic processes.{{Cite web|url=https://medium.com/@szymielewicz/black-boxed-politics-cebc0d5a54ad|title=Black-Boxed Politics|last=Szymielewicz|first=Katarzyna|date=2020-01-20|website=Medium|language=en|access-date=2020-02-11}}

= Lack of data about sensitive categories =

A significant barrier to understanding the tackling of bias in practice is that categories, such as demographics of individuals protected by anti-discrimination law, are often not explicitly considered when collecting and processing data.{{Cite journal|last1=Veale|first1=Michael|last2=Binns|first2=Reuben|date=2017|title=Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data|journal=Big Data & Society|volume=4|issue=2|pages=205395171774353|doi=10.1177/2053951717743530|ssrn=3060763|doi-access=free}} In some cases, there is little opportunity to collect this data explicitly, such as in device fingerprinting, ubiquitous computing and the Internet of Things. In other cases, the data controller may not wish to collect such data for reputational reasons, or because it represents a heightened liability and security risk. It may also be the case that, at least in relation to the European Union's General Data Protection Regulation, such data falls under the 'special category' provisions (Article 9), and therefore comes with more restrictions on potential collection and processing.

Some practitioners have tried to estimate and impute these missing sensitive categorizations in order to allow bias mitigation, for example building systems to infer ethnicity from names,{{Cite journal|last1=Elliott|first1=Marc N.|last2=Morrison|first2=Peter A.|last3=Fremont|first3=Allen|last4=McCaffrey|first4=Daniel F.|last5=Pantoja|first5=Philip|last6=Lurie|first6=Nicole|date=June 2009|title=Using the Census Bureau's surname list to improve estimates of race/ethnicity and associated disparities|journal=Health Services and Outcomes Research Methodology|volume=9|issue=2|pages=69–83|doi=10.1007/s10742-009-0047-1|s2cid=43293144|issn=1387-3741}} however this can introduce other forms of bias if not undertaken with care.{{Cite book|last1=Chen|first1=Jiahao|last2=Kallus|first2=Nathan|last3=Mao|first3=Xiaojie|last4=Svacha|first4=Geoffry|last5=Udell|first5=Madeleine|title=Proceedings of the Conference on Fairness, Accountability, and Transparency |chapter=Fairness Under Unawareness |date=2019|chapter-url=http://dl.acm.org/citation.cfm?doid=3287560.3287594|location=Atlanta, GA, USA|publisher=ACM Press|pages=339–348|doi=10.1145/3287560.3287594|isbn=9781450361255|arxiv=1811.11154|s2cid=58006233}} Machine learning researchers have drawn upon cryptographic privacy-enhancing technologies such as secure multi-party computation to propose methods whereby algorithmic bias can be assessed or mitigated without these data ever being available to modellers in cleartext.{{Cite journal|last1=Kilbertus|first1=Niki|last2=Gascon|first2=Adria|last3=Kusner|first3=Matt|last4=Veale|first4=Michael|last5=Gummadi|first5=Krishna|last6=Weller|first6=Adrian|date=2018|title=Blind Justice: Fairness with Encrypted Sensitive Attributes|url=http://proceedings.mlr.press/v80/kilbertus18a.html|journal=International Conference on Machine Learning|pages=2630–2639|bibcode=2018arXiv180603281K|arxiv=1806.03281}}

Algorithmic bias does not only include protected categories, but can also concern characteristics less easily observable or codifiable, such as political viewpoints. In these cases, there is rarely an easily accessible or non-controversial ground truth, and removing the bias from such a system is more difficult.{{Cite book|last1=Binns|first1=Reuben|last2=Veale|first2=Michael|last3=Kleek|first3=Max Van|last4=Shadbolt|first4=Nigel|title=Social Informatics |chapter=Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation |date=13 September 2017|series=Lecture Notes in Computer Science|volume=10540|pages=405–415|arxiv=1707.01477|doi=10.1007/978-3-319-67256-4_32|isbn=978-3-319-67255-7|s2cid=2814848}} Furthermore, false and accidental correlations can emerge from a lack of understanding of protected categories, for example, insurance rates based on historical data of car accidents which may overlap, strictly by coincidence, with residential clusters of ethnic minorities.{{cite web|last1=Claburn|first1=Thomas|title=EU Data Protection Law May End The Unknowable Algorithm – InformationWeek|url=https://www.informationweek.com/government/big-data-analytics/eu-data-protection-law-may-end-the-unknowable-algorithm/d/d-id/1326294?|website=InformationWeek|date=18 July 2016 |access-date=25 November 2017}}

Solutions

A study of 84 policy guidelines on ethical AI found that fairness and "mitigation of unwanted bias" was a common point of concern, and were addressed through a blend of technical solutions, transparency and monitoring, right to remedy and increased oversight, and diversity and inclusion efforts.{{Cite journal|last1=Jobin|first1=Anna|last2=Ienca|first2=Marcello|last3=Vayena|first3=Effy|author-link3=Effy Vayena|date=2 September 2019|title=The global landscape of AI ethics guidelines|journal=Nature Machine Intelligence|volume=1|issue=9|pages=389–399|doi=10.1038/s42256-019-0088-2|arxiv=1906.11668|s2cid=201827642}}

= Technical =

{{further|Fairness (machine learning)}}

There have been several attempts to create methods and tools that can detect and observe biases within an algorithm. These emergent fields focus on tools which are typically applied to the (training) data used by the program rather than the algorithm's internal processes. These methods may also analyze a program's output and its usefulness and therefore may involve the analysis of its confusion matrix (or table of confusion).{{cite web|url=https://research.google.com/bigpicture/attacking-discrimination-in-ml/|title=Attacking discrimination with smarter machine learning|first1=Martin|last1=Wattenberg|first2=Fernanda|last2=Viégas|first3=Moritz|last3=Hardt|publisher=Google Research}}{{cite arXiv |eprint=1610.02413|last1=Hardt|first1=Moritz|title=Equality of Opportunity in Supervised Learning|last2=Price|first2=Eric|last3=Srebro|first3=Nathan|class=cs.LG|year=2016}}{{cite web|url=https://venturebeat.com/2018/05/25/microsoft-is-developing-a-tool-to-help-engineers-catch-bias-in-algorithms/|title=Microsoft is developing a tool to help engineers catch bias in algorithms|date=2018-05-25|first=Kyle|last=Wiggers|website=VentureBeat.com}}{{cite web |title=Facebook says it has a tool to detect bias in its artificial intelligence |date=2018-05-03 |website=Quartz |archive-url=https://web.archive.org/web/20230305194710/https://qz.com/1268520/facebook-says-it-has-a-tool-to-detect-bias-in-its-artificial-intelligence |archive-date=2023-03-05 |url-status=live |url=https://qz.com/1268520/facebook-says-it-has-a-tool-to-detect-bias-in-its-artificial-intelligence/}}{{cite web|url=https://github.com/pymetrics/audit-ai|title=Pymetrics audit-AI|website=GitHub.com}}{{cite web|url=https://venturebeat-com.cdn.ampproject.org/c/s/venturebeat.com/2018/05/31/pymetrics-open-sources-audit-ai-an-algorithm-bias-detection-tool/amp/|title=Pymetrics open-sources Audit AI, an algorithm bias detection tool|date=2018-05-31|first=Khari|last=Johnson|website=VentureBeat.com}}{{cite web|url=https://github.com/dssg/aequitas|title=Aequitas: Bias and Fairness Audit Toolkit|website=GitHub.com}}https://dsapp.uchicago.edu/aequitas/ open-sources Audit AI, Aequitas at University of Chicago{{cite web|url=https://www.ibm.com/blogs/research/2018/02/mitigating-bias-ai-models/|title=Mitigating Bias in AI Models|first=Ruchir|last=Puri|date=2018-02-06|archive-url=https://web.archive.org/web/20180207040739/https://www.ibm.com/blogs/research/2018/02/mitigating-bias-ai-models/|archive-date=2018-02-07|website=IBM.com}} Explainable AI to detect algorithm Bias is a suggested way to detect the existence of bias in an algorithm or learning model.S. Sen, D. Dasgupta and K. D. Gupta, "An Empirical Study on Algorithmic Bias", 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain, 2020, pp. 1189-1194, {{doi|10.1109/COMPSAC48688.2020.00-95}}. Using machine learning to detect bias is called, "conducting an AI audit", where the "auditor" is an algorithm that goes through the AI model and the training data to identify biases.{{Cite journal|last1=Zou|first1=James|last2=Schiebinger|first2=Londa|date=July 2018|title=AI can be sexist and racist — it's time to make it fair |journal=Nature|language=en|volume=559|issue=7714|pages=324–326|doi=10.1038/d41586-018-05707-8|pmid=30018439|bibcode=2018Natur.559..324Z|doi-access=free}}

Ensuring that an AI tool such as a classifier is free from bias is more difficult than just removing the sensitive information

from its input signals, because this is typically implicit in other signals. For example, the hobbies, sports and schools attended

by a job candidate might reveal their gender to the software, even when this is removed from the analysis. Solutions to this

problem involve ensuring that the intelligent agent does not have any information that could be used to reconstruct the protected

and sensitive information about the subject, as first demonstrated in{{Cite conference| title =Right for the right reason: Training agnostic networks. | publisher =Springer | last1= Jia| first1 = Sen| last2= Welfare| first2 = Thomas | last3=Cristianini |first3 = Nello | date =2018 | conference = International Symposium on Intelligent Data Analysis}} where a deep learning network was simultaneously trained to learn a task while at the same time being completely agnostic about the protected feature. A simpler method was proposed in the context of word embeddings, and involves removing information that is correlated with the protected characteristic.{{Cite conference| title =Biased embeddings from wild data: Measuring, understanding and removing. | publisher =Springer | last1= Sutton| first1 = Adam| last2= Welfare| first2 = Thomas | last3=Cristianini |first3 = Nello | date =2018 | conference = International Symposium on Intelligent Data Analysis|doi=10.1007/978-3-030-01768-2_27}}

Currently{{when|date=March 2025}}, a new IEEE standard is being drafted that aims to specify methodologies which help creators of algorithms eliminate issues of bias and articulate transparency (i.e. to authorities or end users) about the function and possible effects of their algorithms. The project was approved February 2017 and is sponsored by the Software & Systems Engineering Standards Committee,{{cite web | url=https://www.computer.org/web/standards/s2esc | title=Software & Systems Engineering Standards Committee | date=April 17, 2018 }} a committee chartered by the IEEE Computer Society. A draft of the standard is expected to be submitted for balloting in June 2019.{{Cite journal|last=Koene|first=Ansgar|date=June 2017|title=Algorithmic Bias: Addressing Growing Concerns [Leading Edge]|journal=IEEE Technology and Society Magazine|volume=36|issue=2|pages=31–32|doi=10.1109/mts.2017.2697080|issn=0278-0097|url=http://eprints.nottingham.ac.uk/44207/8/IEEE_Tech_Sociery_Magazine_AlgoBias_2017_AKoene.pdf|access-date=August 1, 2019|archive-date=July 19, 2018|archive-url=https://web.archive.org/web/20180719081345/http://eprints.nottingham.ac.uk/44207/8/IEEE_Tech_Sociery_Magazine_AlgoBias_2017_AKoene.pdf|url-status=dead}}{{Cite web|url=https://standards.ieee.org/project/7003.html|archive-url=https://web.archive.org/web/20181203152115/https://standards.ieee.org/project/7003.html|url-status=dead|archive-date=December 3, 2018|title=P7003 - Algorithmic Bias Considerations|website=IEEE|access-date=2018-12-03}}The standard was published in January 2025.{{cite web|url=https://standards.ieee.org/ieee/7003/11357/|title= IEEE 7003-2024 IEEE Standard for Algorithmic Bias Considerations|access-date=16 March 2025}}

In 2022, the IEEE released a standard aimed at specifying methodologies to help creators of algorithms address issues of bias and promote transparency regarding the function and potential effects of their algorithms. The project, initially approved in February 2017, was sponsored by the Software & Systems Engineering Standards Committee,{{cite web | url=https://www.computer.org/volunteering/boards-and-committees/standards-activities/committees/s2esc | title=Software & Systems Engineering Standards Committee | date=April 17, 2018 }} a committee under the IEEE Computer Society. The standard provides guidelines for articulating transparency to authorities or end users and mitigating algorithmic biases.{{Cite web |date=2022 |title=IEEE CertifAIEd™ – Ontological Specification for Ethical Algorithmic Bias |url=https://engagestandards.ieee.org/rs/211-FYL-955/images/IEEE%20CertifAIEd%20Ontological%20Spec-Algorithmic%20Bias-2022%20%5BI1.3%5D.pdf |publisher=IEEE}}

= Transparency and monitoring =

{{further|Algorithmic transparency}}

Ethics guidelines on AI point to the need for accountability, recommending that steps be taken to improve the interpretability of results.{{Cite web|url=https://www.internetsociety.org/resources/doc/2017/artificial-intelligence-and-machine-learning-policy-paper/|title=Artificial Intelligence and Machine Learning: Policy Paper|last=The Internet Society|date=18 April 2017|website=Internet Society|access-date=11 February 2020}} Such solutions include the consideration of the "right to understanding" in machine learning algorithms, and to resist deployment of machine learning in situations where the decisions could not be explained or reviewed.{{Cite web|url=https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-in-machine-learning|title=White Paper: How to Prevent Discriminatory Outcomes in Machine Learning|date=12 March 2018|website=World Economic Forum|access-date=11 February 2020}} Toward this end, a movement for "Explainable AI" is already underway within organizations such as DARPA, for reasons that go beyond the remedy of bias.{{Cite web|url=https://www.darpa.mil/program/explainable-artificial-intelligence|title=Explainable Artificial Intelligence|website=www.darpa.mil|access-date=2020-02-11}} Price Waterhouse Coopers, for example, also suggests that monitoring output means designing systems in such a way as to ensure that solitary components of the system can be isolated and shut down if they skew results.{{Cite web|url=https://www.pwc.co.uk/services/risk-assurance/insights/accelerating-innovation-through-responsible-ai/responsible-ai-framework.html|title=The responsible AI framework|last=PricewaterhouseCoopers|website=PwC|language=en-gb|access-date=2020-02-11}}

An initial approach towards transparency included the open-sourcing of algorithms.{{Cite book|last=Heald|first=David|title=Transparency: The Key to Better Governance?|date=2006-09-07|publisher=British Academy|isbn=978-0-19-726383-9|language=en|doi=10.5871/bacad/9780197263839.003.0002}} Software code can be looked into and improvements can be proposed through source-code-hosting facilities. However, this approach doesn't necessarily produce the intended effects. Companies and organizations can share all possible documentation and code, but this does not establish transparency if the audience doesn't understand the information given. Therefore, the role of an interested critical audience is worth exploring in relation to transparency. Algorithms cannot be held accountable without a critical audience.{{Cite journal|last1=Kemper|first1=Jakko|last2=Kolkman|first2=Daan|date=2019-12-06|title=Transparent to whom? No algorithmic accountability without a critical audience|journal=Information, Communication & Society|volume=22|issue=14|pages=2081–2096|doi=10.1080/1369118X.2018.1477967|issn=1369-118X|doi-access=free|hdl=11245.1/75cb1256-5fe5-4724-9a63-03ef66032d8e|hdl-access=free}}

= Right to remedy =

From a regulatory perspective, the Toronto Declaration calls for applying a human rights framework to harms caused by algorithmic bias.{{Cite web|url=https://www.hrw.org/news/2018/07/03/toronto-declaration-protecting-rights-equality-and-non-discrimination-machine|title=The Toronto Declaration: Protecting the rights to equality and non-discrimination in machine learning systems|date=2018-07-03|website=Human Rights Watch|language=en|access-date=2020-02-11}} This includes legislating expectations of due diligence on behalf of designers of these algorithms, and creating accountability when private actors fail to protect the public interest, noting that such rights may be obscured by the complexity of determining responsibility within a web of complex, intertwining processes.{{Cite book |title=The Toronto Declaration: Protecting the Right to Equality and Non-Discrimination in Machine Learning Systems|publisher=Human Rights Watch|year=2018|url=https://www.accessnow.org/cms/assets/uploads/2018/08/The-Toronto-Declaration_ENG_08-2018.pdf|pages=15}} Others propose the need for clear liability insurance mechanisms.{{Cite journal|last1=Floridi|first1=Luciano|last2=Cowls|first2=Josh|last3=Beltrametti|first3=Monica|last4=Chatila|first4=Raja|last5=Chazerand|first5=Patrice|last6=Dignum|first6=Virginia|last7=Luetge|first7=Christoph|last8=Madelin|first8=Robert|last9=Pagallo|first9=Ugo|last10=Rossi|first10=Francesca|last11=Schafer|first11=Burkhard|date=2018-12-01|title=AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations|journal=Minds and Machines|language=en|volume=28|issue=4|pages=703|doi=10.1007/s11023-018-9482-5|issn=1572-8641|pmc=6404626|pmid=30930541}}

= Diversity and inclusion =

Amid concerns that the design of AI systems is primarily the domain of white, male engineers,{{Cite news|last=Crawford|first=Kate|url=https://www.nytimes.com/2016/06/26/opinion/sunday/artificial-intelligences-white-guy-problem.html|title=Opinion {{!}} Artificial Intelligence's White Guy Problem|date=2016-06-25|work=The New York Times |access-date=2020-02-11|language=en-US|issn=0362-4331}} a number of scholars have suggested that algorithmic bias may be minimized by expanding inclusion in the ranks of those designing AI systems. For example, just 12% of machine learning engineers are women,{{Cite news|url=https://www.wired.com/story/artificial-intelligence-researchers-gender-imbalance/|title=AI Is the Future—But Where Are the Women?|magazine=Wired|access-date=2020-02-11|language=en|issn=1059-1028}} with black AI leaders pointing to a "diversity crisis" in the field.{{Cite web|url=https://www.technologyreview.com/s/610192/were-in-a-diversity-crisis-black-in-ais-founder-on-whats-poisoning-the-algorithms-in-our/|title="We're in a diversity crisis": cofounder of Black in AI on what's poisoning algorithms in our lives|last=Snow|first=Jackie|website=MIT Technology Review|language=en-US|access-date=2020-02-11}} Groups like Black in AI and Queer in AI are attempting to create more inclusive spaces in the AI community and work against the often harmful desires of corporations that control the trajectory of AI research.{{Cite news|url=https://www.technologyreview.com/2021/06/14/1026148/ai-big-tech-timnit-gebru-paper-ethics/|title=Inside the fight to reclaim AI from Big Tech's control|work=MIT Technology Review|access-date=2021-06-21|language=en-US|last=Hao|first=Karen|date=2021-06-14}} Critiques of simple inclusivity efforts suggest that diversity programs can not address overlapping forms of inequality, and have called for applying a more deliberate lens of intersectionality to the design of algorithms.{{Cite journal|last=Ciston|first=Sarah|date=2019-12-29|title=Intersectional AI Is Essential|journal=Journal of Science and Technology of the Arts|language=en|volume=11|issue=2|pages=3–8|doi=10.7559/citarj.v11i2.665|issn=2183-0088|doi-access=free}}{{cite book |last1=D'Ignazio |first1=Catherine |last2=Klein |first2=Lauren F. |title=Data Feminism |date=2020 |publisher=MIT Press |isbn=978-0262044004}}{{rp|4}} Researchers at the University of Cambridge have argued that addressing racial diversity is hampered by the "whiteness" of the culture of AI.{{Cite journal|last1=Cave|first1=Stephen|last2=Dihal|first2=Kanta|date=2020-08-06|title=The Whiteness of AI|journal=Philosophy & Technology |volume=33 |issue=4 |pages=685–703 |language=en |doi=10.1007/s13347-020-00415-6 |issn=2210-5441|doi-access=free}}

= Interdisciplinarity and Collaboration =

Integrating interdisciplinarity and collaboration in developing of AI systems can play a critical role in tackling algorithmic bias. Integrating insights, expertise, and perspectives from disciplines outside of computer science can foster a better understanding of the impact data driven solutions have on society. An example of this in AI research is PACT or Participatory Approach to enable Capabilities in communiTies, a proposed framework for facilitating collaboration when developing AI driven solutions concerned with social impact.{{Cite book |url=https://ezpa.library.ualberta.ca/ezpAuthen.cgi?url=https://dl.acm.org/doi/abs/10.1145/3461702.3462612 |access-date=2023-04-06 |via=ezpa.library.ualberta.ca |doi=10.1145/3461702.3462612| arxiv=2105.01774 | s2cid=233740121 | last1=Bondi | first1=Elizabeth | last2=Xu | first2=Lily | last3=Acosta-Navas | first3=Diana | last4=Killian | first4=Jackson A. | title=Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society | chapter=Envisioning Communities: A Participatory Approach Towards AI for Social Good | year=2021 | pages=425–436 | isbn=9781450384735 }} This framework identifies guiding principals for stakeholder participation when working on AI for Social Good (AI4SG) projects. PACT attempts to reify the importance of decolonizing and power-shifting efforts in the design of human-centered AI solutions. An academic initiative in this regard is the Stanford University's Institute for Human-Centered Artificial Intelligence which aims to foster multidisciplinary collaboration. The mission of the institute is to advance artificial intelligence (AI) research, education, policy and practice to improve the human condition.{{Cite web |last=University |first=Stanford |date=2019-03-18 |title=Stanford University launches the Institute for Human-Centered Artificial Intelligence |url=https://news.stanford.edu/2019/03/18/stanford_university_launches_human-centered_ai/ |access-date=2023-04-06 |website=Stanford News |language=en}}

Collaboration with outside experts and various stakeholders facilitates ethical, inclusive, and accountable development of intelligent systems. It incorporates ethical considerations, understands the social and cultural context, promotes human-centered design, leverages technical expertise, and addresses policy and legal considerations.{{Cite book |last1=Bondi |first1=Elizabeth |last2=Xu |first2=Lily |last3=Acosta-Navas |first3=Diana |last4=Killian |first4=Jackson A. |title=Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society |chapter=Envisioning Communities: A Participatory Approach Towards AI for Social Good |date=2021-07-21 |pages=425–436 |doi=10.1145/3461702.3462612|arxiv=2105.01774 |isbn=9781450384735 |s2cid=233740121 }} Collaboration across disciplines is essential to effectively mitigate bias in AI systems and ensure that AI technologies are fair, transparent, and accountable.

Regulation

= Europe =

The General Data Protection Regulation (GDPR), the European Union's revised data protection regime that was implemented in 2018, addresses "Automated individual decision-making, including profiling" in Article 22. These rules prohibit "solely" automated decisions which have a "significant" or "legal" effect on an individual, unless they are explicitly authorised by consent, contract, or member state law. Where they are permitted, there must be safeguards in place, such as a right to a human-in-the-loop, and a non-binding right to an explanation of decisions reached. While these regulations are commonly considered to be new, nearly identical provisions have existed across Europe since 1995, in Article 15 of the Data Protection Directive. The original automated decision rules and safeguards found in French law since the late 1970s.{{Cite journal|last=Bygrave|first=Lee A|journal=Computer Law & Security Review|volume=17|issue=1|pages=17–24|doi=10.1016/s0267-3649(01)00104-2|year=2001|title=Automated Profiling}}

The GDPR addresses algorithmic bias in profiling systems, as well as the statistical approaches possible to clean it, directly in recital 71,{{Cite journal|last1=Veale|first1=Michael|last2=Edwards|first2=Lilian|date=2018|title=Clarity, Surprises, and Further Questions in the Article 29 Working Party Draft Guidance on Automated Decision-Making and Profiling|journal=Computer Law & Security Review|volume=34 |issue=2 |pages=398–404 |doi=10.1016/j.clsr.2017.12.002|ssrn=3071679|s2cid=4797884 |url=http://discovery.ucl.ac.uk/10046182/1/Veale%201-s2.0-S026736491730376X-main%281%29.pdf}} noting that

the controller should use appropriate mathematical or statistical procedures for the profiling, implement technical and organisational measures appropriate ... that prevents, inter alia, discriminatory effects on natural persons on the basis of racial or ethnic origin, political opinion, religion or beliefs, trade union membership, genetic or health status or sexual orientation, or that result in measures having such an effect.
Like the non-binding right to an explanation in recital 71, the problem is the non-binding nature of recitals.{{Cite journal|last1=Wachter|first1=Sandra|last2=Mittelstadt|first2=Brent|last3=Floridi|first3=Luciano|date=1 May 2017|title=Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation|journal=International Data Privacy Law|volume=7|issue=2|pages=76–99|doi=10.1093/idpl/ipx005|issn=2044-3994|doi-access=free}} While it has been treated as a requirement by the Article 29 Working Party that advised on the implementation of data protection law, its practical dimensions are unclear. It has been argued that the Data Protection Impact Assessments for high risk data profiling (alongside other pre-emptive measures within data protection) may be a better way to tackle issues of algorithmic discrimination, as it restricts the actions of those deploying algorithms, rather than requiring consumers to file complaints or request changes.{{Cite journal|last1=Edwards|first1=Lilian|last2=Veale|first2=Michael|date=23 May 2017|title=Slave to the Algorithm? Why a Right to an Explanation Is Probably Not the Remedy You Are Looking For|ssrn=2972855|journal=Duke Law & Technology Review|volume=16|pages=18–84|doi=}}

= United States =

The United States has no general legislation controlling algorithmic bias, approaching the problem through various state and federal laws that might vary by industry, sector, and by how an algorithm is used.{{cite news|last1=Singer|first1=Natasha|title=Consumer Data Protection Laws, an Ocean Apart|url=https://www.nytimes.com/2013/02/03/technology/consumer-data-protection-laws-an-ocean-apart.html|access-date=26 November 2017|work=The New York Times|date=2 February 2013}} Many policies are self-enforced or controlled by the Federal Trade Commission. In 2016, the Obama administration released the National Artificial Intelligence Research and Development Strategic Plan,{{cite web|last1=Obama|first1=Barack|title=The Administration's Report on the Future of Artificial Intelligence|url=https://obamawhitehouse.archives.gov/blog/2016/10/12/administrations-report-future-artificial-intelligence|website=whitehouse.gov|publisher=National Archives|access-date=26 November 2017|date=12 October 2016}} which was intended to guide policymakers toward a critical assessment of algorithms. It recommended researchers to "design these systems so that their actions and decision-making are transparent and easily interpretable by humans, and thus can be examined for any bias they may contain, rather than just learning and repeating these biases". Intended only as guidance, the report did not create any legal precedent.{{cite book|last1=and Technology Council|first1=National Science|title=National Artificial Intelligence Research and Development Strategic Plan|date=2016|publisher=US Government|url=https://obamawhitehouse.archives.gov/sites/default/files/whitehouse_files/microsites/ostp/NSTC/national_ai_rd_strategic_plan.pdf|access-date=26 November 2017}}{{rp|26}}

In 2017, New York City passed the first algorithmic accountability bill in the United States.{{cite web |last1=Kirchner |first1=Lauren |title=New York City Moves to Create Accountability for Algorithms — ProPublica |url=https://www.propublica.org/article/new-york-city-moves-to-create-accountability-for-algorithms |website=ProPublica |access-date=28 July 2018 |date=18 December 2017}} The bill, which went into effect on January 1, 2018, required "the creation of a task force that provides recommendations on how information on agency automated decision systems may be shared with the public, and how agencies may address instances where people are harmed by agency automated decision systems."{{cite web |title=The New York City Council - File #: Int 1696-2017 |url=http://legistar.council.nyc.gov/LegislationDetail.aspx?ID=3137815&GUID=437A6A6D-62E1-47E2-9C42-461253F9C6D0 |website=legistar.council.nyc.gov |publisher=New York City Council |access-date=28 July 2018 }} In 2023, New York City implemented a law requiring employers using automated hiring tools to conduct independent "bias audits" and publish the results. This law marked one of the first legally mandated transparency measures for AI systems used in employment decisions in the United States. {{Cite web |last=Wiggers |first=Kyle |date=2023-07-05 |title=NYC's anti-bias law for hiring algorithms goes into effect |url=https://techcrunch.com/2023/07/05/nycs-anti-bias-law-for-hiring-algorithms-goes-into-effect/ |access-date=2025-04-16 |website=TechCrunch |language=en-US}} The task force is required to present findings and recommendations for further regulatory action in 2019.{{cite magazine |last1=Powles |first1=Julia |title=New York City's Bold, Flawed Attempt to Make Algorithms Accountable |url=https://www.newyorker.com/tech/elements/new-york-citys-bold-flawed-attempt-to-make-algorithms-accountable |magazine=The New Yorker |access-date=28 July 2018}}

On February 11, 2019, according to Executive Order 13859, the federal government unveiled the "American AI Initiative," a comprehensive strategy to maintain U.S. leadership in artificial intelligence. The initiative highlights the importance of sustained AI research and development, ethical standards, workforce training, and the protection of critical AI technologies.{{cite web | url=https://www.federalregister.gov/documents/2019/02/14/2019-02544/maintaining-american-leadership-in-artificial-intelligence | title=Maintaining American Leadership in Artificial Intelligence | date=February 14, 2019 }} This aligns with broader efforts to ensure transparency, accountability, and innovation in AI systems across public and private sectors. Furthermore, on October 30, 2023, the President signed Executive Order 14110, which emphasizes the safe, secure, and trustworthy development and use of artificial intelligence (AI). The order outlines a coordinated, government-wide approach to harness AI's potential while mitigating its risks, including fraud, discrimination, and national security threats. An important point in the commitment is promoting responsible innovation and collaboration across sectors to ensure that AI benefits society as a whole.{{cite web | url=https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence | title=Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence | date=November 2023 }} With this order, President Joe Biden mandated the federal government to create best practices for companies to optimize AI's benefits and minimize its harms.{{cite web | url=https://uk.news.yahoo.com/vp-kamala-harris-unveils-safe-090142553.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAArtakTrGRBYYrFkNxMWolKDlVr-GwuBLb-2Kh3X4WcMDwp2ii5K1s6QoZFqhdxw5uQ4vUyMS-81rVwpyBjqRTFQaSosZ2rEhap4RHN53KjE2ZifB37cnzGxCNUUEiL-yxWW53A_L_Z5GCPTcY05l94f3c6-WKSMpKpqm79Bw0hM | title=VP Kamala Harris Unveils "Safe, Secure & Responsible" AI Guidelines for Federal Agencies | date=March 28, 2024 }}

= India =

On July 31, 2018, a draft of the Personal Data Bill was presented.{{Cite web|url=https://www.insurancejournal.com/news/international/2018/07/31/496489.htm|title=India Weighs Comprehensive Data Privacy Bill, Similar to EU's GDPR|date=2018-07-31|website=Insurance Journal|access-date=2019-02-26}} The draft proposes standards for the storage, processing and transmission of data. While it does not use the term algorithm, it makes for provisions for "harm resulting from any processing or any kind of processing undertaken by the fiduciary". It defines "any denial or withdrawal of a service, benefit or good resulting from an evaluative decision about the data principal" or "any discriminatory treatment" as a source of harm that could arise from improper use of data. It also makes special provisions for people of "Intersex status".{{cite web |url=https://meity.gov.in/writereaddata/files/Personal_Data_Protection_Bill,2018.pdf |title=The Personal Data Protection Bill, 2018 |publisher=Ministry of Electronics & Information Technology, Government of India |date=2018 |access-date=29 April 2022}}

See also

References

{{Reflist}}

Further reading

  • {{cite book |last1=Baer |first1=Tobias |date=2019 |title=Understand, Manage, and Prevent Algorithmic Bias: A Guide for Business Users and Data Scientists |isbn=9781484248843 |location=New York |publisher=Apress |url=https://www.springer.com/us/book/9781484248843}}
  • {{cite book |last1=Noble |first1=Safiya Umoja |date=2018 |title=Algorithms of Oppression: How Search Engines Reinforce Racism |isbn=9781479837243 |location=New York |publisher=New York University Press |title-link=Algorithms of Oppression}}

Category:Machine learning

Category:Information ethics

Category:Computing and society

Category:Philosophy of artificial intelligence

Category:Discrimination

Category:Bias