Progress in artificial intelligence
{{short description|How AI-related technologies evolve}}
{{see also|History of artificial intelligence|Timeline of artificial intelligence}}
{{Artificial intelligence}}
[[File:Classification of images progress human.png|thumb|Progress in machine classification of images
----
The error rate of AI by year. Red line - the error rate of a trained human on a particular task.
]]
Progress in artificial intelligence (AI) refers to the advances, milestones, and breakthroughs that have been achieved in the field of artificial intelligence over time. AI is a multidisciplinary branch of computer science that aims to create machines and systems capable of performing tasks that typically require human intelligence. AI applications have been used in a wide range of fields including medical diagnosis, finance, robotics, law, video games, agriculture, and scientific discovery. However, many AI applications are not perceived as AI: "A lot of cutting-edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore."[http://www.cnn.com/2006/TECH/science/07/24/ai.bostrom/ AI set to exceed human brain power] {{Webarchive|url=https://web.archive.org/web/20080219001624/http://www.cnn.com/2006/TECH/science/07/24/ai.bostrom/ |date=2008-02-19 }} CNN.com (July 26, 2006){{cite journal|doi=10.1016/j.bushor.2018.08.004|title=Siri, Siri, in my hand: Who's the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence|journal=Business Horizons|volume=62|pages=15–25|year=2019|last1=Kaplan|first1=Andreas|last2=Haenlein|first2=Michael|s2cid=158433736}} "Many thousands of AI applications are deeply embedded in the infrastructure of every industry."{{Harvnb|Kurtzweil|2005|p=264}} In the late 1990s and early 2000s, AI technology became widely used as elements of larger systems,{{Citation|last=National Research Council|title=Funding a Revolution: Government Support for Computing Research|year=1999|author-link=United States National Research Council|chapter=Developments in Artificial Intelligence|publisher=National Academy Press|isbn=978-0-309-06278-7|oclc=246584055|chapter-url-access=registration|chapter-url=https://archive.org/details/fundingrevolutio00nati}} under "Artificial Intelligence in the 90s" but the field was rarely credited for these successes at the time.
Kaplan and Haenlein structure artificial intelligence along three evolutionary stages:
- Artificial narrow intelligence – AI capable only of specific tasks;
- Artificial general intelligence – AI with ability in several areas, and able to autonomously solve problems they were never even designed for;
- Artificial superintelligence – AI capable of general tasks, including scientific creativity, social skills, and general wisdom.
To allow comparison with human performance, artificial intelligence can be evaluated on constrained and well-defined problems. Such tests have been termed subject-matter expert Turing tests. Also, smaller problems provide more achievable goals and there are an ever-increasing number of positive results.
Humans still substantially outperform both GPT-4 and models trained on the ConceptARC benchmark that scored 60% on most, and 77% on one category, while humans 91% on all and 97% on one category.{{cite news |last=Biever |first=Celeste |title= ChatGPT broke the Turing test — the race is on for new ways to assess AI |url= https://www.nature.com/articles/d41586-023-02361-7? |date=25 July 2023 |work=Nature |accessdate=26 July 2023 }}
Current performance in specific areas
There are many useful abilities that can be described as showing some form of intelligence. This gives better insight into the comparative success of artificial intelligence in different areas.
AI, like electricity or the steam engine, is a general-purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at.{{cite journal|last1=Brynjolfsson|first1=Erik|last2=Mitchell|first2=Tom|title=What can machine learning do? Workforce implications|url=https://www.science.org/doi/10.1126/science.aap8062|access-date=7 May 2018|journal=Science|date=22 December 2017|volume=358|issue=6370|pages=1530–1534|language=en|doi=10.1126/science.aap8062|pmid=29269459 |bibcode=2017Sci...358.1530B|s2cid=4036151 |archive-date=29 September 2021|archive-url=https://web.archive.org/web/20210929161800/https://www.science.org/doi/10.1126/science.aap8062|url-status=live}} Some versions of Moravec's paradox observe that humans are more likely to outperform machines in areas such as physical dexterity that have been the direct target of natural selection.{{cite news|title=IKEA furniture and the limits of AI|url=https://www.economist.com/news/leaders/21740735-humans-have-had-good-run-most-recent-breakthrough-robotics-it-clear|access-date=24 April 2018|newspaper=The Economist|date=2018|language=en|archive-date=24 April 2018|archive-url=https://web.archive.org/web/20180424014106/https://economist.com/news/leaders/21740735-humans-have-had-good-run-most-recent-breakthrough-robotics-it-clear|url-status=live}} While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets.{{cite news|last1=Sample|first1=Ian|title='It's able to create knowledge itself': Google unveils AI that learns on its own|url=https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own|access-date=7 May 2018|work=the Guardian|date=18 October 2017|language=en|archive-date=19 October 2017|archive-url=https://web.archive.org/web/20171019213849/https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own|url-status=live}}{{cite news|title=The AI revolution in science|url=https://www.science.org/content/article/ai-revolution-science|access-date=7 May 2018|work=Science {{!}} AAAS|date=5 July 2017|language=en|archive-date=14 December 2021|archive-url=https://web.archive.org/web/20211214221104/https://www.science.org/content/article/ai-revolution-science|url-status=live}} Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI."{{cite news|title=Will your job still exist in 10 years when the robots arrive?|url=http://www.scmp.com/tech/innovation/article/2098164/robots-are-coming-here-are-some-jobs-wont-exist-10-years|access-date=7 May 2018|work=South China Morning Post|date=2017|language=en|archive-date=7 May 2018|archive-url=https://web.archive.org/web/20180507221346/http://www.scmp.com/tech/innovation/article/2098164/robots-are-coming-here-are-some-jobs-wont-exist-10-years|url-status=live}}
Games provide a high-profile benchmark for assessing rates of progress; many games have a large professional player base and a well-established competitive rating system. AlphaGo brought the era of classical board-game benchmarks to a close when Artificial Intelligence proved their competitive edge over humans in 2016. Deep Mind's AlphaGo AI software program defeated the world's best professional Go Player Lee Sedol.{{Cite journal|last=Mokyr|first=Joel|date=2019-11-01|title=The Technology Trap: Capital Labor, and Power in the Age of Automation. By Carl Benedikt Frey. Princeton: Princeton University Press, 2019. Pp. 480. $29.95, hardcover.|url=http://dx.doi.org/10.1017/s0022050719000639|journal=The Journal of Economic History|volume=79|issue=4|pages=1183–1189|doi=10.1017/s0022050719000639|s2cid=211324400|issn=0022-0507|access-date=2020-11-25|archive-date=2023-02-02|archive-url=https://web.archive.org/web/20230202181321/https://www.cambridge.org/core/journals/journal-of-economic-history/article/abs/technology-trap-capital-labor-and-power-in-the-age-of-automation-by-carl-benedikt-frey-princeton-princeton-university-press-2019-pp-480-2995-hardcover/F86C8E992A0EE1018ED110607A353A0D|url-status=live}} Games of imperfect knowledge provide new challenges to AI in the area of game theory; the most prominent milestone in this area was brought to a close by Libratus' poker victory in 2017.{{cite news|last1=Borowiec|first1=Tracey Lien, Steven|title=AlphaGo beats human Go champ in milestone for artificial intelligence|url=https://www.latimes.com/world/asia/la-fg-korea-alphago-20160312-story.html|access-date=7 May 2018|work=Los Angeles Times|date=2016|archive-date=13 May 2018|archive-url=https://web.archive.org/web/20180513234132/http://www.latimes.com/world/asia/la-fg-korea-alphago-20160312-story.html|url-status=live}}{{cite journal|last1=Brown|first1=Noam|last2=Sandholm|first2=Tuomas|title=Superhuman AI for heads-up no-limit poker: Libratus beats top professionals|journal=Science|date=26 January 2018|volume=359 |issue=6374 |pages=418–424|language=en|doi=10.1126/science.aao1733|pmid=29249696 |bibcode=2018Sci...359..418B |s2cid=5003977 |doi-access=free}} E-sports continue to provide additional benchmarks; Facebook AI, Deepmind, and others have engaged with the popular StarCraft franchise of videogames.{{cite journal|last1=Ontanon|first1=Santiago|last2=Synnaeve|first2=Gabriel|last3=Uriarte|first3=Alberto|last4=Richoux|first4=Florian|last5=Churchill|first5=David|last6=Preuss|first6=Mike|title=A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft|journal=IEEE Transactions on Computational Intelligence and AI in Games|date=December 2013|volume=5|issue=4|pages=293–311|doi=10.1109/TCIAIG.2013.2286295|citeseerx=10.1.1.406.2524|s2cid=5014732}}{{cite magazine|title=Facebook Quietly Enters StarCraft War for AI Bots, and Loses|url=https://www.wired.com/story/facebook-quietly-enters-starcraft-war-for-ai-bots-and-loses/|access-date=7 May 2018|magazine=WIRED|date=2017|archive-date=2 February 2023|archive-url=https://web.archive.org/web/20230202181319/https://www.wired.com/story/facebook-quietly-enters-starcraft-war-for-ai-bots-and-loses/|url-status=live}}
Broad classes of outcome for an AI test may be given as:
- optimal: it is not possible to perform better (note: some of these entries were solved by humans)
- super-human: performs better than all humans
- high-human: performs better than most humans
- par-human: performs similarly to most humans
- sub-human: performs worse than most humans
=Optimal=
{{See also|Solved game}}
- Tic-tac-toe
- Connect Four: 1988
- Checkers (aka 8x8 draughts): Weakly solved (2007){{Cite journal| last1 = Schaeffer | first1 = J.| last2 = Burch | first2 = N.| last3 = Bjornsson | first3 = Y.| last4 = Kishimoto | first4 = A.| last5 = Muller | first5 = M.| last6 = Lake | first6 = R.| last7 = Lu | first7 = P.| last8 = Sutphen | first8 = S.| title = Checkers is solved| journal = Science| volume = 317| issue = 5844| pages = 1518–1522| year = 2007| pmid = 17641166| doi = 10.1126/science.1144079| citeseerx = 10.1.1.95.5393| bibcode = 2007Sci...317.1518S| s2cid = 10274228}}
- Rubik's Cube: Mostly solved (2010){{cite web|url=http://www.cube20.org/|title=God's Number is 20|access-date=2011-08-07|archive-date=2013-07-21|archive-url=https://web.archive.org/web/20130721182026/http://www.cube20.org/|url-status=live}}
- Heads-up limit hold'em poker: Statistically optimal in the sense that "a human lifetime of play is not sufficient to establish with statistical significance that the strategy is not an exact solution" (2015){{Cite journal | doi = 10.1126/science.1259433| title = Heads-up limit hold'em poker is solved| journal = Science| volume = 347| issue = 6218| pages = 145–9| year = 2015| last1 = Bowling| first1 = M.| last2 = Burch| first2 = N.| last3 = Johanson| first3 = M.| last4 = Tammelin| first4 = O.| pmid=25574016| bibcode = 2015Sci...347..145B| citeseerx = 10.1.1.697.72| s2cid = 3796371}}
=Super-human=
- Othello (aka reversi): c. 1997
- Scrabble:{{cite magazine|title=In Major AI Breakthrough, Google System Secretly Beats Top Player at the Ancient Game of Go|url=https://www.wired.com/2016/01/in-a-huge-breakthrough-googles-ai-beats-a-top-player-at-the-game-of-go/|access-date=28 December 2017|magazine=WIRED|archive-date=2 February 2017|archive-url=https://web.archive.org/web/20170202211927/https://www.wired.com/2016/01/in-a-huge-breakthrough-googles-ai-beats-a-top-player-at-the-game-of-go/|url-status=live}}{{Cite journal| last1 = Sheppard | first1 = B.| title = World-championship-caliber Scrabble| journal = Artificial Intelligence| volume = 134| issue = 1–2| pages = 241–275| year = 2002| doi = 10.1016/S0004-3702(01)00166-7| doi-access = free}} 2006{{cite magazine|last1=Webley|first1=Kayla|title=Top 10 Man-vs.-Machine Moments|url=https://content.time.com/time/specials/packages/article/0,28804,2049187_2049195_2049083,00.html|access-date=28 December 2017|magazine=Time|date=15 February 2011|archive-date=26 December 2017|archive-url=https://web.archive.org/web/20171226005745/http://content.time.com/time/specials/packages/article/0,28804,2049187_2049195_2049083,00.html|url-status=live}}
- Backgammon: c. 1995–2002{{cite journal |last=Tesauro |first=Gerald |url=http://www.research.ibm.com/massive/tdl.html |title=Temporal difference learning and TD-Gammon |journal=Communications of the ACM |volume=38 |issue=3 |date=March 1995 |pages=58–68 |doi=10.1145/203330.203343 |s2cid=8763243 |access-date=2008-03-26 |archive-date=2013-01-11 |archive-url=https://web.archive.org/web/20130111060444/http://www.research.ibm.com/massive/tdl.html |url-status=live |doi-access=free }}{{cite journal|last1=Tesauro|first1=Gerald|title=Programming backgammon using self-teaching neural nets|journal=Artificial Intelligence|date=January 2002|volume=134|issue=1–2|pages=181–199|doi=10.1016/S0004-3702(01)00110-2|quote=...at least two other neural net programs also appear to be capable of superhuman play}}
- Chess: Supercomputer (c. 1997); Personal computer (c. 2006);{{Cite news|url=https://en.chessbase.com/post/kramnik-vs-deep-fritz-computer-wins-match-by-4-2|title=Kramnik vs Deep Fritz: Computer wins match by 4:2|date=2006-12-05|work=Chess News|access-date=2018-07-15|language=en-US|archive-date=2018-11-25|archive-url=https://web.archive.org/web/20181125095823/https://en.chessbase.com/post/kramnik-vs-deep-fritz-computer-wins-match-by-4-2|url-status=live}} Mobile phone (c. 2009);{{Cite web|url=http://theweekinchess.com/html/twic771.html#13|title=The Week in Chess 771|website=theweekinchess.com|access-date=2018-07-15|archive-date=2018-11-15|archive-url=https://web.archive.org/web/20181115020103/http://theweekinchess.com/html/twic771.html#13|url-status=live}} Computer defeats human + computer (c. 2017){{Cite web|url=http://www.infinitychess.com/Page/Public/Article/DefaultArticle.aspx?id=322|title=Zor Winner in an Exciting Photo Finish|last=Nickel|first=Arno|date=May 2017|website=www.infinitychess.com|publisher=Innovative Solutions|access-date=2018-07-17|quote=... on third place the best centaur ...|archive-date=2018-08-17|archive-url=https://web.archive.org/web/20180817111551/http://infinitychess.com/Page/Public/Article/DefaultArticle.aspx?id=322|url-status=live}}
- Jeopardy!: Question answering, although the machine did not use speech recognition (2011){{Cite news |last=Markoff |first=John |date=2011-02-16 |title=Computer Wins on 'Jeopardy!': Trivial, It's Not |language=en-US |work=The New York Times |url=https://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html |access-date=2023-02-22 |issn=0362-4331}}{{cite news|url=https://www.pcworld.com/article/219893/ibm_watson_vanquishes_human_jeopardy_foes.html|title=IBM Watson Vanquishes Human Jeopardy Foes|last=Jackson|first=Joab|publisher=PC World|agency=IDG News|access-date=2011-02-17|archive-date=2011-02-20|archive-url=https://web.archive.org/web/20110220020908/http://www.pcworld.com/article/219893/ibm_watson_vanquishes_human_jeopardy_foes.html|url-status=live}}
- Arimaa: 2015{{Cite web|url=http://arimaa.com/arimaa/challenge/|title=The Arimaa Challenge|website=arimaa.com|access-date=2018-07-15|archive-date=2010-03-22|archive-url=https://web.archive.org/web/20100322201944/http://arimaa.com/arimaa/challenge/|url-status=live}}{{cite news|last1=Roeder|first1=Oliver|title=The Bots Beat Us. Now What?|url=https://fivethirtyeight.com/features/the-bots-beat-us-now-what/|access-date=28 December 2017|work=FiveThirtyEight|date=10 July 2017|archive-date=28 December 2017|archive-url=https://web.archive.org/web/20171228171720/https://fivethirtyeight.com/features/the-bots-beat-us-now-what/|url-status=live}}
- Shogi: c. 2017
- Go: 2017{{Cite news|url=https://www.theverge.com/2017/5/25/15689462/alphago-ke-jie-game-2-result-google-deepmind-china|title=AlphaGo beats Ke Jie again to wrap up three-part match|work=The Verge|access-date=2018-07-15|archive-date=2018-07-15|archive-url=https://web.archive.org/web/20180715181631/https://www.theverge.com/2017/5/25/15689462/alphago-ke-jie-game-2-result-google-deepmind-china|url-status=live}}
- Heads-up no-limit hold'em poker: 2017{{Cite journal| last1 = Brown | first1 = Noam| last2 = Sandholm| first2 = Tuomas|title = Superhuman AI for heads-up no-limit poker: Libratus beats top professionals| journal = Science| volume = 359| issue = 6374| pages = 418–424| year = 2017| doi = 10.1126/science.aao1733| pmid = 29249696| bibcode = 2018Sci...359..418B| doi-access = free}}
- Six-player no-limit hold'em poker: 2019{{cite journal |last1=Blair |first1=Alan |last2=Saffidine |first2=Abdallah |title=AI surpasses humans at six-player poker |journal=Science |date=30 August 2019 |volume=365 |issue=6456 |pages=864–865 |doi=10.1126/science.aay7774 |pmid=31467208 |bibcode=2019Sci...365..864B |s2cid=201672421 |url=https://www.science.org/doi/10.1126/science.aay7774 |access-date=30 June 2022 |archive-date=18 July 2022 |archive-url=https://web.archive.org/web/20220718200144/https://www.science.org/doi/10.1126/science.aay7774 |url-status=live }}
- Gran Turismo Sport: 2022{{Cite news|url=https://www.theverge.com/2022/2/9/22925420/sony-ai-gran-turismo-driving-gt-sophy-nature-paper|title=Sony's new AI driver achieves 'reliably superhuman' race times in Gran Turismo|work=The Verge|access-date=2022-07-19|archive-date=2022-07-20|archive-url=https://web.archive.org/web/20220720023339/https://www.theverge.com/2022/2/9/22925420/sony-ai-gran-turismo-driving-gt-sophy-nature-paper|url-status=live}}
=High-human=
- Crosswords: c. 2012Proverb: The probabilistic cruciverbalist. By Greg A. Keim, Noam Shazeer, Michael L. Littman, Sushant Agarwal, Catherine M. Cheves, Joseph Fitzgerald, Jason Grosland, Fan Jiang, Shannon Pollard, and Karl Weinmeister. 1999. In Proceedings of the Sixteenth National Conference on Artificial Intelligence, 710-717. Menlo Park, Calif.: AAAI Press.{{cite news|title='Dr. Fill' vies for crossword solving supremacy, but still comes up short|url=http://www.pri.org/stories/2014-09-24/dr-fill-vies-crossword-solving-supremacy-still-comes-short|first=Adam|last=Wernick|date=24 Sep 2014|publisher=Public Radio International|access-date=Dec 27, 2017|quote=The first year, Dr. Fill came in 141st out of about 600 competitors. It did a little better the second-year; last year it was 65th|archive-date=2017-12-28|archive-url=https://web.archive.org/web/20171228175445/https://www.pri.org/stories/2014-09-24/dr-fill-vies-crossword-solving-supremacy-still-comes-short|url-status=live}}
- Freeciv: 2016{{cite news|title=Arago's AI can now beat some human players at complex civ strategy games.|url=https://techcrunch.com/2016/12/06/aragos-ai-can-now-beat-some-human-players-at-complex-civ-strategy-games/|access-date=20 July 2022|work=TechCrunch|date=6 December 2016|archive-date=5 June 2022|archive-url=https://web.archive.org/web/20220605151457/https://techcrunch.com/2016/12/06/aragos-ai-can-now-beat-some-human-players-at-complex-civ-strategy-games/|url-status=live}}
- Dota 2: 2018{{cite news|title=AI bots trained for 180 years a day to beat humans at Dota 2.|url=https://www.theverge.com/2018/6/25/17492918/openai-dota-2-bot-ai-five-5v5-matches|access-date=17 July 2018|work=The Verge|date=25 June 2018|archive-date=25 June 2018|archive-url=https://web.archive.org/web/20180625183203/https://www.theverge.com/2018/6/25/17492918/openai-dota-2-bot-ai-five-5v5-matches|url-status=live}}
- Bridge card-playing: According to a 2009 review, "the best programs are attaining expert status as (bridge) card players", excluding bidding.Bethe, P. M. (2009). The state of automated bridge play.
- StarCraft II: 2019{{cite web|url=https://deepmind.com/blog/alphastar-mastering-the-real-time-strategy-game-starcraft-ii|title=AlphaStar: Mastering the Real-Time Strategy Game StarCraft II|date=24 January 2019 |access-date=2022-07-19|archive-date=2022-07-22|archive-url=https://web.archive.org/web/20220722140448/https://www.deepmind.com/blog/alphastar-mastering-the-real-time-strategy-game-starcraft-ii|url-status=live}}
- Mahjong: 2019{{cite web|url=https://www.microsoft.com/en-us/research/project/suphx-mastering-mahjong-with-deep-reinforcement-learning/|title=Suphx: The World Best Mahjong AI|website=Microsoft|access-date=2022-07-19|archive-date=2022-07-19|archive-url=https://web.archive.org/web/20220719233921/https://www.microsoft.com/en-us/research/project/suphx-mastering-mahjong-with-deep-reinforcement-learning/|url-status=live}}
- Stratego: 2022{{cite news|title=Deepmind AI Researchers Introduce 'DeepNash', An Autonomous Agent Trained With Model-Free Multiagent Reinforcement Learning That Learns To Play The Game Of Stratego At Expert Level.|url=https://www.marktechpost.com/2022/07/09/deepmind-ai-researchers-introduce-deepnash-an-autonomous-agent-trained-with-model-free-multiagent-reinforcement-learning-that-learns-to-play-the-game-of-stratego-at-expert-level/|access-date=19 July 2022|work=MarkTechPost|date=9 July 2022|archive-date=9 July 2022|archive-url=https://web.archive.org/web/20220709215709/https://www.marktechpost.com/2022/07/09/deepmind-ai-researchers-introduce-deepnash-an-autonomous-agent-trained-with-model-free-multiagent-reinforcement-learning-that-learns-to-play-the-game-of-stratego-at-expert-level/|url-status=live}}
- No-Press Diplomacy: 2022{{cite arXiv |eprint=2210.05492 |title=Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning |date= 11 October 2022 |class=cs.GT |last1=Bakhtin |first1=Anton |last2=Wu |first2=David |last3=Lerer |first3=Adam |last4=Gray |first4=Jonathan |last5=Jacob |first5=Athul |last6=Farina |first6=Gabriele |last7=Miller |first7=Alexander |last8=Brown |first8=Noam }}
- Hanabi: 2022{{cite arXiv |eprint=2210.05125 |title=Human-AI Coordination via Human-Regularized Search and Learning |date= 11 October 2022 |class=cs.AI |last1=Hu |first1=Hengyuan |last2=Wu |first2=David |last3=Lerer |first3=Adam |last4=Foerster |first4=Jakob |last5=Brown |first5=Noam }}
- Natural language processing{{citation needed|date=December 2022}}
= Par-human =
- Optical character recognition for ISO 1073-1:1976 and similar special characters.{{citation needed|date=January 2023}}
- Classification of images{{cite web|url=https://venturebeat.com/2015/02/09/microsoft-researchers-say-their-newest-deep-learning-system-beats-humans-and-google/|title=Microsoft researchers say their newest deep learning system beats humans -- and Google - VentureBeat - Big Data - by Jordan Novet|work=VentureBeat|date=2015-02-10|access-date=2017-09-08|archive-date=2017-08-09|archive-url=https://web.archive.org/web/20170809060105/https://venturebeat.com/2015/02/09/microsoft-researchers-say-their-newest-deep-learning-system-beats-humans-and-google/|url-status=live}}
- Handwriting recognition{{cite arXiv |eprint=1605.06065 |title=One-shot Learning with Memory-Augmented Neural Networks |at=p. 5, Table 1 |date= 19 May 2016 |quote=4.2. Omniglot Classification: "The network exhibited high classification accuracy on just the second presentation of a sample from a class within an episode (82.8%), reaching up to 94.9% accuracy by the fifth instance and 98.1% accuracy by the tenth. |last1=Santoro |first1=Adam |last2=Bartunov |first2=Sergey |last3=Botvinick |first3=Matthew |last4= Wierstra |first4=Daan |last5=Lillicrap |first5=Timothy |class=cs.LG }}
- Facial recognition{{cite web|url=https://neurosciencenews.com/man-machine-facial-recognition-120191/|title=Man Versus Machine: Who Wins When It Comes to Facial Recognition?|work=Neuroscience News|date=2018-12-03|access-date=2022-07-20|archive-date=2022-07-20|archive-url=https://web.archive.org/web/20220720033355/https://neurosciencenews.com/man-machine-facial-recognition-120191/|url-status=live}}
- Visual question answering{{cite arXiv |eprint=2111.08896 |title=Achieving Human Parity on Visual Question Answering |date= 17 November 2021 |class=cs.CL |last1=Yan |first1=Ming |last2=Xu |first2=Haiyang |last3=Li |first3=Chenliang |last4=Tian |first4=Junfeng |last5=Bi |first5=Bin |last6=Wang |first6=Wei |last7=Chen |first7=Weihua |last8=Xu |first8=Xianzhe |last9=Wang |first9=Fan |last10=Cao |first10=Zheng |last11=Zhang |first11=Zhicheng |last12=Zhang |first12=Qiyu |last13=Zhang |first13=Ji |last14=Huang |first14=Songfang |last15=Huang |first15=Fei |last16=Si |first16=Luo |last17=Jin |first17=Rong }}
- SQuAD 2.0 English reading-comprehension benchmark (2019)
- SuperGLUE English-language understanding benchmark (2020)Zhang, D., Mishra, S., Brynjolfsson, E., Etchemendy, J., Ganguli, D., Grosz, B., ... & Perrault, R. (2021). The AI index 2021 annual report. AI Index (Stanford University). arXiv preprint arXiv:2103.06312.
- Some school science exams (2019){{cite news |last1=Metz |first1=Cade |title=A Breakthrough for A.I. Technology: Passing an 8th-Grade Science Test |url=https://www.nytimes.com/2019/09/04/technology/artificial-intelligence-aristo-passed-test.html |access-date=5 January 2023 |work=The New York Times |date=4 September 2019 |archive-date=5 January 2023 |archive-url=https://web.archive.org/web/20230105051148/https://www.nytimes.com/2019/09/04/technology/artificial-intelligence-aristo-passed-test.html |url-status=live }}
- Some tasks based on Raven's Progressive Matrices
- Many Atari 2600 games (2015){{cite magazine |last1=McMillan |first1=Robert |title=Google's AI Is Now Smart Enough to Play Atari Like the Pros |url=https://www.wired.com/2015/02/google-ai-plays-atari-like-pros/ |access-date=5 January 2023 |magazine=Wired |date=2015 |archive-date=5 January 2023 |archive-url=https://web.archive.org/web/20230105053006/https://www.wired.com/2015/02/google-ai-plays-atari-like-pros/ |url-status=live }}
=Sub-human=
{{More citations needed section|date=December 2022}}
- Optical character recognition for printed text (nearing par-human for Latin-script typewritten text)
- Object recognition{{clarify|date=December 2017}}
- Various robotics tasks that may require advances in robot hardware as well as AI, including:
- Stable bipedal locomotion: Bipedal robots can walk, but are less stable than human walkers (as of 2017){{cite news|title=Robots with legs are getting ready to walk among us|url=https://www.theverge.com/2017/2/22/14635530/bipedal-legged-robots-mobility-advantages|access-date=28 December 2017|work=The Verge|archive-date=28 December 2017|archive-url=https://web.archive.org/web/20171228171515/https://www.theverge.com/2017/2/22/14635530/bipedal-legged-robots-mobility-advantages|url-status=live}}
- Humanoid soccer{{cite news|last1=Hurst|first1=Nathan|title=Why Funny, Falling, Soccer-Playing Robots Matter|url=https://www.smithsonianmag.com/innovation/why-funny-falling-soccer-playing-robots-matter-180964260/|access-date=28 December 2017|work=Smithsonian|language=en|archive-date=28 December 2017|archive-url=https://web.archive.org/web/20171228171803/https://www.smithsonianmag.com/innovation/why-funny-falling-soccer-playing-robots-matter-180964260/|url-status=live}}
- Speech recognition: "nearly equal to human performance" (2017){{cite news|title=The Business of Artificial Intelligence|url=https://hbr.org/cover-story/2017/07/the-business-of-artificial-intelligence|access-date=28 December 2017|work=Harvard Business Review|date=18 July 2017|language=en|archive-date=29 December 2017|archive-url=https://web.archive.org/web/20171229064652/https://hbr.org/cover-story/2017/07/the-business-of-artificial-intelligence|url-status=live}}
- Explainability. Current medical systems can diagnose certain medical conditions well, but cannot explain to users why they made the diagnosis.Brynjolfsson, E., & Mitchell, T. (2017). What can machine learning do? Workforce implications. Science, 358(6370), 1530-1534.
- Many tests of fluid intelligence (2020)
- Bongard visual cognition problems, such as the Bongard-LOGO benchmark (2020){{cite journal |last1=van der Maas |first1=Han L.J. |last2=Snoek |first2=Lukas |last3=Stevenson |first3=Claire E. |title=How much intelligence is there in artificial intelligence? A 2020 update |journal=Intelligence |date=July 2021 |volume=87 |pages=101548 |doi=10.1016/j.intell.2021.101548|s2cid=236236331 |doi-access=free }}Nie, W., Yu, Z., Mao, L., Patel, A. B., Zhu, Y., & Anandkumar, A. (2020). Bongard-logo: A new benchmark for human-level concept learning and reasoning. Advances in Neural Information Processing Systems, 33, 16468-16480.
- Visual Commonsense Reasoning (VCR) benchmark (as of 2020)
- Stock market prediction: Financial data collection and processing using Machine Learning algorithms
- Angry Birds video game, as of 2020{{cite journal |last1=Stephenson |first1=Matthew |last2=Renz |first2=Jochen |last3=Ge |first3=Xiaoyu |title=The computational complexity of Angry Birds |journal=Artificial Intelligence |date=March 2020 |volume=280 |pages=103232 |doi=10.1016/j.artint.2019.103232 |arxiv=1812.07793 |s2cid=56475869 |quote=Despite many different attempts over the past five years the problem is still largely unsolved, with AI approaches far from human-level performance.}}
- Various tasks that are difficult to solve without contextual knowledge, including:
- Translation
- Word-sense disambiguation
Proposed tests of artificial intelligence
{{See also|Turing test#Versions}}
In his famous Turing test, Alan Turing picked language, the defining feature of human beings, for its basis.{{Turing 1950}} The Turing test is now considered too exploitable to be a meaningful benchmark.{{cite journal|last1=Schoenick|first1=Carissa|last2=Clark|first2=Peter|last3=Tafjord|first3=Oyvind|last4=Turney|first4=Peter|last5=Etzioni|first5=Oren|date=23 August 2017|title=Moving beyond the Turing Test with the Allen AI Science Challenge|journal=Communications of the ACM|volume=60|issue=9|pages=60–64|arxiv=1604.04315|doi=10.1145/3122814|s2cid=6383047}}
The Feigenbaum test, proposed by the inventor of expert systems, tests a machine's knowledge and expertise about a specific subject.{{cite journal|last=Feigenbaum|first=Edward A.|date=2003|title=Some challenges and grand challenges for computational intelligence|journal=Journal of the ACM|volume=50|issue=1|pages=32–40|doi=10.1145/602382.602400|s2cid=15379263}} A paper by Jim Gray of Microsoft in 2003 suggested extending the Turing test to speech understanding, speaking and recognizing objects and behavior.{{cite journal|last=Gray|first=Jim |year=2003|title=What Next? A Dozen Information-Technology Research Goals|journal=Journal of the ACM|volume=50|issue=1|pages=41–57|bibcode=1999cs.......11005G |arxiv=cs/9911005 |doi=10.1145/602382.602401|s2cid=10336312 }}
Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; however, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels.{{cite journal|last=Hernandez-Orallo|first=Jose|year=2000|title=Beyond the Turing Test|journal=Journal of Logic, Language and Information|volume=9|issue=4|pages=447–466|doi=10.1023/A:1008367325700|s2cid=14481982}}{{Cite web |last=Kuang-Cheng |first=Andy Wang |date=2023 |title=International licensing under an endogenous tariff in vertically-related markets |url=https://ingestai.io/ |access-date=2023-04-23 |website=Journal of Economics |language=en}}{{cite journal|last1=Dowe|first1=D. L.|last2=Hajek|first2=A. R.|year=1997|title=A computational extension to the Turing Test|url=http://www.csse.monash.edu.au/publications/1997/tr-cs97-322-abs.html|url-status=dead|journal=Proceedings of the 4th Conference of the Australasian Cognitive Science Society|archive-url=https://web.archive.org/web/20110628194905/http://www.csse.monash.edu.au/publications/1997/tr-cs97-322-abs.html|archive-date=28 June 2011|df=dmy-all}}{{cite journal|last1=Hernandez-Orallo|first1=J.|last2=Dowe|first2=D. L.|year=2010|title=Measuring Universal Intelligence: Towards an Anytime Intelligence Test|journal=Artificial Intelligence|volume=174|issue=18|pages=1508–1539|citeseerx=10.1.1.295.9079|doi=10.1016/j.artint.2010.09.006}}{{cite journal|last1=Hernández-Orallo|first1=José|last2=Dowe|first2=David L.|last3=Hernández-Lloreda|first3=M.Victoria|date=March 2014|title=Universal psychometrics: Measuring cognitive abilities in the machine kingdom|journal=Cognitive Systems Research|volume=27|pages=50–74|doi=10.1016/j.cogsys.2013.06.001|hdl-access=free|hdl=10251/50244|s2cid=26440282}}
Exams
According to OpenAI, in 2023 ChatGPT GPT-4 scored the 90th percentile on the Uniform Bar Exam. On the SATs, GPT-4 scored the 89th percentile on math, and the 93rd percentile in Reading & Writing. On the GREs, it scored on the 54th percentile on the writing test, 88th percentile on the quantitative section, and 99th percentile on the verbal section. It scored in the 99th to 100th percentile on the 2020 USA Biology Olympiad semifinal exam. It scored a perfect "5" on several AP exams.{{cite news |last1=Varanasi |first1=Lakshmi |title=AI models like ChatGPT and GPT-4 are acing everything from the bar exam to AP Biology. Here's a list of difficult exams both AI versions have passed. |url=https://www.businessinsider.com/list-here-are-the-exams-chatgpt-has-passed-so-far-2023-1 |access-date=22 June 2023 |work=Business Insider |date=March 2023}}
Independent researchers found in 2023 that ChatGPT GPT-3.5 "performed at or near the passing threshold" for the three parts of the United States Medical Licensing Examination. GPT-3.5 was also assessed to attain a low, but passing, grade from exams for four law school courses at the University of Minnesota. GPT-4 passed a text-based radiology board–style examination.{{cite news |last1=Rudy |first1=Melissa |title=Latest version of ChatGPT passes radiology board-style exam, highlights AI's 'growing potential,' study finds |url=https://www.foxnews.com/health/latest-version-chatgpt-passes-radiology-board-style-exam-highlights-ai-growing-potential-study-finds |access-date=22 June 2023 |work=Fox News |date=24 May 2023}}{{cite journal |last1=Bhayana |first1=Rajesh |last2=Bleakney |first2=Robert R. |last3=Krishna |first3=Satheesh |title=GPT-4 in Radiology: Improvements in Advanced Reasoning |journal=Radiology |date=1 June 2023 |volume=307 |issue=5 |pages=e230987 |doi=10.1148/radiol.230987 |pmid=37191491 |s2cid=258716171 |url=https://pubs.rsna.org/doi/10.1148/radiol.230987}}
Competitions
{{Main|Competitions and prizes in artificial intelligence}}
Many competitions and prizes, such as the Imagenet Challenge, promote research in artificial intelligence. The most common areas of competition include general machine intelligence, conversational behavior, data-mining, robotic cars, and robot soccer as well as conventional games.{{Cite web|title=ILSVRC2017|url=http://image-net.org/challenges/LSVRC/2017/|access-date=2018-11-06|website=image-net.org|language=en|archive-date=2018-11-02|archive-url=https://web.archive.org/web/20181102131747/http://www.image-net.org/challenges/LSVRC/2017/|url-status=live}}
Past and current predictions
An expert poll around 2016, conducted by Katja Grace of the Future of Humanity Institute and associates, gave median estimates of 3 years for championship Angry Birds, 4 years for the World Series of Poker, and 6 years for StarCraft. On more subjective tasks, the poll gave 6 years for folding laundry as well as an average human worker, 7–10 years for expertly answering 'easily Googleable' questions, 8 years for average speech transcription, 9 years for average telephone banking, and 11 years for expert songwriting, but over 30 years for writing a New York Times bestseller or winning the Putnam math competition.
=Chess=
An AI defeated a grandmaster in a regulation tournament game for the first time in 1988; rebranded as Deep Blue, it beat the reigning human world chess champion in 1997 (see Deep Blue versus Garry Kasparov).{{cite news|last1=McClain|first1=Dylan Loeb|title=Bent Larsen, Chess Grandmaster, Dies at 75|url=https://www.nytimes.com/2010/09/11/world/americas/11larsen.html|access-date=31 January 2018|work=The New York Times|date=11 September 2010|archive-date=25 March 2014|archive-url=https://archive.today/20140325101752/http://www.nytimes.com/2010/09/11/world/americas/11larsen.html|url-status=live}}
=Go=
AlphaGo defeated a European Go champion in October 2015, and Lee Sedol in March 2016, one of the world's top players (see AlphaGo versus Lee Sedol). According to Scientific American and other sources, most observers had expected superhuman Computer Go performance to be at least a decade away.{{cite news|last1=Koch|first1=Christof|title=How the Computer Beat the Go Master|url=https://www.scientificamerican.com/article/how-the-computer-beat-the-go-master/|access-date=31 January 2018|work=Scientific American|date=2016|language=en|archive-date=6 September 2017|archive-url=https://web.archive.org/web/20170906224946/https://www.scientificamerican.com/article/how-the-computer-beat-the-go-master/|url-status=live}}{{cite news|title='I'm in shock!' How an AI beat the world's best human at Go|url=https://www.newscientist.com/article/2079871-im-in-shock-how-an-ai-beat-the-worlds-best-human-at-go/|access-date=31 January 2018|work=New Scientist|date=2016|archive-date=13 May 2016|archive-url=https://web.archive.org/web/20160513193612/https://www.newscientist.com/article/2079871-im-in-shock-how-an-ai-beat-the-worlds-best-human-at-go/|url-status=live}}{{cite news|last1=Moyer|first1=Christopher|title=How Google's AlphaGo Beat a Go World Champion|url=https://www.theatlantic.com/technology/archive/2016/03/the-invisible-opponent/475611/|access-date=31 January 2018|work=The Atlantic|date=2016|archive-date=31 January 2018|archive-url=https://web.archive.org/web/20180131200838/https://www.theatlantic.com/technology/archive/2016/03/the-invisible-opponent/475611/|url-status=live}}
=Human-level artificial general intelligence (AGI)=
AI pioneer and economist Herbert A. Simon inaccurately predicted in 1965: "Machines will be capable, within twenty years, of doing any work a man can do". Similarly, in 1970 Marvin Minsky wrote that "Within a generation... the problem of creating artificial intelligence will substantially be solved."{{cite book|last1=Bostrom|first1=Nick|title=Superintelligence|date=2013|publisher=Oxford University Press|location=Oxford|isbn=978-0199678112|language=en|title-link=Superintelligence (book)}}
Four polls conducted in 2012 and 2013 suggested that the median estimate among experts for when AGI would arrive was 2040 to 2050, depending on the poll.{{cite magazine|last1=Khatchadourian|first1=Raffi|title=The Doomsday Invention|url=https://www.newyorker.com/magazine/2015/11/23/doomsday-invention-artificial-intelligence-nick-bostrom|access-date=31 January 2018|magazine=The New Yorker|date=16 November 2015|archive-date=29 April 2019|archive-url=https://web.archive.org/web/20190429183807/https://www.newyorker.com/magazine/2015/11/23/doomsday-invention-artificial-intelligence-nick-bostrom|url-status=live}}Müller, V. C., & Bostrom, N. (2016). Future progress in artificial intelligence: A survey of expert opinion. In Fundamental issues of artificial intelligence (pp. 555-572). Springer, Cham.
The Grace poll around 2016 found results varied depending on how the question was framed. Respondents asked to estimate "when unaided machines can accomplish every task better and more cheaply than human workers" gave an aggregated median answer of 45 years and a 10% chance of it occurring within 9 years. Other respondents asked to estimate "when all occupations are fully automatable. That is, when for any occupation, machines could be built to carry out the task better and more cheaply than human workers" estimated a median of 122 years and a 10% probability of 20 years. The median response for when "AI researcher" could be fully automated was around 90 years. No link was found between seniority and optimism, but Asian researchers were much more optimistic than North American researchers on average; Asians predicted 30 years on average for "accomplish every task", compared with the 74 years predicted by North Americans.{{cite news|last1=Gray|first1=Richard|title=How long will it take for your job to be automated?|url=http://www.bbc.com/capital/story/20170619-how-long-will-it-take-for-your-job-to-be-automated|access-date=31 January 2018|work=BBC|date=2018|language=en|archive-date=11 January 2018|archive-url=https://web.archive.org/web/20180111134529/http://www.bbc.com/capital/story/20170619-how-long-will-it-take-for-your-job-to-be-automated|url-status=live}}{{cite news|title=AI will be able to beat us at everything by 2060, say experts|url=https://www.newscientist.com/article/2133188-ai-will-be-able-to-beat-us-at-everything-by-2060-say-experts/|access-date=31 January 2018|work=New Scientist|date=2018|archive-date=31 January 2018|archive-url=https://web.archive.org/web/20180131202306/https://www.newscientist.com/article/2133188-ai-will-be-able-to-beat-us-at-everything-by-2060-say-experts/|url-status=live}}Grace, K., Salvatier, J., Dafoe, A., Zhang, B., & Evans, O. (2017). When will AI exceed human performance? Evidence from AI experts. arXiv preprint arXiv:1705.08807.
See also
References
{{Reflist|30em}}
Notes
{{Reflist|group=note}}
External links
- [https://aiimpacts.org/miri-ai-predictions-dataset/ MIRI database of predictions about AGI]
{{emerging technologies|topics=yes|infocom=yes}}