EleutherAI
{{Short description|Artificial intelligence research collective}}
{{Use dmy dates|date=March 2023}}
{{Infobox website
| name = EleutherAI
| logo = EleutherAI logo.svg
| logo_size = 200px
| company_type = Research co-operative
| founded = {{start date and age|df=y|2020|7|03}}
| founder =
| industry = Artificial intelligence
| products = GPT-Neo, GPT-NeoX, GPT-J, Pythia, The Pile, VQGAN-CLIP
| website = [https://eleuther.ai eleuther.ai]
}}
{{Artificial intelligence}}
EleutherAI ({{IPAc-en|ə|ˈ|l|uː|θ|ər}}{{cite web |date=2021-04-02 |title=Talk with Stella Biderman on The Pile, GPT-Neo and MTG |url=https://www.youtube.com/watch?v=SpiXLeLkpcw&list=WL&index=4&ab_channel=TheInferencePodcast |accessdate=2023-03-26 |publisher=The Interference Podcast}}) is a grass-roots non-profit artificial intelligence (AI) research group. The group, considered an open-source version of OpenAI,{{cite web |last=Smith |first=Craig |date=2022-03-21 |title=EleutherAI: When OpenAI Isn't Open Enough |url=https://spectrum.ieee.org/eleutherai-openai-not-open-enough |accessdate=2023-08-08 |work=IEEE Spectrum |publisher=IEEE |archive-date=29 August 2023 |archive-url=https://web.archive.org/web/20230829225345/https://spectrum.ieee.org/eleutherai-openai-not-open-enough |url-status=live }} was formed in a Discord server in July 2020 by Connor Leahy, Sid Black, and Leo Gao{{Cite web |title=About |url=https://www.eleuther.ai/about |access-date=2024-05-23 |website=EleutherAI |language=en-GB}} to organize a replication of GPT-3. In early 2023, it formally incorporated as the EleutherAI Institute, a non-profit research institute.{{Cite web |last=Wiggers |first=Kyle |date=2023-03-02 |title=Stability AI, Hugging Face and Canva back new AI research nonprofit |url=https://techcrunch.com/2023/03/02/stability-ai-hugging-face-and-canva-back-new-ai-research-nonprofit/ |access-date=2023-08-08 |website=TechCrunch |language=en-US |archive-date=29 August 2023 |archive-url=https://web.archive.org/web/20230829225347/https://techcrunch.com/2023/03/02/stability-ai-hugging-face-and-canva-back-new-ai-research-nonprofit/ |url-status=live }}
History
EleutherAI began as a Discord server on July 7, 2020, under the tentative name "LibreAI" before rebranding to "EleutherAI" later that month,{{Cite web |last1=Leahy |first1=Connor |last2=Hallahan |first2=Eric |last3=Gao |first3=Leo |last4=Biderman |first4=Stella |date=2021-07-07 |title=What A Long, Strange Trip It's Been: EleutherAI One Year Retrospective |url=https://blog.eleuther.ai/year-one/ |access-date=2023-04-14 |website=EleutherAI Blog |language=en |archive-date=29 August 2023 |archive-url=https://web.archive.org/web/20230829225353/https://blog.eleuther.ai/year-one/ |url-status=live }} in reference to eleutheria, the Greek word for liberty. Its founding members are Connor Leahy, Len Gao, and Sid Black. They co-wrote the code for Eleuther to serve as a collection of open source AI research, creating a machine learning model similar to GPT-3.{{cite web | url=https://techcrunch.com/2023/03/02/stability-ai-hugging-face-and-canva-back-new-ai-research-nonprofit/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAALg9rewZCydvUUJSnN4Yp_jPOm1keHmEucKCHFMxJUloBT-3p37CC7UdIsRu_jpbiaRF6emagIBjkh1CemD09RbMmh7ICwwNozTLJqwRRxW49i11vTbcKT2Ci_YZsof-5VmJ5x5MVrMtMbQFr4MDcIvkXTKDu5Z367UdH_YQOHIJ | title=Stability AI, Hugging Face and Canva back new AI research nonprofit | date=2 March 2023 }}
On December 30, 2020, EleutherAI released The Pile, a curated dataset of diverse text for training large language models.{{cite conference |title=The Pile: An 800GB Dataset of Diverse Text for Language Modeling |conference=arXiv 2101.00027|date=2020-12-31 |last1=Gao |first1=Leo |last2=Biderman |first2=Stella |last3=Black |first3=Sid |arxiv=2101.00027 |display-authors=etal }} While the paper referenced the existence of the GPT-Neo models, the models themselves were not released until March 21, 2021.{{Cite web |date=2021-05-15 |title=GPT-3's free alternative GPT-Neo is something to be excited about |url=https://venturebeat.com/ai/gpt-3s-free-alternative-gpt-neo-is-something-to-be-excited-about/ |access-date=2023-04-14 |website=VentureBeat |language=en-US |archive-date=9 March 2023 |archive-url=https://web.archive.org/web/20230309012717/https://venturebeat.com/ai/gpt-3s-free-alternative-gpt-neo-is-something-to-be-excited-about/ |url-status=live }} According to a retrospective written several months later, the authors did not anticipate that "people would care so much about our 'small models.{{'"}}{{cite news |title=What A Long, Strange Trip It's Been: EleutherAI One Year Retrospective |last1=Leahy |first1=Connor |last2=Hallahan |first2=Eric |last3=Gao |first3=Leo |last4=Biderman |first4=Stella |date=2021-07-07 |url=https://blog.eleuther.ai/year-one/ |access-date=1 March 2023 |archive-date=29 August 2023 |archive-url=https://web.archive.org/web/20230829225353/https://blog.eleuther.ai/year-one/ |url-status=live }} On June 9, 2021, EleutherAI followed this up with GPT-J-6B, a six billion parameter language model that was again the largest open-source GPT-3-like model in the world.{{Cite web |title=GPT-J-6B: An Introduction to the Largest Open Source GPT Model {{!}} Forefront |url=https://www.forefront.ai/blog-posts/gpt-j-6b-an-introduction-to-the-largest-open-sourced-gpt-model |access-date=2023-03-01 |website=www.forefront.ai |archive-date=9 March 2023 |archive-url=https://web.archive.org/web/20230309205439/https://www.forefront.ai/blog-posts/gpt-j-6b-an-introduction-to-the-largest-open-sourced-gpt-model |url-status=dead }} These language models were released under the Apache 2.0 free software license and are considered to have "fueled an entirely new wave of startups".
While EleutherAI initially turned down funding offers, preferring to use Google's TPU Research Cloud Program to source their compute,{{Cite web|title=EleutherAI: When OpenAI Isn't Open Enough|url=https://spectrum.ieee.org/eleutherai-openai-not-open-enough|access-date=2023-03-01|website=IEEE Spectrum|archive-date=21 March 2023|archive-url=https://web.archive.org/web/20230321062859/https://spectrum.ieee.org/eleutherai-openai-not-open-enough|url-status=live}} by early 2021 they had accepted funding from CoreWeave (a small cloud computing company) and SpellML (a cloud infrastructure company) in the form of access to powerful GPU clusters that are necessary for large scale machine learning research. On Feb 10, 2022, they released GPT-NeoX-20B, a model similar to their prior work but scaled up thanks to the resources CoreWeave provided.{{cite arXiv|last1=Black|first1=Sid|last2=Biderman|first2=Stella|last3=Hallahan|first3=Eric|display-authors=etal|date=14 April 2022|title=GPT-NeoX-20B{{!}} An Open-Source Autoregressive Language Model|eprint=2204.06745|class=cs.CL}}
In 2022, many EleutherAI members participated in the BigScience Research Workshop, working on projects including multitask finetuning,{{cite arXiv | eprint=2110.08207 | last1=Sanh | first1=Victor | last2=Webson | first2=Albert | last3=Raffel | first3=Colin | last4=Bach | first4=Stephen H. | last5=Sutawika | first5=Lintang | last6=Alyafeai | first6=Zaid | last7=Chaffin | first7=Antoine | last8=Stiegler | first8=Arnaud | author9=Teven Le Scao | last10=Raja | first10=Arun | last11=Dey | first11=Manan | author12=M Saiful Bari | last13=Xu | first13=Canwen | last14=Thakker | first14=Urmish | author15=Shanya Sharma Sharma | last16=Szczechla | first16=Eliza | last17=Kim | first17=Taewoon | last18=Chhablani | first18=Gunjan | last19=Nayak | first19=Nihal | last20=Datta | first20=Debajyoti | last21=Chang | first21=Jonathan | author22=Mike Tian-Jian Jiang | last23=Wang | first23=Han | last24=Manica | first24=Matteo | last25=Shen | first25=Sheng | author26=Zheng Xin Yong | last27=Pandey | first27=Harshit | last28=Bawden | first28=Rachel | last29=Wang | first29=Thomas | last30=Neeraj | first30=Trishala | title=Multitask Prompted Training Enables Zero-Shot Task Generalization | date=2021 | class=cs.LG | display-authors=1 }}{{cite arXiv | eprint=2211.01786 | last1=Muennighoff | first1=Niklas | last2=Wang | first2=Thomas | last3=Sutawika | first3=Lintang | last4=Roberts | first4=Adam | last5=Biderman | first5=Stella | author6=Teven Le Scao | author7=M Saiful Bari | last8=Shen | first8=Sheng | last9=Yong | first9=Zheng-Xin | last10=Schoelkopf | first10=Hailey | last11=Tang | first11=Xiangru | last12=Radev | first12=Dragomir | author13=Alham Fikri Aji | last14=Almubarak | first14=Khalid | last15=Albanie | first15=Samuel | last16=Alyafeai | first16=Zaid | last17=Webson | first17=Albert | last18=Raff | first18=Edward | last19=Raffel | first19=Colin | title=Crosslingual Generalization through Multitask Finetuning | date=2022 | class=cs.CL }} training BLOOM,{{cite arXiv | eprint=2211.05100 | last1=Workshop | first1=BigScience | author2=Teven Le Scao | last3=Fan | first3=Angela | last4=Akiki | first4=Christopher | last5=Pavlick | first5=Ellie | last6=Ilić | first6=Suzana | last7=Hesslow | first7=Daniel | last8=Castagné | first8=Roman | author9=Alexandra Sasha Luccioni | last10=Yvon | first10=François | last11=Gallé | first11=Matthias | last12=Tow | first12=Jonathan | last13=Rush | first13=Alexander M. | last14=Biderman | first14=Stella | last15=Webson | first15=Albert | author16=Pawan Sasanka Ammanamanchi | last17=Wang | first17=Thomas | last18=Sagot | first18=Benoît | last19=Muennighoff | first19=Niklas | author20=Albert Villanova del Moral | last21=Ruwase | first21=Olatunji | last22=Bawden | first22=Rachel | last23=Bekman | first23=Stas | last24=McMillan-Major | first24=Angelina | last25=Beltagy | first25=Iz | last26=Nguyen | first26=Huu | last27=Saulnier | first27=Lucile | last28=Tan | first28=Samson | author29=Pedro Ortiz Suarez | last30=Sanh | first30=Victor | title=BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | date=2022 | class=cs.CL | display-authors=1 }} and designing evaluation libraries. Engineers at EleutherAI, Stability AI, and NVIDIA joined forces with biologists led by Columbia University and Harvard University{{cite web | url=https://cbirt.net/meet-openfold-reimplementing-alphafold2-to-illuminate-its-learning-mechanisms-and-generalization/ | title=Meet OpenFold: Reimplementing AlphaFold2 to Illuminate Its Learning Mechanisms and Generalization | date=21 August 2023 }}
to train OpenFold, an open-source replication of DeepMind's AlphaFold2.{{cite web | url=https://wandb.ai/openfold/openfold/reports/Democratizing-AI-for-Biology-with-OpenFold--VmlldzoyODUyNDI4 | title=Democratizing AI for Biology with OpenFold }}
In early 2023, EleutherAI incorporated as a non-profit research institute run by Stella Biderman, Curtis Huebner, and Shivanshu Purohit. This announcement came with the statement that EleutherAI's shift of focus away from training larger language models was part of a deliberate push towards doing work in interpretability, alignment, and scientific research.{{cite web | url=https://blog.eleuther.ai/year-two-preface/ | title=The View from 30,000 Feet: Preface to the Second EleutherAI Retrospective | date=2 March 2023 }} While EleutherAI is still committed to promoting access to AI technologies, they feel that "there is substantially more interest in training and releasing LLMs than there once was," enabling them to focus on other projects.{{cite web | url=https://thenonprofittimes.com/technology/ai-research-lab-launches-open-source-research-nonprofit/ | title=AI Research Lab Launches Open Source Research Nonprofit }}
In July 2024, an investigation by Proof news found that EleutherAI's The Pile dataset includes subtitles from over 170,000 YouTube videos across more than 48,000 channels. The findings drew criticism and accusations of theft from YouTubers and others who had their work published on the platform.{{cite magazine | first1=Annie | last1=Gilbertson | first2=Alex | last2=Reisner | title=Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI | magazine=WIRED | date=2024-07-16 | url=https://www.wired.com/story/youtube-training-data-apple-nvidia-anthropic/ | access-date=2024-07-18 }}{{cite web | last=Gilbertson | first=Annie | title=Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI | website=Proof | date=2024-07-16 | url=https://www.proofnews.org/apple-nvidia-anthropic-used-thousands-of-swiped-youtube-videos-to-train-ai/ | access-date=2024-07-18}} In 2025, Stella Biderman served as executive director. Aviya Skowron served as head of policy and ethics. Nora Belrose served as head of interpretability, and Quentin Anthony was head of HPC.{{cite web | url=https://www.eleuther.ai/staff | title=Staff }}
Research
{{Primary sources|date=August 2023}}
According to their website, EleutherAI is a "decentralized grassroots collective of volunteer researchers, engineers, and developers focused on AI alignment, scaling, and open-source AI research".{{cite web |title=EleutherAI Website |url=https://eleuther.ai/ |access-date=1 July 2021 |publisher=EleutherAI |archive-date=2 July 2021 |archive-url=https://web.archive.org/web/20210702103521/https://www.eleuther.ai/ |url-status=live }} While they do not sell any of their technologies as products, they publish the results of their research in academic venues, write blog posts detailing their ideas and methodologies, and provide trained models for anyone to use for free.{{Citation needed|date=March 2023}}
= The Pile =
{{Main|The Pile (dataset)}}
The Pile is an 886 GB dataset designed for training large language models. It was originally developed to train EleutherAI's GPT-Neo models but has become widely used to train other models, including Microsoft's Megatron-Turing Natural Language Generation,{{Cite web|url=https://venturebeat.com/ai/microsoft-and-nvidia-team-up-to-train-one-of-the-worlds-largest-language-models/|title=Microsoft and Nvidia team up to train one of the world's largest language models|date=11 October 2021|access-date=8 March 2023|archive-date=27 March 2023|archive-url=https://web.archive.org/web/20230327211519/https://venturebeat.com/ai/microsoft-and-nvidia-team-up-to-train-one-of-the-worlds-largest-language-models/|url-status=live}}{{Cite web|url=https://lifearchitect.ai/megatron/|title=AI: Megatron the Transformer, and its related language models|date=24 September 2021|accessdate=8 March 2023|archive-date=4 March 2023|archive-url=https://web.archive.org/web/20230304115241/https://lifearchitect.ai/megatron/|url-status=live}} Meta AI's Open
Pre-trained Transformers,{{cite arXiv|first1=Susan|last1=Zhang|first2=Stephen|last2=Roller|first3=Naman|last3=Goyal|first4=Mikel|last4=Artetxe|first5=Moya|last5=Chen|first6=Shuohui|last6=Chen|first7=Christopher|last7=Dewan|first8=Mona|last8=Diab|first9=Xian|last9=Li|first10=Xi Victoria|last10=Lin|first11=Todor|last11=Mihaylov|first12=Myle|last12=Ott|first13=Sam|last13=Shleifer|first14=Kurt|last14=Shuster|first15=Daniel|last15=Simig|first16=Punit Singh|last16=Koura|first17=Anjali|last17=Sridhar|first18=Tianlu|last18=Wang|first19=Luke|last19=Zettlemoyer|eprint=2205.01068|title=OPT: Open Pre-trained Transformer Language Models|class=cs.CL|date=21 June 2022}} LLaMA,{{cite arXiv|last1=Touvron|first1=Hugo|last2=Lavril|first2=Thibaut|last3=Izacard|first3=Gautier|last4=Grave|first4=Edouard|last5=Lample|first5=Guillaume|display-authors=etal|eprint=2302.13971|title=LLaMA: Open and Efficient Foundation Language Models|class=cs.CL|date=27 February 2023}} and Galactica,{{cite arXiv|last1=Taylor|first1=Ross|last2=Kardas|first2=Marcin|last3=Cucurull|first3=Guillem|last4=Scialom|first4=Thomas|last5=Hartshorn|first5=Anthony|last6=Saravia|first6=Elvis|last7=Poulton|first7=Andrew|last8=Kerkez|first8=Viktor|last9=Stojnic|first9=Robert|eprint=2211.09085|title=Galactica: A Large Language Model for Science|class=cs.CL|date=16 November 2022}} Stanford University's BioMedLM 2.7B,{{Cite web|url=https://huggingface.co/stanford-crfm/BioMedLM|title=Model Card for BioMedLM 2.7B|website=huggingface.co|accessdate=5 June 2023|archive-date=5 June 2023|archive-url=https://web.archive.org/web/20230605175035/https://huggingface.co/stanford-crfm/BioMedLM|url-status=live}} the Beijing Academy of Artificial Intelligence's
Chinese-Transformer-XL,{{cite journal |last1=Yuan |first1=Sha |last2=Zhao |first2=Hanyu |last3=Du |first3=Zhengxiao |last4=Ding |first4=Ming |last5=Liu |first5=Xiao |last6=Cen |first6=Yukuo |last7=Zou |first7=Xu |last8=Yang |first8=Zhilin |last9=Tang |first9=Jie |title=WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models |journal=AI Open |date=2021 |volume=2 |pages=65–68 |doi=10.1016/j.aiopen.2021.06.001 |doi-access=free }} and Yandex's YaLM 100B.{{cite press release |last=Grabovskiy|first=Ilya|date=2022|title=Yandex publishes YaLM 100B, the largest GPT-like neural network in open source|url=https://yandex.com/company/press_center/press_releases/2022/2022-23-06|location= |publisher=Yandex|access-date=5 June 2023}} Compared to other datasets, the Pile's main distinguishing features are that it is a curated selection of data chosen by researchers at EleutherAI to contain information they thought language models should learn and that it is the only such dataset that is thoroughly documented by the researchers who developed it.{{cite journal |last1=Khan |first1=Mehtab |last2=Hanna |first2=Alex |title=The Subjects and Stages of AI Dataset Development: A Framework for Dataset Accountability |journal=Ohio State Technology Law Journal |date=2023 |volume=19 |issue=2 |pages=171–256 |hdl=1811/103549 |hdl-access=free |ssrn=4217148 }}
= GPT models =
EleutherAI's most prominent research relates to its work to train open-source large language models inspired by OpenAI's GPT-3.{{Cite web|url=https://venturebeat.com/ai/gpt-3s-free-alternative-gpt-neo-is-something-to-be-excited-about/|title=GPT-3's free alternative GPT-Neo is something to be excited about|date=15 May 2021|access-date=10 March 2023|archive-date=9 March 2023|archive-url=https://web.archive.org/web/20230309012717/https://venturebeat.com/ai/gpt-3s-free-alternative-gpt-neo-is-something-to-be-excited-about/|url-status=live}} EleutherAI's "GPT-Neo" model series has released 125 million, 1.3 billion, 2.7 billion, 6 billion, and 20 billion parameter models.
- GPT-Neo (125M, 1.3B, 2.7B):{{cite report |type=Preprint |last1=Andonian |first1=Alex |last2=Biderman |first2=Stella |last3=Black |first3=Sid |last4=Gali |first4=Preetham |last5=Gao |first5=Leo |last6=Hallahan |first6=Eric |last7=Levy-Kramer |first7=Josh |last8=Leahy |first8=Connor |last9=Nestler |first9=Lucas |last10=Parker |first10=Kip |last11=Pieler |first11=Michael |last12=Purohit |first12=Shivanshu |last13=Songz |first13=Tri |last14=Phil |first14=Wang |last15=Weinbach |first15=Samuel |title=GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch |date=10 March 2023 |doi=10.5281/zenodo.5879544 }} released in March 2021, it was the largest open-source GPT-3-style language model in the world at the time of release.
- GPT-J (6B):{{Cite web|url=https://huggingface.co/EleutherAI/gpt-j-6B|title=EleutherAI/gpt-j-6B · Hugging Face|website=huggingface.co|access-date=10 March 2023|archive-date=12 March 2023|archive-url=https://web.archive.org/web/20230312092356/https://huggingface.co/EleutherAI/gpt-j-6B|url-status=live}} released in March 2021, it was the largest open-source GPT-3-style language model in the world at the time of release.{{Cite web|url=https://www.forefront.ai/blog-posts/gpt-j-6b-an-introduction-to-the-largest-open-sourced-gpt-model|title=GPT-J-6B: An Introduction to the Largest Open Source GPT Model | Forefront|website=www.forefront.ai|access-date=1 March 2023|archive-date=9 March 2023|archive-url=https://web.archive.org/web/20230309205439/https://www.forefront.ai/blog-posts/gpt-j-6b-an-introduction-to-the-largest-open-sourced-gpt-model|url-status=dead}}
- GPT-NeoX (20B):{{cite conference |title=GPT-NeoX-20B: An Open-Source Autoregressive Language Model |conference=Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models |date=2022-05-01 |last1=Black |first1=Sidney |last2=Biderman |first2=Stella |last3=Hallahan |first3=Eric |display-authors=etal |pages=95–136 |doi=10.18653/v1/2022.bigscience-1.9 |via=Association for Computational Linguistics - Anthology |url=https://aclanthology.org/2022.bigscience-1.9/ |access-date=2022-12-19 |arxiv=2204.06745 }} released in February 2022, it was the largest open-source language model in the world at the time of release.
- Pythia (13B):{{cite arXiv | eprint=2304.01373 | last1=Biderman | first1=Stella | last2=Schoelkopf | first2=Hailey | last3=Anthony | first3=Quentin | last4=Bradley | first4=Herbie | last5=O'Brien | first5=Kyle | last6=Hallahan | first6=Eric | author7=Mohammad Aflah Khan | last8=Purohit | first8=Shivanshu | author9=USVSN Sai Prashanth | last10=Raff | first10=Edward | last11=Skowron | first11=Aviya | last12=Sutawika | first12=Lintang | author13=Oskar van der Wal | title=Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling | date=2023 | class=cs.CL }} While prior models focused on scaling larger to close the gap with closed-sourced models like GPT-3, the Pythia model suite goes in another direction. The Pythia suite was designed to facilitate scientific research on the capabilities of and learning processes in large language models. Featuring 154 partially trained model checkpoints, fully public training data, and the ability to reproduce the exact training order, Pythia enables research on verifiable training,{{cite arXiv | eprint=2307.00682 | last1=Choi | first1=Dami | last2=Shavit | first2=Yonadav | last3=Duvenaud | first3=David | title=Tools for Verifying Neural Models' Training Data | date=2023 | class=cs.LG }} social biases, memorization,{{cite arXiv | eprint=2304.11158 | last1=Biderman | first1=Stella | author2=USVSN Sai Prashanth | last3=Sutawika | first3=Lintang | last4=Schoelkopf | first4=Hailey | last5=Anthony | first5=Quentin | last6=Purohit | first6=Shivanshu | last7=Raff | first7=Edward | title=Emergent and Predictable Memorization in Large Language Models | date=2023 | class=cs.CL }} and more.{{cite arXiv | eprint=2308.04014 | last1=Gupta | first1=Kshitij | last2=Thérien | first2=Benjamin | last3=Ibrahim | first3=Adam | last4=Richter | first4=Mats L. | last5=Anthony | first5=Quentin | last6=Belilovsky | first6=Eugene | last7=Rish | first7=Irina | last8=Lesort | first8=Timothée | title=Continual Pre-Training of Large Language Models: How to (Re)warm your model? | date=2023 | class=cs.CL }}
= VQGAN-CLIP =
File:Scenic_Valley_in_the_Afternoon_Artistic_(VQGAN+CLIP).jpg created with VQGAN-CLIP, a text-to-image model created by EleutherAI]]
File:Gothic Amethyst Restaurant Interior at Night (CLIP-Guided Diffusion).jpg |language=en |archive-date=29 August 2023 |archive-url=https://web.archive.org/web/20230829225357/https://colab.research.google.com/drive/12a_Wrfi2_gwwAuN3VvMTwVMz9TfqctNj |url-status=live }}]]
Following the release of DALL-E by OpenAI in January 2021, EleutherAI started working on text-to-image synthesis models. When OpenAI did not release DALL-E publicly, EleutherAI's Katherine Crowson and digital artist Ryan Murdock developed a technique for using CLIP (another model developed by OpenAI) to convert regular image generation models into text-to-image synthesis ones.{{Cite web|url=https://ljvmiranda921.github.io/notebook/2021/08/08/clip-vqgan/|title=The Illustrated VQGAN|first=LJ|last=MIRANDA|website=ljvmiranda921.github.io|date=8 August 2021 |accessdate=8 March 2023|archive-date=20 March 2023|archive-url=https://web.archive.org/web/20230320163030/https://ljvmiranda921.github.io/notebook/2021/08/08/clip-vqgan/|url-status=live}}{{Cite web|url=https://www.nylon.com/life/images-ai-art-twitter-machine-learning|title=Inside The World of Uncanny AI Twitter Art|website=Nylon|date=24 March 2022 |accessdate=8 March 2023|archive-date=29 August 2023|archive-url=https://web.archive.org/web/20230829225921/https://www.nylon.com/life/images-ai-art-twitter-machine-learning|url-status=live}}{{Cite web|url=https://www.yahoo.com/lifestyle/ai-turns-movie-text-descriptions-152546287.html|title=This AI Turns Movie Text Descriptions Into Abstract Posters|website=Yahoo Life|date=20 September 2021 |accessdate=8 March 2023|archive-date=27 December 2022|archive-url=https://web.archive.org/web/20221227171626/https://www.yahoo.com/lifestyle/ai-turns-movie-text-descriptions-152546287.html|url-status=live}}{{Cite web|url=https://www.theregister.com/2021/08/22/in_brief_ai/|title=A man spent a year in jail on a murder charge involving disputed AI evidence. Now the case has been dropped|first=Katyanna|last=Quach|website=www.theregister.com|accessdate=8 March 2023|archive-date=8 March 2023|archive-url=https://web.archive.org/web/20230308173730/https://www.theregister.com/2021/08/22/in_brief_ai/|url-status=live}} Building on ideas dating back to Google's DeepDream,{{Cite web|url=https://ml.berkeley.edu/blog/posts/clip-art/|title=Alien Dreams: An Emerging Art Scene - ML@B Blog|website=Alien Dreams: An Emerging Art Scene - ML@B Blog|accessdate=8 March 2023|archive-date=10 March 2023|archive-url=https://web.archive.org/web/20230310061548/https://ml.berkeley.edu/blog/posts/clip-art/|url-status=live}} they found their first major success combining CLIP with another publicly available model called VQGAN and the resulting model is called VQGAN-CLIP.{{Cite web |title=VQGAN-CLIP |url=https://www.eleuther.ai/artifacts/vqgan-clip |access-date=2023-08-20 |website=EleutherAI |language=en-GB |archive-date=20 August 2023 |archive-url=https://web.archive.org/web/20230820072140/https://www.eleuther.ai/artifacts/vqgan-clip |url-status=live }} Crowson released the technology by tweeting notebooks demonstrating the technique that people could run for free without any special equipment.{{Cite news|url=https://www.abc.net.au/news/science/2021-07-15/ai-art-tool-makes-paintings-of-australia/100288386|title=We asked an AI tool to 'paint' images of Australia. Critics say they're good enough to sell|newspaper=ABC News |date=14 July 2021|accessdate=8 March 2023|via=www.abc.net.au|archive-date=7 March 2023|archive-url=https://web.archive.org/web/20230307043918/https://www.abc.net.au/news/science/2021-07-15/ai-art-tool-makes-paintings-of-australia/100288386|url-status=live}}{{Cite web|url=https://analyticsindiamag.com/online-tools-to-create-mind-blowing-ai-art/|title=Online tools to create mind-blowing AI art|first=Poornima|last=Nataraj|date=28 February 2022|website=Analytics India Magazine|accessdate=8 March 2023|archive-date=8 February 2023|archive-url=https://web.archive.org/web/20230208012723/https://analyticsindiamag.com/online-tools-to-create-mind-blowing-ai-art/|url-status=live}}{{Cite web|url=https://www.vice.com/en/article/woman-making-viral-portraits-of-mental-health-on-tiktok/|title=Meet the Woman Making Viral Portraits of Mental Health on TikTok|website=Vice.com|date=30 November 2021 |access-date=8 March 2023|archive-date=11 May 2023|archive-url=https://web.archive.org/web/20230511151259/https://www.vice.com/en/article/qjbb3w/woman-making-viral-portraits-of-mental-health-on-tiktok|url-status=live}} This work was credited by Stability AI CEO Emad Mostaque as motivating the founding of Stability AI.{{cite tweet|user=EMostaque|number=1631359828541972483|title=Stability AI came out of @AiEleuther and we have been delighted to incubate it as the foundation was set up}}
Public reception
= Praise =
EleutherAI's work to democratize GPT-3 won the UNESCO Netexplo Global Innovation Award in 2021,{{Cite web|url=https://www.unesco.org/en/articles/unesco-netexplo-forum-2021|title=UNESCO Netexplo Forum 2021 | UNESCO|accessdate=8 March 2023|archive-date=16 October 2022|archive-url=https://web.archive.org/web/20221016051954/https://www.unesco.org/en/articles/unesco-netexplo-forum-2021|url-status=live}} InfoWorld's Best of Open Source Software Award in 2021{{Cite web|url=https://www.infoworld.com/article/3637038/the-best-open-source-software-of-2021.html|title=The best open source software of 2021|first=James R. Borck, Martin Heller, Andrew C. Oliver, Ian Pointer, Matthew Tyson and Serdar|last=Yegulalp|date=18 October 2021|website=InfoWorld|accessdate=8 March 2023|archive-date=8 March 2023|archive-url=https://web.archive.org/web/20230308023356/https://www.infoworld.com/article/3637038/the-best-open-source-software-of-2021.html|url-status=live}} and 2022,{{Cite web|url=https://www.infoworld.com/article/3676829/the-best-open-source-software-of-2022.html|title=The best open source software of 2022|first=James R. Borck, Martin Heller, Andrew C. Oliver, Ian Pointer, Isaac Sacolick, Matthew Tyson and Serdar|last=Yegulalp|date=17 October 2022|website=InfoWorld|accessdate=8 March 2023|archive-date=8 March 2023|archive-url=https://web.archive.org/web/20230308023356/https://www.infoworld.com/article/3676829/the-best-open-source-software-of-2022.html|url-status=live}} was nominated for VentureBeat's AI Innovation Award in 2021.{{Cite web|url=https://venturebeat.com/business/venturebeat-presents-ai-innovation-awards-nominees-at-transform-2021/|title=VentureBeat presents AI Innovation Awards nominees at Transform 2021|date=16 July 2021|accessdate=8 March 2023|archive-date=8 March 2023|archive-url=https://web.archive.org/web/20230308023356/https://venturebeat.com/business/venturebeat-presents-ai-innovation-awards-nominees-at-transform-2021/|url-status=live}}
Gary Marcus, a cognitive scientist and noted critic of deep learning companies such as OpenAI and DeepMind,{{Cite web|url=https://www.zdnet.com/article/the-next-decade-in-ai-gary-marcus-four-steps-towards-robust-artificial-intelligence/|title=What's next for AI: Gary Marcus talks about the journey toward robust artificial intelligence|website=ZDNET|accessdate=8 March 2023|archive-date=1 March 2023|archive-url=https://web.archive.org/web/20230301171618/https://www.zdnet.com/article/the-next-decade-in-ai-gary-marcus-four-steps-towards-robust-artificial-intelligence/|url-status=live}} has repeatedly{{cite tweet|user=GaryMarcus|number=1491631086819889155|title=GPT-NeoX-20B, 20 billion parameter large language model made freely available to public, with candid report on strengths, limits, ecological costs, etc.}}{{cite tweet|user=GaryMarcus|number=1494995567637762048|title=incredibly important result: "our results raise the question of how much [large language] models actually generalize beyond pretraining data"}} praised EleutherAI's dedication to open-source and transparent research.
Maximilian Gahntz, a senior policy researcher at the Mozilla Foundation, applauded EleutherAI's efforts to give more researchers the ability to audit and assess AI technology. "If models are open and if data sets are open, that'll enable much more of the critical research that's pointed out many of the flaws and harms associated with generative AI and that's often far too difficult to conduct."{{Cite web |last=Chowdhury |first=Meghmala |date=2022-12-29 |title=Will Powerful AI Disrupt Industries Once Thought to be Safe in 2023? |url=https://www.analyticsinsight.net/will-powerful-ai-disrupt-industries-once-thought-to-be-safe-in-2023/ |access-date=2023-04-06 |website=Analytics Insight |language=en-US |archive-date=1 January 2023 |archive-url=https://web.archive.org/web/20230101102714/https://www.analyticsinsight.net/will-powerful-ai-disrupt-industries-once-thought-to-be-safe-in-2023/ |url-status=live }}
= Criticism =
Technology journalist Kyle Wiggers has raised concerns about whether EleutherAI is as independent as it claims, or "whether the involvement of commercially motivated ventures like Stability AI and Hugging Face—both of which are backed by substantial venture capital—might influence EleutherAI's research."{{Cite web|url=https://techcrunch.com/2023/03/02/stability-ai-hugging-face-and-canva-back-new-ai-research-nonprofit/|title=Stability AI, Hugging Face and Canva back new AI research nonprofit|first=Kyle|last=Wiggers|date=2 March 2023|accessdate=8 March 2023|archive-date=7 March 2023|archive-url=https://web.archive.org/web/20230307110347/https://techcrunch.com/2023/03/02/stability-ai-hugging-face-and-canva-back-new-ai-research-nonprofit/|url-status=live}}
See also
References
{{reflist|25em}}
{{Existential risk from artificial intelligence}}
Category:Artificial intelligence laboratories