GPT-4o#GPT Image 1
{{Short description|Large multimodal model from OpenAI}}
{{Use mdy dates|date=February 2025}}
{{Infobox software
| title = Generative Pre-trained Transformer 4 Omni (GPT-4o)
| logo =
| developer = OpenAI
| released = {{Start date and age|2024|5|13}}
| latest preview version = ChatGPT-4o-latest (2025-03-26)
| latest preview date = {{Start date and age|2025|03|26}}
| replaces = GPT-4 Turbo
| replaced_by = {{Indented plainlist|
}}
| genre = {{Indented plainlist|
}}
| license = Proprietary
| website = {{URL|https://openai.com/index/hello-gpt-4o}}
}}
GPT-4o ("o" for "omni") is a multilingual, multimodal generative pre-trained transformer developed by OpenAI and released in May 2024.{{Cite web |last=Wiggers |first=Kyle |date=2024-05-13 |title=OpenAI debuts GPT-4o 'omni' model now powering ChatGPT |url=https://techcrunch.com/2024/05/13/openais-newest-model-is-gpt-4o/ |access-date=2024-05-13 |website=TechCrunch |language=en-US}} GPT-4o is free, but ChatGPT Plus subscribers have higher usage limits.{{Cite web |last=Field |first=Hayden |date=2024-05-13 |title=OpenAI launches new AI model GPT-4o and desktop version of ChatGPT |url=https://www.cnbc.com/2024/05/13/openai-launches-new-ai-model-and-desktop-version-of-chatgpt.html |access-date=2024-05-14 |website=CNBC |language=en}} It can process and generate text, images and audio.{{Cite web |last=Colburn |first=Thomas |title=OpenAI unveils GPT-4o, a fresh multimodal AI flagship model |url=https://www.theregister.com/2024/05/13/openai_gpt4o/ |access-date=2024-05-18 |website=The Register |language=en}} Its application programming interface (API) is faster and cheaper than its predecessor, GPT-4 Turbo.
Background
Multiple versions of GPT-4o were originally secretly launched under different names on Large Model Systems Organization's (LMSYS) Chatbot Arena as three different models. These three models were called gpt2-chatbot, im-a-good-gpt2-chatbot, and im-also-a-good-gpt2-chatbot.{{Cite web |last=Edwards |first=Benj |date=2024-05-13 |title=Before launching, GPT-4o broke records on chatbot leaderboard under a secret name |url=https://arstechnica.com/information-technology/2024/05/before-launching-gpt-4o-broke-records-on-chatbot-leaderboard-under-a-secret-name/ |access-date=2024-05-17 |website=Ars Technica |language=en-us}} On 7 May 2024, OpenAI CEO Sam Altman tweeted "im-a-good-gpt2-chatbot", which was commonly interpreted as a confirmation that these were new OpenAI models being A/B tested.{{Cite web |last=Zeff |first=Maxwell |date=2024-05-07 |title=Powerful New Chatbot Mysteriously Returns in the Middle of the Night |url=https://gizmodo.com/powerful-new-gpt2-chatbot-mysteriously-returns-1851460717 |access-date=2024-05-17 |website=Gizmodo |language=en}}{{Cite news |title=Sam Altman (@sama) on X |url=https://twitter.com/sama/status/1787222050589028528 |archive-url=http://web.archive.org/web/20241217062611/https://twitter.com/sama/status/1787222050589028528 |archive-date=2024-12-17 |access-date=2025-04-06 |work=X (formerly Twitter) |language=en}}
Capabilities
When released in May 2024, GPT-4o achieved state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation.{{cite web |last=van Rijmenam |first=Mark |date=13 May 2024 |title=OpenAI Launched GPT-4o: The Future of AI Interactions Is Here |url=https://www.thedigitalspeaker.com/openai-gpt4o-future-ai-interactions/ |access-date=17 May 2024 |website=The Digital Speaker}}{{Cite web |last=Daws |first=Ryan |date=2024-05-14 |title=GPT-4o delivers human-like AI interaction with text, audio, and vision integration |url=https://www.artificialintelligence-news.com/2024/05/14/gpt-4o-human-like-ai-interaction-text-audio-vision-integration/ |access-date=2024-05-18 |website=AI News |language=en-GB}}{{Cite journal |last1=Shahriar |first1=Sakib |last2=Lund |first2=Brady D. |last3=Mannuru |first3=Nishith Reddy |last4=Arshad |first4=Muhammad Arbab |last5=Hayawi |first5=Kadhim |last6=Bevara |first6=Ravi Varma Kumar |last7=Mannuru |first7=Aashrith |last8=Batool |first8=Laiba |date=2024-09-03 |title=Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency |journal=Applied Sciences |language=en |volume=14 |issue=17 |pages=7782 |doi=10.3390/app14177782 |doi-access=free |issn=2076-3417}} GPT-4o scored 88.7 on the Massive Multitask Language Understanding (MMLU) benchmark compared to 86.5 for GPT-4.{{Cite web |title=Hello GPT-4o |url=https://openai.com/index/hello-gpt-4o/ |website=OpenAI}} Unlike GPT-3.5 and GPT-4, which rely on other models to process sound, GPT-4o natively supports voice-to-voice. The Advanced Voice Mode was delayed and finally released to ChatGPT Plus and Team subscribers in September 2024.{{Cite web |last=David |first=Emilia |date=2024-09-24 |title=OpenAI finally brings humanlike ChatGPT Advanced Voice Mode to U.S. Plus, Team users |url=https://venturebeat.com/ai/openai-finally-brings-humanlike-chatgpt-advanced-voice-mode-to-u-s-plus-team-users/ |access-date=2025-02-15 |website=VentureBeat |language=en-US}} On 1 October 2024, the Realtime API was introduced.{{Cite web |title=Introducing the Realtime API |url=https://openai.com/index/introducing-the-realtime-api/ |access-date=2024-11-29 |website=openai.com |language=en-US}}
When released, the model supported over 50 languages, which OpenAI claims cover over 97% of speakers.{{Cite web |last=Edwards |first=Benj |date=2024-05-13 |title=Major ChatGPT-4o update allows audio-video talks with an "emotional" AI chatbot |url=https://arstechnica.com/information-technology/2024/05/chatgpt-4o-lets-you-have-real-time-audio-video-conversations-with-emotional-chatbot/ |access-date=2024-05-17 |website=Ars Technica |language=en-us}} Mira Murati demonstrated the model's multilingual capability by speaking Italian to the model and having it translate between English and Italian during the live-streamed OpenAI demonstration event on 13 May 2024. In addition, the new tokenizer{{Cite web |title=OpenAI Platform |url=https://platform.openai.com/tokenizer |access-date=2024-11-29 |website=platform.openai.com |language=en}} uses fewer tokens for certain languages, especially languages that are not based on the Latin alphabet, making it cheaper for those languages.
GPT-4o has knowledge up to October 2023,{{Cite web |title=Models - OpenAI API |url=https://platform.openai.com/docs/models/gpt-4o |access-date=17 May 2024 |website=OpenAI}}{{Cite web |last=Conway |first=Adam |date=2024-05-13 |title=What is GPT-4o? Everything you need to know about the new OpenAI model that everyone can use for free |url=https://www.xda-developers.com/gpt-4o/ |access-date=2024-05-17 |website=XDA Developers |language=en}} but can access the Internet if up-to-date information is needed. It has a context length of 128k tokens.
= Corporate customization =
In August 2024, OpenAI introduced a new feature allowing corporate customers to customize GPT-4o using proprietary company data. This customization, known as fine-tuning, enables businesses to adapt GPT-4o to specific tasks or industries, enhancing its utility in areas like customer service and specialized knowledge domains. Previously, fine-tuning was available only on the less powerful model GPT-4o mini.{{Cite web |date=2024-08-21 |title=OpenAI lets companies customise its most powerful AI model |url=https://www.scmp.com/tech/tech-trends/article/3275262/openai-launches-fine-tuning-gpt-4o-its-most-powerful-ai-model#:~:text=To%20fine-tune%20a%20model,not%20images%20or%20other%20content. |access-date=2024-08-22 |website=South China Morning Post |language=en}}{{Cite news |date=2024-08-20 |title=OpenAI to Let Companies Customize Its Most Powerful AI Model |url=https://www.bloomberg.com/news/articles/2024-08-20/openai-to-let-companies-customize-its-most-powerful-ai-model |access-date=2024-08-22 |work=Bloomberg |language=en}}
The fine-tuning process requires customers to upload their data to OpenAI's servers, with the training typically taking one to two hours. OpenAI's focus with this rollout is to reduce the complexity and effort required for businesses to tailor AI solutions to their needs, potentially increasing the adoption and effectiveness of AI in corporate environments.{{Cite news |author=The Hindu Bureau |date=2024-08-21 |title=OpenAI will let businesses customise GPT-4o for specific use cases |url=https://www.thehindu.com/sci-tech/technology/openai-will-let-businesses-customise-gpt-4o-for-specific-use-cases/article68549452.ece |access-date=2024-08-22 |work=The Hindu |language=en-IN |issn=0971-751X}}
GPT-4o mini
On July 18, 2024, OpenAI released a smaller and cheaper version, GPT-4o mini.{{Cite web |last=Franzen |first=Carl |date=2024-07-18 |title=OpenAI unveils GPT-4o mini — a smaller, much cheaper multimodal AI model |url=https://venturebeat.com/ai/openai-unveils-gpt-4o-mini-a-smaller-much-cheaper-multimodal-ai-model/ |access-date=2024-07-18 |website=VentureBeat |language=en-US}}
According to OpenAI, its low cost is expected to be particularly useful for companies, startups, and developers that seek to integrate it into their services, which often make a high number of API calls. Its API costs $0.15 per million input tokens and $0.6 per million output tokens, compared to $2.50 and $10,{{Cite web |title=OpenAI Pricing |url=https://openai.com/api/pricing/ }} respectively, for GPT-4o. It is also significantly more capable and 60% cheaper than GPT-3.5 Turbo, which it replaced on the ChatGPT interface. The price after fine-tuning doubles: $0.3 per million input tokens and $1.2 per million output tokens. It is estimated that its parameter count is 8B.{{cite arXiv |last=Ben Abacha |first=Asma |date=2025 |title=MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes |eprint=2412.19260 |class= cs.CL}}
GPT Image 1
{{Infobox software
| name = GPT Image 1
| screenshot = KarlMarx4o.webp
| caption = An image of Karl Marx in a modern-day context generated by GPT Image 1
| author =
| developer = OpenAI
| released = {{start date and age|df=y|2025|3|25}}
| replaces = DALL-E 3
| genre = Text-to-image model
| license =
| website = https://platform.openai.com/docs/models/gpt-image-1
}}
On March 25, 2025, OpenAI released an image-generation model that is native to GPT-4o, as the successor to DALL-E 3. The model was later named as GPT Image 1 (gpt-image-1
) and introduced to the API on April 23. It was made available to paid users, with the rollout to free users being delayed.{{cite news |last=Roth |first=Emma |title=ChatGPT's new image generator is delayed for free users |url=https://www.theverge.com/news/636948/openai-chatgpt-image-generation-gpt-4o |access-date=March 26, 2025 |work=The Verge |date=March 26, 2025}} The use of the feature was subsequently limited, with Sam Altman noting in a Tweet that "[their] GPUs were melting" from its unprecedented popularity.{{cite news |last1=Welch |first1=Chris |title=OpenAI says "our GPUs are melting" as it limits ChatGPT image generation requests |url=https://www.theverge.com/news/637542/chatgpt-says-our-gpus-are-melting-as-it-puts-limit-on-image-generation-requests |access-date=March 28, 2025 |work=The Verge |date=March 27, 2025}} OpenAI later revealed that over 130 million users around the world created more than 700 million images with GPT Image 1 in just the first week.{{cite web |title=Introducing our latest image generation model in the API |url=https://openai.com/index/image-generation-api/ |publisher=OpenAI |access-date=30 April 2025 |date=23 April 2025}}
Controversies
= Scarlett Johansson controversy =
{{anchor|Sky voice}}
As released, GPT-4o offered five voices: Breeze, Cove, Ember, Juniper, and Sky. A similarity between the voice of American actress Scarlett Johansson and Sky was quickly noticed. On May 14, Entertainment Weekly asked themselves whether this likeness was on purpose.{{Cite magazine |last=Stenzel |first=Wesley |date=May 14, 2024 |title=ChatGPT launching talking AI that sounds exactly like Scarlett Johansson in 'Her' — on purpose? |url=https://ew.com/chatgpt-talking-ai-sounds-just-like-scarlett-johansson-in-her-8648678 |access-date=2024-05-21 |magazine=Entertainment Weekly |language=en}} On May 18, Johansson's husband, Colin Jost, joked about the similarity in a segment on Saturday Night Live.{{Cite web |last=Caruso |first=Nick |date=2024-05-20 |title=Scarlett Johansson Says She Was 'Shocked, Angered and in Disbelief' After Hearing ChatGPT Voice That Sounds Like Her — Read Statement |url=https://tvline.com/news/scarlett-johansson-chatgpt-voice-openai-snl-joke-1235243988/ |access-date=2024-05-21 |website=TVLine |language=en-US}} On May 20, 2024, OpenAI disabled the Sky voice, issuing a statement saying "We've heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them."{{Cite web |date=May 19, 2024 |title=How the voices for ChatGPT were chosen |url=https://openai.com/index/how-the-voices-for-chatgpt-were-chosen/ |website=OpenAI}}
Scarlett Johansson starred in the 2013 sci-fi movie Her, playing Samantha, an artificially intelligent virtual assistant personified by a female voice.
As part of the promotion leading up to the release of GPT-4o, Sam Altman on May 13 tweeted a single word: "her".{{Cite web |date=May 13, 2024 |title=her |url=https://x.com/sama/status/1790075827666796666?lang=en |access-date=2024-05-21 |website=X (formerly Twitter)}}{{Cite news |last=Allyn |first=Bobby |date=May 20, 2024 |title=Scarlett Johansson says she is 'shocked, angered' over new ChatGPT voice |url=https://www.npr.org/2024/05/20/1252495087/openai-pulls-ai-voice-that-was-compared-to-scarlett-johansson-in-the-movie-her |work=NPR}}
OpenAI stated that each voice was based on the voice work of a hired actor. According to OpenAI, "Sky's voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice." CTO Mira Murati stated "I don't know about the voice. I actually had to go and listen to Scarlett Johansson's voice." OpenAI further stated the voice talent was recruited before reaching out to Johansson.{{cite news |last=Tiku |first=Nitasha |title=OpenAI didn't copy Scarlett Johansson's voice for ChatGPT, records show |url=https://www.washingtonpost.com/technology/2024/05/22/openai-scarlett-johansson-chatgpt-ai-voice/ |access-date=November 29, 2024 |newspaper=The Washington Post |date=May 23, 2024}}
On May 21, Johansson issued a statement explaining that OpenAI had repeatedly offered to make her a deal to gain permission to use her voice as early as nine months prior to release, a deal she rejected. She said she was "shocked, angered, and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference." In the statement, Johansson also used the incident to draw attention to the lack of legal safeguards around the use of creative work to power leading AI tools, as her legal counsel demanded OpenAI detail the specifics of how the Sky voice was created.{{Cite news |last=Mickle |first=Tripp |date=2024-05-20 |title=Scarlett Johansson Said No, but OpenAI's Virtual Assistant Sounds Just Like Her |url=https://www.nytimes.com/2024/05/20/technology/scarlett-johannson-openai-voice.html |access-date=2024-05-21 |work=The New York Times |language=en-US |issn=0362-4331}}
Observers noted similarities to how Johansson had previously sued and settled with The Walt Disney Company for breach of contract over the direct-to-streaming rollout of her Marvel film Black Widow,{{Cite web |date=2024-05-21 |title=Scarlett Johansson took on Disney. Now she's battling OpenAI over a ChatGPT voice that sounds like hers |url=https://ca.finance.yahoo.com/news/scarlett-johansson-took-disney-now-130511407.html |access-date=2024-05-21 |website=Yahoo Finance |language=en-CA}} a settlement widely speculated to have netted her around $40M.{{Cite news |last=Pulver |first=Andrew |date=2021-10-01 |title=Scarlett Johansson settles Black Widow lawsuit with Disney |url=https://www.theguardian.com/film/2021/oct/01/scarlett-johansson-settles-black-widow-lawsuit-disney |access-date=2024-05-21 |work=The Guardian |language=en-GB |issn=0261-3077}}
Also on May 21, Shira Ovide at The Washington Post shared her list of "most bone-headed self-owns" by technology companies, with the decision to go ahead with a Johansson sound-alike voice despite her opposition and then denying the similarities ranking 6th.{{cite news |last=Ovide |first=Shira |title=Exactly how stupid was what OpenAI did to Scarlett Johansson? |url=https://www.washingtonpost.com/technology/2024/05/21/chatgpt-voice-scarlett-johansson/ |newspaper=The Washington Post |date=30 May 2024}} On May 24, Derek Robertson at Politico wrote about the "massive backlash", concluding that "appropriating the voice of one of the world's most famous movie stars — in reference [...] to a film that serves as a cautionary tale about over-reliance on AI — is unlikely to help shift the public back into [Sam Altman's] corner anytime soon."{{Cite web |last=Robertson |first=Derek |date=May 22, 2024 |title=Sam Altman's Scarlett Johansson Blunder Just Made AI a Harder Sell in DC |url=https://www.politico.com/news/magazine/2024/05/22/scarlett-johansson-sam-altmans-washington-00159507 |website=Politico}}
= Studio Ghibli filter =
File:WhiteHouseGhibli.jpg's official Twitter account. The depiction in the style of Studio Ghibli has been criticized.]]
Upon the launch of GPT-4o's image generation (later named as GPT Image 1) on March 2025, photographs recreated in the style of Studio Ghibli films went viral.{{cite news |last=Spangler |first=Todd |title=OpenAI CEO Responds to ChatGPT Users Creating Studio Ghibli-Style AI Images |url=https://variety.com/2025/digital/news/openai-ceo-chatgpt-studio-ghibli-ai-images-1236349141/ |access-date=March 27, 2025 |work=Variety |date=March 26, 2025}} Sam Altman acknowledged the trend by changing his profile pic into a Studio Ghibli-inspired one.{{cite news |last=Choudhary |first=Govind |title=OpenAI CEO Sam Altman reacts as AI turns him into a Studio Ghibli Character |url=https://www.livemint.com/ai/artificial-intelligence/openai-ceo-sam-altman-reacts-as-ai-turns-him-into-a-studio-ghibli-character-11743083600993.html |access-date=March 28, 2025 |work=Mint |date=March 27, 2025 |language=en}}{{cite news |last=Notopoulos |first=Katie |title=Sam Altman did a good tweet |url=https://www.businessinsider.com/sam-altman-good-tweet-twink-image-generator-2025-3 |access-date=March 28, 2025 |work=Business Insider |date=March 27, 2025}} The use of the Ghibli style was challenged, with the Associated Press and The New York Times noting that Hayao Miyazaki was critical of AI art in the 2016 documentary Never-Ending Man: Hayao Miyazaki.{{cite news |last1=O'Brien |first1=Matt |last2=Parvini |first2=Sarah |title=ChatGPT's viral Studio Ghibli-style images highlight AI copyright concerns |url=https://apnews.com/article/studio-ghibli-chatgpt-images-hayao-miyazaki-openai-0f4cb487ec3042dd5b43ad47879b91f4 |access-date=March 28, 2025 |work=AP News |date=March 27, 2025 |language=en}}{{cite news |last=Kircher |first=Madison Malone |date=March 27, 2025 |title=ChatGPT's Studio Ghibli Style Animations Are Almost Too Good |url=https://www.nytimes.com/2025/03/27/style/ai-chatgpt-studio-ghibli.html |url-status=live |archive-url=https://archive.today/20250327191540/https://www.nytimes.com/2025/03/27/style/ai-chatgpt-studio-ghibli.html |archive-date=March 27, 2025 |access-date=March 27, 2025 |work=The New York Times |language=en}} Use of the Ghibli-style images faced further controversy when the White House's official Twitter account posted a Ghibli-style image mocking the arrest of migrant woman Virginia Basora-Gonzalez by immigration authorities, which shows her crying as an immigration officer places her in handcuffs.{{cite news |last=Bio |first=Demian |title=White House Mocks Migrant With Criminal Record Who Cried After Being Arrested |url=https://www.latintimes.com/white-house-mocks-migrant-criminal-record-who-cried-after-being-arrested-579414 |access-date=March 28, 2025 |work=Latin Times |date=March 27, 2025 |language=en}}{{cite news |last=Vera |first=Kelby |title=White House Posts Ghoulish AI Cartoon Showing Woman's Deportation |url=https://www.huffpost.com/entry/white-house-deportation-ai-ghibli-animation-cartoon_n_67e5b4ace4b0455df70b41c2 |access-date=March 28, 2025 |work=HuffPost |date=March 27, 2025 |language=en}} North American distributor GKids responded to the trend in a press release, comparing the use of the filter to its coinciding IMAX re-release of the 1997 Studio Ghibli film, Princess Mononoke.{{cite news |last=Tangcay |first=Jazz |title=Studio Ghibli Distributor Champions 'Princess Mononoke' Box Office at 'A Time When Technology Tries to Replicate Humanity' |url=https://variety.com/2025/film/news/studio-ghibli-princess-mononoke-open-ai-1236351261/ |access-date=March 29, 2025 |work=Variety |date=March 28, 2025}}
= Sycophancy =
In April 2025, OpenAI rolled back an update of GPT-4o due to excessive sycophancy, after widespread reports that it had become flattering and agreeable to the point of supporting clearly delusional or dangerous ideas.{{Cite web |last=Franzen |first=Carl |date=2025-04-30 |title=OpenAI rolls back ChatGPT's sycophancy and explains what went wrong |url=https://venturebeat.com/ai/openai-rolls-back-chatgpts-sycophancy-and-explains-what-went-wrong/ |access-date=2025-05-01 |website=VentureBeat |language=en-US}}
See also
References
{{Reflist|2}}
{{OpenAI}}
{{Artificial intelligence navbox}}
{{Generative AI}}
Category:2024 in artificial intelligence
Category:Artificial intelligence art
Category:Generative pre-trained transformers