OpenAI o3

{{Short description|Large language model}}

{{Use mdy dates|date=January 2025}}

{{Infobox software

| name = o3

| developer = OpenAI

| released = {{Indented plainlist|

  • o3-mini: {{Start date|2025|01|31}}
  • o3: {{Start date| 2025|04|16}}}}

o3-pro: {{Start date|2025|06|10}}

| latest release version =

| latest release date =

| genre = {{indented plainlist|

}}

| replaces = OpenAI o1

| replaced_by = o3-mini: OpenAI o4-mini

| license =

}}

OpenAI o3 is a reflective generative pre-trained transformer (GPT) model developed by OpenAI as a successor to OpenAI o1 for ChatGPT. It is designed to devote additional deliberation time when addressing questions that require step-by-step logical reasoning.{{Cite magazine |last=Knight |first=Will |date=December 20, 2024 |title=OpenAI Upgrades Its Smartest AI Model With Improved Reasoning Skills |url=https://www.wired.com/story/openai-o3-reasoning-model-google-gemini/ |magazine=Wired |via=}}{{Cite web |last=Metz |first=Cade |date=2024-12-20 |title=OpenAI Unveils New A.l. That Can 'Reason' Through Math and Science Problems |url=https://www.nytimes.com/2024/12/20/technology/openai-new-ai-math-science.html |website=The New York Times}} On January 31, 2025, OpenAI released a smaller model, o3-mini, followed on April 16 by o3 and o4-mini.{{Cite web |title=Introducing OpenAI o3 and o4-mini |url=https://openai.com/index/introducing-o3-and-o4-mini/ |access-date=2025-04-16 |website=openai.com |language=en-US}}

History

The OpenAI o3 model was announced on December 20, 2024. It was called "o3" rather than "o2" to avoid trademark conflict with the mobile carrier brand named O2. OpenAI invited safety and security researchers to apply for early access of these models until January 10, 2025.{{Cite web |date=December 20, 2024 |title=Early access for safety testing |url=https://openai.com/index/early-access-for-safety-testing/ |website=OpenAI}} Similarly to o1, there are two different models: o3 and o3-mini.{{Cite web |last=Franzen |first=Carl |date=2025-01-31 |title=It's here: OpenAI's o3-mini advanced reasoning model arrives to counter DeepSeek's rise |url=https://venturebeat.com/ai/its-here-openais-o3-mini-advanced-reasoning-model-arrives-to-counter-deepseeks-rise/ |access-date=2025-02-01 |website=VentureBeat |language=en-US}}

On January 31, 2025, OpenAI released o3-mini to all ChatGPT users (including free-tier) and some API users. OpenAI describes o3-mini as a "specialized alternative" to o1 for "technical domains requiring precision and speed".{{Cite web |date=February 13, 2025 |title=Introducing OpenAI O3 Mini |url=https://openai.com/index/openai-o3-mini/ |access-date=February 13, 2025 |website=OpenAI}} o3-mini features three reasoning effort levels: low, medium and high. The free version uses medium. The variant using more compute is called o3-mini-high, and is available to paid subscribers.{{Cite web |date=January 31, 2025 |title=OpenAI o3-mini |url=https://openai.com/index/openai-o3-mini/ |access-date=2025-02-03 |website=OpenAI |language=en-US}} Subscribers to ChatGPT's Pro tier have unlimited access to both o3-mini and o3-mini-high.

On February 2, OpenAI launched OpenAI Deep Research, a ChatGPT service using a version of o3 that makes comprehensive reports within 5 to 30 minutes, based on web searches.{{Cite web |last=Ha |first=Anthony |date=2025-02-03 |title=OpenAI unveils a new ChatGPT agent for 'deep research' |url=https://techcrunch.com/2025/02/02/openai-unveils-a-new-chatgpt-agent-for-deep-research/ |access-date=2025-02-04 |website=TechCrunch |language=en-US}}

On February 6, in response to pressure from rivals like DeepSeek, OpenAI announced an update aimed at enhancing the transparency of the thought process in its o3-mini model.{{Cite web |last=Wiggers |first=Kyle |date=2025-02-06 |title=OpenAI now reveals more of its o3-mini model's thought process |url=https://techcrunch.com/2025/02/06/openai-now-reveals-more-of-its-o3-mini-models-thought-process/ |access-date=2025-02-07 |website=TechCrunch |language=en-US}}

On February 12, OpenAI further increased rate limits for o3-mini-high to 50 requests per day (from 50 requests per week) for ChatGPT Plus subscribers, and implemented file/image upload support.{{Cite tweet |number=1889822643676913977 |user=OpenAI |title=Two updates you'll like— OpenAI o1 and o3-mini now support both file & image uploads in ChatGPT. We raised o3-mini-high limits by 7x for Plus users to up to 50 per day. |date=February 13, 2025 |access-date=February 13, 2025 |url=https://x.com/OpenAI/status/1889822643676913977}}

On April 16, 2025, OpenAI released o3 and o4-mini, a successor of o3-mini.

On June 10, OpenAI released o3-pro, which the company claims is its most capable model yet.{{Cite web |last=Wiggers |first=Kyle |date=2025-06-10 |title=OpenAI releases o3-pro, a souped-up version of its o3 AI reasoning model |url=https://techcrunch.com/2025/06/10/openai-releases-o3-pro-a-souped-up-version-of-its-o3-ai-reasoning-model/ |access-date=2025-06-11 |website=TechCrunch |language=en-US}} OpenAI stated: "We recommend using it for challenging questions where reliability matters more than speed, and waiting a few minutes is worth the tradeoff".{{Cite web |last=Washenko |first=Anna |date=2025-06-06 |title=OpenAI adds the o3-pro model to ChatGPT today |url=https://www.engadget.com/ai/openai-adds-the-o3-pro-model-to-chatgpt-today-212126136.html |access-date=2025-06-11 |website=Engadget |language=en-US}}

Capabilities

Reinforcement learning was used to teach o3 to "think" before generating answers, using what OpenAI refers to as a "private chain of thought".{{Cite report |url=https://cdn.openai.com/o3-mini-system-card-feb10.pdf |title=OpenAI O3 Mini System Card |date=January 13, 2025 |publisher=OpenAI |access-date=February 13, 2025 }} This approach enables the model to plan ahead and reason through tasks, performing a series of intermediate reasoning steps to assist in solving the problem, at the cost of additional computing power and increased latency of responses.{{Cite web |last1=Zeff |first1=Maxwell |last2=Wiggers |first2=Kyle |date=2024-12-20 |title=OpenAI announces new o3 models |url=https://techcrunch.com/2024/12/20/openai-announces-new-o3-model/ |access-date=2024-12-22 |website=TechCrunch |language=en-US}}

o3 demonstrates significantly better performance than o1 on complex tasks, including coding, mathematics, and science. OpenAI reported that o3 achieved a score of 87.7% on the GPQA Diamond benchmark, which contains expert-level science questions not publicly available online.{{Cite web |last1=Franzen |first1=Carl |last2=David |first2=Emilia |date=2024-12-20 |title=OpenAI confirms new frontier models o3 and o3-mini |url=https://venturebeat.com/ai/openai-confirms-new-frontier-models-o3-and-o3-mini/ |access-date=2024-12-26 |website=VentureBeat |language=en-US}}

On SWE-bench Verified, a software engineering benchmark assessing the ability to solve real GitHub issues, o3 scored 71.7%, compared to 48.9% for o1. On Codeforces, o3 reached an Elo score of 2727, whereas o1 scored 1891.

On the Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) benchmark, which evaluates an AI's ability to handle new logical and skill acquisition problems, o3 attained three times the accuracy of o1.{{Cite web |last=Hsu |first=Jeremy |date=20 December 2024 |title=OpenAI's o3 model aced a test of AI reasoning – but it's still not AGI |url=https://www.newscientist.com/article/2462000-openais-o3-model-aced-a-test-of-ai-reasoning-but-its-still-not-agi/ |access-date=2024-12-22 |website=New Scientist |language=en-US}}

References

{{Reflist}}