OpenAI introduces latest AI reasoning models: o3, o4-mini Heaptalk

Heaptalk, Jakarta — OpenAI officially introduced its latest AI reasoning models, o3 and o4-mini. These models can use and combine tools within ChatGPT, including searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and generating images.

“Today, we’re releasing OpenAI o3 and o4-mini, the latest in our o-series of models trained to think for longer before responding. These are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers,” OpenAI stated on its official page.

o3 and o4-mini are trained to reason about when and how to use tools to produce detailed and thoughtful answers in the right output formats, typically in under a minute, to solve more complex problems. This allows them to tackle multi-faceted questions more effectively, which is claimed as a step toward a more agentic ChatGPT that can independently execute tasks on users’ behalf.

Mastering coding, math, and science

The company added, “The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks, setting a new standard in both intelligence and usefulness.”

Claimed as OpenAI’s most powerful reasoning model, o3 mastered coding, math, science, and visual perception skills. It is designed for complex queries requiring multi-faceted analysis and whose answers may not be immediately apparent. It performs especially strongly at visual tasks like analyzing images, charts, and graphics. In evaluations by external experts, o3 makes 20% fewer significant errors than OpenAI o1 on complex, real-world tasks, especially excelling in areas like programming, business/consulting, and creative ideation.

o4-mini is a smaller model optimized for fast, cost-efficient reasoning. It performs well in size and cost, particularly in math, coding, and visual tasks. This model outperforms its predecessor, o3‑mini, on non-STEM tasks and domains like data science. As a result of its efficiency, o4-mini supports significantly higher usage limits than o3, making it a high-volume, high-throughput option for questions that benefit from reasoning.