Openais O3-Mini argumentation model comes to counteract deepseek

Openais O3-Mini argumentation model comes to counteract deepseek

Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more


Openai published a new proprietary AI model in good time to counteract the quick Rising the open source rival Deepseek-R1 – But will it be enough to stish the success of the latter?

Today, after several days full of rumors and increasing anticipation among AI users on social media, Openal debut O3-MiniThe second model in his new family of “Dennichen” – AL models that take a little more time to think, analyze their own processes and think about their own “chains” before they are on user inquiries and inputs with new ones Response.

The result is a model that can carry out on the level of a doctoral student or even a final tree when answering hard questions in mathematics, natural sciences, engineering and many other areas.

The O3-Mini model is now available on Chatgpt, including the free level, and on Openais application programming interface (API). And it is actually cheaper, faster and more powerful than the previous high-end model, Openai’s O1, and its faster siblings with a lower parameter, O1-Mini.

While it is inevitably compared to Deepseek-R1-and the publication date of some is considered in response-it is important to remember that O3 and O3mini were announced well before Deepseek R1 in January. in December 2024and this Openai CEO Sam Altman previously stated on x Due to the feedback from developers and researchers, there would also be chatted and the Openai -API.

In contrast to Deepseek-R1, O3-Mini is not made available as an open source model.

Openai provided no further details about the (suspected) larger O3 model, which was announced in December next to O3-Mini. At that time, Openas Opt-in-Dropdown form for testing O3 said that it would experience a “delay of several weeks” before third parties could test it.

Performance and functions

Similar to O1, Openai O3-Mini is optimized for the argumentation in mathematics, coding and natural sciences.

Its performance is comparable to Openai O1, if it uses medium -sized arguments, but offers the following advantages:

  • 24% faster response times compared to O1-Mini (Openai did not deliver a certain number here Tests of artificial analysisThe response time of O1 minis is 12.8 seconds to receive and output 100 tokens. For O3-Mini, a speed penalty of 24% would drop the response time to 10.32 seconds.)
  • Improved accuracy, with external testers 56% of the time prefer the answers from O3-Mini.
  • 39% less larger errors in complex real questions.
  • Better performance in coding and Stem tasks, especially with high argument.
  • Three argumentation effort (low, medium, high) and it enable users and developers to compensate for accuracy and speed.

The model also has impressive benchmarks that have even exceeded O1 in some cases, according to the O3 mini system card Openai, which was published online (which was previously published as the official announcement of the model availability).

The context window of O3-Mini-The number of combined tokens that can enter/output it in a single interaction-is 200,000, whereby there is a maximum of 100,000 in each issue. This is the same as the complete O1 model and surpasses Deepseek– –R1S context window of around 128,000/130,000 tokens. But it’s far below Google Gemini 2.0 Flash Thinking’s New context window of up to 1 million tokens.

While O3-Mini focuses on argumentation functions, it is not yet about visual functions. Developers and users who want to upload images and files should continue to use O1 in the meantime.

The competition heats up

The arrival of O3-Mini markings The first time is an argumentation model for free chatt users. The former O1 model family was only available for the payment of Subscribers from Chatgpt Plus, Pro and other plans as well as the paid API from Openaai.

As with the LLM-von Large Sprachmodell (LLM) by starting Chatgpt in November 2022, Openai essentially the entire category of argumentation models in September 2024 When O1 revealed for the first timeA class of models with a new training regime and architecture.

But Openai did not make O1 Open Source in line with his recent history, contrary to his name and the original founding mission. Instead, it kept the code of the model proprietary.

And in the past two weeks, O1 was overshadowed by overshadow Chinese Ki -Startup DeepseekThe R1 launched a rival, highly efficient, largely open source argumentation model, which free from everyone around the world to take, recover and adapt as well as free of charge on Deepseek’s website and mobile app-a model that reports According to a fraction of the costs of O1 and other LLMS made of top laboratories.

Deepseek-R1’s Permissible with license conditionsFree app/website for consumers and decision to free the code base of R1 freely available, it has led to a real explosion of use in both consumer and company markets – even a real usage explosion Openai investor Microsoft and anthropical backer Amazon Rushing to give variants from your cloud marketplaces. Confusion, the AI search company, also quickly Add a variant of it for users.

Deepseek also dethroned the Chatgpt iOS app as number 1 in the US Apple App StoreAnd is remarkable to exceed Openai by connecting his R1 model to the web search in his app and web. This is something that Openai has not yet done for O1, which leads to further technoangest for technology workers and others online that China has obtained or surpassed in the USA in the USA – or even in the technology in general.

However, many AI researchers, scientists and top VCs like Marc Andreessen welcomed the rise of Deepseek and in particular the open procurement as a flood that all boats in the AI.

Availability in Chatgpt

O3 is now setting for Chatgpt Free, Team and Pro user worldwide, whereby access to companies and educational institutions is available next week.

  • Free users can try O3-Mini for the first time by selecting the “Reason” button in the chat bar or regenerating an answer.
Screenshot of the Chatgpt request with “Reason” button. Note that the command prompt in Openas screenshot refers to Deepseek is accused of having done – Take the outputs of Openai models and train them to train your own R1.
  • The news limits have increased 3 times for plus and team users, from 50 to 150 messages per day.
  • Pro users receive unlimited access to O3-Mini and a new, even higher variant, O3-Mini high.

In addition, O3-Mini now supports search integration in chatt and offers answers with relevant web links. This function is still in the early stages, since Openaai refines the search functions in their argumentation models.

API integration and prices

For developers, O3-Mini is available via the API of the chat ends, assistants-API and Batch-API. The model supports function calls, structured outputs and developer news, which makes it easy to integrate into real applications.

One of the most remarkable advantages of O3-mini is cost efficiency: it is 63% cheaper than Openai O1-Mini and 93% cheaper than the complete O1 model with a price of USD 1.10/4.40 USD per million tokens in/ Out (with a cache discount of 50%).

However, it fades compared to the affordability of the civil servant Deepseek APIThe offer of R1 at $ 0.14/0.55 per million tokens in/out. In view of Deepseek in China, however, the geopolitical consciousness and security concerns with regard to the data of the user/enterprise are equipped in the model and from the model. Openai remains the preferred API for some security -oriented customers and companies in the USA. And Europe.

Developers can also adapt the argumentation effort (low, medium, high) based on their application needs and enable more control over latency and accuracy compromises.

In security, Openaai used something like this that is known as the “considered orientation” with O3-Mini. This means that the model has been asked to argue about the security guidelines written by humans that were given to understand more of their intention and damage, which should prevent them and find their own possibilities to ensure that these damage prevents become. Openaai says that the model is less zenzig when discussing sensitive topics and at the same time preserves security.

According to Openaai, the GPT-4O model exceeds in the treatment of security and jailbreak challenges and carried out extensive external security tests before the publication.

A The latest report deals in Wired (Where my wife works) showed that Deepseek succumbed to every jailbreak prompt and was tested by 50 by security researchers, which can give Openaai O3-Mini the advantage over Deepseek R1 in cases where security and security are of the utmost importance.

what is next?

The start of O3-Mini represents the broader efforts of Openas to make the advanced argument and cheaper and cost-effective than more intensive than ever before from deepseeks R1 and others. This includes Google, which recently published a free version of its own competition model Gemini 2 flash thinking With an extended input context of up to 1 million tokens.

With the focus on StEM argumentation and affordability, Openai should expand the range of the AI-controlled problem solving both in both consumer and developer applications.

But when the company becomes more ambitious than ever -for example, a 500 billion dollar -Infrastructure project of 500 billion US dollars will be announced Stargate With the support of Softbank, the question remains whether his strategy pays off well or not to justify the multi-billion Deeper investors Like Microsoft and other VCS.

If open source models are increasingly closing the gap with Openai in performance and exceed them in the costs, its superior security measures, powerful-who can prioritize costs and efficiency compared to these attributes? As always, we report on the developments while they develop.



Source link
Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *