Openai reacts to Deepseek competition with detailed signs of argument for O3-Mini

Openai reacts to Deepseek competition with detailed signs of argument for O3-Mini

Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more


Openai now shows more details about the O3-Mini argumentation process, its latest argumentation model. The change was announced Openais X account and gets like the AI ​​laboratory is Under increased pressure from Deepseek-R1, a rival open model that shows its argumentation token completely.

Models such as O3 and R1 are subjected to a lengthy “thinking chain” (cot) (cot), in which you create additional tokens to break the problem for different answers and achieve a final solution. Previously hidden openai’s argumentation models and generated only an overview of the argumentation levels on a high level. This made it difficult to understand users and developers to understand the logic of argumentation of the model and to change their instructions and request, to control them in the right direction.

Openai viewed the chain of thinking as a competitive advantage and hid it to prevent rivals from copying to train their models. But with R1 and other open models Show their complete laneThe lack of transparency is a disadvantage for Openai.

The new version of O3-Mini shows a more detailed version of Cot. Although we still don’t see the raw token, it offers the argumentation process a lot more clarity.

Why it is important for applications

In our Earlier experiments On O1 and R1 we found that O1 solved a little better in solving data analyzes and argumentation problems. One of the most important restrictions, however, was that there was no way to find out why the model made mistakes-and it often made mistakes if they were confronted with messy data in the real world from the web. On the other hand, it enabled us that the R1’s chain of thoughts will fix the problems and change our requests to improve the argument.

In one of our experiments, for example, both models could not give the correct answer. Thanks to the detailed chain of thought from R1, we were able to find out that the problem was not with the model itself, but with the call phase, in which information was collected from the web. In other experiments, R1’s chain of thought was able to provide us with information if it could not analyze the information we provided, while O1 gave us only a very rough overview of how it formulated its reaction.

We tested the new O3 mini model for a variant of an earlier experiment with which we were carried out with O1. We have provided the model with a text file with prices from various shares from January 2024 to January 2025. The file was loud and uneducated, a mixture of plain text and HTML elements. We then asked the model to calculate the value of a portfolio that was distributed evenly across all shares on the first day of every month from January 2024 to January to January to January to January to January to 2025 (we used the term “Mag 7” in the request to make it a little more challenged).

O3 minis cot was very helpful this time. First of all, the model justified what the MAG 7 was filtered to filter the data in order to keep the relevant stocks (in order to make the problem challenging, we have not added some not -MAG shares to the data), calculates the monthly amount , which was invested in everyone, inventory and the final calculations for the correct answer (the portfolio would have had a value of around $ 2,200 in the data we have made available to the model).

It will take a lot more tests for the boundaries of the new chain of thought, since Openaai still hides many details. With our Vibe checks, the new format seems to be much more useful.

What it means for Openai

When Deepseek-R1 was released, there were three clear advantages over Openai argumentation models: it was open, cheap and transparent.

Since then, Openaai has managed to shorten the gap. While O1 $ 60 per million output tokens costs, O3-mini only costs $ 4.40, while O1 is exceeded in many argumentation benchmarks. R1 costs around $ 7 and 8 per million tokens from US providers. (Deepseek offers R1 for $ 2.19 per million tokens on its own servers, but many organizations will not be able to use it because it is hosted in China.)

With the new change in the COT edition, Openai has the transparency problem a bit to deal with something.

It remains to be seen what Openai will do about open sourcing of his models. Since its publication, R1 has been adapted, grabbed and hosted by many different laboratories and companies, which may make the preferred argumentation model for companies. Sam Altman, CEO of Openaai, recently admitted that he was.On the wrong side of the storyIn open source debate. We will have to see how this knowledge will manifest in Openais future publications.



Source link
Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *