Dario Amodei challenges Deepseek’s 6 -million dollar -Ki -Ki narrative: What Anthropic thinks about China’s latest AI train

Dario Amodei challenges Deepseek’s 6 -million dollar -Ki -Ki narrative: What Anthropic thinks about China’s latest AI train

Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more


The KI world was shaken last week when DeepseekA Chinese AI startup announced its latest language model Deepseek-R1, which is accessible to the skills of the leading American AI systems to a fraction of the costs. The announcement solved a widespread Market sale This deleted almost 200 billion US dollars from the market value of Nvidia and triggered violent debates about the future of AI development.

The narrative that came quickly basically interrupted The economy of building advanced AI systems, which allegedly only reached $ 6 million, which American companies spends billions. This interpretation sent shock waves through the Silicon Valley, where companies like OpenaiPresent Anthropic And Google have justified massive investments in the computer infrastructure to maintain their technological lead.

But in the middle of the market turbulence and breathless headlines, Dario AmodeiCo -founder of Anthropic and one of the pioneering researchers behind today’s great -language models (LLMS) published a detailed analysis that offers a more differentiated perspective on deepseek. Be Blog post Cut through the hysteria to provide several crucial insights into what Deepseek has actually achieved and what it means for the future of AI development.

Here are the four most important findings from the analysis of Amodei, which redesign our understanding of Deepseek’s announcement.

1. The narrative of ‘$ 6 million’ model misses a decisive context

Deepseek reported Development costs According to the Amodei, a wider lens must be viewed. He directly demands the popular interpretation:

“Deepseek does not do for 6 million US dollars, which has cost US ASI companies billions.” I can only speak for Anthropic, but Claude 3.5 Sonett is a medium-sized model that costs a few $ 10 dollars millions for training (I will not give a precise number). In addition, 3.5 sonnet was not trained in any way, which included a larger or more expensive model (in contrast to some rumors). “

This shocking revelation fundamentally shifts the narrative about Deepseek’s cost efficiency. If you look at that sonnet was trained 9-12 months ago and still surpasses the Deepseek model in many tasks. The performance seems more like the natural progress of AI development costs than the revolutionary breakthrough.

Timing and the context are also significant. After the historical trends of cost reduction in AI development, which AMOodei estimates at around 4x a year, the cost structure of Deepseek seems to be largely on the trend than dramatically before the curve.

2. Deepseek-V3, not R1, was the real technical performance

While markets and media are intensively on Deepseeks R1 modelAmodei points out that the company’s more important innovation was previously.

Deepseek-V3 It was actually the real innovation and what should the people know a month ago (we definitely did it). As a prepared model, it seems to get close to the performance of the US US models for some important tasks and at the same time to train less. “

The distinction between V3 and R1 is crucial for the understanding of deepseek of true technological progress. V3 represented real technical innovations, especially when managing the model “KeyAnd the limits of Mixture of experts (Moe) method.

This insight explains why the dramatic reaction of the market to R1 may have been moved. R1 added to V3S Foundation essentially reinforcement learning functions – a step that several companies are currently doing with their models.

3. The entire company investment shows a different picture

Perhaps the most revealing aspect of amodeis analysis concerns the general investment of deepseek in AI development.

“It was reported – we cannot be sure that it is true – that Deepseek actually had 50,000 Hopper Generation ChipsWhat I would suspect is within a factor ~ 2-3X of the most important US ACI companies. These 50,000 hopper chips cost ~ 1 billion USD. Therefore, Deepseek’s total editions as a company (differs from expenses for the training of a single model) does not differ significantly from the US KI laboratories. “

This revelation is dramatically reorganizing the story about Deepseek’s resource efficiency. While the company may have achieved impressive results in the individual model training, its general investment in AI development seems to be roughly comparable to its American colleagues.

The distinction between model training costs and overall investments by companies underlines the persistent importance of considerable resources for AI development. It indicates that technical efficiency can be improved while the AI ​​remains competitive.

4. The current “crossover point” is temporary

Amodei describes the current moment in AI development as unique but fleeting.

“We are therefore at an interesting” crossover point “, where it is temporarily that several companies can produce good argumentation models,” he wrote. “This will quickly stop to be true, since everyone moves the scaling curve of these models further up.”

This observation offers a crucial context for understanding the current status of the AI ​​competition. The ability of several companies to achieve similar results in the argumentation skills is more of a temporary phenomenon than a new status quo.

The effects are important for the future of AI development. Since companies further increase their models, especially in the resource -intensive area of ​​strengthening learning, the field will probably differ again according to who can invest the most in training and infrastructure. This indicates that Deepseek has reached an impressive milestone, but has not fundamentally changed the long-term economy of advanced AI development.

The real costs for the construction of AI: what amodeis analysis shows

The detailed analysis of AMOodei from Deepseek achieves is reduced by weeks of market speculation to uncover the actual economy of building advanced AI systems. His blog post is systematic both the panic and the enthusiasm, which followed after the announcement of Deepseek, systematically, and shows how the model training costs of the company in the amount of $ 6 million adapt to the steady march of AI development .

Markets and media tend to dramatically undermine the history of a Chinese company that dramatically undermine the US AI development costs. However, the collapse of Amodei shows a more complex reality: Deepseek’s overall investment, in particular the reported computer hardware of 1 billion US dollar, reflects the expenses of its American colleagues.

This moment of cost parity between US and Chinese AI development marks what Amodei calls a “crossover point” – a temporary window in which several companies can achieve similar results. His analysis suggests that this window is closed when the AI ​​functions progress and strengthen the training requirements. The field will probably return to prefer organizations with the deepest resources.

The construction of the advanced AI remains an expensive undertaking, and the careful examination of Amodei shows why the measurement of the actual costs requires examination of the entire scope of the investment. Its methodical deconstruction of deepseeks could ultimately prove to be more important than the first announcement that triggered such turbulence in the markets.



Source link
Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *