Deepseek: Everything you need to know about the AI -Chatbot app

Deepseek has become viral.

The Chinese Ai Lab Deepseek broke into the mainstream awareness this week The chatbot app rose at the top of the Apple App Store charts. Deepseek’s AI models that were trained with calculation-efficient techniques, Wall Street have led analysts – and technologists -to ask whether the United States can maintain its leadership in the AI race and whether the demand for AI chips will be maintained.

But where did Deepseek came from and how did it become international fame so quickly?

Deepseek’s Trader jumps

Deepseek is supported by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform his trading decisions.

Ki -enthusiast Liang Wenfeng Co-founding of high-flyers in 2015. Wengeng, who reported as a student at Zhejiang University, with trade in trade, started a hedge fund in 2019 that focused on the development and provision of AI algorithms.

In 2023, High-Flyer Deepseek started as a laboratory for researching AI tools that are separated from his financial business. With high flyers as one of its investors, the laboratory was called his own company, also Deepseek.

From day one on the first day, Deepseek built his own data center cluster for model training. But like other AI companies in China, Deepseek was affected by US export bans on hardware. In order to train one of its newer models, the company had to use NVIDIA H800 chips, a less power of a chip, the H100 that is available to the US company.

Deepseek’s technical team should be young. The Company According to reports aggressively recruits Doctoral students -KI researcher from top Chinese universities. Deepseek also stops people without computer science background To help his technology better understand a wide range of topics, according to the New York Times.

Deepseek’s strong models

In November 2023, Deepseek presented its first models Vor-Depseek Coder, Deepseek LLM and Deepseek Chat. began to notice.

Deepseek-V2, a general text and image analyzing system, has achieved a good performance in various AI benchmarks-and at that time it was much cheaper than comparable models. It forced Deepseek’s domestic competition, including bytedance and Alibaba, to reduce the usage prices for some of their models and to make others completely free.

Deepseek-V3Started in December 2024 and only contributed to Deepseek’s fame.

According to the internal benchmark test by Deepseek, Deepseek V3 exceeds both downloadable, openly available models such as Meta’s lama and “closed” models, which can only be accessed by an API, such as Openais GPT-4O.

Deepseek’s R1 argumentation model is also impressive. Deepseek was published in January and claimed R1, like the O1 model from Openai, carries out on important benchmarks.

As an argumentation model, R1 checks the facts for itself, which contributes to avoiding some of the pitfalls that normally stumble models. The argumentation models last a little longer and more seconds to minutes to get to solutions compared to a typical non-limitation model. The advantage is that they tend to be more reliable in areas such as physics, natural sciences and mathematics.

However, there is a disadvantage of R1, Deepseek V3 and Deepseek’s other models. They are subject to Chinese-developed AI Benchmarking through China’s Internet regulator to ensure that his answers “core core core socialist values”. In Deepseek’s Chatbot -app, for example, R1 will not answer any questions about Tiananmen Square or Taiwan’s autonomy.

A disruptive approach

If Deepseek has a business model, it is not clear what this model is. The company evaluates its products and services far below the market value – and gives away others for free.

The way Deepseek says has enabled the breakthroughs of efficiency to maintain extreme cost competitions. Some experts dispute However, the company’s numbers have delivered.

Whatever the case may be, developers have entered Deepseek’s models that are not open source, since the expression is generally understood, but is available under permissible licenses that enable commercial use. According to Clem Delangue, the CEO of Sugging Face, one of the platforms on which Deepseek’s models are organized. Developers on the embrace face have created over 500 “derivative” models from R1 That gave up 2.5 million downloads.

Deepseek’s success with larger and more established competitors was described as “emerging AI” And Redeemed “a new era of the AI -Brinkmanship”. The success of the company was at least partially responsible for Nvidia’s share price dropped 18% by 18% on Mondayand for trigger a public answer from the Openai CEO Sam Altman.

As far as Deepseek’s future is concerned, it is not clear. Improved models are a matter of course. But the US government seems to be Carefully growth with what it perceives as a harmful stranger.

Techcrunch has an AI-oriented newsletter! Register here To get it into your inbox every Wednesday.

Source link

Spread the love