Cerebras has just terminated 6 new KI data centers that process 40m tokens per second – and it could be bad news for Nvidia

Cerebras has just terminated 6 new KI data centers that process 40m tokens per second – and it could be bad news for Nvidia

Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more


Cerebras systemsA KI hardware startup that was constantly challenging Nvidia’s dominance On Tuesday, the market for artificial intelligence announced a significant expansion of the footprint of the data center and two important corporate partnerships, which the company position as the leading provider of high-speed AI inferiority services.

The company will add six new AI calculus centers in North America and Europe and increase its inference capacity of twenty subject to over 40 million token per second. The expansion includes facilities in Dallas, Minneapolis, Oklahoma City, Montreal, New York and France, with 85% of the total capacity in the USA.

“This year it is our goal, the demand and the new demand that we expect to really satisfy new models such as Lama 4 and New Deekseek models,” said James Wang, director of product marketing at Cerebras, in an interview with Venturebeat. “This is our enormous growth initiative this year to satisfy the almost unlimited demand that we see for inference token across the board.”

The expansion of the data center is the ambitious bet of the company that the market for high-speed ACI inference is processed in which trained AI models for applications in the real world are generated dramatically when companies are looking for faster alternatives to GPU-based solutions from NVIDIA.

Cerebras plans to expand in eight data centers in North America and Europe from 2 million to over 40 million tokens per second to the fourth quarter of 2025. (Credit: Cerebras)

Strategic partnerships that bring high-speed AI for developers and financial analysts

In addition to the expansion of the infrastructure, Cerebra’s partnerships terminated HugThe popular AI developer platform and Alphase menseA market information platform that is widespread in the financial service industry.

The Hug The integration enables its five million developers to be able to access Cerebras inference With a single click without registering separately for cerebras. This is a large distribution channel for cerebras, especially for developers who work with open source models as possible Call 3.3 70b.

“Hugging face is a kind of Github from AI and the center of all open source -ai ai development,” said Wang. “The integration is super nice and native. They only appear in the list of inference providers. You simply check the check box and can use cerebras immediately. “

The alphasense partnership is a significant company profit for companies, whereby the Financial Intelligence Platform changes from a “global, three best AI model provider closed AI” to Cerebras. The company, which serves around 85% of Fortune 100 companies, uses Cerebras to accelerate its AI-driven search functions for market information.

“This is an enormous customer win and a very large contract for us,” said Wang. “We accelerate by 10 times, so what used to be five seconds or longer took it, basically with Cerebras.”

Mistral LE chat, which is driven by Cerebras, processes 1,100 tokens per second and exceeds competitors such as Google’s Gemini, Chatgpt and Claude. (Credit: Cerebras)

How Cerebras the race for the AI ​​infection speed gains when the argumentation models become slower

Cerebras has positioned itself as a specialist for high -speed strollers and asserted his claim Wafer-scale engine processor (Wafel-3) AI models can do 10 to 70 times faster than GPU-based solutions. This speed advantage has become increasingly valuable if AI models develop into more complex argumentation skills.

“If you listen to Jensen’s statements, the argument is the next big deal, even according to Nvidia,” said Wang and referred to Nvidia CEO Jensen Huang. “But what he doesn’t tell you is that the whole thing runs 10 times slower because the model has to think and generate a number of internal monologues before you have the final answer.”

This slowdown creates a chance for cerebras, the special hardware of which should accelerate these more complex KI workloads. The company has already secured top -class customers, including Confusion Ai And Mistral they haveUse the cerebras to supply your AI search or assistant products with electricity.

“We help to become confusion to the fastest AI search engine in the world. Otherwise it is not possible, ”said Wang. “We help Mistral to achieve the same performance. Now you have a reason for the people to subscribe to Le Chat Pro, while your model is probably not the same state-of-the-art level as GPT-4. “

Cerebras’ hardware delivers infer to up to 13x faster than GPU solutions for popular AI models such as Lama 3.3 70b and deepseek R1 70b. (Credit: Cerebras)

The convincing economy behind Cerebras’ challenge for Openaai and Nvidia

Cerebras bet that the combination of speed and costs also make its inference services attractive for companies that already use leading models such as GPT-4.

Wang pointed out Call 3.3 70bAn open source model that has optimized cerebras for its hardware now achieves intelligence tests as GPT-4 from Openaai and costs much less to run.

“Anyone who uses GPT-4 today can simply switch to Lama 3.3 70b as a drop in substitute,” he said. “The price for GPT-4 is (approximately) $ 4.40 in mixed conditions. And Lama 3.3 is like 60 cents. We’re about 60 cents, right? So you reduce the costs of almost an order of magnitude. And if you use cerebras, increase the speed by another size. “

Inside cerebras-tornado-tried data centers built for the resilience of AI

As part of its expansion, the company made considerable investments in resistant infrastructure. The establishment of Oklahoma City, which is due to come online in June 2025, is said to withstand extreme weather events.

“As you know, Oklahoma is a kind of tornado zone. Therefore, this data center is actually evaluated and designed in such a way that they are completely resistant to tornados and seismic activities, ”said Wang. “It will withstand the strongest Tornado who has ever been recorded. If this thing only goes through, this thing continues to send Lama token to developers. “

The establishment of Oklahoma City, which is operated in cooperation with the Scalic data center, will be accommodated by over 300 cerebras CS-3 systems And has triple redundant strength stations and customer -specific water cooling solutions that were specially developed for the systems of cerebras.

This facility generates to withstand the extreme weather, houses over 300 cerebras CS-3 systems if it is opened in June 2025 and has redundant electricity and special cooling systems. (Credit: Cerebras)

From skepticism to market leadership: How Cerebras prove its value

The expansion announced today and the partnerships announced today are an important milestone for Cerebras, which is supposed to prove itself in a KI hardware market dominated by KI hardware market Nvidia.

“I think what was reasonable skepticism about customer recording, maybe when we started for the first time, it is now completely to bed, only in view of the variety of logos we have,” said Wang.

The company aims at three specific areas in which almost inference offers the greatest value: real-time language and video processing, argumentation models and coding applications.

“The coding is one of this type of the arguments and regular questions and answers, which may take 30 seconds to one minute to generate the entire code,” said Wang. “The direct speed is proportional to the productivity of the developers. So a speed is important. “

Cerebras focuses on high-speed infection instead of participating in all AI work loads, and has found a niche in which she can claim the largest cloud providers for leadership qualities.

“Nobody generally competes against AWS and Azure on its size. We obviously not fully achieve like you, but to replicate a key segment. We have more capacity than you on the high-speed inferz front, ”said Wang.

Why the US-centered expansion of cerebras for the sovereignty of AI and the future workload is important

The expansion comes at a time when the AI ​​industry is increasingly focusing on inference skills, since companies switch from experimenting with generative AI to use in production applications in which speed and cost efficiency are of crucial importance.

With 85% of its inference capacity in the United States, Cerebras is also positioned as an important player in promoting domestic AI infrastructure at a time when technological sovereignty has become a national priority.

“Cerebras invites the future of the USKI leadership with unsurpassed performance, scaling and efficiency. These new global data centers will serve as a backbone for the next wave of AI innovation,” said Dhiraj Mallick, COO of Cerebras Systems, in the company’s announcement.

Like argumentation models Deepseek R1 And Openai’s O3 If you become more common, the demand for faster inference solutions will probably grow. These models that can take minutes to generate answers to conventional hardware work, according to the company, almost on cerebras systems.

For technical decision-makers who evaluate AI infrastructure options, the expansion of cerebras represents an important new alternative to GPU-based solutions, especially for applications in which the response time is of crucial importance for the user experience.

It remains to be seen whether the company can really challenge the dominance of Nvidia on the wider AI hardware market, but the focus on high-speed inference and significant infrastructure investments shows a clear strategy to develop a valuable segment of the rapidly developing AI landscape.



Source link
Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *