Champions are not forever. Last week, Deepseek AI sent chills in the thorns of investors and technological companies with its high -level high -flight performance. Now, two computer startups write these vibrations.
Cérebras Systems made Huge computer chips– The size of the dinner plates – with a radical design. Groq, on the other hand, makes custom chips for models of large languages. In a head-to-head test, these Alt Plips have exploded the water competition that runs a version of Deepseek viral AI.
While the answers can take a few minutes to complete on other materials, Cerebras said that its version of Deepseek eliminated certain coding tasks as little as 1.5 seconds. According to an artificial analysis, the chips on the scale of the company’s brochure were 57 times faster than the competitors who directed AI on the GPUs and the fastest. It was last week. Yesterday, Groq exceeded Cerebras at the top with a new offer.
By figures, the advance of Deepseek is more nuanced than it seemsBut the trend is real. Even if the laboratories plan to considerably extend AI models, the algorithms themselves become much more effective. On the material side, these earnings are paired by Nvidia, but also by chip startups, such as deceptions and growls, which can surpass inference.
Big Tech undertakes to buy more equipment, and Nvidia will not be put aside soon, but the alternatives can start to nibble on the edges, especially if they can serve faster or less expensive AI models than options more traditional.
Be reasonable
The new DEEPSEEK AI, R1, is a model of “reasoning”, like O1 of Openai. This means that instead of spitting the first response generated, he chewed the problem, bringing his response step by step.
For an occasional cat, it does not make much difference, but for complex and precious problems – like coding or mathematics, it is a leap forward.
Deepseek R1 is already extremely effective. It was the news last week.
Not only was R1 cheaper to train, but only $ 6 million (although what this number means is disputed) – it is cheap to work, and its weights and engineering details are open. This contrasts with the major titles on imminent investments in AI property efforts which are larger than the Apollo program.
The news has given investors a break – maybe AI will not need as much cash and as many tokens as technology leaders think so. Nvidia, the probable beneficiary of these investments, took a big stock market.
Small, fast – always intelligent
All this is on the side of the software, where algorithms become cheaper and more effective. But flea training or AI management is also improving.
Last year, Groq, a startup Founded by Jonathan RossThe engineer who previously developed Google’s internal fleas, made headlines with custom fleas for large languages. While the popular responses of Chatbot took place online by line on the GPUs, conversations on Groq fleas real -time.
It was then. The new generation of reasoning AI models takes much more time to provide answers, by design.
Called “test-time compute”, these models produce several responses in the background, select the best and offer justification for their response. Companies say that the answers improve, the more they are allowed to “think”. These models do not beat older models at all levels, but they have made progress in fields where older algorithms fight, such as mathematics and coding.
As the reasoning models move attention to inference – the process where a finished AI model deals with a user’s request – speed and cost are greater. People want answers quickly and they don’t want to pay more for them. Here, above all, Nvidia faces growing competition.
In this case, Cerebras, Groq and several other inference suppliers have decided to organize a cracked version of R1.
Instead of the original model of 671 billion parameters – the parameters are a measure of the size and complexity of an algorithm – they execute Deepseek R1 LLAMA -70B. As its name suggests, the model is smaller, with only 70 billion parameters. But even, according to Cerebras, it can always surpass O1-Mini of Openai on certain bearings.
Artificial analysis, an AI analysis platform, has been Single-on-one performance comparisons Of several inference suppliers last week and Cerebras came out in the lead. For a similar cost, the chips on the scale of the brochure spit around 1,500 tokens per second, against 536 and 235 for Sambanova and Groq, respectively. In a demonstration of efficiency gains, Cerebras said that his version of Deepseek had taken 1.5 seconds to carry out a coding task that had taken 22 seconds O1-Mini of Openai.
Yesterday, artificial analysis Launched an update Include a new groq offer that has exceeded Cerebras.
The smaller R1 model cannot correspond to larger models for Pound, but artificial analysis noted that the results are the first time that reasoning models have speeds comparable to non-knowledgeed models.
Beyond speed and cost, inference companies also host models wherever they are based. Deepseek pulled at the top of the graphics in popularity last week, but his models are accommodated on servers in China, and the experts have since noted Concerns about security and privacy. In His press releaseCerebras made sure to note that he welcomes Deepseek in the United States.
The less it’s more
Whatever its longer -term impact, the news illustrates a strong trend – and it is worth noted, already existing – towards greater efficiency in AI.
Since Openai has previewed O1 last year, the company went to its next model, O3. They gave users access to a smaller version of the latest model, O3-Mini, last week. Yesterday, Google published versions of its own models of reasoning whose efficiency approaches R1. And because Deepseek models are open and include a detailed article on their development, holders and arrivals will adopt advances.
Meanwhile, laboratories on the border remain committed to go big. Google, Microsoft, Amazon and Meta will spend $ 300 billion—Listinées on AI – this year. And Openai and Softbank have accepted A $ 500 billion data center project over four years called Stargate.
Dario Amodei, the CEO of Anthropic, describes this as a flying in three parts. The larger models are giving steps on capabilities. Companies later refine these models which, among other improvements, now include the development of reasoning models. Woven everywhere, hardware and software progress makes algorithms cheaper and more effective.
This latest trend means that companies can evolve more for less on the border, while smaller and more agitated algorithms with advanced capacities open up new applications and require. Until this process is exhausted – which is a subject of debate – there will be a request for AI chips of all kinds.