In an article published on its model of great Deepseek-V3 language (LLM), which was launched in December, the Chinese start-up said that the training only took 2.8 million hours of GPU at the cost of 5 , $ 6 million US dollars, a fraction of the time and money fraction that US companies have devoted to their own models.
Deepseek-R1, the company’s open-source reasoning model published on January 20, has demonstrated comparable capacities to those of the more advanced models of Openai, Anthropic and Google, but also with significantly lower training costs. The document on R1 did not mention the cost of development.
Deepseek’s own files, and those of its high flight of affiliated hedge funds, show that the company is one of the most acute entities for the formation of AI. As early as 2019, Liang Wenfeng, the founder of High-Flyer and Deepseek, spent 200 million yuan (27.8 million US dollars) to buy 1,100 graphic processing units (GPU) to form market negotiation algorithms. High-Flyer said that its computer center at the time covered an area equivalent to a basketball court, according to company documents, which would have placed it around 436.6 square meters (4,700 square feet).
In 2021, the fund spent 1 billion yuan on the development of its Fire-Flayer 2 of the Supercalculator Cluster, which was to reach 1,550 petaflops, a measurement of calculation power, according to the High-Flyer website. It would be similar in the performance of some of the most powerful supervisors in the world.