This week’s revelation that DeepSeek’s R1 model can offer performance in line with ChatGPT at a fraction of the development cost has had a cataclysmic impact on the AI space, particularly in the US.
Already, western AI firms are claiming that DeepSeek is underreporting the actual cost of developing its model, echoing longstanding accusations from western vendors about being undercut by their Chinese rivals. While it’s unlikely that DeepSeek has reported every associated cost, its achievement is still a significant development that will have an impact in the AI space.
We asked Reece Hayden, AI & Machine Learning Principal Analyst at global technology intelligence firm ABI Research, about whether DeepSeek’s numbers add up, and what the ramifications will be for AI if so.
DeepSeek reportedly offers a cost-to-performance ratio several times lower than other models such as ChatGPT. DeepSeek claims to have produced the R1 model in two months for around $5.6 million, using Nvidia’s H800 chips, compared to the H100 chips typically used by US AI firms. How feasible is this?
This is feasible as training AI models is mostly driven by the capacity of the GPU system. Some reports say they have around 50k GPUs which would effectively support training a model like R1. For generative AI training, creating enough processing capacity to handle massive data sets is what is crucial. The solution lies in developing systems that can effectively coordinate between a massive number of processors, regardless of their individual performance. I believe this is exactly what DeepSeek did to create their required training capacity.
In addition, their ability to train this model effectively was also supported by the quality of their source code. DeepSeek has focused on optimizing processes to deal with capacity constraints using new training methodologies and supporting large scale reinforcement training.
Although I am skeptical about the costs being reported (I expect they left some of the major costs out and just included compute training run costs), if this is true it highlights that leading edge GPUs are no longer the main value driver, but AI R&D expertise which can effectively build GPU system and utilize leading edge techniques will be more important. I am also skeptical about the cost of inference for the model. This is likely part of a strategy to undercut and gain market share. This cost does not seem reasonable even if the model was developed cheaply.
What are the factors that have likely driven the company’s strategy of undercutting its US rivals on cost?
- Showing capabilities: speculation that this announcement is being utilized to show China’s AI capabilities within the scope of the US-China AI war. By highlighting that they can build leading edge models at ultra-low cost even without access to leading edge chipsets, they are indicating that the chip ban does not impact them.
- Capture data: the value of AI model adoption is not just about paying for access, but it is also about the huge value of data captured from end-users as this can be utilized to train models or support other products/goals.
- Build market share: the cost of inference has fallen dramatically over the last 1.5 years, similar to how the cost of search has fallen. This undercutting will increase adoption globally and help build DeepSeek’s brand awareness and market share.
If DeepSeek has indeed created an AI model with comparable performance for a fraction of the cost, what are the implications for the AI investment landscape, particularly given that US firms invested hugely in AI last year?
SLMS: One of the key trends I have been watching for a while is small language models (SLMs). SLMs are highly targeted sub10B parameter GenAI models which are capable of performing narrow tasks for specific verticals or industries. These have much better price/performance compared to LLMs (like ChatGPT). I think this announcement will trigger much larger investment in SLMs and support for companies like Arcee.ai who are actively investing in this market.
Investment in AI development & R&D, not just chips: this announcement firmly highlights that investment should not just be in infrastructure but in AI development. There will be increased investment in R&D targeting new training methodologies, data collection & optimization, and new fine-tuning techniques. The real breakthrough is DeepSeek’s use of new training methodologies to achieve as good performance.
Inference will become the focus, with many players moving up the stack to focus on software & services: I do not think that it will reduce demand for AI chips, infrastructure, or data centre capacity – but it will trigger focus on inference rather than on training. This is just the latest of many announcements, which is causing the commoditization of models. Model capabilities are plateauing, and the focus is on cost reduction. This means that more investment will focus on the application and service layer as this is where money can still be made.
What does DeepSeek’s development mean for the global AI landscape? Will this lead to competitive innovation, or price wars?
Both. Price wars are already happening in the US with the cost of inference massively declining. US firms will likely respond with competitive innovation mostly targeting training processes, applications and services. One thing I am sure of is that inference (rather than training) will now be the focus, as US firms will see this as the way to make money. Why spend lots competing on slightly improved models, when you can use open-source models & build over-the-top?
Could DeepSeek’s cost-efficient model accelerate AI adoption across industries? What does this mean for Nvidia and other western powerhouses?
No, I do not think it will have an impact on enterprises. Cost of inference is already very low, and is not the major issue for enterprises.
Bigger challenges which are hindering deployment of GenAI:
- Data privacy, sovereignty, and governance
- Operational implementation and return on investment
- Use case identification
- Upskilling workers & value creation with AI tools.
- Trust
- Customization, and verticalization
When I look at the enterprise market I think about technology, people, process. For GenAI, the technology is not the issue - models are good enough and can support a lot of back-office use cases. The biggest challenges and barriers to enterprise deployment are people & process.
On top of this, trust will be a major issue given that this model is coming out of China. Before it comes close to enterprise deployment, expect a huge amount of testing. I am surprised that companies like Perplexity have embedded this model within their chatbots - although there are misconceptions around ownership, and the fact that it is open source means that Perplexity has control over where data goes. A huge amount of scrutiny will come from customers and enterprises, especially around censorship which is embedded within the open-source model. At the end of the day, even an open-source model is influenced by the developers as a model is only as good as its training data.
How might this development influence the balance of AI leadership between the US and China?
I do not think it will have a very large impact on AI leadership in the near future. It is the first time the US’ AI leadership has faced any challenge from abroad which will create a stir, building hype around the AI race. What it will do though is accelerate the geo-political war that is currently being fought around AI, making it likely that US-based AI companies will want additional controls to help them retain their leadership.