DeepSeek marks a potential shift in the AI competitive landscape
Portfolio Manager Richard Clode discusses the market’s current concerns around DeepSeek's most recent LLM developments and what impact this may have on investors.

6 minute read
Key takeaways:
- DeepSeek’s innovative V3 LLM and reinforcement learning-based reasoning model R1 appear to suggest that the company has made advancements in offering more efficient and cost-effective AI solutions.
- This is driving a reassessment of AI investment strategies, focusing attention on the sustainability of AI capital expenditures, the AI competitive landscape, and the monetisation of AI.
- A more selective approach to identifying AI capex beneficiaries, as well as looking ahead to the next phases of AI investment opportunity is crucial as this new tech wave develops.
What has DeepSeek achieved in terms of LLM innovation?
DeepSeek, the Chinese AI startup and developer of open-source large language models (LLMs) launched its third generation V3 LLM in December 2024. DeepSeek-V3, which is a mixture of experts (MoE) model that is benchmarking well against the best developed LLMs in the West and this month DeepSeek-R1, which is a reinforcement learning reasoning model that benchmarks well against OpenAI’s o1 generative pre-trained transformer (GPT). V3 uses a MoE model taking several smaller models working together with a total of 671 billion parameters and only 37 billion active parameters at any given moment for each token during inferencing. V3 has further innovations such as multi-head latent attention (MHLA) reducing cache and memory size/usage, mixed precision computation on FP8 and a post-training phase re-architecture. Now MoE always looks more efficient as only a portion of the total parameters are active at any given point during token inferencing so that’s not overly surprising albeit V3 looks even more efficient, about 10x vs peers and 3-7x given other innovations. The DeepSeek-R1 model is claimed uniquely to have done away with supervised fine tuning. So there seems to be some innovation there, even if a lot of the headline improvements come from more standard techniques, while there is a wider debate on how much of the work DeepSeek has done themselves and how much is from leveraging open-source third-party LLMs.
3 key reasons why the markets are concerned with DeepSeek
1. DeepSeek appears to have significantly lower training costs
DeepSeek claims to have trained V3 on only 2,048 NVIDIA H800 GPUs for two months, which at US$2 per hour explains the US$5 million total cost headline number announced. That is a fraction of what Western hyperscalers are throwing at their LLM training (eg. it’s 9% of the compute used for Meta’s LLaMA 3.1 405B model).
2. China can still compete despite US restrictions
DeepSeek shows that a Chinese company can compete with the US best-of-breed AI companies, despite the current restrictions on Chinese access to advanced US semiconductor technology. This evokes memories of a generation of Russian coders, who given restrictions on PC time in post-Soviet Russia, invented ingenious ways to code. Has the same thing happened in China where semi restrictions have forced greater LLM architecture innovation vs the US who has just relied on throwing the compute kitchen sink at the problem?
3. AI monetisation
DeepSeek is charging significantly less than OpenAI to use its models (about 20-40x lower), which plays into the AI monetisation concern given the extraordinary amounts of capex deployed in the West.
A notable AI force
The global AI ecosystem is taking note of DeepSeek’s developments. Despite only being launched two years ago (2023), DeepSeek benefits from the pedigree and backing of the team at quantitative fund High-Flyer Capital Management, as well as the success and innovation of its prior generation models. This is why while V3 was launched in December and R1 earlier this month, the market is only reacting now because R1’s reasoning capabilities are now viewed as cutting edge. Plus, over the last weekend DeepSeek became the top free app on Apple’s AppStore, overtaking ChatGPT. Silicon Valley investor Marc Andreessen posted that DeepSeek is “one of the most amazing and impressive breakthroughs I’ve ever seen,” which is high praise from a credible industry veteran. Comments like that have heightened the market’s concerns for the sustainability of AI capex and associated companies like NVIDIA.
What do we make of all this?
- New technology waves require innovation
Any new technology wave requires innovation to drive down the cost curve over time to enable mass adoption. We are witnessing multiple avenues of AI innovation to address scaling issues with training LLMs as well as more efficient inferencing. DeepSeek appears to bring some genuine innovation to the architecture of general purpose and reasoning models. Innovation and the driving down of costs are key to unlocking AI and enabling mass adoption longer term.
- Distillation
DeepSeek’s model leverages a technique called distillation, which is being pursued more broadly in the AI industry. Distillation refers to equipping smaller models with the abilities of larger ones, by transferring the learnings of the larger, teacher model into the smaller, student one. However, it is important to note DeepSeek’s distillation techniques are reliant on the work of others. Exactly how reliant is a key question the market is grappling with currently.
- Take the capex number with a pinch of salt:
Related to the above, the capex numbers referred to are just comparing apples to oranges. The US$5 million cited relates to just one training run, ignoring any prior training runs and the training of the larger teacher models, whether at DeepSeek or the third-party open source LLMs they were built on.
- Open source innovation
As AI luminary Yann LeCun has noted, this is a victory for the open source model of driving community innovation with DeepSeek leveraging Meta’s Llama and Alibaba’s Qwen open source models. Again this is positive for the longer-term development of AI, driving and proliferating innovation. However, due to the current state of geopolitics one would probably expect greater US government scrutiny on other countries accessing state of the art AI LLMs from the US.
- LLMs commoditising?
It has long been our belief that monetising LLMs in the longer term will be challenging given the volume of competition, including from open source developers and competitors looking to monetise in alternative ways. The DeepSeek announcement only brings greater scrutiny to the return on investment (ROI) of the huge capex general purpose foundational model developers are spending.
Investment Implications
The concerns around DeepSeek play into the growing debate on AI scaling challenges as well as the ROI of AI capex spend, and ultimately, concerns around the sustainability of AI capex beneficiary earnings and the prices the market is willing to pay. We continue to expect ongoing strong spending on AI capex as seen recently from announcements by Meta and the Stargate AI project. But we also think we need to be more selective in those AI capex beneficiaries, as well as think about the next phases of AI investment opportunity as this new tech wave develops.
We characterise infrastructure as the first phase of a new wave followed by platforms and then the software, applications and services. We are approaching that pivot to the platform phase led by the cloud but still see longer-term investment opportunities in AI infrastructure as well. The market has rapidly shifted from concerns on AI capex being too high, to now worrying that AI capex is going to collapse. Both cannot happen simultaneously, and the truth likely lies in between. Ultimately, we think these developments are positive for the long-term health and development of AI. We continue to identify selective AI infrastructure beneficiaries and build our exposure to platforms that will benefit from more efficient AI compute, training models and inferencing.
Source for DeepSeek information: https://api-docs.deepseek.com/news/news250120
AI token: the smallest units of data used by a language model to process and generate text. Capex/capital expenditure: company spending to acquire or upgrade physical assets such as buildings, machinery, equipment, technology etc. to maintain or improve operations and foster future growth. GPT or Generative Pre-trained Transformers: a family of neural network models that use the transformer architecture, which power generative AI applications such as ChatGPT. GPU: a graphics processing unit performs complex mathematical and geometric calculations that are necessary for graphics rendering and are also used in gaming, content creation and machine learning. Inference or inferencing: refers to artificial intelligence processing. Whereas machine learning and deep learning refer to training neural networks, AI inference applies knowledge from a trained neural network model and uses it to infer a result. Hyperscalers: companies that provide infrastructure for cloud, networking, and internet services at scale. Examples include Google Cloud, Microsoft Azure, Facebook Infrastructure, Alibaba Cloud, and Amazon Web Services. LLM (large language model): a specialised type of artificial intelligence that has been trained on vast amounts of text to understand existing content and generate original content. MoE (Mixture of Experts Model): a machine learning approach that divides an AI model into separate sub-networks/experts to jointly perform a task. This enables significant cost reduction and faster performance for inferencing because specific experts are used for a task, instead of activating the entire neural network for every task. Open source software: code that is designed to be publicly accessible, in terms of viewing, modifying and distributing. Reinforcement Learning (RL): a technique where the AI learns by interacting with its environment and receiving feedback in the form of rewards or penalties. This allows the AI to adapt and evolve, as well as improve its logical and problem-solving skills. ROI (return on investment): is a financial ratio used to measure the performance of an investment, calculated by dividing net profit/loss by the initial cost of the investment.
These are the views of the author at the time of publication and may differ from the views of other individuals/teams at Janus Henderson Investors. References made to individual securities do not constitute a recommendation to buy, sell or hold any security, investment strategy or market sector, and should not be assumed to be profitable. Janus Henderson Investors, its affiliated advisor, or its employees, may have a position in the securities mentioned.
Past performance does not predict future returns. The value of an investment and the income from it can fall as well as rise and you may not get back the amount originally invested.
The information in this article does not qualify as an investment recommendation.
There is no guarantee that past trends will continue, or forecasts will be realised.
Marketing Communication.
Important information
Please read the following important information regarding funds related to this article.
- Shares/Units can lose value rapidly, and typically involve higher risks than bonds or money market instruments. The value of your investment may fall as a result.
- Shares of small and mid-size companies can be more volatile than shares of larger companies, and at times it may be difficult to value or to sell shares at desired times and prices, increasing the risk of losses.
- If a Fund has a high exposure to a particular country or geographical region it carries a higher level of risk than a Fund which is more broadly diversified.
- The Fund is focused towards particular industries or investment themes and may be heavily impacted by factors such as changes in government regulation, increased price competition, technological advancements and other adverse events.
- The Fund follows a sustainable investment approach, which may cause it to be overweight and/or underweight in certain sectors and thus perform differently than funds that have a similar objective but which do not integrate sustainable investment criteria when selecting securities.
- The Fund may use derivatives with the aim of reducing risk or managing the portfolio more efficiently. However this introduces other risks, in particular, that a derivative counterparty may not meet its contractual obligations.
- If the Fund holds assets in currencies other than the base currency of the Fund, or you invest in a share/unit class of a different currency to the Fund (unless hedged, i.e. mitigated by taking an offsetting position in a related security), the value of your investment may be impacted by changes in exchange rates.
- When the Fund, or a share/unit class, seeks to mitigate exchange rate movements of a currency relative to the base currency (hedge), the hedging strategy itself may positively or negatively impact the value of the Fund due to differences in short-term interest rates between the currencies.
- Securities within the Fund could become hard to value or to sell at a desired time and price, especially in extreme market conditions when asset prices may be falling, increasing the risk of investment losses.
- The Fund could lose money if a counterparty with which the Fund trades becomes unwilling or unable to meet its obligations, or as a result of failure or delay in operational processes or the failure of a third party provider.
Specific risks
- Shares/Units can lose value rapidly, and typically involve higher risks than bonds or money market instruments. The value of your investment may fall as a result.
- If a Fund has a high exposure to a particular country or geographical region it carries a higher level of risk than a Fund which is more broadly diversified.
- The Fund is focused towards particular industries or investment themes and may be heavily impacted by factors such as changes in government regulation, increased price competition, technological advancements and other adverse events.
- This Fund may have a particularly concentrated portfolio relative to its investment universe or other funds in its sector. An adverse event impacting even a small number of holdings could create significant volatility or losses for the Fund.
- The Fund may use derivatives with the aim of reducing risk or managing the portfolio more efficiently. However this introduces other risks, in particular, that a derivative counterparty may not meet its contractual obligations.
- If the Fund holds assets in currencies other than the base currency of the Fund, or you invest in a share/unit class of a different currency to the Fund (unless hedged, i.e. mitigated by taking an offsetting position in a related security), the value of your investment may be impacted by changes in exchange rates.
- When the Fund, or a share/unit class, seeks to mitigate exchange rate movements of a currency relative to the base currency (hedge), the hedging strategy itself may positively or negatively impact the value of the Fund due to differences in short-term interest rates between the currencies.
- Securities within the Fund could become hard to value or to sell at a desired time and price, especially in extreme market conditions when asset prices may be falling, increasing the risk of investment losses.
- The Fund could lose money if a counterparty with which the Fund trades becomes unwilling or unable to meet its obligations, or as a result of failure or delay in operational processes or the failure of a third party provider.
