Why DeepSeek Tanked U.S. Tech Stocks

white robot near brown wall

Image Source: Unsplash


Introduction: The Impact of DeepSeek's R1 Release

On Monday, January 20, 2025, the AI industry witnessed a seismic shift with the release of DeepSeek's new open-source model, R1. It took a week, but US tech stocks are now repricing in light of what is being reported as a major breakthrough on performance and cost-efficiency. This Chinese company's latest large language model (LLM) has reportedly outperformed OpenAI's leading model across multiple benchmarks, achieving this at a fraction of the cost. This revelation has sent shockwaves through Silicon Valley, causing a significant drop in US tech stocks as investors reassess the massive investments in data centers, particularly those utilizing Nvidia GPUs.


DeepSeek's Cost-Efficiency Claims: A Game-Changing Approach

All of this information is currently being vetted by industry experts for accuracy, but DeepSeek's public disclosure highlights that they employed advanced software optimization techniques to enhance hardware efficiency, slashing costs by 93%. If these claims hold true, it suggests that many tech companies may be overspending on hardware, and that innovative engineering approaches can yield competitive results without the need for extensive resources. As a result, the market is currently adjusting its expectations regarding future energy demand, compute costs, and inference costs.


Benchmarking Success: DeepSeek's R1 Training Process

The most striking claim from DeepSeek is that their entire training process was completed for just $5.576 million, utilizing 2.8 million GPU hours on mid-tier NVIDIA H800 GPUs. Remarkably, they did not use the more advanced H100 or Blackwell GPUs, which US companies have been investing heavily in to enhance training speed and model performance. This achievement sets a new benchmark for scaling LLMs efficiently, potentially narrowing the performance gap between open-source and proprietary models. If what DeepSeek says is true, this could disrupt the strategies of tech giants that have relied on significant capital investments to maintain their edge.


Potential Long-Term Implications for Cloud Service Providers

A valid consideration for investors is the potential long-term impact of more efficient compute and lower training costs on cloud service providers. With advancements like DeepSeek's, there is the possibility that AI workloads could increasingly be deployed locally at the edge, which might shift some demand away from cloud-based inference. While this represents a long-term risk, companies like Google, Amazon, and Microsoft are well-positioned to adapt to these trends through their hybrid cloud and edge solutions. Any perceived "cracks" in their dominance today are far from material, and it is premature to suggest they will lead to significant loss of market share. To stay competitive, leading hyperscalers and semiconductor companies will likely experiment with similar architectures, training frameworks, and hardware optimizations as DeepSeek, which could bolster their capabilities.


Accusations and Skepticism: The Accuracy of DeepSeek's Claims

Amidst the public discourse regarding this latest development, accusations have surfaced regarding the accuracy of DeepSeek's costs. Some notable technology experts like Elon Musk speculate that DeepSeek may have underreported the number and type of GPUs used to avoid revealing their ability to circumvent US sanctions and export restrictions on high-end Nvidia GPUs. While this remains unknown, in the near-term it is likely that most executives and decision-makers at tech firms will pause, test, and validate DeepSeek's claims before making significant capital allocations or strategic pivots. If DeepSeek's achievements with mid-tier GPUs are verified, it could shift market demand towards these more affordable options, putting pressure on high-end GPU pricing.


The Role of Software Innovation in DeepSeek's Success

A critical yet often overlooked aspect of DeepSeek's success is the role of software innovation. Techniques such as multi-head latent attention (MLA), load balancing, dual pipeline algorithms, FP8 mixed precision training, and multi-token prediction have significantly enhanced resource efficiency by better utilizing hardware and addressing traditional bottlenecks in distributed AI training. This opens up opportunities for competitors who face barriers to entry due to high capital expenditures.


The Future of AI: Decentralization and Democratization

On a positive note, over the longer-term, if DeepSeek's findings are confirmed, we could see Jevon’s paradox come into play where greater efficiencies eventually lead to even higher demand for AI services, as the cost for these services come down. This could lead to a more decentralized computing landscape, putting pressure on hyperscalers and advancing the democratization of AI. Ultimately, this race to the bottom in terms of cost could benefit consumers, driving down the price of AI technologies.


Financial Sense's Outlook: Cautious Optimism

At Financial Sense, we remain cautious in the near term but optimistic about the long-term trajectory of the AI landscape. We do not believe the "AI bubble" has burst, but rather that its growth has potentially slowed. As we navigate these developments, we will diligently monitor the industry, ensuring our clients remain well-informed and ready for the dynamic shifts in the AI landscape.

Sources:


More By This Author:

Big Picture: Investing In A Trump Economy
Big Picture: Navigating The Bubble
Buckle Up: Bond Vigilantes Are Back

Advisory services offered through Financial Sense® Advisors, Inc., a registered investment adviser. Securities offered through Financial Sense® Securities, Inc., Member FINRA/SIPC. DBA ...

more
How did you like this article? Let us know so we can better customize your reading experience.

Comments

Leave a comment to automatically be entered into our contest to win a free Echo Show.
Or Sign in with