Why DeepSeek Is Bullish For The World

Photo by Steve Johnson on Unsplash


“There are decades when nothing happens, and weeks when decades happen,” says a quote from Vladimir Lenin, who may have copied it from someone else. Regardless of origin, it is true and will be even more true as technology and even government changes faster. We may have just lived through such a week for the financial markets—and maybe for the whole global economy.

I’m talking, of course, about the Chinese-made DeepSeek artificial intelligence model, which emerged last week and seems to deliver results comparable to the far more expensive systems US companies are working on. Everyone is still debating the implications. While it may be bad news for US tech companies that have been throwing hundreds of billions of dollars at ever more expensive AI systems, it is unambiguously very good for the world.

Historical analogies nearly always fail, but here’s one that may at least rhyme: On March 14, 2000, then-President Bill Clinton and then-PM Tony Blair of the UK said research into the human genome’s sequence should be made freely available to all. Biotechnology stocks immediately plunged. Other segments held up for a while, but the great Nasdaq bull market arguably ended that day.

What was the trigger? The biotech industry of that time had planned to profit by using proprietary genetic data to make expensive drugs. The data would be the “wide moat” protecting them from competition. When Clinton and Blair signaled this might not be so easy, those stocks lost a lot of their appeal, as did stocks in general a few months later. Valuation matters.

Similarly, today’s US tech giants are all-in on artificial intelligence. Their wide moat is (or was) that these systems require highly advanced and very expensive microchips which the US government had helpfully decreed ineligible for export to China.

This produced a rosy outlook: US companies like Nvidia (NVDA) would profit from making those chips, which other US companies would buy and use to develop AI applications in vast new US data centers. The main worry was getting enough electricity to power it all. China, it was thought, would be hobbled for lack of the necessary chips, and thus present little competition.

DeepSeek just poked a massive hole in that happy narrative. More broadly, it tells us something important about China and where the world economy is going.

My best guess is all this will be bad news for the US stock market but good news—and possibly great news—for humanity. I’ll explain why in a minute. But first, we need to understand what just happened.


“An Unambiguous Innovation”

When the topic involves China and technology, one of my go-to sources is Gavekal Research. They know how the Chinese economy works and are really good at explaining it in ways Westerners can understand.

Louis Gave knew DeepSeek was a gamechanger as soon as he saw it. Last Sunday night, as US investors were just starting to see the news, he posted a report calling this Another Sputnik Moment.

(Explanation for the youngsters: The USSR’s 1957 launch of Sputnik, the first Earth satellite, wasn’t just a scientific achievement. Terrified Americans suddenly knew the Soviet Union was flying right over their heads. This marked the moment when the Cold War became real. Louis Gave isn’t prone to overstatement, so his use of that term made me sit up in my chair.)

Gavekal’s Beijing-based tech analyst Tilly Zhang followed up with a deeper explanation. This is a good, short summary of the hoopla, so I’ll just quote Tilly verbatim.

“DeepSeek stunned the AI industry with the release of two new models: V3 in December 2024 and R1 in January 2025. These perform roughly as well as OpenAI’s leading models on tasks such as math and coding. But the most striking aspect is the cost: DeepSeek said training the V3 model required 2,048 of Nvidia’s H800 chips—a downgraded last-generation chip designed to comply with US export controls on China—at a cost of US$5.6mn. By contrast, OpenAI has said it spent more than US$100mn to train ChatGPT-4.

“DeepSeek’s achievements may not be quite as impressive as headlines imply. For starters, the new models have shortcomings. One of the key reasons they perform well despite limited access to advanced chips is due to their ‘Mixture of Experts’ approach. This means available computing power is concentrated on a few ‘expert’ tasks, while less-critical tasks may be undertrained. The models thus excel in certain areas, but their overall performance is less consistent than some rivals. For instance, one Chinese AI expert has noted that one of the models performs well on math and coding tests, but correctly answered only about half of some other classic AI test questions. In short, the models are specialists adapted to become generalists.

“The US$5.6mn price tag should also not be taken too literally. One reason DeepSeek’s costs are low is that the company’s offerings have so far focused only on text-based large-language models, while some US rivals offer multimodal models that can handle images and videos, making direct cost comparisons misleading. Another is that the company can piggyback on the costly earlier advances made and lessons learned by US AI firms. Finally, the much-cited cost figure doesn’t account for prior research and development spending. DeepSeek’s parent company reportedly had an initial research and development budget of around RMB3bn, as well as a stockpile of about 10,000 of Nvidia’s advanced A100 chips, meaning the actual cost of development was almost certainly much higher.

“Nonetheless, the company achieved an unambiguous innovation in software architecture that allowed it to deliver strong performance on many tasks at a low cost. That reflects a broader strategy among Chinese technology firms in response to US export controls: using software to get more out of less-advanced hardware. A 2023 review of Tencent’s Hunyuan AI model by the Berkeley AI Research Lab, for instance, concluded that ‘[s]oftware advancements are making old hardware increasingly useful.’”

How do they do this with a lesser version of Nvidia’s GPU? Limited access to the hardware as well as limited capital forced the developers to be extraordinarily creative. It is highly likely that the results cost a great deal more than $5.6 million, but even 10 times more still makes it extraordinarily cheap.

The twist you need to understand is that DeepSeek isn’t exactly better than OpenAI and other US-developed systems. But it seems to come close, and at a fraction of the cost, thanks to clever software design. Let’s look at that cost. Newsletter writer Ed Zitron had an insightful take called Deep Impact.

“DeepSeek's models—V3 and R1—are more efficient (and as a result cheaper to run), and can be accessed via its API at prices that are astronomically cheaper than OpenAI's. DeepSeek-Chat—running DeepSeek's GPT-4o competitive V3 model—costs $0.07 per 1 million input tokens (as in commands given to the model) and $1.10 per 1 million output tokens (as in the resulting output from the model), a dramatic price drop from the $2.50 per 1 million input tokens and $10 per 1 million output tokens that OpenAI charges for GPT-4o. DeepSeek-Reasoner—its ‘reasoning’ model—costs $0.55 per 1 million input tokens, and $2.19 per 1 million output tokens compared to OpenAI's o1 model, which costs $15 per 1 million input tokens and $60 per 1 million output tokens.

“Now, there's a very obvious ‘but’ here. We do not know where DeepSeek is hosting its models, who has access to that data, or where that data is coming from or going. We don't even know who funds DeepSeek, other than that it’s connected to High-Flyer, the hedge fund that it split from in 2023.”

Conventional wisdom had been that Chinese companies excel in hardware but don’t have Silicon Valley’s programming talent, even though China has been excelling in software for a long time. And they are literally training more than 10 times the software engineers the US is. That’s why people assumed the export ban on Nvidia’s most advanced chips would prevent China from catching up. If DeepSeek (and presumably others) are finding ways around this barrier, it’s a true game-changer.


11-Foot Ladders

Suspicions abound that DeepSeek is really a Chinese government operation. Louis Gave thinks DeepSeek is what it appears to be: a quant hedge fund using its own resources to build a technology platform it thought would be useful. In China, government-funded projects go through universities or the large companies over which the government has deeper control.


More By This Author:

Crucial Questions
A Possible Storm
A Partly Cloudy Year

How did you like this article? Let us know so we can better customize your reading experience.

Comments

Leave a comment to automatically be entered into our contest to win a free Echo Show.
Or Sign in with