DeepSeek AI Is Free, The Comparable Version Of ChatGPT Costs $200 A Month

Wall Street is stunned and rightfully so.

DeepSeek’s AI Model Is the Top-Rated App in the U.S.

Scientific American comments Why DeepSeek’s AI Model Just Became the Top-Rated App in the U.S.

DeepSeek’s artificial intelligence assistant made big waves Monday, becoming the top-rated app in the Apple Store and sending tech stocks into a downward tumble. What’s all the fuss about?

The Chinese start-up, DeepSeek, surprised the tech industry with a new model that rivals the abilities of OpenAI’s most recent model—with far less investment and using reduced-capacity chips. The U.S. bans exports of state-of-the-art computer chips to China and limits sales of chipmaking equipment. DeepSeek, based in the eastern Chinese city of Hangzhou, reportedly had a stockpile of high-performance Nvidia A100 chips from times prior to the ban—so its engineers could have used those to develop the model. But in a key breakthrough, the start-up says it instead used much lower-powered Nvidia H800 chips to train the new model, dubbed DeepSeek-R1.

On common AI tests in mathematics and coding, DeepSeek-R1 matched the scores of Open AI’s o1 model, according to VentureBeat.

DeepSeek-R1 is free for users to download, while the comparable version of ChatGPT costs $200 a month.

Because it requires less computational power, the cost of running DeepSeek-R1 is a tenth of the cost of similar competitors, says Hanchang Cao, an incoming assistant professor in Information Systems and Operations Management at Emory University. “For academic researchers or start-ups, this difference in the cost really means a lot,” Cao says.

DeepSeek achieved its efficiency in several ways, says Anil Ananthaswamy, author of Why Machines Learn: The Elegant Math Behind Modern AI. The model has 670 billion parameters, or variables it learns from during training, making it the largest open-source large language model yet, Ananthaswamy explains. But the model uses an architecture called “mixture of experts” so that only a relevant fraction of these parameters—tens of billions instead of hundreds of billions—are activated for any given query. This cuts down on computing costs. The DeepSeek LLM also uses a method called multi-head latent attention to boost the efficiency of its inferences; and instead of predicting an answer word-by-word, it generates multiple words at once.

Another important aspect of DeepSeek-R1 is that the company has made the code behind the product open-source, Ananthaswamy says. (The training data remains proprietary.) This means that the company’s claims can be checked. If the model is as computationally efficient as DeepSeek claims, he says, it will probably open up new avenues for researchers who use AI in their work to do so more quickly and cheaply. It will also enable more research into the inner workings of LLMs themselves.

“One of the big things has been this divide that has opened up between academia and industry because academia has been unable to work with these really large models or do research in any meaningful way,” Ananthaswamy says. “But something like this, it’s within the reach of academia now, because you have the code.”

DeepSeek Stuns Wall Street

The Wall Street Journal reports DeepSeek Stuns Wall Street With Capability and Cost

Who saw that coming? Not Wall Street, which sold off tech stocks on Monday after the weekend news that a highly sophisticated Chinese AI model, DeepSeek, rivals Big Tech-built systems but cost a fraction to develop. The implications are likely to be far-reaching, and not merely in equities.

Enter DeepSeek, which last week released a new R1 model that claims to be as advanced as OpenAI’s on math, code and reasoning tasks. Tech gurus who inspected the model agreed. One economist asked R1 how much Donald Trump’s proposed 25% tariffs will affect Canada’s GDP, and it spit back an answer close to that of a major bank’s estimate in 12 seconds. Along with the detailed steps R1 used to get to the answer.

More startling, DeepSeek required far fewer chips to train than other advanced AI models and thus cost only an estimated $5.6 million to develop. Other advanced models cost in the neighborhood of $1 billion. Venture capitalist Marc Andreessen called it “AI’s Sputnik moment,” and he may be right.

DeepSeek is challenging assumptions about the computing power and spending needed for AI advances. OpenAI, Oracle and SoftBank last week made headlines when they announced a joint venture, Stargate, to invest up to $500 billion in building out AI infrastructure. Microsoft plans to spend $80 billion on AI data centers this year.

CEO Mark Zuckerberg on Friday said Meta would spend about $65 billion on AI projects this year and build a data center “so large that it would cover a significant part of Manhattan.” Meta expects to have 1.3 million advanced chips by the end of this year. DeepSeek’s model reportedly required as few as 10,000 to develop.

DeepSeek’s breakthrough means these tech giants may not have to spend as much to train their AI models. But it also means these firms, notably Google’s DeepMind, might lose their first-mover, technological edge. 

DeepSeek is vindicating President Trump’s decision to rescind a Biden executive order that gave government far too much control over AI. Companies developing AI models that pose a “serious risk” to national security, economic security, or public health and safety would have had to notify regulators when training their models and share the results of “red-team safety tests.”

DeepSeek should also cause Republicans in Washington to rethink their antitrust obsessions with big tech. Bureaucrats aren’t capable of overseeing thousands of AI models, and more regulation would slow innovation and make it harder for U.S. companies to compete with China. As DeepSeek shows, it’s possible for a David to compete with the Goliaths. Let a thousand American AI flowers bloom.

Ignoring AI’s Potential Is Ignorant

Nate Silver says It’s Time to Come to Grips with AI

Ignoring AI’s potential is, well, ignorant

For the real leaders of the left, the issue simply isn’t on the radar. Bernie Sanders has only tweeted about “AI” once in passing, and AOC’s concerns have been limited to one tweet about “deepfakes.”

Meanwhile, the vibe from lefty public intellectuals has been smug dismissiveness. Take this seven-word tweet from Ken Klippenstein, a left-leaning journalist formerly of The Intercept who now writes a popular Substack.

I’m sorry, but this is ignorant. Large language models like ChatGPT are, by some measures, the most rapidly adopted technology in human history. Kulwin’s tweet is equivalent to, in the 1990s, dismissing the Internet as a “pornography and hacking machine.” Yes, these are common use cases, but they’re the tip of a massive iceberg.

It’s not just that AIs can now solve Math Olympiad problems. LLMs also provide a lot of “mundane utility,” from serving as computer programmers to research assistants to all-around problem-solving tools. I’d estimate that using LLMs and other AI tools improve my productivity by perhaps 5 percent on a day-to-day basis. It’s not yet a true “game changer,” but more and more, they provide reliable marginal value, from debugging Stata code to vetting technical concepts to serving as a copy editor or a creative muse.

Impressive Math

The New York Times says Move Over, Mathematicians, Here Comes AlphaProof

In January [2024], a Google DeepMind system named AlphaGeometry solved a sampling of Olympiad geometry problems at nearly the level of a human gold medalist. “AlphaGeometry 2 has now surpassed the gold medalists in solving I.M.O. problems,” Thang Luong, the principal investigator, said in an email.

The lab’s strike at this year’s Olympiad deployed the improved version of AlphaGeometry. Not surprisingly, the model fared rather well on the geometry problem, polishing it off in 19 seconds.

I cannot fathom solving that problem ever, let alone in 19 seconds. And I had three semesters of calculus, plus differential equations, and advanced statistics (all of which I admit I have long forgotten).

Development Costs

Development of DeepSeek reportedly cost $5.6 million vs US costs estimated in the neighborhood of $1 billion.

DeepSeek’s model reportedly required as few as 2,000 Nvdia chips (some estimates at 10,000 chips) to develop vs Meta’s expectation to need 1.3 million advanced chips by the end of this year.

I don’t doubt the DeepSeek numbers because its access to Nvidia’s advanced chips was by rented data centers with restricted access that China was not supposed to have at all.

We do not know how much China really spent, but we sure do know Biden’s export sanctions on technology failed in spectacular fashion.

How Much Spending Is Really Needed?

Trump secured pledges to spend $500 billion on data centers. Meta alone is planning to spend more than $60 billion.

Technology investor Marc Andreessen called DeepSeek’s AI model “one of the most amazing and impressive breakthroughs I’ve ever seen” and “a profound gift to the world” in a post on X.

Export Restrictions

I discussed export restrictions yesterday in China’s DeepSeek AI Raises Doubts Over U.S. Tech Dominance and Export Curbs

Another Sanction Failure

Biden placed numerous exports bans on chip technology to prevent this from happening.

Instead DeepSeek said in a late-December report that it used a cluster of more than 2,000 Nvidia chips to train its AI.

No one should be surprised by this.

To Those Hard of Learning, Here’s a Repeat Lesson on Why Sanctions Fail

On September 26, 2024, I commented To Those Hard of Learning, Here’s a Repeat Lesson on Why Sanctions Fail

Let’s discuss a claim that sanction failures are due to a lack of political will.

Robin Brooks on X: “When someone tells you that sanctions can’t and won’t work, that’s basically pro-Russian propaganda. Are we seriously to believe that nothing can be done to stop the shameful flood of transshipments to Russia via Central Asia? Come on. This is just about a lack of political will.

I am pretty sure that “someone” is me because we have gone round and round on this.

When someone tells you that sanctions do work. Ask them for evidence.

The above post was on oil-related sanctions. The next article is how and why chip sanctions failed.

On August 26, 2024, I commented China Gains Secret Access to Nvdia Microchips by Renting Computers

The US has blocked export of Nvdia chips to China. But where there’s profit, there’s a way.

Know Your Customer’s Customer’s Customer

China sets up an AI company in Singapore. AI developers buy cloud time through a subsidiary that further masks the operation by paying in Bitcoin.

In turn, the subsidiary buys time from a company Dubai or Singapore that hosts the servers.

US politicians are outraged. But some of us are amused knowing full well that sanctions don’t work. So instead of cloud profits going to US corporations, the profits go to Saudi Arabia, Singapore, Dubai, and South Korea.

Only Amazon is forced to “know your customer”.

Musk Trashes Trump’s Pet AI Project

Regarding capital expenditures, note that Musk Trashes Trump’s Pet AI Project causing a feud with Trump’s staff.

Musk, who owns his own AI startup, was not at Trump’s unveiling of “Stargate,” an effort to supercharge the country’s AI infrastructure featuring the tech giants OpenAI, Softbank and Oracle. Musk, who co-founded OpenAI, has long been critical of its CEO Sam Altman and spent much of Wednesday trolling him online. “They don’t actually have the money,” he said. Softbank, meanwhile, “has well under $10B secured. I have that on good authority.”

Musk even reposted a joke that suggested Altman and his team smoked crack “to come up with their $500 billion number for Stargate.”

How Good is DeepSeek?

The answer is we don’t really know, but we do know that the free downloadable version is very good, perhaps as good or better than ChatGPT which costs $200 a month (well, not for long)!

After assuming for years the US was far ahead of the rest of the world on AI, comes a Sputnik realization that perhaps the US is really behind despite China having to use rented computers and/or lower-grade chips to develop their AI.

By the way, what does this say about US military intelligence (other than it’s likely in a huge tizzy right now).


More By This Author:

China’s DeepSeek AI Raises Doubts Over U.S. Tech Dominance And Export Curbs
Counting Hidden Debt, BYD’s True Net Debt Is $44 Billions
For the Full Year, 2024 Existing-Home Sales Lowest In Nearly 30 Years
How did you like this article? Let us know so we can better customize your reading experience.

Comments

Leave a comment to automatically be entered into our contest to win a free Echo Show.
Or Sign in with