By Emily Bary
The Chinese AI service has Wall Street worried that it will be cheaper than expected to develop models. But as chip stocks sink, some analysts see a silver lining.
What if companies don’t need to spend nearly as much as expected to develop artificial-intelligence models?
That’s the big question on the minds of investors Monday, given the newfound attention on DeepSeek, a Chinese AI app that has climbed to the top of downloads from Apple’s U.S. App Store. The service has become so popular that it’s restricting registration due to what it called “large-scale malicious attacks.”
Investors are concerned, though, because the company reportedly was able to build a model that functions like OpenAI’s ChatGPT without spending to the same degree. Wall Street is nervous about what DeepSeek’s success means for companies like Nvidia Corp. (NVDA), Broadcom Inc. (AVGO), Marvell Technology Inc. (MRVL) and others that have seen their stocks run up on expectations their businesses would benefit from lofty AI-fueled capital-expenditure budgets in the years to come.
Those stocks are each down more than 17% in afternoon trading Monday. The Nasdaq Composite Index COMP is off 3.4%.
“If DeepSeek’s innovations are adopted broadly, an argument can be made that model training costs could come down significantly even at U.S. hyperscalers, potentially raising questions about the need for 1-million XPU/GPU clusters as projected by some,” Raymond James analyst Srini Pajjuri wrote in a note to clients over the weekend.
In a post titled “The Short Case for Nvidia Stock,” former quant investor and current Web3 entrepreneur Jeffrey Emanuel said DeepSeek’s success “suggests the entire industry has been massively over-provisioning compute resources.”
He added that “markets eventually find a way around artificial bottlenecks that generate super-normal profits,” meaning that Nvidia may face “a much rockier path to maintaining its current growth trajectory and margins than its valuation implies.”
Nvidia, for its part, called DeepSeek “an excellent AI advancement,” while saying it represented “a perfect example of test-time scaling,” which means more computation is done during the inferencing phase.
“DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely available models and compute that is fully export-control compliant,” an Nvidia spokesperson said. “Inference requires significant numbers of Nvidia GPUs and high-performance networking.”
It’s also worth digging into the numbers that have Wall Street so worried. Specifically, there’s consternation about a paper that suggested DeepSeek’s creator spent $5.6 million to build the model. By contrast, large technology companies in the U.S. are shelling out tens of billions of dollars a year on capital expenditures and earmarking much of that for AI infrastructure.
Read more: Zuckerberg wants to spend billions more on AI. Meta investors like that idea.
The $5 million number, though, is highly misleading, according to Bernstein analyst Stacy Rasgon. “Did DeepSeek really ‘build OpenAI for $5M?’ Of course not,” he wrote in a note to clients over the weekend.
That number corresponds to DeepSeek-V3, a “mixture-of-experts” model that “through a number of optimizations and clever techniques can provide similar or better performance vs other large foundational models but requires a small fraction of the compute resources to train,” according to Rasgon.
See also: DeepSeek story is bad news for Nvidia and the microchip makers. It may be worse for these stocks.
But the $5 million figure “does not include all the other costs associated with prior research and experiments on architectures, algorithms, or data,” he continued, adding that this type of model is designed “to significantly reduce cost to train and run, given that only a portion of the parameter set is active at any one time.”
Meanwhile, DeepSeek also has an R1 model that “seems to be causing most of the angst” given its comparisons to OpenAI’s o1 model, according to Rasgon. “DeepSeek’s R1 paper did not quantify the additional resources that were required to develop the R1 model (presumably they were substantial as well),” he wrote.
That said, he thinks it’s “absolutely true that DeepSeek’s pricing blows away anything from the competition, with the company pricing their models anywhere from 20-40x cheaper than equivalent models from OpenAI.”
But he doesn’t buy that this is a “doomsday” situation for semiconductor companies: “We are still going to need, and get, a lot of chips.”
Cantor Fitzgerald’s C.J. Muse also saw a silver lining. “Innovation is driving down cost of adoption and making AI ubiquitous,” he wrote. “We see this progress as positive in the need for more and more compute over time (not less).”
A few analysts made reference to the Jevons paradox, which says that efficiency gains can boost the consumption of a given resource. “Rather than lead to less consumption of accelerated hardware, we believe this Jevons Paradox dynamic should in fact lead to more consumption and proliferation of compute resources as more impactful use cases continue to be unlocked,” TD Cowen’s Joshua Buchalter wrote.
Microsoft Corp. (MSFT) Chief Executive Satya Nadella also referenced the term on X, the social-media platform formerly known as Twitter.
Raymond James’s Pajjuri also wasn’t panicking, writing that DeepSeek could “drive even more urgency among U.S. hyperscalers to leverage their key advantage (access to GPUs) to distance themselves from cheaper alternatives,” he wrote.
Additionally, while the DeepSeek fears are centered on training costs, Pajjuri thinks investors should also think about inferencing. Training is the process of showing a model data that will teach it to draw conclusions, and inferencing is the process of putting that model to work based on new data.
Pajjuri argued that “as training costs decline, more AI use cases could emerge, driving significant growth in inferencing,” including for models like DeepSeek’s R1 and OpenAI’s o1.
Emanuel, though, wrote that DeepSeek is said to be “nearly 50x more compute efficient” than popular U.S. models on the training side, and perhaps even more so when it comes to inference.
-Emily Bary
This content was created by MarketWatch, which is operated by Dow Jones & Co. MarketWatch is published independently from Dow Jones Newswires and The Wall Street Journal.
(END) Dow Jones Newswires
01-27-25 1432ET
Copyright (c) 2025 Dow Jones & Company, Inc.