Making sense of DeepSeek R1

The new Chinese AI model that has spooked America's AI industry

Jan 28, 2025

Correction: An earlier version of this post stated my TIME article was published last month. It was actually published earlier this month.

A new AI model by Hangzhou-based startup DeepSeek has catapulted China’s AI prowess into public consciousness after rivalling one of OpenAI’s top models in performance. It has stunned some, spooked investors, and shaved $1 trillion off U.S. tech stocks. But for those watching the space closely, R1’s performance was not totally surprising.

Before R1’s release, I reported for TIME that China has shrunk the gap in AI development despite US restrictions on chips. Indeed, in that article, I made mention of DeepSeek’s R1-preview, a pre-release version of the model, which rivaled OpenAI’s own o1-preview. Here are some points from that story that I think are particularly relevant now.

DeepSeek R1 did not come out of nowhere

It is among a string of recent releases from Chinese AI developers, which suggest the U.S.’s lead has indeed shrunk.

In November, Alibaba and Chinese AI developer DeepSeek released reasoning models that, by some measures, rival OpenAI’s o1-preview. The same month, Chinese videogame juggernaut Tencent unveiled Hunyuan-Large, an open-source model that the company’s testing found outperformed top open-source models developed in the U.S. across several benchmarks. Then, in the final days of 2024, DeepSeek released DeepSeek-v3, which now ranks highest among open-source AI on a popular online leaderboard and holds its own against top performing closed systems from OpenAI and Anthropic.

It’s not as simple as “U.S. export controls aren’t working”

Some have taken DeepSeek R1 (and the other recent Chinese AI releases) as a sign that the U.S.’s restrictions on chips are ineffective at constraining China’s AI development. However, it might be more complex.

U.S. export controls have not been 100% effective in stopping China from acquiring chips:

DeepSeek, the Chinese developer behind an AI reasoning model called R1, which rivals OpenAI’s O1-preview, assembled a cluster of 10,000 soon-to-be-banned Nvidia A100 GPUs a year before export controls were introduced.

Smuggling might also have undermined the export control’s effectiveness. In October, Reuters reported that restricted TSMC chips were found on a product made by Chinese company Huawei. Chinese companies have also reportedly acquired restricted chips using shell companies outside China. Others have skirted export controls by renting GPU access from offshore cloud providers. In December, The Wall Street Journal reported that the U.S. is preparing new measures that would limit China’s ability to access chips through other countries.

And Chinese AI developers (as well as their American counterparts) have been getting more compute-bang out of the same hardware. R1 is just the most recent example.

In November, Tencent released a language model called Hunyuan-Large that outperforms Meta’s most powerful variant of Llama 3.1 in several benchmarks. While benchmarks are an imperfect measure for comparing AI models’ overall intelligence, Hunyuan-Large’s performance is impressive because it was trained using the less powerful, unrestricted Nvidia H20 GPUs, according to research by the Berkeley Risk and Security Lab. “They're clearly getting much better use out of the hardware because of better software,” says Ritwik Gupta, the author of the research, who also advises the Department of Defense’s Defense Innovation Unit. Rival Chinese lab’s DeepSeek-v3, believed to be the strongest open model available, was also trained using surprisingly little compute.

But, so long as AI development continues to require increasing compute, export controls will likely bite harder over time. It may be too early for Chinese AI developers to feel their full effects.

“Export controls mostly hit you on quantity,” Heim says, adding that even if some restricted chips find their way into the hands of Chinese developers, by reducing the number, export controls make it harder to train and deploy models at scale. “I do expect export controls to generally hit harder over time, as long as compute stays as important,” he says.

DeepSeek R1 is a sign that AI technology is diffusing

Whether it is neck-and-neck with the U.S. or several years behind, China will likely acquire advanced AI capabilities eventually. This is because researchers will continue to find ways to eek more performance out of the same computing power (currently, the physical compute required to achieve a given performance in language models is declining at a rate of 3 times per year, according to Epoch AI). Export controls just buy you time. This is something Lennart Heim, one of the people I spoke to for my story last month, has written about on his own blog.

Technological diffusion means China, too, may soon develop systems with superhuman capabilities in areas like cyber-warfare. As Scott Singer, another expert I spoke to for the story, told me last month:

Within Washington, “right now, there is a hesitation to bring China to the [negotiating] table,” says Scott Singer, a visiting scholar in the Technology and International Affairs Program at the Carnegie Endowment for International Peace. The implicit reasoning: ‘[If the U.S. is ahead], why would we share anything?’”
But he notes there are compelling reasons to negotiate with China on AI. “China does not have to be leading to be a source of catastrophic risk,” he says, adding its continued progress in spite of compute restrictions means it could one day produce AI with dangerous capabilities. "If China is much closer, consider what types of conversations you want to have with them around ensuring both sides' systems remain secure,” Singer says.

What surprised me about DeepSeek R1

While the writing was on the wall for DeepSeek R1’s release, I did not expect it to generate as many headlines as it has nor send U.S. tech markets into a frenzy. My colleagues, Billy Perrigo and Tharin Pillay, have written a smart explainer about why and what it means for the future of America’s AI industry, which you can read now on TIME.

Read now

Harry’s Substack

Discussion about this post