Cloud providers are experiencing a sharp rise in demand for Nvidia’s H200 chips as Chinese AI firm DeepSeek intensifies the race for leading AI models.
Though the stock market reacted negatively, sending Nvidia’s stock down 16% on Monday, DeepSeek has steadily gained traction among AI researchers. Since releasing its first model, V2, in May 2024, the company has drawn increasing attention.
The breakthrough came with V3’s release in December, which sparked excitement among AI developers. The launch of DeepSeek R1 in early January further fueled demand for Nvidia’s H200 GPUs, according to Lambda VP Robert Brooks. Enterprises are now pre-purchasing large blocks of H200 capacity ahead of public availability.
DeepSeek’s Open-Source Model Reshapes AI Computing
DeepSeek’s open-source approach allows users to access models with minimal cost. However, running these models at scale requires powerful hardware or cloud computing services.
On Friday, Semianalysis analysts reported that DeepSeek’s AI models were having a tangible impact on H100 and H200 pricing. Meanwhile, Nvidia’s CFO Colette Kress confirmed that total H200 GPU sales had already reached double-digit billions by November 2023.
Unlike AI models from OpenAI, Microsoft, and Meta, DeepSeek trained its models using less powerful hardware, making them more efficient. Investors are now questioning whether major AI firms need their massive infrastructure investments. Despite this, AI inference tasks remain compute-heavy, requiring high-performance chips.
Nvidia H200 Chips Become Essential for DeepSeek V3
DeepSeek’s models vary in size, with the most powerful version reaching 678 billion parameters. This figure sits between OpenAI’s ChatGPT-4 (1.76 trillion) and Meta’s Llama 3 (405 billion).
To run DeepSeek V3 efficiently, H200 chips are the only widely available Nvidia hardware capable of handling the model on a single node. Running it on lower-power GPUs is possible but requires expertise, slows down performance, and leaves room for error, said AI expert Srivastava.
Nvidia’s upcoming Blackwell chips will also support DeepSeek V3, but they have only just started shipping. With soaring demand, securing enough H200 GPUs for optimal performance is proving challenging.
Baseten, which optimizes AI model performance, relies on data centers for GPU access and works to enhance inference speed. Its customers value real-time AI interactions, and DeepSeek’s open-source models offer unprecedented capacity at a fraction of the cost.