The state of GPU capabilities in U.S. and China by the end of 2024
Current advancements in AI depend heavily on increased compute density, which fundamentally relates to power density that restricts data center capacity. While nuclear power may be essential in the long term, U.S. hyperscalers cannot afford to delay their data center expansions; they must act decisively in this competitive landscape – If you don’t acquire land, power and labor now, someone else will [1]. In addition to this “Game Theory” of AI CapEx, the fear-of-missing-out (FOMO) is another core driver as highlighted by the Alphabet CEO Sundar Pichai [2]:
We are at an early stage of what I view as a very transformative era. When we go through a curve like this, the risk of under-investing is dramatically greater than the risk of over-investing. Even in scenarios where it turns out that we are over-investing the infrastructure is widely useful for us. I think not investing to be at the frontier definitely has much more significant downside. Having said that, we obsess around every dollar we put in. Our teams work super hard and I'm proud of the efficiency work, be it optimization of hardware, software, model deployment across our fleet.
Microsoft’s leadership has presented similar arguments to investors in its substantial AI spendings, assuring them that practical measures are in place to monetize these investments, such as the development of AI agent products for their to-B segments [3], in response to Wall Street's pressing concerns [4]. In summary, major U.S. tech companies are committed to securing the necessary compute power for their current and future operations [5].
In contrast, GPU power in China has become increasingly decentralized. The total number of NVIDIA GPUs that China can legally acquire is limited to ~1M (H20/A20/H800/A800)/year due to U.S. sanctions. Notably, major purchasers of these NVIDIA GPUs from Q2 2023 to Q2 2024 include state-owned telecom companies China Mobile and China Telecom, surpassing traditional cloud providers like Huawei, Tencent, Alibaba, ByteDance, and Baidu [6]. Additionally, it is an open secret that many second-tier players, research institutes, and state-owned entities are still managing to access the latest NVIDIA GPUs through irregular channels (often involving transitions from Singapore) in substantial quantities. This situation is further leveraged by some local government-backed AI centers to stimulate infrastructure investment and capital raising, perpetuating the interconnection between real estate and GDP growth [7].
Skeptically, China may hold less than 30% (including irregular acquisitions) of the global NVIDIA GPU market share, compared to over 40% in the U.S. Unfortunately, no individual entity in China is publicly known to operate a data center close to the scale of 100,000 GPUs, which is the capacity currently employed by OpenAI and xAI for training their frontier models.
In summary, AI capabilities in China are decentralized due to U.S. sanctions and the country's unique national conditions. In the long run, the lack of a super AI cluster tends to diminish the benefits associated with AI scaling laws.