tech

GPUs are in short supply, do we really have to delay shipments?

Over the past year, GPUs for AI servers have been in high demand and are expected to remain scarce. Looking at the development trend for the next year, the tightness of related products is likely to increase.

According to TrendForce forecasts, the global server shipment is expected to reach approximately 13.654 million units in 2024, with a year-on-year growth rate of about 2.05%. At the same time, the market continues to focus on deploying AI servers, with a shipment proportion of about 12.1%.

From the shipment dynamics of major ODMs, this year, the highest growth rate is seen at Foxconn, with an estimated annual shipment increase of about 5% to 7%, including orders for Dell 16G platform, AWS Graviton 3 and 4, Google Genoa, and Microsoft Gen9. In terms of AI server orders, Foxconn has secured orders from Oracle this year and has also taken on some AWS orders.

The second-highest growth rate is predicted for Inventec, with an estimated annual shipment increase of about 0 to 3%. In the AI server sector, aside from North American CSPs, Chinese customers, such as ByteDance, have the strongest demand. It is estimated that Inventec's AI server shipments could see a double-digit annual growth rate this year, accounting for about 10% to 15%.

Following Inventec are Quanta and Supermicro, with an expected flat year-on-year growth rate in server shipments for this year.

Advertisement

Overall, AI servers remain the strongest in shipments for all ODMs in 2024, mainly driven by orders from North American cloud data center vendors. It is anticipated that this year, the growth rate and proportion of AI server shipments will both reach double-digit percentages. In terms of shipment types, this year, the shipment of models equipped with high-end AI training chips (such as NVIDIA H-series and AMD's MI series products) is expected to double. This means greater business opportunities for NVIDIA and AMD.

01

Does size lead to bullying customers?

Currently, NVIDIA holds 80% of the AI server GPU chip market share, and AI systems and major internet companies all require the company's GPUs. Therefore, NVIDIA has significant market influence in this sector.Recently, foreign media have reported that Nvidia may be deliberately delaying shipments to limit the order intake of its competitors. According to the report, if Nvidia discovers that customers are seeking better business opportunities, it might delay the delivery of data center GPUs. Groq, a competitor of Nvidia and a startup company specializing in AI chips, has stated that due to customers' fears of retaliatory shipment delays from Nvidia, they are apprehensive about keeping their acquisition or design of AI technology confidential.

Groq's CEO, Jonathan Ross, has indicated that potential customers, fearing that Nvidia might discover their discussions with other manufacturers about orders, would deny having met with these rival companies. In fact, this kind of situation is not uncommon in the industry. Ross said, "Many people say that if Nvidia hears that a customer is meeting, they will deny it. The problem is that you have to pay Nvidia a year in advance, and you might get the hardware within a year, or it might take longer."

Foreign media have even suggested that tech giants like Microsoft, Google, and Amazon are building their own AI accelerators but insist they have no intention of becoming Nvidia's competitors, given Nvidia's dominant position in the AI market.

In response to Ross's statement, Nvidia's CEO, Jen-Hsun Huang, told industry analysts that he is trying to fairly allocate GPUs to customers and will avoid selling products to companies that do not immediately use the accelerators.

Following Groq's accusation that Nvidia uses delayed shipment tactics, AMD's former Vice President, Scott Herkelman, stated that Nvidia indeed adopts similar strategies, even referring to Nvidia as the "GPU Monopoly Group." He wrote on social media platform X, "This happens more often than expected."

Herkelman's stance is quite noteworthy because he has been in charge of AMD's graphics business division since 2016, which competes with Nvidia in consumer and data center businesses, until he left AMD in 2023. More importantly, he served as the General Manager of Nvidia's GeForce business from September 2012 to May 2015.

However, it is currently unclear whether there is any evidence to prove that Nvidia has indeed engaged in the aforementioned practices.

02

How to Maintain Industry Leadership?NVIDIA, in its proactive position in the AI server GPU market, will certainly try every means to maximize profits. In terms of specific measures, in addition to high pricing, maintaining customer stickiness is a very important way. In this regard, specific measures include technical ones, such as firmly grasping a large number of engineers with the CUDA software and hardware ecosystem, as well as commercial operations, to reduce the opportunity for competitors to acquire customers as much as possible.

Under the current market conditions of AI servers and related GPU markets, both GPU providers and system integrators or internet giants are becoming more sensitive. Especially the internet giants are making two-handed preparations, while purchasing more GPUs from NVIDIA, they are also accelerating the development of related chips. For NVIDIA's various customers, it is inevitable to have more contact with AMD and Intel.

Cloud service providers such as Microsoft, Meta, and Amazon, in order to reduce chip costs and diversify the chip supply chain, reduce dependence on NVIDIA, not only start to increase the procurement of AMD's MI300 series products, require ODM factories to design AI servers specifically using MI300 series products, but also strengthen the development of self-developed HPC chips, and strive to use more self-developed chips in their own Internet and cloud computing systems.

AMD is NVIDIA's biggest competitor. Due to the increasing attention from more and more NVIDIA customers, the procurement volume of AMD products is increasing, which makes the industry status of this GPU "second brother" continue to rise, which is more and more obvious in the capital market. Recently, AMD's stock price has risen by more than 9% to $192.53 per share, setting a historical high, and has risen by 14.8% in February, with a market value of over $300 billion for the first time.

According to Dow Jones Market Data, AMD's market value has reached $311 billion. After the recent continuous rise, AMD's stock price is getting more and more expensive, and its price-to-earnings ratio is close to 50 times, far higher than NVIDIA's 32 times.

In this way, although NVIDIA is still very popular, it is limited by advanced processes, packaging capacity, and competition from chip competitors at all levels, and the company's sense of crisis is also increasing.

For NVIDIA, it is necessary to speed up the pace of launching new products and optimizing existing products.

NVIDIA's AI annual event "GTC 2024" will be held in the United States on March 18. At that time, Huang Renxun will announce the latest AI chip and the B100 GPU with the new generation Blackwell architecture. It is reported that this new product will be manufactured by TSMC's 3nm process, and will be shipped as early as the fourth quarter of this year.

The GTC conference will not only bring together professional engineers and researchers from all over the world, but also invite several big names in the field to attend in person. The event is expected to attract more than 300,000 people (total number of offline and online participants), and this year's GTC is also seen as an important weathervane for observing the development process of key AI technologies in 2024 and 2025.

NVIDIA's upcoming B100, compared with the current H series GPU, has a significant increase in overall performance. Among them, just the HBM memory capacity is about 40% higher than the strongest H series H200 chip, which enables B100 to meet the high-performance HPC or accelerated LLM AI training needs. It is understood that the AI performance of the B100 chip is at least twice that of the Hopper architecture H200, and can reach more than four times that of H100.Several major AI server system manufacturers have begun to compete for the B100, including Wistron, which has become a supplier of Nvidia's B100 module, TSMC, which provides 3nm or 4nm process technology, and Ingrasys, a major contract manufacturer that has started to receive orders for Nvidia's B100 AI server liquid cooling project. Ingrasys stated that this year's AI server market is still dominated by Nvidia products, with high-end training AI server products being the main force in the market.

Regarding the upcoming new product B100, Nvidia has also upgraded the cooling technology it carries, switching from air cooling to liquid cooling. On this, Huang Renxun once mentioned that he firmly believes that immersion liquid cooling technology is the future development direction and will drive the entire cooling market to usher in a comprehensive innovation. It is reported that starting with B100, all of Nvidia's future product cooling technologies will switch from air cooling to liquid cooling.

03

The Chinese market adds more sense of crisis to Nvidia

Nvidia's attention to the threats posed by competitors can be seen from its attitude towards the Chinese mainland market and local chip companies.

The Chinese mainland market accounts for about 20% of Nvidia's sales. In the past two years, the company has had to change the performance specifications of GPUs several times to meet the export requirements of the U.S. government.

In August 2022, the U.S. government banned the export of Nvidia's A100 and H100 chips to the Chinese mainland because the communication bandwidth of these chips reached 600GB/s or higher. In response to the Chinese mainland market, Nvidia subsequently launched the A800 and H800 processors, both with significantly lower communication bandwidth than 600GB/s.

In October 2023, the U.S. Department of Commerce's Bureau of Industry and Security (BIS) stated that it would use "performance density" as a new parameter to classify restricted chips. According to the new regulations, Nvidia's A800, H800, L40, L40S, and RTX 4090-related products were prohibited from being sold to the Chinese mainland. In response to this regulation, Nvidia launched three AI chips last November—H20, L20, and L2—but they will not be mass-produced and delivered until the second quarter of 2024.

In response to the restricted sales of RTX 4090 in the Chinese mainland, Nvidia developed the RTX 4090 D graphics card, which lowers some specifications to meet U.S. export control requirements. It is reported that the RTX 4090 D meets the comprehensive computing performance (TPP) limit of 4800, while the TPP of the RTX 4090 is 5286.

Recently, Nvidia launched the latest version of the Chinese market-specific GPU and graphics card RTX 5880 Ada, which complies with the 4800 TPP limit regulation. Nvidia uses the AD102 chip for the RTX 6000 Ada and RTX 5000 Ada, and the RTX 5880 Ada is likely to use a variant of the same chip. The AD102 has 18,432 CUDA cores.In recent years, with the introduction of restrictive policies in the United States and the enhancement of competitiveness of domestic Chinese enterprises and products, the technological and product advantages of manufacturers like NVIDIA are diminishing. For instance, the H20 still holds an edge over domestic Chinese AI chips in terms of performance and efficiency, but this advantage is narrowing as several domestic Chinese chip manufacturers are rapidly advancing.

Due to the shrinking gap between domestic Chinese AI chips and NVIDIA's specialized products, since the beginning of 2024, several major Chinese internet companies and cloud service providers have indicated that their orders for NVIDIA H20 and similar products this year will be significantly fewer than initially intended, as the usage of domestically produced chips has increased.

Test results show that the H20 can efficiently transfer data between multiple processors, making it more suitable for AI computing applications than domestic Chinese chips. However, more H20 units are required to demonstrate the computing power of NVIDIA's regular GPUs, which significantly raises costs. In contrast, the most advanced domestic Chinese AI chips can also handle AI-related applications, albeit with lower complexity in task processing. Insiders have revealed that several major Chinese internet companies and cloud service providers have already shifted some AI chip orders to domestic manufacturers. Taking Huawei as an example, it is reported that the company received at least 5,000 orders for Ascend 910B chips from domestic internet giants last year, with deliveries scheduled for this year.

In late February, during an interview with foreign media, Jensen Huang stated that currently, the entire tech industry is racing to develop and optimize its own chip technology. Whether it's the TPU team, AWS Trainium and Inferentia team, Microsoft's Maia project, or major Chinese cloud service providers and startups, a significant amount of effort has been invested in this field. This competitive landscape is indeed very intense.

Regarding competitors in mainland China, Huang expressed that Huawei is an excellent company. Despite being limited by existing semiconductor process technologies, they can still construct very powerful systems by aggregating many chips. To compete with Huawei, NVIDIA is offering two new AI chip samples specifically tailored for the Chinese market to its customers.

This is the first time NVIDIA has publicly listed Huawei as a competitor. Previously, the only time NVIDIA mentioned Huawei publicly was in a financial report from 2017, where the company stated that Huawei would use NVIDIA's Volta HGX architecture to build AI systems for data centers. In that report, NVIDIA also listed Huawei as a partner for its AI smart city platform, and now, Huawei has become a competitor that NVIDIA must take seriously.

Conclusion

After the boom in 2023, the AI server market in 2024 is likely to be even more robust, providing more business opportunities for related high-performance chips, especially GPU manufacturers.

For industry-leading manufacturers, numerous competitors, including customers, are eyeing their positions. The higher one stands, the harder one may fall if not handled well. It is necessary to fully leverage existing advantages in technology, products, and commercial promotion to suppress competitors and maintain industry status.For NVIDIA, the AI server GPU market in 2024 will still be its domain. However, in the ever-changing and evolving high-performance computing market, who can predict how much the technology and product market will change in two or three years? Just as two years ago, who could have predicted that Huawei would achieve a breakthrough in mobile processor manufacturing by 2023?