NVIDIA’s Ultimate Chip, Huang’s Dream Realized

News | Rai Technology

“Do Not Miss the Decisive Moment of AI!”

When Huang Renxun raised this point at the 2023 NVIDIA GTC, many ordinary people like me might not have paid much attention to it. After all, at that time, AI applications were scarce, with only a few digital enthusiasts knowledgeable enough to deploy local AI artwork at home. The large-scale language models available back then only brought a fleeting sense of novelty to the general public.

However, looking back now, that time might indeed have been a decisive moment for AI.

In the early hours of today, the much-anticipated NVIDIA GTC 2024, known as a barometer for AI, was officially held at the SAP Center in San Jose, California, where NVIDIA’s founder and CEO Huang Renxun took to the stage to deliver the keynote speech “Witnessing the Moment of AI Transformation.” Huang did not disappoint us, and we truly witnessed a moment of AI transformation.

(Source: techovedas)

During this two-hour speech, Huang Renxun unveiled the company’s most powerful AI accelerator card to date—the Blackwell GB200 superchip system, the fully pre-configured GB200 server system, and NVIDIA’s latest advancements in AI software (NIM microservices), Omiverse cloud (Earth simulation), embodied intelligence (robots), and other technologies.

Just how powerful is this wave of AI chip infrastructure that Huang brought? What changes will it bring to the AI large model industry? Let’s find out.

Blackwell GB200: The Ultimate AI Accelerator Card

In the field of artificial intelligence, computing speed is crucial. To complete a large amount of parallel processing of homogeneous data in the shortest time possible, a vast amount of data needs to be fed to the GPU to train a complex neural network model. It can be said that the GPU is the cornerstone of AI large model training platforms, even the decisive computational base.

Therefore, the star of this speech naturally is NVIDIA’s core product “Blackwell B200” GPU chip.

(Source: NVIDIA, on-site comparison of Blackwell architecture and Grace Hopper architecture GPU)

As the first new product based on NVIDIA’s Blackwell architecture, the Blackwell B200 is built on TSMC’s 4nm process, using a dual-die design that connects two dies into one GPU, giving each GPU chip 208 billion transistors.

Compared to the 80 billion transistors in the previous generation GH100 GPU, the Blackwell B200 GPU marks a significant breakthrough and even aligns with Moore’s Law, which states that the number of transistors that can be accommodated on an integrated circuit will roughly double every 18 months.

(Source: NVIDIA)

Huang Renxun stated that with this architecture upgrade, the AI performance of Blackwell B200 can reach 20 PFLOPS, while the H100 is only 4 PFLOPS, theoretically boosting the efficiency of inference in Large Language Models (LLMs) by 30 times. The added processing power will enable AI companies to train larger and more complex models.

Even more outstandingly, building upon B200, Huang introduced the complete AI chip set—the Blackwell GB200, consisting of two Blackwell B200 GPUs and a Grace CPU based on Arm.

In the 175 billion parameter GPT-3 LLM benchmark test, NVIDIA claims that the performance of GB200 is seven times that of H100 and the training speed is four times faster than H100.

(Source: NVIDIA)

This AI performance has reached a different level.

Of course, if GB200 still does not meet your needs, NVIDIA has prepared a series of server arrays composed of Blackwell GB200, with the highest achieving the GB200 NVL72 system made up of 72 B200 GPUs, providing training power up to 720 PFLOPS at FP8 precision, reaching the level of the previous generation DGX SuperPod supercomputer cluster.

(Source: NVIDIA)

More importantly, compared to H100, it can reduce costs and energy consumption by a factor of 25.

At the beginning of this year, the well-known American magazine The New Yorker reported that ChatGPT consumes over 500,000 kilowatt-hours of electricity daily, equivalent to 17,000 times the average electricity consumption of an American household. As Elon Musk put it, in the foreseeable future, the shortage of electricity will become a major factor constraining AI development.

(Image Source: businessinsider.com)

Huang, the CEO of Nvidia, clearly stated that previously, training a 1.8 trillion parameters model required 8000 H100 GPUs and about 15 megawatts of power, whereas now this can be achieved with 2000 B200 GPUs, consuming only 4 megawatts.

With such remarkable parameters, foreigners exclaimed, “Moore’s Law has been rewritten!”

It can be anticipated that in order to continue attracting customers in the domestic market, Huang Renxun is likely to introduce a special edition AI accelerator card using the new generation AI graphics processor architecture, the Blackwell B20 GPU.

However, given the explicit export restrictions on computing power by the U.S. Department of Commerce, it remains unknown how much capacity improvement this Chinese special edition GPU can bring and whether it can engage in healthy competition with domestic alternative AI accelerator cards.

From Simulating Earth to Humanoid Robots

Judging by the fervor worldwide, the advent of generative AI has garnered a wide consensus. So, what exactly can we achieve with AIGC? Huang provided some standard answers today.

Have you ever played a game called SimEarth? The developer, MAXIS, created a miniature Earth on relatively underpowered computers at the time, allowing players to play the role of a deity, managing the planet’s geography, atmosphere, biology, civilizations, and building a thriving planet.

(Image Source: MAXIS Studio)

Now, Nvidia is utilizing the capabilities of large models to create a digital twin of Earth — Earth-2.

Earth-2 is an AI physical environment created in Nvidia Omniverse by Modulus, running millions of times faster, aimed at achieving a globally simulated environment at the scale of data centers, ultimately using cloud computing and artificial intelligence technology to simulate and visualize weather conditions.

(Image Source: Nvidia)

By combining traditional weather models with Nvidia’s meteorological model, it can achieve forecasts over hundreds or even thousands of square kilometers, providing information such as the range of impact from events like typhoons, thus minimizing property damage. This technology is expected to be opened to more countries and regions in the future.

Indeed, the meme about simulating Earth on PS3 years ago seems to be coming true now.

(Image Source: PS3)

Next, let’s talk about humanoid robots.

In recent years, humanoid robots have become a popular research trend in the scientific community. Apart from Musk’s well-known Tesla Optimus, both domestic and international companies like Boston Dynamics, Agility Robotics, UBTECH, Xiaomi, iFlytek, and DeepRobotics are exploring this path.

As large models continue to iterate and improve, the rapid advancement in intelligent generalization capabilities has led many in the industry to see the potential of humanoid robots. Instead of driving robots with various data through repeated tuning, using large models as the brain with robots as the body, allowing the large model to gather information, make judgments, and take actions through perception, motion, and environmental interaction seems more promising.

And this is one of the ultimate forms of artificial intelligence — embodied intelligence.

(Image Source: Nvidia)

To this end, Nvidia has today launched the world’s first universal basic model for humanoid robots — Project GR00T. Robots powered by this model will be able to understand natural language, mimic actions through observing human behavior, and users can teach them various skills quickly to adapt to the real world and interact with it.

Huang firmly believes that embodied intelligence will lead the next wave of artificial intelligence.

At this point, all I want to say is, UBTECH and others, hurry up and collaborate with Nvidia. Your robots’ “bodies” must be empowered by Nvidia’s Project GR00T “brain” to become truly intelligent robots. With the emergence of Project GR00T, the era of real robots might be approaching, which is also the ultimate application of AI: making artificial intelligence manifest as “humans”. # Fulfilling a Ten-Year Dream: NVIDIA’s CUDA Shines Bright at GTC 2024

At the opening keynote of GTC 2024, Jen-Hsun Huang looked back on NVIDIA’s history.

In 2014, Jen-Hsun Huang first emphasized the importance of machine learning and introduced the concept of CUDA (Compute Unified Device Architecture). While many still viewed NVIDIA solely as a manufacturer of “gaming graphics cards,” they were already at the forefront of the AI revolution.

Back then, CUDA’s main applications were in scientific computing, such as climate modeling, physics simulations, bioinformatics, and other specialized research fields. Although these applications were valuable, they were limited in scope. Consequently, NVIDIA’s CUDA struggled to penetrate the market, and the returns did not match the significant research and development investments. Every year, Huang had to explain to the board why NVIDIA should persist with CUDA at a time when its potential was unclear. Perhaps even Huang himself did not anticipate that in the following years, CUDA would find success in computing scenarios such as blockchain mining and AI model computations, leading to immense prosperity.

In just two years, NVIDIA established a trillion-dollar empire through the H100 and H200 chips, surpassing traditional powerhouses like Amazon in market value. With this momentum, surpassing Apple and Microsoft to become the world’s leading giant in the visible future is not impossible.

Currently, demand for NVIDIA’s “cards” exceeds supply, with Chinese tech giants like ByteDance and Baidu stockpiling cards to handle extreme situations. Even Silicon Valley tech giants like Microsoft and Meta are approaching Huang to buy cards.

Despite the increasing number of players in AI and AI chipsets and certain trade policy conflicts making Huang somewhat constrained, his confidence in the newly launched B200 and GB200 models was evident throughout the keynote address. He remains steadfast in his vision of empowering the world with AI.

In 2024, hailed as the inaugural year of AI applications, NVIDIA’s CUDA (Compute Unified Device Architecture), as its name suggests, has become more versatile. From foundational technologies such as large language models, conversational AI, edge computing, to applications like intelligent cockpits, autonomous driving, humanoid robots, AI phones, AI PCs, AI home appliances, AI search, AI painting, and even future scenarios like climate forecasting, computational lithography, and 6G networks, AI is omnipresent, and NVIDIA’s computing power is everywhere, becoming synonymous with “universal computing.”

NVIDIA’s CUDA is truly amazing and cool!

(Image Source: NVIDIA)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.