Nvidia CEO Jensen Huang delivers a keynote deal with in the course of the Nvidia GTC Synthetic Intelligence Convention at SAP Middle on March 18, 2024 in San Jose, California.
Justin Sullivan | Getty Photos
Nvidia on Monday introduced a brand new technology of synthetic intelligence chips and software program for operating synthetic intelligence fashions. The announcement, made throughout Nvidia’s developer’s convention in San Jose, comes because the chipmaker seeks to solidify its place because the go-to provider for AI firms.
Nvidia’s share value is up five-fold and complete gross sales have greater than tripled since OpenAI’s ChatGPT kicked off the AI growth in late 2022. Nvidia’s high-end server GPUs are important for coaching and deploying massive AI fashions. Firms like Microsoft and Meta have spent billions of {dollars} shopping for the chips.
The brand new technology of AI graphics processors is called Blackwell. The primary Blackwell chip is known as the GB200 and can ship later this 12 months. Nvidia is attractive its clients with extra highly effective chips to spur new orders. Firms and software program makers, for instance, are nonetheless scrambling to get their arms on the present technology of “Hopper” H100s and related chips.
“Hopper is fantastic, but we need bigger GPUs,” Nvidia CEO Jensen Huang stated on Monday on the firm’s developer convention in California.
Nvidia shares fell greater than 1% in prolonged buying and selling on Monday.
The corporate additionally launched revenue-generating software program referred to as NIM that can make it simpler to deploy AI, giving clients one more reason to stay with Nvidia chips over a rising subject of opponents.
Nvidia executives say that the corporate is changing into much less of a mercenary chip supplier and extra of a platform supplier, like Microsoft or Apple, on which different firms can construct software program.
“Blackwell’s not a chip, it’s the name of a platform,” Huang stated.
“The sellable commercial product was the GPU and the software was all to help people use the GPU in different ways,” stated Nvidia enterprise VP Manuvir Das in an interview. “Of course, we still do that. But what’s really changed is, we really have a commercial software business now.”
Das stated Nvidia’s new software program will make it simpler to run packages on any of Nvidia’s GPUs, even older ones that is likely to be higher fitted to deploying however not constructing AI.
“If you’re a developer, you’ve got an interesting model you want people to adopt, if you put it in a NIM, we’ll make sure that it’s runnable on all our GPUs, so you reach a lot of people,” Das stated.
Meet Blackwell, the successor to Hopper
Nvidia’s GB200 Grace Blackwell Superchip, with two B200 graphics processors and one Arm-based central processor.
Each two years Nvidia updates its GPU structure, unlocking a giant bounce in efficiency. Lots of the AI fashions launched over the previous 12 months had been educated on the corporate’s Hopper structure — utilized by chips such because the H100 — which was introduced in 2022.
Nvidia says Blackwell-based processors, just like the GB200, provide an enormous efficiency improve for AI firms, with 20 petaflops in AI efficiency versus 4 petaflops for the H100. The extra processing energy will allow AI firms to coach larger and extra intricate fashions, Nvidia stated.
The chip consists of what Nvidia calls a “transformer engine specifically built to run transformers-based AI, one of the core technologies underpinning ChatGPT.
The Blackwell GPU is large and combines two separately manufactured dies into one chip manufactured by TSMC. It will also be available as an entire server called the GB200 NVLink 2, combining 72 Blackwell GPUs and other Nvidia parts designed to train AI models.
Nvidia CEO Jensen Huang compares the size of the new “Blackwell” chip versus the current “Hopper” H100 chip at the company’s developer conference, in San Jose, California.
Nvidia
Amazon, Google, Microsoft, and Oracle will sell access to the GB200 through cloud services. The GB200 pairs two B200 Blackwell GPUs with one Arm-based Grace CPU. Nvidia said Amazon Web Services would build a server cluster with 20,000 GB200 chips.
Nvidia said that the system can deploy a 27-trillion-parameter model. That’s much larger than even the biggest models, such as GPT-4, which reportedly has 1.7 trillion parameters. Many artificial intelligence researchers believe bigger models with more parameters and data could unlock new capabilities.
Nvidia didn’t provide a cost for the new GB200 or the systems it’s used in. Nvidia’s Hopper-based H100 costs between $25,000 and $40,000 per chip, with whole systems that cost as much as $200,000, according to analyst estimates.
Nvidia will also sell B200 graphics processors as part of a complete system that takes up an entire server rack.
Nvidia inference microservice
Nvidia also announced it’s adding a new product named NIM, which stands for Nvidia Inference Microservice, to its Nvidia enterprise software subscription.
NIM makes it easier to use older Nvidia GPUs for inference, or the process of running AI software, and will allow companies to continue to use the hundreds of millions of Nvidia GPUs they already own. Inference requires less computational power than the initial training of a new AI model. NIM enables companies that want to run their own AI models, instead of buying access to AI results as a service from companies like OpenAI.
The strategy is to get customers who buy Nvidia-based servers to sign up for Nvidia enterprise, which costs $4,500 per GPU per year for a license.
Nvidia will work with AI companies like Microsoft or Hugging Face to ensure their AI models are tuned to run on all compatible Nvidia chips. Then, using a NIM, developers can efficiently run the model on their own servers or cloud-based Nvidia servers without a lengthy configuration process.
“In my code, the place I used to be calling into OpenAI, I’ll substitute one line of code to level it to this NIM that I acquired from Nvidia as a substitute,” Das stated.
Nvidia says the software program can even assist AI run on GPU-equipped laptops, as a substitute of on servers within the cloud.