Powerful and scalable ND- and NC-series virtual machines optimised for AI-distributed training and inference are part of Azure’s cloud-based AI supercomputer. With hundreds of NVIDIA A100 and H100 GPUs, NVIDIA Quantum-2 400Gb/s InfiniBand networking, and the NVIDIA AI Enterprise software suite added to its platform, it is the first public cloud to use NVIDIA’s cutting-edge AI stack.
NVIDIA will work with Azure to research and accelerate advancements in generative AI, a rapidly developing field of artificial intelligence where foundational models like the Megatron Turing NLG 530B serve as the basis for unsupervised, self-learning algorithms to generate new text, code, digital images, video, or audio.
Additionally, the businesses will work together to improve Microsoft’s DeepSpeed deep learning optimization tool. Azure enterprise clients will have access to NVIDIA’s full suite of AI processes and software development kits that have been tailored for Azure.
Quantum
The AI-optimized virtual machine instances from Microsoft Azure are the first public cloud instances to use NVIDIA Quantum-2 400Gb/s InfiniBand networking and are built with the most cutting-edge data centre GPUs. To train even the largest huge language models, create the most intricate recommender systems at scale, and enable generative AI at scale, customers can deploy hundreds of GPUs in a single cluster.
With NVIDIA A100 GPUs, the current Azure instances provide NVIDIA Quantum 200Gb/s InfiniBand networking. Future versions will use NVIDIA H100 GPUs and NVIDIA Quantum-2 400Gb/s InfiniBand networking. These AI-optimized products, when used in conjunction with Azure’s cutting-edge compute cloud infrastructure, networking, and storage, will deliver scalable peak performance for AI training and deep learning inference workloads of any size.
A wide variety of AI services and applications, such as Microsoft DeepSpeed and the NVIDIA AI Enterprise software suite, will be supported by the platform.
To speed up transformer-based models used for huge language models, generative AI, and generating code, among other uses, Microsoft DeepSpeed will make use of the NVIDIA H100 Transformer Engine. With twice the throughput of 16-bit operations, this technology uses DeepSpeed’s 8-bit floating point precision capabilities to significantly speed up AI calculations for transformers.