At the the SIGGRAPH conference this morning in Los Angeles, NVIDIA made several generative AI-related announcements, including a partnership with Hugging Face intended to broaden access to generative AI supercomputing (NVIDIA’s DGX cloud hardware) for developers building large language models (LLMs) and other AI applications on the Hugging Face platform.
The companies said the combination will help “supercharge” adoption of generative AI using LLMs that are tailored with business data for industry-specific applications, including intelligent chatbots, search and summarization.
“Researchers and developers are at the heart of generative AI that is transforming every industry,” said Jensen Huang, founder/CEO of NVIDIA. “Hugging Face and NVIDIA are connecting the world’s largest AI community with NVIDIA’s AI computing platform in the world’s leading clouds.”
The collaboration will include a new Hugging Face service, available in coming months, called Training Cluster as a Service, which is designed to simplify creation of custom generative AI models for the enterprise.
The Hugging Face platform lets developers build and deploy AI models using open-source resources. The company said more than 15,000 organizations use the platform and that its community has shared more than 250,000 models and 50,000 datasets.
Each instance of DGX Cloud features eight NVIDIA H100 or A100 80GB Tensor Core GPUs for a total of 640GB of GPU memory per node.
NVIDIA also announced its AI Workbench, a “unified workspace” the company said is designed to simplify model tuning and training on a PC or workstations and then scale them to a data center, public cloud or NVIDIA DGX Cloud.
NVIDIA said the AI Workbench is intended to reduce the complexity of getting started with an enterprise AI project. Accessed through an interface running on a local system, it allows developers to customize models from repositories like Hugging Face, GitHub and NVIDIA NGC using custom data, which can then be shared across multiple platforms.
“Enterprises around the world are racing to find the right infrastructure and build generative AI models and applications,” said Manuvir Das, NVIDIA’s VP of enterprise computing. “NVIDIA AI Workbench provides a simplified path for cross-organizational teams to create the AI-based applications that are increasingly becoming essential in modern business.”
The company said customizing pre-trained models with the many open-source tools can require hunting through multiple online repositories for the right framework, tools and containers, and employing the right skills to customize a model for a specific use case. But with the AI Workbench, NVIDIA said developers can customize and run generative AI in a few clicks. “It allows them to pull together all necessary enterprise-grade models, frameworks, SDKs and libraries from open-source repositories and the NVIDIA AI platform into a unified developer workspace.”
The company said AI infrastructure providers — including Dell Technologies, Hewlett Packard Enterprise, HP Inc., Lambda, Lenovo and Supermicro – “are embracing AI Workbench for its ability to augment their latest generation of multi-GPU capable desktop workstations, high-end mobile workstations and virtual workstations.”
Developers with a Windows or Linux-based RTX PC or workstation will also be able to initiate, test and fine-tune enterprise-grade generative AI projects on their local RTX systems, and access data center and cloud computing resources when the need to scale arises.
NVIDIA also announced the latest version of its enterprise software platform, NVIDIA AI Enterprise 4.0, intended to give businesses access to the tools for adopting generative AI.
It includes:
- NVIDIA NeMo — a cloud-native framework with “end-to-support” to build and deploy LLMs.
- NVIDIA Triton Management Service, which helps automate production deployments. It allows enterprises to deploy multiple NVIDIA Triton Inference Server instances in Kubernetes with model orchestration designed for efficient operation of scalable AI.
- NVIDIA Base Command Manager Essentials cluster management software, designed to help enterprises maximize performance and utilization of AI servers across data center, multi-cloud and hybrid-cloud environments.
Software companies ServiceNow and Snowflake, as well as infrastructure provider Dell Technologies, which offers Dell Generative AI Solutions, recently announced collaborations with NVIDIA for generative AI solutions and services on their platforms. The company said its AI Enterprise 4.0 will be integrated into partner marketplaces, including Google Cloud and Microsoft Azure, as well as through NVIDIA cloud Oracle Cloud Infrastructure.
Additionally, MLOps providers, including Azure Machine Learning, ClearML, Domino Data Lab, Run:AI, and Weights & Biases, are adding integrations with the NVIDIA AI platform designed to simplify production-grade generative AI model development.