On Windows AI PCs, Nvidia is launching Chat with RTX, which enables users to create customized local AI chatbots.
This is Nvidia’s most recent attempt to make AI on its graphics processing units (GPUs) a widely available tool.
With the launch of Chat with RTX, a new service that showcases the capabilities of TensorRT-LLM software and retrieval-augmented generation (RAG), customers can now take advantage of personalized generative AI right on their local devices. Additionally, it helps with local privacy so users don’t have to worry about their AI discussions, and it uses less data center processing power.
Chatbots, which usually rely on cloud servers using Nvidia GPUs, have become an essential element of millions of people’s everyday interactions worldwide. Nevertheless, the Chat with RTX tech demo changes this paradigm by allowing users to locally benefit from generative AI using the processing capacity of GPUs with a minimum of 8GB of video random access memory (VRAM) and Nvidia GeForce RTX 30 Series or higher.
Individualized AI encounter
According to Nvidia, Chat with RTX is a customized AI companion that customers can tailor with their own content—it’s more than just a chatbot. Users may take use of the benefits of generative AI with never-before-seen speed and privacy by utilizing the capabilities of nearby GeForce-powered Windows PCs, according to the business.
Based on local datasets, the solution uses TensorRT-LLM software, RAG, and Nvidia RTX acceleration to enable prompt and contextually relevant responses. The program allows users to link local files on their PCs, which can then be used as a dataset for open-source large language models such as Llama 2 or Mistral.
Users can type natural language inquiries, like inquiring about a restaurant recommendation or other tailored information, and Chat with RTX will quickly scan and offer the answer with context, saving them the trouble of sorting through several files. The application is flexible and easy to use because it supports a wide range of file types, such as.txt,.pdf,.doc/.docx, and.xml.
Combining various media together
According to Nvidia, Chat with RTX is unique in that it can incorporate data from multimedia sources, especially YouTube playlists and videos.
Contextual inquiries can be made possible by users integrating knowledge from video content into their chatbot. For example, users can use educational resources to get short lessons and how-tos or search for trip recommendations based on the videos of their favorite influencers.
Fast results are guaranteed by the application’s local processing capabilities, and more significantly, user data is retained on the device. Chat with RTX enables users to handle sensitive data without requiring an internet connection or sharing it with third parties by doing away with the requirement for cloud-based services.
Future prospects and system requirements
Users must have Windows 10 or 11 installed, the most recent Nvidia GPU drivers, a GeForce RTX 30 Series GPU or above, and at least 8GB of VRAM in order to use Chat with RTX.
Developers can use the TensorRT-LLM RAG developer reference project, which is hosted on GitHub, to investigate the possibility of using RTX GPUs to accelerate large language models (LLMs). Until February 23, Nvidia is encouraging developers to take part in the Generative AI on Nvidia RTX developer contest. There are prizes up for grabs, including a GeForce RTX 4090 GPU and a full conference pass to Nvidia GTC.