Nvidia unveiled a prototype AI avatar at CES 2025 that lives in your PC’s desktop. The AI assistant, R2X, appears to be like like a online game character, and it may possibly assist you navigate apps in your laptop.
The R2X avatar is rendered and animated utilizing Nvidia’s AI fashions, and customers can run the avatar on in style LLMs of their selection, akin to OpenAI’s GPT-4o or xAI’s Grok. Customers can speak with R2X by textual content and voice, add information to it for processing, and even allow the AI assistant to view what’s taking place reside in your display screen or digital camera.
Tech firms are creating quite a lot of AI avatars just lately, not simply in video video games but in addition for enterprise and shopper clients. The early demoes are unusual, however some assume these avatars are a promising consumer interface for AI assistants. With R2X, Nvidia is making an attempt to mix generative online game capabilities with cutting-edge LLMs to create an AI assistant that appears and seems like a human.
The corporate plans to open-source these avatars within the first half of 2025. Nvidia sees this as a brand new consumer interface for builders to construct with, permitting customers to plug of their favourite AI software program merchandise and even run these avatars regionally.
Very like Microsoft’s Recall characteristic (which has been delayed resulting from privateness considerations), R2X can take fixed screenshots of your display screen and run them by an AI mannequin for processing, although this characteristic is turned off by default. When on, it may possibly supply suggestions on functions operating in your laptop and, for instance, assist you work by a posh coding job.
R2X continues to be a prototype, and even Nvidia admits there are nonetheless some bugs to work out. In demos with TechCrunch, Nvidia’s avatar had an uncanny-valley really feel to it — its face generally acquired caught in odd positions, and its tone felt just a little aggressive at instances. And broadly, I discover it just a little odd to have a humanoid avatar stare at me whereas I work.
R2X typically provided useful directions and precisely seen what was on the display screen. However at one level, the avatar gave us incorrect directions, and in a while, the avatar stopped with the ability to view the display screen in any respect. This can be a problem with the underlying AI mannequin (on this case, GPT-4o), however the instance reveals the restrictions of this early expertise.
In a single demo, an Nvidia product lead confirmed how R2X can view, and help customers with, the apps in your display screen. Particularly, R2X helped us use Adobe Photoshop’s generative fill characteristic. The photograph we chosen was of Nvidia CEO Jensen Huang standing in an Asian restaurant with two restaurant employees. Nvidia’s avatar hallucinated and gave the fallacious directions for the place to search out the generative fill characteristic in Photoshop. It later misplaced the flexibility to view the display screen, however after switching the AI mannequin we used to xAI’s Grok, the avatar regained its display screen viewing talents.
In one other demo, R2X was in a position to ingest a PDF from the desktop after which reply questions on it. This course of is powered by an area retrieval augmented technology (RAG) characteristic, which supplies these AI avatars the flexibility to drag info from a doc and course of it utilizing the underlying LLM.
Nvidia is utilizing some AI fashions from its online game division to energy the way in which these avatars look. To generate avatars, Nvidia makes use of its RTX neural faces algorithm. To automate the face, lip, and tongue motion, Nvidia is utilizing a brand new mannequin referred to as Audio2Face™-3D. That mannequin appeared to stall at some factors, holding the avatars face in awkward positions.
The corporate additionally says these R2X avatars will have the ability to be a part of Microsoft Groups conferences, performing as a private assistant.
An Nvidia product lead says the corporate is working to provide these AI avatars agentic talents as properly, in order that R2X might in the future take actions in your desktop. These talents appear to be a good distance out, and they might seemingly require partnerships with software program makers like Microsoft and Adobe, who’re making an attempt to develop comparable agentic techniques themselves.
It’s not instantly clear how Nvidia is producing the voices in these merchandise. R2X’s voice when utilizing GPT-4o sounds distinctive from any of ChatGPT’s preset voices, whereas xAI’s Grok chatbot doesn’t have a voice mode in any respect but.