New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI



Launched today, NVIDIA Nemotron 3 Super is a 120‑billion‑parameter open model with 12 billion active parameters designed to run complex agentic AI systems at scale. 

Available now, the model combines advanced reasoning capabilities to efficiently complete tasks with high accuracy for autonomous agents.

AI-Native Companies: Perplexity offers its users access to Nemotron 3 Super for search and as one of 20 orchestrated models in Computer. Companies offering software development agents like CodeRabbit, Factory and Greptile are integrating the model into their AI agents along with proprietary models to achieve higher accuracy at lower cost. And life sciences and frontier AI organizations like Edison Scientific and Lila Sciences will power their agents for deep literature search, data science and molecular understanding.

Enterprise Software Platforms: Industry leaders such as Amdocs, Palantir, Cadence, Dassault Systèmes and Siemens are deploying and customizing the model to automate workflows in telecom, cybersecurity, semiconductor design and manufacturing. 

As companies move beyond chatbots and into multi‑agent applications, they encounter two constraints.

The first is context explosion. Multi‑agent workflows generate up to 15x more tokens than standard chat because each interaction requires resending full histories, including tool outputs and intermediate reasoning. 

Over long tasks, this volume of context increases costs and can lead to goal drift, where agents lose alignment with the original objective.

The second is the thinking tax. Complex agents must reason at every step, but using large models for every subtask makes multi-agent applications too expensive and sluggish for practical applications.

Nemotron 3 Super has a 1‑million‑token context window, allowing agents to retain full workflow state in memory and preventing goal drift.

Nemotron 3 Super has set new standards, claiming the top spot on Artificial Analysis for efficiency and openness with leading accuracy among models of the same size. 

The model also powers the NVIDIA AI-Q research agent to the No. 1 position on DeepResearch Bench and DeepResearch Bench II leaderboards, benchmarks that measure an AI system’s ability to conduct thorough, multistep research across large document sets while maintaining reasoning coherence. 

Hybrid Architecture

Nemotron 3 Super uses a hybrid mixture‑of‑experts (MoE) architecture that combines three major innovations to deliver up to 5x higher throughput and up to 2x higher accuracy than the previous Nemotron Super model. 

  • Hybrid Architecture: Mamba layers deliver 4x higher memory and compute efficiency, while transformer layers drive advanced reasoning.
  • MoE: Only 12 billion of its 120 billion parameters are active at inference. 
  • Latent MoE: A new technique that improves accuracy by activating four expert specialists for the cost of one to generate the next token at inference.
  • Multi-Token Prediction: Predicts multiple future words simultaneously, resulting in 3x faster inference.

On the NVIDIA Blackwell platform, the model runs in NVFP4 precision. That cuts memory requirements and pushes inference up to 4x faster than FP8 on NVIDIA Hopper, with no loss in accuracy. 

Open Weights, Data and Recipes

NVIDIA is releasing Nemotron 3 Super with open weights under a permissive license. Developers can deploy and customize it on workstations, in data centers or in the cloud.

The model was trained on synthetic data generated using frontier reasoning models. NVIDIA is publishing the complete methodology, including over 10 trillion tokens of pre- and post-training datasets, 15 training environments for reinforcement learning and evaluation recipes. Researchers can further use the NVIDIA NeMo platform to fine-tune the model or build their own. 

Use in Agentic Systems

Nemotron 3 Super is designed to handle complex subtasks inside a multi-agent system. 

A software development agent can load an entire codebase into context at once, enabling end-to-end code generation and debugging without document segmentation. 

In financial analysis it can load thousands of pages of reports into memory,  eliminating the need to re-reason across long conversations, which improves efficiency. 

Nemotron 3 Super has high-accuracy tool calling that ensures autonomous agents reliably navigate massive function libraries to prevent execution errors in high-stakes environments, like autonomous security orchestration in cybersecurity.

Availability

NVIDIA Nemotron 3 Super, part of the Nemotron 3 family, can be accessed at build.nvidia.com, Perplexity, OpenRouter and Hugging Face. Dell Technologies is bringing the model to the Dell Enterprise Hub on Hugging Face, optimized for on-premise deployment on the Dell AI Factory, advancing multi-agent AI workflows. HPE is also bringing NVIDIA Nemotron to its agents hub to help ensure scalable enterprise adoption of agentic AI. 

Enterprises and developers can deploy the model through several partners:

The model is packaged as an NVIDIA NIM microservice, allowing deployment from on-premises systems to the cloud.

Stay up to date on agentic AI, NVIDIA Nemotron and more by subscribing to NVIDIA AI news, joining the community, and following NVIDIA AI on LinkedIn, Instagram, X and Facebook.

Explore self-paced video tutorials and livestreams.



How to watch Jensen Huang’s Nvidia GTC 2026 keynote


Nvidia kicks off its annual GTC developer conference in San Jose, California, next week with CEO Jensen Huang’s keynote scheduled for Monday at 11am PT / 2pm ET.

GTC — which stands for GPU Technology Conference — is Nvidia’s flagship annual event, where the chipmaker typically uses the spotlight to announce new products, champion partnerships, and lay out its vision for the future of computing. Huang’s keynote will focus on Nvidia’s role in the future of computing and AI. You can watch the two-hour address in person at the SAP Center or livestream the talk on the event’s website.

The broader three-day event is focused on what’s coming next for AI across industries including healthcare, robotics, and autonomous vehicles, among others.

On the software side, it’s rumored that Nvidia will release an open source platform for enterprise AI agents, dubbed NemoClaw, as originally reported by Wired. The platform would give businesses a structured way to build and deploy AI agents (software that can carry out multi-step tasks autonomously) and would position Nvidia to mirror similar offerings from companies like OpenAI.

On the hardware side, the company is also rumored to be releasing a new chip designed to accelerate the AI inference process — the process by which an AI model applies what it has learned to generate responses or make decisions, as distinct from the initial training process, which requires far more computing power. Faster, cheaper inference is widely seen as one of the last bottlenecks to scaling AI applications broadly. The chip, if confirmed, would represent Nvidia’s latest bid to dominate not just the training market, where it already commands an estimated 80% share, but the inference market as well, where competition from custom chips built by Google, Amazon and others is fast intensifying.

Kevin Cook, a senior equity strategist at Zacks Investment Research, told TechCrunch that attendees should also expect to learn what the company plans to do with its relationship with Groq, the inference company Nvidia reportedly paid $20 billion late last year to license its technology. There’s a lot of curiosity around this tie-up, given that Jonathan Ross, Groq’s founder, Sunny Madra, Groq’s President, and other members of the Groq team agreed to join Nvidia to help advance and scale that licensed tech.

There will, of course, also be a range of partnership announcements and demonstrations showcasing Nvidia’s AI capabilities across industries.

Techcrunch event

San Francisco, CA
|
October 13-15, 2026

Survey Reveals AI Is Delivering Clear Return on Investment in Healthcare


AI is accelerating every aspect of healthcare — from radiology and drug discovery to medical device manufacturing and new treatment methods enabled by digital twins of the human body.

NVIDIA’s second annual “State of AI in Healthcare and Life Sciences” survey report reveals how the industry is moving from AI experimentation to execution, reaping return on investment (ROI) on core applications like medical imaging and drug discovery.

The industry is also embracing open source software and AI models to tackle specific use cases, as well as exploring using agentic AI to speed knowledge retrieval and research paper analysis.

Highlights from this year’s report include:

  • 70% of respondents said their organizations are actively using AI, up from 63% in 2024.
  • 69% said they’re using generative AI and large language models, up from 54%.
  • 82% said open source software and models are moderately to extremely important to their organizations’ AI strategy.
  • 47% said they’re using or assessing agentic AI.
  • 85% of executives said AI is helping increase revenue, and 80% said it’s helping reduce costs.

“Over the next 12-18 months, the most visible and scalable impact of AI will come from logistics and administrative streamlining,” said John Nosta, president of NostaLab, a healthcare think tank. “That’s where adoption curves are already steep — scheduling, documentation, coding, utilization management and care coordination.”

Read more below on some of the report’s key findings.

AI Adoption Ramps Up Across Healthcare and Life Sciences

AI adoption is up across every industry segment in this year’s survey — spanning digital healthcare, pharmaceutical and biotechnology, payers and providers, and medical technology and tools — with digital healthcare leading at 78%, followed by medical technology at 74%.

The top industry workload was generative AI and large language models, according to 69% of respondents. AI for data analytics and data science was the second most-used workload, followed by predictive analytics. New to the survey, agentic AI ranked fourth, with 47% of respondents saying they’re using or assessing AI agents.

“Scaling generative AI in healthcare starts with focusing on real clinical and operational problems, rather than the technology itself,” said Dr. Annabelle Painter, clinical AI strategy lead at Visiba U.K. “The organizations seeing impact are those that embed AI into existing workflows instead of layering AI on top as a separate tool.”

Healthcare and life sciences organizations are deploying these AI workloads across a variety of use cases, each specific to their primary functions. For example, 61% of respondents from medical technology said they’re using AI for medical imaging, such as radiologists using it to work more quickly and efficiently, while 57% from pharmaceutical and biotechnology said drug discovery is being driven by AI.

For the entire industry, the top AI use cases were clinical decision support (such as radiologists highlighting areas of concern on a scan), medical imaging and workflow optimization.

AI Budgets to Increase With Strong ROI

AI is helping healthcare and life sciences organizations become even better at their core competencies — underscoring strong ROI.

In addition to increasing annual revenue and reducing annual costs, AI is boosting back-office productivity through workflow optimization and is scaling across other key business operations such as patient interaction and administrative tasks.

For example, 57% of respondents from the medical technology segment reported seeing ROI from deploying AI for medical imaging. Nearly half (46%) of pharmaceutical and biotechnology respondents said AI for drug discovery and development was among their top ROI use cases.

The top ROI use case for digital healthcare providers was virtual health assistants and chatbots, according to 37%, while 39% of respondents from payers and providers (which include hospitals, primary care providers and insurance companies) cited administrative tasks and workflow optimization as their top area of ROI.

As a result of AI’s positive impact, 85% of respondents said their AI budgets would increase this year, with another 12% saying budgets would stay the same. For almost half of respondents (46%), AI spending will increase significantly, by more than 10%.

“Healthcare organizations that successfully integrate AI are those that explicitly fund and prioritize evaluation as a core operational function, ensuring AI delivers measurable improvements in safety, quality and patient care over time,” said Painter.

Using Open Source for Domain-Specific AI Deployment

Leaning into open source models and software allows enterprises to build domain-specific applications, lending them greater flexibility and efficiency while boosting business returns.

The healthcare industry has embraced open source, with 82% of survey respondents stating it’s moderately to extremely important to their AI strategy.

“Open models will shape the intellectual field,” said Nosta. “They are essential for exploration and for keeping the field honest. But in clinical environments where safety, liability and accountability are nonnegotiable, proprietary systems will remain necessary for validation, integration and trust. The key insight here is that discovery will be open, and deployment will demand stewardship.”

Download the “State of AI in Healthcare and Life Sciences: 2026 Trends” report for in-depth results and insights.

Sign up for NVIDIA’s healthcare and life sciences newsletter.

Know What Else Used a Lot of Energy? Human Civilization



At last week’s India AI Impact Summit in New Delhi, industry leaders convened to discuss the future of artificial intelligence and how best to squeeze it into parts of your life you haven’t even considered. Notably absent was Bill Gates, who dropped out hours before his scheduled keynote over the ongoing scrutiny about his presence in the Epstein Files (though he continues to deny any wrongdoing). While the convention was reportedly a bit chaotic, what with the protests and all, the luminaries from around the tech world present nonetheless kept things upbeat and optimistic, declaring “full steam ahead” on the technological hype train carrying our species and planet off a cliff.

Also in attendance was OpenAI’s Sam Altman, who earned numerous headlines over the course of the event for his words and antics. His buzz blitzkrieg started on Thursday at a seemingly easy photo-opp layup with Indian Prime Minister Narendra Modi and other AI executives all raising their joined hands in a celebratory display of industry-wide solidarity. Altman and the former colleague and present CEO of Anthropic to his left, Dario Amodei, notably refused to complete the chain and hold each other’s hands, making for an all-too-poignant moment. Altman would continue to make news throughout the summit for his comments on the industry’s “urgent” need for global regulation and his sneaking suspicion that companies might actually be using AI as a scapegoat to whitewash their layoffs.

Ever the yapper, Altman has bagged yet another round of earned media for an interview with The Indian Express’ Anant Goenka, during which he posited some controversial rebuttals to concerns about AI’s environmental impact.

Altman started off by saying the claims about ChatGPT consuming “‘17 gallons of water for each query’ or whatever,” are “completely untrue, totally insane, no connection to reality,” before qualifying that, OK, maybe it was a valid concern when his company “used to do evaporative cooling in data centers.”

He went on to say that there is “fair” concern about the amount of energy data centers eat to crank out the most soulless slop you’ve ever seen, but suggested the onus of responsibility for dealing with AI’s ravenous appetite falls to the energy sector itself, which Altman feels needs to “move towards nuclear or wind and solar very quickly.”

Altman then stunned the crowd and firmly re-entered the discourse with a mind-blowing truth bomb for those who still felt AI was consuming too much energy.

“It also takes a lot of energy to train a human,” Altman rejoined euphorically. “It takes like 20 years of life, and all the food you eat before that time, before you get smart. And not only that, it took like the very widespread evolution of the hundred billion people that have ever lived and learned not to get eaten by predators and learned how to figure out science and whatever to produce you, and then you took whatever you took.”

It is true that every person and the sum total of human civilization have consumed a sizable amount of energy (and water) to get to where we are today. While the value comparison of a nascent tech industry and its models to the entirety of civilization and human beings may have elicited adulation at the summit, Altman got an icier reception from the internet. Social media quickly took to roasting the remarks as “dystopian” and “deeply antisocial and antihuman.”

Perhaps further illuminating the backlash, Altman’s energy comments butt up against the frustrating lack of transparency within the industry our collective futures now hinge upon. There are currently no regulations in place requiring data centers to disclose their water and energy consumption. Furthermore, center employees and business partners are typically muzzled by nondisclosure agreements. This has made reporting and research on the true expenditure levels a tricky figure to pin down.

At least we’ve got Sam to keep us informed while waiting for some clarity about what’s actually going on and being used in those centers.

Survey Reveals AI Advances in Telecom: Networks and Automation in Driver’s Seat as Return on Investment Climbs


AI is accelerating the telecommunications industry’s transformation, becoming the backbone of autonomous networks and AI-native wireless infrastructure. At the same time, the technology is unlocking new business and revenue opportunities, as telecom operators accelerate AI adoption across consumers, enterprises and nations.

NVIDIA’s fourth annual “State of AI in Telecommunications” survey report unpacks these trends, underscoring strong AI adoption, impact and investment in the industry.

Highlights from the report include:

  • 90% said AI is helping increase annual revenue and drive down costs.
  • 77% said they expect to see AI-native networks launch before the deployment of 6G.
  • 65% of telecom operators said network automation is being driven by AI.
  • 60% said their organization is using or assessing generative AI, up from 49% in 2024.
  • 89% said open source models and software are important to their AI strategy.
  • 89% of telcos plan to boost AI spending in 2026, up from 65% a year ago.

“There is a seismic shift underway in the telecom industry driven by AI,” said Sebastian Barros, managing director of Circles, a Singapore-based telecommunications provider. “Communication service providers are converging on a new realization. Their role in society extends beyond moving bits across networks toward moving intelligence across local and regulated infrastructure. That transition defines the move from telco to ‘AICO’ — AI infrastructure companies operating at network proximity, not application vendors riding on top.”

Here are some more key findings from the report.

Tangible Revenue Impact and Return on Investment

The telecommunications industry is seeing a definitive revenue impact from the use of AI. Overall, about nine out of 10 respondents said AI is helping to increase revenue and reduce costs. Telecommunications operators, which represent about a quarter of the 1,000 responses in the survey, are also seeing the benefit, with 90% saying AI has had a positive impact on revenue and costs.

The top AI use cases cited for return on investment (ROI) were AI for autonomous networks (50%), followed by improved customer service (41%) and internal process optimization (33%).

“Autonomous networks deliver immediate ROI by eliminating human effort from repetitive, reactive workflows,” said Barros. “The fastest impact areas are energy management, fault prediction, configuration drift correction and capacity planning.”

This strong impact on revenue and ROI is leading telecommunications companies to increase their AI budgets in 2026. Overall, 89% of respondents said their AI budget will increase in the next 12 months, up from 65% in last year’s survey, with 35% saying their budgets would increase more than 10% from this year.

Focus on AI-Native Networks and Autonomous Operations

Network automation has overtaken customer experience as the leading use case for investment, deployment and ROI impact. This signals a bold step toward autonomous networks — AI-driven, self-managing systems that can self-configure, self-heal and self-optimize with minimal human intervention. Eighty-eight percent of organizations report being between levels 1-3 of autonomy, as defined by the TM Forum, and the use of generative AI and agentic AI is expected to accelerate the shift to level 5 autonomous networks.

“Autonomous networks are delivering return on investment faster than any other AI use case because they directly reduce outages, energy consumption and manual intervention,” said Chetan Sharma, CEO of Chetan Sharma Consulting. “Agentic AI accelerates this by coordinating decisions across domains in real time.”

A surge in edge computing investment is reshaping telecom network architectures, bringing AI inferencing closer to users through a distributed computing infrastructure. Telcos are stepping up investments in AI-native RAN and 6G — signaling a major industry intercept ahead of the traditional 6G deployment cycle, with 77% of respondents anticipating a much faster time to deployment of this new AI-native wireless network architecture.

The top drivers of investment are using AI to enhance spectral efficiency, improving the performance of the radio access network supporting edge AI applications and accelerating the research and development of 6G.

A Universal Boost in Productivity 

AI in telecommunications is advancing autonomous networks and business opportunities as well as improving internal operations. Nearly every respondent in the survey said AI is boosting employee productivity, with 26% citing major to significant improvements to their ability to complete more tasks with higher quality in less time.

The productivity gains are coming from generative and agentic AI solutions deployed across operations, from the back office to networks.

“Generative AI delivered fast productivity gains, but agentic AI is where telecoms begin to see structural ROI,” Sharma said. “Autonomous agents can act across networks, IT and customer journeys, turning insights into decisions without human delay.”

Download the “State of AI in Telecommunications 2026 Trends” report for in-depth results and insights.

Explore NVIDIA AI technologies for telecommunications.

India Fuels Its AI Mission With NVIDIA


India is the nexus of AI innovation this week as the host of the AI Impact Summit, which brings together global heads of state and industry to chart the future of AI.

At the summit, taking place in New Delhi, industry leaders, government agencies, educational institutions and startups are sharing how they’re working with NVIDIA to drive the AI industrial revolution in the world’s most populous country.

These initiatives support the IndiaAI Mission, a government effort that’s infusing India’s AI ecosystem with over $1 billion to bolster the nation’s compute capacity and foster the development of sovereign AI datasets, frontier models and applications. The mission also supports AI education, startup innovation and frameworks for trustworthy AI.

Read how NVIDIA is supporting IndiaAI Mission priorities including:

NVIDIA Cloud Partners Boost India AI Infrastructure

To achieve its AI ambitions, India is investing heavily in its computing infrastructure. Under the IndiaAI Compute Pillar, the nation is building out its AI cloud offerings with systems including tens of thousands of NVIDIA GPUs.

NVIDIA is collaborating with next‑generation cloud providers Yotta, L&T and E2E Networks to deliver advanced AI factories to meet India’s growing need for AI compute and enable it to develop AI models and services that drive innovation.

  • Yotta is a hyperscale data center and cloud provider building large‑scale sovereign AI infrastructure for India, branded as Shakti Cloud, powered by over 20,000 NVIDIA Blackwell Ultra GPUs. Its campuses in Navi Mumbai and Greater Noida deliver GPU‑dense, high‑bandwidth AI cloud services on a pay‑per‑use model, designed to make advanced AI training and inference affordable and compliant for Indian enterprises and public sector customers.
  • Larsen & Toubro (L&T) is building sovereign, gigawatt-scale NVIDIA AI factory infrastructure in India to reinforce the country’s position as a global AI powerhouse in alignment with the IndiaAI Mission. The roadmap includes initial expansions in Chennai to 30 megawatts as well as a new 40-megawatt facility in Mumbai. These facilities will power sovereign cloud workloads and hyperscale deployments, delivering secure, energy‑efficient infrastructure for advanced AI applications.
  • E2E Networks is building an NVIDIA Blackwell GPU cluster on its TIR platform, hosted at the L&T Vyoma Data Center in Chennai. The TIR cloud compute platform will feature NVIDIA HGX B200 systems and NVIDIA Enterprise software as well as NVIDIA Nemotron open models to supercharge sovereign development across agentic AI, healthcare, finance, manufacturing and agriculture.

India’s AI cloud infrastructure will host workloads as well as manufacture intelligence for model training, fine-tuning and high‑scale inference. Capacity within these data centers will be reserved for model builders, startups, researchers and enterprises to build, fine-tune and deploy AI in India.

Further expanding access to NVIDIA AI infrastructure in India, Netweb Technologies is launching its Tyrone Camarero AI Supercomputing systems built on the NVIDIA Grace Blackwell architecture. The NVIDIA GB200 NVL4 platforms — manufactured in India by Netweb under the government’s “Make in India” mission — feature four NVIDIA Blackwell GPUs and two NVIDIA Grace CPUs to power scientific computing, model training and inference.

NVIDIA and India AI-Native Companies Build the Nation’s Frontier AI Models

Another key goal of the IndiaAI Mission — led by its Innovation Center Pillar — is to develop and deploy foundation models trained on India-specific data and domestic AI infrastructure.

For a nation as multilingual as India — with 22 constitutionally recognized languages and over 1,500 more recorded by the country’s census — frontier AI models are a powerful tool to help its more than 1.4 billion residents interact with technology in their primary language.

Organizations across the country are building AI applications with NVIDIA Nemotron to support public-sector services, financial systems and enterprise operations in multiple languages.

NVIDIA Nemotron open models, datasets, tools and libraries enable organizations to build frontier speech, language and multimodal models at scale and across languages for government, consumer and enterprise applications. It includes India-specific datasets like Nemotron-Personas-India, an open dataset built from publicly available census data using NeMo Data Designer that includes 21 million fully synthetic Indic personas to enable population-scale sovereign AI development.

Adopters in India of Nemotron — and NeMo Curator, an open library for multilingual and multimodal data curation — include:

  • BharatGen, a sovereign AI initiative supported by the Government of India aimed at strengthening the country’s multilingual and multimodal AI ecosystem. As part of this effort, BharatGen has developed a 17-billion-parameter mixture-of-experts (MoE) model from the ground up, using the NVIDIA NeMo framework for pretraining and the NeMo RL library for post-training. The open source models are designed to power applications across public services, agriculture, security and cultural preservation.
  • Chariot, a company building AI systems for speech and multimodal communication. Using the NeMo framework, Chariot is developing an 8-billion-parameter model for real-time text to speech, supporting applications that improve accessibility and digital interaction across consumer and enterprise use cases.
  • Commotion, backed by Tata Communications, which has developed an AI operating system to automate complex enterprise workflows. By integrating NVIDIA Nemotron models and speech capabilities, the platform enables governed, production-grade AI deployments, helping enterprises scale AI across critical business operations.
  • CoRover.ai, which has deployed NVIDIA Nemotron Speech open models and NVIDIA Riva libraries for end-to-end, ultralow-latency speech AI — including the NVIDIA Riva Whisper v3 model for multilingual automatic speech recognition in English, Hindi and Gujarati. Powering customer service applications for the Indian Railway Catering and Tourism Corporation, CoRover’s platform supports around 10,000 concurrent users and more than 5,000 daily ticket bookings.
  • Gnani.ai, which offers enterprises a multilingual agentic AI platform that can interact with customers through voice and text. Gnani is building a 14-billion-parameter speech-to-speech model built on NVIDIA Nemotron Speech models, datasets and NeMo libraries including NeMo libraries through NVIDIA Cloud Partner E2E Networks — with plans to expand to a 32-billion-parameter model. By fine-tuning the NVIDIA Nemotron Speech model for Indic languages, Gnani has achieved a 15x reduction in inference costs, enabling the company to scale to support more than 10 million calls per day for customers in telecom, banking and hospitality.
  • National Payments Corporation of India (NPCI), which operates India’s retail payment and settlement systems and is deploying AI models to support digital financial services. Building on its production deployment of the AI-powered UPI Help Assistant — a pilot initiative for India’s Unified Payments Interface (UPI) — NPCI is exploring training FiMi, a financial model for India, using the NVIDIA Nemotron 3 Nano model and its own datasets. The model, fine-tuned with the NeMo framework, will support multilingual customer service across India’s banking ecosystem.
  • Sarvam.ai, a leader in full-stack sovereign generative AI that provides enterprise-grade multimodal, speech-to-text, text-to-speech, translation and reasoning models. The company is open sourcing its Sarvam-3 series of text and multimodal large language model variants, trained for 22 Indic languages, English math and code. Sarvam is using NeMo Curator to construct high-quality multilingual training data while adopting a subset of NVIDIA Nemotron datasets. The foundation models were pre-trained from scratch across 3B, 30B and 100B parameter sizes using the NVIDIA NeMo framework and Megatron-LM, and post-trained with NeMo RL. Training was conducted on NVIDIA H100 GPUs through NVIDIA Cloud Partners, including Yotta. With these sovereign models, Sarvam.ai’s new Pravah platform enables production-grade inference for Indian government and enterprise applications.
  • Soket.ai, which is using a modern large-model training stack on open NVIDIA Nemotron technologies, including NVIDIA Megatron and NVIDIA NeMo. These open source components enable scalable experimentation, training stability and efficient GPU usage, while preserving full control over the model’s data, design and life cycle.
  • Tech Mahindra, which has developed an 8-billion-parameter foundation model tailored for Indian languages and dialects. The model, built with Nemotron, is being designed for use in classrooms, where it can help make educational materials available in a wider range of Indian languages including Hindi, Maithili and Dogri. The team generated synthetic data with Nemotron libraries and tools such as NeMo Data Designer and conducted supervised fine-tuning with NeMo AutoModel.
  • Zoho, which is advancing its Zia LLM platform with proprietary models built using NVIDIA NeMo on the NVIDIA Blackwell and Hopper platforms, integrated across its software-as-a-service applications. This privacy-first architecture delivers contextual, production-grade AI for critical business workflows like customer relation management and finance, ensuring technology sovereignty and enterprise security at a global scale.

Developers building sovereign AI systems can access NVIDIA Nemotron and NeMo today. Nemotron models can be deployed anywhere on NVIDIA-accelerated infrastructure — including on NVIDIA DGX Spark, which is now available in India through qualified partners including PNY, RP tech India, Tech Data, a TD SYNNEX Company, as well as on NVIDIA Marketplace. A version manufactured in India as part of the “Make in India” initiative is available through Netweb.

DGX Spark also runs sovereign AI models by Indian model builders including Sarvam.ai.

Government and Academic Partnerships to Support Research in AI for Science and Engineering

Under its Application Development Initiative Pillar, the IndiaAI Mission is supporting high-impact AI applications — and its Startup Financing Pillar aims to democratize funding availability for AI entrepreneurs across the country.

NVIDIA is collaborating with government agencies, research institutions, venture capital firms and startups to advance projects aligned with these goals.

NVIDIA is collaborating with the Anusandhan National Research Foundation (ANRF), a statutory body under the Indian government, to spur even more cutting-edge AI research across the nation’s leading academic institutions. The initiative will support ANRF’s AI for Science & Engineering program and future AI programs.

NVIDIA will offer ANRF grantee institutions complimentary access to NVIDIA AI Enterprise software and specialized technical mentorship through the NVIDIA AI Technology Center. The collaboration will also include AI bootcamps, workshops and hackathons to strengthen India’s AI research ecosystem.

NVIDIA is also partnering with prominent venture capital firms including Peak XV, Z47, Elevation Capital,, Nexus Venture Partners and Accel India to identify and fund promising startups of all stages that are building AI solutions for India and international use. More than 4,000 of India’s AI startups are already part of the NVIDIA Inception program.

For more from the India AI Summit, learn how NVIDIA and global industrial software leaders are partnering with India’s largest manufacturers — and how India’s global systems integrators are building enterprise AI agents with NVIDIA.

Bitcoin biopic starring Casey Affleck to use AI to generate locations and tweak performances


Killing Satoshi, an upcoming biopic about the elusive creator of Bitcoin, will reportedly rely heavily on artificial intelligence to generate locations and adjust actors’ performances, Variety reports. The film was announced in 2025 as being directed by Doug Liman (The Bourne Identity, The Edge of Tomorrow) and starring Casey Affleck and Pete Davidson in undisclosed roles, but its connection to overhyped technology was previously understood to begin and end with cryptocurrency.

According to a UK casting notice viewed by Variety, the producers of Killing Satoshi reserve the right to “change, add to, take from, translate, reformat or reprocess” actors’ performances, using “generative artificial intelligence (GAI) and/or machine learning technologies.” No digital replicas will be created of performers, but it sounds like plenty of other AI-driven tweaks are on the table. The production’s use of AI will also extend to the setting of its shoots, per Variety’s source. Killing Satoshi will be shot on a “markerless performative capture stage” and things like backgrounds and locations will be entirely generated by AI.

You guess is as good as mine as to why a film about blockchain technology needs to be filmed this way, but Doug Liman has been connected with plenty of unusual projects in the past, including a rumored Tom Cruise film that was supposed to film on the International Space Station. Killing Satoshi will be far less practical in comparison, and walking a much finer line of what’s acceptable in the entertainment industry.

A major sticking point in SAG-AFTRA’s 2023 contract negotiations was guaranteeing protections for actors who could be replaced by AI. Equity, the union representing actors in the UK, is currently negotiating protections for members that are concerned that AI could be used to reproduce their likenesses and voices and let studios use them without their consent.

Long Delayed Siri Functions Are Reportedly Being Delayed Once Again Because They’re Slow and Inaccurate



Mark Gurman, Bloomberg’s Apple scoops guy, says the development of the latest version of Siri is not looking good in tests. It’s apparently going badly enough that Apple will release only a partial version when the updated voice assistant debuts in the next version of iOS. To be clear, the iOS 26.4 update is still expected to arrive next month, and it’s still expected to have a new version of Siri, but it may be a bit of a letdown.

That’s not good for Apple. Perhaps you’ll recall that Apple has been advertising a version of Siri that works as a smart, seamless, automated personal assistant in your pocket for a long time. Apple even made a commercial about this with Bella Ramsey released in fall of 2024:

But that ad had to be pulled because Apple couldn’t ship a real-life version of what it depicted. Asking Siri questions as if it’s a chatbot and then getting good answers drawn from your information across multiple apps is a function that certainly feels possible based on existing technology. But it’s now 2026 and Apple still hasn’t released that version of Siri.

And as I wrote late last month, Apple is perceived as needing to notch a win in the AI area after falling way behind Google in AI authority. The AI model driving the new, still unreleased, Siri is essentially rented from Google for $1 billion per year. And who knows, perhaps Google’s model is the culprit behind the latest problems with Siri, but it’s hard to picture consumers blaming Google if Apple can’t execute a solid new Siri product.

Gurman’s sources tell him tests of the new Siri found that it processes queries incorrectly, and that it sometimes takes “too long”—too long for what? We don’t get to know, but it’s clearly slow. Gurman points to the feature from the Bella Ramsey ad in which the AI mines answers from your personal data, and answers questions like “What was that Greek restaurant Larry told me to try?” as one likely to be delayed past iOS 26.4.

If it’s iOS 26.5 that eventually gets the Bella Ramsey version of Siri, and the user interface ends up being designed like the working version of that operating system that Apple employees are using to perform tests, Gurman says there may be an optional toggle allowing the user to “preview” that new Siri version, meaning it’ll be framed as something that the user can try at their own peril.

So ostensibly, these Siri features aren’t being cancelled or eliminated, but delayed. Apple will, Gurman says, release some sort of partial Siri update in March with iOS 26.4, and then the rest of the new Siri features will be sprinkled into the 26.5 update in May, and the larger update to iOS 27 in September, when the iPhone 18 line is scheduled to roll out. Though this “remains a fluid situation, and Apple’s plans may change further,” Gurman writes.

Apparently, according to Gurman, another delayed feature will be Siri-based voice controls for “App Intents,” a new framework for controlling apps that Apple says will perform an “increasingly critical role within Apple’s developer platforms.” This delay may not be grieved by developers, who, judging from X posts, don’t seem super eager to figure out how to use it.

Nemotron Labs: How AI Agents Are Turning Documents Into Real-Time Business Intelligence


Editor’s note: This post is part of the Nemotron Labs blog series, which explores how the latest open models, datasets and training techniques help businesses build specialized AI systems and applications on NVIDIA platforms. Each post highlights practical ways to use an open stack to deliver value in production — from transparent research copilots to scalable AI agents.

Businesses today face the challenge of uncovering valuable insights buried within a wide variety of documents — including reports, presentations, PDFs, web pages and spreadsheets.

Often, teams piece together insights by manually reviewing files, copying data into spreadsheets, building dashboards and using basic search or template-based optical character recognition (OCR) tools that often miss important details in complex media.

Intelligent document processing is an AI-powered workflow that automatically reads, understands and extracts insights from documents. It interprets rich formats inside those documents — including tables, charts, images and text — using AI agents and techniques like retrieval-augmented generation (RAG) to turn the multimodal content into insights that other multi-agent systems and people can easily use.

With NVIDIA Nemotron open models and GPU-accelerated libraries, organizations can build AI-powered document intelligence systems for research, financial services, legal workflows and more.

These open models, datasets and training recipes have powered strong results on leaderboards such as MTEB, MMTEB and ViDoRe V3, benchmarks for evaluating multilingual and multimodal retrieval models. Teams can choose from among the best models for tasks like search and question answering.

How Document Processing Streamlines Business Intelligence

Document intelligence systems that can pull meaning from complex layouts, scale to huge file libraries and show exactly where an answer came from are incredibly useful in high-stakes environments. These systems:

  • Understand rich document content, moving beyond simple text scraping to capture information from charts, tables, figures and mixed-language pages and treating documents as a human would by recognizing structure, relationships and context​​.
  • Handle large quantities of shifting data, ingesting and processing massive collections of documents in parallel, and keeping knowledge bases continuously up to date.​​
  • Find exactly what users need, helping AI agents pinpoint the most relevant passages, tables or paragraphs to a query so they can respond with precision and accuracy.​​
  • Show the evidence behind answers by providing citations to specific pages or charts so teams can gain transparency and auditability, which is critical in regulated industries.​​

The result is a shift from static document archives to living knowledge systems that directly power business intelligence, customer experiences and operational workflows.

Document Intelligence at Work

Intelligent document processing systems built on NVIDIA Nemotron RAG models, Nemotron Parse and accelerated computing are already reshaping how organizations across industries gain insights from their documents.​​

Justt: AI-Native Chargeback Management and Dispute Optimization

In financial services, payment disputes create significant revenue loss and operational complexity for merchants, largely because the evidence needed to handle them lives in unstructured formats. Transaction logs, customer communications and policy documents are often fragmented across systems and difficult to process at scale, making dispute handling slow, manual and costly.

Justt.ai provides an AI-driven platform that automates the full chargeback lifecycle at scale. The platform connects directly to payment service providers and merchant data sources to ingest transaction data, customer interactions and policies, then automatically assembles dispute-specific evidence that aligns with card network and issuer requirements.

The platform’s AI-powered dispute optimization, powered by Nemotron Parse, applies predictive analytics to determine which chargebacks to fight or accept, and how to optimize each response for maximum net recovery. Leading hospitality operators like HEI Hotels & Resorts use the platform to automate dispute handling across their properties, recapturing revenue while maintaining guest relationships.

By pairing document-centric intelligence with decision automation, merchants can recapture a significant portion of revenue lost to illegitimate chargebacks while reducing manual review effort.​

Read about how Justt’s chargeback management tool autonomously processes financial data to handle disputes for merchants.

Docusign: Scaling Agreement Intelligence

Docusign is the global leader in Intelligent Agreement Management, handling millions of transactions every day for more than 1.8 million customers and over 1 billion users.

Agreements are the foundation of every business, but the critical information they contain are often buried inside pages of documents. To surface the information, Docusign needed high-fidelity extraction of tables, text and metadata from complex documents like PDFs so organizations could understand and act on obligations, risks and opportunities faster.

Docusign is evaluating Nemotron Parse for deeper contract understanding at scale. Running on NVIDIA GPUs, the model combines advanced AI with layout detection and OCR. The system can reliably interpret complex tables and reconstruct tables with required information. This reduces the need for manual corrections and helps ensure that even the most complex contracts are processed with the speed and accuracy their customers expect.

With this foundation, Docusign will transform agreement repositories into structured data that powers contract search, analysis and AI-driven workflows — turning agreements into business assets that help organizations and their teams improve visibility, reduce risk and make faster decisions.

Edison Scientific: Research Across Massive Literature Scale

Edison Scientific’s Kosmos AI Scientist helps researchers navigate complex scientific landscapes to synthesize literature, identify connections and surface evidence.​

Edison needed a way to rapidly and accurately extract structured information from large volumes of PDFs, including equations, tables and figures that traditional information parsing methods often mishandle.​

By integrating the NVIDIA Nemotron Parse model into its PaperQA pipeline, Edison can decompose research papers, index key concepts and ground responses in specific passages, improving both throughput and answer quality for scientists.​​ This approach turns a sprawling research corpus into an interactive, queryable knowledge engine that accelerates hypothesis generation and literature review.​

The high efficiency of Nemotron Parse enables cost-efficient serving at scale, allowing Edison’s team to unlock the whole multimodal pipeline.

Designing an Intelligent Document Processing Application With NVIDIA Technologies

A robust, domain-specific document intelligence pipeline requires technologies that can handle data extraction, embedding and reranking, while keeping the data secure and compliant with regulations.​​

  • Extraction: Nemotron extraction and OCR models rapidly ingest multimodal PDFs, text, tables, graphs and images to convert them into structured, machine-readable content while preserving layout and semantics.
  • Embedding: Nemotron embedding models convert passages, entities and visual elements into vector representations tuned for document retrieval, enabling semantically accurate search.​​
  • Reranking: Nemotron reranking models evaluate candidate passages to ensure the most relevant content is surfaced as context for large language models (LLMs), improving answer fidelity and reducing hallucinations.​​
  • Parsing: Nemotron Parse models decipher document semantics to extract text and tables with precise spatial grounding and correct reading flow. Overcoming layout variability, they turn unstructured documents into actionable data that enhances the accuracy of LLMs and agentic workflows.

These capabilities are packaged as NVIDIA NIM microservices and foundation models that run efficiently on NVIDIA GPUs, allowing teams to scale from proof of concept to production while keeping sensitive data within their chosen cloud or data center environment.

The most effective AI systems use a mix of frontier models and open source models like NVIDIA Nemotron, with an LLM router analyzing each task and automatically selecting the model best suited for it. This approach keeps performance strong while managing computing costs and improving efficiency.

Get Started With NVIDIA Nemotron

Access a step-by-step tutorial on how to build a document processing pipeline with RAG capabilities. Explore how Nemotron RAG can power specialized agents tailored for different industries.​

Plus, experiment with Nemotron RAG models and the NVIDIA NeMo Retriever open library, available on GitHub and Hugging Face, as well as Nemotron Parse on Hugging Face.

Join the community of developers building with the NVIDIA Blueprint for Enterprise RAG — trusted by a dozen industry-leading AI Data Platform providers and available now on build.nvidia.com, GitHub and the NGC catalog.

Stay up to date on agentic AI, NVIDIA Nemotron and more by subscribing to NVIDIA AI news, joining the community and following NVIDIA AI on LinkedIn, Instagram, X and Facebook.  

Explore self-paced video tutorials and livestreams.



Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World



At 3DEXPERIENCE World in Houston, NVIDIA founder and CEO Jensen Huang and Dassault Systèmes CEO Pascal Daloz laid out a blueprint for industrial AI rooted in physics-based “world models” — systems designed to simulate products, factories and even biological systems before they’re built.

“Artificial intelligence will be infrastructure,”  like water, electricity, and the internet Huang told the crowd, playfully referring to the engineering-heavy audience as “Solid Workers,” a nod to Dassault Systèmes’ SolidWorks platform.

The announcement continues a collaboration spanning more than a quarter century between NVIDIA and Dassault Systèmes.

“This is the largest collaboration our two companies have ever had in over a quarter century,” Huang said. “We’re going to fuse these technologies so engineers can work at a scale that’s 100 times, 1,000 times — and eventually a million times greater than before.”

The new partnership brings NVIDIA accelerated computing and AI libraries together with Dassault Systèmes’ Virtual Twin platforms to move more engineering work into real-time digital workflows, powered by AI companions that help teams explore, validate, prototype and iterate faster.

Huang framed the shift as a reinvention of the computing stack: moving from hand-specified, structured digital designs to systems that can generate, simulate and optimize in software — at industrial scale.

From Digital Models to Industry World Models

Virtual twins are not applications, “they are knowledge factories,” Daloz said.

The partnership aims to establish industry world models — science-validated AI systems grounded in physics that can serve as mission-critical platforms across biology, materials science, engineering and manufacturing.

In Daloz’s framing, the value moves upstream: virtual twins become the place where knowledge is created, tested, and trusted — before anything is built in the physical world.

Dassault Systèmes, whose 3DEXPERIENCE platform serves more than 45 million users and 400,000 customers globally, has long been a leader in virtual twin technology — digital replicas that let engineers simulate products and processes before building them physically.

The collaboration brings together accelerated computing, AI and digital twin technologies so engineers can design not only geometry, but behavior — and explore radically larger design spaces earlier in development.

Together, the companies outlined how this shared architecture will show up across science, engineering and manufacturing workflows:

  • Advancing Biology and Materials Research​: The NVIDIA BioNeMo platform and BIOVIA science-validated world models accelerate the discovery of new molecules and next-generation materials.
  • AI-Driven Design and Engineering: SIMULIA AI-based Virtual Twin Physics Behavior leveraging NVIDIA CUDA-X libraries and AI physics libraries empowers designers and engineers to accurately and instantly predict outcomes.
  • Virtual Twins for Every Factory: NVIDIA Omniverse physical AI libraries integrated into the DELMIA Virtual Twin enable autonomous, software-defined production systems.
  • Virtual Companions Supercharge Dassault Systèmes’ Users: The 3DEXPERIENCE agentic platform, combining NVIDIA AI technologies and NVIDIA Nemotron open models with Dassault Systèmes’ Industry World Models, powers Virtual Companions to tap into deep industrial context, delivering trusted, actionable intelligence.

Huang said that in domains like biology and materials, the frontier is learning the underlying “language” of complex systems and then generating new options that can be evaluated and validated in simulation.

Designing and Operating the Factory in Software

A central theme of the discussion was how factories themselves are changing — from static physical assets to living systems that are designed, simulated and operated as virtual twins.

As part of the partnership, Dassault Systèmes is deploying NVIDIA-powered AI factories on three continents through its OUTSCALE sovereign cloud, enabling customers to run AI workloads while maintaining data residency and security requirements.

Both executives emphasized that the goal isn’t to replace engineers — it’s to amplify them. As AI agent companions take on more exploratory and repetitive tasks, designers and engineers gain leverage and creativity, not redundancy.

AI Companions That Expand Human Creativity

Every designer will have a “team of companions,” Huang said — a shift he described as fundamentally positive for engineers, software platforms and the broader ecosystem built on them.

For the tens of millions of engineers who use Dassault Systèmes tools to design everything from aircraft to consumer packaged goods, the shift isn’t about replacing human creativity — it’s about expanding it.

“Success is not about automation,” Daloz said. “[Engineers] don’t want to automate the past — they want to invent the future.”

Looking ahead, Daloz framed the partnership as about more than performance gains – it’s an effort to open new possibilities, help companies eliminate bad choices before they become expensive mistakes, and create entirely new categories of products.

“Virtual twins and the 3D Universes are not applications,” Daloz said. “They are knowledge factories.”

The fireside conversation between Huang and Daloz was broadcast live from 3DEXPERIENCE World.