New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI



Launched today, NVIDIA Nemotron 3 Super is a 120‑billion‑parameter open model with 12 billion active parameters designed to run complex agentic AI systems at scale. 

Available now, the model combines advanced reasoning capabilities to efficiently complete tasks with high accuracy for autonomous agents.

AI-Native Companies: Perplexity offers its users access to Nemotron 3 Super for search and as one of 20 orchestrated models in Computer. Companies offering software development agents like CodeRabbit, Factory and Greptile are integrating the model into their AI agents along with proprietary models to achieve higher accuracy at lower cost. And life sciences and frontier AI organizations like Edison Scientific and Lila Sciences will power their agents for deep literature search, data science and molecular understanding.

Enterprise Software Platforms: Industry leaders such as Amdocs, Palantir, Cadence, Dassault Systèmes and Siemens are deploying and customizing the model to automate workflows in telecom, cybersecurity, semiconductor design and manufacturing. 

As companies move beyond chatbots and into multi‑agent applications, they encounter two constraints.

The first is context explosion. Multi‑agent workflows generate up to 15x more tokens than standard chat because each interaction requires resending full histories, including tool outputs and intermediate reasoning. 

Over long tasks, this volume of context increases costs and can lead to goal drift, where agents lose alignment with the original objective.

The second is the thinking tax. Complex agents must reason at every step, but using large models for every subtask makes multi-agent applications too expensive and sluggish for practical applications.

Nemotron 3 Super has a 1‑million‑token context window, allowing agents to retain full workflow state in memory and preventing goal drift.

Nemotron 3 Super has set new standards, claiming the top spot on Artificial Analysis for efficiency and openness with leading accuracy among models of the same size. 

The model also powers the NVIDIA AI-Q research agent to the No. 1 position on DeepResearch Bench and DeepResearch Bench II leaderboards, benchmarks that measure an AI system’s ability to conduct thorough, multistep research across large document sets while maintaining reasoning coherence. 

Hybrid Architecture

Nemotron 3 Super uses a hybrid mixture‑of‑experts (MoE) architecture that combines three major innovations to deliver up to 5x higher throughput and up to 2x higher accuracy than the previous Nemotron Super model. 

  • Hybrid Architecture: Mamba layers deliver 4x higher memory and compute efficiency, while transformer layers drive advanced reasoning.
  • MoE: Only 12 billion of its 120 billion parameters are active at inference. 
  • Latent MoE: A new technique that improves accuracy by activating four expert specialists for the cost of one to generate the next token at inference.
  • Multi-Token Prediction: Predicts multiple future words simultaneously, resulting in 3x faster inference.

On the NVIDIA Blackwell platform, the model runs in NVFP4 precision. That cuts memory requirements and pushes inference up to 4x faster than FP8 on NVIDIA Hopper, with no loss in accuracy. 

Open Weights, Data and Recipes

NVIDIA is releasing Nemotron 3 Super with open weights under a permissive license. Developers can deploy and customize it on workstations, in data centers or in the cloud.

The model was trained on synthetic data generated using frontier reasoning models. NVIDIA is publishing the complete methodology, including over 10 trillion tokens of pre- and post-training datasets, 15 training environments for reinforcement learning and evaluation recipes. Researchers can further use the NVIDIA NeMo platform to fine-tune the model or build their own. 

Use in Agentic Systems

Nemotron 3 Super is designed to handle complex subtasks inside a multi-agent system. 

A software development agent can load an entire codebase into context at once, enabling end-to-end code generation and debugging without document segmentation. 

In financial analysis it can load thousands of pages of reports into memory,  eliminating the need to re-reason across long conversations, which improves efficiency. 

Nemotron 3 Super has high-accuracy tool calling that ensures autonomous agents reliably navigate massive function libraries to prevent execution errors in high-stakes environments, like autonomous security orchestration in cybersecurity.

Availability

NVIDIA Nemotron 3 Super, part of the Nemotron 3 family, can be accessed at build.nvidia.com, Perplexity, OpenRouter and Hugging Face. Dell Technologies is bringing the model to the Dell Enterprise Hub on Hugging Face, optimized for on-premise deployment on the Dell AI Factory, advancing multi-agent AI workflows. HPE is also bringing NVIDIA Nemotron to its agents hub to help ensure scalable enterprise adoption of agentic AI. 

Enterprises and developers can deploy the model through several partners:

The model is packaged as an NVIDIA NIM microservice, allowing deployment from on-premises systems to the cloud.

Stay up to date on agentic AI, NVIDIA Nemotron and more by subscribing to NVIDIA AI news, joining the community, and following NVIDIA AI on LinkedIn, Instagram, X and Facebook.

Explore self-paced video tutorials and livestreams.



NVIDIA Advances Autonomous Networks With Agentic AI Blueprints and Telco Reasoning Models



Autonomous networks — intelligent, self-managing telecommunications operations — are moving from a future vision to a current priority for telecom operators. In the latest NVIDIA State of AI in Telecommunications report, network automation emerged as the top AI use case for investment and return on investment.

Automation is different from autonomy. Beyond executing predefined workflows, autonomous networks must understand operator intent, reason over tradeoffs and decide what actions to take. Reasoning models and AI agents fine-tuned on telecom data are key to enabling this shift.

For networks to become autonomous, there’s a need for an end-to-end agentic system that includes key components like telco network models and AI agents that talk to each other and use network simulation tools to validate actions.

Ahead of Mobile World Congress Barcelona, NVIDIA unveiled an open NVIDIA Nemotron-based large telco model (LTM), a comprehensive guide for building reasoning agents for network operations, and new NVIDIA Blueprints for energy saving and network configuration with multi-agent orchestration to help operators advance toward autonomy.

And as part of GSMA’s new Open Telco AI initiative — launching tomorrow — NVIDIA is releasing the new open source LTM, implementation guide and agentic AI blueprints as open resources through GSMA, an organization for the mobile communications industry.

Open Nemotron 3 Large Telco Model Brings Reasoning to Telecom 

For telcos to successfully operationalize generative and agentic AI across their operations, AI models must have the ability to understand the language of telecom and reason through complex workflows. NVIDIA has collaborated with AdaptKey AI to release a new open source, 30-billion-parameter NVIDIA Nemotron LTM that operators around the world can use to build autonomous networks.

Built on the NVIDIA Nemotron 3 family of foundation models and fine-tuned by AdaptKey AI using open telecom datasets including industry standards and synthetic logs, the LTM is optimized to understand telecom industry terminology and reason through workflows such as fault isolation, remediation planning and change validation.

As an open model, the Nemotron LTM gives telcos full transparency into how it was trained and what data was used, enabling secure and fast on‑premises deployment within their networks, where they can build and run agents directly. It also lets telcos safely adapt and extend telecom‑tuned reasoning with their own network and operational data, so they can move toward autonomous operations without sacrificing control over data or security.

Teaching AI Agents to Reason Like Network Engineers

NVIDIA and Tech Mahindra have published an open source guide that shows telecom operators how to fine-tune domain-specific reasoning models and build agents that can safely execute network operations center (NOC) workflows.

The guide outlines a framework for teaching models to reason like NOC engineers: focus on high‑impact, high‑frequency incident categories, translate expert resolutions into step‑by‑step procedures and turn those into structured reasoning traces that capture each action, tool call, outcome and decision. These traces become the “thinking examples” the model learns from, so it understands not just what to do, but why a particular sequence of checks and fixes is safe and effective.

Using the NVIDIA NeMo-Skills pipeline, operators can fine-tune a reasoning model on these traces, laying the foundation for telco-specialized AI agents that can reason and solve problems like a network engineer.

Maximizing Energy Efficiency With New Intent-Driven Energy Saving Blueprint

Autonomous networks rely on closed‑loop operation: models that understand the network, agents that act on intent and simulation that feeds results back into the system to validate and refine decisions. The new NVIDIA Blueprint for intent-driven RAN energy efficiency brings these pieces together, helping operators systematically reduce power consumption in 5G radio access networks (RAN) while maintaining quality of service.

The blueprint integrates network test and measurement leader VIAVI’s TeraVM AI RAN Scenario Generator (AI RSG) platform to generate synthetic network data — including cell utilization, user throughput and other traffic patterns — and convert it into a simple, queryable format.

An energy planning agent then reasons over the synthetic data to generate energy-saving policies that can be simulated in AI RSG, allowing operators to safely validate energy-saving policies in a closed loop to meet their intent without changing live configurations or impacting subscribers.

Telcos Put the NVIDIA Blueprint for Network Configuration to Work

The NVIDIA Blueprint for telco network configuration is being adopted by operators around the world.

Cassava Technologies is using the blueprint to build Cassava Autonomous Network, an agentic platform designed to optimize Africa’s diverse, multi-vendor mobile network environment. The platform implements three agents: one to monitor the network and recommend configuration changes, one to apply changes with documentation and governance, and one to assess the impact of changes made and safely roll them back if they have unintended effects.

NTT DATA is implementing the blueprint to bring intelligence to traffic regulation, helping the network manage surges when users reconnect after an outage, and is deploying it with a tier 1 operator in Japan.

An AI agent looks at real-time demand across the network and then decides when and how to admit new users on specific cells. As conditions stabilize, the agent adapts its decisions, turning what used to be manual configurations into a data-driven optimization cycle for more resilient mobile networks.

Evolving Network Configuration With Multi-Agent Orchestration

To help telcos design, observe and optimize complex agentic workflows across the RAN, NVIDIA and BubbleRAN are enhancing the NVIDIA Blueprint for telco network configuration with NVIDIA NeMo Agent Toolkit (NAT) and BubbleRAN Agentic Toolkit (BAT), complementary frameworks for multi-agent orchestration.

BubbleRAN is integrating NAT and BAT into its Opti-Sphere platform to manage network monitoring, configuration and validation agents more flexibly across containers and workloads, and connect them to tools that report network metrics and traffic status so they can continuously propose and validate configuration changes.

Telenor Group will be the first telco to adopt the blueprint with BubbleRAN to enhance its 5G network for Telenor Maritime, the group’s global connectivity provider at sea.

Learn more about the latest advancements in agentic AI for telecommunications at Mobile World Congress, taking place in Barcelona from March 2-5. 

See notice regarding software product information.

India Fuels Its AI Mission With NVIDIA


India is the nexus of AI innovation this week as the host of the AI Impact Summit, which brings together global heads of state and industry to chart the future of AI.

At the summit, taking place in New Delhi, industry leaders, government agencies, educational institutions and startups are sharing how they’re working with NVIDIA to drive the AI industrial revolution in the world’s most populous country.

These initiatives support the IndiaAI Mission, a government effort that’s infusing India’s AI ecosystem with over $1 billion to bolster the nation’s compute capacity and foster the development of sovereign AI datasets, frontier models and applications. The mission also supports AI education, startup innovation and frameworks for trustworthy AI.

Read how NVIDIA is supporting IndiaAI Mission priorities including:

NVIDIA Cloud Partners Boost India AI Infrastructure

To achieve its AI ambitions, India is investing heavily in its computing infrastructure. Under the IndiaAI Compute Pillar, the nation is building out its AI cloud offerings with systems including tens of thousands of NVIDIA GPUs.

NVIDIA is collaborating with next‑generation cloud providers Yotta, L&T and E2E Networks to deliver advanced AI factories to meet India’s growing need for AI compute and enable it to develop AI models and services that drive innovation.

  • Yotta is a hyperscale data center and cloud provider building large‑scale sovereign AI infrastructure for India, branded as Shakti Cloud, powered by over 20,000 NVIDIA Blackwell Ultra GPUs. Its campuses in Navi Mumbai and Greater Noida deliver GPU‑dense, high‑bandwidth AI cloud services on a pay‑per‑use model, designed to make advanced AI training and inference affordable and compliant for Indian enterprises and public sector customers.
  • Larsen & Toubro (L&T) is building sovereign, gigawatt-scale NVIDIA AI factory infrastructure in India to reinforce the country’s position as a global AI powerhouse in alignment with the IndiaAI Mission. The roadmap includes initial expansions in Chennai to 30 megawatts as well as a new 40-megawatt facility in Mumbai. These facilities will power sovereign cloud workloads and hyperscale deployments, delivering secure, energy‑efficient infrastructure for advanced AI applications.
  • E2E Networks is building an NVIDIA Blackwell GPU cluster on its TIR platform, hosted at the L&T Vyoma Data Center in Chennai. The TIR cloud compute platform will feature NVIDIA HGX B200 systems and NVIDIA Enterprise software as well as NVIDIA Nemotron open models to supercharge sovereign development across agentic AI, healthcare, finance, manufacturing and agriculture.

India’s AI cloud infrastructure will host workloads as well as manufacture intelligence for model training, fine-tuning and high‑scale inference. Capacity within these data centers will be reserved for model builders, startups, researchers and enterprises to build, fine-tune and deploy AI in India.

Further expanding access to NVIDIA AI infrastructure in India, Netweb Technologies is launching its Tyrone Camarero AI Supercomputing systems built on the NVIDIA Grace Blackwell architecture. The NVIDIA GB200 NVL4 platforms — manufactured in India by Netweb under the government’s “Make in India” mission — feature four NVIDIA Blackwell GPUs and two NVIDIA Grace CPUs to power scientific computing, model training and inference.

NVIDIA and India AI-Native Companies Build the Nation’s Frontier AI Models

Another key goal of the IndiaAI Mission — led by its Innovation Center Pillar — is to develop and deploy foundation models trained on India-specific data and domestic AI infrastructure.

For a nation as multilingual as India — with 22 constitutionally recognized languages and over 1,500 more recorded by the country’s census — frontier AI models are a powerful tool to help its more than 1.4 billion residents interact with technology in their primary language.

Organizations across the country are building AI applications with NVIDIA Nemotron to support public-sector services, financial systems and enterprise operations in multiple languages.

NVIDIA Nemotron open models, datasets, tools and libraries enable organizations to build frontier speech, language and multimodal models at scale and across languages for government, consumer and enterprise applications. It includes India-specific datasets like Nemotron-Personas-India, an open dataset built from publicly available census data using NeMo Data Designer that includes 21 million fully synthetic Indic personas to enable population-scale sovereign AI development.

Adopters in India of Nemotron — and NeMo Curator, an open library for multilingual and multimodal data curation — include:

  • BharatGen, a sovereign AI initiative supported by the Government of India aimed at strengthening the country’s multilingual and multimodal AI ecosystem. As part of this effort, BharatGen has developed a 17-billion-parameter mixture-of-experts (MoE) model from the ground up, using the NVIDIA NeMo framework for pretraining and the NeMo RL library for post-training. The open source models are designed to power applications across public services, agriculture, security and cultural preservation.
  • Chariot, a company building AI systems for speech and multimodal communication. Using the NeMo framework, Chariot is developing an 8-billion-parameter model for real-time text to speech, supporting applications that improve accessibility and digital interaction across consumer and enterprise use cases.
  • Commotion, backed by Tata Communications, which has developed an AI operating system to automate complex enterprise workflows. By integrating NVIDIA Nemotron models and speech capabilities, the platform enables governed, production-grade AI deployments, helping enterprises scale AI across critical business operations.
  • CoRover.ai, which has deployed NVIDIA Nemotron Speech open models and NVIDIA Riva libraries for end-to-end, ultralow-latency speech AI — including the NVIDIA Riva Whisper v3 model for multilingual automatic speech recognition in English, Hindi and Gujarati. Powering customer service applications for the Indian Railway Catering and Tourism Corporation, CoRover’s platform supports around 10,000 concurrent users and more than 5,000 daily ticket bookings.
  • Gnani.ai, which offers enterprises a multilingual agentic AI platform that can interact with customers through voice and text. Gnani is building a 14-billion-parameter speech-to-speech model built on NVIDIA Nemotron Speech models, datasets and NeMo libraries including NeMo libraries through NVIDIA Cloud Partner E2E Networks — with plans to expand to a 32-billion-parameter model. By fine-tuning the NVIDIA Nemotron Speech model for Indic languages, Gnani has achieved a 15x reduction in inference costs, enabling the company to scale to support more than 10 million calls per day for customers in telecom, banking and hospitality.
  • National Payments Corporation of India (NPCI), which operates India’s retail payment and settlement systems and is deploying AI models to support digital financial services. Building on its production deployment of the AI-powered UPI Help Assistant — a pilot initiative for India’s Unified Payments Interface (UPI) — NPCI is exploring training FiMi, a financial model for India, using the NVIDIA Nemotron 3 Nano model and its own datasets. The model, fine-tuned with the NeMo framework, will support multilingual customer service across India’s banking ecosystem.
  • Sarvam.ai, a leader in full-stack sovereign generative AI that provides enterprise-grade multimodal, speech-to-text, text-to-speech, translation and reasoning models. The company is open sourcing its Sarvam-3 series of text and multimodal large language model variants, trained for 22 Indic languages, English math and code. Sarvam is using NeMo Curator to construct high-quality multilingual training data while adopting a subset of NVIDIA Nemotron datasets. The foundation models were pre-trained from scratch across 3B, 30B and 100B parameter sizes using the NVIDIA NeMo framework and Megatron-LM, and post-trained with NeMo RL. Training was conducted on NVIDIA H100 GPUs through NVIDIA Cloud Partners, including Yotta. With these sovereign models, Sarvam.ai’s new Pravah platform enables production-grade inference for Indian government and enterprise applications.
  • Soket.ai, which is using a modern large-model training stack on open NVIDIA Nemotron technologies, including NVIDIA Megatron and NVIDIA NeMo. These open source components enable scalable experimentation, training stability and efficient GPU usage, while preserving full control over the model’s data, design and life cycle.
  • Tech Mahindra, which has developed an 8-billion-parameter foundation model tailored for Indian languages and dialects. The model, built with Nemotron, is being designed for use in classrooms, where it can help make educational materials available in a wider range of Indian languages including Hindi, Maithili and Dogri. The team generated synthetic data with Nemotron libraries and tools such as NeMo Data Designer and conducted supervised fine-tuning with NeMo AutoModel.
  • Zoho, which is advancing its Zia LLM platform with proprietary models built using NVIDIA NeMo on the NVIDIA Blackwell and Hopper platforms, integrated across its software-as-a-service applications. This privacy-first architecture delivers contextual, production-grade AI for critical business workflows like customer relation management and finance, ensuring technology sovereignty and enterprise security at a global scale.

Developers building sovereign AI systems can access NVIDIA Nemotron and NeMo today. Nemotron models can be deployed anywhere on NVIDIA-accelerated infrastructure — including on NVIDIA DGX Spark, which is now available in India through qualified partners including PNY, RP tech India, Tech Data, a TD SYNNEX Company, as well as on NVIDIA Marketplace. A version manufactured in India as part of the “Make in India” initiative is available through Netweb.

DGX Spark also runs sovereign AI models by Indian model builders including Sarvam.ai.

Government and Academic Partnerships to Support Research in AI for Science and Engineering

Under its Application Development Initiative Pillar, the IndiaAI Mission is supporting high-impact AI applications — and its Startup Financing Pillar aims to democratize funding availability for AI entrepreneurs across the country.

NVIDIA is collaborating with government agencies, research institutions, venture capital firms and startups to advance projects aligned with these goals.

NVIDIA is collaborating with the Anusandhan National Research Foundation (ANRF), a statutory body under the Indian government, to spur even more cutting-edge AI research across the nation’s leading academic institutions. The initiative will support ANRF’s AI for Science & Engineering program and future AI programs.

NVIDIA will offer ANRF grantee institutions complimentary access to NVIDIA AI Enterprise software and specialized technical mentorship through the NVIDIA AI Technology Center. The collaboration will also include AI bootcamps, workshops and hackathons to strengthen India’s AI research ecosystem.

NVIDIA is also partnering with prominent venture capital firms including Peak XV, Z47, Elevation Capital,, Nexus Venture Partners and Accel India to identify and fund promising startups of all stages that are building AI solutions for India and international use. More than 4,000 of India’s AI startups are already part of the NVIDIA Inception program.

For more from the India AI Summit, learn how NVIDIA and global industrial software leaders are partnering with India’s largest manufacturers — and how India’s global systems integrators are building enterprise AI agents with NVIDIA.

Nemotron Labs: How AI Agents Are Turning Documents Into Real-Time Business Intelligence


Editor’s note: This post is part of the Nemotron Labs blog series, which explores how the latest open models, datasets and training techniques help businesses build specialized AI systems and applications on NVIDIA platforms. Each post highlights practical ways to use an open stack to deliver value in production — from transparent research copilots to scalable AI agents.

Businesses today face the challenge of uncovering valuable insights buried within a wide variety of documents — including reports, presentations, PDFs, web pages and spreadsheets.

Often, teams piece together insights by manually reviewing files, copying data into spreadsheets, building dashboards and using basic search or template-based optical character recognition (OCR) tools that often miss important details in complex media.

Intelligent document processing is an AI-powered workflow that automatically reads, understands and extracts insights from documents. It interprets rich formats inside those documents — including tables, charts, images and text — using AI agents and techniques like retrieval-augmented generation (RAG) to turn the multimodal content into insights that other multi-agent systems and people can easily use.

With NVIDIA Nemotron open models and GPU-accelerated libraries, organizations can build AI-powered document intelligence systems for research, financial services, legal workflows and more.

These open models, datasets and training recipes have powered strong results on leaderboards such as MTEB, MMTEB and ViDoRe V3, benchmarks for evaluating multilingual and multimodal retrieval models. Teams can choose from among the best models for tasks like search and question answering.

How Document Processing Streamlines Business Intelligence

Document intelligence systems that can pull meaning from complex layouts, scale to huge file libraries and show exactly where an answer came from are incredibly useful in high-stakes environments. These systems:

  • Understand rich document content, moving beyond simple text scraping to capture information from charts, tables, figures and mixed-language pages and treating documents as a human would by recognizing structure, relationships and context​​.
  • Handle large quantities of shifting data, ingesting and processing massive collections of documents in parallel, and keeping knowledge bases continuously up to date.​​
  • Find exactly what users need, helping AI agents pinpoint the most relevant passages, tables or paragraphs to a query so they can respond with precision and accuracy.​​
  • Show the evidence behind answers by providing citations to specific pages or charts so teams can gain transparency and auditability, which is critical in regulated industries.​​

The result is a shift from static document archives to living knowledge systems that directly power business intelligence, customer experiences and operational workflows.

Document Intelligence at Work

Intelligent document processing systems built on NVIDIA Nemotron RAG models, Nemotron Parse and accelerated computing are already reshaping how organizations across industries gain insights from their documents.​​

Justt: AI-Native Chargeback Management and Dispute Optimization

In financial services, payment disputes create significant revenue loss and operational complexity for merchants, largely because the evidence needed to handle them lives in unstructured formats. Transaction logs, customer communications and policy documents are often fragmented across systems and difficult to process at scale, making dispute handling slow, manual and costly.

Justt.ai provides an AI-driven platform that automates the full chargeback lifecycle at scale. The platform connects directly to payment service providers and merchant data sources to ingest transaction data, customer interactions and policies, then automatically assembles dispute-specific evidence that aligns with card network and issuer requirements.

The platform’s AI-powered dispute optimization, powered by Nemotron Parse, applies predictive analytics to determine which chargebacks to fight or accept, and how to optimize each response for maximum net recovery. Leading hospitality operators like HEI Hotels & Resorts use the platform to automate dispute handling across their properties, recapturing revenue while maintaining guest relationships.

By pairing document-centric intelligence with decision automation, merchants can recapture a significant portion of revenue lost to illegitimate chargebacks while reducing manual review effort.​

Read about how Justt’s chargeback management tool autonomously processes financial data to handle disputes for merchants.

Docusign: Scaling Agreement Intelligence

Docusign is the global leader in Intelligent Agreement Management, handling millions of transactions every day for more than 1.8 million customers and over 1 billion users.

Agreements are the foundation of every business, but the critical information they contain are often buried inside pages of documents. To surface the information, Docusign needed high-fidelity extraction of tables, text and metadata from complex documents like PDFs so organizations could understand and act on obligations, risks and opportunities faster.

Docusign is evaluating Nemotron Parse for deeper contract understanding at scale. Running on NVIDIA GPUs, the model combines advanced AI with layout detection and OCR. The system can reliably interpret complex tables and reconstruct tables with required information. This reduces the need for manual corrections and helps ensure that even the most complex contracts are processed with the speed and accuracy their customers expect.

With this foundation, Docusign will transform agreement repositories into structured data that powers contract search, analysis and AI-driven workflows — turning agreements into business assets that help organizations and their teams improve visibility, reduce risk and make faster decisions.

Edison Scientific: Research Across Massive Literature Scale

Edison Scientific’s Kosmos AI Scientist helps researchers navigate complex scientific landscapes to synthesize literature, identify connections and surface evidence.​

Edison needed a way to rapidly and accurately extract structured information from large volumes of PDFs, including equations, tables and figures that traditional information parsing methods often mishandle.​

By integrating the NVIDIA Nemotron Parse model into its PaperQA pipeline, Edison can decompose research papers, index key concepts and ground responses in specific passages, improving both throughput and answer quality for scientists.​​ This approach turns a sprawling research corpus into an interactive, queryable knowledge engine that accelerates hypothesis generation and literature review.​

The high efficiency of Nemotron Parse enables cost-efficient serving at scale, allowing Edison’s team to unlock the whole multimodal pipeline.

Designing an Intelligent Document Processing Application With NVIDIA Technologies

A robust, domain-specific document intelligence pipeline requires technologies that can handle data extraction, embedding and reranking, while keeping the data secure and compliant with regulations.​​

  • Extraction: Nemotron extraction and OCR models rapidly ingest multimodal PDFs, text, tables, graphs and images to convert them into structured, machine-readable content while preserving layout and semantics.
  • Embedding: Nemotron embedding models convert passages, entities and visual elements into vector representations tuned for document retrieval, enabling semantically accurate search.​​
  • Reranking: Nemotron reranking models evaluate candidate passages to ensure the most relevant content is surfaced as context for large language models (LLMs), improving answer fidelity and reducing hallucinations.​​
  • Parsing: Nemotron Parse models decipher document semantics to extract text and tables with precise spatial grounding and correct reading flow. Overcoming layout variability, they turn unstructured documents into actionable data that enhances the accuracy of LLMs and agentic workflows.

These capabilities are packaged as NVIDIA NIM microservices and foundation models that run efficiently on NVIDIA GPUs, allowing teams to scale from proof of concept to production while keeping sensitive data within their chosen cloud or data center environment.

The most effective AI systems use a mix of frontier models and open source models like NVIDIA Nemotron, with an LLM router analyzing each task and automatically selecting the model best suited for it. This approach keeps performance strong while managing computing costs and improving efficiency.

Get Started With NVIDIA Nemotron

Access a step-by-step tutorial on how to build a document processing pipeline with RAG capabilities. Explore how Nemotron RAG can power specialized agents tailored for different industries.​

Plus, experiment with Nemotron RAG models and the NVIDIA NeMo Retriever open library, available on GitHub and Hugging Face, as well as Nemotron Parse on Hugging Face.

Join the community of developers building with the NVIDIA Blueprint for Enterprise RAG — trusted by a dozen industry-leading AI Data Platform providers and available now on build.nvidia.com, GitHub and the NGC catalog.

Stay up to date on agentic AI, NVIDIA Nemotron and more by subscribing to NVIDIA AI news, joining the community and following NVIDIA AI on LinkedIn, Instagram, X and Facebook.  

Explore self-paced video tutorials and livestreams.



Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World



At 3DEXPERIENCE World in Houston, NVIDIA founder and CEO Jensen Huang and Dassault Systèmes CEO Pascal Daloz laid out a blueprint for industrial AI rooted in physics-based “world models” — systems designed to simulate products, factories and even biological systems before they’re built.

“Artificial intelligence will be infrastructure,”  like water, electricity, and the internet Huang told the crowd, playfully referring to the engineering-heavy audience as “Solid Workers,” a nod to Dassault Systèmes’ SolidWorks platform.

The announcement continues a collaboration spanning more than a quarter century between NVIDIA and Dassault Systèmes.

“This is the largest collaboration our two companies have ever had in over a quarter century,” Huang said. “We’re going to fuse these technologies so engineers can work at a scale that’s 100 times, 1,000 times — and eventually a million times greater than before.”

The new partnership brings NVIDIA accelerated computing and AI libraries together with Dassault Systèmes’ Virtual Twin platforms to move more engineering work into real-time digital workflows, powered by AI companions that help teams explore, validate, prototype and iterate faster.

Huang framed the shift as a reinvention of the computing stack: moving from hand-specified, structured digital designs to systems that can generate, simulate and optimize in software — at industrial scale.

From Digital Models to Industry World Models

Virtual twins are not applications, “they are knowledge factories,” Daloz said.

The partnership aims to establish industry world models — science-validated AI systems grounded in physics that can serve as mission-critical platforms across biology, materials science, engineering and manufacturing.

In Daloz’s framing, the value moves upstream: virtual twins become the place where knowledge is created, tested, and trusted — before anything is built in the physical world.

Dassault Systèmes, whose 3DEXPERIENCE platform serves more than 45 million users and 400,000 customers globally, has long been a leader in virtual twin technology — digital replicas that let engineers simulate products and processes before building them physically.

The collaboration brings together accelerated computing, AI and digital twin technologies so engineers can design not only geometry, but behavior — and explore radically larger design spaces earlier in development.

Together, the companies outlined how this shared architecture will show up across science, engineering and manufacturing workflows:

  • Advancing Biology and Materials Research​: The NVIDIA BioNeMo platform and BIOVIA science-validated world models accelerate the discovery of new molecules and next-generation materials.
  • AI-Driven Design and Engineering: SIMULIA AI-based Virtual Twin Physics Behavior leveraging NVIDIA CUDA-X libraries and AI physics libraries empowers designers and engineers to accurately and instantly predict outcomes.
  • Virtual Twins for Every Factory: NVIDIA Omniverse physical AI libraries integrated into the DELMIA Virtual Twin enable autonomous, software-defined production systems.
  • Virtual Companions Supercharge Dassault Systèmes’ Users: The 3DEXPERIENCE agentic platform, combining NVIDIA AI technologies and NVIDIA Nemotron open models with Dassault Systèmes’ Industry World Models, powers Virtual Companions to tap into deep industrial context, delivering trusted, actionable intelligence.

Huang said that in domains like biology and materials, the frontier is learning the underlying “language” of complex systems and then generating new options that can be evaluated and validated in simulation.

Designing and Operating the Factory in Software

A central theme of the discussion was how factories themselves are changing — from static physical assets to living systems that are designed, simulated and operated as virtual twins.

As part of the partnership, Dassault Systèmes is deploying NVIDIA-powered AI factories on three continents through its OUTSCALE sovereign cloud, enabling customers to run AI workloads while maintaining data residency and security requirements.

Both executives emphasized that the goal isn’t to replace engineers — it’s to amplify them. As AI agent companions take on more exploratory and repetitive tasks, designers and engineers gain leverage and creativity, not redundancy.

AI Companions That Expand Human Creativity

Every designer will have a “team of companions,” Huang said — a shift he described as fundamentally positive for engineers, software platforms and the broader ecosystem built on them.

For the tens of millions of engineers who use Dassault Systèmes tools to design everything from aircraft to consumer packaged goods, the shift isn’t about replacing human creativity — it’s about expanding it.

“Success is not about automation,” Daloz said. “[Engineers] don’t want to automate the past — they want to invent the future.”

Looking ahead, Daloz framed the partnership as about more than performance gains – it’s an effort to open new possibilities, help companies eliminate bad choices before they become expensive mistakes, and create entirely new categories of products.

“Virtual twins and the 3D Universes are not applications,” Daloz said. “They are knowledge factories.”

The fireside conversation between Huang and Daloz was broadcast live from 3DEXPERIENCE World.