Agentic AI - Faz Business | فاز الأعمال

NVIDIA BioNeMo Agent Toolkit Brings Accelerated AI to Life Sciences Researchers in Claude Science

Posted on July 1, 2026 by faz_business

Life sciences has entered an era of computational scale, and for more than a decade, NVIDIA has built the full GPU-accelerated computing stack — spanning hardware, frameworks, libraries, models, microservices and domain-specific tools — to help researchers run more sophisticated workflows and iterate faster.

This week, Anthropic announced Claude Science, an AI workbench for science research that lets scientists converse with agents in natural language to run their work end to end.

Claude Science integrates with NVIDIA BioNeMo Agent Toolkit as a resource that scientists can access within their workflow. The toolkit packages NVIDIA-accelerated capabilities as callable skills, enabling Claude Science to select the appropriate tool, prepare valid inputs and execute the workflow — all while connecting to NVIDIA compute resources deployed anywhere. This brings NVIDIA’s accelerated models, libraries and NVIDIA NIM microservices directly into the same environment where the rest of the research happens.

The world’s largest pharmaceutical companies use NVIDIA technologies to advance AI-enabled research across drug discovery, genomics, medical imaging, molecular design and protein engineering. Today, 18 of the top 20 pharmaceutical companies use NVIDIA BioNeMo, underscoring the breadth of its role across the ecosystem.

Advancing the Agentic Era of Scientific Discovery

Claude Science lets scientists use natural language to move their research from intent into action, without manually configuring models, endpoints, or software environments. NVIDIA BioNeMo Agent Toolkit extends that with access to accelerated workflows and models like Evo 2, Boltz-2 and OpenFold3, so the analyses that benefit from acceleration run faster.

A scientist begins by describing a research task, such as analyzing a genomic sequence, predicting a protein structure or designing a potential binder, in natural language. Claude Science interprets the request and orchestrates the work through preconfigured domain-specialized agents that know established workflows across genomics, proteomics, single-cell analysis, cheminformatics and clinical research.

BioNeMo Agent Toolkit gives these agents the context needed to connect each step with an appropriate NVIDIA scientific capability. Each skill includes information about its purpose and required inputs, helping agents prepare and execute the workflow and return outputs for review.

The result is an iterative loop between scientific reasoning and accelerated computational work. Scientists can inspect outputs, refine their questions and determine the next step while staying focused on the science.

One powerful example is generating better inhibitors of common cancer targets. In this workflow, a scientist starts with a known cancer-causing antigen mutation and asks Claude to design numerous potential inhibitors. Claude Science integrated with BioNeMo Agent Toolkit and NVIDIA NIM microservices accelerates high-throughput inhibitor prediction, optimization and validation.

A Scientific Foundation Built for Agents

AI agents reason, plan and use tools to complete tasks. In life sciences, those tools are often specialized computational workflows.

An autonomous AI scientist agent doesn’t reason in isolation. It may need to fingerprint a library of compounds, cluster promising hits, generate conformers for top candidates, analyze genomic context and compare perturbation responses before recommending the next experiment.

Each step relies on a scientific tool, and the agent can only work as fast as those tools run.

NVIDIA BioNeMo Agent Toolkit gives scientific agents the accelerated tools they need to operate at the speed of science. It includes:

NVIDIA Parabricks accelerates genomic analysis from hours to minutes, so an agent can integrate genomic context into a decision in near real time.
RAPIDS-singlecell, developed by scverse, compresses a 1.3-million-cell preprocessing and clustering workflow from 52 minutes to 25 seconds, so single cell analysis becomes part of the reasoning loop rather than an offline batch of jobs.
nvMolKit accelerates cheminformatics operations like similarity search and conformer generation by up to 3,000x, so an agent iterating across a massive chemical space gets results at the speed of thought.
NVIDIA BioNeMo open models deliver core biomolecular capabilities accelerated by NVIDIA libraries, so an agent has a purpose-built scientific model for each step of a workflow.
BioNeMo NIM microservices package those models as enterprise-ready inference endpoints — containerized microservices with the full accelerated software stack pre-integrated and tuned for high-performance inference — so an agent can call a single stable application programming interface for production deployment.

NVIDIA BioNeMo Agent Toolkit is open and harness-agnostic, allowing the same scientific skills to work across agent frameworks and research platforms. The toolkit and its skills are available now through NVIDIA developer resources and GitHub.

Scientists can access BioNeMo-powered workflows through Anthropic’s Claude Science, which is entering public beta today. As part of the public beta, Anthropic is inviting researchers to provide feedback on additional domain specialists and integrations they need.

How Businesses Are Building Specialized AI They Can Trust

Posted on June 23, 2026 by faz_business

Companies are asking how to build specialized AI that fits with the way their workflows actually run.

The first wave of enterprise AI was about access. Companies experimented with new frontier and open models, ran pilots and explored how AI can help.

Now, specialized agents — systems of models that can reason, use tools and take action even for the most complex workflows — put more useful AI within reach of the people who already know the work best.

Agents are already helping life sciences researchers accelerate medicine discovery, security teams investigate vulnerabilities with more context and operations teams seamlessly coordinate supply chains.

To tap into these specialized agents, businesses are using a foundation they can adapt and own: one built on models they can customize, tools that connect to systems they already use and infrastructure that lets agents operate safely at scale.

NVIDIA Agent Toolkit — comprising models, tools, skills and a secure runtime — provides an open, modular foundation for building safer, faster, lower-cost digital AI coworkers that enterprises and developers can customize, specialize, control and trust.

The Building Blocks for Specialized AI Coworkers

Enterprises and developers building secure, specialized AI agents require:

Models, which provide the reasoning foundation.
Tools and skills, which connect agents to the actions and domain expertise needed to get work done.
Runtime support, which helps agents execute workflows.

NVIDIA Agent Toolkit includes all three:

NVIDIA Nemotron open models give teams flexibility to customize, evaluate and deploy agents for their own needs.
NVIDIA NemoClaw blueprints provide patterns for safer agent behavior, delivering accurate results at lower costs, with tools and skills connecting agents to concrete actions.
The NVIDIA OpenShell runtime helps agents operate safely inside the systems where work gets done.

NVIDIA technologies accelerate all the pieces needed to turn a powerful frontier model into a fully functional digital coworker. The toolkit’s users can work with third-party agent harnesses — or agent orchestration frameworks — of their choice, including Hermes Agents and OpenClaw.

This unlocks enterprise AI momentum with control. And that matters because the most valuable agents across industries will be specialized.

Agents Take Shape Across Industries

The specialized AI foundation is already at work.

In life sciences, agents can help researchers call domain models for protein design, virtual screening, genomics analysis and biomarker discovery. The new NVIDIA BioNeMo Toolkit enables work that previously took months to be completed in days.

In healthcare, agents support clinical documentation, clinical decision support and care coordination. Plus, physical agents in robotics systems trained in digital twins of hospitals can scale surgical assistance and hospital automation to meet care demands.

In software, cybersecurity, industrial operations and customer workflows, agents can connect to the tools and data teams already use, helping people move faster through complex workflows.

For example, Cadence and Synopsys are building autonomous agents for chip design and engineering workflows. CrowdStrike is running specialized security agents that triage alerts with 98.5% accuracy. Palantir, SAP, ServiceNow, Siemens and Dassault Systèmes are embedding agent capabilities into the enterprise platforms where critical decisions get made.

It all points to the same larger shift: Agents become more useful when they can combine models, tools, skills, runtime and infrastructure in ways companies can adapt to their own workflows. NVIDIA Agent Toolkit provides an open, modular foundation that enables this combination.

Learn more about NVIDIA Agent Toolkit and NVIDIA BioNeMo Agent Toolkit.

HPE AI Factory With NVIDIA Expands for the Era of Agents

Posted on June 16, 2026 by faz_business

Enterprises are moving agentic AI from proof of concept to production — and the next generation of AI factories are built for the era of agents.

At HPE Discover Las Vegas, running through Thursday, June 18, NVIDIA and HPE are expanding the HPE AI Factory with NVIDIA, including NVIDIA Vera CPU and NVIDIA Agent Toolkit for HPE Private Cloud AI.

Plus, NVIDIA Confidential Computing extends across HPE AI Factory and enhanced full-stack NVIDIA integration — with NVIDIA accelerated computing, NVIDIA AI software and NVIDIA networking — is available throughout the entire portfolio.

NVIDIA Vera CPU Available With HPE Private Cloud AI

The HPE ProLiant Compute DL394 Gen12 with the NVIDIA Vera CPU will be available in 2027 with HPE Private Cloud AI, a turnkey AI factory co-engineered with NVIDIA. Vera is the first CPU built for agents — designed for the tool calls, orchestration and real-time data processing required across the agent loop — bringing deterministic, low-latency performance into HPE Private Cloud AI.

The New York Stock Exchange, in collaboration with Redpanda and HPE, is an early enterprise customer exploring Vera CPU with the HPE ProLiant Compute DL394 Gen12 server.

The Vera CPU is part of the NVIDIA Vera Rubin platform, which is ramping into full production with the NVIDIA Vera Rubin NVL72 rack-scale system available from HPE. Vera Rubin was built for frontier-scale models larger than 1 trillion parameters and will ship with full-stack NVIDIA Confidential Computing across every chip.

HPE is also bringing the HPE Compute XD700 — built on NVIDIA HGX Rubin NVL8 — to the HPE AI Factory, supporting up to 128 Rubin GPUs per rack.

NVIDIA Agent Toolkit Now Available With HPE Private Cloud AI

NVIDIA Agent Toolkit — including NVIDIA Nemotron open models, the NVIDIA OpenShell secure runtime and NVIDIA NemoClaw blueprints — will be available with HPE Private Cloud AI. Together, they give enterprises an agentic AI operating system for monitoring agent behavior, enforcing governance policies, and safely building and running autonomous, long-running multi-agent systems.

HPE Private Cloud AI adds secure local agent registration, letting customers approve AI models, skills and tools against centralized governance and security policies before they run. New HPE Zerto Software capabilities detect rogue agent actions and use continuous data protection to rewind to a clean state.

On the data side, HPE Alletra Storage MP X10000 — which achieved the foundation level of NVIDIA-Certified Storage — automatically applies metadata and governance policies to prepare unstructured data for AI pipelines, improving token throughput.

NVIDIA Confidential Computing Across All HPE AI Factory Solutions

NVIDIA Confidential Computing is now available across the HPE AI Factory through HPE Services — including HPE AI Factory at Scale, HPE Sovereign AI Factory and HPE Private Cloud AI.

AI applications access and use private and sensitive data that needs to be protected and secured. In addition, models trained with proprietary data or techniques need to be safeguarded from exfiltration. Confidential computing is essential for these modern AI workloads, as it protects models and private data during execution for on-premises and sovereign deployments, establishing a chain of trust through cryptographic attestation and encryption at every stage.

In addition, HPE ProLiant Compute DL380a achieved certification as part of the NVIDIA-Certified Systems for NVIDIA Confidential Computing program, which validates robust application performance with confidential computing. These systems provide hardware-based protection for AI workloads and sensitive data assets while maintaining optimal NVIDIA acceleration.

Across the HPE AI Factory solutions, NVIDIA BlueField DPUs and NVIDIA DOCA provide in-silicon zero-trust policy enforcement, runtime threat detection and network encryption — protecting AI workloads, agents and data without performance tradeoffs.

Enhanced Full-Stack NVIDIA Integration Across the Portfolio

HPE AI Factory at Scale, HPE Sovereign AI Factory and HPE Private Cloud AI are now available with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, NVIDIA Spectrum-X Ethernet networking, NVIDIA BlueField-3 DPUs and NVIDIA ConnectX-8 SuperNICs.

For next-generation AI factories, every Vera Rubin NVL72 system will ship with NVIDIA networking built in — NVIDIA Vera BlueField-4 DPUs, NVIDIA ConnectX-9 SuperNICs and NVIDIA Spectrum-X Ethernet — with NVIDIA Spectrum-6 switching delivering 1.6x higher networking performance for AI communication versus off-the-shelf Ethernet.

Spectrum-X Ethernet networking is the standard for HPE AI Factory with NVIDIA — including at-scale, sovereign and turnkey AI factory solutions available now. For large-scale and sovereign workloads, HPE announced at NVIDIA GTC in March that it’s also adding NVIDIA InfiniBand networking options — including NVIDIA Quantum-X800 InfiniBand with the HPE Cray Supercomputing GX5000.

These configurations are based on NVIDIA reference architectures and support use cases from AI development through production-scale deployment, with NVIDIA AI Enterprise software and the HPE Unleash AI ecosystem.

At HPE Discover this week, the Unleash AI partner program is expanding with nearly a dozen new AI software partners — including Aizen, BridgeTEK, deepset, Deliverance, Faclon Labs, Gallop, Rocket, Supervity, Thales, Trustwise and Vortiqx.

Attendees can explore these solutions all week at the show and learn more about the HPE AI Factory with NVIDIA, part of the NVIDIA AI Computing by HPE portfolio.

NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark

Posted on June 12, 2026 by faz_business

AgentPerf from Artificial Analysis, the industry’s first agentic AI benchmark, gives developers, enterprises and infrastructure providers a clear way to compare systems for agentic AI. In the first round of published results, the NVIDIA Blackwell Ultra NVL72 platform delivers leading performance across the agentic AI workloads tested, running 20x more agents per megawatt than NVIDIA Hopper.

Agentic AI is a fundamentally different workload than conversational AI. A single chat completion is a sprint: one large language model (LLM) call, one response. An agent functions more like a relay: It breaks a goal into many steps and keeps going until the task is done.

Agents chain together multiple LLM calls and tool calls to gather context, observe, reason and act.

That results in dozens to hundreds of LLM calls chained together, each passing growing context to the next, with tool calls like code compile and execution, database search and web browsing at every handoff. The complexity isn’t additive; it’s multiplicative.

The distinction matters enormously for performance measurement. Existing AI inference benchmarks measure one LLM call: how fast an LLM responds to a single request and how many simultaneous requests a system can handle. They weren’t designed for agentic workloads, where chained LLM calls, tool call delays and growing context stress accelerated computing systems in fundamentally different ways than a single LLM call ever could.

For companies building and deploying agents at scale, it’s important to understand how responsive agents are, how many can be deployed simultaneously and how much useful work AI infrastructure can deliver for every dollar and watt invested.

NVIDIA GB300 NVL72 Runs 20x More Agents per Megawatt

In this first round, AgentPerf measures agentic performance with DeepSeek V4 Pro, a large mixture-of-experts (MoE) model that represents the class of frontier models powering today’s most capable agents. On this workload, NVIDIA GB300 NVL72 delivers the highest performance in the benchmark, running up to 20x more agents per megawatt than the NVIDIA HGX H200 system.

NVIDIA GB300 NVL72 supports far more concurrent agents per megawatt than NVIDIA H200 at both service-level objectives of 20 and 60 tokens per second per agent.

The performance advantage comes from extreme codesign across the full stack. GB300 NVL72 connects 72 GPUs into a single rack-scale system, enabling large MoE models like DeepSeek V4 Pro to distribute model execution efficiently at scale.

CUDA kernels accelerate this further by overlapping communication and compute, so the cost of coordinating across experts is absorbed rather than added to latency.

NVIDIA TensorRT LLM sustains efficiency as concurrent agent sessions scale. For example, it separates the processing of inputs from the generation of outputs so each can be optimized independently.

These results are grounded in a benchmark methodology built from the ground up to reflect how agentic AI actually works in production.

Artificial Analysis AgentPerf: Built on Real-World Agentic Workloads

AgentPerf is built based on real coding agent trajectories: an agent receives a task, reads files, writes and edits code, executes commands and iterates based on the results — all drawn from real public code repositories across 12+ programming languages. The long sequence lengths, tool call patterns and delays are all representative of real-world coding workflows.

AgentPerf then measures how many of these agentic tasks a platform can support simultaneously while meeting defined performance thresholds for responsiveness and output token rate. Tool calls are not executed but simulated using representative CPU processing time, so differences in results reflect accelerated computing performance only.

The results translate directly into infrastructure decisions: how many concurrent agentic tasks can be run per accelerator and per megawatt of power. For enterprises deploying AI agents at scale, those numbers determine how much productive work a given infrastructure investment can actually deliver.

NVIDIA Ecosystem Partners Harness Blackwell’s Leading Performance

Leading inference providers including Baseten, DeepInfra and Together AI are already serving agentic workloads on frontier models such as DeepSeek V4 Pro on NVIDIA Blackwell and powering production agentic applications today.

Together AI powers real-time inference for Cursor, an AI-powered agentic coding platform, on NVIDIA Blackwell. Cursor’s agents debug issues, generate features and execute refactors while developers continue working.

DeepInfra powers Pam.ai, an AI workforce platform for car dealerships, which deploys agents to book service appointments, handle calls and run outbound sales campaigns, entirely on NVIDIA Blackwell.

As NVIDIA and the open source ecosystem continue to optimize inference software, performance and efficiency on agentic workloads will only improve. The NVIDIA Vera Rubin architecture is now in full production, bringing the next generation of infrastructure capacity to meet the growing demands of agentic AI at scale.

Dive deeper into AgentPerf’s methodology and NVIDIA’s full-stack optimizations for agentic AI in this technical blog.

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

Posted on June 9, 2026 by faz_business

A year ago at London Tech Week, NVIDIA founder and CEO Jensen Huang and U.K. Prime Minister Keir Starmer made a declaration: the U.K. would be an AI maker, not an AI taker.

At this year’s event, NVIDIA and its partners are showcasing how that commitment is producing real momentum across the nation’s infrastructure, startups and enterprises.

U.K. technology leaders are innovating across healthcare and life sciences, coding, agentic AI, inference and more — all running on sovereign AI deployments.

AI Minister Kanishka Narayan said: “A year ago, we said the UK would be an AI maker, not an AI taker. Today we’re delivering on that — with sovereign compute powering British startups to push the boundaries of what AI can do, from drug discovery to healthcare to robotics. This is what it looks like when a country backs its own talent with the infrastructure to match.

“NVIDIA’s decision to invest billions here is a reflection of the strength of what’s being built in Britain. We are determined to make sure the next generation of AI breakthroughs happens in this country, and we have everything we need to make it happen.”

Commitment to Compute

Over the past year, the number of AI cloud providers planning to deploy AI infrastructure on U.K. soil has doubled.

Nebius has announced plans to expand customers and cloud capabilities with three new deployments of advanced NVIDIA AI infrastructure, as the NVIDIA AI Cloud ecosystem partner continues to build out its commercial and AI R&D hub in London. Combined, the deployments are expected to reach 65 megawatts when fully ramped up in 2027.

CoreWeave is building in the U.K. Government’s AI Growth Zones, and seven more NVIDIA AI Cloud ecosystem partners have plans in the pipeline. BT and Nscale announced plans to build sovereign AI data centers across three existing BT sites in the U.K., combining NVIDIA AI infrastructure, Nscale’s full stack and BT’s trusted nationwide connectivity backbone.

From Fund to Frontier

Central to that sovereign compute story is Isambard-AI — the U.K.’s most powerful computer. Built on 5,400 NVIDIA GH200 Grace Hopper Superchips and running entirely on zero-carbon electricity, it’s the engine behind some of the U.K.’s most ambitious AI research.

The U.K. government’s Sovereign AI Fund is putting that capability to work by backing homegrown companies and providing the domestic infrastructure needed to scale their ambitions.

Among its first recipients is Ineffable Intelligence, which recently announced a collaboration with NVIDIA to build the future of reinforcement learning infrastructure.

Other recipients include four U.K.-based NVIDIA Inception startups, each pushing the AI frontier using Isambard-AI. These startups are:

Cosine Builds Sovereign Coding Platform

Cosine is building an end-to-end sovereign AI coding platform for highly regulated industries such as financial services, critical infrastructure and national security. Using Isambard, Cosine is training a new, large-parameter, mixture-of-experts, multimodal agentic LLM for natively handling data types beyond text and image.

“Access to Isambard enables the project, full stop,” said Alistair Pullen, cofounder and CEO of Cosine. “We already have the people who know how to do this. We have the data. We have the infrastructure and the training. The thing we’ve never had is this level of compute.”

Cursive Trains Self-Improving AI Systems

Cursive is building self-improving AI systems that learn continuously from real-world data, enabling them to operate autonomously over long periods of time. This is unlocked through new memory-augmented architectures with dramatically larger context windows, currently in development using the Sovereign AI Fund resources. In addition, the team recently adopted the NVIDIA Megatron-LM framework for distributed training at scale.

“The Sovereign AI Fund is more than just processing power — it’s a statement about investing in AI in the U.K.,” said Talfan Evans, cofounder and CEO of Cursive. “Sovereignty is actually now a buying criterion — and it’s a challenge to tap into the resources we uniquely have as U.K. and European companies.”

Doubleword Optimizes Inference to Deliver Abundant Intelligence Tokens

Doubleword, the U.K.’s first dedicated inference lab, optimizes every layer of the AI stack to maximize what it calls “IQ per dollar.” The company deploys open models including NVIDIA Nemotron 3 Super 120B and builds on the NVIDIA Dynamo inference framework.

On Isambard, Doubleword’s early results achieved 70x faster model cold starts — aka model loading times — and 4x lossless KV cache compression, critical advancements for long-running agentic workloads. The result: inference at 90-95% lower costs than other leading inference providers.

“Sovereign AI is most impactful at the inference layer,” said Meryem Arik, cofounder and CEO of Doubleword. “Inference is when you’re actually getting the value from the model — we want that value created in the U.K., with U.K. compute and U.K. data centers.”

Prima Mente Uses Foundation Models to Study Alzheimer’s and More

Prima Mente builds biological foundation models to identify new biomarkers, subtypes and drug targets of Alzheimer’s, Parkinson’s and ALS. With its Isambard allocation, the company is developing Pleiades 2, a foundation model combining five biological data modalities.

Achieving nearly 3x speedups in model training with NVIDIA Blackwell GPUs, Prima Mente also uses NVIDIA Parabricks for genomic data processing and NVIDIA Transformer Engine for model optimization.

“Research shows Alzheimer’s might be 25 different subgroups of disease, and we want to help by using AI to identify these subtypes and the biology within the cells as they change,” said Hannah Madan, cofounder of Prima Mente.

Video courtesy of Nebius and Prima Mente.

AI Talent, Policy and Production

NVIDIA’s £2 billion investment in the U.K. startup ecosystem — in collaboration with leading venture capital firms — is bringing new capital and advanced AI infrastructure to major U.K. hubs including London, Oxford, Cambridge and Manchester.

U.K. membership in the NVIDIA Inception program has increased by 50% over the past year. AI-native companies like Doubleword, Synthesia and PolyAI are scaling globally from U.K. roots.

At last year’s London Tech Week, NVIDIA announced a collaboration with the U.K Department for Science, Innovation and Technology on 6G and AI skills. The 6G collaboration has seeded testbeds at four U.K. universities. In May, the NVIDIA Deep Learning Institute (DLI) delivered two new courses — added to support the nation’s wireless research community — to participants from over 30 U.K. universities.

Plus, as part of this AI skills collaboration, NVIDIA DLI courses are offered as part of QA’s AI Apprenticeships in England.

And the NVIDIA Developer Program now includes more than 200,000 U.K. developers.

The Sovereign AI Forum, which launched last year with seven charter members, convened the country’s AI leadership to turn policy into deployment roadmaps. Over the past year, the Forum has welcomed dozens of participants across government, industry and the startup community — turning policy into deployment roadmaps.

And enterprise AI is moving from pilot to production:

Apian is building digital twins of two National Health Service hospitals, combining autonomous devices, ground robots, computer vision and robotic simulation.
Deliverance AI is helping regulated enterprises to run, govern and scale AI agents inside their own environment — through a single control plane. The Agentic Operating System is built for organizations where data sovereignty is non-negotiable.
Glass Futures has installed an AI-driven digital twin of its glass furnace capable of testing and predicting new, optimal ways to make glass. The digital twin taps into NVIDIA accelerated computing and the NVIDIA PhysicsNeMo framework.
Orbital Industries has announced codesigned, NVIDIA Vera Rubin DSX AI Factory-compliant AI infrastructure that accelerates time to first token.
Reading Football Club is partnering with Stelia to establish an AI Centre of Excellence, combining Stelia’s full-stack AI platform with accelerated compute infrastructure from NVIDIA and Lenovo.

It all reflects momentous progress in U.K. AI leadership — and offers a glimpse of where it’s heading.

Join NVIDIA at London Tech Week.

NVIDIA and Google Cloud Empower the Next Wave of AI Builders

Posted on May 21, 2026 by faz_business

At this year’s Google I/O conference, NVIDIA and Google Cloud are accelerating the work of more than 100,000 developers in the companies’ joint developer community, which provides curated learning paths, hands-on labs and events that help them build using the full-stack NVIDIA AI platform on Google Cloud.

Launched at Google I/O last year, the community brings together developers, data scientists and machine learning engineers who want to sharpen their AI skills on the latest NVIDIA and Google Cloud technologies.

New additions for the community are rolling out this year, including a learning path for using the JAX library on NVIDIA GPUs, a new NVIDIA Dynamo codelab focused on inference optimizations, as well as monthly developer livestreams.

Over the last year, the community has become a go‑to hub for AI builders using NVIDIA‑accelerated tools for data science and machine learning. The result has been production‑ready retrieval-augmented generation applications on Google Kubernetes Engine (GKE) and instrumenting observability for agent workloads.

These AI builders are also experimenting with new large language model research and prototyping hybrid on‑premises and cloud inference for real‑world use cases like sports analytics and enterprise data pipelines.

Building With Google DeepMind’s Gemma, NVIDIA Nemotron and Open Frameworks

NVIDIA and Google Cloud are equipping developers with learning resources and hands-on labs that combine NVIDIA libraries, open models and tools with Google Cloud’s AI platform — so they can build optimized, production‑ready AI applications faster.

For example, developers can accelerate data science and analytics with the NVIDIA cuDF library in Google Colab Enterprise or Dataproc, or deploy multi-agent applications by combining Google DeepMind’s Gemma 4 models, NVIDIA Nemotron open models and Google Agent Development Kit with Google Cloud G4 VMs powered by NVIDIA RTX PRO 6000 Blackwell GPUs in Google Cloud Run or with spot instances.

NVIDIA and Google Cloud work closely across open frameworks like JAX so developers can build, scale and productize JAX workloads on NVIDIA AI infrastructure on Google Cloud — from single‑GPU experiments to multi‑rack deployments — while getting strong performance and a consistent experience.

This work extends to Google Cloud AI Hypercomputer, where the MaxText framework uses these JAX optimizations to train large models efficiently on NVIDIA GPUs.

Building on the same foundation, NVIDIA Dynamo on GKE helps developers optimize large-scale inference — including mixture-of-experts models — so they can serve AI applications more efficiently with NVIDIA accelerated infrastructure on Google Cloud.

To help developers get hands-on with these capabilities, a new learning path on running and scaling JAX on NVIDIA GPUs and a new NVIDIA Dynamo on GKE inference codelab will become available next month for members in the Google Cloud and NVIDIA developer community.

Advancing Responsible AI With Google DeepMind’s SynthID and NVIDIA Cosmos

AI agents are increasingly built from a system of AI models — combining proprietary and open source models that reason, plan and act on users’ behalf.

Amid this shift, trust and transparency are foundational, so developers and organizations can understand how these systems work and what they generate.

NVIDIA was the first industry partner to collaborate with Google DeepMind on SynthID, an AI watermarking technology that embeds robust digital watermarks directly into AI‑generated content, which helps preserve the integrity of outputs from NVIDIA Cosmos world foundation models available on build.nvidia.com.

Cosmos models provide rich 3D perception and simulation capabilities for robots, autonomous machines and other physical AI systems, while SynthID brings content transparency to the imagery and video they rely on.

Together, they help preserve the integrity of AI‑generated content so developers can build and deploy agentic applications more responsibly across cloud, edge and real‑world environments.

Building on a Full-Stack NVIDIA and Google Cloud Platform

This year, Google I/O is putting the spotlight on new agentic experiences and tools for developers — and NVIDIA and Google Cloud are focused on ensuring builders have the infrastructure, software and learning resources they need to make the most of them.

For developers in the community building on NVIDIA and Google Cloud, the skills and tools they learn can scale, effortlessly taking projects from prototype to enterprise‑grade workloads.

At Google Cloud Next, Google Cloud and NVIDIA expanded their full‑stack platform to help developers train, deploy and operationalize agents on Google Cloud. This collaboration includes work on NVIDIA Vera Rubin-powered A5X instances, Google DeepMind Gemini models and more, and is being harnessed by leading AI labs and enterprises including OpenAI, Thinking Machine Labs, Schrodinger, Salesforce, Snap and Crowdstrike. Learn more in this blog.

Join the NVIDIA and Google Cloud developer community to connect with other builders and stay up to date on new tools, developer events and programs.

NVIDIA and ServiceNow Partner on New Autonomous AI Agents for Enterprises

Posted on May 6, 2026 by faz_business

Enterprise AI has learned to generate. It has learned to reason. Now companies are asking the next question: How should AI act?

Early agent systems have shown what’s possible, moving beyond simple prompts to take on more complex tasks. The next step is bringing those capabilities into enterprise environments — where agents must operate with context, control and consistency across real workflows.

At ServiceNow Knowledge 2026, NVIDIA founder and CEO Jensen Huang joined ServiceNow chairman and CEO Bill McDermott during the opening keynote to discuss the next phase of enterprise AI.

The companies are expanding their collaboration across the full stack, delivering specialized autonomous AI agents that are safe and easy to adopt — powered by NVIDIA accelerated computing, open models, domain-specific skills and secure agent execution software, and bringing together enterprise workflow context from ServiceNow Action Fabric and governance from ServiceNow AI Control Tower.

ServiceNow is introducing Project Arc, a long-running, self-evolving autonomous desktop agent designed for knowledge workers, including developers, IT teams and administrators.

Unlike standalone AI agents, Project Arc connects natively to the ServiceNow AI Platform through ServiceNow Action Fabric to bring governance, auditability and workflow intelligence to every action the autonomous desktop agent takes. It can access the local file systems, terminals and applications installed on a machine to complete complex, multistep tasks that traditional automation can’t handle, but with the controls enterprises actually need to deploy AI at scale.

The work is designed based on three requirements every company will need for long-running, autonomous agents: open models and domain-specific skills that can be customized and security that helps agents act without exposing sensitive data or systems — all running on AI factories that deliver efficient tokenomics.

Bringing this level of autonomy to enterprises requires control from the start.

Project Arc uses NVIDIA OpenShell, an open source secure runtime for developing and deploying autonomous agents in sandboxed, policy-governed environments. ServiceNow is building on and contributing to OpenShell to advance a common foundation for secure, enterprise-grade agent execution. With OpenShell, enterprises can define what an agent can see, which tools it can use and how each action is contained.

“Project Arc represents the next step in our ongoing collaboration with NVIDIA, bringing autonomous execution to the desktop,” said Jon Sigler, executive vice president and general manager of AI Platform at ServiceNow. “By combining OpenShell’s runtime layer with ServiceNow AI Control Tower, and powered by ServiceNow Action Fabric, we’re delivering the governance and security that enterprise AI requires.”

Open Models and Agent Skills Scale Enterprise AI

To be effective, enterprise AI systems must be adaptable. NVIDIA and ServiceNow are building on an open ecosystem that allows organizations to tailor models and applications to their specific domains and data.

NVIDIA agent skills enable specialized agents, such as ServiceNow AI Specialists, to deliver targeted capabilities across enterprise workflows. For example, the NVIDIA AI-Q Blueprint for building specialized deep research agents empowers ServiceNow AI Specialists to gather context, synthesize information and support more complex decision-making across business functions.

In addition, the NVIDIA Agent Toolkit, including NVIDIA Nemotron open models, provide flexible building blocks and specialized skills for developing customized AI applications. To support real-world performance that these systems can perform reliably, the companies are also advancing NOWAI-Bench, an open benchmarking suite for enterprise AI agents, integrated with the NVIDIA NeMo Gym library. NOWAI-Bench includes EnterpriseOps-Gym, one of the industry’s most challenging enterprise agent benchmarks, where Nemotron 3 Super currently ranks No. 1 among open source models.

Unlike general benchmarks, these evaluations focus on multistep workflows — where enterprise AI systems often encounter real challenges — helping teams build agents that perform reliably in production environments.

Efficient AI Factories

As AI agents become long running and always on, scaling them across millions of workflows requires not just capability but efficiency — making token economics central to enterprise AI.

NVIDIA AI factories are built to deliver the lowest-cost, most-efficient tokenomics for production AI. The NVIDIA Blackwell platform delivers more than 50x greater token output per watt than NVIDIA Hopper, resulting in nearly 35x lower cost per million tokens. For enterprises running agents across millions of workflows, that efficiency can determine how quickly AI moves from pilots to broad production use.

ServiceNow AI Control Tower integrates with the NVIDIA Enterprise AI Factory validated design, extending governance and observability to large-scale AI workloads. With added agent observability capabilities, organizations can monitor behavior in real time and manage AI systems across their full lifecycle — from deployment to optimization.

AI is becoming a new way that work gets done. What’s changing now is that the core pieces required to deploy it at scale — capable agents, built-in guardrails and proven performance — are all coming together.

The companies that move fastest will be the ones that give agents the infrastructure to act, the context to make decisions and the governance to keep every action accountable — and NVIDIA and ServiceNow are making this a reality for the world’s enterprises.

Learn more about NVIDIA OpenShell and the NVIDIA AI-Q Blueprint.

Nemotron Labs: What OpenClaw Agents Mean for Every Organization

Posted on May 2, 2026 by faz_business

Editor’s note: This post is part of the Nemotron Labs blog series, which explores how the latest open models, datasets and training techniques help businesses build specialized AI systems and applications on NVIDIA platforms. Each post highlights practical ways to use an open stack to deliver real value in production — from transparent research copilots to scalable AI agents.

By early 2026, the open source project OpenClaw had become a phenomenon. In January, its GitHub star count crossed 100,000 as developer interest surged. Community dashboards and traffic analytics showed more than 2 million visitors in a single week. By March, OpenClaw topped 250,000 stars — overtaking React to become the most-starred software project on GitHub in just 60 days.

Created by Peter Steinberger, OpenClaw is a self-hosted, persistent AI assistant designed to run locally or on private servers. The project drew attention for its accessibility and unbounded autonomy: Users could deploy an AI model locally without depending on cloud infrastructure or external application programming interfaces (APIs).

Most AI agents today are triggered by a prompt, complete a defined task and then stop running. A long-running autonomous agent, or “claw,” works differently. These agents run persistently in the background, completing tasks on their own and surfacing only what requires a human decision. They operate on a heartbeat: At regular intervals, they check their task list, evaluate what needs action, and either act or wait for the next cycle.

OpenClaw’s rapid adoption also sparked debate. Security researchers raised concerns about how self-hosted AI tools manage sensitive data, authentication and model updates. Others questioned whether local deployments could expose users to new risks — from unpatched server instances to malicious contributions in community forks. As contributors and maintainers worked to address these issues, OpenClaw’s rise prompted a broader conversation across the AI ecosystem about the trade-offs between openness, privacy and safety.

To help enhance the security and robustness of the OpenClaw project, NVIDIA is collaborating with Steinberger and the OpenClaw developer community to address potential vulnerabilities, as detailed in a recent blog post by OpenClaw.

NVIDIA contributes code and guidance focused on improving model isolation, better managing local data access and strengthening the processes for verifying community code contributions. The goal is to support the project’s momentum by contributing its security and systems expertise in an open, transparent way that strengthens the community’s work while preserving OpenClaw’s independent governance.

To help make long-running agents safer for enterprises, NVIDIA also introduced NVIDIA NemoClaw, a reference implementation that uses a single command to install OpenClaw, the NVIDIA OpenShell secure runtime and NVIDIA Nemotron open models with hardened defaults for networking, data access and security. NemoClaw serves as a blueprint for organizations to deploy claws more securely.

Inference Demand Multiplies With Each AI Wave

AI has moved through four phases, and the time between each is shortening. Predictive AI took years to become mainstream. Generative AI moved faster. Reasoning AI arrived faster still. Autonomous AI — the wave OpenClaw represents — is setting an even faster pace.

What compounds with each wave is inference demand. Generative AI increased token usage over predictive AI. Reasoning AI increased it another 100x. Autonomous agents, which run continuously and act across long time horizons, drive inference demand up by another 1,000x over reasoning AI. Each wave multiplies the compute required.

This increase in token usage is enabling organizations to speed their productivity by orders of magnitude. For example, long-running agents can help researchers work through a problem overnight, iterate on a design across thousands of configurations, or monitor systems and surface only the anomalies that require human judgment — freeing up researchers’ work days for higher-value tasks.

Choosing the Tool: When to Deploy a ‘Claw’

While generative AI has become a staple for on-demand tasks, there are specific scenarios where the persistent “heartbeat” of a claw offers distinct advantages. Determining when to move from a standard prompt-based AI to a long-running agent often comes down to the nature of the workflow:

From “On-Demand” to “Always-On”: While standard models are excellent for immediate, human-triggered queries, claws are often better suited for tasks that require continuous background monitoring or periodic system checks without a manual start.
Managing High-Iteration Loops: For complex problems, like testing thousands of chemical combinations or simulating infrastructure stress tests, a claw can manage the sheer volume of iterations that might otherwise be bottlenecked by human intervention.
Shifting from Suggestions to Actions: In many workflows, standard AI is used to provide information or drafts. A claw is often considered when the goal is for the AI to move into the execution phase — interacting with APIs, updating databases or managing files across a long time horizon.
Resource Optimization: For massive, token-heavy reasoning tasks, deploying a local claw on dedicated hardware like an NVIDIA DGX Spark personal AI supercomputer allows for more predictable costs and data privacy compared with high-frequency cloud API calls.

How Are Organizations Using Long-Running Autonomous Agents?

The practical applications of long-running autonomous agents span every function and sector.

In financial services, agents continuously monitor trading systems and regulatory feeds, flagging material events before the morning review. In drug discovery, agents sweep new scientific literature, extracting relevant findings and updating internal databases in real time without researcher intervention — a process that previously took weeks.

In engineering and manufacturing, agents speed problem analysis by testing thousands of parameter combinations, ranking results and flagging the configurations worth examining — and all this can happen overnight.

In IT operations, agents diagnose infrastructure incidents, apply known remediations and escalate only the novel problems — compressing average time to resolution from hours to minutes. At ServiceNow, AI specialists leveraging Apriel and NVIDIA Nemotron models can resolve 90% of tickets autonomously.

How Can Companies Deploy Autonomous Agents Responsibly?

Autonomous agents are hands-on. They can send communications, write files, call APIs and update live systems. When an agent produces a wrong action, there are real consequences. Getting the accountability framework right from the start is essential, and organizations deploying autonomous agents in production must treat governance as a first-order requirement.

Organizations need to see what their agents are doing, inspect their reasoning at each step, audit their actions and intervene when needed.

Organizations deploying autonomous agents responsibly are focused on three priorities:

An open, auditable framework: NemoClaw is built on OpenClaw’s MIT licensed codebase, which means organizations own the full agent harness. They can read, fork and modify every layer of how their agents are built and deployed. That transparency enables teams to understand and control the system at the code level. Running open source models like NVIDIA Nemotron locally keeps sensitive workloads, including patient records, legal documents, financial transactions and proprietary research, within the organization’s own environment, ensuring that trace data stays under organizational control.
Securing the runtime environment: NemoClaw runs agents inside OpenShell, a sandboxed environment that defines precisely what the agent can and cannot do, enforcing clear permission boundaries from the start.
Local compute: NVIDIA DGX Spark supercomputers deliver data-center-class GPU performance in a deskside form factor built for continuous local inference that’s always on, with local model hosting and data that stays within the organization’s environment. NVIDIA DGX Station systems scale that capability for teams running multiple agents simultaneously across complex, sustained workloads.

The organizations defining what autonomous agents do in practice are accumulating something valuable: months of live operational learning, governance frameworks developed through real workloads and agents that have absorbed the institutional context that makes them genuinely useful. This foundation will only deepen over time.

Get Started With NVIDIA NemoClaw

Access a step-by-step tutorial on how to build a more secure AI agent with NemoClaw on NVIDIA DGX Spark. Explore how NemoClaw can deploy more secure, always-on AI assistants with a single command.

Experiment with NemoClaw, available on GitHub, and join the community of developers on Discord building with NemoClaw using NVIDIA Nemotron 3 Super and Telegram on DGX Spark.

Stay up to date on agentic AI, NVIDIA Nemotron and more by subscribing to NVIDIA AI news, joining the community and following NVIDIA AI on LinkedIn, Instagram, X and Facebook.

Explore self-paced video tutorials and livestreams.

NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents

Posted on April 28, 2026 by faz_business

AI agent systems today juggle separate models for vision, speech and language — losing time and context as they pass data from one model to the other.

Unveiled today, NVIDIA Nemotron 3 Nano Omni is an open multimodal model that brings these capabilities together into one system, enabling agents to deliver faster, smarter responses with advanced reasoning across video, audio, image and text. This best-in-class model gives enterprises and developers a production path for more efficient and accurate multimodal AI agents with full deployment flexibility and control.

Nemotron 3 Nano Omni sets a new efficiency frontier for open multimodal models with leading accuracy and low cost, topping six leaderboards for complex document intelligence, and video and audio understanding.

At a Glance

What it is

An open, omni-modal reasoning model — the highest-efficiency open multimodal model of its kind with leading accuracy

What it handles

Text, images, audio, video, documents, charts and graphical interfaces (input); text (output)

Who it’s for

Enterprises and developers building fast and reliable, agentic systems that need a multimodal perception sub-agent

How it works

Functions as the “eyes and ears” in a system of agents, working alongside models like Nemotron 3 Super and Ultra or other proprietary models

Why it matters

Leading multimodal accuracy and 9x higher throughput than other open omni models with the same interactivity, resulting in lower cost and better scalability without sacrificing responsiveness.

Architecture

30B-A3B hybrid MoE with Conv3D, EVS, 256K context

Availability

April 28th, 2026 via Hugging Face, OpenRouter, build.nvidia.com and 25+ partner platforms

AI and software companies already adopting Nemotron 3 Nano Omni include Aible, Applied Scientific Intelligence (ASI), Eka Care, Foxconn, H Company, Palantir and Pyler, with Dell Technologies, Docusign, Infosys, K-Dense, Lila, Oracle and Zefr evaluating the model.

“To build useful agents, you can’t wait seconds for a model to interpret a screen,” said Gautier Cloix, CEO of H Company. “By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before. This isn’t just a speed boost: It’s a fundamental shift in how our agents perceive and interact with digital environments in real time.”

Nemotron 3 Nano Omni Enables Faster, Leaner Multimodal Agents

Consider an AI agent for customer support processing a screen recording while analyzing uploaded call audio and checking data logs — or an agent for finance tasked with parsing PDFs, spreadsheets, charts and voice notes. Today, most agentic systems accomplish these tasks with separate models for vision, speech and language.

This approach increases latency through repeated inference passes, fragments context across modalities, and adds cost and inaccuracies over time.

By combining vision and audio encoders within its 30B-A3B, hybrid mixture-of-experts architecture, Nemotron 3 Nano Omni eliminates the need for separate perception models, driving inference efficiency at scale. It pairs this efficiency with strong multimodal perception accuracy, enabling AI systems to achieve 9x higher throughput than other open omni models with the same interactivity. The result is lower costs and better scalability without sacrificing responsiveness or quality.

In agentic systems, Nemotron 3 Nano Omni can work alongside proprietary cloud models or other NVIDIA Nemotron open models — such as Nemotron 3 Super for high-frequency execution or Nemotron 3 Ultra for complex planning — as well as proprietary models from other providers, to power sub-agents for agentic workflows such as computer use, document intelligence and audio-video reasoning.

Computer use agents — Nemotron 3 Nano Omni powers the perception loop for agents navigating graphical user interfaces, reasoning over onscreen content and understanding user interface state over time. H Company’s latest computer usage agent, powered by Nemotron 3 Nano Omni, uses a native input resolution of 1920×1080 pixels to achieve high-fidelity visual reasoning. In preliminary evaluations on the OSWorld benchmark, this integration showed a significant leap in navigating complex graphical interfaces and used Nemotron 3 Nano Omni’s ability to process very high-resolution images.
Document intelligence — Interprets documents, charts, tables, screenshots and mixed-media inputs, enabling agents to reason across visual structure and text content coherently. Critical for enterprise analysis and compliance workflows.
Audio and video understanding — For customer service, research and monitoring workflows, Nemotron 3 Nano Omni maintains audio-video context, tying what was said, shown and documented into a single reasoning stream instead of disconnected summaries.

Open and Customizable, Deployable Anywhere

Nemotron 3 Nano Omni is released with open weights, datasets and training techniques — giving organizations full transparency and control over how the model is customized and deployed.

Developers can use tools like NVIDIA NeMo for customization, evaluation and optimization for domain-specific use cases. Because the Nemotron family of models is open, organizations can deploy them in environments that meet regulatory, sovereignty or data localization requirements.

The Nemotron 3 family — including Nano, Super and Ultra models — has seen over 50 million downloads in the past year. Omni extends the family’s capabilities into multimodal and agentic domains.

The model is available on Hugging Face, OpenRouter and build.nvidia.com as an NVIDIA NIM microservice and through a broad ecosystem of NVIDIA Cloud Partners, inference platforms and cloud service providers.

Its open, lightweight architecture supports consistent deployment from local systems like NVIDIA Jetson modules, NVIDIA DGX Spark and DGX Station to data center and cloud environments.

Visit the NVIDIA technical blog for tutorials, cookbooks and deployment guides for Nemotron 3 Nano Omni use cases. Stay up to date on agentic AI, NVIDIA Nemotron and more by subscribing to NVIDIA news, joining the community and following NVIDIA AI on LinkedIn, Instagram, X and Facebook.

Explore self-paced video tutorials and livestreams.

Posted in GamingTagged Agentic AI, artificial intelligence, Nemotron, NVIDIA NeMo, open sourceLeave a Comment on NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents

OpenAI’s New GPT-5.5 Powers Codex on NVIDIA Infrastructure

Posted on April 25, 2026 by faz_business

AI agents have revolutionized developer workflows, and their next frontier is knowledge work: processing information, solving complex problems, coming up with new ideas and driving innovation.

Codex, OpenAI’s agentic coding application, is enabling this new frontier. It’s now powered by GPT-5.5, OpenAI’s latest frontier model, which runs on NVIDIA GB200 NVL72 rack-scale systems.

Over 10,000 NVIDIANs — across engineering, product, legal, marketing, finance, sales, HR, operations and developer programs — are already using GPT-5.5-powered Codex to achieve, in their words, “mind-blowing” and “life-changing” results.

NVIDIA engineers have had access to GPT-5.5 through the Codex app for a few weeks, and the gains are measurable. Served on GB200 NVL72, which is capable of delivering 35x lower cost per million tokens and 50x higher token output per second per megawatt compared with prior-generation systems — economics that make frontier-model inference viable at enterprise scale.

Debugging cycles that once stretched across days are closing in hours. Experimentation that previously required weeks is turning into overnight progress in complex, multi-file codebases. Teams are shipping end-to-end features from natural-language prompts, with stronger reliability and fewer wasted cycles than earlier models.

OpenAI’s stunning progress is just the latest example of NVIDIA’s work with every frontier model company — not just to accelerate the use of AI agents inside NVIDIA, but to help the company’s partners build the world’s best, lowest cost and most power efficient models for everyone.

As NVIDIA founder and CEO Jensen Huang told employees in a company-wide email urging everyone to use Codex: “Let’s jump to lightspeed. Welcome to the age of AI.”

A Deployment Built for Enterprise Security

Just like humans, every agent needs its own dedicated computer.

To ensure seamless operation within secure enterprise environments, the Codex app supports remote Secure Shell (SSH) connections to approved cloud virtual machines, allowing agents to work with real company data without exposing it externally.

So to ensure maximum security and auditability, NVIDIA IT rolled out cloud virtual machines (VMs) for every employee to run their agent safely. This provides a dedicated sandbox for the agent to operate at its maximum capabilities while maintaining full auditability. Users can control the Codex agent running in the cloud VM from a user interface that every employee is familiar with.

A zero-data retention policy governs NVIDIA’s deployment, and agents access production systems with read-only permissions through command-line interfaces and Skills — the same agentic toolkit NVIDIA uses to run automation workflows across the company.

A Decade of Full-Stack Collaboration

The GPT-5.5 launch and the Codex rollout reflect more than 10 years of collaboration between NVIDIA and OpenAI. The partnership began in 2016, when Huang hand-delivered the first NVIDIA DGX-1 AI supercomputer to OpenAI’s San Francisco headquarters.

Since then, the two companies have worked closely across the full AI stack.

NVIDIA was a day-zero partner for OpenAI’s gpt-oss open-weight model launch, optimizing model weights for NVIDIA TensorRT-LLM and ecosystem frameworks including vLLM and Ollama.

OpenAI has committed to deploying more than 10 gigawatts of NVIDIA systems for its next-generation AI infrastructure — a buildout that will put millions of NVIDIA GPUs at the foundation of OpenAI’s model training and inference for years ahead.

And OpenAI and NVIDIA are early silicon and codesign partners: OpenAI provides feedback that informs NVIDIA’s hardware roadmap, and in turn gains early access to new architectures. That relationship produced a concrete milestone — the joint bring-up of the first GB200 NVL72 100,000-GPU cluster. The cluster completed multiple large-scale training runs and set a new benchmark for system-level reliability at frontier scale.

GPT-5.5 is the product of that infrastructure running at full strength.

Learn more in OpenAI’s announcement.

Posted in GamingTagged Agentic AI, NVIDIA BlackwellLeave a Comment on OpenAI’s New GPT-5.5 Powers Codex on NVIDIA Infrastructure

Posts navigation

Older posts