AWS Commits to Deploy 1 Million+ NVIDIA GPUs Including Blackwell and Vera Rubin Across Global Cloud Regions in 2026

Amazon Web Services announced a massive expansion of its NVIDIA GPU infrastructure at NVIDIA GTC 2026 on March 16, 2026, committing to deploy more than 1 million NVIDIA GPUs across its global cloud regions starting in 2026.

KEY ANNOUNCEMENTS:

SCALE: 1 million+ NVIDIA GPUs including both Blackwell and Vera Rubin GPU architectures across AWS global regions. AWS already offers the broadest collection of NVIDIA GPU-based instances of any cloud provider.
RTX PRO 4500 FIRST: AWS is the first major cloud provider to announce Amazon EC2 instances accelerated by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs. These instances are optimized for data analytics, conversational AI, content generation, recommender systems, and rendering workloads. Built on the AWS Nitro System for enhanced security.
NIXL DISAGGREGATED INFERENCE: AWS announced support for NVIDIA Inference Xfer Library (NIXL) with AWS Elastic Fabric Adapter (EFA) to accelerate disaggregated LLM inference. This enables high-throughput, low-latency KV-cache data movement between GPU compute nodes and distributed memory — critical for scaling modern agentic AI workloads. Works across both NVIDIA GPUs and AWS Trainium chips.
NEMOTRON ON BEDROCK: Expanded NVIDIA Nemotron model support on Amazon Bedrock, giving developers access to NVIDIA open models through AWS managed infrastructure.
SPARK ACCELERATION: 3x faster Apache Spark on Amazon EMR with NVIDIA RTX PRO 6000 GPUs — significant for data processing pipelines that feed AI agent systems.

BROADER GTC CONTEXT:

The AWS announcement is part of a massive wave of infrastructure commitments at GTC 2026:

NVIDIA CEO Jensen Huang projected $1 trillion in orders for Blackwell and Vera Rubin through 2027 (CNBC)
Google Cloud also announced Vera Rubin NVL72 support and fractional G4 VMs
Cisco expanded Secure AI Factory to edge deployments
Cognizant launched multi-tenant AI Factory powered by Dell/NVIDIA
EXL advanced EXLerate.ai agentic platform with NVIDIA tech stack

AGENTIC AI INFRASTRUCTURE FOCUS:

AWS explicitly framed the announcement around agentic AI: their infrastructure is designed to power AI systems capable of reasoning, planning, and acting autonomously across complex workflows. The NIXL integration is specifically optimized for the long-running, multi-step inference patterns characteristic of agentic workloads.

The Vera Rubin architecture (NVIDIA latest) delivers 10x performance per watt over Grace Blackwell and supports 700 million tokens per second vs. 2 million for Hopper-era systems — a 350x improvement that fundamentally changes the economics of running autonomous AI agents at scale.

AWS Commits to Deploy 1 Million+ NVIDIA GPUs Including Blackwell and Vera Rubin Across Global Cloud Regions in 2026

Sources

Share this article

🧠 Stay Updated on AI Agents

Deploy Your AI Agent Today

More from AI Infrastructure