Nvidia Earnings — Is Nvidia Tech Years Ahead of AMD and TPUs

November 17, 2025 | BullxBear

Note: This article is best viewed on a desktop or laptop for optimal table and image readability.

As Nvidia’s earnings approach, all eyes are on whether the company can maintain the technological
momentum that helped it reach an unprecedented five-trillion-dollar valuation. Demand for AI silicon has
surged worldwide, turning datacenter GPUs into one of the fastest-growing markets in tech.

With this growth comes serious competition—led by AMD, Google TPUs, and startups like Cerebras, Groq,
Graphcore, Habana, Triton, and Tenstorrent.
This analysis focuses mainly on Nvidia versus AMD due to stronger public data, and incorporates Google and
startup ecosystems wherever credible information exists.

For anyone tracking Nvidia’s stock outlook, earnings setup, or long-term valuation potential, understanding
how Nvidia stacks up against the rest of the industry is essential.

Structure of This Article

AI hardware cloud-provider sales model
Nvidia’s ecosystem explained
Competition analysis
Benchmarks and conclusions

1. AI Hardware Cloud-Provider Sales Model

Cloud providers remain the world’s largest buyers of AI chips. Key players include Coreweave, Oracle Cloud, AWS, Google Cloud, and Microsoft Azure. Most AI workloads run inside hyperscale data centers, and the cloud ecosystem reveals exactly how AI compute is priced, deployed, and scaled.

Among these buyers, Coreweave pricing offers unusually transparent insights into modern AI compute commercialization.

Decoding Coreweave’s Sales Model

Coreweave-pricing

Source: GPU Cloud Pricing

A single NVIDIA B200 GPU costs $68.80 per hour.
A GB200 NVL72 costs $42 per hour per GPU, but only when renting the full 72-GPU NVLink + NVSwitch cluster.
This reflects Jensen Huang’s well-known philosophy: “The more you buy, the more you save.”

To maximize performance and efficiency, Coreweave prioritizes systems with:

Scalable configurations optimized for large customers.
Low power consumption to control datacenter energy budgets.
Robust software stacks that distribute work efficiently.
Ultra-fast GPU-to-GPU communication for maximum throughput.

Slow interconnects can waste compute cycles and reduce effective output. This is where Nvidia’s advantage becomes clear.

The ultimate goal: dozens—or even thousands—of GPUs must operate as one massive, unified computer with zero friction for the user.

2. Understanding the Nvidia Ecosystem

Nvidia has spent years building and acquiring the technologies that now form its deeply integrated, full-stack AI computing platform. This ecosystem scales smoothly from a single GPU to clusters with tens of thousands of GPUs. Much of today’s GPU competition centers on whether rivals can build ecosystems that match Nvidia’s breadth, integration, and maturity.

2.1 Nvidia GPUs

GPUs excel at highly parallel workloads containing thousands of threads.
AI training and inference require massive parallel computing, making GPUs the default architecture across the industry.
Researchers constantly push for higher parallelization to maximize GPU efficiency.

Competition

AMD Instinct GPUs provide competitive architectures and continue improving their software stack.
Google TPUs target large-scale AI workloads with matrix-centric compute.
Startups like Cerebras, Groq, Graphcore, Habana, Triton, and Tenstorrent take niche architectural bets to differentiate themselves.

Nvidia’s Position on Startup Architectures

Nvidia believes most startup accelerators serve narrow workloads.
GPUs are general-purpose, letting customers repurpose hardware as models evolve.
Startups such as D-Matrix and SambaNova have shifted toward inference-only strategies.
Example: D-Matrix raised $275 million, entirely focused on inference, signaling a retreat from the training market.

Source:
d-Matrix Raises $275 Million to Power the Age of AI Inference

2.2 NVLink and NVSwitch

Large AI models rely on many GPUs working together with minimal communication latency. Since the AlexNet breakthrough in 2012, Nvidia has invested aggressively in interconnects, one of the most overlooked technologies in AI computing.

Source:
What AlexNet Brought To The World Of Deep Learning

Scientist's perspective all GB200 NVL72 complex system as single computer

How NVLink and NVSwitch Work

NVLink connects GPUs using high-bandwidth serdes links.
NVSwitch enables full connectivity across these NVLink ports.
The NVL72 GB200 system provides independent GPU-to-GPU paths across all 72 GPUs.
This design produces extremely low latency and high throughput.

More info: NVIDIA NVLink and NVLink Switch

Competition

No competitor currently matches Nvidia’s interconnect fabric.
Google TPUs use their own interconnect (Houdini/Ironwood).
AMD’s xGMI suffers from multi-hop latency at scale.
Upscale AI and partners are attempting to create a Universal Accelerator Link, but performance data is limited.

Source:
Upscale AI Launches with Over $100 Million Seed Round to Democratize AI Network Infrastructure and Advance Open Standards

2.3 BlueField DPU (Mellanox Acquisition)

Large clusters that exceed 72 GPUs in NVL72 GB200 must use Ethernet or InfiniBand switches across nodes. Here, BlueField DPUs become essential.

PCIe is inefficient for massive distributed workloads.
BlueField DPUs convert PCIe into high-performance Ethernet or InfiniBand.
DPUs aggregate small memory requests into efficient packets.
The DPU integrates ARM cores that offload virtualization duties, easing the workload on the system CPU.

Competition

No DPU competitor matches BlueField’s features or tight ecosystem integration.

2.4 Infiniband and Spectrum-X Networking

Nvidia acquired InfiniBand switches via Mellanox.
InfiniBand provides sub-microsecond latency—down to ~130 nanoseconds port-to-port.
It uses a credit-based flow control, lossless, and virtual lanes, ideal for GPU clusters.
Source: QM8790 datasheet (https://network.nvidia.com/files/doc-2020/pb-qm8790.pdf)
Nvidia Spectrum-X provides an Ethernet option for non-Nvidia environments.

More info: NVIDIA Quantum InfiniBand Switches

Competition

Broadcom’s Tomahawk Ultra offers 250 nanoseconds latency best case.
Source:Broadcom Ships Tomahawk Ultra Ethernet Switch with 250ns Latency for AI and HPC

InfiniBand remains superior due to 130 nanoseconds of latency, consistent lossless behavior, 64 virtual lanes enabling non-blocking efficient broadcasting of data transfer.

2.5 ARM CPU Integration (Grace Hopper)

Accelerators require a CPU to coordinate compute workloads.
ARM CPUs show strong power efficiency, proven by Apple’s M-series systems.
Apple’s M1 brought long battery life and strong performance, boosting MacBook sales 13% YoY.
Source: Apple Q4 2025 earnings call
(Apple Inc. (AAPL) Q4 FY2025 earnings call transcript)

Nvidia adopted ARM for its Grace Hopper CPUs due to its better perf/watt.
Grace Hopper connects directly to NVLink and NVSwitch for unified compute and memory.
Nvidia also supports Intel/ AMD x86 CPUs where it is beneficial, so customers are always getting the best systems.

More info:

NVIDIA GH200 Grace Hopper Superchip

Competition

ARM is efficient, but x86 remains dominant.
AMD continues reporting strong HPC CPU revenue growth.

Nvidia’s CPU strategy strengthens its platform but remains flexible for customers.

3. Competition Table

Category	Nvidia	AMD	Google	Startups	BullxBear View
Processor	Blackwell GPU	MI Instinct GPU	TPU	Various	Competitive at the device level
Scale-Up	NVLink + NVSwitch	xGMI	TPU Interconnect	Early UAL	Nvidia leads
Scale-Out	InfiniBand / Spectrum-X	Ethernet only	Proprietary TPU network	Ethernet	Nvidia leads
Cross-Datacenter	InfiniBand / Spectrum-X	Ethernet	TPU Interconnect	Ethernet	Nvidia leads
CPU Strategy	ARM + x86	x86	Axion + x86	x86/ARM	Competitive
Ecosystem	Full-stack integration	Partial stack	Internal stack	Fragmented	Nvidia strongest

Google TPU public data is limited, so we use MLPerf benchmarks for comparison.
Source: Benchmark MLPerf Training | MLCommons Version 2.0 Results

4. Benchmarks

4.1 Single-GPU Inference

Single-GPU-Inference benchmarking results

Nvidia B200 (TensorRT) leads current inference rankings.
AMD MI355X shows notable improvement and narrows the gap.

Source: https://inferencemax.semianalysis.com/

4.2 Multi-GPU System Performance

MW Compute cluster benchmarking results

Nvidia B200 systems deliver significantly higher performance across multi-GPU workloads
When benchmarked at 1 MW worth of power, GB200 (Tensor RT optimized) generates 8 million tokens/ second compared to ~6 million tokens/ second generated by AMD’s MI355X.

Source: https://inferencemax.semianalysis.com/

4.3 Cost-to-Performance

AMD using higher HBM memory compared to Nvidia

Source: MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

AMD’s MI300X uses bigger HBM memory configurations compared to Nvidia, but Nvidia’s H100/H200 deliver superior throughput and latency due to their integrated ecosystem.

GPU die area is always a tradeoff between memory and compute. Nvidia’s fast interconnects—NVLink, NVSwitch, and InfiniBand—deliver data so quickly that less on-die memory is needed, freeing more silicon for SMs and Tensor Cores.

5. Conclusion

The cloud-provider sales model shows customers value scalability, power efficiency,
and seamless software orchestration across large GPU fleets.
Nvidia’s architecture—GPUs, NVLink, NVSwitch, BlueField DPUs, InfiniBand, Spectrum-X,
and Grace Hopper—forms a unified platform for low-latency, high-throughput AI workloads.
Competitors such as AMD, Google TPUs, and startups are improving, but none offer Nvidia’s
end-to-end integration across compute, interconnects, networking, and CPU coordination.
Benchmarks confirm Nvidia leads in single-GPU inference, multi-GPU scaling, throughput,
latency, and cost-to-performance due to its deeply integrated ecosystem.
Nvidia further widens its moat through software platforms like Nvidia Isaac, CUDA, and
TensorRT, enabling advanced robotics, simulation, and physical AI workflows.
Overall, Nvidia maintains a durable and expanding advantage, with Google TPUs representing
the closest vertically integrated alternative.