Wednesday, March 4, 2026 | ๐Ÿ”ฅ trending
๐Ÿ”ฅ
TrustMeBro
news that hits different ๐Ÿ’…
๐Ÿค– ai

New SemiAnalysis InferenceX Data Shows NVIDIA Blackwell U...

The NVIDIA Blackwell platform has been widely adopted by leading inference providers such as Baseten, DeepInfra, Fireworks AI and Togethe...

โœ๏ธ
no cap correspondent ๐Ÿงข
Tuesday, February 17, 2026 ๐Ÿ“– 2 min read
New SemiAnalysis InferenceX Data Shows NVIDIA Blackwell U...
Image: NVIDIA Blog

Whatโ€™s Happening

Breaking it down: The NVIDIA Blackwell platform has been widely adopted by leading inference providers such as Baseten, DeepInfra, Fireworks AI and Together AI to reduce cost per token by up to 10x.

Now, the NVIDIA Blackwell Ultra platform is taking this momentum further for agentic AI. AI agents and coding assistants are driving explosive growth in software-programming-related Read Article New SemiAnalysis InferenceX Data Shows NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Costs for Agentic AI Cloud providers including Microsoft, CoreWeave and Oracle Cloud Infrastructure are deploying NVIDIA GB300 NVL72 systems at grow for low-latency and long-context use cases such as agentic coding and coding assistants. (it feels like chaos)

By Ashraf Eassa The NVIDIA Blackwell platform has been widely adopted providers such as Baseten, DeepInfra, Fireworks AI and Together AI to reduce cost per token 10x.

The Details

These applications require low latency to maintain real-time responsiveness across multistep workflows and long context when reasoning across entire codebases. New SemiAnalysis InferenceX performance data shows that the combination of NVIDIAโ€™s software optimizations and the next-generation NVIDIA Blackwell Ultra platform has delivered breakthrough advances on both fronts.

NVIDIA GB300 NVL72 systems now deliver up to 50x higher throughput per megawatt, resulting in 35x lower cost per token compared with the NVIDIA Hopper platform. Chips, system architecture and software, NVIDIAโ€™s extreme codesign accelerates performance across AI workloads โ€” from agentic coding to interactive coding assistants โ€” while driving down costs at grow.

Why This Matters

GB300 NVL72 Delivers up to 50x Better Performance for Low-Latency Workloads Recent analysis from Signal65 shows that NVIDIA GB200 NVL72 with extreme hardware and software codesign delivers more than 10x more tokens per watt, resulting in one-tenth the cost per token compared with the NVIDIA Hopper platform. These massive performance gains continue to expand as the underlying stack improves.

This adds to the ongoing AI race thatโ€™s captivating the tech world.

The Bottom Line

This story is still developing, and weโ€™ll keep you updated as more info drops.

Sound off in the comments.

โœจ

Originally reported by NVIDIA Blog

Got a question about this? ๐Ÿค”

Ask anything about this article and get an instant answer.

Answers are AI-generated based on the article content.

vibe check:

more like this ๐Ÿ‘€