#inference

8 posts

9 Jun

Avinash Ahuja 9 Jun 2026 1 min read

NVIDIA Confidential Computing to Help Expand Apple’s Private Cloud Compute

NVIDIA GPUs with Confidential Computing are now used for confidential inference in Apple’s Private Cloud Compute (PCC), as it expands beyond Apple’s data centers to Google Cloud. Unveiled during Apple’s annual WWDC gathering for developers from around the globe, NVIDIA GPUs will support server-side inference for Apple Foundation Models, custom-built by Apple and Google, leveraging […]

ai infrastructure artificial intelligence cybersecurity hardware inference

15 Apr

Shruti Koparkar 15 Apr 2026 5 min read

Rethinking AI TCO: Why Cost per Token Is the Only Metric That Matters

Nvidia

Traditional data centers only stored, retrieved and processed data. In the generative and agentic AI era, these facilities have evolved into AI token factories. With AI inference becoming their primary workload, their primary output is intelligence manufactured in the form of tokens. This transformation demands a corresponding shift in how the economics of AI infrastructure, […]

ai infrastructure inference nvidia blackwellthink smart

17 Mar

Kanika Atri 17 Mar 2026 5 min read

NVIDIA, Telecom Leaders Build AI Grids to Optimize Inference on Distributed Networks

Nvidia

As AI‑native applications scale to more users, agents and devices, the telecommunications network is becoming the next frontier for distributing AI. At NVIDIA GTC 2026, leading operators in the U.S. and Asia showed that this shift is underway, announcing AI grids — geographically distributed and interconnected AI infrastructure — using their network footprint to power […]

ai infrastructure artificial intelligence gtc 2026 inference nvidia rtx

12 Feb

Shruti Koparkar 12 Feb 2026 6 min read

Leading Inference Providers Achieve Lowest Token Cost With Open Source Models on NVIDIA Blackwell

Nvidia

A diagnostic insight in healthcare. A character’s dialogue in an interactive game. An autonomous resolution from a customer service agent. Each of these AI-powered interactions is built on the same unit of intelligence: a token. Scaling these AI interactions requires businesses to consider whether they can afford more tokens. The answer lies in better tokenomics […]

ai ai infrastructure agentic ai dynamo inference

3 Dec 2025

Shruti Koparkar 3 Dec 2025 8 min read

Mixture of Experts Powers the Most Intelligent Frontier AI Models, Runs 10x Faster to Deliver 1/10 the Token Cost on NVIDIA Blackwell NVL72

Nvidia

The top 10 most intelligent open-source models all use a mixture-of-experts architecture. Kimi K2 Thinking, DeepSeek-R1, Mistral Large 3 and others run 10x faster to enable one-tenth the cost per token on NVIDIA GB200 NVL72. A look under the hood of virtually any frontier model today will reveal a mixture-of-experts (MoE) model architecture that mimics […]

ai infrastructure artificial intelligence dynamo inference nvidia blackwell

2 Dec 2025

Kari Briski 2 Dec 2025 2 min read

NVIDIA Partners With Mistral AI to Accelerate New Family of Open Models

Nvidia

Today, Mistral AI announced the Mistral 3 family of open-source multilingual, multimodal models, optimized across NVIDIA supercomputing and edge platforms. Mistral Large 3 is a mixture-of-experts (MoE) model — instead of firing up every neuron for every token, it only activates the parts of the model with the most impact. The result is efficiency […]

ai infrastructure inference nvidia blackwellnvlinktensorrt

13 Nov 2025

Shruti Koparkar 13 Nov 2025 4 min read

AWS, Google, Microsoft and OCI Boost AI Inference Performance for Cloud Customers With NVIDIA Dynamo

Nvidia

Editor’s note: This post is part of Think SMART, a series focused on how leading AI service providers, developers and enterprises can boost their inference performance and return on investment with the latest advancements from NVIDIA’s full-stack inference platform. NVIDIA Blackwell delivers the highest performance and efficiency, and lowest total cost of ownership across every […]

ai infrastructure dynamo inferencethink smart

31 May 2024

Tarek Ziadé 31 May 2024 9 min read

Experimenting with local alt text generation in Firefox Nightly

Mozilla Hacks

Firefox 130 will introduce an experimental new capability to automatically generate alt-text for images using a fully private on-device AI model. The feature will be available as part of Firefox’s built-in PDF editor, and our end goal is to make it available in general browsing for users with screen readers. The post Experimenting with local alt text generation in Firefox…

artificial intelligence feature featured article firefox inference