News | Joshua Berkowitz

1 Article

TensorRT-LLM ×

NVIDIA Blackwell and Llama 4 Maverick: Ushering in a New Era of AI Inference Speed

An NVIDIA AI system accomplished a record breaking 1,000+ tokens per second, per user, from a 400-billion-parameter language model all on a single machine. NVIDIA’s Blackwell architecture, paired with...

AI inference Blackwell GPU acceleration Llama 4 NVIDIA speculative decoding TensorRT-LLM

May 23, 2025

0 9240

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Try It

Most Popular Articles

Check out what the hot topics are!

See all

Follow us

Our latest content

Prompt Maker Image Generator

Most Popular Articles

Every shirt tells a story—and every story

#ClothingForACause