Blog Posts | Joshua Berkowitz

4 Articles

2025 × quantization ×

Unsloth Dynamic GGUFs: How Extreme Model Compression Outperforms AI Giants

Compressing a large language model by 75% and still outperforming the latest releases from OpenAI and Anthropic is the promise of Unsloth Dynamic GGUFs. Their integration with the Aider Polyglot bench...

Aider Polyglot benchmarking DeepSeek LLMs model compression open-source AI quantization Unsloth

Dec 6, 2025

0 4829

News

TorchAO: A PyTorch-Native Shortcut To Smaller, Faster Models

TorchAO is PyTorch's native toolkit for model efficiency: it unifies post-training quantization (PTQ), quantization-aware training (QAT), float8 (FP8) training, and structured sparsity in one coherent...

deep learning FP8 model efficiency open source PyTorch QAT quantization sparsity TorchAO

Nov 4, 2025

0 22704

Github Repos

BitNet: 1-bit LLMs Land With Practical Inference on CPUs and GPUs

BitNet from Microsoft Research is the official C++ inference stack for native 1-bit large language models, centered on BitNet b1.58. The repo ships fast, lossless ternary kernels for CPUs, a CUDA W2A8...

1-bit LLM BitNet CPU GGUF GPU inference llama.cpp quantization T-MAC

Aug 27, 2025

0 45793

Github Repos

AMD Ryzen AI Max+ Upgrade: Powering 128B-Parameter LLMs Locally on Windows PCs

With AMD's latest update deploying massive language models, up to 128 billion parameters, directly on your Windows laptop is now a possible. AMD’s Ryzen AI Max+ is a breakthrough that brings state-of-...

AMD context window large language models LLM deployment local AI quantization Ryzen AI Windows AI

Jul 31, 2025

0 39699

News

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Most Popular Articles

Check out what the hot topics are!

See all

Every shirt tells a story—and every story

#ClothingForACause