Abstract AlgorithmsAn AI Powered Learning Platform

Topic

model optimization

2 articles

Types of LLM Quantization: By Timing, Scope, and Mapping

Types of LLM Quantization: By Timing, Scope, and Mapping

TLDR: There is no single "best" LLM quantization. You classify and choose quantization along three axes: when you quantize (timing), what you quantize (scope), and how values are encoded (mapping). In practice, most teams start with weight quantizati...

Mar 14, 2026•16 min read

LLM Model Quantization: Why, When, and How to Deploy Smaller, Faster Models

LLM Model Quantization: Why, When, and How to Deploy Smaller, Faster Models

TLDR: Quantization converts high-precision model weights and activations (FP16/FP32) into lower-precision formats (INT8 or INT4) so LLMs run with less memory, lower latency, and lower cost. The key is choosing the right quantization method for your a...

Mar 8, 2026•13 min read