Home/Learn/Optimization
Topic

Optimization

Learn Optimization as a connected topic across chapters, concepts, simulations, and interview reasoning.

10 Concepts7 Articles2h 20m

Overview

Learn Optimization as a connected topic across chapters, concepts, simulations, and interview reasoning.

How this topic helps

Ai
Deep Learning
Llm
Inference

Learning Path in this Topic

Series that contain articles from Optimization. Select a path to filter the article list.

Articles

7 matched articles

Article 1LoRA Explained: How to Fine-Tune LLMs on a BudgetTLDR: Fine-tuning a 7B-parameter LLM updates billions of weights and requires expensive GPUs. LoRA (Low-Rank Adaptation) freezes the original weights and trains only tiny adapter matrices that are add13 minArticle 2Spark Adaptive Query Execution: Dynamic Coalescing, Pruning, and Skew HandlingTLDR: Before AQE, Spark compiled your entire query into a static physical plan using size estimates that were frequently wrong — and a wrong estimate at planning time meant a skewed join, 800 small ta39 minArticle 3LLM Model Selection Guide: GPT-4o vs Claude vs Llama vs Mistral — When to Use WhichTLDR: 🧠 Choosing the right LLM can save you 80% on costs while maintaining quality. This guide provides a decision framework, cost comparison, and practical examples to help engineering teams select 23 minArticle 4Context Window Management: Strategies for Long Documents and Extended ConversationsTLDR: 🧠 Context windows are LLM memory limits. When conversations grow past 4K-128K tokens, you need strategies: sliding windows (cheap, lossy), summarization (balanced), RAG (selective), map-reduce 20 minArticle 5Types of LLM Quantization: By Timing, Scope, and MappingTLDR: There is no single "best" LLM quantization. You classify and choose quantization along three axes: when you quantize (timing), what you quantize (scope), and how values are encoded (mapping). In17 minArticle 6LLM Model Quantization: Why, When, and How to Deploy Smaller, Faster ModelsTLDR: Quantization converts high-precision model weights and activations (FP16/FP32) into lower-precision formats (INT8 or INT4) so LLMs run with less memory, lower latency, and lower cost. The key is13 min

Page 1 of 2