Abstract Algorithms
Explore

Start here

Llm

Learn Llm as a connected topic across articles, concepts, simulations, and interview reasoning.

LlmMental ModelTradeoffsFailure ModesInterview ReasoningFine

Begin with

Fine gives you the cleanest entry point before branching into constraints, failures, and related systems.

12

Articles

10

Concepts

Start With Fine

Grounding

Build the mental model.

Start Reading

Shape

See how the pieces depend on each other.

See Context

Consequence

Compare what improves and what breaks.

Compare Tradeoffs

Stress

Change constraints and watch behavior.

Practice Reasoning

Next

Move to the next useful edge.

Continue Reading

Related systems

Follow the nearby ideas

Use the map as a quiet orientation layer, then move back into the articles for depth.

Guidance

Llm

Continues from what you have already explored.

System behavior

LLM Inference Pipeline

Request transforms through prompt, retrieval, generation, and guardrails.

Open
Step 1 / 2Normal flow
requestcommandpersistpublishconsumereqUClientActorGAPI GatewayBoundaryCCore ServiceCoordinatorDState StoreDurabilityQEvent StreamStreamWConsumerWorker

Read in sequence

1Fine-Tuning LLMs with LoRA and QLoRA: A Practical Deep-DiveTLDR: LoRA freezes the base model and trains two tiny matrices per layer — 0.1 % of parameters, 70 % less GPU memory, near-identical quality. QLoRA adds 4-bit NF4 quantization of the frozen base, enab31 min2Build vs Buy: Deploying Your Own LLM vs Using ChatGPT, Gemini, and Claude APIsTLDR: Use the API until you hit $10K/month or a hard data privacy requirement. Then add a semantic cache. Then evaluate hybrid routing. Self-hosting full model serving is only cost-effective at > 50M 31 min3Fine-Tuning LLMs: The Complete Engineer's Guide to SFT, LoRA, and RLHFTLDR: A pretrained LLM is a generalist. Fine-tuning makes it a specialist. Supervised Fine-Tuning (SFT) teaches it your domain's language through labeled examples. LoRA does the same with 99% fewer tr30 min4Chain of Thought Prompting: Teaching LLMs to Think Step by StepTLDR: Chain of Thought (CoT) prompting tells a language model to reason out loud before answering. By generating intermediate steps, the model steers itself toward correct conclusions — turning guessw27 min5LLM Hallucinations: Causes, Detection, and Mitigation StrategiesTLDR: LLMs hallucinate because they are trained to predict the next plausible token — not the next true token. Understanding the three hallucination types (factual, faithfulness, open-domain) plus the30 min6Sparse Mixture of Experts: How MoE LLMs Do More With Less ComputeTLDR: Mixture of Experts (MoE) replaces the single dense Feed-Forward Network (FFN) layer in each Transformer block with N independent expert FFNs plus a learned router. Only the top-K experts activat27 min7Dense LLM Architecture: How Every Parameter Works on Every TokenTLDR: In a dense LLM every single parameter is active for every token in every forward pass — no routing, no selection. A transformer block runs multi-head self-attention (Q, K, V) followed by a feed-24 min8LLM Software Development Pitfalls: What to Avoid and When to SimplifyTLDR: Most bad LLM products do not fail because the model is weak. They fail because teams wrap a maybe-useful model in too much architecture: prompt spaghetti, no eval harness, weak tool schemas, hug20 min9LLM Observability: Tracing, Logging, and Debugging Production AI SystemsTLDR: 🔍 LLM observability is radically different from traditional APM—non-deterministic outputs, variable token costs, and multi-step reasoning chains require specialized tracing. LangSmith provides 19 min10LLM Evaluation Frameworks: How to Measure Model Quality (RAGAS, DeepEval, TruLens)TLDR: 📏 Traditional ML metrics (accuracy, F1) fail for LLMs because there's no single "correct" answer. RAGAS measures RAG pipeline quality with faithfulness, answer relevance, and context precision.16 min11LangChain 101: Chains, Prompts, and LLM IntegrationTLDR: LangChain's LCEL pipe operator (|) wires prompts, models, and output parsers into composable chains — swap OpenAI for Anthropic or Ollama by changing one line without touching the rest of your c19 min12Types of LLM Quantization: By Timing, Scope, and MappingTLDR: There is no single "best" LLM quantization. You classify and choose quantization along three axes: when you quantize (timing), what you quantize (scope), and how values are encoded (mapping). In17 min

Related threads

Abstract Algorithms · © 2026 · Engineering learning lab