Abstract Algorithms

fine tuning

7 articles across 4 sub-topics

Ai(3)

SFT for LLMs: A Practical Guide to Supervised Fine-Tuning

TLDR: Supervised fine-tuning (SFT) is the stage where a pretrained model learns task-specific response behavior from curated input-output examples. It is usually the first alignment step after pretraining and often the foundation for later RLHF. Good...

Mar 9, 2026•12 min read

PEFT, LoRA, and QLoRA: A Practical Guide to Efficient LLM Fine-Tuning

TLDR: Full fine-tuning updates every model weight, which is expensive in memory, compute, and storage. PEFT methods update only a small trainable slice. LoRA learns low-rank adapters on top of frozen base weights. QLoRA pushes efficiency further by q...

Mar 9, 2026•13 min read

LoRA Explained: How to Fine-Tune LLMs on a Budget

TLDR: Fine-tuning a 7B-parameter LLM updates billions of weights and requires expensive GPUs. LoRA (Low-Rank Adaptation) freezes the original weights and trains only tiny adapter matrices that are added on top. 90%+ memory reduction; zero inference l...

Mar 9, 2026•13 min read

Deep Learning(2)

Fine-Tuning LLMs with LoRA and QLoRA: A Practical Deep-Dive

TLDR: LoRA freezes the base model and trains two tiny matrices per layer — 0.1 % of parameters, 70 % less GPU memory, near-identical quality. QLoRA adds 4-bit NF4 quantization of the frozen base, enabling 70B fine-tuning on 2× A100 80 GB instead of 8...

Apr 19, 2026•31 min read

Transfer Learning Explained: Standing on the Shoulders of Pretrained Models

TLDR: You don't need millions of labeled images or months of GPU time to build a great model. Transfer learning lets you borrow a pretrained network's hard-won feature detectors, plug in a new output head, and fine-tune on your small dataset — often ...

Apr 18, 2026•27 min read

Ai Agents(1)

RAG vs Fine-Tuning: When to Use Each (and When to Combine Them)

TLDR: RAG gives LLMs access to current knowledge at inference time; fine-tuning changes how they reason and write. Use RAG when your data changes. Use fine-tuning when you need consistent style, tone, or domain reasoning. Use both for production assi...

Apr 19, 2026•30 min read

Huggingface(1)

Fine-Tuning LLMs: The Complete Engineer's Guide to SFT, LoRA, and RLHF

TLDR: A pretrained LLM is a generalist. Fine-tuning makes it a specialist. Supervised Fine-Tuning (SFT) teaches it your domain's language through labeled examples. LoRA does the same with 99% fewer trainable parameters. RLHF shapes its behavior using...

Apr 18, 2026•30 min read