Abstract Algorithms

rlhf

3 articles across 2 sub-topics

Ai(2)

RLHF in Practice: From Human Preferences to Better LLM Policies

TLDR: Reinforcement Learning from Human Feedback (RLHF) helps align language models with human preferences after pretraining and SFT. The typical pipeline is: collect preference comparisons, train a reward model, then optimize a policy (often with KL...

Mar 9, 2026•11 min read

RLHF Explained: How We Teach AI to Be Nice

TLDR: A raw LLM is a super-smart parrot that read the entire internet — including its worst parts. RLHF (Reinforcement Learning from Human Feedback) is the training pipeline that transforms it from a pattern-matching engine into an assistant that is ...

Mar 9, 2026•13 min read

Fine Tuning(1)

Fine-Tuning LLMs: The Complete Engineer's Guide to SFT, LoRA, and RLHF

TLDR: A pretrained LLM is a generalist. Fine-tuning makes it a specialist. Supervised Fine-Tuning (SFT) teaches it your domain's language through labeled examples. LoRA does the same with 99% fewer trainable parameters. RLHF shapes its behavior using...

Apr 18, 2026•30 min read