Abstract Algorithms

Category

attention mechanism

2 articles across 2 sub-topics

Deep Learning(1)

Attention Mechanism Explained: How Transformers Learn to Focus

Attention Mechanism Explained: How Transformers Learn to Focus

TLDR: Attention lets every token in a sequence ask "what else is relevant to me?" — dynamically weighting relationships across all positions simultaneously. It replaced the fixed-size hidden-state bottleneck of RNNs and is the engine behind every GPT...

Apr 18, 2026•25 min read

Architecture(1)

How Transformer Architecture Works: A Deep Dive

TLDR: The Transformer is the architecture behind every major LLM (GPT, BERT, Claude, Gemini). Its core innovation is Self-Attention — a mechanism that lets the model weigh relationships between all tokens in a sequence simultaneously, regardless of d...

Mar 9, 2026•17 min read