Topic
transformers
6 articles across 4 sub-topics
Sub-topic
2 articles
Dot Product in Machine Learning: The Engine Behind Similarity, Attention, and Neural Networks
TLDR: The dot product multiplies corresponding elements of two vectors and sums the results. In machine learning it does three critical jobs: it scores semantic similarity between embeddings, computes every activation in a fully connected layer, and ...

Attention Mechanism Explained: How Transformers Learn to Focus
TLDR: Attention lets every token in a sequence ask "what else is relevant to me?" — dynamically weighting relationships across all positions simultaneously. It replaced the fixed-size hidden-state bottleneck of RNNs and is the engine behind every GPT...
Sub-topic
2 articles

Sparse Mixture of Experts: How MoE LLMs Do More With Less Compute
TLDR: Mixture of Experts (MoE) replaces the single dense Feed-Forward Network (FFN) layer in each Transformer block with N independent expert FFNs plus a learned router. Only the top-K experts activate per token — so total parameters far exceed activ...

Dense LLM Architecture: How Every Parameter Works on Every Token
TLDR: In a dense LLM every single parameter is active for every token in every forward pass — no routing, no selection. A transformer block runs multi-head self-attention (Q, K, V) followed by a feed-forward network (FFN) with roughly 4× the hidden d...
Sub-topic
1 article
Softmax Function Explained: From Raw Scores to Probabilities
TLDR: Softmax converts a vector of raw scores (logits) into a valid probability distribution by exponentiating each value and dividing by the total. Subtracting the max before exponentiating prevents floating-point overflow. Temperature scaling contr...
Sub-topic
1 article
Practical LLM Quantization in Colab: A Hugging Face Walkthrough
TLDR: This is a practical, notebook-style quantization guide for Google Colab and Hugging Face. You will quantize real models, run inference, compare memory/latency, and learn when to use 4-bit NF4 vs safer INT8 paths. 📖 What You Will Build in Thi...
