Abstract Algorithms

Articles

Engineering deep dives for understanding systems.

Distributed systems, AI infrastructure, data structures, and system design explained with calm, production-minded depth.

Explore the archive

Continue exploring

Pick up a systems thread

Concept collections

Curated ways into the archive

Recent deep dives

Editorial reading rhythm

282 articles

Systems exploration

Follow the concept continuity

View related systems
Browse the full archive
Data Lineage Explained: Tracking Data Flow Across Your OrganizationTLDR: ๐Ÿ“Š Data lineage is the complete genealogy of your data โ€” where it comes from, how it's transformed, and where it ends up. It's critical for debugging pipelines, proving compliance, and understan12 min readData Governance Essentials: Framework and Best PracticesTLDR: ๐Ÿ“‹ Data governance is the framework that answers "who owns this data, who can access it, and what quality standards must it meet?" Without governance, data pipelines become chaotic. Implement it9 min readOWASP Credential Stuffing Key Terms Explained with Practical ExamplesTLDR: Credential-stuffing defense works only when you treat login as a layered, risk-adaptive system: detect attack shape, add step-up authentication, combine bot and fingerprint signals, prevent user15 min readSoftmax Function Explained: From Raw Scores to ProbabilitiesTLDR: Softmax converts a vector of raw scores (logits) into a valid probability distribution by exponentiating each value and dividing by the total. Subtracting the max before exponentiating prevents 23 min readNoSQL Partitioning: How Cassandra, DynamoDB, and MongoDB Split DataTLDR: Every NoSQL database hides a partitioning engine behind a deceptively simple API. Cassandra uses a consistent hashing ring where a Murmur3 hash of your partition key selects a node โ€” virtual nod24 min readJava 21 to 25: Virtual Threads, Pattern Matching, and Structured ConcurrencyTLDR: Java 21 LTS makes virtual threads a production-ready replacement for bounded thread pools โ€” your newFixedThreadPool(200) can become newVirtualThreadPerTaskExecutor() and handle 10ร— the concurren22 min readJava 14 to 17: Records, Sealed Classes, Text Blocks, and Pattern MatchingTLDR: Java 14โ€“17 ran a deliberate four-release preview-to-stable conveyor belt. Records replaced 50-line POJOs with one line. Text blocks ended escape-sequence chaos in multi-line strings. Sealed clas25 min readHyperLogLog Explained: Counting Billions of Unique Items with 12 KBTLDR: HyperLogLog estimates the number of distinct elements in a dataset using ~12 KB of memory regardless of cardinality โ€” with ยฑ0.81% error. The insight: if you hash every element to a random bit st18 min readDot Product in Machine Learning: The Engine Behind Similarity, Attention, and Neural NetworksTLDR: The dot product multiplies corresponding elements of two vectors and sums the results. In machine learning it does three critical jobs: it scores semantic similarity between embeddings, computes22 min readCount-Min Sketch Explained: Frequency Estimation at Streaming ScaleTLDR: Count-Min Sketch (CMS) is a fixed-size d ร— w counter matrix that estimates how often any element has appeared in a stream. Insert: hash the element with each of the d hash functions to get one c22 min readClock Skew and Causality Violations: Why Distributed Clocks LieTLDR: Physical clocks on distributed machines cannot be perfectly synchronized. NTP keeps them within tens to hundreds of milliseconds in normal conditions โ€” but under load, across datacenters, or aft19 min readBloom Filters Explained: Membership Testing with Zero False NegativesTLDR: A Bloom filter is a bit array of m bits + k independent hash functions that sets k bits on insert and checks those same k bits on lookup. If any checked bit is 0, the element is definitely not i19 min read
โ€ฆ

Continue

Read one deep dive, then follow the next related system.

Explore concept collections

Abstract Algorithms ยท ยฉ 2026 ยท Engineering learning lab