Home

Topic
delta lake
3 articles across 2 sub-topics
Sub-topic
Big Data
2 articles

Modern Table Formats: Delta Lake vs Apache Iceberg vs Apache Hudi
TLDR: Delta Lake, Apache Iceberg, and Apache Hudi are open table formats that wrap Parquet files with a transaction log (or snapshot tree) to deliver ACID guarantees, time travel, schema evolution, an
•24 min read

Medallion Architecture: Bronze, Silver, and Gold Layers in Practice
TLDR: Medallion Architecture solves the "data swamp" problem by organizing a data lake into three progressively refined zones — Bronze (raw, immutable), Silver (cleaned, conformed), Gold (aggregated,
•23 min read
Sub-topic
#Apache-spark
1 article
Reading and Writing Data in Spark: Parquet, Delta, JSON, and JDBC
TLDR: Parquet's columnar layout with row-group statistics enables predicate pushdown that can reduce a 500 GB scan to 8 GB. Delta Lake wraps Parquet with a JSON transaction log to add ACID semantics a
•35 min read
