How Fluentd Works: The Unified Logging Layer
Logs are messy. Fluentd cleans them up. Learn how this open-source data collector unifies logging from multiple sources.
Abstract AlgorithmsTLDR: Fluentd is an open-source data collector that decouples log sources from destinations. It ingests logs from 100+ sources (Nginx, Docker, syslog), normalizes them to JSON, applies filters and transformations, and routes them to 100+ outputs (Elasticsearch, S3, Kafka). Tag-based routing is the core concept.
๐ A Thousand Services, One Logging Chaos
Before unified logging, a typical microservices stack looks like this:
- Nginx writes to
/var/log/nginx/access.log - Java app writes to Log4j rotation files
- Kubernetes pods write to stdout
- Database writes to
/var/lib/postgresql/log/
Each destination (Splunk, Elasticsearch, S3) requires custom scripts per source. Ten services ร three destinations = 30 custom scripts, each with its own error handling and retry logic.
Fluentd solves this with a unified layer: any input goes through Fluentd, gets normalized to JSON, and routes to any output using a single config.
๐ข Tags and Routing: Fluentd's Core Concept
Every event in Fluentd has a tag โ a dot-separated string that determines where the event is routed.
<source>
@type tail
path /var/log/nginx/access.log
tag web.nginx
format nginx
</source>
<source>
@type tail
path /var/log/app/app.log
tag app.backend
format json
</source>
<match web.**>
@type elasticsearch
host elastic.local
port 9200
index_name nginx-logs
</match>
<match app.**>
@type s3
s3_bucket my-log-archive
path logs/%Y/%m/%d/
</match>
web.nginxevents match<match web.**>โ go to Elasticsearchapp.backendevents match<match app.**>โ go to S3
flowchart TD
Nginx[Nginx access.log\ntag: web.nginx] --> Fluentd
App[App log\ntag: app.backend] --> Fluentd
Sys[Syslog\ntag: system.kernel] --> Fluentd
Fluentd -->|match web.**| ES[Elasticsearch]
Fluentd -->|match app.**| S3[Amazon S3]
Fluentd -->|match system.**| Kafka[Kafka topic]
โ๏ธ The Plugin Architecture: Input โ Filter โ Buffer โ Output
Fluentd's power comes from its plugin model:
| Plugin type | Role | Examples |
| Input | Collect events from sources | tail, http, forward, syslog, docker |
| Parser | Parse raw text into structured JSON | nginx, apache2, json, regexp, csv |
| Filter | Transform, enrich, or drop events | record_transformer, grep, geoip |
| Buffer | Batch and retry on output failures | file, memory |
| Output | Send events to destinations | elasticsearch, s3, kafka, stdout |
Buffer plugins are critical for reliability. Without buffering, a downstream outage (e.g., Elasticsearch restart) causes log loss. With a file buffer:
- Events are written to disk first.
- Flushed to the output on schedule (or when the buffer fills).
- Retried automatically on failure with exponential backoff.
<match **>
@type elasticsearch
host elastic.local
<buffer>
@type file
path /var/log/fluentd-buffer
flush_interval 5s
retry_max_times 10
</buffer>
</match>
๐ง Filter: Enriching and Transforming Events
Filters run in the pipeline between input and output:
<filter web.nginx>
@type record_transformer
enable_ruby true
<record>
environment "production"
hostname "#{Socket.gethostname}"
log_level ${record["status"].to_i >= 500 ? "ERROR" : "INFO"}
</record>
</filter>
This adds environment, hostname, and a derived log_level to every Nginx event before sending to Elasticsearch.
๐ Fluentd vs Logstash vs Fluent Bit
| Fluentd | Logstash | Fluent Bit | |
| Language | Ruby + C | Java | C |
| Memory footprint | ~60 MB | ~500 MB+ | ~1 MB |
| Plugin ecosystem | 700+ plugins | 200+ plugins | 70+ plugins |
| Best for | Central aggregation server | Elasticsearch pipelines | Edge / container collection |
| Kubernetes pattern | Deploy as DaemonSet or aggregator | Sidecar or aggregator | DaemonSet (forward to Fluentd) |
The common production pattern: Fluent Bit as a lightweight DaemonSet on every node, forwarding to a central Fluentd aggregation layer, then to Elasticsearch/S3.
๐ Key Takeaways
- Fluentd collects logs from any source, normalizes to JSON, and routes to any destination via tag-based matching.
- Plugin types: Input โ Parser โ Filter โ Buffer โ Output.
- Buffer plugins provide durability: events survive output outages.
- Compared to Logstash (Java, heavy) and Fluent Bit (C, ultra-light), Fluentd sits in the middle as a reliable aggregation layer.
- Common pattern: Fluent Bit (edge) โ Fluentd (aggregation) โ Elasticsearch or Kafka.
๐งฉ Test Your Understanding
- An Nginx log event is tagged
web.nginx. Which<match>rule catches it:web.**orapp.**? - Elasticsearch goes offline for 10 minutes. Without a buffer plugin, what happens to logs?
- What is the difference between a parser plugin and a filter plugin in Fluentd?
- Why would you run Fluent Bit on each node instead of Fluentd directly?
๐ Related Posts

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
SFT for LLMs: A Practical Guide to Supervised Fine-Tuning
TLDR: Supervised fine-tuning (SFT) is the stage where a pretrained model learns task-specific response behavior from curated input-output examples. It is usually the first alignment step after pretraining and often the foundation for later RLHF. Good...
RLHF in Practice: From Human Preferences to Better LLM Policies
TLDR: Reinforcement Learning from Human Feedback (RLHF) helps align language models with human preferences after pretraining and SFT. The typical pipeline is: collect preference comparisons, train a reward model, then optimize a policy (often with KL...
PEFT, LoRA, and QLoRA: A Practical Guide to Efficient LLM Fine-Tuning
TLDR: Full fine-tuning updates every model weight, which is expensive in memory, compute, and storage. PEFT methods update only a small trainable slice. LoRA learns low-rank adapters on top of frozen base weights. QLoRA pushes efficiency further by q...
LLM Model Naming Conventions: How to Read Names and Why They Matter
TLDR: LLM names encode practical decisions: model family, size, training stage, context window, format, and quantization level. If you can decode naming conventions, you can avoid costly deployment mistakes and choose the right checkpoint faster. ๏ฟฝ...
