DFS — Depth-First Search: Go Deep Before Going Wide

DFS explores graphs by going as deep as possible before backtracking — the key to cycle detection and path search.

Data Structures and Algorithms

Abstract Algorithms

·Mar 29, 2026·15 min read

📚

Intermediate

For developers with some experience. Builds on fundamentals.

Estimated read time: 15 min

AI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.

TLDR: DFS explores a graph by diving as deep as possible along each path before backtracking, using a call stack (recursion) or an explicit stack. It is the go-to algorithm for cycle detection, path finding, topological sort, and connected components. Time: O(V + E). Space: O(V). Watch for StackOverflow on deep graphs — use iterative DFS instead.

A compiler needs to detect circular imports: module A imports B, B imports C, C imports A — an illegal cycle. A game engine needs to find all rooms reachable from the starting room. A social graph algorithm needs to identify all connected components — groups of users with no path between them. All three are DFS.

What makes DFS the right tool here? BFS explores breadth first — it finds things close to the source before things far away. That is ideal for shortest paths. But for exploring an entire connected component, checking whether a cycle exists, or finding any path (not the shortest one), you want to go as deep as possible before backing up. DFS commits to one direction until it either finds a goal or hits a dead end, then backtracks to the last unexplored branch.

The intuition is navigating a maze with a rule: always take the rightmost corridor first. Keep going until you either exit or hit a wall, then backtrack to the last junction you had not fully explored. DFS exhausts one branch completely before trying the next.

Problem Category	Why DFS Works	Key DFS Mechanic
Cycle detection in directed graph	Follow edges; revisit detection = cycle	3-color white/gray/black marking
Connected components	Flood-fill each component from any node	Mark visited across restarts
Topological sort (DAG)	Post-order DFS gives reverse topo order	Record on finish, not on visit
Path existence (source → target)	Follow one path to completion	Return true on goal match

DFS handles all four of these naturally. BFS can do them too, but with more bookkeeping and less intuitive code.

🔍 The Stack Contract: How Recursion and Explicit Stacks Both Power DFS

DFS is fundamentally a stack-based algorithm. The "go deep, then backtrack" behaviour is exactly what a stack delivers: last-in-first-out.

When you write DFS recursively, the call stack is your DFS stack. Each recursive call pushes a frame onto the call stack representing the current node, and returning from the function pops that frame — which is backtracking.

When you write DFS iteratively, you manage an explicit stack (Deque<Integer> in Java). You push the starting node, pop a node to process it, and push its unvisited neighbors. The result is the same traversal order.

The key difference between recursive and iterative DFS is when you mark a node visited:

Recursive DFS: mark visited when you enter the recursive call (before exploring neighbors).
Iterative DFS: mark visited either when you push (to avoid re-pushing duplicates) or when you pop (which can allow duplicates in the stack — safer to mark on push).

Pre-order, in-order, and post-order tree traversal are all special cases of DFS applied to binary trees: the difference is only in when you "process" (visit) the current node relative to recursing left and right.

⚙️ Recursive and Iterative DFS Templates in Java

Recursive DFS template — the default; clean and concise:

import java.util.*;

public class DFSTemplate {

    // Recursive DFS on an adjacency-list graph
    public void dfsRecursive(Map<Integer, List<Integer>> graph,
                             int node,
                             Set<Integer> visited) {
        visited.add(node);          // Mark visited on entry
        System.out.println("Visiting: " + node);

        for (int neighbor : graph.getOrDefault(node, Collections.emptyList())) {
            if (!visited.contains(neighbor)) {
                dfsRecursive(graph, neighbor, visited);
            }
        }
        // Post-order processing goes here (e.g., topological sort stack push)
    }

    public void dfsFromStart(Map<Integer, List<Integer>> graph, int start) {
        dfsRecursive(graph, start, new HashSet<>());
    }
}

Iterative DFS template — required when the graph is too deep for the JVM call stack:

public void dfsIterative(Map<Integer, List<Integer>> graph, int start) {
    Deque<Integer> stack = new ArrayDeque<>();
    Set<Integer> visited = new HashSet<>();

    stack.push(start);

    while (!stack.isEmpty()) {
        int node = stack.pop();
        if (visited.contains(node)) continue; // Guard for duplicates in stack
        visited.add(node);
        System.out.println("Visiting: " + node);

        // Push neighbors in reverse order to maintain left-to-right traversal
        List<Integer> neighbors = graph.getOrDefault(node, Collections.emptyList());
        for (int i = neighbors.size() - 1; i >= 0; i--) {
            if (!visited.contains(neighbors.get(i))) {
                stack.push(neighbors.get(i));
            }
        }
    }
}

The iterative version skips post-order processing naturally — if you need post-order (e.g., for topological sort), the iterative approach requires a second stack or a visited-twice scheme. For most interview problems, recursive DFS is cleaner. Use iterative only when stack depth is a real concern (trees with hundreds of thousands of nodes).

🧠 Deep Dive: DFS Internals and Call Stack Analysis

DFS Internals: Recursive Call Stack vs Explicit Stack

Here is the call stack state during a recursive DFS from node A on the path A→B→D and side branch B→E, then C→F:

Call Depth	Stack Frames (top = most recent)	Action
0	[dfs(A)]	Visit A, recurse to B
1	[dfs(B), dfs(A)]	Visit B, recurse to D
2	[dfs(D), dfs(B), dfs(A)]	Visit D, no unvisited neighbors, return
1	[dfs(B), dfs(A)]	Back in B, recurse to E
2	[dfs(E), dfs(B), dfs(A)]	Visit E, no unvisited neighbors, return
1	[dfs(B), dfs(A)]	B fully explored, return
0	[dfs(A)]	Back in A, recurse to C
1	[dfs(C), dfs(A)]	Visit C, recurse to F
2	[dfs(F), dfs(C), dfs(A)]	Visit F, return

The maximum stack depth equals the longest path in the graph (the diameter in the worst case). For a linked-list-shaped graph of 100,000 nodes, the recursive DFS would push 100,000 stack frames — far exceeding the JVM default stack size (~500–1000 frames before StackOverflowError).

Performance Analysis: O(V + E) Time and O(V) Space

Time Complexity: O(V + E)

Each vertex is entered and exited exactly once via the recursive call — O(V) function calls. Each edge is examined exactly once, when its source vertex processes its neighbor list — O(E) edge checks. Total: O(V + E).

Space Complexity: O(V)

The visited set holds at most V nodes — O(V). The recursion stack holds at most the depth of the deepest path — in the worst case (a linear chain), O(V). In practice on balanced trees, it is O(log V). Total: O(V).

When does DFS run slow? DFS on a dense graph (E ≈ V²) has O(V²) time. DFS on a graph with very deep paths causes StackOverflow on recursive implementations — switch to iterative DFS when depth can exceed ~1,000 nodes.

📊 DFS Traversal Order vs BFS on the Same Graph

The same graph yields very different traversal orders depending on the algorithm:

graph TD
    A[A  Start]
    B[B]
    C[C]
    D[D]
    E[E]

    A --> B
    A --> C
    B --> D
    B --> E
    C --> E

    style A fill:#4CAF50,color:#fff,stroke:#388E3C
    style B fill:#2196F3,color:#fff,stroke:#1565C0
    style C fill:#9C27B0,color:#fff,stroke:#6A1B9A
    style D fill:#FF9800,color:#fff,stroke:#E65100
    style E fill:#FF9800,color:#fff,stroke:#E65100

Traversal	Visit Order	How It Works
BFS from A	A → B → C → D → E	Level 0: A. Level 1: B, C. Level 2: D, E
DFS from A (recursive, left-first)	A → B → D → E → C	Dive into B immediately, exhaust B's branch before visiting C
Pre-order DFS	Node before children	Used in tree serialization
Post-order DFS	Node after children	Used in topological sort, dependency builds

Notice that BFS visits C (level 1) before D (level 2), while DFS visits D before C because it fully commits to B's subtree first. This difference is why DFS finds deeply nested solutions faster but cannot guarantee shortest paths.

🌍 DFS in the Wild: Compilers, Game Engines, and Build Tools

Cycle detection in Java imports: The Java compiler detects circular class dependencies using a 3-color DFS on the class dependency graph. A class starts as white (unvisited), turns gray (in the current DFS path), and turns black (fully explored). If a gray node is encountered again, a cycle exists and compilation fails. The JVM's classloader uses the same principle.

Reachability analysis in game engines: Unreal Engine and Unity use DFS to precompute navigation meshes — marking all rooms or tiles reachable from the player's spawn point. Any room not reachable by DFS is excluded from the loaded game world, reducing memory usage. This is equivalent to the connected-components DFS algorithm.

Topological sort in build systems: Make, Gradle, and Bazel use DFS post-order traversal on the build dependency DAG to determine compilation order. Post-order DFS produces a reverse topological sort: a node is recorded only after all its dependencies are fully built. This guarantees a module is only compiled after all its imports are ready.

⚖️ When DFS Causes StackOverflow and What to Do About It

The StackOverflow problem: Java's default thread stack size is 512 KB–1 MB, accommodating roughly 5,000–10,000 recursive calls depending on frame size. A graph shaped like a linked list with 100,000 nodes will crash a recursive DFS. In interviews this is rarely an issue, but in production it is a real concern.

Symptoms: java.lang.StackOverflowError with DFS recursion at the top of the stack trace.

Mitigation 1 — Iterative DFS with explicit stack: Converts recursion into a loop. The explicit Deque is heap-allocated, so it can hold millions of entries without hitting stack limits.

Mitigation 2 — Increase stack size (for threads): You can launch DFS in a dedicated thread with a custom stack size:

Thread dfsThread = new Thread(null, () -> dfsRecursive(graph, start, visited),
                              "dfs-thread", 64 * 1024 * 1024); // 64 MB stack
dfsThread.start();
dfsThread.join();

Mitigation 3 — Tail recursion elimination: Java does not perform tail-call optimization. If the recursive call is not the last statement (which is true for DFS with post-order work), you cannot rely on any compiler to eliminate the stack frame. Use iterative DFS.

Best practice: In production code that runs DFS on user-supplied graphs of unknown depth, always use the iterative template. Use recursive DFS only for in-memory trees of bounded and known depth (e.g., a parsed JSON document).

🧭 DFS vs BFS: When to Go Deep vs Go Wide

Situation	Recommendation
Detect cycle in directed graph	DFS with 3-color marking — most natural fit
Find all connected components	DFS — restart DFS for each unvisited node
Find shortest path (unweighted)	BFS — DFS does not guarantee it
Topological sort of a DAG	DFS post-order — collect nodes on finish
Any path from source to target exists?	DFS — explores one complete path quickly
Minimum steps in a grid or puzzle	BFS — level = step count

The one-line decision rule: Use DFS when the problem involves exploring or labeling structure (cycles, components, orderings, paths). Use BFS when the problem needs minimum distance or level-order processing.

🧪 Four Classic DFS Problems with Full Java Solutions

Example 1: Path Sum — Root-to-Leaf Path Equals Target (LeetCode 112)

DFS checks whether any root-to-leaf path sums to the target. Subtract the current node's value at each step; at a leaf, check if the remaining sum is zero.

public class PathSum {
    public boolean hasPathSum(TreeNode root, int targetSum) {
        if (root == null) return false;
        // At a leaf: check if remaining sum is exactly this leaf's value
        if (root.left == null && root.right == null) return root.val == targetSum;
        int remaining = targetSum - root.val;
        return hasPathSum(root.left, remaining) || hasPathSum(root.right, remaining);
    }
}

Example 2: Number of Islands — DFS Version (LeetCode 200)

DFS floods each island, sinking visited cells in-place. Compare with the BFS version in the companion post — same result, different traversal order.

public int numIslands(char[][] grid) {
    int islands = 0;
    for (int r = 0; r < grid.length; r++) {
        for (int c = 0; c < grid[0].length; c++) {
            if (grid[r][c] == '1') {
                islands++;
                dfs(grid, r, c);
            }
        }
    }
    return islands;
}

private void dfs(char[][] grid, int r, int c) {
    if (r < 0 || r >= grid.length || c < 0 || c >= grid[0].length
            || grid[r][c] != '1') return;
    grid[r][c] = '0';      // Sink cell = mark visited
    dfs(grid, r + 1, c);
    dfs(grid, r - 1, c);
    dfs(grid, r, c + 1);
    dfs(grid, r, c - 1);
}

Example 3: Course Schedule — Cycle Detection in a Directed Graph (LeetCode 207)

3-color DFS: white (0) = unvisited, gray (1) = in current DFS path, black (2) = fully explored. A gray → gray edge is a back edge, meaning a cycle exists.

public class CourseSchedule {

    public boolean canFinish(int numCourses, int[][] prerequisites) {
        List<List<Integer>> graph = new ArrayList<>();
        for (int i = 0; i < numCourses; i++) graph.add(new ArrayList<>());
        for (int[] pre : prerequisites) graph.get(pre[1]).add(pre[0]);

        int[] color = new int[numCourses]; // 0=white, 1=gray, 2=black

        for (int i = 0; i < numCourses; i++) {
            if (color[i] == 0 && hasCycle(graph, i, color)) return false;
        }
        return true;
    }

    private boolean hasCycle(List<List<Integer>> graph, int node, int[] color) {
        color[node] = 1; // Mark gray — in current DFS path
        for (int neighbor : graph.get(node)) {
            if (color[neighbor] == 1) return true;  // Back edge = cycle
            if (color[neighbor] == 0 && hasCycle(graph, neighbor, color)) return true;
        }
        color[node] = 2; // Mark black — fully explored
        return false;
    }
}

Example 4: Clone Graph — Deep Copy with DFS and a Visited Map (LeetCode 133)

DFS clones each node exactly once. The visited map doubles as both the cycle-prevention mechanism and the mapping from original to clone.

import java.util.*;

public class CloneGraph {

    // Node definition: int val; List<Node> neighbors;
    public Node cloneGraph(Node node) {
        if (node == null) return null;
        return dfs(node, new HashMap<>());
    }

    private Node dfs(Node node, Map<Node, Node> visited) {
        if (visited.containsKey(node)) return visited.get(node);

        Node clone = new Node(node.val);
        visited.put(node, clone);           // Map original → clone before recursing

        for (Node neighbor : node.neighbors) {
            clone.neighbors.add(dfs(neighbor, visited));
        }
        return clone;
    }
}

The map must be populated before recursing into neighbors, or cycles in the graph will cause infinite recursion.

🛠️ JGraphT: Open-Source Graph Algorithms Powered by DFS

JGraphT (github.com/jgrapht/jgrapht) is the most widely used open-source Java graph library. Its DepthFirstIterator implements iterative DFS for all Graph<V, E> implementations:

import org.jgrapht.*;
import org.jgrapht.graph.*;
import org.jgrapht.traverse.*;

Graph<String, DefaultEdge> graph = new DefaultDirectedGraph<>(DefaultEdge.class);
graph.addVertex("A"); graph.addVertex("B"); graph.addVertex("C");
graph.addEdge("A", "B");
graph.addEdge("B", "C");
graph.addEdge("A", "C");

// DFS from vertex "A"
DepthFirstIterator<String, DefaultEdge> dfsIter =
        new DepthFirstIterator<>(graph, "A");

while (dfsIter.hasNext()) {
    System.out.println(dfsIter.next());
}

JGraphT's DepthFirstIterator uses an internal explicit stack (not recursion) to avoid StackOverflow on large graphs. It also supports listener hooks for pre- and post-visit events — the equivalent of pre-order and post-order processing in recursive DFS — without requiring users to rewrite the traversal loop.

JGraphT also ships CycleDetector<V, E> (built on 3-color DFS), TopologicalOrderIterator<V, E> (DFS post-order), and ConnectivityInspector<V, E> (DFS restart per component) — all the standard DFS-based algorithms out of the box.

📚 Lessons Learned: DFS Pitfalls in Production Code

Recursive DFS will StackOverflow on deep user-supplied graphs. The JVM default stack handles roughly 5,000–10,000 frames. Any production system accepting arbitrary graph input must use iterative DFS. Catching StackOverflowError is not a valid mitigation — it is a system error, not an application exception.
The 3-color scheme is essential for directed cycle detection. Using a simple visited boolean set only detects revisited nodes — it flags false cycles in undirected graphs and misses some cycles in directed graphs. Gray (in-path) marking is the only correct way to detect back edges in a directed graph.
Clone Graph: populate the visited map before recursing. If you create the clone but recurse before adding it to the map, circular graphs will cause infinite recursion. The correct order is always: create → insert → recurse.
Post-order DFS for topological sort: record on finish, not on visit. A node should be added to the topological order only after all its dependents have been fully explored. Recording on first visit gives the wrong order.
Iterative DFS does not naturally produce post-order results. If you need post-order (topological sort, expression tree evaluation), either use recursive DFS or use a two-stack iterative approach. This is a common interview gotcha: "write an iterative post-order traversal."

📌 Summary: DFS in Five Key Points

DFS goes as deep as possible before backtracking, using the call stack (recursive) or an explicit Deque (iterative) to track the path.
Cycle detection, connected components, topological sort, and path existence are DFS's natural territory. BFS handles these but with more bookkeeping.
3-color marking (white/gray/black) is the correct algorithm for directed cycle detection — a simple visited set gives wrong results on directed graphs.
Time O(V + E), Space O(V) — space dominated by the recursion depth or explicit stack size.
Use iterative DFS in production when graph depth is unbounded — recursive DFS StackOverflows on deep inputs are a real-world failure mode, not just a theoretical concern.

The DFS decision rule: use DFS when you need to explore, label, or order the structure of a graph. Use BFS when you need minimum distance or level-order processing.

Test Your Knowledge

🧠

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

HyperLogLog Explained: Counting Billions of Unique Items with 12 KB

TLDR: HyperLogLog estimates the number of distinct elements in a dataset using ~12 KB of memory regardless of cardinality — with ±0.81% error. The insight: if you hash every element to a random bit string, the maximum length of leading zeros you obse...

May 3, 2026•17 min read

Count-Min Sketch Explained: Frequency Estimation at Streaming Scale

TLDR: Count-Min Sketch (CMS) is a fixed-size d × w counter matrix that estimates how often any element has appeared in a stream. Insert: hash the element with each of the d hash functions to get one column per row, increment those d counters. Query: ...

May 3, 2026•21 min read

Bloom Filters Explained: Membership Testing with Zero False Negatives

TLDR: A Bloom filter is a bit array of m bits + k independent hash functions that sets k bits on insert and checks those same k bits on lookup. If any checked bit is 0, the element is definitely not in the set — false negatives are mathematically imp...

May 3, 2026•18 min read

Java 21 to 25: Virtual Threads, Pattern Matching, and Structured Concurrency

TLDR: Java 21 LTS makes virtual threads a production-ready replacement for bounded thread pools — your newFixedThreadPool(200) can become newVirtualThreadPerTaskExecutor() and handle 10× the concurrency with no architectural changes. Pattern switch w...