All Posts

Merge Intervals Pattern: Solve Scheduling Problems with Sort and Sweep

Sort once, sweep once: three full Java solutions for the merge intervals pattern used in every scheduling system.

Abstract AlgorithmsAbstract Algorithms
Β·Β·16 min read
Cover Image for Merge Intervals Pattern: Solve Scheduling Problems with Sort and Sweep
Share
Share on X / Twitter
Share on LinkedIn
Copy link

TLDR: Sort intervals by start time, then sweep left-to-right and merge any interval whose start ≀ the current running end. O(n log n) time, O(n) space. One pattern β€” three interview problems solved.


πŸ“– When Two Meetings Overlap: The Scheduling Problem That Launched This Pattern

Your calendar app needs to render free time slots. You have 500 meetings stored as [start, end] pairs. Some overlap, some are back-to-back, some are completely separate. Before you can show anything useful, you need to collapse all overlapping and touching meetings into their merged spans.

Brute force β€” comparing every pair β€” costs O(nΒ²). But there is a smarter observation: if you sort intervals by start time, you never need to look backward. Every overlap can only happen with the interval you just processed. That single insight reduces the problem to a linear sweep.

This is the merge intervals pattern: sort once, then greedily extend or commit intervals in a single left-to-right pass. It surfaces across interview problems, scheduling systems, and anywhere ranges of values need to be consolidated.


πŸ” The Overlap Condition: When a.end >= b.start

Two intervals a = [a.start, a.end] and b = [b.start, b.end] (where a.start <= b.start after sorting) overlap when:

a.end >= b.start

If this holds, the merged interval spans [a.start, max(a.end, b.end)]. Note the max β€” b might be entirely inside a, in which case a.end stays unchanged.

Three states exist after sorting:

StateConditionAction
Non-overlappinga.end < b.startCommit a to result, start fresh with b
Touchinga.end == b.startMerge: new end = b.end (they share a boundary point)
Overlappinga.end > b.startMerge: new end = max(a.end, b.end)

The touching case matters: [1, 3] and [3, 5] share the point 3 and should merge to [1, 5]. The condition a.end >= b.start handles all three cases at once.


βš™οΈ Sort-Then-Sweep: The Two-Step Algorithm for Merging Intervals

The algorithm fits in two clean phases:

Phase 1 β€” Sort all intervals by their start time. After sorting, any potential overlap between interval i and interval j (where i < j) only ever involves j possibly overlapping with the running merged interval β€” never with an earlier one you already committed.

Phase 2 β€” Sweep left-to-right maintaining a current interval:

  • If the next interval starts before or at current.end, extend current.end to max(current.end, next.end).
  • Otherwise commit current to the result and make next the new current.

The invariant is powerful: once you sort, you never need to revisit a committed interval. Greedy always works here because a later interval's start is β‰₯ every previously committed interval's start.

Here is the complete Java solution for LeetCode 56 β€” Merge Intervals:

import java.util.*;

public class MergeIntervals {

    public int[][] merge(int[][] intervals) {
        if (intervals.length <= 1) return intervals;

        // Phase 1: sort by start time
        Arrays.sort(intervals, (a, b) -> a[0] - b[0]);

        List<int[]> result = new ArrayList<>();
        int[] current = intervals[0];

        // Phase 2: sweep and merge
        for (int i = 1; i < intervals.length; i++) {
            if (intervals[i][0] <= current[1]) {
                // Overlapping or touching β€” extend the current interval
                current[1] = Math.max(current[1], intervals[i][1]);
            } else {
                // Gap found β€” commit current and start fresh
                result.add(current);
                current = intervals[i];
            }
        }
        result.add(current); // don't forget the last interval

        return result.toArray(new int[result.size()][]);
    }
}

Trace on [[1,3],[2,6],[8,10],[15,18]]:

StepCurrentNextActionResult list
1[1,3][2,6]2 ≀ 3 β†’ mergecurrent=[1,6]
2[1,6][8,10]8 > 6 β†’ commitresult=[[1,6]], current=[8,10]
3[8,10][15,18]15 > 10 β†’ commitresult=[[1,6],[8,10]], current=[15,18]
Endβ€”β€”commit lastresult=[[1,6],[8,10],[15,18]]

🧠 Deep Dive: Why Sorting Makes Greedy Correct, and How Complexity Scales

Under the Hood: Internals of the Sorted Sweep

The reason sorting enables a greedy single pass is a monotonicity proof: after sorting by start time, interval i+1 has a start β‰₯ interval i's start. This means that if interval i+1 does not overlap the current merged interval, no subsequent interval can overlap it either (their starts are even larger). You can safely commit the current interval and move on β€” no backtracking required.

Without sorting, a new interval could overlap any previous one, forcing O(n) backward scanning per interval. Sorting eliminates that possibility entirely.

The current variable plays the role of a running window that keeps growing as long as overlaps continue. It is the classic greedy extension strategy β€” always keep the widest span achievable with the current run of overlapping intervals.

Performance Analysis: Time and Space Trade-offs

DimensionComplexityJustification
TimeO(n log n)Dominated by the sort; sweep is O(n)
SpaceO(n)Result list holds at most n merged intervals
In-place sortO(log n)Java's Arrays.sort uses a dual-pivot quicksort internally

The sweep itself is O(n) β€” each interval is visited exactly once, and each interval is committed to the result at most once. The bottleneck is always the sort.

Bottleneck scenario: if all intervals overlap (e.g., [[1,100],[2,99],[3,98],...]), the result is a single interval but you still pay O(n log n) for the sort. You cannot do better than O(n log n) in the comparison-based model for this problem.


πŸ“Š Three Interval States: Non-Overlapping, Touching, and Overlapping

flowchart TD
    A[Sort intervals by start time] --> B[Pick first as current]
    B --> C{Next interval start <= current end?}
    C -->|"Yes β€” overlapping or touching"| D["Extend: current.end = max(current.end, next.end)"]
    D --> E[Advance to following interval]
    E --> C
    C -->|"No β€” gap found"| F[Commit current to result list]
    F --> G[Set next as new current]
    G --> C
    C -->|"No more intervals"| H[Commit final current]
    H --> I[Return result list]

The loop never backtracks. After sorting, each interval is visited exactly once β€” either merged into the running window or committed to the output.

The three concrete cases from the condition table above map directly to this flowchart's left branch (merge) and right branch (commit). The max on the end time handles the case where one interval is completely inside another.


🌍 Real-World Uses: Google Calendar, CPU Scheduling, and Genome Assembly

Calendar free-slot computation is the most direct use. Given a list of busy intervals, merge them and then take the complement of the merged spans against the total working day to find free slots. Google Calendar's free/busy API does exactly this at scale.

CPU burst scheduling in operating systems uses interval merging to consolidate contiguous memory allocations or time-slice segments. When two adjacent allocated blocks are freed, they merge into a single free block β€” the same overlap condition applies.

Genome assembly in bioinformatics produces overlapping DNA sequence reads that must be merged into contigs. The reads are sorted by position and then merged using the same a.end >= b.start condition β€” at a scale of millions of intervals.

Input / Process / Output for a calendar free-slot system:

StageData
Input[[9,10],[10,12],[11,14],[15,17]] (meeting times)
After sort[[9,10],[10,12],[11,14],[15,17]] (already sorted)
After merge[[9,14],[15,17]] (busy spans)
Free slots[[14,15]] in a 9–17 workday

πŸ§ͺ Three Problems, One Pattern

Problem 1: Merge Intervals (LeetCode 56)

Full solution shown in the βš™οΈ section above.

Problem 2: Insert Interval

Given a sorted, non-overlapping list, insert a new interval and return the merged result. There are exactly three phases: collect intervals that end before the new one starts, merge all overlapping intervals into the new one, then collect the rest.

public class InsertInterval {

    public int[][] insert(int[][] intervals, int[] newInterval) {
        List<int[]> result = new ArrayList<>();
        int i = 0, n = intervals.length;

        // Phase 1: intervals that end strictly before newInterval starts
        while (i < n && intervals[i][1] < newInterval[0]) {
            result.add(intervals[i++]);
        }

        // Phase 2: merge all overlapping intervals into newInterval
        while (i < n && intervals[i][0] <= newInterval[1]) {
            newInterval[0] = Math.min(newInterval[0], intervals[i][0]);
            newInterval[1] = Math.max(newInterval[1], intervals[i][1]);
            i++;
        }
        result.add(newInterval);

        // Phase 3: intervals that start strictly after newInterval ends
        while (i < n) {
            result.add(intervals[i++]);
        }

        return result.toArray(new int[result.size()][]);
    }
}

The three-phase structure is the key insight: pre, overlap, post. You never need to sort because the input is already sorted.

Problem 3: Meeting Rooms II β€” Minimum Rooms Needed

How many conference rooms do you need to hold all meetings without conflicts? A min-heap stores the end time of each active meeting. For each new meeting (sorted by start), if the earliest-ending meeting finishes before this one starts, you can reuse that room. Otherwise, you need a new room.

import java.util.*;

public class MeetingRoomsII {

    public int minMeetingRooms(int[][] intervals) {
        if (intervals.length == 0) return 0;

        // Sort by start time
        Arrays.sort(intervals, (a, b) -> a[0] - b[0]);

        // Min-heap of end times for rooms currently in use
        PriorityQueue<Integer> heap = new PriorityQueue<>();

        for (int[] interval : intervals) {
            // If the earliest-ending room frees up before this meeting starts, reuse it
            if (!heap.isEmpty() && heap.peek() <= interval[0]) {
                heap.poll();
            }
            // Allocate a room (or new room) for this meeting
            heap.offer(interval[1]);
        }

        // Rooms still in the heap are concurrently occupied at peak
        return heap.size();
    }
}

Trace on [[0,30],[5,10],[15,20]]:

MeetingHeap beforeDecisionHeap after
[0,30][]New room[30]
[5,10][30]30 > 5 β†’ new room[10, 30]
[15,20][10, 30]10 ≀ 15 β†’ reuse[20, 30]

Result: heap.size() = 2 β€” two rooms needed at peak.


βš–οΈ When Merge Intervals Breaks Down (and What to Use Instead)

The sort-then-sweep approach has specific failure modes to watch for:

Edge case β€” single-point intervals: [3, 3] is a zero-length interval that can still merge with [2, 3] or [3, 5]. The condition a.end >= b.start handles this correctly β€” confirm your implementation does not use strict greater-than.

Edge case β€” containment: [1, 10] completely contains [3, 6]. The max(current.end, next.end) step handles this, but if you naively take next.end instead of max, you silently shrink the merged interval.

When to avoid merge intervals:

SituationRecommendation
Intervals arrive in a stream (online)Use an interval tree (O(log n) insert/query) instead
Need to query which interval a point falls inSorted list + binary search or segment tree
Intervals have associated values to aggregateSweep line with event-based processing
Input is already sortedSkip the sort; sweep directly for O(n)

🧭 Choosing Between Interval Merge, Sweep Line, and Interval Trees

ScenarioBest Approach
Offline batch β€” merge all overlapsSort-then-sweep O(n log n)
Online β€” insert and query intervals dynamicallyInterval tree or TreeMap-based approach
Count overlaps at every pointSweep line with +1 / -1 events
Find all intervals containing a query pointSegment tree or augmented BST
Already sorted, one-pass merge neededLinear sweep O(n)
Embedded ranges (gene annotation, painting)Coordinate compression + BIT

The sort-then-sweep pattern is the right choice whenever you process a static batch of intervals and need the merged result. Dynamic scenarios call for interval trees.


πŸ› οΈ Guava RangeSet: Production-Grade Interval Merging in Java

Google Guava's RangeSet<C> is a set of disjoint ranges that auto-merges overlapping additions. It is the production answer to merge intervals in Java applications.

import com.google.common.collect.ImmutableRangeSet;
import com.google.common.collect.Range;
import com.google.common.collect.TreeRangeSet;

public class GuavaIntervalMerge {

    public static void main(String[] args) {
        TreeRangeSet<Integer> rangeSet = TreeRangeSet.create();

        // Add intervals β€” Guava merges overlaps automatically
        rangeSet.add(Range.closed(1, 3));
        rangeSet.add(Range.closed(2, 6));  // merges with [1,3]
        rangeSet.add(Range.closed(8, 10));
        rangeSet.add(Range.closed(15, 18));

        // Prints [[1..6], [8..10], [15..18]]
        System.out.println(rangeSet.asRanges());

        // Query: does any merged interval contain point 4?
        System.out.println(rangeSet.contains(4)); // true

        // RangeMap for values per interval (e.g., user bookings)
        // com.google.common.collect.TreeRangeMap handles per-interval payloads
    }
}

TreeRangeSet internally uses a TreeMap<Cut<C>, Range<C>> where Cut represents a boundary endpoint. Overlap merging happens at insertion time in O(log n). This is the same algorithmic idea β€” maintain a sorted structure, detect overlaps on insert β€” but optimized for dynamic workloads.

For a static batch, the sort-then-sweep you wrote above is faster (O(n log n) vs. O(n log n) with a higher constant for the TreeMap). For a system where intervals arrive continuously, TreeRangeSet is the right choice.


πŸ“š Lessons from Getting Interval Problems Wrong

Lesson 1 β€” Forget the max and create shrinking intervals. The most common bug is current[1] = intervals[i][1] instead of current[1] = Math.max(current[1], intervals[i][1]). This silently shrinks the merged interval when the next interval is fully contained within the current one.

Lesson 2 β€” Mutate the input array. Using current = intervals[0] and then current[1] = ... modifies the original array. In Java, int[] current = intervals[0] is a reference, not a copy. Either copy the first interval (new int[]{intervals[0][0], intervals[0][1]}) or be aware you are mutating input β€” which can fail tests that check the input is unchanged.

Lesson 3 β€” Forget to add the last interval. The sweep loop commits current only when a gap is found. The last interval (or last merged group) is never followed by a gap, so it never gets committed inside the loop. Always add result.add(current) after the loop.

Lesson 4 β€” Use strict greater-than for the overlap check. intervals[i][0] < current[1] misses touching intervals (== case). Use <=.

Lesson 5 β€” Sort by start but break ties by end descending. For Meeting Rooms variants where fully contained intervals matter, tie-breaking in the comparator can change results. For the basic merge, start-only sorting is sufficient.


πŸ“Œ Summary: The Merge Intervals Playbook

  • Sort by start time β€” this converts a 2D overlap check into a 1D greedy sweep.
  • Overlap condition: next.start <= current.end β€” use <= to handle touching intervals.
  • Merge rule: current.end = max(current.end, next.end) β€” the max handles containment.
  • Always add the last current after the loop exits.
  • Meeting Rooms II flips the question: instead of merging intervals into fewer, count concurrent ones with a min-heap.
  • Insert Interval uses three phases (before / overlap / after) and skips the sort since input is pre-sorted.
  • Time: O(n log n) dominated by sort. Space: O(n) for result list.

One-liner to remember: Sort by start, extend by max-end, commit on gap β€” that is the entire merge intervals pattern.


πŸ“ Practice Quiz

  1. Given [[1,4],[4,5]], what does the merge algorithm produce?

    • A) [[1,4],[4,5]] β€” they do not overlap
    • B) [[1,5]] β€” touching intervals merge
    • C) [[1,4],[5,5]] β€” the second interval is trimmed Correct Answer: B
  2. You have 1000 meetings sorted by start time. What is the minimum time complexity of merging overlapping meetings?

    • A) O(n) β€” a single sweep is sufficient since data is already sorted
    • B) O(n log n) β€” you must re-sort before sweeping
    • C) O(nΒ²) β€” every pair must be compared Correct Answer: A
  3. In the Meeting Rooms II solution, why is a min-heap of end times used instead of a max-heap?

    • A) A min-heap is sorted in ascending order, which mirrors the sort order of start times
    • B) We want to check whether the room that ends earliest can be reused β€” a min-heap gives O(1) access to that room
    • C) A max-heap would overflow for large inputs Correct Answer: B
  4. Open-ended challenge: The merge intervals pattern runs in O(n log n) due to sorting. Design a data structure that accepts intervals one at a time (streaming input, not batch) and answers "how many distinct merged intervals exist right now?" in O(log n) per insert. What data structure would you use, and what invariants does it need to maintain?



Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms