Every developer who has ever implemented undo and redo has, at some point, made the same beautiful mistake. They save pointers to "history states" thinking they've saved the history. They haven't. They've saved five views of the same data.

One sloppy mutation later, the entire history is corrupted. Future undos restore the future. The past, it turns out, was sharing memory with the present all along.

Part OneThe mistake

Imagine a simple document editor. You want undo and redo, so you keep a vector of past document states. Each state is a heavy object (think layers, pixels, or a big text buffer), so to keep things efficient, you store shared pointers instead of full copies.

naive_history.cpp C++
class Document {
    std::vector<int> data;  // imagine this is huge
public:
    void set(size_t i, int v) { data[i] = v; }
    int  get(size_t i) const { return data[i]; }
};

std::vector<std::shared_ptr<Document>> history;

// Save current state into history
history.push_back(current);

// Later, user edits the document
current->set(3, 42);

// What just happened to history[history.size()-1]?
// It changed too. They point to the same object.

This looks fine until you actually use it. The moment you mutate current, every prior history entry that shared the pointer mutates with it. The undo stack now contains "the current state" five times in a row.

state_0
state_1
state_2
DOC
one object
Mutate the document and every history entry mutates with it.

The fix that every junior engineer reaches for next is to deep copy the entire document before each edit. That works. It's also catastrophically wasteful. For a 50MB document with 100 edits, you've just allocated 5GB. Most of which is identical to the previous version.

Copying is correct. Copying everything is the wrong amount of correct.

Part TwoThe trick

Copy-on-Write is the obvious idea once you see it: share data freely until someone tries to change it. At the moment of mutation, and only then, make a private copy for the mutator. Everyone else keeps the original. The history stays clean. Most reads pay nothing.

→ The rule

Reads are free and shared. Writes pay for themselves.

If a shared object has only one owner, mutation happens in place. If it has multiple owners, the mutator clones the data first, then mutates its private copy. The other owners never notice. This is what std::shared_ptr::use_count() is quietly designed to support.

In C++, the pattern is small and self-contained. The handle class owns a shared pointer to the data. Read operations forward through directly. Write operations call a helper that detaches if needed, then mutate.

cow_document.cpp C++
class Document {
    std::shared_ptr<std::vector<int>> data;

    void detach() {
        if (data.use_count() > 1) {
            // We're sharing. Make our own copy.
            data = std::make_shared<std::vector<int>>(*data);
        }
    }

public:
    Document() : data(std::make_shared<std::vector<int>>()) {}

    // Read: cheap, shared, no copy
    int get(size_t i) const { return (*data)[i]; }

    // Write: detach first, then mutate
    void set(size_t i, int v) {
        detach();
        (*data)[i] = v;
    }
};

That's the whole pattern. Pushing a copy of the document onto the history stack now costs a pointer bump and a refcount increment. Editing the live document triggers exactly one allocation, and only at the moment of the first edit after a snapshot.

state_0
state_1
state_2
v1
snapshot
v2
current
History shares the snapshot. The current state forks only when it changes.

Part ThreeWhy this is fast

The performance story is straightforward once the mechanics click. Three things happen quickly that used to be slow:

Snapshots are O(1). Pushing a document into history is just incrementing a refcount. No allocation, no memcpy, no scaling with document size. You can snapshot every keystroke if you want.

Reads are free. Reading from any history state, or the current state, dereferences the same underlying buffer. There's no indirection cost beyond what a shared pointer already imposes.

Writes are deferred. If you snapshot and then never edit again, you never paid for the copy. If you snapshot and edit a thousand times, you paid for one copy at the first edit and zero copies after that, until the next snapshot.

This is the same idea behind persistent data structures in functional languages, behind fork() in Unix, and behind how filesystem snapshots work in ZFS and Btrfs. Different domains, same insight: sharing is the default, copying is the exception, and the exception only triggers when it has to.

Part FourThe tradeoffs

No pattern is free. COW has real costs worth being honest about before you reach for it:

↑ What you gain
  • O(1) snapshots regardless of data size
  • Memory-efficient history with no duplication of unchanged data
  • Safer mutation semantics across owners
  • Natural fit for undo/redo, transactional state, and immutable APIs
↓ What you pay
  • Every write checks use_count()
  • The first write after sharing triggers a full clone, sometimes at an awkward moment
  • Refcount updates are atomic, which can hurt in highly concurrent code
  • Reasoning about ownership gets subtler than with plain values

The last point is the one that bites people in practice. Bugs in COW code tend to be subtle: a missing detach, a const method that mutates through a non-const member, a reference handed out that outlives its sharing context. Static analyzers catch some of this. Discipline catches the rest. There's a reason Qt builds explicit detach calls into many of its containers, and a reason languages like Swift bake COW into the language for arrays and dictionaries: doing it by hand is doable, but not effortless.

Part FiveWhen to reach for it

COW shines exactly when your access pattern is "many readers, occasional writers, frequent snapshots." That description fits more code than you'd think:

Document editors with undo and redo. Image manipulation pipelines where filters chain together. Configuration trees that get snapshotted before risky operations. Versioned caches. Anywhere you'd otherwise be tempted to deep-copy a large object "just to be safe."

Skip it when writes dominate reads, when sharing is rare in practice, or when the cost of an unexpected clone at write time would be worse than the cost of upfront copying. Real-time systems with tight latency budgets often fall in this category. The "unexpected pause to allocate" cost of COW can blow your frame budget at exactly the wrong moment.

Copy-on-Write isn't clever. It's just the recognition that most copies don't need to happen. Make the lazy choice the default, and let mutation pay the bill it was always going to pay anyway. Thanks for reading ✦