Smart Pointers: Box, Rc, and RefCell

Smart Pointers: Box, Rc, and RefCell

Twelve lessons ago, Box<dyn Error> was introduced as "reads, for now, as some kind of error, built from an idea waiting for you further down this course." The traits lesson explained the dyn half: any type at all, decided at runtime. It never explained the Box. This lesson finally does: dyn Error has no size the compiler can know in advance, and Box<T> is always the same fixed size no matter what is inside, which is the entire reason Result<(), Box<dyn Error>> compiles. From there: Rc<T> for sharing read-only access to the same heap data from multiple owners, where Rc::clone is a counter bump rather than the ownership lesson's deep-copying clone, RefCell<T> for mutating through a shared reference by checking the borrowing rules at runtime instead of compile time (and panicking on purpose when they are broken), and Rc<RefCell<T>>, the shared-mutable-state shape the next lesson reaches for again across threads.

17 min read4 learning objectives

What You'll Learn

  • Explain why Box<dyn Error> has compiled since the propagating-errors lesson: dyn Error has no compile-time-known size, but Box<T> is always the same fixed size regardless of T
  • Use Box::new to move a value to the heap, and recognize that Box<T> frees that allocation automatically when it goes out of scope, the same Drop timing as String and Vec<T>
  • Share read-only access to one heap allocation from multiple owners with Rc::clone and Rc::strong_count, and explain how this differs from the ownership lesson's deep-copying clone
  • Mutate data through a shared reference with RefCell::borrow_mut, recognize the runtime-panic version of the borrowing rules, and combine Rc and RefCell into the Rc<RefCell<T>> pattern

Heads up — this lesson starts an ADVANCED section. Building a CLI closed by drawing a line: everything through lesson 32 was about what to write — the right type, the right trait, the right method. Starting here, the question becomes who owns a piece of data — and, for the first time, more than one owner is sometimes exactly what's wanted.

But first, a twelve-lesson-old promise. The propagating-errors lesson introduced Box<dyn Error> and asked for patience — it "reads, for now, as... built from an idea (trait objects) waiting for you a little further down this course." The traits lesson kept half of that promise: dyn Error means "any type at all, decided at runtime, so long as it implements Error." It never explained the Box. That's where this lesson starts.

What Box is for: a type with no size

Delete the Box and ask the compiler what it thinks of Result<(), dyn Error>:

Two sentences, and both matter. "the size for values of type (dyn std::error::Error + 'static) cannot be known at compilation time" — dyn Error could be io::Error, ParseIntError, a hand-written CountError, serde_json::Error — the traits lesson's "any type at all, decided at runtime" — and every one of those types is a different size. Before the compiler can generate code for anything involving Result<T, E>, it needs to know how many bytes E takes up. dyn Error doesn't have an answer. (That + 'static is the lifetimes lesson's prediction landing right on schedule — "occasionally inside a trait bound," it said, and here it is. It just means "this can stick around for as long as the program needs it" — not the interesting part of this message.)

Second sentence: "the trait Sized is not implemented for..." Sized is Rust's name for "a type whose size is known at compile time." Every type this course has used — i32, String, Vec<T>, every struct and enum written by hand — is Sized. dyn Error is the first thing in this course that isn't: it isn't really one type, it's a stand-in for whichever type implements Error, decided only once the program is running.

Box<T>: a fixed-size handle to the heap

Box<T> exists to fix exactly this. No matter what T is, Box<T> is always the same size — because T itself doesn't live inside the Box. It lives on the heap, and the Box is a small, fixed-size handle pointing at it. The simplest possible example:

Box::new(5) puts 5 on the heap and hands back b — a Box<i32>, sitting on the stack, pointing at it. println!("b = {b}"); prints b = 5, not an address: Box<T> implements Display whenever T does — the same one-impl-covers-every-qualifying-T shape as .to_string() arriving for free in the writing-tests lesson, or Vec<T> getting Serialize for free in the serde-and-json lesson. And when b goes out of scope at the end of main, the heap allocation is freed automatically — Box<T>'s Drop, the same automatic cleanup the ownership lesson's very first String had, just for a plain i32 this time.

And that's the whole trick behind Box<dyn Error>. dyn Error — some error type, decided at runtime, of a size the compiler can't pin down — goes on the heap, where its size doesn't matter to anything else. Box<dyn Error> — always the same fixed size, whichever error ended up inside — is what actually sits in Result's E slot. Every ? that has converted an error into Box<dyn Error> since the propagating-errors lesson has been doing exactly what Box::new(5) just did, with an error value standing in for 5.

More than one owner: Rc<T>

Every value in this course has had exactly one owner — the ownership lesson's first and biggest rule. Rc<T> ("reference counted") is the first type that bends it: multiple owners, sharing the same heap allocation, none of them solely responsible for cleaning it up.

Rc::new wraps the Vec<String> exactly the way Box::new wrapped 5 — except Rc::clone(&tasks) doesn't deep-copy the vector the way the ownership lesson's .clone() did. It bumps a counter and hands back another pointer to the same heap allocation. Rc::strong_count reads that counter directly: one owner right after Rc::new, three after two clones. tasks, tasks2, and tasks3 are three separate variables, all pointing at one Vec<String>tasks2[0] and tasks3[0] both print "Write lesson 33" because there is only one of it.

Same method name, very different job. The ownership lesson's .clone() made an independent, separately-owned copy — deliberately expensive, so the cost would be visible. Rc::clone is closer to +1. When each Rc goes out of scope, the count drops by one; only when it reaches zero does the Vec<String> actually get dropped — Box<T>'s cleanup from a moment ago, just deferred until the last owner leaves.

Mutating through a shared reference: RefCell<T>

Every read in the last block went through &T Rc::clone only ever hands out shared references, and the references-and-borrowing lesson's rule — any number of readers, or exactly one writer, never both, checked at compile time — still applies. RefCell<T> is for when that's too strict. It enforces the same rule — but at runtime, not compile time.

.borrow_mut() returns a RefMut<i32> — a smart pointer in its own right — and * reaches through it to the i32 underneath, the mirror image of &. .borrow() returns a Ref<i32>, and println!("count: {}", count.borrow()); prints 2 directly: Ref<T>, like Box<T>, implements Display whenever T does — no * needed just to print it.

RefCell's rule, broken on purpose

Borrow the same RefCell mutably twice at once, and watch what the compiler does — and doesn't — catch:

This compiles without one complaint — first and second are each just a let binding, and the borrow checker has nothing to say about either in isolation. Run it, and RefCell says what the compiler couldn't: the references-and-borrowing lesson's "one writer at a time" rule, broken, the moment second's .borrow_mut() runs. "already borrowed: BorrowMutError" is RefCell's own error type — and panic!, not Result, is the response, exactly the understanding-errors lesson's rule for a problem that's the program's fault, not the input's. Holding two mutable borrows of one RefCell at the same time is always a bug; RefCell panics rather than let it slide.

Putting them together: Rc<RefCell<T>>

One more layer, and the two ideas combine exactly the way their names suggest:

Rc::new(RefCell::new(...)) — multiple owners (Rc) of one mutable cell (RefCell). tasks and tasks2 are two separate variables pointing at the same RefCell<Vec<String>> Rc::clone, the same +1 as before. tasks2.borrow_mut().push(...) mutates through tasks2; tasks.borrow() — a different variable entirely — sees the result. Two names, one piece of data, genuinely shared and genuinely mutable.

Hold onto this exact shape. Rc<RefCell<T>> solves "multiple owners, one of them needs to mutate" for a single thread. The moment there's more than one thread, both halves get replaced with thread-safe versions — Arc instead of Rc, Mutex instead of RefCell — but the shape, Arc<Mutex<T>>, and even .lock() standing in for .borrow_mut(), will look immediately familiar.

Quick exercise

  1. Build the double-.borrow_mut() example and run it yourself — confirm it compiles cleanly and panics only when run. Then change just the second line to count.borrow() (read-only) instead of count.borrow_mut(), and run it again. Still panics — "already mutably borrowed: BorrowError" this time — because the rule isn't "two writers"; it's "a writer and anything else, at the same time." Two .borrow()s together, with no .borrow_mut() in the mix, are always fine — that's the entire point of RefCell.
  2. In the Rc<T> example, wrap one of the clones in its own block — { let _temp = Rc::clone(&tasks); println!("inside: {}", Rc::strong_count(&tasks)); } — then print Rc::strong_count(&tasks) again afterward. Watch the count rise inside the block and fall back the moment _temp is dropped at the closing }Rc's cleanup, tied to the same scope-based Drop timing as everything dropped since the ownership lesson.

Three new types, but really one idea, told three ways. The ownership lesson said: one owner, and Rust tracks it at compile time. Box<T> doesn't change that — it just moves the value to the heap, so a fixed-size handle can stand in for something whose size isn't known until runtime (dyn Error, among others). Rc<T> relaxes "one owner" to "one or more owners, counted." RefCell<T> relaxes "the references-and-borrowing lesson's rule, checked at compile time" to "the same rule, checked at runtime — and panicked on, if it's broken." Rc<RefCell<T>> is both relaxations, stacked.

The next lesson picks up Rc<RefCell<T>> and asks one new question: what if "multiple owners" means "multiple threads," running at the same time, for real? Threads, message passing, and the thread-safe cousins of everything just covered — Arc and Mutex — are next.