Strings

Why a type that looks this simple turns out to be the most genuinely subtle one in the language: building and combining strings, and the real reason Rust refuses to let you write s[0] — a refusal that is protecting you from a bug you have not met yet.

12 min read4 learning objectives

What You'll Learn

  • Build, grow, and combine Strings with push_str, push, the + operator, and format!
  • Explain — in your own words — why Rust will not let you index a String with an integer
  • Tell bytes, Unicode scalar values (chars), and grapheme clusters apart, and know which one everyday code actually wants
  • Slice strings safely along character boundaries, and recognize why slicing into the middle of one panics on purpose

Quick honesty check: has it struck you, this whole course, how often String and &str show up side by side, and how the difference between them keeps quietly mattering? That's not an accident, and it isn't Rust being needlessly fussy. Strings are where a genuinely interesting design decision lives — one most languages never ask you to think about, mostly by getting it subtly wrong. This lesson is about slowing down long enough to actually understand it, instead of pattern-matching your way past it forever.

A 30-second recap, since we're about to lean on it

String is a growable, heap-allocated, owned string — your data, on your terms, free to change. &str — a "string slice" — is a borrowed view into string data that lives somewhere else, exactly the same relationship a slice has to the vector underneath it, which is precisely how you met this idea back in the slice type. String literals like "hello" are &str; reach for String the moment you need an owned, growable copy of your own.

Building and growing strings

Several roads, same destination: String::new() for empty, .to_string() on anything that knows how to display itself (string literals included), or String::from(...) — which you've been typing since lesson one. All three hand you back an owned, growable String; which one you reach for is purely a matter of whatever reads most naturally at the call site.

.push_str() appends a string slice — and notice it takes &str, not String; there's no reason to demand ownership of text you're merely copying in. .push() appends a single character. And then there's +, which looks like plain concatenation and is secretly a method call wearing a trench coat: s1 + &s2 desugars to s1.add(&s2), and add's signature takes self by value. Which means — if "by value" is ringing every bell it should be ringing by now — s1 gets moved into that call, and is gone the moment it returns. The + operator was never free. Rust just refuses to hide the cost from you.

format!: combining strings without losing any of them

Chain enough +s together and the result turns unreadable fast — and quietly devours every owned String on its left side as it goes. format! sidesteps both problems in one move: it reads exactly like println!, and — this is the important part — it only borrows its arguments, the very same captured-identifier borrowing you've been using in every println! since lesson one. Look at that final line: tic is exactly as usable after format! as it was before. Reach for it the moment you're combining more than two pieces of text — which, in honest practice, is most of the time.

The big one: why can't you just write s[0]?

Here's where strings stop looking like "a list of characters" and the real design decision finally comes into view. In plenty of languages, s[0] hands you the first character without a second thought. Try the same thing in Rust, and the compiler stops you cold — not because of a bug, and not an oversight. A deliberate refusal to answer a question that doesn't actually have one single honest answer.

"Hola" is four characters and four bytes — so far, perfectly intuitive. But "Здравствуйте" ("hello," in Russian) is twelve characters... and twenty-four bytes. Every one of those Cyrillic letters takes two bytes to encode in UTF-8 — the encoding every String in Rust guarantees, always, with no exceptions. So "what's at index 0?" doesn't actually have a single honest answer. The first byte? Useless on its own — it's only half a character, and printing it alone would produce garbage. The first character? A perfectly reasonable thing to want — but it can't be found in constant time the way array indexing promises, because Rust would first have to walk the string from the start, decoding as it goes, just to discover where that first character ends.

So Rust simply declines to guess on your behalf:

That error is the compiler being precise about exactly what it's refusing to invent on your behalf — a single answer to a question that is secretly three different questions, wearing the same trench coat.

Three honest ways to look at the same string

Instead of guessing, Rust simply asks you to say which of those three things you actually mean: raw bytes (exactly how the data sits in memory — what .bytes() walks), Unicode scalar values (the closest real equivalent to "characters" — what .chars() walks, and what Rust calls a char), or grapheme clusters (what a human eye would point at on screen and call "one letter" — which, for some scripts, doesn't even line up with .chars() either, and is genuinely its own rabbit hole that the standard library deliberately leaves to outside crates). For the overwhelming majority of everyday code, .chars() is the one you want:

Two characters. Four bytes. Both views are completely honest about what they are — which is exactly why Rust makes you pick one instead of quietly picking for you.

Slicing strings: the same idea, with sharper edges

You can absolutely still slice a String — using the exact range syntax from the slice-type lesson — but the range is measured in bytes, and both ends have to land squarely on character boundaries. Slice cleanly between characters and you get back precisely the substring you'd expect. Slice into the middle of one, and the program panics rather than quietly handing you something corrupted. It's the same philosophy you've now watched show up again and again in this language: a sharp edge that other languages let you walk straight into, turned into something Rust forces you to face — on your terms, at a moment of your own choosing, instead of three time zones and one very confusing bug report later.

Quick exercise

Write a function fn count_chars(input: &str) -> usize that returns the number of Unicode scalar values in a string slice — then call it on both "Hello" and "Здравствуйте" (and on a string with an emoji or two, if you're feeling adventurous), and compare what you get against .len() on the same inputs. Watching those two numbers diverge with your own eyes will do more for this lesson sinking in than any amount of explanation ever could.

A type that looks simple turning out to be subtle is, genuinely, one of the more reliable signs that you're starting to think the way Rust wants you to — not "what's the shortcut," but "what's actually true here, and what does the type system need to guarantee so I can never quietly forget it." You've now met the two collections that show up absolutely everywhere in real Rust code. Time for the third — the one built specifically for the question "do I already have one of these, and if so, what was it paired with?": hash maps.