One of the more hotly debated topics among software developers these days is adopting functional programming. Those people who ponder these questions have presented good arguments regarding adopting (or avoiding) functional programming. But the term “functional programming” is a bit slippery and often when developers say functional programming they’re actually referring to a few different ideas. One of the more important ideas is default immutability in data in the language. But as sometimes happens, people can load a term with different meanings unrelated to the original. This makes it harder to discuss technical merits and drawbacks because people don’t agree on exactly what they mean. Sometimes it’s helpful in debate and discussion to spend some time to clarify exactly what we mean by certain terms.
Mutability and the converse immutability are two terms that are debated in a less-than-rigorous fashion. In this post, I will endeavor to explain these terms and what they mean in practical terms. I also hope to give those making decisions about technology a bit of insight into why immutability is an important concept to understand.
It’s difficult to define the absence of something so I will first define mutability; it will then be easier to discuss what we mean by its opposite.
When all the other abstractions are stripped away, a computer doesn’t know anything except 1 and 0. Either a particular bit is on or it’s off. Creating software is an extremely taxing mental activity and anything that can be done to lighten the load for the software developer is a good thing. One way to lighten the load is to allow the developer to think of things in abstract terms. If a developer can think of a value stored in memory somewhere as opposed to thinking of a series of 1s and 0s at some particular circuit, it’s far easier for the developer to focus on more important concerns. Hence very early in the history of software development the abstraction of a variable was created. A variable was simply a way of naming a memory location to make it easier for a developer to reason about it. It’s the difference between saying “memory location 3000 has the value 25 stored in it” and saying “age equals 25” where age is defined as memory location 3000. To the computer, they’re effectively the same thing but to the software developer the latter is far easier to understand.
Now at the hardware level unless something in the hardware specifically prevents it, any memory location can be altered at any time. That is, unless the hardware has some mechanism to prevent it, my software can alter “age” to be 26 pretty much anywhere it cares to do so.
To give a more specific definition, mutability is simply the ability of software to overwrite any memory location at any time with a new value.
Now, a developer has to have the ability to store a new value over an existing value at least once. That is, if he or she could never write a new value to a memory location then everything would be 0 all over memory and we couldn’t get anything accomplished. So to dig a bit deeper, mutability is the ability to overwrite a memory location which has been modified from the default with a new value.
That is to say if I store 25 at memory location 3000, with mutability I can later overwrite memory location 3000 with 26 (or anything else I care to).
So then if mutability is the ability to store a new value in a memory location that has already been set to a non-default value then what is immutability? Immutability is simply the property that a memory location can be set one time and then never altered again.
That is to say that if mutability is being able to set age to 26 after it’s been initially set to 25 then immutability prevents the age from ever being changed from 25 after that. The memory location can be given an initial value but once it’s given an initial value the software is not permitted to alter it.
So Why Should I Care?
So maybe now you’re saying to yourself—ok, so I get it. With mutability I can keep on changing a value anytime I want; with immutability, I can only set a value one time. Why is this important?
There are a few reasons this is worth discussing. I’ll cover some of those of which I’m aware but this is hardly an exhaustive list.
It’s easier to reason about immutable code
Every separate detail that a developer has to keep track of in his or her head is a form of a mental tax. Psychologists have determined that, on average, the typical human being can keep 5 to 7 distinct items in their short-term memory at any given time. As a developer has to remember more and more details, it’s more and more likely that he or she will forget some detail. Not having to worry about a value being changed after it’s been set is just one less bit of mental taxation.
It’s easier to prevent accidental changes if the data is immutable
Almost every time you hear of a new exploit, it’s likely that it involves software writing to memory that it was not intended to write. Buffer overflow bugs in software are all errors involving writing memory that was not meant to be written. While no system devised by human developers will ever be perfect, it’s less likely that a system that cannot accidentally write to memory once the memory is set will suffer from these sorts of security exploits.
Beyond even these security exploits there’s the simple matter of the number of bugs arising in software due to inadvertent modification of memory. Anyone who’s ever coded in C will have stories of a bad pointer overwriting memory. They will have stories about this because these kinds of bugs are usually hard to reproduce and hard to isolate. Imagine you have people entering their name and the name is stored in the wrong location in memory. Each time a different name is entered, there’s potential for a different behavior and therefore a different bug. Default immutability lessens this possibility.
It’s easier to execute code in parallel when the data is immutable
This is one of the major reasons for the debate/discussion around functional programming these days. Executing code in parallel is likely the way forward as far as running our software faster. However, when code is executed in parallel, the number of possible writes to the same data is multiplied by the number of executing processes accessing the data. Therefore developers have to modify their code to prevent multiple processes from accidentally writing the same data. Making data immutable nicely side-steps this issue. If you can’t write existing memory, you can’t accidentally modify existing memory.
By the way, many thanks to Ms. Sarah Trenz for her invaluable feedback on this article.