'lvalue' considered harmful
Posted on 06 June 2010
If there is one term I would like to erase from computing terminology forever, it is ‘lvalue’. It is a confusing and distracting term, and far from making a discussion more precise, it makes it less so.
The problem is that there are multiple competing definitions for lvalue.
1. A storage location: The C Standard definition
A C lvalue is an expression which refers to a location which can store a value.
This is the definition I am most familiar with. Given the C code:
int a[30]; int i=3,j=5,k=10; a[i] = k;
the following are all lvalues:
a i j k a[i] a[j] a[k]
because every one of these expressions refers to a memory location which holds a value. The standard suggests in a footnote that it is helpful to think of “lvalue” as short for locator value.
2. Something which can be assigned to: The C Standard “modifiable lvalue”
In C, a modifiable lvalue is an expression which may appear on the left-hand side of an assignment.
Note that this is not contingent on there being such an assignment; only that assignment is possible. As a result, given the same C example code, the following are modifiable lvalues:
i j k a[i] a[j] a[k]
Notably missing is a. a is an lvalue but not a modifiable lvalue, because it refers to a location in memory but can not appear as the left side of an assignment. In C, you cannot assign to a whole array, only to one cell at a time.
In contexts outside of C, this is often simply referred to as an lvalue, but when talking about C you should stick to Standard terminology to be as precise as possible. Especially if you’re asking questions on comp.lang.c.
3. Something on the left of an assignment: the “left-hand side” definition
This definition is of a totally different character to the C Standard definitions. In the perl code:
my $array = []; my $i = 3; my $j = 5; $array->[$i] = $j++;
this definition states that the only lvalue is $array->[$i] because it’s on the left-hand side. $array->[$i] is an lvalue only in the context of this assignment; in another assignment (say, $foo = $bar) $array->[$i] is no longer an lvalue, $foo is instead.
There is an obvious problem in this example that $array->[$i] is not the only expression which gets written to: $j is also written to by the autoincrement operator.
I believe when people use this definition, they really mean definitions 4 or 2 but are using sloppy language. Nevertheless it is a definition which people use.
4. Something which is written to
This is similar to the above argument, but states more precisely that an lvalue is something which is written to in an expression. In the above example, there are two lvalues: $array->[$i] and $j, both of which are written to as a result of the code. I like it more than definition 3 for this reason, but you can start to see how thinking of ‘lvalue’ and ‘rvalue’ as analogous to ‘left’ and ‘right’ breaks down.
Mutual incompatibility
These four definitions are mutually incompatible. The biggest problem is that definitions #1 and #2 are absolute, while #3 and #4 are contingent on a particular assignment. Under #3 and #4, a variable can be an lvalue in one line of code, but on the next line it is not an lvalue because it is not written to. Under #1, a variable is always an lvalue because it always refers to a location in memory. Under #2, a variable is always an lvalue unless it is of some type unsuitable for assignment — a constant, for example.
In other words, definition #2 says an lvalue can appear on the left-hand side of an assignment, while definition #3 says an lvalue does appear on the left-hand side of an assignment.
Definitions #1 and #2 are incompatible in programming languages which have storage locations which cannot be assigned to. Such a location is a #1-lvalue but not a #2-lvalue. The chief problem requiring separate definitions in C is the existence of array and structure types — these cannot be directly assigned to. In modern programming languages, arrays and structures can typically be assigned, so this problem is moot.
The other main type of location which cannot be assigned to is a variable with constant type. There are many languages in which you can declare variables constant. A reasonable modification can be made to definition 2 to say “an lvalue is an expression which could be on the left-hand side of an assignment, ignoring constant qualifiers”; but then there exist lvalues which cannot be assigned to. Nevertheless outside of the context of C there are situations in which it is reasonable to conflate #1 and #2.
A survey of definitions
I did a search of various programming resources I know for definitions of lvalue. Here is what I came up with:
- Strachey’s seminal paper in which he coins the term L-value: “An L-value represents an area of the store of the computer.” pretty explicitly #1
- Wikipedia: “A value (computer science) that has an address” #1 as of time of writing
- Wiktionary: “A value that can be treated as an address or storage location.” #1 as of time of writing.
- Computer Programming and Precise Terminology, a Dr Dobbs article: #1
- C Standard (read a free draft here): #1 (and #2 for modifiable lvalue).
- Foldoc: “A reference to a location, an expression which can appear as the destination of an assignment operator indicating where a value should be stored.” #1 and #2 in the same sentence.
- about.com: #3 (top google hit for lvalue definition)
- c2.com: Direct quoting of Strachey; #1
- msdn: “An lvalue refers to an object that persists beyond a single expression. You can think of an lvalue as an object that has a name.” #1
- perldoc perlglossary: #2
- The Ruby Programming Language: page 92, #2
- The Camel Book, 3rd edition: page 52, #2
I must be honest here and say I fully expected more instances of definitions #3 and #4 than I found. I have come across people using #3 and #4 on mailing lists and in fora enough for the term to be a source of confusion — in fact, just today I saw somebody using definition 4. The mnemonic value of ‘lvalue’ and ‘rvalue’ versus ‘left’ and ‘right’ encourages people to think of lvalues as something to do with assignment, rather than as something to do with persistent storage locations.
In any case, in my experience there are enough people confusing the term ‘lvalue’ that I feel that it should not be used at all. I would instead recommend the term ‘location’. If you must use the term lvalue, please use definition #1.
On rvalues
I haven’t mentioned the term rvalue at all here. It doesn’t get used nearly as much, for which I am thankful; I believe the reason it isn’t used is because the term value does the same job much better, or where more precision is required, value of an expression.