Demystifying Lvalue references
Hello everyone, so here’s an interesting piece of code. The following program would print 10 when it's run.
#include <iostream> using std::cout; int globalvar = 20; int& foo() { return globalvar; } int main() { foo() = 10; cout << globalvar; return 0; }
The interesting line in the above program would be at line 13,
If you notice closely such initialization is possible because the method foo returns a reference to an integer. If in line 6, if the function was defined as
then we would get an error thrown by gcc at line 13
error: lvalue required as left operand of assignment
So without further ado, lets dig in to see what's happening under the hood. We'll use gdb to debug and disassemble the program.
(gdb) break main Breakpoint 1, main () at ref.cpp:13 13 foo() = 10; (gdb) disassemble Dump of assembler code for function main(): 0x080485ae <+0>: push %ebp 0x080485af <+1>: mov %esp,%ebp 0x080485b1 <+3>: and $0xfffffff0,%esp 0x080485b4 <+6>: sub $0x10,%esp => 0x080485b7 <+9>: call 0x80485a4 <foo()> 0x080485bc <+14>: movl $0xa,(%eax) 0x080485c2 <+20>: mov 0x8049988,%eax 0x080485c7 <+25>: mov %eax,0x4(%esp) 0x080485cb <+29>: movl $0x80499a0,(%esp) 0x080485d2 <+36>: call 0x8048474 <_ZNSolsEi@plt> 0x080485d7 <+41>: mov $0x0,%eax 0x080485dc <+46>: leave 0x080485dd <+47>: ret End of assembler dump.
In the above disassembly, we see that the current execution is at instruction 0x080485b7, where we call the method foo.
0x080485b7 <+9>: call 0x80485a4 <foo()>
We step into the method foo which is located at address 0x80485a4 and disassemble again.
(gdb) step foo () at ref.cpp:8 8 return globalvar; (gdb) disassemble Dump of assembler code for function foo(): 0x080485a4 <+0>: push %ebp 0x080485a5 <+1>: mov %esp,%ebp => 0x080485a7 <+3>: mov $0x8049988,%eax 0x080485ac <+8>: pop %ebp 0x080485ad <+9>: ret End of assembler dump.
We see that inside the function, the value at address 0x8049988 is moved to the eax register.
0x080485a7 <+3>: mov $0x8049988,%eax
The value at that address 0x8049988 is the value of the global variable globalvar . To verify, we just introspect the value at that address.
(gdb) x/d 0x8049988 0x8049988 <globalvar>: 20 (gdb) next 9 } (gdb) info registers eax eax 0x8049988 134519176
Once the function returns, the next step is where the magic of reference unfolds. Particularly,
0x080485bc <+14>: movl $0xa,(%eax)
This is how the assignment is made possible. In the above instruction, we move-immediate the integer 10 into the address location pointed by eax, which is nothing but the address of globalvar . To think of this instruction in C terms, this statement is similar to
*eax = 10; //eax = &globalvar
We can verify that the value of globalvar has now been modified as 10.
(gdb) x/d 0x8049988 0x8049988 <globalvar>: 10
The instructions from 0x080485c2, corresponds to passing this value to cout to print on the console.
So now we have demystified Lvalue reference can you think of a place where such a scenario would be used in C++? We come of this subtle piece of code almost daily but fail to realize it. Okay, it's simple. We come across this type of scenario when we overload the [] operator for STL map or vector. For instance, say we are having a vector of doubles, then the operator [] method would roughly look like this,
double& operator[](int i) { return elem[i]; }
The above return by reference allows assignment such as follows possible
References and Further Reading
http://eli.thegreenplace.net/2011/12/15/understanding-lvalues-and-rvalues-in-c-and-c
https://isocpp.org/wiki/faq/references#assigning-refs
http://accu.org/index.php/journals/227