The Stack ========= In computer architecture, the stack is a hardware manifestation of the stack data structure (a Last In, First Out queue). In x86, the stack is simply an area in RAM that was chosen to be the stack - there is no special hardware to store stack contents. The ``esp`` / ``rsp`` register holds the address in memory where the bottom of the stack resides. When something is ``push`` ed to the stack, ``esp`` decrements by 4 (or 8 on 64-bit x86), and the value that was ``push`` ed is stored at that location in memory. Likewise, when a ``pop`` instruction is executed, the value at ``esp`` is retrieved (i.e. ``esp`` is dereferenced), and ``esp`` is then incremented by 4 (or 8). **The stack "grows" down to lower memory addresses!** Conventionally, ``ebp`` / ``rbp`` contains the address of the top of the current **stack frame**, and so sometimes local variables are referenced as an offset relative to ``ebp`` rather than an offset to ``esp``. A stack frame is essentially just the space used on the stack by a given function. Uses ---- The stack is primarily used for a few things: - Storing function arguments - Storing local variables - Storing processor state between function calls Example ------- Let's see what the stack looks like right after ``say_hi`` has been called in this 32-bit x86 C program: .. code-block:: c #include void say_hi(const char * name) { printf("Hello %s!\n", name); } int main(int argc, char ** argv) { char * name; if (argc != 2) { return 1; } name = argv[1]; say_hi(name); return 0; } And the relevant assembly: :: 0804840b : 804840b: 55 push ebp 804840c: 89 e5 mov ebp,esp 804840e: 83 ec 08 sub esp,0x8 8048411: 83 ec 08 sub esp,0x8 8048414: ff 75 08 push DWORD PTR [ebp+0x8] 8048417: 68 f0 84 04 08 push 0x80484f0 804841c: e8 bf fe ff ff call 80482e0 8048421: 83 c4 10 add esp,0x10 8048424: 90 nop 8048425: c9 leave 8048426: c3 ret 08048427
: 8048427: 8d 4c 24 04 lea ecx,[esp+0x4] 804842b: 83 e4 f0 and esp,0xfffffff0 804842e: ff 71 fc push DWORD PTR [ecx-0x4] 8048431: 55 push ebp 8048432: 89 e5 mov ebp,esp 8048434: 51 push ecx 8048435: 83 ec 14 sub esp,0x14 8048438: 89 c8 mov eax,ecx 804843a: 83 38 02 cmp DWORD PTR [eax],0x2 804843d: 74 07 je 8048446 804843f: b8 01 00 00 00 mov eax,0x1 8048444: eb 1c jmp 8048462 8048446: 8b 40 04 mov eax,DWORD PTR [eax+0x4] 8048449: 8b 40 04 mov eax,DWORD PTR [eax+0x4] 804844c: 89 45 f4 mov DWORD PTR [ebp-0xc],eax 804844f: 83 ec 0c sub esp,0xc 8048452: ff 75 f4 push DWORD PTR [ebp-0xc] 8048455: e8 b1 ff ff ff call 804840b 804845a: 83 c4 10 add esp,0x10 804845d: b8 00 00 00 00 mov eax,0x0 8048462: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4] 8048465: c9 leave 8048466: 8d 61 fc lea esp,[ecx-0x4] 8048469: c3 ret Skipping over the bulk of ``main``, you'll see that at ``0x8048452`` ``main``'s ``name`` local is pushed to the stack because it's the first argument to ``say_hi``. Then, a ``call`` instruction is executed. ``call`` instructions first push the current instruction pointer to the stack, then jump to their destination. So when the processor begins executing ``say_hi`` at ``0x0804840b``, the stack looks like this: :: EIP = 0x0804840b (push ebp) ESP = 0xffff0000 EBP = 0xffff002c 0xffff0004: 0xffffa0a0 // say_hi argument 1 ESP -> 0xffff0000: 0x0804845a // Return address for say_hi The first thing ``say_hi`` does is save the current ``ebp`` so that when it returns, ``ebp`` is back where ``main`` expects it to be. The stack now looks like this: :: EIP = 0x0804840c (mov ebp, esp) ESP = 0xfffefffc EBP = 0xffff002c 0xffff0004: 0xffffa0a0 // say_hi argument 1 0xffff0000: 0x0804845a // Return address for say_hi ESP -> 0xfffefffc: 0xffff002c // Saved EBP Again, note how ``esp`` gets smaller when values are pushed to the stack. Next, the current ``esp`` is saved into ``ebp``, marking the top of the new stack frame. :: EIP = 0x0804840e (sub esp, 0x8) ESP = 0xfffefffc EBP = 0xfffefffc 0xffff0004: 0xffffa0a0 // say_hi argument 1 0xffff0000: 0x0804845a // Return address for say_hi ESP, EBP -> 0xfffefffc: 0xffff002c // Saved EBP Then, the stack is "grown" to accommodate local variables inside ``say_hi``. :: EIP = 0x08048414 (push [ebp + 0x8]) ESP = 0xfffeffec EBP = 0xfffefffc 0xffff0004: 0xffffa0a0 // say_hi argument 1 0xffff0000: 0x0804845a // Return address for say_hi EBP -> 0xfffefffc: 0xffff002c // Saved EBP 0xfffefff8: UNDEFINED 0xfffefff4: UNDEFINED 0xfffefff0: UNDEFINED ESP -> 0xfffefffc: UNDEFINED .. note:: Stack space is not implictly cleared! Now, the 2 arguments to ``printf`` are pushed in reverse order. :: EIP = 0x0804841c (call printf@plt) ESP = 0xfffeffe4 EBP = 0xfffefffc 0xffff0004: 0xffffa0a0 // say_hi argument 1 0xffff0000: 0x0804845a // Return address for say_hi EBP -> 0xfffefffc: 0xffff002c // Saved EBP 0xfffefff8: UNDEFINED 0xfffefff4: UNDEFINED 0xfffefff0: UNDEFINED 0xfffeffec: UNDEFINED 0xfffeffe8: 0xffffa0a0 // printf argument 2 ESP -> 0xfffeffe4: 0x080484f0 // printf argument 1 Finally, ``printf`` is called, which pushes the address of the next instruction to execute. :: EIP = 0x080482e0 ESP = 0xfffeffe4 EBP = 0xfffefffc 0xffff0004: 0xffffa0a0 // say_hi argument 1 0xffff0000: 0x0804845a // Return address for say_hi EBP -> 0xfffefffc: 0xffff002c // Saved EBP 0xfffefff8: UNDEFINED 0xfffefff4: UNDEFINED 0xfffefff0: UNDEFINED 0xfffeffec: UNDEFINED 0xfffeffe8: 0xffffa0a0 // printf argument 2 0xfffeffe4: 0x080484f0 // printf argument 1 ESP -> 0xfffeffe0: 0x08048421 // Return address for printf Once ``printf`` has returned, the ``leave`` instruction moves ``ebp`` into ``esp``, and pops the saved EBP. :: EIP = 0x08048426 (ret) ESP = 0xfffefffc EBP = 0xffff002c 0xffff0004: 0xffffa0a0 // say_hi argument 1 ESP -> 0xffff0000: 0x0804845a // Return address for say_hi And finally, ``ret`` pops the saved instruction pointer into ``eip`` which causes the program to return to main with the same ``esp``, ``ebp``, and stack contents as when ``say_hi`` was initially called. :: EIP = 0x0804845a (add esp, 0x10) ESP = 0xffff0000 EBP = 0xffff002c ESP -> 0xffff0004: 0xffffa0a0 // say_hi argument 1