* x86 and linux stack layout
@ 2004-11-21 13:33 Daniel Souza
2004-11-21 15:13 ` Justinas
2004-11-21 19:08 ` Glynn Clements
0 siblings, 2 replies; 7+ messages in thread
From: Daniel Souza @ 2004-11-21 13:33 UTC (permalink / raw)
To: linux-c-programming
Hi everybody
can anyone explain me how the x86 stack works ? like...
the stack starts at 0xbfffe000, growing forward, at the start of
the main() call (or another elf session that starts after main()
and initializes the argc, argv and envp args), and after
every CALL if modifies the EBP and ESP doing :
and after a RET call, it does:
and differences between JMP, LONGJMP and CALL,
what registers they change, etc.
And so, how function arguments looks like in the stack, for
example, when a function like
int foo (u_long boo, char *moo, char loo) {}
is caught, how they arguments looks like in the stack ?
i know that will be a 4 bytes long integer, another 4bytes
pointer (32b) and a 1byte char, in a reverse order. Will the
stack pointer be added (or subtracted) by 9 bytes, that
mean, the sum of all argument type lengths ?
When a function returns, where its result is stored on ?
If I make a lot of function calls, in anywhere the position of stack
of each call needs to be stored (like a backtrace)... where
is it stored on ?
what are stack frames ? whats the relation between ESP and EBP ?
What those ELF sessions that are caught before main() do ? what
happens internally
when main() returns ? like, execute another elf session like .dtors
and try to return the return code to OS, as return of a execve() for
example. Is it right ?
Thanks a lot =)
Daniel
--
# (perl -e 'while (1) { print "\x90"; }') | dd of=/dev/war
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: x86 and linux stack layout 2004-11-21 13:33 x86 and linux stack layout Daniel Souza @ 2004-11-21 15:13 ` Justinas 2004-11-21 19:08 ` Glynn Clements 1 sibling, 0 replies; 7+ messages in thread From: Justinas @ 2004-11-21 15:13 UTC (permalink / raw) To: Daniel Souza, linux-c-programming On Sun, 21 Nov 2004 10:33:53 -0300 Daniel Souza <thehazard@gmail.com> wrote: > Hi everybody Hello I'll try to explain some detais, as far as i remember. > > can anyone explain me how the x86 stack works ? like... i think You know the principles how does the stack work;] First In Last Out. The are two main commands to work with it(in assembly langiuage) push and pop. lets say we have a esp=0x0000100. When we do such instruction push eax the processor actualy does two main steps: sub esp,4 ;decreases s stack pointer by two mov [esp], eax ;moves to memory location where points esp a ;values stored in eax, now esp=0x000000fe by popping a value from a stack(pop eax) we have this situation: mov eax,[esp] ;moves value from top of stack add esp,4 ;increases esp by 4(size of eax register) ;now esp=0x00000100 > the stack starts at 0xbfffe000, growing forward, at the start of > the main() call (or another elf session that starts after main() > and initializes the argc, argv and envp args), and after > every CALL if modifies the EBP and ESP doing : lets say we have a c function(systems word = 4bytes): void do_smth(int a, char b); when calling this function caling processor does two pushes: push b; push a; consider stack before push, esp=0x00000100(for simplicity 0100) 0000| | 0001| | 0002| | ....| | ....| | 00FD| | 00FE| | 00FF|___________| esp-> 0100 after pushes esp=0x00F8 0000| | 0001| | 0002| | ....| | ....| | 00FD| | esp-> 00F8| a | 00FC|_____b_____| 0100 and after that call function. Witch knows, that on top of the stack there is needed parameters. > > and after a RET call, it does: > > and differences between JMP, LONGJMP and CALL, > what registers they change, etc. > > And so, how function arguments looks like in the stack, for > example, when a function like > int foo (u_long boo, char *moo, char loo) {} > is caught, how they arguments looks like in the stack ? > > i know that will be a 4 bytes long integer, another 4bytes > pointer (32b) and a 1byte char, in a reverse order. Will the > stack pointer be added (or subtracted) by 9 bytes, that > mean, the sum of all argument type lengths ? > > When a function returns, where its result is stored on ? usualy in eax, eax:edx. But it depends http://weblogs.asp.net/oldnewthing/archive/2004/01/02/47184.aspx http://weblogs.asp.net/oldnewthing/archive/2004/01/07/48303.aspx and old good google, keyword: calling convention :) > > If I make a lot of function calls, in anywhere the position of stack > of each call needs to be stored (like a backtrace)... where > is it stored on ? > > what are stack frames ? whats the relation between ESP and EBP ? with ebp You can create a base pointer, in stack section, from witch you evaluate absolute address. ss+ebp+esp = absolute address of top stack value. And could anybody else explain in details what is stack frame is. I cant remember:/ But it is relates with this. As i remember(maybe wrong!), local function variables are created in stack frames, based on ebp. > > What those ELF sessions that are caught before main() do ? what > happens internally > when main() returns ? like, execute another elf session like .dtors > and try to return the return code to OS, as return of a execve() for > example. Is it right ? some kind off. > > > Thanks a lot =) > Daniel > > > -- > # (perl -e 'while (1) { print "\x90"; }') | dd of=/dev/war > - > To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: x86 and linux stack layout 2004-11-21 13:33 x86 and linux stack layout Daniel Souza 2004-11-21 15:13 ` Justinas @ 2004-11-21 19:08 ` Glynn Clements 2004-11-21 20:07 ` Daniel Souza 1 sibling, 1 reply; 7+ messages in thread From: Glynn Clements @ 2004-11-21 19:08 UTC (permalink / raw) To: Daniel Souza; +Cc: linux-c-programming Daniel Souza wrote: > can anyone explain me how the x86 stack works ? like... > the stack starts at 0xbfffe000, growing forward, at the start of > the main() call (or another elf session that starts after main() > and initializes the argc, argv and envp args), and after > every CALL if modifies the EBP and ESP doing : > > and after a RET call, it does: > > and differences between JMP, LONGJMP and CALL, > what registers they change, etc. A near JMP instruction sets EIP to the specified address (either an offset relative to the current EIP, the contents of a register, or the contents of a memory location). A far JMP instruction sets both CS and EIP. A CALL instruction is similar to a near JMP, but it pushes EIP onto the stack first, so that RET works. > And so, how function arguments looks like in the stack, for > example, when a function like > int foo (u_long boo, char *moo, char loo) {} > is caught, how they arguments looks like in the stack ? > > i know that will be a 4 bytes long integer, another 4bytes > pointer (32b) and a 1byte char, in a reverse order. Will the > stack pointer be added (or subtracted) by 9 bytes, that > mean, the sum of all argument type lengths ? First, the compiler will typically pad individual arguments to the machine's word size (e.g. 32 bits on x86). Also, it may pad the stack frame further. The layout of the arguments is as if the compiler pushed each argument (padded to a multiple of the word size) onto the stack (with a PUSH instruction) in right-to-left order. > When a function returns, where its result is stored on ? For integer/pointer values, the result is in EAX. > If I make a lot of function calls, in anywhere the position of stack > of each call needs to be stored (like a backtrace)... where > is it stored on ? CALL saves the current EIP on the stack. > what are stack frames ? whats the relation between ESP and EBP ? The first few instructions of each function typically look like: pushl %ebp movl %esp, %ebp subl <offset>, %esp where <offset> is the number of bytes which the function uses for local variables. The same effect can be achieved by the ENTER instruction, but on recent x86 chips the ENTER instruction is slower, so it isn't used. In calling the function, the caller pushed the arguments onto the stack, then the CALL instruction pushed EIP onto the stack. Coupled with the above code, the stack will look like: char loo char* moo u_long boo old EIP EBP -> old EBP <local var> <local var> ... ESP -> <local var> The function's arguments can be referenced as positive offsets from EBP, while its local variables can be referenced as negative offsets from EBP. The collection of arguments, local variables and saved EIP/EBP is referred to as a stack frame, and EBP is referred to as a frame pointer. Note that EBP itself points to the previous EBP, which will point to the EBP before that, and so on. So EBP effectively points to a linked list of stack frames; gdb's where/bt commands simply display this list. Before returning from the function, the EBP and ESP must be restored with: movl %ebp, %esp popl %ebp (or the LEAVE instruction, which is equivalent). That leaves the saved EIP on top of the stack for the RET instruction. The -fomit-frame-pointer switch disables the use of EBP. Arguments and local variables are referenced as positive offsets from ESP. The compiler has to track changes to ESP so that offsets are computed correctly. This leaves EBP available for other purposes, but inhibits debugging (without EBP, a debugger can't figure out what is stored where). -- Glynn Clements <glynn@gclements.plus.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: x86 and linux stack layout 2004-11-21 19:08 ` Glynn Clements @ 2004-11-21 20:07 ` Daniel Souza 2004-11-21 21:00 ` Glynn Clements 0 siblings, 1 reply; 7+ messages in thread From: Daniel Souza @ 2004-11-21 20:07 UTC (permalink / raw) To: Glynn Clements; +Cc: linux-c-programming Thanks Glynn and Justinas, helped a lot. Do you know what other ELF sessions are for ? What is the work of the ld.linux ? what does it do ? Thanks again =) []'sss Daniel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: x86 and linux stack layout 2004-11-21 20:07 ` Daniel Souza @ 2004-11-21 21:00 ` Glynn Clements 2004-11-21 23:07 ` Daniel Souza 0 siblings, 1 reply; 7+ messages in thread From: Glynn Clements @ 2004-11-21 21:00 UTC (permalink / raw) To: Daniel Souza; +Cc: linux-c-programming Daniel Souza wrote: > Do you know what other ELF sessions are for ? I'm not familiar with that term. > What is the work of the ld.linux ? what does it do ? ld-linux is the loader. It is responsible for loading dynamically-linked executables and the shared libraries on which they depend. When you execve() a dynamically-linked executable, the kernel actually runs "/lib/ld-linux.so.2 <program> <arguments>", and ld-linux does the work of setting up the shared libraries. -- Glynn Clements <glynn@gclements.plus.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: x86 and linux stack layout 2004-11-21 21:00 ` Glynn Clements @ 2004-11-21 23:07 ` Daniel Souza 2004-11-22 3:50 ` Glynn Clements 0 siblings, 1 reply; 7+ messages in thread From: Daniel Souza @ 2004-11-21 23:07 UTC (permalink / raw) To: linux-c-programming Tks Gkynn... Good... well, there's a way to 'recover' stripped binaries ? any fingerprint that identifies where a function starts in an executable (like you mentioned, a sequence of stack pushing and esp decreasing) ? there's a safer way to 'detect' functions ? How runtime loadable libraries are linked to the executable ? the functions used from that libraries needs to be realocatted ? like... supose that: /home/daniel/example1.c void main(void) { libfoo_init(); } is compiled and linked to use a runtime library like /lib/libfoo.so. The executable code of example1 will look like ..... 0x80de4fe8 CALL 0xd3ff483e <libfoo_init> ...... as the executable code will need to be build with a fixed function address in the CALL opcode... and lets suppose that the address was 0xd3ff483e. Running the same binary in another system, with the same version of libfoo, but another compilation (i.e., the size of library and addresses are different), the program will run sucessfuly. The question is: who realocates the addresses of CALLs ? (if things really works like that way that im supposing to work) The addresses along the executable code needs to be rewriten by something, or there is a "table" where the program finds the desired function address, and only that table needs to be "reallocated" or readressable ? like... instead of ..... 0x80de4fe8 CALL 0xd3ff483e <libfoo_init> ...... we have ..... 0x80de4fe8 CALL libfoo_init <libfoo+0x3ef> ...... and a table that tells that libfoo_init is at address 0xYYYYYYYY in libfoo, or, something else ? when the program starts, something before the main() (ld.linux) loads all external libraries in the process address space by mmap'ing the libraries. So, lets supose that libfoo got mapped at some address AFTER 0xd3ff483e, that is the address in CALLs. Who will tell the CALLs that libfoo_init() is in another address ? where these 'tables' are stored on ? PS: elf sessions are sessions within a elf binary, like, .ctors, .dtors, etc, ( like, rum "objdump -d /bin/cat", or with the argument to display all info, that i cant remember right now) Tks a lot =) Daniel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: x86 and linux stack layout 2004-11-21 23:07 ` Daniel Souza @ 2004-11-22 3:50 ` Glynn Clements 0 siblings, 0 replies; 7+ messages in thread From: Glynn Clements @ 2004-11-22 3:50 UTC (permalink / raw) To: Daniel Souza; +Cc: linux-c-programming Daniel Souza wrote: > Tks Gkynn... Good... well, there's a way to 'recover' stripped > binaries ? any fingerprint that identifies where a function starts in > an executable (like you mentioned, a sequence of stack pushing and esp > decreasing) ? there's a safer way to 'detect' functions ? Exported functions exist in the symbol table regardless of whether the executable has been stripped. Stripping removes the debug information, but exported functions must still be identifiable in case they are required for linking. Note that "nm" normally displays the debug information, while "nm -D" displays the symbol table used for linking. As for detecting function boundaries, any address which is the target of a CALL instruction is likely to be the start of a function. I don't think that there's a more reliable way to detect the end of a function. > How runtime loadable libraries are linked to the executable ? the > functions used from that libraries needs to be realocatted ? Calls to functions in shared libraries are implemented using indirect jumps, so only the table of addresses needs to be relocated. Performing relocations directly on the text sections would prevent the memory from being shared between multiple processes. The situation is complicated by lazy binding, where the addresses initially point into the loader; the first time that a function is called, the loader finds the actual address then replaces the indirect address. If you want the exact details, use "objdump -d ..." to disassemble the binary, or use gdb's "disassemble" command (and the "stepi" instruction to step by machine code instructions rather than by C statements). > PS: elf sessions are sessions within a elf binary, like, .ctors, > .dtors, etc, ( like, rum "objdump -d /bin/cat", or with the argument > to display all info, that i cant remember right now) Those are called "sections" (or sometimes "segments"). -- Glynn Clements <glynn@gclements.plus.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-11-22 3:50 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-11-21 13:33 x86 and linux stack layout Daniel Souza 2004-11-21 15:13 ` Justinas 2004-11-21 19:08 ` Glynn Clements 2004-11-21 20:07 ` Daniel Souza 2004-11-21 21:00 ` Glynn Clements 2004-11-21 23:07 ` Daniel Souza 2004-11-22 3:50 ` Glynn Clements
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).