x86 and linux stack layout

linux-c-programming.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* x86 and linux stack layout
@ 2004-11-21 13:33 Daniel Souza
  2004-11-21 15:13 ` Justinas
  2004-11-21 19:08 ` Glynn Clements
  0 siblings, 2 replies; 7+ messages in thread
From: Daniel Souza @ 2004-11-21 13:33 UTC (permalink / raw)
  To: linux-c-programming

Hi everybody

can anyone explain me how the x86 stack works ? like...
the stack starts at 0xbfffe000, growing forward, at the start of
the main() call (or another elf session that starts after main()
and initializes the argc, argv and envp args), and after
every CALL if modifies the EBP and ESP doing :

and after a RET call, it does:

and differences between JMP, LONGJMP and CALL,  
what registers they change, etc.

And so, how function arguments looks like in the stack, for 
example, when a function like
int foo (u_long boo, char *moo, char loo) {}
is caught, how they arguments looks like in the stack ?

i know that will be a 4 bytes long integer, another 4bytes
pointer (32b) and a 1byte char, in a reverse order. Will the
stack pointer be added (or subtracted) by 9 bytes, that
mean, the sum of all argument type lengths ? 

When a function returns, where its result is stored on ? 

If I make a lot of function calls, in anywhere the position of stack
of each call needs to be stored (like a backtrace)... where
is it stored on ? 

what are stack frames ? whats the relation between ESP and EBP ?

What those ELF sessions that are caught before main() do ? what
happens internally
when main() returns ? like, execute another elf session like .dtors
and try to return the return code to OS, as return of a execve() for
example. Is it right ?

Thanks a lot =)
Daniel

-- 
# (perl -e 'while (1) { print "\x90"; }') | dd of=/dev/war

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: x86 and linux stack layout
  2004-11-21 13:33 x86 and linux stack layout Daniel Souza
@ 2004-11-21 15:13 ` Justinas
  2004-11-21 19:08 ` Glynn Clements
  1 sibling, 0 replies; 7+ messages in thread
From: Justinas @ 2004-11-21 15:13 UTC (permalink / raw)
  To: Daniel Souza, linux-c-programming

On Sun, 21 Nov 2004 10:33:53 -0300
Daniel Souza <thehazard@gmail.com> wrote:

> Hi everybody
Hello
I'll try to explain some detais, as far as i remember.
> 
> can anyone explain me how the x86 stack works ? like...
i think You know the principles how does the stack work;] First In Last Out. The are two main commands to work with it(in assembly langiuage) push and pop.
lets say we have a esp=0x0000100. When we do such instruction
	push eax
the processor actualy does two main steps:
	sub esp,4	;decreases s stack pointer by two
	mov [esp], eax	;moves to memory location where points esp a 			;values stored in eax, now esp=0x000000fe

by popping a value from a stack(pop eax) we have this situation:
	mov eax,[esp]	;moves value from top of stack
	add esp,4	;increases esp by 4(size of eax register)
			;now esp=0x00000100

> the stack starts at 0xbfffe000, growing forward, at the start of
> the main() call (or another elf session that starts after main()
> and initializes the argc, argv and envp args), and after
> every CALL if modifies the EBP and ESP doing :
lets say we have a c function(systems word = 4bytes):
void do_smth(int a, char b);
when calling this function caling processor does two pushes:
	push b;
	push a;
consider stack before push, esp=0x00000100(for simplicity 0100)
	0000|		|
	0001|		|
	0002|		|
	....|		|
	....|		|
	00FD|		|
	00FE|		|
	00FF|___________|
esp->	0100

after pushes esp=0x00F8
	0000|		|
	0001|		|
	0002|		|
	....|		|
	....|		|
	00FD|		|
esp->	00F8|     a	|
	00FC|_____b_____|
	0100

and after that call function. Witch knows, that on top of the stack there is needed parameters.

> 
> and after a RET call, it does:
> 
> and differences between JMP, LONGJMP and CALL,  
> what registers they change, etc.
> 
> And so, how function arguments looks like in the stack, for 
> example, when a function like
> int foo (u_long boo, char *moo, char loo) {}
> is caught, how they arguments looks like in the stack ?
> 
> i know that will be a 4 bytes long integer, another 4bytes
> pointer (32b) and a 1byte char, in a reverse order. Will the
> stack pointer be added (or subtracted) by 9 bytes, that
> mean, the sum of all argument type lengths ? 
> 
> When a function returns, where its result is stored on ? 
usualy in eax, eax:edx. But it depends 
http://weblogs.asp.net/oldnewthing/archive/2004/01/02/47184.aspx
http://weblogs.asp.net/oldnewthing/archive/2004/01/07/48303.aspx
and old good google, keyword: calling convention :)
> 
> If I make a lot of function calls, in anywhere the position of stack
> of each call needs to be stored (like a backtrace)... where
> is it stored on ? 

> 
> what are stack frames ? whats the relation between ESP and EBP ?
with ebp You can create a base pointer, in stack section, from witch you evaluate absolute address. ss+ebp+esp = absolute address of top stack value. And could anybody else explain in details what is stack frame is. I cant remember:/ But it is relates with this. As i remember(maybe wrong!), local function variables are created in stack frames, based on ebp.
> 
> What those ELF sessions that are caught before main() do ? what
> happens internally
> when main() returns ? like, execute another elf session like .dtors
> and try to return the return code to OS, as return of a execve() for
> example. Is it right ?
some kind off.
> 
> 
> Thanks a lot =)
> Daniel
> 
> 
> -- 
> # (perl -e 'while (1) { print "\x90"; }') | dd of=/dev/war
> -
> To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: x86 and linux stack layout
  2004-11-21 13:33 x86 and linux stack layout Daniel Souza
  2004-11-21 15:13 ` Justinas
@ 2004-11-21 19:08 ` Glynn Clements
  2004-11-21 20:07   ` Daniel Souza
  1 sibling, 1 reply; 7+ messages in thread
From: Glynn Clements @ 2004-11-21 19:08 UTC (permalink / raw)
  To: Daniel Souza; +Cc: linux-c-programming

Daniel Souza wrote:

> can anyone explain me how the x86 stack works ? like...
> the stack starts at 0xbfffe000, growing forward, at the start of
> the main() call (or another elf session that starts after main()
> and initializes the argc, argv and envp args), and after
> every CALL if modifies the EBP and ESP doing :
> 
> and after a RET call, it does:
> 
> and differences between JMP, LONGJMP and CALL,  
> what registers they change, etc.

A near JMP instruction sets EIP to the specified address (either an
offset relative to the current EIP, the contents of a register, or the
contents of a memory location). A far JMP instruction sets both CS and
EIP.

A CALL instruction is similar to a near JMP, but it pushes EIP onto
the stack first, so that RET works.

> And so, how function arguments looks like in the stack, for 
> example, when a function like
> int foo (u_long boo, char *moo, char loo) {}
> is caught, how they arguments looks like in the stack ?
> 
> i know that will be a 4 bytes long integer, another 4bytes
> pointer (32b) and a 1byte char, in a reverse order. Will the
> stack pointer be added (or subtracted) by 9 bytes, that
> mean, the sum of all argument type lengths ? 

First, the compiler will typically pad individual arguments to the
machine's word size (e.g. 32 bits on x86). Also, it may pad the stack
frame further.

The layout of the arguments is as if the compiler pushed each argument
(padded to a multiple of the word size) onto the stack (with a PUSH
instruction) in right-to-left order.

> When a function returns, where its result is stored on ? 

For integer/pointer values, the result is in EAX.

> If I make a lot of function calls, in anywhere the position of stack
> of each call needs to be stored (like a backtrace)... where
> is it stored on ? 

CALL saves the current EIP on the stack.

> what are stack frames ? whats the relation between ESP and EBP ?

The first few instructions of each function typically look like:

	pushl	%ebp
	movl	%esp, %ebp
	subl	<offset>, %esp

where <offset> is the number of bytes which the function uses for
local variables. The same effect can be achieved by the ENTER
instruction, but on recent x86 chips the ENTER instruction is slower,
so it isn't used.

In calling the function, the caller pushed the arguments onto the
stack, then the CALL instruction pushed EIP onto the stack. Coupled
with the above code, the stack will look like:

	char loo
	char* moo
	u_long boo
	old EIP
EBP ->	old EBP
	<local var>
	<local var>
	...
ESP ->	<local var>

The function's arguments can be referenced as positive offsets from
EBP, while its local variables can be referenced as negative offsets
from EBP. The collection of arguments, local variables and saved
EIP/EBP is referred to as a stack frame, and EBP is referred to as a
frame pointer.

Note that EBP itself points to the previous EBP, which will point to
the EBP before that, and so on. So EBP effectively points to a linked
list of stack frames; gdb's where/bt commands simply display this
list.

Before returning from the function, the EBP and ESP must be restored
with:

	movl	%ebp, %esp
	popl	%ebp

(or the LEAVE instruction, which is equivalent). That leaves the saved
EIP on top of the stack for the RET instruction.

The -fomit-frame-pointer switch disables the use of EBP. Arguments and
local variables are referenced as positive offsets from ESP. The
compiler has to track changes to ESP so that offsets are computed
correctly. This leaves EBP available for other purposes, but inhibits
debugging (without EBP, a debugger can't figure out what is stored
where).

-- 
Glynn Clements <glynn@gclements.plus.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: x86 and linux stack layout
  2004-11-21 19:08 ` Glynn Clements
@ 2004-11-21 20:07   ` Daniel Souza
  2004-11-21 21:00     ` Glynn Clements
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Souza @ 2004-11-21 20:07 UTC (permalink / raw)
  To: Glynn Clements; +Cc: linux-c-programming

Thanks Glynn and Justinas, helped a lot. 
Do you know what other ELF sessions are for ? What is the work of the
ld.linux ? what does it do ?

Thanks again =)

[]'sss
Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: x86 and linux stack layout
  2004-11-21 20:07   ` Daniel Souza
@ 2004-11-21 21:00     ` Glynn Clements
  2004-11-21 23:07       ` Daniel Souza
  0 siblings, 1 reply; 7+ messages in thread
From: Glynn Clements @ 2004-11-21 21:00 UTC (permalink / raw)
  To: Daniel Souza; +Cc: linux-c-programming

Daniel Souza wrote:

> Do you know what other ELF sessions are for ? 

I'm not familiar with that term.

> What is the work of the ld.linux ? what does it do ?

ld-linux is the loader. It is responsible for loading
dynamically-linked executables and the shared libraries on which they
depend.

When you execve() a dynamically-linked executable, the kernel actually
runs "/lib/ld-linux.so.2 <program> <arguments>", and ld-linux does the
work of setting up the shared libraries.

-- 
Glynn Clements <glynn@gclements.plus.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: x86 and linux stack layout
  2004-11-21 21:00     ` Glynn Clements
@ 2004-11-21 23:07       ` Daniel Souza
  2004-11-22  3:50         ` Glynn Clements
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Souza @ 2004-11-21 23:07 UTC (permalink / raw)
  To: linux-c-programming

Tks Gkynn... Good... well, there's a way to 'recover' stripped
binaries ? any fingerprint that identifies where a function starts in
an executable (like you mentioned, a sequence of stack pushing and esp
decreasing) ? there's a safer way to 'detect' functions ?

How runtime loadable libraries are linked to the executable ? the
functions used from that libraries needs to be realocatted ? like...
supose that:

/home/daniel/example1.c
void main(void)
{
   libfoo_init();
}
is compiled and linked to use a runtime library like /lib/libfoo.so.

The executable code of example1 will look like

.....
0x80de4fe8 CALL 0xd3ff483e <libfoo_init>
......

as the executable code will need to be build with a fixed function
address in the CALL opcode... and lets suppose that the address was
0xd3ff483e. Running the same binary in another system, with the same
version of libfoo, but another compilation (i.e., the size of library
and addresses are different), the program will run sucessfuly. The
question is: who realocates the addresses of CALLs ? (if things really
works like that way that im supposing to work)

The addresses along the executable code needs to be rewriten by
something, or there is a "table" where the program finds the desired
function address, and only that table needs to be "reallocated" or
readressable ? like... instead of

.....
0x80de4fe8 CALL 0xd3ff483e <libfoo_init>
......

we have

.....
0x80de4fe8 CALL libfoo_init <libfoo+0x3ef>
......

and a table that tells that libfoo_init is at address 0xYYYYYYYY in
libfoo, or, something else ? when the program starts, something before
the main() (ld.linux) loads all external libraries in the process
address space by mmap'ing the libraries. So, lets supose that libfoo
got mapped at some address AFTER 0xd3ff483e, that is the address in
CALLs. Who will tell the CALLs that libfoo_init() is in another
address ? where these 'tables' are stored on ?

PS: elf sessions are sessions within a elf binary, like, .ctors,
.dtors, etc, ( like, rum "objdump -d /bin/cat", or with the argument
to display all info, that i cant remember right now)

Tks a lot =)

Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: x86 and linux stack layout
  2004-11-21 23:07       ` Daniel Souza
@ 2004-11-22  3:50         ` Glynn Clements
  0 siblings, 0 replies; 7+ messages in thread
From: Glynn Clements @ 2004-11-22  3:50 UTC (permalink / raw)
  To: Daniel Souza; +Cc: linux-c-programming

Daniel Souza wrote:

> Tks Gkynn... Good... well, there's a way to 'recover' stripped
> binaries ? any fingerprint that identifies where a function starts in
> an executable (like you mentioned, a sequence of stack pushing and esp
> decreasing) ? there's a safer way to 'detect' functions ?

Exported functions exist in the symbol table regardless of whether the
executable has been stripped. Stripping removes the debug information,
but exported functions must still be identifiable in case they are
required for linking.

Note that "nm" normally displays the debug information, while "nm -D"
displays the symbol table used for linking.

As for detecting function boundaries, any address which is the target
of a CALL instruction is likely to be the start of a function. I don't
think that there's a more reliable way to detect the end of a
function.

> How runtime loadable libraries are linked to the executable ? the
> functions used from that libraries needs to be realocatted ?

Calls to functions in shared libraries are implemented using indirect
jumps, so only the table of addresses needs to be relocated. 
Performing relocations directly on the text sections would prevent the
memory from being shared between multiple processes.

The situation is complicated by lazy binding, where the addresses
initially point into the loader; the first time that a function is
called, the loader finds the actual address then replaces the indirect
address.

If you want the exact details, use "objdump -d ..." to disassemble the
binary, or use gdb's "disassemble" command (and the "stepi"
instruction to step by machine code instructions rather than by C
statements).

> PS: elf sessions are sessions within a elf binary, like, .ctors,
> .dtors, etc, ( like, rum "objdump -d /bin/cat", or with the argument
> to display all info, that i cant remember right now)

Those are called "sections" (or sometimes "segments").

-- 
Glynn Clements <glynn@gclements.plus.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-11-22  3:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-21 13:33 x86 and linux stack layout Daniel Souza
2004-11-21 15:13 ` Justinas
2004-11-21 19:08 ` Glynn Clements
2004-11-21 20:07   ` Daniel Souza
2004-11-21 21:00     ` Glynn Clements
2004-11-21 23:07       ` Daniel Souza
2004-11-22  3:50         ` Glynn Clements

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).