* A serious stack issue caused by gcc optimization
@ 2009-03-03 14:22 Bean
2009-03-03 14:51 ` phcoder
0 siblings, 1 reply; 3+ messages in thread
From: Bean @ 2009-03-03 14:22 UTC (permalink / raw)
To: The development of GRUB 2
Hi,
I've found a serious stack issue when debugging EFI amd64 port. The
problem is in grub_parser_cmdline_state, the c code is all right,
problem is in the assembly code generated by gcc:
grub_parser_cmdline_state:
mov %edi,-0x14(%rsp)
movl $0x1,-0xc(%rsp)
leaq EXT_C(my_state_transitions)(%rip),%rcx
xor %r8d,%r8d
jmp loc_31
loc_1b:
cmp %edi,%eax
jne loc_2d
mov 0x8(%rcx),%al
cmp %sil,%al
je loc_3e
test %al,%al
cmove %rcx,%r8
loc_2d:
add $0x10,%rcx
loc_31:
mov (%rcx),%eax
test %eax,%eax
loc_35:
jne loc_1b
loc_37:
jmp loc_4d
loc_39:
lea -0x18(%rsp),%rcx
loc_3e:
xor %eax,%eax
cmpl $0x0,0xc(%rcx)
cmovne %esi,%eax
mov %al,(%rdx)
mov 0x4(%rcx),%eax
addq $0x18, %rsp
retq
loc_4d:
test %r8,%r8
mov %r8,%rcx
jne loc_3e
jmp loc_39
You can see that it access local variable default_transition using
negative offset from %esp. In EFI, hardware interrupt is enabled. When
an interrupt fires (such as timer), it uses the same stack, which in
effect overwritten default_transition ! The result is that the parser
is halted by wrong grub_parser_cmdline_state result, and the menu is
truncated. If I subtract %rsp and uses positive offset, this problem
goes away.
For grub_parser_cmdline_state itself, there is a simple solution, just
change default_transition to global variable:
diff --git a/kern/parser.c b/kern/parser.c
index e931853..a0ab0e7 100644
--- a/kern/parser.c
+++ b/kern/parser.c
@@ -54,6 +54,7 @@ static struct grub_parser_state_transition
state_transitions[] =
{ 0, 0, 0, 0}
};
+static struct grub_parser_state_transition default_transition;
/* Determines the state following STATE, determined by C. */
grub_parser_state_t
@@ -61,7 +62,6 @@ grub_parser_cmdline_state (grub_parser_state_t
state, char c, char *result)
{
struct grub_parser_state_transition *transition;
struct grub_parser_state_transition *next_match = 0;
- struct grub_parser_state_transition default_transition;
int found = 0;
default_transition.to_state = state;
But there could be other hidden bombs lurking around. Especially that
random reboot issue, I suspect it's caused by the same bug.
I've tried a few gcc options, but doesn't seems to work. Anyone knows
how to instruct gcc to subtract %rsp pointer for local variable ?
--
Bean
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: A serious stack issue caused by gcc optimization
2009-03-03 14:22 A serious stack issue caused by gcc optimization Bean
@ 2009-03-03 14:51 ` phcoder
2009-03-03 16:06 ` Bean
0 siblings, 1 reply; 3+ messages in thread
From: phcoder @ 2009-03-03 14:51 UTC (permalink / raw)
To: The development of GRUB 2
Hello
Have you tried -mno-red-zone? Here is an extract from man gcc:
-mno-red-zone
Do not use a so called red zone for x86-64 code. The red
zone is
mandated by the x86-64 ABI, it is a 128-byte area beyond the
location of the stack pointer that will not be modified by
signal
or interrupt handlers and therefore can be used for
temporary data
without adjusting the stack pointer. The flag -mno-red-zone
disables this red zone.
Bean wrote:
> Hi,
>
> I've found a serious stack issue when debugging EFI amd64 port. The
> problem is in grub_parser_cmdline_state, the c code is all right,
> problem is in the assembly code generated by gcc:
>
> grub_parser_cmdline_state:
> mov %edi,-0x14(%rsp)
> movl $0x1,-0xc(%rsp)
> leaq EXT_C(my_state_transitions)(%rip),%rcx
>
> xor %r8d,%r8d
> jmp loc_31
> loc_1b:
> cmp %edi,%eax
> jne loc_2d
> mov 0x8(%rcx),%al
> cmp %sil,%al
> je loc_3e
> test %al,%al
> cmove %rcx,%r8
> loc_2d:
> add $0x10,%rcx
> loc_31:
> mov (%rcx),%eax
> test %eax,%eax
> loc_35:
> jne loc_1b
> loc_37:
> jmp loc_4d
> loc_39:
> lea -0x18(%rsp),%rcx
> loc_3e:
> xor %eax,%eax
> cmpl $0x0,0xc(%rcx)
> cmovne %esi,%eax
> mov %al,(%rdx)
> mov 0x4(%rcx),%eax
> addq $0x18, %rsp
> retq
> loc_4d:
> test %r8,%r8
> mov %r8,%rcx
> jne loc_3e
> jmp loc_39
>
> You can see that it access local variable default_transition using
> negative offset from %esp. In EFI, hardware interrupt is enabled. When
> an interrupt fires (such as timer), it uses the same stack, which in
> effect overwritten default_transition ! The result is that the parser
> is halted by wrong grub_parser_cmdline_state result, and the menu is
> truncated. If I subtract %rsp and uses positive offset, this problem
> goes away.
>
> For grub_parser_cmdline_state itself, there is a simple solution, just
> change default_transition to global variable:
>
> diff --git a/kern/parser.c b/kern/parser.c
> index e931853..a0ab0e7 100644
> --- a/kern/parser.c
> +++ b/kern/parser.c
> @@ -54,6 +54,7 @@ static struct grub_parser_state_transition
> state_transitions[] =
> { 0, 0, 0, 0}
> };
>
> +static struct grub_parser_state_transition default_transition;
>
> /* Determines the state following STATE, determined by C. */
> grub_parser_state_t
> @@ -61,7 +62,6 @@ grub_parser_cmdline_state (grub_parser_state_t
> state, char c, char *result)
> {
> struct grub_parser_state_transition *transition;
> struct grub_parser_state_transition *next_match = 0;
> - struct grub_parser_state_transition default_transition;
> int found = 0;
>
> default_transition.to_state = state;
>
> But there could be other hidden bombs lurking around. Especially that
> random reboot issue, I suspect it's caused by the same bug.
>
> I've tried a few gcc options, but doesn't seems to work. Anyone knows
> how to instruct gcc to subtract %rsp pointer for local variable ?
>
--
Regards
Vladimir 'phcoder' Serbinenko
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: A serious stack issue caused by gcc optimization
2009-03-03 14:51 ` phcoder
@ 2009-03-03 16:06 ` Bean
0 siblings, 0 replies; 3+ messages in thread
From: Bean @ 2009-03-03 16:06 UTC (permalink / raw)
To: The development of GRUB 2
[-- Attachment #1: Type: text/plain, Size: 683 bytes --]
On Tue, Mar 3, 2009 at 10:51 PM, phcoder <phcoder@gmail.com> wrote:
> Hello
> Have you tried -mno-red-zone? Here is an extract from man gcc:
> -mno-red-zone
> Do not use a so called red zone for x86-64 code. The red zone is
> mandated by the x86-64 ABI, it is a 128-byte area beyond the
> location of the stack pointer that will not be modified by signal
> or interrupt handlers and therefore can be used for temporary data
> without adjusting the stack pointer. The flag -mno-red-zone
> disables this red zone.
Hi,
Oh, thanks a lot, this option does the trick. Here is the patch file.
--
Bean
[-- Attachment #2: amd64_efi.diff --]
[-- Type: application/octet-stream, Size: 452 bytes --]
diff --git a/configure.ac b/configure.ac
index fd89d30..f0166e8 100644
--- a/configure.ac
+++ b/configure.ac
@@ -308,7 +308,7 @@ if test "x$target_m64" = x1; then
if test "x$grub_cv_cc_mcmodel" = xno; then
AC_MSG_ERROR([-mcmodel=large not supported, upgrade your gcc])
fi
- TARGET_CFLAGS="$TARGET_CFLAGS -m64 -mcmodel=large"
+ TARGET_CFLAGS="$TARGET_CFLAGS -m64 -mcmodel=large -mno-red-zone"
TARGET_LDFLAGS="$TARGET_LDFLAGS -m64"
fi
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-03-03 16:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-03 14:22 A serious stack issue caused by gcc optimization Bean
2009-03-03 14:51 ` phcoder
2009-03-03 16:06 ` Bean
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.