All of lore.kernel.org
 help / color / mirror / Atom feed
* unwind_stack() and an exception at the last instruction (after the epilogue)
       [not found] <b647ffbd0612121342y5b188be0o5ccce1b2c57a9725@mail.gmail.com>
@ 2006-12-13 11:07 ` Dmitry Adamushko
  2006-12-13 11:54   ` Thiemo Seufer
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Adamushko @ 2006-12-13 11:07 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Dmitry Adamushko

[ resend: probably, my previouse one had been rejected as it was not
in plain-text :]


 Hello,

 unwind_stack() explicitly handles a case when an exception takes
place at the first instruction, i.e. before the prologue.

 But what's about another corner case - when an exception is caused by
an instruction placed after the epilogue.

 example:

 00400e8c <cause_oops>:
   400e8c:       3c1c0fc0        lui     gp,0xfc0
   400e90:       279c71c4        addiu   gp,gp,29124
   400e94:       0399e021        addu    gp,gp,t9
   400e98:       27bdffe0        addiu   sp,sp,-32
   400e9c:       afbf0018        sw      ra,24(sp)
   400ea0:       afbc0010        sw      gp,16(sp)
   400ea4:       8f84801c        lw      a0,-32740(gp)
   400ea8:       8f9980ac        lw      t9,-32596(gp)
   400eac:       00000000        nop
   400eb0:       0320f809        jalr    t9
   400eb4:       24841984        addiu   a0,a0,6532
   400eb8:       8fbc0010        lw      gp,16(sp)
   400ebc:       8fbf0018        lw      ra,24(sp)
   400ec0:       27bd0020        addiu   sp,sp,32
   400ec4:       03e00008        jr      ra
   400ec8:       ac000000        sw      zero,0(zero)
<----------- <epc> will be here when an exception happens


 In this case, <sp> already points to the caller's stack frame so
unwind_stack() will take a wrong assumption (as it looks at the
epilogue of the callee).

 btw, the first and last instructions are just corner cases of an
instruction being placed before the prologue and after the epilogue,
right?

 so something like

 - if (unlikely(ofs == 0)) {
 + if (unlikely(offs == 0 || offs == size - sizeof_mips_instruction))
         pc = *ra;
         *ra = 0;
         return pc;
 }

 won't be a generic solution.

 Did I miss something? Hm... <epc> is always guaranted to be right
when the instruction is in the branch delay slot?

 p.s. yep, the example is a part of user-space code (optimization:
-Os) or is there anything (compiler options etc.) preventing similar
code from being generated for kernel-space code?


Thanks in advance for any comments.


-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unwind_stack() and an exception at the last instruction (after the epilogue)
  2006-12-13 11:07 ` unwind_stack() and an exception at the last instruction (after the epilogue) Dmitry Adamushko
@ 2006-12-13 11:54   ` Thiemo Seufer
  2006-12-13 12:45     ` Dmitry Adamushko
  0 siblings, 1 reply; 7+ messages in thread
From: Thiemo Seufer @ 2006-12-13 11:54 UTC (permalink / raw)
  To: Dmitry Adamushko; +Cc: linux-mips, Ralf Baechle

Dmitry Adamushko wrote:
> [ resend: probably, my previouse one had been rejected as it was not
> in plain-text :]
> 
> 
> Hello,
> 
> unwind_stack() explicitly handles a case when an exception takes
> place at the first instruction, i.e. before the prologue.
> 
> But what's about another corner case - when an exception is caused by
> an instruction placed after the epilogue.
> 
> example:
> 
> 00400e8c <cause_oops>:
>   400e8c:       3c1c0fc0        lui     gp,0xfc0
>   400e90:       279c71c4        addiu   gp,gp,29124
>   400e94:       0399e021        addu    gp,gp,t9
>   400e98:       27bdffe0        addiu   sp,sp,-32
>   400e9c:       afbf0018        sw      ra,24(sp)
>   400ea0:       afbc0010        sw      gp,16(sp)
>   400ea4:       8f84801c        lw      a0,-32740(gp)
>   400ea8:       8f9980ac        lw      t9,-32596(gp)
>   400eac:       00000000        nop
>   400eb0:       0320f809        jalr    t9
>   400eb4:       24841984        addiu   a0,a0,6532
>   400eb8:       8fbc0010        lw      gp,16(sp)
>   400ebc:       8fbf0018        lw      ra,24(sp)
>   400ec0:       27bd0020        addiu   sp,sp,32
>   400ec4:       03e00008        jr      ra
>   400ec8:       ac000000        sw      zero,0(zero)
> <----------- <epc> will be here when an exception happens

Was this example generated by a real world compiler? (Which one?)

> In this case, <sp> already points to the caller's stack frame so
> unwind_stack() will take a wrong assumption (as it looks at the
> epilogue of the callee).
> 
> btw, the first and last instructions are just corner cases of an
> instruction being placed before the prologue and after the epilogue,
> right?
> 
> so something like
> 
> - if (unlikely(ofs == 0)) {
> + if (unlikely(offs == 0 || offs == size - sizeof_mips_instruction))
>         pc = *ra;
>         *ra = 0;
>         return pc;
> }
> 
> won't be a generic solution.
> 
> Did I miss something? Hm... <epc> is always guaranted to be right
> when the instruction is in the branch delay slot?
> 
> p.s. yep, the example is a part of user-space code (optimization:
> -Os) or is there anything (compiler options etc.) preventing similar
> code from being generated for kernel-space code?

I'm inclined to claim the example is broken WRT ABI rules since it
doesn't enclose the whole user code in the prologue/epilogue bracket.


Thiemo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unwind_stack() and an exception at the last instruction (after the epilogue)
  2006-12-13 11:54   ` Thiemo Seufer
@ 2006-12-13 12:45     ` Dmitry Adamushko
  2006-12-13 13:52       ` Thiemo Seufer
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Adamushko @ 2006-12-13 12:45 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Ralf Baechle, linux-mips

> Was this example generated by a real world compiler? (Which one?)

[adamushkad@cplx219]/>mips-linux-uclibc-gcc -v
Reading specs from
/vobs/linux/tools/mips/gcc-3.4.2/bin/../lib/gcc/mips-linux-uclibc/3.4.2/specs
Configured with:
/vobs/linux/tools/buildroot/toolchain_build_mips/gcc-3.4.2/configure
--prefix=/vobs/linux/tools/buildroot/build_mips/staging_dir
--build=i386-pc-linux-gnu
--host=i386-pc-linux-gnu
--target=mips-linux-uclibc
--enable-languages=c,c++
--enable-shared
--disable-__cxa_atexit
--enable-target-optspace
-with-gnu-ld
--disable-nls
--enable-multilib
Thread model: posix

gcc version 3.4.2


> I'm inclined to claim the example is broken WRT ABI rules since it
> doesn't enclose the whole user code in the prologue/epilogue bracket.
>

It's o32. So it explicitly requires that when the prologue and
epilogue have been used in the function, all the user code must be
place in between, right?

In this light, the unlike(ofs == 0) in unwind_stack() aims at catching
cases when <sp> is wrong (if addiu sp,sp,OFFSET is normally the very
first instruction)

ok, here is an example from kernel/sched.o (the same compiler).

00000098 <enqueue_task>:
      98:       8c820018        lw      v0,24(a0)   <----- before the prologue
      9c:       27bdfff8        addiu   sp,sp,-8
      a0:       afbe0000        sw      s8,0(sp)
      a4:       000210c0        sll     v0,v0,0x3
      a8:       00a21021        addu    v0,a1,v0
      ac:       24420018        addiu   v0,v0,24
      b0:       8c460004        lw      a2,4(v0)
      b4:       24830020        addiu   v1,a0,32
      b8:       ac430004        sw      v1,4(v0)
      bc:       ac820020        sw      v0,32(a0)
      c0:       ac660004        sw      a2,4(v1)
      c4:       acc30000        sw      v1,0(a2)
      c8:       8c860018        lw      a2,24(a0)
      cc:       24a70004        addiu   a3,a1,4
      d0:       03a0f021        move    s8,sp
      d4:       00061142        srl     v0,a2,0x5
      d8:       00021080        sll     v0,v0,0x2
      dc:       00e23821        addu    a3,a3,v0
      e0:       8ce30000        lw      v1,0(a3)
      e4:       30c6001f        andi    a2,a2,0x1f
      e8:       24020001        li      v0,1
      ec:       00c21004        sllv    v0,v0,a2
      f0:       00621825        or      v1,v1,v0
      f4:       ace30000        sw      v1,0(a3)
      f8:       8ca20000        lw      v0,0(a1)
      fc:       03c0e821        move    sp,s8
     100:       8fbe0000        lw      s8,0(sp)
     104:       24420001        addiu   v0,v0,1
     108:       27bd0008        addiu   sp,sp,8
     10c:       aca20000        sw      v0,0(a1)
     110:       03e00008        jr      ra
     114:       ac850028        sw      a1,40(a0)   <------------
after the epilogue


As I can see, normally this compiler places "addiu   sp,sp,FRAME_SIZE"
at the branch delay slot of "jr ra" but e.g. enqueue_task() (example
above) and request_task() are exceptions. btw, the very first
instruction is also placed before the epilogue.

Are there any configure options that might have caused such a
behaviour [hmmm... e.g. gcc was configured with --ignore-abi-rulles :]
? Although, I don't think this would be an option-dependent case.


> Thiemo
>


-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unwind_stack() and an exception at the last instruction (after the epilogue)
  2006-12-13 12:45     ` Dmitry Adamushko
@ 2006-12-13 13:52       ` Thiemo Seufer
  2006-12-13 14:40         ` Dmitry Adamushko
  0 siblings, 1 reply; 7+ messages in thread
From: Thiemo Seufer @ 2006-12-13 13:52 UTC (permalink / raw)
  To: Dmitry Adamushko; +Cc: Ralf Baechle, linux-mips

Dmitry Adamushko wrote:
> >Was this example generated by a real world compiler? (Which one?)
> 
> [adamushkad@cplx219]/>mips-linux-uclibc-gcc -v
> Reading specs from
> /vobs/linux/tools/mips/gcc-3.4.2/bin/../lib/gcc/mips-linux-uclibc/3.4.2/specs
> Configured with:
> /vobs/linux/tools/buildroot/toolchain_build_mips/gcc-3.4.2/configure
> --prefix=/vobs/linux/tools/buildroot/build_mips/staging_dir
> --build=i386-pc-linux-gnu
> --host=i386-pc-linux-gnu
> --target=mips-linux-uclibc
> --enable-languages=c,c++
> --enable-shared
> --disable-__cxa_atexit
> --enable-target-optspace
> -with-gnu-ld
> --disable-nls
> --enable-multilib
> Thread model: posix
> 
> gcc version 3.4.2

I figure it doesn't create such an zero access as shown in the example.

> >I'm inclined to claim the example is broken WRT ABI rules since it
> >doesn't enclose the whole user code in the prologue/epilogue bracket.
> >
> 
> It's o32. So it explicitly requires that when the prologue and
> epilogue have been used in the function, all the user code must be
> place in between, right?

That's basically the definition of "prologue" and "epilogue".

> In this light, the unlike(ofs == 0) in unwind_stack() aims at catching
> cases when <sp> is wrong (if addiu sp,sp,OFFSET is normally the very
> first instruction)

Technically it is probably ok, since the o32 ABI covers only PIC code,
while the kernel is non-PIC.

> ok, here is an example from kernel/sched.o (the same compiler).
> 
> 00000098 <enqueue_task>:
>      98:       8c820018        lw      v0,24(a0)   <----- before the 
>      prologue
>      9c:       27bdfff8        addiu   sp,sp,-8
>      a0:       afbe0000        sw      s8,0(sp)
>      a4:       000210c0        sll     v0,v0,0x3
>      a8:       00a21021        addu    v0,a1,v0
>      ac:       24420018        addiu   v0,v0,24
>      b0:       8c460004        lw      a2,4(v0)
>      b4:       24830020        addiu   v1,a0,32
>      b8:       ac430004        sw      v1,4(v0)
>      bc:       ac820020        sw      v0,32(a0)
>      c0:       ac660004        sw      a2,4(v1)
>      c4:       acc30000        sw      v1,0(a2)
>      c8:       8c860018        lw      a2,24(a0)
>      cc:       24a70004        addiu   a3,a1,4
>      d0:       03a0f021        move    s8,sp
>      d4:       00061142        srl     v0,a2,0x5
>      d8:       00021080        sll     v0,v0,0x2
>      dc:       00e23821        addu    a3,a3,v0
>      e0:       8ce30000        lw      v1,0(a3)
>      e4:       30c6001f        andi    a2,a2,0x1f
>      e8:       24020001        li      v0,1
>      ec:       00c21004        sllv    v0,v0,a2
>      f0:       00621825        or      v1,v1,v0
>      f4:       ace30000        sw      v1,0(a3)
>      f8:       8ca20000        lw      v0,0(a1)
>      fc:       03c0e821        move    sp,s8
>     100:       8fbe0000        lw      s8,0(sp)
>     104:       24420001        addiu   v0,v0,1
>     108:       27bd0008        addiu   sp,sp,8
>     10c:       aca20000        sw      v0,0(a1)
>     110:       03e00008        jr      ra
>     114:       ac850028        sw      a1,40(a0)   <------------
> after the epilogue

It looks rather broken, given that the stack frame is only used to
pointlessly push s8 around. The compiler should have optimized it away.

> As I can see, normally this compiler places "addiu   sp,sp,FRAME_SIZE"
> at the branch delay slot of "jr ra" but e.g. enqueue_task() (example
> above) and request_task() are exceptions. btw, the very first
> instruction is also placed before the epilogue.
> 
> Are there any configure options that might have caused such a
> behaviour [hmmm... e.g. gcc was configured with --ignore-abi-rulles :]
> ? Although, I don't think this would be an option-dependent case.

Well, breakage happens from time to time in gcc. To cover such cases
it would be nice to have a more robust stack unwinder, but that's easier
said than done.


Thiemo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unwind_stack() and an exception at the last instruction (after the epilogue)
  2006-12-13 13:52       ` Thiemo Seufer
@ 2006-12-13 14:40         ` Dmitry Adamushko
  2006-12-13 16:16           ` Atsushi Nemoto
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Adamushko @ 2006-12-13 14:40 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Ralf Baechle, linux-mips

> > gcc version 3.4.2
>
> I figure it doesn't create such an zero access as shown in the example.

the code in question intentionally dereferenced a NULL pointer.

the funny thing is that when it's like this :

void cause_oops(void)
{
        unsigned long *addr = NULL;

        printf("Let's crash...");     // (1)
       *addr = 0;                     // (2)
}

the compiler (-g -Os) generates the code as I have sent before, iow
with "sw zero, 0(zero)" in the delay slot [see, the compiler is kindof
smart as it elimimates a need to store "addr" on stack :]
But if I change the order of (1) and (2), the generated code is different

00401364 <cause_oops>:
  401364:       3c1c0fc0        lui     gp,0xfc0
  401368:       279c6cec        addiu   gp,gp,27884
  40136c:       0399e021        addu    gp,gp,t9
  401370:       8f84801c        lw      a0,-32740(gp)
  401374:       8f9980b0        lw      t9,-32592(gp)
  401378:       ac000000        sw      zero,0(zero)
  40137c:       03200008        jr      t9
  401380:       24842010        addiu   a0,a0,8208

So the "prologue" and "epilogue" are omitted, that's good.

>
> It looks rather broken, given that the stack frame is only used to
> pointlessly push s8 around. The compiler should have optimized it away.

Yes, all the "broken" functions (there are a few in sched.o) have at
least one thing in common - they don't use stack at all, aside of
storing the frame pointer (s8).


> > Are there any configure options that might have caused such a
> > behaviour [hmmm... e.g. gcc was configured with --ignore-abi-rulles :]
> > ? Although, I don't think this would be an option-dependent case.
>
> Well, breakage happens from time to time in gcc. To cover such cases
> it would be nice to have a more robust stack unwinder, but that's easier
> said than done.

Yep, but this would add additional complexity which is not that
necessary for the common path.

e.g. as we know the start and end address of the function
(ksyms_lookup_size_off()), it's possible to find out a position of the
"prologue" and "epilogue" (addiu sp,sp,SIZE - the same way it's done
in get_frame_info()) so we would know:

function_start (1), prologue_addr (2), epilogue_addr (3), function_end (4)

and this would cover the (broken) cases when <epc> is in [1, 2] or [3, 4]
as well as the cases when e.g. <sp> is broken in the prologue ?

Anyway, thanks for the conversation.


>
> Thiemo
>


-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unwind_stack() and an exception at the last instruction (after the epilogue)
  2006-12-13 14:40         ` Dmitry Adamushko
@ 2006-12-13 16:16           ` Atsushi Nemoto
  2006-12-14  1:47             ` Ralf Baechle
  0 siblings, 1 reply; 7+ messages in thread
From: Atsushi Nemoto @ 2006-12-13 16:16 UTC (permalink / raw)
  To: dmitry.adamushko; +Cc: ths, ralf, linux-mips

On Wed, 13 Dec 2006 15:40:21 +0100, "Dmitry Adamushko" <dmitry.adamushko@gmail.com> wrote:
> e.g. as we know the start and end address of the function
> (ksyms_lookup_size_off()), it's possible to find out a position of the
> "prologue" and "epilogue" (addiu sp,sp,SIZE - the same way it's done
> in get_frame_info()) so we would know:
> 
> function_start (1), prologue_addr (2), epilogue_addr (3), function_end (4)
> 
> and this would cover the (broken) cases when <epc> is in [1, 2] or [3, 4]
> as well as the cases when e.g. <sp> is broken in the prologue ?

It would be hard because:

* A function can have multiple epilogues.
* gcc often moves "if" block codes to end of the function.

While current unwind_stack() is not perfect, any attempt to make it
robust is welcome.  But you might have to analyze _all_ code if you
wanted to save _all_ case.  I think UNIX's "90% principle" is good
enough here.

BTW, enqueue_task() will not use stack anymore since
SCHED_NO_NO_OMIT_FRAME_POINTER is defined.

---
Atsushi Nemoto

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: unwind_stack() and an exception at the last instruction (after the epilogue)
  2006-12-13 16:16           ` Atsushi Nemoto
@ 2006-12-14  1:47             ` Ralf Baechle
  0 siblings, 0 replies; 7+ messages in thread
From: Ralf Baechle @ 2006-12-14  1:47 UTC (permalink / raw)
  To: Atsushi Nemoto; +Cc: dmitry.adamushko, ths, linux-mips

On Thu, Dec 14, 2006 at 01:16:51AM +0900, Atsushi Nemoto wrote:

> While current unwind_stack() is not perfect, any attempt to make it
> robust is welcome.  But you might have to analyze _all_ code if you
> wanted to save _all_ case.  I think UNIX's "90% principle" is good
> enough here.

If the current unwinder should ever become a problem we have the option
of the DWARF2-based unwinder as backup.

  Ralf

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-12-14  2:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <b647ffbd0612121342y5b188be0o5ccce1b2c57a9725@mail.gmail.com>
2006-12-13 11:07 ` unwind_stack() and an exception at the last instruction (after the epilogue) Dmitry Adamushko
2006-12-13 11:54   ` Thiemo Seufer
2006-12-13 12:45     ` Dmitry Adamushko
2006-12-13 13:52       ` Thiemo Seufer
2006-12-13 14:40         ` Dmitry Adamushko
2006-12-13 16:16           ` Atsushi Nemoto
2006-12-14  1:47             ` Ralf Baechle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.