* Error in save_stack_trace() on x86_64?
@ 2008-05-11 13:09 Vegard Nossum
2008-05-11 19:44 ` Arjan van de Ven
0 siblings, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2008-05-11 13:09 UTC (permalink / raw)
To: Ingo Molnar, Arjan van de Ven; +Cc: Linux Kernel Mailing List
Hi,
I am having a problem with v2.6.26-rc1 on x86_64. It seems that
save_stack_trace() is not able to follow page fault boundaries, since
all my saved traces look like this:
RIP: 0010:[<ffffffff8039b004>] [<ffffffff8039b004>] add_uevent_var+0xb4/0x160
...
[<ffffffff80221f97>] kmemcheck_read+0x127/0x1e0
[<ffffffff80222269>] kmemcheck_access+0x179/0x1d0
[<ffffffff8022231f>] kmemcheck_fault+0x5f/0x80
[<ffffffff8061cd1e>] do_page_fault+0x4de/0x8d0
[<ffffffff8061a7d9>] error_exit+0x0/0x51
[<ffffffffffffffff>] 0xffffffffffffffff
I have this in my .config:
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_STACKTRACE=y
...
CONFIG_FRAME_POINTER=y
...
CONFIG_DEBUG_INFO=y
On 32-bit, I am able to see the calls leading up to the page fault as
well. Did I miss something here?
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-11 13:09 Error in save_stack_trace() on x86_64? Vegard Nossum
@ 2008-05-11 19:44 ` Arjan van de Ven
2008-05-11 19:56 ` Vegard Nossum
2008-05-18 17:13 ` Vegard Nossum
0 siblings, 2 replies; 9+ messages in thread
From: Arjan van de Ven @ 2008-05-11 19:44 UTC (permalink / raw)
To: Vegard Nossum; +Cc: Ingo Molnar, Linux Kernel Mailing List
Vegard Nossum wrote:
> Hi,
>
> I am having a problem with v2.6.26-rc1 on x86_64. It seems that
> save_stack_trace() is not able to follow page fault boundaries, since
> all my saved traces look like this:
>
> RIP: 0010:[<ffffffff8039b004>] [<ffffffff8039b004>] add_uevent_var+0xb4/0x160
> ...
> [<ffffffff80221f97>] kmemcheck_read+0x127/0x1e0
> [<ffffffff80222269>] kmemcheck_access+0x179/0x1d0
> [<ffffffff8022231f>] kmemcheck_fault+0x5f/0x80
> [<ffffffff8061cd1e>] do_page_fault+0x4de/0x8d0
> [<ffffffff8061a7d9>] error_exit+0x0/0x51
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> I have this in my .config:
>
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_STACKTRACE=y
> ...
> CONFIG_FRAME_POINTER=y
> ...
> CONFIG_DEBUG_INFO=y
>
>
> On 32-bit, I am able to see the calls leading up to the page fault as
> well. Did I miss something here?
can you give an example?
if a pagefault happens in userspace this trace looks correct.
if it happens in kernel space... I wonder if the separate exception stack thing
is hurting us with the stacks not being properly connected...
(but oopses and the like seem to come out just fine so I kinda doubt you're hitting that)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-11 19:44 ` Arjan van de Ven
@ 2008-05-11 19:56 ` Vegard Nossum
2008-05-11 19:58 ` Arjan van de Ven
2008-05-18 17:13 ` Vegard Nossum
1 sibling, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2008-05-11 19:56 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Ingo Molnar, Linux Kernel Mailing List
On Sun, May 11, 2008 at 9:44 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> Vegard Nossum wrote:
>
> > Hi,
> >
> > I am having a problem with v2.6.26-rc1 on x86_64. It seems that
> > save_stack_trace() is not able to follow page fault boundaries, since
> > all my saved traces look like this:
> >
> > RIP: 0010:[<ffffffff8039b004>] [<ffffffff8039b004>]
> add_uevent_var+0xb4/0x160
> > ...
> > [<ffffffff80221f97>] kmemcheck_read+0x127/0x1e0
> > [<ffffffff80222269>] kmemcheck_access+0x179/0x1d0
> > [<ffffffff8022231f>] kmemcheck_fault+0x5f/0x80
> > [<ffffffff8061cd1e>] do_page_fault+0x4de/0x8d0
> > [<ffffffff8061a7d9>] error_exit+0x0/0x51
> > [<ffffffffffffffff>] 0xffffffffffffffff
> >
> > I have this in my .config:
> >
> > CONFIG_STACKTRACE_SUPPORT=y
> > CONFIG_STACKTRACE=y
> > ...
> > CONFIG_FRAME_POINTER=y
> > ...
> > CONFIG_DEBUG_INFO=y
> >
> >
> > On 32-bit, I am able to see the calls leading up to the page fault as
> > well. Did I miss something here?
> >
>
> can you give an example?
This is a similarly saved 32-bit backtrace:
[<c0119101>] kmemcheck_read+0xd1/0x160
[<c01192c6>] kmemcheck_access+0x136/0x1a0
[<c04bb206>] do_page_fault+0x5e6/0x690
[<c04b925a>] error_code+0x72/0x78
[<c012d751>] sysctl_set_parent+0x21/0x40
[<c012d751>] sysctl_set_parent+0x21/0x40
[<c012d751>] sysctl_set_parent+0x21/0x40
[<c012d751>] sysctl_set_parent+0x21/0x40
[<c012e9c8>] __register_sysctl_paths+0xb8/0x120
[<c0497cdf>] register_net_sysctl_table+0x4f/0x60
[<c040ba36>] neigh_sysctl_register+0x1a6/0x290
[<c0695734>] arp_init+0x54/0x60
[<c0695ba7>] inet_init+0x107/0x340
[<c066f5c7>] kernel_init+0x127/0x290
[<c0104cc7>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff
>
> if a pagefault happens in userspace this trace looks correct.
No, it is happening from kernel code. As you can see from the original
backtrace, the regs->ip (RIP) (regs taken from the very same
do_page_fault()) points at add_uevent_var, which is a kernel function.
>
> if it happens in kernel space... I wonder if the separate exception stack
> thing
> is hurting us with the stacks not being properly connected...
> (but oopses and the like seem to come out just fine so I kinda doubt you're
> hitting that)
>
Thanks for looking into this.
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-11 19:56 ` Vegard Nossum
@ 2008-05-11 19:58 ` Arjan van de Ven
0 siblings, 0 replies; 9+ messages in thread
From: Arjan van de Ven @ 2008-05-11 19:58 UTC (permalink / raw)
To: Vegard Nossum; +Cc: Ingo Molnar, Linux Kernel Mailing List
Vegard Nossum wrote: in userspace this trace looks correct.
>
> No, it is happening from kernel code. As you can see from the original
> backtrace, the regs->ip (RIP) (regs taken from the very same
> do_page_fault()) points at add_uevent_var, which is a kernel function.
>
>> if it happens in kernel space... I wonder if the separate exception stack
>> thing
>> is hurting us with the stacks not being properly connected...
>> (but oopses and the like seem to come out just fine so I kinda doubt you're
>> hitting that)
>>
>
> Thanks for looking into this.
>
do you happen to have something that I maybe can reproduce?
(if you have that it would save me a ton of time in reproducing)
>
> Vegard
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
@ 2008-05-11 20:46 Vegard Nossum
0 siblings, 0 replies; 9+ messages in thread
From: Vegard Nossum @ 2008-05-11 20:46 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Ingo Molnar, Linux Kernel Mailing List
On Sun, May 11, 2008 at 9:58 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> Vegard Nossum wrote: in userspace this trace looks correct.
>
> > No, it is happening from kernel code. As you can see from the original
> > backtrace, the regs->ip (RIP) (regs taken from the very same
> > do_page_fault()) points at add_uevent_var, which is a kernel function.
> >
> >
> > > if it happens in kernel space... I wonder if the separate exception
> stack
> > > thing
> > > is hurting us with the stacks not being properly connected...
> > > (but oopses and the like seem to come out just fine so I kinda doubt
> you're
> > > hitting that)
> > >
> > >
> >
> > Thanks for looking into this.
>
> do you happen to have something that I maybe can reproduce?
> (if you have that it would save me a ton of time in reproducing)
>
Here's a very dirty hack that reproduces it for me. I'm sorry I don't have
the logs to show it, but it works correctly on 32-bit, but not on 64-bit.
I guess you should also make sure that:
CONFIG_DEBUG_INFO=y
CONFIG_FRAME_POINTER=y
CONFIG_STACKTRACE=y
Thanks.
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
arch/x86/mm/fault.c | 17 +++++++++++++++++
init/main.c | 12 ++++++++++++
lib/Kconfig.debug | 1 +
3 files changed, 30 insertions(+), 0 deletions(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index fd7e179..559a5f3 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -25,6 +25,7 @@
#include <linux/kprobes.h>
#include <linux/uaccess.h>
#include <linux/kdebug.h>
+#include <linux/stacktrace.h>
#include <asm/system.h>
#include <asm/desc.h>
@@ -602,6 +603,22 @@ void __kprobes do_page_fault(struct pt_regs *regs, unsigned long error_code)
if (notify_page_fault(regs))
return;
+ dump_stack();
+
+ struct stack_trace trace;
+ unsigned long entries[16];
+ trace.nr_entries = 0;
+ trace.entries = entries;
+ trace.max_entries = 16;
+ trace.skip = 0;
+ printk(KERN_EMERG "saved stack trace:\n");
+ save_stack_trace(&trace);
+ print_stack_trace(&trace, 0);
+ printk(KERN_EMERG "end saved stack trace\n");
+
+ while(1)
+ halt();
+
/*
* We fault-in kernel-space virtual memory on-demand. The
* 'reference' page table is init_mm.pgd.
diff --git a/init/main.c b/init/main.c
index ddada7a..d9fb240 100644
--- a/init/main.c
+++ b/init/main.c
@@ -531,6 +531,16 @@ void __init __weak thread_info_cache_init(void)
{
}
+void noinline trigger_page_fault(void) {
+ struct page *p = alloc_pages(GFP_KERNEL, 0);
+ unsigned long addr = page_address(p);
+ set_memory_4k(addr, 0);
+ int level;
+ pte_t *pte = lookup_address(addr, &level);
+ set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
+ *(char*) addr = 0;
+}
+
asmlinkage void __init start_kernel(void)
{
char * command_line;
@@ -680,6 +690,8 @@ asmlinkage void __init start_kernel(void)
acpi_early_init(); /* before LAPIC and SMP init */
+ trigger_page_fault();
+
/* Do the rest non-__init'ed, we're now alive */
rest_init();
}
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d2099f4..3fc1247 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -419,6 +419,7 @@ config DEBUG_LOCKING_API_SELFTESTS
config STACKTRACE
bool
+ default y
depends on DEBUG_KERNEL
depends on STACKTRACE_SUPPORT
--
1.5.4.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-11 19:44 ` Arjan van de Ven
2008-05-11 19:56 ` Vegard Nossum
@ 2008-05-18 17:13 ` Vegard Nossum
2008-05-18 18:31 ` Vegard Nossum
1 sibling, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2008-05-18 17:13 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Ingo Molnar, Linux Kernel Mailing List, Pekka Enberg
Hi,
On Sun, May 11, 2008 at 9:44 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> Vegard Nossum wrote:
>>
>> I am having a problem with v2.6.26-rc1 on x86_64. It seems that
>> save_stack_trace() is not able to follow page fault boundaries, since
>> all my saved traces look like this:
>>
>> RIP: 0010:[<ffffffff8039b004>] [<ffffffff8039b004>]
>> add_uevent_var+0xb4/0x160
>> ...
>> [<ffffffff80221f97>] kmemcheck_read+0x127/0x1e0
>> [<ffffffff80222269>] kmemcheck_access+0x179/0x1d0
>> [<ffffffff8022231f>] kmemcheck_fault+0x5f/0x80
>> [<ffffffff8061cd1e>] do_page_fault+0x4de/0x8d0
>> [<ffffffff8061a7d9>] error_exit+0x0/0x51
>> [<ffffffffffffffff>] 0xffffffffffffffff
...
>>
>> On 32-bit, I am able to see the calls leading up to the page fault as
>> well. Did I miss something here?
>
> can you give an example?
>
> if a pagefault happens in userspace this trace looks correct.
>
> if it happens in kernel space... I wonder if the separate exception stack
> thing
> is hurting us with the stacks not being properly connected...
> (but oopses and the like seem to come out just fine so I kinda doubt you're
> hitting that)
Okay, this is slightly emberrassing. I made a new test, here's the output:
dump_stack():
[<ffffffff8062b021>] do_page_fault+0x31/0x70
[<ffffffff80224195>] ? cpa_fill_pool+0x135/0x140
[<ffffffff80224c40>] ? change_page_attr_set_clr+0x1c0/0x220
[<ffffffff80220a21>] ? address_get_pte+0x11/0x30
[<ffffffff80628fb9>] error_exit+0x0/0x51
[<ffffffff8028655a>] ? __slab_alloc+0x35a/0x560
[<ffffffff80286556>] ? __slab_alloc+0x356/0x560
[<ffffffff80386535>] ? kvasprintf+0x55/0x90
[<ffffffff80287809>] ? __kmalloc+0xf9/0x110
[<ffffffff80386535>] ? kvasprintf+0x55/0x90
[<ffffffff8038660b>] ? kasprintf+0x9b/0xa0
[<ffffffff802898ba>] ? create_kmalloc_cache+0xaa/0xe0
[<ffffffff80898193>] ? kmem_cache_init+0xf3/0x170
[<ffffffff80882b35>] ? start_kernel+0x245/0x340
[<ffffffff80882457>] ? x86_64_start_kernel+0x257/0x290
save_stack_trace()/print_stack_trace():
[<ffffffff80213eca>] save_stack_trace+0x2a/0x50
[<ffffffff8062b049>] do_page_fault+0x59/0x70
[<ffffffff80628fb9>] error_exit+0x0/0x51
[<ffffffffffffffff>] 0xffffffffffffffff
And what seems now immediately clear is that the difference is that
the latter doesn't print the unreliable stack frames. Which reminds me
that *I* was the person who submitted the patch to do that:
commit 1650743cdc0db73478f72c57544ce79ea8f3dda6
Author: Vegard Nossum <vegard.nossum@gmail.com>
Date: Fri Feb 22 19:23:58 2008 +0100
x86: don't save unreliable stack trace entries
Currently, there is no way for print_stack_trace() to determine whether
a given stack trace entry was deemed reliable or not, simply because
save_stack_trace() does not record this information. (Perhaps needless
to say, this makes the saved stack traces A LOT harder to read, and
probably with no other benefits, since debugging features that use
save_stack_trace() most likely also require frame pointers, etc.)
This patch reverts to the old behaviour of only recording the reliable trace
entries for saved stack traces.
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Still, this seems to be the better behaviour (that patch should not be
reverted), and I think it's the tracer itself that should be fixed to
not mark these entries as unreliable, like the 32-bit version
apparently does.
I did send a patch in february that would allow the reliability of
frames to be saved along with the frames themselves, though it had no
replies:
http://lkml.org/lkml/2008/2/23/173
Would you reconsider this patch, or provide some feedback if it needs
to be improved? In the meantime, I will make some attempts at making
the pre-pagefault frames be seen as reliable :-)
Thanks.
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-18 17:13 ` Vegard Nossum
@ 2008-05-18 18:31 ` Vegard Nossum
2008-05-18 18:52 ` Arjan van de Ven
0 siblings, 1 reply; 9+ messages in thread
From: Vegard Nossum @ 2008-05-18 18:31 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Ingo Molnar, Linux Kernel Mailing List, Pekka Enberg
Hi,
On Sun, May 18, 2008 at 7:13 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> to be improved? In the meantime, I will make some attempts at making
> the pre-pagefault frames be seen as reliable :-)
FYI, the stack looks like this:
Call Trace:
ffffffff80877ba8: 0000000000000000
ffffffff80877bb0: ffffffff80877c68
ffffffff80877bb8: ffff810007801000
ffffffff80877bc0: ffff810007801000
ffffffff80877bc8: ffffffff80877c98
ffffffff80877bd0: ffffffff8062b061
[<ffffffff8062b061>] do_page_fault+0x31/0x70
ffffffff80877bd8: 0000000000000000
ffffffff80877be0: 0000000000000000
ffffffff80877be8: 0000000000000000
ffffffff80877bf0: 00000000000000d0
ffffffff80877bf8: 0000000000000000
ffffffff80877c00: 0000000000000000
ffffffff80877c08: ffffffff80877c18
ffffffff80877c10: 0000000000000246
ffffffff80877c18: ffffffff80877c48
ffffffff80877c20: ffffffff802241d5
[<ffffffff802241d5>] ? cpa_fill_pool+0x135/0x140
ffffffff80877c28: ffff810000000000
ffffffff80877c30: 0000000000000000
ffffffff80877c38: ffff810000000000
ffffffff80877c40: ffff810007801000
ffffffff80877c48: ffffffff80877cc8
ffffffff80877c50: ffffffff80224c80
[<ffffffff80224c80>] ? change_page_attr_set_clr+0x1c0/0x220
ffffffff80877c58: 00000001807dc380
ffffffff80877c60: ffffffff00000001
ffffffff80877c68: ffff810007802000
ffffffff80877c70: 0000000000000000
ffffffff80877c78: ffffffff80877c98
ffffffff80877c80: ffffffff80220a61
[<ffffffff80220a61>] ? address_get_pte+0x11/0x30
ffffffff80877c88: 0000000000007801
ffffffff80877c90: 0000000000000001
ffffffff80877c98: 000000008020bb59
ffffffff80877ca0: ffffffff80628ff9
[<ffffffff80628ff9>] error_exit+0x0/0x51
ffffffff80877ca8: ffffe200001e0040
ffffffff80877cb0: ffffffff80867c20
ffffffff80877cb8: ffff810007801000
ffffffff80877cc0: ffff810007801000
ffffffff80877cc8: ffffffff80877d98
ffffffff80877cd0: ffff810007801000
ffffffff80877cd8: 0000000000001000
ffffffff80877ce0: ffff810007802000
ffffffff80877ce8: 0000000000000000
ffffffff80877cf0: ffffffff807dc42a
ffffffff80877cf8: 0000000000000000
ffffffff80877d00: 0000000000000000
ffffffff80877d08: ffff810007801000
ffffffff80877d10: ffffe200001e0040
ffffffff80877d18: ffffffff80867c20
ffffffff80877d20: ffffffffffffffff
ffffffff80877d28: ffffffff8028659a
[<ffffffff8028659a>] ? __slab_alloc+0x35a/0x560
ffffffff80877d30: 0000000000000010
ffffffff80877d38: 0000000000000246
ffffffff80877d40: ffffffff80877d58
ffffffff80877d48: 0000000000000018
ffffffff80877d50: ffffffff80286596
[<ffffffff80286596>] ? __slab_alloc+0x356/0x560
ffffffff80877d58: 0000000000000002
ffffffff80877d60: ffffffff80386575
[<ffffffff80386575>] ? kvasprintf+0x55/0x90
ffffffff80877d68: 000000d0ffffffff
ffffffff80877d70: 00000000000000d0
ffffffff80877d78: 0000000000000282
ffffffff80877d80: ffff810001008820
ffffffff80877d88: ffffffff80867c20
ffffffff80877d90: 00000000000000d0
ffffffff80877d98: ffffffff80877dd8
ffffffff80877da0: ffffffff80287849
[<ffffffff80287849>] ? __kmalloc+0xf9/0x110
ffffffff80877da8: ffffffff8074d893
ffffffff80877db0: 00000000000000d0
ffffffff80877db8: ffffffff80877e38
ffffffff80877dc0: 000000000000000a
ffffffff80877dc8: ffffffff8074d889
ffffffff80877dd0: 0000000000092e80
ffffffff80877dd8: ffffffff80877e28
ffffffff80877de0: ffffffff80386575
[<ffffffff80386575>] ? kvasprintf+0x55/0x90
ffffffff80877de8: 0000003000000018
ffffffff80877df0: ffffffff80877f18
ffffffff80877df8: ffffffff80877e58
ffffffff80877e00: 0000000000000001
ffffffff80877e08: 0000000000000004
ffffffff80877e10: ffffffff80867ac0
ffffffff80877e18: 0000000000000001
ffffffff80877e20: ffffffff80877fa8
ffffffff80877e28: ffffffff80877f08
ffffffff80877e30: ffffffff8038664b
[<ffffffff8038664b>] ? kasprintf+0x9b/0xa0
ffffffff80877e38: 0000003000000010
ffffffff80877e40: ffffffff80877f18
ffffffff80877e48: ffffffff80877e58
ffffffff80877e50: 0000000000000000
ffffffff80877e58: ffffffff8074d881
ffffffff80877e60: 0000000000000008
ffffffff80877e68: 0000000000000008
ffffffff80877e70: 0000000000000003
ffffffff80877e78: 0000000000000001
ffffffff80877e80: ffffffff808696c0
ffffffff80877e88: 0000000000000286
ffffffff80877e90: 0000000000000000
ffffffff80877e98: 00000000000000d0
ffffffff80877ea0: 0000000000000000
ffffffff80877ea8: 00000000000000d0
ffffffff80877eb0: ffffffff80868b60
ffffffff80877eb8: 0000000000001000
ffffffff80877ec0: ffffffff8074d881
ffffffff80877ec8: ffffffff80877f08
ffffffff80877ed0: ffffffff802898fa
[<ffffffff802898fa>] ? create_kmalloc_cache+0xaa/0xe0
ffffffff80877ed8: 0000000000000000
ffffffff80877ee0: 000000000000000d
ffffffff80877ee8: ffffffff80868d48
ffffffff80877ef0: 0000000000000001
ffffffff80877ef8: ffffffff80877fa8
ffffffff80877f00: 0000000000092e80
ffffffff80877f08: ffffffff80877f48
ffffffff80877f10: ffffffff80898193
[<ffffffff80898193>] ? kmem_cache_init+0xf3/0x170
ffffffff80877f18: 000000000000012c
ffffffff80877f20: 0000000000000000
ffffffff80877f28: 0000000000000000
ffffffff80877f30: 0000000000000000
ffffffff80877f38: ffffffff808b14c0
ffffffff80877f40: ffffffff808d8200
ffffffff80877f48: ffffffff80877f78
ffffffff80877f50: ffffffff80882b35
[<ffffffff80882b35>] ? start_kernel+0x245/0x340
ffffffff80877f58: ffffffff80877f78
ffffffff80877f60: ffffffff808b14c0
ffffffff80877f68: 0000000000b46a30
ffffffff80877f70: 0000000000000000
ffffffff80877f78: ffffffff80877fe8
ffffffff80877f80: ffffffff80882457
[<ffffffff80882457>] ? x86_64_start_kernel+0x257/0x290
ffffffff80877f88: 0000000000000000
ffffffff80877f90: 0000000000000000
ffffffff80877f98: 0000000000000000
ffffffff80877fa0: 0000000000000000
ffffffff80877fa8: 80888e0000102136
ffffffff80877fb0: 00000000ffffffff
ffffffff80877fb8: 0000000000000000
ffffffff80877fc0: 0000000000000000
ffffffff80877fc8: 0000000000000000
ffffffff80877fd0: 0000000000000000
ffffffff80877fd8: 0000000000000000
ffffffff80877fe0: 0000000000000000
ffffffff80877fe8: 0000000000000000
ffffffff80877ff0: 0000000000000000
Using a simple debug patch:
diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c
index 72923ba..e33fd8f 100644
--- a/arch/x86/kernel/traps_64.c
+++ b/arch/x86/kernel/traps_64.c
@@ -244,6 +244,8 @@ static inline unsigned long
print_context_stack(struct thread_info *tinfo,
while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) {
unsigned long addr;
+printk(KERN_EMERG "%p: %p\n", stack, (void *) *stack);
+
addr = *stack;
if (__kernel_text_address(addr)) {
if ((unsigned long) stack == bp + 8) {
Is the error obvious from the stack-trace I posted above? This is not
really my field, so I might easily miss it :-)
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-18 18:31 ` Vegard Nossum
@ 2008-05-18 18:52 ` Arjan van de Ven
2008-05-18 20:23 ` Vegard Nossum
0 siblings, 1 reply; 9+ messages in thread
From: Arjan van de Ven @ 2008-05-18 18:52 UTC (permalink / raw)
To: Vegard Nossum; +Cc: Ingo Molnar, Linux Kernel Mailing List, Pekka Enberg
On Sun, 18 May 2008 20:31:18 +0200
>
> Is the error obvious from the stack-trace I posted above? This is not
> really my field, so I might easily miss it :-)
unfortunately I don't really have time today to take a detailed look
(social obligations), but the trick is to follow where EBP (rBP) is
going...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Error in save_stack_trace() on x86_64?
2008-05-18 18:52 ` Arjan van de Ven
@ 2008-05-18 20:23 ` Vegard Nossum
0 siblings, 0 replies; 9+ messages in thread
From: Vegard Nossum @ 2008-05-18 20:23 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Ingo Molnar, H. Peter Anvin, Linux Kernel Mailing List,
Pekka Enberg
On Sun, May 18, 2008 at 8:52 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> On Sun, 18 May 2008 20:31:18 +0200
>>
>> Is the error obvious from the stack-trace I posted above? This is not
>> really my field, so I might easily miss it :-)
>
> unfortunately I don't really have time today to take a detailed look
> (social obligations), but the trick is to follow where EBP (rBP) is
> going...
That's perfectly okay, I regard all help as a bonus :-D
There seems to be something odd going on here:
ffffffff80877bc8: ffffffff80877c98 <--- this points to the frame below
ffffffff80877bd0: ffffffff8062b061
[<ffffffff8062b061>] do_page_fault+0x31/0x70
<...>
ffffffff80877c88: 0000000000007801
ffffffff80877c90: 0000000000000001
ffffffff80877c98: 000000008020bb59 <--- but this pointer is invalid!
ffffffff80877ca0: ffffffff80628ff9
[<ffffffff80628ff9>] error_exit+0x0/0x51
And the invalid pointer should have been ffffffff80877f48:
ffffffff80877f48: ffffffff80877f78 <---
ffffffff80877f50: ffffffff80882b35
[<ffffffff80882b35>] ? start_kernel+0x245/0x340
which is where the page fault came from.
It seems to me error_exit() called do_page_fault(), but that
do_page_fault() did not push the %rbp, or it was overwritten later.
(But how can it then be restored correctly when the function returns?)
I think this is the relevant code (from arch/x86/kernel/entry_64.S):
movq ORIG_RAX(%rsp),%rsi /* get error code */
movq $-1,ORIG_RAX(%rsp)
call *%rax
/* ebx: no swapgs flag (1: don't need swapgs, 0: need it) */
error_exit:
movl %ebx,%eax
RESTORE_REST
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
where that "call *%rax" would push the (return) address of error_exit
on the stack and go into do_page_fault().
It seems that do_page_fault() is doing the right thing, however:
ffffffff8062b030 <do_page_fault>:
ffffffff8062b030: 55 push %rbp
ffffffff8062b031: 48 89 e5 mov %rsp,%rbp
ffffffff8062b034: 53 push %rbx
ffffffff8062b035: 48 81 ec b8 00 00 00 sub $0xb8,%rsp
so my current theory is that the entry is overwritten later.
So what is the value 000000008020bb59 (from the erronous stack entry)?
It certainly looks like the lower half of an address to me.
And indeed, looking this up gives me:
ffffffff8020bb59 <irq_return>:
ffffffff8020bb59: 48 cf iretq
Strange!
In any case, I think we can safely assume that the stack tracer itself
is perfectly okay, and that the error is actually in how the stack is
handled just before/after the actual call to do_page_fault(). Does
anybody actually know how this code all works? It is admittedly
probably not the most critical error in the kernel, but it would be
nice to have this sorted out. Ingo, hpa...?
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-05-18 20:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-11 13:09 Error in save_stack_trace() on x86_64? Vegard Nossum
2008-05-11 19:44 ` Arjan van de Ven
2008-05-11 19:56 ` Vegard Nossum
2008-05-11 19:58 ` Arjan van de Ven
2008-05-18 17:13 ` Vegard Nossum
2008-05-18 18:31 ` Vegard Nossum
2008-05-18 18:52 ` Arjan van de Ven
2008-05-18 20:23 ` Vegard Nossum
-- strict thread matches above, loose matches on Subject: below --
2008-05-11 20:46 Vegard Nossum
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox