From: Ingo Molnar <mingo@kernel.org>
To: Richard Yao <ryao@gentoo.org>
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
"Ingo Molnar" <mingo@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, "Andrew Morton" <akpm@linux-foundation.org>,
"Tejun Heo" <tj@kernel.org>, "Vineet Gupta" <vgupta@synopsys.com>,
"Jesper Nilsson" <jesper.nilsson@axis.com>,
"Jiri Slaby" <jslaby@suse.cz>,
linux-kernel@vger.kernel.org, kernel@gentoo.org,
"Brian Behlendorf" <behlendorf1@llnl.gov>,
"Linus Torvalds" <torvalds@linux-foundation.org>,
"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Arnaldo Carvalho de Melo" <acme@infradead.org>,
"Jiri Olsa" <jolsa@redhat.com>
Subject: Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
Date: Sun, 27 Apr 2014 14:08:20 +0200 [thread overview]
Message-ID: <20140427120820.GC22116@gmail.com> (raw)
In-Reply-To: <1398535818-14217-1-git-send-email-ryao@gentoo.org>
* Richard Yao <ryao@gentoo.org> wrote:
> Stack traces are generated by scanning the stack and interpeting
> anything that looks like it could be a pointer to something. We do
> not need to do this when we have frame pointers, but we do it
> anyway, with the distinction that we use the return pointers to mark
> actual frames by the absence of a question mark.
>
> The additional verbosity of stack scanning tends to bombard us with
> walls of text for no gain in practice, so lets switch to printing
> only stack frames when frame pointers are available. That we can
> spend less time reading stack traces and more time looking at code.
>
> Signed-off-by: Richard Yao <ryao@gentoo.org>
> ---
> arch/x86/kernel/dumpstack.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index d9c12d3..94ffe06 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -162,7 +162,11 @@ static void print_trace_address(void *data, unsigned long addr, int reliable)
> static const struct stacktrace_ops print_trace_ops = {
> .stack = print_trace_stack,
> .address = print_trace_address,
> +#ifdef CONFIG_FRAME_POINTER
> + .walk_stack = print_context_stack_bp,
> +#else
> .walk_stack = print_context_stack,
> +#endif
> };
I don't really like this patch, and this question comes up often, so
let me try to explain it more verbosely.
Lets take a real-life example of a BP-only trace about a lockup:
Call Trace:
[<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
[<7916a34c>] do_raw_spin_lock+0xb7/0xe8
[<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
[<790391e8>] prepare_to_wait+0x18/0x57
[<792b7158>] __wait_on_bit+0x20/0x5f
[<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
[<790e4eff>] __wait_on_buffer+0x1b/0x1e
[<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
[<7911015f>] ext3_iget+0x45/0x346
[<79114e03>] ext3_lookup+0x97/0xa6
[<790c9c0e>] lookup_real+0x20/0x35
[<790ca1d2>] __lookup_hash+0x2a/0x31
[<790caa0e>] lookup_slow+0x36/0x8c
[<790cbdda>] path_lookupat+0xf9/0x5c5
[<790cc2c5>] filename_lookup+0x1f/0x84
[<790cde98>] user_path_at_empty+0x3f/0x65
[<790cdecb>] user_path_at+0xd/0xf
[<790c6585>] vfs_fstatat+0x40/0x88
[<790c65f8>] vfs_stat+0x13/0x15
[<790c67e2>] sys_stat64+0x11/0x22
[<792b9d04>] syscall_call+0x7/0xb
What does this call trace tell us? That we locked up somewhere in the
ext3 code. That's all that we know.
Now lets take a look at a 'verbose' entry of the same lockup, that
also outputs other entries that 'look like' kernel text addresses:
Call Trace:
[<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
[<7916a34c>] do_raw_spin_lock+0xb7/0xe8
[<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
[<790391e8>] ? prepare_to_wait+0x18/0x57
[<790391e8>] prepare_to_wait+0x18/0x57
[<7914a320>] ? generic_make_request+0x80/0xb5
[<790e4f30>] ? unmap_underlying_metadata+0x2e/0x2e
[<792b7158>] __wait_on_bit+0x20/0x5f
[<7914a427>] ? submit_bio+0xd2/0xfb
[<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
[<790e4f30>] ? unmap_underlying_metadata+0x2e/0x2e
[<79039086>] ? autoremove_wake_function+0x31/0x31
[<790e4eff>] __wait_on_buffer+0x1b/0x1e
[<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
[<7911015f>] ext3_iget+0x45/0x346
[<79114e03>] ext3_lookup+0x97/0xa6
[<790c9c0e>] lookup_real+0x20/0x35
[<790ca1d2>] __lookup_hash+0x2a/0x31
[<790caa0e>] lookup_slow+0x36/0x8c
[<790cbdda>] path_lookupat+0xf9/0x5c5
[<790cc2c5>] filename_lookup+0x1f/0x84
[<790cde98>] user_path_at_empty+0x3f/0x65
[<790cdecb>] user_path_at+0xd/0xf
[<790c6585>] vfs_fstatat+0x40/0x88
[<7903f844>] ? lg_local_unlock+0x31/0x47
[<790c65f8>] vfs_stat+0x13/0x15
[<790c67e2>] sys_stat64+0x11/0x22
[<790c3c47>] ? __fput+0x187/0x1a0
[<792b9d37>] ? restore_all+0xf/0xf
[<79165bb4>] ? trace_hardirqs_on_thunk+0xc/0x10
[<792b9d04>] syscall_call+0x7/0xb
Firstly, it's still easy to tell the true backtrace: ignore all the
'?' entries.
But the '?' entries also tell us something more: that recently this
CPU submitted IO. That might or might not be suspected from the
original trace, but from the 'extended' trace it's really obvious that
IO activity happened before the lockup as well.
There were many other examples in the past where '?' entries gave us
clues about the nature of a bug: they represent a poor man's trace of
what happened recently on that CPU.
So it's useful information for hairy bugs and it would be sad to
remove them.
Having said that, your complaint that '?' entries can make reading of
back traces more difficult is valid as well - so maybe we can do
something about that.
For example if we used more horizontal separation in the output
format:
Call Trace:
[<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
[<7916a34c>] do_raw_spin_lock+0xb7/0xe8
[<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
[<790391e8>] # ? prepare_to_wait+0x18/0x57
[<790391e8>] prepare_to_wait+0x18/0x57
[<7914a320>] # ? generic_make_request+0x80/0xb5
[<790e4f30>] # ? unmap_underlying_metadata+0x2e/0x2e
[<792b7158>] __wait_on_bit+0x20/0x5f
[<7914a427>] # ? submit_bio+0xd2/0xfb
[<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
[<790e4f30>] # ? unmap_underlying_metadata+0x2e/0x2e
[<79039086>] # ? autoremove_wake_function+0x31/0x31
[<790e4eff>] __wait_on_buffer+0x1b/0x1e
[<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
[<7911015f>] ext3_iget+0x45/0x346
[<79114e03>] ext3_lookup+0x97/0xa6
[<790c9c0e>] lookup_real+0x20/0x35
[<790ca1d2>] __lookup_hash+0x2a/0x31
[<790caa0e>] lookup_slow+0x36/0x8c
[<790cbdda>] path_lookupat+0xf9/0x5c5
[<790cc2c5>] filename_lookup+0x1f/0x84
[<790cde98>] user_path_at_empty+0x3f/0x65
[<790cdecb>] user_path_at+0xd/0xf
[<790c6585>] vfs_fstatat+0x40/0x88
[<7903f844>] # ? lg_local_unlock+0x31/0x47
[<790c65f8>] vfs_stat+0x13/0x15
[<790c67e2>] sys_stat64+0x11/0x22
[<790c3c47>] # ? __fput+0x187/0x1a0
[<792b9d37>] # ? restore_all+0xf/0xf
[<79165bb4>] # ? trace_hardirqs_on_thunk+0xc/0x10
[<792b9d04>] syscall_call+0x7/0xb
then the information would still be there, it would still be in the
right place, but it would also be much easier to ignore the right side
column it visually. (The '# ?' prefix would also make it clear that
this information is not so important as the left hand side column.)
Or something like that. Assuming this doesn't confuse tools and such.
Thanks,
Ingo
next prev parent reply other threads:[~2014-04-27 12:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-26 18:10 [PATCH] x86/dumpstack: Walk frames when built with frame pointers Richard Yao
2014-04-27 12:08 ` Ingo Molnar [this message]
2014-04-27 19:42 ` Peter Zijlstra
2014-04-27 19:51 ` Richard Yao
2014-04-27 20:08 ` Linus Torvalds
2014-04-27 20:36 ` Richard Yao
2014-05-07 17:18 ` Ingo Molnar
2014-05-07 17:37 ` Peter Zijlstra
2014-04-30 21:56 ` Frederic Weisbecker
2014-05-07 16:40 ` Ingo Molnar
2014-06-06 8:17 ` Peter Zijlstra
2014-06-06 8:24 ` Borislav Petkov
2014-06-06 9:35 ` Peter Zijlstra
2014-06-07 3:08 ` H. Peter Anvin
2014-06-07 3:09 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140427120820.GC22116@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=behlendorf1@llnl.gov \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=jesper.nilsson@axis.com \
--cc=jolsa@redhat.com \
--cc=jslaby@suse.cz \
--cc=kernel@gentoo.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=ryao@gentoo.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vgupta@synopsys.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox