public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] improve_stack: make stack dump output useful again
@ 2014-02-23  0:19 Sasha Levin
  2014-02-23 20:27 ` Linus Torvalds
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2014-02-23  0:19 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, Sasha Levin

Right now when people try to report issues in the kernel they send stack
dumps to eachother, which looks something like this:

[    6.906437]  [<ffffffff811f0e90>] ? backtrace_test_irq_callback+0x20/0x20
[    6.907121]  [<ffffffff84388ce8>] dump_stack+0x52/0x7f
[    6.907640]  [<ffffffff811f0ec8>] backtrace_regression_test+0x38/0x110
[    6.908281]  [<ffffffff813596a0>] ? proc_create_data+0xa0/0xd0
[    6.908870]  [<ffffffff870a8040>] ? proc_modules_init+0x22/0x22
[    6.909480]  [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
[...]

However, most of the text you get is pure garbage.

The only useful thing above is the function name. Due to the amount of
different kernel code versions and various configurations being used, the
kernel address and the offset into the function are not really helpful in
determining where the problem actually occured.

Too often the result of someone looking at a stack dump is asking the person
who sent it for a translation for one or more 'addr2line' translations. Which
slows down the entire process of debugging the issue (and really annoying).

The "improve_stack" script (wanted: better name) is an attempt to make the
output more useful and easy to work with by translating all kernel addresses
in the stack dump into line numbers. Which means that the stack dump we saw
before would look like this:

[    6.906437]  [<kernel/backtracetest.c:73>] ? backtrace_test_irq_callback+0x20/0x20
[    6.907121]  [<lib/dump_stack.c:52>] dump_stack+0x52/0x7f
[    6.907640]  [<kernel/backtracetest.c:40 kernel/backtracetest.c:77>] backtrace_regression_test+0x38/0x110
[    6.908281]  [<fs/proc/generic.c:445>] ? proc_create_data+0xa0/0xd0
[    6.908870]  [<kernel/kallsyms.c:611>] ? proc_modules_init+0x22/0x22
[    6.909480]  [<init/main.c:696>] do_one_initcall+0xc2/0x1e0

It's pretty obvious why this is better than the previous stack dump before.

Usage is pretty simple:

	./improve_stack.sh [vmlinux] [base path]

Where vmlinux is the vmlinux to extract line numbers from and base path is
the path that points to the root of the build tree, for example:

	./improve_stack.sh vmlinux /home/sasha/linux/

And the stack trace should be piped through it (I, for example, just pipe
the output of the serial console of my KVM test box through it).

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
---
 scripts/improve_stack.sh |   32 ++++++++++++++++++++++++++++++++
 1 files changed, 32 insertions(+), 0 deletions(-)
 create mode 100755 scripts/improve_stack.sh

diff --git a/scripts/improve_stack.sh b/scripts/improve_stack.sh
new file mode 100755
index 0000000..03a4a90
--- /dev/null
+++ b/scripts/improve_stack.sh
@@ -0,0 +1,32 @@
+#!/bin/bash
+
+if [ $# != "2" ]; then
+	echo "Usage:"
+	echo "	$0 [vmlinux] [base path]"
+	exit 1
+fi
+
+vmlinux=$1
+basepath=$2
+
+while read line; do
+	# Let's see if we have an address in the line
+	if [[ $line =~ \[\<([^]]+)\>\]  ]]; then
+		# Translate address to line numbers
+		code=`addr2line -i -e $vmlinux ${BASH_REMATCH[1]}`
+
+		# Strip useless base path
+		code=${code//$basepath/""}
+
+		# In the case of inlines, move everything to same line
+		code=${code//$'\n'/' '}
+
+		# Replace old address with pretty line numbers
+		newline=${line//${BASH_REMATCH[1]}/$code}
+
+		echo "$newline"
+	else
+		# Nothing special in this line, show it as is
+		echo "$line"
+	fi
+done
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-02-23  0:19 [RFC] improve_stack: make stack dump output useful again Sasha Levin
@ 2014-02-23 20:27 ` Linus Torvalds
  2014-02-23 20:44   ` Joe Perches
  2014-03-13 15:16   ` Sasha Levin
  0 siblings, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2014-02-23 20:27 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Linux Kernel Mailing List

On Sat, Feb 22, 2014 at 4:19 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
> Right now when people try to report issues in the kernel they send stack
> dumps to eachother, which looks something like this:
>
> [    6.906437]  [<ffffffff811f0e90>] ? backtrace_test_irq_callback+0x20/0x20
> [    6.907121]  [<ffffffff84388ce8>] dump_stack+0x52/0x7f
> [    6.907640]  [<ffffffff811f0ec8>] backtrace_regression_test+0x38/0x110
> [    6.908281]  [<ffffffff813596a0>] ? proc_create_data+0xa0/0xd0
> [    6.908870]  [<ffffffff870a8040>] ? proc_modules_init+0x22/0x22
> [    6.909480]  [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
> [...]
>
> However, most of the text you get is pure garbage.

I'd like to fix that, but I'd like to fix it in the kernel, and just
stop printing the hex addresses entirely.

However, your kind of script actually makes that worse, in that it
uses the redundant hex addresses for 'addr2line', and that tool is
known to not work with symbolic addresses, only with actual numerical
ones.

So I would *really* want to do this kernel change (possibly
conditional on RANDOMIZE_BASE_ADDRESS or whatever the config variable
is called):

    diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
    index d9c12d3022a7..58039e728f00 100644
    --- a/arch/x86/kernel/dumpstack.c
    +++ b/arch/x86/kernel/dumpstack.c
    @@ -27,13 +27,12 @@ static int die_counter;

     static void printk_stack_address(unsigned long address, int reliable)
     {
    -       pr_cont(" [<%p>] %s%pB\n",
    -               (void *)address, reliable ? "" : "? ", (void *)address);
    +       pr_cont(" %s[<%pB>]\n", reliable ? "" : "? ", (void *)address);
     }

     void printk_address(unsigned long address)
     {
    -       pr_cont(" [<%p>] %pS\n", (void *)address, (void *)address);
    +       pr_cont(" [<%pS>]\n", (void *)address);
     }

     #ifdef CONFIG_FUNCTION_GRAPH_TRACER

which would make the kernel stack traces much prettier.

But that would require that there be a "resolve symbolic address" (if
CONFIG_KALLSYMS isn't enabled, it would still be hexadecimal) for the
address inside the [<>] thing..

I don't know of any sane tool that does that directly, but it
shouldn't be *that* hard. You can *almost* do it with

  echo "p backtrace_regression_test+0x38" | gdb vmlinux

but you see the problem if you try that ;)

               Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-02-23 20:27 ` Linus Torvalds
@ 2014-02-23 20:44   ` Joe Perches
  2014-02-23 20:55     ` Linus Torvalds
  2014-03-13 15:16   ` Sasha Levin
  1 sibling, 1 reply; 19+ messages in thread
From: Joe Perches @ 2014-02-23 20:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sasha Levin, Linux Kernel Mailing List

On Sun, 2014-02-23 at 12:27 -0800, Linus Torvalds wrote:
> So I would *really* want to do this kernel change (possibly
> conditional on RANDOMIZE_BASE_ADDRESS or whatever the config variable
> is called):
> 
>     diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
>     index d9c12d3022a7..58039e728f00 100644
>     --- a/arch/x86/kernel/dumpstack.c
>     +++ b/arch/x86/kernel/dumpstack.c
>     @@ -27,13 +27,12 @@ static int die_counter;
> 
>      static void printk_stack_address(unsigned long address, int reliable)
>      {
>     -       pr_cont(" [<%p>] %s%pB\n",
>     -               (void *)address, reliable ? "" : "? ", (void *)address);
>     +       pr_cont(" %s[<%pB>]\n", reliable ? "" : "? ", (void *)address);
>      }
> 
>      void printk_address(unsigned long address)
>      {
>     -       pr_cont(" [<%p>] %pS\n", (void *)address, (void *)address);
>     +       pr_cont(" [<%pS>]\n", (void *)address);
>      }
> 
>      #ifdef CONFIG_FUNCTION_GRAPH_TRACER

I'd rather see this as a vsprintf extension so
that it is more difficult to interleave and
doesn't need to be replicated for each arch.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-02-23 20:44   ` Joe Perches
@ 2014-02-23 20:55     ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2014-02-23 20:55 UTC (permalink / raw)
  To: Joe Perches; +Cc: Sasha Levin, Linux Kernel Mailing List

On Sun, Feb 23, 2014 at 12:44 PM, Joe Perches <joe@perches.com> wrote:
>
> I'd rather see this as a vsprintf extension so
> that it is more difficult to interleave and
> doesn't need to be replicated for each arch.

We could easily get rid of printk_[stack_]address() by just inlining
it into the callers, and make them use the [<%p[SB]>] thing directly.

That's a separate issue from trying to get rid of the hex parts of the
address, though.

Also, note that even if you merged printk_stack_address into its
callers, you'd still have the actual stack walking ->address() and
->stack() calls as separate calls, so it's not like you'd really avoid
any interleaving if there are concurrent walkers.

                Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-02-23 20:27 ` Linus Torvalds
  2014-02-23 20:44   ` Joe Perches
@ 2014-03-13 15:16   ` Sasha Levin
  2014-03-13 22:03     ` Linus Torvalds
  1 sibling, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2014-03-13 15:16 UTC (permalink / raw)
  To: Linus Torvalds, Sasha Levin; +Cc: Linux Kernel Mailing List

On 02/23/2014 03:27 PM, Linus Torvalds wrote:
> On Sat, Feb 22, 2014 at 4:19 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>> Right now when people try to report issues in the kernel they send stack
>> dumps to eachother, which looks something like this:
>>
>> [    6.906437]  [<ffffffff811f0e90>] ? backtrace_test_irq_callback+0x20/0x20
>> [    6.907121]  [<ffffffff84388ce8>] dump_stack+0x52/0x7f
>> [    6.907640]  [<ffffffff811f0ec8>] backtrace_regression_test+0x38/0x110
>> [    6.908281]  [<ffffffff813596a0>] ? proc_create_data+0xa0/0xd0
>> [    6.908870]  [<ffffffff870a8040>] ? proc_modules_init+0x22/0x22
>> [    6.909480]  [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
>> [...]
>>
>> However, most of the text you get is pure garbage.
>
> I'd like to fix that, but I'd like to fix it in the kernel, and just
> stop printing the hex addresses entirely.
>
> However, your kind of script actually makes that worse, in that it
> uses the redundant hex addresses for 'addr2line', and that tool is
> known to not work with symbolic addresses, only with actual numerical
> ones.
>
> So I would *really* want to do this kernel change (possibly
> conditional on RANDOMIZE_BASE_ADDRESS or whatever the config variable
> is called):
>
>      diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
>      index d9c12d3022a7..58039e728f00 100644
>      --- a/arch/x86/kernel/dumpstack.c
>      +++ b/arch/x86/kernel/dumpstack.c
>      @@ -27,13 +27,12 @@ static int die_counter;
>
>       static void printk_stack_address(unsigned long address, int reliable)
>       {
>      -       pr_cont(" [<%p>] %s%pB\n",
>      -               (void *)address, reliable ? "" : "? ", (void *)address);
>      +       pr_cont(" %s[<%pB>]\n", reliable ? "" : "? ", (void *)address);
>       }
>
>       void printk_address(unsigned long address)
>       {
>      -       pr_cont(" [<%p>] %pS\n", (void *)address, (void *)address);
>      +       pr_cont(" [<%pS>]\n", (void *)address);
>       }
>
>       #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>
> which would make the kernel stack traces much prettier.
>
> But that would require that there be a "resolve symbolic address" (if
> CONFIG_KALLSYMS isn't enabled, it would still be hexadecimal) for the
> address inside the [<>] thing..
>
> I don't know of any sane tool that does that directly, but it
> shouldn't be *that* hard. You can *almost* do it with
>
>    echo "p backtrace_regression_test+0x38" | gdb vmlinux
>
> but you see the problem if you try that ;)

I've looked into doing it in the kernel, but it seems that it would require a rather
large code addition just to deal with getting pretty line numbers.

Unless I'm missing something big, is it really worth it?


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 15:16   ` Sasha Levin
@ 2014-03-13 22:03     ` Linus Torvalds
  2014-03-13 22:20       ` Sasha Levin
  2014-03-13 23:12       ` Dave Jones
  0 siblings, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2014-03-13 22:03 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Linux Kernel Mailing List

On Thu, Mar 13, 2014 at 8:16 AM, Sasha Levin <sasha.levin@oracle.com> wrote:
>
> I've looked into doing it in the kernel, but it seems that it would require
> a rather
> large code addition just to deal with getting pretty line numbers.

No no no. The *kernel* will never do line numbers, especially since
only people who don't care about build performance compile with debug
info, and even if you do do that, the kernel won't load it anyway.

You missed the point.

The kernel is going to *remove* all the hex numbers that your script
relies on, because those hex numbers are completely worthless. They
are worthless and annoying now, but they are *doubly* worthless if the
kernel is compiled with base address randomization, since nobody will
know what the hex numbers mean.

> Unless I'm missing something big, is it really worth it?

You're missing something big. The patch I sent earlier *is* going to
happen one of these days, possible for 3.15. So your script that looks
at hex numbers is broken.

You need to look at the *symbol* number. In this output:

     [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0

that "ffffffff810020c2" is crap, and is going away. The address that
is meaningful and valid is the "do_one_initcall+0xc2" part.

*That* is the part you'd use to parse in user space.

Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
hex number doesn't *work*.

          Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 22:03     ` Linus Torvalds
@ 2014-03-13 22:20       ` Sasha Levin
  2014-03-13 22:59         ` Linus Torvalds
  2014-03-13 23:12       ` Dave Jones
  1 sibling, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2014-03-13 22:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

On 03/13/2014 06:03 PM, Linus Torvalds wrote:
> On Thu, Mar 13, 2014 at 8:16 AM, Sasha Levin <sasha.levin@oracle.com> wrote:
>>
>> I've looked into doing it in the kernel, but it seems that it would require
>> a rather
>> large code addition just to deal with getting pretty line numbers.
>
> No no no. The *kernel* will never do line numbers, especially since
> only people who don't care about build performance compile with debug
> info, and even if you do do that, the kernel won't load it anyway.
>
> You missed the point.
>
> The kernel is going to *remove* all the hex numbers that your script
> relies on, because those hex numbers are completely worthless. They
> are worthless and annoying now, but they are *doubly* worthless if the
> kernel is compiled with base address randomization, since nobody will
> know what the hex numbers mean.
>
>> Unless I'm missing something big, is it really worth it?
>
> You're missing something big. The patch I sent earlier *is* going to
> happen one of these days, possible for 3.15. So your script that looks
> at hex numbers is broken.
>
> You need to look at the *symbol* number. In this output:
>
>       [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
>
> that "ffffffff810020c2" is crap, and is going away. The address that
> is meaningful and valid is the "do_one_initcall+0xc2" part.
>
> *That* is the part you'd use to parse in user space.
>
> Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
> hex number doesn't *work*.

Oh. doh. that was stupid of me.

I'll fix it up and re-send this patch.


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 22:20       ` Sasha Levin
@ 2014-03-13 22:59         ` Linus Torvalds
  2014-03-13 23:07           ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2014-03-13 22:59 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Linux Kernel Mailing List

On Thu, Mar 13, 2014 at 3:20 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>
> I'll fix it up and re-send this patch.

The problem (as you will find out) is that "addr2line" doesn't take
symbolic names.

So either addr2line needs to be improved (which really would be a good
idea regardless), or your script needs to use "nm" or "gdb" or
something to first translate the symbol+off into a hex number (which
will not match the kernel-provided hex number when base randomization
is in effect, but it will match the pre-randomized data in the vmlinux
file, so then addr2line would work).

              Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 22:59         ` Linus Torvalds
@ 2014-03-13 23:07           ` Sasha Levin
  2014-03-14  0:50             ` Linus Torvalds
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2014-03-13 23:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

On 03/13/2014 06:59 PM, Linus Torvalds wrote:
> On Thu, Mar 13, 2014 at 3:20 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>>
>> I'll fix it up and re-send this patch.
>
> The problem (as you will find out) is that "addr2line" doesn't take
> symbolic names.
>
> So either addr2line needs to be improved (which really would be a good
> idea regardless), or your script needs to use "nm" or "gdb" or
> something to first translate the symbol+off into a hex number (which
> will not match the kernel-provided hex number when base randomization
> is in effect, but it will match the pre-randomized data in the vmlinux
> file, so then addr2line would work).

I figured that I'll just read it from System.map (and do the math when
adding the offset). That should work, right?


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 22:03     ` Linus Torvalds
  2014-03-13 22:20       ` Sasha Levin
@ 2014-03-13 23:12       ` Dave Jones
  2014-03-14 18:31         ` Kees Cook
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Jones @ 2014-03-13 23:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sasha Levin, Linux Kernel Mailing List, keescook

On Thu, Mar 13, 2014 at 03:03:41PM -0700, Linus Torvalds wrote:

 > You need to look at the *symbol* number. In this output:
 > 
 >      [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
 > 
 > that "ffffffff810020c2" is crap, and is going away. The address that
 > is meaningful and valid is the "do_one_initcall+0xc2" part.
 > 
 > *That* is the part you'd use to parse in user space.
 > 
 > Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
 > hex number doesn't *work*.

That reminds me, perf top is still busted when this option is enabled.

	Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 23:07           ` Sasha Levin
@ 2014-03-14  0:50             ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2014-03-14  0:50 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Linux Kernel Mailing List

On Thu, Mar 13, 2014 at 4:07 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>
> I figured that I'll just read it from System.map (and do the math when
> adding the offset). That should work, right?

Yes, although just reading the symbols from the vmlinux file would be
*much* more convenient, since I know that not everybody saves the
System.map file (cough cough me). But the vmlinux file you need
anyway.

So it would actually be much better to use "nm vmlinux" as the source
of the base information instead, and have just one file you depend on.

              Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-13 23:12       ` Dave Jones
@ 2014-03-14 18:31         ` Kees Cook
  2014-03-14 18:33           ` Dave Jones
  2014-03-14 19:08           ` Dave Jones
  0 siblings, 2 replies; 19+ messages in thread
From: Kees Cook @ 2014-03-14 18:31 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Sasha Levin,
	Linux Kernel Mailing List, Kees Cook

On Thu, Mar 13, 2014 at 4:12 PM, Dave Jones <davej@redhat.com> wrote:
> On Thu, Mar 13, 2014 at 03:03:41PM -0700, Linus Torvalds wrote:
>
>  > You need to look at the *symbol* number. In this output:
>  >
>  >      [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
>  >
>  > that "ffffffff810020c2" is crap, and is going away. The address that
>  > is meaningful and valid is the "do_one_initcall+0xc2" part.
>  >
>  > *That* is the part you'd use to parse in user space.
>  >
>  > Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
>  > hex number doesn't *work*.
>
> That reminds me, perf top is still busted when this option is enabled.

Hrm, works for me. I'm not very familiar with what to expect, but
comparing output between kaslr boot and nokaslr boot, it looks the
same to me.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 18:31         ` Kees Cook
@ 2014-03-14 18:33           ` Dave Jones
  2014-03-14 19:08           ` Dave Jones
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Jones @ 2014-03-14 18:33 UTC (permalink / raw)
  To: Kees Cook; +Cc: Linus Torvalds, Sasha Levin, Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 11:31:11AM -0700, Kees Cook wrote:
 > On Thu, Mar 13, 2014 at 4:12 PM, Dave Jones <davej@redhat.com> wrote:
 > > On Thu, Mar 13, 2014 at 03:03:41PM -0700, Linus Torvalds wrote:
 > >
 > >  > You need to look at the *symbol* number. In this output:
 > >  >
 > >  >      [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
 > >  >
 > >  > that "ffffffff810020c2" is crap, and is going away. The address that
 > >  > is meaningful and valid is the "do_one_initcall+0xc2" part.
 > >  >
 > >  > *That* is the part you'd use to parse in user space.
 > >  >
 > >  > Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
 > >  > hex number doesn't *work*.
 > >
 > > That reminds me, perf top is still busted when this option is enabled.
 > 
 > Hrm, works for me. I'm not very familiar with what to expect, but
 > comparing output between kaslr boot and nokaslr boot, it looks the
 > same to me.

I don't get kernel symbols resolved at all when it's enabled
Disabling the config option makes them come back again.
I didn't try nokaslr.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 18:31         ` Kees Cook
  2014-03-14 18:33           ` Dave Jones
@ 2014-03-14 19:08           ` Dave Jones
  2014-03-14 19:31             ` Kees Cook
  2014-03-14 19:32             ` Linus Torvalds
  1 sibling, 2 replies; 19+ messages in thread
From: Dave Jones @ 2014-03-14 19:08 UTC (permalink / raw)
  To: Kees Cook; +Cc: Linus Torvalds, Sasha Levin, Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 11:31:11AM -0700, Kees Cook wrote:
 > On Thu, Mar 13, 2014 at 4:12 PM, Dave Jones <davej@redhat.com> wrote:
 > > On Thu, Mar 13, 2014 at 03:03:41PM -0700, Linus Torvalds wrote:
 > >
 > >  > You need to look at the *symbol* number. In this output:
 > >  >
 > >  >      [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
 > >  >
 > >  > that "ffffffff810020c2" is crap, and is going away. The address that
 > >  > is meaningful and valid is the "do_one_initcall+0xc2" part.
 > >  >
 > >  > *That* is the part you'd use to parse in user space.
 > >  >
 > >  > Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
 > >  > hex number doesn't *work*.
 > >
 > > That reminds me, perf top is still busted when this option is enabled.
 > 
 > Hrm, works for me. I'm not very familiar with what to expect, but
 > comparing output between kaslr boot and nokaslr boot, it looks the
 > same to me.

ok, nokalsr makes it work too.
Booting with that and using the perf binary from 3.14rc6 , I just see..

  9.30%  [kernel]                      [k] 0xffffffffaf18e887
  7.98%  [kernel]                      [k] 0xffffffffaf3276c7
  6.10%  [kernel]                      [k] 0xffffffffaf18dd3a
  4.39%  [kernel]                      [k] 0xffffffffaf327717
  1.71%  [kernel]                      [k] 0xffffffffaf18e89c
  1.52%  [kernel]                      [k] 0xffffffffaf3276cc

Curiously, if I use the perf binary from 3.13, I see everything lumped together as..

 95.89%  [kernel].exit.text            [k] 0x000000002e586c26

(When kaslr is disabled both binaries work fine)

Also maybe related: The rc6 binary claims it can't read symbols from vmlinux
when kaslr is enabled.

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 19:08           ` Dave Jones
@ 2014-03-14 19:31             ` Kees Cook
  2014-03-14 19:32             ` Linus Torvalds
  1 sibling, 0 replies; 19+ messages in thread
From: Kees Cook @ 2014-03-14 19:31 UTC (permalink / raw)
  To: Dave Jones, Kees Cook, Linus Torvalds, Sasha Levin,
	Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 12:08 PM, Dave Jones <davej@redhat.com> wrote:
> On Fri, Mar 14, 2014 at 11:31:11AM -0700, Kees Cook wrote:
>  > On Thu, Mar 13, 2014 at 4:12 PM, Dave Jones <davej@redhat.com> wrote:
>  > > On Thu, Mar 13, 2014 at 03:03:41PM -0700, Linus Torvalds wrote:
>  > >
>  > >  > You need to look at the *symbol* number. In this output:
>  > >  >
>  > >  >      [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
>  > >  >
>  > >  > that "ffffffff810020c2" is crap, and is going away. The address that
>  > >  > is meaningful and valid is the "do_one_initcall+0xc2" part.
>  > >  >
>  > >  > *That* is the part you'd use to parse in user space.
>  > >  >
>  > >  > Try it today with the CONFIG_RANDOMIZE_BASE option to see. Using the
>  > >  > hex number doesn't *work*.
>  > >
>  > > That reminds me, perf top is still busted when this option is enabled.
>  >
>  > Hrm, works for me. I'm not very familiar with what to expect, but
>  > comparing output between kaslr boot and nokaslr boot, it looks the
>  > same to me.
>
> ok, nokalsr makes it work too.
> Booting with that and using the perf binary from 3.14rc6 , I just see..
>
>   9.30%  [kernel]                      [k] 0xffffffffaf18e887
>   7.98%  [kernel]                      [k] 0xffffffffaf3276c7
>   6.10%  [kernel]                      [k] 0xffffffffaf18dd3a
>   4.39%  [kernel]                      [k] 0xffffffffaf327717
>   1.71%  [kernel]                      [k] 0xffffffffaf18e89c
>   1.52%  [kernel]                      [k] 0xffffffffaf3276cc
>
> Curiously, if I use the perf binary from 3.13, I see everything lumped together as..
>
>  95.89%  [kernel].exit.text            [k] 0x000000002e586c26
>
> (When kaslr is disabled both binaries work fine)
>
> Also maybe related: The rc6 binary claims it can't read symbols from vmlinux
> when kaslr is enabled.

Very odd. The perf I built from Linus's tree seems to resolve
everything fine for me. I wonder what we're doing differently. I
literally just did "cd tools/perf; make; scp perf
root@test-machine:/root; ssh root@test-machine 'perf top'" and
everything looks fine.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 19:08           ` Dave Jones
  2014-03-14 19:31             ` Kees Cook
@ 2014-03-14 19:32             ` Linus Torvalds
  2014-03-14 19:41               ` Linus Torvalds
  2014-03-14 20:08               ` Dave Jones
  1 sibling, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2014-03-14 19:32 UTC (permalink / raw)
  To: Dave Jones, Kees Cook, Linus Torvalds, Sasha Levin,
	Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 12:08 PM, Dave Jones <davej@redhat.com> wrote:
>
> ok, nokalsr makes it work too.
> Booting with that and using the perf binary from 3.14rc6 , I just see..
>
>   9.30%  [kernel]                      [k] 0xffffffffaf18e887

Hmm. Do you have CONFIG_KALLSYMS enabled?

I have CONFIG_RANDOMIZE_BASE=y and perf (both report and top) seems to
work fine for me (current git, not v3.14-rc6, but still)

           Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 19:32             ` Linus Torvalds
@ 2014-03-14 19:41               ` Linus Torvalds
  2014-03-14 20:15                 ` Kees Cook
  2014-03-14 20:08               ` Dave Jones
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2014-03-14 19:41 UTC (permalink / raw)
  To: Dave Jones, Kees Cook, Linus Torvalds, Sasha Levin,
	Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 12:32 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Hmm. Do you have CONFIG_KALLSYMS enabled?

Just to clarify. KALLSYMS is required to figure out the symbol
addresses on a kaslr system. There's no way to look them up in a
System.map file or the object file, since the addresses aren't going
to match.

Your symptoms really sound like you might not have KALLSYMS enabled.

I wonder if we have a "select KALLSYMS" as part of RANDOMIZE_BASE,
since you need it for debug messages too (random hex numbers aren't
too useful ;)

            Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 19:32             ` Linus Torvalds
  2014-03-14 19:41               ` Linus Torvalds
@ 2014-03-14 20:08               ` Dave Jones
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Jones @ 2014-03-14 20:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kees Cook, Sasha Levin, Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 12:32:23PM -0700, Linus Torvalds wrote:
 > On Fri, Mar 14, 2014 at 12:08 PM, Dave Jones <davej@redhat.com> wrote:
 > >
 > > ok, nokalsr makes it work too.
 > > Booting with that and using the perf binary from 3.14rc6 , I just see..
 > >
 > >   9.30%  [kernel]                      [k] 0xffffffffaf18e887
 > 
 > Hmm. Do you have CONFIG_KALLSYMS enabled?

$ grep KALLSYMS .config
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y

mine looks like this.. http://codemonkey.org.uk/junk/kallsyms

 > I have CONFIG_RANDOMIZE_BASE=y and perf (both report and top) seems to
 > work fine for me (current git, not v3.14-rc6, but still)

yeah, on current now too.

CONFIG_RANDOMIZE_BASE=y
CONFIG_RANDOMIZE_BASE_MAX_OFFSET=0x40000000


	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] improve_stack: make stack dump output useful again
  2014-03-14 19:41               ` Linus Torvalds
@ 2014-03-14 20:15                 ` Kees Cook
  0 siblings, 0 replies; 19+ messages in thread
From: Kees Cook @ 2014-03-14 20:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dave Jones, Sasha Levin, Linux Kernel Mailing List

On Fri, Mar 14, 2014 at 12:41 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Mar 14, 2014 at 12:32 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Hmm. Do you have CONFIG_KALLSYMS enabled?
>
> Just to clarify. KALLSYMS is required to figure out the symbol
> addresses on a kaslr system. There's no way to look them up in a
> System.map file or the object file, since the addresses aren't going
> to match.
>
> Your symptoms really sound like you might not have KALLSYMS enabled.
>
> I wonder if we have a "select KALLSYMS" as part of RANDOMIZE_BASE,
> since you need it for debug messages too (random hex numbers aren't
> too useful ;)

My weak preference would be to allow to build without that hard
requirement (citing imagined space savings, blah blah). But since
everything I normally use has KALLSYMS, I wouldn't object to doing it
either. :)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-03-14 20:15 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-23  0:19 [RFC] improve_stack: make stack dump output useful again Sasha Levin
2014-02-23 20:27 ` Linus Torvalds
2014-02-23 20:44   ` Joe Perches
2014-02-23 20:55     ` Linus Torvalds
2014-03-13 15:16   ` Sasha Levin
2014-03-13 22:03     ` Linus Torvalds
2014-03-13 22:20       ` Sasha Levin
2014-03-13 22:59         ` Linus Torvalds
2014-03-13 23:07           ` Sasha Levin
2014-03-14  0:50             ` Linus Torvalds
2014-03-13 23:12       ` Dave Jones
2014-03-14 18:31         ` Kees Cook
2014-03-14 18:33           ` Dave Jones
2014-03-14 19:08           ` Dave Jones
2014-03-14 19:31             ` Kees Cook
2014-03-14 19:32             ` Linus Torvalds
2014-03-14 19:41               ` Linus Torvalds
2014-03-14 20:15                 ` Kees Cook
2014-03-14 20:08               ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox