* buggy perf callgraph output
@ 2010-12-08 16:40 Nick Piggin
2010-12-08 21:48 ` Frederic Weisbecker
0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2010-12-08 16:40 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: linux-kernel
Hi,
I get strange looking perf output with the callchain hits not sorted
properly for graph and fractal:
7.34% open [kernel.kallsyms] [k] kmem_cache_alloc
|
--- kmem_cache_alloc
|
|---0.05%-- get_empty_filp
| do_filp_open
| do_sys_open
| sys_open
| system_call
| __open_nocancel
|
---0.16%-- getname
do_sys_open
sys_open
system_call
__open_nocancel
6.17% open [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|---0.01%-- add_partial
| __slab_free
| kmem_cache_free
| file_free_rcu
| __rcu_process_callbacks
| rcu_process_callbacks
| __do_softirq
| call_softirq
| do_softirq
| irq_exit
| smp_apic_timer_interrupt
| apic_timer_interrupt
|
|---0.16%-- dput
| |
| --0.57%-- fput
| filp_close
| sys_close
| system_call
| __libc_close
|
|--0.60%-- link_path_walk
| do_path_lookup
| do_filp_open
| do_sys_open
| sys_open
| system_call
| __open_nocancel
|
|--0.57%-- __slab_free
| kmem_cache_free
| file_free_rcu
| __rcu_process_callbacks
| rcu_process_callbacks
| __do_softirq
With flat it looks a bit better:
7.34% open [kernel.kallsyms] [k] kmem_cache_alloc
6.09%
kmem_cache_alloc
get_empty_filp
do_filp_open
do_sys_open
sys_open
system_call
__open_nocancel
1.07%
kmem_cache_alloc
getname
do_sys_open
sys_open
system_call
__open_nocancel
6.17% open [kernel.kallsyms] [k] _raw_spin_lock
0.60%
_raw_spin_lock
link_path_walk
do_path_lookup
do_filp_open
do_sys_open
sys_open
system_call
__open_nocancel
0.57%
_raw_spin_lock
fd_install
do_sys_open
sys_open
system_call
__open_nocancel
0.57%
_raw_spin_lock
dput
fput
filp_close
sys_close
system_call
__libc_close
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: buggy perf callgraph output 2010-12-08 16:40 buggy perf callgraph output Nick Piggin @ 2010-12-08 21:48 ` Frederic Weisbecker 2010-12-15 13:02 ` Nick Piggin 0 siblings, 1 reply; 8+ messages in thread From: Frederic Weisbecker @ 2010-12-08 21:48 UTC (permalink / raw) To: Nick Piggin; +Cc: Arnaldo Carvalho de Melo, linux-kernel On Thu, Dec 09, 2010 at 03:40:15AM +1100, Nick Piggin wrote: > Hi, > > I get strange looking perf output with the callchain hits not sorted > properly for graph and fractal: > > 7.34% open [kernel.kallsyms] [k] kmem_cache_alloc > | > --- kmem_cache_alloc > | > |---0.05%-- get_empty_filp > | do_filp_open > | do_sys_open > | sys_open > | system_call > | __open_nocancel > | > ---0.16%-- getname > do_sys_open > sys_open > system_call > __open_nocancel > > 6.17% open [kernel.kallsyms] [k] _raw_spin_lock > | > --- _raw_spin_lock > | > |---0.01%-- add_partial > | __slab_free > | kmem_cache_free > | file_free_rcu > | __rcu_process_callbacks > | rcu_process_callbacks > | __do_softirq > | call_softirq > | do_softirq > | irq_exit > | smp_apic_timer_interrupt > | apic_timer_interrupt > | > |---0.16%-- dput > | | > | --0.57%-- fput > | filp_close > | sys_close > | system_call > | __libc_close > | > |--0.60%-- link_path_walk > | do_path_lookup > | do_filp_open > | do_sys_open > | sys_open > | system_call > | __open_nocancel > | > |--0.57%-- __slab_free > | kmem_cache_free > | file_free_rcu > | __rcu_process_callbacks > | rcu_process_callbacks > | __do_softirq > > > With flat it looks a bit better: > 7.34% open [kernel.kallsyms] [k] kmem_cache_alloc > 6.09% > kmem_cache_alloc > get_empty_filp > do_filp_open > do_sys_open > sys_open > system_call > __open_nocancel > > 1.07% > kmem_cache_alloc > getname > do_sys_open > sys_open > system_call > __open_nocancel > > 6.17% open [kernel.kallsyms] [k] _raw_spin_lock > 0.60% > _raw_spin_lock > link_path_walk > do_path_lookup > do_filp_open > do_sys_open > sys_open > system_call > __open_nocancel > > 0.57% > _raw_spin_lock > fd_install > do_sys_open > sys_open > system_call > __open_nocancel > > 0.57% > _raw_spin_lock > dput > fput > filp_close > sys_close > system_call > __libc_close I can not reproduce it. Could you please try to reproduce, run perf archive and send me your perf.data.tar.bz2 ? Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: buggy perf callgraph output 2010-12-08 21:48 ` Frederic Weisbecker @ 2010-12-15 13:02 ` Nick Piggin 2010-12-16 11:47 ` Frederic Weisbecker 0 siblings, 1 reply; 8+ messages in thread From: Nick Piggin @ 2010-12-15 13:02 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: Nick Piggin, Arnaldo Carvalho de Melo, linux-kernel On Wed, Dec 08, 2010 at 10:48:13PM +0100, Frederic Weisbecker wrote: > I can not reproduce it. Could you please try to reproduce, > run perf archive and send me your perf.data.tar.bz2 ? It seems to be happening all the time, just look further in callgraphs. This attached perf.data.bz2 looks like this, when using -g graph 15.05% dbench [kernel.kallsyms] [k] copy_user_generic_string | --- copy_user_generic_string | |---0.16%-- generic_file_aio_read | do_sync_read | vfs_read | | | --0.05%-- sys_pread64 | system_call | 0x7f64a60bb193 | |--0.10%-- generic_file_buffered_write | __generic_file_aio_write | generic_file_aio_write | do_sync_write | vfs_write | sys_pwrite64 | system_call | 0x7f64a60bb203 | 0xe01170 | ---0.11%-- dcache_readdir vfs_readdir sys_getdents system_call 0x7f64a60ade65 See, the last element is greater than the second last. -g fractal looks like this: 15.05% dbench [kernel.kallsyms] [k] copy_user_generic_string | --- copy_user_generic_string | |---1.09%-- generic_file_aio_read | do_sync_read | vfs_read | | | |--0.55%-- sys_pread64 | | system_call | | 0x7f64a60bb193 | | | --2.19%-- sys_read | system_call | 0x7f64a60d3ea0 | |--0.69%-- generic_file_buffered_write | __generic_file_aio_write | generic_file_aio_write | do_sync_write | vfs_write | sys_pwrite64 | system_call | 0x7f64a60bb203 | 0xe01170 | |---0.72%-- dcache_readdir | vfs_readdir | sys_getdents | system_call | 0x7f64a60ade65 So it's totally screwy. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: buggy perf callgraph output 2010-12-15 13:02 ` Nick Piggin @ 2010-12-16 11:47 ` Frederic Weisbecker 2010-12-17 5:32 ` Nick Piggin 0 siblings, 1 reply; 8+ messages in thread From: Frederic Weisbecker @ 2010-12-16 11:47 UTC (permalink / raw) To: Nick Piggin; +Cc: Arnaldo Carvalho de Melo, linux-kernel On Thu, Dec 16, 2010 at 12:02:45AM +1100, Nick Piggin wrote: > On Wed, Dec 08, 2010 at 10:48:13PM +0100, Frederic Weisbecker wrote: > > I can not reproduce it. Could you please try to reproduce, > > run perf archive and send me your perf.data.tar.bz2 ? > > It seems to be happening all the time, just look further in > callgraphs. > > This attached perf.data.bz2 looks like this, when using -g graph > > 15.05% dbench [kernel.kallsyms] [k] > copy_user_generic_string > | > --- copy_user_generic_string > | > |---0.16%-- generic_file_aio_read > | do_sync_read > | vfs_read > | | > | --0.05%-- sys_pread64 > | system_call > | 0x7f64a60bb193 > | > |--0.10%-- generic_file_buffered_write > | __generic_file_aio_write > | generic_file_aio_write > | do_sync_write > | vfs_write > | sys_pwrite64 > | system_call > | 0x7f64a60bb203 > | 0xe01170 > | > ---0.11%-- dcache_readdir > vfs_readdir > sys_getdents > system_call > 0x7f64a60ade65 > > See, the last element is greater than the second last. > > -g fractal looks like this: > > 15.05% dbench [kernel.kallsyms] [k] > copy_user_generic_string > | > --- copy_user_generic_string > | > |---1.09%-- generic_file_aio_read > | do_sync_read > | vfs_read > | | > | |--0.55%-- sys_pread64 > | | system_call > | | 0x7f64a60bb193 > | | > | --2.19%-- sys_read > | system_call > | 0x7f64a60d3ea0 > | > |--0.69%-- generic_file_buffered_write > | __generic_file_aio_write > | generic_file_aio_write > | do_sync_write > | vfs_write > | sys_pwrite64 > | system_call > | 0x7f64a60bb203 > | 0xe01170 > | > |---0.72%-- dcache_readdir > | vfs_readdir > | sys_getdents > | system_call > | 0x7f64a60ade65 > > > So it's totally screwy. First time I see this. I can reproduce but differently because I miss your perf.data.tar.bz2 that results after the "perf archive" command, then the chains are based on addresses and not on symbols. But I found one of these problems on your file even without the symbols: --- 0x8152b50e | |---6.54%-- 0x810e83a7 | 0x810d8590 | 0x810d8710 | 0x81002cbb | 0xa60ade65 | |--13.76%-- 0x810dbc9f | | | |--44.73%-- 0x810d1ff5 | | | | | |--38.87%-- 0x810d44c4 | | | 0x810d5242 | | | | | | | |--82.03%-- 0x810d5f12 | | | | 0x810cbfd7 | | | | 0x810cc046 | | | | 0x810cc1ff | | | | 0x81002cbb | | | | 0xa60d37f5 | | | | | | | | | |--86.66%-- 0xe01170 | | | | | | | | | --13.34%-- 0x6c632f73 | | | | | | | --17.97%-- 0x810d629f | | | 0x810c63a3 | | | 0x810c648b | | | 0x81002cbb | | | 0xa60d3cb0 These are not the two last on the chain so it seems to happen even more randomly. I'll fix that, thanks! ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: buggy perf callgraph output 2010-12-16 11:47 ` Frederic Weisbecker @ 2010-12-17 5:32 ` Nick Piggin 2010-12-23 12:55 ` Frederic Weisbecker 0 siblings, 1 reply; 8+ messages in thread From: Nick Piggin @ 2010-12-17 5:32 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: Nick Piggin, Arnaldo Carvalho de Melo, linux-kernel On Thu, Dec 16, 2010 at 12:47:58PM +0100, Frederic Weisbecker wrote: > On Thu, Dec 16, 2010 at 12:02:45AM +1100, Nick Piggin wrote: > > On Wed, Dec 08, 2010 at 10:48:13PM +0100, Frederic Weisbecker wrote: > > > I can not reproduce it. Could you please try to reproduce, > > > run perf archive and send me your perf.data.tar.bz2 ? > > > > It seems to be happening all the time, just look further in > > callgraphs. > > > > This attached perf.data.bz2 looks like this, when using -g graph > > > > 15.05% dbench [kernel.kallsyms] [k] > > copy_user_generic_string > > | > > --- copy_user_generic_string > > | > > |---0.16%-- generic_file_aio_read > > | do_sync_read > > | vfs_read > > | | > > | --0.05%-- sys_pread64 > > | system_call > > | 0x7f64a60bb193 > > | > > |--0.10%-- generic_file_buffered_write > > | __generic_file_aio_write > > | generic_file_aio_write > > | do_sync_write > > | vfs_write > > | sys_pwrite64 > > | system_call > > | 0x7f64a60bb203 > > | 0xe01170 > > | > > ---0.11%-- dcache_readdir > > vfs_readdir > > sys_getdents > > system_call > > 0x7f64a60ade65 > > > > See, the last element is greater than the second last. > > > > -g fractal looks like this: > > > > 15.05% dbench [kernel.kallsyms] [k] > > copy_user_generic_string > > | > > --- copy_user_generic_string > > | > > |---1.09%-- generic_file_aio_read > > | do_sync_read > > | vfs_read > > | | > > | |--0.55%-- sys_pread64 > > | | system_call > > | | 0x7f64a60bb193 > > | | > > | --2.19%-- sys_read > > | system_call > > | 0x7f64a60d3ea0 > > | > > |--0.69%-- generic_file_buffered_write > > | __generic_file_aio_write > > | generic_file_aio_write > > | do_sync_write > > | vfs_write > > | sys_pwrite64 > > | system_call > > | 0x7f64a60bb203 > > | 0xe01170 > > | > > |---0.72%-- dcache_readdir > > | vfs_readdir > > | sys_getdents > > | system_call > > | 0x7f64a60ade65 > > > > > > So it's totally screwy. > > > First time I see this. > > I can reproduce but differently because I miss your perf.data.tar.bz2 > that results after the "perf archive" command, then the chains are based > on addresses and not on symbols. Oh I didn't realize perf archive, noted for next time, thanks. > But I found one of these problems on your file even without the symbols: > > --- 0x8152b50e > | > |---6.54%-- 0x810e83a7 > | 0x810d8590 > | 0x810d8710 > | 0x81002cbb > | 0xa60ade65 > | > |--13.76%-- 0x810dbc9f > | | > | |--44.73%-- 0x810d1ff5 > | | | > | | |--38.87%-- 0x810d44c4 > | | | 0x810d5242 > | | | | > | | | |--82.03%-- 0x810d5f12 > | | | | 0x810cbfd7 > | | | | 0x810cc046 > | | | | 0x810cc1ff > | | | | 0x81002cbb > | | | | 0xa60d37f5 > | | | | | > | | | | |--86.66%-- 0xe01170 > | | | | | > | | | | --13.34%-- 0x6c632f73 > | | | | > | | | --17.97%-- 0x810d629f > | | | 0x810c63a3 > | | | 0x810c648b > | | | 0x81002cbb > | | | 0xa60d3cb0 > > These are not the two last on the chain so it seems to happen even more > randomly. > > I'll fix that, thanks! That would be great, thanks! It's a really awesome tool, I appreciate your effort on it. Thanks, Nick ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: buggy perf callgraph output 2010-12-17 5:32 ` Nick Piggin @ 2010-12-23 12:55 ` Frederic Weisbecker 2010-12-23 13:08 ` Ingo Molnar 0 siblings, 1 reply; 8+ messages in thread From: Frederic Weisbecker @ 2010-12-23 12:55 UTC (permalink / raw) To: Nick Piggin; +Cc: Arnaldo Carvalho de Melo, linux-kernel On Fri, Dec 17, 2010 at 04:32:58PM +1100, Nick Piggin wrote: > That would be great, thanks! It's a really awesome tool, I appreciate > your effort on it. Thanks :) So the below patch should fix the issue. Tell me if you encounter it again, or anything else. diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index d503670..c749ba6 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -356,7 +356,7 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask, static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain, int depth, int depth_mask, int period, - u64 total_samples, int hits, + u64 total_samples, u64 hits, int left_margin) { int i; ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: buggy perf callgraph output 2010-12-23 12:55 ` Frederic Weisbecker @ 2010-12-23 13:08 ` Ingo Molnar 2010-12-23 13:35 ` Frederic Weisbecker 0 siblings, 1 reply; 8+ messages in thread From: Ingo Molnar @ 2010-12-23 13:08 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: Nick Piggin, Arnaldo Carvalho de Melo, linux-kernel * Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Fri, Dec 17, 2010 at 04:32:58PM +1100, Nick Piggin wrote: > > That would be great, thanks! It's a really awesome tool, I appreciate > > your effort on it. > > Thanks :) > > So the below patch should fix the issue. Tell me if you encounter it > again, or anything else. > > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c > index d503670..c749ba6 100644 > --- a/tools/perf/util/hist.c > +++ b/tools/perf/util/hist.c > @@ -356,7 +356,7 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask, > > static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain, > int depth, int depth_mask, int period, > - u64 total_samples, int hits, > + u64 total_samples, u64 hits, > int left_margin) Hmm ... nice fix but why didnt GCC warn about the loss of half of the 'cumul' bits in this call: ret += ipchain__fprintf_graph(fp, chain, depth, new_depth_mask, i++, new_total, cumul, left_margin); We really, really want GCC to warn about this - it's such a common mistake. What disabled that warning? Very weird. Thanks, Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: buggy perf callgraph output 2010-12-23 13:08 ` Ingo Molnar @ 2010-12-23 13:35 ` Frederic Weisbecker 0 siblings, 0 replies; 8+ messages in thread From: Frederic Weisbecker @ 2010-12-23 13:35 UTC (permalink / raw) To: Ingo Molnar; +Cc: Nick Piggin, Arnaldo Carvalho de Melo, linux-kernel On Thu, Dec 23, 2010 at 02:08:20PM +0100, Ingo Molnar wrote: > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > On Fri, Dec 17, 2010 at 04:32:58PM +1100, Nick Piggin wrote: > > > That would be great, thanks! It's a really awesome tool, I appreciate > > > your effort on it. > > > > Thanks :) > > > > So the below patch should fix the issue. Tell me if you encounter it > > again, or anything else. > > > > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c > > index d503670..c749ba6 100644 > > --- a/tools/perf/util/hist.c > > +++ b/tools/perf/util/hist.c > > @@ -356,7 +356,7 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask, > > > > static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain, > > int depth, int depth_mask, int period, > > - u64 total_samples, int hits, > > + u64 total_samples, u64 hits, > > int left_margin) > > Hmm ... nice fix but why didnt GCC warn about the loss of half of the 'cumul' bits > in this call: > > ret += ipchain__fprintf_graph(fp, chain, depth, > new_depth_mask, i++, > new_total, > cumul, > left_margin); > > We really, really want GCC to warn about this - it's such a common mistake. What > disabled that warning? Very weird. No idea. I thought perf had that in its warning arsenal :) > > Thanks, > > Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-12-23 13:38 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-12-08 16:40 buggy perf callgraph output Nick Piggin 2010-12-08 21:48 ` Frederic Weisbecker 2010-12-15 13:02 ` Nick Piggin 2010-12-16 11:47 ` Frederic Weisbecker 2010-12-17 5:32 ` Nick Piggin 2010-12-23 12:55 ` Frederic Weisbecker 2010-12-23 13:08 ` Ingo Molnar 2010-12-23 13:35 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox