* Perf report --percentage
@ 2016-06-07 19:21 Sebastian, Shiny
2016-06-08 7:34 ` Milian Wolff
0 siblings, 1 reply; 9+ messages in thread
From: Sebastian, Shiny @ 2016-06-07 19:21 UTC (permalink / raw)
To: linux-perf-users@vger.kernel.org
Hi,
I work with Intel and am running perf tool within Linux VMs on Windows Hyper-V. With kernel 4.6 and corresponding perf version, percentages of call stacks at deeper levels do not seem to add up to 100% or parent value (26.97%): example below. I have tried the -percentage flag with both 'absolute' and 'relative' values and they don't seem to change the output at all. Is this a bug or something wrong with my perf flags ?
Commands used:
#perf record -a -g -c 10000 sleep 10
#perf report -i cpu3.data.old -C 1 --no-children
- 27.48% fio [kernel.kallsyms] [k] __softirqentry_text_start
__softirqentry_text_start
- irq_exit
- 26.97% smp_call_function_single_interrupt
- call_function_single_interrupt
+ 4.21% __blk_run_queue
+ 4.09% hv_ringbuffer_write
+ 1.83% 0x99c
+ 1.78% __blockdev_direct_IO
+ 0.96% do_blockdev_direct_IO
+ 0.95% generic_make_request
+ 0.86% blk_queue_bio
+ 0.70% iov_iter_get_pages
+ 0.64% md_make_request
+ 0.55% do_io_submit
0.51% entry_SYSCALL_64_fastpath
+ 0.51% hyperv_vector_handler
Older version: Perf version 4.2 on kernel 4.2 - seems to normalize them to 100%.
27.69% swapper [kernel.kallsyms] [k] intel_idle
|
---intel_idle
|
|--99.84%-- cpuidle_enter_state
| cpuidle_enter
| call_cpuidle
| cpu_startup_entry
| |
| |--97.27%-- start_secondary
| |
| --2.73%-- rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
--0.16%-- [...]
I am running SLES 12.
Thanks,
Shiny
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-07 19:21 Perf report --percentage Sebastian, Shiny
@ 2016-06-08 7:34 ` Milian Wolff
2016-06-08 16:21 ` Sebastian, Shiny
2016-06-09 15:13 ` Arnaldo Carvalho de Melo
0 siblings, 2 replies; 9+ messages in thread
From: Milian Wolff @ 2016-06-08 7:34 UTC (permalink / raw)
To: shiny.sebastian; +Cc: linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 4075 bytes --]
On Dienstag, 7. Juni 2016 19:21:15 CEST Sebastian, Shiny wrote:
> Hi,
>
> I work with Intel and am running perf tool within Linux VMs on Windows
> Hyper-V. With kernel 4.6 and corresponding perf version, percentages of
> call stacks at deeper levels do not seem to add up to 100% or parent value
> (26.97%): example below. I have tried the -percentage flag with both
> 'absolute' and 'relative' values and they don't seem to change the output
> at all. Is this a bug or something wrong with my perf flags ?
>
> Commands used:
> #perf record -a -g -c 10000 sleep 10
> #perf report -i cpu3.data.old -C 1 --no-children
>
> - 27.48% fio [kernel.kallsyms] [k] __softirqentry_text_start
> __softirqentry_text_start
> - irq_exit
> - 26.97% smp_call_function_single_interrupt
> - call_function_single_interrupt
> + 4.21% __blk_run_queue
> + 4.09% hv_ringbuffer_write
> + 1.83% 0x99c
> + 1.78% __blockdev_direct_IO
> + 0.96% do_blockdev_direct_IO
> + 0.95% generic_make_request
> + 0.86% blk_queue_bio
> + 0.70% iov_iter_get_pages
> + 0.64% md_make_request
> + 0.55% do_io_submit
> 0.51% entry_SYSCALL_64_fastpath
> + 0.51% hyperv_vector_handler
Can you try to look at the output of
perf report -i cpu3.data.old -C 1 --no-children -g graph,0
i.e. potentially there are more entries with a cost of 0.5% or less, which by
default falls below the threshold.
That said, I just tested it and it doesn't work for me either:
tmp$ perf record --call-graph dwarf kwrite
[ perf record: Woken up 65 times to write data ]
[ perf record: Captured and wrote 16.308 MB perf.data (2032 samples) ]
tmp$ perf report -g graph,0 --no-children
- 9.47% kwrite ld-2.23.so [.] do_lookup_x
- 2.21% do_lookup_x
- _dl_lookup_symbol_x
+ 1.42% _dl_relocate_object
+ 0.74% _dl_fixup
+ 7.06% kwrite ld-2.23.so [.] strcmp
It's also bogus when I look at it with --children:
- 10.76% 9.47% kwrite ld-2.23.so [.] do_lookup_x
+ 1.54% _start
+ 1.34% do_lookup_x
+ 0.61% _dl_runtime_resolve_avx
- 7.06% 7.06% kwrite ld-2.23.so [.] strcmp
+ 0.42% _start
+ 0.22% _dl_runtime_resolve_avx
So someone else with more insight should answer whether this output can be
explained somehow.
> Older version: Perf version 4.2 on kernel 4.2 - seems to normalize them to
> 100%. 27.69% swapper [kernel.kallsyms] [k] intel_idle
>
> ---intel_idle
>
> |--99.84%-- cpuidle_enter_state
> |
> | cpuidle_enter
> | call_cpuidle
> | cpu_startup_entry
> |
> | |--97.27%-- start_secondary
> | |
> | --2.73%-- rest_init
> |
> | start_kernel
> | x86_64_start_reservations
> | x86_64_start_kernel
>
> --0.16%-- [...]
This output can be reproduced on newer perf by passing `-g fractal` to `perf
report`. I still get the issues as shown above though with a newer perf:
- 9.47% kwrite ld-2.23.so [.] do_lookup_x
+ 23.31% do_lookup_x
- 7.06% kwrite ld-2.23.so [.] strcmp
+ 9.06% strcmp
+ 5.71% kwrite ld-2.23.so [.] check_match
I.e. ~74% and 91% of the samples don't seen to have a proper backtrace
attribution?
Cheers
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Perf report --percentage
2016-06-08 7:34 ` Milian Wolff
@ 2016-06-08 16:21 ` Sebastian, Shiny
2016-06-11 8:31 ` Namhyung Kim
2016-06-09 15:13 ` Arnaldo Carvalho de Melo
1 sibling, 1 reply; 9+ messages in thread
From: Sebastian, Shiny @ 2016-06-08 16:21 UTC (permalink / raw)
To: Milian Wolff; +Cc: linux-perf-users@vger.kernel.org
Thanks Milian. I added -g graph,0 and it did add lot more percentages - it's still not adding up though.
#perf report -C 1 --no-children -g graph,0
output:
- 19.84% fio [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
- _raw_spin_unlock_irqrestore
+ 7.88% hv_ringbuffer_write
+ 4.88% scsi_end_request
+ 3.43% mix_pool_bytes
+ 0.92% aio_complete
+ 0.85% blk_run_queue
+ 0.80% scsi_put_command
+ 0.37% dio_bio_end_aio
+ 0.35% do_blockdev_direct_IO
+ 0.07% dio_complete
+ 0.07% scsi_run_queue
+ 0.06% __blockdev_direct_IO
+ 0.05% scsi_io_completion
+ 0.04% bio_endio
+ 0.02% add_timer_randomness
+ 0.02% vmbus_sendpacket_mpb_desc
+ 0.01% hrtimer_start_range_ns
Thanks,
Shiny
-----Original Message-----
From: milian [mailto:milian.wolff@kdab.com]
Sent: Wednesday, June 8, 2016 12:35 AM
To: Sebastian, Shiny <shiny.sebastian@intel.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: Perf report --percentage
On Dienstag, 7. Juni 2016 19:21:15 CEST Sebastian, Shiny wrote:
> Hi,
>
> I work with Intel and am running perf tool within Linux VMs on Windows
> Hyper-V. With kernel 4.6 and corresponding perf version, percentages of
> call stacks at deeper levels do not seem to add up to 100% or parent value
> (26.97%): example below. I have tried the -percentage flag with both
> 'absolute' and 'relative' values and they don't seem to change the output
> at all. Is this a bug or something wrong with my perf flags ?
>
> Commands used:
> #perf record -a -g -c 10000 sleep 10
> #perf report -i cpu3.data.old -C 1 --no-children
>
> - 27.48% fio [kernel.kallsyms] [k] __softirqentry_text_start
> __softirqentry_text_start
> - irq_exit
> - 26.97% smp_call_function_single_interrupt
> - call_function_single_interrupt
> + 4.21% __blk_run_queue
> + 4.09% hv_ringbuffer_write
> + 1.83% 0x99c
> + 1.78% __blockdev_direct_IO
> + 0.96% do_blockdev_direct_IO
> + 0.95% generic_make_request
> + 0.86% blk_queue_bio
> + 0.70% iov_iter_get_pages
> + 0.64% md_make_request
> + 0.55% do_io_submit
> 0.51% entry_SYSCALL_64_fastpath
> + 0.51% hyperv_vector_handler
Can you try to look at the output of
perf report -i cpu3.data.old -C 1 --no-children -g graph,0
i.e. potentially there are more entries with a cost of 0.5% or less, which by
default falls below the threshold.
That said, I just tested it and it doesn't work for me either:
tmp$ perf record --call-graph dwarf kwrite
[ perf record: Woken up 65 times to write data ]
[ perf record: Captured and wrote 16.308 MB perf.data (2032 samples) ]
tmp$ perf report -g graph,0 --no-children
- 9.47% kwrite ld-2.23.so [.] do_lookup_x
- 2.21% do_lookup_x
- _dl_lookup_symbol_x
+ 1.42% _dl_relocate_object
+ 0.74% _dl_fixup
+ 7.06% kwrite ld-2.23.so [.] strcmp
It's also bogus when I look at it with --children:
- 10.76% 9.47% kwrite ld-2.23.so [.] do_lookup_x
+ 1.54% _start
+ 1.34% do_lookup_x
+ 0.61% _dl_runtime_resolve_avx
- 7.06% 7.06% kwrite ld-2.23.so [.] strcmp
+ 0.42% _start
+ 0.22% _dl_runtime_resolve_avx
So someone else with more insight should answer whether this output can be
explained somehow.
> Older version: Perf version 4.2 on kernel 4.2 - seems to normalize them to
> 100%. 27.69% swapper [kernel.kallsyms] [k] intel_idle
>
> ---intel_idle
>
> |--99.84%-- cpuidle_enter_state
> |
> | cpuidle_enter
> | call_cpuidle
> | cpu_startup_entry
> |
> | |--97.27%-- start_secondary
> | |
> | --2.73%-- rest_init
> |
> | start_kernel
> | x86_64_start_reservations
> | x86_64_start_kernel
>
> --0.16%-- [...]
This output can be reproduced on newer perf by passing `-g fractal` to `perf
report`. I still get the issues as shown above though with a newer perf:
- 9.47% kwrite ld-2.23.so [.] do_lookup_x
+ 23.31% do_lookup_x
- 7.06% kwrite ld-2.23.so [.] strcmp
+ 9.06% strcmp
+ 5.71% kwrite ld-2.23.so [.] check_match
I.e. ~74% and 91% of the samples don't seen to have a proper backtrace
attribution?
Cheers
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-08 7:34 ` Milian Wolff
2016-06-08 16:21 ` Sebastian, Shiny
@ 2016-06-09 15:13 ` Arnaldo Carvalho de Melo
2016-06-10 3:14 ` Namhyung Kim
1 sibling, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-06-09 15:13 UTC (permalink / raw)
To: Namhyung Kim
Cc: Milian Wolff, shiny.sebastian, Frédéric Weisbecker,
David Ahern, Jiri Olsa, linux-perf-users
Em Wed, Jun 08, 2016 at 09:34:58AM +0200, Milian Wolff escreveu:
> On Dienstag, 7. Juni 2016 19:21:15 CEST Sebastian, Shiny wrote:
> > I work with Intel and am running perf tool within Linux VMs on Windows
> > Hyper-V. With kernel 4.6 and corresponding perf version, percentages of
> > call stacks at deeper levels do not seem to add up to 100% or parent value
> > (26.97%): example below. I have tried the -percentage flag with both
> > 'absolute' and 'relative' values and they don't seem to change the output
> > at all. Is this a bug or something wrong with my perf flags ?
> > Commands used:
> > #perf record -a -g -c 10000 sleep 10
> > #perf report -i cpu3.data.old -C 1 --no-children
> >
> > - 27.48% fio [kernel.kallsyms] [k] __softirqentry_text_start
> > __softirqentry_text_start
> > - irq_exit
> > - 26.97% smp_call_function_single_interrupt
> > - call_function_single_interrupt
> > + 4.21% __blk_run_queue
> > + 4.09% hv_ringbuffer_write
> > + 1.83% 0x99c
> > + 1.78% __blockdev_direct_IO
> > + 0.96% do_blockdev_direct_IO
> > + 0.95% generic_make_request
> > + 0.86% blk_queue_bio
> > + 0.70% iov_iter_get_pages
> > + 0.64% md_make_request
> > + 0.55% do_io_submit
> > 0.51% entry_SYSCALL_64_fastpath
> > + 0.51% hyperv_vector_handler
>
> Can you try to look at the output of
>
> perf report -i cpu3.data.old -C 1 --no-children -g graph,0
>
> i.e. potentially there are more entries with a cost of 0.5% or less, which by
> default falls below the threshold.
>
> That said, I just tested it and it doesn't work for me either:
Namyung, can you please take a look at this?
- Arnaldo
> tmp$ perf record --call-graph dwarf kwrite
> [ perf record: Woken up 65 times to write data ]
> [ perf record: Captured and wrote 16.308 MB perf.data (2032 samples) ]
> tmp$ perf report -g graph,0 --no-children
>
> - 9.47% kwrite ld-2.23.so [.] do_lookup_x
> - 2.21% do_lookup_x
> - _dl_lookup_symbol_x
> + 1.42% _dl_relocate_object
> + 0.74% _dl_fixup
> + 7.06% kwrite ld-2.23.so [.] strcmp
>
> It's also bogus when I look at it with --children:
>
> - 10.76% 9.47% kwrite ld-2.23.so [.] do_lookup_x
> + 1.54% _start
> + 1.34% do_lookup_x
> + 0.61% _dl_runtime_resolve_avx
> - 7.06% 7.06% kwrite ld-2.23.so [.] strcmp
> + 0.42% _start
> + 0.22% _dl_runtime_resolve_avx
>
> So someone else with more insight should answer whether this output can be
> explained somehow.
>
> > Older version: Perf version 4.2 on kernel 4.2 - seems to normalize them to
> > 100%. 27.69% swapper [kernel.kallsyms] [k] intel_idle
> >
> > ---intel_idle
> >
> > |--99.84%-- cpuidle_enter_state
> > |
> > | cpuidle_enter
> > | call_cpuidle
> > | cpu_startup_entry
> > |
> > | |--97.27%-- start_secondary
> > | |
> > | --2.73%-- rest_init
> > |
> > | start_kernel
> > | x86_64_start_reservations
> > | x86_64_start_kernel
> >
> > --0.16%-- [...]
>
> This output can be reproduced on newer perf by passing `-g fractal` to `perf
> report`. I still get the issues as shown above though with a newer perf:
>
> - 9.47% kwrite ld-2.23.so [.] do_lookup_x
> + 23.31% do_lookup_x
> - 7.06% kwrite ld-2.23.so [.] strcmp
> + 9.06% strcmp
> + 5.71% kwrite ld-2.23.so [.] check_match
>
> I.e. ~74% and 91% of the samples don't seen to have a proper backtrace
> attribution?
>
> Cheers
>
> --
> Milian Wolff | milian.wolff@kdab.com | Software Engineer
> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
> Tel: +49-30-521325470
> KDAB - The Qt Experts
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-09 15:13 ` Arnaldo Carvalho de Melo
@ 2016-06-10 3:14 ` Namhyung Kim
0 siblings, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2016-06-10 3:14 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Milian Wolff, shiny.sebastian, Frédéric Weisbecker,
David Ahern, Jiri Olsa, linux-perf-users
Hi Arnaldo,
On Fri, Jun 10, 2016 at 12:13 AM, Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
> Em Wed, Jun 08, 2016 at 09:34:58AM +0200, Milian Wolff escreveu:
>> On Dienstag, 7. Juni 2016 19:21:15 CEST Sebastian, Shiny wrote:
>> > I work with Intel and am running perf tool within Linux VMs on Windows
>> > Hyper-V. With kernel 4.6 and corresponding perf version, percentages of
>> > call stacks at deeper levels do not seem to add up to 100% or parent value
>> > (26.97%): example below. I have tried the -percentage flag with both
>> > 'absolute' and 'relative' values and they don't seem to change the output
>> > at all. Is this a bug or something wrong with my perf flags ?
>
>> > Commands used:
>> > #perf record -a -g -c 10000 sleep 10
>> > #perf report -i cpu3.data.old -C 1 --no-children
>> >
>> > - 27.48% fio [kernel.kallsyms] [k] __softirqentry_text_start
>> > __softirqentry_text_start
>> > - irq_exit
>> > - 26.97% smp_call_function_single_interrupt
>> > - call_function_single_interrupt
>> > + 4.21% __blk_run_queue
>> > + 4.09% hv_ringbuffer_write
>> > + 1.83% 0x99c
>> > + 1.78% __blockdev_direct_IO
>> > + 0.96% do_blockdev_direct_IO
>> > + 0.95% generic_make_request
>> > + 0.86% blk_queue_bio
>> > + 0.70% iov_iter_get_pages
>> > + 0.64% md_make_request
>> > + 0.55% do_io_submit
>> > 0.51% entry_SYSCALL_64_fastpath
>> > + 0.51% hyperv_vector_handler
>>
>> Can you try to look at the output of
>>
>> perf report -i cpu3.data.old -C 1 --no-children -g graph,0
>>
>> i.e. potentially there are more entries with a cost of 0.5% or less, which by
>> default falls below the threshold.
>>
>> That said, I just tested it and it doesn't work for me either:
>
> Namyung, can you please take a look at this?
Will do.
Thanks,
Namhyung
>
>> tmp$ perf record --call-graph dwarf kwrite
>> [ perf record: Woken up 65 times to write data ]
>> [ perf record: Captured and wrote 16.308 MB perf.data (2032 samples) ]
>> tmp$ perf report -g graph,0 --no-children
>>
>> - 9.47% kwrite ld-2.23.so [.] do_lookup_x
>> - 2.21% do_lookup_x
>> - _dl_lookup_symbol_x
>> + 1.42% _dl_relocate_object
>> + 0.74% _dl_fixup
>> + 7.06% kwrite ld-2.23.so [.] strcmp
>>
>> It's also bogus when I look at it with --children:
>>
>> - 10.76% 9.47% kwrite ld-2.23.so [.] do_lookup_x
>> + 1.54% _start
>> + 1.34% do_lookup_x
>> + 0.61% _dl_runtime_resolve_avx
>> - 7.06% 7.06% kwrite ld-2.23.so [.] strcmp
>> + 0.42% _start
>> + 0.22% _dl_runtime_resolve_avx
>>
>> So someone else with more insight should answer whether this output can be
>> explained somehow.
>>
>> > Older version: Perf version 4.2 on kernel 4.2 - seems to normalize them to
>> > 100%. 27.69% swapper [kernel.kallsyms] [k] intel_idle
>> >
>> > ---intel_idle
>> >
>> > |--99.84%-- cpuidle_enter_state
>> > |
>> > | cpuidle_enter
>> > | call_cpuidle
>> > | cpu_startup_entry
>> > |
>> > | |--97.27%-- start_secondary
>> > | |
>> > | --2.73%-- rest_init
>> > |
>> > | start_kernel
>> > | x86_64_start_reservations
>> > | x86_64_start_kernel
>> >
>> > --0.16%-- [...]
>>
>> This output can be reproduced on newer perf by passing `-g fractal` to `perf
>> report`. I still get the issues as shown above though with a newer perf:
>>
>> - 9.47% kwrite ld-2.23.so [.] do_lookup_x
>> + 23.31% do_lookup_x
>> - 7.06% kwrite ld-2.23.so [.] strcmp
>> + 9.06% strcmp
>> + 5.71% kwrite ld-2.23.so [.] check_match
>>
>> I.e. ~74% and 91% of the samples don't seen to have a proper backtrace
>> attribution?
>>
>> Cheers
>>
>> --
>> Milian Wolff | milian.wolff@kdab.com | Software Engineer
>> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
>> Tel: +49-30-521325470
>> KDAB - The Qt Experts
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-08 16:21 ` Sebastian, Shiny
@ 2016-06-11 8:31 ` Namhyung Kim
2016-06-11 11:53 ` Milian Wolff
0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2016-06-11 8:31 UTC (permalink / raw)
To: Sebastian, Shiny; +Cc: Milian Wolff, linux-perf-users@vger.kernel.org
Hello,
On Wed, Jun 08, 2016 at 04:21:43PM +0000, Sebastian, Shiny wrote:
> Thanks Milian. I added -g graph,0 and it did add lot more percentages - it's still not adding up though.
>
> #perf report -C 1 --no-children -g graph,0
I think it's because you used -C option to restrict samples only from
cpu 1. One way to check this is adding 'cpu' to sort keys and compare
sample count and callchain count.
What is the result of this command?
# perf report -C 1 --no-children -g graph,0,count -n -s +cpu
Thanks,
Namhyung
> output:
> - 19.84% fio [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
> - _raw_spin_unlock_irqrestore
> + 7.88% hv_ringbuffer_write
> + 4.88% scsi_end_request
> + 3.43% mix_pool_bytes
> + 0.92% aio_complete
> + 0.85% blk_run_queue
> + 0.80% scsi_put_command
> + 0.37% dio_bio_end_aio
> + 0.35% do_blockdev_direct_IO
> + 0.07% dio_complete
> + 0.07% scsi_run_queue
> + 0.06% __blockdev_direct_IO
> + 0.05% scsi_io_completion
> + 0.04% bio_endio
> + 0.02% add_timer_randomness
> + 0.02% vmbus_sendpacket_mpb_desc
> + 0.01% hrtimer_start_range_ns
>
>
> Thanks,
> Shiny
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-11 8:31 ` Namhyung Kim
@ 2016-06-11 11:53 ` Milian Wolff
2016-06-12 13:55 ` Namhyung Kim
0 siblings, 1 reply; 9+ messages in thread
From: Milian Wolff @ 2016-06-11 11:53 UTC (permalink / raw)
To: namhyung; +Cc: shiny.sebastian, linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]
On Samstag, 11. Juni 2016 17:31:48 CEST Namhyung Kim wrote:
> Hello,
>
> On Wed, Jun 08, 2016 at 04:21:43PM +0000, Sebastian, Shiny wrote:
> > Thanks Milian. I added -g graph,0 and it did add lot more percentages -
> > it's still not adding up though.
> >
> > #perf report -C 1 --no-children -g graph,0
>
> I think it's because you used -C option to restrict samples only from
> cpu 1. One way to check this is adding 'cpu' to sort keys and compare
> sample count and callchain count.
>
> What is the result of this command?
>
> # perf report -C 1 --no-children -g graph,0,count -n -s +cpu
If that is the case, how would you explain my findings? Note how I did not
pass `-C 1`, and still missed up to 90% of backtraces:
$ perf report --no-children -g fractal,0
- 9.47% kwrite ld-2.23.so [.] do_lookup_x
+ 23.31% do_lookup_x
- 7.06% kwrite ld-2.23.so [.] strcmp
+ 9.06% strcmp
+ 5.71% kwrite ld-2.23.so [.] check_match
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-11 11:53 ` Milian Wolff
@ 2016-06-12 13:55 ` Namhyung Kim
2016-06-13 11:01 ` Milian Wolff
0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2016-06-12 13:55 UTC (permalink / raw)
To: Milian Wolff; +Cc: shiny.sebastian, linux-perf-users
Hi Milian,
On Sat, Jun 11, 2016 at 01:53:34PM +0200, Milian Wolff wrote:
> On Samstag, 11. Juni 2016 17:31:48 CEST Namhyung Kim wrote:
> > Hello,
> >
> > On Wed, Jun 08, 2016 at 04:21:43PM +0000, Sebastian, Shiny wrote:
> > > Thanks Milian. I added -g graph,0 and it did add lot more percentages -
> > > it's still not adding up though.
> > >
> > > #perf report -C 1 --no-children -g graph,0
> >
> > I think it's because you used -C option to restrict samples only from
> > cpu 1. One way to check this is adding 'cpu' to sort keys and compare
> > sample count and callchain count.
> >
> > What is the result of this command?
> >
> > # perf report -C 1 --no-children -g graph,0,count -n -s +cpu
>
> If that is the case, how would you explain my findings? Note how I did not
> pass `-C 1`, and still missed up to 90% of backtraces:
>
> $ perf report --no-children -g fractal,0
>
> - 9.47% kwrite ld-2.23.so [.] do_lookup_x
> + 23.31% do_lookup_x
> - 7.06% kwrite ld-2.23.so [.] strcmp
> + 9.06% strcmp
> + 5.71% kwrite ld-2.23.so [.] check_match
Hmm.. I have no idea. Could you please show me the count of
callchains with following command?
$ perf report --no-children -g graph,0,count -n
Also it'd be great if you double check the output of 'perf script' to
ensure all the samples have callchains..
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Perf report --percentage
2016-06-12 13:55 ` Namhyung Kim
@ 2016-06-13 11:01 ` Milian Wolff
0 siblings, 0 replies; 9+ messages in thread
From: Milian Wolff @ 2016-06-13 11:01 UTC (permalink / raw)
To: namhyung; +Cc: shiny.sebastian, linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]
On Sonntag, 12. Juni 2016 22:55:26 CEST Namhyung Kim wrote:
> Hi Milian,
>
> On Sat, Jun 11, 2016 at 01:53:34PM +0200, Milian Wolff wrote:
> > On Samstag, 11. Juni 2016 17:31:48 CEST Namhyung Kim wrote:
> > > Hello,
> > >
> > > On Wed, Jun 08, 2016 at 04:21:43PM +0000, Sebastian, Shiny wrote:
> > > > Thanks Milian. I added -g graph,0 and it did add lot more percentages
> > > > -
> > > > it's still not adding up though.
> > > >
> > > > #perf report -C 1 --no-children -g graph,0
> > >
> > > I think it's because you used -C option to restrict samples only from
> > > cpu 1. One way to check this is adding 'cpu' to sort keys and compare
> > > sample count and callchain count.
> > >
> > > What is the result of this command?
> > >
> > > # perf report -C 1 --no-children -g graph,0,count -n -s +cpu
> >
> > If that is the case, how would you explain my findings? Note how I did not
> > pass `-C 1`, and still missed up to 90% of backtraces:
> >
> > $ perf report --no-children -g fractal,0
> >
> > - 9.47% kwrite ld-2.23.so [.]
> > do_lookup_x
> >
> > + 23.31% do_lookup_x
> >
> > - 7.06% kwrite ld-2.23.so [.] strcmp
> >
> > + 9.06% strcmp
> >
> > + 5.71% kwrite ld-2.23.so [.]
> > check_match
>
> Hmm.. I have no idea. Could you please show me the count of
> callchains with following command?
>
> $ perf report --no-children -g graph,0,count -n
- 12.43% 252 kwrite ld-2.23.so [.]
do_lookup_x
51 do_lookup_x
+ _dl_lookup_symbol_x
- 10.46% 223 kwrite ld-2.23.so [.]
strcmp
+ 37 strcmp
- 8.66% 175 kwrite ld-2.23.so [.]
check_match
44 check_match
do_lookup_x
+ _dl_lookup_symbol_x
Note that I could not reproduce that on a different machine of mine.
> Also it'd be great if you double check the output of 'perf script' to
> ensure all the samples have callchains..
Quite a few samples don't have callchains:
kwrite 14838 31462.912726: 222759 cycles:u:
kwrite 14838 31462.913158: 207623 cycles:u:
kwrite 14838 31462.913428: 196656 cycles:u:
kwrite 14838 31462.913532: 194832 cycles:u:
kwrite 14838 31462.913714: 229283 cycles:u:
kwrite 14838 31462.913895: 239871 cycles:u:
kwrite 14838 31462.914303: 234688 cycles:u:
kwrite 14838 31462.914384: 223273 cycles:u:
kwrite 14838 31462.914533: 281279 cycles:u:
So this quite possibly is the issue in my case.
Could we improve `perf report` to show a "<no stack>" entry or similar? I.e.
something like this:
- 12.43% 252 kwrite ld-2.23.so [.]
do_lookup_x
51 do_lookup_x
+ _dl_lookup_symbol_x
201 <no stack>
- 10.46% 223 kwrite ld-2.23.so [.]
strcmp
+ 37 strcmp
186 <no stack>
- 8.66% 175 kwrite ld-2.23.so [.]
check_match
44 check_match
do_lookup_x
+ _dl_lookup_symbol_x
131 <no stack>
I'll have a look at implementing that. Should be simple enough, I hope.
Bye
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-06-13 11:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-07 19:21 Perf report --percentage Sebastian, Shiny
2016-06-08 7:34 ` Milian Wolff
2016-06-08 16:21 ` Sebastian, Shiny
2016-06-11 8:31 ` Namhyung Kim
2016-06-11 11:53 ` Milian Wolff
2016-06-12 13:55 ` Namhyung Kim
2016-06-13 11:01 ` Milian Wolff
2016-06-09 15:13 ` Arnaldo Carvalho de Melo
2016-06-10 3:14 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).