All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@GMAIL.COM>
To: Ingo Molnar <mingo@elte.hu>
Cc: mingo@redhat.com, hpa@zytor.com, acme@redhat.com,
	linux-kernel@vger.kernel.org, peterz@infradead.org,
	efault@gmx.de, tglx@linutronix.de,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:perfcounters/core] perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
Date: Sun, 9 Aug 2009 04:42:37 +0200	[thread overview]
Message-ID: <20090809024235.GA10146@nowhere> (raw)
In-Reply-To: <20090808122923.GA20064@elte.hu>

On Sat, Aug 08, 2009 at 02:29:23PM +0200, Ingo Molnar wrote:
> 
> * tip-bot for Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > Commit-ID:  84fce09c6b21ef675de079a724c13a205c89520b
> > Gitweb:     http://git.kernel.org/tip/84fce09c6b21ef675de079a724c13a205c89520b
> > Author:     Frederic Weisbecker <fweisbec@gmail.com>
> > AuthorDate: Sat, 8 Aug 2009 02:16:25 +0200
> > Committer:  Ingo Molnar <mingo@elte.hu>
> > CommitDate: Sat, 8 Aug 2009 13:49:21 +0200
> > 
> > perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
> > 
> > When we filter the callchains below a given percentage, we
> > ignore them and the end result only shows entries that have an
> > upper percentage than the filter threshold.
> > 
> > It seems to users then that we have an imbalance in the
> > percentage, as if the sum inside a profiled branch doesn't
> > reach 100%.
> > 
> > Since in the past there have been real perf report bugs that
> > showed the same sypmtom, it would be nice to assure the user
> > that the data is perfect and trustable and it all sums up to
> > 100.00%.
> > 
> > So fix this by displaying the remaining hits that have been
> > filtered but without more detail than their amount in each
> > branches. Example while filtering below 50%:
> > 
> > 7.73%  [k] delay_tsc
> >                 |
> >                 |--98.22%-- __const_udelay
> >                 |          |
> >                 |          |--86.37%-- ath5k_hw_register_timeout
> >                 |          |          ath5k_hw_noise_floor_calibration
> >                 |          |          ath5k_hw_reset
> >                 |          |          ath5k_reset
> >                 |          |          ath5k_config
> >                 |          |          ieee80211_hw_config
> >                 |          |          |
> >                 |          |          |--88.53%-- ieee80211_scan_work
> >                 |          |          |          worker_thread
> >                 |          |          |          kthread
> >                 |          |          |          child_rip
> >                 |          |           --11.47%-- [...]
> >                 |           --13.63%-- [...]
> >                  --1.78%-- [...]
> 
> i dont seem to be able to get the bracketed numbers. I only get:
> 
>   titan:~> perf record -f -g -a ./hackbench 10
>   titan:~> perf report -g fractal,10.0
> 
> # Samples: 26832
> #
> # Overhead         Command             Shared Object  Symbol
> # ........  ..............  ........................  ......
> #
>      8.69%       hackbench  [kernel]                  [k] copy_user_generic_string
> 
>      6.84%       hackbench  [kernel]                  [k] unix_stream_recvmsg
> 
>      6.07%       hackbench  [kernel]                  [k] sock_alloc_send_pskb
> 
>      4.33%       hackbench  [kernel]                  [k] _spin_lock
> 
>      3.77%       hackbench  [kernel]                  [k] __kmalloc_node_track_caller
> 
>      3.74%       hackbench  [kernel]                  [k] __alloc_skb
>                 |          
>                 |--80.00%-- sock_alloc_send_pskb
>                 |          sock_alloc_send_skb
>                 |          unix_stream_sendmsg
>                 |          __sock_sendmsg
>                 |          sock_aio_write
>                 |          do_sync_write
>                 |          vfs_write
>                 |          sys_write
>                 |          sysenter_dispatch
>                 |          0xf7ffa430
>                 |          0xffadeb0000000014
>                 |          
>                  --20.00%-- sock_alloc_send_skb
>                            unix_stream_sendmsg
>                            __sock_sendmsg
>                            sock_aio_write
>                            do_sync_write
>                            vfs_write
>                            sys_write
>                            sysenter_dispatch
>                            0xf7ffa430
>                            0xffadeb0000000014
> 
>      3.53%       hackbench  [kernel]                  [k] ia32_sysenter_target
> 
>      2.92%       hackbench  [vdso]                    [.] 0x000000f7ffa430
>                 |          
>                 |--66.67%-- 0xf7ffa430
>                 |          0xffadeb0000000014
>                 |          
>                  --33.33%-- 0xf7ffa431
>                            0xffadeb0000000014
> 
>      2.56%       hackbench  [kernel]                  [k] virt_to_head_page
> 
>      2.37%       hackbench  [kernel]                  [k] kmem_cache_alloc_node
> 
>      2.33%       hackbench  [kernel]                  [k] _spin_lock_irqsave
>                 |          
>                 |--78.57%-- remove_wait_queue
>                 |          poll_freewait
>                 |          do_sys_poll
>                 |          sys_poll
>                 |          sysenter_dispatch
>                 |          0xf7ffa430
>                 |          0x1ffadea3c
>                 |          
>                 |--7.14%-- __up_read
>                 |          up_read
>                 |          do_page_fault
>                 |          page_fault
>                 |          0xf7ffa430
>                 |          0xa0df710000000a
>                 |          
>                 |--7.14%-- unix_stream_sendmsg
>                 |          __sock_sendmsg
>                 |          sock_aio_write
>                 |          do_sync_write
>                 |          vfs_write
>                 |          sys_write
>                 |          sysenter_dispatch
>                 |          0xf7ffa430
>                 |          0xffadeb0000000014
>                 |          
>                  --7.14%-- skb_queue_tail
>                            unix_stream_sendmsg
>                            __sock_sendmsg
>                            sock_aio_write
>                            do_sync_write
>                            vfs_write
>                            sys_write
>                            sysenter_dispatch
>                            0xf7ffa430
>                            0xffadeb0000000014
> 
> 
> I dont see the brackets anymore, nor does the 10% limit seem to have 
> been listened to for the 7.14% result above, right?
> 
> 	Ingo


Oh damn. I've been investigating for several hours in the wrong direction
just because this bug triggered only when I wasn't using -s s

Actually, it's just that the other kind of sorting make this bug occur
much more quickly: in the first results. It was then more obvious in these
cases, but the issue was present in every kind of sorting.

Please find the fix below, thanks!

Frederic.

---
>From 52c89279744db58a68fa2339b3c8038e67113fca Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sun, 9 Aug 2009 04:19:15 +0200
Subject: [PATCH] perf tools: callchain: Fix bad rounding of minimum rate

Sometimes we get callchain branches that have a rate under the limit
given by the user.

Say you launched:

perf record -f -g -a ./hackbench 10
perf report -g fractal,10.0

And you get:

2.33%       hackbench  [kernel]                  [k] _spin_lock_irqsave
                |
                |--78.57%-- remove_wait_queue
                |          poll_freewait
                |          do_sys_poll
                |          sys_poll
                |          sysenter_dispatch
                |          0xf7ffa430
                |          0x1ffadea3c
                |
                |--7.14%-- __up_read
                |          up_read
                |          do_page_fault
                |          page_fault
                |          0xf7ffa430
                |          0xa0df710000000a
                ...

It is abnormal to get a 7.14% branch whereas we passed a 10% filter.

The problem is that we round down the minimum threshold.
This happens mostly when we have very low number of events.
If the total amount of your branch is 4 and you have a subranch of 3
events, filtering to 90% will be computed like follows:

limit = 4 * 0.9;

The result is about 3.6, but the cast to integer will round down to 3.
It means that our filter is actually of 75%

We must then explicitly round up the minimum threshold.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 tools/perf/util/callchain.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index a8e67aa..0114734 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -13,6 +13,7 @@
 #include <stdio.h>
 #include <stdbool.h>
 #include <errno.h>
+#include <math.h>
 
 #include "callchain.h"
 
@@ -112,7 +113,7 @@ static void __sort_chain_graph_rel(struct callchain_node *node,
 	u64 min_hit;
 
 	node->rb_root = RB_ROOT;
-	min_hit = node->children_hit * min_percent / 100.0;
+	min_hit = ceil(node->children_hit * min_percent);
 
 	chain_for_each_child(child, node) {
 		__sort_chain_graph_rel(child, min_percent);
@@ -126,7 +127,7 @@ static void
 sort_chain_graph_rel(struct rb_root *rb_root, struct callchain_node *chain_root,
 		     u64 min_hit __used, struct callchain_param *param)
 {
-	__sort_chain_graph_rel(chain_root, param->min_percent);
+	__sort_chain_graph_rel(chain_root, param->min_percent / 100.0);
 	rb_root->rb_node = chain_root->rb_root.rb_node;
 }
 
-- 
1.6.2.3




  reply	other threads:[~2009-08-09  2:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-08  0:16 [PATCH 1/4] perf tools: callchain: Warn only once in empty node detection Frederic Weisbecker
2009-08-08  0:16 ` [PATCH 2/4] perf tools: callchain: Ignore empty callchains Frederic Weisbecker
2009-08-08 11:51   ` [tip:perfcounters/core] perf tools: callchain: Fix spurious 'perf report' warnings: ignore " tip-bot for Frederic Weisbecker
2009-08-09 11:09   ` tip-bot for Frederic Weisbecker
2009-08-08  0:16 ` [PATCH 3/4] perf tools: callchain: Default display callchain from report if recorded with -g Frederic Weisbecker
2009-08-08 11:51   ` [tip:perfcounters/core] perf tools: callchain: Fix 'perf report' display to be callchain by default tip-bot for Frederic Weisbecker
2009-08-09 11:09   ` tip-bot for Frederic Weisbecker
2009-08-08  0:16 ` [PATCH 4/4] perf tools: callchain: Display amount of ignored chains in fractal mode Frederic Weisbecker
2009-08-08 11:52   ` [tip:perfcounters/core] perf tools: callchain: Fix sum of percentages to be 100% by displaying " tip-bot for Frederic Weisbecker
2009-08-08 12:29     ` Ingo Molnar
2009-08-09  2:42       ` Frederic Weisbecker [this message]
2009-08-09 11:12         ` [tip:perfcounters/urgent] perf tools: callchain: Fix bad rounding of minimum rate tip-bot for Frederic Weisbecker
2009-08-09 11:10   ` [tip:perfcounters/core] perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode tip-bot for Frederic Weisbecker
2009-08-08  1:06 ` [PATCH 1/4] perf tools: callchain: Warn only once in empty node detection Arnaldo Carvalho de Melo
2009-08-08  1:10   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090809024235.GA10146@nowhere \
    --to=fweisbec@gmail.com \
    --cc=acme@redhat.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.