Re: [PATCH 2/3] perf tools: Spare double comparison of callchain first entry

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Namhyung Kim <namhyung@kernel.org>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	David Ahern <dsahern@gmail.com>, Ingo Molnar <mingo@kernel.org>,
	Jiri Olsa <jolsa@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH 2/3] perf tools: Spare double comparison of callchain first entry
Date: Thu, 16 Jan 2014 10:17:53 +0900	[thread overview]
Message-ID: <87d2js9132.fsf@sejong.aot.lge.com> (raw)
In-Reply-To: <20140115165927.GA21574@localhost.localdomain> (Frederic Weisbecker's message of "Wed, 15 Jan 2014 17:59:30 +0100")

Hi Frederic,

On Wed, 15 Jan 2014 17:59:30 +0100, Frederic Weisbecker wrote:
> On Wed, Jan 15, 2014 at 03:23:46PM +0900, Namhyung Kim wrote:
>> On Tue, 14 Jan 2014 16:37:15 +0100, Frederic Weisbecker wrote:
>> > When a new callchain child branch matches an existing one in the rbtree,
>> > the comparison of its first entry is performed twice:
>> >
>> > 1) From append_chain_children() on branch lookup
>> >
>> > 2) If 1) reports a match, append_chain() then compares all entries of
>> > the new branch against the matching node in the rbtree, and this
>> > comparison includes the first entry of the new branch again.
>> 
>> Right.
>> 
>> >
>> > Lets shortcut this by performing the whole comparison only from
>> > append_chain() which then returns the result of the comparison between
>> > the first entry of the new branch and the iterating node in the rbtree.
>> > If the first entry matches, the lookup on the current level of siblings
>> > stops and propagates to the children of the matching nodes.
>> 
>> Hmm..  it looks like that I thought directly calling append_chain() has
>> some overhead - but it's not.
>
> No that's a right concern. I worried as well because I wasn't sure if there
> is more match than unmatch on the first entry. I'd tend to think that the first
> entry endures unmatches most often, in which case calling match_chain() first
> may be more efficient as a fast path (ie: calling append_chain() involves
> one more function call and a few other details).
>
> But eventually measurement hasn't shown significant difference before and
> after the patch.

I think if the sort key doesn't contain "symbol", unmatch case would be
increased as more various callchains would go into a same entry.

>
>> 
>> >
>> > This results in less comparisons performed by the CPU.
>> 
>> Do you have any numbers?  I suspect it'd not be a big change, but just
>> curious.
>
> So I compared before/after the patchset (which include the cursor restore removal)
> with:
>
> 	1) Some big hackbench-like load that generates > 200 MB perf.data
>
> 	perf record -g -- perf bench sched messaging -l $SOME_BIG_NUMBER
>
> 	2) Compare before/after with the following reports:
>
> 	perf stat perf report --stdio > /dev/null
> 	perf stat perf report --stdio -s sym > /dev/null
> 	perf stat perf report --stdio -G > /dev/null
> 	perf stat perf report --stdio -g fractal,0.5,caller,address > /dev/null 
>
> And most of the time I had < 0.01% difference on time completion in favour of the patchset
> (which may be due to the removed cursor restore patch eventually).
>
> So, all in one, there was no real interesting difference. If you want the true results I can definetly relaunch the tests.

So as an extreme case, could you please also test "-s cpu" case and
share the numbers?

Thanks,
Namhyung

next prev parent reply	other threads:[~2014-01-16  1:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-14 15:37 perf tools: Random cleanups Frederic Weisbecker
2014-01-14 15:37 ` [PATCH 1/3] perf tools: Do proper comm override error handling Frederic Weisbecker
2014-01-15  5:54   ` Namhyung Kim
2014-01-19 12:25   ` [tip:perf/core] " tip-bot for Frederic Weisbecker
2014-01-14 15:37 ` [PATCH 2/3] perf tools: Spare double comparison of callchain first entry Frederic Weisbecker
2014-01-15  6:23   ` Namhyung Kim
2014-01-15 16:59     ` Frederic Weisbecker
2014-01-16  1:17       ` Namhyung Kim [this message]
2014-01-16 17:34         ` Frederic Weisbecker
2014-01-16 19:47           ` Arnaldo Carvalho de Melo
2014-01-17  7:56             ` Namhyung Kim
2014-01-17 16:07               ` Frederic Weisbecker
2014-01-19 12:25   ` [tip:perf/core] perf callchain: " tip-bot for Frederic Weisbecker
2014-01-14 15:37 ` [PATCH 3/3] perf tools: Remove unnecessary callchain cursor state restore on unmatch Frederic Weisbecker
2014-01-15  6:24   ` Namhyung Kim
2014-01-19 12:25   ` [tip:perf/core] " tip-bot for Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d2js9132.fsf@sejong.aot.lge.com \
    --to=namhyung@kernel.org \
    --cc=acme@redhat.com \
    --cc=adrian.hunter@intel.com \
    --cc=dsahern@gmail.com \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.