From: Frederic Weisbecker <fweisbec@gmail.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@kernel.org>,
Namhyung Kim <namhyung.kim@lge.com>,
LKML <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jiri Olsa <jolsa@redhat.com>
Subject: Re: [PATCH 1/8] perf callchain: Convert children list to rbtree
Date: Wed, 2 Oct 2013 12:18:28 +0200 [thread overview]
Message-ID: <20131002101826.GC7941@localhost.localdomain> (raw)
In-Reply-To: <1380185890-25758-2-git-send-email-namhyung@kernel.org>
On Thu, Sep 26, 2013 at 05:58:03PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@lge.com>
>
> Current collapse stage has a scalability problem which can be
> reproduced easily with parallel kernel build. This is because it
> needs to traverse every children of callchain linearly during the
> collapse/merge stage. Convert it to rbtree reduced the overhead
> significantly.
>
> On my 400MB perf.data file which recorded with make -j32 kernel build:
>
> $ time perf --no-pager report --stdio > /dev/null
>
> before:
> real 6m22.073s
> user 6m18.683s
> sys 0m0.706s
>
> after:
> real 0m20.780s
> user 0m19.962s
> sys 0m0.689s
>
> During the perf report the overhead on append_chain_children went down
> from 96.69% to 18.16%:
>
> - 18.16% perf perf [.] append_chain_children
> - append_chain_children
> - 77.48% append_chain_children
> + 69.79% merge_chain_branch
> - 22.96% append_chain_children
> + 67.44% merge_chain_branch
> + 30.15% append_chain_children
> + 2.41% callchain_append
> + 7.25% callchain_append
> + 12.26% callchain_append
> + 10.22% merge_chain_branch
> + 11.58% perf perf [.] dso__find_symbol
> + 8.02% perf perf [.] sort__comm_cmp
> + 5.48% perf libc-2.17.so [.] malloc_consolidate
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Link: http://lkml.kernel.org/n/tip-d9tcfow6stbrp4btvgs51y67@git.kernel.org
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Have you tested this patchset when collapsing is not used?
There are fair chances that this patchset does not only improve collapsing
but also callchain insertion in general. So it's probably a win in any case. But
still it would be nice to make sure that it's the case because we are getting
rid of collapsing anyway.
The test that could tell us about that is to run "perf report -s sym" and compare the
time it takes to complete before and after this patch, because "-s sym" shouldn't
involve collapses.
Sorting by anything that is not comm should do the trick in fact.
Thanks.
next prev parent reply other threads:[~2013-10-02 10:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-26 8:58 [PATCHSET 0/8] perf tools: Fix scalability problem on callchain merging (v4) Namhyung Kim
2013-09-26 8:58 ` [PATCH 1/8] perf callchain: Convert children list to rbtree Namhyung Kim
2013-10-02 10:18 ` Frederic Weisbecker [this message]
2013-10-08 2:03 ` Namhyung Kim
2013-10-08 19:22 ` Frederic Weisbecker
2013-10-10 1:06 ` Namhyung Kim
2013-09-26 8:58 ` [PATCH 2/8] perf ui/progress: Add new helper functions for progress bar Namhyung Kim
2013-09-26 8:58 ` [PATCH 3/8] perf tools: Show progress on histogram collapsing Namhyung Kim
2013-09-26 8:58 ` [PATCH 4/8] perf tools: Use an accessor to read thread comm Namhyung Kim
2013-09-26 8:58 ` [PATCH 5/8] perf tools: Add time argument on comm setting Namhyung Kim
2013-09-26 8:58 ` [PATCH 6/8] perf tools: Add new comm infrastructure Namhyung Kim
2013-09-26 8:58 ` [PATCH 7/8] perf tools: Compare hists comm by addresses Namhyung Kim
2013-09-26 8:58 ` [PATCH 8/8] perf tools: Get current comm instead of last one Namhyung Kim
2013-10-02 10:01 ` Frederic Weisbecker
2013-10-08 1:56 ` Namhyung Kim
2013-09-26 9:34 ` [PATCHSET 0/8] perf tools: Fix scalability problem on callchain merging (v4) Ingo Molnar
2013-09-27 2:08 ` Namhyung Kim
2013-09-26 13:46 ` David Ahern
2013-09-26 14:07 ` Arnaldo Carvalho de Melo
2013-09-27 2:10 ` Namhyung Kim
-- strict thread matches above, loose matches on Subject: below --
2013-10-11 5:15 [PATCHSET 0/8] perf tools: Fix scalability problem on callchain merging (v5) Namhyung Kim
2013-10-11 5:15 ` [PATCH 1/8] perf callchain: Convert children list to rbtree Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131002101826.GC7941@localhost.localdomain \
--to=fweisbec@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@ghostprotocols.net \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung.kim@lge.com \
--cc=namhyung@kernel.org \
--cc=paulus@samba.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).