From: Waiman Long <waiman.long@hp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
Jiri Olsa <jolsa@redhat.com>,
Stephane Eranian <eranian@google.com>,
Namhyung Kim <namhyung@kernel.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org, "Chandramouleeswaran,
Aswin" <aswin@hp.com>, "Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH v2] perf record: fix symbol processing bug and greatly improve performance
Date: Fri, 05 Jul 2013 13:09:17 -0400 [thread overview]
Message-ID: <51D6FDBD.6010606@hp.com> (raw)
In-Reply-To: <20130510081214.GA6848@gmail.com>
On 05/10/2013 04:12 AM, Ingo Molnar wrote:
> * Waiman Long<Waiman.Long@hp.com> wrote:
>
>> When "perf record" was used on a large machine with a lot of CPUs,
>> the perf post-processing time (the time after the workload was done
>> until the perf command itself exited) could take a lot of minutes
>> and even hours depending on how large the resulting perf.data file was.
>>
>> While running AIM7 1500-user high_systime workload on a 80-core
>> x86-64 system with a 3.9 kernel (with only the -s -a options used),
>> the workload itself took about 2 minutes to run and the perf.data
>> file had a size of 1108.746 MB. However, the post-processing step
>> took more than 10 minutes.
>>
>> With a gprof-profiled perf binary, the time spent by perf was as
>> follows:
>>
>> % cumulative self self total
>> time seconds seconds calls s/call s/call name
>> 96.90 822.10 822.10 192156 0.00 0.00 dsos__find
>> 0.81 828.96 6.86 172089958 0.00 0.00 rb_next
>> 0.41 832.44 3.48 48539289 0.00 0.00 rb_erase
>>
>> So 97% (822 seconds) of the time was spent in a single dsos_find()
>> function. After analyzing the call-graph data below:
>>
>> -----------------------------------------------
>> 0.00 822.12 192156/192156 map__new [6]
>> [7] 96.9 0.00 822.12 192156 vdso__dso_findnew [7]
>> 822.10 0.00 192156/192156 dsos__find [8]
>> 0.01 0.00 192156/192156 dsos__add [62]
>> 0.01 0.00 192156/192366 dso__new [61]
>> 0.00 0.00 1/45282525 memdup [31]
>> 0.00 0.00 192156/192230 dso__set_long_name [91]
>> -----------------------------------------------
>> 822.10 0.00 192156/192156 vdso__dso_findnew [7]
>> [8] 96.9 822.10 0.00 192156 dsos__find [8]
>> -----------------------------------------------
>>
>> It was found that the vdso__dso_findnew() function failed to locate
>> VDSO__MAP_NAME ("[vdso]") in the dso list and have to insert a new
>> entry at the end for 192156 times. This problem is due to the fact that
>> there are 2 types of name in the dso entry - short name and long name.
>> The initial dso__new() adds "[vdso]" to both the short and long names.
>> After that, vdso__dso_findnew() modifies the long name to something
>> like /tmp/perf-vdso.so-NoXkDj. The dsos__find() function only compares
>> the long name. As a result, the same vdso entry is duplicated many
>> time in the dso list. This bug increases memory consumption as well
>> as slows the symbol processing time to a crawl.
>>
>> To resolve this problem, the dsos__find() function interface was
>> modified to enable searching either the long name or the short
>> name. The vdso__dso_findnew() will now search only the short name
>> while the other call sites search for the long name as before.
>>
>> With this change, the cpu time of perf was reduced from 848.38s to
>> 15.77s and dsos__find() only accounted for 0.06% of the total time.
>>
>> 0.06 15.73 0.01 192151 0.00 0.00 dsos__find
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hp.com>
>> ---
>> tools/perf/util/dso.c | 10 ++++++++--
>> tools/perf/util/dso.h | 3 ++-
>> tools/perf/util/vdso.c | 2 +-
>> 3 files changed, 11 insertions(+), 4 deletions(-)
> Acked-by: Ingo Molnar<mingo@kernel.org>
>
> Thanks,
>
> Ingo
Thank for the Ack. Will that patch go into v3.11?
Regards,
Longman
next prev parent reply other threads:[~2013-07-05 17:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-09 14:42 [PATCH v2] perf record: fix symbol processing bug and greatly improve performance Waiman Long
2013-05-10 8:12 ` Ingo Molnar
2013-07-05 17:09 ` Waiman Long [this message]
2013-07-05 17:26 ` Arnaldo Carvalho de Melo
2013-07-05 17:44 ` Waiman Long
2013-07-12 8:52 ` [tip:perf/urgent] perf symbols: Fix vdso list searching tip-bot for Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51D6FDBD.6010606@hp.com \
--to=waiman.long@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@ghostprotocols.net \
--cc=aswin@hp.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=paulus@samba.org \
--cc=scott.norton@hp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.