From: Andi Kleen <andi@firstfloor.org>
To: acme@kernel.org
Cc: linux-kernel@vger.kernel.org, jolsa@kernel.org,
eranian@google.com, kan.liang@linux.intel.com,
peterz@infradead.org
Subject: Optimize perf stat for large number of events/cpus v2
Date: Sun, 20 Oct 2019 10:51:53 -0700 [thread overview]
Message-ID: <20191020175202.32456-1-andi@firstfloor.org> (raw)
[The earlier v1 version had a lot of conflicts against some
recent libperf changes in tip/perf/core. Resolve that and
also fix some minor issues.]
This patch kit optimizes perf stat for a large number of events
on systems with many CPUs and PMUs.
Some profiling shows that the most overhead is doing IPIs to
all the target CPUs. We can optimize this by using sched_setaffinity
to set the affinity to a target CPU once and then doing
the perf operation for all events on that CPU. This requires
some restructuring, but cuts the set up time quite a bit.
In theory we could go further by parallelizing these setups
too, but that would be much more complicated and for now just batching it
per CPU seems to be sufficient. At some point with many more cores
parallelization or a better bulk perf setup API might be needed though.
In addition perf does a lot of redundant /sys accesses with
many PMUs, which can be also expensve. This is also optimized.
On a large test case (>700 events with many weak groups) on a 94 CPU
system I go from
real 0m8.607s
user 0m0.550s
sys 0m8.041s
to
real 0m3.269s
user 0m0.760s
sys 0m1.694s
so shaving ~6 seconds of system time, at slightly more cost
in perf stat itself. On a 4 socket system with the savings
are more dramatic:
real 0m15.641s
user 0m0.873s
sys 0m14.729s
to
real 0m4.493s
user 0m1.578s
sys 0m2.444s
so 11s difference in the user visible set up time.
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/stat-scale-4
v1: Initial post.
v2: Rebase. Fix some minor issues.
-Andi
next reply other threads:[~2019-10-20 17:52 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-20 17:51 Andi Kleen [this message]
2019-10-20 17:51 ` [PATCH v2 1/9] perf evsel: Always preserve errno while cleaning up perf_event_open failures Andi Kleen
2019-10-22 8:01 ` Jiri Olsa
2019-11-12 11:18 ` [tip: perf/core] " tip-bot2 for Andi Kleen
2019-10-20 17:51 ` [PATCH v2 2/9] perf evsel: Avoid close(-1) Andi Kleen
2019-10-22 8:01 ` Jiri Olsa
2019-11-12 11:18 ` [tip: perf/core] " tip-bot2 for Andi Kleen
2019-10-20 17:51 ` [PATCH v2 3/9] perf pmu: Use file system cache to optimize sysfs access Andi Kleen
2019-10-23 9:47 ` Jiri Olsa
2019-10-20 17:51 ` [PATCH v2 4/9] perf affinity: Add infrastructure to save/restore affinity Andi Kleen
2019-10-23 9:59 ` Jiri Olsa
2019-10-23 13:02 ` Andi Kleen
2019-10-23 14:30 ` Jiri Olsa
2019-10-23 14:52 ` Andi Kleen
2019-10-23 16:16 ` Alexey Budankov
2019-10-23 17:19 ` Andi Kleen
2019-10-23 18:08 ` Alexey Budankov
2019-10-23 22:37 ` Andi Kleen
2019-10-24 8:46 ` Alexey Budankov
2019-10-20 17:51 ` [PATCH v2 5/9] perf evsel: Add iterator to iterate over events ordered by CPU Andi Kleen
2019-10-20 17:51 ` [PATCH v2 6/9] perf stat: Use affinity for closing file descriptors Andi Kleen
2019-10-20 17:52 ` [PATCH v2 7/9] perf stat: Use affinity for opening events Andi Kleen
2019-10-20 17:52 ` [PATCH v2 8/9] perf stat: Use affinity for reading Andi Kleen
2019-10-20 17:52 ` [PATCH v2 9/9] perf stat: Use affinity for enabling/disabling events Andi Kleen
2019-10-23 10:30 ` Jiri Olsa
2019-10-23 13:07 ` Andi Kleen
2019-10-22 8:02 ` Optimize perf stat for large number of events/cpus v2 Jiri Olsa
2019-10-22 14:11 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191020175202.32456-1-andi@firstfloor.org \
--to=andi@firstfloor.org \
--cc=acme@kernel.org \
--cc=eranian@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox