From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-perf-users@vger.kernel.org, Song Liu <song@kernel.org>,
Hao Luo <haoluo@google.com>,
bpf@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>
Subject: Re: [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1)
Date: Thu, 6 Apr 2023 21:41:28 -0300 [thread overview]
Message-ID: <ZC9muCsvRwV3ImhI@kernel.org> (raw)
In-Reply-To: <20230406210611.1622492-1-namhyung@kernel.org>
Em Thu, Apr 06, 2023 at 02:06:04PM -0700, Namhyung Kim escreveu:
> Hello,
>
> I got a report that the overhead of perf lock contention is too big in
> some cases. It was running the task aggregation mode (-t) at the moment
> and there were lots of tasks contending each other.
>
> It turned out that the hash map update is a problem. The result is saved
> in the lock_stat hash map which is pre-allocated. The BPF program never
> deletes data in the map, but just adds. But if the map is full, (try to)
> update the map becomes a very heavy operation - since it needs to check
> every CPU's freelist to get a new node to save the result. But we know
> it'd fail when the map is full. No need to update then.
Thanks, applied.
- Arnaldo
> I've checked it on my 64 CPU machine with this.
>
> $ perf bench sched messaging -g 1000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1000 groups == 40000 processes run
>
> Total time: 2.825 [sec]
>
> And I used the task mode, so that it can guarantee the map is full.
> The default map entry size is 16K and this workload has 40K tasks.
>
> Before:
> $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1000 groups == 40000 processes run
>
> Total time: 11.299 [sec]
> contended total wait max wait avg wait pid comm
>
> 19284 3.51 s 3.70 ms 181.91 us 1305863 sched-messaging
> 243 84.09 ms 466.67 us 346.04 us 1336608 sched-messaging
> 177 66.35 ms 12.08 ms 374.88 us 1220416 node
>
> After:
> $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1000 groups == 40000 processes run
>
> Total time: 3.044 [sec]
> contended total wait max wait avg wait pid comm
>
> 18743 591.92 ms 442.96 us 31.58 us 1431454 sched-messaging
> 51 210.64 ms 207.45 ms 4.13 ms 1468724 sched-messaging
> 81 68.61 ms 65.79 ms 847.07 us 1463183 sched-messaging
>
> === output for debug ===
>
> bad: 1164137, total: 2253341
> bad rate: 51.66 %
> histogram of failure reasons
> task: 0
> stack: 0
> time: 0
> data: 1164137
>
> The first few patches are small cleanups and fixes. You can get the code
> from 'perf/lock-map-v1' branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
> Namhyung Kim (7):
> perf lock contention: Simplify parse_lock_type()
> perf lock contention: Use -M for --map-nr-entries
> perf lock contention: Update default map size to 16384
> perf lock contention: Add data failure stat
> perf lock contention: Update total/bad stats for hidden entries
> perf lock contention: Revise needs_callstack() condition
> perf lock contention: Do not try to update if hash map is full
>
> tools/perf/Documentation/perf-lock.txt | 4 +-
> tools/perf/builtin-lock.c | 64 ++++++++-----------
> tools/perf/util/bpf_lock_contention.c | 7 +-
> .../perf/util/bpf_skel/lock_contention.bpf.c | 29 +++++++--
> tools/perf/util/bpf_skel/lock_data.h | 3 +
> tools/perf/util/lock-contention.h | 2 +
> 6 files changed, 60 insertions(+), 49 deletions(-)
>
>
> base-commit: e5116f46d44b72ede59a6923829f68a8b8f84e76
> --
> 2.40.0.577.gac1e443424-goog
>
--
- Arnaldo
prev parent reply other threads:[~2023-04-07 0:41 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-06 21:06 [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1) Namhyung Kim
2023-04-06 21:06 ` [PATCH 1/7] perf lock contention: Simplify parse_lock_type() Namhyung Kim
2023-04-06 21:06 ` [PATCH 2/7] perf lock contention: Use -M for --map-nr-entries Namhyung Kim
2023-04-06 21:06 ` [PATCH 3/7] perf lock contention: Update default map size to 16384 Namhyung Kim
2023-04-06 21:06 ` [PATCH 4/7] perf lock contention: Add data failure stat Namhyung Kim
2023-04-06 21:06 ` [PATCH 5/7] perf lock contention: Update total/bad stats for hidden entries Namhyung Kim
2023-04-06 21:06 ` [PATCH 6/7] perf lock contention: Revise needs_callstack() condition Namhyung Kim
2023-04-06 21:06 ` [PATCH 7/7] perf lock contention: Do not try to update if hash map is full Namhyung Kim
2023-04-07 0:35 ` [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1) Ian Rogers
2023-04-07 0:41 ` Arnaldo Carvalho de Melo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZC9muCsvRwV3ImhI@kernel.org \
--to=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=bpf@vger.kernel.org \
--cc=haoluo@google.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).