linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org, Song Liu <song@kernel.org>,
	Hao Luo <haoluo@google.com>,
	bpf@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>
Subject: Re: [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1)
Date: Thu, 6 Apr 2023 21:41:28 -0300	[thread overview]
Message-ID: <ZC9muCsvRwV3ImhI@kernel.org> (raw)
In-Reply-To: <20230406210611.1622492-1-namhyung@kernel.org>

Em Thu, Apr 06, 2023 at 02:06:04PM -0700, Namhyung Kim escreveu:
> Hello,
> 
> I got a report that the overhead of perf lock contention is too big in
> some cases.  It was running the task aggregation mode (-t) at the moment
> and there were lots of tasks contending each other.
> 
> It turned out that the hash map update is a problem.  The result is saved
> in the lock_stat hash map which is pre-allocated.  The BPF program never
> deletes data in the map, but just adds.  But if the map is full, (try to)
> update the map becomes a very heavy operation - since it needs to check
> every CPU's freelist to get a new node to save the result.  But we know
> it'd fail when the map is full.  No need to update then.

Thanks, applied.

- Arnaldo

 
> I've checked it on my 64 CPU machine with this.
> 
>     $ perf bench sched messaging -g 1000
>     # Running 'sched/messaging' benchmark:
>     # 20 sender and receiver processes per group
>     # 1000 groups == 40000 processes run
> 
>          Total time: 2.825 [sec]
> 
> And I used the task mode, so that it can guarantee the map is full.
> The default map entry size is 16K and this workload has 40K tasks.
> 
> Before:
>     $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
>     # Running 'sched/messaging' benchmark:
>     # 20 sender and receiver processes per group
>     # 1000 groups == 40000 processes run
> 
>          Total time: 11.299 [sec]
>      contended   total wait     max wait     avg wait          pid   comm
> 
>          19284      3.51 s       3.70 ms    181.91 us      1305863   sched-messaging
>            243     84.09 ms    466.67 us    346.04 us      1336608   sched-messaging
>            177     66.35 ms     12.08 ms    374.88 us      1220416   node
> 
> After:
>     $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
>     # Running 'sched/messaging' benchmark:
>     # 20 sender and receiver processes per group
>     # 1000 groups == 40000 processes run
> 
>          Total time: 3.044 [sec]
>      contended   total wait     max wait     avg wait          pid   comm
> 
>          18743    591.92 ms    442.96 us     31.58 us      1431454   sched-messaging
>             51    210.64 ms    207.45 ms      4.13 ms      1468724   sched-messaging
>             81     68.61 ms     65.79 ms    847.07 us      1463183   sched-messaging
> 
>     === output for debug ===
> 
>     bad: 1164137, total: 2253341
>     bad rate: 51.66 %
>     histogram of failure reasons
>            task: 0
>           stack: 0
>            time: 0
>            data: 1164137
> 
> The first few patches are small cleanups and fixes.  You can get the code
> from 'perf/lock-map-v1' branch in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks,
> Namhyung
> 
> Namhyung Kim (7):
>   perf lock contention: Simplify parse_lock_type()
>   perf lock contention: Use -M for --map-nr-entries
>   perf lock contention: Update default map size to 16384
>   perf lock contention: Add data failure stat
>   perf lock contention: Update total/bad stats for hidden entries
>   perf lock contention: Revise needs_callstack() condition
>   perf lock contention: Do not try to update if hash map is full
> 
>  tools/perf/Documentation/perf-lock.txt        |  4 +-
>  tools/perf/builtin-lock.c                     | 64 ++++++++-----------
>  tools/perf/util/bpf_lock_contention.c         |  7 +-
>  .../perf/util/bpf_skel/lock_contention.bpf.c  | 29 +++++++--
>  tools/perf/util/bpf_skel/lock_data.h          |  3 +
>  tools/perf/util/lock-contention.h             |  2 +
>  6 files changed, 60 insertions(+), 49 deletions(-)
> 
> 
> base-commit: e5116f46d44b72ede59a6923829f68a8b8f84e76
> -- 
> 2.40.0.577.gac1e443424-goog
> 

-- 

- Arnaldo

      parent reply	other threads:[~2023-04-07  0:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-06 21:06 [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1) Namhyung Kim
2023-04-06 21:06 ` [PATCH 1/7] perf lock contention: Simplify parse_lock_type() Namhyung Kim
2023-04-06 21:06 ` [PATCH 2/7] perf lock contention: Use -M for --map-nr-entries Namhyung Kim
2023-04-06 21:06 ` [PATCH 3/7] perf lock contention: Update default map size to 16384 Namhyung Kim
2023-04-06 21:06 ` [PATCH 4/7] perf lock contention: Add data failure stat Namhyung Kim
2023-04-06 21:06 ` [PATCH 5/7] perf lock contention: Update total/bad stats for hidden entries Namhyung Kim
2023-04-06 21:06 ` [PATCH 6/7] perf lock contention: Revise needs_callstack() condition Namhyung Kim
2023-04-06 21:06 ` [PATCH 7/7] perf lock contention: Do not try to update if hash map is full Namhyung Kim
2023-04-07  0:35 ` [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1) Ian Rogers
2023-04-07  0:41 ` Arnaldo Carvalho de Melo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZC9muCsvRwV3ImhI@kernel.org \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=haoluo@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).