* [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation
@ 2026-05-30 19:59 Suchit Karunakaran
2026-05-30 20:41 ` sashiko-bot
2026-06-04 13:37 ` Arnaldo Carvalho de Melo
0 siblings, 2 replies; 5+ messages in thread
From: Suchit Karunakaran @ 2026-05-30 19:59 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung
Cc: mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter,
james.clark, tycho, suchitkarunakaran, linux-perf-users,
linux-kernel, bpf
update_lock_stat() handles lock contentions that start but never reach a
contention_end event (e.g., locks still held when profiling stops), but
previously treated LOCK_AGGR_CGROUP as a no-op due to missing cgroup
context in userspace.
Fix this by adding a cgroup_id field to struct tstamp_data, recording it
at contention_begin using get_current_cgroup_id() when aggr_mode is
LOCK_AGGR_CGROUP. Capturing it at contention_begin is semantically
correct, the contention cost is incurred by the task that had to wait,
not by whatever task happens to be running at contention_end. It is also
preferable from a performance standpoint, as contention_end runs just
before the task enters the critical section.
Update contention_end to use pelem->cgroup_id instead of calling
get_current_cgroup_id() dynamically, ensuring both complete and
incomplete contention events attribute the wait time to the cgroup at
wait-start time consistently.
Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
---
tools/perf/util/bpf_lock_contention.c | 4 ++--
tools/perf/util/bpf_skel/lock_contention.bpf.c | 4 +++-
tools/perf/util/bpf_skel/lock_data.h | 1 +
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/bpf_lock_contention.c b/tools/perf/util/bpf_lock_contention.c
index cbd7435579fe..1a5bd2ff8ee4 100644
--- a/tools/perf/util/bpf_lock_contention.c
+++ b/tools/perf/util/bpf_lock_contention.c
@@ -463,8 +463,8 @@ static void update_lock_stat(int map_fd, int pid, u64 end_ts,
stat_key.lock_addr_or_cgroup = ts_data->lock;
break;
case LOCK_AGGR_CGROUP:
- /* TODO */
- return;
+ stat_key.lock_addr_or_cgroup = ts_data->cgroup_id;
+ break;
default:
return;
}
diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
index 96e7d853b9ed..139199811020 100644
--- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
+++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
@@ -536,6 +536,8 @@ int contention_begin(u64 *ctx)
pelem->timestamp = bpf_ktime_get_ns();
pelem->lock = (__u64)ctx[0];
pelem->flags = (__u32)ctx[1];
+ if (aggr_mode == LOCK_AGGR_CGROUP)
+ pelem->cgroup_id = get_current_cgroup_id();
if (needs_callstack) {
u32 i = 0;
@@ -771,7 +773,7 @@ int contention_end(u64 *ctx)
key.stack_id = pelem->stack_id;
break;
case LOCK_AGGR_CGROUP:
- key.lock_addr_or_cgroup = get_current_cgroup_id();
+ key.lock_addr_or_cgroup = pelem->cgroup_id;
break;
default:
/* should not happen */
diff --git a/tools/perf/util/bpf_skel/lock_data.h b/tools/perf/util/bpf_skel/lock_data.h
index 28c5e5aced7f..652e114e6b87 100644
--- a/tools/perf/util/bpf_skel/lock_data.h
+++ b/tools/perf/util/bpf_skel/lock_data.h
@@ -13,6 +13,7 @@ struct owner_tracing_data {
struct tstamp_data {
u64 timestamp;
u64 lock;
+ u64 cgroup_id;
u32 flags;
s32 stack_id;
};
--
2.54.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation
2026-05-30 19:59 [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation Suchit Karunakaran
@ 2026-05-30 20:41 ` sashiko-bot
2026-06-04 13:37 ` Arnaldo Carvalho de Melo
1 sibling, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-05-30 20:41 UTC (permalink / raw)
To: Suchit Karunakaran; +Cc: linux-perf-users, bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Incomplete lock contentions for per-CPU locks (spinlocks/rwlocks) are silently dropped when using cgroup aggregation due to an unmodified early return in `account_end_timestamp()`.
--
commit 9be2409410efd2d4d44f23abbecf590bbbc25d34
Author: Suchit Karunakaran <suchitkarunakaran@gmail.com>
perf/lock: enable end-timestamp accounting for cgroup aggregation
This commit enables end-timestamp accounting for cgroup aggregation by
tracking the cgroup_id in the BPF program during contention_begin. It
updates update_lock_stat() in userspace and contention_end() in BPF to
use this recorded cgroup context instead of fetching it dynamically.
> diff --git a/tools/perf/util/bpf_lock_contention.c b/tools/perf/util/bpf_lock_contention.c
> index cbd7435579fea..1a5bd2ff8ee4c 100644
> --- a/tools/perf/util/bpf_lock_contention.c
> +++ b/tools/perf/util/bpf_lock_contention.c
> @@ -463,8 +463,8 @@ static void update_lock_stat(int map_fd, int pid, u64 end_ts,
> stat_key.lock_addr_or_cgroup = ts_data->lock;
> break;
> case LOCK_AGGR_CGROUP:
> - /* TODO */
> - return;
> + stat_key.lock_addr_or_cgroup = ts_data->cgroup_id;
> + break;
> default:
> return;
> }
[Severity: Medium]
Since this patch enables end-timestamp accounting for cgroup aggregation,
should the early return for LOCK_AGGR_CGROUP in account_end_timestamp()
also be removed?
tools/perf/util/bpf_lock_contention.c:account_end_timestamp() {
...
/* Now it'll check per-cpu tstamp map which doesn't have TID. */
if (aggr_mode == LOCK_AGGR_TASK || aggr_mode == LOCK_AGGR_CGROUP)
return;
...
}
Without removing this check, incomplete lock contentions for per-CPU
locks (like spinlocks and rwlocks) appear to be silently skipped when
using cgroup aggregation.
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260530195940.78700-1-suchitkarunakaran@gmail.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation
2026-05-30 19:59 [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation Suchit Karunakaran
2026-05-30 20:41 ` sashiko-bot
@ 2026-06-04 13:37 ` Arnaldo Carvalho de Melo
2026-06-04 16:31 ` Namhyung Kim
1 sibling, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-04 13:37 UTC (permalink / raw)
To: Namhyung Kim, Suchit Karunakaran
Cc: Peter Zijlstra, Ingo Molnar, mark.rutland, alexander.shishkin,
Jiri Olsa, Ian Rogers, Adrian Hunter, james.clark, tycho,
linux-perf-users, linux-kernel, bpf
On Sun, May 31, 2026 at 01:29:40AM +0530, Suchit Karunakaran wrote:
> update_lock_stat() handles lock contentions that start but never reach a
> contention_end event (e.g., locks still held when profiling stops), but
> previously treated LOCK_AGGR_CGROUP as a no-op due to missing cgroup
> context in userspace.
>
> Fix this by adding a cgroup_id field to struct tstamp_data, recording it
> at contention_begin using get_current_cgroup_id() when aggr_mode is
> LOCK_AGGR_CGROUP. Capturing it at contention_begin is semantically
> correct, the contention cost is incurred by the task that had to wait,
> not by whatever task happens to be running at contention_end. It is also
> preferable from a performance standpoint, as contention_end runs just
> before the task enters the critical section.
>
> Update contention_end to use pelem->cgroup_id instead of calling
> get_current_cgroup_id() dynamically, ensuring both complete and
> incomplete contention events attribute the wait time to the cgroup at
> wait-start time consistently.
Namhyung, can you provide an Acked-by or Reviewed-by?
Thanks,
- Arnaldo
> Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
> ---
> tools/perf/util/bpf_lock_contention.c | 4 ++--
> tools/perf/util/bpf_skel/lock_contention.bpf.c | 4 +++-
> tools/perf/util/bpf_skel/lock_data.h | 1 +
> 3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/bpf_lock_contention.c b/tools/perf/util/bpf_lock_contention.c
> index cbd7435579fe..1a5bd2ff8ee4 100644
> --- a/tools/perf/util/bpf_lock_contention.c
> +++ b/tools/perf/util/bpf_lock_contention.c
> @@ -463,8 +463,8 @@ static void update_lock_stat(int map_fd, int pid, u64 end_ts,
> stat_key.lock_addr_or_cgroup = ts_data->lock;
> break;
> case LOCK_AGGR_CGROUP:
> - /* TODO */
> - return;
> + stat_key.lock_addr_or_cgroup = ts_data->cgroup_id;
> + break;
> default:
> return;
> }
> diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> index 96e7d853b9ed..139199811020 100644
> --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
> +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> @@ -536,6 +536,8 @@ int contention_begin(u64 *ctx)
> pelem->timestamp = bpf_ktime_get_ns();
> pelem->lock = (__u64)ctx[0];
> pelem->flags = (__u32)ctx[1];
> + if (aggr_mode == LOCK_AGGR_CGROUP)
> + pelem->cgroup_id = get_current_cgroup_id();
>
> if (needs_callstack) {
> u32 i = 0;
> @@ -771,7 +773,7 @@ int contention_end(u64 *ctx)
> key.stack_id = pelem->stack_id;
> break;
> case LOCK_AGGR_CGROUP:
> - key.lock_addr_or_cgroup = get_current_cgroup_id();
> + key.lock_addr_or_cgroup = pelem->cgroup_id;
> break;
> default:
> /* should not happen */
> diff --git a/tools/perf/util/bpf_skel/lock_data.h b/tools/perf/util/bpf_skel/lock_data.h
> index 28c5e5aced7f..652e114e6b87 100644
> --- a/tools/perf/util/bpf_skel/lock_data.h
> +++ b/tools/perf/util/bpf_skel/lock_data.h
> @@ -13,6 +13,7 @@ struct owner_tracing_data {
> struct tstamp_data {
> u64 timestamp;
> u64 lock;
> + u64 cgroup_id;
> u32 flags;
> s32 stack_id;
> };
> --
> 2.54.0
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation
2026-06-04 13:37 ` Arnaldo Carvalho de Melo
@ 2026-06-04 16:31 ` Namhyung Kim
2026-06-04 20:29 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 5+ messages in thread
From: Namhyung Kim @ 2026-06-04 16:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Suchit Karunakaran, Peter Zijlstra, Ingo Molnar, mark.rutland,
alexander.shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
james.clark, tycho, linux-perf-users, linux-kernel, bpf
On Thu, Jun 04, 2026 at 10:37:05AM -0300, Arnaldo Carvalho de Melo wrote:
> On Sun, May 31, 2026 at 01:29:40AM +0530, Suchit Karunakaran wrote:
> > update_lock_stat() handles lock contentions that start but never reach a
> > contention_end event (e.g., locks still held when profiling stops), but
> > previously treated LOCK_AGGR_CGROUP as a no-op due to missing cgroup
> > context in userspace.
> >
> > Fix this by adding a cgroup_id field to struct tstamp_data, recording it
> > at contention_begin using get_current_cgroup_id() when aggr_mode is
> > LOCK_AGGR_CGROUP. Capturing it at contention_begin is semantically
> > correct, the contention cost is incurred by the task that had to wait,
> > not by whatever task happens to be running at contention_end. It is also
> > preferable from a performance standpoint, as contention_end runs just
> > before the task enters the critical section.
> >
> > Update contention_end to use pelem->cgroup_id instead of calling
> > get_current_cgroup_id() dynamically, ensuring both complete and
> > incomplete contention events attribute the wait time to the cgroup at
> > wait-start time consistently.
>
> Namhyung, can you provide an Acked-by or Reviewed-by?
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation
2026-06-04 16:31 ` Namhyung Kim
@ 2026-06-04 20:29 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-04 20:29 UTC (permalink / raw)
To: Namhyung Kim
Cc: Suchit Karunakaran, Peter Zijlstra, Ingo Molnar, mark.rutland,
alexander.shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
james.clark, tycho, linux-perf-users, linux-kernel, bpf
On Thu, Jun 04, 2026 at 09:31:15AM -0700, Namhyung Kim wrote:
> On Thu, Jun 04, 2026 at 10:37:05AM -0300, Arnaldo Carvalho de Melo wrote:
> > On Sun, May 31, 2026 at 01:29:40AM +0530, Suchit Karunakaran wrote:
> > > update_lock_stat() handles lock contentions that start but never reach a
> > > contention_end event (e.g., locks still held when profiling stops), but
> > > previously treated LOCK_AGGR_CGROUP as a no-op due to missing cgroup
> > > context in userspace.
> > >
> > > Fix this by adding a cgroup_id field to struct tstamp_data, recording it
> > > at contention_begin using get_current_cgroup_id() when aggr_mode is
> > > LOCK_AGGR_CGROUP. Capturing it at contention_begin is semantically
> > > correct, the contention cost is incurred by the task that had to wait,
> > > not by whatever task happens to be running at contention_end. It is also
> > > preferable from a performance standpoint, as contention_end runs just
> > > before the task enters the critical section.
> > >
> > > Update contention_end to use pelem->cgroup_id instead of calling
> > > get_current_cgroup_id() dynamically, ensuring both complete and
> > > incomplete contention events attribute the wait time to the cgroup at
> > > wait-start time consistently.
> >
> > Namhyung, can you provide an Acked-by or Reviewed-by?
>
> Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Thanks, applied to perf-tools-next, for v7.2.
- Arnaldo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-04 20:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-30 19:59 [PATCH v2 RESEND] perf/lock: enable end-timestamp accounting for cgroup aggregation Suchit Karunakaran
2026-05-30 20:41 ` sashiko-bot
2026-06-04 13:37 ` Arnaldo Carvalho de Melo
2026-06-04 16:31 ` Namhyung Kim
2026-06-04 20:29 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox