public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cgroup: add cpu.stat.percpu for per-CPU cgroup stats
@ 2026-04-07  1:06 Willy Barro Raffel
  2026-04-07 18:27 ` Tejun Heo
  0 siblings, 1 reply; 3+ messages in thread
From: Willy Barro Raffel @ 2026-04-07  1:06 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel, Willy Barro Raffel
  Cc: Justinien Bouron, Gunnar Kudrjavets

Expose per-CPU subtree_bstat via a new cgroupfs file cpu.stat.percpu.
Each line shows one CPU cumulative stats in io.stat-style key=value
format:

  cpu0 usage_usec=123 user_usec=45 system_usec=78 nice_usec=0
  cpu1 usage_usec=456 user_usec=123 system_usec=333 nice_usec=0

This completes the interface left as a TODO in commit 7716f383a583
("Merge tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup")
which added per-CPU subtree_bstat but only exposed it via BPF/drgn.

Signed-off-by: Willy Barro Raffel <willybar@amazon.com>
Reviewed-by: Justinien Bouron <jbouron@amazon.com>
Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com>
---
 kernel/cgroup/cgroup-internal.h |  1 +
 kernel/cgroup/cgroup.c          | 10 +++++++++
 kernel/cgroup/rstat.c           | 36 +++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+)

diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index 3bfe37693d68..28aff03975f2 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -277,6 +277,7 @@ int css_rstat_init(struct cgroup_subsys_state *css);
 void css_rstat_exit(struct cgroup_subsys_state *css);
 int ss_rstat_init(struct cgroup_subsys *ss);
 void cgroup_base_stat_cputime_show(struct seq_file *seq);
+void cgroup_base_stat_cputime_show_percpu(struct seq_file *seq);
 
 /*
  * namespace.c
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index be1d71dda317..652fae15d7c5 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3968,6 +3968,12 @@ static int cpu_local_stat_show(struct seq_file *seq, void *v)
 	return ret;
 }
 
+
+static int cpu_percpu_stat_show(struct seq_file *seq, void *v)
+{
+	cgroup_base_stat_cputime_show_percpu(seq);
+	return 0;
+}
 #ifdef CONFIG_PSI
 static int cgroup_io_pressure_show(struct seq_file *seq, void *v)
 {
@@ -5499,6 +5505,10 @@ static struct cftype cgroup_base_files[] = {
 		.name = "cpu.stat.local",
 		.seq_show = cpu_local_stat_show,
 	},
+	{
+		.name = "cpu.stat.percpu",
+		.seq_show = cpu_percpu_stat_show,
+	},
 	{ }	/* terminate */
 };
 
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 150e5871e66f..f1aaed87180c 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -743,6 +743,42 @@ void cgroup_base_stat_cputime_show(struct seq_file *seq)
 	cgroup_force_idle_show(seq, &bstat);
 }
 
+
+void cgroup_base_stat_cputime_show_percpu(struct seq_file *seq)
+{
+	struct cgroup *cgrp = seq_css(seq)->cgroup;
+	int cpu;
+
+	css_rstat_flush(&cgrp->self);
+
+	for_each_possible_cpu(cpu) {
+		struct cgroup_rstat_base_cpu *rstatbc;
+		struct cgroup_base_stat bstat;
+		unsigned int seq_cnt;
+
+		/* Reacquire for each CPU to avoid disabling IRQs too long */
+		__css_rstat_lock(&cgrp->self, cpu);
+		rstatbc = cgroup_rstat_base_cpu(cgrp, cpu);
+		do {
+			seq_cnt = __u64_stats_fetch_begin(&rstatbc->bsync);
+			bstat = rstatbc->subtree_bstat;
+		} while (__u64_stats_fetch_retry(&rstatbc->bsync, seq_cnt));
+		__css_rstat_unlock(&cgrp->self, cpu);
+
+		do_div(bstat.cputime.sum_exec_runtime, NSEC_PER_USEC);
+		do_div(bstat.cputime.utime, NSEC_PER_USEC);
+		do_div(bstat.cputime.stime, NSEC_PER_USEC);
+		do_div(bstat.ntime, NSEC_PER_USEC);
+
+		seq_printf(seq, "cpu%d usage_usec=%llu user_usec=%llu system_usec=%llu nice_usec=%llu\n",
+			   cpu,
+			   bstat.cputime.sum_exec_runtime,
+			   bstat.cputime.utime,
+			   bstat.cputime.stime,
+			   bstat.ntime);
+	}
+}
+
 /* Add bpf kfuncs for css_rstat_updated() and css_rstat_flush() */
 BTF_KFUNCS_START(bpf_rstat_kfunc_ids)
 BTF_ID_FLAGS(func, css_rstat_updated)
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] cgroup: add cpu.stat.percpu for per-CPU cgroup stats
  2026-04-07  1:06 [PATCH] cgroup: add cpu.stat.percpu for per-CPU cgroup stats Willy Barro Raffel
@ 2026-04-07 18:27 ` Tejun Heo
  2026-04-07 20:24   ` Barro Raffel, Willy
  0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2026-04-07 18:27 UTC (permalink / raw)
  To: Willy Barro Raffel
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel,
	Justinien Bouron, Gunnar Kudrjavets

On Mon, Apr 06, 2026 at 06:06:43PM -0700, Willy Barro Raffel wrote:
> Expose per-CPU subtree_bstat via a new cgroupfs file cpu.stat.percpu.
> Each line shows one CPU cumulative stats in io.stat-style key=value
> format:
> 
>   cpu0 usage_usec=123 user_usec=45 system_usec=78 nice_usec=0
>   cpu1 usage_usec=456 user_usec=123 system_usec=333 nice_usec=0
> 
> This completes the interface left as a TODO in commit 7716f383a583
> ("Merge tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup")
> which added per-CPU subtree_bstat but only exposed it via BPF/drgn.

Given how quickly cpu count is increasing with 1k CPUs on common prod
machines not too far off, I'm not sure naively formatting output for every
possible CPU is desirable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cgroup: add cpu.stat.percpu for per-CPU cgroup stats
  2026-04-07 18:27 ` Tejun Heo
@ 2026-04-07 20:24   ` Barro Raffel, Willy
  0 siblings, 0 replies; 3+ messages in thread
From: Barro Raffel, Willy @ 2026-04-07 20:24 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johannes Weiner, Michal Koutný, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, Bouron, Justinien,
	Kudrjavets, Gunnar

On Tue, Apr 07, 2026 at 08:27:41AM -1000, Tejun Heo wrote:
>On Mon, Apr 06, 2026 at 06:06:43PM -0700, Willy Barro Raffel wrote:
>> Expose per-CPU subtree_bstat via a new cgroupfs file cpu.stat.percpu.
>> Each line shows one CPU cumulative stats in io.stat-style key=value
>> format:
>>
>>   cpu0 usage_usec=123 user_usec=45 system_usec=78 nice_usec=0
>>   cpu1 usage_usec=456 user_usec=123 system_usec=333 nice_usec=0
>>
>> This completes the interface left as a TODO in commit 7716f383a583
>> ("Merge tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup")
>> which added per-CPU subtree_bstat but only exposed it via BPF/drgn.
>
>Given how quickly cpu count is increasing with 1k CPUs on common prod
>machines not too far off, I'm not sure naively formatting output for every
>possible CPU is desirable.
>
>Thanks.
>
>--
>tejun

Good point. I can skip CPUs with zero stats in the output, i.e.: a cgroup running on 4 of 1024 CPUs would only produce 4 lines. Would that address your concern?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-07 20:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-07  1:06 [PATCH] cgroup: add cpu.stat.percpu for per-CPU cgroup stats Willy Barro Raffel
2026-04-07 18:27 ` Tejun Heo
2026-04-07 20:24   ` Barro Raffel, Willy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox