From: Tiffany Yang <ynaffit@google.com>
To: linux-kernel@vger.kernel.org
Cc: "John Stultz" <jstultz@google.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Stephen Boyd" <sboyd@kernel.org>,
"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
"Frederic Weisbecker" <frederic@kernel.org>,
"Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Pavel Machek" <pavel@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Chen Ridong" <chenridong@huawei.com>,
kernel-team@android.com, "Jonathan Corbet" <corbet@lwn.net>,
cgroups@vger.kernel.org, linux-doc@vger.kernel.org
Subject: [RFC PATCH v2] cgroup: Track time in cgroup v2 freezer
Date: Sun, 13 Jul 2025 22:00:09 -0700 [thread overview]
Message-ID: <20250714050008.2167786-2-ynaffit@google.com> (raw)
The cgroup v2 freezer controller allows user processes to be dynamically
added to and removed from an interruptible frozen state from
userspace. This feature is helpful for application management, as
a background app can be frozen to prevent its threads from being
scheduled or otherwise contending with foreground tasks for
resources. However, the application is usually unaware that it was
frozen, which can cause issues by disrupting any internal monitoring
that it is performing.
As an example, an application may implement a watchdog thread for one of
its high priority maintenance tasks that operates by checking some state
of that task at a set interval to ensure it has made progress. The key
challenge here is that the task is only expected to make progress when
the application it belongs to has the opportunity to run, but there's no
application-relative time to set the watchdog timer against. Instead,
the next timeout is set relative to system time, using an approximation
that assumes the application will continue to be scheduled as
normal. If the task misses that approximate deadline because the
application was frozen, without any way to know that, the watchdog may
kill the healthy process.
Other sources of delay can cause similar issues, but this change focuses
on allowing frozen time to be accounted for in particular because of how
large it can grow and how unevenly it can affect applications running on
the system. To allow an application to better account for the time it
spends running, I propose tracking the time each cgroup spends freezing
and exposing it to userland via a new core interface file in
cgroupfs (cgroup.freeze.stat). I used this naming because utility
controllers like "kill" and "freeze" are exposed as cgroup v2 core
interface files, but I'm happy to change it if there's a convention
others would prefer!
Currently, the cgroup css_set_lock is used to serialize accesses to the
CGRP_FREEZE bit of cgrp->flags and the new cgroup_freezer_state counters
(freeze_time_start_ns and freeze_time_total_ns). If we start to see
higher contention on this lock, we may want to introduce a v2 freezer
state-specific lock to avoid having to take the global lock every time
a cgroup.freeze.stat file is read.
Any feedback would be much appreciated!
Thank you,
Tiffany
Signed-off-by: Tiffany Yang <ynaffit@google.com>
---
v2:
* Track per-cgroup freezing time instead of per-task frozen time as
suggested by Tejun Heo
Cc: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Machek <pavel@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Chen Ridong <chenridong@huawei.com>
---
Documentation/admin-guide/cgroup-v2.rst | 8 ++++++++
include/linux/cgroup-defs.h | 6 ++++++
kernel/cgroup/cgroup.c | 24 ++++++++++++++++++++++++
kernel/cgroup/freezer.c | 8 ++++++--
4 files changed, 44 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index bd98ea3175ec..9fbf3a959bdf 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1018,6 +1018,14 @@ All cgroup core files are prefixed with "cgroup."
it's possible to delete a frozen (and empty) cgroup, as well as
create new sub-cgroups.
+ cgroup.freeze.stat
+ A read-only flat-keyed file which exists in non-root cgroups.
+ The following entry is defined:
+
+ freeze_time_total_ns
+ Cumulative time that this cgroup has spent in the freezing
+ state, regardless of whether or not it reaches "frozen".
+
cgroup.kill
A write-only single value file which exists in non-root cgroups.
The only allowed value is "1".
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index e61687d5e496..86332d83fa22 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -436,6 +436,12 @@ struct cgroup_freezer_state {
* frozen, SIGSTOPped, and PTRACEd.
*/
int nr_frozen_tasks;
+
+ /* Time when the cgroup was requested to freeze */
+ u64 freeze_time_start_ns;
+
+ /* Total duration the cgroup has spent freezing */
+ u64 freeze_time_total_ns;
};
struct cgroup {
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index a723b7dc6e4e..1f54d16a8713 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4050,6 +4050,23 @@ static ssize_t cgroup_freeze_write(struct kernfs_open_file *of,
return nbytes;
}
+static int cgroup_freeze_stat_show(struct seq_file *seq, void *v)
+{
+ struct cgroup *cgrp = seq_css(seq)->cgroup;
+ u64 freeze_time = 0;
+
+ spin_lock_irq(&css_set_lock);
+ if (test_bit(CGRP_FREEZE, &cgrp->flags))
+ freeze_time = ktime_get_ns() - cgrp->freezer.freeze_time_start_ns;
+
+ freeze_time += cgrp->freezer.freeze_time_total_ns;
+ spin_unlock_irq(&css_set_lock);
+
+ seq_printf(seq, "freeze_time_total_ns %llu\n", freeze_time);
+
+ return 0;
+}
+
static void __cgroup_kill(struct cgroup *cgrp)
{
struct css_task_iter it;
@@ -5355,6 +5372,11 @@ static struct cftype cgroup_base_files[] = {
.seq_show = cgroup_freeze_show,
.write = cgroup_freeze_write,
},
+ {
+ .name = "cgroup.freeze.stat",
+ .flags = CFTYPE_NOT_ON_ROOT,
+ .seq_show = cgroup_freeze_stat_show,
+ },
{
.name = "cgroup.kill",
.flags = CFTYPE_NOT_ON_ROOT,
@@ -5758,6 +5780,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
* if the parent has to be frozen, the child has too.
*/
cgrp->freezer.e_freeze = parent->freezer.e_freeze;
+ cgrp->freezer.freeze_time_total_ns = 0;
if (cgrp->freezer.e_freeze) {
/*
* Set the CGRP_FREEZE flag, so when a process will be
@@ -5766,6 +5789,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
* consider it frozen immediately.
*/
set_bit(CGRP_FREEZE, &cgrp->flags);
+ cgrp->freezer.freeze_time_start_ns = ktime_get_ns();
set_bit(CGRP_FROZEN, &cgrp->flags);
}
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
index bf1690a167dd..6f3fab252140 100644
--- a/kernel/cgroup/freezer.c
+++ b/kernel/cgroup/freezer.c
@@ -179,10 +179,14 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze)
lockdep_assert_held(&cgroup_mutex);
spin_lock_irq(&css_set_lock);
- if (freeze)
+ if (freeze) {
set_bit(CGRP_FREEZE, &cgrp->flags);
- else
+ cgrp->freezer.freeze_time_start_ns = ktime_get_ns();
+ } else {
clear_bit(CGRP_FREEZE, &cgrp->flags);
+ cgrp->freezer.freeze_time_total_ns += (ktime_get_ns() -
+ cgrp->freezer.freeze_time_start_ns);
+ }
spin_unlock_irq(&css_set_lock);
if (freeze)
--
2.50.0.727.gbf7dc18ff4-goog
next reply other threads:[~2025-07-14 5:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-14 5:00 Tiffany Yang [this message]
2025-07-17 12:56 ` [RFC PATCH v2] cgroup: Track time in cgroup v2 freezer Michal Koutný
2025-07-17 13:52 ` Chen Ridong
2025-07-17 17:06 ` Tejun Heo
2025-07-22 22:41 ` Tiffany Yang
2025-07-22 22:27 ` Tiffany Yang
2025-07-17 17:05 ` Tejun Heo
2025-07-18 8:20 ` Michal Koutný
2025-07-18 9:26 ` Chen Ridong
2025-07-18 13:58 ` cpu.stat in core or cpu controller (was Re: [RFC PATCH v2] cgroup: Track time in cgroup v2 freezer) Michal Koutný
2025-07-19 2:01 ` Chen Ridong
2025-07-19 16:27 ` Tejun Heo
2025-07-22 9:01 ` Chen Ridong
2025-07-22 11:54 ` Michal Koutný
2025-07-23 1:28 ` Chen Ridong
2025-07-25 1:08 ` Tejun Heo
2025-07-25 1:54 ` Chen Ridong
2025-07-22 22:16 ` [RFC PATCH v2] cgroup: Track time in cgroup v2 freezer Tiffany Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250714050008.2167786-2-ynaffit@google.com \
--to=ynaffit@google.com \
--cc=anna-maria@linutronix.de \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huawei.com \
--cc=corbet@lwn.net \
--cc=frederic@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jstultz@google.com \
--cc=kernel-team@android.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mkoutny@suse.com \
--cc=pavel@kernel.org \
--cc=rafael@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).