From: Shakeel Butt <shakeel.butt@linux.dev>
To: Tejun Heo <tj@kernel.org>
Cc: "Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Kuniyuki Iwashima" <kuniyu@google.com>,
"Daniel Sedlak" <daniel.sedlak@cdn77.com>,
"Meta kernel team" <kernel-team@meta.com>,
linux-mm@kvack.org, netdev@vger.kernel.org,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
"Jakub Kicinski" <kuba@kernel.org>
Subject: [PATCH 2/3] cgroup: add lockless fast-path checks to cgroup_file_notify()
Date: Sat, 28 Feb 2026 06:20:17 -0800 [thread overview]
Message-ID: <20260228142018.3178529-3-shakeel.butt@linux.dev> (raw)
In-Reply-To: <20260228142018.3178529-1-shakeel.butt@linux.dev>
Add two lockless checks before acquiring the lock:
1. READ_ONCE(cfile->kn) NULL check to skip torn-down files.
2. READ_ONCE(cfile->notified_at) check to skip when within the
rate-limit window (~10ms).
Both checks have safe error directions -- a stale read can only cause
unnecessary lock acquisition, never a missed notification. Annotate
all write sites with WRITE_ONCE() to pair with the lockless readers.
The trade-off is that trailing timer_reduce() calls during bursts are
skipped, so the deferred notification that delivers the final state
may be lost. This is acceptable for the primary callers like
__memcg_memory_event() where events keep arriving.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reported-by: Jakub Kicinski <kuba@kernel.org>
---
kernel/cgroup/cgroup.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 33282c7d71e4..5473ebd0f6c1 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1749,7 +1749,7 @@ static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
struct cgroup_file *cfile = (void *)css + cft->file_offset;
spin_lock_irq(&cgroup_file_kn_lock);
- cfile->kn = NULL;
+ WRITE_ONCE(cfile->kn, NULL);
spin_unlock_irq(&cgroup_file_kn_lock);
timer_delete_sync(&cfile->notify_timer);
@@ -4430,7 +4430,7 @@ static int cgroup_add_file(struct cgroup_subsys_state *css, struct cgroup *cgrp,
timer_setup(&cfile->notify_timer, cgroup_file_notify_timer, 0);
spin_lock_irq(&cgroup_file_kn_lock);
- cfile->kn = kn;
+ WRITE_ONCE(cfile->kn, kn);
spin_unlock_irq(&cgroup_file_kn_lock);
}
@@ -4686,20 +4686,27 @@ int cgroup_add_legacy_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
*/
void cgroup_file_notify(struct cgroup_file *cfile)
{
- unsigned long flags;
+ unsigned long flags, last, next;
struct kernfs_node *kn = NULL;
+ if (!READ_ONCE(cfile->kn))
+ return;
+
+ last = READ_ONCE(cfile->notified_at);
+ if (time_before_eq(jiffies, last + CGROUP_FILE_NOTIFY_MIN_INTV))
+ return;
+
spin_lock_irqsave(&cgroup_file_kn_lock, flags);
if (cfile->kn) {
- unsigned long last = cfile->notified_at;
- unsigned long next = last + CGROUP_FILE_NOTIFY_MIN_INTV;
+ last = cfile->notified_at;
+ next = last + CGROUP_FILE_NOTIFY_MIN_INTV;
- if (time_in_range(jiffies, last, next)) {
+ if (time_before_eq(jiffies, next)) {
timer_reduce(&cfile->notify_timer, next);
} else {
kn = cfile->kn;
kernfs_get(kn);
- cfile->notified_at = jiffies;
+ WRITE_ONCE(cfile->notified_at, jiffies);
}
}
spin_unlock_irqrestore(&cgroup_file_kn_lock, flags);
--
2.47.3
next prev parent reply other threads:[~2026-02-28 14:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-28 14:20 [PATCH 0/3] cgroup: improve cgroup_file_notify() scalability Shakeel Butt
2026-02-28 14:20 ` [PATCH 1/3] cgroup: reduce cgroup_file_kn_lock hold time in cgroup_file_notify() Shakeel Butt
2026-02-28 14:20 ` Shakeel Butt [this message]
2026-03-02 1:50 ` [PATCH 2/3] cgroup: add lockless fast-path checks to cgroup_file_notify() Chen Ridong
2026-03-02 16:14 ` Shakeel Butt
2026-03-02 17:00 ` Shakeel Butt
2026-03-03 3:18 ` Chen Ridong
2026-03-03 4:01 ` Shakeel Butt
2026-03-05 7:01 ` Chen Ridong
2026-03-03 3:08 ` Chen Ridong
2026-02-28 14:20 ` [PATCH 3/3] cgroup: replace global cgroup_file_kn_lock with per-cgroup_file lock Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260228142018.3178529-3-shakeel.butt@linux.dev \
--to=shakeel.butt@linux.dev \
--cc=cgroups@vger.kernel.org \
--cc=daniel.sedlak@cdn77.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mkoutny@suse.com \
--cc=netdev@vger.kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.