All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: Tejun Heo <tj@kernel.org>
Cc: "Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Chen Ridong" <chenridong@huaweicloud.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Meta kernel team" <kernel-team@meta.com>,
	linux-mm@kvack.org, netdev@vger.kernel.org,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v2 2/3] cgroup: add lockless fast-path checks to cgroup_file_notify()
Date: Tue, 10 Mar 2026 18:01:00 -0700	[thread overview]
Message-ID: <20260311010101.3306366-3-shakeel.butt@linux.dev> (raw)
In-Reply-To: <20260311010101.3306366-1-shakeel.butt@linux.dev>

Add lockless checks before acquiring cgroup_file_kn_lock:

1. READ_ONCE(cfile->kn) NULL check to skip torn-down files.
2. READ_ONCE(cfile->notified_at) rate-limit check to skip when
   within the notification interval.  If within the interval, arm
   the deferred timer via timer_reduce() and confirm it is pending
   before returning -- if the timer fired in between, fall through
   to the lock path so the notification is not lost.

Both checks have safe error directions -- a stale read can only
cause unnecessary lock acquisition, never a missed notification.

The critical section is simplified to just taking a kernfs_get()
reference and updating notified_at.

Annotate cfile->kn and cfile->notified_at write sites with
WRITE_ONCE() to pair with the lockless readers.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
Changes since v1:
- Moves the timer arming and rate limiting out of lock.

 kernel/cgroup/cgroup.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index b3fbeadb2b5a..b00f4c3242e0 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1749,7 +1749,7 @@ static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 		struct cgroup_file *cfile = (void *)css + cft->file_offset;
 
 		spin_lock_irq(&cgroup_file_kn_lock);
-		cfile->kn = NULL;
+		WRITE_ONCE(cfile->kn, NULL);
 		spin_unlock_irq(&cgroup_file_kn_lock);
 
 		timer_delete_sync(&cfile->notify_timer);
@@ -4429,7 +4429,7 @@ static int cgroup_add_file(struct cgroup_subsys_state *css, struct cgroup *cgrp,
 		timer_setup(&cfile->notify_timer, cgroup_file_notify_timer, 0);
 
 		spin_lock_irq(&cgroup_file_kn_lock);
-		cfile->kn = kn;
+		WRITE_ONCE(cfile->kn, kn);
 		spin_unlock_irq(&cgroup_file_kn_lock);
 	}
 
@@ -4685,21 +4685,25 @@ int cgroup_add_legacy_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
  */
 void cgroup_file_notify(struct cgroup_file *cfile)
 {
-	unsigned long flags;
+	unsigned long flags, last, next;
 	struct kernfs_node *kn = NULL;
 
+	if (!READ_ONCE(cfile->kn))
+		return;
+
+	last = READ_ONCE(cfile->notified_at);
+	next = last + CGROUP_FILE_NOTIFY_MIN_INTV;
+	if (time_in_range(jiffies, last, next)) {
+		timer_reduce(&cfile->notify_timer, next);
+		if (timer_pending(&cfile->notify_timer))
+			return;
+	}
+
 	spin_lock_irqsave(&cgroup_file_kn_lock, flags);
 	if (cfile->kn) {
-		unsigned long last = cfile->notified_at;
-		unsigned long next = last + CGROUP_FILE_NOTIFY_MIN_INTV;
-
-		if (time_in_range(jiffies, last, next)) {
-			timer_reduce(&cfile->notify_timer, next);
-		} else {
-			kn = cfile->kn;
-			kernfs_get(kn);
-			cfile->notified_at = jiffies;
-		}
+		kn = cfile->kn;
+		kernfs_get(kn);
+		WRITE_ONCE(cfile->notified_at, jiffies);
 	}
 	spin_unlock_irqrestore(&cgroup_file_kn_lock, flags);
 
-- 
2.52.0


  parent reply	other threads:[~2026-03-11  1:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11  1:00 [PATCH v2 0/3] cgroup: improve cgroup_file_notify() scalability Shakeel Butt
2026-03-11  1:00 ` [PATCH v2 1/3] cgroup: reduce cgroup_file_kn_lock hold time in cgroup_file_notify() Shakeel Butt
2026-03-11  1:01 ` Shakeel Butt [this message]
2026-03-11  1:01 ` [PATCH v2 3/3] cgroup: replace global cgroup_file_kn_lock with per-cgroup_file lock Shakeel Butt
2026-03-11 22:40 ` [PATCH v2 0/3] cgroup: improve cgroup_file_notify() scalability Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260311010101.3306366-3-shakeel.butt@linux.dev \
    --to=shakeel.butt@linux.dev \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huaweicloud.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mkoutny@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.