All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: <linux-nfs@vger.kernel.org>, Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH v1 3/5] fs: add pin_insert_group() for superblock-only pins
Date: Tue, 30 Dec 2025 09:18:36 -0500	[thread overview]
Message-ID: <20251230141838.2547848-4-cel@kernel.org> (raw)
In-Reply-To: <20251230141838.2547848-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Filesystems using fs_pin currently receive callbacks from both
group_pin_kill() (during remount read-only) and mnt_pin_kill()
(during mount teardown). Some filesystems require callbacks only
from the former.

NFSD maintains NFSv4 client state associated with the superblocks
of exported filesystems. Revoking this state during unmount requires
lock ordering that conflicts with mnt_pin_kill() context:
mnt_pin_kill() runs during cleanup_mnt() with namespace locks held,
but NFSD's state revocation path acquires these same locks for mount
table lookups, creating AB-BA deadlock potential.

Add pin_insert_group() to register pins on the superblock's s_pins
list only. The function name derives from group_pin_kill(), which
iterates s_pins during remount read-only. Pins registered this way
do not receive mnt_pin_kill() callbacks during mount teardown.

After pin insertion, checking SB_ACTIVE detects racing unmounts.
When the superblock remains active, normal unmount cleanup occurs
through the subsystem's own shutdown path (outside the problematic
locking context) without pin callbacks.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/fs_pin.c            | 29 +++++++++++++++++++++++++++++
 include/linux/fs_pin.h |  1 +
 2 files changed, 30 insertions(+)

diff --git a/fs/fs_pin.c b/fs/fs_pin.c
index 972f34558b97..93da2e710abc 100644
--- a/fs/fs_pin.c
+++ b/fs/fs_pin.c
@@ -48,6 +48,35 @@ void pin_insert(struct fs_pin *pin, struct vfsmount *m)
 }
 EXPORT_SYMBOL_GPL(pin_insert);
 
+/**
+ * pin_insert_group - register an fs_pin for superblock-only notification
+ * @pin: the pin to register (must be initialized with init_fs_pin())
+ * @m: the vfsmount whose superblock to monitor
+ *
+ * Registers @pin on the superblock's s_pins list only. Callbacks arrive
+ * only from group_pin_kill() (invoked during remount read-only), not
+ * from mnt_pin_kill() (invoked during mount namespace teardown).
+ *
+ * Use this instead of pin_insert() when mnt_pin_kill() callbacks would
+ * execute in problematic locking contexts. Because mnt_pin_kill() runs
+ * during cleanup_mnt(), callbacks cannot acquire locks also taken during
+ * mount table operations without risking AB-BA deadlock.
+ *
+ * After insertion, check SB_ACTIVE to detect racing unmounts. If clear,
+ * call pin_remove() and abort. Normal unmount cleanup then occurs through
+ * subsystem-specific shutdown paths without pin callback involvement.
+ *
+ * The callback must call pin_remove() before returning. Callbacks execute
+ * with the RCU read lock held.
+ */
+void pin_insert_group(struct fs_pin *pin, struct vfsmount *m)
+{
+	spin_lock(&pin_lock);
+	hlist_add_head(&pin->s_list, &m->mnt_sb->s_pins);
+	spin_unlock(&pin_lock);
+}
+EXPORT_SYMBOL_GPL(pin_insert_group);
+
 void pin_kill(struct fs_pin *p)
 {
 	wait_queue_entry_t wait;
diff --git a/include/linux/fs_pin.h b/include/linux/fs_pin.h
index bdd09fd2520c..379e13bc72db 100644
--- a/include/linux/fs_pin.h
+++ b/include/linux/fs_pin.h
@@ -21,4 +21,5 @@ static inline void init_fs_pin(struct fs_pin *p, void (*kill)(struct fs_pin *))
 
 void pin_remove(struct fs_pin *);
 void pin_insert(struct fs_pin *, struct vfsmount *);
+void pin_insert_group(struct fs_pin *, struct vfsmount *);
 void pin_kill(struct fs_pin *);
-- 
2.52.0


  parent reply	other threads:[~2025-12-30 14:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-30 14:18 [PATCH v1 0/5] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2025-12-30 14:18 ` [PATCH v1 1/5] nfsd: cancel async COPY operations when admin revokes filesystem state Chuck Lever
2025-12-30 23:15   ` NeilBrown
2026-01-06 13:44   ` Jeff Layton
2025-12-30 14:18 ` [PATCH v1 2/5] fs: export pin_insert and pin_remove for modular filesystems Chuck Lever
2025-12-30 23:18   ` NeilBrown
2026-01-06 13:46   ` Jeff Layton
2025-12-30 14:18 ` Chuck Lever [this message]
2025-12-30 23:34   ` [PATCH v1 3/5] fs: add pin_insert_group() for superblock-only pins NeilBrown
2026-01-06 15:22   ` Jeff Layton
2026-01-06 15:24     ` Chuck Lever
2025-12-30 14:18 ` [PATCH v1 4/5] nfsd: revoke NFSv4 state when filesystem is unmounted Chuck Lever
2025-12-31  0:10   ` NeilBrown
2025-12-30 14:18 ` [PATCH v1 5/5] nfsd: close cached files on filesystem unmount Chuck Lever
2025-12-31  0:16   ` NeilBrown
2025-12-31  0:28 ` [PATCH v1 0/5] Automatic NFSv4 state revocation " NeilBrown
2025-12-31  2:18   ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251230141838.2547848-4-cel@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@ownmail.net \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.