public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: <linux-nfs@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH v2 3/6] fs: add pin_insert_sb() for superblock-only pins
Date: Wed,  7 Jan 2026 19:40:13 -0500	[thread overview]
Message-ID: <20260108004016.3907158-4-cel@kernel.org> (raw)
In-Reply-To: <20260108004016.3907158-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

The fs_pin mechanism notifies interested subsystems when a filesystem
is remounted read-only or unmounted. Currently, BSD process accounting
uses this to halt accounting when the target filesystem goes away.
Registered pins receive callbacks from both group_pin_kill() (during
remount read-only) and mnt_pin_kill() (during mount teardown).

NFSD maintains NFSv4 client state associated with the superblocks
of exported filesystems. Revoking this state during unmount requires
lock ordering that conflicts with mnt_pin_kill() context:
mnt_pin_kill() runs during cleanup_mnt() with namespace locks held,
but NFSD's state revocation path acquires these same locks for mount
table lookups, creating AB-BA deadlock potential.

Add pin_insert_sb() to register pins on the superblock's s_pins
list only. Pins registered this way do not receive mnt_pin_kill()
callbacks during mount teardown.

After pin insertion, checking SB_ACTIVE detects racing unmounts.
When the superblock remains active, normal unmount cleanup occurs
through the subsystem's own shutdown path (outside the problematic
locking context) without pin callbacks.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/fs_pin.c            | 29 +++++++++++++++++++++++++++++
 include/linux/fs_pin.h |  1 +
 2 files changed, 30 insertions(+)

diff --git a/fs/fs_pin.c b/fs/fs_pin.c
index 972f34558b97..7204b4a5891f 100644
--- a/fs/fs_pin.c
+++ b/fs/fs_pin.c
@@ -48,6 +48,35 @@ void pin_insert(struct fs_pin *pin, struct vfsmount *m)
 }
 EXPORT_SYMBOL_GPL(pin_insert);
 
+/**
+ * pin_insert_sb - register an fs_pin on the superblock only
+ * @pin: the pin to register (must be initialized with init_fs_pin())
+ * @m: the vfsmount whose superblock to monitor
+ *
+ * Registers @pin on the superblock's s_pins list only. Callbacks arrive
+ * only from group_pin_kill() (invoked during remount read-only), not
+ * from mnt_pin_kill() (invoked during mount namespace teardown).
+ *
+ * Use this instead of pin_insert() when mnt_pin_kill() callbacks would
+ * execute in problematic locking contexts. Because mnt_pin_kill() runs
+ * during cleanup_mnt(), callbacks cannot acquire locks also taken during
+ * mount table operations without risking AB-BA deadlock.
+ *
+ * After insertion, check SB_ACTIVE to detect racing unmounts. If clear,
+ * call pin_remove() and abort. Normal unmount cleanup then occurs through
+ * subsystem-specific shutdown paths without pin callback involvement.
+ *
+ * The callback must call pin_remove() before returning. Callbacks execute
+ * with the RCU read lock held.
+ */
+void pin_insert_sb(struct fs_pin *pin, struct vfsmount *m)
+{
+	spin_lock(&pin_lock);
+	hlist_add_head(&pin->s_list, &m->mnt_sb->s_pins);
+	spin_unlock(&pin_lock);
+}
+EXPORT_SYMBOL_GPL(pin_insert_sb);
+
 void pin_kill(struct fs_pin *p)
 {
 	wait_queue_entry_t wait;
diff --git a/include/linux/fs_pin.h b/include/linux/fs_pin.h
index bdd09fd2520c..24c55329b15f 100644
--- a/include/linux/fs_pin.h
+++ b/include/linux/fs_pin.h
@@ -21,4 +21,5 @@ static inline void init_fs_pin(struct fs_pin *p, void (*kill)(struct fs_pin *))
 
 void pin_remove(struct fs_pin *);
 void pin_insert(struct fs_pin *, struct vfsmount *);
+void pin_insert_sb(struct fs_pin *, struct vfsmount *);
 void pin_kill(struct fs_pin *);
-- 
2.52.0


  parent reply	other threads:[~2026-01-08  0:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-08  0:40 [PATCH v2 0/6] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2026-01-08  0:40 ` [PATCH v2 1/6] nfsd: cancel async COPY operations when admin revokes filesystem state Chuck Lever
2026-01-08  0:40 ` [PATCH v2 2/6] fs: export pin_insert and pin_remove for modular filesystems Chuck Lever
2026-01-08  0:40 ` Chuck Lever [this message]
2026-01-08  0:40 ` [PATCH v2 4/6] fs: invoke group_pin_kill() during mount teardown Chuck Lever
2026-01-09  8:38   ` NeilBrown
2026-01-09 16:04     ` Chuck Lever
2026-01-10 16:49       ` Al Viro
2026-01-10 20:07         ` Chuck Lever
2026-01-10 21:52           ` NeilBrown
2026-01-10 22:08           ` Al Viro
2026-01-10 22:31             ` Chuck Lever
2026-01-08  0:40 ` [PATCH v2 5/6] nfsd: revoke NFSv4 state when filesystem is unmounted Chuck Lever
2026-01-09  9:06   ` NeilBrown
2026-01-08  0:40 ` [PATCH v2 6/6] nfsd: close cached files on filesystem unmount Chuck Lever
2026-01-09 16:25 ` [PATCH v2 0/6] Automatic NFSv4 state revocation " Jeff Layton
2026-01-12  9:16   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260108004016.3907158-4-cel@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@ownmail.net \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox