All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: <linux-nfs@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH v2 3/6] fs: add pin_insert_sb() for superblock-only pins
Date: Wed,  7 Jan 2026 19:40:13 -0500	[thread overview]
Message-ID: <20260108004016.3907158-4-cel@kernel.org> (raw)
In-Reply-To: <20260108004016.3907158-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

The fs_pin mechanism notifies interested subsystems when a filesystem
is remounted read-only or unmounted. Currently, BSD process accounting
uses this to halt accounting when the target filesystem goes away.
Registered pins receive callbacks from both group_pin_kill() (during
remount read-only) and mnt_pin_kill() (during mount teardown).

NFSD maintains NFSv4 client state associated with the superblocks
of exported filesystems. Revoking this state during unmount requires
lock ordering that conflicts with mnt_pin_kill() context:
mnt_pin_kill() runs during cleanup_mnt() with namespace locks held,
but NFSD's state revocation path acquires these same locks for mount
table lookups, creating AB-BA deadlock potential.

Add pin_insert_sb() to register pins on the superblock's s_pins
list only. Pins registered this way do not receive mnt_pin_kill()
callbacks during mount teardown.

After pin insertion, checking SB_ACTIVE detects racing unmounts.
When the superblock remains active, normal unmount cleanup occurs
through the subsystem's own shutdown path (outside the problematic
locking context) without pin callbacks.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/fs_pin.c            | 29 +++++++++++++++++++++++++++++
 include/linux/fs_pin.h |  1 +
 2 files changed, 30 insertions(+)

diff --git a/fs/fs_pin.c b/fs/fs_pin.c
index 972f34558b97..7204b4a5891f 100644
--- a/fs/fs_pin.c
+++ b/fs/fs_pin.c
@@ -48,6 +48,35 @@ void pin_insert(struct fs_pin *pin, struct vfsmount *m)
 }
 EXPORT_SYMBOL_GPL(pin_insert);
 
+/**
+ * pin_insert_sb - register an fs_pin on the superblock only
+ * @pin: the pin to register (must be initialized with init_fs_pin())
+ * @m: the vfsmount whose superblock to monitor
+ *
+ * Registers @pin on the superblock's s_pins list only. Callbacks arrive
+ * only from group_pin_kill() (invoked during remount read-only), not
+ * from mnt_pin_kill() (invoked during mount namespace teardown).
+ *
+ * Use this instead of pin_insert() when mnt_pin_kill() callbacks would
+ * execute in problematic locking contexts. Because mnt_pin_kill() runs
+ * during cleanup_mnt(), callbacks cannot acquire locks also taken during
+ * mount table operations without risking AB-BA deadlock.
+ *
+ * After insertion, check SB_ACTIVE to detect racing unmounts. If clear,
+ * call pin_remove() and abort. Normal unmount cleanup then occurs through
+ * subsystem-specific shutdown paths without pin callback involvement.
+ *
+ * The callback must call pin_remove() before returning. Callbacks execute
+ * with the RCU read lock held.
+ */
+void pin_insert_sb(struct fs_pin *pin, struct vfsmount *m)
+{
+	spin_lock(&pin_lock);
+	hlist_add_head(&pin->s_list, &m->mnt_sb->s_pins);
+	spin_unlock(&pin_lock);
+}
+EXPORT_SYMBOL_GPL(pin_insert_sb);
+
 void pin_kill(struct fs_pin *p)
 {
 	wait_queue_entry_t wait;
diff --git a/include/linux/fs_pin.h b/include/linux/fs_pin.h
index bdd09fd2520c..24c55329b15f 100644
--- a/include/linux/fs_pin.h
+++ b/include/linux/fs_pin.h
@@ -21,4 +21,5 @@ static inline void init_fs_pin(struct fs_pin *p, void (*kill)(struct fs_pin *))
 
 void pin_remove(struct fs_pin *);
 void pin_insert(struct fs_pin *, struct vfsmount *);
+void pin_insert_sb(struct fs_pin *, struct vfsmount *);
 void pin_kill(struct fs_pin *);
-- 
2.52.0


  parent reply	other threads:[~2026-01-08  0:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-08  0:40 [PATCH v2 0/6] Automatic NFSv4 state revocation on filesystem unmount Chuck Lever
2026-01-08  0:40 ` [PATCH v2 1/6] nfsd: cancel async COPY operations when admin revokes filesystem state Chuck Lever
2026-01-08  0:40 ` [PATCH v2 2/6] fs: export pin_insert and pin_remove for modular filesystems Chuck Lever
2026-01-08  0:40 ` Chuck Lever [this message]
2026-01-08  0:40 ` [PATCH v2 4/6] fs: invoke group_pin_kill() during mount teardown Chuck Lever
2026-01-09  8:38   ` NeilBrown
2026-01-09 16:04     ` Chuck Lever
2026-01-10 16:49       ` Al Viro
2026-01-10 20:07         ` Chuck Lever
2026-01-10 21:52           ` NeilBrown
2026-01-10 22:08           ` Al Viro
2026-01-10 22:31             ` Chuck Lever
2026-01-08  0:40 ` [PATCH v2 5/6] nfsd: revoke NFSv4 state when filesystem is unmounted Chuck Lever
2026-01-09  9:06   ` NeilBrown
2026-01-08  0:40 ` [PATCH v2 6/6] nfsd: close cached files on filesystem unmount Chuck Lever
2026-01-09 16:25 ` [PATCH v2 0/6] Automatic NFSv4 state revocation " Jeff Layton
2026-01-12  9:16   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260108004016.3907158-4-cel@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@ownmail.net \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.