* [PATCH 1/8] ext4: convert extents KUnit test to sget_fc()
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-26 15:09 ` [PATCH 2/8] ext4: convert mballoc " Christian Brauner
` (8 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
The extents KUnit test uses sget() to get an initialized superblock for
its fake file_system_type. sget() predates fs_context and we want to
retire it. Switch this caller over to sget_fc().
Add a no-op ext_init_fs_context() so fs_context_for_mount() has
something to call on the fake fs_type. ext_set() now takes a struct
fs_context * (still a no-op). extents_kunit_init() allocates the fc,
hands it to sget_fc() and drops the fc reference once the sb is
published. sget_fc() does not retain a pointer to it.
No functional change for the test.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/ext4/extents-test.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/extents-test.c b/fs/ext4/extents-test.c
index 6b53a3f39fcd..bd7795a82607 100644
--- a/fs/ext4/extents-test.c
+++ b/fs/ext4/extents-test.c
@@ -37,6 +37,7 @@
#include <kunit/test.h>
#include <kunit/static_stub.h>
+#include <linux/fs_context.h>
#include <linux/gfp_types.h>
#include <linux/stddef.h>
@@ -130,14 +131,20 @@ static void ext_kill_sb(struct super_block *sb)
generic_shutdown_super(sb);
}
-static int ext_set(struct super_block *sb, void *data)
+static int ext_init_fs_context(struct fs_context *fc)
+{
+ return 0;
+}
+
+static int ext_set(struct super_block *sb, struct fs_context *fc)
{
return 0;
}
static struct file_system_type ext_fs_type = {
- .name = "extents test",
- .kill_sb = ext_kill_sb,
+ .name = "extents test",
+ .init_fs_context = ext_init_fs_context,
+ .kill_sb = ext_kill_sb,
};
static void extents_kunit_exit(struct kunit *test)
@@ -223,6 +230,7 @@ static int extents_kunit_init(struct kunit *test)
struct ext4_inode_info *ei;
struct inode *inode;
struct super_block *sb;
+ struct fs_context *fc;
struct ext4_sb_info *sbi = NULL;
struct kunit_ext_test_param *param =
(struct kunit_ext_test_param *)(test->param_value);
@@ -232,7 +240,13 @@ static int extents_kunit_init(struct kunit *test)
if (sbi == NULL)
return -ENOMEM;
- sb = sget(&ext_fs_type, NULL, ext_set, 0, NULL);
+ fc = fs_context_for_mount(&ext_fs_type, 0);
+ if (IS_ERR(fc)) {
+ kfree(sbi);
+ return PTR_ERR(fc);
+ }
+ sb = sget_fc(fc, NULL, ext_set);
+ put_fs_context(fc);
if (IS_ERR(sb)) {
kfree(sbi);
return PTR_ERR(sb);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 2/8] ext4: convert mballoc KUnit test to sget_fc()
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
2026-05-26 15:09 ` [PATCH 1/8] ext4: convert extents KUnit test to sget_fc() Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-27 0:47 ` Theodore Tso
2026-05-26 15:09 ` [PATCH 3/8] smb: client: convert cifs_smb3_do_mount() " Christian Brauner
` (7 subsequent siblings)
9 siblings, 1 reply; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
Same treatment as the extents KUnit test. The mballoc test uses sget()
as a thin "give me an initialized superblock" wrapper for a fake
file_system_type. Move it onto sget_fc() so sget() can go away.
Add a no-op mbt_init_fs_context() so fs_context_for_mount() has
something to call on the fake fs_type. mbt_set() now takes a struct
fs_context * (still a no-op). mbt_ext4_alloc_super_block() allocates
the fc, hands it to sget_fc() and drops the fc reference once the sb
is published.
No functional change.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/ext4/mballoc-test.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/mballoc-test.c b/fs/ext4/mballoc-test.c
index 90ed505fa4b1..d90da44aadbd 100644
--- a/fs/ext4/mballoc-test.c
+++ b/fs/ext4/mballoc-test.c
@@ -5,6 +5,7 @@
#include <kunit/test.h>
#include <kunit/static_stub.h>
+#include <linux/fs_context.h>
#include <linux/random.h>
#include "ext4.h"
@@ -63,8 +64,14 @@ static void mbt_kill_sb(struct super_block *sb)
generic_shutdown_super(sb);
}
+static int mbt_init_fs_context(struct fs_context *fc)
+{
+ return 0;
+}
+
static struct file_system_type mbt_fs_type = {
.name = "mballoc test",
+ .init_fs_context = mbt_init_fs_context,
.kill_sb = mbt_kill_sb,
};
@@ -127,7 +134,7 @@ static void mbt_mb_release(struct super_block *sb)
kfree(sb->s_bdev);
}
-static int mbt_set(struct super_block *sb, void *data)
+static int mbt_set(struct super_block *sb, struct fs_context *fc)
{
return 0;
}
@@ -136,13 +143,19 @@ static struct super_block *mbt_ext4_alloc_super_block(void)
{
struct mbt_ext4_super_block *fsb;
struct super_block *sb;
+ struct fs_context *fc;
struct ext4_sb_info *sbi;
fsb = kzalloc_obj(*fsb);
if (fsb == NULL)
return NULL;
- sb = sget(&mbt_fs_type, NULL, mbt_set, 0, NULL);
+ fc = fs_context_for_mount(&mbt_fs_type, 0);
+ if (IS_ERR(fc))
+ goto out;
+
+ sb = sget_fc(fc, NULL, mbt_set);
+ put_fs_context(fc);
if (IS_ERR(sb))
goto out;
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 2/8] ext4: convert mballoc KUnit test to sget_fc()
2026-05-26 15:09 ` [PATCH 2/8] ext4: convert mballoc " Christian Brauner
@ 2026-05-27 0:47 ` Theodore Tso
2026-05-28 12:02 ` Christian Brauner
0 siblings, 1 reply; 15+ messages in thread
From: Theodore Tso @ 2026-05-27 0:47 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro
On Tue, May 26, 2026 at 05:09:04PM +0200, Christian Brauner wrote:
> Add a no-op mbt_init_fs_context() so fs_context_for_mount() has
> something to call on the fake fs_type....
I was trying to figure out what needed to be in an init_fs_context()
functrion, and I came accross this in
Documentation/filesystems/mount_api.rst:
const struct fs_context_operations *ops
These are operations that can be done on a filesystem context (see
below). This must be set by the ->init_fs_context() file_system_type
operation. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
So is it safe to just have an init_fs_context() function which doesn't
do this?
> +static int mbt_init_fs_context(struct fs_context *fc)
> +{
> + return 0;
> +}
> +
I see in fs/fs_context.c that in some places the code protects against
a NULL ops pointer:
if (fc->need_free && fc->ops && fc->ops->free)
fc->ops->free(fc);
But in other places, it doesn't and we'll end up derefrencing a null
pointer:
if (fc->ops->parse_param) {
ret = fc->ops->parse_param(fc, param);
....
So it's unclear to me --- when is it safe (and not safe) to not bother
to fill in the ops pointer?
- Ted
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 2/8] ext4: convert mballoc KUnit test to sget_fc()
2026-05-27 0:47 ` Theodore Tso
@ 2026-05-28 12:02 ` Christian Brauner
2026-06-03 13:52 ` Theodore Tso
0 siblings, 1 reply; 15+ messages in thread
From: Christian Brauner @ 2026-05-28 12:02 UTC (permalink / raw)
To: Theodore Tso
Cc: linux-fsdevel, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro
On Tue, May 26, 2026 at 07:47:27PM -0500, Theodore Ts'o wrote:
> On Tue, May 26, 2026 at 05:09:04PM +0200, Christian Brauner wrote:
> > Add a no-op mbt_init_fs_context() so fs_context_for_mount() has
> > something to call on the fake fs_type....
>
> I was trying to figure out what needed to be in an init_fs_context()
> functrion, and I came accross this in
> Documentation/filesystems/mount_api.rst:
>
> const struct fs_context_operations *ops
>
> These are operations that can be done on a filesystem context (see
> below). This must be set by the ->init_fs_context() file_system_type
> operation. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> So is it safe to just have an init_fs_context() function which doesn't
> do this?
>
> > +static int mbt_init_fs_context(struct fs_context *fc)
> > +{
> > + return 0;
> > +}
> > +
>
> I see in fs/fs_context.c that in some places the code protects against
> a NULL ops pointer:
>
> if (fc->need_free && fc->ops && fc->ops->free)
> fc->ops->free(fc);
>
> But in other places, it doesn't and we'll end up derefrencing a null
> pointer:
>
> if (fc->ops->parse_param) {
> ret = fc->ops->parse_param(fc, param);
>
> ....
>
> So it's unclear to me --- when is it safe (and not safe) to not bother
> to fill in the ops pointer?
Hey Ted!
In these two cases it's fine. Because you're just using the allocation
and deallocation functions to get a fs_context that's basically just an
empty vessel to get at a superblock via sget_fc() but you're not really
doing anything with it.
IOW, you can never end up in callchains that cause issues.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 2/8] ext4: convert mballoc KUnit test to sget_fc()
2026-05-28 12:02 ` Christian Brauner
@ 2026-06-03 13:52 ` Theodore Tso
0 siblings, 0 replies; 15+ messages in thread
From: Theodore Tso @ 2026-06-03 13:52 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro
On Thu, May 28, 2026 at 02:02:50PM +0200, Christian Brauner wrote:
>
> In these two cases it's fine. Because you're just using the allocation
> and deallocation functions to get a fs_context that's basically just an
> empty vessel to get at a superblock via sget_fc() but you're not really
> doing anything with it.
If you're OK with, I have no objects, but...
I'm sure it's fine today. But is this something which is documented
to be fine in the future? It just seems a little fragile and is
contrary to the documentation.
Thanks,
- Ted
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/8] smb: client: convert cifs_smb3_do_mount() to sget_fc()
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
2026-05-26 15:09 ` [PATCH 1/8] ext4: convert extents KUnit test to sget_fc() Christian Brauner
2026-05-26 15:09 ` [PATCH 2/8] ext4: convert mballoc " Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-26 15:09 ` [PATCH 4/8] fs: retire sget() Christian Brauner
` (6 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
The CIFS mount path already runs through fs_context: smb3_get_tree()
calls smb3_get_tree_common() with a struct fs_context * in hand. But
the fc is dropped on the way to sget(). Plumb it through to sget_fc()
so the legacy sget() interface can go.
cifs_smb3_do_mount() now takes (struct fs_context *, struct
smb3_fs_context *). The old (fs_type, flags) pair is reconstructed
from fc->fs_type and fc->sb_flags. The flags argument was always
passed as 0 by the sole caller anyway. The cifs_dbg diagnostic now
prints fc->sb_flags directly.
cifs_match_super() and cifs_set_super() were the two void-data
callbacks for sget(). The match callback now takes
(struct super_block *, struct fs_context *) and reads struct
cifs_mnt_data out of fc->sget_key. The set callback is gone entirely:
sget_fc() pre-populates sb->s_fs_info from fc->s_fs_info before
invoking set() so set_anon_super_fc() (which just allocates an anon
bdev) is sufficient.
Before sget_fc() we stash cifs_sb in fc->s_fs_info, the per-mount data
in fc->sget_key and force fc->sb_flags to SB_NODIRATIME | SB_NOATIME
to reproduce the previous hard-coded behaviour (alloc_super() reads
fc->sb_flags). The original sb_flags is saved and restored around the
call so the rest of the mount path sees the same fc semantics as
before.
mnt_data.flags keeps its historical value of 0 so the CIFS_MS_MASK
comparison in compare_mount_options() returns the same (always-equal)
result.
No functional change. With this in place sget() has no remaining CIFS
caller.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/smb/client/cifsfs.c | 43 ++++++++++++++++++++++++++-----------------
fs/smb/client/cifsfs.h | 3 ++-
fs/smb/client/cifsproto.h | 3 ++-
fs/smb/client/connect.c | 5 +++--
fs/smb/client/fs_context.c | 2 +-
5 files changed, 34 insertions(+), 22 deletions(-)
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index 9f76b0347fa9..d5074e3fbb85 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/fs.h>
+#include <linux/fs_context.h>
#include <linux/filelock.h>
#include <linux/mount.h>
#include <linux/slab.h>
@@ -966,26 +967,19 @@ cifs_get_root(struct smb3_fs_context *ctx, struct super_block *sb)
return dentry;
}
-static int cifs_set_super(struct super_block *sb, void *data)
-{
- struct cifs_mnt_data *mnt_data = data;
- sb->s_fs_info = mnt_data->cifs_sb;
- return set_anon_super(sb, NULL);
-}
-
struct dentry *
-cifs_smb3_do_mount(struct file_system_type *fs_type,
- int flags, struct smb3_fs_context *old_ctx)
+cifs_smb3_do_mount(struct fs_context *fc, struct smb3_fs_context *old_ctx)
{
struct cifs_mnt_data mnt_data;
struct cifs_sb_info *cifs_sb;
struct super_block *sb;
struct dentry *root;
+ unsigned int saved_sb_flags;
int rc;
if (cifsFYI) {
- cifs_dbg(FYI, "%s: devname=%s flags=0x%x\n", __func__,
- old_ctx->source, flags);
+ cifs_dbg(FYI, "%s: devname=%s sb_flags=0x%x\n", __func__,
+ old_ctx->source, fc->sb_flags);
} else {
cifs_info("Attempting to mount %s\n", old_ctx->source);
}
@@ -1012,7 +1006,7 @@ cifs_smb3_do_mount(struct file_system_type *fs_type,
rc = cifs_mount(cifs_sb, cifs_sb->ctx);
if (rc) {
- if (!(flags & SB_SILENT))
+ if (!(fc->sb_flags & SB_SILENT))
cifs_dbg(VFS, "cifs_mount failed w/return code = %d\n",
rc);
root = ERR_PTR(rc);
@@ -1021,12 +1015,27 @@ cifs_smb3_do_mount(struct file_system_type *fs_type,
mnt_data.ctx = cifs_sb->ctx;
mnt_data.cifs_sb = cifs_sb;
- mnt_data.flags = flags;
+ mnt_data.flags = 0;
- /* BB should we make this contingent on mount parm? */
- flags |= SB_NODIRATIME | SB_NOATIME;
-
- sb = sget(fs_type, cifs_match_super, cifs_set_super, flags, &mnt_data);
+ /*
+ * sb->s_flags is set from fc->sb_flags by alloc_super(). CIFS has
+ * historically forced SB_NODIRATIME | SB_NOATIME on every mount and
+ * ignored the caller-supplied SB_* flags. Preserve that behaviour by
+ * overriding fc->sb_flags around the sget_fc() call.
+ *
+ * Hand cifs_sb to sget_fc() via fc->s_fs_info; sget_fc() copies it
+ * onto sb->s_fs_info before running set() and clears fc->s_fs_info
+ * on successful publish. Pass the rest of the per-mount context to
+ * cifs_match_super() through fc->sget_key.
+ */
+ saved_sb_flags = fc->sb_flags;
+ fc->sb_flags = SB_NODIRATIME | SB_NOATIME;
+ fc->s_fs_info = cifs_sb;
+ fc->sget_key = &mnt_data;
+ sb = sget_fc(fc, cifs_match_super, set_anon_super_fc);
+ fc->sget_key = NULL;
+ fc->s_fs_info = NULL;
+ fc->sb_flags = saved_sb_flags;
if (IS_ERR(sb)) {
cifs_umount(cifs_sb);
return ERR_CAST(sb);
diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h
index c455b15f2778..0a93f48924a5 100644
--- a/fs/smb/client/cifsfs.h
+++ b/fs/smb/client/cifsfs.h
@@ -144,8 +144,9 @@ ssize_t cifs_file_copychunk_range(unsigned int xid, struct file *src_file,
long cifs_ioctl(struct file *filep, unsigned int command, unsigned long arg);
void cifs_setsize(struct inode *inode, loff_t offset);
+struct fs_context;
struct smb3_fs_context;
-struct dentry *cifs_smb3_do_mount(struct file_system_type *fs_type, int flags,
+struct dentry *cifs_smb3_do_mount(struct fs_context *fc,
struct smb3_fs_context *old_ctx);
char *cifs_silly_fullpath(struct dentry *dentry);
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index 4a25afda9448..a39572cbaadb 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -19,6 +19,7 @@
struct statfs;
struct smb_rqst;
struct smb3_fs_context;
+struct fs_context;
/*
*****************************************************************
@@ -236,7 +237,7 @@ void cifs_mount_put_conns(struct cifs_mount_ctx *mnt_ctx);
int cifs_mount_get_session(struct cifs_mount_ctx *mnt_ctx);
int cifs_is_path_remote(struct cifs_mount_ctx *mnt_ctx);
int cifs_mount_get_tcon(struct cifs_mount_ctx *mnt_ctx);
-int cifs_match_super(struct super_block *sb, void *data);
+int cifs_match_super(struct super_block *sb, struct fs_context *fc);
int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb3_fs_context *ctx);
void cifs_umount(struct cifs_sb_info *cifs_sb);
void cifs_mark_open_files_invalid(struct cifs_tcon *tcon);
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index dcde25da468d..79762e6bbe50 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -6,6 +6,7 @@
*
*/
#include <linux/fs.h>
+#include <linux/fs_context.h>
#include <linux/net.h>
#include <linux/string.h>
#include <linux/sched/mm.h>
@@ -2991,9 +2992,9 @@ static int match_prepath(struct super_block *sb,
}
int
-cifs_match_super(struct super_block *sb, void *data)
+cifs_match_super(struct super_block *sb, struct fs_context *fc)
{
- struct cifs_mnt_data *mnt_data = data;
+ struct cifs_mnt_data *mnt_data = fc->sget_key;
struct smb3_fs_context *ctx;
struct cifs_sb_info *cifs_sb;
struct TCP_Server_Info *tcp_srv;
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
index b9544eb0381b..6aba4e1c9c27 100644
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -920,7 +920,7 @@ static int smb3_get_tree_common(struct fs_context *fc)
struct dentry *root;
int rc = 0;
- root = cifs_smb3_do_mount(fc->fs_type, 0, ctx);
+ root = cifs_smb3_do_mount(fc, ctx);
if (IS_ERR(root))
return PTR_ERR(root);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 4/8] fs: retire sget()
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (2 preceding siblings ...)
2026-05-26 15:09 ` [PATCH 3/8] smb: client: convert cifs_smb3_do_mount() " Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-26 15:09 ` [PATCH 5/8] super: drop sb_lock from setup_bdev_super() tuple publication Christian Brauner
` (5 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
sget() and sget_fc() have lived side by side as near-duplicate
find-or-create-and-publish helpers for the legacy and fs_context mount
APIs. The three remaining in-tree callers (CIFS plus the ext4 extents
and mballoc KUnit tests) have all been moved to sget_fc(). Nothing
calls sget() anymore.
Delete sget() from fs/super.c and the prototype in <linux/fs.h>.
Update the two comments that referred to "sget()" or "sget{_fc}()" to
just say "sget_fc()".
This removes ~60 lines of code that only existed to be kept in
lockstep with sget_fc() on every superblock publish-path change.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/btrfs/super.c | 2 +-
fs/super.c | 71 ++++--------------------------------------------------
include/linux/fs.h | 4 ---
3 files changed, 6 insertions(+), 71 deletions(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b26aa9169e83..636154861d7c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2052,7 +2052,7 @@ static int btrfs_get_tree_subvol(struct fs_context *fc)
* then open_ctree will properly initialize the file system specific
* settings later. btrfs_init_fs_info initializes the static elements
* of the fs_info (locks and such) to make cleanup easier if we find a
- * superblock with our given fs_devices later on at sget() time.
+ * superblock with our given fs_devices later on at sget_fc() time.
*/
fs_info = kvzalloc_obj(struct btrfs_fs_info);
if (!fs_info)
diff --git a/fs/super.c b/fs/super.c
index 378e81efe643..5fe8cea9f8fe 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -328,7 +328,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
init_rwsem(&s->s_umount);
lockdep_set_class(&s->s_umount, &type->s_umount_key);
/*
- * sget() can have s_umount recursion.
+ * sget_fc() can have s_umount recursion.
*
* When it cannot find a suitable sb, it allocates a new
* one (this one), and tries again to find a suitable old
@@ -439,7 +439,7 @@ static void kill_super_notify(struct super_block *sb)
/*
* Remove it from @fs_supers so it isn't found by new
- * sget{_fc}() walkers anymore. Any concurrent mounter still
+ * sget_fc() walkers anymore. Any concurrent mounter still
* managing to grab a temporary reference is guaranteed to
* already see SB_DYING and will wait until we notify them about
* SB_DEAD.
@@ -517,7 +517,7 @@ EXPORT_SYMBOL(deactivate_super);
* @sb: superblock to acquire
*
* Acquire a temporary reference on a superblock and try to trade it for
- * an active reference. This is used in sget{_fc}() to wait for a
+ * an active reference. This is used in sget_fc() to wait for a
* superblock to either become SB_BORN or for it to pass through
* sb->kill() and be marked as SB_DEAD.
*
@@ -673,11 +673,11 @@ void generic_shutdown_super(struct super_block *sb)
/*
* Broadcast to everyone that grabbed a temporary reference to this
* superblock before we removed it from @fs_supers that the superblock
- * is dying. Every walker of @fs_supers outside of sget{_fc}() will now
+ * is dying. Every walker of @fs_supers outside of sget_fc() will now
* discard this superblock and treat it as dead.
*
* We leave the superblock on @fs_supers so it can be found by
- * sget{_fc}() until we passed sb->kill_sb().
+ * sget_fc() until we passed sb->kill_sb().
*/
super_wake(sb, SB_DYING);
super_unlock_excl(sb);
@@ -808,67 +808,6 @@ struct super_block *sget_fc(struct fs_context *fc,
}
EXPORT_SYMBOL(sget_fc);
-/**
- * sget - find or create a superblock
- * @type: filesystem type superblock should belong to
- * @test: comparison callback
- * @set: setup callback
- * @flags: mount flags
- * @data: argument to each of them
- */
-struct super_block *sget(struct file_system_type *type,
- int (*test)(struct super_block *,void *),
- int (*set)(struct super_block *,void *),
- int flags,
- void *data)
-{
- struct user_namespace *user_ns = current_user_ns();
- struct super_block *s = NULL;
- struct super_block *old;
- int err;
-
-retry:
- spin_lock(&sb_lock);
- if (test) {
- hlist_for_each_entry(old, &type->fs_supers, s_instances) {
- if (!test(old, data))
- continue;
- if (user_ns != old->s_user_ns) {
- spin_unlock(&sb_lock);
- destroy_unused_super(s);
- return ERR_PTR(-EBUSY);
- }
- if (!grab_super(old))
- goto retry;
- destroy_unused_super(s);
- return old;
- }
- }
- if (!s) {
- spin_unlock(&sb_lock);
- s = alloc_super(type, flags, user_ns);
- if (!s)
- return ERR_PTR(-ENOMEM);
- goto retry;
- }
-
- err = set(s, data);
- if (err) {
- spin_unlock(&sb_lock);
- destroy_unused_super(s);
- return ERR_PTR(err);
- }
- s->s_type = type;
- strscpy(s->s_id, type->name, sizeof(s->s_id));
- list_add_tail(&s->s_list, &super_blocks);
- hlist_add_head(&s->s_instances, &type->fs_supers);
- spin_unlock(&sb_lock);
- get_filesystem(type);
- shrinker_register(s->s_shrink);
- return s;
-}
-EXPORT_SYMBOL(sget);
-
void drop_super(struct super_block *sb)
{
super_unlock_shared(sb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 11559c513dfb..6dbe3218dc1e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2327,10 +2327,6 @@ void free_anon_bdev(dev_t);
struct super_block *sget_fc(struct fs_context *fc,
int (*test)(struct super_block *, struct fs_context *),
int (*set)(struct super_block *, struct fs_context *));
-struct super_block *sget(struct file_system_type *type,
- int (*test)(struct super_block *,void *),
- int (*set)(struct super_block *,void *),
- int flags, void *data);
struct super_block *sget_dev(struct fs_context *fc, dev_t dev);
/* Alas, no aliases. Too much hassle with bringing module.h everywhere */
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 5/8] super: drop sb_lock from setup_bdev_super() tuple publication
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (3 preceding siblings ...)
2026-05-26 15:09 ` [PATCH 4/8] fs: retire sget() Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-27 11:53 ` Christian Brauner
2026-05-26 15:09 ` [PATCH 6/8] super: convert sb->s_count to refcount_t Christian Brauner
` (4 subsequent siblings)
9 siblings, 1 reply; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
The tuple {s_bdev_file, s_bdev, s_bdi, SB_I_STABLE_WRITES} written by
setup_bdev_super() is publication of immutable state, not list
integrity. The sb is already on @super_blocks and @fs_supers at this
point (sget_dev() -> sget_fc() put it there) but SB_BORN is unset, so
any iterator that calls super_lock() blocks on
wait_var_event(SB_BORN | SB_DYING).
The SUPER_ITER_UNLOCKED iterators (filesystems_freeze,
filesystems_thaw, do_emergency_remount) do not look at s_bdev, s_bdi
or s_iflags so they cannot observe a partial fill either.
When vfs_get_tree() later calls super_wake(sb, SB_BORN) it does
smp_store_release(&sb->s_flags, sb->s_flags | SB_BORN)
and any reader gating on SB_BORN via super_flags() loads sb->s_flags
with smp_load_acquire(). The release/acquire pair orders the four
prior writes against the load of SB_BORN.
s_iflags is a shared field so use WRITE_ONCE() on the
read-modify-write to keep the compiler from tearing the store.
retire_super() is the only other writer of s_iflags and only runs
against an already-born sb under s_umount.
This drops one of the five sb_lock acquisitions in the mount path
with no behavioural change for any reader.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/super.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 5fe8cea9f8fe..c451f689c7b3 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1576,13 +1576,16 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
bdev_fput(bdev_file);
return -EBUSY;
}
- spin_lock(&sb_lock);
+ /*
+ * Publish before SB_BORN is set. super_wake(sb, SB_BORN) below uses
+ * smp_store_release(); any iterator that observes SB_BORN via
+ * super_flags()'s smp_load_acquire() sees these writes.
+ */
sb->s_bdev_file = bdev_file;
sb->s_bdev = bdev;
sb->s_bdi = bdi_get(bdev->bd_disk->bdi);
if (bdev_stable_writes(bdev))
- sb->s_iflags |= SB_I_STABLE_WRITES;
- spin_unlock(&sb_lock);
+ WRITE_ONCE(sb->s_iflags, sb->s_iflags | SB_I_STABLE_WRITES);
snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
shrinker_debugfs_rename(sb->s_shrink, "sb-%s:%s", sb->s_type->name,
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 6/8] super: convert sb->s_count to refcount_t
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (4 preceding siblings ...)
2026-05-26 15:09 ` [PATCH 5/8] super: drop sb_lock from setup_bdev_super() tuple publication Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-26 15:09 ` [PATCH 7/8] super: switch list manipulation to _rcu primitives Christian Brauner
` (3 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
s_count is the temporary-reference count used to pin a superblock
across the spinlock-to-rwsem hop in every iterator and in
grab_super(). It's a plain int incremented and decremented only under
sb_lock.
Convert it to refcount_t. No semantic change yet: every increment
still happens with sb_lock held, so observation of a live ref is
still serialised by the lock. The increments use refcount_inc()
rather than refcount_inc_not_zero() because every callsite is still
looking at an sb known to be live under sb_lock.
This prepares the ground for switching iterators to RCU readers in a
later patch, at which point refcount_inc_not_zero() becomes the right
primitive at the lockless pin sites.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/super.c | 14 +++++++-------
include/linux/fs/super_types.h | 3 ++-
2 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index c451f689c7b3..2fa7023010ec 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -366,7 +366,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
spin_lock_init(&s->s_inode_wblist_lock);
fserror_mount(s);
- s->s_count = 1;
+ refcount_set(&s->s_count, 1);
atomic_set(&s->s_active, 1);
mutex_init(&s->s_vfs_rename_mutex);
lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
@@ -406,7 +406,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
*/
static void __put_super(struct super_block *s)
{
- if (!--s->s_count) {
+ if (refcount_dec_and_test(&s->s_count)) {
list_del_init(&s->s_list);
WARN_ON(s->s_dentry_lru.node);
WARN_ON(s->s_inode_lru.node);
@@ -528,7 +528,7 @@ static bool grab_super(struct super_block *sb)
{
bool locked;
- sb->s_count++;
+ refcount_inc(&sb->s_count);
spin_unlock(&sb_lock);
locked = super_lock_excl(sb);
if (locked) {
@@ -857,7 +857,7 @@ static void __iterate_supers(void (*f)(struct super_block *, void *), void *arg,
sb = next_super(sb, flags)) {
if (super_flags(sb, SB_DYING))
continue;
- sb->s_count++;
+ refcount_inc(&sb->s_count);
spin_unlock(&sb_lock);
if (flags & SUPER_ITER_UNLOCKED) {
@@ -902,7 +902,7 @@ void iterate_supers_type(struct file_system_type *type,
if (super_flags(sb, SB_DYING))
continue;
- sb->s_count++;
+ refcount_inc(&sb->s_count);
spin_unlock(&sb_lock);
locked = super_lock_shared(sb);
@@ -934,7 +934,7 @@ struct super_block *user_get_super(dev_t dev, bool excl)
if (sb->s_dev != dev)
continue;
- sb->s_count++;
+ refcount_inc(&sb->s_count);
spin_unlock(&sb_lock);
locked = super_lock(sb, excl);
@@ -1368,7 +1368,7 @@ static struct super_block *bdev_super_lock(struct block_device *bdev, bool excl)
/* Make sure sb doesn't go away from under us */
spin_lock(&sb_lock);
- sb->s_count++;
+ refcount_inc(&sb->s_count);
spin_unlock(&sb_lock);
mutex_unlock(&bdev->bd_holder_lock);
diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
index 383050e7fdf5..3a8cc0c723a8 100644
--- a/include/linux/fs/super_types.h
+++ b/include/linux/fs/super_types.h
@@ -11,6 +11,7 @@
#include <linux/uidgid.h>
#include <linux/uuid.h>
#include <linux/percpu-rwsem.h>
+#include <linux/refcount.h>
#include <linux/workqueue_types.h>
#include <linux/quota.h>
@@ -145,7 +146,7 @@ struct super_block {
unsigned long s_magic;
struct dentry *s_root;
struct rw_semaphore s_umount;
- int s_count;
+ refcount_t s_count;
atomic_t s_active;
#ifdef CONFIG_SECURITY
void *s_security;
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 7/8] super: switch list manipulation to _rcu primitives
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (5 preceding siblings ...)
2026-05-26 15:09 ` [PATCH 6/8] super: convert sb->s_count to refcount_t Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-26 15:09 ` [PATCH 8/8] super: convert iterators to RCU readers + refcount_inc_not_zero Christian Brauner
` (2 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
Swap the list/hlist write-side operations on @super_blocks and
@fs_type->fs_supers over to their _rcu variants. All three call sites
still hold sb_lock; this is a purely mechanical change that
establishes the writer-side memory ordering lockless RCU readers can
rely on in the next patch.
The affected sites are sget_fc() (list_add_tail() and
hlist_add_head() at the publish step), __put_super()
(list_del_init() -> list_bidir_del_rcu() of s_list when the last
temporary reference is dropped) and kill_super_notify()
(hlist_del_init() -> hlist_del_rcu() of s_instances).
@super_blocks gets list_bidir_del_rcu() rather than list_del_rcu()
because the next patch walks the list backward for
filesystems_freeze() and do_emergency_remount(). list_del_rcu()
preserves the unlinked entry's ->next pointer but poisons ->prev with
LIST_POISON2, which would crash any concurrent reverse traversal that
landed on the just-unlinked entry between the SB_DYING check and the
cursor advance. list_bidir_del_rcu() preserves both ->next and
->prev so reverse traversal stays safe. See kernel/nstree.c for the
canonical bidirectional-RCU list pattern.
The "_init" half of the deletions is not used elsewhere on these list
nodes after removal so dropping it is fine. The entry is about to be
freed via call_rcu(destroy_super_rcu) (for s_list) or to disappear
with the superblock (for s_instances, once the list has done its job
notifying SB_DEAD waiters).
Iterators keep using plain list_for_each_entry() and
hlist_for_each_entry() under sb_lock. Their conversion to lockless
RCU traversal with refcount_inc_not_zero() is the next patch.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/super.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 2fa7023010ec..8c01b95be717 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -407,7 +407,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
static void __put_super(struct super_block *s)
{
if (refcount_dec_and_test(&s->s_count)) {
- list_del_init(&s->s_list);
+ list_bidir_del_rcu(&s->s_list);
WARN_ON(s->s_dentry_lru.node);
WARN_ON(s->s_inode_lru.node);
WARN_ON(s->s_mounts);
@@ -445,7 +445,7 @@ static void kill_super_notify(struct super_block *sb)
* SB_DEAD.
*/
spin_lock(&sb_lock);
- hlist_del_init(&sb->s_instances);
+ hlist_del_rcu(&sb->s_instances);
spin_unlock(&sb_lock);
/*
@@ -784,8 +784,8 @@ struct super_block *sget_fc(struct fs_context *fc,
* It's in a nascent state and users should wait on SB_BORN or
* SB_DYING to be set.
*/
- list_add_tail(&s->s_list, &super_blocks);
- hlist_add_head(&s->s_instances, &s->s_type->fs_supers);
+ list_add_tail_rcu(&s->s_list, &super_blocks);
+ hlist_add_head_rcu(&s->s_instances, &s->s_type->fs_supers);
spin_unlock(&sb_lock);
get_filesystem(s->s_type);
shrinker_register(s->s_shrink);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 8/8] super: convert iterators to RCU readers + refcount_inc_not_zero
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (6 preceding siblings ...)
2026-05-26 15:09 ` [PATCH 7/8] super: switch list manipulation to _rcu primitives Christian Brauner
@ 2026-05-26 15:09 ` Christian Brauner
2026-05-27 11:54 ` [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
2026-05-28 11:18 ` Jan Kara
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-26 15:09 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro,
Christian Brauner (Amutable)
Walk @super_blocks and @fs_supers under rcu_read_lock() and pin the
current entry with refcount_inc_not_zero() instead of holding sb_lock
across the cursor advance. sb_lock was only there to keep the
cursor's ->next / ->prev pointer from being mutated by concurrent
list_del / list_add. RCU semantics give us that guarantee directly:
list_bidir_del_rcu() preserves both ->next and ->prev on the
unlinked entry and list_add_tail_rcu() publishes new entries with
the release barrier set up by the previous patch.
The pattern at each iterator is:
rcu_read_lock();
list_for_each_entry_rcu(sb, ...) {
if (SB_DYING) continue;
if (!refcount_inc_not_zero(&sb->s_count)) continue;
rcu_read_unlock();
... /* may sleep on s_umount */
if (prev)
put_super(prev);
prev = sb;
rcu_read_lock(); /* prev pinned: prev->{next,prev} valid */
}
rcu_read_unlock();
if (prev)
put_super(prev);
While we hold a pin on @prev, __put_super() cannot reach the
refcount_dec_and_test() transition that drives list_bidir_del_rcu().
So @prev stays on the list and concurrent list_bidir_del_rcu() of
other entries keeps @prev->s_list.{next,prev} pointing at the still-
live neighbour (or the head sentinel). The cursor advance after
re-acquiring rcu_read_lock() is therefore always against a live
chain in whichever direction we're walking.
put_super() now appears in the middle of the loop where __put_super()
used to be called with sb_lock held. It briefly takes sb_lock for
the trailing-ref drop; in the common case dec_and_test() returns
false and the lock is held for only a handful of cycles.
first_super() and next_super() switch the forward arm to READ_ONCE()
on the head and cursor ->next pointers and the reverse arm to
rcu_dereference(list_bidir_prev_rcu(...)). The forward arm matches
the semantics of list_entry_rcu() used internally by
list_for_each_entry_rcu(); the reverse arm is the canonical
bidirectional-RCU traversal pattern (see kernel/nstree.c) and is
needed because filesystems_freeze() and do_emergency_remount() pass
SUPER_ITER_REVERSE.
iterate_supers_type() and user_get_super() get the same treatment.
user_get_super() simplifies further: on lookup hit we return with
the pin; on lookup miss followed by SB_DYING discovery we put_super()
and return NULL.
sget_fc() and grab_super() are not touched here.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/super.c | 71 +++++++++++++++++++++++++++++++++-----------------------------
1 file changed, 38 insertions(+), 33 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 8c01b95be717..d9b1148f7030 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -831,17 +831,25 @@ enum super_iter_flags_t {
static inline struct super_block *first_super(enum super_iter_flags_t flags)
{
+ struct list_head *next;
+
if (flags & SUPER_ITER_REVERSE)
- return list_last_entry(&super_blocks, struct super_block, s_list);
- return list_first_entry(&super_blocks, struct super_block, s_list);
+ next = rcu_dereference(list_bidir_prev_rcu(&super_blocks));
+ else
+ next = READ_ONCE(super_blocks.next);
+ return list_entry(next, struct super_block, s_list);
}
static inline struct super_block *next_super(struct super_block *sb,
enum super_iter_flags_t flags)
{
+ struct list_head *next;
+
if (flags & SUPER_ITER_REVERSE)
- return list_prev_entry(sb, s_list);
- return list_next_entry(sb, s_list);
+ next = rcu_dereference(list_bidir_prev_rcu(&sb->s_list));
+ else
+ next = READ_ONCE(sb->s_list.next);
+ return list_entry(next, struct super_block, s_list);
}
static void __iterate_supers(void (*f)(struct super_block *, void *), void *arg,
@@ -850,15 +858,15 @@ static void __iterate_supers(void (*f)(struct super_block *, void *), void *arg,
struct super_block *sb, *p = NULL;
bool excl = flags & SUPER_ITER_EXCL;
- guard(spinlock)(&sb_lock);
-
+ rcu_read_lock();
for (sb = first_super(flags);
!list_entry_is_head(sb, &super_blocks, s_list);
sb = next_super(sb, flags)) {
if (super_flags(sb, SB_DYING))
continue;
- refcount_inc(&sb->s_count);
- spin_unlock(&sb_lock);
+ if (!refcount_inc_not_zero(&sb->s_count))
+ continue;
+ rcu_read_unlock();
if (flags & SUPER_ITER_UNLOCKED) {
f(sb, arg);
@@ -867,13 +875,14 @@ static void __iterate_supers(void (*f)(struct super_block *, void *), void *arg,
super_unlock(sb, excl);
}
- spin_lock(&sb_lock);
if (p)
- __put_super(p);
+ put_super(p);
p = sb;
+ rcu_read_lock();
}
+ rcu_read_unlock();
if (p)
- __put_super(p);
+ put_super(p);
}
void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
@@ -895,15 +904,15 @@ void iterate_supers_type(struct file_system_type *type,
{
struct super_block *sb, *p = NULL;
- spin_lock(&sb_lock);
- hlist_for_each_entry(sb, &type->fs_supers, s_instances) {
+ rcu_read_lock();
+ hlist_for_each_entry_rcu(sb, &type->fs_supers, s_instances) {
bool locked;
if (super_flags(sb, SB_DYING))
continue;
-
- refcount_inc(&sb->s_count);
- spin_unlock(&sb_lock);
+ if (!refcount_inc_not_zero(&sb->s_count))
+ continue;
+ rcu_read_unlock();
locked = super_lock_shared(sb);
if (locked) {
@@ -911,14 +920,14 @@ void iterate_supers_type(struct file_system_type *type,
super_unlock_shared(sb);
}
- spin_lock(&sb_lock);
if (p)
- __put_super(p);
+ put_super(p);
p = sb;
+ rcu_read_lock();
}
+ rcu_read_unlock();
if (p)
- __put_super(p);
- spin_unlock(&sb_lock);
+ put_super(p);
}
EXPORT_SYMBOL(iterate_supers_type);
@@ -927,25 +936,21 @@ struct super_block *user_get_super(dev_t dev, bool excl)
{
struct super_block *sb;
- spin_lock(&sb_lock);
- list_for_each_entry(sb, &super_blocks, s_list) {
- bool locked;
-
+ rcu_read_lock();
+ list_for_each_entry_rcu(sb, &super_blocks, s_list) {
if (sb->s_dev != dev)
continue;
+ if (!refcount_inc_not_zero(&sb->s_count))
+ continue;
+ rcu_read_unlock();
- refcount_inc(&sb->s_count);
- spin_unlock(&sb_lock);
-
- locked = super_lock(sb, excl);
- if (locked)
+ if (super_lock(sb, excl))
return sb;
- spin_lock(&sb_lock);
- __put_super(sb);
- break;
+ put_super(sb);
+ return NULL;
}
- spin_unlock(&sb_lock);
+ rcu_read_unlock();
return NULL;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 0/8] super: retire sget(), convert iterators to RCU
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (7 preceding siblings ...)
2026-05-26 15:09 ` [PATCH 8/8] super: convert iterators to RCU readers + refcount_inc_not_zero Christian Brauner
@ 2026-05-27 11:54 ` Christian Brauner
2026-05-28 11:18 ` Jan Kara
9 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2026-05-27 11:54 UTC (permalink / raw)
To: linux-fsdevel
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ritesh Harjani (IBM),
linux-ext4, linux-cifs, Alexander Viro
On Tue, May 26, 2026 at 05:09:02PM +0200, Christian Brauner wrote:
> * retire sget(): CIFS plus the two ext4 KUnit tests (extents-test,
>
> * Walk @super_blocks and @type->fs_supers under RCU, pinned by
Can't work as I originally envisioned.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 0/8] super: retire sget(), convert iterators to RCU
2026-05-26 15:09 [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
` (8 preceding siblings ...)
2026-05-27 11:54 ` [PATCH 0/8] super: retire sget(), convert iterators to RCU Christian Brauner
@ 2026-05-28 11:18 ` Jan Kara
9 siblings, 0 replies; 15+ messages in thread
From: Jan Kara @ 2026-05-28 11:18 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Theodore Ts'o, Andreas Dilger, Jan Kara,
Ritesh Harjani (IBM), linux-ext4, linux-cifs, Alexander Viro
On Tue 26-05-26 17:09:02, Christian Brauner wrote:
> * retire sget(): CIFS plus the two ext4 KUnit tests (extents-test,
> mballoc-test) were the last in-tree callers, and all three convert
> cleanly to sget_fc(). That lets sget() and its prototype come out,
> taking ~60 lines that only existed to be kept in lockstep with
> sget_fc() on every publish-path change.
This is definitely a good cleanup!
> * Walk @super_blocks and @type->fs_supers under RCU, pinned by
> refcount_inc_not_zero(&sb->s_count). iterate_supers(),
> iterate_supers_type(), user_get_super(), do_emergency_remount(),
> filesystems_freeze() and filesystems_thaw() no longer hold sb_lock
> across the cursor advance.
>
> The conversion goes in four small steps. Drop sb_lock from
> setup_bdev_super(): the {s_bdev_file, s_bdev, s_bdi,
> SB_I_STABLE_WRITES} tuple is publication of immutable state, and
> SB_BORN already gates every reader via super_wake()'s
> smp_store_release paired with super_flags()'s smp_load_acquire. Then
> convert sb->s_count to refcount_t -- mechanical, every increment is
> still under sb_lock. Then switch the write-side list/hlist ops to
> their _rcu variants; @super_blocks gets list_bidir_del_rcu() so the
> reverse-walking iterators (filesystems_freeze, do_emergency_remount)
> keep a valid ->prev on the unlinked entry, matching the canonical
> pattern in kernel/nstree.c. Finally, convert the iterators themselves:
> cursor advance via READ_ONCE / rcu_dereference, with the previous
> entry kept pinned via its s_count across the rcu_read_unlock ->
> callback -> rcu_read_lock cycle.
So I guess the motivation for getting rid of sb_lock is some contention on
it you can observe? When exactly? It would be nice to mention the
motivation as a justification for the additional complexity...
Honza
>
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> ---
> Christian Brauner (8):
> ext4: convert extents KUnit test to sget_fc()
> ext4: convert mballoc KUnit test to sget_fc()
> smb: client: convert cifs_smb3_do_mount() to sget_fc()
> fs: retire sget()
> super: drop sb_lock from setup_bdev_super() tuple publication
> super: convert sb->s_count to refcount_t
> super: switch list manipulation to _rcu primitives
> super: convert iterators to RCU readers + refcount_inc_not_zero
>
> fs/btrfs/super.c | 2 +-
> fs/ext4/extents-test.c | 22 +++++-
> fs/ext4/mballoc-test.c | 17 ++++-
> fs/smb/client/cifsfs.c | 43 ++++++-----
> fs/smb/client/cifsfs.h | 3 +-
> fs/smb/client/cifsproto.h | 3 +-
> fs/smb/client/connect.c | 5 +-
> fs/smb/client/fs_context.c | 2 +-
> fs/super.c | 167 ++++++++++++++---------------------------
> include/linux/fs.h | 4 -
> include/linux/fs/super_types.h | 3 +-
> 11 files changed, 127 insertions(+), 144 deletions(-)
> ---
> base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
> change-id: 20260526-work-sget-6bc80b96cba5
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 15+ messages in thread