[PATCH v6 0/8] btrfs: use fs_holder

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs
@ 2025-06-30  5:29 Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 1/8] btrfs: always open the device read-only in btrfs_scan_one_device Qu Wenruo
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v6:
- Fix an error handling bug that can lead to use-after-free
  Reported by syzbot, that inside btrfs_get_tree_super() that if we
  didn't open the devices, there are corner cases that
  fs_info->fs_devices can be freed twice, causing use-after-free bug.

  This one fixed two error paths:
  * sget_fc() failure
    Which is not the one reported by syzbot, but still possible to hit.

  * btrfs_open_devices() failure
    Which I believe is the one reported by syzbot.

  There is a dedicated fix pushed into linux-next.

  This refreshed series is for the proper merge into our for-next
  branch.

v5:
- Fix a tailing whitespace
  This introduced by patch "btrfs: add comments to make super block
  creation more clear", and that patch is created during a small
  window where my commit checkpatch hook is broken.

  And unfortunately that comment is also later updated by several
  patches, causing several conflicts with that whitespace error fixed.

v4:
- Fix a lockdep error
  In the patch "btrfs: delay btrfs_open_devices() until super block is
  created", we call sget_fc() with uuid_mutex locked.
  But during fs closing, we also try to lock uuid_mutex with s_umount
  locked.

  This leads to a reserved lock sequence and resuled a lockdep warning.

  Fix it by introducing btrfs_fs_devices::holding (aka, the old solution
  introduced by Christoph), but this time with no extra bugs during
  fstests.

- Add the patch to use fs_holder_ops
  This patch is small and properly tested, it's more situable to include
  this one here, other than delaying it to the next devloss feature.

- Add the missing patch to always open device-readonly when scanning
  My bad, there are a little too many patches pending, and I forgot to
  include the first patch.

v3:
- Drop the btrfs_fs_devices::opened split
  It turns out to cause problems during tests.

- Extra cleanup related to the btrfs_get_tree_*()
  Now the re-entry through vfs_get_tree() is completely dropped.

- Extra comments explaining the sget_fc() behavior

- Call bdev_fput() instead of fput()
  This alignes us to all the other fses.

- Updated patch to delay btrfs_open_devices() until sget_fc()
  Instead of relying on the previous solution (split
  btrfs_open_devices::opened), just expand the uuid_mutex critical
  section.


Christoph Hellwig (3):
  btrfs: always open the device read-only in btrfs_scan_one_device
  btrfs: call btrfs_close_devices from ->kill_sb
  btrfs: use the super_block as holder when mounting file systems

Qu Wenruo (5):
  btrfs: get rid of the re-entry of btrfs_get_tree()
  btrfs: add comments to make super block creation more clear
  btrfs: call bdev_fput() to reclaim the blk_holder immediately
  btrfs: delay btrfs_open_devices() until super block is created
  btrfs: use fs_holder_ops for all opened devices

 fs/btrfs/dev-replace.c |   4 +-
 fs/btrfs/disk-io.c     |   4 +-
 fs/btrfs/fs.h          |   2 -
 fs/btrfs/ioctl.c       |   4 +-
 fs/btrfs/super.c       | 129 ++++++++++++++++++++++-------------------
 fs/btrfs/volumes.c     |  33 ++++++-----
 fs/btrfs/volumes.h     |  27 ++++++++-
 7 files changed, 119 insertions(+), 84 deletions(-)

-- 
2.50.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v6 1/8] btrfs: always open the device read-only in btrfs_scan_one_device
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 2/8] btrfs: get rid of the re-entry of btrfs_get_tree() Qu Wenruo
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Christoph Hellwig, Johannes Thumshirn

From: Christoph Hellwig <hch@lst.de>

btrfs_scan_one_device opens the block device only to read the super
block.  Instead of passing a blk_mode_t argument to sometimes open
it for writing, just hard code BLK_OPEN_READ as it will never write
to the device or hand the block_device out to someone else.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/super.c   | 9 ++++-----
 fs/btrfs/volumes.c | 4 ++--
 fs/btrfs/volumes.h | 2 +-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 2d0d8c6e77b4..b9e08a59da4e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -364,10 +364,9 @@ static int btrfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
 		break;
 	case Opt_device: {
 		struct btrfs_device *device;
-		blk_mode_t mode = btrfs_open_mode(fc);
 
 		mutex_lock(&uuid_mutex);
-		device = btrfs_scan_one_device(param->string, mode, false);
+		device = btrfs_scan_one_device(param->string, false);
 		mutex_unlock(&uuid_mutex);
 		if (IS_ERR(device))
 			return PTR_ERR(device);
@@ -1855,7 +1854,7 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 	 * With 'true' passed to btrfs_scan_one_device() (mount time) we expect
 	 * either a valid device or an error.
 	 */
-	device = btrfs_scan_one_device(fc->source, mode, true);
+	device = btrfs_scan_one_device(fc->source, true);
 	ASSERT(device != NULL);
 	if (IS_ERR(device)) {
 		mutex_unlock(&uuid_mutex);
@@ -2233,7 +2232,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 		 * Scanning outside of mount can return NULL which would turn
 		 * into 0 error code.
 		 */
-		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
+		device = btrfs_scan_one_device(vol->name, false);
 		ret = PTR_ERR_OR_ZERO(device);
 		mutex_unlock(&uuid_mutex);
 		break;
@@ -2251,7 +2250,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 		 * Scanning outside of mount can return NULL which would turn
 		 * into 0 error code.
 		 */
-		device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false);
+		device = btrfs_scan_one_device(vol->name, false);
 		if (IS_ERR_OR_NULL(device)) {
 			mutex_unlock(&uuid_mutex);
 			if (IS_ERR(device))
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c99aec904e16..bb63729770d9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1441,7 +1441,7 @@ static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
  * the device or return an error. Multi-device and seeding devices are registered
  * in both cases.
  */
-struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
+struct btrfs_device *btrfs_scan_one_device(const char *path,
 					   bool mount_arg_dev)
 {
 	struct btrfs_super_block *disk_super;
@@ -1462,7 +1462,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 	 * values temporarily, as the device paths of the fsid are the only
 	 * required information for assembling the volume.
 	 */
-	bdev_file = bdev_file_open_by_path(path, flags, NULL, NULL);
+	bdev_file = bdev_file_open_by_path(path, BLK_OPEN_READ, NULL, NULL);
 	if (IS_ERR(bdev_file))
 		return ERR_CAST(bdev_file);
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 6d8b1f38e3ee..afa71d315c46 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -719,7 +719,7 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
 void btrfs_mapping_tree_free(struct btrfs_fs_info *fs_info);
 int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 		       blk_mode_t flags, void *holder);
-struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
+struct btrfs_device *btrfs_scan_one_device(const char *path,
 					   bool mount_arg_dev);
 int btrfs_forget_devices(dev_t devt);
 void btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 2/8] btrfs: get rid of the re-entry of btrfs_get_tree()
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 1/8] btrfs: always open the device read-only in btrfs_scan_one_device Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 3/8] btrfs: add comments to make super block creation more clear Qu Wenruo
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs

[EXISTING PROBLEM]
Currently btrfs mount is split into two parts:

- btrfs_get_tree_subvol()
  Which setups the very basic fs_info, and eventually call
  mount_subvol() to mount the target subvolume.

- btrfs_get_tree_super()
  This is the part doing super block allocation and if there is no
  existing super block, do the real open_ctree() to open the fs.

However currently we're doing this in a complex re-entry fashion:

vfs_get_tree()
|- btrfs_get_tree()
   |- btrfs_get_tree_subvol()
      |- vfs_get_tree()
      |  |- btrfs_get_tree()
      |     |- btrfs_get_tree_super()
      |- mount_subvol()

This is definitely not that easy to grasp.

[ENHANCEMENT]
The function vfs_get_tree() is only doing the following works:

- Call get_tree() call back
- Call super_wake()
- Call security_sb_set_mnt_opts()

In our case, super_wake() can be skipped, as after
btrfs_get_tree_subvol() finished, vfs_get_tree() will call super_wake()
on the super block we got anyway.

The same applies to security_sb_set_mnt_opts(), as long as we do not
free the security from our original fc in btrfs_get_tree_subvol(), the
first vfs_get_tree() call will handle the security correctly.

So here we only need to:

- Replace vfs_get_tree() call with btrfs_get_tree_super()

- Keep the existing fc->security for vfs_get_tree() to handle the
  security

This will remove the re-entry behavior and make thing much easier to
follow.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/super.c | 27 +++------------------------
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b9e08a59da4e..d977d2da985e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2046,15 +2046,7 @@ static int btrfs_get_tree_subvol(struct fs_context *fc)
 	 */
 	dup_fc->s_fs_info = fs_info;
 
-	/*
-	 * We'll do the security settings in our btrfs_get_tree_super() mount
-	 * loop, they were duplicated into dup_fc, we can drop the originals
-	 * here.
-	 */
-	security_free_mnt_opts(&fc->security);
-	fc->security = NULL;
-
-	ret = vfs_get_tree(dup_fc);
+	ret = btrfs_get_tree_super(dup_fc);
 	if (ret)
 		goto error;
 
@@ -2086,21 +2078,8 @@ static int btrfs_get_tree_subvol(struct fs_context *fc)
 
 static int btrfs_get_tree(struct fs_context *fc)
 {
-	/*
-	 * Since we use mount_subtree to mount the default/specified subvol, we
-	 * have to do mounts in two steps.
-	 *
-	 * First pass through we call btrfs_get_tree_subvol(), this is just a
-	 * wrapper around fc_mount() to call back into here again, and this time
-	 * we'll call btrfs_get_tree_super().  This will do the open_ctree() and
-	 * everything to open the devices and file system.  Then we return back
-	 * with a fully constructed vfsmount in btrfs_get_tree_subvol(), and
-	 * from there we can do our mount_subvol() call, which will lookup
-	 * whichever subvol we're mounting and setup this fc with the
-	 * appropriate dentry for the subvol.
-	 */
-	if (fc->s_fs_info)
-		return btrfs_get_tree_super(fc);
+	ASSERT(fc->s_fs_info == NULL);
+
 	return btrfs_get_tree_subvol(fc);
 }
 
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 3/8] btrfs: add comments to make super block creation more clear
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 1/8] btrfs: always open the device read-only in btrfs_scan_one_device Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 2/8] btrfs: get rid of the re-entry of btrfs_get_tree() Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 4/8] btrfs: call btrfs_close_devices from ->kill_sb Qu Wenruo
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs

When calling sget_fc(), there are 3 different situations:

a) Critical error
   No super block created.

b) A new super block is created
   The fc->s_fs_info is transferred to the super block, and fc->s_fs_info
   is reset to NULL.

   In this case sb->s_root should still be NULL, and needs to be properly
   initialized later by btrfs_fill_super().

c) An existing super block is returned
   The fc->s_fs_info is untouched, and anything related to that fs_info
   should be properly cleaned up.

This is not obvious even with the extra comments at sget_fc().

Enhance the situation by:

- Add comments for case b) and c)
  Especially for case c), the fs_info and fs_devices cleanup happens at
  different timing, thus needs extra explanation.

- Move the comments closer to case b) and case c)

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/super.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d977d2da985e..b6bc262acf71 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1876,15 +1876,6 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 
 	bdev = fs_devices->latest_dev->bdev;
 
-	/*
-	 * From now on the error handling is not straightforward.
-	 *
-	 * If successful, this will transfer the fs_info into the super block,
-	 * and fc->s_fs_info will be NULL.  However if there's an existing
-	 * super, we'll still have fc->s_fs_info populated.  If we error
-	 * completely out it'll be cleaned up when we drop the fs_context,
-	 * otherwise it's tied to the lifetime of the super_block.
-	 */
 	sb = sget_fc(fc, btrfs_fc_test_super, set_anon_super_fc);
 	if (IS_ERR(sb)) {
 		ret = PTR_ERR(sb);
@@ -1894,6 +1885,20 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 	set_device_specific_options(fs_info);
 
 	if (sb->s_root) {
+		/*
+		 * Not the first mount of the fs thus got an existing super block.
+		 *
+		 * Will reuse the returned super block, fs_info and fs_devices.
+		 */
+		ASSERT(fc->s_fs_info == fs_info);
+
+		/*
+		 * fc->s_fs_info is not touched and will be later freed by
+		 * put_fs_context() through btrfs_free_fs_context().
+		 *
+		 * But we have opened fs_devices at the beginning of the
+		 * function, thus still need to close them manually.
+		 */
 		btrfs_close_devices(fs_devices);
 		/*
 		 * At this stage we may have RO flag mismatch between
@@ -1902,6 +1907,13 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		 * needed.
 		 */
 	} else {
+		/*
+		 * The first mount of the fs thus a new superblock, fc->s_fs_info
+		 * should be NULL, and the owner ship of our fs_info and fs_devices is
+		 * transferred to the super block.
+		 */
+		ASSERT(fc->s_fs_info == NULL);
+
 		snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
 		shrinker_debugfs_rename(sb->s_shrink, "sb-btrfs:%s", sb->s_id);
 		btrfs_sb(sb)->bdev_holder = &btrfs_fs_type;
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 4/8] btrfs: call btrfs_close_devices from ->kill_sb
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (2 preceding siblings ...)
  2025-06-30  5:29 ` [PATCH v6 3/8] btrfs: add comments to make super block creation more clear Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 5/8] btrfs: call bdev_fput() to reclaim the blk_holder immediately Qu Wenruo
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Christoph Hellwig, Johannes Thumshirn

From: Christoph Hellwig <hch@lst.de>

Although btrfs is not yet implementing blk_holder_ops yet, there is a
requirement for a proper blk_holder_ops:

- blkdev_put() must not be called under sb->s_umount
  The blkdev_put()/bdev_fput() must not be called under sb->s_umount to
  avoid lock order reversal with disk->open_mutex.
  This is for the proper blk_holder_ops callbacks.

  Currently we're fine because we call regular fput() which defers the
  blk holder reclaiming.

To prepare for the future of blk_holder_ops, move the
btrfs_close_devices() calls into btrfs_free_fs_info().

That will be called from kill_sb() callbacks, which is also called for
error handing during mount failures, or there is already an existing
super block.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/disk-io.c |  4 ++--
 fs/btrfs/super.c   | 28 +++++++++-------------------
 2 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6ac5be02dce7..daf52179a85f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1246,6 +1246,8 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
 {
 	struct percpu_counter *em_counter = &fs_info->evictable_extent_maps;
 
+	if (fs_info->fs_devices)
+		btrfs_close_devices(fs_info->fs_devices);
 	percpu_counter_destroy(&fs_info->stats_read_blocks);
 	percpu_counter_destroy(&fs_info->dirty_metadata_bytes);
 	percpu_counter_destroy(&fs_info->delalloc_bytes);
@@ -3682,7 +3684,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 
 	iput(fs_info->btree_inode);
 fail:
-	btrfs_close_devices(fs_info->fs_devices);
 	ASSERT(ret < 0);
 	return ret;
 }
@@ -4429,7 +4430,6 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	iput(fs_info->btree_inode);
 
 	btrfs_mapping_tree_free(fs_info);
-	btrfs_close_devices(fs_info->fs_devices);
 }
 
 void btrfs_mark_buffer_dirty(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b6bc262acf71..cc15939080e6 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1869,18 +1869,14 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 	if (ret)
 		return ret;
 
-	if (!(fc->sb_flags & SB_RDONLY) && fs_devices->rw_devices == 0) {
-		ret = -EACCES;
-		goto error;
-	}
+	if (!(fc->sb_flags & SB_RDONLY) && fs_devices->rw_devices == 0)
+		return -EACCES;
 
 	bdev = fs_devices->latest_dev->bdev;
 
 	sb = sget_fc(fc, btrfs_fc_test_super, set_anon_super_fc);
-	if (IS_ERR(sb)) {
-		ret = PTR_ERR(sb);
-		goto error;
-	}
+	if (IS_ERR(sb))
+		return PTR_ERR(sb);
 
 	set_device_specific_options(fs_info);
 
@@ -1889,17 +1885,15 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		 * Not the first mount of the fs thus got an existing super block.
 		 *
 		 * Will reuse the returned super block, fs_info and fs_devices.
-		 */
-		ASSERT(fc->s_fs_info == fs_info);
-
-		/*
+		 *
 		 * fc->s_fs_info is not touched and will be later freed by
 		 * put_fs_context() through btrfs_free_fs_context().
 		 *
-		 * But we have opened fs_devices at the beginning of the
-		 * function, thus still need to close them manually.
+		 * And the fs_info->fs_devices will also be closed by
+		 * btrfs_free_fs_context().
 		 */
-		btrfs_close_devices(fs_devices);
+		ASSERT(fc->s_fs_info == fs_info);
+
 		/*
 		 * At this stage we may have RO flag mismatch between
 		 * fc->sb_flags and sb->s_flags.  Caller should detect such
@@ -1928,10 +1922,6 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 
 	fc->root = dget(sb->s_root);
 	return 0;
-
-error:
-	btrfs_close_devices(fs_devices);
-	return ret;
 }
 
 /*
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 5/8] btrfs: call bdev_fput() to reclaim the blk_holder immediately
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (3 preceding siblings ...)
  2025-06-30  5:29 ` [PATCH v6 4/8] btrfs: call btrfs_close_devices from ->kill_sb Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 6/8] btrfs: delay btrfs_open_devices() until super block is created Qu Wenruo
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs

As part of the preparation for btrfs blk_holder_ops, we want to ensure
the holder of a block device has a proper lifespan.

However btrfs is always using fput() to close a block device, which has
one problem:

- fput() is deferred
  Meaning we can have a block device with invalid (aka, freed) holder.

To avoid the problem, and align the behavior to all the existing codes,
just call bdev_fput() instead.

There is some extra requirement on the locking, but that's all resolved
by previous patches and we should be safe to call bdev_fput().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/dev-replace.c |  2 +-
 fs/btrfs/ioctl.c       |  4 ++--
 fs/btrfs/volumes.c     | 18 +++++++++---------
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index bd2761799181..473450ee0408 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -327,7 +327,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	return 0;
 
 error:
-	fput(bdev_file);
+	bdev_fput(bdev_file);
 	return ret;
 }
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 5068f1fa86f6..65763bc6a0f6 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2699,7 +2699,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 err_drop:
 	mnt_drop_write_file(file);
 	if (bdev_file)
-		fput(bdev_file);
+		bdev_fput(bdev_file);
 out:
 	btrfs_put_dev_args_from_path(&args);
 	kfree(vol_args);
@@ -2750,7 +2750,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 
 	mnt_drop_write_file(file);
 	if (bdev_file)
-		fput(bdev_file);
+		bdev_fput(bdev_file);
 out:
 	btrfs_put_dev_args_from_path(&args);
 out_free:
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bb63729770d9..f51c81c1682c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -489,7 +489,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
 	if (holder) {
 		ret = set_blocksize(*bdev_file, BTRFS_BDEV_BLOCKSIZE);
 		if (ret) {
-			fput(*bdev_file);
+			bdev_fput(*bdev_file);
 			goto error;
 		}
 	}
@@ -497,7 +497,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
 	*disk_super = btrfs_read_disk_super(bdev, 0, false);
 	if (IS_ERR(*disk_super)) {
 		ret = PTR_ERR(*disk_super);
-		fput(*bdev_file);
+		bdev_fput(*bdev_file);
 		goto error;
 	}
 
@@ -721,7 +721,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 
 error_free_page:
 	btrfs_release_disk_super(disk_super);
-	fput(bdev_file);
+	bdev_fput(bdev_file);
 
 	return -EINVAL;
 }
@@ -1071,7 +1071,7 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
 			continue;
 
 		if (device->bdev_file) {
-			fput(device->bdev_file);
+			bdev_fput(device->bdev_file);
 			device->bdev = NULL;
 			device->bdev_file = NULL;
 			fs_devices->open_devices--;
@@ -1118,7 +1118,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
 		invalidate_bdev(device->bdev);
 	}
 
-	fput(device->bdev_file);
+	bdev_fput(device->bdev_file);
 }
 
 static void btrfs_close_one_device(struct btrfs_device *device)
@@ -1491,7 +1491,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path,
 	btrfs_release_disk_super(disk_super);
 
 error_bdev_put:
-	fput(bdev_file);
+	bdev_fput(bdev_file);
 
 	return device;
 }
@@ -2295,7 +2295,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 	 * free the device.
 	 *
 	 * We cannot call btrfs_close_bdev() here because we're holding the sb
-	 * write lock, and fput() on the block device will pull in the
+	 * write lock, and bdev_fput() on the block device will pull in the
 	 * ->open_mutex on the block device and it's dependencies.  Instead
 	 *  just flush the device and let the caller do the final bdev_release.
 	 */
@@ -2474,7 +2474,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
 	else
 		memcpy(args->fsid, disk_super->fsid, BTRFS_FSID_SIZE);
 	btrfs_release_disk_super(disk_super);
-	fput(bdev_file);
+	bdev_fput(bdev_file);
 	return 0;
 }
 
@@ -2922,7 +2922,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 error_free_device:
 	btrfs_free_device(device);
 error:
-	fput(bdev_file);
+	bdev_fput(bdev_file);
 	if (locked) {
 		mutex_unlock(&uuid_mutex);
 		up_write(&sb->s_umount);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 6/8] btrfs: delay btrfs_open_devices() until super block is created
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (4 preceding siblings ...)
  2025-06-30  5:29 ` [PATCH v6 5/8] btrfs: call bdev_fput() to reclaim the blk_holder immediately Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 7/8] btrfs: use the super_block as holder when mounting file systems Qu Wenruo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs

Currently btrfs always calls btrfs_open_devices() before creating the
super block.

It's fine for now because:

- No blk_holder_ops is provided
- btrfs_fs_type is used as a holder

This means no matter who wins the device opening race, the holder will be
the same thus not affecting the later sget_fc() race.

And since no blk_holder_ops is provided, no bdev operation is depending on
the holder.

But this will no longer be true if we want to implement a proper
blk_holder_ops using fs_holder_ops.
This means we will need a proper super block as the bdev holder.

To prepare for such change:

- Add btrfs_fs_devices::holding member
  This will prevent btrfs_free_stale_devices() and btrfs_close_device()
  from deleting the fs_devices when there is another process trying to
  mount the fs.

  Along with the new member, here comes two helpers,
  btrfs_fs_devices_inc_holding() and btrfs_fs_devices_dec_holding().

  This will allow us to hold an fs_devices without opening it.

  This is needed because we can not hold uuid_mutex while calling
  sget_fc(), this will reverse the lock sequence with s_umount, causing
  a lockdep warning.

- Delay btrfs_open_devices() until a super block is returned
  This means we have to hold the initial fs_devices first, then unlock
  uuid_mutex, call sget_fc(), then re-lock uuid_mutex, and decrease the
  holding number.

  For new super block case, we continue to btrfs_open_devices() with
  uuid_mutex hold.
  For existing super block case, we can unlock uuid_mutex and continue.

  Although this means a more complex error handling path, as if we
  didn't call btrfs_open_devices() (either got an existing sb, or
  sget_fc() failed), we can not let btrfs_put_fs_info() to cleanup the
  fs_devices, as it can be freed at any time after we decrease the hold
  on fs_devices and unlock uuid_mutex.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/super.c   | 60 +++++++++++++++++++++++++++++++++++-----------
 fs/btrfs/volumes.c |  5 ++--
 fs/btrfs/volumes.h | 25 +++++++++++++++++++
 3 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index cc15939080e6..23105f91f6d2 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1841,7 +1841,6 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 	struct btrfs_fs_info *fs_info = fc->s_fs_info;
 	struct btrfs_fs_context *ctx = fc->fs_private;
 	struct btrfs_fs_devices *fs_devices = NULL;
-	struct block_device *bdev;
 	struct btrfs_device *device;
 	struct super_block *sb;
 	blk_mode_t mode = btrfs_open_mode(fc);
@@ -1860,23 +1859,38 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		mutex_unlock(&uuid_mutex);
 		return PTR_ERR(device);
 	}
-
 	fs_devices = device->fs_devices;
+	/*
+	 * We can not hold uuid_mutex calling sget_fc(), it will lead to a
+	 * locking order reversal with s_umount.
+	 *
+	 * So here we increase the holding number of fs_devices, this will ensure
+	 * the fs_devices itself won't be freed.
+	 */
+	btrfs_fs_devices_inc_holding(fs_devices);
 	fs_info->fs_devices = fs_devices;
-
-	ret = btrfs_open_devices(fs_devices, mode, &btrfs_fs_type);
 	mutex_unlock(&uuid_mutex);
-	if (ret)
-		return ret;
 
-	if (!(fc->sb_flags & SB_RDONLY) && fs_devices->rw_devices == 0)
-		return -EACCES;
-
-	bdev = fs_devices->latest_dev->bdev;
 
 	sb = sget_fc(fc, btrfs_fc_test_super, set_anon_super_fc);
-	if (IS_ERR(sb))
+	if (IS_ERR(sb)) {
+		mutex_lock(&uuid_mutex);
+		btrfs_fs_devices_dec_holding(fs_devices);
+		/*
+		 * Since the fs_devices is not opened, it can be freed at any
+		 * time after unlocking uuid_mutex.
+		 * We do not want to double free such dangling fs_devices through
+		 * put_fs_context()->btrfs_free_fs_info().
+		 * So here we reset fs_info->fs_devices to NULL, and let the
+		 * regular fs_devices reclaim path to handle it.
+		 *
+		 * This applies to all later branches where no fs_devices is
+		 * opened.
+		 */
+		fs_info->fs_devices = NULL;
+		mutex_unlock(&uuid_mutex);
 		return PTR_ERR(sb);
+	}
 
 	set_device_specific_options(fs_info);
 
@@ -1888,12 +1902,13 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		 *
 		 * fc->s_fs_info is not touched and will be later freed by
 		 * put_fs_context() through btrfs_free_fs_context().
-		 *
-		 * And the fs_info->fs_devices will also be closed by
-		 * btrfs_free_fs_context().
 		 */
 		ASSERT(fc->s_fs_info == fs_info);
 
+		mutex_lock(&uuid_mutex);
+		btrfs_fs_devices_dec_holding(fs_devices);
+		fs_info->fs_devices = NULL;
+		mutex_unlock(&uuid_mutex);
 		/*
 		 * At this stage we may have RO flag mismatch between
 		 * fc->sb_flags and sb->s_flags.  Caller should detect such
@@ -1901,6 +1916,8 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		 * needed.
 		 */
 	} else {
+		struct block_device *bdev;
+
 		/*
 		 * The first mount of the fs thus a new superblock, fc->s_fs_info
 		 * should be NULL, and the owner ship of our fs_info and fs_devices is
@@ -1908,6 +1925,21 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		 */
 		ASSERT(fc->s_fs_info == NULL);
 
+		mutex_lock(&uuid_mutex);
+		btrfs_fs_devices_dec_holding(fs_devices);
+		ret = btrfs_open_devices(fs_devices, mode, &btrfs_fs_type);
+		if (ret < 0)
+			fs_info->fs_devices = NULL;
+		mutex_unlock(&uuid_mutex);
+		if (ret < 0) {
+			deactivate_locked_super(sb);
+			return ret;
+		}
+		if (!(fc->sb_flags & SB_RDONLY) && fs_devices->rw_devices == 0) {
+			deactivate_locked_super(sb);
+			return -EACCES;
+		}
+		bdev = fs_devices->latest_dev->bdev;
 		snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
 		shrinker_debugfs_rename(sb->s_shrink, "sb-btrfs:%s", sb->s_id);
 		btrfs_sb(sb)->bdev_holder = &btrfs_fs_type;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f51c81c1682c..309f9c182e6f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -415,6 +415,7 @@ static void free_fs_devices(struct btrfs_fs_devices *fs_devices)
 	struct btrfs_device *device;
 
 	WARN_ON(fs_devices->opened);
+	WARN_ON(fs_devices->holding);
 	while (!list_empty(&fs_devices->devices)) {
 		device = list_first_entry(&fs_devices->devices,
 					  struct btrfs_device, dev_list);
@@ -542,7 +543,7 @@ static int btrfs_free_stale_devices(dev_t devt, struct btrfs_device *skip_device
 				continue;
 			if (devt && devt != device->devt)
 				continue;
-			if (fs_devices->opened) {
+			if (fs_devices->opened || fs_devices->holding) {
 				if (devt)
 					ret = -EBUSY;
 				break;
@@ -1198,7 +1199,7 @@ void btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 
 	mutex_lock(&uuid_mutex);
 	close_fs_devices(fs_devices);
-	if (!fs_devices->opened) {
+	if (!fs_devices->opened && !fs_devices->holding) {
 		list_splice_init(&fs_devices->seed_list, &list);
 
 		/*
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index afa71d315c46..6acb154ccf87 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -422,6 +422,17 @@ struct btrfs_fs_devices {
 	/* Count fs-devices opened. */
 	int opened;
 
+	/*
+	 * Counter of the processes that are holding this fs_devices but not
+	 * yet opened.
+	 * This is for mounting handling, as we can only open the fs_devices
+	 * after a super block is created.
+	 * But we can not hold uuid_mutex during sget_fc(), thus we have to
+	 * hold the fs_devices (meaning it can not be released) until a super
+	 * block is returned.
+	 */
+	int holding;
+
 	/* Set when we find or add a device that doesn't have the nonrot flag set. */
 	bool rotating;
 	/* Devices support TRIM/discard commands. */
@@ -854,6 +865,20 @@ static inline void btrfs_warn_unknown_chunk_allocation(enum btrfs_chunk_allocati
 	WARN_ONCE(1, "unknown allocation policy %d, fallback to regular", pol);
 }
 
+static inline void btrfs_fs_devices_inc_holding(struct btrfs_fs_devices *fs_devices)
+{
+	lockdep_assert_held(&uuid_mutex);
+	ASSERT(fs_devices->holding >= 0);
+	fs_devices->holding++;
+}
+
+static inline void btrfs_fs_devices_dec_holding(struct btrfs_fs_devices *fs_devices)
+{
+	lockdep_assert_held(&uuid_mutex);
+	ASSERT(fs_devices->holding > 0);
+	fs_devices->holding--;
+}
+
 void btrfs_commit_device_sizes(struct btrfs_transaction *trans);
 
 struct list_head * __attribute_const__ btrfs_get_fs_uuids(void);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 7/8] btrfs: use the super_block as holder when mounting file systems
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (5 preceding siblings ...)
  2025-06-30  5:29 ` [PATCH v6 6/8] btrfs: delay btrfs_open_devices() until super block is created Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:29 ` [PATCH v6 8/8] btrfs: use fs_holder_ops for all opened devices Qu Wenruo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Christoph Hellwig, Johannes Thumshirn

From: Christoph Hellwig <hch@lst.de>

The file system type is not a very useful holder as it doesn't allow us
to go back to the actual file system instance.  Pass the super_block
instead which is useful when passed back to the file system driver.

This matches what is done for all other block device based file systems,
and allows us to remove btrfs_fs_info::bdev_holder completely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/dev-replace.c | 2 +-
 fs/btrfs/fs.h          | 2 --
 fs/btrfs/super.c       | 3 +--
 fs/btrfs/volumes.c     | 4 ++--
 4 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 473450ee0408..b828e4003552 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -250,7 +250,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	}
 
 	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
-					fs_info->bdev_holder, NULL);
+					   fs_info->sb, NULL);
 	if (IS_ERR(bdev_file)) {
 		btrfs_err(fs_info, "target device %s is invalid!", device_path);
 		return PTR_ERR(bdev_file);
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index b239e4b8421c..d90304d4e32c 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -715,8 +715,6 @@ struct btrfs_fs_info {
 	u32 data_chunk_allocations;
 	u32 metadata_ratio;
 
-	void *bdev_holder;
-
 	/* Private scrub information */
 	struct mutex scrub_lock;
 	atomic_t scrubs_running;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 23105f91f6d2..727d5c1d1bf1 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1927,7 +1927,7 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 
 		mutex_lock(&uuid_mutex);
 		btrfs_fs_devices_dec_holding(fs_devices);
-		ret = btrfs_open_devices(fs_devices, mode, &btrfs_fs_type);
+		ret = btrfs_open_devices(fs_devices, mode, sb);
 		if (ret < 0)
 			fs_info->fs_devices = NULL;
 		mutex_unlock(&uuid_mutex);
@@ -1942,7 +1942,6 @@ static int btrfs_get_tree_super(struct fs_context *fc)
 		bdev = fs_devices->latest_dev->bdev;
 		snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
 		shrinker_debugfs_rename(sb->s_shrink, "sb-btrfs:%s", sb->s_id);
-		btrfs_sb(sb)->bdev_holder = &btrfs_fs_type;
 		ret = btrfs_fill_super(sb, fs_devices);
 		if (ret) {
 			deactivate_locked_super(sb);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 309f9c182e6f..869402eaa0bb 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2707,7 +2707,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 		return -EROFS;
 
 	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
-					fs_info->bdev_holder, NULL);
+					   fs_info->sb, NULL);
 	if (IS_ERR(bdev_file))
 		return PTR_ERR(bdev_file);
 
@@ -7176,7 +7176,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info,
 	if (IS_ERR(fs_devices))
 		return fs_devices;
 
-	ret = open_fs_devices(fs_devices, BLK_OPEN_READ, fs_info->bdev_holder);
+	ret = open_fs_devices(fs_devices, BLK_OPEN_READ, fs_info->sb);
 	if (ret) {
 		free_fs_devices(fs_devices);
 		return ERR_PTR(ret);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 8/8] btrfs: use fs_holder_ops for all opened devices
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (6 preceding siblings ...)
  2025-06-30  5:29 ` [PATCH v6 7/8] btrfs: use the super_block as holder when mounting file systems Qu Wenruo
@ 2025-06-30  5:29 ` Qu Wenruo
  2025-06-30  5:40 ` [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Christoph Hellwig
  2025-07-01 14:38 ` David Sterba
  9 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:29 UTC (permalink / raw)
  To: linux-btrfs

Since we have btrfs_fs_info::sb (struct super_block) as our block device
holder, we can safely use fs_holder_ops for all of our block devices.

This enables freezing/thawing the btrfs from the underlying block
devices.

This may enhance hibernation/suspension support since previously
freezing/thawing a block device managed by btrfs won't do anything btrfs
specific, but only syncing the block device.

Thus before this change, freezing the underlying block devices won't
prevent future writes into the btrfs, thus may cause problems for
hibernation/suspension when btrfs is involved.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/dev-replace.c | 2 +-
 fs/btrfs/volumes.c     | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index b828e4003552..4675bcd5f92e 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -250,7 +250,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	}
 
 	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
-					   fs_info->sb, NULL);
+					   fs_info->sb, &fs_holder_ops);
 	if (IS_ERR(bdev_file)) {
 		btrfs_err(fs_info, "target device %s is invalid!", device_path);
 		return PTR_ERR(bdev_file);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 869402eaa0bb..8ea1a69808a3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -475,7 +475,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
 	struct block_device *bdev;
 	int ret;
 
-	*bdev_file = bdev_file_open_by_path(device_path, flags, holder, NULL);
+	*bdev_file = bdev_file_open_by_path(device_path, flags, holder, &fs_holder_ops);
 
 	if (IS_ERR(*bdev_file)) {
 		ret = PTR_ERR(*bdev_file);
@@ -2707,7 +2707,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 		return -EROFS;
 
 	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
-					   fs_info->sb, NULL);
+					   fs_info->sb, &fs_holder_ops);
 	if (IS_ERR(bdev_file))
 		return PTR_ERR(bdev_file);
 
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (7 preceding siblings ...)
  2025-06-30  5:29 ` [PATCH v6 8/8] btrfs: use fs_holder_ops for all opened devices Qu Wenruo
@ 2025-06-30  5:40 ` Christoph Hellwig
  2025-06-30  5:43   ` Qu Wenruo
  2025-07-01 14:38 ` David Sterba
  9 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2025-06-30  5:40 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Looks like this doesn't add the remove_dev method and thus does the
wrong thing when a device underlying a raid configuration gets removed?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs
  2025-06-30  5:40 ` [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Christoph Hellwig
@ 2025-06-30  5:43   ` Qu Wenruo
  2025-06-30  5:49     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2025-06-30  5:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-btrfs



在 2025/6/30 15:10, Christoph Hellwig 写道:
> Looks like this doesn't add the remove_dev method and thus does the
> wrong thing when a device underlying a raid configuration gets removed?
> 

The remove_bdev() callback is a completely different patchset, which 
depends on the btrfs shutdown support.

This series is only for the blk_holder_ops, and will be the basis for 
the remove_bdev() series.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs
  2025-06-30  5:43   ` Qu Wenruo
@ 2025-06-30  5:49     ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2025-06-30  5:49 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Christoph Hellwig, linux-btrfs

On Mon, Jun 30, 2025 at 03:13:11PM +0930, Qu Wenruo wrote:
> 
> 
> 在 2025/6/30 15:10, Christoph Hellwig 写道:
> > Looks like this doesn't add the remove_dev method and thus does the
> > wrong thing when a device underlying a raid configuration gets removed?
> > 
> 
> The remove_bdev() callback is a completely different patchset, which depends
> on the btrfs shutdown support.
> 
> This series is only for the blk_holder_ops, and will be the basis for the
> remove_bdev() series.

Oh, right.  Without ->shutdown we're not getting any shutdown at all
so things should be fine.  Sorry for the noise.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs
  2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
                   ` (8 preceding siblings ...)
  2025-06-30  5:40 ` [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Christoph Hellwig
@ 2025-07-01 14:38 ` David Sterba
  9 siblings, 0 replies; 13+ messages in thread
From: David Sterba @ 2025-07-01 14:38 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Jun 30, 2025 at 02:59:04PM +0930, Qu Wenruo wrote:
> [CHANGELOG]
> v6:
> - Fix an error handling bug that can lead to use-after-free
>   Reported by syzbot, that inside btrfs_get_tree_super() that if we
>   didn't open the devices, there are corner cases that
>   fs_info->fs_devices can be freed twice, causing use-after-free bug.
> 
>   This one fixed two error paths:
>   * sget_fc() failure
>     Which is not the one reported by syzbot, but still possible to hit.
> 
>   * btrfs_open_devices() failure
>     Which I believe is the one reported by syzbot.
> 
>   There is a dedicated fix pushed into linux-next.
> 
>   This refreshed series is for the proper merge into our for-next
>   branch.

Updated in linux-next. There were some conflicts between branches so
this got a bit delayed. If there are no significant problems found,
please add the branch to for-next by Friday or on Monday at the latest.
Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-01 14:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-30  5:29 [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 1/8] btrfs: always open the device read-only in btrfs_scan_one_device Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 2/8] btrfs: get rid of the re-entry of btrfs_get_tree() Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 3/8] btrfs: add comments to make super block creation more clear Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 4/8] btrfs: call btrfs_close_devices from ->kill_sb Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 5/8] btrfs: call bdev_fput() to reclaim the blk_holder immediately Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 6/8] btrfs: delay btrfs_open_devices() until super block is created Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 7/8] btrfs: use the super_block as holder when mounting file systems Qu Wenruo
2025-06-30  5:29 ` [PATCH v6 8/8] btrfs: use fs_holder_ops for all opened devices Qu Wenruo
2025-06-30  5:40 ` [PATCH v6 0/8] btrfs: use fs_holder_ops for btrfs Christoph Hellwig
2025-06-30  5:43   ` Qu Wenruo
2025-06-30  5:49     ` Christoph Hellwig
2025-07-01 14:38 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox