* [PATCH 0/7] Implement freeze and thaw as holder operations
@ 2023-09-27 13:21 Christian Brauner
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
` (6 more replies)
0 siblings, 7 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
Hey Christoph,
Hey Jan,
This implements block device freezing and thawing as holder operations.
Not just does this allow to simplify the current implementation by
removing a few locks and members in struct block_device and getting rid
of a few helpers in the superblock code, it also allows us to implement
block device freezing for multiple devices and not just the main block
device.
This will also allow us to fix block device freezing and thawing for
btrfs which is broken right now but that's just a nice side-effect.
Thanks!
Christian
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
base-commit: 0e945134b680040b8613e962f586d91b6d40292d
change-id: 20230927-vfs-super-freeze-eff650f66b06
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/7] bdev: rename freeze and thaw helpers
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 14:35 ` Darrick J. Wong
` (2 more replies)
2023-09-27 13:21 ` [PATCH 2/7] bdev: add freeze and thaw holder operations Christian Brauner
` (5 subsequent siblings)
6 siblings, 3 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
We have bdev_mark_dead() etc and we're going to move block device
freezing to holder ops in the next patch. Make the naming consistent:
* freeze_bdev() -> bdev_freeze()
* thaw_bdev() -> bdev_thaw()
Also document the return code.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
block/bdev.c | 22 +++++++++++++---------
drivers/md/dm.c | 4 ++--
fs/ext4/ioctl.c | 4 ++--
fs/f2fs/file.c | 4 ++--
fs/super.c | 4 ++--
fs/xfs/xfs_fsops.c | 4 ++--
include/linux/blkdev.h | 4 ++--
7 files changed, 25 insertions(+), 21 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index f3b13aa1b7d4..0d27db3e69e7 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -207,18 +207,20 @@ int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend)
EXPORT_SYMBOL(sync_blockdev_range);
/**
- * freeze_bdev - lock a filesystem and force it into a consistent state
+ * bdev_freeze - lock a filesystem and force it into a consistent state
* @bdev: blockdevice to lock
*
* If a superblock is found on this device, we take the s_umount semaphore
* on it to make sure nobody unmounts until the snapshot creation is done.
* The reference counter (bd_fsfreeze_count) guarantees that only the last
* unfreeze process can unfreeze the frozen filesystem actually when multiple
- * freeze requests arrive simultaneously. It counts up in freeze_bdev() and
- * count down in thaw_bdev(). When it becomes 0, thaw_bdev() will unfreeze
+ * freeze requests arrive simultaneously. It counts up in bdev_freeze() and
+ * count down in bdev_thaw(). When it becomes 0, thaw_bdev() will unfreeze
* actually.
+ *
+ * Return: On success zero is returned, negative error code on failure.
*/
-int freeze_bdev(struct block_device *bdev)
+int bdev_freeze(struct block_device *bdev)
{
struct super_block *sb;
int error = 0;
@@ -248,15 +250,17 @@ int freeze_bdev(struct block_device *bdev)
mutex_unlock(&bdev->bd_fsfreeze_mutex);
return error;
}
-EXPORT_SYMBOL(freeze_bdev);
+EXPORT_SYMBOL(bdev_freeze);
/**
- * thaw_bdev - unlock filesystem
+ * bdev_thaw - unlock filesystem
* @bdev: blockdevice to unlock
*
- * Unlocks the filesystem and marks it writeable again after freeze_bdev().
+ * Unlocks the filesystem and marks it writeable again after bdev_freeze().
+ *
+ * Return: On success zero is returned, negative error code on failure.
*/
-int thaw_bdev(struct block_device *bdev)
+int bdev_thaw(struct block_device *bdev)
{
struct super_block *sb;
int error = -EINVAL;
@@ -285,7 +289,7 @@ int thaw_bdev(struct block_device *bdev)
mutex_unlock(&bdev->bd_fsfreeze_mutex);
return error;
}
-EXPORT_SYMBOL(thaw_bdev);
+EXPORT_SYMBOL(bdev_thaw);
/*
* pseudo-fs
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 64a1f306c96c..6fa309e8efb0 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2648,7 +2648,7 @@ static int lock_fs(struct mapped_device *md)
WARN_ON(test_bit(DMF_FROZEN, &md->flags));
- r = freeze_bdev(md->disk->part0);
+ r = bdev_freeze(md->disk->part0);
if (!r)
set_bit(DMF_FROZEN, &md->flags);
return r;
@@ -2658,7 +2658,7 @@ static void unlock_fs(struct mapped_device *md)
{
if (!test_bit(DMF_FROZEN, &md->flags))
return;
- thaw_bdev(md->disk->part0);
+ bdev_thaw(md->disk->part0);
clear_bit(DMF_FROZEN, &md->flags);
}
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 0bfe2ce589e2..c1390219c945 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -810,11 +810,11 @@ int ext4_force_shutdown(struct super_block *sb, u32 flags)
switch (flags) {
case EXT4_GOING_FLAGS_DEFAULT:
- ret = freeze_bdev(sb->s_bdev);
+ ret = bdev_freeze(sb->s_bdev);
if (ret)
return ret;
set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
- thaw_bdev(sb->s_bdev);
+ bdev_thaw(sb->s_bdev);
break;
case EXT4_GOING_FLAGS_LOGFLUSH:
set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ca5904129b16..c22aeb9ffb61 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2239,11 +2239,11 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
switch (in) {
case F2FS_GOING_DOWN_FULLSYNC:
- ret = freeze_bdev(sb->s_bdev);
+ ret = bdev_freeze(sb->s_bdev);
if (ret)
goto out;
f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
- thaw_bdev(sb->s_bdev);
+ bdev_thaw(sb->s_bdev);
break;
case F2FS_GOING_DOWN_METASYNC:
/* do checkpoint only */
diff --git a/fs/super.c b/fs/super.c
index 2d762ce67f6e..e54866345dc7 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1210,7 +1210,7 @@ static void do_thaw_all_callback(struct super_block *sb)
if (born && sb->s_root) {
if (IS_ENABLED(CONFIG_BLOCK))
- while (sb->s_bdev && !thaw_bdev(sb->s_bdev))
+ while (sb->s_bdev && !bdev_thaw(sb->s_bdev))
pr_warn("Emergency Thaw on %pg\n", sb->s_bdev);
thaw_super_locked(sb, FREEZE_HOLDER_USERSPACE);
} else {
@@ -1501,7 +1501,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
/*
* Until SB_BORN flag is set, there can be no active superblock
* references and thus no filesystem freezing. get_active_super() will
- * just loop waiting for SB_BORN so even freeze_bdev() cannot proceed.
+ * just loop waiting for SB_BORN so even bdev_freeze() cannot proceed.
*
* It is enough to check bdev was not frozen before we set s_bdev.
*/
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 7cb75cb6b8e9..57076a25f17d 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -482,9 +482,9 @@ xfs_fs_goingdown(
{
switch (inflags) {
case XFS_FSOP_GOING_FLAGS_DEFAULT: {
- if (!freeze_bdev(mp->m_super->s_bdev)) {
+ if (!bdev_freeze(mp->m_super->s_bdev)) {
xfs_force_shutdown(mp, SHUTDOWN_FORCE_UMOUNT);
- thaw_bdev(mp->m_super->s_bdev);
+ bdev_thaw(mp->m_super->s_bdev);
}
break;
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index eef450f25982..bf25b63e13d5 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1530,8 +1530,8 @@ static inline int early_lookup_bdev(const char *pathname, dev_t *dev)
}
#endif /* CONFIG_BLOCK */
-int freeze_bdev(struct block_device *bdev);
-int thaw_bdev(struct block_device *bdev);
+int bdev_freeze(struct block_device *bdev);
+int bdev_thaw(struct block_device *bdev);
struct io_comp_batch {
struct request *req_list;
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/7] bdev: add freeze and thaw holder operations
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 14:38 ` Darrick J. Wong
` (2 more replies)
2023-09-27 13:21 ` [PATCH 3/7] bdev: implement " Christian Brauner
` (4 subsequent siblings)
6 siblings, 3 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
Add block device freeze and thaw holder operations. Follow-up patches
will implement block device freeze and thaw based on stuct
blk_holder_ops.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
include/linux/blkdev.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bf25b63e13d5..f2ddccaaef4d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1468,6 +1468,16 @@ struct blk_holder_ops {
* Sync the file system mounted on the block device.
*/
void (*sync)(struct block_device *bdev);
+
+ /*
+ * Freeze the file system mounted on the block device.
+ */
+ int (*freeze)(struct block_device *bdev);
+
+ /*
+ * Thaw the file system mounted on the block device.
+ */
+ int (*thaw)(struct block_device *bdev);
};
extern const struct blk_holder_ops fs_holder_ops;
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/7] bdev: implement freeze and thaw holder operations
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
2023-09-27 13:21 ` [PATCH 2/7] bdev: add freeze and thaw holder operations Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 14:53 ` Darrick J. Wong
2023-10-02 7:10 ` Christoph Hellwig
2023-09-27 13:21 ` [PATCH 4/7] fs: remove get_active_super() Christian Brauner
` (3 subsequent siblings)
6 siblings, 2 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
The old method of implementing block device freeze and thaw operations
required us to rely on get_active_super() to walk the list of all
superblocks on the system to find any superblock that might use the
block device. This is wasteful and not very pleasant overall.
Now that we can finally go straight from block device to owning
superblock things become way simpler. In fact, we can even get rid of
bd_fsfreeze_mutex and rely on the holder lock for freezing.
We let fs_bdev_thaw() grab its own active reference so we can hold
bd_holder_lock across freeze and thaw without risk of deadlocks. That in
turn allows us to get rid of bd_fsfreeze_mutex. This means
fs_bdev_thaw() might put the last reference to the superblock instead of
thaw_super(). This shouldn't be an issue. If it turns out to be one we
can reshuffle the code to simply hold s_umount when thaw_super() returns.
Thanks to Jan and Christoph who pointed out that bd_fsfreeze_mutex is
unnecessary now.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
block/bdev.c | 74 +++++++++++++++++--------------------
fs/super.c | 94 ++++++++++++++++++++++++++++++++++++++++++-----
include/linux/blk_types.h | 2 +-
3 files changed, 119 insertions(+), 51 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index 0d27db3e69e7..3deccd0ffcf2 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -222,32 +222,23 @@ EXPORT_SYMBOL(sync_blockdev_range);
*/
int bdev_freeze(struct block_device *bdev)
{
- struct super_block *sb;
int error = 0;
- mutex_lock(&bdev->bd_fsfreeze_mutex);
- if (++bdev->bd_fsfreeze_count > 1)
- goto done;
+ mutex_lock(&bdev->bd_holder_lock);
- sb = get_active_super(bdev);
- if (!sb)
- goto sync;
- if (sb->s_op->freeze_super)
- error = sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE);
- else
- error = freeze_super(sb, FREEZE_HOLDER_USERSPACE);
- deactivate_super(sb);
+ if (atomic_inc_return(&bdev->bd_fsfreeze_count) > 1) {
+ mutex_unlock(&bdev->bd_holder_lock);
+ return 0;
+ }
- if (error) {
- bdev->bd_fsfreeze_count--;
- goto done;
+ if (bdev->bd_holder_ops && bdev->bd_holder_ops->freeze) {
+ error = bdev->bd_holder_ops->freeze(bdev);
+ lockdep_assert_not_held(&bdev->bd_holder_lock);
+ } else {
+ sync_blockdev(bdev);
+ mutex_unlock(&bdev->bd_holder_lock);
}
- bdev->bd_fsfreeze_sb = sb;
-sync:
- sync_blockdev(bdev);
-done:
- mutex_unlock(&bdev->bd_fsfreeze_mutex);
return error;
}
EXPORT_SYMBOL(bdev_freeze);
@@ -262,31 +253,32 @@ EXPORT_SYMBOL(bdev_freeze);
*/
int bdev_thaw(struct block_device *bdev)
{
- struct super_block *sb;
- int error = -EINVAL;
+ int error = 0, nr_freeze;
- mutex_lock(&bdev->bd_fsfreeze_mutex);
- if (!bdev->bd_fsfreeze_count)
- goto out;
+ mutex_lock(&bdev->bd_holder_lock);
- error = 0;
- if (--bdev->bd_fsfreeze_count > 0)
- goto out;
+ /*
+ * If this returns < 0 it means that @bd_fsfreeze_count was
+ * already 0 and no decrement was performed.
+ */
+ nr_freeze = atomic_dec_if_positive(&bdev->bd_fsfreeze_count);
+ if (nr_freeze < 0) {
+ mutex_unlock(&bdev->bd_holder_lock);
+ return -EINVAL;
+ }
- sb = bdev->bd_fsfreeze_sb;
- if (!sb)
- goto out;
+ if (nr_freeze > 0) {
+ mutex_unlock(&bdev->bd_holder_lock);
+ return 0;
+ }
+
+ if (bdev->bd_holder_ops && bdev->bd_holder_ops->thaw) {
+ error = bdev->bd_holder_ops->thaw(bdev);
+ lockdep_assert_not_held(&bdev->bd_holder_lock);
+ } else {
+ mutex_unlock(&bdev->bd_holder_lock);
+ }
- if (sb->s_op->thaw_super)
- error = sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE);
- else
- error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
- if (error)
- bdev->bd_fsfreeze_count++;
- else
- bdev->bd_fsfreeze_sb = NULL;
-out:
- mutex_unlock(&bdev->bd_fsfreeze_mutex);
return error;
}
EXPORT_SYMBOL(bdev_thaw);
diff --git a/fs/super.c b/fs/super.c
index e54866345dc7..672f1837fbef 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1469,9 +1469,91 @@ static void fs_bdev_sync(struct block_device *bdev)
super_unlock_shared(sb);
}
+static struct super_block *get_bdev_super(const struct block_device *bdev)
+{
+ struct super_block *sb_bdev = bdev->bd_holder, *sb = NULL;
+
+ if (!sb_bdev)
+ return NULL;
+ if (super_lock_excl(sb_bdev) && atomic_inc_not_zero(&sb_bdev->s_active))
+ sb = sb_bdev;
+ super_unlock_excl(sb_bdev);
+ return sb;
+}
+
+static int fs_bdev_freeze(struct block_device *bdev)
+ __releases(&bdev->bd_holder_lock)
+{
+ struct super_block *sb;
+ int error = 0;
+
+ lockdep_assert_held(&bdev->bd_holder_lock);
+
+ sb = get_bdev_super(bdev);
+ if (sb) {
+ if (sb->s_op->freeze_super)
+ error = sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE);
+ else
+ error = freeze_super(sb, FREEZE_HOLDER_USERSPACE);
+ if (error)
+ atomic_dec(&bdev->bd_fsfreeze_count);
+ }
+
+ /*
+ * We have grabbed an active reference which means that the
+ * superblock and the block device cannot go away. But we might
+ * end up holding the last reference and so end up shutting the
+ * superblock down. So drop @bdev->bd_holder_lock to avoid
+ * deadlocks with blkdev_put().
+ */
+ mutex_unlock(&bdev->bd_holder_lock);
+
+ if (sb)
+ deactivate_super(sb);
+
+ if (!error)
+ sync_blockdev(bdev);
+
+ return error;
+}
+
+static int fs_bdev_thaw(struct block_device *bdev)
+ __releases(&bdev->bd_holder_lock)
+{
+ struct super_block *sb;
+ int error;
+
+ lockdep_assert_held(&bdev->bd_holder_lock);
+
+ sb = get_bdev_super(bdev);
+ if (WARN_ON_ONCE(!sb))
+ return -EINVAL;
+
+ if (sb->s_op->thaw_super)
+ error = sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE);
+ else
+ error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
+ if (error)
+ atomic_inc(&bdev->bd_fsfreeze_count);
+
+ /*
+ * We have grabbed an active reference which means that the
+ * superblock and the block device cannot go away. But we might
+ * end up holding the last reference and so end up shutting the
+ * superblock down. So drop @bdev->bd_holder_lock to avoid
+ * deadlocks with blkdev_put().
+ */
+ mutex_unlock(&bdev->bd_holder_lock);
+ deactivate_super(sb);
+
+ return error;
+}
+
const struct blk_holder_ops fs_holder_ops = {
.mark_dead = fs_bdev_mark_dead,
.sync = fs_bdev_sync,
+ .freeze = fs_bdev_freeze,
+ .thaw = fs_bdev_thaw,
};
EXPORT_SYMBOL_GPL(fs_holder_ops);
@@ -1499,15 +1581,10 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
}
/*
- * Until SB_BORN flag is set, there can be no active superblock
- * references and thus no filesystem freezing. get_active_super() will
- * just loop waiting for SB_BORN so even bdev_freeze() cannot proceed.
- *
- * It is enough to check bdev was not frozen before we set s_bdev.
+ * It is enough to check bdev was not frozen before we set
+ * s_bdev as freezing will wait until SB_BORN is set.
*/
- mutex_lock(&bdev->bd_fsfreeze_mutex);
- if (bdev->bd_fsfreeze_count > 0) {
- mutex_unlock(&bdev->bd_fsfreeze_mutex);
+ if (atomic_read(&bdev->bd_fsfreeze_count) > 0) {
if (fc)
warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev);
blkdev_put(bdev, sb);
@@ -1519,7 +1596,6 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
if (bdev_stable_writes(bdev))
sb->s_iflags |= SB_I_STABLE_WRITES;
spin_unlock(&sb_lock);
- mutex_unlock(&bdev->bd_fsfreeze_mutex);
snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
shrinker_debugfs_rename(&sb->s_shrink, "sb-%s:%s", sb->s_type->name,
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d5c5e59ddbd2..88e1848b0869 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -57,7 +57,7 @@ struct block_device {
const struct blk_holder_ops *bd_holder_ops;
struct mutex bd_holder_lock;
/* The counter of freeze processes */
- int bd_fsfreeze_count;
+ atomic_t bd_fsfreeze_count;
int bd_holders;
struct kobject *bd_holder_dir;
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/7] fs: remove get_active_super()
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
` (2 preceding siblings ...)
2023-09-27 13:21 ` [PATCH 3/7] bdev: implement " Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 14:54 ` Darrick J. Wong
` (2 more replies)
2023-09-27 13:21 ` [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb} Christian Brauner
` (2 subsequent siblings)
6 siblings, 3 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
This function is now unused so remove it. One less function that uses
the global superblock list.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/super.c | 28 ----------------------------
include/linux/fs.h | 1 -
2 files changed, 29 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 672f1837fbef..181ac8501301 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1016,34 +1016,6 @@ void iterate_supers_type(struct file_system_type *type,
EXPORT_SYMBOL(iterate_supers_type);
-/**
- * get_active_super - get an active reference to the superblock of a device
- * @bdev: device to get the superblock for
- *
- * Scans the superblock list and finds the superblock of the file system
- * mounted on the device given. Returns the superblock with an active
- * reference or %NULL if none was found.
- */
-struct super_block *get_active_super(struct block_device *bdev)
-{
- struct super_block *sb;
-
- if (!bdev)
- return NULL;
-
- spin_lock(&sb_lock);
- list_for_each_entry(sb, &super_blocks, s_list) {
- if (sb->s_bdev == bdev) {
- if (!grab_super(sb))
- return NULL;
- super_unlock_excl(sb);
- return sb;
- }
- }
- spin_unlock(&sb_lock);
- return NULL;
-}
-
struct super_block *user_get_super(dev_t dev, bool excl)
{
struct super_block *sb;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b528f063e8ff..ad0ddc10d560 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3052,7 +3052,6 @@ extern int vfs_readlink(struct dentry *, char __user *, int);
extern struct file_system_type *get_filesystem(struct file_system_type *fs);
extern void put_filesystem(struct file_system_type *fs);
extern struct file_system_type *get_fs_type(const char *name);
-extern struct super_block *get_active_super(struct block_device *bdev);
extern void drop_super(struct super_block *sb);
extern void drop_super_exclusive(struct super_block *sb);
extern void iterate_supers(void (*)(struct super_block *, void *), void *);
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb}
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
` (3 preceding siblings ...)
2023-09-27 13:21 ` [PATCH 4/7] fs: remove get_active_super() Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 15:11 ` Darrick J. Wong
2023-10-02 16:24 ` Jan Kara
2023-09-27 13:21 ` [PATCH 6/7] fs: remove unused helper Christian Brauner
2023-09-27 13:21 ` [PATCH 7/7] porting: document block device freeze and thaw changes Christian Brauner
6 siblings, 2 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
Both bd_fsfreeze_mutex and bd_fsfreeze_sb are now unused and can be
removed. Also move bd_fsfreeze_count down to not have it weirdly placed
in the middle of the holder fields.
Suggested-by: Jan Kara <jack@suse.cz>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
block/bdev.c | 1 -
include/linux/blk_types.h | 7 ++-----
2 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index 3deccd0ffcf2..084855b669f7 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -392,7 +392,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
mapping_set_gfp_mask(&inode->i_data, GFP_USER);
bdev = I_BDEV(inode);
- mutex_init(&bdev->bd_fsfreeze_mutex);
spin_lock_init(&bdev->bd_size_lock);
mutex_init(&bdev->bd_holder_lock);
bdev->bd_partno = partno;
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 88e1848b0869..0238236852b7 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -56,14 +56,11 @@ struct block_device {
void * bd_holder;
const struct blk_holder_ops *bd_holder_ops;
struct mutex bd_holder_lock;
- /* The counter of freeze processes */
- atomic_t bd_fsfreeze_count;
int bd_holders;
struct kobject *bd_holder_dir;
- /* Mutex for freeze */
- struct mutex bd_fsfreeze_mutex;
- struct super_block *bd_fsfreeze_sb;
+ /* The counter of freeze processes */
+ atomic_t bd_fsfreeze_count;
struct partition_meta_info *bd_meta_info;
#ifdef CONFIG_FAIL_MAKE_REQUEST
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 6/7] fs: remove unused helper
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
` (4 preceding siblings ...)
2023-09-27 13:21 ` [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb} Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 15:12 ` Darrick J. Wong
` (2 more replies)
2023-09-27 13:21 ` [PATCH 7/7] porting: document block device freeze and thaw changes Christian Brauner
6 siblings, 3 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
The grab_super() helper is now only used by grab_super_dead(). Merge the
two helpers into one.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/super.c | 44 +++++++-------------------------------------
1 file changed, 7 insertions(+), 37 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 181ac8501301..6cdce2b31622 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -517,35 +517,6 @@ void deactivate_super(struct super_block *s)
EXPORT_SYMBOL(deactivate_super);
-/**
- * grab_super - acquire an active reference
- * @s: reference we are trying to make active
- *
- * Tries to acquire an active reference. grab_super() is used when we
- * had just found a superblock in super_blocks or fs_type->fs_supers
- * and want to turn it into a full-blown active reference. grab_super()
- * is called with sb_lock held and drops it. Returns 1 in case of
- * success, 0 if we had failed (superblock contents was already dead or
- * dying when grab_super() had been called). Note that this is only
- * called for superblocks not in rundown mode (== ones still on ->fs_supers
- * of their type), so increment of ->s_count is OK here.
- */
-static int grab_super(struct super_block *s) __releases(sb_lock)
-{
- bool born;
-
- s->s_count++;
- spin_unlock(&sb_lock);
- born = super_lock_excl(s);
- if (born && atomic_inc_not_zero(&s->s_active)) {
- put_super(s);
- return 1;
- }
- super_unlock_excl(s);
- put_super(s);
- return 0;
-}
-
static inline bool wait_dead(struct super_block *sb)
{
unsigned int flags;
@@ -559,7 +530,7 @@ static inline bool wait_dead(struct super_block *sb)
}
/**
- * grab_super_dead - acquire an active reference to a superblock
+ * grab_super - acquire an active reference to a superblock
* @sb: superblock to acquire
*
* Acquire a temporary reference on a superblock and try to trade it for
@@ -570,17 +541,16 @@ static inline bool wait_dead(struct super_block *sb)
* Return: This returns true if an active reference could be acquired,
* false if not.
*/
-static bool grab_super_dead(struct super_block *sb)
+static bool grab_super(struct super_block *sb)
{
-
sb->s_count++;
- if (grab_super(sb)) {
+ spin_unlock(&sb_lock);
+ if (super_lock_excl(sb) && atomic_inc_not_zero(&sb->s_active)) {
put_super(sb);
- lockdep_assert_held(&sb->s_umount);
return true;
}
+ super_unlock_excl(sb);
wait_var_event(&sb->s_flags, wait_dead(sb));
- lockdep_assert_not_held(&sb->s_umount);
put_super(sb);
return false;
}
@@ -831,7 +801,7 @@ struct super_block *sget_fc(struct fs_context *fc,
warnfc(fc, "reusing existing filesystem in another namespace not allowed");
return ERR_PTR(-EBUSY);
}
- if (!grab_super_dead(old))
+ if (!grab_super(old))
goto retry;
destroy_unused_super(s);
return old;
@@ -875,7 +845,7 @@ struct super_block *sget(struct file_system_type *type,
destroy_unused_super(s);
return ERR_PTR(-EBUSY);
}
- if (!grab_super_dead(old))
+ if (!grab_super(old))
goto retry;
destroy_unused_super(s);
return old;
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 7/7] porting: document block device freeze and thaw changes
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
` (5 preceding siblings ...)
2023-09-27 13:21 ` [PATCH 6/7] fs: remove unused helper Christian Brauner
@ 2023-09-27 13:21 ` Christian Brauner
2023-09-27 15:19 ` Darrick J. Wong
6 siblings, 1 reply; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 13:21 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig
Cc: linux-fsdevel, Darrick J. Wong, Christian Brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Documentation/filesystems/porting.rst | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index 4d05b9862451..fef97a2e6729 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1045,3 +1045,28 @@ filesystem type is now moved to a later point when the devices are closed:
As this is a VFS level change it has no practical consequences for filesystems
other than that all of them must use one of the provided kill_litter_super(),
kill_anon_super(), or kill_block_super() helpers.
+
+---
+
+**mandatory**
+
+Block device freezing and thawing have been moved to holder operations. As we
+can now go straight from block devcie to superblock the get_active_super()
+and bd_fsfreeze_sb members in struct block_device are gone.
+
+The bd_fsfreeze_mutex is gone as well since we can rely on the bd_holder_lock
+to protect against concurrent freeze and thaw.
+
+Before this change, get_active_super() would only be able to find the
+superblock of the main block device, i.e., the one stored in sb->s_bdev. Block
+device freezing now works for any block device owned by a given superblock, not
+just the main block device.
+
+When thawing we now grab an active reference so we can hold bd_holder_lock
+across thaw without the risk of deadlocks (because the superblock goes away
+which would require us to take bd_holder_lock). That allows us to get rid of
+bd_fsfreeze_mutex. Currently we just reacquire s_umount after thaw_super() and
+drop the active reference we took before. This someone could grab an active
+reference before we dropped the last one. This shouldn't be an issue. If it
+turns out to be one we can reshuffle the code to simply hold s_umount when
+thaw_super() returns and drop the reference.
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/7] bdev: rename freeze and thaw helpers
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
@ 2023-09-27 14:35 ` Darrick J. Wong
2023-10-02 6:51 ` Christoph Hellwig
2023-10-02 11:28 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 14:35 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:14PM +0200, Christian Brauner wrote:
> We have bdev_mark_dead() etc and we're going to move block device
> freezing to holder ops in the next patch. Make the naming consistent:
>
> * freeze_bdev() -> bdev_freeze()
> * thaw_bdev() -> bdev_thaw()
>
> Also document the return code.
Yesssssssss
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> block/bdev.c | 22 +++++++++++++---------
> drivers/md/dm.c | 4 ++--
> fs/ext4/ioctl.c | 4 ++--
> fs/f2fs/file.c | 4 ++--
> fs/super.c | 4 ++--
> fs/xfs/xfs_fsops.c | 4 ++--
> include/linux/blkdev.h | 4 ++--
> 7 files changed, 25 insertions(+), 21 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index f3b13aa1b7d4..0d27db3e69e7 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -207,18 +207,20 @@ int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend)
> EXPORT_SYMBOL(sync_blockdev_range);
>
> /**
> - * freeze_bdev - lock a filesystem and force it into a consistent state
> + * bdev_freeze - lock a filesystem and force it into a consistent state
> * @bdev: blockdevice to lock
> *
> * If a superblock is found on this device, we take the s_umount semaphore
> * on it to make sure nobody unmounts until the snapshot creation is done.
> * The reference counter (bd_fsfreeze_count) guarantees that only the last
> * unfreeze process can unfreeze the frozen filesystem actually when multiple
> - * freeze requests arrive simultaneously. It counts up in freeze_bdev() and
> - * count down in thaw_bdev(). When it becomes 0, thaw_bdev() will unfreeze
> + * freeze requests arrive simultaneously. It counts up in bdev_freeze() and
> + * count down in bdev_thaw(). When it becomes 0, thaw_bdev() will unfreeze
> * actually.
> + *
> + * Return: On success zero is returned, negative error code on failure.
> */
> -int freeze_bdev(struct block_device *bdev)
> +int bdev_freeze(struct block_device *bdev)
> {
> struct super_block *sb;
> int error = 0;
> @@ -248,15 +250,17 @@ int freeze_bdev(struct block_device *bdev)
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> }
> -EXPORT_SYMBOL(freeze_bdev);
> +EXPORT_SYMBOL(bdev_freeze);
>
> /**
> - * thaw_bdev - unlock filesystem
> + * bdev_thaw - unlock filesystem
> * @bdev: blockdevice to unlock
> *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> + * Unlocks the filesystem and marks it writeable again after bdev_freeze().
> + *
> + * Return: On success zero is returned, negative error code on failure.
> */
> -int thaw_bdev(struct block_device *bdev)
> +int bdev_thaw(struct block_device *bdev)
> {
> struct super_block *sb;
> int error = -EINVAL;
> @@ -285,7 +289,7 @@ int thaw_bdev(struct block_device *bdev)
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> }
> -EXPORT_SYMBOL(thaw_bdev);
> +EXPORT_SYMBOL(bdev_thaw);
>
> /*
> * pseudo-fs
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 64a1f306c96c..6fa309e8efb0 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2648,7 +2648,7 @@ static int lock_fs(struct mapped_device *md)
>
> WARN_ON(test_bit(DMF_FROZEN, &md->flags));
>
> - r = freeze_bdev(md->disk->part0);
> + r = bdev_freeze(md->disk->part0);
> if (!r)
> set_bit(DMF_FROZEN, &md->flags);
> return r;
> @@ -2658,7 +2658,7 @@ static void unlock_fs(struct mapped_device *md)
> {
> if (!test_bit(DMF_FROZEN, &md->flags))
> return;
> - thaw_bdev(md->disk->part0);
> + bdev_thaw(md->disk->part0);
> clear_bit(DMF_FROZEN, &md->flags);
> }
>
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index 0bfe2ce589e2..c1390219c945 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -810,11 +810,11 @@ int ext4_force_shutdown(struct super_block *sb, u32 flags)
>
> switch (flags) {
> case EXT4_GOING_FLAGS_DEFAULT:
> - ret = freeze_bdev(sb->s_bdev);
> + ret = bdev_freeze(sb->s_bdev);
> if (ret)
> return ret;
> set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
> - thaw_bdev(sb->s_bdev);
> + bdev_thaw(sb->s_bdev);
> break;
> case EXT4_GOING_FLAGS_LOGFLUSH:
> set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index ca5904129b16..c22aeb9ffb61 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -2239,11 +2239,11 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
>
> switch (in) {
> case F2FS_GOING_DOWN_FULLSYNC:
> - ret = freeze_bdev(sb->s_bdev);
> + ret = bdev_freeze(sb->s_bdev);
> if (ret)
> goto out;
> f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
> - thaw_bdev(sb->s_bdev);
> + bdev_thaw(sb->s_bdev);
> break;
> case F2FS_GOING_DOWN_METASYNC:
> /* do checkpoint only */
> diff --git a/fs/super.c b/fs/super.c
> index 2d762ce67f6e..e54866345dc7 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1210,7 +1210,7 @@ static void do_thaw_all_callback(struct super_block *sb)
>
> if (born && sb->s_root) {
> if (IS_ENABLED(CONFIG_BLOCK))
> - while (sb->s_bdev && !thaw_bdev(sb->s_bdev))
> + while (sb->s_bdev && !bdev_thaw(sb->s_bdev))
> pr_warn("Emergency Thaw on %pg\n", sb->s_bdev);
> thaw_super_locked(sb, FREEZE_HOLDER_USERSPACE);
> } else {
> @@ -1501,7 +1501,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
> /*
> * Until SB_BORN flag is set, there can be no active superblock
> * references and thus no filesystem freezing. get_active_super() will
> - * just loop waiting for SB_BORN so even freeze_bdev() cannot proceed.
> + * just loop waiting for SB_BORN so even bdev_freeze() cannot proceed.
> *
> * It is enough to check bdev was not frozen before we set s_bdev.
> */
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 7cb75cb6b8e9..57076a25f17d 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -482,9 +482,9 @@ xfs_fs_goingdown(
> {
> switch (inflags) {
> case XFS_FSOP_GOING_FLAGS_DEFAULT: {
> - if (!freeze_bdev(mp->m_super->s_bdev)) {
> + if (!bdev_freeze(mp->m_super->s_bdev)) {
> xfs_force_shutdown(mp, SHUTDOWN_FORCE_UMOUNT);
> - thaw_bdev(mp->m_super->s_bdev);
> + bdev_thaw(mp->m_super->s_bdev);
> }
> break;
> }
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index eef450f25982..bf25b63e13d5 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1530,8 +1530,8 @@ static inline int early_lookup_bdev(const char *pathname, dev_t *dev)
> }
> #endif /* CONFIG_BLOCK */
>
> -int freeze_bdev(struct block_device *bdev);
> -int thaw_bdev(struct block_device *bdev);
> +int bdev_freeze(struct block_device *bdev);
> +int bdev_thaw(struct block_device *bdev);
>
> struct io_comp_batch {
> struct request *req_list;
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/7] bdev: add freeze and thaw holder operations
2023-09-27 13:21 ` [PATCH 2/7] bdev: add freeze and thaw holder operations Christian Brauner
@ 2023-09-27 14:38 ` Darrick J. Wong
2023-10-02 6:52 ` Christoph Hellwig
2023-10-02 16:32 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 14:38 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:15PM +0200, Christian Brauner wrote:
> Add block device freeze and thaw holder operations. Follow-up patches
> will implement block device freeze and thaw based on stuct
s/stuct/struct/
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> blk_holder_ops.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> include/linux/blkdev.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index bf25b63e13d5..f2ddccaaef4d 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1468,6 +1468,16 @@ struct blk_holder_ops {
> * Sync the file system mounted on the block device.
> */
> void (*sync)(struct block_device *bdev);
> +
> + /*
> + * Freeze the file system mounted on the block device.
> + */
> + int (*freeze)(struct block_device *bdev);
> +
> + /*
> + * Thaw the file system mounted on the block device.
> + */
> + int (*thaw)(struct block_device *bdev);
> };
>
> extern const struct blk_holder_ops fs_holder_ops;
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/7] bdev: implement freeze and thaw holder operations
2023-09-27 13:21 ` [PATCH 3/7] bdev: implement " Christian Brauner
@ 2023-09-27 14:53 ` Darrick J. Wong
2023-09-27 15:15 ` Christian Brauner
2023-10-02 7:10 ` Christoph Hellwig
1 sibling, 1 reply; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 14:53 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:16PM +0200, Christian Brauner wrote:
> The old method of implementing block device freeze and thaw operations
> required us to rely on get_active_super() to walk the list of all
> superblocks on the system to find any superblock that might use the
> block device. This is wasteful and not very pleasant overall.
>
> Now that we can finally go straight from block device to owning
> superblock things become way simpler. In fact, we can even get rid of
> bd_fsfreeze_mutex and rely on the holder lock for freezing.
>
> We let fs_bdev_thaw() grab its own active reference so we can hold
> bd_holder_lock across freeze and thaw without risk of deadlocks. That in
> turn allows us to get rid of bd_fsfreeze_mutex. This means
> fs_bdev_thaw() might put the last reference to the superblock instead of
> thaw_super(). This shouldn't be an issue. If it turns out to be one we
> can reshuffle the code to simply hold s_umount when thaw_super() returns.
>
> Thanks to Jan and Christoph who pointed out that bd_fsfreeze_mutex is
> unnecessary now.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> block/bdev.c | 74 +++++++++++++++++--------------------
> fs/super.c | 94 ++++++++++++++++++++++++++++++++++++++++++-----
> include/linux/blk_types.h | 2 +-
> 3 files changed, 119 insertions(+), 51 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index 0d27db3e69e7..3deccd0ffcf2 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -222,32 +222,23 @@ EXPORT_SYMBOL(sync_blockdev_range);
> */
> int bdev_freeze(struct block_device *bdev)
> {
> - struct super_block *sb;
> int error = 0;
>
> - mutex_lock(&bdev->bd_fsfreeze_mutex);
> - if (++bdev->bd_fsfreeze_count > 1)
> - goto done;
> + mutex_lock(&bdev->bd_holder_lock);
>
> - sb = get_active_super(bdev);
> - if (!sb)
> - goto sync;
> - if (sb->s_op->freeze_super)
> - error = sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE);
> - else
> - error = freeze_super(sb, FREEZE_HOLDER_USERSPACE);
> - deactivate_super(sb);
> + if (atomic_inc_return(&bdev->bd_fsfreeze_count) > 1) {
> + mutex_unlock(&bdev->bd_holder_lock);
> + return 0;
> + }
>
> - if (error) {
> - bdev->bd_fsfreeze_count--;
> - goto done;
> + if (bdev->bd_holder_ops && bdev->bd_holder_ops->freeze) {
> + error = bdev->bd_holder_ops->freeze(bdev);
> + lockdep_assert_not_held(&bdev->bd_holder_lock);
> + } else {
> + sync_blockdev(bdev);
Why ignore the return value from sync_blockdev? It calls
filemap_write_and_wait, which clears AS_EIO/AS_ENOSPC from the bdev
mapping, which means that we'll silently drop accumulated IO errors.
> + mutex_unlock(&bdev->bd_holder_lock);
Also not sure why this fallback case holds bd_holder_lock across the
sync_blockdev but fs_bdev_freeze doesn't?
> }
> - bdev->bd_fsfreeze_sb = sb;
>
> -sync:
> - sync_blockdev(bdev);
> -done:
> - mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> }
> EXPORT_SYMBOL(bdev_freeze);
> @@ -262,31 +253,32 @@ EXPORT_SYMBOL(bdev_freeze);
> */
> int bdev_thaw(struct block_device *bdev)
> {
> - struct super_block *sb;
> - int error = -EINVAL;
> + int error = 0, nr_freeze;
>
> - mutex_lock(&bdev->bd_fsfreeze_mutex);
> - if (!bdev->bd_fsfreeze_count)
> - goto out;
> + mutex_lock(&bdev->bd_holder_lock);
>
> - error = 0;
> - if (--bdev->bd_fsfreeze_count > 0)
> - goto out;
> + /*
> + * If this returns < 0 it means that @bd_fsfreeze_count was
> + * already 0 and no decrement was performed.
> + */
> + nr_freeze = atomic_dec_if_positive(&bdev->bd_fsfreeze_count);
> + if (nr_freeze < 0) {
> + mutex_unlock(&bdev->bd_holder_lock);
> + return -EINVAL;
> + }
>
> - sb = bdev->bd_fsfreeze_sb;
> - if (!sb)
> - goto out;
> + if (nr_freeze > 0) {
> + mutex_unlock(&bdev->bd_holder_lock);
> + return 0;
> + }
> +
> + if (bdev->bd_holder_ops && bdev->bd_holder_ops->thaw) {
> + error = bdev->bd_holder_ops->thaw(bdev);
> + lockdep_assert_not_held(&bdev->bd_holder_lock);
> + } else {
> + mutex_unlock(&bdev->bd_holder_lock);
> + }
>
> - if (sb->s_op->thaw_super)
> - error = sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE);
> - else
> - error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
> - if (error)
> - bdev->bd_fsfreeze_count++;
> - else
> - bdev->bd_fsfreeze_sb = NULL;
> -out:
> - mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> }
> EXPORT_SYMBOL(bdev_thaw);
> diff --git a/fs/super.c b/fs/super.c
> index e54866345dc7..672f1837fbef 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1469,9 +1469,91 @@ static void fs_bdev_sync(struct block_device *bdev)
> super_unlock_shared(sb);
> }
>
> +static struct super_block *get_bdev_super(const struct block_device *bdev)
> +{
> + struct super_block *sb_bdev = bdev->bd_holder, *sb = NULL;
> +
> + if (!sb_bdev)
> + return NULL;
> + if (super_lock_excl(sb_bdev) && atomic_inc_not_zero(&sb_bdev->s_active))
> + sb = sb_bdev;
> + super_unlock_excl(sb_bdev);
> + return sb;
> +}
> +
> +static int fs_bdev_freeze(struct block_device *bdev)
> + __releases(&bdev->bd_holder_lock)
> +{
> + struct super_block *sb;
> + int error = 0;
> +
> + lockdep_assert_held(&bdev->bd_holder_lock);
> +
> + sb = get_bdev_super(bdev);
> + if (sb) {
> + if (sb->s_op->freeze_super)
> + error = sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE);
> + else
> + error = freeze_super(sb, FREEZE_HOLDER_USERSPACE);
> + if (error)
> + atomic_dec(&bdev->bd_fsfreeze_count);
> + }
> +
> + /*
> + * We have grabbed an active reference which means that the
> + * superblock and the block device cannot go away. But we might
> + * end up holding the last reference and so end up shutting the
> + * superblock down. So drop @bdev->bd_holder_lock to avoid
> + * deadlocks with blkdev_put().
> + */
> + mutex_unlock(&bdev->bd_holder_lock);
> +
> + if (sb)
> + deactivate_super(sb);
> +
> + if (!error)
> + sync_blockdev(bdev);
Same question here about ignoring the return value.
(Everything after this point looks ok to me.)
--D
> +
> + return error;
> +}
> +
> +static int fs_bdev_thaw(struct block_device *bdev)
> + __releases(&bdev->bd_holder_lock)
> +{
> + struct super_block *sb;
> + int error;
> +
> + lockdep_assert_held(&bdev->bd_holder_lock);
> +
> + sb = get_bdev_super(bdev);
> + if (WARN_ON_ONCE(!sb))
> + return -EINVAL;
> +
> + if (sb->s_op->thaw_super)
> + error = sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE);
> + else
> + error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
> + if (error)
> + atomic_inc(&bdev->bd_fsfreeze_count);
> +
> + /*
> + * We have grabbed an active reference which means that the
> + * superblock and the block device cannot go away. But we might
> + * end up holding the last reference and so end up shutting the
> + * superblock down. So drop @bdev->bd_holder_lock to avoid
> + * deadlocks with blkdev_put().
> + */
> + mutex_unlock(&bdev->bd_holder_lock);
> + deactivate_super(sb);
> +
> + return error;
> +}
> +
> const struct blk_holder_ops fs_holder_ops = {
> .mark_dead = fs_bdev_mark_dead,
> .sync = fs_bdev_sync,
> + .freeze = fs_bdev_freeze,
> + .thaw = fs_bdev_thaw,
> };
> EXPORT_SYMBOL_GPL(fs_holder_ops);
>
> @@ -1499,15 +1581,10 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
> }
>
> /*
> - * Until SB_BORN flag is set, there can be no active superblock
> - * references and thus no filesystem freezing. get_active_super() will
> - * just loop waiting for SB_BORN so even bdev_freeze() cannot proceed.
> - *
> - * It is enough to check bdev was not frozen before we set s_bdev.
> + * It is enough to check bdev was not frozen before we set
> + * s_bdev as freezing will wait until SB_BORN is set.
> */
> - mutex_lock(&bdev->bd_fsfreeze_mutex);
> - if (bdev->bd_fsfreeze_count > 0) {
> - mutex_unlock(&bdev->bd_fsfreeze_mutex);
> + if (atomic_read(&bdev->bd_fsfreeze_count) > 0) {
> if (fc)
> warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev);
> blkdev_put(bdev, sb);
> @@ -1519,7 +1596,6 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
> if (bdev_stable_writes(bdev))
> sb->s_iflags |= SB_I_STABLE_WRITES;
> spin_unlock(&sb_lock);
> - mutex_unlock(&bdev->bd_fsfreeze_mutex);
>
> snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
> shrinker_debugfs_rename(&sb->s_shrink, "sb-%s:%s", sb->s_type->name,
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index d5c5e59ddbd2..88e1848b0869 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -57,7 +57,7 @@ struct block_device {
> const struct blk_holder_ops *bd_holder_ops;
> struct mutex bd_holder_lock;
> /* The counter of freeze processes */
> - int bd_fsfreeze_count;
> + atomic_t bd_fsfreeze_count;
> int bd_holders;
> struct kobject *bd_holder_dir;
>
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/7] fs: remove get_active_super()
2023-09-27 13:21 ` [PATCH 4/7] fs: remove get_active_super() Christian Brauner
@ 2023-09-27 14:54 ` Darrick J. Wong
2023-10-02 7:10 ` Christoph Hellwig
2023-10-02 16:22 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 14:54 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:17PM +0200, Christian Brauner wrote:
> This function is now unused so remove it. One less function that uses
> the global superblock list.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/super.c | 28 ----------------------------
> include/linux/fs.h | 1 -
> 2 files changed, 29 deletions(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 672f1837fbef..181ac8501301 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1016,34 +1016,6 @@ void iterate_supers_type(struct file_system_type *type,
>
> EXPORT_SYMBOL(iterate_supers_type);
>
> -/**
> - * get_active_super - get an active reference to the superblock of a device
> - * @bdev: device to get the superblock for
> - *
> - * Scans the superblock list and finds the superblock of the file system
> - * mounted on the device given. Returns the superblock with an active
> - * reference or %NULL if none was found.
> - */
> -struct super_block *get_active_super(struct block_device *bdev)
> -{
> - struct super_block *sb;
> -
> - if (!bdev)
> - return NULL;
> -
> - spin_lock(&sb_lock);
> - list_for_each_entry(sb, &super_blocks, s_list) {
> - if (sb->s_bdev == bdev) {
> - if (!grab_super(sb))
> - return NULL;
> - super_unlock_excl(sb);
> - return sb;
> - }
> - }
> - spin_unlock(&sb_lock);
> - return NULL;
> -}
> -
> struct super_block *user_get_super(dev_t dev, bool excl)
> {
> struct super_block *sb;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b528f063e8ff..ad0ddc10d560 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3052,7 +3052,6 @@ extern int vfs_readlink(struct dentry *, char __user *, int);
> extern struct file_system_type *get_filesystem(struct file_system_type *fs);
> extern void put_filesystem(struct file_system_type *fs);
> extern struct file_system_type *get_fs_type(const char *name);
> -extern struct super_block *get_active_super(struct block_device *bdev);
> extern void drop_super(struct super_block *sb);
> extern void drop_super_exclusive(struct super_block *sb);
> extern void iterate_supers(void (*)(struct super_block *, void *), void *);
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb}
2023-09-27 13:21 ` [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb} Christian Brauner
@ 2023-09-27 15:11 ` Darrick J. Wong
2023-09-27 15:18 ` Christian Brauner
2023-10-02 7:12 ` Christoph Hellwig
2023-10-02 16:24 ` Jan Kara
1 sibling, 2 replies; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 15:11 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:18PM +0200, Christian Brauner wrote:
> Both bd_fsfreeze_mutex and bd_fsfreeze_sb are now unused and can be
> removed. Also move bd_fsfreeze_count down to not have it weirdly placed
> in the middle of the holder fields.
>
> Suggested-by: Jan Kara <jack@suse.cz>
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> block/bdev.c | 1 -
> include/linux/blk_types.h | 7 ++-----
> 2 files changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index 3deccd0ffcf2..084855b669f7 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -392,7 +392,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
> mapping_set_gfp_mask(&inode->i_data, GFP_USER);
>
> bdev = I_BDEV(inode);
> - mutex_init(&bdev->bd_fsfreeze_mutex);
> spin_lock_init(&bdev->bd_size_lock);
> mutex_init(&bdev->bd_holder_lock);
> bdev->bd_partno = partno;
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index 88e1848b0869..0238236852b7 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -56,14 +56,11 @@ struct block_device {
> void * bd_holder;
Hmmm. get_bdev_super from patch 3 now requires that bd_holder is a
pointer to a struct super_block. AFAICT it's only called in conjunction
with fs_holder_ops, so I suggest that the declaration for that should
grow a comment to that effect:
/*
* For filesystems, the @holder argument passed to blkdev_get_* and
* bd_prepare_to_claim must point to a super_block object.
*/
extern const struct blk_holder_ops fs_holder_ops;
This patch itself looks ok so:
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> const struct blk_holder_ops *bd_holder_ops;
> struct mutex bd_holder_lock;
> - /* The counter of freeze processes */
> - atomic_t bd_fsfreeze_count;
> int bd_holders;
> struct kobject *bd_holder_dir;
>
> - /* Mutex for freeze */
> - struct mutex bd_fsfreeze_mutex;
> - struct super_block *bd_fsfreeze_sb;
> + /* The counter of freeze processes */
> + atomic_t bd_fsfreeze_count;
>
> struct partition_meta_info *bd_meta_info;
> #ifdef CONFIG_FAIL_MAKE_REQUEST
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 6/7] fs: remove unused helper
2023-09-27 13:21 ` [PATCH 6/7] fs: remove unused helper Christian Brauner
@ 2023-09-27 15:12 ` Darrick J. Wong
2023-10-02 7:12 ` Christoph Hellwig
2023-10-02 16:26 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 15:12 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:19PM +0200, Christian Brauner wrote:
> The grab_super() helper is now only used by grab_super_dead(). Merge the
> two helpers into one.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Yup, nice cleanup...
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/super.c | 44 +++++++-------------------------------------
> 1 file changed, 7 insertions(+), 37 deletions(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 181ac8501301..6cdce2b31622 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -517,35 +517,6 @@ void deactivate_super(struct super_block *s)
>
> EXPORT_SYMBOL(deactivate_super);
>
> -/**
> - * grab_super - acquire an active reference
> - * @s: reference we are trying to make active
> - *
> - * Tries to acquire an active reference. grab_super() is used when we
> - * had just found a superblock in super_blocks or fs_type->fs_supers
> - * and want to turn it into a full-blown active reference. grab_super()
> - * is called with sb_lock held and drops it. Returns 1 in case of
> - * success, 0 if we had failed (superblock contents was already dead or
> - * dying when grab_super() had been called). Note that this is only
> - * called for superblocks not in rundown mode (== ones still on ->fs_supers
> - * of their type), so increment of ->s_count is OK here.
> - */
> -static int grab_super(struct super_block *s) __releases(sb_lock)
> -{
> - bool born;
> -
> - s->s_count++;
> - spin_unlock(&sb_lock);
> - born = super_lock_excl(s);
> - if (born && atomic_inc_not_zero(&s->s_active)) {
> - put_super(s);
> - return 1;
> - }
> - super_unlock_excl(s);
> - put_super(s);
> - return 0;
> -}
> -
> static inline bool wait_dead(struct super_block *sb)
> {
> unsigned int flags;
> @@ -559,7 +530,7 @@ static inline bool wait_dead(struct super_block *sb)
> }
>
> /**
> - * grab_super_dead - acquire an active reference to a superblock
> + * grab_super - acquire an active reference to a superblock
> * @sb: superblock to acquire
> *
> * Acquire a temporary reference on a superblock and try to trade it for
> @@ -570,17 +541,16 @@ static inline bool wait_dead(struct super_block *sb)
> * Return: This returns true if an active reference could be acquired,
> * false if not.
> */
> -static bool grab_super_dead(struct super_block *sb)
> +static bool grab_super(struct super_block *sb)
> {
> -
> sb->s_count++;
> - if (grab_super(sb)) {
> + spin_unlock(&sb_lock);
> + if (super_lock_excl(sb) && atomic_inc_not_zero(&sb->s_active)) {
> put_super(sb);
> - lockdep_assert_held(&sb->s_umount);
> return true;
> }
> + super_unlock_excl(sb);
> wait_var_event(&sb->s_flags, wait_dead(sb));
> - lockdep_assert_not_held(&sb->s_umount);
> put_super(sb);
> return false;
> }
> @@ -831,7 +801,7 @@ struct super_block *sget_fc(struct fs_context *fc,
> warnfc(fc, "reusing existing filesystem in another namespace not allowed");
> return ERR_PTR(-EBUSY);
> }
> - if (!grab_super_dead(old))
> + if (!grab_super(old))
> goto retry;
> destroy_unused_super(s);
> return old;
> @@ -875,7 +845,7 @@ struct super_block *sget(struct file_system_type *type,
> destroy_unused_super(s);
> return ERR_PTR(-EBUSY);
> }
> - if (!grab_super_dead(old))
> + if (!grab_super(old))
> goto retry;
> destroy_unused_super(s);
> return old;
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/7] bdev: implement freeze and thaw holder operations
2023-09-27 14:53 ` Darrick J. Wong
@ 2023-09-27 15:15 ` Christian Brauner
2023-09-27 16:01 ` Darrick J. Wong
0 siblings, 1 reply; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 15:15 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
> > + sync_blockdev(bdev);
>
> Why ignore the return value from sync_blockdev? It calls
> filemap_write_and_wait, which clears AS_EIO/AS_ENOSPC from the bdev
> mapping, which means that we'll silently drop accumulated IO errors.
Because freeze_bdev() has always ignored sync_blockdev() errors so far
and I'm not sure what we'd do with that error. We can report it but we
might confuse callers that think that the freeze failed when it hasn't.
>
> > + mutex_unlock(&bdev->bd_holder_lock);
>
> Also not sure why this fallback case holds bd_holder_lock across the
> sync_blockdev but fs_bdev_freeze doesn't?
I'll massage that to be consistent. Thanks!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb}
2023-09-27 15:11 ` Darrick J. Wong
@ 2023-09-27 15:18 ` Christian Brauner
2023-10-02 7:12 ` Christoph Hellwig
1 sibling, 0 replies; 32+ messages in thread
From: Christian Brauner @ 2023-09-27 15:18 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 08:11:11AM -0700, Darrick J. Wong wrote:
> On Wed, Sep 27, 2023 at 03:21:18PM +0200, Christian Brauner wrote:
> > Both bd_fsfreeze_mutex and bd_fsfreeze_sb are now unused and can be
> > removed. Also move bd_fsfreeze_count down to not have it weirdly placed
> > in the middle of the holder fields.
> >
> > Suggested-by: Jan Kara <jack@suse.cz>
> > Suggested-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > block/bdev.c | 1 -
> > include/linux/blk_types.h | 7 ++-----
> > 2 files changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/block/bdev.c b/block/bdev.c
> > index 3deccd0ffcf2..084855b669f7 100644
> > --- a/block/bdev.c
> > +++ b/block/bdev.c
> > @@ -392,7 +392,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
> > mapping_set_gfp_mask(&inode->i_data, GFP_USER);
> >
> > bdev = I_BDEV(inode);
> > - mutex_init(&bdev->bd_fsfreeze_mutex);
> > spin_lock_init(&bdev->bd_size_lock);
> > mutex_init(&bdev->bd_holder_lock);
> > bdev->bd_partno = partno;
> > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> > index 88e1848b0869..0238236852b7 100644
> > --- a/include/linux/blk_types.h
> > +++ b/include/linux/blk_types.h
> > @@ -56,14 +56,11 @@ struct block_device {
> > void * bd_holder;
>
> Hmmm. get_bdev_super from patch 3 now requires that bd_holder is a
> pointer to a struct super_block. AFAICT it's only called in conjunction
Yeah, it's documented in Documentations/filesystems/porting.rst as of
060e6c7d179e ("porting: document superblock as block device holder")
which tries to explain differences between the old and new world in
detail.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 7/7] porting: document block device freeze and thaw changes
2023-09-27 13:21 ` [PATCH 7/7] porting: document block device freeze and thaw changes Christian Brauner
@ 2023-09-27 15:19 ` Darrick J. Wong
2023-10-02 16:45 ` Jan Kara
0 siblings, 1 reply; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 15:19 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 03:21:20PM +0200, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> Documentation/filesystems/porting.rst | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
> index 4d05b9862451..fef97a2e6729 100644
> --- a/Documentation/filesystems/porting.rst
> +++ b/Documentation/filesystems/porting.rst
> @@ -1045,3 +1045,28 @@ filesystem type is now moved to a later point when the devices are closed:
> As this is a VFS level change it has no practical consequences for filesystems
> other than that all of them must use one of the provided kill_litter_super(),
> kill_anon_super(), or kill_block_super() helpers.
> +
> +---
> +
> +**mandatory**
> +
> +Block device freezing and thawing have been moved to holder operations. As we
> +can now go straight from block devcie to superblock the get_active_super()
s/devcie/device/
> +and bd_fsfreeze_sb members in struct block_device are gone.
> +
> +The bd_fsfreeze_mutex is gone as well since we can rely on the bd_holder_lock
> +to protect against concurrent freeze and thaw.
> +
> +Before this change, get_active_super() would only be able to find the
> +superblock of the main block device, i.e., the one stored in sb->s_bdev. Block
> +device freezing now works for any block device owned by a given superblock, not
> +just the main block device.
You might want to document this new fs_holder_ops scheme:
"Filesystems opening a block device must pass the super_block object
and fs_holder_ops as the @holder and @hops parameters."
Though TBH I see a surprising amount of fs code that doesn't do this, so
perhaps it's not so mandatory?
--D
> +
> +When thawing we now grab an active reference so we can hold bd_holder_lock
> +across thaw without the risk of deadlocks (because the superblock goes away
> +which would require us to take bd_holder_lock). That allows us to get rid of
> +bd_fsfreeze_mutex. Currently we just reacquire s_umount after thaw_super() and
> +drop the active reference we took before. This someone could grab an active
> +reference before we dropped the last one. This shouldn't be an issue. If it
> +turns out to be one we can reshuffle the code to simply hold s_umount when
> +thaw_super() returns and drop the reference.
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/7] bdev: implement freeze and thaw holder operations
2023-09-27 15:15 ` Christian Brauner
@ 2023-09-27 16:01 ` Darrick J. Wong
2023-10-02 6:54 ` Christoph Hellwig
0 siblings, 1 reply; 32+ messages in thread
From: Darrick J. Wong @ 2023-09-27 16:01 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 05:15:28PM +0200, Christian Brauner wrote:
> > > + sync_blockdev(bdev);
> >
> > Why ignore the return value from sync_blockdev? It calls
> > filemap_write_and_wait, which clears AS_EIO/AS_ENOSPC from the bdev
> > mapping, which means that we'll silently drop accumulated IO errors.
>
> Because freeze_bdev() has always ignored sync_blockdev() errors so far
> and I'm not sure what we'd do with that error. We can report it but we
> might confuse callers that think that the freeze failed when it hasn't.
Thinking about this some more...
I got confused that fs_bdev_freeze drops the bd_fsfreeze_count if
freeze_super fails. But I guess that has to get done before letting go
of bd_holder_lock.
For the bdev->bd_holder_ops == fs_holder_ops case, the freeze_super call
will call sync_filesystem, which calls sync_blockdev. If that fails,
the fsfreeze aborts, and the bdev freeze (at least with the old code)
would also abort.
For the !bdev->bd_holder_ops case, why not capture the sync_blockdev
error code and decrement bd_fsfreeze_count if the sync failed? Then
this function either returns 0 with the fs and bdev frozen; or an error
code and nothing frozen.
(Also, does this mean that the new sync_blockdev call at the bottom of
fs_bdev_freeze isn't necessary? Filesystems that do IO in ->freeze_fs
should be flushing the block device.)
--D
> >
> > > + mutex_unlock(&bdev->bd_holder_lock);
> >
> > Also not sure why this fallback case holds bd_holder_lock across the
> > sync_blockdev but fs_bdev_freeze doesn't?
>
> I'll massage that to be consistent. Thanks!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/7] bdev: rename freeze and thaw helpers
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
2023-09-27 14:35 ` Darrick J. Wong
@ 2023-10-02 6:51 ` Christoph Hellwig
2023-10-02 11:28 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 6:51 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/7] bdev: add freeze and thaw holder operations
2023-09-27 13:21 ` [PATCH 2/7] bdev: add freeze and thaw holder operations Christian Brauner
2023-09-27 14:38 ` Darrick J. Wong
@ 2023-10-02 6:52 ` Christoph Hellwig
2023-10-02 16:32 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 6:52 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
On Wed, Sep 27, 2023 at 03:21:15PM +0200, Christian Brauner wrote:
> Add block device freeze and thaw holder operations. Follow-up patches
> will implement block device freeze and thaw based on stuct
> blk_holder_ops.
The changes looks good, but having them as a standalone patch is
a bit silly. I'd merge it into the first patch actually doing
something with the methods.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/7] bdev: implement freeze and thaw holder operations
2023-09-27 16:01 ` Darrick J. Wong
@ 2023-10-02 6:54 ` Christoph Hellwig
0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 6:54 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Christian Brauner, Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 09:01:42AM -0700, Darrick J. Wong wrote:
>
> For the bdev->bd_holder_ops == fs_holder_ops case, the freeze_super call
> will call sync_filesystem, which calls sync_blockdev. If that fails,
> the fsfreeze aborts, and the bdev freeze (at least with the old code)
> would also abort.
>
> For the !bdev->bd_holder_ops case, why not capture the sync_blockdev
> error code and decrement bd_fsfreeze_count if the sync failed? Then
> this function either returns 0 with the fs and bdev frozen; or an error
> code and nothing frozen.
Yes, even if that is a behavior change it would be a lot more consistent.
Maybe do the capturing of the error code as a prep patch so that it
is clearly bisectable.
> (Also, does this mean that the new sync_blockdev call at the bottom of
> fs_bdev_freeze isn't necessary? Filesystems that do IO in ->freeze_fs
> should be flushing the block device.)
Various methods including freeze do the sync_blockdev unconditionally.
I think this is a bad idea and should be moved into the file systems,
but I don't think this is in scope for this series.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/7] bdev: implement freeze and thaw holder operations
2023-09-27 13:21 ` [PATCH 3/7] bdev: implement " Christian Brauner
2023-09-27 14:53 ` Darrick J. Wong
@ 2023-10-02 7:10 ` Christoph Hellwig
1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 7:10 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
>
> +static struct super_block *get_bdev_super(const struct block_device *bdev)
> +{
> + struct super_block *sb_bdev = bdev->bd_holder, *sb = NULL;
> +
> + if (!sb_bdev)
> + return NULL;
> + if (super_lock_excl(sb_bdev) && atomic_inc_not_zero(&sb_bdev->s_active))
> + sb = sb_bdev;
> + super_unlock_excl(sb_bdev);
> + return sb;
I find the flow here a bit confusing, because to me sb_bdev implies
the super_block of the bdev fs, and because the super_lock_excl calling
convention that always locks no matter of the return value is very
confusing. Maybe at least rename sb_bdev to holder_bdev?
> +static int fs_bdev_freeze(struct block_device *bdev)
> + __releases(&bdev->bd_holder_lock)
> +{
> + struct super_block *sb;
> + int error = 0;
> +
> + lockdep_assert_held(&bdev->bd_holder_lock);
> +
> + sb = get_bdev_super(bdev);
> + if (sb) {
We always have a sb in bdev->bd_holder. So the only way get_bdev_super
could return NULL is if we can't get an active reference or the fs is
marked as SB_DYING. I think the best is to just give up and not even
bother with the sync, which is going to cause more problems than it
could help. I think we're better off just straight returning an
error here, and don't bother with the sync_blockdev.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/7] fs: remove get_active_super()
2023-09-27 13:21 ` [PATCH 4/7] fs: remove get_active_super() Christian Brauner
2023-09-27 14:54 ` Darrick J. Wong
@ 2023-10-02 7:10 ` Christoph Hellwig
2023-10-02 16:22 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 7:10 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb}
2023-09-27 15:11 ` Darrick J. Wong
2023-09-27 15:18 ` Christian Brauner
@ 2023-10-02 7:12 ` Christoph Hellwig
1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 7:12 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Christian Brauner, Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed, Sep 27, 2023 at 08:11:11AM -0700, Darrick J. Wong wrote:
> > @@ -56,14 +56,11 @@ struct block_device {
> > void * bd_holder;
>
> Hmmm. get_bdev_super from patch 3 now requires that bd_holder is a
> pointer to a struct super_block. AFAICT it's only called in conjunction
> with fs_holder_ops, so I suggest that the declaration for that should
> grow a comment to that effect:
>
> /*
> * For filesystems, the @holder argument passed to blkdev_get_* and
> * bd_prepare_to_claim must point to a super_block object.
> */
> extern const struct blk_holder_ops fs_holder_ops;
Note that this is not strictly speaking true, it is only true for
file systems actually using fs_holder_ops, not file systems in general.
That being said a comment to that extent is indeed useful.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 6/7] fs: remove unused helper
2023-09-27 13:21 ` [PATCH 6/7] fs: remove unused helper Christian Brauner
2023-09-27 15:12 ` Darrick J. Wong
@ 2023-10-02 7:12 ` Christoph Hellwig
2023-10-02 16:26 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-02 7:12 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/7] bdev: rename freeze and thaw helpers
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
2023-09-27 14:35 ` Darrick J. Wong
2023-10-02 6:51 ` Christoph Hellwig
@ 2023-10-02 11:28 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Jan Kara @ 2023-10-02 11:28 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
On Wed 27-09-23 15:21:14, Christian Brauner wrote:
> We have bdev_mark_dead() etc and we're going to move block device
> freezing to holder ops in the next patch. Make the naming consistent:
>
> * freeze_bdev() -> bdev_freeze()
> * thaw_bdev() -> bdev_thaw()
>
> Also document the return code.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> block/bdev.c | 22 +++++++++++++---------
> drivers/md/dm.c | 4 ++--
> fs/ext4/ioctl.c | 4 ++--
> fs/f2fs/file.c | 4 ++--
> fs/super.c | 4 ++--
> fs/xfs/xfs_fsops.c | 4 ++--
> include/linux/blkdev.h | 4 ++--
> 7 files changed, 25 insertions(+), 21 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index f3b13aa1b7d4..0d27db3e69e7 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -207,18 +207,20 @@ int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend)
> EXPORT_SYMBOL(sync_blockdev_range);
>
> /**
> - * freeze_bdev - lock a filesystem and force it into a consistent state
> + * bdev_freeze - lock a filesystem and force it into a consistent state
> * @bdev: blockdevice to lock
> *
> * If a superblock is found on this device, we take the s_umount semaphore
> * on it to make sure nobody unmounts until the snapshot creation is done.
> * The reference counter (bd_fsfreeze_count) guarantees that only the last
> * unfreeze process can unfreeze the frozen filesystem actually when multiple
> - * freeze requests arrive simultaneously. It counts up in freeze_bdev() and
> - * count down in thaw_bdev(). When it becomes 0, thaw_bdev() will unfreeze
> + * freeze requests arrive simultaneously. It counts up in bdev_freeze() and
> + * count down in bdev_thaw(). When it becomes 0, thaw_bdev() will unfreeze
> * actually.
> + *
> + * Return: On success zero is returned, negative error code on failure.
> */
> -int freeze_bdev(struct block_device *bdev)
> +int bdev_freeze(struct block_device *bdev)
> {
> struct super_block *sb;
> int error = 0;
> @@ -248,15 +250,17 @@ int freeze_bdev(struct block_device *bdev)
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> }
> -EXPORT_SYMBOL(freeze_bdev);
> +EXPORT_SYMBOL(bdev_freeze);
>
> /**
> - * thaw_bdev - unlock filesystem
> + * bdev_thaw - unlock filesystem
> * @bdev: blockdevice to unlock
> *
> - * Unlocks the filesystem and marks it writeable again after freeze_bdev().
> + * Unlocks the filesystem and marks it writeable again after bdev_freeze().
> + *
> + * Return: On success zero is returned, negative error code on failure.
> */
> -int thaw_bdev(struct block_device *bdev)
> +int bdev_thaw(struct block_device *bdev)
> {
> struct super_block *sb;
> int error = -EINVAL;
> @@ -285,7 +289,7 @@ int thaw_bdev(struct block_device *bdev)
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> }
> -EXPORT_SYMBOL(thaw_bdev);
> +EXPORT_SYMBOL(bdev_thaw);
>
> /*
> * pseudo-fs
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 64a1f306c96c..6fa309e8efb0 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2648,7 +2648,7 @@ static int lock_fs(struct mapped_device *md)
>
> WARN_ON(test_bit(DMF_FROZEN, &md->flags));
>
> - r = freeze_bdev(md->disk->part0);
> + r = bdev_freeze(md->disk->part0);
> if (!r)
> set_bit(DMF_FROZEN, &md->flags);
> return r;
> @@ -2658,7 +2658,7 @@ static void unlock_fs(struct mapped_device *md)
> {
> if (!test_bit(DMF_FROZEN, &md->flags))
> return;
> - thaw_bdev(md->disk->part0);
> + bdev_thaw(md->disk->part0);
> clear_bit(DMF_FROZEN, &md->flags);
> }
>
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index 0bfe2ce589e2..c1390219c945 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -810,11 +810,11 @@ int ext4_force_shutdown(struct super_block *sb, u32 flags)
>
> switch (flags) {
> case EXT4_GOING_FLAGS_DEFAULT:
> - ret = freeze_bdev(sb->s_bdev);
> + ret = bdev_freeze(sb->s_bdev);
> if (ret)
> return ret;
> set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
> - thaw_bdev(sb->s_bdev);
> + bdev_thaw(sb->s_bdev);
> break;
> case EXT4_GOING_FLAGS_LOGFLUSH:
> set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index ca5904129b16..c22aeb9ffb61 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -2239,11 +2239,11 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
>
> switch (in) {
> case F2FS_GOING_DOWN_FULLSYNC:
> - ret = freeze_bdev(sb->s_bdev);
> + ret = bdev_freeze(sb->s_bdev);
> if (ret)
> goto out;
> f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
> - thaw_bdev(sb->s_bdev);
> + bdev_thaw(sb->s_bdev);
> break;
> case F2FS_GOING_DOWN_METASYNC:
> /* do checkpoint only */
> diff --git a/fs/super.c b/fs/super.c
> index 2d762ce67f6e..e54866345dc7 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1210,7 +1210,7 @@ static void do_thaw_all_callback(struct super_block *sb)
>
> if (born && sb->s_root) {
> if (IS_ENABLED(CONFIG_BLOCK))
> - while (sb->s_bdev && !thaw_bdev(sb->s_bdev))
> + while (sb->s_bdev && !bdev_thaw(sb->s_bdev))
> pr_warn("Emergency Thaw on %pg\n", sb->s_bdev);
> thaw_super_locked(sb, FREEZE_HOLDER_USERSPACE);
> } else {
> @@ -1501,7 +1501,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
> /*
> * Until SB_BORN flag is set, there can be no active superblock
> * references and thus no filesystem freezing. get_active_super() will
> - * just loop waiting for SB_BORN so even freeze_bdev() cannot proceed.
> + * just loop waiting for SB_BORN so even bdev_freeze() cannot proceed.
> *
> * It is enough to check bdev was not frozen before we set s_bdev.
> */
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 7cb75cb6b8e9..57076a25f17d 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -482,9 +482,9 @@ xfs_fs_goingdown(
> {
> switch (inflags) {
> case XFS_FSOP_GOING_FLAGS_DEFAULT: {
> - if (!freeze_bdev(mp->m_super->s_bdev)) {
> + if (!bdev_freeze(mp->m_super->s_bdev)) {
> xfs_force_shutdown(mp, SHUTDOWN_FORCE_UMOUNT);
> - thaw_bdev(mp->m_super->s_bdev);
> + bdev_thaw(mp->m_super->s_bdev);
> }
> break;
> }
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index eef450f25982..bf25b63e13d5 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1530,8 +1530,8 @@ static inline int early_lookup_bdev(const char *pathname, dev_t *dev)
> }
> #endif /* CONFIG_BLOCK */
>
> -int freeze_bdev(struct block_device *bdev);
> -int thaw_bdev(struct block_device *bdev);
> +int bdev_freeze(struct block_device *bdev);
> +int bdev_thaw(struct block_device *bdev);
>
> struct io_comp_batch {
> struct request *req_list;
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/7] fs: remove get_active_super()
2023-09-27 13:21 ` [PATCH 4/7] fs: remove get_active_super() Christian Brauner
2023-09-27 14:54 ` Darrick J. Wong
2023-10-02 7:10 ` Christoph Hellwig
@ 2023-10-02 16:22 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Jan Kara @ 2023-10-02 16:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
On Wed 27-09-23 15:21:17, Christian Brauner wrote:
> This function is now unused so remove it. One less function that uses
> the global superblock list.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Nice. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/super.c | 28 ----------------------------
> include/linux/fs.h | 1 -
> 2 files changed, 29 deletions(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 672f1837fbef..181ac8501301 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1016,34 +1016,6 @@ void iterate_supers_type(struct file_system_type *type,
>
> EXPORT_SYMBOL(iterate_supers_type);
>
> -/**
> - * get_active_super - get an active reference to the superblock of a device
> - * @bdev: device to get the superblock for
> - *
> - * Scans the superblock list and finds the superblock of the file system
> - * mounted on the device given. Returns the superblock with an active
> - * reference or %NULL if none was found.
> - */
> -struct super_block *get_active_super(struct block_device *bdev)
> -{
> - struct super_block *sb;
> -
> - if (!bdev)
> - return NULL;
> -
> - spin_lock(&sb_lock);
> - list_for_each_entry(sb, &super_blocks, s_list) {
> - if (sb->s_bdev == bdev) {
> - if (!grab_super(sb))
> - return NULL;
> - super_unlock_excl(sb);
> - return sb;
> - }
> - }
> - spin_unlock(&sb_lock);
> - return NULL;
> -}
> -
> struct super_block *user_get_super(dev_t dev, bool excl)
> {
> struct super_block *sb;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b528f063e8ff..ad0ddc10d560 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3052,7 +3052,6 @@ extern int vfs_readlink(struct dentry *, char __user *, int);
> extern struct file_system_type *get_filesystem(struct file_system_type *fs);
> extern void put_filesystem(struct file_system_type *fs);
> extern struct file_system_type *get_fs_type(const char *name);
> -extern struct super_block *get_active_super(struct block_device *bdev);
> extern void drop_super(struct super_block *sb);
> extern void drop_super_exclusive(struct super_block *sb);
> extern void iterate_supers(void (*)(struct super_block *, void *), void *);
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb}
2023-09-27 13:21 ` [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb} Christian Brauner
2023-09-27 15:11 ` Darrick J. Wong
@ 2023-10-02 16:24 ` Jan Kara
1 sibling, 0 replies; 32+ messages in thread
From: Jan Kara @ 2023-10-02 16:24 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
On Wed 27-09-23 15:21:18, Christian Brauner wrote:
> Both bd_fsfreeze_mutex and bd_fsfreeze_sb are now unused and can be
> removed. Also move bd_fsfreeze_count down to not have it weirdly placed
> in the middle of the holder fields.
>
> Suggested-by: Jan Kara <jack@suse.cz>
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
The patch looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> block/bdev.c | 1 -
> include/linux/blk_types.h | 7 ++-----
> 2 files changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index 3deccd0ffcf2..084855b669f7 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -392,7 +392,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
> mapping_set_gfp_mask(&inode->i_data, GFP_USER);
>
> bdev = I_BDEV(inode);
> - mutex_init(&bdev->bd_fsfreeze_mutex);
> spin_lock_init(&bdev->bd_size_lock);
> mutex_init(&bdev->bd_holder_lock);
> bdev->bd_partno = partno;
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index 88e1848b0869..0238236852b7 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -56,14 +56,11 @@ struct block_device {
> void * bd_holder;
> const struct blk_holder_ops *bd_holder_ops;
> struct mutex bd_holder_lock;
> - /* The counter of freeze processes */
> - atomic_t bd_fsfreeze_count;
> int bd_holders;
> struct kobject *bd_holder_dir;
>
> - /* Mutex for freeze */
> - struct mutex bd_fsfreeze_mutex;
> - struct super_block *bd_fsfreeze_sb;
> + /* The counter of freeze processes */
> + atomic_t bd_fsfreeze_count;
>
> struct partition_meta_info *bd_meta_info;
> #ifdef CONFIG_FAIL_MAKE_REQUEST
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 6/7] fs: remove unused helper
2023-09-27 13:21 ` [PATCH 6/7] fs: remove unused helper Christian Brauner
2023-09-27 15:12 ` Darrick J. Wong
2023-10-02 7:12 ` Christoph Hellwig
@ 2023-10-02 16:26 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Jan Kara @ 2023-10-02 16:26 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
On Wed 27-09-23 15:21:19, Christian Brauner wrote:
> The grab_super() helper is now only used by grab_super_dead(). Merge the
> two helpers into one.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Yeah, nice! Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/super.c | 44 +++++++-------------------------------------
> 1 file changed, 7 insertions(+), 37 deletions(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 181ac8501301..6cdce2b31622 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -517,35 +517,6 @@ void deactivate_super(struct super_block *s)
>
> EXPORT_SYMBOL(deactivate_super);
>
> -/**
> - * grab_super - acquire an active reference
> - * @s: reference we are trying to make active
> - *
> - * Tries to acquire an active reference. grab_super() is used when we
> - * had just found a superblock in super_blocks or fs_type->fs_supers
> - * and want to turn it into a full-blown active reference. grab_super()
> - * is called with sb_lock held and drops it. Returns 1 in case of
> - * success, 0 if we had failed (superblock contents was already dead or
> - * dying when grab_super() had been called). Note that this is only
> - * called for superblocks not in rundown mode (== ones still on ->fs_supers
> - * of their type), so increment of ->s_count is OK here.
> - */
> -static int grab_super(struct super_block *s) __releases(sb_lock)
> -{
> - bool born;
> -
> - s->s_count++;
> - spin_unlock(&sb_lock);
> - born = super_lock_excl(s);
> - if (born && atomic_inc_not_zero(&s->s_active)) {
> - put_super(s);
> - return 1;
> - }
> - super_unlock_excl(s);
> - put_super(s);
> - return 0;
> -}
> -
> static inline bool wait_dead(struct super_block *sb)
> {
> unsigned int flags;
> @@ -559,7 +530,7 @@ static inline bool wait_dead(struct super_block *sb)
> }
>
> /**
> - * grab_super_dead - acquire an active reference to a superblock
> + * grab_super - acquire an active reference to a superblock
> * @sb: superblock to acquire
> *
> * Acquire a temporary reference on a superblock and try to trade it for
> @@ -570,17 +541,16 @@ static inline bool wait_dead(struct super_block *sb)
> * Return: This returns true if an active reference could be acquired,
> * false if not.
> */
> -static bool grab_super_dead(struct super_block *sb)
> +static bool grab_super(struct super_block *sb)
> {
> -
> sb->s_count++;
> - if (grab_super(sb)) {
> + spin_unlock(&sb_lock);
> + if (super_lock_excl(sb) && atomic_inc_not_zero(&sb->s_active)) {
> put_super(sb);
> - lockdep_assert_held(&sb->s_umount);
> return true;
> }
> + super_unlock_excl(sb);
> wait_var_event(&sb->s_flags, wait_dead(sb));
> - lockdep_assert_not_held(&sb->s_umount);
> put_super(sb);
> return false;
> }
> @@ -831,7 +801,7 @@ struct super_block *sget_fc(struct fs_context *fc,
> warnfc(fc, "reusing existing filesystem in another namespace not allowed");
> return ERR_PTR(-EBUSY);
> }
> - if (!grab_super_dead(old))
> + if (!grab_super(old))
> goto retry;
> destroy_unused_super(s);
> return old;
> @@ -875,7 +845,7 @@ struct super_block *sget(struct file_system_type *type,
> destroy_unused_super(s);
> return ERR_PTR(-EBUSY);
> }
> - if (!grab_super_dead(old))
> + if (!grab_super(old))
> goto retry;
> destroy_unused_super(s);
> return old;
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/7] bdev: add freeze and thaw holder operations
2023-09-27 13:21 ` [PATCH 2/7] bdev: add freeze and thaw holder operations Christian Brauner
2023-09-27 14:38 ` Darrick J. Wong
2023-10-02 6:52 ` Christoph Hellwig
@ 2023-10-02 16:32 ` Jan Kara
2 siblings, 0 replies; 32+ messages in thread
From: Jan Kara @ 2023-10-02 16:32 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, Darrick J. Wong
On Wed 27-09-23 15:21:15, Christian Brauner wrote:
> Add block device freeze and thaw holder operations. Follow-up patches
> will implement block device freeze and thaw based on stuct
> blk_holder_ops.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good (I don't really care whether you fold this into patch 3 or not).
Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> include/linux/blkdev.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index bf25b63e13d5..f2ddccaaef4d 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1468,6 +1468,16 @@ struct blk_holder_ops {
> * Sync the file system mounted on the block device.
> */
> void (*sync)(struct block_device *bdev);
> +
> + /*
> + * Freeze the file system mounted on the block device.
> + */
> + int (*freeze)(struct block_device *bdev);
> +
> + /*
> + * Thaw the file system mounted on the block device.
> + */
> + int (*thaw)(struct block_device *bdev);
> };
>
> extern const struct blk_holder_ops fs_holder_ops;
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 7/7] porting: document block device freeze and thaw changes
2023-09-27 15:19 ` Darrick J. Wong
@ 2023-10-02 16:45 ` Jan Kara
2023-10-05 6:48 ` Christoph Hellwig
0 siblings, 1 reply; 32+ messages in thread
From: Jan Kara @ 2023-10-02 16:45 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Christian Brauner, Jan Kara, Christoph Hellwig, linux-fsdevel
On Wed 27-09-23 08:19:11, Darrick J. Wong wrote:
> On Wed, Sep 27, 2023 at 03:21:20PM +0200, Christian Brauner wrote:
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > Documentation/filesystems/porting.rst | 25 +++++++++++++++++++++++++
> > 1 file changed, 25 insertions(+)
> >
> > diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
> > index 4d05b9862451..fef97a2e6729 100644
> > --- a/Documentation/filesystems/porting.rst
> > +++ b/Documentation/filesystems/porting.rst
> > @@ -1045,3 +1045,28 @@ filesystem type is now moved to a later point when the devices are closed:
> > As this is a VFS level change it has no practical consequences for filesystems
> > other than that all of them must use one of the provided kill_litter_super(),
> > kill_anon_super(), or kill_block_super() helpers.
> > +
> > +---
> > +
> > +**mandatory**
> > +
> > +Block device freezing and thawing have been moved to holder operations. As we
> > +can now go straight from block devcie to superblock the get_active_super()
>
> s/devcie/device/
>
> > +and bd_fsfreeze_sb members in struct block_device are gone.
> > +
> > +The bd_fsfreeze_mutex is gone as well since we can rely on the bd_holder_lock
> > +to protect against concurrent freeze and thaw.
> > +
> > +Before this change, get_active_super() would only be able to find the
> > +superblock of the main block device, i.e., the one stored in sb->s_bdev. Block
> > +device freezing now works for any block device owned by a given superblock, not
> > +just the main block device.
>
> You might want to document this new fs_holder_ops scheme:
>
> "Filesystems opening a block device must pass the super_block object
> and fs_holder_ops as the @holder and @hops parameters."
>
> Though TBH I see a surprising amount of fs code that doesn't do this, so
> perhaps it's not so mandatory?
This is actually a good point. For the main device, fs/super.c takes care
of this (perhaps except for btrfs). So this patch set should not regress
anything. But for other devices such as the journal device or similar,
passing proper holder and holder_ops from the filesystem is necessary.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 7/7] porting: document block device freeze and thaw changes
2023-10-02 16:45 ` Jan Kara
@ 2023-10-05 6:48 ` Christoph Hellwig
0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2023-10-05 6:48 UTC (permalink / raw)
To: Jan Kara
Cc: Darrick J. Wong, Christian Brauner, Christoph Hellwig,
linux-fsdevel
On Mon, Oct 02, 2023 at 06:45:24PM +0200, Jan Kara wrote:
> > "Filesystems opening a block device must pass the super_block object
> > and fs_holder_ops as the @holder and @hops parameters."
> >
> > Though TBH I see a surprising amount of fs code that doesn't do this, so
> > perhaps it's not so mandatory?
>
> This is actually a good point. For the main device, fs/super.c takes care
> of this (perhaps except for btrfs). So this patch set should not regress
> anything. But for other devices such as the journal device or similar,
> passing proper holder and holder_ops from the filesystem is necessary.
It is is necessary to gain functionality where we call into the
fs based on the block device. In the old get_super based world these
never worked either as get_super was based on sb->s_dev only.
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2023-10-05 15:42 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-27 13:21 [PATCH 0/7] Implement freeze and thaw as holder operations Christian Brauner
2023-09-27 13:21 ` [PATCH 1/7] bdev: rename freeze and thaw helpers Christian Brauner
2023-09-27 14:35 ` Darrick J. Wong
2023-10-02 6:51 ` Christoph Hellwig
2023-10-02 11:28 ` Jan Kara
2023-09-27 13:21 ` [PATCH 2/7] bdev: add freeze and thaw holder operations Christian Brauner
2023-09-27 14:38 ` Darrick J. Wong
2023-10-02 6:52 ` Christoph Hellwig
2023-10-02 16:32 ` Jan Kara
2023-09-27 13:21 ` [PATCH 3/7] bdev: implement " Christian Brauner
2023-09-27 14:53 ` Darrick J. Wong
2023-09-27 15:15 ` Christian Brauner
2023-09-27 16:01 ` Darrick J. Wong
2023-10-02 6:54 ` Christoph Hellwig
2023-10-02 7:10 ` Christoph Hellwig
2023-09-27 13:21 ` [PATCH 4/7] fs: remove get_active_super() Christian Brauner
2023-09-27 14:54 ` Darrick J. Wong
2023-10-02 7:10 ` Christoph Hellwig
2023-10-02 16:22 ` Jan Kara
2023-09-27 13:21 ` [PATCH 5/7] super: remove bd_fsfreeze_{mutex,sb} Christian Brauner
2023-09-27 15:11 ` Darrick J. Wong
2023-09-27 15:18 ` Christian Brauner
2023-10-02 7:12 ` Christoph Hellwig
2023-10-02 16:24 ` Jan Kara
2023-09-27 13:21 ` [PATCH 6/7] fs: remove unused helper Christian Brauner
2023-09-27 15:12 ` Darrick J. Wong
2023-10-02 7:12 ` Christoph Hellwig
2023-10-02 16:26 ` Jan Kara
2023-09-27 13:21 ` [PATCH 7/7] porting: document block device freeze and thaw changes Christian Brauner
2023-09-27 15:19 ` Darrick J. Wong
2023-10-02 16:45 ` Jan Kara
2023-10-05 6:48 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).