[PATCHSET RFC 0/3] fs: kernel and userspace filesystem freeze

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHSET RFC 0/3] fs: kernel and userspace filesystem freeze
@ 2023-06-12  3:15 Darrick J. Wong
  2023-06-12  3:15 ` [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12  3:15 UTC (permalink / raw)
  To: djwong
  Cc: ruansy.fnst, mcgrof, hch, Jan Kara, jack, linux-xfs,
	linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

Hi all,

Sometimes, kernel filesystem drivers need the ability to quiesce writes
to the filesystem so that the driver can perform some kind of
maintenance activity.  This capability mostly already exists in the form
of filesystem freezing but with the huge caveat that userspace can thaw
any frozen fs at any time.  If the correctness of the fs maintenance
program requires stillness of the filesystem, then this caveat is BAD.

Provide a means for the kernel to initiate its own filesystem freezes.
A freeze of one type can be shared with a different type of freeze, but
nested freezes of the same type are not allowed.  A shared freeze
remains in effect until both holders thaw the filesystem.

This capability will be used (sparingly) by the upcoming xfs online fsck
feature; the fsdax pre-removal code; and hopefully one day by suspend.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=kernel-fsfreeze
---
 Documentation/filesystems/vfs.rst |    4 +
 block/bdev.c                      |    8 +--
 fs/f2fs/gc.c                      |    4 +
 fs/gfs2/glops.c                   |    2 -
 fs/gfs2/super.c                   |    6 +-
 fs/gfs2/sys.c                     |    4 +
 fs/gfs2/util.c                    |    2 -
 fs/ioctl.c                        |    8 +--
 fs/quota/quota.c                  |    5 +-
 fs/super.c                        |  110 ++++++++++++++++++++++++++++++++-----
 include/linux/fs.h                |   16 +++--
 11 files changed, 128 insertions(+), 41 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-06-12  3:15 [PATCHSET RFC 0/3] fs: kernel and userspace filesystem freeze Darrick J. Wong
@ 2023-06-12  3:15 ` Darrick J. Wong
  2023-06-12  3:58   ` Christoph Hellwig
  2023-06-12 11:08   ` Jan Kara
  2023-06-12  3:15 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
  2023-06-12  3:15 ` [PATCH 3/3] fs: Drop wait_unfrozen wait queue Darrick J. Wong
  2 siblings, 2 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12  3:15 UTC (permalink / raw)
  To: djwong
  Cc: mcgrof, jack, hch, ruansy.fnst, linux-xfs, linux-fsdevel, mcgrof,
	jack, hch, ruansy.fnst

From: Darrick J. Wong <djwong@kernel.org>

Userspace can freeze a filesystem using the FIFREEZE ioctl or by
suspending the block device; this state persists until userspace thaws
the filesystem with the FITHAW ioctl or resuming the block device.
Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for
the fsfreeze ioctl") we only allow the first freeze command to succeed.

The kernel may decide that it is necessary to freeze a filesystem for
its own internal purposes, such as suspends in progress, filesystem fsck
activities, or quiescing a device prior to removal.  Userspace thaw
commands must never break a kernel freeze, and kernel thaw commands
shouldn't undo userspace's freeze command.

Introduce a couple of freeze holder flags and wire it into the
sb_writers state.  One kernel and one userspace freeze are allowed to
coexist at the same time; the filesystem will not thaw until both are
lifted.

I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but
for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing
behaviors.

Cc: mcgrof@kernel.org
Cc: jack@suse.cz
Cc: hch@infradead.org
Cc: ruansy.fnst@fujitsu.com
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 Documentation/filesystems/vfs.rst |    4 +-
 block/bdev.c                      |    8 ++--
 fs/f2fs/gc.c                      |    4 +-
 fs/gfs2/glops.c                   |    2 -
 fs/gfs2/super.c                   |    6 +--
 fs/gfs2/sys.c                     |    4 +-
 fs/gfs2/util.c                    |    2 -
 fs/ioctl.c                        |    8 ++--
 fs/super.c                        |   79 +++++++++++++++++++++++++++++++++----
 include/linux/fs.h                |   15 +++++--
 10 files changed, 100 insertions(+), 32 deletions(-)


diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 769be5230210..41cf2a56cbca 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -260,9 +260,9 @@ filesystem.  The following members are defined:
 		void (*evict_inode) (struct inode *);
 		void (*put_super) (struct super_block *);
 		int (*sync_fs)(struct super_block *sb, int wait);
-		int (*freeze_super) (struct super_block *);
+		int (*freeze_super) (struct super_block *, enum freeze_holder who);
 		int (*freeze_fs) (struct super_block *);
-		int (*thaw_super) (struct super_block *);
+		int (*thaw_super) (struct super_block *, enum freeze_wholder who);
 		int (*unfreeze_fs) (struct super_block *);
 		int (*statfs) (struct dentry *, struct kstatfs *);
 		int (*remount_fs) (struct super_block *, int *, char *);
diff --git a/block/bdev.c b/block/bdev.c
index 21c63bfef323..e8032c5beae0 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -248,9 +248,9 @@ int freeze_bdev(struct block_device *bdev)
 	if (!sb)
 		goto sync;
 	if (sb->s_op->freeze_super)
-		error = sb->s_op->freeze_super(sb);
+		error = sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE);
 	else
-		error = freeze_super(sb);
+		error = freeze_super(sb, FREEZE_HOLDER_USERSPACE);
 	deactivate_super(sb);
 
 	if (error) {
@@ -291,9 +291,9 @@ int thaw_bdev(struct block_device *bdev)
 		goto out;
 
 	if (sb->s_op->thaw_super)
-		error = sb->s_op->thaw_super(sb);
+		error = sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE);
 	else
-		error = thaw_super(sb);
+		error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
 	if (error)
 		bdev->bd_fsfreeze_count++;
 	else
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 61c5f9d26018..bca4e75c14e0 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -2166,7 +2166,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 	if (err)
 		return err;
 
-	freeze_super(sbi->sb);
+	freeze_super(sbi->sb, FREEZE_HOLDER_USERSPACE);
 	f2fs_down_write(&sbi->gc_lock);
 	f2fs_down_write(&sbi->cp_global_sem);
 
@@ -2217,6 +2217,6 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 out_err:
 	f2fs_up_write(&sbi->cp_global_sem);
 	f2fs_up_write(&sbi->gc_lock);
-	thaw_super(sbi->sb);
+	thaw_super(sbi->sb, FREEZE_HOLDER_USERSPACE);
 	return err;
 }
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 01d433ed6ce7..6bffb7609d01 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -584,7 +584,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
 	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
 	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
 		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
-		error = freeze_super(sdp->sd_vfs);
+		error = freeze_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
 				error);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index a84bf6444bba..3965b00a7503 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -682,7 +682,7 @@ void gfs2_freeze_func(struct work_struct *work)
 		gfs2_assert_withdraw(sdp, 0);
 	} else {
 		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
-		error = thaw_super(sb);
+		error = thaw_super(sb, FREEZE_HOLDER_USERSPACE);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
 				error);
@@ -702,7 +702,7 @@ void gfs2_freeze_func(struct work_struct *work)
  *
  */
 
-static int gfs2_freeze(struct super_block *sb)
+static int gfs2_freeze(struct super_block *sb, enum freeze_holder who)
 {
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	int error;
@@ -747,7 +747,7 @@ static int gfs2_freeze(struct super_block *sb)
  *
  */
 
-static int gfs2_unfreeze(struct super_block *sb)
+static int gfs2_unfreeze(struct super_block *sb, enum freeze_holder who)
 {
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 454dc2ff8b5e..9d04a2907869 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -166,10 +166,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 
 	switch (n) {
 	case 0:
-		error = thaw_super(sdp->sd_vfs);
+		error = thaw_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE);
 		break;
 	case 1:
-		error = freeze_super(sdp->sd_vfs);
+		error = freeze_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE);
 		break;
 	default:
 		return -EINVAL;
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 7a6aeffcdf5c..357457b7c5b3 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 		/* Make sure gfs2_unfreeze works if partially-frozen */
 		flush_work(&sdp->sd_freeze_work);
 		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
-		thaw_super(sdp->sd_vfs);
+		thaw_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE);
 	} else {
 		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
 			    TASK_UNINTERRUPTIBLE);
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 5b2481cd4750..a56cbceedcd1 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -396,8 +396,8 @@ static int ioctl_fsfreeze(struct file *filp)
 
 	/* Freeze */
 	if (sb->s_op->freeze_super)
-		return sb->s_op->freeze_super(sb);
-	return freeze_super(sb);
+		return sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE);
+	return freeze_super(sb, FREEZE_HOLDER_USERSPACE);
 }
 
 static int ioctl_fsthaw(struct file *filp)
@@ -409,8 +409,8 @@ static int ioctl_fsthaw(struct file *filp)
 
 	/* Thaw */
 	if (sb->s_op->thaw_super)
-		return sb->s_op->thaw_super(sb);
-	return thaw_super(sb);
+		return sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE);
+	return thaw_super(sb, FREEZE_HOLDER_USERSPACE);
 }
 
 static int ioctl_file_dedupe_range(struct file *file,
diff --git a/fs/super.c b/fs/super.c
index 34afe411cf2b..36adccecc828 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -39,7 +39,7 @@
 #include <uapi/linux/mount.h>
 #include "internal.h"
 
-static int thaw_super_locked(struct super_block *sb);
+static int thaw_super_locked(struct super_block *sb, enum freeze_holder who);
 
 static LIST_HEAD(super_blocks);
 static DEFINE_SPINLOCK(sb_lock);
@@ -1027,7 +1027,7 @@ static void do_thaw_all_callback(struct super_block *sb)
 	down_write(&sb->s_umount);
 	if (sb->s_root && sb->s_flags & SB_BORN) {
 		emergency_thaw_bdev(sb);
-		thaw_super_locked(sb);
+		thaw_super_locked(sb, FREEZE_HOLDER_USERSPACE);
 	} else {
 		up_write(&sb->s_umount);
 	}
@@ -1635,14 +1635,34 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
 		percpu_up_write(sb->s_writers.rw_sem + level);
 }
 
+static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
+{
+	/* Someone else already holds this type of freeze */
+	if (sb->s_writers.freeze_holders & who)
+		return -EBUSY;
+
+	WARN_ON(sb->s_writers.freeze_holders == 0);
+
+	sb->s_writers.freeze_holders |= who;
+	return 0;
+}
+
 /**
  * freeze_super - lock the filesystem and force it into a consistent state
  * @sb: the super to lock
+ * @who: FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs;
+ * FREEZE_HOLDER_KERNEL if the kernel wants to freeze it
  *
  * Syncs the super to make sure the filesystem is consistent and calls the fs's
- * freeze_fs.  Subsequent calls to this without first thawing the fs will return
+ * freeze_fs.  Subsequent calls to this without first thawing the fs may return
  * -EBUSY.
  *
+ * The @who argument distinguishes between the kernel and userspace trying to
+ * freeze the filesystem.  Although there cannot be multiple kernel freezes or
+ * multiple userspace freezes in effect at any given time, the kernel and
+ * userspace can both hold a filesystem frozen.  The filesystem remains frozen
+ * until there are no kernel or userspace freezes in effect.
+ *
  * During this function, sb->s_writers.frozen goes through these values:
  *
  * SB_UNFROZEN: File system is normal, all writes progress as usual.
@@ -1668,12 +1688,19 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
  *
  * sb->s_writers.frozen is protected by sb->s_umount.
  */
-int freeze_super(struct super_block *sb)
+int freeze_super(struct super_block *sb, enum freeze_holder who)
 {
 	int ret;
 
 	atomic_inc(&sb->s_active);
 	down_write(&sb->s_umount);
+
+	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
+		ret = freeze_frozen_super(sb, who);
+		deactivate_locked_super(sb);
+		return ret;
+	}
+
 	if (sb->s_writers.frozen != SB_UNFROZEN) {
 		deactivate_locked_super(sb);
 		return -EBUSY;
@@ -1684,8 +1711,10 @@ int freeze_super(struct super_block *sb)
 		return 0;	/* sic - it's "nothing to do" */
 	}
 
+
 	if (sb_rdonly(sb)) {
 		/* Nothing to do really... */
+		sb->s_writers.freeze_holders |= who;
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
 		up_write(&sb->s_umount);
 		return 0;
@@ -1731,6 +1760,7 @@ int freeze_super(struct super_block *sb)
 	 * For debugging purposes so that fs can warn if it sees write activity
 	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
 	 */
+	sb->s_writers.freeze_holders |= who;
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
 	lockdep_sb_freeze_release(sb);
 	up_write(&sb->s_umount);
@@ -1738,16 +1768,47 @@ int freeze_super(struct super_block *sb)
 }
 EXPORT_SYMBOL(freeze_super);
 
-static int thaw_super_locked(struct super_block *sb)
+static int try_thaw_shared_super(struct super_block *sb, enum freeze_holder who)
+{
+	/* Freeze is not held by this type? */
+	if (!(sb->s_writers.freeze_holders & who))
+		return -EINVAL;
+
+	/* Also frozen for someone else? */
+	if (sb->s_writers.freeze_holders & ~who) {
+		sb->s_writers.freeze_holders &= ~who;
+		return 0;
+	}
+
+	/* Magic value to proceed with thaw */
+	return 1;
+}
+
+/*
+ * Undoes the effect of a freeze_super_locked call.  If the filesystem is
+ * frozen both by userspace and the kernel, a thaw call from either source
+ * removes that state without releasing the other state or unlocking the
+ * filesystem.
+ */
+static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
 {
 	int error;
 
+	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
+		error = try_thaw_shared_super(sb, who);
+		if (error != 1) {
+			up_write(&sb->s_umount);
+			return error;
+		}
+	}
+
 	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
 		up_write(&sb->s_umount);
 		return -EINVAL;
 	}
 
 	if (sb_rdonly(sb)) {
+		sb->s_writers.freeze_holders &= ~who;
 		sb->s_writers.frozen = SB_UNFROZEN;
 		goto out;
 	}
@@ -1765,6 +1826,7 @@ static int thaw_super_locked(struct super_block *sb)
 		}
 	}
 
+	sb->s_writers.freeze_holders &= ~who;
 	sb->s_writers.frozen = SB_UNFROZEN;
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
@@ -1777,12 +1839,13 @@ static int thaw_super_locked(struct super_block *sb)
  * thaw_super -- unlock filesystem
  * @sb: the super to thaw
  *
- * Unlocks the filesystem and marks it writeable again after freeze_super().
+ * Unlocks the filesystem and marks it writeable again after freeze_super()
+ * if there are no remaining freezes on the filesystem.
  */
-int thaw_super(struct super_block *sb)
+int thaw_super(struct super_block *sb, enum freeze_holder who)
 {
 	down_write(&sb->s_umount);
-	return thaw_super_locked(sb);
+	return thaw_super_locked(sb, who);
 }
 EXPORT_SYMBOL(thaw_super);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 133f0640fb24..c58a560569b3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1145,7 +1145,8 @@ enum {
 #define SB_FREEZE_LEVELS (SB_FREEZE_COMPLETE - 1)
 
 struct sb_writers {
-	int				frozen;		/* Is sb frozen? */
+	unsigned short			frozen;		/* Is sb frozen? */
+	unsigned short			freeze_holders;	/* Who froze fs? */
 	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
 	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
 };
@@ -1899,6 +1900,10 @@ extern loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
 					struct file *dst_file, loff_t dst_pos,
 					loff_t len, unsigned int remap_flags);
 
+enum freeze_holder {
+	FREEZE_HOLDER_KERNEL	= (1U << 0),
+	FREEZE_HOLDER_USERSPACE	= (1U << 1),
+};
 
 struct super_operations {
    	struct inode *(*alloc_inode)(struct super_block *sb);
@@ -1911,9 +1916,9 @@ struct super_operations {
 	void (*evict_inode) (struct inode *);
 	void (*put_super) (struct super_block *);
 	int (*sync_fs)(struct super_block *sb, int wait);
-	int (*freeze_super) (struct super_block *);
+	int (*freeze_super) (struct super_block *, enum freeze_holder who);
 	int (*freeze_fs) (struct super_block *);
-	int (*thaw_super) (struct super_block *);
+	int (*thaw_super) (struct super_block *, enum freeze_holder who);
 	int (*unfreeze_fs) (struct super_block *);
 	int (*statfs) (struct dentry *, struct kstatfs *);
 	int (*remount_fs) (struct super_block *, int *, char *);
@@ -2286,8 +2291,8 @@ extern int unregister_filesystem(struct file_system_type *);
 extern int vfs_statfs(const struct path *, struct kstatfs *);
 extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
-extern int freeze_super(struct super_block *super);
-extern int thaw_super(struct super_block *super);
+extern int freeze_super(struct super_block *super, enum freeze_holder who);
+extern int thaw_super(struct super_block *super, enum freeze_holder who);
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
 extern int super_setup_bdi(struct super_block *sb);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12  3:15 [PATCHSET RFC 0/3] fs: kernel and userspace filesystem freeze Darrick J. Wong
  2023-06-12  3:15 ` [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze Darrick J. Wong
@ 2023-06-12  3:15 ` Darrick J. Wong
  2023-06-12  4:01   ` Christoph Hellwig
  2023-06-12 11:35   ` Jan Kara
  2023-06-12  3:15 ` [PATCH 3/3] fs: Drop wait_unfrozen wait queue Darrick J. Wong
  2 siblings, 2 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12  3:15 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

From: Darrick J. Wong <djwong@kernel.org>

Jan Kara suggested that when one thread is in the middle of freezing a
filesystem, another thread trying to freeze the same fs but with a
different freeze_holder should wait until the freezer reaches either end
state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.

Plumb in the extra coded needed to wait for the fs freezer to reach an
end state and try the freeze again.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/super.c |   27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)


diff --git a/fs/super.c b/fs/super.c
index 36adccecc828..151e0eeff2c2 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1647,6 +1647,15 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
 	return 0;
 }
 
+static void wait_for_partially_frozen(struct super_block *sb)
+{
+	up_write(&sb->s_umount);
+	wait_var_event(&sb->s_writers.frozen,
+			sb->s_writers.frozen == SB_UNFROZEN ||
+			sb->s_writers.frozen == SB_FREEZE_COMPLETE);
+	down_write(&sb->s_umount);
+}
+
 /**
  * freeze_super - lock the filesystem and force it into a consistent state
  * @sb: the super to lock
@@ -1690,11 +1699,13 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
  */
 int freeze_super(struct super_block *sb, enum freeze_holder who)
 {
+	bool try_again = true;
 	int ret;
 
 	atomic_inc(&sb->s_active);
 	down_write(&sb->s_umount);
 
+retry:
 	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
 		ret = freeze_frozen_super(sb, who);
 		deactivate_locked_super(sb);
@@ -1702,8 +1713,14 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 	}
 
 	if (sb->s_writers.frozen != SB_UNFROZEN) {
-		deactivate_locked_super(sb);
-		return -EBUSY;
+		if (!try_again) {
+			deactivate_locked_super(sb);
+			return -EBUSY;
+		}
+
+		wait_for_partially_frozen(sb);
+		try_again = false;
+		goto retry;
 	}
 
 	if (!(sb->s_flags & SB_BORN)) {
@@ -1716,6 +1733,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 		/* Nothing to do really... */
 		sb->s_writers.freeze_holders |= who;
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+		wake_up_var(&sb->s_writers.frozen);
 		up_write(&sb->s_umount);
 		return 0;
 	}
@@ -1736,6 +1754,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 		sb->s_writers.frozen = SB_UNFROZEN;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
 		wake_up(&sb->s_writers.wait_unfrozen);
+		wake_up_var(&sb->s_writers.frozen);
 		deactivate_locked_super(sb);
 		return ret;
 	}
@@ -1752,6 +1771,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb, SB_FREEZE_FS);
 			wake_up(&sb->s_writers.wait_unfrozen);
+			wake_up_var(&sb->s_writers.frozen);
 			deactivate_locked_super(sb);
 			return ret;
 		}
@@ -1762,6 +1782,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 	 */
 	sb->s_writers.freeze_holders |= who;
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+	wake_up_var(&sb->s_writers.frozen);
 	lockdep_sb_freeze_release(sb);
 	up_write(&sb->s_umount);
 	return 0;
@@ -1810,6 +1831,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
 	if (sb_rdonly(sb)) {
 		sb->s_writers.freeze_holders &= ~who;
 		sb->s_writers.frozen = SB_UNFROZEN;
+		wake_up_var(&sb->s_writers.frozen);
 		goto out;
 	}
 
@@ -1828,6 +1850,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
 
 	sb->s_writers.freeze_holders &= ~who;
 	sb->s_writers.frozen = SB_UNFROZEN;
+	wake_up_var(&sb->s_writers.frozen);
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] fs: Drop wait_unfrozen wait queue
  2023-06-12  3:15 [PATCHSET RFC 0/3] fs: kernel and userspace filesystem freeze Darrick J. Wong
  2023-06-12  3:15 ` [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze Darrick J. Wong
  2023-06-12  3:15 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
@ 2023-06-12  3:15 ` Darrick J. Wong
  2023-06-12 11:12   ` Jan Kara
  2 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12  3:15 UTC (permalink / raw)
  To: djwong; +Cc: Jan Kara, linux-xfs, linux-fsdevel, mcgrof, jack, hch,
	ruansy.fnst

From: Jan Kara <jack@suse.cz>

wait_unfrozen waitqueue is used only in quota code to wait for
filesystem to become unfrozen. In that place we can just use
sb_start_write() - sb_end_write() pair to achieve the same. So just
remove the waitqueue.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/quota/quota.c   |    5 +++--
 fs/super.c         |    4 ----
 include/linux/fs.h |    1 -
 3 files changed, 3 insertions(+), 7 deletions(-)


diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 052f143e2e0e..0e41fb84060f 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -895,8 +895,9 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
 			up_write(&sb->s_umount);
 		else
 			up_read(&sb->s_umount);
-		wait_event(sb->s_writers.wait_unfrozen,
-			   sb->s_writers.frozen == SB_UNFROZEN);
+		/* Wait for sb to unfreeze */
+		sb_start_write(sb);
+		sb_end_write(sb);
 		put_super(sb);
 		goto retry;
 	}
diff --git a/fs/super.c b/fs/super.c
index 151e0eeff2c2..fd04dda6c5c0 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -236,7 +236,6 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 					&type->s_writers_key[i]))
 			goto fail;
 	}
-	init_waitqueue_head(&s->s_writers.wait_unfrozen);
 	s->s_bdi = &noop_backing_dev_info;
 	s->s_flags = flags;
 	if (s->s_user_ns != &init_user_ns)
@@ -1753,7 +1752,6 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 	if (ret) {
 		sb->s_writers.frozen = SB_UNFROZEN;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
-		wake_up(&sb->s_writers.wait_unfrozen);
 		wake_up_var(&sb->s_writers.frozen);
 		deactivate_locked_super(sb);
 		return ret;
@@ -1770,7 +1768,6 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 				"VFS:Filesystem freeze failed\n");
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb, SB_FREEZE_FS);
-			wake_up(&sb->s_writers.wait_unfrozen);
 			wake_up_var(&sb->s_writers.frozen);
 			deactivate_locked_super(sb);
 			return ret;
@@ -1853,7 +1850,6 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
 	wake_up_var(&sb->s_writers.frozen);
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
-	wake_up(&sb->s_writers.wait_unfrozen);
 	deactivate_locked_super(sb);
 	return 0;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c58a560569b3..5870fbbecb81 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1147,7 +1147,6 @@ enum {
 struct sb_writers {
 	unsigned short			frozen;		/* Is sb frozen? */
 	unsigned short			freeze_holders;	/* Who froze fs? */
-	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
 	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
 };
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-06-12  3:15 ` [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze Darrick J. Wong
@ 2023-06-12  3:58   ` Christoph Hellwig
  2023-06-12 18:09     ` Darrick J. Wong
  2023-06-12 11:08   ` Jan Kara
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-06-12  3:58 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: mcgrof, jack, hch, ruansy.fnst, linux-xfs, linux-fsdevel

On Sun, Jun 11, 2023 at 08:15:22PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Userspace can freeze a filesystem using the FIFREEZE ioctl or by
> suspending the block device; this state persists until userspace thaws
> the filesystem with the FITHAW ioctl or resuming the block device.
> Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for
> the fsfreeze ioctl") we only allow the first freeze command to succeed.
> 
> The kernel may decide that it is necessary to freeze a filesystem for
> its own internal purposes, such as suspends in progress, filesystem fsck
> activities, or quiescing a device prior to removal.  Userspace thaw
> commands must never break a kernel freeze, and kernel thaw commands
> shouldn't undo userspace's freeze command.
> 
> Introduce a couple of freeze holder flags and wire it into the
> sb_writers state.  One kernel and one userspace freeze are allowed to
> coexist at the same time; the filesystem will not thaw until both are
> lifted.
> 
> I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but
> for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing
> behaviors.
> 
> Cc: mcgrof@kernel.org
> Cc: jack@suse.cz
> Cc: hch@infradead.org
> Cc: ruansy.fnst@fujitsu.com
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  Documentation/filesystems/vfs.rst |    4 +-
>  block/bdev.c                      |    8 ++--
>  fs/f2fs/gc.c                      |    4 +-
>  fs/gfs2/glops.c                   |    2 -
>  fs/gfs2/super.c                   |    6 +--
>  fs/gfs2/sys.c                     |    4 +-
>  fs/gfs2/util.c                    |    2 -
>  fs/ioctl.c                        |    8 ++--
>  fs/super.c                        |   79 +++++++++++++++++++++++++++++++++----
>  include/linux/fs.h                |   15 +++++--
>  10 files changed, 100 insertions(+), 32 deletions(-)
> 
> 
> diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
> index 769be5230210..41cf2a56cbca 100644
> --- a/Documentation/filesystems/vfs.rst
> +++ b/Documentation/filesystems/vfs.rst
> @@ -260,9 +260,9 @@ filesystem.  The following members are defined:
>  		void (*evict_inode) (struct inode *);
>  		void (*put_super) (struct super_block *);
>  		int (*sync_fs)(struct super_block *sb, int wait);
> -		int (*freeze_super) (struct super_block *);
> +		int (*freeze_super) (struct super_block *, enum freeze_holder who);
>  		int (*freeze_fs) (struct super_block *);
> -		int (*thaw_super) (struct super_block *);
> +		int (*thaw_super) (struct super_block *, enum freeze_wholder who);

Nit: Can you spell out the sb paramter as well and avoid the overly long
lines here?

> +static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> +{
> +	/* Someone else already holds this type of freeze */
> +	if (sb->s_writers.freeze_holders & who)
> +		return -EBUSY;
> +
> +	WARN_ON(sb->s_writers.freeze_holders == 0);
> +
> +	sb->s_writers.freeze_holders |= who;
> +	return 0;
> +}

So with the simplification I'm not even sure we need this helper
anymore.  But I could live with it either way.

>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> + * @who: FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs;
> + * FREEZE_HOLDER_KERNEL if the kernel wants to freeze it

I think the cnonstants should use a % prefix for kerneldoc to notice
them.  Also I suspect something like:

 * @who: context that wants to free

 and then in the body:

 * @who should be:
 *  * %FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs
 *  * %FREEZE_HOLDER_KERNEL if the kernel wants to freeze it

for better rendering of the comments.  Same applies for the thaw side.

> +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  {
>  	int error;
>  
> +	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> +		error = try_thaw_shared_super(sb, who);
> +		if (error != 1) {
> +			up_write(&sb->s_umount);
> +			return error;
> +		}
> +	}
> +
>  	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {

Make this and

	} else {

instead of checking the same condition twice?

> +extern int freeze_super(struct super_block *super, enum freeze_holder who);
> +extern int thaw_super(struct super_block *super, enum freeze_holder who);

.. and drop the pointless externs here.

Except for these various nitpicks this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12  3:15 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
@ 2023-06-12  4:01   ` Christoph Hellwig
  2023-06-12 18:33     ` Darrick J. Wong
  2023-06-12 11:35   ` Jan Kara
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-06-12  4:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

On Sun, Jun 11, 2023 at 08:15:28PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Jan Kara suggested that when one thread is in the middle of freezing a
> filesystem, another thread trying to freeze the same fs but with a
> different freeze_holder should wait until the freezer reaches either end
> state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> 
> Plumb in the extra coded needed to wait for the fs freezer to reach an
> end state and try the freeze again.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/super.c |   27 +++++++++++++++++++++++++--
>  1 file changed, 25 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/super.c b/fs/super.c
> index 36adccecc828..151e0eeff2c2 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1647,6 +1647,15 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
>  	return 0;
>  }
>  
> +static void wait_for_partially_frozen(struct super_block *sb)
> +{
> +	up_write(&sb->s_umount);
> +	wait_var_event(&sb->s_writers.frozen,
> +			sb->s_writers.frozen == SB_UNFROZEN ||
> +			sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> +	down_write(&sb->s_umount);

Does sb->s_writers.frozen need WRITE_ONCE/READ_ONCE treatment if we want
to check it outside of s_umount?  Or should we maybe just open code
wait_var_event and only drop the lock after checking the variable?

>  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> -		deactivate_locked_super(sb);
> -		return -EBUSY;
> +		if (!try_again) {
> +			deactivate_locked_super(sb);
> +			return -EBUSY;
> +		}
> +
> +		wait_for_partially_frozen(sb);
> +		try_again = false;
> +		goto retry;

Can you throw in a comment on wait we're only waiting for a partial
freeze one here?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-06-12  3:15 ` [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze Darrick J. Wong
  2023-06-12  3:58   ` Christoph Hellwig
@ 2023-06-12 11:08   ` Jan Kara
  2023-06-12 11:14     ` Jan Kara
  2023-06-12 18:16     ` Darrick J. Wong
  1 sibling, 2 replies; 20+ messages in thread
From: Jan Kara @ 2023-06-12 11:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: mcgrof, jack, hch, ruansy.fnst, linux-xfs, linux-fsdevel

On Sun 11-06-23 20:15:22, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Userspace can freeze a filesystem using the FIFREEZE ioctl or by
> suspending the block device; this state persists until userspace thaws
> the filesystem with the FITHAW ioctl or resuming the block device.
> Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for
> the fsfreeze ioctl") we only allow the first freeze command to succeed.
> 
> The kernel may decide that it is necessary to freeze a filesystem for
> its own internal purposes, such as suspends in progress, filesystem fsck
> activities, or quiescing a device prior to removal.  Userspace thaw
> commands must never break a kernel freeze, and kernel thaw commands
> shouldn't undo userspace's freeze command.
> 
> Introduce a couple of freeze holder flags and wire it into the
> sb_writers state.  One kernel and one userspace freeze are allowed to
> coexist at the same time; the filesystem will not thaw until both are
> lifted.
> 
> I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but
> for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing
> behaviors.
> 
> Cc: mcgrof@kernel.org
> Cc: jack@suse.cz
> Cc: hch@infradead.org
> Cc: ruansy.fnst@fujitsu.com
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Thanks Darrick. Some comments below.

> +static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> +{
> +	/* Someone else already holds this type of freeze */
> +	if (sb->s_writers.freeze_holders & who)
> +		return -EBUSY;
> +
> +	WARN_ON(sb->s_writers.freeze_holders == 0);
> +
> +	sb->s_writers.freeze_holders |= who;
> +	return 0;
> +}
> +
>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> + * @who: FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs;
> + * FREEZE_HOLDER_KERNEL if the kernel wants to freeze it
>   *
>   * Syncs the super to make sure the filesystem is consistent and calls the fs's
> - * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> + * freeze_fs.  Subsequent calls to this without first thawing the fs may return
>   * -EBUSY.
>   *
> + * The @who argument distinguishes between the kernel and userspace trying to
> + * freeze the filesystem.  Although there cannot be multiple kernel freezes or
> + * multiple userspace freezes in effect at any given time, the kernel and
> + * userspace can both hold a filesystem frozen.  The filesystem remains frozen
> + * until there are no kernel or userspace freezes in effect.
> + *
>   * During this function, sb->s_writers.frozen goes through these values:
>   *
>   * SB_UNFROZEN: File system is normal, all writes progress as usual.
> @@ -1668,12 +1688,19 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>   *
>   * sb->s_writers.frozen is protected by sb->s_umount.
>   */
> -int freeze_super(struct super_block *sb)
> +int freeze_super(struct super_block *sb, enum freeze_holder who)
>  {
>  	int ret;
>  
>  	atomic_inc(&sb->s_active);
>  	down_write(&sb->s_umount);
> +
> +	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> +		ret = freeze_frozen_super(sb, who);
> +		deactivate_locked_super(sb);
> +		return ret;
> +	}

I find it a little bit odd that the second freeze holder does not get the
active superblock reference. It all looks correct but I'd find it easier to
reason about (and also eventually lift the reference counting out of
freeze_super()) if the rule was: Successful freeze_super() <=> you have
s_active reference.

> +
>  	if (sb->s_writers.frozen != SB_UNFROZEN) {

I still find it strange that:

Task1					Task2

while (1) {				while (1) {
  ioctl(f, FIFREEZE);			  freeze_super(sb, FREEZE_HOLDER_KERNEL);
  ioctl(f, FITHAW);			  thaw_super(sb, FREEZE_HOLDER_KERNEL);
}					}

will randomly end up returning EBUSY to Task1 or Task2 although there is no
real conflict. I think it will be much more useful behavior if in case of
this conflict the second holder just waited for freezing procedure to finish
and then report success. Because I don't think either caller can do
anything sensible with this race other than retry but it cannot really
distinguish EBUSY as in "someone other holder of the same type has the sb
already frozen" from "freezing raced with holder of a different type".


>  		deactivate_locked_super(sb);
>  		return -EBUSY;
> @@ -1684,8 +1711,10 @@ int freeze_super(struct super_block *sb)
>  		return 0;	/* sic - it's "nothing to do" */
>  	}
>  
> +

Why the extra empty line?

>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
> +		sb->s_writers.freeze_holders |= who;
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
>  		up_write(&sb->s_umount);
>  		return 0;
> @@ -1731,6 +1760,7 @@ int freeze_super(struct super_block *sb)
>  	 * For debugging purposes so that fs can warn if it sees write activity
>  	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
>  	 */
> +	sb->s_writers.freeze_holders |= who;
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
>  	lockdep_sb_freeze_release(sb);
>  	up_write(&sb->s_umount);
> @@ -1738,16 +1768,47 @@ int freeze_super(struct super_block *sb)
>  }
>  EXPORT_SYMBOL(freeze_super);
>  
> -static int thaw_super_locked(struct super_block *sb)
> +static int try_thaw_shared_super(struct super_block *sb, enum freeze_holder who)
> +{
> +	/* Freeze is not held by this type? */
> +	if (!(sb->s_writers.freeze_holders & who))
> +		return -EINVAL;
> +
> +	/* Also frozen for someone else? */
> +	if (sb->s_writers.freeze_holders & ~who) {
> +		sb->s_writers.freeze_holders &= ~who;
> +		return 0;
> +	}
> +
> +	/* Magic value to proceed with thaw */
> +	return 1;
> +}
> +
> +/*
> + * Undoes the effect of a freeze_super_locked call.  If the filesystem is
> + * frozen both by userspace and the kernel, a thaw call from either source
> + * removes that state without releasing the other state or unlocking the
> + * filesystem.
> + */
> +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  {
>  	int error;
>  
> +	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> +		error = try_thaw_shared_super(sb, who);
> +		if (error != 1) {
> +			up_write(&sb->s_umount);
> +			return error;
> +		}
> +	}
> +
>  	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
>  		up_write(&sb->s_umount);
>  		return -EINVAL;
>  	}

I'd first check for the above condition and then just fold
try_thaw_shared_super() into here. That way you can avoid the odd special
return and the code will be actually more readable. Probably we should grow
out_err label for:

	up_write(&sb->s_umount);
	return error;

and use it for the error returns as well...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] fs: Drop wait_unfrozen wait queue
  2023-06-12  3:15 ` [PATCH 3/3] fs: Drop wait_unfrozen wait queue Darrick J. Wong
@ 2023-06-12 11:12   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-06-12 11:12 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-xfs, linux-fsdevel, mcgrof, hch, ruansy.fnst

On Sun 11-06-23 20:15:34, Darrick J. Wong wrote:
> From: Jan Kara <jack@suse.cz>
> 
> wait_unfrozen waitqueue is used only in quota code to wait for
> filesystem to become unfrozen. In that place we can just use
> sb_start_write() - sb_end_write() pair to achieve the same. So just
> remove the waitqueue.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

I have this already queued in my tree for the merge window... :)

								Honza

> ---
>  fs/quota/quota.c   |    5 +++--
>  fs/super.c         |    4 ----
>  include/linux/fs.h |    1 -
>  3 files changed, 3 insertions(+), 7 deletions(-)
> 
> 
> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
> index 052f143e2e0e..0e41fb84060f 100644
> --- a/fs/quota/quota.c
> +++ b/fs/quota/quota.c
> @@ -895,8 +895,9 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
>  			up_write(&sb->s_umount);
>  		else
>  			up_read(&sb->s_umount);
> -		wait_event(sb->s_writers.wait_unfrozen,
> -			   sb->s_writers.frozen == SB_UNFROZEN);
> +		/* Wait for sb to unfreeze */
> +		sb_start_write(sb);
> +		sb_end_write(sb);
>  		put_super(sb);
>  		goto retry;
>  	}
> diff --git a/fs/super.c b/fs/super.c
> index 151e0eeff2c2..fd04dda6c5c0 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -236,7 +236,6 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>  					&type->s_writers_key[i]))
>  			goto fail;
>  	}
> -	init_waitqueue_head(&s->s_writers.wait_unfrozen);
>  	s->s_bdi = &noop_backing_dev_info;
>  	s->s_flags = flags;
>  	if (s->s_user_ns != &init_user_ns)
> @@ -1753,7 +1752,6 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  	if (ret) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
> -		wake_up(&sb->s_writers.wait_unfrozen);
>  		wake_up_var(&sb->s_writers.frozen);
>  		deactivate_locked_super(sb);
>  		return ret;
> @@ -1770,7 +1768,6 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  				"VFS:Filesystem freeze failed\n");
>  			sb->s_writers.frozen = SB_UNFROZEN;
>  			sb_freeze_unlock(sb, SB_FREEZE_FS);
> -			wake_up(&sb->s_writers.wait_unfrozen);
>  			wake_up_var(&sb->s_writers.frozen);
>  			deactivate_locked_super(sb);
>  			return ret;
> @@ -1853,7 +1850,6 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  	wake_up_var(&sb->s_writers.frozen);
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
> -	wake_up(&sb->s_writers.wait_unfrozen);
>  	deactivate_locked_super(sb);
>  	return 0;
>  }
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c58a560569b3..5870fbbecb81 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1147,7 +1147,6 @@ enum {
>  struct sb_writers {
>  	unsigned short			frozen;		/* Is sb frozen? */
>  	unsigned short			freeze_holders;	/* Who froze fs? */
> -	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
>  	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
>  };
>  
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-06-12 11:08   ` Jan Kara
@ 2023-06-12 11:14     ` Jan Kara
  2023-06-12 18:16     ` Darrick J. Wong
  1 sibling, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-06-12 11:14 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: mcgrof, jack, hch, ruansy.fnst, linux-xfs, linux-fsdevel

On Mon 12-06-23 13:08:11, Jan Kara wrote:
> On Sun 11-06-23 20:15:22, Darrick J. Wong wrote:
> > +
> >  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> 
> I still find it strange that:
> 
> Task1					Task2
> 
> while (1) {				while (1) {
>   ioctl(f, FIFREEZE);			  freeze_super(sb, FREEZE_HOLDER_KERNEL);
>   ioctl(f, FITHAW);			  thaw_super(sb, FREEZE_HOLDER_KERNEL);
> }					}
> 
> will randomly end up returning EBUSY to Task1 or Task2 although there is no
> real conflict. I think it will be much more useful behavior if in case of
> this conflict the second holder just waited for freezing procedure to finish
> and then report success. Because I don't think either caller can do
> anything sensible with this race other than retry but it cannot really
> distinguish EBUSY as in "someone other holder of the same type has the sb
> already frozen" from "freezing raced with holder of a different type".

I take this back, you solve the problem in patch 2 :). I'm sorry for the
noise.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12  3:15 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
  2023-06-12  4:01   ` Christoph Hellwig
@ 2023-06-12 11:35   ` Jan Kara
  2023-06-12 18:36     ` Darrick J. Wong
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Kara @ 2023-06-12 11:35 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

On Sun 11-06-23 20:15:28, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Jan Kara suggested that when one thread is in the middle of freezing a
> filesystem, another thread trying to freeze the same fs but with a
> different freeze_holder should wait until the freezer reaches either end
> state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> 
> Plumb in the extra coded needed to wait for the fs freezer to reach an
> end state and try the freeze again.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

You can maybe copy my reasoning from my reply to your patch 1/3 to the
changelog to explain why waiting is more desirable.

> @@ -1690,11 +1699,13 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
>   */
>  int freeze_super(struct super_block *sb, enum freeze_holder who)
>  {
> +	bool try_again = true;
>  	int ret;
>  
>  	atomic_inc(&sb->s_active);
>  	down_write(&sb->s_umount);
>  
> +retry:
>  	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
>  		ret = freeze_frozen_super(sb, who);
>  		deactivate_locked_super(sb);
> @@ -1702,8 +1713,14 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  	}
>  
>  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> -		deactivate_locked_super(sb);
> -		return -EBUSY;
> +		if (!try_again) {
> +			deactivate_locked_super(sb);
> +			return -EBUSY;
> +		}

Why only one retry? Do you envision someone will be freezing & thawing
filesystem so intensively that livelocks are possible? In that case I'd
think the system has other problems than freeze taking long... Honestly,
I'd be looping as long as we either succeed or return error for other
reasons.

What we could be doing to limit unnecessary waiting is that we'd update
freeze_holders already when we enter freeze_super() and lock s_umount
(bailing if our holder type is already set). That way we'd have at most one
process for each holder type freezing the fs / waiting for freezing to
complete.

> +
> +		wait_for_partially_frozen(sb);
> +		try_again = false;
> +		goto retry;
>  	}
>  
...
> @@ -1810,6 +1831,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.freeze_holders &= ~who;
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		wake_up_var(&sb->s_writers.frozen);
>  		goto out;
>  	}
>  
> @@ -1828,6 +1850,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  
>  	sb->s_writers.freeze_holders &= ~who;
>  	sb->s_writers.frozen = SB_UNFROZEN;
> +	wake_up_var(&sb->s_writers.frozen);
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);

I don't think wakeups on thaw are really needed because the waiter should
be woken already once the freezing task exits from freeze_super(). I guess
you've sprinkled them here just for good measure?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-06-12  3:58   ` Christoph Hellwig
@ 2023-06-12 18:09     ` Darrick J. Wong
  0 siblings, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12 18:09 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: mcgrof, jack, ruansy.fnst, linux-xfs, linux-fsdevel

On Sun, Jun 11, 2023 at 08:58:40PM -0700, Christoph Hellwig wrote:
> On Sun, Jun 11, 2023 at 08:15:22PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Userspace can freeze a filesystem using the FIFREEZE ioctl or by
> > suspending the block device; this state persists until userspace thaws
> > the filesystem with the FITHAW ioctl or resuming the block device.
> > Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for
> > the fsfreeze ioctl") we only allow the first freeze command to succeed.
> > 
> > The kernel may decide that it is necessary to freeze a filesystem for
> > its own internal purposes, such as suspends in progress, filesystem fsck
> > activities, or quiescing a device prior to removal.  Userspace thaw
> > commands must never break a kernel freeze, and kernel thaw commands
> > shouldn't undo userspace's freeze command.
> > 
> > Introduce a couple of freeze holder flags and wire it into the
> > sb_writers state.  One kernel and one userspace freeze are allowed to
> > coexist at the same time; the filesystem will not thaw until both are
> > lifted.
> > 
> > I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but
> > for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing
> > behaviors.
> > 
> > Cc: mcgrof@kernel.org
> > Cc: jack@suse.cz
> > Cc: hch@infradead.org
> > Cc: ruansy.fnst@fujitsu.com
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  Documentation/filesystems/vfs.rst |    4 +-
> >  block/bdev.c                      |    8 ++--
> >  fs/f2fs/gc.c                      |    4 +-
> >  fs/gfs2/glops.c                   |    2 -
> >  fs/gfs2/super.c                   |    6 +--
> >  fs/gfs2/sys.c                     |    4 +-
> >  fs/gfs2/util.c                    |    2 -
> >  fs/ioctl.c                        |    8 ++--
> >  fs/super.c                        |   79 +++++++++++++++++++++++++++++++++----
> >  include/linux/fs.h                |   15 +++++--
> >  10 files changed, 100 insertions(+), 32 deletions(-)
> > 
> > 
> > diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
> > index 769be5230210..41cf2a56cbca 100644
> > --- a/Documentation/filesystems/vfs.rst
> > +++ b/Documentation/filesystems/vfs.rst
> > @@ -260,9 +260,9 @@ filesystem.  The following members are defined:
> >  		void (*evict_inode) (struct inode *);
> >  		void (*put_super) (struct super_block *);
> >  		int (*sync_fs)(struct super_block *sb, int wait);
> > -		int (*freeze_super) (struct super_block *);
> > +		int (*freeze_super) (struct super_block *, enum freeze_holder who);
> >  		int (*freeze_fs) (struct super_block *);
> > -		int (*thaw_super) (struct super_block *);
> > +		int (*thaw_super) (struct super_block *, enum freeze_wholder who);
> 
> Nit: Can you spell out the sb paramter as well and avoid the overly long
> lines here?

Done.

> > +static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> > +{
> > +	/* Someone else already holds this type of freeze */
> > +	if (sb->s_writers.freeze_holders & who)
> > +		return -EBUSY;
> > +
> > +	WARN_ON(sb->s_writers.freeze_holders == 0);
> > +
> > +	sb->s_writers.freeze_holders |= who;
> > +	return 0;
> > +}
> 
> So with the simplification I'm not even sure we need this helper
> anymore.  But I could live with it either way.

Ok, gone.  It makes the code flow rather easier to understand,
especially given Jan's reply asking for a shared freeze to leave
s_active elevated by 2.

> >  /**
> >   * freeze_super - lock the filesystem and force it into a consistent state
> >   * @sb: the super to lock
> > + * @who: FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs;
> > + * FREEZE_HOLDER_KERNEL if the kernel wants to freeze it
> 
> I think the cnonstants should use a % prefix for kerneldoc to notice
> them.  Also I suspect something like:
> 
>  * @who: context that wants to free
> 
>  and then in the body:
> 
>  * @who should be:
>  *  * %FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs
>  *  * %FREEZE_HOLDER_KERNEL if the kernel wants to freeze it
> 
> for better rendering of the comments.  Same applies for the thaw side.

Done.  Thanks for the kerneldoc, I can never keep rst and kerneldoc
straight anymore.

> > +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
> >  {
> >  	int error;
> >  
> > +	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> > +		error = try_thaw_shared_super(sb, who);
> > +		if (error != 1) {
> > +			up_write(&sb->s_umount);
> > +			return error;
> > +		}
> > +	}
> > +
> >  	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> 
> Make this and
> 
> 	} else {
> 
> instead of checking the same condition twice?

Ok.

> > +extern int freeze_super(struct super_block *super, enum freeze_holder who);
> > +extern int thaw_super(struct super_block *super, enum freeze_holder who);
> 
> .. and drop the pointless externs here.

Ok done.

> Except for these various nitpicks this looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks!

--D


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-06-12 11:08   ` Jan Kara
  2023-06-12 11:14     ` Jan Kara
@ 2023-06-12 18:16     ` Darrick J. Wong
  1 sibling, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12 18:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: mcgrof, hch, ruansy.fnst, linux-xfs, linux-fsdevel

On Mon, Jun 12, 2023 at 01:08:11PM +0200, Jan Kara wrote:
> On Sun 11-06-23 20:15:22, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Userspace can freeze a filesystem using the FIFREEZE ioctl or by
> > suspending the block device; this state persists until userspace thaws
> > the filesystem with the FITHAW ioctl or resuming the block device.
> > Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for
> > the fsfreeze ioctl") we only allow the first freeze command to succeed.
> > 
> > The kernel may decide that it is necessary to freeze a filesystem for
> > its own internal purposes, such as suspends in progress, filesystem fsck
> > activities, or quiescing a device prior to removal.  Userspace thaw
> > commands must never break a kernel freeze, and kernel thaw commands
> > shouldn't undo userspace's freeze command.
> > 
> > Introduce a couple of freeze holder flags and wire it into the
> > sb_writers state.  One kernel and one userspace freeze are allowed to
> > coexist at the same time; the filesystem will not thaw until both are
> > lifted.
> > 
> > I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but
> > for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing
> > behaviors.
> > 
> > Cc: mcgrof@kernel.org
> > Cc: jack@suse.cz
> > Cc: hch@infradead.org
> > Cc: ruansy.fnst@fujitsu.com
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> 
> Thanks Darrick. Some comments below.
> 
> > +static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> > +{
> > +	/* Someone else already holds this type of freeze */
> > +	if (sb->s_writers.freeze_holders & who)
> > +		return -EBUSY;
> > +
> > +	WARN_ON(sb->s_writers.freeze_holders == 0);
> > +
> > +	sb->s_writers.freeze_holders |= who;
> > +	return 0;
> > +}
> > +
> >  /**
> >   * freeze_super - lock the filesystem and force it into a consistent state
> >   * @sb: the super to lock
> > + * @who: FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs;
> > + * FREEZE_HOLDER_KERNEL if the kernel wants to freeze it
> >   *
> >   * Syncs the super to make sure the filesystem is consistent and calls the fs's
> > - * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> > + * freeze_fs.  Subsequent calls to this without first thawing the fs may return
> >   * -EBUSY.
> >   *
> > + * The @who argument distinguishes between the kernel and userspace trying to
> > + * freeze the filesystem.  Although there cannot be multiple kernel freezes or
> > + * multiple userspace freezes in effect at any given time, the kernel and
> > + * userspace can both hold a filesystem frozen.  The filesystem remains frozen
> > + * until there are no kernel or userspace freezes in effect.
> > + *
> >   * During this function, sb->s_writers.frozen goes through these values:
> >   *
> >   * SB_UNFROZEN: File system is normal, all writes progress as usual.
> > @@ -1668,12 +1688,19 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
> >   *
> >   * sb->s_writers.frozen is protected by sb->s_umount.
> >   */
> > -int freeze_super(struct super_block *sb)
> > +int freeze_super(struct super_block *sb, enum freeze_holder who)
> >  {
> >  	int ret;
> >  
> >  	atomic_inc(&sb->s_active);
> >  	down_write(&sb->s_umount);
> > +
> > +	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> > +		ret = freeze_frozen_super(sb, who);
> > +		deactivate_locked_super(sb);
> > +		return ret;
> > +	}
> 
> I find it a little bit odd that the second freeze holder does not get the
> active superblock reference. It all looks correct but I'd find it easier to
> reason about (and also eventually lift the reference counting out of
> freeze_super()) if the rule was: Successful freeze_super() <=> you have
> s_active reference.

Ok, I'll keep the active ref when a freezer starts sharing a freeze.

> > +
> >  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> 
> I still find it strange that:
> 
> Task1					Task2
> 
> while (1) {				while (1) {
>   ioctl(f, FIFREEZE);			  freeze_super(sb, FREEZE_HOLDER_KERNEL);
>   ioctl(f, FITHAW);			  thaw_super(sb, FREEZE_HOLDER_KERNEL);
> }					}
> 
> will randomly end up returning EBUSY to Task1 or Task2 although there is no
> real conflict. I think it will be much more useful behavior if in case of
> this conflict the second holder just waited for freezing procedure to finish
> and then report success. Because I don't think either caller can do
> anything sensible with this race other than retry but it cannot really
> distinguish EBUSY as in "someone other holder of the same type has the sb
> already frozen" from "freezing raced with holder of a different type".

<nod> I'll copy this justification into the commit message for the
second patch.

> 
> >  		deactivate_locked_super(sb);
> >  		return -EBUSY;
> > @@ -1684,8 +1711,10 @@ int freeze_super(struct super_block *sb)
> >  		return 0;	/* sic - it's "nothing to do" */
> >  	}
> >  
> > +
> 
> Why the extra empty line?

Whitespace damage.

> >  	if (sb_rdonly(sb)) {
> >  		/* Nothing to do really... */
> > +		sb->s_writers.freeze_holders |= who;
> >  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> >  		up_write(&sb->s_umount);
> >  		return 0;
> > @@ -1731,6 +1760,7 @@ int freeze_super(struct super_block *sb)
> >  	 * For debugging purposes so that fs can warn if it sees write activity
> >  	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
> >  	 */
> > +	sb->s_writers.freeze_holders |= who;
> >  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> >  	lockdep_sb_freeze_release(sb);
> >  	up_write(&sb->s_umount);
> > @@ -1738,16 +1768,47 @@ int freeze_super(struct super_block *sb)
> >  }
> >  EXPORT_SYMBOL(freeze_super);
> >  
> > -static int thaw_super_locked(struct super_block *sb)
> > +static int try_thaw_shared_super(struct super_block *sb, enum freeze_holder who)
> > +{
> > +	/* Freeze is not held by this type? */
> > +	if (!(sb->s_writers.freeze_holders & who))
> > +		return -EINVAL;
> > +
> > +	/* Also frozen for someone else? */
> > +	if (sb->s_writers.freeze_holders & ~who) {
> > +		sb->s_writers.freeze_holders &= ~who;
> > +		return 0;
> > +	}
> > +
> > +	/* Magic value to proceed with thaw */
> > +	return 1;
> > +}
> > +
> > +/*
> > + * Undoes the effect of a freeze_super_locked call.  If the filesystem is
> > + * frozen both by userspace and the kernel, a thaw call from either source
> > + * removes that state without releasing the other state or unlocking the
> > + * filesystem.
> > + */
> > +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
> >  {
> >  	int error;
> >  
> > +	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> > +		error = try_thaw_shared_super(sb, who);
> > +		if (error != 1) {
> > +			up_write(&sb->s_umount);
> > +			return error;
> > +		}
> > +	}
> > +
> >  	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> >  		up_write(&sb->s_umount);
> >  		return -EINVAL;
> >  	}
> 
> I'd first check for the above condition and then just fold
> try_thaw_shared_super() into here. That way you can avoid the odd special
> return and the code will be actually more readable. Probably we should grow
> out_err label for:
> 
> 	up_write(&sb->s_umount);
> 	return error;
> 
> and use it for the error returns as well...

<shrug> we're only adding one of these, but I'll tack on a fourth patch
to clean these up.

--D

> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12  4:01   ` Christoph Hellwig
@ 2023-06-12 18:33     ` Darrick J. Wong
  2023-06-12 18:47       ` Darrick J. Wong
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12 18:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, ruansy.fnst

On Sun, Jun 11, 2023 at 09:01:48PM -0700, Christoph Hellwig wrote:
> On Sun, Jun 11, 2023 at 08:15:28PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Jan Kara suggested that when one thread is in the middle of freezing a
> > filesystem, another thread trying to freeze the same fs but with a
> > different freeze_holder should wait until the freezer reaches either end
> > state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> > 
> > Plumb in the extra coded needed to wait for the fs freezer to reach an
> > end state and try the freeze again.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/super.c |   27 +++++++++++++++++++++++++--
> >  1 file changed, 25 insertions(+), 2 deletions(-)
> > 
> > 
> > diff --git a/fs/super.c b/fs/super.c
> > index 36adccecc828..151e0eeff2c2 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -1647,6 +1647,15 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> >  	return 0;
> >  }
> >  
> > +static void wait_for_partially_frozen(struct super_block *sb)
> > +{
> > +	up_write(&sb->s_umount);
> > +	wait_var_event(&sb->s_writers.frozen,
> > +			sb->s_writers.frozen == SB_UNFROZEN ||
> > +			sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > +	down_write(&sb->s_umount);
> 
> Does sb->s_writers.frozen need WRITE_ONCE/READ_ONCE treatment if we want
> to check it outside of s_umount?  Or should we maybe just open code
> wait_var_event and only drop the lock after checking the variable?

How about something like:

	do {
		up_write(&sb->s_umount);
		down_write(&sb->s_umount);
	} while (sb->s_writers.frozen != SB_UNFROZEN &&
		 sb->s_writers.frozen != SB_FREEZE_COMPLETE);

so that we always return in either end state of a freezer transition?

> >  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> > -		deactivate_locked_super(sb);
> > -		return -EBUSY;
> > +		if (!try_again) {
> > +			deactivate_locked_super(sb);
> > +			return -EBUSY;
> > +		}
> > +
> > +		wait_for_partially_frozen(sb);
> > +		try_again = false;
> > +		goto retry;
> 
> Can you throw in a comment on wait we're only waiting for a partial
> freeze one here?

I didn't want a thread to get stuck in the retry forever if it always
loses the race.  However, I think any other threads running freeze_super
will always end at UNFROZEN or COMPLETE; and thaw_super always goes
straight froM COMPLETE to UNFROZEN, so I think I'll get rid of the retry
flag logic entirely.

--D

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12 11:35   ` Jan Kara
@ 2023-06-12 18:36     ` Darrick J. Wong
  2023-06-13  7:52       ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12 18:36 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-xfs, linux-fsdevel, mcgrof, hch, ruansy.fnst

On Mon, Jun 12, 2023 at 01:35:26PM +0200, Jan Kara wrote:
> On Sun 11-06-23 20:15:28, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Jan Kara suggested that when one thread is in the middle of freezing a
> > filesystem, another thread trying to freeze the same fs but with a
> > different freeze_holder should wait until the freezer reaches either end
> > state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> > 
> > Plumb in the extra coded needed to wait for the fs freezer to reach an
> > end state and try the freeze again.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> 
> You can maybe copy my reasoning from my reply to your patch 1/3 to the
> changelog to explain why waiting is more desirable.

Done.

"Neither caller can do anything sensible with this race other than retry
but they cannot really distinguish EBUSY as in "someone other holder of
the same type has the sb already frozen" from "freezing raced with
holder of a different type"

is now added to the commit message.

> > @@ -1690,11 +1699,13 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> >   */
> >  int freeze_super(struct super_block *sb, enum freeze_holder who)
> >  {
> > +	bool try_again = true;
> >  	int ret;
> >  
> >  	atomic_inc(&sb->s_active);
> >  	down_write(&sb->s_umount);
> >  
> > +retry:
> >  	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
> >  		ret = freeze_frozen_super(sb, who);
> >  		deactivate_locked_super(sb);
> > @@ -1702,8 +1713,14 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
> >  	}
> >  
> >  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> > -		deactivate_locked_super(sb);
> > -		return -EBUSY;
> > +		if (!try_again) {
> > +			deactivate_locked_super(sb);
> > +			return -EBUSY;
> > +		}
> 
> Why only one retry? Do you envision someone will be freezing & thawing
> filesystem so intensively that livelocks are possible?

I was worried about that, but the more I think about it, the more I can
convince myself that we can take the simpler approach of cycling
s_umount until we get the state we want.

> In that case I'd
> think the system has other problems than freeze taking long... Honestly,
> I'd be looping as long as we either succeed or return error for other
> reasons.

Ok.

> What we could be doing to limit unnecessary waiting is that we'd update
> freeze_holders already when we enter freeze_super() and lock s_umount
> (bailing if our holder type is already set). That way we'd have at most one
> process for each holder type freezing the fs / waiting for freezing to
> complete.

<shrug> I don't know how often we even really have threads contending
for s_umount and elevated freeze state.  How about we go with the
simpler wait_for_partially_frozen and see if complaints crop up?

> > +
> > +		wait_for_partially_frozen(sb);
> > +		try_again = false;
> > +		goto retry;
> >  	}
> >  
> ...
> > @@ -1810,6 +1831,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
> >  	if (sb_rdonly(sb)) {
> >  		sb->s_writers.freeze_holders &= ~who;
> >  		sb->s_writers.frozen = SB_UNFROZEN;
> > +		wake_up_var(&sb->s_writers.frozen);
> >  		goto out;
> >  	}
> >  
> > @@ -1828,6 +1850,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
> >  
> >  	sb->s_writers.freeze_holders &= ~who;
> >  	sb->s_writers.frozen = SB_UNFROZEN;
> > +	wake_up_var(&sb->s_writers.frozen);
> >  	sb_freeze_unlock(sb, SB_FREEZE_FS);
> >  out:
> >  	wake_up(&sb->s_writers.wait_unfrozen);
> 
> I don't think wakeups on thaw are really needed because the waiter should
> be woken already once the freezing task exits from freeze_super(). I guess
> you've sprinkled them here just for good measure?

I've changed wait_for_partially_frozen to cycle s_umount until we find
the locked state we want, so the wake_up_var calls aren't needed
anymore.

--D

> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12 18:33     ` Darrick J. Wong
@ 2023-06-12 18:47       ` Darrick J. Wong
  0 siblings, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-12 18:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, ruansy.fnst

On Mon, Jun 12, 2023 at 11:33:02AM -0700, Darrick J. Wong wrote:
> On Sun, Jun 11, 2023 at 09:01:48PM -0700, Christoph Hellwig wrote:
> > On Sun, Jun 11, 2023 at 08:15:28PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > Jan Kara suggested that when one thread is in the middle of freezing a
> > > filesystem, another thread trying to freeze the same fs but with a
> > > different freeze_holder should wait until the freezer reaches either end
> > > state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> > > 
> > > Plumb in the extra coded needed to wait for the fs freezer to reach an
> > > end state and try the freeze again.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > ---
> > >  fs/super.c |   27 +++++++++++++++++++++++++--
> > >  1 file changed, 25 insertions(+), 2 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/super.c b/fs/super.c
> > > index 36adccecc828..151e0eeff2c2 100644
> > > --- a/fs/super.c
> > > +++ b/fs/super.c
> > > @@ -1647,6 +1647,15 @@ static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who)
> > >  	return 0;
> > >  }
> > >  
> > > +static void wait_for_partially_frozen(struct super_block *sb)
> > > +{
> > > +	up_write(&sb->s_umount);
> > > +	wait_var_event(&sb->s_writers.frozen,
> > > +			sb->s_writers.frozen == SB_UNFROZEN ||
> > > +			sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > > +	down_write(&sb->s_umount);
> > 
> > Does sb->s_writers.frozen need WRITE_ONCE/READ_ONCE treatment if we want
> > to check it outside of s_umount?  Or should we maybe just open code
> > wait_var_event and only drop the lock after checking the variable?
> 
> How about something like:
> 
> 	do {
> 		up_write(&sb->s_umount);
> 		down_write(&sb->s_umount);
> 	} while (sb->s_writers.frozen != SB_UNFROZEN &&
> 		 sb->s_writers.frozen != SB_FREEZE_COMPLETE);
> 
> so that we always return in either end state of a freezer transition?

Of course as soon as I hit send I realize that no, we don't want to be
cycling s_umount repeatedly even sb->s_writers.frozen hasn't changed.
And maybe we want the wait to be killable too?

static int wait_for_partially_frozen(struct super_block *sb)
{
	int ret = 0;

	do {
		unsigned short old = sb->s_writers.frozen;

		up_write(&sb->s_umount);
		ret = wait_var_event_killable(&sb->s_writers.frozen,
					       sb->s_writers.frozen != old);
		down_write(&sb->s_umount);
	} while (ret == 0 &&
		 sb->s_writers.frozen != SB_UNFROZEN &&
		 sb->s_writers.frozen != SB_FREEZE_COMPLETE);

	return ret;
}

I'll try this out and report back.

--D

> > >  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> > > -		deactivate_locked_super(sb);
> > > -		return -EBUSY;
> > > +		if (!try_again) {
> > > +			deactivate_locked_super(sb);
> > > +			return -EBUSY;
> > > +		}
> > > +
> > > +		wait_for_partially_frozen(sb);
> > > +		try_again = false;
> > > +		goto retry;
> > 
> > Can you throw in a comment on wait we're only waiting for a partial
> > freeze one here?
> 
> I didn't want a thread to get stuck in the retry forever if it always
> loses the race.  However, I think any other threads running freeze_super
> will always end at UNFROZEN or COMPLETE; and thaw_super always goes
> straight froM COMPLETE to UNFROZEN, so I think I'll get rid of the retry
> flag logic entirely.
> 
> --D

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-12 18:36     ` Darrick J. Wong
@ 2023-06-13  7:52       ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-06-13  7:52 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-xfs, linux-fsdevel, mcgrof, hch, ruansy.fnst

On Mon 12-06-23 11:36:57, Darrick J. Wong wrote:
> On Mon, Jun 12, 2023 at 01:35:26PM +0200, Jan Kara wrote:
> > What we could be doing to limit unnecessary waiting is that we'd update
> > freeze_holders already when we enter freeze_super() and lock s_umount
> > (bailing if our holder type is already set). That way we'd have at most one
> > process for each holder type freezing the fs / waiting for freezing to
> > complete.
> 
> <shrug> I don't know how often we even really have threads contending
> for s_umount and elevated freeze state.  How about we go with the
> simpler wait_for_partially_frozen and see if complaints crop up?

Yeah, I'm for the simpler approach as well. This was more a suggestion if
you think that is not viable.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-16  1:48 [PATCHSET v2 0/3] fs: kernel and userspace filesystem freeze Darrick J. Wong
@ 2023-06-16  1:48 ` Darrick J. Wong
  2023-06-16  2:19   ` Dave Chinner
                     ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Darrick J. Wong @ 2023-06-16  1:48 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

From: Darrick J. Wong <djwong@kernel.org>

Jan Kara suggested that when one thread is in the middle of freezing a
filesystem, another thread trying to freeze the same fs but with a
different freeze_holder should wait until the freezer reaches either end
state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.

Neither caller can do anything sensible with this race other than retry
but they cannot really distinguish EBUSY as in "someone other holder of
the same type has the sb already frozen" from "freezing raced with
holder of a different type".

Plumb in the extra coded needed to wait for the fs freezer to reach an
end state and try the freeze again.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/super.c |   34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)


diff --git a/fs/super.c b/fs/super.c
index 81fb67157cba..1b8ea81788d4 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1635,6 +1635,24 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
 		percpu_up_write(sb->s_writers.rw_sem + level);
 }
 
+static int wait_for_partially_frozen(struct super_block *sb)
+{
+	int ret = 0;
+
+	do {
+		unsigned short old = sb->s_writers.frozen;
+
+		up_write(&sb->s_umount);
+		ret = wait_var_event_killable(&sb->s_writers.frozen,
+					       sb->s_writers.frozen != old);
+		down_write(&sb->s_umount);
+	} while (ret == 0 &&
+		 sb->s_writers.frozen != SB_UNFROZEN &&
+		 sb->s_writers.frozen != SB_FREEZE_COMPLETE);
+
+	return ret;
+}
+
 /**
  * freeze_super - lock the filesystem and force it into a consistent state
  * @sb: the super to lock
@@ -1686,6 +1704,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 	atomic_inc(&sb->s_active);
 	down_write(&sb->s_umount);
 
+retry:
 	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
 		if (sb->s_writers.freeze_holders & who) {
 			deactivate_locked_super(sb);
@@ -1704,8 +1723,13 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 	}
 
 	if (sb->s_writers.frozen != SB_UNFROZEN) {
-		deactivate_locked_super(sb);
-		return -EBUSY;
+		ret = wait_for_partially_frozen(sb);
+		if (ret) {
+			deactivate_locked_super(sb);
+			return ret;
+		}
+
+		goto retry;
 	}
 
 	if (!(sb->s_flags & SB_BORN)) {
@@ -1717,6 +1741,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 		/* Nothing to do really... */
 		sb->s_writers.freeze_holders |= who;
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+		wake_up_var(&sb->s_writers.frozen);
 		up_write(&sb->s_umount);
 		return 0;
 	}
@@ -1737,6 +1762,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 		sb->s_writers.frozen = SB_UNFROZEN;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
 		wake_up(&sb->s_writers.wait_unfrozen);
+		wake_up_var(&sb->s_writers.frozen);
 		deactivate_locked_super(sb);
 		return ret;
 	}
@@ -1753,6 +1779,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb, SB_FREEZE_FS);
 			wake_up(&sb->s_writers.wait_unfrozen);
+			wake_up_var(&sb->s_writers.frozen);
 			deactivate_locked_super(sb);
 			return ret;
 		}
@@ -1763,6 +1790,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
 	 */
 	sb->s_writers.freeze_holders |= who;
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+	wake_up_var(&sb->s_writers.frozen);
 	lockdep_sb_freeze_release(sb);
 	up_write(&sb->s_umount);
 	return 0;
@@ -1803,6 +1831,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
 	if (sb_rdonly(sb)) {
 		sb->s_writers.freeze_holders &= ~who;
 		sb->s_writers.frozen = SB_UNFROZEN;
+		wake_up_var(&sb->s_writers.frozen);
 		goto out;
 	}
 
@@ -1821,6 +1850,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
 
 	sb->s_writers.freeze_holders &= ~who;
 	sb->s_writers.frozen = SB_UNFROZEN;
+	wake_up_var(&sb->s_writers.frozen);
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-16  1:48 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
@ 2023-06-16  2:19   ` Dave Chinner
  2023-06-16  5:52   ` Christoph Hellwig
  2023-06-16 13:24   ` Jan Kara
  2 siblings, 0 replies; 20+ messages in thread
From: Dave Chinner @ 2023-06-16  2:19 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

On Thu, Jun 15, 2023 at 06:48:38PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Jan Kara suggested that when one thread is in the middle of freezing a
> filesystem, another thread trying to freeze the same fs but with a
> different freeze_holder should wait until the freezer reaches either end
> state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> 
> Neither caller can do anything sensible with this race other than retry
> but they cannot really distinguish EBUSY as in "someone other holder of
> the same type has the sb already frozen" from "freezing raced with
> holder of a different type".
> 
> Plumb in the extra coded needed to wait for the fs freezer to reach an
> end state and try the freeze again.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/super.c |   34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)

Simple enough. I was going to comment about replacing wait_unfrozen
with a variant on wait_for_partially_frozen(), but then I looked at
the next patch....

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-16  1:48 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
  2023-06-16  2:19   ` Dave Chinner
@ 2023-06-16  5:52   ` Christoph Hellwig
  2023-06-16 13:24   ` Jan Kara
  2 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-06-16  5:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] fs: wait for partially frozen filesystems
  2023-06-16  1:48 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
  2023-06-16  2:19   ` Dave Chinner
  2023-06-16  5:52   ` Christoph Hellwig
@ 2023-06-16 13:24   ` Jan Kara
  2 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-06-16 13:24 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, linux-fsdevel, mcgrof, jack, hch, ruansy.fnst

On Thu 15-06-23 18:48:38, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Jan Kara suggested that when one thread is in the middle of freezing a
> filesystem, another thread trying to freeze the same fs but with a
> different freeze_holder should wait until the freezer reaches either end
> state (UNFROZEN or COMPLETE) instead of returning EBUSY immediately.
> 
> Neither caller can do anything sensible with this race other than retry
> but they cannot really distinguish EBUSY as in "someone other holder of
						  ^^^ some

> the same type has the sb already frozen" from "freezing raced with
> holder of a different type".
> 
> Plumb in the extra coded needed to wait for the fs freezer to reach an
		     ^^^ code

> end state and try the freeze again.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Besides these typo fixes the patch looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c |   34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/super.c b/fs/super.c
> index 81fb67157cba..1b8ea81788d4 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1635,6 +1635,24 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>  		percpu_up_write(sb->s_writers.rw_sem + level);
>  }
>  
> +static int wait_for_partially_frozen(struct super_block *sb)
> +{
> +	int ret = 0;
> +
> +	do {
> +		unsigned short old = sb->s_writers.frozen;
> +
> +		up_write(&sb->s_umount);
> +		ret = wait_var_event_killable(&sb->s_writers.frozen,
> +					       sb->s_writers.frozen != old);
> +		down_write(&sb->s_umount);
> +	} while (ret == 0 &&
> +		 sb->s_writers.frozen != SB_UNFROZEN &&
> +		 sb->s_writers.frozen != SB_FREEZE_COMPLETE);
> +
> +	return ret;
> +}
> +
>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> @@ -1686,6 +1704,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  	atomic_inc(&sb->s_active);
>  	down_write(&sb->s_umount);
>  
> +retry:
>  	if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) {
>  		if (sb->s_writers.freeze_holders & who) {
>  			deactivate_locked_super(sb);
> @@ -1704,8 +1723,13 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  	}
>  
>  	if (sb->s_writers.frozen != SB_UNFROZEN) {
> -		deactivate_locked_super(sb);
> -		return -EBUSY;
> +		ret = wait_for_partially_frozen(sb);
> +		if (ret) {
> +			deactivate_locked_super(sb);
> +			return ret;
> +		}
> +
> +		goto retry;
>  	}
>  
>  	if (!(sb->s_flags & SB_BORN)) {
> @@ -1717,6 +1741,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  		/* Nothing to do really... */
>  		sb->s_writers.freeze_holders |= who;
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +		wake_up_var(&sb->s_writers.frozen);
>  		up_write(&sb->s_umount);
>  		return 0;
>  	}
> @@ -1737,6 +1762,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  		sb->s_writers.frozen = SB_UNFROZEN;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
> +		wake_up_var(&sb->s_writers.frozen);
>  		deactivate_locked_super(sb);
>  		return ret;
>  	}
> @@ -1753,6 +1779,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  			sb->s_writers.frozen = SB_UNFROZEN;
>  			sb_freeze_unlock(sb, SB_FREEZE_FS);
>  			wake_up(&sb->s_writers.wait_unfrozen);
> +			wake_up_var(&sb->s_writers.frozen);
>  			deactivate_locked_super(sb);
>  			return ret;
>  		}
> @@ -1763,6 +1790,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who)
>  	 */
>  	sb->s_writers.freeze_holders |= who;
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +	wake_up_var(&sb->s_writers.frozen);
>  	lockdep_sb_freeze_release(sb);
>  	up_write(&sb->s_umount);
>  	return 0;
> @@ -1803,6 +1831,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.freeze_holders &= ~who;
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		wake_up_var(&sb->s_writers.frozen);
>  		goto out;
>  	}
>  
> @@ -1821,6 +1850,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
>  
>  	sb->s_writers.freeze_holders &= ~who;
>  	sb->s_writers.frozen = SB_UNFROZEN;
> +	wake_up_var(&sb->s_writers.frozen);
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-06-16 13:24 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-12  3:15 [PATCHSET RFC 0/3] fs: kernel and userspace filesystem freeze Darrick J. Wong
2023-06-12  3:15 ` [PATCH 1/3] fs: distinguish between user initiated freeze and kernel initiated freeze Darrick J. Wong
2023-06-12  3:58   ` Christoph Hellwig
2023-06-12 18:09     ` Darrick J. Wong
2023-06-12 11:08   ` Jan Kara
2023-06-12 11:14     ` Jan Kara
2023-06-12 18:16     ` Darrick J. Wong
2023-06-12  3:15 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
2023-06-12  4:01   ` Christoph Hellwig
2023-06-12 18:33     ` Darrick J. Wong
2023-06-12 18:47       ` Darrick J. Wong
2023-06-12 11:35   ` Jan Kara
2023-06-12 18:36     ` Darrick J. Wong
2023-06-13  7:52       ` Jan Kara
2023-06-12  3:15 ` [PATCH 3/3] fs: Drop wait_unfrozen wait queue Darrick J. Wong
2023-06-12 11:12   ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2023-06-16  1:48 [PATCHSET v2 0/3] fs: kernel and userspace filesystem freeze Darrick J. Wong
2023-06-16  1:48 ` [PATCH 2/3] fs: wait for partially frozen filesystems Darrick J. Wong
2023-06-16  2:19   ` Dave Chinner
2023-06-16  5:52   ` Christoph Hellwig
2023-06-16 13:24   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).