[Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic
@ 2023-06-12 16:33 Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 1/8] gfs2: Rename remaining "transaction" glock references Andreas Gruenbacher
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Currently, filesystem freeze is implemented inside freeze_go_sync(), a
glock state transition callback.  It turns out that in some scenarios,
freezing the filesystem in such a callback can deadlock.  To prevent
that from happening, rework the freeze / thaw logic and move it out of
the glock state engine.

Andreas Gruenbacher (8):
  gfs2: Rename remaining "transaction" glock references
  gfs2: Rename the {freeze,thaw}_super callbacks
  gfs2: Rename gfs2_freeze_lock{ => _shared }
  gfs2: Reconfiguring frozen filesystem already rejected
  gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR}
  gfs2: Rework freeze / thaw logic
  gfs2: Replace sd_freeze_state with SDF_FROZEN flag
  gfs2: gfs2_freeze_lock_shared cleanup

 fs/gfs2/glock.c      |   4 +-
 fs/gfs2/glops.c      |  52 +++++-------
 fs/gfs2/incore.h     |  10 +--
 fs/gfs2/log.c        |  11 +--
 fs/gfs2/ops_fstype.c |  15 +---
 fs/gfs2/recovery.c   |  28 +++----
 fs/gfs2/super.c      | 186 +++++++++++++++++++++++++++++++++----------
 fs/gfs2/super.h      |   1 +
 fs/gfs2/sys.c        |   4 +-
 fs/gfs2/trans.c      |   3 +-
 fs/gfs2/util.c       |  45 ++++-------
 fs/gfs2/util.h       |   3 +-
 12 files changed, 206 insertions(+), 156 deletions(-)

-- 
2.40.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 1/8] gfs2: Rename remaining "transaction" glock references
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 2/8] gfs2: Rename the {freeze, thaw}_super callbacks Andreas Gruenbacher
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The transaction glock was repurposed to serve as the new freeze glock
years ago.  Don't refer to it as the transaction glock anymore.

Also, to be more precise, call it the "freeze glock" instead of the
"freeze lock".  Ditto for the journal glock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/glock.c      | 4 ++--
 fs/gfs2/ops_fstype.c | 2 +-
 fs/gfs2/recovery.c   | 8 ++++----
 fs/gfs2/super.c      | 2 +-
 fs/gfs2/util.c       | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5adc7d85dbf3..1438e7465e30 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -145,8 +145,8 @@ static void gfs2_glock_dealloc(struct rcu_head *rcu)
  *
  * We need to allow some glocks to be enqueued, dequeued, promoted, and demoted
  * when we're withdrawn. For example, to maintain metadata integrity, we should
- * disallow the use of inode and rgrp glocks when withdrawn. Other glocks, like
- * iopen or the transaction glocks may be safely used because none of their
+ * disallow the use of inode and rgrp glocks when withdrawn. Other glocks like
+ * the iopen or freeze glock may be safely used because none of their
  * metadata goes through the journal. So in general, we should disallow all
  * glocks that are journaled, and allow all the others. One exception is:
  * we need to allow our active journal to be promoted and demoted so others
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 9af9ddb61ca0..3b93e4a22dfd 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -434,7 +434,7 @@ static int init_locking(struct gfs2_sbd *sdp, struct gfs2_holder *mount_gh,
 	error = gfs2_glock_get(sdp, GFS2_FREEZE_LOCK, &gfs2_freeze_glops,
 			       CREATE, &sdp->sd_freeze_gl);
 	if (error) {
-		fs_err(sdp, "can't create transaction glock: %d\n", error);
+		fs_err(sdp, "can't create freeze glock: %d\n", error);
 		goto fail_rename;
 	}
 
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 2bb085a72e8e..d8e522f389aa 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -420,10 +420,10 @@ void gfs2_recover_func(struct work_struct *work)
 	if (sdp->sd_args.ar_spectator)
 		goto fail;
 	if (jd->jd_jid != sdp->sd_lockstruct.ls_jid) {
-		fs_info(sdp, "jid=%u: Trying to acquire journal lock...\n",
+		fs_info(sdp, "jid=%u: Trying to acquire journal glock...\n",
 			jd->jd_jid);
 		jlocked = 1;
-		/* Acquire the journal lock so we can do recovery */
+		/* Acquire the journal glock so we can do recovery */
 
 		error = gfs2_glock_nq_num(sdp, jd->jd_jid, &gfs2_journal_glops,
 					  LM_ST_EXCLUSIVE,
@@ -465,10 +465,10 @@ void gfs2_recover_func(struct work_struct *work)
 		ktime_ms_delta(t_jhd, t_jlck));
 
 	if (!(head.lh_flags & GFS2_LOG_HEAD_UNMOUNT)) {
-		fs_info(sdp, "jid=%u: Acquiring the transaction lock...\n",
+		fs_info(sdp, "jid=%u: Acquiring the freeze glock...\n",
 			jd->jd_jid);
 
-		/* Acquire a shared hold on the freeze lock */
+		/* Acquire a shared hold on the freeze glock */
 
 		error = gfs2_freeze_lock(sdp, &thaw_gh, LM_FLAG_PRIORITY);
 		if (error)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 3a7e7c31eb9c..03f45711100d 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -463,7 +463,7 @@ static int gfs2_write_inode(struct inode *inode, struct writeback_control *wbc)
  * @flags: The type of dirty
  *
  * Unfortunately it can be called under any combination of inode
- * glock and transaction lock, so we have to check carefully.
+ * glock and freeze glock, so we have to check carefully.
  *
  * At the moment this deals only with atime - it should be possible
  * to expand that role in future, once a review of the locking has
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 7a6aeffcdf5c..c84242ef4903 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -107,7 +107,7 @@ int gfs2_freeze_lock(struct gfs2_sbd *sdp, struct gfs2_holder *freeze_gh,
 	error = gfs2_glock_nq_init(sdp->sd_freeze_gl, LM_ST_SHARED, flags,
 				   freeze_gh);
 	if (error && error != GLR_TRYFAILED)
-		fs_err(sdp, "can't lock the freeze lock: %d\n", error);
+		fs_err(sdp, "can't lock the freeze glock: %d\n", error);
 	return error;
 }
 
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 2/8] gfs2: Rename the {freeze, thaw}_super callbacks
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 1/8] gfs2: Rename remaining "transaction" glock references Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 3/8] gfs2: Rename gfs2_freeze_lock{ => _shared } Andreas Gruenbacher
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Rename gfs2_freeze to gfs2_freeze_super and gfs2_unfreeze to
gfs2_thaw_super to match the names of the corresponding super
operations.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/super.c | 12 ++++++------
 fs/gfs2/util.c  |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 03f45711100d..dbda7e253a9c 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -697,12 +697,12 @@ void gfs2_freeze_func(struct work_struct *work)
 }
 
 /**
- * gfs2_freeze - prevent further writes to the filesystem
+ * gfs2_freeze_super - prevent further writes to the filesystem
  * @sb: the VFS structure for the filesystem
  *
  */
 
-static int gfs2_freeze(struct super_block *sb)
+static int gfs2_freeze_super(struct super_block *sb)
 {
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	int error;
@@ -742,12 +742,12 @@ static int gfs2_freeze(struct super_block *sb)
 }
 
 /**
- * gfs2_unfreeze - reallow writes to the filesystem
+ * gfs2_thaw_super - reallow writes to the filesystem
  * @sb: the VFS structure for the filesystem
  *
  */
 
-static int gfs2_unfreeze(struct super_block *sb)
+static int gfs2_thaw_super(struct super_block *sb)
 {
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 
@@ -1526,8 +1526,8 @@ const struct super_operations gfs2_super_ops = {
 	.evict_inode		= gfs2_evict_inode,
 	.put_super		= gfs2_put_super,
 	.sync_fs		= gfs2_sync_fs,
-	.freeze_super		= gfs2_freeze,
-	.thaw_super		= gfs2_unfreeze,
+	.freeze_super		= gfs2_freeze_super,
+	.thaw_super		= gfs2_thaw_super,
 	.statfs			= gfs2_statfs,
 	.drop_inode		= gfs2_drop_inode,
 	.show_options		= gfs2_show_options,
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index c84242ef4903..16ac3b3e7b88 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -188,7 +188,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 	sdp->sd_jinode_gh.gh_flags |= GL_NOCACHE;
 	gfs2_glock_dq(&sdp->sd_jinode_gh);
 	if (test_bit(SDF_FS_FROZEN, &sdp->sd_flags)) {
-		/* Make sure gfs2_unfreeze works if partially-frozen */
+		/* Make sure gfs2_thaw_super works if partially-frozen */
 		flush_work(&sdp->sd_freeze_work);
 		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
 		thaw_super(sdp->sd_vfs);
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 3/8] gfs2: Rename gfs2_freeze_lock{ => _shared }
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 1/8] gfs2: Rename remaining "transaction" glock references Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 2/8] gfs2: Rename the {freeze, thaw}_super callbacks Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 4/8] gfs2: Reconfiguring frozen filesystem already rejected Andreas Gruenbacher
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Rename gfs2_freeze_lock to gfs2_freeze_lock_shared to make it a bit more
obvious that this function establishes the "thawed" state of the freeze
glock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/ops_fstype.c |  4 ++--
 fs/gfs2/recovery.c   |  2 +-
 fs/gfs2/super.c      |  2 +-
 fs/gfs2/util.c       | 10 +++++-----
 fs/gfs2/util.h       |  5 +++--
 5 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 3b93e4a22dfd..4fb25496e92f 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1269,7 +1269,7 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 		}
 	}
 
-	error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
+	error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
 	if (error)
 		goto fail_per_node;
 
@@ -1592,7 +1592,7 @@ static int gfs2_reconfigure(struct fs_context *fc)
 	if ((sb->s_flags ^ fc->sb_flags) & SB_RDONLY) {
 		struct gfs2_holder freeze_gh;
 
-		error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
+		error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
 		if (error)
 			return -EINVAL;
 
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index d8e522f389aa..61ef07da40b2 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -470,7 +470,7 @@ void gfs2_recover_func(struct work_struct *work)
 
 		/* Acquire a shared hold on the freeze glock */
 
-		error = gfs2_freeze_lock(sdp, &thaw_gh, LM_FLAG_PRIORITY);
+		error = gfs2_freeze_lock_shared(sdp, &thaw_gh, LM_FLAG_PRIORITY);
 		if (error)
 			goto fail_gunlock_ji;
 
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index dbda7e253a9c..0fecd1887d0c 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -677,7 +677,7 @@ void gfs2_freeze_func(struct work_struct *work)
 	struct super_block *sb = sdp->sd_vfs;
 
 	atomic_inc(&sb->s_active);
-	error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
+	error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
 	if (error) {
 		gfs2_assert_withdraw(sdp, 0);
 	} else {
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 16ac3b3e7b88..817c4f42ae7b 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -93,13 +93,13 @@ int check_journal_clean(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 }
 
 /**
- * gfs2_freeze_lock - hold the freeze glock
+ * gfs2_freeze_lock_shared - hold the freeze glock
  * @sdp: the superblock
  * @freeze_gh: pointer to the requested holder
  * @caller_flags: any additional flags needed by the caller
  */
-int gfs2_freeze_lock(struct gfs2_sbd *sdp, struct gfs2_holder *freeze_gh,
-		     int caller_flags)
+int gfs2_freeze_lock_shared(struct gfs2_sbd *sdp, struct gfs2_holder *freeze_gh,
+			    int caller_flags)
 {
 	int flags = LM_FLAG_NOEXP | GL_EXACT | caller_flags;
 	int error;
@@ -157,8 +157,8 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 		gfs2_holder_mark_uninitialized(&freeze_gh);
 		if (sdp->sd_freeze_gl &&
 		    !gfs2_glock_is_locked_by_me(sdp->sd_freeze_gl)) {
-			ret = gfs2_freeze_lock(sdp, &freeze_gh,
-				       log_write_allowed ? 0 : LM_FLAG_TRY);
+			ret = gfs2_freeze_lock_shared(sdp, &freeze_gh,
+					log_write_allowed ? 0 : LM_FLAG_TRY);
 			if (ret == GLR_TRYFAILED)
 				ret = 0;
 		}
diff --git a/fs/gfs2/util.h b/fs/gfs2/util.h
index 78ec190f4155..3291e33e81e9 100644
--- a/fs/gfs2/util.h
+++ b/fs/gfs2/util.h
@@ -149,8 +149,9 @@ int gfs2_io_error_i(struct gfs2_sbd *sdp, const char *function,
 
 extern int check_journal_clean(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 			       bool verbose);
-extern int gfs2_freeze_lock(struct gfs2_sbd *sdp,
-			    struct gfs2_holder *freeze_gh, int caller_flags);
+extern int gfs2_freeze_lock_shared(struct gfs2_sbd *sdp,
+				   struct gfs2_holder *freeze_gh,
+				   int caller_flags);
 extern void gfs2_freeze_unlock(struct gfs2_holder *freeze_gh);
 
 #define gfs2_io_error(sdp) \
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 4/8] gfs2: Reconfiguring frozen filesystem already rejected
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
                   ` (2 preceding siblings ...)
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 3/8] gfs2: Rename gfs2_freeze_lock{ => _shared } Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 5/8] gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR} Andreas Gruenbacher
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Reconfiguring a frozen filesystem is already rejected in
reconfigure_super(), so there is no need to check for that condition
again at the filesystem level.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/ops_fstype.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4fb25496e92f..12eedba45dbb 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1590,12 +1590,6 @@ static int gfs2_reconfigure(struct fs_context *fc)
 		fc->sb_flags |= SB_RDONLY;
 
 	if ((sb->s_flags ^ fc->sb_flags) & SB_RDONLY) {
-		struct gfs2_holder freeze_gh;
-
-		error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
-		if (error)
-			return -EINVAL;
-
 		if (fc->sb_flags & SB_RDONLY) {
 			gfs2_make_fs_ro(sdp);
 		} else {
@@ -1603,7 +1597,6 @@ static int gfs2_reconfigure(struct fs_context *fc)
 			if (error)
 				errorfc(fc, "unable to remount read-write");
 		}
-		gfs2_freeze_unlock(&freeze_gh);
 	}
 	sdp->sd_args = *newargs;
 
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 5/8] gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR}
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
                   ` (3 preceding siblings ...)
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 4/8] gfs2: Reconfiguring frozen filesystem already rejected Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Rename the SDF_FS_FROZEN flag to SDF_FREEZE_INITIATOR to indicate more
clearly that the node that has this flag set is the initiator of the
freeze.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com
---
 fs/gfs2/incore.h | 2 +-
 fs/gfs2/super.c  | 8 ++++----
 fs/gfs2/sys.c    | 2 +-
 fs/gfs2/util.c   | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 79485329118b..1f10529ebcf5 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -600,7 +600,7 @@ enum {
 	SDF_RORECOVERY		= 7, /* read only recovery */
 	SDF_SKIP_DLM_UNLOCK	= 8,
 	SDF_FORCE_AIL_FLUSH     = 9,
-	SDF_FS_FROZEN           = 10,
+	SDF_FREEZE_INITIATOR	= 10,
 	SDF_WITHDRAWING		= 11, /* Will withdraw eventually */
 	SDF_WITHDRAW_IN_PROG	= 12, /* Withdraw is in progress */
 	SDF_REMOTE_WITHDRAW	= 13, /* Performing remote recovery */
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 0fecd1887d0c..5f3e699ae922 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -691,8 +691,8 @@ void gfs2_freeze_func(struct work_struct *work)
 		gfs2_freeze_unlock(&freeze_gh);
 	}
 	deactivate_super(sb);
-	clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
-	wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
+	clear_bit_unlock(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
+	wake_up_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR);
 	return;
 }
 
@@ -735,7 +735,7 @@ static int gfs2_freeze_super(struct super_block *sb)
 		fs_err(sdp, "retrying...\n");
 		msleep(1000);
 	}
-	set_bit(SDF_FS_FROZEN, &sdp->sd_flags);
+	set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
 out:
 	mutex_unlock(&sdp->sd_freeze_mutex);
 	return error;
@@ -760,7 +760,7 @@ static int gfs2_thaw_super(struct super_block *sb)
 
 	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
 	mutex_unlock(&sdp->sd_freeze_mutex);
-	return wait_on_bit(&sdp->sd_flags, SDF_FS_FROZEN, TASK_INTERRUPTIBLE);
+	return wait_on_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR, TASK_INTERRUPTIBLE);
 }
 
 /**
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 454dc2ff8b5e..bc752de01573 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -111,7 +111,7 @@ static ssize_t status_show(struct gfs2_sbd *sdp, char *buf)
 		     test_bit(SDF_RORECOVERY, &f),
 		     test_bit(SDF_SKIP_DLM_UNLOCK, &f),
 		     test_bit(SDF_FORCE_AIL_FLUSH, &f),
-		     test_bit(SDF_FS_FROZEN, &f),
+		     test_bit(SDF_FREEZE_INITIATOR, &f),
 		     test_bit(SDF_WITHDRAWING, &f),
 		     test_bit(SDF_WITHDRAW_IN_PROG, &f),
 		     test_bit(SDF_REMOTE_WITHDRAW, &f),
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 817c4f42ae7b..1743caee5505 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -187,7 +187,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 	}
 	sdp->sd_jinode_gh.gh_flags |= GL_NOCACHE;
 	gfs2_glock_dq(&sdp->sd_jinode_gh);
-	if (test_bit(SDF_FS_FROZEN, &sdp->sd_flags)) {
+	if (test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags)) {
 		/* Make sure gfs2_thaw_super works if partially-frozen */
 		flush_work(&sdp->sd_freeze_work);
 		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
                   ` (4 preceding siblings ...)
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 5/8] gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR} Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-13 10:28   ` Andrew Price
  2023-06-13 13:05   ` Alexander Aring
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 7/8] gfs2: Replace sd_freeze_state with SDF_FROZEN flag Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 8/8] gfs2: gfs2_freeze_lock_shared cleanup Andreas Gruenbacher
  7 siblings, 2 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

So far, at mount time, gfs2 would take the freeze glock in shared mode
and then immediately drop it again, turning it into a cached glock that
can be reclaimed at any time.  To freeze the filesystem cluster-wide,
the node initiating the freeze would take the freeze glock in exclusive
mode, which would cause the freeze glock's freeze_go_sync() callback to
run on each node.  There, gfs2 would freeze the filesystem and schedule
gfs2_freeze_func() to run.  gfs2_freeze_func() would re-acquire the
freeze glock in shared mode, thaw the filesystem, and drop the freeze
glock again.  The initiating node would keep the freeze glock held in
exclusive mode.  To thaw the filesystem, the initiating node would drop
the freeze glock again, which would allow gfs2_freeze_func() to resume
on all nodes, leaving the filesystem in the thawed state.

It turns out that in freeze_go_sync(), we cannot reliably and safely
freeze the filesystem.  This is primarily because the final unmount of a
filesystem takes a write lock on the s_umount rw semaphore before
calling into gfs2_put_super(), and freeze_go_sync() needs to call
freeze_super() which also takes a write lock on the same semaphore,
causing a deadlock.  We could work around this by trying to take an
active reference on the super block first, which would prevent unmount
from running at the same time.  But that can fail, and freeze_go_sync()
isn't actually allowed to fail.

To get around this, this patch changes the freeze glock locking scheme
as follows:

At mount time, each node takes the freeze glock in shared mode.  To
freeze a filesystem, the initiating node first freezes the filesystem
locally and then drops and re-acquires the freeze glock in exclusive
mode.  All other nodes notice that there is contention on the freeze
glock in their go_callback callbacks, and they schedule
gfs2_freeze_func() to run.  There, they freeze the filesystem locally
and drop and re-acquire the freeze glock before re-thawing the
filesystem.  This is happening outside of the glock state engine, so
there, we are allowed to fail.

From a cluster point of view, taking and immediately dropping a glock is
indistinguishable from taking the glock and only dropping it upon
contention, so this new scheme is compatible with the old one.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/glops.c      |  52 +++++--------
 fs/gfs2/log.c        |   2 -
 fs/gfs2/ops_fstype.c |   5 +-
 fs/gfs2/recovery.c   |  24 +++---
 fs/gfs2/super.c      | 179 ++++++++++++++++++++++++++++++++++---------
 fs/gfs2/super.h      |   1 +
 fs/gfs2/util.c       |  32 +++-----
 7 files changed, 185 insertions(+), 110 deletions(-)

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 01d433ed6ce7..7c48c7afd6a4 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -561,47 +561,33 @@ static void inode_go_dump(struct seq_file *seq, struct gfs2_glock *gl,
 }
 
 /**
- * freeze_go_sync - promote/demote the freeze glock
+ * freeze_go_callback - A cluster node is requesting a freeze
  * @gl: the glock
+ * @remote: true if this came from a different cluster node
  */
 
-static int freeze_go_sync(struct gfs2_glock *gl)
+static void freeze_go_callback(struct gfs2_glock *gl, bool remote)
 {
-	int error = 0;
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+	struct super_block *sb = sdp->sd_vfs;
+
+	if (!remote ||
+	    gl->gl_state != LM_ST_SHARED ||
+	    gl->gl_demote_state != LM_ST_UNLOCKED)
+		return;
 
 	/*
-	 * We need to check gl_state == LM_ST_SHARED here and not gl_req ==
-	 * LM_ST_EXCLUSIVE. That's because when any node does a freeze,
-	 * all the nodes should have the freeze glock in SH mode and they all
-	 * call do_xmote: One for EX and the others for UN. They ALL must
-	 * freeze locally, and they ALL must queue freeze work. The freeze_work
-	 * calls freeze_func, which tries to reacquire the freeze glock in SH,
-	 * effectively waiting for the thaw on the node who holds it in EX.
-	 * Once thawed, the work func acquires the freeze glock in
-	 * SH and everybody goes back to thawed.
+	 * Try to get an active super block reference to prevent racing with
+	 * unmount (see trylock_super()).  But note that unmount isn't the only
+	 * place where a write lock on s_umount is taken, and we can fail here
+	 * because of things like remount as well.
 	 */
-	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
-	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
-		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
-		error = freeze_super(sdp->sd_vfs);
-		if (error) {
-			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
-				error);
-			if (gfs2_withdrawn(sdp)) {
-				atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
-				return 0;
-			}
-			gfs2_assert_withdraw(sdp, 0);
-		}
-		queue_work(gfs2_freeze_wq, &sdp->sd_freeze_work);
-		if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
-			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
-				       GFS2_LFC_FREEZE_GO_SYNC);
-		else /* read-only mounts */
-			atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
+	if (down_read_trylock(&sb->s_umount)) {
+		atomic_inc(&sb->s_active);
+		up_read(&sb->s_umount);
+		if (!queue_work(gfs2_freeze_wq, &sdp->sd_freeze_work))
+			deactivate_super(sb);
 	}
-	return 0;
 }
 
 /**
@@ -761,9 +747,9 @@ const struct gfs2_glock_operations gfs2_rgrp_glops = {
 };
 
 const struct gfs2_glock_operations gfs2_freeze_glops = {
-	.go_sync = freeze_go_sync,
 	.go_xmote_bh = freeze_go_xmote_bh,
 	.go_demote_ok = freeze_go_demote_ok,
+	.go_callback = freeze_go_callback,
 	.go_type = LM_TYPE_NONDISK,
 	.go_flags = GLOF_NONDISK,
 };
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index d750d1128bed..dca535311dee 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1136,8 +1136,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 		if (flags & (GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
 			     GFS2_LOG_HEAD_FLUSH_FREEZE))
 			gfs2_log_shutdown(sdp);
-		if (flags & GFS2_LOG_HEAD_FLUSH_FREEZE)
-			atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
 	}
 
 out_end:
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 12eedba45dbb..4ce5718719ac 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1140,7 +1140,6 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 	int silent = fc->sb_flags & SB_SILENT;
 	struct gfs2_sbd *sdp;
 	struct gfs2_holder mount_gh;
-	struct gfs2_holder freeze_gh;
 	int error;
 
 	sdp = init_sbd(sb);
@@ -1269,15 +1268,15 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 		}
 	}
 
-	error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
+	error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
 	if (error)
 		goto fail_per_node;
 
 	if (!sb_rdonly(sb))
 		error = gfs2_make_fs_rw(sdp);
 
-	gfs2_freeze_unlock(&freeze_gh);
 	if (error) {
+		gfs2_freeze_unlock(&sdp->sd_freeze_gh);
 		if (sdp->sd_quotad_process)
 			kthread_stop(sdp->sd_quotad_process);
 		sdp->sd_quotad_process = NULL;
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 61ef07da40b2..afeda936e2be 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -404,7 +404,7 @@ void gfs2_recover_func(struct work_struct *work)
 	struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
 	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
 	struct gfs2_log_header_host head;
-	struct gfs2_holder j_gh, ji_gh, thaw_gh;
+	struct gfs2_holder j_gh, ji_gh;
 	ktime_t t_start, t_jlck, t_jhd, t_tlck, t_rep;
 	int ro = 0;
 	unsigned int pass;
@@ -465,14 +465,14 @@ void gfs2_recover_func(struct work_struct *work)
 		ktime_ms_delta(t_jhd, t_jlck));
 
 	if (!(head.lh_flags & GFS2_LOG_HEAD_UNMOUNT)) {
-		fs_info(sdp, "jid=%u: Acquiring the freeze glock...\n",
-			jd->jd_jid);
-
-		/* Acquire a shared hold on the freeze glock */
+		mutex_lock(&sdp->sd_freeze_mutex);
 
-		error = gfs2_freeze_lock_shared(sdp, &thaw_gh, LM_FLAG_PRIORITY);
-		if (error)
+		if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
+			mutex_unlock(&sdp->sd_freeze_mutex);
+			fs_warn(sdp, "jid=%u: Can't replay: filesystem "
+				"is frozen\n", jd->jd_jid);
 			goto fail_gunlock_ji;
+		}
 
 		if (test_bit(SDF_RORECOVERY, &sdp->sd_flags)) {
 			ro = 1;
@@ -496,7 +496,7 @@ void gfs2_recover_func(struct work_struct *work)
 			fs_warn(sdp, "jid=%u: Can't replay: read-only block "
 				"device\n", jd->jd_jid);
 			error = -EROFS;
-			goto fail_gunlock_thaw;
+			goto fail_gunlock_nofreeze;
 		}
 
 		t_tlck = ktime_get();
@@ -514,7 +514,7 @@ void gfs2_recover_func(struct work_struct *work)
 			lops_after_scan(jd, error, pass);
 			if (error) {
 				up_read(&sdp->sd_log_flush_lock);
-				goto fail_gunlock_thaw;
+				goto fail_gunlock_nofreeze;
 			}
 		}
 
@@ -522,7 +522,7 @@ void gfs2_recover_func(struct work_struct *work)
 		clean_journal(jd, &head);
 		up_read(&sdp->sd_log_flush_lock);
 
-		gfs2_freeze_unlock(&thaw_gh);
+		mutex_unlock(&sdp->sd_freeze_mutex);
 		t_rep = ktime_get();
 		fs_info(sdp, "jid=%u: Journal replayed in %lldms [jlck:%lldms, "
 			"jhead:%lldms, tlck:%lldms, replay:%lldms]\n",
@@ -543,8 +543,8 @@ void gfs2_recover_func(struct work_struct *work)
 	fs_info(sdp, "jid=%u: Done\n", jd->jd_jid);
 	goto done;
 
-fail_gunlock_thaw:
-	gfs2_freeze_unlock(&thaw_gh);
+fail_gunlock_nofreeze:
+	mutex_unlock(&sdp->sd_freeze_mutex);
 fail_gunlock_ji:
 	if (jlocked) {
 		gfs2_glock_dq_uninit(&ji_gh);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 5f3e699ae922..eb41ab32695a 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -332,7 +332,12 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
 	struct lfcc *lfcc;
 	LIST_HEAD(list);
 	struct gfs2_log_header_host lh;
-	int error;
+	int error, error2;
+
+	/*
+	 * Grab all the journal glocks in SH mode.  We are *probably* doing
+	 * that to prevent recovery.
+	 */
 
 	list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
 		lfcc = kmalloc(sizeof(struct lfcc), GFP_KERNEL);
@@ -349,11 +354,13 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
 		list_add(&lfcc->list, &list);
 	}
 
+	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+
 	error = gfs2_glock_nq_init(sdp->sd_freeze_gl, LM_ST_EXCLUSIVE,
 				   LM_FLAG_NOEXP | GL_NOPID,
 				   &sdp->sd_freeze_gh);
 	if (error)
-		goto out;
+		goto relock_shared;
 
 	list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
 		error = gfs2_jdesc_check(jd);
@@ -368,8 +375,14 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
 		}
 	}
 
-	if (error)
-		gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+	if (!error)
+		goto out;  /* success */
+
+	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+
+relock_shared:
+	error2 = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
+	gfs2_assert_withdraw(sdp, !error2);
 
 out:
 	while (!list_empty(&list)) {
@@ -615,6 +628,8 @@ static void gfs2_put_super(struct super_block *sb)
 
 	/*  Release stuff  */
 
+	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+
 	iput(sdp->sd_jindex);
 	iput(sdp->sd_statfs_inode);
 	iput(sdp->sd_rindex);
@@ -669,31 +684,81 @@ static int gfs2_sync_fs(struct super_block *sb, int wait)
 	return sdp->sd_log_error;
 }
 
-void gfs2_freeze_func(struct work_struct *work)
+static int gfs2_freeze_locally(struct gfs2_sbd *sdp)
 {
-	int error;
-	struct gfs2_holder freeze_gh;
-	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
 	struct super_block *sb = sdp->sd_vfs;
+	int error;
 
-	atomic_inc(&sb->s_active);
-	error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
-	if (error) {
-		gfs2_assert_withdraw(sdp, 0);
-	} else {
-		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
-		error = thaw_super(sb);
-		if (error) {
-			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
-				error);
-			gfs2_assert_withdraw(sdp, 0);
+	atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
+
+	error = freeze_super(sb);
+	if (error)
+		goto fail;
+
+	if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) {
+		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
+			       GFS2_LFC_FREEZE_GO_SYNC);
+		if (gfs2_withdrawn(sdp)) {
+			thaw_super(sb);
+			error = -EIO;
+			goto fail;
 		}
-		gfs2_freeze_unlock(&freeze_gh);
 	}
+	return 0;
+
+fail:
+	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
+	return error;
+}
+
+static int gfs2_do_thaw(struct gfs2_sbd *sdp)
+{
+	struct super_block *sb = sdp->sd_vfs;
+	int error;
+
+	error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
+	if (error)
+		goto fail;
+	error = thaw_super(sb);
+	if (!error)
+		return 0;
+
+fail:
+	fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
+	gfs2_assert_withdraw(sdp, 0);
+	return error;
+}
+
+void gfs2_freeze_func(struct work_struct *work)
+{
+	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
+	struct super_block *sb = sdp->sd_vfs;
+	int error;
+
+	mutex_lock(&sdp->sd_freeze_mutex);
+	error = -EBUSY;
+	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
+		goto out_unlock;
+
+	error = gfs2_freeze_locally(sdp);
+	if (error)
+		goto out_unlock;
+
+	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+	atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
+
+	error = gfs2_do_thaw(sdp);
+	if (error)
+		goto out;
+
+	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
+
+out_unlock:
+	mutex_unlock(&sdp->sd_freeze_mutex);
 	deactivate_super(sb);
-	clear_bit_unlock(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
-	wake_up_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR);
-	return;
+out:
+	if (error)
+		fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n", error);
 }
 
 /**
@@ -707,21 +772,27 @@ static int gfs2_freeze_super(struct super_block *sb)
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	int error;
 
-	mutex_lock(&sdp->sd_freeze_mutex);
-	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
-		error = -EBUSY;
+	if (!mutex_trylock(&sdp->sd_freeze_mutex))
+		return -EBUSY;
+	error = -EBUSY;
+	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
 		goto out;
-	}
 
 	for (;;) {
-		if (gfs2_withdrawn(sdp)) {
-			error = -EINVAL;
+		error = gfs2_freeze_locally(sdp);
+		if (error) {
+			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
+				error);
 			goto out;
 		}
 
 		error = gfs2_lock_fs_check_clean(sdp);
 		if (!error)
-			break;
+			break;  /* success */
+
+		error = gfs2_do_thaw(sdp);
+		if (error)
+			goto out;
 
 		if (error == -EBUSY)
 			fs_err(sdp, "waiting for recovery before freeze\n");
@@ -735,8 +806,12 @@ static int gfs2_freeze_super(struct super_block *sb)
 		fs_err(sdp, "retrying...\n");
 		msleep(1000);
 	}
-	set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
+
 out:
+	if (!error) {
+		set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
+		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
+	}
 	mutex_unlock(&sdp->sd_freeze_mutex);
 	return error;
 }
@@ -750,17 +825,47 @@ static int gfs2_freeze_super(struct super_block *sb)
 static int gfs2_thaw_super(struct super_block *sb)
 {
 	struct gfs2_sbd *sdp = sb->s_fs_info;
+	int error;
 
-	mutex_lock(&sdp->sd_freeze_mutex);
-	if (atomic_read(&sdp->sd_freeze_state) != SFS_FROZEN ||
-	    !gfs2_holder_initialized(&sdp->sd_freeze_gh)) {
-		mutex_unlock(&sdp->sd_freeze_mutex);
-		return -EINVAL;
+	if (!mutex_trylock(&sdp->sd_freeze_mutex))
+		return -EBUSY;
+	error = -EINVAL;
+	if (!test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags))
+		goto out;
+
+	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+
+	error = gfs2_do_thaw(sdp);
+
+	if (!error) {
+		clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
+		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
 	}
+out:
+	mutex_unlock(&sdp->sd_freeze_mutex);
+	return error;
+}
+
+void gfs2_thaw_freeze_initiator(struct super_block *sb)
+{
+	struct gfs2_sbd *sdp = sb->s_fs_info;
+
+	mutex_lock(&sdp->sd_freeze_mutex);
+	if (!test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags))
+		goto out;
 
 	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
+#if 0
+	int error;
+	error = thaw_super(sb);
+	if (error)
+		fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
+	clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
+	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
+#endif
+
+out:
 	mutex_unlock(&sdp->sd_freeze_mutex);
-	return wait_on_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR, TASK_INTERRUPTIBLE);
 }
 
 /**
diff --git a/fs/gfs2/super.h b/fs/gfs2/super.h
index 58d13fd77aed..bba58629bc45 100644
--- a/fs/gfs2/super.h
+++ b/fs/gfs2/super.h
@@ -46,6 +46,7 @@ extern void gfs2_statfs_change_out(const struct gfs2_statfs_change_host *sc,
 extern void update_statfs(struct gfs2_sbd *sdp, struct buffer_head *m_bh);
 extern int gfs2_statfs_sync(struct super_block *sb, int type);
 extern void gfs2_freeze_func(struct work_struct *work);
+extern void gfs2_thaw_freeze_initiator(struct super_block *sb);
 
 extern void free_local_statfs_inodes(struct gfs2_sbd *sdp);
 extern struct inode *find_local_statfs_inode(struct gfs2_sbd *sdp,
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 1743caee5505..00494dffb47c 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -124,7 +124,6 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 	struct gfs2_inode *ip;
 	struct gfs2_glock *i_gl;
 	u64 no_formal_ino;
-	int log_write_allowed = test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
 	int ret = 0;
 	int tries;
 
@@ -152,24 +151,18 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 	 */
 	clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
 	if (!sb_rdonly(sdp->sd_vfs)) {
-		struct gfs2_holder freeze_gh;
-
-		gfs2_holder_mark_uninitialized(&freeze_gh);
-		if (sdp->sd_freeze_gl &&
-		    !gfs2_glock_is_locked_by_me(sdp->sd_freeze_gl)) {
-			ret = gfs2_freeze_lock_shared(sdp, &freeze_gh,
-					log_write_allowed ? 0 : LM_FLAG_TRY);
-			if (ret == GLR_TRYFAILED)
-				ret = 0;
-		}
-		if (!ret)
-			gfs2_make_fs_ro(sdp);
+		bool locked = mutex_trylock(&sdp->sd_freeze_mutex);
+
+		gfs2_make_fs_ro(sdp);
+
+		if (locked)
+			mutex_unlock(&sdp->sd_freeze_mutex);
+
 		/*
 		 * Dequeue any pending non-system glock holders that can no
 		 * longer be granted because the file system is withdrawn.
 		 */
 		gfs2_gl_dq_holders(sdp);
-		gfs2_freeze_unlock(&freeze_gh);
 	}
 
 	if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */
@@ -187,15 +180,8 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 	}
 	sdp->sd_jinode_gh.gh_flags |= GL_NOCACHE;
 	gfs2_glock_dq(&sdp->sd_jinode_gh);
-	if (test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags)) {
-		/* Make sure gfs2_thaw_super works if partially-frozen */
-		flush_work(&sdp->sd_freeze_work);
-		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
-		thaw_super(sdp->sd_vfs);
-	} else {
-		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
-			    TASK_UNINTERRUPTIBLE);
-	}
+	gfs2_thaw_freeze_initiator(sdp->sd_vfs);
+	wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE, TASK_UNINTERRUPTIBLE);
 
 	/*
 	 * holder_uninit to force glock_put, to force dlm to let go
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 7/8] gfs2: Replace sd_freeze_state with SDF_FROZEN flag
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
                   ` (5 preceding siblings ...)
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 8/8] gfs2: gfs2_freeze_lock_shared cleanup Andreas Gruenbacher
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Replace sd_freeze_state with a new SDF_FROZEN flag.

There no longer is a need for indicating that a freeze is in progress
(SDF_STARTING_FREEZE); we are now protecting the critical sections with
the sd_freeze_mutex.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/incore.h     |  8 +-------
 fs/gfs2/log.c        |  9 ++++-----
 fs/gfs2/ops_fstype.c |  1 -
 fs/gfs2/recovery.c   |  2 +-
 fs/gfs2/super.c      | 25 +++++++++----------------
 fs/gfs2/sys.c        |  2 ++
 fs/gfs2/trans.c      |  3 +--
 7 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 1f10529ebcf5..8b1b50d19de1 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -608,12 +608,7 @@ enum {
 					 withdrawing */
 	SDF_DEACTIVATING	= 15,
 	SDF_EVICTING		= 16,
-};
-
-enum gfs2_freeze_state {
-	SFS_UNFROZEN		= 0,
-	SFS_STARTING_FREEZE	= 1,
-	SFS_FROZEN		= 2,
+	SDF_FROZEN		= 17,
 };
 
 #define GFS2_FSNAME_LEN		256
@@ -841,7 +836,6 @@ struct gfs2_sbd {
 
 	/* For quiescing the filesystem */
 	struct gfs2_holder sd_freeze_gh;
-	atomic_t sd_freeze_state;
 	struct mutex sd_freeze_mutex;
 
 	char sd_fsname[GFS2_FSNAME_LEN + 3 * sizeof(int) + 2];
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index dca535311dee..aa568796207c 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -914,9 +914,8 @@ void gfs2_write_log_header(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 static void log_write_header(struct gfs2_sbd *sdp, u32 flags)
 {
 	blk_opf_t op_flags = REQ_PREFLUSH | REQ_FUA | REQ_META | REQ_SYNC;
-	enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
 
-	gfs2_assert_withdraw(sdp, (state != SFS_FROZEN));
+	gfs2_assert_withdraw(sdp, !test_bit(SDF_FROZEN, &sdp->sd_flags));
 
 	if (test_bit(SDF_NOBARRIERS, &sdp->sd_flags)) {
 		gfs2_ordered_wait(sdp);
@@ -1036,7 +1035,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 {
 	struct gfs2_trans *tr = NULL;
 	unsigned int reserved_blocks = 0, used_blocks = 0;
-	enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
+	bool frozen = test_bit(SDF_FROZEN, &sdp->sd_flags);
 	unsigned int first_log_head;
 	unsigned int reserved_revokes = 0;
 
@@ -1067,7 +1066,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 		if (tr) {
 			sdp->sd_log_tr = NULL;
 			tr->tr_first = first_log_head;
-			if (unlikely (state == SFS_FROZEN)) {
+			if (unlikely(frozen)) {
 				if (gfs2_assert_withdraw_delayed(sdp,
 				       !tr->tr_num_buf_new && !tr->tr_num_databuf_new))
 					goto out_withdraw;
@@ -1092,7 +1091,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
 		clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
 
-	if (unlikely(state == SFS_FROZEN))
+	if (unlikely(frozen))
 		if (gfs2_assert_withdraw_delayed(sdp, !reserved_revokes))
 			goto out_withdraw;
 
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4ce5718719ac..24acd17e530c 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -135,7 +135,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
 	init_rwsem(&sdp->sd_log_flush_lock);
 	atomic_set(&sdp->sd_log_in_flight, 0);
 	init_waitqueue_head(&sdp->sd_log_flush_wait);
-	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
 	mutex_init(&sdp->sd_freeze_mutex);
 
 	return sdp;
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index afeda936e2be..9c7a9f640bad 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -467,7 +467,7 @@ void gfs2_recover_func(struct work_struct *work)
 	if (!(head.lh_flags & GFS2_LOG_HEAD_UNMOUNT)) {
 		mutex_lock(&sdp->sd_freeze_mutex);
 
-		if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
+		if (test_bit(SDF_FROZEN, &sdp->sd_flags)) {
 			mutex_unlock(&sdp->sd_freeze_mutex);
 			fs_warn(sdp, "jid=%u: Can't replay: filesystem "
 				"is frozen\n", jd->jd_jid);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index eb41ab32695a..828297e9cbca 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -689,26 +689,19 @@ static int gfs2_freeze_locally(struct gfs2_sbd *sdp)
 	struct super_block *sb = sdp->sd_vfs;
 	int error;
 
-	atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
-
 	error = freeze_super(sb);
 	if (error)
-		goto fail;
+		return error;
 
 	if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) {
 		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
 			       GFS2_LFC_FREEZE_GO_SYNC);
 		if (gfs2_withdrawn(sdp)) {
 			thaw_super(sb);
-			error = -EIO;
-			goto fail;
+			return -EIO;
 		}
 	}
 	return 0;
-
-fail:
-	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
-	return error;
 }
 
 static int gfs2_do_thaw(struct gfs2_sbd *sdp)
@@ -737,7 +730,7 @@ void gfs2_freeze_func(struct work_struct *work)
 
 	mutex_lock(&sdp->sd_freeze_mutex);
 	error = -EBUSY;
-	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
+	if (test_bit(SDF_FROZEN, &sdp->sd_flags))
 		goto out_unlock;
 
 	error = gfs2_freeze_locally(sdp);
@@ -745,13 +738,13 @@ void gfs2_freeze_func(struct work_struct *work)
 		goto out_unlock;
 
 	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
-	atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
+	set_bit(SDF_FROZEN, &sdp->sd_flags);
 
 	error = gfs2_do_thaw(sdp);
 	if (error)
 		goto out;
 
-	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
+	clear_bit(SDF_FROZEN, &sdp->sd_flags);
 
 out_unlock:
 	mutex_unlock(&sdp->sd_freeze_mutex);
@@ -775,7 +768,7 @@ static int gfs2_freeze_super(struct super_block *sb)
 	if (!mutex_trylock(&sdp->sd_freeze_mutex))
 		return -EBUSY;
 	error = -EBUSY;
-	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
+	if (test_bit(SDF_FROZEN, &sdp->sd_flags))
 		goto out;
 
 	for (;;) {
@@ -810,7 +803,7 @@ static int gfs2_freeze_super(struct super_block *sb)
 out:
 	if (!error) {
 		set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
-		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
+		set_bit(SDF_FROZEN, &sdp->sd_flags);
 	}
 	mutex_unlock(&sdp->sd_freeze_mutex);
 	return error;
@@ -839,7 +832,7 @@ static int gfs2_thaw_super(struct super_block *sb)
 
 	if (!error) {
 		clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
-		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
+		clear_bit(SDF_FROZEN, &sdp->sd_flags);
 	}
 out:
 	mutex_unlock(&sdp->sd_freeze_mutex);
@@ -861,7 +854,7 @@ void gfs2_thaw_freeze_initiator(struct super_block *sb)
 	if (error)
 		fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
 	clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
-	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
+	clear_bit(SDF_FROZEN, &sdp->sd_flags);
 #endif
 
 out:
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index bc752de01573..2dfbe2f188dd 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -82,6 +82,7 @@ static ssize_t status_show(struct gfs2_sbd *sdp, char *buf)
 		     "RO Recovery:              %d\n"
 		     "Skip DLM Unlock:          %d\n"
 		     "Force AIL Flush:          %d\n"
+		     "FS Freeze Initiator:      %d\n"
 		     "FS Frozen:                %d\n"
 		     "Withdrawing:              %d\n"
 		     "Withdraw In Prog:         %d\n"
@@ -112,6 +113,7 @@ static ssize_t status_show(struct gfs2_sbd *sdp, char *buf)
 		     test_bit(SDF_SKIP_DLM_UNLOCK, &f),
 		     test_bit(SDF_FORCE_AIL_FLUSH, &f),
 		     test_bit(SDF_FREEZE_INITIATOR, &f),
+		     test_bit(SDF_FROZEN, &f),
 		     test_bit(SDF_WITHDRAWING, &f),
 		     test_bit(SDF_WITHDRAW_IN_PROG, &f),
 		     test_bit(SDF_REMOTE_WITHDRAW, &f),
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 63fec11ef2ce..ec1631257978 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -233,7 +233,6 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh)
 	struct gfs2_bufdata *bd;
 	struct gfs2_meta_header *mh;
 	struct gfs2_trans *tr = current->journal_info;
-	enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
 
 	lock_buffer(bh);
 	if (buffer_pinned(bh)) {
@@ -267,7 +266,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh)
 		       (unsigned long long)bd->bd_bh->b_blocknr);
 		BUG();
 	}
-	if (unlikely(state == SFS_FROZEN)) {
+	if (unlikely(test_bit(SDF_FROZEN, &sdp->sd_flags))) {
 		fs_info(sdp, "GFS2:adding buf while frozen\n");
 		gfs2_assert_withdraw(sdp, 0);
 	}
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 8/8] gfs2: gfs2_freeze_lock_shared cleanup
  2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
                   ` (6 preceding siblings ...)
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 7/8] gfs2: Replace sd_freeze_state with SDF_FROZEN flag Andreas Gruenbacher
@ 2023-06-12 16:33 ` Andreas Gruenbacher
  7 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-12 16:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

All the remaining users of gfs2_freeze_lock_shared() set freeze_gh to
&sdp->sd_freeze_gh and flags to 0, so remove those two parameters.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/ops_fstype.c | 2 +-
 fs/gfs2/super.c      | 4 ++--
 fs/gfs2/util.c       | 9 +++------
 fs/gfs2/util.h       | 4 +---
 4 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 24acd17e530c..9375409fd0c5 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1267,7 +1267,7 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 		}
 	}
 
-	error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
+	error = gfs2_freeze_lock_shared(sdp);
 	if (error)
 		goto fail_per_node;
 
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 828297e9cbca..f2fa0fb80ee1 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -381,7 +381,7 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
 	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
 
 relock_shared:
-	error2 = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
+	error2 = gfs2_freeze_lock_shared(sdp);
 	gfs2_assert_withdraw(sdp, !error2);
 
 out:
@@ -709,7 +709,7 @@ static int gfs2_do_thaw(struct gfs2_sbd *sdp)
 	struct super_block *sb = sdp->sd_vfs;
 	int error;
 
-	error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
+	error = gfs2_freeze_lock_shared(sdp);
 	if (error)
 		goto fail;
 	error = thaw_super(sb);
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 00494dffb47c..b9db034c7f58 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -95,17 +95,14 @@ int check_journal_clean(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 /**
  * gfs2_freeze_lock_shared - hold the freeze glock
  * @sdp: the superblock
- * @freeze_gh: pointer to the requested holder
- * @caller_flags: any additional flags needed by the caller
  */
-int gfs2_freeze_lock_shared(struct gfs2_sbd *sdp, struct gfs2_holder *freeze_gh,
-			    int caller_flags)
+int gfs2_freeze_lock_shared(struct gfs2_sbd *sdp)
 {
-	int flags = LM_FLAG_NOEXP | GL_EXACT | caller_flags;
+	int flags = LM_FLAG_NOEXP | GL_EXACT;
 	int error;
 
 	error = gfs2_glock_nq_init(sdp->sd_freeze_gl, LM_ST_SHARED, flags,
-				   freeze_gh);
+				   &sdp->sd_freeze_gh);
 	if (error && error != GLR_TRYFAILED)
 		fs_err(sdp, "can't lock the freeze glock: %d\n", error);
 	return error;
diff --git a/fs/gfs2/util.h b/fs/gfs2/util.h
index 3291e33e81e9..cdb839529175 100644
--- a/fs/gfs2/util.h
+++ b/fs/gfs2/util.h
@@ -149,9 +149,7 @@ int gfs2_io_error_i(struct gfs2_sbd *sdp, const char *function,
 
 extern int check_journal_clean(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd,
 			       bool verbose);
-extern int gfs2_freeze_lock_shared(struct gfs2_sbd *sdp,
-				   struct gfs2_holder *freeze_gh,
-				   int caller_flags);
+extern int gfs2_freeze_lock_shared(struct gfs2_sbd *sdp);
 extern void gfs2_freeze_unlock(struct gfs2_holder *freeze_gh);
 
 #define gfs2_io_error(sdp) \
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
@ 2023-06-13 10:28   ` Andrew Price
  2023-06-13 12:49     ` Andreas Gruenbacher
  2023-06-13 13:05   ` Alexander Aring
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Price @ 2023-06-13 10:28 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On 12/06/2023 17:33, Andreas Gruenbacher wrote:
> So far, at mount time, gfs2 would take the freeze glock in shared mode
> and then immediately drop it again, turning it into a cached glock that
> can be reclaimed at any time.  To freeze the filesystem cluster-wide,
> the node initiating the freeze would take the freeze glock in exclusive
> mode, which would cause the freeze glock's freeze_go_sync() callback to
> run on each node.  There, gfs2 would freeze the filesystem and schedule
> gfs2_freeze_func() to run.  gfs2_freeze_func() would re-acquire the
> freeze glock in shared mode, thaw the filesystem, and drop the freeze
> glock again.  The initiating node would keep the freeze glock held in
> exclusive mode.  To thaw the filesystem, the initiating node would drop
> the freeze glock again, which would allow gfs2_freeze_func() to resume
> on all nodes, leaving the filesystem in the thawed state.
> 
> It turns out that in freeze_go_sync(), we cannot reliably and safely
> freeze the filesystem.  This is primarily because the final unmount of a
> filesystem takes a write lock on the s_umount rw semaphore before
> calling into gfs2_put_super(), and freeze_go_sync() needs to call
> freeze_super() which also takes a write lock on the same semaphore,
> causing a deadlock.  We could work around this by trying to take an
> active reference on the super block first, which would prevent unmount
> from running at the same time.  But that can fail, and freeze_go_sync()
> isn't actually allowed to fail.
> 
> To get around this, this patch changes the freeze glock locking scheme
> as follows:
> 
> At mount time, each node takes the freeze glock in shared mode.  To
> freeze a filesystem, the initiating node first freezes the filesystem
> locally and then drops and re-acquires the freeze glock in exclusive
> mode.  All other nodes notice that there is contention on the freeze
> glock in their go_callback callbacks, and they schedule
> gfs2_freeze_func() to run.  There, they freeze the filesystem locally
> and drop and re-acquire the freeze glock before re-thawing the
> filesystem.  This is happening outside of the glock state engine, so
> there, we are allowed to fail.
> 
>  From a cluster point of view, taking and immediately dropping a glock is
> indistinguishable from taking the glock and only dropping it upon
> contention, so this new scheme is compatible with the old one.

Nice!

> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>   fs/gfs2/glops.c      |  52 +++++--------
>   fs/gfs2/log.c        |   2 -
>   fs/gfs2/ops_fstype.c |   5 +-
>   fs/gfs2/recovery.c   |  24 +++---
>   fs/gfs2/super.c      | 179 ++++++++++++++++++++++++++++++++++---------
>   fs/gfs2/super.h      |   1 +
>   fs/gfs2/util.c       |  32 +++-----
>   7 files changed, 185 insertions(+), 110 deletions(-)
> 
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index 01d433ed6ce7..7c48c7afd6a4 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -561,47 +561,33 @@ static void inode_go_dump(struct seq_file *seq, struct gfs2_glock *gl,
>   }
>   
>   /**
> - * freeze_go_sync - promote/demote the freeze glock
> + * freeze_go_callback - A cluster node is requesting a freeze
>    * @gl: the glock
> + * @remote: true if this came from a different cluster node
>    */
>   
> -static int freeze_go_sync(struct gfs2_glock *gl)
> +static void freeze_go_callback(struct gfs2_glock *gl, bool remote)
>   {
> -	int error = 0;
>   	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
> +	struct super_block *sb = sdp->sd_vfs;
> +
> +	if (!remote ||
> +	    gl->gl_state != LM_ST_SHARED ||
> +	    gl->gl_demote_state != LM_ST_UNLOCKED)
> +		return;
>   
>   	/*
> -	 * We need to check gl_state == LM_ST_SHARED here and not gl_req ==
> -	 * LM_ST_EXCLUSIVE. That's because when any node does a freeze,
> -	 * all the nodes should have the freeze glock in SH mode and they all
> -	 * call do_xmote: One for EX and the others for UN. They ALL must
> -	 * freeze locally, and they ALL must queue freeze work. The freeze_work
> -	 * calls freeze_func, which tries to reacquire the freeze glock in SH,
> -	 * effectively waiting for the thaw on the node who holds it in EX.
> -	 * Once thawed, the work func acquires the freeze glock in
> -	 * SH and everybody goes back to thawed.
> +	 * Try to get an active super block reference to prevent racing with
> +	 * unmount (see trylock_super()).  But note that unmount isn't the only
> +	 * place where a write lock on s_umount is taken, and we can fail here
> +	 * because of things like remount as well.
>   	 */
> -	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
> -	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
> -		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> -		error = freeze_super(sdp->sd_vfs);
> -		if (error) {
> -			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
> -				error);
> -			if (gfs2_withdrawn(sdp)) {
> -				atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> -				return 0;
> -			}
> -			gfs2_assert_withdraw(sdp, 0);
> -		}
> -		queue_work(gfs2_freeze_wq, &sdp->sd_freeze_work);
> -		if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
> -			gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
> -				       GFS2_LFC_FREEZE_GO_SYNC);
> -		else /* read-only mounts */
> -			atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> +	if (down_read_trylock(&sb->s_umount)) {
> +		atomic_inc(&sb->s_active);
> +		up_read(&sb->s_umount);
> +		if (!queue_work(gfs2_freeze_wq, &sdp->sd_freeze_work))
> +			deactivate_super(sb);
>   	}
> -	return 0;
>   }
>   
>   /**
> @@ -761,9 +747,9 @@ const struct gfs2_glock_operations gfs2_rgrp_glops = {
>   };
>   
>   const struct gfs2_glock_operations gfs2_freeze_glops = {
> -	.go_sync = freeze_go_sync,
>   	.go_xmote_bh = freeze_go_xmote_bh,
>   	.go_demote_ok = freeze_go_demote_ok,
> +	.go_callback = freeze_go_callback,
>   	.go_type = LM_TYPE_NONDISK,
>   	.go_flags = GLOF_NONDISK,
>   };
> diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> index d750d1128bed..dca535311dee 100644
> --- a/fs/gfs2/log.c
> +++ b/fs/gfs2/log.c
> @@ -1136,8 +1136,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
>   		if (flags & (GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
>   			     GFS2_LOG_HEAD_FLUSH_FREEZE))
>   			gfs2_log_shutdown(sdp);
> -		if (flags & GFS2_LOG_HEAD_FLUSH_FREEZE)
> -			atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
>   	}
>   
>   out_end:
> diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> index 12eedba45dbb..4ce5718719ac 100644
> --- a/fs/gfs2/ops_fstype.c
> +++ b/fs/gfs2/ops_fstype.c
> @@ -1140,7 +1140,6 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
>   	int silent = fc->sb_flags & SB_SILENT;
>   	struct gfs2_sbd *sdp;
>   	struct gfs2_holder mount_gh;
> -	struct gfs2_holder freeze_gh;
>   	int error;
>   
>   	sdp = init_sbd(sb);
> @@ -1269,15 +1268,15 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
>   		}
>   	}
>   
> -	error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
> +	error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
>   	if (error)
>   		goto fail_per_node;
>   
>   	if (!sb_rdonly(sb))
>   		error = gfs2_make_fs_rw(sdp);
>   
> -	gfs2_freeze_unlock(&freeze_gh);
>   	if (error) {
> +		gfs2_freeze_unlock(&sdp->sd_freeze_gh);
>   		if (sdp->sd_quotad_process)
>   			kthread_stop(sdp->sd_quotad_process);
>   		sdp->sd_quotad_process = NULL;
> diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
> index 61ef07da40b2..afeda936e2be 100644
> --- a/fs/gfs2/recovery.c
> +++ b/fs/gfs2/recovery.c
> @@ -404,7 +404,7 @@ void gfs2_recover_func(struct work_struct *work)
>   	struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
>   	struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
>   	struct gfs2_log_header_host head;
> -	struct gfs2_holder j_gh, ji_gh, thaw_gh;
> +	struct gfs2_holder j_gh, ji_gh;
>   	ktime_t t_start, t_jlck, t_jhd, t_tlck, t_rep;
>   	int ro = 0;
>   	unsigned int pass;
> @@ -465,14 +465,14 @@ void gfs2_recover_func(struct work_struct *work)
>   		ktime_ms_delta(t_jhd, t_jlck));
>   
>   	if (!(head.lh_flags & GFS2_LOG_HEAD_UNMOUNT)) {
> -		fs_info(sdp, "jid=%u: Acquiring the freeze glock...\n",
> -			jd->jd_jid);
> -
> -		/* Acquire a shared hold on the freeze glock */
> +		mutex_lock(&sdp->sd_freeze_mutex);
>   
> -		error = gfs2_freeze_lock_shared(sdp, &thaw_gh, LM_FLAG_PRIORITY);
> -		if (error)
> +		if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
> +			mutex_unlock(&sdp->sd_freeze_mutex);
> +			fs_warn(sdp, "jid=%u: Can't replay: filesystem "
> +				"is frozen\n", jd->jd_jid);
>   			goto fail_gunlock_ji;
> +		}
>   
>   		if (test_bit(SDF_RORECOVERY, &sdp->sd_flags)) {
>   			ro = 1;
> @@ -496,7 +496,7 @@ void gfs2_recover_func(struct work_struct *work)
>   			fs_warn(sdp, "jid=%u: Can't replay: read-only block "
>   				"device\n", jd->jd_jid);
>   			error = -EROFS;
> -			goto fail_gunlock_thaw;
> +			goto fail_gunlock_nofreeze;
>   		}
>   
>   		t_tlck = ktime_get();
> @@ -514,7 +514,7 @@ void gfs2_recover_func(struct work_struct *work)
>   			lops_after_scan(jd, error, pass);
>   			if (error) {
>   				up_read(&sdp->sd_log_flush_lock);
> -				goto fail_gunlock_thaw;
> +				goto fail_gunlock_nofreeze;
>   			}
>   		}
>   
> @@ -522,7 +522,7 @@ void gfs2_recover_func(struct work_struct *work)
>   		clean_journal(jd, &head);
>   		up_read(&sdp->sd_log_flush_lock);
>   
> -		gfs2_freeze_unlock(&thaw_gh);
> +		mutex_unlock(&sdp->sd_freeze_mutex);
>   		t_rep = ktime_get();
>   		fs_info(sdp, "jid=%u: Journal replayed in %lldms [jlck:%lldms, "
>   			"jhead:%lldms, tlck:%lldms, replay:%lldms]\n",
> @@ -543,8 +543,8 @@ void gfs2_recover_func(struct work_struct *work)
>   	fs_info(sdp, "jid=%u: Done\n", jd->jd_jid);
>   	goto done;
>   
> -fail_gunlock_thaw:
> -	gfs2_freeze_unlock(&thaw_gh);
> +fail_gunlock_nofreeze:
> +	mutex_unlock(&sdp->sd_freeze_mutex);
>   fail_gunlock_ji:
>   	if (jlocked) {
>   		gfs2_glock_dq_uninit(&ji_gh);
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 5f3e699ae922..eb41ab32695a 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -332,7 +332,12 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
>   	struct lfcc *lfcc;
>   	LIST_HEAD(list);
>   	struct gfs2_log_header_host lh;
> -	int error;
> +	int error, error2;
> +
> +	/*
> +	 * Grab all the journal glocks in SH mode.  We are *probably* doing
> +	 * that to prevent recovery.
> +	 */
>   
>   	list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
>   		lfcc = kmalloc(sizeof(struct lfcc), GFP_KERNEL);
> @@ -349,11 +354,13 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
>   		list_add(&lfcc->list, &list);
>   	}
>   
> +	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +
>   	error = gfs2_glock_nq_init(sdp->sd_freeze_gl, LM_ST_EXCLUSIVE,
>   				   LM_FLAG_NOEXP | GL_NOPID,
>   				   &sdp->sd_freeze_gh);
>   	if (error)
> -		goto out;
> +		goto relock_shared;
>   
>   	list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
>   		error = gfs2_jdesc_check(jd);
> @@ -368,8 +375,14 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
>   		}
>   	}
>   
> -	if (error)
> -		gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +	if (!error)
> +		goto out;  /* success */
> +
> +	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +
> +relock_shared:
> +	error2 = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
> +	gfs2_assert_withdraw(sdp, !error2);
>   
>   out:
>   	while (!list_empty(&list)) {
> @@ -615,6 +628,8 @@ static void gfs2_put_super(struct super_block *sb)
>   
>   	/*  Release stuff  */
>   
> +	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +
>   	iput(sdp->sd_jindex);
>   	iput(sdp->sd_statfs_inode);
>   	iput(sdp->sd_rindex);
> @@ -669,31 +684,81 @@ static int gfs2_sync_fs(struct super_block *sb, int wait)
>   	return sdp->sd_log_error;
>   }
>   
> -void gfs2_freeze_func(struct work_struct *work)
> +static int gfs2_freeze_locally(struct gfs2_sbd *sdp)
>   {
> -	int error;
> -	struct gfs2_holder freeze_gh;
> -	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
>   	struct super_block *sb = sdp->sd_vfs;
> +	int error;
>   
> -	atomic_inc(&sb->s_active);
> -	error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
> -	if (error) {
> -		gfs2_assert_withdraw(sdp, 0);
> -	} else {
> -		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> -		error = thaw_super(sb);
> -		if (error) {
> -			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
> -				error);
> -			gfs2_assert_withdraw(sdp, 0);
> +	atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> +
> +	error = freeze_super(sb);
> +	if (error)
> +		goto fail;
> +
> +	if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) {
> +		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
> +			       GFS2_LFC_FREEZE_GO_SYNC);
> +		if (gfs2_withdrawn(sdp)) {
> +			thaw_super(sb);
> +			error = -EIO;
> +			goto fail;
>   		}
> -		gfs2_freeze_unlock(&freeze_gh);
>   	}
> +	return 0;
> +
> +fail:
> +	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> +	return error;
> +}
> +
> +static int gfs2_do_thaw(struct gfs2_sbd *sdp)
> +{
> +	struct super_block *sb = sdp->sd_vfs;
> +	int error;
> +
> +	error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
> +	if (error)
> +		goto fail;
> +	error = thaw_super(sb);
> +	if (!error)
> +		return 0;
> +
> +fail:
> +	fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
> +	gfs2_assert_withdraw(sdp, 0);
> +	return error;
> +}
> +
> +void gfs2_freeze_func(struct work_struct *work)
> +{
> +	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
> +	struct super_block *sb = sdp->sd_vfs;
> +	int error;
> +
> +	mutex_lock(&sdp->sd_freeze_mutex);
> +	error = -EBUSY;
> +	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
> +		goto out_unlock;
> +
> +	error = gfs2_freeze_locally(sdp);
> +	if (error)
> +		goto out_unlock;
> +
> +	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +	atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> +
> +	error = gfs2_do_thaw(sdp);
> +	if (error)
> +		goto out;
> +
> +	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> +
> +out_unlock:
> +	mutex_unlock(&sdp->sd_freeze_mutex);
>   	deactivate_super(sb);
> -	clear_bit_unlock(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> -	wake_up_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR);
> -	return;
> +out:
> +	if (error)
> +		fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n", error);
>   }
>   
>   /**
> @@ -707,21 +772,27 @@ static int gfs2_freeze_super(struct super_block *sb)
>   	struct gfs2_sbd *sdp = sb->s_fs_info;
>   	int error;
>   
> -	mutex_lock(&sdp->sd_freeze_mutex);
> -	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
> -		error = -EBUSY;
> +	if (!mutex_trylock(&sdp->sd_freeze_mutex))
> +		return -EBUSY;
> +	error = -EBUSY;
> +	if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
>   		goto out;
> -	}
>   
>   	for (;;) {
> -		if (gfs2_withdrawn(sdp)) {
> -			error = -EINVAL;
> +		error = gfs2_freeze_locally(sdp);
> +		if (error) {
> +			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
> +				error);
>   			goto out;
>   		}
>   
>   		error = gfs2_lock_fs_check_clean(sdp);
>   		if (!error)
> -			break;
> +			break;  /* success */
> +
> +		error = gfs2_do_thaw(sdp);
> +		if (error)
> +			goto out;
>   
>   		if (error == -EBUSY)
>   			fs_err(sdp, "waiting for recovery before freeze\n");
> @@ -735,8 +806,12 @@ static int gfs2_freeze_super(struct super_block *sb)
>   		fs_err(sdp, "retrying...\n");
>   		msleep(1000);
>   	}
> -	set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> +
>   out:
> +	if (!error) {
> +		set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> +		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> +	}
>   	mutex_unlock(&sdp->sd_freeze_mutex);
>   	return error;
>   }
> @@ -750,17 +825,47 @@ static int gfs2_freeze_super(struct super_block *sb)
>   static int gfs2_thaw_super(struct super_block *sb)
>   {
>   	struct gfs2_sbd *sdp = sb->s_fs_info;
> +	int error;
>   
> -	mutex_lock(&sdp->sd_freeze_mutex);
> -	if (atomic_read(&sdp->sd_freeze_state) != SFS_FROZEN ||
> -	    !gfs2_holder_initialized(&sdp->sd_freeze_gh)) {
> -		mutex_unlock(&sdp->sd_freeze_mutex);
> -		return -EINVAL;
> +	if (!mutex_trylock(&sdp->sd_freeze_mutex))
> +		return -EBUSY;
> +	error = -EINVAL;
> +	if (!test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags))
> +		goto out;
> +
> +	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +
> +	error = gfs2_do_thaw(sdp);
> +
> +	if (!error) {
> +		clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> +		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
>   	}
> +out:
> +	mutex_unlock(&sdp->sd_freeze_mutex);
> +	return error;
> +}
> +
> +void gfs2_thaw_freeze_initiator(struct super_block *sb)
> +{
> +	struct gfs2_sbd *sdp = sb->s_fs_info;
> +
> +	mutex_lock(&sdp->sd_freeze_mutex);
> +	if (!test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags))
> +		goto out;
>   
>   	gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> +#if 0
> +	int error;
> +	error = thaw_super(sb);
> +	if (error)
> +		fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
> +	clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> +	atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> +#endif

Can be removed.

Andy

> +
> +out:
>   	mutex_unlock(&sdp->sd_freeze_mutex);
> -	return wait_on_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR, TASK_INTERRUPTIBLE);
>   }
>   
>   /**
> diff --git a/fs/gfs2/super.h b/fs/gfs2/super.h
> index 58d13fd77aed..bba58629bc45 100644
> --- a/fs/gfs2/super.h
> +++ b/fs/gfs2/super.h
> @@ -46,6 +46,7 @@ extern void gfs2_statfs_change_out(const struct gfs2_statfs_change_host *sc,
>   extern void update_statfs(struct gfs2_sbd *sdp, struct buffer_head *m_bh);
>   extern int gfs2_statfs_sync(struct super_block *sb, int type);
>   extern void gfs2_freeze_func(struct work_struct *work);
> +extern void gfs2_thaw_freeze_initiator(struct super_block *sb);
>   
>   extern void free_local_statfs_inodes(struct gfs2_sbd *sdp);
>   extern struct inode *find_local_statfs_inode(struct gfs2_sbd *sdp,
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 1743caee5505..00494dffb47c 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -124,7 +124,6 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>   	struct gfs2_inode *ip;
>   	struct gfs2_glock *i_gl;
>   	u64 no_formal_ino;
> -	int log_write_allowed = test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
>   	int ret = 0;
>   	int tries;
>   
> @@ -152,24 +151,18 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>   	 */
>   	clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
>   	if (!sb_rdonly(sdp->sd_vfs)) {
> -		struct gfs2_holder freeze_gh;
> -
> -		gfs2_holder_mark_uninitialized(&freeze_gh);
> -		if (sdp->sd_freeze_gl &&
> -		    !gfs2_glock_is_locked_by_me(sdp->sd_freeze_gl)) {
> -			ret = gfs2_freeze_lock_shared(sdp, &freeze_gh,
> -					log_write_allowed ? 0 : LM_FLAG_TRY);
> -			if (ret == GLR_TRYFAILED)
> -				ret = 0;
> -		}
> -		if (!ret)
> -			gfs2_make_fs_ro(sdp);
> +		bool locked = mutex_trylock(&sdp->sd_freeze_mutex);
> +
> +		gfs2_make_fs_ro(sdp);
> +
> +		if (locked)
> +			mutex_unlock(&sdp->sd_freeze_mutex);
> +
>   		/*
>   		 * Dequeue any pending non-system glock holders that can no
>   		 * longer be granted because the file system is withdrawn.
>   		 */
>   		gfs2_gl_dq_holders(sdp);
> -		gfs2_freeze_unlock(&freeze_gh);
>   	}
>   
>   	if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */
> @@ -187,15 +180,8 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>   	}
>   	sdp->sd_jinode_gh.gh_flags |= GL_NOCACHE;
>   	gfs2_glock_dq(&sdp->sd_jinode_gh);
> -	if (test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags)) {
> -		/* Make sure gfs2_thaw_super works if partially-frozen */
> -		flush_work(&sdp->sd_freeze_work);
> -		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> -		thaw_super(sdp->sd_vfs);
> -	} else {
> -		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
> -			    TASK_UNINTERRUPTIBLE);
> -	}
> +	gfs2_thaw_freeze_initiator(sdp->sd_vfs);
> +	wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE, TASK_UNINTERRUPTIBLE);
>   
>   	/*
>   	 * holder_uninit to force glock_put, to force dlm to let go


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic
  2023-06-13 10:28   ` Andrew Price
@ 2023-06-13 12:49     ` Andreas Gruenbacher
  0 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-13 12:49 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Jun 13, 2023 at 12:29?PM Andrew Price <anprice@redhat.com> wrote:
> On 12/06/2023 17:33, Andreas Gruenbacher wrote:
> > So far, at mount time, gfs2 would take the freeze glock in shared mode
> > and then immediately drop it again, turning it into a cached glock that
> > can be reclaimed at any time.  To freeze the filesystem cluster-wide,
> > the node initiating the freeze would take the freeze glock in exclusive
> > mode, which would cause the freeze glock's freeze_go_sync() callback to
> > run on each node.  There, gfs2 would freeze the filesystem and schedule
> > gfs2_freeze_func() to run.  gfs2_freeze_func() would re-acquire the
> > freeze glock in shared mode, thaw the filesystem, and drop the freeze
> > glock again.  The initiating node would keep the freeze glock held in
> > exclusive mode.  To thaw the filesystem, the initiating node would drop
> > the freeze glock again, which would allow gfs2_freeze_func() to resume
> > on all nodes, leaving the filesystem in the thawed state.
> >
> > It turns out that in freeze_go_sync(), we cannot reliably and safely
> > freeze the filesystem.  This is primarily because the final unmount of a
> > filesystem takes a write lock on the s_umount rw semaphore before
> > calling into gfs2_put_super(), and freeze_go_sync() needs to call
> > freeze_super() which also takes a write lock on the same semaphore,
> > causing a deadlock.  We could work around this by trying to take an
> > active reference on the super block first, which would prevent unmount
> > from running at the same time.  But that can fail, and freeze_go_sync()
> > isn't actually allowed to fail.
> >
> > To get around this, this patch changes the freeze glock locking scheme
> > as follows:
> >
> > At mount time, each node takes the freeze glock in shared mode.  To
> > freeze a filesystem, the initiating node first freezes the filesystem
> > locally and then drops and re-acquires the freeze glock in exclusive
> > mode.  All other nodes notice that there is contention on the freeze
> > glock in their go_callback callbacks, and they schedule
> > gfs2_freeze_func() to run.  There, they freeze the filesystem locally
> > and drop and re-acquire the freeze glock before re-thawing the
> > filesystem.  This is happening outside of the glock state engine, so
> > there, we are allowed to fail.
> >
> >  From a cluster point of view, taking and immediately dropping a glock is
> > indistinguishable from taking the glock and only dropping it upon
> > contention, so this new scheme is compatible with the old one.
>
> Nice!
>
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > ---
> >   fs/gfs2/glops.c      |  52 +++++--------
> >   fs/gfs2/log.c        |   2 -
> >   fs/gfs2/ops_fstype.c |   5 +-
> >   fs/gfs2/recovery.c   |  24 +++---
> >   fs/gfs2/super.c      | 179 ++++++++++++++++++++++++++++++++++---------
> >   fs/gfs2/super.h      |   1 +
> >   fs/gfs2/util.c       |  32 +++-----
> >   7 files changed, 185 insertions(+), 110 deletions(-)
> >
> > diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> > index 01d433ed6ce7..7c48c7afd6a4 100644
> > --- a/fs/gfs2/glops.c
> > +++ b/fs/gfs2/glops.c
> > @@ -561,47 +561,33 @@ static void inode_go_dump(struct seq_file *seq, struct gfs2_glock *gl,
> >   }
> >
> >   /**
> > - * freeze_go_sync - promote/demote the freeze glock
> > + * freeze_go_callback - A cluster node is requesting a freeze
> >    * @gl: the glock
> > + * @remote: true if this came from a different cluster node
> >    */
> >
> > -static int freeze_go_sync(struct gfs2_glock *gl)
> > +static void freeze_go_callback(struct gfs2_glock *gl, bool remote)
> >   {
> > -     int error = 0;
> >       struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
> > +     struct super_block *sb = sdp->sd_vfs;
> > +
> > +     if (!remote ||
> > +         gl->gl_state != LM_ST_SHARED ||
> > +         gl->gl_demote_state != LM_ST_UNLOCKED)
> > +             return;
> >
> >       /*
> > -      * We need to check gl_state == LM_ST_SHARED here and not gl_req ==
> > -      * LM_ST_EXCLUSIVE. That's because when any node does a freeze,
> > -      * all the nodes should have the freeze glock in SH mode and they all
> > -      * call do_xmote: One for EX and the others for UN. They ALL must
> > -      * freeze locally, and they ALL must queue freeze work. The freeze_work
> > -      * calls freeze_func, which tries to reacquire the freeze glock in SH,
> > -      * effectively waiting for the thaw on the node who holds it in EX.
> > -      * Once thawed, the work func acquires the freeze glock in
> > -      * SH and everybody goes back to thawed.
> > +      * Try to get an active super block reference to prevent racing with
> > +      * unmount (see trylock_super()).  But note that unmount isn't the only
> > +      * place where a write lock on s_umount is taken, and we can fail here
> > +      * because of things like remount as well.
> >        */
> > -     if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
> > -         !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
> > -             atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> > -             error = freeze_super(sdp->sd_vfs);
> > -             if (error) {
> > -                     fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
> > -                             error);
> > -                     if (gfs2_withdrawn(sdp)) {
> > -                             atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> > -                             return 0;
> > -                     }
> > -                     gfs2_assert_withdraw(sdp, 0);
> > -             }
> > -             queue_work(gfs2_freeze_wq, &sdp->sd_freeze_work);
> > -             if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))
> > -                     gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
> > -                                    GFS2_LFC_FREEZE_GO_SYNC);
> > -             else /* read-only mounts */
> > -                     atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> > +     if (down_read_trylock(&sb->s_umount)) {
> > +             atomic_inc(&sb->s_active);
> > +             up_read(&sb->s_umount);
> > +             if (!queue_work(gfs2_freeze_wq, &sdp->sd_freeze_work))
> > +                     deactivate_super(sb);
> >       }
> > -     return 0;
> >   }
> >
> >   /**
> > @@ -761,9 +747,9 @@ const struct gfs2_glock_operations gfs2_rgrp_glops = {
> >   };
> >
> >   const struct gfs2_glock_operations gfs2_freeze_glops = {
> > -     .go_sync = freeze_go_sync,
> >       .go_xmote_bh = freeze_go_xmote_bh,
> >       .go_demote_ok = freeze_go_demote_ok,
> > +     .go_callback = freeze_go_callback,
> >       .go_type = LM_TYPE_NONDISK,
> >       .go_flags = GLOF_NONDISK,
> >   };
> > diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> > index d750d1128bed..dca535311dee 100644
> > --- a/fs/gfs2/log.c
> > +++ b/fs/gfs2/log.c
> > @@ -1136,8 +1136,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
> >               if (flags & (GFS2_LOG_HEAD_FLUSH_SHUTDOWN |
> >                            GFS2_LOG_HEAD_FLUSH_FREEZE))
> >                       gfs2_log_shutdown(sdp);
> > -             if (flags & GFS2_LOG_HEAD_FLUSH_FREEZE)
> > -                     atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> >       }
> >
> >   out_end:
> > diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> > index 12eedba45dbb..4ce5718719ac 100644
> > --- a/fs/gfs2/ops_fstype.c
> > +++ b/fs/gfs2/ops_fstype.c
> > @@ -1140,7 +1140,6 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
> >       int silent = fc->sb_flags & SB_SILENT;
> >       struct gfs2_sbd *sdp;
> >       struct gfs2_holder mount_gh;
> > -     struct gfs2_holder freeze_gh;
> >       int error;
> >
> >       sdp = init_sbd(sb);
> > @@ -1269,15 +1268,15 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
> >               }
> >       }
> >
> > -     error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
> > +     error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
> >       if (error)
> >               goto fail_per_node;
> >
> >       if (!sb_rdonly(sb))
> >               error = gfs2_make_fs_rw(sdp);
> >
> > -     gfs2_freeze_unlock(&freeze_gh);
> >       if (error) {
> > +             gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> >               if (sdp->sd_quotad_process)
> >                       kthread_stop(sdp->sd_quotad_process);
> >               sdp->sd_quotad_process = NULL;
> > diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
> > index 61ef07da40b2..afeda936e2be 100644
> > --- a/fs/gfs2/recovery.c
> > +++ b/fs/gfs2/recovery.c
> > @@ -404,7 +404,7 @@ void gfs2_recover_func(struct work_struct *work)
> >       struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
> >       struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
> >       struct gfs2_log_header_host head;
> > -     struct gfs2_holder j_gh, ji_gh, thaw_gh;
> > +     struct gfs2_holder j_gh, ji_gh;
> >       ktime_t t_start, t_jlck, t_jhd, t_tlck, t_rep;
> >       int ro = 0;
> >       unsigned int pass;
> > @@ -465,14 +465,14 @@ void gfs2_recover_func(struct work_struct *work)
> >               ktime_ms_delta(t_jhd, t_jlck));
> >
> >       if (!(head.lh_flags & GFS2_LOG_HEAD_UNMOUNT)) {
> > -             fs_info(sdp, "jid=%u: Acquiring the freeze glock...\n",
> > -                     jd->jd_jid);
> > -
> > -             /* Acquire a shared hold on the freeze glock */
> > +             mutex_lock(&sdp->sd_freeze_mutex);
> >
> > -             error = gfs2_freeze_lock_shared(sdp, &thaw_gh, LM_FLAG_PRIORITY);
> > -             if (error)
> > +             if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
> > +                     mutex_unlock(&sdp->sd_freeze_mutex);
> > +                     fs_warn(sdp, "jid=%u: Can't replay: filesystem "
> > +                             "is frozen\n", jd->jd_jid);
> >                       goto fail_gunlock_ji;
> > +             }
> >
> >               if (test_bit(SDF_RORECOVERY, &sdp->sd_flags)) {
> >                       ro = 1;
> > @@ -496,7 +496,7 @@ void gfs2_recover_func(struct work_struct *work)
> >                       fs_warn(sdp, "jid=%u: Can't replay: read-only block "
> >                               "device\n", jd->jd_jid);
> >                       error = -EROFS;
> > -                     goto fail_gunlock_thaw;
> > +                     goto fail_gunlock_nofreeze;
> >               }
> >
> >               t_tlck = ktime_get();
> > @@ -514,7 +514,7 @@ void gfs2_recover_func(struct work_struct *work)
> >                       lops_after_scan(jd, error, pass);
> >                       if (error) {
> >                               up_read(&sdp->sd_log_flush_lock);
> > -                             goto fail_gunlock_thaw;
> > +                             goto fail_gunlock_nofreeze;
> >                       }
> >               }
> >
> > @@ -522,7 +522,7 @@ void gfs2_recover_func(struct work_struct *work)
> >               clean_journal(jd, &head);
> >               up_read(&sdp->sd_log_flush_lock);
> >
> > -             gfs2_freeze_unlock(&thaw_gh);
> > +             mutex_unlock(&sdp->sd_freeze_mutex);
> >               t_rep = ktime_get();
> >               fs_info(sdp, "jid=%u: Journal replayed in %lldms [jlck:%lldms, "
> >                       "jhead:%lldms, tlck:%lldms, replay:%lldms]\n",
> > @@ -543,8 +543,8 @@ void gfs2_recover_func(struct work_struct *work)
> >       fs_info(sdp, "jid=%u: Done\n", jd->jd_jid);
> >       goto done;
> >
> > -fail_gunlock_thaw:
> > -     gfs2_freeze_unlock(&thaw_gh);
> > +fail_gunlock_nofreeze:
> > +     mutex_unlock(&sdp->sd_freeze_mutex);
> >   fail_gunlock_ji:
> >       if (jlocked) {
> >               gfs2_glock_dq_uninit(&ji_gh);
> > diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> > index 5f3e699ae922..eb41ab32695a 100644
> > --- a/fs/gfs2/super.c
> > +++ b/fs/gfs2/super.c
> > @@ -332,7 +332,12 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
> >       struct lfcc *lfcc;
> >       LIST_HEAD(list);
> >       struct gfs2_log_header_host lh;
> > -     int error;
> > +     int error, error2;
> > +
> > +     /*
> > +      * Grab all the journal glocks in SH mode.  We are *probably* doing
> > +      * that to prevent recovery.
> > +      */
> >
> >       list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
> >               lfcc = kmalloc(sizeof(struct lfcc), GFP_KERNEL);
> > @@ -349,11 +354,13 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
> >               list_add(&lfcc->list, &list);
> >       }
> >
> > +     gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +
> >       error = gfs2_glock_nq_init(sdp->sd_freeze_gl, LM_ST_EXCLUSIVE,
> >                                  LM_FLAG_NOEXP | GL_NOPID,
> >                                  &sdp->sd_freeze_gh);
> >       if (error)
> > -             goto out;
> > +             goto relock_shared;
> >
> >       list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
> >               error = gfs2_jdesc_check(jd);
> > @@ -368,8 +375,14 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
> >               }
> >       }
> >
> > -     if (error)
> > -             gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +     if (!error)
> > +             goto out;  /* success */
> > +
> > +     gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +
> > +relock_shared:
> > +     error2 = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
> > +     gfs2_assert_withdraw(sdp, !error2);
> >
> >   out:
> >       while (!list_empty(&list)) {
> > @@ -615,6 +628,8 @@ static void gfs2_put_super(struct super_block *sb)
> >
> >       /*  Release stuff  */
> >
> > +     gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +
> >       iput(sdp->sd_jindex);
> >       iput(sdp->sd_statfs_inode);
> >       iput(sdp->sd_rindex);
> > @@ -669,31 +684,81 @@ static int gfs2_sync_fs(struct super_block *sb, int wait)
> >       return sdp->sd_log_error;
> >   }
> >
> > -void gfs2_freeze_func(struct work_struct *work)
> > +static int gfs2_freeze_locally(struct gfs2_sbd *sdp)
> >   {
> > -     int error;
> > -     struct gfs2_holder freeze_gh;
> > -     struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
> >       struct super_block *sb = sdp->sd_vfs;
> > +     int error;
> >
> > -     atomic_inc(&sb->s_active);
> > -     error = gfs2_freeze_lock_shared(sdp, &freeze_gh, 0);
> > -     if (error) {
> > -             gfs2_assert_withdraw(sdp, 0);
> > -     } else {
> > -             atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> > -             error = thaw_super(sb);
> > -             if (error) {
> > -                     fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
> > -                             error);
> > -                     gfs2_assert_withdraw(sdp, 0);
> > +     atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> > +
> > +     error = freeze_super(sb);
> > +     if (error)
> > +             goto fail;
> > +
> > +     if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) {
> > +             gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_FREEZE |
> > +                            GFS2_LFC_FREEZE_GO_SYNC);
> > +             if (gfs2_withdrawn(sdp)) {
> > +                     thaw_super(sb);
> > +                     error = -EIO;
> > +                     goto fail;
> >               }
> > -             gfs2_freeze_unlock(&freeze_gh);
> >       }
> > +     return 0;
> > +
> > +fail:
> > +     atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> > +     return error;
> > +}
> > +
> > +static int gfs2_do_thaw(struct gfs2_sbd *sdp)
> > +{
> > +     struct super_block *sb = sdp->sd_vfs;
> > +     int error;
> > +
> > +     error = gfs2_freeze_lock_shared(sdp, &sdp->sd_freeze_gh, 0);
> > +     if (error)
> > +             goto fail;
> > +     error = thaw_super(sb);
> > +     if (!error)
> > +             return 0;
> > +
> > +fail:
> > +     fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
> > +     gfs2_assert_withdraw(sdp, 0);
> > +     return error;
> > +}
> > +
> > +void gfs2_freeze_func(struct work_struct *work)
> > +{
> > +     struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
> > +     struct super_block *sb = sdp->sd_vfs;
> > +     int error;
> > +
> > +     mutex_lock(&sdp->sd_freeze_mutex);
> > +     error = -EBUSY;
> > +     if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
> > +             goto out_unlock;
> > +
> > +     error = gfs2_freeze_locally(sdp);
> > +     if (error)
> > +             goto out_unlock;
> > +
> > +     gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +     atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> > +
> > +     error = gfs2_do_thaw(sdp);
> > +     if (error)
> > +             goto out;
> > +
> > +     atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> > +
> > +out_unlock:
> > +     mutex_unlock(&sdp->sd_freeze_mutex);
> >       deactivate_super(sb);
> > -     clear_bit_unlock(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> > -     wake_up_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR);
> > -     return;
> > +out:
> > +     if (error)
> > +             fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n", error);
> >   }
> >
> >   /**
> > @@ -707,21 +772,27 @@ static int gfs2_freeze_super(struct super_block *sb)
> >       struct gfs2_sbd *sdp = sb->s_fs_info;
> >       int error;
> >
> > -     mutex_lock(&sdp->sd_freeze_mutex);
> > -     if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) {
> > -             error = -EBUSY;
> > +     if (!mutex_trylock(&sdp->sd_freeze_mutex))
> > +             return -EBUSY;
> > +     error = -EBUSY;
> > +     if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN)
> >               goto out;
> > -     }
> >
> >       for (;;) {
> > -             if (gfs2_withdrawn(sdp)) {
> > -                     error = -EINVAL;
> > +             error = gfs2_freeze_locally(sdp);
> > +             if (error) {
> > +                     fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
> > +                             error);
> >                       goto out;
> >               }
> >
> >               error = gfs2_lock_fs_check_clean(sdp);
> >               if (!error)
> > -                     break;
> > +                     break;  /* success */
> > +
> > +             error = gfs2_do_thaw(sdp);
> > +             if (error)
> > +                     goto out;
> >
> >               if (error == -EBUSY)
> >                       fs_err(sdp, "waiting for recovery before freeze\n");
> > @@ -735,8 +806,12 @@ static int gfs2_freeze_super(struct super_block *sb)
> >               fs_err(sdp, "retrying...\n");
> >               msleep(1000);
> >       }
> > -     set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> > +
> >   out:
> > +     if (!error) {
> > +             set_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> > +             atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> > +     }
> >       mutex_unlock(&sdp->sd_freeze_mutex);
> >       return error;
> >   }
> > @@ -750,17 +825,47 @@ static int gfs2_freeze_super(struct super_block *sb)
> >   static int gfs2_thaw_super(struct super_block *sb)
> >   {
> >       struct gfs2_sbd *sdp = sb->s_fs_info;
> > +     int error;
> >
> > -     mutex_lock(&sdp->sd_freeze_mutex);
> > -     if (atomic_read(&sdp->sd_freeze_state) != SFS_FROZEN ||
> > -         !gfs2_holder_initialized(&sdp->sd_freeze_gh)) {
> > -             mutex_unlock(&sdp->sd_freeze_mutex);
> > -             return -EINVAL;
> > +     if (!mutex_trylock(&sdp->sd_freeze_mutex))
> > +             return -EBUSY;
> > +     error = -EINVAL;
> > +     if (!test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags))
> > +             goto out;
> > +
> > +     gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +
> > +     error = gfs2_do_thaw(sdp);
> > +
> > +     if (!error) {
> > +             clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> > +             atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> >       }
> > +out:
> > +     mutex_unlock(&sdp->sd_freeze_mutex);
> > +     return error;
> > +}
> > +
> > +void gfs2_thaw_freeze_initiator(struct super_block *sb)
> > +{
> > +     struct gfs2_sbd *sdp = sb->s_fs_info;
> > +
> > +     mutex_lock(&sdp->sd_freeze_mutex);
> > +     if (!test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags))
> > +             goto out;
> >
> >       gfs2_freeze_unlock(&sdp->sd_freeze_gh);
> > +#if 0
> > +     int error;
> > +     error = thaw_super(sb);
> > +     if (error)
> > +             fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n", error);
> > +     clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags);
> > +     atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> > +#endif
>
> Can be removed.

Ah, of course, thanks.

Andreas

> Andy
>
> > +
> > +out:
> >       mutex_unlock(&sdp->sd_freeze_mutex);
> > -     return wait_on_bit(&sdp->sd_flags, SDF_FREEZE_INITIATOR, TASK_INTERRUPTIBLE);
> >   }
> >
> >   /**
> > diff --git a/fs/gfs2/super.h b/fs/gfs2/super.h
> > index 58d13fd77aed..bba58629bc45 100644
> > --- a/fs/gfs2/super.h
> > +++ b/fs/gfs2/super.h
> > @@ -46,6 +46,7 @@ extern void gfs2_statfs_change_out(const struct gfs2_statfs_change_host *sc,
> >   extern void update_statfs(struct gfs2_sbd *sdp, struct buffer_head *m_bh);
> >   extern int gfs2_statfs_sync(struct super_block *sb, int type);
> >   extern void gfs2_freeze_func(struct work_struct *work);
> > +extern void gfs2_thaw_freeze_initiator(struct super_block *sb);
> >
> >   extern void free_local_statfs_inodes(struct gfs2_sbd *sdp);
> >   extern struct inode *find_local_statfs_inode(struct gfs2_sbd *sdp,
> > diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> > index 1743caee5505..00494dffb47c 100644
> > --- a/fs/gfs2/util.c
> > +++ b/fs/gfs2/util.c
> > @@ -124,7 +124,6 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
> >       struct gfs2_inode *ip;
> >       struct gfs2_glock *i_gl;
> >       u64 no_formal_ino;
> > -     int log_write_allowed = test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
> >       int ret = 0;
> >       int tries;
> >
> > @@ -152,24 +151,18 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
> >        */
> >       clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
> >       if (!sb_rdonly(sdp->sd_vfs)) {
> > -             struct gfs2_holder freeze_gh;
> > -
> > -             gfs2_holder_mark_uninitialized(&freeze_gh);
> > -             if (sdp->sd_freeze_gl &&
> > -                 !gfs2_glock_is_locked_by_me(sdp->sd_freeze_gl)) {
> > -                     ret = gfs2_freeze_lock_shared(sdp, &freeze_gh,
> > -                                     log_write_allowed ? 0 : LM_FLAG_TRY);
> > -                     if (ret == GLR_TRYFAILED)
> > -                             ret = 0;
> > -             }
> > -             if (!ret)
> > -                     gfs2_make_fs_ro(sdp);
> > +             bool locked = mutex_trylock(&sdp->sd_freeze_mutex);
> > +
> > +             gfs2_make_fs_ro(sdp);
> > +
> > +             if (locked)
> > +                     mutex_unlock(&sdp->sd_freeze_mutex);
> > +
> >               /*
> >                * Dequeue any pending non-system glock holders that can no
> >                * longer be granted because the file system is withdrawn.
> >                */
> >               gfs2_gl_dq_holders(sdp);
> > -             gfs2_freeze_unlock(&freeze_gh);
> >       }
> >
> >       if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */
> > @@ -187,15 +180,8 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
> >       }
> >       sdp->sd_jinode_gh.gh_flags |= GL_NOCACHE;
> >       gfs2_glock_dq(&sdp->sd_jinode_gh);
> > -     if (test_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags)) {
> > -             /* Make sure gfs2_thaw_super works if partially-frozen */
> > -             flush_work(&sdp->sd_freeze_work);
> > -             atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> > -             thaw_super(sdp->sd_vfs);
> > -     } else {
> > -             wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
> > -                         TASK_UNINTERRUPTIBLE);
> > -     }
> > +     gfs2_thaw_freeze_initiator(sdp->sd_vfs);
> > +     wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE, TASK_UNINTERRUPTIBLE);
> >
> >       /*
> >        * holder_uninit to force glock_put, to force dlm to let go
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic
  2023-06-12 16:33 ` [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
  2023-06-13 10:28   ` Andrew Price
@ 2023-06-13 13:05   ` Alexander Aring
  2023-06-13 13:33     ` Andreas Gruenbacher
  1 sibling, 1 reply; 13+ messages in thread
From: Alexander Aring @ 2023-06-13 13:05 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Andreas,

On Mon, Jun 12, 2023 at 12:33?PM Andreas Gruenbacher
<agruenba@redhat.com> wrote:
...
>
> @@ -152,24 +151,18 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>          */
>         clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
>         if (!sb_rdonly(sdp->sd_vfs)) {
> -               struct gfs2_holder freeze_gh;
> -
> -               gfs2_holder_mark_uninitialized(&freeze_gh);
> -               if (sdp->sd_freeze_gl &&
> -                   !gfs2_glock_is_locked_by_me(sdp->sd_freeze_gl)) {
> -                       ret = gfs2_freeze_lock_shared(sdp, &freeze_gh,
> -                                       log_write_allowed ? 0 : LM_FLAG_TRY);
> -                       if (ret == GLR_TRYFAILED)
> -                               ret = 0;
> -               }
> -               if (!ret)
> -                       gfs2_make_fs_ro(sdp);
> +               bool locked = mutex_trylock(&sdp->sd_freeze_mutex);
> +
> +               gfs2_make_fs_ro(sdp);
> +
> +               if (locked)
> +                       mutex_unlock(&sdp->sd_freeze_mutex);

I am not sure if I overlooked something here, for me it looks like the
application does not care about if sd_freeze_mutex is locked or not
because the introduced locked boolean will never be evaluated?

What am I missing here?

- Alex


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic
  2023-06-13 13:05   ` Alexander Aring
@ 2023-06-13 13:33     ` Andreas Gruenbacher
  0 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2023-06-13 13:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Jun 13, 2023 at 3:05?PM Alexander Aring <aahringo@redhat.com> wrote:
> Hi Andreas,
>
> On Mon, Jun 12, 2023 at 12:33?PM Andreas Gruenbacher
> <agruenba@redhat.com> wrote:
> ...
> >
> > @@ -152,24 +151,18 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
> >          */
> >         clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
> >         if (!sb_rdonly(sdp->sd_vfs)) {
> > -               struct gfs2_holder freeze_gh;
> > -
> > -               gfs2_holder_mark_uninitialized(&freeze_gh);
> > -               if (sdp->sd_freeze_gl &&
> > -                   !gfs2_glock_is_locked_by_me(sdp->sd_freeze_gl)) {
> > -                       ret = gfs2_freeze_lock_shared(sdp, &freeze_gh,
> > -                                       log_write_allowed ? 0 : LM_FLAG_TRY);
> > -                       if (ret == GLR_TRYFAILED)
> > -                               ret = 0;
> > -               }
> > -               if (!ret)
> > -                       gfs2_make_fs_ro(sdp);
> > +               bool locked = mutex_trylock(&sdp->sd_freeze_mutex);
> > +
> > +               gfs2_make_fs_ro(sdp);
> > +
> > +               if (locked)
> > +                       mutex_unlock(&sdp->sd_freeze_mutex);
>
> I am not sure if I overlooked something here, for me it looks like the
> application does not care about if sd_freeze_mutex is locked or not
> because the introduced locked boolean will never be evaluated?
>
> What am I missing here?

This is to withdraw the filesystem. We're trying to acquire
sd_freeze_mutex to prevent local races, but if we can't get it, we
still go ahead and mark the filesystem read-only. Then we unlock
sd_freeze_mutex, but only if we've locked it before. This is a bit
ugly, but I don't have any better ideas right now.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-13 13:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-12 16:33 [Cluster-devel] [PATCH 0/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 1/8] gfs2: Rename remaining "transaction" glock references Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 2/8] gfs2: Rename the {freeze, thaw}_super callbacks Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 3/8] gfs2: Rename gfs2_freeze_lock{ => _shared } Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 4/8] gfs2: Reconfiguring frozen filesystem already rejected Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 5/8] gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR} Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 6/8] gfs2: Rework freeze / thaw logic Andreas Gruenbacher
2023-06-13 10:28   ` Andrew Price
2023-06-13 12:49     ` Andreas Gruenbacher
2023-06-13 13:05   ` Alexander Aring
2023-06-13 13:33     ` Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 7/8] gfs2: Replace sd_freeze_state with SDF_FROZEN flag Andreas Gruenbacher
2023-06-12 16:33 ` [Cluster-devel] [PATCH 8/8] gfs2: gfs2_freeze_lock_shared cleanup Andreas Gruenbacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).