* [Cluster-devel] [gfs2 PATCH] gfs2: fix a deadlock on withdraw-during mount
[not found] <1048075113.28357420.1621343657187.JavaMail.zimbra@redhat.com>
@ 2021-05-18 13:14 ` Bob Peterson
0 siblings, 0 replies; only message in thread
From: Bob Peterson @ 2021-05-18 13:14 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, gfs2 would deadlock because of the following
sequence during mount:
mount
gfs2_fill_super
gfs2_make_fs_rw <--- Detects IO error with glock
kthread_stop(sdp->sd_quotad_process);
<--- Blocked waiting for quotad to finish
logd
Detects IO error and the need to withdraw
calls gfs2_withdraw
gfs2_make_fs_ro
kthread_stop(sdp->sd_quotad_process);
<--- Blocked waiting for quotad to finish
gfs2_quotad
gfs2_statfs_sync
gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted
glock_work_func
do_xmote <---Detects IO error, can't release glock: blocked on withdraw
glops->go_inval
glock_blocked_by_withdraw
requeue glock work & exit <--- work requeued, blocked by withdraw
This patch makes a special exception for the statfs system inode glock,
which allows the statfs glock UNLOCK to proceed normally. That allows the
quotad daemon to exit during the withdraw, which allows the logd daemon
to exit during the withdraw, which allows the mount to exit.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glock.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index d7bee2ab5d2b..5af436c94d2a 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -582,6 +582,16 @@ static void finish_xmote(struct gfs2_glock *gl, unsigned int ret)
spin_unlock(&gl->gl_lockref.lock);
}
+static bool is_system_glock(struct gfs2_glock *gl)
+{
+ struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+ struct gfs2_inode *m_ip = GFS2_I(sdp->sd_statfs_inode);
+
+ if (gl == m_ip->i_gl)
+ return true;
+ return false;
+}
+
/**
* do_xmote - Calls the DLM to change the state of a lock
* @gl: The lock state
@@ -671,17 +681,25 @@ __acquires(&gl->gl_lockref.lock)
* to see sd_log_error and withdraw, and in the meantime, requeue the
* work for later.
*
+ * We make a special exception for some system glocks, such as the
+ * system statfs inode glock, which needs to be granted before the
+ * gfs2_quotad daemon can exit, and that exit needs to finish before
+ * we can unmount the withdrawn file system.
+ *
* However, if we're just unlocking the lock (say, for unmount, when
* gfs2_gl_hash_clear calls clear_glock) and recovery is complete
* then it's okay to tell dlm to unlock it.
*/
if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
gfs2_withdraw_delayed(sdp);
- if (glock_blocked_by_withdraw(gl)) {
- if (target != LM_ST_UNLOCKED ||
- test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags)) {
+ if (glock_blocked_by_withdraw(gl) &&
+ (target != LM_ST_UNLOCKED ||
+ test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags))) {
+ if (!is_system_glock(gl)) {
gfs2_glock_queue_work(gl, GL_GLOCK_DFT_HOLD);
goto out;
+ } else {
+ clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
}
}
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2021-05-18 13:14 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1048075113.28357420.1621343657187.JavaMail.zimbra@redhat.com>
2021-05-18 13:14 ` [Cluster-devel] [gfs2 PATCH] gfs2: fix a deadlock on withdraw-during mount Bob Peterson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).