From mboxrd@z Thu Jan  1 00:00:00 1970
From: Bob Peterson <rpeterso@redhat.com>
Date: Tue, 18 May 2021 09:12:10 -0400 (EDT)
Subject: [Cluster-devel] [gfs2 patch] gfs2: fix scheduling while atomic bug
	in glocks
In-Reply-To: <892198597.28357144.1621343494850.JavaMail.zimbra@redhat.com>
Message-ID: <16881786.28357200.1621343530958.JavaMail.zimbra@redhat.com>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Before this patch, in the unlikely event that gfs2_glock_dq encountered a
withdraw, it would do a wait_on_bit to wait for its journal to be
recovered, but it never released the glock's spin_lock, which caused a
scheduling-while-atomic error.

This patch unlocks the lockref spin_lock before waiting for recovery.

Fixes: 601ef0d52e961 ("gfs2: Force withdraw to replay journals and wait
for it to finish"
Reported-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/glock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 79f47f227e81..d7bee2ab5d2b 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1471,9 +1471,11 @@ void gfs2_glock_dq(struct gfs2_holder *gh)
 	    glock_blocked_by_withdraw(gl) &&
 	    gh->gh_gl != sdp->sd_jinode_gl) {
 		sdp->sd_glock_dqs_held++;
+		spin_unlock(&gl->gl_lockref.lock);
 		might_sleep();
 		wait_on_bit(&sdp->sd_flags, SDF_WITHDRAW_RECOVERY,
 			    TASK_UNINTERRUPTIBLE);
+		spin_lock(&gl->gl_lockref.lock);
 	}
 	if (gh->gh_flags & GL_NOCACHE)
 		handle_callback(gl, LM_ST_UNLOCKED, 0, false);