From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Peterson Date: Tue, 18 May 2021 09:12:10 -0400 (EDT) Subject: [Cluster-devel] [gfs2 patch] gfs2: fix scheduling while atomic bug in glocks In-Reply-To: <892198597.28357144.1621343494850.JavaMail.zimbra@redhat.com> Message-ID: <16881786.28357200.1621343530958.JavaMail.zimbra@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Before this patch, in the unlikely event that gfs2_glock_dq encountered a withdraw, it would do a wait_on_bit to wait for its journal to be recovered, but it never released the glock's spin_lock, which caused a scheduling-while-atomic error. This patch unlocks the lockref spin_lock before waiting for recovery. Fixes: 601ef0d52e961 ("gfs2: Force withdraw to replay journals and wait for it to finish" Reported-by: Alexander Aring Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 79f47f227e81..d7bee2ab5d2b 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -1471,9 +1471,11 @@ void gfs2_glock_dq(struct gfs2_holder *gh) glock_blocked_by_withdraw(gl) && gh->gh_gl != sdp->sd_jinode_gl) { sdp->sd_glock_dqs_held++; + spin_unlock(&gl->gl_lockref.lock); might_sleep(); wait_on_bit(&sdp->sd_flags, SDF_WITHDRAW_RECOVERY, TASK_UNINTERRUPTIBLE); + spin_lock(&gl->gl_lockref.lock); } if (gh->gh_flags & GL_NOCACHE) handle_callback(gl, LM_ST_UNLOCKED, 0, false);