From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH 15/28] gfs2: fix infinite loop when checking ail item count before go_inval
Date: Thu, 20 Feb 2020 13:53:16 -0600 [thread overview]
Message-ID: <20200220195329.952027-16-rpeterso@redhat.com> (raw)
In-Reply-To: <20200220195329.952027-1-rpeterso@redhat.com>
Before this patch, the rgrp_go_inval and inode_go_inval functions each
checked if there were any items left on the ail count (by way of a
count), and if so, did a withdraw. But the withdraw code now uses
glocks when changing the file system to read-only status. So we can
not have glock functions withdrawing or a hang will likely result:
The glocks can't be serviced by the work_func if the work_func is
busy doing its own withdraw.
This patch removes the checks from the go_inval functions and adds
a centralized check in do_xmote to warn about the problem and not
withdraw, but flag the error so it's eventually caught when the logd
daemon eventually runs.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
---
fs/gfs2/glock.c | 17 +++++++++++++++--
fs/gfs2/glops.c | 3 ---
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 7602d0e2492c..5afaf92057c0 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -604,9 +604,22 @@ __acquires(&gl->gl_lockref.lock)
spin_unlock(&gl->gl_lockref.lock);
if (glops->go_sync)
glops->go_sync(gl);
- if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
+ if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags)) {
+ /*
+ * The call to go_sync should have cleared out the ail list.
+ * If there are still items, we have a problem. We ought to
+ * withdraw, but we can't because the withdraw code also uses
+ * glocks. Warn about the error, dump the glock, then fall
+ * through and wait for logd to do the withdraw for us.
+ */
+ if ((atomic_read(&gl->gl_ail_count) != 0) &&
+ (!cmpxchg(&sdp->sd_log_error, 0, -EIO))) {
+ gfs2_assert_warn(sdp, !atomic_read(&gl->gl_ail_count));
+ gfs2_dump_glock(NULL, gl, true);
+ }
glops->go_inval(gl, target == LM_ST_DEFERRED ? 0 : DIO_METADATA);
- clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ }
gfs2_glock_hold(gl);
if (sdp->sd_lockstruct.ls_ops->lm_lock) {
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 7cfacbe35e59..b58924482d9a 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -188,7 +188,6 @@ static void rgrp_go_inval(struct gfs2_glock *gl, int flags)
gfs2_rgrp_brelse(rgd);
WARN_ON_ONCE(!(flags & DIO_METADATA));
- gfs2_assert_withdraw(sdp, !atomic_read(&gl->gl_ail_count));
truncate_inode_pages_range(mapping, gl->gl_vm.start, gl->gl_vm.end);
if (rgd)
@@ -288,8 +287,6 @@ static void inode_go_inval(struct gfs2_glock *gl, int flags)
{
struct gfs2_inode *ip = gfs2_glock2inode(gl);
- gfs2_assert_withdraw(gl->gl_name.ln_sbd, !atomic_read(&gl->gl_ail_count));
-
if (flags & DIO_METADATA) {
struct address_space *mapping = gfs2_glock2aspace(gl);
truncate_inode_pages(mapping, 0);
--
2.24.1
next prev parent reply other threads:[~2020-02-20 19:53 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-20 19:53 [Cluster-devel] [GFS2 PATCH 00/28] GFS2 recovery patches v10 Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 01/28] gfs2: Split gfs2_lm_withdraw into two functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 02/28] gfs2: Report errors before withdraw Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 03/28] gfs2: Remove usused cluster_wide arguments of gfs2_consist functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 04/28] gfs2: Turn gfs2_consist into void functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 05/28] gfs2: Return bool from gfs2_assert functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 06/28] gfs2: Introduce concept of a pending withdraw Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 07/28] gfs2: clear ail1 list when gfs2 withdraws Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 08/28] gfs2: Rework how rgrp buffer_heads are managed Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 09/28] gfs2: log error reform Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 10/28] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 11/28] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 12/28] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 13/28] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 14/28] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2020-02-20 19:53 ` Bob Peterson [this message]
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 16/28] gfs2: Add verbose option to check_journal_clean Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 17/28] gfs2: Issue revokes more intelligently Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 18/28] gfs2: Prepare to withdraw as soon as an IO error occurs in log write Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 19/28] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 20/28] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 21/28] gfs2: Withdraw in gfs2_ail1_flush if write_cache_pages fails Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 22/28] gfs2: drain the ail2 list after io errors Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 23/28] gfs2: Don't demote a glock until its revokes are written Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 24/28] gfs2: Do proper error checking for go_sync family of glops functions Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 25/28] gfs2: flesh out delayed withdraw for gfs2_log_flush Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 26/28] fs: clean up __block_commit_write Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 27/28] gfs2: don't allow releasepage to free bd still used for revokes Bob Peterson
2020-02-20 19:53 ` [Cluster-devel] [GFS2 PATCH 28/28] gfs2: allow journal replay to hold sd_log_flush_lock Bob Peterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200220195329.952027-16-rpeterso@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).