From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH v6 21/26] gfs2: fix infinite loop when checking ail item count before go_inval
Date: Thu, 23 May 2019 08:04:16 -0500 [thread overview]
Message-ID: <20190523130421.21003-22-rpeterso@redhat.com> (raw)
In-Reply-To: <20190523130421.21003-1-rpeterso@redhat.com>
Before this patch, the rgrp_go_inval and inode_go_inval functions each
checked if there were any items left on the ail count (by way of a
count), and if so, did a withdraw. But the withdraw code now uses
glocks when changing the file system to read-only status. So we can
not have glock functions withdrawing or a hang will likely result:
The glocks can't be serviced by the work_func if the work_func is
busy doing its own withdraw.
This patch removes the checks from the go_inval functions and adds
a centralized check in do_xmote to warn about the problem and not
withdraw, but flag the error so it's eventually caught when the logd
daemon eventually runs.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/glock.c | 17 +++++++++++++++--
fs/gfs2/glops.c | 3 ---
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 7202793056a8..cb378df8217b 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -570,9 +570,22 @@ __acquires(&gl->gl_lockref.lock)
spin_unlock(&gl->gl_lockref.lock);
if (glops->go_sync)
glops->go_sync(gl);
- if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
+ if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags)) {
+ /**
+ * The call to go_sync should have cleared out the ail list.
+ * If there are still items, we have a problem. We ought to
+ * withdraw, but we can't because the withdraw code also uses
+ * glocks. Warn about the error, dump the glock, then fall
+ * through and wait for logd to do the withdraw for us.
+ */
+ if ((atomic_read(&gl->gl_ail_count) != 0) &&
+ (!cmpxchg(&sdp->sd_log_error, 0, -EIO))) {
+ gfs2_assert_warn(sdp, !atomic_read(&gl->gl_ail_count));
+ gfs2_dump_glock(NULL, gl, true);
+ }
glops->go_inval(gl, target == LM_ST_DEFERRED ? 0 : DIO_METADATA);
- clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ }
gfs2_glock_hold(gl);
if (sdp->sd_lockstruct.ls_ops->lm_lock) {
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index db4d499ecf4f..5d4d2b6873e5 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -196,7 +196,6 @@ static void rgrp_go_inval(struct gfs2_glock *gl, int flags)
gfs2_rgrp_brelse(rgd);
WARN_ON_ONCE(!(flags & DIO_METADATA));
- gfs2_assert_withdraw(sdp, !atomic_read(&gl->gl_ail_count));
truncate_inode_pages_range(mapping, gl->gl_vm.start, gl->gl_vm.end);
if (rgd)
@@ -296,8 +295,6 @@ static void inode_go_inval(struct gfs2_glock *gl, int flags)
{
struct gfs2_inode *ip = gfs2_glock2inode(gl);
- gfs2_assert_withdraw(gl->gl_name.ln_sbd, !atomic_read(&gl->gl_ail_count));
-
if (flags & DIO_METADATA) {
struct address_space *mapping = gfs2_glock2aspace(gl);
truncate_inode_pages(mapping, 0);
--
2.21.0
next prev parent reply other threads:[~2019-05-23 13:04 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-23 13:03 [Cluster-devel] [GFS2 PATCH v6 00/26] gfs2: misc recovery patch collection Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 01/26] gfs2: kthread and remount improvements Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 02/26] gfs2: eliminate tr_num_revoke_rm Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 03/26] gfs2: log which portion of the journal is replayed Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 04/26] gfs2: Warn when a journal replay overwrites a rgrp with buffers Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 05/26] gfs2: Change SDF_SHUTDOWN to SDF_WITHDRAWN Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 06/26] gfs2: simplify gfs2_freeze by removing case Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 07/26] gfs2: dump fsid when dumping glock problems Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 08/26] gfs2: replace more printk with calls to fs_info and friends Bob Peterson
2019-05-29 16:20 ` Andreas Gruenbacher
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 09/26] gfs2: Introduce concept of a pending withdraw Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 10/26] gfs2: fix infinite loop in gfs2_ail1_flush on io error Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 11/26] gfs2: log error reform Bob Peterson
2019-08-20 14:09 ` Andreas Gruenbacher
2019-09-04 16:59 ` Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 12/26] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 13/26] gfs2: Stop ail1 wait loop when withdrawn Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 14/26] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn Bob Peterson
2019-08-27 11:20 ` Andreas Gruenbacher
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 15/26] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 16/26] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 17/26] gfs2: Don't loop forever in gfs2_freeze if withdrawn Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 18/26] gfs2: Make secondary withdrawers wait for first withdrawer Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 19/26] gfs2: Don't write log headers after file system withdraw Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 20/26] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2019-05-23 13:04 ` Bob Peterson [this message]
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 22/26] gfs2: Add verbose option to check_journal_clean Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 23/26] gfs2: Abort gfs2_freeze if io error is seen Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 24/26] gfs2: Issue revokes more intelligently Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 25/26] gfs2: Prepare to withdraw as soon as an IO error occurs in log write Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 26/26] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190523130421.21003-22-rpeterso@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).