From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 16/32] gfs2: Abort gfs2_freeze if io error is seen
Date: Wed, 13 Nov 2019 15:30:14 -0600 [thread overview]
Message-ID: <20191113213030.237431-17-rpeterso@redhat.com> (raw)
In-Reply-To: <20191113213030.237431-1-rpeterso@redhat.com>
Before this patch, an io error, such as -EIO writing to the journal
would cause function gfs2_freeze to go into an infinite loop,
continuously retrying the freeze operation. But nothing ever clears
the -EIO except unmount after withdraw, which is impossible if the
freeze operation never ends (fails). Instead you get:
[ 6499.767994] gfs2: fsid=dm-32.0: error freezing FS: -5
[ 6499.773058] gfs2: fsid=dm-32.0: retrying...
[ 6500.791957] gfs2: fsid=dm-32.0: error freezing FS: -5
[ 6500.797015] gfs2: fsid=dm-32.0: retrying...
This patch adds a check for -EIO in gfs2_freeze, and if seen, it
dequeues the freeze glock, aborts the loop and returns the error.
Also, there's no need to pass the freeze holder to function
gfs2_lock_fs_check_clean since it's only called in one place and
it's a well-known superblock pointer, so this simplifies that.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/super.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index a696663bf5e5..c7183d550442 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -403,8 +403,7 @@ struct lfcc {
* Returns: errno
*/
-static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp,
- struct gfs2_holder *freeze_gh)
+static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp)
{
struct gfs2_inode *ip;
struct gfs2_jdesc *jd;
@@ -429,7 +428,7 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp,
}
error = gfs2_glock_nq_init(sdp->sd_freeze_gl, LM_ST_EXCLUSIVE,
- GL_NOCACHE, freeze_gh);
+ GL_NOCACHE, &sdp->sd_freeze_gh);
list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
error = gfs2_jdesc_check(jd);
@@ -445,7 +444,7 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp,
}
if (error)
- gfs2_glock_dq_uninit(freeze_gh);
+ gfs2_glock_dq_uninit(&sdp->sd_freeze_gh);
out:
while (!list_empty(&list)) {
@@ -798,15 +797,20 @@ static int gfs2_freeze(struct super_block *sb)
goto out;
}
- error = gfs2_lock_fs_check_clean(sdp, &sdp->sd_freeze_gh);
+ error = gfs2_lock_fs_check_clean(sdp);
if (!error)
break;
if (error == -EBUSY)
fs_err(sdp, "waiting for recovery before freeze\n");
- else
+ else if (error == -EIO) {
+ fs_err(sdp, "Fatal IO error: cannot freeze gfs2 due "
+ "to recovery error.\n");
+ gfs2_glock_dq_uninit(&sdp->sd_freeze_gh);
+ goto out;
+ } else {
fs_err(sdp, "error freezing FS: %d\n", error);
-
+ }
fs_err(sdp, "retrying...\n");
msleep(1000);
}
--
2.23.0
next prev parent reply other threads:[~2019-11-13 21:30 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-13 21:29 [Cluster-devel] [PATCH 00/32] gfs2: misc recovery patch collection Bob Peterson
2019-11-13 21:29 ` [Cluster-devel] [PATCH 01/32] gfs2: Introduce concept of a pending withdraw Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 02/32] gfs2: clear ail1 list when gfs2 withdraws Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 03/32] gfs2: Rework how rgrp buffer_heads are managed Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 04/32] gfs2: fix infinite loop in gfs2_ail1_flush on io error Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 05/32] gfs2: log error reform Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 06/32] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 07/32] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 08/32] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 09/32] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 10/32] gfs2: Don't loop forever in gfs2_freeze if withdrawn Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 11/32] gfs2: Make secondary withdrawers wait for first withdrawer Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 12/32] gfs2: Don't write log headers after file system withdraw Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 13/32] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 14/32] gfs2: fix infinite loop when checking ail item count before go_inval Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 15/32] gfs2: Add verbose option to check_journal_clean Bob Peterson
2019-11-13 21:30 ` Bob Peterson [this message]
2019-11-13 21:30 ` [Cluster-devel] [PATCH 17/32] gfs2: Issue revokes more intelligently Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 18/32] gfs2: Prepare to withdraw as soon as an IO error occurs in log write Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 19/32] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 20/32] gfs2: new slab for transactions Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 21/32] gfs2: Close timing window with GLF_INVALIDATE_IN_PROGRESS Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 22/32] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 23/32] gfs2: Don't skip log flush if glock still has revokes Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 24/32] gfs2: initialize tr_ail1_list when creating transactions Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 25/32] gfs2: Withdraw in gfs2_ail1_flush if write_cache_pages returns error Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 26/32] gfs2: drain the ail2 list after io errors Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 27/32] gfs2: make gfs2_log_shutdown static Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 28/32] gfs2: Eliminate GFS2_RDF_UPTODATE flag in favor of buffer existence Bob Peterson
2019-11-14 10:42 ` Steven Whitehouse
2019-11-14 13:16 ` Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 29/32] gfs2: if finish_open returns error, clean up iopen glock mess Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 30/32] gfs2: Don't demote a glock until its revokes are written Bob Peterson
2019-11-14 10:45 ` Steven Whitehouse
2019-11-13 21:30 ` [Cluster-devel] [PATCH 31/32] gfs2: Do proper error checking for go_sync family of glops functions Bob Peterson
2019-11-13 21:30 ` [Cluster-devel] [PATCH 32/32] gfs2: fix glock reference problem in gfs2_trans_add_unrevoke Bob Peterson
2019-11-14 10:48 ` [Cluster-devel] [PATCH 00/32] gfs2: misc recovery patch collection Steven Whitehouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191113213030.237431-17-rpeterso@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).