cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH v3 06/19] gfs2: log error reform
Date: Tue, 30 Apr 2019 17:03:06 -0600	[thread overview]
Message-ID: <20190430230319.10375-7-rpeterso@redhat.com> (raw)
In-Reply-To: <20190430230319.10375-1-rpeterso@redhat.com>

Before this patch, gfs2 kept track of journal io errors in two
places sd_log_error and the SDF_AIL1_IO_ERROR flag in sd_flags.
This patch consolidates the two into sd_log_error so that it
reflects the first error encountered writing to the journal.
In future patches, we will take advantage of this by checking
this value rather than having to check both when reacting to
io errors.

In addition, this fixes a tight loop in unmount: If buffers
get on the ail1 list and an io error occurs elsewhere, the
ail1 list would never be cleared because they were always busy.
So unmount would hang, waiting for the ail1 list to empty.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/incore.h |  5 ++---
 fs/gfs2/log.c    | 20 +++++++++++++++-----
 fs/gfs2/quota.c  |  2 +-
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 003d9da937b4..e16ab4c98072 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -619,8 +619,7 @@ enum {
 	SDF_RORECOVERY		= 7, /* read only recovery */
 	SDF_SKIP_DLM_UNLOCK	= 8,
 	SDF_FORCE_AIL_FLUSH     = 9,
-	SDF_AIL1_IO_ERROR	= 10,
-	SDF_WITHDRAW		= 11, /* Will withdraw eventually */
+	SDF_WITHDRAW		= 10, /* Will withdraw eventually */
 };
 
 enum gfs2_freeze_state {
@@ -829,7 +828,7 @@ struct gfs2_sbd {
 	atomic_t sd_log_in_flight;
 	struct bio *sd_log_bio;
 	wait_queue_head_t sd_log_flush_wait;
-	int sd_log_error;
+	int sd_log_error; /* First log error */
 
 	atomic_t sd_reserving_log;
 	wait_queue_head_t sd_reserving_log_wait;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 944aaf3d1816..33ef2cb570e2 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -108,8 +108,7 @@ __acquires(&sdp->sd_ail_lock)
 
 		if (!buffer_busy(bh)) {
 			if (!buffer_uptodate(bh) &&
-			    !test_and_set_bit(SDF_AIL1_IO_ERROR,
-					      &sdp->sd_flags)) {
+			    !cmpxchg(&sdp->sd_log_error, 0, -EIO)) {
 				gfs2_io_error_bh(sdp, bh);
 				set_bit(SDF_WITHDRAW, &sdp->sd_flags);
 			}
@@ -203,10 +202,21 @@ static void gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 					 bd_ail_st_list) {
 		bh = bd->bd_bh;
 		gfs2_assert(sdp, bd->bd_tr == tr);
-		if (buffer_busy(bh))
+		/**
+		 * If another process flagged an io error, e.g. writing to the
+		 * journal, error all other bhs and move them off the ail1 to
+		 * prevent a tight loop when unmount tries to flush ail1,
+		 * regardless of whether they're still busy. If no outside
+		 * errors were found and the buffer is busy, move to the next.
+		 * If the ail buffer is not busy and caught an error, flag it
+		 * for others.
+		 */
+		if (sdp->sd_log_error) {
+			gfs2_io_error_bh(sdp, bh);
+		} else if (buffer_busy(bh)) {
 			continue;
-		if (!buffer_uptodate(bh) &&
-		    !test_and_set_bit(SDF_AIL1_IO_ERROR, &sdp->sd_flags)) {
+		} else if (!buffer_uptodate(bh) &&
+			   !cmpxchg(&sdp->sd_log_error, 0, -EIO)) {
 			gfs2_io_error_bh(sdp, bh);
 			set_bit(SDF_WITHDRAW, &sdp->sd_flags);
 		}
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index a8dfc86fd682..8871fca9102f 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1480,7 +1480,7 @@ static void quotad_error(struct gfs2_sbd *sdp, const char *msg, int error)
 		return;
 	if (!gfs2_withdrawn(sdp)) {
 		fs_err(sdp, "gfs2_quotad: %s error %d\n", msg, error);
-		sdp->sd_log_error = error;
+		cmpxchg(&sdp->sd_log_error, 0, error);
 		wake_up(&sdp->sd_logd_waitq);
 	}
 }
-- 
2.20.1



  parent reply	other threads:[~2019-04-30 23:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-30 23:03 [Cluster-devel] [GFS2 PATCH v3 00/19] gfs2: misc recovery patch collection Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 01/19] gfs2: kthread and remount improvements Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 02/19] gfs2: eliminate tr_num_revoke_rm Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 03/19] gfs2: log which portion of the journal is replayed Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 04/19] gfs2: Warn when a journal replay overwrites a rgrp with buffers Bob Peterson
2019-05-07 14:26   ` Andreas Gruenbacher
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 05/19] gfs2: Introduce concept of a pending withdraw Bob Peterson
2019-05-07 14:36   ` Andreas Gruenbacher
2019-04-30 23:03 ` Bob Peterson [this message]
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 07/19] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 08/19] gfs2: Stop ail1 wait loop when withdrawn Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 09/19] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn Bob Peterson
2019-05-01  0:08   ` Steven Whitehouse
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 10/19] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 11/19] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 12/19] gfs2: Don't loop forever in gfs2_freeze if withdrawn Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 13/19] gfs2: Make secondary withdrawers wait for first withdrawer Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 14/19] gfs2: Don't write log headers after file system withdraw Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 15/19] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 16/19] gfs2: simply gfs2_freeze by removing case Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 17/19] gfs2: Add verbose option to check_journal_clean Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 18/19] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
2019-04-30 23:03 ` [Cluster-devel] [GFS2 PATCH v3 19/19] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Bob Peterson
2019-05-01  0:10 ` [Cluster-devel] [GFS2 PATCH v3 00/19] gfs2: misc recovery patch collection Steven Whitehouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190430230319.10375-7-rpeterso@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).