cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH v6 11/26] gfs2: log error reform
Date: Wed, 4 Sep 2019 12:59:14 -0400 (EDT)	[thread overview]
Message-ID: <1215047977.12665709.1567616354680.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <CAHc6FU49bXX7GoNeO3eNex+wVjEecYfQjySCy1EbTL2ek+tUzg@mail.gmail.com>

Hi Andreas,

----- Original Message -----
> > +                */
> > +               if (sdp->sd_log_error) {
> > +                       gfs2_io_error_bh(sdp, bh);
> 
> some of the error handling here is still sketchy: the only place where
> sd_log_error is set without withdrawing the filesystem is
> quotad_error. If the filesystem has already been marked
> SDF_WITHDRAWING or SDF_WITHDRAWN, gfs2_io_error_bh will be a no-op. It
> seems that we want to set SDF_WITHDRAWING here for the quotad_error
> case instead of calling gfs2_io_error_bh?
> 
> > +               } else if (buffer_busy(bh)) {
> >                         continue;
> > -               if (!buffer_uptodate(bh) &&
> > -                   !test_and_set_bit(SDF_AIL1_IO_ERROR, &sdp->sd_flags)) {
> > +               } else if (!buffer_uptodate(bh) &&
> > +                          !cmpxchg(&sdp->sd_log_error, 0, -EIO)) {
> >                         gfs2_io_error_bh(sdp, bh);
> >                         set_bit(SDF_WITHDRAWING, &sdp->sd_flags);
> >                 }

The main idea was to move busy buffers to tr_ail2_list after
an errors have been flagged (before the test for buffer_busy()).
Would something like this be more acceptable?

@@ -200,10 +199,19 @@ static void gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 					 bd_ail_st_list) {
 		bh = bd->bd_bh;
 		gfs2_assert(sdp, bd->bd_tr == tr);
-		if (buffer_busy(bh))
+		/*
+		 * If another process flagged an io error, e.g. writing to the
+		 * journal, error all other bhs and move them off the ail1 to
+		 * prevent a tight loop when unmount tries to flush ail1,
+		 * regardless of whether they're still busy. If no outside
+		 * errors were found and the buffer is busy, move to the next.
+		 * If the ail buffer is not busy and caught an error, flag it
+		 * for others.
+		 */
+		if (!sdp->sd_log_error && buffer_busy(bh))
 			continue;
 		if (!buffer_uptodate(bh) &&
-		    !test_and_set_bit(SDF_AIL1_IO_ERROR, &sdp->sd_flags)) {
+		    !cmpxchg(&sdp->sd_log_error, 0, -EIO)) {
 			gfs2_io_error_bh(sdp, bh);
 			set_bit(SDF_WITHDRAWING, &sdp->sd_flags);
 		}



Regards,

Bob Peterson



  reply	other threads:[~2019-09-04 16:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 13:03 [Cluster-devel] [GFS2 PATCH v6 00/26] gfs2: misc recovery patch collection Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 01/26] gfs2: kthread and remount improvements Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 02/26] gfs2: eliminate tr_num_revoke_rm Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 03/26] gfs2: log which portion of the journal is replayed Bob Peterson
2019-05-23 13:03 ` [Cluster-devel] [GFS2 PATCH v6 04/26] gfs2: Warn when a journal replay overwrites a rgrp with buffers Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 05/26] gfs2: Change SDF_SHUTDOWN to SDF_WITHDRAWN Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 06/26] gfs2: simplify gfs2_freeze by removing case Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 07/26] gfs2: dump fsid when dumping glock problems Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 08/26] gfs2: replace more printk with calls to fs_info and friends Bob Peterson
2019-05-29 16:20   ` Andreas Gruenbacher
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 09/26] gfs2: Introduce concept of a pending withdraw Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 10/26] gfs2: fix infinite loop in gfs2_ail1_flush on io error Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 11/26] gfs2: log error reform Bob Peterson
2019-08-20 14:09   ` Andreas Gruenbacher
2019-09-04 16:59     ` Bob Peterson [this message]
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 12/26] gfs2: Only complain the first time an io error occurs in quota or log Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 13/26] gfs2: Stop ail1 wait loop when withdrawn Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 14/26] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn Bob Peterson
2019-08-27 11:20   ` Andreas Gruenbacher
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 15/26] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 16/26] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 17/26] gfs2: Don't loop forever in gfs2_freeze if withdrawn Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 18/26] gfs2: Make secondary withdrawers wait for first withdrawer Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 19/26] gfs2: Don't write log headers after file system withdraw Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 20/26] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 21/26] gfs2: fix infinite loop when checking ail item count before go_inval Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 22/26] gfs2: Add verbose option to check_journal_clean Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 23/26] gfs2: Abort gfs2_freeze if io error is seen Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 24/26] gfs2: Issue revokes more intelligently Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 25/26] gfs2: Prepare to withdraw as soon as an IO error occurs in log write Bob Peterson
2019-05-23 13:04 ` [Cluster-devel] [GFS2 PATCH v6 26/26] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1215047977.12665709.1567616354680.JavaMail.zimbra@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).