From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Peterson Date: Wed, 4 Sep 2019 12:59:14 -0400 (EDT) Subject: [Cluster-devel] [GFS2 PATCH v6 11/26] gfs2: log error reform In-Reply-To: References: <20190523130421.21003-1-rpeterso@redhat.com> <20190523130421.21003-12-rpeterso@redhat.com> Message-ID: <1215047977.12665709.1567616354680.JavaMail.zimbra@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi Andreas, ----- Original Message ----- > > + */ > > + if (sdp->sd_log_error) { > > + gfs2_io_error_bh(sdp, bh); > > some of the error handling here is still sketchy: the only place where > sd_log_error is set without withdrawing the filesystem is > quotad_error. If the filesystem has already been marked > SDF_WITHDRAWING or SDF_WITHDRAWN, gfs2_io_error_bh will be a no-op. It > seems that we want to set SDF_WITHDRAWING here for the quotad_error > case instead of calling gfs2_io_error_bh? > > > + } else if (buffer_busy(bh)) { > > continue; > > - if (!buffer_uptodate(bh) && > > - !test_and_set_bit(SDF_AIL1_IO_ERROR, &sdp->sd_flags)) { > > + } else if (!buffer_uptodate(bh) && > > + !cmpxchg(&sdp->sd_log_error, 0, -EIO)) { > > gfs2_io_error_bh(sdp, bh); > > set_bit(SDF_WITHDRAWING, &sdp->sd_flags); > > } The main idea was to move busy buffers to tr_ail2_list after an errors have been flagged (before the test for buffer_busy()). Would something like this be more acceptable? @@ -200,10 +199,19 @@ static void gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_trans *tr) bd_ail_st_list) { bh = bd->bd_bh; gfs2_assert(sdp, bd->bd_tr == tr); - if (buffer_busy(bh)) + /* + * If another process flagged an io error, e.g. writing to the + * journal, error all other bhs and move them off the ail1 to + * prevent a tight loop when unmount tries to flush ail1, + * regardless of whether they're still busy. If no outside + * errors were found and the buffer is busy, move to the next. + * If the ail buffer is not busy and caught an error, flag it + * for others. + */ + if (!sdp->sd_log_error && buffer_busy(bh)) continue; if (!buffer_uptodate(bh) && - !test_and_set_bit(SDF_AIL1_IO_ERROR, &sdp->sd_flags)) { + !cmpxchg(&sdp->sd_log_error, 0, -EIO)) { gfs2_io_error_bh(sdp, bh); set_bit(SDF_WITHDRAWING, &sdp->sd_flags); } Regards, Bob Peterson