All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 05/12] xfs: ratelimit unmount time per-buffer I/O error warning
Date: Tue, 21 Apr 2020 08:13:37 -0400	[thread overview]
Message-ID: <20200421121337.GA31715@bfoster> (raw)
In-Reply-To: <20200420222332.GP9800@dread.disaster.area>

On Tue, Apr 21, 2020 at 08:23:32AM +1000, Dave Chinner wrote:
> On Mon, Apr 20, 2020 at 10:02:05AM -0400, Brian Foster wrote:
> > On Mon, Apr 20, 2020 at 01:19:59PM +1000, Dave Chinner wrote:
> > > On Fri, Apr 17, 2020 at 11:08:52AM -0400, Brian Foster wrote:
> > > > At unmount time, XFS emits a warning for every in-core buffer that
> > > > might have undergone a write error. In practice this behavior is
> > > > probably reasonable given that the filesystem is likely short lived
> > > > once I/O errors begin to occur consistently. Under certain test or
> > > > otherwise expected error conditions, this can spam the logs and slow
> > > > down the unmount. Ratelimit the warning to prevent this problem
> > > > while still informing the user that errors have occurred.
> > > > 
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > ---
> > > >  fs/xfs/xfs_buf.c | 7 +++----
> > > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > > index 93942d8e35dd..5120fed06075 100644
> > > > --- a/fs/xfs/xfs_buf.c
> > > > +++ b/fs/xfs/xfs_buf.c
> > > > @@ -1685,11 +1685,10 @@ xfs_wait_buftarg(
> > > >  			bp = list_first_entry(&dispose, struct xfs_buf, b_lru);
> > > >  			list_del_init(&bp->b_lru);
> > > >  			if (bp->b_flags & XBF_WRITE_FAIL) {
> > > > -				xfs_alert(btp->bt_mount,
> > > > -"Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!",
> > > > +				xfs_alert_ratelimited(btp->bt_mount,
> > > > +"Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!\n"
> > > > +"Please run xfs_repair to determine the extent of the problem.",
> > > >  					(long long)bp->b_bn);
> > > 
> > > Hmmmm. I was under the impression that multiple line log messages
> > > were frowned upon because they prevent every output line in the log
> > > being tagged correctly. That's where KERN_CONT came from (i.e. it's
> > > a continuation of a previous log message), but we don't use that
> > > with the XFS logging and hence multi-line log messages are split
> > > into multiple logging calls.
> > > 
> > 
> > I debated combining these into a single line for that exact reason for
> > about a second and then just went with this because I didn't think it
> > mattered that much.
> 
> It doesn't matter to us, but it does matter to those people who want
> their log entries correctly tagged for their classification
> engines...
> 

Makes sense, though I am a bit curious whether it would be categorized
correctly even when fixed up, or whether something like a single long
line would be preferred over two. *shrug*

> > > IOWs, this might be better handled just using a static ratelimit
> > > variable here....
> > > 
> > > Actually, we already have one for xfs_buf_item_push() to limit
> > > warnings about retrying XBF_WRITE_FAIL buffers:
> > > 
> > > static DEFINE_RATELIMIT_STATE(xfs_buf_write_fail_rl_state, 30 * HZ, 10);
> > > 
> > > Perhaps we should be using the same ratelimit variable here....
> > > 
> > 
> > IIRC that was static in another file, but we can centralize (and perhaps
> > generalize..) it somewhere if that is preferred..
> 
> I think it makes sense to have all the buffer write fail
> messages ratelimited under the same variable - once it starts
> spewing messages, we should limit them all the same way...
> 

Yeah. I actually ended up sticking the ratelimit in the buftarg as it
comes off a bit cleaner than a global and I don't think there's much of
a practical difference in having a per-target limit.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


  reply	other threads:[~2020-04-21 12:13 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-17 15:08 [PATCH 00/12] xfs: flush related error handling cleanups Brian Foster
2020-04-17 15:08 ` [PATCH 01/12] xfs: refactor failed buffer resubmission into xfsaild Brian Foster
2020-04-17 22:37   ` Allison Collins
2020-04-20  2:45   ` Dave Chinner
2020-04-20 13:58     ` Brian Foster
2020-04-20 22:19       ` Dave Chinner
2020-04-17 15:08 ` [PATCH 02/12] xfs: factor out buffer I/O failure simulation code Brian Foster
2020-04-17 22:37   ` Allison Collins
2020-04-20  2:48   ` Dave Chinner
2020-04-20 13:58     ` Brian Foster
2020-04-17 15:08 ` [PATCH 03/12] xfs: always attach iflush_done and simplify error handling Brian Foster
2020-04-18  0:07   ` Allison Collins
2020-04-20 13:59     ` Brian Foster
2020-04-20  3:08   ` Dave Chinner
2020-04-20 14:00     ` Brian Foster
2020-04-17 15:08 ` [PATCH 04/12] xfs: remove unnecessary shutdown check from xfs_iflush() Brian Foster
2020-04-18  0:27   ` Allison Collins
2020-04-20  3:10   ` Dave Chinner
2020-04-17 15:08 ` [PATCH 05/12] xfs: ratelimit unmount time per-buffer I/O error warning Brian Foster
2020-04-20  3:19   ` Dave Chinner
2020-04-20 14:02     ` Brian Foster
2020-04-20 22:23       ` Dave Chinner
2020-04-21 12:13         ` Brian Foster [this message]
2020-04-20 18:50   ` Allison Collins
2020-04-17 15:08 ` [PATCH 06/12] xfs: remove duplicate verification from xfs_qm_dqflush() Brian Foster
2020-04-20  3:53   ` Dave Chinner
2020-04-20 14:02     ` Brian Foster
2020-04-20 22:31       ` Dave Chinner
2020-04-17 15:08 ` [PATCH 07/12] xfs: abort consistently on dquot flush failure Brian Foster
2020-04-20  3:54   ` Dave Chinner
2020-04-20 18:50   ` Allison Collins
2020-04-17 15:08 ` [PATCH 08/12] xfs: remove unnecessary quotaoff intent item push handler Brian Foster
2020-04-20  3:58   ` Dave Chinner
2020-04-20 14:02     ` Brian Foster
2020-04-17 15:08 ` [PATCH 09/12] xfs: elide the AIL lock on log item failure tracking Brian Foster
2020-04-17 15:08 ` [PATCH 10/12] xfs: clean up AIL log item removal functions Brian Foster
2020-04-20  4:32   ` Dave Chinner
2020-04-20 14:03     ` Brian Foster
2020-04-17 15:08 ` [PATCH 11/12] xfs: remove unused iflush stale parameter Brian Foster
2020-04-20  4:34   ` Dave Chinner
2020-04-20 19:19   ` Allison Collins
2020-04-17 15:08 ` [PATCH 12/12] xfs: random buffer write failure errortag Brian Foster
2020-04-20  4:37   ` Dave Chinner
2020-04-20 14:04     ` Brian Foster
2020-04-20 22:42   ` Allison Collins
2020-04-19 22:53 ` [PATCH 00/12] xfs: flush related error handling cleanups Dave Chinner
2020-04-20 14:06   ` Brian Foster
2020-04-20 22:53     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200421121337.GA31715@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.