From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 05/12] xfs: ratelimit unmount time per-buffer I/O error warning
Date: Tue, 21 Apr 2020 08:13:37 -0400 [thread overview]
Message-ID: <20200421121337.GA31715@bfoster> (raw)
In-Reply-To: <20200420222332.GP9800@dread.disaster.area>
On Tue, Apr 21, 2020 at 08:23:32AM +1000, Dave Chinner wrote:
> On Mon, Apr 20, 2020 at 10:02:05AM -0400, Brian Foster wrote:
> > On Mon, Apr 20, 2020 at 01:19:59PM +1000, Dave Chinner wrote:
> > > On Fri, Apr 17, 2020 at 11:08:52AM -0400, Brian Foster wrote:
> > > > At unmount time, XFS emits a warning for every in-core buffer that
> > > > might have undergone a write error. In practice this behavior is
> > > > probably reasonable given that the filesystem is likely short lived
> > > > once I/O errors begin to occur consistently. Under certain test or
> > > > otherwise expected error conditions, this can spam the logs and slow
> > > > down the unmount. Ratelimit the warning to prevent this problem
> > > > while still informing the user that errors have occurred.
> > > >
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > ---
> > > > fs/xfs/xfs_buf.c | 7 +++----
> > > > 1 file changed, 3 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > > index 93942d8e35dd..5120fed06075 100644
> > > > --- a/fs/xfs/xfs_buf.c
> > > > +++ b/fs/xfs/xfs_buf.c
> > > > @@ -1685,11 +1685,10 @@ xfs_wait_buftarg(
> > > > bp = list_first_entry(&dispose, struct xfs_buf, b_lru);
> > > > list_del_init(&bp->b_lru);
> > > > if (bp->b_flags & XBF_WRITE_FAIL) {
> > > > - xfs_alert(btp->bt_mount,
> > > > -"Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!",
> > > > + xfs_alert_ratelimited(btp->bt_mount,
> > > > +"Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!\n"
> > > > +"Please run xfs_repair to determine the extent of the problem.",
> > > > (long long)bp->b_bn);
> > >
> > > Hmmmm. I was under the impression that multiple line log messages
> > > were frowned upon because they prevent every output line in the log
> > > being tagged correctly. That's where KERN_CONT came from (i.e. it's
> > > a continuation of a previous log message), but we don't use that
> > > with the XFS logging and hence multi-line log messages are split
> > > into multiple logging calls.
> > >
> >
> > I debated combining these into a single line for that exact reason for
> > about a second and then just went with this because I didn't think it
> > mattered that much.
>
> It doesn't matter to us, but it does matter to those people who want
> their log entries correctly tagged for their classification
> engines...
>
Makes sense, though I am a bit curious whether it would be categorized
correctly even when fixed up, or whether something like a single long
line would be preferred over two. *shrug*
> > > IOWs, this might be better handled just using a static ratelimit
> > > variable here....
> > >
> > > Actually, we already have one for xfs_buf_item_push() to limit
> > > warnings about retrying XBF_WRITE_FAIL buffers:
> > >
> > > static DEFINE_RATELIMIT_STATE(xfs_buf_write_fail_rl_state, 30 * HZ, 10);
> > >
> > > Perhaps we should be using the same ratelimit variable here....
> > >
> >
> > IIRC that was static in another file, but we can centralize (and perhaps
> > generalize..) it somewhere if that is preferred..
>
> I think it makes sense to have all the buffer write fail
> messages ratelimited under the same variable - once it starts
> spewing messages, we should limit them all the same way...
>
Yeah. I actually ended up sticking the ratelimit in the buftarg as it
comes off a bit cleaner than a global and I don't think there's much of
a practical difference in having a per-target limit.
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
next prev parent reply other threads:[~2020-04-21 12:13 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-17 15:08 [PATCH 00/12] xfs: flush related error handling cleanups Brian Foster
2020-04-17 15:08 ` [PATCH 01/12] xfs: refactor failed buffer resubmission into xfsaild Brian Foster
2020-04-17 22:37 ` Allison Collins
2020-04-20 2:45 ` Dave Chinner
2020-04-20 13:58 ` Brian Foster
2020-04-20 22:19 ` Dave Chinner
2020-04-17 15:08 ` [PATCH 02/12] xfs: factor out buffer I/O failure simulation code Brian Foster
2020-04-17 22:37 ` Allison Collins
2020-04-20 2:48 ` Dave Chinner
2020-04-20 13:58 ` Brian Foster
2020-04-17 15:08 ` [PATCH 03/12] xfs: always attach iflush_done and simplify error handling Brian Foster
2020-04-18 0:07 ` Allison Collins
2020-04-20 13:59 ` Brian Foster
2020-04-20 3:08 ` Dave Chinner
2020-04-20 14:00 ` Brian Foster
2020-04-17 15:08 ` [PATCH 04/12] xfs: remove unnecessary shutdown check from xfs_iflush() Brian Foster
2020-04-18 0:27 ` Allison Collins
2020-04-20 3:10 ` Dave Chinner
2020-04-17 15:08 ` [PATCH 05/12] xfs: ratelimit unmount time per-buffer I/O error warning Brian Foster
2020-04-20 3:19 ` Dave Chinner
2020-04-20 14:02 ` Brian Foster
2020-04-20 22:23 ` Dave Chinner
2020-04-21 12:13 ` Brian Foster [this message]
2020-04-20 18:50 ` Allison Collins
2020-04-17 15:08 ` [PATCH 06/12] xfs: remove duplicate verification from xfs_qm_dqflush() Brian Foster
2020-04-20 3:53 ` Dave Chinner
2020-04-20 14:02 ` Brian Foster
2020-04-20 22:31 ` Dave Chinner
2020-04-17 15:08 ` [PATCH 07/12] xfs: abort consistently on dquot flush failure Brian Foster
2020-04-20 3:54 ` Dave Chinner
2020-04-20 18:50 ` Allison Collins
2020-04-17 15:08 ` [PATCH 08/12] xfs: remove unnecessary quotaoff intent item push handler Brian Foster
2020-04-20 3:58 ` Dave Chinner
2020-04-20 14:02 ` Brian Foster
2020-04-17 15:08 ` [PATCH 09/12] xfs: elide the AIL lock on log item failure tracking Brian Foster
2020-04-17 15:08 ` [PATCH 10/12] xfs: clean up AIL log item removal functions Brian Foster
2020-04-20 4:32 ` Dave Chinner
2020-04-20 14:03 ` Brian Foster
2020-04-17 15:08 ` [PATCH 11/12] xfs: remove unused iflush stale parameter Brian Foster
2020-04-20 4:34 ` Dave Chinner
2020-04-20 19:19 ` Allison Collins
2020-04-17 15:08 ` [PATCH 12/12] xfs: random buffer write failure errortag Brian Foster
2020-04-20 4:37 ` Dave Chinner
2020-04-20 14:04 ` Brian Foster
2020-04-20 22:42 ` Allison Collins
2020-04-19 22:53 ` [PATCH 00/12] xfs: flush related error handling cleanups Dave Chinner
2020-04-20 14:06 ` Brian Foster
2020-04-20 22:53 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200421121337.GA31715@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).