From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2 05/13] xfs: ratelimit unmount time per-buffer I/O error message
Date: Thu, 23 Apr 2020 10:29:58 -0400 [thread overview]
Message-ID: <20200423142958.GB43557@bfoster> (raw)
In-Reply-To: <20200423044604.GI27860@dread.disaster.area>
On Thu, Apr 23, 2020 at 02:46:04PM +1000, Dave Chinner wrote:
> On Wed, Apr 22, 2020 at 01:54:21PM -0400, Brian Foster wrote:
> > At unmount time, XFS emits a warning for every in-core buffer that
> > might have undergone a write error. In practice this behavior is
> > probably reasonable given that the filesystem is likely short lived
> > once I/O errors begin to occur consistently. Under certain test or
> > otherwise expected error conditions, this can spam the logs and slow
> > down the unmount.
> >
> > We already have a ratelimit state defined for buffers failing
> > writeback. Fold this state into the buftarg and reuse it for the
> > unmount time errors.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
>
> Looks fine, but I suspect we both missed something here:
> xfs_buf_ioerror_alert() was made a ratelimited printk in the last
> cycle:
>
> void
> xfs_buf_ioerror_alert(
> struct xfs_buf *bp,
> xfs_failaddr_t func)
> {
> xfs_alert_ratelimited(bp->b_mount,
> "metadata I/O error in \"%pS\" at daddr 0x%llx len %d error %d",
> func, (uint64_t)XFS_BUF_ADDR(bp), bp->b_length,
> -bp->b_error);
> }
>
Yeah, I hadn't noticed that.
> Hence I think all these buffer error alerts can be brought under the
> same rate limiting variable. Something like this in xfs_message.c:
>
One thing to note is that xfs_alert_ratelimited() ultimately uses
the DEFAULT_RATELIMIT_INTERVAL of 5s. The ratelimit we're generalizing
here uses 30s (both use a burst of 10). That seems reasonable enough to
me for I/O errors so I'm good with the changes below.
FWIW, that also means we could just call xfs_buf_alert_ratelimited()
from xfs_buf_item_push() if we're also Ok with using an "alert" instead
of a "warn." I'm not immediately aware of a reason to use one over the
other (xfs_wait_buftarg() already uses alert) so I'll try that unless I
hear an objection. The xfs_wait_buftarg() ratelimit presumably remains
open coded because it's two separate calls and we probably don't want
them to individually count against the limit.
Brian
> void
> xfs_buf_alert_ratelimited(
> struct xfs_buf *bp,
> const char *rlmsg,
> const char *fmt,
> ...)
> {
> struct va_format vaf;
> va_list args;
>
> if (!___ratelimit(&bp->b_target->bt_ioerror_rl, rlmsg)
> return;
>
> va_start(args, fmt);
> vaf.fmt = fmt;
> vaf.args = &args;
> __xfs_printk(KERN_ALERT, bp->b_mount, &vaf);
> va_end(args);
> }
>
> and:
>
> void
> xfs_buf_ioerror_alert(
> struct xfs_buf *bp,
> xfs_failaddr_t func)
> {
> xfs_buf_alert_ratelimited(bp, "XFS: metadata IO error",
> "metadata I/O error in \"%pS\" at daddr 0x%llx len %d error %d",
> func, (uint64_t)XFS_BUF_ADDR(bp), bp->b_length, -bp->b_error);
> }
>
>
> > ---
> > fs/xfs/xfs_buf.c | 13 +++++++++++--
> > fs/xfs/xfs_buf.h | 1 +
> > fs/xfs/xfs_buf_item.c | 10 +---------
> > 3 files changed, 13 insertions(+), 11 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index 7a6bc617f0a9..c28a93d2fd8c 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -1684,10 +1684,12 @@ xfs_wait_buftarg(
> > struct xfs_buf *bp;
> > bp = list_first_entry(&dispose, struct xfs_buf, b_lru);
> > list_del_init(&bp->b_lru);
> > - if (bp->b_flags & XBF_WRITE_FAIL) {
> > + if (bp->b_flags & XBF_WRITE_FAIL &&
> > + ___ratelimit(&bp->b_target->bt_ioerror_rl,
> > + "XFS: Corruption Alert")) {
> > xfs_alert(btp->bt_mount,
> > "Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!",
> > - (long long)bp->b_bn);
> > + (long long)bp->b_bn);
> > xfs_alert(btp->bt_mount,
> > "Please run xfs_repair to determine the extent of the problem.");
> > }
>
> I think if we are tossing away metadata here, we should probably
> shut down the filesystem once the loop has completed. That way we
> get all the normal warnings about running xfs_repair and don't have
> to open code it here...
>
> > -
> > STATIC uint
> > xfs_buf_item_push(
> > struct xfs_log_item *lip,
> > @@ -518,7 +510,7 @@ xfs_buf_item_push(
> >
> > /* has a previous flush failed due to IO errors? */
> > if ((bp->b_flags & XBF_WRITE_FAIL) &&
> > - ___ratelimit(&xfs_buf_write_fail_rl_state, "XFS: Failing async write")) {
> > + ___ratelimit(&bp->b_target->bt_ioerror_rl, "XFS: Failing async write")) {
> > xfs_warn(bp->b_mount,
> > "Failing async write on buffer block 0x%llx. Retrying async write.",
> > (long long)bp->b_bn);
>
> This gets simplified to:
>
> if (bp->b_flags & XBF_WRITE_FAIL) {
> xfs_buf_alert_ratelimited(bp, "XFS: Failing async write",
> "Failing async write on buffer block 0x%llx. Retrying async write.",
> (long long)bp->b_bn);
> }
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
next prev parent reply other threads:[~2020-04-23 14:30 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-22 17:54 [PATCH v2 00/13] xfs: flush related error handling cleanups Brian Foster
2020-04-22 17:54 ` [PATCH v2 01/13] xfs: refactor failed buffer resubmission into xfsaild Brian Foster
2020-04-23 4:09 ` Dave Chinner
2020-04-25 17:21 ` Christoph Hellwig
2020-04-22 17:54 ` [PATCH v2 02/13] xfs: factor out buffer I/O failure simulation code Brian Foster
2020-04-23 4:10 ` Dave Chinner
2020-04-25 17:23 ` Christoph Hellwig
2020-04-27 11:11 ` Brian Foster
2020-04-22 17:54 ` [PATCH v2 03/13] xfs: fallthru to buffer attach on error and simplify error handling Brian Foster
2020-04-23 4:18 ` Dave Chinner
2020-04-23 14:28 ` Brian Foster
2020-04-25 17:26 ` Christoph Hellwig
2020-04-22 17:54 ` [PATCH v2 04/13] xfs: remove unnecessary shutdown check from xfs_iflush() Brian Foster
2020-04-25 17:27 ` Christoph Hellwig
2020-04-22 17:54 ` [PATCH v2 05/13] xfs: ratelimit unmount time per-buffer I/O error message Brian Foster
2020-04-23 4:46 ` Dave Chinner
2020-04-23 14:29 ` Brian Foster [this message]
2020-04-23 21:14 ` Dave Chinner
2020-04-24 11:12 ` Brian Foster
2020-04-24 22:08 ` Dave Chinner
2020-04-27 11:11 ` Brian Foster
2020-04-22 17:54 ` [PATCH v2 06/13] xfs: fix duplicate verification from xfs_qm_dqflush() Brian Foster
2020-04-23 4:47 ` Dave Chinner
2020-04-25 17:28 ` Christoph Hellwig
2020-04-22 17:54 ` [PATCH v2 07/13] xfs: abort consistently on dquot flush failure Brian Foster
2020-04-25 17:30 ` Christoph Hellwig
2020-04-22 17:54 ` [PATCH v2 08/13] xfs: elide the AIL lock on log item failure tracking Brian Foster
2020-04-23 5:59 ` Dave Chinner
2020-04-23 14:36 ` Brian Foster
2020-04-23 21:38 ` Dave Chinner
2020-04-24 11:14 ` Brian Foster
2020-04-22 17:54 ` [PATCH v2 09/13] xfs: clean up AIL log item removal functions Brian Foster
2020-04-23 4:54 ` Dave Chinner
2020-04-25 17:37 ` Christoph Hellwig
2020-04-27 11:12 ` Brian Foster
2020-04-22 17:54 ` [PATCH v2 10/13] xfs: combine xfs_trans_ail_[remove|delete]() Brian Foster
2020-04-23 4:55 ` Dave Chinner
2020-04-22 17:54 ` [PATCH v2 11/13] xfs: remove unused iflush stale parameter Brian Foster
2020-04-25 17:37 ` Christoph Hellwig
2020-04-22 17:54 ` [PATCH v2 12/13] xfs: random buffer write failure errortag Brian Foster
2020-04-23 5:11 ` Dave Chinner
2020-04-25 17:38 ` Christoph Hellwig
2020-04-27 11:12 ` Brian Foster
2020-04-22 17:54 ` [PATCH v2 13/13] xfs: remove unused shutdown types Brian Foster
2020-04-23 5:13 ` Dave Chinner
2020-04-25 17:39 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200423142958.GB43557@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.