From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: shutdown on failure to add page to log bio
Date: Tue, 24 Mar 2020 13:34:59 -0700 [thread overview]
Message-ID: <20200324203459.GG29339@magnolia> (raw)
In-Reply-To: <20200324172949.GB3148@bfoster>
On Tue, Mar 24, 2020 at 01:29:49PM -0400, Brian Foster wrote:
> On Tue, Mar 24, 2020 at 10:18:59AM -0700, Darrick J. Wong wrote:
> > On Tue, Mar 24, 2020 at 12:57:00PM -0400, Brian Foster wrote:
> > > If the bio_add_page() call fails, we proceed to write out a
> > > partially constructed log buffer. This corrupts the physical log
> > > such that log recovery is not possible. Worse, persistent
> > > occurrences of this error eventually lead to a BUG_ON() failure in
> > > bio_split() as iclogs wrap the end of the physical log, which
> > > triggers log recovery on subsequent mount.
> > >
> > > Rather than warn about writing out a corrupted log buffer, shutdown
> > > the fs as is done for any log I/O related error. This preserves the
> > > consistency of the physical log such that log recovery succeeds on a
> > > subsequent mount. Note that this was observed on a 64k page debug
> > > kernel without upstream commit 59bb47985c1d ("mm, sl[aou]b:
> > > guarantee natural alignment for kmalloc(power-of-two)"), which
> > > demonstrated frequent iclog bio overflows due to unaligned (slab
> > > allocated) iclog data buffers.
> >
> > Fixes: tag?
> >
>
> I suppose you could argue it fixes commit 79b54d9bfcdcd ("xfs: use bios
> directly to write log buffers"), but I didn't include a tag because this
> is not really fixing a reproducible bug. It's fixing up the error
> handling based on a bad combination of patches in a distro kernel.
> Perhaps I'm just not clear on when we do or don't want a fixes tag..?
[Summarizing what I rambled about on IRC:]
From my perspective, this looks like you concluded that the WARN_ON_ONCE
wasn't sufficient to deal with the error (because the physical log got
corrupted), so you're adding branch code to shut down the log.
Granted, it should only happen if bio_add_page fails, but as that's not
part of xfs, we have to code defensively enough to avoid breaking the
filesystem.
Looks ok, will add fixes tag and send it to the testcloud...
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> Brian
>
> > Otherwise, looks ok to me.
> >
> > --D
> >
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > > fs/xfs/xfs_log.c | 14 ++++++++++----
> > > 1 file changed, 10 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index 2a90a483c2d6..ebb6a5c95332 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -1705,16 +1705,22 @@ xlog_bio_end_io(
> > >
> > > static void
> > > xlog_map_iclog_data(
> > > - struct bio *bio,
> > > - void *data,
> > > + struct xlog_in_core *iclog,
> > > size_t count)
> > > {
> > > + struct xfs_mount *mp = iclog->ic_log->l_mp;
> > > + struct bio *bio = &iclog->ic_bio;
> > > + void *data = iclog->ic_data;
> > > +
> > > do {
> > > struct page *page = kmem_to_page(data);
> > > unsigned int off = offset_in_page(data);
> > > size_t len = min_t(size_t, count, PAGE_SIZE - off);
> > >
> > > - WARN_ON_ONCE(bio_add_page(bio, page, len, off) != len);
> > > + if (bio_add_page(bio, page, len, off) != len) {
> > > + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR);
> > > + break;
> > > + }
> > >
> > > data += len;
> > > count -= len;
> > > @@ -1762,7 +1768,7 @@ xlog_write_iclog(
> > > if (need_flush)
> > > iclog->ic_bio.bi_opf |= REQ_PREFLUSH;
> > >
> > > - xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count);
> > > + xlog_map_iclog_data(iclog, count);
> > > if (is_vmalloc_addr(iclog->ic_data))
> > > flush_kernel_vmap_range(iclog->ic_data, count);
> > >
> > > --
> > > 2.21.1
> > >
> >
>
next prev parent reply other threads:[~2020-03-24 20:35 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-24 16:57 [PATCH] xfs: shutdown on failure to add page to log bio Brian Foster
2020-03-24 17:18 ` Darrick J. Wong
2020-03-24 17:29 ` Brian Foster
2020-03-24 20:34 ` Darrick J. Wong [this message]
2020-03-24 23:24 ` Dave Chinner
2020-03-25 11:24 ` Brian Foster
2020-03-25 7:12 ` Christoph Hellwig
2020-03-25 11:25 ` Brian Foster
2020-03-25 11:41 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200324203459.GG29339@magnolia \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox