Re: [PATCH 3/3] xfs: alignment check bio buffers

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>,
	linux-block <linux-block@vger.kernel.org>
Subject: Re: [PATCH 3/3] xfs: alignment check bio buffers
Date: Thu, 22 Aug 2019 14:49:05 +1000	[thread overview]
Message-ID: <20190822044905.GU1119@dread.disaster.area> (raw)
In-Reply-To: <CACVXFVN93h7QrFvZNVQQwYZg_n0wGXwn=XZztMJrNbdjzzSpKQ@mail.gmail.com>

On Thu, Aug 22, 2019 at 10:50:02AM +0800, Ming Lei wrote:
> On Thu, Aug 22, 2019 at 8:06 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Wed, Aug 21, 2019 at 06:38:20PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > Add memory buffer alignment validation checks to bios built in XFS
> > > to catch bugs that will result in silent data corruption in block
> > > drivers that cannot handle unaligned memory buffers but don't
> > > validate the incoming buffer alignment is correct.
> > >
> > > Known drivers with these issues are xenblk, brd and pmem.
> > >
> > > Despite there being nothing XFS specific to xfs_bio_add_page(), this
> > > function was created to do the required validation because the block
> > > layer developers that keep telling us that is not possible to
> > > validate buffer alignment in bio_add_page(), and even if it was
> > > possible it would be too much overhead to do at runtime.
> >
> > I really don't think we should life this to XFS, but instead fix it
> > in the block layer.  And that is not only because I have a pending
> > series lifting bits you are touching to the block layer..
> >
> > > +int
> > > +xfs_bio_add_page(
> > > +     struct bio      *bio,
> > > +     struct page     *page,
> > > +     unsigned int    len,
> > > +     unsigned int    offset)
> > > +{
> > > +     struct request_queue    *q = bio->bi_disk->queue;
> > > +     bool            same_page = false;
> > > +
> > > +     if (WARN_ON_ONCE(!blk_rq_aligned(q, len, offset)))
> > > +             return -EIO;
> > > +
> > > +     if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
> > > +             if (bio_full(bio, len))
> > > +                     return 0;
> > > +             __bio_add_page(bio, page, len, offset);
> > > +     }
> > > +     return len;
> >
> > I know Jens disagree, but with the amount of bugs we've been hitting
> > thangs to slub (and I'm pretty sure we have a more hiding outside of
> > XFS) I think we need to add the blk_rq_aligned check to bio_add_page.
> 
> It isn't correct to blk_rq_aligned() here because 'len' has to be logical block
> size aligned, instead of DMA aligned only.

News to me.

AFAIA, the overall _IO_ that is being built needs to be a multiple
of the logical block size in total size (i.e. bio->bi_iter.size)
because sub sector IO is not allowed. But queue DMA limits are not
defined in sectors - they define the scatter/gather DMA capability
of the hardware, and that's what individual segments (bvecs) need to
align to.  That's what blk_rq_aligned() checks here - that the bvec
segment aligns to what the underlying driver(s) requires, not that
the entire IO is sector sized and aligned.

Also, think about multipage bvecs - the pages we are spanning here
are contiguous pages, so this should end up merging them and turning
it into a single multipage bvec whose length is sector size
aligned...

> Also not sure all users may setup bio->bi_disk well before adding page to bio,
> since it is allowed to do that now.

XFS does, so I just don't care about random users of bio_add_page()
in this patch. Somebody else can run the block layer gauntlet to get
these checks moved into generic code and they've already been
rejected twice as unnecessary.

> If slub buffer crosses two pages, block layer may not handle it at all
> even though
> un-aligned 'offset' issue is solved.

A slub buffer crossing two _contiguous_ pages should end up merged
as a multipage bvec. But I'm curious, what does adding multiple
contiguous pages to a bio actually break?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2019-08-22  4:50 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-21  8:38 [PATCH 0/3] xfs: avoid IO issues unaligned memory allocation Dave Chinner
2019-08-21  8:38 ` [PATCH 1/3] xfs: add kmem allocation trace points Dave Chinner
2019-08-21 13:34   ` Brian Foster
2019-08-21 23:20   ` Christoph Hellwig
2019-08-21  8:38 ` [PATCH 2/3] xfs: add kmem_alloc_io() Dave Chinner
2019-08-21 13:35   ` Brian Foster
2019-08-21 15:08     ` Darrick J. Wong
2019-08-21 21:24       ` Dave Chinner
2019-08-21 15:23     ` Eric Sandeen
2019-08-21 21:14     ` Dave Chinner
2019-08-22 13:40       ` Brian Foster
2019-08-22 22:39         ` Dave Chinner
2019-08-23 12:10           ` Brian Foster
2019-08-21 23:24   ` Christoph Hellwig
2019-08-22  0:31     ` Dave Chinner
2019-08-22  7:59       ` Christoph Hellwig
2019-08-22  8:51         ` Peter Zijlstra
2019-08-22  9:10           ` Peter Zijlstra
2019-08-22 10:14             ` Dave Chinner
2019-08-22 11:14               ` Vlastimil Babka
2019-08-22 12:07                 ` Dave Chinner
2019-08-22 12:19                   ` Vlastimil Babka
2019-08-22 13:17                     ` Dave Chinner
2019-08-22 14:26                       ` Vlastimil Babka
2019-08-26 12:21                         ` Michal Hocko
2019-08-21  8:38 ` [PATCH 3/3] xfs: alignment check bio buffers Dave Chinner
2019-08-21 13:39   ` Brian Foster
2019-08-21 21:39     ` Dave Chinner
2019-08-22 13:47       ` Brian Foster
2019-08-22 23:03         ` Dave Chinner
2019-08-23 12:33           ` Brian Foster
2019-08-21 23:30     ` Christoph Hellwig
2019-08-22  0:44       ` Dave Chinner
2019-08-21 23:29   ` Christoph Hellwig
2019-08-22  0:37     ` Dave Chinner
2019-08-22  8:03       ` Christoph Hellwig
2019-08-22 10:17         ` Dave Chinner
2019-08-22  2:50     ` Ming Lei
2019-08-22  4:49       ` Dave Chinner [this message]
2019-08-22  7:23         ` Ming Lei
2019-08-22  8:08         ` Christoph Hellwig
2019-08-22 10:20           ` Ming Lei
2019-08-23  0:14             ` Christoph Hellwig
2019-08-23  1:19               ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190822044905.GU1119@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tom.leiming@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.