From: Eric Farman <farman@linux.ibm.com>
To: Halil Pasic <pasic@linux.ibm.com>, Keith Busch <kbusch@kernel.org>
Cc: Keith Busch <kbusch@fb.com>,
linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
linux-nvme@lists.infradead.org,
Christian Borntraeger <borntraeger@linux.ibm.com>,
axboe@kernel.dk, Kernel Team <Kernel-team@fb.com>,
hch@lst.de, bvanassche@acm.org, damien.lemoal@opensource.wdc.com,
ebiggers@kernel.org, pankydev8@gmail.com
Subject: Re: [PATCHv6 11/11] iomap: add support for dma aligned direct-io
Date: Tue, 28 Jun 2022 11:20:44 -0400 [thread overview]
Message-ID: <83e65083890a7ac9c581c5aee0361d1b49e6abd9.camel@linux.ibm.com> (raw)
In-Reply-To: <20220628110024.01fcf84f.pasic@linux.ibm.com>
On Tue, 2022-06-28 at 11:00 +0200, Halil Pasic wrote:
> On Mon, 27 Jun 2022 09:36:56 -0600
> Keith Busch <kbusch@kernel.org> wrote:
>
> > On Mon, Jun 27, 2022 at 11:21:20AM -0400, Eric Farman wrote:
> > > Apologies, it took me an extra day to get back to this, but it is
> > > indeed this pass through that's causing our boot failures. I note
> > > that
> > > the old code (in iomap_dio_bio_iter), did:
> > >
> > > if ((pos | length | align) & ((1 << blkbits) - 1))
> > > return -EINVAL;
> > >
> > > With blkbits equal to 12, the resulting mask was 0x0fff against
> > > an
> > > align value (from iov_iter_alignment) of x200 kicks us out.
> > >
> > > The new code (in iov_iter_aligned_iovec), meanwhile, compares
> > > this:
> > >
> > > if ((unsigned long)(i->iov[k].iov_base + skip) &
> > > addr_mask)
> > > return false;
> > >
> > > iov_base (and the output of the old iov_iter_aligned_iovec()
> > > routine)
> > > is x200, but since addr_mask is x1ff this check provides a
> > > different
> > > response than it used to.
> > >
> > > To check this, I changed the comparator to len_mask (almost
> > > certainly
> > > not the right answer since addr_mask is then unused, but it was
> > > good
> > > for a quick test), and our PV guests are able to boot again with
> > > -next
> > > running in the host.
> >
> > This raises more questions for me. It sounds like your process used
> > to get an
> > EINVAL error, and it wants to continue getting an EINVAL error
> > instead of
> > letting the direct-io request proceed. Is that correct?
>
> Is my understanding as well. But I'm not familiar enough with the
> code to
> tell where and how that -EINVAL gets handled.
>
> BTW let me just point out that the bounce buffering via swiotlb
> needed
> for PV is not unlikely to mess up the alignment of things. But I'm
> not
> sure if that is relevant here.
>
> Regards,
> Halil
>
> > If so, could you
> > provide more details on what issue occurs with dispatching this
> > request?
This error occurs reading the initial boot record for a guest, stating
QEMU was unable to read block zero from the device. The code that
complains doesn't appear to have anything that says "oh, got EINVAL,
try it this other way" but I haven't chased down if/where something in
between is expecting that and handling it in some unique way. I -think-
I have an easier reproducer now, so maybe I'd be able to get a better
answer to this question.
> >
> > If you really need to restrict address' alignment to the storage's
> > logical
> > block size, I think your storage driver needs to set the
> > dma_alignment queue
> > limit to that value.
It's possible that there's a problem in the virtio stack here, but the
failing configuration is a qcow image on the host rootfs, so it's not
using any distinct driver. The bdev request queue that ends up being
used is the same allocated out of blk_alloc_queue, so changing
dma_alignment there wouldn't work.
next prev parent reply other threads:[~2022-06-28 15:21 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220610195830.3574005-1-kbusch@fb.com>
2022-06-13 21:22 ` [PATCHv6 00/11] direct-io dma alignment Jens Axboe
[not found] ` <20220610195830.3574005-12-kbusch@fb.com>
2022-06-23 18:29 ` [PATCHv6 11/11] iomap: add support for dma aligned direct-io Eric Farman
2022-06-23 18:51 ` Keith Busch
2022-06-23 19:11 ` Keith Busch
2022-06-23 20:32 ` Eric Farman
2022-06-23 21:34 ` Eric Farman
2022-06-27 15:21 ` Eric Farman
2022-06-27 15:36 ` Keith Busch
2022-06-28 9:00 ` Halil Pasic
2022-06-28 15:20 ` Eric Farman [this message]
2022-06-29 3:18 ` Eric Farman
2022-06-29 3:52 ` Keith Busch
2022-06-29 18:04 ` Eric Farman
2022-06-29 19:07 ` Keith Busch
2022-06-29 19:28 ` Eric Farman
2022-06-30 5:45 ` Christian Borntraeger
2022-07-22 7:36 ` Eric Biggers
2022-07-22 14:43 ` Keith Busch
2022-07-22 18:01 ` Eric Biggers
2022-07-22 20:26 ` Keith Busch
2022-07-25 18:19 ` Eric Biggers
2022-07-24 2:13 ` Jaegeuk Kim
2022-07-22 17:53 ` Darrick J. Wong
2022-07-22 18:12 ` Eric Biggers
2022-07-23 5:03 ` Darrick J. Wong
[not found] ` <20220610195830.3574005-6-kbusch@fb.com>
2022-07-22 21:53 ` [PATCHv6 05/11] block: add a helper function for dio alignment Bart Van Assche
[not found] ` <20220610195830.3574005-7-kbusch@fb.com>
2022-07-22 21:57 ` [PATCHv6 06/11] block/merge: count bytes instead of sectors Bart Van Assche
[not found] ` <20220610195830.3574005-8-kbusch@fb.com>
2022-06-13 14:22 ` [PATCHv6 07/11] block/bounce: " Christoph Hellwig
2022-07-22 22:01 ` Bart Van Assche
2022-07-25 14:46 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83e65083890a7ac9c581c5aee0361d1b49e6abd9.camel@linux.ibm.com \
--to=farman@linux.ibm.com \
--cc=Kernel-team@fb.com \
--cc=axboe@kernel.dk \
--cc=borntraeger@linux.ibm.com \
--cc=bvanassche@acm.org \
--cc=damien.lemoal@opensource.wdc.com \
--cc=ebiggers@kernel.org \
--cc=hch@lst.de \
--cc=kbusch@fb.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=pankydev8@gmail.com \
--cc=pasic@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).