linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Keith Busch <kbusch@kernel.org>
Cc: Keith Busch <kbusch@fb.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
	Chao Yu <chao@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	Kernel Team <Kernel-team@fb.com>,
	hch@lst.de, bvanassche@acm.org, damien.lemoal@opensource.wdc.com,
	pankydev8@gmail.com, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [PATCHv6 11/11] iomap: add support for dma aligned direct-io
Date: Fri, 22 Jul 2022 18:01:35 +0000	[thread overview]
Message-ID: <Ytrl/1YEg9M0fb+i@gmail.com> (raw)
In-Reply-To: <Ytq3qwTBTRRxBfXv@kbusch-mbp.dhcp.thefacebook.com>

On Fri, Jul 22, 2022 at 08:43:55AM -0600, Keith Busch wrote:
> On Fri, Jul 22, 2022 at 12:36:01AM -0700, Eric Biggers wrote:
> > [+f2fs list and maintainers]
> > 
> > On Fri, Jun 10, 2022 at 12:58:30PM -0700, Keith Busch wrote:
> > > From: Keith Busch <kbusch@kernel.org>
> > > 
> > > Use the address alignment requirements from the block_device for direct
> > > io instead of requiring addresses be aligned to the block size.
> > > 
> > > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > ---
> > >  fs/iomap/direct-io.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > > index 370c3241618a..5d098adba443 100644
> > > --- a/fs/iomap/direct-io.c
> > > +++ b/fs/iomap/direct-io.c
> > > @@ -242,7 +242,6 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> > >  	struct inode *inode = iter->inode;
> > >  	unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev));
> > >  	unsigned int fs_block_size = i_blocksize(inode), pad;
> > > -	unsigned int align = iov_iter_alignment(dio->submit.iter);
> > >  	loff_t length = iomap_length(iter);
> > >  	loff_t pos = iter->pos;
> > >  	unsigned int bio_opf;
> > > @@ -253,7 +252,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> > >  	size_t copied = 0;
> > >  	size_t orig_count;
> > >  
> > > -	if ((pos | length | align) & ((1 << blkbits) - 1))
> > > +	if ((pos | length) & ((1 << blkbits) - 1) ||
> > > +	    !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
> > >  		return -EINVAL;
> > >  
> > >  	if (iomap->type == IOMAP_UNWRITTEN) {
> > 
> > I noticed that this patch is going to break the following logic in
> > f2fs_should_use_dio() in fs/f2fs/file.c:
> > 
> > 	/*
> > 	 * Direct I/O not aligned to the disk's logical_block_size will be
> > 	 * attempted, but will fail with -EINVAL.
> > 	 *
> > 	 * f2fs additionally requires that direct I/O be aligned to the
> > 	 * filesystem block size, which is often a stricter requirement.
> > 	 * However, f2fs traditionally falls back to buffered I/O on requests
> > 	 * that are logical_block_size-aligned but not fs-block aligned.
> > 	 *
> > 	 * The below logic implements this behavior.
> > 	 */
> > 	align = iocb->ki_pos | iov_iter_alignment(iter);
> > 	if (!IS_ALIGNED(align, i_blocksize(inode)) &&
> > 	    IS_ALIGNED(align, bdev_logical_block_size(inode->i_sb->s_bdev)))
> > 		return false;
> > 
> > 	return true;
> > 
> > So, f2fs assumes that __iomap_dio_rw() returns an error if the I/O isn't logical
> > block aligned.  This patch changes that.  The result is that DIO will sometimes
> > proceed in cases where the I/O doesn't have the fs block alignment required by
> > f2fs for all DIO.
> > 
> > Does anyone have any thoughts about what f2fs should be doing here?  I think
> > it's weird that f2fs has different behaviors for different degrees of
> > misalignment: fail with EINVAL if not logical block aligned, else fallback to
> > buffered I/O if not fs block aligned.  I think it should be one convention or
> > the other.  Any opinions about which one it should be?
> 
> It looks like f2fs just falls back to buffered IO for this condition without
> reaching the new code in iomap_dio_bio_iter().

No.  It's a bit subtle, so read the code and what I'm saying carefully.  f2fs
only supports 4K aligned DIO and normally falls back to buffered I/O; however,
for DIO that is *very* misaligned (not even LBS aligned) it returns EINVAL
instead.  And it relies on __iomap_dio_rw() returning that EINVAL.

Relying on __iomap_dio_rw() in that way is definitely a bad design on f2fs's
part (and I messed that up when switching f2fs from fs/direct-io.c to iomap).
The obvious fix is to just have f2fs do the LBS alignment check itself.

But I think that f2fs shouldn't have different behavior for different levels of
misalignment in the first place, so I was wondering if anyone had any thoughts
on which behavior (EINVAL or fallback to buffered I/O) should be standardized on
in all cases, at least for f2fs.  There was some discussion about this sort of
thing for ext4 several years ago on the thread
https://lore.kernel.org/linux-ext4/1461472078-20104-1-git-send-email-tytso@mit.edu/T/#u,
but it didn't really reach a conclusion.  I'm wondering if the f2fs maintainers
have any thoughts about why the f2fs behavior is as it is.  I.e. is it just
accidental, or are there specific reasons...

- Eric

  reply	other threads:[~2022-07-22 18:01 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220610195830.3574005-1-kbusch@fb.com>
     [not found] ` <20220610195830.3574005-8-kbusch@fb.com>
2022-06-13 14:22   ` [PATCHv6 07/11] block/bounce: count bytes instead of sectors Christoph Hellwig
2022-07-22 22:01   ` Bart Van Assche
2022-07-25 14:46     ` Keith Busch
2022-06-13 21:22 ` [PATCHv6 00/11] direct-io dma alignment Jens Axboe
     [not found] ` <20220610195830.3574005-12-kbusch@fb.com>
2022-06-23 18:29   ` [PATCHv6 11/11] iomap: add support for dma aligned direct-io Eric Farman
2022-06-23 18:51     ` Keith Busch
2022-06-23 19:11       ` Keith Busch
2022-06-23 20:32         ` Eric Farman
2022-06-23 21:34           ` Eric Farman
2022-06-27 15:21             ` Eric Farman
2022-06-27 15:36               ` Keith Busch
2022-06-28  9:00                 ` Halil Pasic
2022-06-28 15:20                   ` Eric Farman
2022-06-29  3:18                     ` Eric Farman
2022-06-29  3:52                       ` Keith Busch
2022-06-29 18:04                         ` Eric Farman
2022-06-29 19:07                           ` Keith Busch
2022-06-29 19:28                             ` Eric Farman
2022-06-30  5:45                             ` Christian Borntraeger
2022-07-22  7:36   ` Eric Biggers
2022-07-22 14:43     ` Keith Busch
2022-07-22 18:01       ` Eric Biggers [this message]
2022-07-22 20:26         ` Keith Busch
2022-07-25 18:19           ` Eric Biggers
2022-07-24  2:13         ` Jaegeuk Kim
2022-07-22 17:53     ` Darrick J. Wong
2022-07-22 18:12       ` Eric Biggers
2022-07-23  5:03         ` Darrick J. Wong
     [not found] ` <20220610195830.3574005-6-kbusch@fb.com>
2022-07-22 21:53   ` [PATCHv6 05/11] block: add a helper function for dio alignment Bart Van Assche
     [not found] ` <20220610195830.3574005-7-kbusch@fb.com>
2022-07-22 21:57   ` [PATCHv6 06/11] block/merge: count bytes instead of sectors Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ytrl/1YEg9M0fb+i@gmail.com \
    --to=ebiggers@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chao@kernel.org \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=kbusch@fb.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=pankydev8@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).