linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Hongbo Li <lihongbo22@huawei.com>,
	linux-bcachefs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, axboe@kernel.dk, hch@lst.de
Subject: Re: bvec_iter.bi_sector -> loff_t? (was: Re: [PATCH] bcachefs: allow direct io fallback to buffer io for) unaligned length or offset
Date: Thu, 20 Jun 2024 15:49:54 +0100	[thread overview]
Message-ID: <ZnRBkr_7Ah8Hj-i-@casper.infradead.org> (raw)
In-Reply-To: <pfxno4kzdgk6imw7vt2wvpluybohbf6brka6tlx34lu2zbbuaz@khifgy2v2z5n>

On Thu, Jun 20, 2024 at 10:16:02AM -0400, Kent Overstreet wrote:
> That's really just descriptive, not prescriptive.
> 
> The intent of O_DIRECT is "bypass the page cache", the alignment
> restrictions are just a side effect of that. Applications just care
> about is having predictable performance characteristics.

But any application that has been written to use O_DIRECT already has the
alignment & size guarantees in place.  What this patch is attempting to do
is make it "more friendly" to use, and I'm not sure that's a great idea.
Not without buy-in from a large cross-section of filesystem people.

I'm more sympathetic to "lets relax the alignment requirements", since
most IO devices actually can do IO to arbitrary boundaries (or at least
reasonable boundaries, eg cacheline alignment or 4-byte alignment).
The 512 byte alignment doesn't seem particularly rooted in any hardware
restrictions.

But size?  Fundamentally, we're asking the device to do IO directly to
this userspace address.  That means you get to do the entire IO, not
just the part of it that you want.  I know some devices have bitbucket
descriptors, but many don't.

> > I'm against it.  Block devices only do sector-aligned IO and we should
> > not pretend otherwise.
> 
> Eh?
> 
> bio isn't really specific to the block layer anyways, given that an
> iov_iter can be a bio underneath. We _really_ should be trying for
> better commonality of data structures.

bio is absolutely specific to the block layer.  Look at it:

/*
 * main unit of I/O for the block layer and lower layers (ie drivers and
 * stacking drivers)
 */

        struct block_device     *bi_bdev;
        unsigned short          bi_flags;       /* BIO_* below */
        unsigned short          bi_ioprio;
        blk_status_t            bi_status;

Filesystems get to use it to interact with the block layer.  The iov_iter
isn't an abstraction over the bio, it's an abstraction over the bio_vec.

  reply	other threads:[~2024-06-20 14:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240620132157.888559-1-lihongbo22@huawei.com>
2024-06-20 13:36 ` bvec_iter.bi_sector -> loff_t? (was: Re: [PATCH] bcachefs: allow direct io fallback to buffer io for) unaligned length or offset Kent Overstreet
2024-06-20 13:54   ` Matthew Wilcox
2024-06-20 14:16     ` Kent Overstreet
2024-06-20 14:49       ` Matthew Wilcox [this message]
2024-06-20 14:56         ` bvec_iter.bi_sector -> loff_t? Jens Axboe
2024-06-20 15:15           ` Matthew Wilcox
2024-06-20 15:18             ` Jens Axboe
2024-06-20 16:26               ` Keith Busch
2024-06-20 15:20             ` Christoph Hellwig
2024-06-20 15:21               ` Jens Axboe
2024-06-21  2:37           ` Hongbo Li
2024-06-21  3:05             ` Kent Overstreet
2024-06-20 15:35         ` bvec_iter.bi_sector -> loff_t? (was: Re: [PATCH] bcachefs: allow direct io fallback to buffer io for) unaligned length or offset Kent Overstreet
2024-06-21  3:13         ` bvec_iter.bi_sector -> loff_t? Hongbo Li
2024-06-20 15:30     ` bvec_iter.bi_sector -> loff_t? (was: Re: [PATCH] bcachefs: allow direct io fallback to buffer io for) unaligned length or offset Christoph Hellwig
2024-06-20 15:43       ` Kent Overstreet
2024-06-21  1:48         ` Ming Lei
2024-06-21  3:07           ` Kent Overstreet
2024-06-21  3:36             ` Ming Lei
2024-06-21  3:52               ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnRBkr_7Ah8Hj-i-@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kent.overstreet@linux.dev \
    --cc=lihongbo22@huawei.com \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).