linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Llamas <cmllamas@google.com>,
	Keith Busch <kbusch@kernel.org>, Keith Busch <kbusch@meta.com>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	axboe@kernel.dk, Hannes Reinecke <hare@suse.de>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: Re: [PATCHv4 5/8] iomap: simplify direct io validity check
Date: Thu, 30 Oct 2025 10:40:15 -0700	[thread overview]
Message-ID: <20251030174015.GC1624@sol> (raw)
In-Reply-To: <20251029070618.GA29697@lst.de>

On Wed, Oct 29, 2025 at 08:06:18AM +0100, Christoph Hellwig wrote:
> I think we need to take a step back and talk about what alignment
> we're talking about here, as there are two dimensions to it.
> 
> The first dimension is: disk alignment vs memory alignment.
> 
> Disk alignment:
>   Direct I/O obviously needs to be aligned to on-disk sectors to have
>   a chance to work, as that is the lowest possible granularity of access.
> 
>   For fіle systems that write out of place we also need to align writes
>   to the logical block size of the file system.
> 
>   With blk-crypto we need to align to the DUN if it is larger than the
>   disk-sector dize.
> 
> Memory alignment:
> 
>   This is the alignment of the buffer in-memory.  Hardware only really
>   cares about this when DMA engines discard the lowest bits, so a typical
>   hardware alignment requirement is to only require a dword (4 byte)
>   alignment.   For drivers that process the payload in software such
>   low alignment have a tendency to cause bugs as they're not written
>   thinking about it.  Similarly for any additional processing like
>   encryption, parity or checksums.
> 
> The second dimension is for the entire operation vs individual vectors,
> this has implications both for the disk and memory alignment.  Keith
> has done work there recently to relax the alignment of the vectors to
> only require the memory alignment, so that preadv/pwritev-like calls
> can have lots of unaligned segments.
> 
> I think it's the latter that's tripping up here now.  Hard coding these
> checks in the file systems seem like a bad idea, we really need to
> advertise them in the queue limits, which is complicated by the fact that
> we only want to do that for bios using block layer encryption. i.e., we
> probably need a separate queue limit that mirrors dma_alignment, but only
> for encrypted bios, and which is taken into account in the block layer
> splitting and communicated up by file systems only for encrypted bios.
> For blk-crypto-fallback we'd need DUN alignment so that the algorithms
> just work (assuming the crypto API can't scatter over misaligned
> segments), but for hardware blk-crypto I suspect that the normal DMA
> engine rules apply, and we don't need to restrict alignment.

Allowing DIO segments to be aligned (in memory address and/or length) to
less than crypto_data_unit_size on encrypted files has been attempted
and discussed before.  Read the cover letter of
https://lore.kernel.org/linux-fscrypt/20220128233940.79464-1-ebiggers@kernel.org/

We eventually decided to proceed with DIO support without it, since it
would have added a lot of complexity.  It would have made the bio
splitting code in the block layer split bios at boundaries where the
length isn't aligned to crypto_data_unit_size, it would have caused a
lot of trouble for blk-crypto-fallback, and it even would have been
incompatible with some of the hardware drivers (e.g. ufs-exynos.c).

It also didn't seem to be all that useful, and it would have introduced
edge cases that don't get tested much.  All reachable to unprivileged
userspace code too, of course.

I can't say that the idea seems all that great to me.

We can always reconsider and still add support for this.  But it's not
clear to me what's changed.

- Eric

  reply	other threads:[~2025-10-30 17:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27 14:12 [PATCHv4 0/8] Keith Busch
2025-08-27 14:12 ` [PATCHv4 1/8] block: check for valid bio while splitting Keith Busch
2025-08-31  0:40   ` Martin K. Petersen
2025-08-27 14:12 ` [PATCHv4 2/8] block: add size alignment to bio_iov_iter_get_pages Keith Busch
2025-08-31  0:40   ` Martin K. Petersen
2025-08-27 14:12 ` [PATCHv4 3/8] block: align the bio after building it Keith Busch
2025-08-31  0:41   ` Martin K. Petersen
2025-09-02  5:23   ` Christoph Hellwig
2025-08-27 14:12 ` [PATCHv4 4/8] block: simplify direct io validity check Keith Busch
2025-08-27 14:12 ` [PATCHv4 5/8] iomap: " Keith Busch
2025-10-27 16:25   ` Carlos Llamas
2025-10-27 16:42     ` Keith Busch
2025-10-27 17:12       ` Carlos Llamas
2025-10-28 22:47       ` Carlos Llamas
2025-10-28 22:56         ` Eric Biggers
2025-10-28 23:03           ` Eric Biggers
2025-10-29  7:06             ` Christoph Hellwig
2025-10-30 17:40               ` Eric Biggers [this message]
2025-10-31  9:18                 ` Christoph Hellwig
2025-11-03 18:10                   ` Eric Biggers
2025-11-03 18:26                     ` Keith Busch
2025-11-04 11:35                       ` Christoph Hellwig
2025-10-30  4:54             ` Carlos Llamas
2025-08-27 14:12 ` [PATCHv4 6/8] block: remove bdev_iter_is_aligned Keith Busch
2025-08-27 14:12 ` [PATCHv4 7/8] blk-integrity: use simpler alignment check Keith Busch
2025-08-27 14:12 ` [PATCHv4 8/8] iov_iter: remove iov_iter_is_aligned Keith Busch
2025-09-09 16:27 ` [PATCHv4 0/8] Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251030174015.GC1624@sol \
    --to=ebiggers@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cmllamas@google.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).