From: Mike Snitzer <snitzer@kernel.org>
To: Keith Busch <kbusch@meta.com>
Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org,
mpatocka@redhat.com, ebiggers@google.com,
Keith Busch <kbusch@kernel.org>
Subject: Re: [RFC PATCH] dm-crypt: allow unaligned bio_vecs for direct io
Date: Thu, 18 Sep 2025 16:27:14 -0400 [thread overview]
Message-ID: <aMxrIjcFqaT2WztN@kernel.org> (raw)
In-Reply-To: <20250918161642.2867886-1-kbusch@meta.com>
On Thu, Sep 18, 2025 at 09:16:42AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Most storage devices can handle DMA for data that is not aligned to the
> sector block size. The block and filesystem layers have introduced
> updates to allow that kind of memory alignment flexibility when
> possible.
I'd love to understand what changes in filesystems you're referring
to. Because I know for certain that DIO with memory that isn't
'dma_alignment' aligned fails with certainty ontop of XFS.
Pretty certain it balks at DIO that isn't logical_block_size aligned
ondisk too.
> dm-crypt, however, currently constrains itself to aligned memory because
> it sends a single scatterlist element for the input ot the encrypt and
> decrypt algorithms. This forces applications that have unaligned data to
> copy through a bounce buffer, increasing CPU and memory utilization.
Even this notion that an application is somehow able to (unwittingly)
lean on "unaligned data to copy through a bounce buffer" -- has me
asking: where does Keith get these wonderful toys?
Anyway, just asking these things because if they were true I wouldn't
be needing to add specialized code to NFSD and NFS to handle
misaligned DIO.
> It appears to be a pretty straight forward thing to modify for skcipher
> since there are 3 unused scatterlist elements immediately available. In
> practice, that should be enough as the sector granularity of data
> generally doesn't straddle more than one page, if at all.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/md/dm-crypt.c | 29 +++++++++++++++++++----------
> 1 file changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 5ef43231fe77f..f860716b7a5c1 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -1429,18 +1429,14 @@ static int crypt_convert_block_skcipher(struct crypt_config *cc,
> struct skcipher_request *req,
> unsigned int tag_offset)
> {
> - struct bio_vec bv_in = bio_iter_iovec(ctx->bio_in, ctx->iter_in);
> struct bio_vec bv_out = bio_iter_iovec(ctx->bio_out, ctx->iter_out);
> + unsigned int bytes = cc->sector_size;
> struct scatterlist *sg_in, *sg_out;
> struct dm_crypt_request *dmreq;
> u8 *iv, *org_iv, *tag_iv;
> __le64 *sector;
> int r = 0;
>
> - /* Reject unexpected unaligned bio. */
> - if (unlikely(bv_in.bv_len & (cc->sector_size - 1)))
> - return -EIO;
> -
> dmreq = dmreq_of_req(cc, req);
> dmreq->iv_sector = ctx->cc_sector;
> if (test_bit(CRYPT_IV_LARGE_SECTORS, &cc->cipher_flags))
> @@ -1457,11 +1453,24 @@ static int crypt_convert_block_skcipher(struct crypt_config *cc,
> *sector = cpu_to_le64(ctx->cc_sector - cc->iv_offset);
>
> /* For skcipher we use only the first sg item */
> - sg_in = &dmreq->sg_in[0];
> sg_out = &dmreq->sg_out[0];
>
> - sg_init_table(sg_in, 1);
> - sg_set_page(sg_in, bv_in.bv_page, cc->sector_size, bv_in.bv_offset);
> + do {
> + struct bio_vec bv_in = bio_iter_iovec(ctx->bio_in, ctx->iter_in);
> + int len = min(bytes, bv_in.bv_len);
> +
> + if (r >= ARRAY_SIZE(dmreq->sg_in))
> + return -EINVAL;
> +
> + sg_in = &dmreq->sg_in[r++];
> + memset(sg_in, 0, sizeof(*sg_in));
> + sg_set_page(sg_in, bv_in.bv_page, len, bv_in.bv_offset);
> + bio_advance_iter_single(ctx->bio_in, &ctx->iter_in, len);
> + bytes -= len;
> + } while (bytes);
> +
> + sg_mark_end(sg_in);
> + sg_in = dmreq->sg_in[0];
>
> sg_init_table(sg_out, 1);
> sg_set_page(sg_out, bv_out.bv_page, cc->sector_size, bv_out.bv_offset);
> @@ -1495,7 +1504,6 @@ static int crypt_convert_block_skcipher(struct crypt_config *cc,
> if (!r && cc->iv_gen_ops && cc->iv_gen_ops->post)
> r = cc->iv_gen_ops->post(cc, org_iv, dmreq);
>
> - bio_advance_iter(ctx->bio_in, &ctx->iter_in, cc->sector_size);
> bio_advance_iter(ctx->bio_out, &ctx->iter_out, cc->sector_size);
>
> return r;
> @@ -3750,7 +3758,8 @@ static void crypt_io_hints(struct dm_target *ti, struct queue_limits *limits)
> limits->physical_block_size =
> max_t(unsigned int, limits->physical_block_size, cc->sector_size);
> limits->io_min = max_t(unsigned int, limits->io_min, cc->sector_size);
> - limits->dma_alignment = limits->logical_block_size - 1;
> + if (crypt_integrity_aead(cc))
> + limits->dma_alignment = limits->logical_block_size - 1;
>
> /*
> * For zoned dm-crypt targets, there will be no internal splitting of
> --
> 2.47.3
>
>
next prev parent reply other threads:[~2025-09-18 20:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-18 16:16 [RFC PATCH] dm-crypt: allow unaligned bio_vecs for direct io Keith Busch
2025-09-18 20:13 ` Keith Busch
2025-09-26 14:19 ` Mikulas Patocka
2025-09-26 16:17 ` Keith Busch
2025-09-18 20:27 ` Mike Snitzer [this message]
2025-09-18 20:52 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMxrIjcFqaT2WztN@kernel.org \
--to=snitzer@kernel.org \
--cc=dm-devel@lists.linux.dev \
--cc=ebiggers@google.com \
--cc=kbusch@kernel.org \
--cc=kbusch@meta.com \
--cc=linux-block@vger.kernel.org \
--cc=mpatocka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox