From: Leonid Ravich <lravich@amazon.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>,
Mike Snitzer <snitzer@kernel.org>,
Mikulas Patocka <mpatocka@redhat.com>,
Alasdair Kergon <agk@redhat.com>,
Ard Biesheuvel <ardb@kernel.org>,
Eric Biggers <ebiggers@kernel.org>, Jens Axboe <axboe@kernel.dk>,
Horia Geanta <horia.geanta@nxp.com>,
Gilad Ben-Yossef <gilad@benyossef.com>,
<linux-crypto@vger.kernel.org>, <dm-devel@lists.linux.dev>,
<linux-block@vger.kernel.org>
Subject: Re: [RFC] crypto: skcipher multi-data-unit requests for dm-crypt
Date: Tue, 28 Apr 2026 10:12:25 +0000 [thread overview]
Message-ID: <20260428101225.24316-1-lravich@amazon.com> (raw)
In-Reply-To: <ae9IUN0lOMkijDyw@gondor.apana.org.au>
On Mon, Apr 27, 2026, Herbert Xu wrote:
> Yes I'm happy with this since it could also work for IPsec.
>
> But before you invest too much energy in it it would be helpful
> if you can get some proof-of-concept performance numbers so that
> your effort is not wasted down the track.
I ran a proof-of-concept benchmark on an XTS-AES-256 dm-crypt
volume backed by a hardware crypto accelerator, comparing
per-sector submission against multi-data-unit submission.
Setup: single-core ARM64, fio 4K sequential writes, buffered IO
with end_fsync (representative of filesystem-over-dm-crypt
workloads). Two rounds per configuration, results were consistent
(< 2% variance between rounds).
Throughput (averaged):
per-sector: 286 MB/s, 73K IOPS
multi-data-unit: 340 MB/s, 87K IOPS (+19%)
CPU cycles (perf, 30s sample):
per-sector: 59.8 billion cycles
multi-data-unit: 36.0 billion cycles (-40%)
The baseline is partially CPU-bound. The perf profile shows
dm-crypt and crypto API per-request overhead consuming roughly
25% of CPU cycles in the per-sector case:
4.3% crypto dispatch
4.1% async completion callback
3.5% completion collection
3.3% kfree
2.9% per-bio context lookup
2.8% crypt_convert loop
1.6% slab allocation
1.3% mempool free
With multi-data-unit, these functions drop out of the top
profile. The bottleneck shifts to DMA mapping and page cache
operations. CPU0 kernel time drops from 78% to 40%, with the
freed cycles appearing as iowait.
The 19% throughput gain (vs 40% CPU reduction) reflects that
the system was partially IO-bound even in the baseline. The
optimization removes the CPU bottleneck, allowing the system
to fully saturate the IO path.
I will prepare the patch series against mainline.
Thanks,
Leonid Ravich
prev parent reply other threads:[~2026-04-28 10:12 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 9:56 [RFC] crypto: skcipher multi-data-unit requests for dm-crypt Leonid Ravich
2026-04-27 11:28 ` Herbert Xu
2026-04-28 10:12 ` Leonid Ravich [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428101225.24316-1-lravich@amazon.com \
--to=lravich@amazon.com \
--cc=agk@redhat.com \
--cc=ardb@kernel.org \
--cc=axboe@kernel.dk \
--cc=davem@davemloft.net \
--cc=dm-devel@lists.linux.dev \
--cc=ebiggers@kernel.org \
--cc=gilad@benyossef.com \
--cc=herbert@gondor.apana.org.au \
--cc=horia.geanta@nxp.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-crypto@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox