From: Harald Freudenberger <freude@linux.ibm.com>
To: Eric Biggers <ebiggers@kernel.org>
Cc: linux-crypto@vger.kernel.org, David Howells <dhowells@redhat.com>,
Ard Biesheuvel <ardb@kernel.org>,
"Jason A . Donenfeld" <Jason@zx2c4.com>,
Holger Dengler <dengler@linux.ibm.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
linux-arm-kernel@lists.infradead.org, linux-s390@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 00/15] SHA-3 library
Date: Wed, 05 Nov 2025 09:16:56 +0100 [thread overview]
Message-ID: <70461134f12796b1166978c8628b5cf3@linux.ibm.com> (raw)
In-Reply-To: <20251104182738.GA2419@sol>
On 2025-11-04 19:27, Eric Biggers wrote:
> On Tue, Nov 04, 2025 at 12:07:40PM +0100, Harald Freudenberger wrote:
>> > Thanks! Is this with the whole series applied? Those numbers are
>> > pretty fast, so probably at least the Keccak acceleration part is
>> > worthwhile. But just to reiterate what I asked for:
>> >
>> > Also, it would be helpful to provide the benchmark output from just
>> > before "lib/crypto: s390/sha3: Add optimized Keccak function", just
>> > after it, and after "lib/crypto: s390/sha3: Add optimized one-shot
>> > SHA-3 digest functions".
>> >
>> > So I'd like to see how much each change helped, which isn't clear if you
>> > show only the result at the end.
>> >
>> > If there's still no evidence that "lib/crypto: s390/sha3: Add optimized
>> > one-shot SHA-3 digest functions" actually helps significantly vs. simply
>> > doing the Keccak acceleration, then we should drop it for simplicity.
> [...]
>> commit b2e169dd8ca5 lib/crypto: s390/sha3: Add optimized one-shot
>> SHA-3
>> digest functions:
>>
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # module: sha3_kunit
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: 1..21
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 1
>> test_hash_test_vectors
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 2
>> test_hash_all_lens_up_to_4096
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 3
>> test_hash_incremental_updates
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 4
>> test_hash_buffer_overruns
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 5 test_hash_overlaps
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 6
>> test_hash_alignment_consistency
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 7
>> test_hash_ctx_zeroization
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 8
>> test_hash_interrupt_context_1
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 9
>> test_hash_interrupt_context_2
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 10
>> test_sha3_224_basic
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 11
>> test_sha3_256_basic
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 12
>> test_sha3_384_basic
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 13
>> test_sha3_512_basic
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 14
>> test_shake128_basic
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 15
>> test_shake256_basic
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 16
>> test_shake128_nist
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 17
>> test_shake256_nist
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 18
>> test_shake_all_lens_up_to_4096
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 19
>> test_shake_multiple_squeezes
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 20
>> test_shake_with_guarded_bufs
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=1: 12
>> MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=16: 80
>> MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=64: 785
>> MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=127:
>> 812 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=128:
>> 1619 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=200:
>> 2319 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=256:
>> 2176 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=511:
>> 4881 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=512:
>> 4968 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=1024:
>> 7565 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=3173:
>> 11909 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=4096:
>> 10378 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=16384:
>> 12273 MB/s
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: ok 21 benchmark_hash
>> Nov 04 10:50:50 b3545008.lnxne.boe kernel: # sha3: pass:21 fail:0
>> skip:0
>> total:21
>>
>> commit 02266b8a383e lib/crypto: s390/sha3: Add optimized Keccak
>> functions:
>>
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: # module: sha3_kunit
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: 1..21
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 1
>> test_hash_test_vectors
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 2
>> test_hash_all_lens_up_to_4096
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 3
>> test_hash_incremental_updates
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 4
>> test_hash_buffer_overruns
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 5 test_hash_overlaps
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 6
>> test_hash_alignment_consistency
>> Nov 04 10:55:37 b3545008.lnxne.boe kernel: ok 7
>> test_hash_ctx_zeroization
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 8
>> test_hash_interrupt_context_1
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 9
>> test_hash_interrupt_context_2
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 10
>> test_sha3_224_basic
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 11
>> test_sha3_256_basic
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 12
>> test_sha3_384_basic
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 13
>> test_sha3_512_basic
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 14
>> test_shake128_basic
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 15
>> test_shake256_basic
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 16
>> test_shake128_nist
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 17
>> test_shake256_nist
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 18
>> test_shake_all_lens_up_to_4096
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 19
>> test_shake_multiple_squeezes
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 20
>> test_shake_with_guarded_bufs
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=1: 12
>> MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=16: 211
>> MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=64: 835
>> MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=127:
>> 1557 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=128:
>> 1617 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=200:
>> 1457 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=256:
>> 1830 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=511:
>> 3035 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=512:
>> 3245 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=1024:
>> 5319 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=3173:
>> 9969 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=4096:
>> 11123 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # benchmark_hash:
>> len=16384:
>> 12767 MB/s
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: ok 21 benchmark_hash
>> Nov 04 10:55:38 b3545008.lnxne.boe kernel: # sha3: pass:21 fail:0
>> skip:0
>> total:21
>
> Thanks. So the results before and after "lib/crypto: s390/sha3: Add
> optimized one-shot SHA-3 digest functions" are:
>
> Length (bytes) Before After
> ============== ========== ==========
> 1 12 MB/s 12 MB/s
> 16 211 MB/s 80 MB/s
> 64 835 MB/s 785 MB/s
> 127 1557 MB/s 812 MB/s
> 128 1617 MB/s 1619 MB/s
> 200 1457 MB/s 2319 MB/s
> 256 1830 MB/s 2176 MB/s
> 511 3035 MB/s 4881 MB/s
> 512 3245 MB/s 4968 MB/s
> 1024 5319 MB/s 7565 MB/s
> 3173 9969 MB/s 11909 MB/s
> 4096 11123 MB/s 10378 MB/s
> 16384 12767 MB/s 12273 MB/s
>
> Unfortunately that seems inconclusive. len=200, 256, 511, 512, 1024,
> 3173 improved. But len=16, 64, 127, 4096, 16384 regressed.
>
> I expected the most improvement on short lengths. The fact that some
> of
> the short lengths actually regressed is concerning.
>
> It's also clear the the Keccak acceleration itself matters far more
> than
> this additional one-shot optimization, as expected. The generic code
> maxed out at only 259 MB/s for you.
>
> I suggest we hold off on "lib/crypto: s390/sha3: Add optimized one-shot
> SHA-3 digest functions" for now, to avoid the extra maintainence cost
> and opportunity for bugs.
>
> If you can provide more accurate numbers that show it's worthwhile, we
> can reconsider. Maybe set the CPU to a fixed frequency, and run
> sha3_kunit multiple times (triggered via KUnit's debugfs interface)?
>
> - Eric
The focus should be on the small data. Let me see what I can do ...
I used a zVM guest for this. Instead use an LPAR may be an option and
some CPU pinning. And do some more tests to be able to calculate a gauss
distribution. However, not within the next few days.
So I agree with you: let's hold back the one-shot optimization.
Harald Freudenberger
next prev parent reply other threads:[~2025-11-05 8:17 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-26 5:50 [PATCH v2 00/15] SHA-3 library Eric Biggers
2025-10-26 5:50 ` [PATCH v2 01/15] crypto: s390/sha3 - Rename conflicting functions Eric Biggers
2025-10-26 5:50 ` [PATCH v2 02/15] crypto: arm64/sha3 - Rename conflicting function Eric Biggers
2025-10-26 5:50 ` [PATCH v2 03/15] lib/crypto: sha3: Add SHA-3 support Eric Biggers
2025-10-26 5:50 ` [PATCH v2 04/15] lib/crypto: sha3: Move SHA3 Iota step mapping into round function Eric Biggers
2025-10-26 5:50 ` [PATCH v2 05/15] lib/crypto: tests: Add SHA3 kunit tests Eric Biggers
2025-10-26 5:50 ` [PATCH v2 06/15] lib/crypto: tests: Add additional SHAKE tests Eric Biggers
2025-10-26 5:50 ` [PATCH v2 07/15] lib/crypto: sha3: Add FIPS cryptographic algorithm self-test Eric Biggers
2025-10-26 5:50 ` [PATCH v2 08/15] crypto: arm64/sha3 - Update sha3_ce_transform() to prepare for library Eric Biggers
2025-10-26 5:50 ` [PATCH v2 09/15] lib/crypto: arm64/sha3: Migrate optimized code into library Eric Biggers
2025-10-26 5:50 ` [PATCH v2 10/15] lib/crypto: s390/sha3: Add optimized Keccak functions Eric Biggers
2025-10-26 5:50 ` [PATCH v2 11/15] lib/crypto: sha3: Support arch overrides of one-shot digest functions Eric Biggers
2025-10-26 5:50 ` [PATCH v2 12/15] lib/crypto: s390/sha3: Add optimized one-shot SHA-3 " Eric Biggers
2025-10-26 5:50 ` [PATCH v2 13/15] crypto: jitterentropy - Use default sha3 implementation Eric Biggers
2025-10-26 5:50 ` [PATCH v2 14/15] crypto: sha3 - Reimplement using library API Eric Biggers
2025-10-26 5:50 ` [PATCH v2 15/15] crypto: s390/sha3 - Remove superseded SHA-3 code Eric Biggers
2025-10-29 9:30 ` [PATCH v2 00/15] SHA-3 library Harald Freudenberger
2025-10-29 16:32 ` Eric Biggers
2025-10-29 20:33 ` Eric Biggers
2025-10-30 8:11 ` Heiko Carstens
2025-10-30 10:16 ` Harald Freudenberger
2025-10-30 10:10 ` Harald Freudenberger
2025-10-30 17:14 ` Eric Biggers
2025-10-31 14:29 ` Harald Freudenberger
2025-11-04 11:07 ` Harald Freudenberger
2025-11-04 18:27 ` Eric Biggers
2025-11-05 8:16 ` Harald Freudenberger [this message]
2025-11-04 11:55 ` Harald Freudenberger
2025-10-30 14:08 ` Ard Biesheuvel
2025-11-03 17:34 ` Eric Biggers
[not found] ` <4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com>
2025-11-06 4:33 ` Eric Biggers
2025-11-06 7:22 ` Eric Biggers
2025-11-06 8:54 ` Harald Freudenberger
2025-11-06 19:51 ` Eric Biggers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=70461134f12796b1166978c8628b5cf3@linux.ibm.com \
--to=freude@linux.ibm.com \
--cc=Jason@zx2c4.com \
--cc=ardb@kernel.org \
--cc=dengler@linux.ibm.com \
--cc=dhowells@redhat.com \
--cc=ebiggers@kernel.org \
--cc=herbert@gondor.apana.org.au \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).