Re: [PATCH v1 1/1] lib/crypto: tests: Add KUnit tests for AES

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Holger Dengler <dengler@linux.ibm.com>
To: David Laight <david.laight.linux@gmail.com>,
	Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>,
	"Jason A . Donenfeld" <Jason@zx2c4.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Harald Freudenberger <freude@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org
Subject: Re: [PATCH v1 1/1] lib/crypto: tests: Add KUnit tests for AES
Date: Fri, 16 Jan 2026 18:31:58 +0100	[thread overview]
Message-ID: <389595e9-e13a-42e3-b0ff-9ca0dd3effe3@linux.ibm.com> (raw)
In-Reply-To: <20260115220558.25390c0e@pumpkin>

Hi David,

On 15/01/2026 23:05, David Laight wrote:
> On Thu, 15 Jan 2026 12:43:32 -0800
> Eric Biggers <ebiggers@kernel.org> wrote:
>>> +static void benchmark_aes(struct kunit *test, const struct aes_testvector *tv)
>>> +{
>>> +	const size_t num_iters = 10000000;  
>>
>> 10000000 iterations is too many.  That's 160 MB of data in each
>> direction per AES key length.  Some CPUs without AES instructions can do
>> only ~20 MB AES per second.  In that case, this benchmark would take 16
>> seconds to run per AES key length, for 48 seconds total.
> 
> Probably best to first do a test that would take a 'reasonable' time
> on a cpu without AES. If that is 'very fast' then do a longer test
> to get more accuracy on a faster implementation.
> 
>>
>> hash-test-template.h and crc_kunit.c use 10000000 / (len + 128)
>> iterations.  That would be 69444 in this case (considering len=16),
>> which is less than 1% of the iterations you've used.  Choosing a number
>> similar to that would seem more appropriate.
>>
>> Ultimately these are just made-up numbers.  But I think we should aim
>> for the benchmark test in each KUnit test suite to take less than a
>> second or so.  The existing tests roughly achieve that, whereas it seems
>> this one can go over it by quite a bit due to the 10000000 iterations.
> 
> Even 1 second is a long time, you end up getting multiple interrupts included.
> I think a lot of these benchmarks are far too long.
> Timing differences less that 1% can be created by scheduling noise.
> Running a test that takes 200 'quanta' of the timer used has an
> error margin of under 1% (100 quanta might be enough).
> While the kernel timestamps have a resolution of 1ns the accuracy is worse.
> If you run a test for even just 10us you ought to get reasonable accuracy
> with a reasonable hope of not getting an interrupt.
> Run the test 10 times and report the fastest value.
> 
> You'll then find the results are entirely unstable because the cpu clock
> frequency keeps changing.
> And long enough buffers can get limited by the d-cache loads.
> 
> For something as slow as AES you can count the number of cpu cycles for
> a single call and get a reasonably consistent figure.
> That will tell you whether the loop is running at the speed you might
> expect it to run at.
> (You need to use data dependencies between the start/end 'times' and
> start/end of the code being timed, x86 lfence/mfence are too slow and
> can hide the 'setup' cost of some instructions.)

Thanks a lot for your feedback. I tried a few of your ideas and it turns out,
that they work quite well. First of all, with a single-block aes
encrypt/decrypt in our hardware (CPACF), we're very close to the resolution of
our CPU clock.

Disclaimer: The encryption/decryption of one block takes ~32ns (~500MB/s).
These numbers should be taken with some care, as on s390 the operating system
always runs virtualized. In my test environment, I also only have access to a
machine with shared CPUs, so there might be some negative impact from other
workload.

The benchmark loops for 100 iterations now without any warm-up. In each
iteration, I measure a single aes_encrypt()/aes_decrypt() call. The lowest
value of these measurements is takes as the value for the bandwidth
calculations. Although it is not necessary in my environment, I'm doing all
iterations with preemption disabled. I think, that this might help on other
platforms to reduce the jitter of the measurement values.

The removal of the warm-up does not have any impact on the numbers.

Just for information: I also tried to measure the cycles with the same
results. The minimal measurement value of a few iterations is much more stable
that the average over a larger number of iterations.

I also did some tests with IRQs disabled (instead of only preemption), but the
numbers stay the same. So I think, it is save enough to stay with disables
preemption.

I also tried you idea, first to do a few measurements and if they are fast
enough, increase the number of iterations. But it turns out, that this it not
really necessary (at least in my env). But I can add this, it it makes sense
on other platforms.

-- 
Mit freundlichen Grüßen / Kind regards
Holger Dengler
--
IBM Systems, Linux on IBM Z Development
dengler@linux.ibm.com

next prev parent reply	other threads:[~2026-01-16 17:32 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-15 18:38 [PATCH v1 0/1] lib/crypto: tests: KUnit test-suite for AES Holger Dengler
2026-01-15 18:38 ` [PATCH v1 1/1] lib/crypto: tests: Add KUnit tests " Holger Dengler
2026-01-15 20:43   ` Eric Biggers
2026-01-15 21:51     ` Holger Dengler
2026-01-15 21:58       ` Eric Biggers
2026-01-15 22:05     ` David Laight
2026-01-16 17:31       ` Holger Dengler [this message]
2026-01-16 18:37         ` David Laight
2026-01-16 19:20           ` Holger Dengler
2026-01-16 19:44             ` Eric Biggers
2026-01-16 20:55               ` Holger Dengler
2026-01-16 22:30                 ` David Laight
2026-01-17 23:59                   ` Eric Biggers
2026-01-16  0:25   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=389595e9-e13a-42e3-b0ff-9ca0dd3effe3@linux.ibm.com \
    --to=dengler@linux.ibm.com \
    --cc=Jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=david.laight.linux@gmail.com \
    --cc=ebiggers@kernel.org \
    --cc=freude@linux.ibm.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox