* Kernel crypto API: cryptoperf performance measurement
@ 2014-08-17 15:55 Stephan Mueller
  2014-08-19  7:17 ` Jussi Kivilinna
  0 siblings, 1 reply; 6+ messages in thread
From: Stephan Mueller @ 2014-08-17 15:55 UTC (permalink / raw)
  To: linux-crypto; +Cc: linux-kernel, Herbert Xu
Hi,
during playing around with the kernel crypto API, I implemented a performance 
measurement tool kit for the various kernel crypto API cipher types. The 
cryptoperf tool kit is provided in [1].
Comments are welcome.
In general, the results are as expected, i.e. the assembler implementations 
are faster than the pure C implementations. However, there are curious results 
which probably should be checked by the maintainers of the respective ciphers 
(hoping that my tool works correctly ;-) ):
ablkcipher
----------
- cryptd is slower by factor 10 across the board
blkcipher
---------
- Blowfish x86_64 assembler together with the generic C block chaining modes 
is significantly slower than Blowfish implemented in generic C
- Blowfish x86_64 assembler in ECB is significantly slower than generic C 
Blowfish ECB
- Serpent assembler implementations are not significantly faster than generic 
C implementations
- AES-NI ECB, LRW, CTR is significantly slower than AES i586 assembler.
- AES-NI ECB, LRW, CTR is not significantly faster than AES generic C
rng
---
- The ANSI X9.31 RNG seems to work massively faster than the underlying AES 
cipher (by about a factor of 5). I am unsure about the cause of this.
Caveat
------
Please note that there is one small error which I am unsure how to fix it as 
documented in the TODO file.
[1] http://www.chronox.de/cryptoperf.html
-- 
Ciao
Stephan
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Kernel crypto API: cryptoperf performance measurement
  2014-08-17 15:55 Kernel crypto API: cryptoperf performance measurement Stephan Mueller
@ 2014-08-19  7:17 ` Jussi Kivilinna
  2014-08-19 18:23   ` Stephan Mueller
  0 siblings, 1 reply; 6+ messages in thread
From: Jussi Kivilinna @ 2014-08-19  7:17 UTC (permalink / raw)
  To: Stephan Mueller, linux-crypto; +Cc: linux-kernel, Herbert Xu
Hello,
On 2014-08-17 18:55, Stephan Mueller wrote:
> Hi,
> 
> during playing around with the kernel crypto API, I implemented a performance 
> measurement tool kit for the various kernel crypto API cipher types. The 
> cryptoperf tool kit is provided in [1].
> 
> Comments are welcome.
Your results are quite slow compared to, for example "cryptsetup
benchmark", which uses kernel crypto from userspace.
With Intel i5-2450M (turbo enabled), I get:
#  Algorithm | Key |  Encryption |  Decryption
     aes-cbc   128b   524,0 MiB/s  11909,1 MiB/s
 serpent-cbc   128b    60,9 MiB/s   219,4 MiB/s
 twofish-cbc   128b   143,4 MiB/s   240,3 MiB/s
     aes-cbc   256b   330,4 MiB/s  1242,8 MiB/s
 serpent-cbc   256b    66,1 MiB/s   220,3 MiB/s
 twofish-cbc   256b   143,5 MiB/s   221,8 MiB/s
     aes-xts   256b  1268,7 MiB/s  4193,0 MiB/s
 serpent-xts   256b   234,8 MiB/s   224,6 MiB/s
 twofish-xts   256b   253,5 MiB/s   254,7 MiB/s
     aes-xts   512b  2535,0 MiB/s  2945,0 MiB/s
 serpent-xts   512b   274,2 MiB/s   242,3 MiB/s
 twofish-xts   512b   250,0 MiB/s   245,8 MiB/s
> 
> In general, the results are as expected, i.e. the assembler implementations 
> are faster than the pure C implementations. However, there are curious results 
> which probably should be checked by the maintainers of the respective ciphers 
> (hoping that my tool works correctly ;-) ):
> 
> ablkcipher
> ----------
> 
> - cryptd is slower by factor 10 across the board
> 
> blkcipher
> ---------
> 
> - Blowfish x86_64 assembler together with the generic C block chaining modes 
> is significantly slower than Blowfish implemented in generic C
> 
> - Blowfish x86_64 assembler in ECB is significantly slower than generic C 
> Blowfish ECB
> 
> - Serpent assembler implementations are not significantly faster than generic 
> C implementations
> 
> - AES-NI ECB, LRW, CTR is significantly slower than AES i586 assembler.
> 
> - AES-NI ECB, LRW, CTR is not significantly faster than AES generic C
> 
Quite many assembly implementations get speed up from processing
parallel block cipher blocks, which modes of operation that (CTR, XTS,
LWR, CBC(dec)). For small buffer sizes, these implementations will use
the non-parallel implementation of cipher.
-Jussi
> rng
> ---
> 
> - The ANSI X9.31 RNG seems to work massively faster than the underlying AES 
> cipher (by about a factor of 5). I am unsure about the cause of this.
> 
> 
> Caveat
> ------
> 
> Please note that there is one small error which I am unsure how to fix it as 
> documented in the TODO file.
> 
> [1] http://www.chronox.de/cryptoperf.html
> 
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Kernel crypto API: cryptoperf performance measurement
  2014-08-19  7:17 ` Jussi Kivilinna
@ 2014-08-19 18:23   ` Stephan Mueller
  2014-08-20 13:25     ` Jussi Kivilinna
  0 siblings, 1 reply; 6+ messages in thread
From: Stephan Mueller @ 2014-08-19 18:23 UTC (permalink / raw)
  To: Jussi Kivilinna; +Cc: linux-crypto, linux-kernel, Herbert Xu
Am Dienstag, 19. August 2014, 10:17:36 schrieb Jussi Kivilinna:
Hi Jussi,
> Hello,
> 
> On 2014-08-17 18:55, Stephan Mueller wrote:
> > Hi,
> > 
> > during playing around with the kernel crypto API, I implemented a
> > performance measurement tool kit for the various kernel crypto API cipher
> > types. The cryptoperf tool kit is provided in [1].
> > 
> > Comments are welcome.
> 
> Your results are quite slow compared to, for example "cryptsetup
> benchmark", which uses kernel crypto from userspace.
> 
> With Intel i5-2450M (turbo enabled), I get:
> 
> #  Algorithm | Key |  Encryption |  Decryption
>      aes-cbc   128b   524,0 MiB/s  11909,1 MiB/s
>  serpent-cbc   128b    60,9 MiB/s   219,4 MiB/s
>  twofish-cbc   128b   143,4 MiB/s   240,3 MiB/s
>      aes-cbc   256b   330,4 MiB/s  1242,8 MiB/s
>  serpent-cbc   256b    66,1 MiB/s   220,3 MiB/s
>  twofish-cbc   256b   143,5 MiB/s   221,8 MiB/s
>      aes-xts   256b  1268,7 MiB/s  4193,0 MiB/s
>  serpent-xts   256b   234,8 MiB/s   224,6 MiB/s
>  twofish-xts   256b   253,5 MiB/s   254,7 MiB/s
>      aes-xts   512b  2535,0 MiB/s  2945,0 MiB/s
>  serpent-xts   512b   274,2 MiB/s   242,3 MiB/s
>  twofish-xts   512b   250,0 MiB/s   245,8 MiB/s
One to four GB per second for XTS? 12 GB per second for AES CBC? Somehow that 
does not sound right.
> 
> > In general, the results are as expected, i.e. the assembler
> > implementations
> > are faster than the pure C implementations. However, there are curious
> > results which probably should be checked by the maintainers of the
> > respective ciphers (hoping that my tool works correctly ;-) ):
> > 
> > ablkcipher
> > ----------
> > 
> > - cryptd is slower by factor 10 across the board
> > 
> > blkcipher
> > ---------
> > 
> > - Blowfish x86_64 assembler together with the generic C block chaining
> > modes is significantly slower than Blowfish implemented in generic C
> > 
> > - Blowfish x86_64 assembler in ECB is significantly slower than generic C
> > Blowfish ECB
> > 
> > - Serpent assembler implementations are not significantly faster than
> > generic C implementations
> > 
> > - AES-NI ECB, LRW, CTR is significantly slower than AES i586 assembler.
> > 
> > - AES-NI ECB, LRW, CTR is not significantly faster than AES generic C
> 
> Quite many assembly implementations get speed up from processing
> parallel block cipher blocks, which modes of operation that (CTR, XTS,
> LWR, CBC(dec)). For small buffer sizes, these implementations will use
> the non-parallel implementation of cipher.
Thanks for the pointer, I will rerun my tests with multiple of the block size 
(e.g. 1024 blocks).
-- 
Ciao
Stephan
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Kernel crypto API: cryptoperf performance measurement
  2014-08-19 18:23   ` Stephan Mueller
@ 2014-08-20 13:25     ` Jussi Kivilinna
  2014-08-20 18:14       ` Milan Broz
  0 siblings, 1 reply; 6+ messages in thread
From: Jussi Kivilinna @ 2014-08-20 13:25 UTC (permalink / raw)
  To: Stephan Mueller; +Cc: linux-crypto, linux-kernel, Herbert Xu
Hello,
On 2014-08-19 21:23, Stephan Mueller wrote:
> Am Dienstag, 19. August 2014, 10:17:36 schrieb Jussi Kivilinna:
> 
> Hi Jussi,
> 
>> Hello,
>>
>> On 2014-08-17 18:55, Stephan Mueller wrote:
>>> Hi,
>>>
>>> during playing around with the kernel crypto API, I implemented a
>>> performance measurement tool kit for the various kernel crypto API cipher
>>> types. The cryptoperf tool kit is provided in [1].
>>>
>>> Comments are welcome.
>>
>> Your results are quite slow compared to, for example "cryptsetup
>> benchmark", which uses kernel crypto from userspace.
>>
>> With Intel i5-2450M (turbo enabled), I get:
>>
>> #  Algorithm | Key |  Encryption |  Decryption
>>      aes-cbc   128b   524,0 MiB/s  11909,1 MiB/s
>>  serpent-cbc   128b    60,9 MiB/s   219,4 MiB/s
>>  twofish-cbc   128b   143,4 MiB/s   240,3 MiB/s
>>      aes-cbc   256b   330,4 MiB/s  1242,8 MiB/s
>>  serpent-cbc   256b    66,1 MiB/s   220,3 MiB/s
>>  twofish-cbc   256b   143,5 MiB/s   221,8 MiB/s
>>      aes-xts   256b  1268,7 MiB/s  4193,0 MiB/s
>>  serpent-xts   256b   234,8 MiB/s   224,6 MiB/s
>>  twofish-xts   256b   253,5 MiB/s   254,7 MiB/s
>>      aes-xts   512b  2535,0 MiB/s  2945,0 MiB/s
>>  serpent-xts   512b   274,2 MiB/s   242,3 MiB/s
>>  twofish-xts   512b   250,0 MiB/s   245,8 MiB/s
> 
> One to four GB per second for XTS? 12 GB per second for AES CBC? Somehow that 
> does not sound right.
Agreed, those do not look correct... I wonder what happened there. On
new run, I got more sane results:
#  Algorithm | Key |  Encryption |  Decryption
     aes-cbc   128b   139,1 MiB/s  1713,6 MiB/s
 serpent-cbc   128b    62,2 MiB/s   232,9 MiB/s
 twofish-cbc   128b   116,3 MiB/s   243,7 MiB/s
     aes-cbc   256b   375,1 MiB/s  1159,4 MiB/s
 serpent-cbc   256b    62,1 MiB/s   214,9 MiB/s
 twofish-cbc   256b   139,3 MiB/s   217,5 MiB/s
     aes-xts   256b  1296,4 MiB/s  1272,5 MiB/s
 serpent-xts   256b   283,3 MiB/s   275,6 MiB/s
 twofish-xts   256b   294,8 MiB/s   299,3 MiB/s
     aes-xts   512b   984,3 MiB/s   991,1 MiB/s
 serpent-xts   512b   227,7 MiB/s   220,6 MiB/s
 twofish-xts   512b   220,6 MiB/s   220,2 MiB/s
-Jussi
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Kernel crypto API: cryptoperf performance measurement
  2014-08-20 13:25     ` Jussi Kivilinna
@ 2014-08-20 18:14       ` Milan Broz
  2014-08-21  7:38         ` Jussi Kivilinna
  0 siblings, 1 reply; 6+ messages in thread
From: Milan Broz @ 2014-08-20 18:14 UTC (permalink / raw)
  To: Jussi Kivilinna; +Cc: Stephan Mueller, linux-crypto, linux-kernel, Herbert Xu
On 08/20/2014 03:25 PM, Jussi Kivilinna wrote:
>> One to four GB per second for XTS? 12 GB per second for AES CBC? Somehow that 
>> does not sound right.
> 
> Agreed, those do not look correct... I wonder what happened there. On
> new run, I got more sane results:
Which cryptsetup version are you using?
There was a bug in that test on fast machines (fixed in 1.6.3, I hope :)
But anyway, it is not intended as rigorous speed test,
it was intended for comparison of ciphers speed on particular machine.
Test basically tries to encrypt 1MB block (or multiple of this
if machine is too fast). All it runs through kernel userspace crypto API
interface.
(Real FDE is always slower because it runs over 512bytes blocks.)
Milan
> 
> #  Algorithm | Key |  Encryption |  Decryption
>      aes-cbc   128b   139,1 MiB/s  1713,6 MiB/s
>  serpent-cbc   128b    62,2 MiB/s   232,9 MiB/s
>  twofish-cbc   128b   116,3 MiB/s   243,7 MiB/s
>      aes-cbc   256b   375,1 MiB/s  1159,4 MiB/s
>  serpent-cbc   256b    62,1 MiB/s   214,9 MiB/s
>  twofish-cbc   256b   139,3 MiB/s   217,5 MiB/s
>      aes-xts   256b  1296,4 MiB/s  1272,5 MiB/s
>  serpent-xts   256b   283,3 MiB/s   275,6 MiB/s
>  twofish-xts   256b   294,8 MiB/s   299,3 MiB/s
>      aes-xts   512b   984,3 MiB/s   991,1 MiB/s
>  serpent-xts   512b   227,7 MiB/s   220,6 MiB/s
>  twofish-xts   512b   220,6 MiB/s   220,2 MiB/s
> 
> -Jussi
> 
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Kernel crypto API: cryptoperf performance measurement
  2014-08-20 18:14       ` Milan Broz
@ 2014-08-21  7:38         ` Jussi Kivilinna
  0 siblings, 0 replies; 6+ messages in thread
From: Jussi Kivilinna @ 2014-08-21  7:38 UTC (permalink / raw)
  To: Milan Broz; +Cc: Stephan Mueller, linux-crypto, linux-kernel, Herbert Xu
On 2014-08-20 21:14, Milan Broz wrote:
> On 08/20/2014 03:25 PM, Jussi Kivilinna wrote:
>>> One to four GB per second for XTS? 12 GB per second for AES CBC? Somehow that 
>>> does not sound right.
>>
>> Agreed, those do not look correct... I wonder what happened there. On
>> new run, I got more sane results:
> 
> Which cryptsetup version are you using?
> 
> There was a bug in that test on fast machines (fixed in 1.6.3, I hope :)
I had version 1.6.1 at hand.
> 
> But anyway, it is not intended as rigorous speed test,
> it was intended for comparison of ciphers speed on particular machine.
>
True, but it's nice easy test when compared to parsing results from
tcrypt speed tests.
-Jussi
> Test basically tries to encrypt 1MB block (or multiple of this
> if machine is too fast). All it runs through kernel userspace crypto API
> interface.
> (Real FDE is always slower because it runs over 512bytes blocks.)
> 
> Milan
> 
^ permalink raw reply	[flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-08-21  7:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-17 15:55 Kernel crypto API: cryptoperf performance measurement Stephan Mueller
2014-08-19  7:17 ` Jussi Kivilinna
2014-08-19 18:23   ` Stephan Mueller
2014-08-20 13:25     ` Jussi Kivilinna
2014-08-20 18:14       ` Milan Broz
2014-08-21  7:38         ` Jussi Kivilinna
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).