qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: Ard Biesheuvel <ardb@kernel.org>, qemu-arm@nongnu.org
Cc: qemu-devel@nongnu.org, "Peter Maydell" <peter.maydell@linaro.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>
Subject: Re: [PATCH v2 0/2] Implement AES on ARM using x86 instructions and vv
Date: Wed, 31 May 2023 09:33:03 -0700	[thread overview]
Message-ID: <722d7683-80b4-bb23-3ca2-77f8de23b801@linaro.org> (raw)
In-Reply-To: <20230531112239.3164777-1-ardb@kernel.org>

On 5/31/23 04:22, Ard Biesheuvel wrote:
> Use the host native instructions to implement the AES instructions
> exposed by the emulated target. The mapping is not 1:1, so it requires a
> bit of fiddling to get the right result.
> 
> This is still RFC material - the current approach feels too ad-hoc, but
> given the non-1:1 correspondence, doing a proper abstraction is rather
> difficult.
> 
> Changes since v1/RFC:
> - add second patch to implement x86 AES instructions on ARM hosts - this
>    helps illustrate what an abstraction should cover.
> - use cpuinfo framework to detect host support for AES instructions.
> - implement ARM aesimc using x86 aesimc directly
> 
> Patch #1 produces a 1.5-2x speedup in tests using the Linux kernel's
> tcrypt benchmark (mode=500)
> 
> Patch #2 produces a 2-3x speedup. The discrepancy is most likely due to
> the fact that ARM uses two instructions to implement a single AES round,
> whereas x86 only uses one.

Thanks.  I spent some time yesterday looking at this, with an encrypted disk test case and 
could only measure 0.6% and 0.5% for total overhead of decrypt and encrypt respectively.

> As for the design of an abstraction: I imagine we could introduce a
> host/aes.h API that implements some building blocks that the TCG helper
> implementation could use.

Indeed.  I was considering interfaces like

/* Perform SubBytes + ShiftRows on state. */
Int128 aesenc_SB_SR(Int128 state);

/* Perform MixColumns on state. */
Int128 aesenc_MC(Int128 state);

/* Perform SubBytes + ShiftRows + MixColumns on state. */
Int128 aesenc_SB_SR_MC(Int128 state);

/* Perform SubBytes + ShiftRows + MixColumns + AddRoundKey. */
Int128 aesenc_SB_SR_MC_AK(Int128 state, Int128 roundkey);

and so forth for aesdec as well.  All but aesenc_MC should be implementable on x86 and 
Power7, and all of them on aarch64.

> I suppose it really depends on whether there is a third host
> architecture that could make use of this, and how its AES instructions
> map onto the primitive AES ops above.

There is Power6 (v{,n}cipher{,last}) and RISC-V Zkn (aes64{es,esm,ds,dsm,im})

I got hung up yesterday was understanding the different endian requirements of x86 vs Power.

ppc64:

     asm("lxvd2x 32,0,%1;"
         "lxvd2x 33,0,%2;"
         "vcipher 0,0,1;"
         "stxvd2x 32,0,%0"
         : : "r"(o), "r"(i), "r"(k), : "memory", "v0", "v1", "v2");

ppc64le:

     unsigned char le[16] = {8,9,10,11,12,13,14,15,0,1,2,3,4,5,6,7};
     asm("lxvd2x 32,0,%1;"
         "lxvd2x 33,0,%2;"
         "lxvd2x 34,0,%3;"
         "vperm 0,0,0,2;"
         "vperm 1,1,1,2;"
         "vcipher 0,0,1;"
         "vperm 0,0,0,2;"
         "stxvd2x 32,0,%0"
         : : "r"(o), "r"(i), "r"(k), "r"(le) : "memory", "v0", "v1", "v2");

There are also differences in their AES_Te* based C routines as well, which made me wonder 
if we are handling host endianness differences correctly in emulation right now.  I think 
I should most definitely add some generic-ish tests for this...


r~


  parent reply	other threads:[~2023-05-31 16:34 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 11:22 [PATCH v2 0/2] Implement AES on ARM using x86 instructions and vv Ard Biesheuvel
2023-05-31 11:22 ` [PATCH v2 1/2] target/arm: use x86 intrinsics to implement AES instructions Ard Biesheuvel
2023-05-31 11:22 ` [PATCH v2 2/2] target/i386: Implement AES instructions using AArch64 counterparts Ard Biesheuvel
2023-05-31 17:13   ` Richard Henderson
2023-05-31 16:33 ` Richard Henderson [this message]
2023-05-31 16:47   ` [PATCH v2 0/2] Implement AES on ARM using x86 instructions and vv Ard Biesheuvel
2023-05-31 17:08     ` Richard Henderson
2023-06-01  4:08       ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=722d7683-80b4-bb23-3ca2-77f8de23b801@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=alex.bennee@linaro.org \
    --cc=ardb@kernel.org \
    --cc=f4bug@amsat.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).