qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: qemu-arm@nongnu.org
Cc: qemu-devel@nongnu.org, "Ard Biesheuvel" <ardb@kernel.org>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>
Subject: [PATCH v2 0/2] Implement AES on ARM using x86 instructions and vv
Date: Wed, 31 May 2023 13:22:37 +0200	[thread overview]
Message-ID: <20230531112239.3164777-1-ardb@kernel.org> (raw)

Use the host native instructions to implement the AES instructions
exposed by the emulated target. The mapping is not 1:1, so it requires a
bit of fiddling to get the right result.

This is still RFC material - the current approach feels too ad-hoc, but
given the non-1:1 correspondence, doing a proper abstraction is rather
difficult.

Changes since v1/RFC:
- add second patch to implement x86 AES instructions on ARM hosts - this
  helps illustrate what an abstraction should cover.
- use cpuinfo framework to detect host support for AES instructions.
- implement ARM aesimc using x86 aesimc directly

Patch #1 produces a 1.5-2x speedup in tests using the Linux kernel's
tcrypt benchmark (mode=500)

Patch #2 produces a 2-3x speedup. The discrepancy is most likely due to
the fact that ARM uses two instructions to implement a single AES round,
whereas x86 only uses one.

Note that using the ARM intrinsics is fiddly with Clang, as it does not
declare the prototypes unless some builtin CPP macro (__ARM_FEATURE_AES)
is defined, which will be set by the compiler based on the command line
arch/cpu options. However, setting this globally for a compilation unit
is dubious, given that we test cpuinfo for AES support, and only emit
the instructions conditionally. So I used inline asm() instead.

As for the design of an abstraction: I imagine we could introduce a
host/aes.h API that implements some building blocks that the TCG helper
implementation could use.

Quoting from my reply to Richard:

Using the primitive operations defined in the AES paper, we basically
perform the following transformation for n rounds of AES (for n in {10,
12, 14})

for (n-1 rounds) {
  AddRoundKey
  ShiftRows
  SubBytes
  MixColumns
}
AddRoundKey
ShiftRows
SubBytes
AddRoundKey

AddRoundKey is just XOR, but it is incorporated into the instructions
that combine a couple of these steps.

So on x86, we have

aesenc:
  ShiftRows
  SubBytes
  MixColumns
  AddRoundKey

aesenclast:
  ShiftRows
  SubBytes
  AddRoundKey

and on ARM we have

aese:
  AddRoundKey
  ShiftRows
  SubBytes

aesmc:
  MixColumns

So a generic routine that does only ShiftRows+SubBytes could be backed by
x86's aesenclast and ARM's aese, using a NULL round key argument in each
case. Then, it would be up to the TCG helper code for either ARM or x86
to incorporate those routines in the right way.

I suppose it really depends on whether there is a third host
architecture that could make use of this, and how its AES instructions
map onto the primitive AES ops above.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>

Ard Biesheuvel (2):
  target/arm: use x86 intrinsics to implement AES instructions
  target/i386: Implement AES instructions using AArch64 counterparts

 host/include/aarch64/host/cpuinfo.h |  1 +
 host/include/i386/host/cpuinfo.h    |  1 +
 target/arm/tcg/crypto_helper.c      | 37 ++++++++++-
 target/i386/ops_sse.h               | 69 ++++++++++++++++++++
 util/cpuinfo-aarch64.c              |  1 +
 util/cpuinfo-i386.c                 |  1 +
 6 files changed, 107 insertions(+), 3 deletions(-)

-- 
2.39.2



             reply	other threads:[~2023-05-31 11:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 11:22 Ard Biesheuvel [this message]
2023-05-31 11:22 ` [PATCH v2 1/2] target/arm: use x86 intrinsics to implement AES instructions Ard Biesheuvel
2023-05-31 11:22 ` [PATCH v2 2/2] target/i386: Implement AES instructions using AArch64 counterparts Ard Biesheuvel
2023-05-31 17:13   ` Richard Henderson
2023-05-31 16:33 ` [PATCH v2 0/2] Implement AES on ARM using x86 instructions and vv Richard Henderson
2023-05-31 16:47   ` Ard Biesheuvel
2023-05-31 17:08     ` Richard Henderson
2023-06-01  4:08       ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230531112239.3164777-1-ardb@kernel.org \
    --to=ardb@kernel.org \
    --cc=alex.bennee@linaro.org \
    --cc=f4bug@amsat.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).