linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/36] AES library improvements
@ 2026-01-05  5:12 Eric Biggers
  2026-01-05  5:12 ` [PATCH 01/36] crypto: powerpc/aes - Rename struct aes_key Eric Biggers
                   ` (35 more replies)
  0 siblings, 36 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

This series applies to libcrypto-next.  It can also be retrieved from:

    git fetch https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git aes-lib-v1

This series makes three main improvements to the kernel's AES library:

  1. Make it use the kernel's existing architecture-optimized AES code,
     including AES instructions, when available.  Previously, only the
     traditional crypto API gave access to the optimized AES code.
     (As a reminder, AES instructions typically make AES over 10 times
     as fast as the generic code.  They also make it constant-time.)

  2. Support preparing an AES key for only the forward direction of the
     block cipher, using about half as much memory.  This is a helpful
     optimization for many common AES modes of operation.  It also helps
     keep structs small enough to be allocated on the stack, especially
     considering potential future library APIs for AES modes.

  3. Replace the library's generic AES implementation with a much faster
     one that is almost as fast as "aes-generic", while still keeping
     the table size reasonably small and maintaining some constant-time
     hardening.  This allows removing "aes-generic", unifying the
     current two generic AES implementations in the kernel tree.

(1) and (2) end up being interrelated: the existing
'struct crypto_aes_ctx' does not work for either one (in general).
Thus, this series reworks the AES library to be based around new data
types 'struct aes_key' and 'struct aes_enckey'.

As has been the case for other algorithms, to achieve (1) without
duplicating the architecture-optimized code, it had to be moved into
lib/crypto/ rather than copied.  To allow actually removing the
arch-specific crypto_cipher "aes" algorithms, a consolidated "aes-lib"
crypto_cipher algorithm which simply wraps the library is also added.
That's most easily done with it replacing "aes-generic" too, so that is
done too.  (That's another reason for doing (3) at the same time.)

As usual, care is taken to support all the existing arch-optimized code.
This makes it possible for users of the traditional crypto API to switch
to the library API, which is generally much easier to use, without being
concerned about performance regressions.

That being said, this series only deals with the bare (single-block) AES
library.  Future patchsets are expected to build on this work to provide
architecture-optimized library APIs for specific AES modes of operation.

Eric Biggers (36):
  crypto: powerpc/aes - Rename struct aes_key
  lib/crypto: aes: Introduce improved AES library
  crypto: arm/aes-neonbs - Use AES library for single blocks
  crypto: arm/aes - Switch to aes_enc_tab[] and aes_dec_tab[]
  crypto: arm64/aes - Switch to aes_enc_tab[] and aes_dec_tab[]
  crypto: arm64/aes - Select CRYPTO_LIB_SHA256 from correct places
  crypto: aegis - Switch from crypto_ft_tab[] to aes_enc_tab[]
  crypto: aes - Remove aes-fixed-time / CONFIG_CRYPTO_AES_TI
  crypto: aes - Replace aes-generic with wrapper around lib
  lib/crypto: arm/aes: Migrate optimized code into library
  lib/crypto: arm64/aes: Migrate optimized code into library
  lib/crypto: powerpc/aes: Migrate SPE optimized code into library
  lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library
  lib/crypto: riscv/aes: Migrate optimized code into library
  lib/crypto: s390/aes: Migrate optimized code into library
  lib/crypto: sparc/aes: Migrate optimized code into library
  lib/crypto: x86/aes: Add AES-NI optimization
  crypto: x86/aes - Remove the superseded AES-NI crypto_cipher
  Bluetooth: SMP: Use new AES library API
  chelsio: Use new AES library API
  net: phy: mscc: macsec: Use new AES library API
  staging: rtl8723bs: core: Use new AES library API
  crypto: arm/ghash - Use new AES library API
  crypto: arm64/ghash - Use new AES library API
  crypto: x86/aes-gcm - Use new AES library API
  crypto: ccp - Use new AES library API
  crypto: chelsio - Use new AES library API
  crypto: crypto4xx - Use new AES library API
  crypto: drbg - Use new AES library API
  crypto: inside-secure - Use new AES library API
  crypto: omap - Use new AES library API
  lib/crypto: aescfb: Use new AES library API
  lib/crypto: aesgcm: Use new AES library API
  lib/crypto: aes: Remove old AES en/decryption functions
  lib/crypto: aes: Drop "_new" suffix from en/decryption functions
  lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox

 arch/arm/configs/milbeaut_m10v_defconfig      |    1 -
 arch/arm/configs/multi_v7_defconfig           |    2 +-
 arch/arm/configs/omap2plus_defconfig          |    2 +-
 arch/arm/configs/pxa_defconfig                |    2 +-
 arch/arm/crypto/Kconfig                       |   19 -
 arch/arm/crypto/Makefile                      |    2 -
 arch/arm/crypto/aes-cipher-glue.c             |   69 -
 arch/arm/crypto/aes-cipher.h                  |   13 -
 arch/arm/crypto/aes-neonbs-glue.c             |   29 +-
 arch/arm/crypto/ghash-ce-glue.c               |   14 +-
 arch/arm64/crypto/Kconfig                     |   29 +-
 arch/arm64/crypto/Makefile                    |    6 -
 arch/arm64/crypto/aes-ce-ccm-glue.c           |    2 -
 arch/arm64/crypto/aes-ce-glue.c               |  178 ---
 arch/arm64/crypto/aes-ce-setkey.h             |    6 -
 arch/arm64/crypto/aes-cipher-glue.c           |   63 -
 arch/arm64/crypto/aes-glue.c                  |    2 -
 arch/arm64/crypto/ghash-ce-glue.c             |   27 +-
 arch/m68k/configs/amiga_defconfig             |    1 -
 arch/m68k/configs/apollo_defconfig            |    1 -
 arch/m68k/configs/atari_defconfig             |    1 -
 arch/m68k/configs/bvme6000_defconfig          |    1 -
 arch/m68k/configs/hp300_defconfig             |    1 -
 arch/m68k/configs/mac_defconfig               |    1 -
 arch/m68k/configs/multi_defconfig             |    1 -
 arch/m68k/configs/mvme147_defconfig           |    1 -
 arch/m68k/configs/mvme16x_defconfig           |    1 -
 arch/m68k/configs/q40_defconfig               |    1 -
 arch/m68k/configs/sun3_defconfig              |    1 -
 arch/m68k/configs/sun3x_defconfig             |    1 -
 arch/powerpc/crypto/Kconfig                   |    2 +-
 arch/powerpc/crypto/Makefile                  |    9 +-
 arch/powerpc/crypto/aes-gcm-p10-glue.c        |    4 +-
 arch/powerpc/crypto/aes-spe-glue.c            |   88 +-
 arch/powerpc/crypto/aes.c                     |  134 --
 arch/powerpc/crypto/aes_cbc.c                 |    4 +-
 arch/powerpc/crypto/aes_ctr.c                 |    2 +-
 arch/powerpc/crypto/aes_xts.c                 |    6 +-
 arch/powerpc/crypto/aesp8-ppc.h               |   22 -
 arch/powerpc/crypto/vmx.c                     |   10 +-
 arch/riscv/crypto/Kconfig                     |    2 -
 arch/riscv/crypto/aes-macros.S                |   12 +-
 arch/riscv/crypto/aes-riscv64-glue.c          |   78 +-
 arch/riscv/crypto/aes-riscv64-zvkned.S        |   27 -
 arch/s390/configs/debug_defconfig             |    2 +-
 arch/s390/configs/defconfig                   |    2 +-
 arch/s390/crypto/Kconfig                      |    2 -
 arch/s390/crypto/aes_s390.c                   |  113 --
 arch/sparc/crypto/Kconfig                     |    2 +-
 arch/sparc/crypto/Makefile                    |    2 +-
 arch/sparc/crypto/aes_glue.c                  |  140 +-
 arch/x86/crypto/Kconfig                       |    2 -
 arch/x86/crypto/aes-gcm-aesni-x86_64.S        |   33 +-
 arch/x86/crypto/aes-gcm-vaes-avx2.S           |   21 +-
 arch/x86/crypto/aes-gcm-vaes-avx512.S         |   25 +-
 arch/x86/crypto/aesni-intel_asm.S             |   25 -
 arch/x86/crypto/aesni-intel_glue.c            |  119 +-
 crypto/Kconfig                                |   23 +-
 crypto/Makefile                               |    4 +-
 crypto/aegis.h                                |    2 +-
 crypto/aes.c                                  |   66 +
 crypto/aes_generic.c                          | 1320 -----------------
 crypto/aes_ti.c                               |   83 --
 crypto/crypto_user.c                          |    2 +-
 crypto/df_sp80090a.c                          |   30 +-
 crypto/drbg.c                                 |   12 +-
 crypto/testmgr.c                              |   43 +-
 drivers/char/tpm/tpm2-sessions.c              |   10 +-
 drivers/crypto/amcc/crypto4xx_alg.c           |   10 +-
 drivers/crypto/ccp/ccp-crypto-aes-cmac.c      |    4 +-
 drivers/crypto/chelsio/chcr_algo.c            |   10 +-
 .../crypto/inside-secure/safexcel_cipher.c    |   12 +-
 drivers/crypto/inside-secure/safexcel_hash.c  |   14 +-
 drivers/crypto/omap-aes-gcm.c                 |    6 +-
 drivers/crypto/omap-aes.h                     |    2 +-
 drivers/crypto/starfive/jh7110-aes.c          |   10 +-
 drivers/crypto/xilinx/xilinx-trng.c           |    8 +-
 .../inline_crypto/ch_ipsec/chcr_ipsec.c       |    4 +-
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c |    8 +-
 .../chelsio/inline_crypto/chtls/chtls_hw.c    |    4 +-
 drivers/net/phy/mscc/mscc_macsec.c            |    8 +-
 drivers/staging/rtl8723bs/core/rtw_security.c |   20 +-
 include/crypto/aes.h                          |  279 +++-
 include/crypto/df_sp80090a.h                  |    2 +-
 include/crypto/gcm.h                          |    2 +-
 lib/crypto/Kconfig                            |   12 +
 lib/crypto/Makefile                           |   43 +-
 lib/crypto/aes.c                              |  473 ++++--
 lib/crypto/aescfb.c                           |   30 +-
 lib/crypto/aesgcm.c                           |   12 +-
 .../crypto/arm}/aes-cipher-core.S             |    4 +-
 lib/crypto/arm/aes.h                          |   56 +
 .../crypto => lib/crypto/arm64}/aes-ce-core.S |    0
 .../crypto/arm64}/aes-cipher-core.S           |    4 +-
 lib/crypto/arm64/aes.h                        |  164 ++
 lib/crypto/powerpc/.gitignore                 |    2 +
 .../crypto/powerpc}/aes-spe-core.S            |    0
 .../crypto/powerpc}/aes-spe-keys.S            |    0
 .../crypto/powerpc}/aes-spe-modes.S           |    0
 .../crypto/powerpc}/aes-spe-regs.h            |    0
 .../crypto/powerpc}/aes-tab-4k.S              |    0
 lib/crypto/powerpc/aes.h                      |  238 +++
 .../crypto/powerpc}/aesp8-ppc.pl              |    1 +
 lib/crypto/riscv/aes-riscv64-zvkned.S         |   84 ++
 lib/crypto/riscv/aes.h                        |   63 +
 lib/crypto/s390/aes.h                         |  106 ++
 lib/crypto/sparc/aes.h                        |  149 ++
 .../crypto => lib/crypto/sparc}/aes_asm.S     |    0
 lib/crypto/x86/aes-aesni.S                    |  261 ++++
 lib/crypto/x86/aes.h                          |   85 ++
 net/bluetooth/smp.c                           |    8 +-
 111 files changed, 2202 insertions(+), 2957 deletions(-)
 delete mode 100644 arch/arm/crypto/aes-cipher-glue.c
 delete mode 100644 arch/arm/crypto/aes-cipher.h
 delete mode 100644 arch/arm64/crypto/aes-ce-glue.c
 delete mode 100644 arch/arm64/crypto/aes-ce-setkey.h
 delete mode 100644 arch/arm64/crypto/aes-cipher-glue.c
 delete mode 100644 arch/powerpc/crypto/aes.c
 create mode 100644 crypto/aes.c
 delete mode 100644 crypto/aes_generic.c
 delete mode 100644 crypto/aes_ti.c
 rename {arch/arm/crypto => lib/crypto/arm}/aes-cipher-core.S (97%)
 create mode 100644 lib/crypto/arm/aes.h
 rename {arch/arm64/crypto => lib/crypto/arm64}/aes-ce-core.S (100%)
 rename {arch/arm64/crypto => lib/crypto/arm64}/aes-cipher-core.S (96%)
 create mode 100644 lib/crypto/arm64/aes.h
 create mode 100644 lib/crypto/powerpc/.gitignore
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-core.S (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-keys.S (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-modes.S (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-regs.h (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-tab-4k.S (100%)
 create mode 100644 lib/crypto/powerpc/aes.h
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aesp8-ppc.pl (99%)
 create mode 100644 lib/crypto/riscv/aes-riscv64-zvkned.S
 create mode 100644 lib/crypto/riscv/aes.h
 create mode 100644 lib/crypto/s390/aes.h
 create mode 100644 lib/crypto/sparc/aes.h
 rename {arch/sparc/crypto => lib/crypto/sparc}/aes_asm.S (100%)
 create mode 100644 lib/crypto/x86/aes-aesni.S
 create mode 100644 lib/crypto/x86/aes.h


base-commit: e78a3142fa5875126e477fdfe329b0aeb1b0693f
-- 
2.52.0



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/36] crypto: powerpc/aes - Rename struct aes_key
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 02/36] lib/crypto: aes: Introduce improved AES library Eric Biggers
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Rename struct aes_key in aesp8-ppc.h and aes-gcm-p10-glue.c to
p8_aes_key and p10_aes_key, respectively.  This frees up the name to use
in the library API in <crypto/aes.h>.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/powerpc/crypto/aes-gcm-p10-glue.c |  4 ++--
 arch/powerpc/crypto/aes.c              |  4 ++--
 arch/powerpc/crypto/aes_cbc.c          |  4 ++--
 arch/powerpc/crypto/aes_ctr.c          |  2 +-
 arch/powerpc/crypto/aes_xts.c          |  6 +++---
 arch/powerpc/crypto/aesp8-ppc.h        | 23 ++++++++++++-----------
 6 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/crypto/aes-gcm-p10-glue.c b/arch/powerpc/crypto/aes-gcm-p10-glue.c
index 85f4fd4b1bdc..f3417436d3f7 100644
--- a/arch/powerpc/crypto/aes-gcm-p10-glue.c
+++ b/arch/powerpc/crypto/aes-gcm-p10-glue.c
@@ -42,11 +42,11 @@ asmlinkage void aes_p10_gcm_decrypt(const u8 *in, u8 *out, size_t len,
 asmlinkage void gcm_init_htable(unsigned char htable[], unsigned char Xi[]);
 asmlinkage void gcm_ghash_p10(unsigned char *Xi, unsigned char *Htable,
 			      unsigned char *aad, unsigned int alen);
 asmlinkage void gcm_update(u8 *iv, void *Xi);
 
-struct aes_key {
+struct p10_aes_key {
 	u8 key[AES_MAX_KEYLENGTH];
 	u64 rounds;
 };
 
 struct gcm_ctx {
@@ -61,11 +61,11 @@ struct Hash_ctx {
 	u8 H[16];	/* subkey */
 	u8 Htable[256];	/* Xi, Hash table(offset 32) */
 };
 
 struct p10_aes_gcm_ctx {
-	struct aes_key enc_key;
+	struct p10_aes_key enc_key;
 	u8 nonce[RFC4106_NONCE_SIZE];
 };
 
 static void vsx_begin(void)
 {
diff --git a/arch/powerpc/crypto/aes.c b/arch/powerpc/crypto/aes.c
index 3f1e5e894902..b7192ee719fc 100644
--- a/arch/powerpc/crypto/aes.c
+++ b/arch/powerpc/crypto/aes.c
@@ -19,12 +19,12 @@
 
 #include "aesp8-ppc.h"
 
 struct p8_aes_ctx {
 	struct crypto_cipher *fallback;
-	struct aes_key enc_key;
-	struct aes_key dec_key;
+	struct p8_aes_key enc_key;
+	struct p8_aes_key dec_key;
 };
 
 static int p8_aes_init(struct crypto_tfm *tfm)
 {
 	const char *alg = crypto_tfm_alg_name(tfm);
diff --git a/arch/powerpc/crypto/aes_cbc.c b/arch/powerpc/crypto/aes_cbc.c
index 5f2a4f375eef..4a9f285f0970 100644
--- a/arch/powerpc/crypto/aes_cbc.c
+++ b/arch/powerpc/crypto/aes_cbc.c
@@ -19,12 +19,12 @@
 
 #include "aesp8-ppc.h"
 
 struct p8_aes_cbc_ctx {
 	struct crypto_skcipher *fallback;
-	struct aes_key enc_key;
-	struct aes_key dec_key;
+	struct p8_aes_key enc_key;
+	struct p8_aes_key dec_key;
 };
 
 static int p8_aes_cbc_init(struct crypto_skcipher *tfm)
 {
 	struct p8_aes_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
diff --git a/arch/powerpc/crypto/aes_ctr.c b/arch/powerpc/crypto/aes_ctr.c
index e27c4036e711..7dbd06f442db 100644
--- a/arch/powerpc/crypto/aes_ctr.c
+++ b/arch/powerpc/crypto/aes_ctr.c
@@ -19,11 +19,11 @@
 
 #include "aesp8-ppc.h"
 
 struct p8_aes_ctr_ctx {
 	struct crypto_skcipher *fallback;
-	struct aes_key enc_key;
+	struct p8_aes_key enc_key;
 };
 
 static int p8_aes_ctr_init(struct crypto_skcipher *tfm)
 {
 	struct p8_aes_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
diff --git a/arch/powerpc/crypto/aes_xts.c b/arch/powerpc/crypto/aes_xts.c
index 9440e771cede..b4c760e465ea 100644
--- a/arch/powerpc/crypto/aes_xts.c
+++ b/arch/powerpc/crypto/aes_xts.c
@@ -20,13 +20,13 @@
 
 #include "aesp8-ppc.h"
 
 struct p8_aes_xts_ctx {
 	struct crypto_skcipher *fallback;
-	struct aes_key enc_key;
-	struct aes_key dec_key;
-	struct aes_key tweak_key;
+	struct p8_aes_key enc_key;
+	struct p8_aes_key dec_key;
+	struct p8_aes_key tweak_key;
 };
 
 static int p8_aes_xts_init(struct crypto_skcipher *tfm)
 {
 	struct p8_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
diff --git a/arch/powerpc/crypto/aesp8-ppc.h b/arch/powerpc/crypto/aesp8-ppc.h
index 5764d4438388..0bea010128cb 100644
--- a/arch/powerpc/crypto/aesp8-ppc.h
+++ b/arch/powerpc/crypto/aesp8-ppc.h
@@ -1,10 +1,10 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/types.h>
 #include <crypto/aes.h>
 
-struct aes_key {
+struct p8_aes_key {
 	u8 key[AES_MAX_KEYLENGTH];
 	int rounds;
 };
 
 extern struct shash_alg p8_ghash_alg;
@@ -12,19 +12,20 @@ extern struct crypto_alg p8_aes_alg;
 extern struct skcipher_alg p8_aes_cbc_alg;
 extern struct skcipher_alg p8_aes_ctr_alg;
 extern struct skcipher_alg p8_aes_xts_alg;
 
 int aes_p8_set_encrypt_key(const u8 *userKey, const int bits,
-			   struct aes_key *key);
+			   struct p8_aes_key *key);
 int aes_p8_set_decrypt_key(const u8 *userKey, const int bits,
-			   struct aes_key *key);
-void aes_p8_encrypt(const u8 *in, u8 *out, const struct aes_key *key);
-void aes_p8_decrypt(const u8 *in, u8 *out, const struct aes_key *key);
+			   struct p8_aes_key *key);
+void aes_p8_encrypt(const u8 *in, u8 *out, const struct p8_aes_key *key);
+void aes_p8_decrypt(const u8 *in, u8 *out, const struct p8_aes_key *key);
 void aes_p8_cbc_encrypt(const u8 *in, u8 *out, size_t len,
-			const struct aes_key *key, u8 *iv, const int enc);
-void aes_p8_ctr32_encrypt_blocks(const u8 *in, u8 *out,
-				 size_t len, const struct aes_key *key,
-				 const u8 *iv);
+			const struct p8_aes_key *key, u8 *iv, const int enc);
+void aes_p8_ctr32_encrypt_blocks(const u8 *in, u8 *out, size_t len,
+				 const struct p8_aes_key *key, const u8 *iv);
 void aes_p8_xts_encrypt(const u8 *in, u8 *out, size_t len,
-			const struct aes_key *key1, const struct aes_key *key2, u8 *iv);
+			const struct p8_aes_key *key1,
+			const struct p8_aes_key *key2, u8 *iv);
 void aes_p8_xts_decrypt(const u8 *in, u8 *out, size_t len,
-			const struct aes_key *key1, const struct aes_key *key2, u8 *iv);
+			const struct p8_aes_key *key1,
+			const struct p8_aes_key *key2, u8 *iv);

base-commit: e78a3142fa5875126e477fdfe329b0aeb1b0693f
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/36] lib/crypto: aes: Introduce improved AES library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
  2026-01-05  5:12 ` [PATCH 01/36] crypto: powerpc/aes - Rename struct aes_key Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  7:47   ` Qingfang Deng
  2026-01-05  5:12 ` [PATCH 03/36] crypto: arm/aes-neonbs - Use AES library for single blocks Eric Biggers
                   ` (33 subsequent siblings)
  35 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

The kernel's AES library currently has the following issues:

- It doesn't take advantage of the architecture-optimized AES code,
  including the implementations using AES instructions.

- It's much slower than even the other software AES implementations: 2-4
  times slower than "aes-generic", "aes-arm", and "aes-arm64".

- It requires that both the encryption and decryption round keys be
  computed and cached.  This is wasteful for users that need only the
  forward (encryption) direction of the cipher: the key struct is 484
  bytes when only 244 are actually needed.  This missed optimization is
  very common, as many AES modes (e.g. GCM, CFB, CTR, CMAC, and even the
  tweak key in XTS) use the cipher only in the forward (encryption)
  direction even when doing decryption.

- It doesn't provide the flexibility to customize the prepared key
  format.  The API is defined to do key expansion, and several callers
  in drivers/crypto/ use it specifically to expand the key.  This is an
  issue when integrating the existing powerpc, s390, and sparc code,
  which is necessary to provide full parity with the traditional API.

To resolve these issues, I'm proposing the following changes:

1. New structs 'aes_key' and 'aes_enckey' are introduced, with
   corresponding functions aes_preparekey() and aes_prepareenckey().

   Generally these structs will include the encryption+decryption round
   keys and the encryption round keys, respectively.  However, the exact
   format will be under control of the architecture-specific AES code.

   (The verb "prepare" is chosen over "expand" since key expansion isn't
   necessarily done.  It's also consistent with hmac*_preparekey().)

2. aes_encrypt() and aes_decrypt() will be changed to operate on the new
   structs instead of struct crypto_aes_ctx.

3. aes_encrypt() and aes_decrypt() will use architecture-optimized code
   when available, or else fall back to a new generic AES implementation
   that unifies the existing two fragmented generic AES implementations.

   The new generic AES implementation uses tables for both SubBytes and
   MixColumns, making it almost as fast as "aes-generic".  However,
   instead of aes-generic's huge 8192-byte tables per direction, it uses
   only 1024 bytes for encryption and 1280 bytes for decryption (similar
   to "aes-arm").  The cost is just some extra rotations.

   The new generic AES implementation also includes table prefetching,
   making it have some "constant-time hardening".  That's an improvement
   from aes-generic which has no constant-time hardening.

   It does slightly regress in constant-time hardening vs. the old
   lib/crypto/aes.c which had smaller tables, and from aes-fixed-time
   which disabled IRQs on top of that.  But I think this is tolerable.
   The real solutions for constant-time AES are AES instructions or
   bit-slicing.  The table-based code remains a best-effort fallback for
   the increasingly-rare case where a real solution is unavailable.

4. crypto_aes_ctx and aes_expandkey() will remain for now, but only for
   callers that are using them specifically for the AES key expansion
   (as opposed to en/decrypting data with the AES library).

This commit begins the migration process by introducing the new structs
and functions, backed by the new generic AES implementation.

To allow callers to be incrementally converted, the new en/decryption
functions are introduced alongside the existing ones and under new
names: aes_encrypt_new() and aes_decrypt_new().  Of course, after all
callers have been updated, the original aes_encrypt() and aes_decrypt()
will be removed and the new functions will take their names.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 include/crypto/aes.h | 138 +++++++++++++++-
 lib/crypto/Kconfig   |   4 +
 lib/crypto/Makefile  |  11 +-
 lib/crypto/aes.c     | 379 ++++++++++++++++++++++++++++++++++++++-----
 4 files changed, 488 insertions(+), 44 deletions(-)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 9339da7c20a8..4da2f125bb15 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -16,10 +16,64 @@
 #define AES_KEYSIZE_256		32
 #define AES_BLOCK_SIZE		16
 #define AES_MAX_KEYLENGTH	(15 * 16)
 #define AES_MAX_KEYLENGTH_U32	(AES_MAX_KEYLENGTH / sizeof(u32))
 
+union aes_enckey_arch {
+	u32 rndkeys[AES_MAX_KEYLENGTH_U32];
+};
+
+union aes_invkey_arch {
+	u32 inv_rndkeys[AES_MAX_KEYLENGTH_U32];
+};
+
+/**
+ * struct aes_enckey - An AES key prepared for encryption
+ * @len: Key length in bytes: 16 for AES-128, 24 for AES-192, 32 for AES-256.
+ * @nrounds: Number of rounds: 10 for AES-128, 12 for AES-192, 14 for AES-256.
+ *	     This is '6 + @len / 4' and is cached so that AES implementations
+ *	     that need it don't have to recompute it for each en/decryption.
+ * @padding: Padding to make offsetof(@k) be a multiple of 16, so that aligning
+ *	     this struct to a 16-byte boundary results in @k also being 16-byte
+ *	     aligned.  Users aren't required to align this struct to 16 bytes,
+ *	     but it may slightly improve performance.
+ * @k: This typically contains the AES round keys as an array of '@nrounds + 1'
+ *     groups of four u32 words.  However, architecture-specific implementations
+ *     of AES may store something else here, e.g. just the raw key if it's all
+ *     they need.
+ *
+ * Note that this struct is about half the size of struct aes_key.  This is
+ * separate from struct aes_key so that modes that need only AES encryption
+ * (e.g. AES-GCM, AES-CTR, AES-CMAC, tweak key in AES-XTS) don't incur the time
+ * and space overhead of computing and caching the decryption round keys.
+ *
+ * Note that there's no decryption-only equivalent (i.e. "struct aes_deckey"),
+ * since (a) it's rare that modes need decryption-only, and (b) some AES
+ * implementations use the same @k for both encryption and decryption, either
+ * always or conditionally; in the latter case both @k and @inv_k are needed.
+ */
+struct aes_enckey {
+	u32 len;
+	u32 nrounds;
+	u32 padding[2];
+	union aes_enckey_arch k;
+};
+
+/**
+ * struct aes_key - An AES key prepared for encryption and decryption
+ * @aes_enckey: Common fields and the key prepared for encryption
+ * @inv_k: This generally contains the round keys for the AES Equivalent
+ *	   Inverse Cipher, as an array of '@nrounds + 1' groups of four u32
+ *	   words.  However, architecture-specific implementations of AES may
+ *	   store something else here.  For example, they may leave this field
+ *	   uninitialized if they use @k for both encryption and decryption.
+ */
+struct aes_key {
+	struct aes_enckey; /* Include all fields of aes_enckey. */
+	union aes_invkey_arch inv_k;
+};
+
 /*
  * Please ensure that the first two fields are 16-byte aligned
  * relative to the start of the structure, i.e., don't move them!
  */
 struct crypto_aes_ctx {
@@ -32,11 +86,11 @@ extern const u32 crypto_ft_tab[4][256] ____cacheline_aligned;
 extern const u32 crypto_it_tab[4][256] ____cacheline_aligned;
 
 /*
  * validate key length for AES algorithms
  */
-static inline int aes_check_keylen(unsigned int keylen)
+static inline int aes_check_keylen(size_t keylen)
 {
 	switch (keylen) {
 	case AES_KEYSIZE_128:
 	case AES_KEYSIZE_192:
 	case AES_KEYSIZE_256:
@@ -66,10 +120,62 @@ int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
  * for the initial combination, the second slot for the first round and so on.
  */
 int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 		  unsigned int key_len);
 
+/**
+ * aes_preparekey() - Prepare an AES key for encryption and decryption
+ * @key: (output) The key structure to initialize
+ * @in_key: The raw AES key
+ * @key_len: Length of the raw key in bytes.  Should be either AES_KEYSIZE_128,
+ *	     AES_KEYSIZE_192, or AES_KEYSIZE_256.
+ *
+ * This prepares an AES key for both the encryption and decryption directions of
+ * the block cipher.  Typically this involves expanding the raw key into both
+ * the standard round keys and the Equivalent Inverse Cipher round keys, but
+ * some architecture-specific implementations don't do the full expansion here.
+ *
+ * The caller is responsible for zeroizing both the struct aes_key and the raw
+ * key once they are no longer needed.
+ *
+ * If you don't need decryption support, use aes_prepareenckey() instead.
+ *
+ * Return: 0 on success or -EINVAL if the given key length is invalid.  No other
+ *	   errors are possible, so callers that always pass a valid key length
+ *	   don't need to check for errors.
+ *
+ * Context: Any context.
+ */
+int aes_preparekey(struct aes_key *key, const u8 *in_key, size_t key_len);
+
+/**
+ * aes_prepareenckey() - Prepare an AES key for encryption-only
+ * @enc_key: (output) The key structure to initialize
+ * @in_key: The raw AES key
+ * @key_len: Length of the raw key in bytes.  Should be either AES_KEYSIZE_128,
+ *	     AES_KEYSIZE_192, or AES_KEYSIZE_256.
+ *
+ * This prepares an AES key for only the encryption direction of the block
+ * cipher.  Typically this involves expanding the raw key into only the standard
+ * round keys, resulting in a struct about half the size of struct aes_key.
+ *
+ * The caller is responsible for zeroizing both the struct aes_enckey and the
+ * raw key once they are no longer needed.
+ *
+ * Note that while the resulting prepared key supports only AES encryption, it
+ * can still be used for decrypting in a mode of operation that uses AES in only
+ * the encryption (forward) direction, for example counter mode.
+ *
+ * Return: 0 on success or -EINVAL if the given key length is invalid.  No other
+ *	   errors are possible, so callers that always pass a valid key length
+ *	   don't need to check for errors.
+ *
+ * Context: Any context.
+ */
+int aes_prepareenckey(struct aes_enckey *enc_key,
+		      const u8 *in_key, size_t key_len);
+
 /**
  * aes_encrypt - Encrypt a single AES block
  * @ctx:	Context struct containing the key schedule
  * @out:	Buffer to store the ciphertext
  * @in:		Buffer containing the plaintext
@@ -82,12 +188,42 @@ void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
  * @out:	Buffer to store the plaintext
  * @in:		Buffer containing the ciphertext
  */
 void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
 
+typedef union {
+	const struct aes_enckey *enc_key;
+	const struct aes_key *full_key;
+} aes_encrypt_arg __attribute__ ((__transparent_union__));
+
+/**
+ * aes_encrypt_new() - Encrypt a single AES block
+ * @key: The AES key, as a pointer to either an encryption-only key
+ *	 (struct aes_enckey) or a full, bidirectional key (struct aes_key).
+ * @out: Buffer to store the ciphertext block
+ * @in: Buffer containing the plaintext block
+ *
+ * Context: Any context.
+ */
+void aes_encrypt_new(aes_encrypt_arg key, u8 out[at_least AES_BLOCK_SIZE],
+		     const u8 in[at_least AES_BLOCK_SIZE]);
+
+/**
+ * aes_decrypt_new() - Decrypt a single AES block
+ * @key: The AES key, previously initialized by aes_preparekey()
+ * @out: Buffer to store the plaintext block
+ * @in: Buffer containing the ciphertext block
+ *
+ * Context: Any context.
+ */
+void aes_decrypt_new(const struct aes_key *key, u8 out[at_least AES_BLOCK_SIZE],
+		     const u8 in[at_least AES_BLOCK_SIZE]);
+
 extern const u8 crypto_aes_sbox[];
 extern const u8 crypto_aes_inv_sbox[];
+extern const u32 __cacheline_aligned aes_enc_tab[256];
+extern const u32 __cacheline_aligned aes_dec_tab[256];
 
 void aescfb_encrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
 		    int len, const u8 iv[AES_BLOCK_SIZE]);
 void aescfb_decrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
 		    int len, const u8 iv[AES_BLOCK_SIZE]);
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 33cf46bbadc8..21fee7c2dfce 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -9,10 +9,14 @@ config CRYPTO_LIB_UTILS
 	tristate
 
 config CRYPTO_LIB_AES
 	tristate
 
+config CRYPTO_LIB_AES_ARCH
+	bool
+	depends on CRYPTO_LIB_AES && !UML && !KMSAN
+
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
 
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 45128eccedef..01193b3f47ba 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -13,12 +13,19 @@ obj-$(CONFIG_KUNIT)				+= tests/
 obj-$(CONFIG_CRYPTO_HASH_INFO)			+= hash_info.o
 
 obj-$(CONFIG_CRYPTO_LIB_UTILS)			+= libcryptoutils.o
 libcryptoutils-y				:= memneq.o utils.o
 
-obj-$(CONFIG_CRYPTO_LIB_AES)			+= libaes.o
-libaes-y					:= aes.o
+################################################################################
+
+obj-$(CONFIG_CRYPTO_LIB_AES) += libaes.o
+libaes-y := aes.o
+ifeq ($(CONFIG_CRYPTO_LIB_AES_ARCH),y)
+CFLAGS_aes.o += -I$(src)/$(SRCARCH)
+endif # CONFIG_CRYPTO_LIB_AES_ARCH
+
+################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
 libaescfb-y					:= aescfb.o
 
 obj-$(CONFIG_CRYPTO_LIB_AESGCM)			+= libaesgcm.o
diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
index b57fda3460f1..57b6d68fd378 100644
--- a/lib/crypto/aes.c
+++ b/lib/crypto/aes.c
@@ -1,11 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (C) 2017-2019 Linaro Ltd <ard.biesheuvel@linaro.org>
+ * Copyright 2026 Google LLC
  */
 
 #include <crypto/aes.h>
+#include <linux/cache.h>
 #include <linux/crypto.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/unaligned.h>
 
@@ -87,10 +89,114 @@ extern const u8 crypto_aes_sbox[256] __alias(aes_sbox);
 extern const u8 crypto_aes_inv_sbox[256] __alias(aes_inv_sbox);
 
 EXPORT_SYMBOL(crypto_aes_sbox);
 EXPORT_SYMBOL(crypto_aes_inv_sbox);
 
+/* aes_enc_tab[i] contains MixColumn([SubByte(i), 0, 0, 0]). */
+const u32 __cacheline_aligned aes_enc_tab[256] = {
+	0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6, 0x0df2f2ff, 0xbd6b6bd6,
+	0xb16f6fde, 0x54c5c591, 0x50303060, 0x03010102, 0xa96767ce, 0x7d2b2b56,
+	0x19fefee7, 0x62d7d7b5, 0xe6abab4d, 0x9a7676ec, 0x45caca8f, 0x9d82821f,
+	0x40c9c989, 0x877d7dfa, 0x15fafaef, 0xeb5959b2, 0xc947478e, 0x0bf0f0fb,
+	0xecadad41, 0x67d4d4b3, 0xfda2a25f, 0xeaafaf45, 0xbf9c9c23, 0xf7a4a453,
+	0x967272e4, 0x5bc0c09b, 0xc2b7b775, 0x1cfdfde1, 0xae93933d, 0x6a26264c,
+	0x5a36366c, 0x413f3f7e, 0x02f7f7f5, 0x4fcccc83, 0x5c343468, 0xf4a5a551,
+	0x34e5e5d1, 0x08f1f1f9, 0x937171e2, 0x73d8d8ab, 0x53313162, 0x3f15152a,
+	0x0c040408, 0x52c7c795, 0x65232346, 0x5ec3c39d, 0x28181830, 0xa1969637,
+	0x0f05050a, 0xb59a9a2f, 0x0907070e, 0x36121224, 0x9b80801b, 0x3de2e2df,
+	0x26ebebcd, 0x6927274e, 0xcdb2b27f, 0x9f7575ea, 0x1b090912, 0x9e83831d,
+	0x742c2c58, 0x2e1a1a34, 0x2d1b1b36, 0xb26e6edc, 0xee5a5ab4, 0xfba0a05b,
+	0xf65252a4, 0x4d3b3b76, 0x61d6d6b7, 0xceb3b37d, 0x7b292952, 0x3ee3e3dd,
+	0x712f2f5e, 0x97848413, 0xf55353a6, 0x68d1d1b9, 0x00000000, 0x2cededc1,
+	0x60202040, 0x1ffcfce3, 0xc8b1b179, 0xed5b5bb6, 0xbe6a6ad4, 0x46cbcb8d,
+	0xd9bebe67, 0x4b393972, 0xde4a4a94, 0xd44c4c98, 0xe85858b0, 0x4acfcf85,
+	0x6bd0d0bb, 0x2aefefc5, 0xe5aaaa4f, 0x16fbfbed, 0xc5434386, 0xd74d4d9a,
+	0x55333366, 0x94858511, 0xcf45458a, 0x10f9f9e9, 0x06020204, 0x817f7ffe,
+	0xf05050a0, 0x443c3c78, 0xba9f9f25, 0xe3a8a84b, 0xf35151a2, 0xfea3a35d,
+	0xc0404080, 0x8a8f8f05, 0xad92923f, 0xbc9d9d21, 0x48383870, 0x04f5f5f1,
+	0xdfbcbc63, 0xc1b6b677, 0x75dadaaf, 0x63212142, 0x30101020, 0x1affffe5,
+	0x0ef3f3fd, 0x6dd2d2bf, 0x4ccdcd81, 0x140c0c18, 0x35131326, 0x2fececc3,
+	0xe15f5fbe, 0xa2979735, 0xcc444488, 0x3917172e, 0x57c4c493, 0xf2a7a755,
+	0x827e7efc, 0x473d3d7a, 0xac6464c8, 0xe75d5dba, 0x2b191932, 0x957373e6,
+	0xa06060c0, 0x98818119, 0xd14f4f9e, 0x7fdcdca3, 0x66222244, 0x7e2a2a54,
+	0xab90903b, 0x8388880b, 0xca46468c, 0x29eeeec7, 0xd3b8b86b, 0x3c141428,
+	0x79dedea7, 0xe25e5ebc, 0x1d0b0b16, 0x76dbdbad, 0x3be0e0db, 0x56323264,
+	0x4e3a3a74, 0x1e0a0a14, 0xdb494992, 0x0a06060c, 0x6c242448, 0xe45c5cb8,
+	0x5dc2c29f, 0x6ed3d3bd, 0xefacac43, 0xa66262c4, 0xa8919139, 0xa4959531,
+	0x37e4e4d3, 0x8b7979f2, 0x32e7e7d5, 0x43c8c88b, 0x5937376e, 0xb76d6dda,
+	0x8c8d8d01, 0x64d5d5b1, 0xd24e4e9c, 0xe0a9a949, 0xb46c6cd8, 0xfa5656ac,
+	0x07f4f4f3, 0x25eaeacf, 0xaf6565ca, 0x8e7a7af4, 0xe9aeae47, 0x18080810,
+	0xd5baba6f, 0x887878f0, 0x6f25254a, 0x722e2e5c, 0x241c1c38, 0xf1a6a657,
+	0xc7b4b473, 0x51c6c697, 0x23e8e8cb, 0x7cdddda1, 0x9c7474e8, 0x211f1f3e,
+	0xdd4b4b96, 0xdcbdbd61, 0x868b8b0d, 0x858a8a0f, 0x907070e0, 0x423e3e7c,
+	0xc4b5b571, 0xaa6666cc, 0xd8484890, 0x05030306, 0x01f6f6f7, 0x120e0e1c,
+	0xa36161c2, 0x5f35356a, 0xf95757ae, 0xd0b9b969, 0x91868617, 0x58c1c199,
+	0x271d1d3a, 0xb99e9e27, 0x38e1e1d9, 0x13f8f8eb, 0xb398982b, 0x33111122,
+	0xbb6969d2, 0x70d9d9a9, 0x898e8e07, 0xa7949433, 0xb69b9b2d, 0x221e1e3c,
+	0x92878715, 0x20e9e9c9, 0x49cece87, 0xff5555aa, 0x78282850, 0x7adfdfa5,
+	0x8f8c8c03, 0xf8a1a159, 0x80898909, 0x170d0d1a, 0xdabfbf65, 0x31e6e6d7,
+	0xc6424284, 0xb86868d0, 0xc3414182, 0xb0999929, 0x772d2d5a, 0x110f0f1e,
+	0xcbb0b07b, 0xfc5454a8, 0xd6bbbb6d, 0x3a16162c,
+};
+EXPORT_SYMBOL(aes_enc_tab);
+
+/* aes_dec_tab[i] contains InvMixColumn([InvSubByte(i), 0, 0, 0]). */
+const u32 __cacheline_aligned aes_dec_tab[256] = {
+	0x50a7f451, 0x5365417e, 0xc3a4171a, 0x965e273a, 0xcb6bab3b, 0xf1459d1f,
+	0xab58faac, 0x9303e34b, 0x55fa3020, 0xf66d76ad, 0x9176cc88, 0x254c02f5,
+	0xfcd7e54f, 0xd7cb2ac5, 0x80443526, 0x8fa362b5, 0x495ab1de, 0x671bba25,
+	0x980eea45, 0xe1c0fe5d, 0x02752fc3, 0x12f04c81, 0xa397468d, 0xc6f9d36b,
+	0xe75f8f03, 0x959c9215, 0xeb7a6dbf, 0xda595295, 0x2d83bed4, 0xd3217458,
+	0x2969e049, 0x44c8c98e, 0x6a89c275, 0x78798ef4, 0x6b3e5899, 0xdd71b927,
+	0xb64fe1be, 0x17ad88f0, 0x66ac20c9, 0xb43ace7d, 0x184adf63, 0x82311ae5,
+	0x60335197, 0x457f5362, 0xe07764b1, 0x84ae6bbb, 0x1ca081fe, 0x942b08f9,
+	0x58684870, 0x19fd458f, 0x876cde94, 0xb7f87b52, 0x23d373ab, 0xe2024b72,
+	0x578f1fe3, 0x2aab5566, 0x0728ebb2, 0x03c2b52f, 0x9a7bc586, 0xa50837d3,
+	0xf2872830, 0xb2a5bf23, 0xba6a0302, 0x5c8216ed, 0x2b1ccf8a, 0x92b479a7,
+	0xf0f207f3, 0xa1e2694e, 0xcdf4da65, 0xd5be0506, 0x1f6234d1, 0x8afea6c4,
+	0x9d532e34, 0xa055f3a2, 0x32e18a05, 0x75ebf6a4, 0x39ec830b, 0xaaef6040,
+	0x069f715e, 0x51106ebd, 0xf98a213e, 0x3d06dd96, 0xae053edd, 0x46bde64d,
+	0xb58d5491, 0x055dc471, 0x6fd40604, 0xff155060, 0x24fb9819, 0x97e9bdd6,
+	0xcc434089, 0x779ed967, 0xbd42e8b0, 0x888b8907, 0x385b19e7, 0xdbeec879,
+	0x470a7ca1, 0xe90f427c, 0xc91e84f8, 0x00000000, 0x83868009, 0x48ed2b32,
+	0xac70111e, 0x4e725a6c, 0xfbff0efd, 0x5638850f, 0x1ed5ae3d, 0x27392d36,
+	0x64d90f0a, 0x21a65c68, 0xd1545b9b, 0x3a2e3624, 0xb1670a0c, 0x0fe75793,
+	0xd296eeb4, 0x9e919b1b, 0x4fc5c080, 0xa220dc61, 0x694b775a, 0x161a121c,
+	0x0aba93e2, 0xe52aa0c0, 0x43e0223c, 0x1d171b12, 0x0b0d090e, 0xadc78bf2,
+	0xb9a8b62d, 0xc8a91e14, 0x8519f157, 0x4c0775af, 0xbbdd99ee, 0xfd607fa3,
+	0x9f2601f7, 0xbcf5725c, 0xc53b6644, 0x347efb5b, 0x7629438b, 0xdcc623cb,
+	0x68fcedb6, 0x63f1e4b8, 0xcadc31d7, 0x10856342, 0x40229713, 0x2011c684,
+	0x7d244a85, 0xf83dbbd2, 0x1132f9ae, 0x6da129c7, 0x4b2f9e1d, 0xf330b2dc,
+	0xec52860d, 0xd0e3c177, 0x6c16b32b, 0x99b970a9, 0xfa489411, 0x2264e947,
+	0xc48cfca8, 0x1a3ff0a0, 0xd82c7d56, 0xef903322, 0xc74e4987, 0xc1d138d9,
+	0xfea2ca8c, 0x360bd498, 0xcf81f5a6, 0x28de7aa5, 0x268eb7da, 0xa4bfad3f,
+	0xe49d3a2c, 0x0d927850, 0x9bcc5f6a, 0x62467e54, 0xc2138df6, 0xe8b8d890,
+	0x5ef7392e, 0xf5afc382, 0xbe805d9f, 0x7c93d069, 0xa92dd56f, 0xb31225cf,
+	0x3b99acc8, 0xa77d1810, 0x6e639ce8, 0x7bbb3bdb, 0x097826cd, 0xf418596e,
+	0x01b79aec, 0xa89a4f83, 0x656e95e6, 0x7ee6ffaa, 0x08cfbc21, 0xe6e815ef,
+	0xd99be7ba, 0xce366f4a, 0xd4099fea, 0xd67cb029, 0xafb2a431, 0x31233f2a,
+	0x3094a5c6, 0xc066a235, 0x37bc4e74, 0xa6ca82fc, 0xb0d090e0, 0x15d8a733,
+	0x4a9804f1, 0xf7daec41, 0x0e50cd7f, 0x2ff69117, 0x8dd64d76, 0x4db0ef43,
+	0x544daacc, 0xdf0496e4, 0xe3b5d19e, 0x1b886a4c, 0xb81f2cc1, 0x7f516546,
+	0x04ea5e9d, 0x5d358c01, 0x737487fa, 0x2e410bfb, 0x5a1d67b3, 0x52d2db92,
+	0x335610e9, 0x1347d66d, 0x8c61d79a, 0x7a0ca137, 0x8e14f859, 0x893c13eb,
+	0xee27a9ce, 0x35c961b7, 0xede51ce1, 0x3cb1477a, 0x59dfd29c, 0x3f73f255,
+	0x79ce1418, 0xbf37c773, 0xeacdf753, 0x5baafd5f, 0x146f3ddf, 0x86db4478,
+	0x81f3afca, 0x3ec468b9, 0x2c342438, 0x5f40a3c2, 0x72c31d16, 0x0c25e2bc,
+	0x8b493c28, 0x41950dff, 0x7101a839, 0xdeb30c08, 0x9ce4b4d8, 0x90c15664,
+	0x6184cb7b, 0x70b632d5, 0x745c6c48, 0x4257b8d0,
+};
+EXPORT_SYMBOL(aes_dec_tab);
+
+/* Prefetch data into L1 cache.  @mem should be __cacheline_aligned. */
+static __always_inline void aes_prefetch(const void *mem, size_t len)
+{
+	for (size_t i = 0; i < len; i += L1_CACHE_BYTES)
+		*(volatile const u8 *)(mem + i);
+	barrier();
+}
+
 static u32 mul_by_x(u32 w)
 {
 	u32 x = w & 0x7f7f7f7f;
 	u32 y = w & 0x80808080;
 
@@ -167,42 +273,21 @@ static u32 subw(u32 in)
 	       (aes_sbox[(in >>  8) & 0xff] <<  8) ^
 	       (aes_sbox[(in >> 16) & 0xff] << 16) ^
 	       (aes_sbox[(in >> 24) & 0xff] << 24);
 }
 
-/**
- * aes_expandkey - Expands the AES key as described in FIPS-197
- * @ctx:	The location where the computed key will be stored.
- * @in_key:	The supplied key.
- * @key_len:	The length of the supplied key.
- *
- * Returns 0 on success. The function fails only if an invalid key size (or
- * pointer) is supplied.
- * The expanded key size is 240 bytes (max of 14 rounds with a unique 16 bytes
- * key schedule plus a 16 bytes key which is used before the first round).
- * The decryption key is prepared for the "Equivalent Inverse Cipher" as
- * described in FIPS-197. The first slot (16 bytes) of each key (enc or dec) is
- * for the initial combination, the second slot for the first round and so on.
- */
-int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
-		  unsigned int key_len)
+static void aes_expandkey_generic(u32 rndkeys[], u32 *inv_rndkeys,
+				  const u8 *in_key, int key_len)
 {
 	u32 kwords = key_len / sizeof(u32);
 	u32 rc, i, j;
-	int err;
-
-	err = aes_check_keylen(key_len);
-	if (err)
-		return err;
-
-	ctx->key_length = key_len;
 
 	for (i = 0; i < kwords; i++)
-		ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
+		rndkeys[i] = get_unaligned_le32(&in_key[i * sizeof(u32)]);
 
 	for (i = 0, rc = 1; i < 10; i++, rc = mul_by_x(rc)) {
-		u32 *rki = ctx->key_enc + (i * kwords);
+		u32 *rki = &rndkeys[i * kwords];
 		u32 *rko = rki + kwords;
 
 		rko[0] = ror32(subw(rki[kwords - 1]), 8) ^ rc ^ rki[0];
 		rko[1] = rko[0] ^ rki[1];
 		rko[2] = rko[1] ^ rki[2];
@@ -227,27 +312,37 @@ int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 	 * Generate the decryption keys for the Equivalent Inverse Cipher.
 	 * This involves reversing the order of the round keys, and applying
 	 * the Inverse Mix Columns transformation to all but the first and
 	 * the last one.
 	 */
-	ctx->key_dec[0] = ctx->key_enc[key_len + 24];
-	ctx->key_dec[1] = ctx->key_enc[key_len + 25];
-	ctx->key_dec[2] = ctx->key_enc[key_len + 26];
-	ctx->key_dec[3] = ctx->key_enc[key_len + 27];
-
-	for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
-		ctx->key_dec[i]     = inv_mix_columns(ctx->key_enc[j]);
-		ctx->key_dec[i + 1] = inv_mix_columns(ctx->key_enc[j + 1]);
-		ctx->key_dec[i + 2] = inv_mix_columns(ctx->key_enc[j + 2]);
-		ctx->key_dec[i + 3] = inv_mix_columns(ctx->key_enc[j + 3]);
-	}
+	if (inv_rndkeys) {
+		inv_rndkeys[0] = rndkeys[key_len + 24];
+		inv_rndkeys[1] = rndkeys[key_len + 25];
+		inv_rndkeys[2] = rndkeys[key_len + 26];
+		inv_rndkeys[3] = rndkeys[key_len + 27];
+
+		for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
+			inv_rndkeys[i]     = inv_mix_columns(rndkeys[j]);
+			inv_rndkeys[i + 1] = inv_mix_columns(rndkeys[j + 1]);
+			inv_rndkeys[i + 2] = inv_mix_columns(rndkeys[j + 2]);
+			inv_rndkeys[i + 3] = inv_mix_columns(rndkeys[j + 3]);
+		}
 
-	ctx->key_dec[i]     = ctx->key_enc[0];
-	ctx->key_dec[i + 1] = ctx->key_enc[1];
-	ctx->key_dec[i + 2] = ctx->key_enc[2];
-	ctx->key_dec[i + 3] = ctx->key_enc[3];
+		inv_rndkeys[i]     = rndkeys[0];
+		inv_rndkeys[i + 1] = rndkeys[1];
+		inv_rndkeys[i + 2] = rndkeys[2];
+		inv_rndkeys[i + 3] = rndkeys[3];
+	}
+}
 
+int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
+		  unsigned int key_len)
+{
+	if (aes_check_keylen(key_len) != 0)
+		return -EINVAL;
+	ctx->key_length = key_len;
+	aes_expandkey_generic(ctx->key_enc, ctx->key_dec, in_key, key_len);
 	return 0;
 }
 EXPORT_SYMBOL(aes_expandkey);
 
 /**
@@ -299,10 +394,118 @@ void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
 	put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
 	put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
 }
 EXPORT_SYMBOL(aes_encrypt);
 
+static __always_inline u32 enc_quarterround(const u32 w[4], int i, u32 rk)
+{
+	return rk ^ aes_enc_tab[(u8)w[i]] ^
+	       rol32(aes_enc_tab[(u8)(w[(i + 1) % 4] >> 8)], 8) ^
+	       rol32(aes_enc_tab[(u8)(w[(i + 2) % 4] >> 16)], 16) ^
+	       rol32(aes_enc_tab[(u8)(w[(i + 3) % 4] >> 24)], 24);
+}
+
+static __always_inline u32 enclast_quarterround(const u32 w[4], int i, u32 rk)
+{
+	return rk ^ ((aes_enc_tab[(u8)w[i]] & 0x0000ff00) >> 8) ^
+	       (aes_enc_tab[(u8)(w[(i + 1) % 4] >> 8)] & 0x0000ff00) ^
+	       ((aes_enc_tab[(u8)(w[(i + 2) % 4] >> 16)] & 0x0000ff00) << 8) ^
+	       ((aes_enc_tab[(u8)(w[(i + 3) % 4] >> 24)] & 0x0000ff00) << 16);
+}
+
+static void __maybe_unused aes_encrypt_generic(const u32 rndkeys[], int nrounds,
+					       u8 out[AES_BLOCK_SIZE],
+					       const u8 in[AES_BLOCK_SIZE])
+{
+	const u32 *rkp = rndkeys;
+	int n = nrounds - 1;
+	u32 w[4];
+
+	w[0] = get_unaligned_le32(&in[0]) ^ *rkp++;
+	w[1] = get_unaligned_le32(&in[4]) ^ *rkp++;
+	w[2] = get_unaligned_le32(&in[8]) ^ *rkp++;
+	w[3] = get_unaligned_le32(&in[12]) ^ *rkp++;
+
+	/*
+	 * Prefetch the table before doing data and key-dependent loads from it.
+	 *
+	 * This is intended only as a basic constant-time hardening measure that
+	 * avoids interfering with performance too much.  Its effectiveness is
+	 * not guaranteed.  For proper constant-time AES, a CPU that supports
+	 * AES instructions should be used instead.
+	 */
+	aes_prefetch(aes_enc_tab, sizeof(aes_enc_tab));
+
+	do {
+		u32 w0 = enc_quarterround(w, 0, *rkp++);
+		u32 w1 = enc_quarterround(w, 1, *rkp++);
+		u32 w2 = enc_quarterround(w, 2, *rkp++);
+		u32 w3 = enc_quarterround(w, 3, *rkp++);
+
+		w[0] = w0;
+		w[1] = w1;
+		w[2] = w2;
+		w[3] = w3;
+	} while (--n);
+
+	put_unaligned_le32(enclast_quarterround(w, 0, *rkp++), &out[0]);
+	put_unaligned_le32(enclast_quarterround(w, 1, *rkp++), &out[4]);
+	put_unaligned_le32(enclast_quarterround(w, 2, *rkp++), &out[8]);
+	put_unaligned_le32(enclast_quarterround(w, 3, *rkp++), &out[12]);
+}
+
+static __always_inline u32 dec_quarterround(const u32 w[4], int i, u32 rk)
+{
+	return rk ^ aes_dec_tab[(u8)w[i]] ^
+	       rol32(aes_dec_tab[(u8)(w[(i + 3) % 4] >> 8)], 8) ^
+	       rol32(aes_dec_tab[(u8)(w[(i + 2) % 4] >> 16)], 16) ^
+	       rol32(aes_dec_tab[(u8)(w[(i + 1) % 4] >> 24)], 24);
+}
+
+static __always_inline u32 declast_quarterround(const u32 w[4], int i, u32 rk)
+{
+	return rk ^ aes_inv_sbox[(u8)w[i]] ^
+	       ((u32)aes_inv_sbox[(u8)(w[(i + 3) % 4] >> 8)] << 8) ^
+	       ((u32)aes_inv_sbox[(u8)(w[(i + 2) % 4] >> 16)] << 16) ^
+	       ((u32)aes_inv_sbox[(u8)(w[(i + 1) % 4] >> 24)] << 24);
+}
+
+static void __maybe_unused aes_decrypt_generic(const u32 inv_rndkeys[],
+					       int nrounds,
+					       u8 out[AES_BLOCK_SIZE],
+					       const u8 in[AES_BLOCK_SIZE])
+{
+	const u32 *rkp = inv_rndkeys;
+	int n = nrounds - 1;
+	u32 w[4];
+
+	w[0] = get_unaligned_le32(&in[0]) ^ *rkp++;
+	w[1] = get_unaligned_le32(&in[4]) ^ *rkp++;
+	w[2] = get_unaligned_le32(&in[8]) ^ *rkp++;
+	w[3] = get_unaligned_le32(&in[12]) ^ *rkp++;
+
+	aes_prefetch(aes_dec_tab, sizeof(aes_dec_tab));
+
+	do {
+		u32 w0 = dec_quarterround(w, 0, *rkp++);
+		u32 w1 = dec_quarterround(w, 1, *rkp++);
+		u32 w2 = dec_quarterround(w, 2, *rkp++);
+		u32 w3 = dec_quarterround(w, 3, *rkp++);
+
+		w[0] = w0;
+		w[1] = w1;
+		w[2] = w2;
+		w[3] = w3;
+	} while (--n);
+
+	aes_prefetch((const void *)aes_inv_sbox, sizeof(aes_inv_sbox));
+	put_unaligned_le32(declast_quarterround(w, 0, *rkp++), &out[0]);
+	put_unaligned_le32(declast_quarterround(w, 1, *rkp++), &out[4]);
+	put_unaligned_le32(declast_quarterround(w, 2, *rkp++), &out[8]);
+	put_unaligned_le32(declast_quarterround(w, 3, *rkp++), &out[12]);
+}
+
 /**
  * aes_decrypt - Decrypt a single AES block
  * @ctx:	Context struct containing the key schedule
  * @out:	Buffer to store the plaintext
  * @in:		Buffer containing the ciphertext
@@ -350,8 +553,102 @@ void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
 	put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
 	put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
 }
 EXPORT_SYMBOL(aes_decrypt);
 
-MODULE_DESCRIPTION("Generic AES library");
+/*
+ * Note: the aes_prepare*key_* names reflect the fact that the implementation
+ * might not actually expand the key.  (The s390 code for example doesn't.)
+ * Where the key is expanded we use the more specific names aes_expandkey_*.
+ *
+ * aes_preparekey_arch() is passed an optional pointer 'inv_k' which points to
+ * the area to store the prepared decryption key.  It will be NULL if the user
+ * is requesting encryption-only.  aes_preparekey_arch() is also passed a valid
+ * 'key_len' and 'nrounds', corresponding to AES-128, AES-192, or AES-256.
+ */
+#ifdef CONFIG_CRYPTO_LIB_AES_ARCH
+/* An arch-specific implementation of AES is available.  Include it. */
+#include "aes.h" /* $(SRCARCH)/aes.h */
+#else
+/* No arch-specific implementation of AES is available.  Use generic code. */
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	aes_expandkey_generic(k->rndkeys, inv_k ? inv_k->inv_rndkeys : NULL,
+			      in_key, key_len);
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	aes_encrypt_generic(key->k.rndkeys, key->nrounds, out, in);
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	aes_decrypt_generic(key->inv_k.inv_rndkeys, key->nrounds, out, in);
+}
+#endif
+
+static int __aes_preparekey(struct aes_enckey *enc_key,
+			    union aes_invkey_arch *inv_k,
+			    const u8 *in_key, size_t key_len)
+{
+	if (aes_check_keylen(key_len) != 0)
+		return -EINVAL;
+	enc_key->len = key_len;
+	enc_key->nrounds = 6 + key_len / 4;
+	aes_preparekey_arch(&enc_key->k, inv_k, in_key, key_len,
+			    enc_key->nrounds);
+	return 0;
+}
+
+int aes_preparekey(struct aes_key *key, const u8 *in_key, size_t key_len)
+{
+	return __aes_preparekey((struct aes_enckey *)key, &key->inv_k,
+				in_key, key_len);
+}
+EXPORT_SYMBOL(aes_preparekey);
+
+int aes_prepareenckey(struct aes_enckey *key, const u8 *in_key, size_t key_len)
+{
+	return __aes_preparekey(key, NULL, in_key, key_len);
+}
+EXPORT_SYMBOL(aes_prepareenckey);
+
+void aes_encrypt_new(aes_encrypt_arg key, u8 out[AES_BLOCK_SIZE],
+		     const u8 in[AES_BLOCK_SIZE])
+{
+	aes_encrypt_arch(key.enc_key, out, in);
+}
+EXPORT_SYMBOL(aes_encrypt_new);
+
+void aes_decrypt_new(const struct aes_key *key, u8 out[AES_BLOCK_SIZE],
+		     const u8 in[AES_BLOCK_SIZE])
+{
+	aes_decrypt_arch(key, out, in);
+}
+EXPORT_SYMBOL(aes_decrypt_new);
+
+#ifdef aes_mod_init_arch
+static int __init aes_mod_init(void)
+{
+	aes_mod_init_arch();
+	return 0;
+}
+subsys_initcall(aes_mod_init);
+
+static void __exit aes_mod_exit(void)
+{
+}
+module_exit(aes_mod_exit);
+#endif
+
+MODULE_DESCRIPTION("AES block cipher");
 MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_AUTHOR("Eric Biggers <ebiggers@kernel.org>");
 MODULE_LICENSE("GPL v2");
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/36] crypto: arm/aes-neonbs - Use AES library for single blocks
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
  2026-01-05  5:12 ` [PATCH 01/36] crypto: powerpc/aes - Rename struct aes_key Eric Biggers
  2026-01-05  5:12 ` [PATCH 02/36] lib/crypto: aes: Introduce improved AES library Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 04/36] crypto: arm/aes - Switch to aes_enc_tab[] and aes_dec_tab[] Eric Biggers
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

aes-neonbs-glue.c calls __aes_arm_encrypt() and __aes_arm_decrypt() to
en/decrypt single blocks for CBC encryption, XTS tweak encryption, and
XTS ciphertext stealing.  In preparation for making the AES library use
this same ARM-optimized single-block AES en/decryption code and making
it an internal implementation detail of the AES library, replace the
calls to these functions with calls to the AES library.

Note that this reduces the size of the aesbs_cbc_ctx and aesbs_xts_ctx
structs, since unnecessary decryption round keys are no longer included.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm/crypto/Kconfig           |  1 -
 arch/arm/crypto/aes-neonbs-glue.c | 29 ++++++++++++++++-------------
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 3eb5071bea14..167a648a9def 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -42,11 +42,10 @@ config CRYPTO_AES_ARM
 	  such attacks very difficult.
 
 config CRYPTO_AES_ARM_BS
 	tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (bit-sliced NEON)"
 	depends on KERNEL_MODE_NEON
-	select CRYPTO_AES_ARM
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_AES
 	help
 	  Length-preserving ciphers: AES cipher algorithms (FIPS-197)
 	  with block cipher modes:
diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c
index df5afe601e4a..f892f281b441 100644
--- a/arch/arm/crypto/aes-neonbs-glue.c
+++ b/arch/arm/crypto/aes-neonbs-glue.c
@@ -10,11 +10,10 @@
 #include <crypto/aes.h>
 #include <crypto/internal/skcipher.h>
 #include <crypto/scatterwalk.h>
 #include <crypto/xts.h>
 #include <linux/module.h>
-#include "aes-cipher.h"
 
 MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
 MODULE_DESCRIPTION("Bit sliced AES using NEON instructions");
 MODULE_LICENSE("GPL v2");
 
@@ -46,17 +45,17 @@ struct aesbs_ctx {
 	u8	rk[13 * (8 * AES_BLOCK_SIZE) + 32] __aligned(AES_BLOCK_SIZE);
 };
 
 struct aesbs_cbc_ctx {
 	struct aesbs_ctx	key;
-	struct crypto_aes_ctx	fallback;
+	struct aes_enckey	fallback;
 };
 
 struct aesbs_xts_ctx {
 	struct aesbs_ctx	key;
-	struct crypto_aes_ctx	fallback;
-	struct crypto_aes_ctx	tweak_key;
+	struct aes_key		fallback;
+	struct aes_enckey	tweak_key;
 };
 
 static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
 			unsigned int key_len)
 {
@@ -120,18 +119,23 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
 			    unsigned int key_len)
 {
 	struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
 	int err;
 
-	err = aes_expandkey(&ctx->fallback, in_key, key_len);
+	err = aes_prepareenckey(&ctx->fallback, in_key, key_len);
 	if (err)
 		return err;
 
 	ctx->key.rounds = 6 + key_len / 4;
 
+	/*
+	 * Note: this assumes that the arm implementation of the AES library
+	 * stores the standard round keys in k.rndkeys.
+	 */
 	kernel_neon_begin();
-	aesbs_convert_key(ctx->key.rk, ctx->fallback.key_enc, ctx->key.rounds);
+	aesbs_convert_key(ctx->key.rk, ctx->fallback.k.rndkeys,
+			  ctx->key.rounds);
 	kernel_neon_end();
 
 	return 0;
 }
 
@@ -150,12 +154,11 @@ static int cbc_encrypt(struct skcipher_request *req)
 		u8 *dst = walk.dst.virt.addr;
 		u8 *prev = walk.iv;
 
 		do {
 			crypto_xor_cpy(dst, src, prev, AES_BLOCK_SIZE);
-			__aes_arm_encrypt(ctx->fallback.key_enc,
-					  ctx->key.rounds, dst, dst);
+			aes_encrypt_new(&ctx->fallback, dst, dst);
 			prev = dst;
 			src += AES_BLOCK_SIZE;
 			dst += AES_BLOCK_SIZE;
 			nbytes -= AES_BLOCK_SIZE;
 		} while (nbytes >= AES_BLOCK_SIZE);
@@ -237,14 +240,14 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
 	err = xts_verify_key(tfm, in_key, key_len);
 	if (err)
 		return err;
 
 	key_len /= 2;
-	err = aes_expandkey(&ctx->fallback, in_key, key_len);
+	err = aes_preparekey(&ctx->fallback, in_key, key_len);
 	if (err)
 		return err;
-	err = aes_expandkey(&ctx->tweak_key, in_key + key_len, key_len);
+	err = aes_prepareenckey(&ctx->tweak_key, in_key + key_len, key_len);
 	if (err)
 		return err;
 
 	return aesbs_setkey(tfm, in_key, key_len);
 }
@@ -277,11 +280,11 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
 
 	err = skcipher_walk_virt(&walk, req, true);
 	if (err)
 		return err;
 
-	__aes_arm_encrypt(ctx->tweak_key.key_enc, rounds, walk.iv, walk.iv);
+	aes_encrypt_new(&ctx->tweak_key, walk.iv, walk.iv);
 
 	while (walk.nbytes >= AES_BLOCK_SIZE) {
 		unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
 		int reorder_last_tweak = !encrypt && tail > 0;
 
@@ -309,13 +312,13 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
 	scatterwalk_map_and_copy(buf, req->src, req->cryptlen, tail, 0);
 
 	crypto_xor(buf, req->iv, AES_BLOCK_SIZE);
 
 	if (encrypt)
-		__aes_arm_encrypt(ctx->fallback.key_enc, rounds, buf, buf);
+		aes_encrypt_new(&ctx->fallback, buf, buf);
 	else
-		__aes_arm_decrypt(ctx->fallback.key_dec, rounds, buf, buf);
+		aes_decrypt_new(&ctx->fallback, buf, buf);
 
 	crypto_xor(buf, req->iv, AES_BLOCK_SIZE);
 
 	scatterwalk_map_and_copy(buf, req->dst, req->cryptlen - AES_BLOCK_SIZE,
 				 AES_BLOCK_SIZE + tail, 1);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/36] crypto: arm/aes - Switch to aes_enc_tab[] and aes_dec_tab[]
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (2 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 03/36] crypto: arm/aes-neonbs - Use AES library for single blocks Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 05/36] crypto: arm64/aes " Eric Biggers
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Instead of crypto_ft_tab and crypto_it_tab from aes_generic.c, use
aes_enc_tab and aes_dec_tab from lib/crypto/aes.c.  These contain the
same data in the first 1024 bytes (which is the part that this code
uses), so the result is the same.  This will allow aes_generic.c to
eventually be removed.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm/crypto/aes-cipher-core.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/crypto/aes-cipher-core.S b/arch/arm/crypto/aes-cipher-core.S
index 1da3f41359aa..87567d6822ba 100644
--- a/arch/arm/crypto/aes-cipher-core.S
+++ b/arch/arm/crypto/aes-cipher-core.S
@@ -190,12 +190,12 @@
 	.align		3
 	.ltorg
 	.endm
 
 ENTRY(__aes_arm_encrypt)
-	do_crypt	fround, crypto_ft_tab,, 2
+	do_crypt	fround, aes_enc_tab,, 2
 ENDPROC(__aes_arm_encrypt)
 
 	.align		5
 ENTRY(__aes_arm_decrypt)
-	do_crypt	iround, crypto_it_tab, crypto_aes_inv_sbox, 0
+	do_crypt	iround, aes_dec_tab, crypto_aes_inv_sbox, 0
 ENDPROC(__aes_arm_decrypt)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/36] crypto: arm64/aes - Switch to aes_enc_tab[] and aes_dec_tab[]
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (3 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 04/36] crypto: arm/aes - Switch to aes_enc_tab[] and aes_dec_tab[] Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 06/36] crypto: arm64/aes - Select CRYPTO_LIB_SHA256 from correct places Eric Biggers
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Instead of crypto_ft_tab and crypto_it_tab from aes_generic.c, use
aes_enc_tab and aes_dec_tab from lib/crypto/aes.c.  These contain the
same data in the first 1024 bytes (which is the part that this code
uses), so the result is the same.  This will allow aes_generic.c to
eventually be removed.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm64/crypto/aes-cipher-core.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S
index c9d6955f8404..651f701c56a8 100644
--- a/arch/arm64/crypto/aes-cipher-core.S
+++ b/arch/arm64/crypto/aes-cipher-core.S
@@ -121,12 +121,12 @@ CPU_BE(	rev		w7, w7		)
 	stp		w6, w7, [out, #8]
 	ret
 	.endm
 
 SYM_FUNC_START(__aes_arm64_encrypt)
-	do_crypt	fround, crypto_ft_tab, crypto_ft_tab + 1, 2
+	do_crypt	fround, aes_enc_tab, aes_enc_tab + 1, 2
 SYM_FUNC_END(__aes_arm64_encrypt)
 
 	.align		5
 SYM_FUNC_START(__aes_arm64_decrypt)
-	do_crypt	iround, crypto_it_tab, crypto_aes_inv_sbox, 0
+	do_crypt	iround, aes_dec_tab, crypto_aes_inv_sbox, 0
 SYM_FUNC_END(__aes_arm64_decrypt)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/36] crypto: arm64/aes - Select CRYPTO_LIB_SHA256 from correct places
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (4 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 05/36] crypto: arm64/aes " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 07/36] crypto: aegis - Switch from crypto_ft_tab[] to aes_enc_tab[] Eric Biggers
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

The call to sha256() occurs in code that is built when either
CRYPTO_AES_ARM64_CE_BLK or CRYPTO_AES_ARM64_NEON_BLK.  The option
CRYPTO_AES_ARM64 is unrelated, notwithstanding its documentation.  I'll
be removing CRYPTO_AES_ARM64 soon anyway, but before doing that, fix
where CRYPTO_LIB_SHA256 is selected from.

Fixes: 01834444d972 ("crypto: arm64/aes - use SHA-256 library instead of crypto_shash")
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm64/crypto/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index da1c9ea8ea83..4453dff8f0c1 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -38,11 +38,10 @@ config CRYPTO_SM3_ARM64_CE
 	  - ARMv8.2 Crypto Extensions
 
 config CRYPTO_AES_ARM64
 	tristate "Ciphers: AES, modes: ECB, CBC, CTR, CTS, XCTR, XTS"
 	select CRYPTO_AES
-	select CRYPTO_LIB_SHA256
 	help
 	  Block ciphers: AES cipher algorithms (FIPS-197)
 	  Length-preserving ciphers: AES with ECB, CBC, CTR, CTS,
 	    XCTR, and XTS modes
 	  AEAD cipher: AES with CBC, ESSIV, and SHA-256
@@ -64,10 +63,11 @@ config CRYPTO_AES_ARM64_CE
 config CRYPTO_AES_ARM64_CE_BLK
 	tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (ARMv8 Crypto Extensions)"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_SKCIPHER
 	select CRYPTO_AES_ARM64_CE
+	select CRYPTO_LIB_SHA256
 	help
 	  Length-preserving ciphers: AES cipher algorithms (FIPS-197)
 	  with block cipher modes:
 	  - ECB (Electronic Codebook) mode (NIST SP800-38A)
 	  - CBC (Cipher Block Chaining) mode (NIST SP800-38A)
@@ -81,10 +81,11 @@ config CRYPTO_AES_ARM64_CE_BLK
 config CRYPTO_AES_ARM64_NEON_BLK
 	tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (NEON)"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_AES
+	select CRYPTO_LIB_SHA256
 	help
 	  Length-preserving ciphers: AES cipher algorithms (FIPS-197)
 	  with block cipher modes:
 	  - ECB (Electronic Codebook) mode (NIST SP800-38A)
 	  - CBC (Cipher Block Chaining) mode (NIST SP800-38A)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/36] crypto: aegis - Switch from crypto_ft_tab[] to aes_enc_tab[]
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (5 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 06/36] crypto: arm64/aes - Select CRYPTO_LIB_SHA256 from correct places Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 08/36] crypto: aes - Remove aes-fixed-time / CONFIG_CRYPTO_AES_TI Eric Biggers
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Instead of crypto_ft_tab[0] from aes_generic.c, use aes_enc_tab from
lib/crypto/aes.c.  These contain the same data, so the result is the
same.  This will allow aes_generic.c to eventually be removed.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 crypto/Kconfig | 2 +-
 crypto/aegis.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 12a87f7cf150..443fe8e016fd 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -764,11 +764,11 @@ endmenu
 menu "AEAD (authenticated encryption with associated data) ciphers"
 
 config CRYPTO_AEGIS128
 	tristate "AEGIS-128"
 	select CRYPTO_AEAD
-	select CRYPTO_AES  # for AES S-box tables
+	select CRYPTO_LIB_AES  # for AES S-box tables
 	help
 	  AEGIS-128 AEAD algorithm
 
 config CRYPTO_AEGIS128_SIMD
 	bool "AEGIS-128 (arm NEON, arm64 NEON)"
diff --git a/crypto/aegis.h b/crypto/aegis.h
index 6ef9c174c973..ffcf8e85ea69 100644
--- a/crypto/aegis.h
+++ b/crypto/aegis.h
@@ -60,11 +60,11 @@ static __always_inline void crypto_aegis_block_and(union aegis_block *dst,
 static __always_inline void crypto_aegis_aesenc(union aegis_block *dst,
 						const union aegis_block *src,
 						const union aegis_block *key)
 {
 	const u8  *s  = src->bytes;
-	const u32 *t = crypto_ft_tab[0];
+	const u32 *t = aes_enc_tab;
 	u32 d0, d1, d2, d3;
 
 	d0 = t[s[ 0]] ^ rol32(t[s[ 5]], 8) ^ rol32(t[s[10]], 16) ^ rol32(t[s[15]], 24);
 	d1 = t[s[ 4]] ^ rol32(t[s[ 9]], 8) ^ rol32(t[s[14]], 16) ^ rol32(t[s[ 3]], 24);
 	d2 = t[s[ 8]] ^ rol32(t[s[13]], 8) ^ rol32(t[s[ 2]], 16) ^ rol32(t[s[ 7]], 24);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/36] crypto: aes - Remove aes-fixed-time / CONFIG_CRYPTO_AES_TI
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (6 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 07/36] crypto: aegis - Switch from crypto_ft_tab[] to aes_enc_tab[] Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 09/36] crypto: aes - Replace aes-generic with wrapper around lib Eric Biggers
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Remove aes-fixed-time, i.e. CONFIG_CRYPTO_AES_TI.  This was a wrapper
around the 256-byte-table-based AES implementation in lib/crypto/aes.c,
with extra code to enable and disable IRQs for constant-time hardening.

While nice in theory, in practice this had the following issues:

- For bulk en/decryption it was 2-4 times slower than aes-generic.  This
  resulted in aes-generic still being needed, creating fragmentation.

- Having both aes-generic and aes-fixed-time punted an AES
  implementation decision to distros and users who are generally
  unprepared to handle it.  In practice, whether aes-fixed-time gets
  used tends to be incidental and not match an explicit distro or user
  intent.  (While aes-fixed-time has a higher priority than aes-generic,
  whether it actually gets enabled, loaded, and used depends on the
  kconfig and whether a modprobe of "aes" happens to be done.  It also
  has a lower priority than aes-arm and aes-arm64.)

- My changes to the generic AES code (in other commits) significantly
  close the gap with aes-fixed-time anyway.  The table size is reduced
  from 8192 bytes to 1024 bytes, and prefetching is added.

- While AES code *should* be constant-time, the real solutions for that
  are AES instructions (which most CPUs have now) or bit-slicing.  arm
  and arm64 already have bit-sliced AES code for many modes; generic
  bit-sliced code could be written but would be very slow for single
  blocks.  Overall, I suggest that trying to write constant-time
  table-based AES code is a bit futile anyway, and in the rare cases
  where a proper AES implementation is still unavailable it's reasonable
  to compromise with an implementation that simply prefetches the table.

Thus, this commit removes aes-fixed-time and CONFIG_CRYPTO_AES_TI.  The
replacement is just the existing CONFIG_CRYPTO_AES, which for now maps
to the existing aes-generic code, but I'll soon be changing to use the
improved AES library code instead.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/m68k/configs/amiga_defconfig    |  1 -
 arch/m68k/configs/apollo_defconfig   |  1 -
 arch/m68k/configs/atari_defconfig    |  1 -
 arch/m68k/configs/bvme6000_defconfig |  1 -
 arch/m68k/configs/hp300_defconfig    |  1 -
 arch/m68k/configs/mac_defconfig      |  1 -
 arch/m68k/configs/multi_defconfig    |  1 -
 arch/m68k/configs/mvme147_defconfig  |  1 -
 arch/m68k/configs/mvme16x_defconfig  |  1 -
 arch/m68k/configs/q40_defconfig      |  1 -
 arch/m68k/configs/sun3_defconfig     |  1 -
 arch/m68k/configs/sun3x_defconfig    |  1 -
 arch/s390/configs/debug_defconfig    |  2 +-
 arch/s390/configs/defconfig          |  2 +-
 crypto/Kconfig                       | 21 -------
 crypto/Makefile                      |  1 -
 crypto/aes_ti.c                      | 83 ----------------------------
 17 files changed, 2 insertions(+), 119 deletions(-)
 delete mode 100644 crypto/aes_ti.c

diff --git a/arch/m68k/configs/amiga_defconfig b/arch/m68k/configs/amiga_defconfig
index bfc1ee7c8158..bffcc417f44c 100644
--- a/arch/m68k/configs/amiga_defconfig
+++ b/arch/m68k/configs/amiga_defconfig
@@ -553,11 +553,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/apollo_defconfig b/arch/m68k/configs/apollo_defconfig
index d9d1f3c4c70d..3f894c20b132 100644
--- a/arch/m68k/configs/apollo_defconfig
+++ b/arch/m68k/configs/apollo_defconfig
@@ -510,11 +510,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/atari_defconfig b/arch/m68k/configs/atari_defconfig
index 523205adccc8..5c5603ca16aa 100644
--- a/arch/m68k/configs/atari_defconfig
+++ b/arch/m68k/configs/atari_defconfig
@@ -530,11 +530,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/bvme6000_defconfig b/arch/m68k/configs/bvme6000_defconfig
index 7b0a4ef0b010..37c747ee395e 100644
--- a/arch/m68k/configs/bvme6000_defconfig
+++ b/arch/m68k/configs/bvme6000_defconfig
@@ -502,11 +502,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/hp300_defconfig b/arch/m68k/configs/hp300_defconfig
index 089c5c394c62..1a376c2b8c45 100644
--- a/arch/m68k/configs/hp300_defconfig
+++ b/arch/m68k/configs/hp300_defconfig
@@ -512,11 +512,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/mac_defconfig b/arch/m68k/configs/mac_defconfig
index 5f2484c36733..2b26450692a5 100644
--- a/arch/m68k/configs/mac_defconfig
+++ b/arch/m68k/configs/mac_defconfig
@@ -529,11 +529,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/multi_defconfig b/arch/m68k/configs/multi_defconfig
index 74f0a1f6d871..012e0e1f506f 100644
--- a/arch/m68k/configs/multi_defconfig
+++ b/arch/m68k/configs/multi_defconfig
@@ -616,11 +616,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/mvme147_defconfig b/arch/m68k/configs/mvme147_defconfig
index 4bee18c820e4..37634b35bfbd 100644
--- a/arch/m68k/configs/mvme147_defconfig
+++ b/arch/m68k/configs/mvme147_defconfig
@@ -502,11 +502,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/mvme16x_defconfig b/arch/m68k/configs/mvme16x_defconfig
index 322c17e55c9a..a0d2e0070afa 100644
--- a/arch/m68k/configs/mvme16x_defconfig
+++ b/arch/m68k/configs/mvme16x_defconfig
@@ -503,11 +503,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/q40_defconfig b/arch/m68k/configs/q40_defconfig
index 82f9baab8fea..62cc3964fc34 100644
--- a/arch/m68k/configs/q40_defconfig
+++ b/arch/m68k/configs/q40_defconfig
@@ -519,11 +519,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/sun3_defconfig b/arch/m68k/configs/sun3_defconfig
index f94ad226cb5b..13107aa4a1b4 100644
--- a/arch/m68k/configs/sun3_defconfig
+++ b/arch/m68k/configs/sun3_defconfig
@@ -500,11 +500,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/m68k/configs/sun3x_defconfig b/arch/m68k/configs/sun3x_defconfig
index a5ecfc505ab2..eaab0ba08989 100644
--- a/arch/m68k/configs/sun3x_defconfig
+++ b/arch/m68k/configs/sun3x_defconfig
@@ -500,11 +500,10 @@ CONFIG_CRYPTO_RSA=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
 CONFIG_CRYPTO_AES=y
-CONFIG_CRYPTO_AES_TI=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAMELLIA=m
 CONFIG_CRYPTO_CAST5=m
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
index 0713914b25b4..09f4bdb9e64f 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -768,11 +768,11 @@ CONFIG_CRYPTO_CRYPTD=m
 CONFIG_CRYPTO_BENCHMARK=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
-CONFIG_CRYPTO_AES_TI=m
+CONFIG_CRYPTO_AES=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAST5=m
 CONFIG_CRYPTO_CAST6=m
diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig
index c064e0cacc98..823193b9f4c6 100644
--- a/arch/s390/configs/defconfig
+++ b/arch/s390/configs/defconfig
@@ -752,11 +752,11 @@ CONFIG_CRYPTO_CRYPTD=m
 CONFIG_CRYPTO_BENCHMARK=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECDSA=m
 CONFIG_CRYPTO_ECRDSA=m
-CONFIG_CRYPTO_AES_TI=m
+CONFIG_CRYPTO_AES=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_ARIA=m
 CONFIG_CRYPTO_BLOWFISH=m
 CONFIG_CRYPTO_CAST5=m
 CONFIG_CRYPTO_CAST6=m
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 443fe8e016fd..db6b0c2fb50e 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -364,31 +364,10 @@ config CRYPTO_AES
 	  demonstrates excellent performance. Rijndael's operations are
 	  among the easiest to defend against power and timing attacks.
 
 	  The AES specifies three key sizes: 128, 192 and 256 bits
 
-config CRYPTO_AES_TI
-	tristate "AES (Advanced Encryption Standard) (fixed time)"
-	select CRYPTO_ALGAPI
-	select CRYPTO_LIB_AES
-	help
-	  AES cipher algorithms (Rijndael)(FIPS-197, ISO/IEC 18033-3)
-
-	  This is a generic implementation of AES that attempts to eliminate
-	  data dependent latencies as much as possible without affecting
-	  performance too much. It is intended for use by the generic CCM
-	  and GCM drivers, and other CTR or CMAC/XCBC based modes that rely
-	  solely on encryption (although decryption is supported as well, but
-	  with a more dramatic performance hit)
-
-	  Instead of using 16 lookup tables of 1 KB each, (8 for encryption and
-	  8 for decryption), this implementation only uses just two S-boxes of
-	  256 bytes each, and attempts to eliminate data dependent latencies by
-	  prefetching the entire table into the cache at the start of each
-	  block. Interrupts are also disabled to avoid races where cachelines
-	  are evicted when the CPU is interrupted to do something else.
-
 config CRYPTO_ANUBIS
 	tristate "Anubis"
 	depends on CRYPTO_USER_API_ENABLE_OBSOLETE
 	select CRYPTO_ALGAPI
 	help
diff --git a/crypto/Makefile b/crypto/Makefile
index 23d3db7be425..be403dc20645 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -132,11 +132,10 @@ obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
 CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure)  # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
 obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
 CFLAGS_aes_generic.o := $(call cc-option,-fno-code-hoisting) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
 obj-$(CONFIG_CRYPTO_SM4) += sm4.o
 obj-$(CONFIG_CRYPTO_SM4_GENERIC) += sm4_generic.o
-obj-$(CONFIG_CRYPTO_AES_TI) += aes_ti.o
 obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o
 obj-$(CONFIG_CRYPTO_CAST_COMMON) += cast_common.o
 obj-$(CONFIG_CRYPTO_CAST5) += cast5_generic.o
 obj-$(CONFIG_CRYPTO_CAST6) += cast6_generic.o
 obj-$(CONFIG_CRYPTO_ARC4) += arc4.o
diff --git a/crypto/aes_ti.c b/crypto/aes_ti.c
deleted file mode 100644
index a3b342f92fab..000000000000
--- a/crypto/aes_ti.c
+++ /dev/null
@@ -1,83 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Scalar fixed time AES core transform
- *
- * Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
- */
-
-#include <crypto/aes.h>
-#include <crypto/algapi.h>
-#include <linux/module.h>
-
-static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
-			 unsigned int key_len)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	return aes_expandkey(ctx, in_key, key_len);
-}
-
-static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	unsigned long flags;
-
-	/*
-	 * Temporarily disable interrupts to avoid races where cachelines are
-	 * evicted when the CPU is interrupted to do something else.
-	 */
-	local_irq_save(flags);
-
-	aes_encrypt(ctx, out, in);
-
-	local_irq_restore(flags);
-}
-
-static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	unsigned long flags;
-
-	/*
-	 * Temporarily disable interrupts to avoid races where cachelines are
-	 * evicted when the CPU is interrupted to do something else.
-	 */
-	local_irq_save(flags);
-
-	aes_decrypt(ctx, out, in);
-
-	local_irq_restore(flags);
-}
-
-static struct crypto_alg aes_alg = {
-	.cra_name			= "aes",
-	.cra_driver_name		= "aes-fixed-time",
-	.cra_priority			= 100 + 1,
-	.cra_flags			= CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize			= AES_BLOCK_SIZE,
-	.cra_ctxsize			= sizeof(struct crypto_aes_ctx),
-	.cra_module			= THIS_MODULE,
-
-	.cra_cipher.cia_min_keysize	= AES_MIN_KEY_SIZE,
-	.cra_cipher.cia_max_keysize	= AES_MAX_KEY_SIZE,
-	.cra_cipher.cia_setkey		= aesti_set_key,
-	.cra_cipher.cia_encrypt		= aesti_encrypt,
-	.cra_cipher.cia_decrypt		= aesti_decrypt
-};
-
-static int __init aes_init(void)
-{
-	return crypto_register_alg(&aes_alg);
-}
-
-static void __exit aes_fini(void)
-{
-	crypto_unregister_alg(&aes_alg);
-}
-
-module_init(aes_init);
-module_exit(aes_fini);
-
-MODULE_DESCRIPTION("Generic fixed time AES");
-MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
-MODULE_LICENSE("GPL v2");
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/36] crypto: aes - Replace aes-generic with wrapper around lib
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (7 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 08/36] crypto: aes - Remove aes-fixed-time / CONFIG_CRYPTO_AES_TI Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 10/36] lib/crypto: arm/aes: Migrate optimized code into library Eric Biggers
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Now that the AES library's performance has been improved, replace
aes_generic.c with a new file aes.c which wraps the AES library.

In preparation for making the AES library actually utilize the kernel's
existing architecture-optimized AES code including AES instructions, set
the driver name to "aes-lib" instead of "aes-generic".  This mirrors
what's been done for the hash algorithms.  Update testmgr.c accordingly.

Since this removes the crypto_aes_set_key() helper function, add
temporary replacements for it to arch/arm/crypto/aes-cipher-glue.c and
arch/arm64/crypto/aes-cipher-glue.c.  This is temporary, as that code
will be migrated into lib/crypto/ in later commits.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm/crypto/aes-cipher-glue.c    |   10 +-
 arch/arm64/crypto/aes-cipher-glue.c  |   10 +-
 crypto/Makefile                      |    3 +-
 crypto/aes.c                         |   66 ++
 crypto/aes_generic.c                 | 1320 --------------------------
 crypto/crypto_user.c                 |    2 +-
 crypto/testmgr.c                     |   43 +-
 drivers/crypto/starfive/jh7110-aes.c |   10 +-
 include/crypto/aes.h                 |    6 -
 9 files changed, 117 insertions(+), 1353 deletions(-)
 create mode 100644 crypto/aes.c
 delete mode 100644 crypto/aes_generic.c

diff --git a/arch/arm/crypto/aes-cipher-glue.c b/arch/arm/crypto/aes-cipher-glue.c
index 29efb7833960..f302db808cd3 100644
--- a/arch/arm/crypto/aes-cipher-glue.c
+++ b/arch/arm/crypto/aes-cipher-glue.c
@@ -12,10 +12,18 @@
 #include "aes-cipher.h"
 
 EXPORT_SYMBOL_GPL(__aes_arm_encrypt);
 EXPORT_SYMBOL_GPL(__aes_arm_decrypt);
 
+static int aes_arm_setkey(struct crypto_tfm *tfm, const u8 *in_key,
+			  unsigned int key_len)
+{
+	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+
+	return aes_expandkey(ctx, in_key, key_len);
+}
+
 static void aes_arm_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
 {
 	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
 	int rounds = 6 + ctx->key_length / 4;
 
@@ -39,11 +47,11 @@ static struct crypto_alg aes_alg = {
 	.cra_ctxsize			= sizeof(struct crypto_aes_ctx),
 	.cra_module			= THIS_MODULE,
 
 	.cra_cipher.cia_min_keysize	= AES_MIN_KEY_SIZE,
 	.cra_cipher.cia_max_keysize	= AES_MAX_KEY_SIZE,
-	.cra_cipher.cia_setkey		= crypto_aes_set_key,
+	.cra_cipher.cia_setkey		= aes_arm_setkey,
 	.cra_cipher.cia_encrypt		= aes_arm_encrypt,
 	.cra_cipher.cia_decrypt		= aes_arm_decrypt,
 
 #ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
 	.cra_alignmask			= 3,
diff --git a/arch/arm64/crypto/aes-cipher-glue.c b/arch/arm64/crypto/aes-cipher-glue.c
index 4ec55e568941..9b27cbac278b 100644
--- a/arch/arm64/crypto/aes-cipher-glue.c
+++ b/arch/arm64/crypto/aes-cipher-glue.c
@@ -10,10 +10,18 @@
 #include <linux/module.h>
 
 asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
 asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
 
+static int aes_arm64_setkey(struct crypto_tfm *tfm, const u8 *in_key,
+			    unsigned int key_len)
+{
+	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+
+	return aes_expandkey(ctx, in_key, key_len);
+}
+
 static void aes_arm64_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
 {
 	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
 	int rounds = 6 + ctx->key_length / 4;
 
@@ -37,11 +45,11 @@ static struct crypto_alg aes_alg = {
 	.cra_ctxsize			= sizeof(struct crypto_aes_ctx),
 	.cra_module			= THIS_MODULE,
 
 	.cra_cipher.cia_min_keysize	= AES_MIN_KEY_SIZE,
 	.cra_cipher.cia_max_keysize	= AES_MAX_KEY_SIZE,
-	.cra_cipher.cia_setkey		= crypto_aes_set_key,
+	.cra_cipher.cia_setkey		= aes_arm64_setkey,
 	.cra_cipher.cia_encrypt		= aes_arm64_encrypt,
 	.cra_cipher.cia_decrypt		= aes_arm64_decrypt
 };
 
 static int __init aes_init(void)
diff --git a/crypto/Makefile b/crypto/Makefile
index be403dc20645..65a2c3478814 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -128,12 +128,11 @@ obj-$(CONFIG_CRYPTO_BLOWFISH) += blowfish_generic.o
 obj-$(CONFIG_CRYPTO_BLOWFISH_COMMON) += blowfish_common.o
 obj-$(CONFIG_CRYPTO_TWOFISH) += twofish_generic.o
 obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += twofish_common.o
 obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
 CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure)  # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
-obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
-CFLAGS_aes_generic.o := $(call cc-option,-fno-code-hoisting) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
+obj-$(CONFIG_CRYPTO_AES) += aes.o
 obj-$(CONFIG_CRYPTO_SM4) += sm4.o
 obj-$(CONFIG_CRYPTO_SM4_GENERIC) += sm4_generic.o
 obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o
 obj-$(CONFIG_CRYPTO_CAST_COMMON) += cast_common.o
 obj-$(CONFIG_CRYPTO_CAST5) += cast5_generic.o
diff --git a/crypto/aes.c b/crypto/aes.c
new file mode 100644
index 000000000000..5c3a0b24dbc0
--- /dev/null
+++ b/crypto/aes.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Crypto API support for AES block cipher
+ *
+ * Copyright 2026 Google LLC
+ */
+
+#include <crypto/aes.h>
+#include <crypto/algapi.h>
+#include <linux/module.h>
+
+static_assert(__alignof__(struct aes_key) <= CRYPTO_MINALIGN);
+
+static int crypto_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
+			     unsigned int key_len)
+{
+	struct aes_key *key = crypto_tfm_ctx(tfm);
+
+	return aes_preparekey(key, in_key, key_len);
+}
+
+static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
+{
+	const struct aes_key *key = crypto_tfm_ctx(tfm);
+
+	aes_encrypt_new(key, out, in);
+}
+
+static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
+{
+	const struct aes_key *key = crypto_tfm_ctx(tfm);
+
+	aes_decrypt_new(key, out, in);
+}
+
+static struct crypto_alg alg = {
+	.cra_name = "aes",
+	.cra_driver_name = "aes-lib",
+	.cra_priority = 100,
+	.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
+	.cra_blocksize = AES_BLOCK_SIZE,
+	.cra_ctxsize = sizeof(struct aes_key),
+	.cra_module = THIS_MODULE,
+	.cra_u = { .cipher = { .cia_min_keysize = AES_MIN_KEY_SIZE,
+			       .cia_max_keysize = AES_MAX_KEY_SIZE,
+			       .cia_setkey = crypto_aes_setkey,
+			       .cia_encrypt = crypto_aes_encrypt,
+			       .cia_decrypt = crypto_aes_decrypt } }
+};
+
+static int __init crypto_aes_mod_init(void)
+{
+	return crypto_register_alg(&alg);
+}
+module_init(crypto_aes_mod_init);
+
+static void __exit crypto_aes_mod_exit(void)
+{
+	crypto_unregister_alg(&alg);
+}
+module_exit(crypto_aes_mod_exit);
+
+MODULE_DESCRIPTION("Crypto API support for AES block cipher");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_CRYPTO("aes");
+MODULE_ALIAS_CRYPTO("aes-lib");
diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
deleted file mode 100644
index 85d2e78c8ef2..000000000000
--- a/crypto/aes_generic.c
+++ /dev/null
@@ -1,1320 +0,0 @@
-/*
- * Cryptographic API.
- *
- * AES Cipher Algorithm.
- *
- * Based on Brian Gladman's code.
- *
- * Linux developers:
- *  Alexander Kjeldaas <astor@fast.no>
- *  Herbert Valerio Riedel <hvr@hvrlab.org>
- *  Kyle McMartin <kyle@debian.org>
- *  Adam J. Richter <adam@yggdrasil.com> (conversion to 2.5 API).
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * ---------------------------------------------------------------------------
- * Copyright (c) 2002, Dr Brian Gladman <brg@gladman.me.uk>, Worcester, UK.
- * All rights reserved.
- *
- * LICENSE TERMS
- *
- * The free distribution and use of this software in both source and binary
- * form is allowed (with or without changes) provided that:
- *
- *   1. distributions of this source code include the above copyright
- *      notice, this list of conditions and the following disclaimer;
- *
- *   2. distributions in binary form include the above copyright
- *      notice, this list of conditions and the following disclaimer
- *      in the documentation and/or other associated materials;
- *
- *   3. the copyright holder's name is not used to endorse products
- *      built using this software without specific written permission.
- *
- * ALTERNATIVELY, provided that this notice is retained in full, this product
- * may be distributed under the terms of the GNU General Public License (GPL),
- * in which case the provisions of the GPL apply INSTEAD OF those given above.
- *
- * DISCLAIMER
- *
- * This software is provided 'as is' with no explicit or implied warranties
- * in respect of its properties, including, but not limited to, correctness
- * and/or fitness for purpose.
- * ---------------------------------------------------------------------------
- */
-
-#include <crypto/aes.h>
-#include <crypto/algapi.h>
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/types.h>
-#include <linux/errno.h>
-#include <asm/byteorder.h>
-#include <linux/unaligned.h>
-
-static inline u8 byte(const u32 x, const unsigned n)
-{
-	return x >> (n << 3);
-}
-
-/* cacheline-aligned to facilitate prefetching into cache */
-__visible const u32 crypto_ft_tab[4][256] ____cacheline_aligned = {
-	{
-		0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6,
-		0x0df2f2ff, 0xbd6b6bd6, 0xb16f6fde, 0x54c5c591,
-		0x50303060, 0x03010102, 0xa96767ce, 0x7d2b2b56,
-		0x19fefee7, 0x62d7d7b5, 0xe6abab4d, 0x9a7676ec,
-		0x45caca8f, 0x9d82821f, 0x40c9c989, 0x877d7dfa,
-		0x15fafaef, 0xeb5959b2, 0xc947478e, 0x0bf0f0fb,
-		0xecadad41, 0x67d4d4b3, 0xfda2a25f, 0xeaafaf45,
-		0xbf9c9c23, 0xf7a4a453, 0x967272e4, 0x5bc0c09b,
-		0xc2b7b775, 0x1cfdfde1, 0xae93933d, 0x6a26264c,
-		0x5a36366c, 0x413f3f7e, 0x02f7f7f5, 0x4fcccc83,
-		0x5c343468, 0xf4a5a551, 0x34e5e5d1, 0x08f1f1f9,
-		0x937171e2, 0x73d8d8ab, 0x53313162, 0x3f15152a,
-		0x0c040408, 0x52c7c795, 0x65232346, 0x5ec3c39d,
-		0x28181830, 0xa1969637, 0x0f05050a, 0xb59a9a2f,
-		0x0907070e, 0x36121224, 0x9b80801b, 0x3de2e2df,
-		0x26ebebcd, 0x6927274e, 0xcdb2b27f, 0x9f7575ea,
-		0x1b090912, 0x9e83831d, 0x742c2c58, 0x2e1a1a34,
-		0x2d1b1b36, 0xb26e6edc, 0xee5a5ab4, 0xfba0a05b,
-		0xf65252a4, 0x4d3b3b76, 0x61d6d6b7, 0xceb3b37d,
-		0x7b292952, 0x3ee3e3dd, 0x712f2f5e, 0x97848413,
-		0xf55353a6, 0x68d1d1b9, 0x00000000, 0x2cededc1,
-		0x60202040, 0x1ffcfce3, 0xc8b1b179, 0xed5b5bb6,
-		0xbe6a6ad4, 0x46cbcb8d, 0xd9bebe67, 0x4b393972,
-		0xde4a4a94, 0xd44c4c98, 0xe85858b0, 0x4acfcf85,
-		0x6bd0d0bb, 0x2aefefc5, 0xe5aaaa4f, 0x16fbfbed,
-		0xc5434386, 0xd74d4d9a, 0x55333366, 0x94858511,
-		0xcf45458a, 0x10f9f9e9, 0x06020204, 0x817f7ffe,
-		0xf05050a0, 0x443c3c78, 0xba9f9f25, 0xe3a8a84b,
-		0xf35151a2, 0xfea3a35d, 0xc0404080, 0x8a8f8f05,
-		0xad92923f, 0xbc9d9d21, 0x48383870, 0x04f5f5f1,
-		0xdfbcbc63, 0xc1b6b677, 0x75dadaaf, 0x63212142,
-		0x30101020, 0x1affffe5, 0x0ef3f3fd, 0x6dd2d2bf,
-		0x4ccdcd81, 0x140c0c18, 0x35131326, 0x2fececc3,
-		0xe15f5fbe, 0xa2979735, 0xcc444488, 0x3917172e,
-		0x57c4c493, 0xf2a7a755, 0x827e7efc, 0x473d3d7a,
-		0xac6464c8, 0xe75d5dba, 0x2b191932, 0x957373e6,
-		0xa06060c0, 0x98818119, 0xd14f4f9e, 0x7fdcdca3,
-		0x66222244, 0x7e2a2a54, 0xab90903b, 0x8388880b,
-		0xca46468c, 0x29eeeec7, 0xd3b8b86b, 0x3c141428,
-		0x79dedea7, 0xe25e5ebc, 0x1d0b0b16, 0x76dbdbad,
-		0x3be0e0db, 0x56323264, 0x4e3a3a74, 0x1e0a0a14,
-		0xdb494992, 0x0a06060c, 0x6c242448, 0xe45c5cb8,
-		0x5dc2c29f, 0x6ed3d3bd, 0xefacac43, 0xa66262c4,
-		0xa8919139, 0xa4959531, 0x37e4e4d3, 0x8b7979f2,
-		0x32e7e7d5, 0x43c8c88b, 0x5937376e, 0xb76d6dda,
-		0x8c8d8d01, 0x64d5d5b1, 0xd24e4e9c, 0xe0a9a949,
-		0xb46c6cd8, 0xfa5656ac, 0x07f4f4f3, 0x25eaeacf,
-		0xaf6565ca, 0x8e7a7af4, 0xe9aeae47, 0x18080810,
-		0xd5baba6f, 0x887878f0, 0x6f25254a, 0x722e2e5c,
-		0x241c1c38, 0xf1a6a657, 0xc7b4b473, 0x51c6c697,
-		0x23e8e8cb, 0x7cdddda1, 0x9c7474e8, 0x211f1f3e,
-		0xdd4b4b96, 0xdcbdbd61, 0x868b8b0d, 0x858a8a0f,
-		0x907070e0, 0x423e3e7c, 0xc4b5b571, 0xaa6666cc,
-		0xd8484890, 0x05030306, 0x01f6f6f7, 0x120e0e1c,
-		0xa36161c2, 0x5f35356a, 0xf95757ae, 0xd0b9b969,
-		0x91868617, 0x58c1c199, 0x271d1d3a, 0xb99e9e27,
-		0x38e1e1d9, 0x13f8f8eb, 0xb398982b, 0x33111122,
-		0xbb6969d2, 0x70d9d9a9, 0x898e8e07, 0xa7949433,
-		0xb69b9b2d, 0x221e1e3c, 0x92878715, 0x20e9e9c9,
-		0x49cece87, 0xff5555aa, 0x78282850, 0x7adfdfa5,
-		0x8f8c8c03, 0xf8a1a159, 0x80898909, 0x170d0d1a,
-		0xdabfbf65, 0x31e6e6d7, 0xc6424284, 0xb86868d0,
-		0xc3414182, 0xb0999929, 0x772d2d5a, 0x110f0f1e,
-		0xcbb0b07b, 0xfc5454a8, 0xd6bbbb6d, 0x3a16162c,
-	}, {
-		0x6363c6a5, 0x7c7cf884, 0x7777ee99, 0x7b7bf68d,
-		0xf2f2ff0d, 0x6b6bd6bd, 0x6f6fdeb1, 0xc5c59154,
-		0x30306050, 0x01010203, 0x6767cea9, 0x2b2b567d,
-		0xfefee719, 0xd7d7b562, 0xabab4de6, 0x7676ec9a,
-		0xcaca8f45, 0x82821f9d, 0xc9c98940, 0x7d7dfa87,
-		0xfafaef15, 0x5959b2eb, 0x47478ec9, 0xf0f0fb0b,
-		0xadad41ec, 0xd4d4b367, 0xa2a25ffd, 0xafaf45ea,
-		0x9c9c23bf, 0xa4a453f7, 0x7272e496, 0xc0c09b5b,
-		0xb7b775c2, 0xfdfde11c, 0x93933dae, 0x26264c6a,
-		0x36366c5a, 0x3f3f7e41, 0xf7f7f502, 0xcccc834f,
-		0x3434685c, 0xa5a551f4, 0xe5e5d134, 0xf1f1f908,
-		0x7171e293, 0xd8d8ab73, 0x31316253, 0x15152a3f,
-		0x0404080c, 0xc7c79552, 0x23234665, 0xc3c39d5e,
-		0x18183028, 0x969637a1, 0x05050a0f, 0x9a9a2fb5,
-		0x07070e09, 0x12122436, 0x80801b9b, 0xe2e2df3d,
-		0xebebcd26, 0x27274e69, 0xb2b27fcd, 0x7575ea9f,
-		0x0909121b, 0x83831d9e, 0x2c2c5874, 0x1a1a342e,
-		0x1b1b362d, 0x6e6edcb2, 0x5a5ab4ee, 0xa0a05bfb,
-		0x5252a4f6, 0x3b3b764d, 0xd6d6b761, 0xb3b37dce,
-		0x2929527b, 0xe3e3dd3e, 0x2f2f5e71, 0x84841397,
-		0x5353a6f5, 0xd1d1b968, 0x00000000, 0xededc12c,
-		0x20204060, 0xfcfce31f, 0xb1b179c8, 0x5b5bb6ed,
-		0x6a6ad4be, 0xcbcb8d46, 0xbebe67d9, 0x3939724b,
-		0x4a4a94de, 0x4c4c98d4, 0x5858b0e8, 0xcfcf854a,
-		0xd0d0bb6b, 0xefefc52a, 0xaaaa4fe5, 0xfbfbed16,
-		0x434386c5, 0x4d4d9ad7, 0x33336655, 0x85851194,
-		0x45458acf, 0xf9f9e910, 0x02020406, 0x7f7ffe81,
-		0x5050a0f0, 0x3c3c7844, 0x9f9f25ba, 0xa8a84be3,
-		0x5151a2f3, 0xa3a35dfe, 0x404080c0, 0x8f8f058a,
-		0x92923fad, 0x9d9d21bc, 0x38387048, 0xf5f5f104,
-		0xbcbc63df, 0xb6b677c1, 0xdadaaf75, 0x21214263,
-		0x10102030, 0xffffe51a, 0xf3f3fd0e, 0xd2d2bf6d,
-		0xcdcd814c, 0x0c0c1814, 0x13132635, 0xececc32f,
-		0x5f5fbee1, 0x979735a2, 0x444488cc, 0x17172e39,
-		0xc4c49357, 0xa7a755f2, 0x7e7efc82, 0x3d3d7a47,
-		0x6464c8ac, 0x5d5dbae7, 0x1919322b, 0x7373e695,
-		0x6060c0a0, 0x81811998, 0x4f4f9ed1, 0xdcdca37f,
-		0x22224466, 0x2a2a547e, 0x90903bab, 0x88880b83,
-		0x46468cca, 0xeeeec729, 0xb8b86bd3, 0x1414283c,
-		0xdedea779, 0x5e5ebce2, 0x0b0b161d, 0xdbdbad76,
-		0xe0e0db3b, 0x32326456, 0x3a3a744e, 0x0a0a141e,
-		0x494992db, 0x06060c0a, 0x2424486c, 0x5c5cb8e4,
-		0xc2c29f5d, 0xd3d3bd6e, 0xacac43ef, 0x6262c4a6,
-		0x919139a8, 0x959531a4, 0xe4e4d337, 0x7979f28b,
-		0xe7e7d532, 0xc8c88b43, 0x37376e59, 0x6d6ddab7,
-		0x8d8d018c, 0xd5d5b164, 0x4e4e9cd2, 0xa9a949e0,
-		0x6c6cd8b4, 0x5656acfa, 0xf4f4f307, 0xeaeacf25,
-		0x6565caaf, 0x7a7af48e, 0xaeae47e9, 0x08081018,
-		0xbaba6fd5, 0x7878f088, 0x25254a6f, 0x2e2e5c72,
-		0x1c1c3824, 0xa6a657f1, 0xb4b473c7, 0xc6c69751,
-		0xe8e8cb23, 0xdddda17c, 0x7474e89c, 0x1f1f3e21,
-		0x4b4b96dd, 0xbdbd61dc, 0x8b8b0d86, 0x8a8a0f85,
-		0x7070e090, 0x3e3e7c42, 0xb5b571c4, 0x6666ccaa,
-		0x484890d8, 0x03030605, 0xf6f6f701, 0x0e0e1c12,
-		0x6161c2a3, 0x35356a5f, 0x5757aef9, 0xb9b969d0,
-		0x86861791, 0xc1c19958, 0x1d1d3a27, 0x9e9e27b9,
-		0xe1e1d938, 0xf8f8eb13, 0x98982bb3, 0x11112233,
-		0x6969d2bb, 0xd9d9a970, 0x8e8e0789, 0x949433a7,
-		0x9b9b2db6, 0x1e1e3c22, 0x87871592, 0xe9e9c920,
-		0xcece8749, 0x5555aaff, 0x28285078, 0xdfdfa57a,
-		0x8c8c038f, 0xa1a159f8, 0x89890980, 0x0d0d1a17,
-		0xbfbf65da, 0xe6e6d731, 0x424284c6, 0x6868d0b8,
-		0x414182c3, 0x999929b0, 0x2d2d5a77, 0x0f0f1e11,
-		0xb0b07bcb, 0x5454a8fc, 0xbbbb6dd6, 0x16162c3a,
-	}, {
-		0x63c6a563, 0x7cf8847c, 0x77ee9977, 0x7bf68d7b,
-		0xf2ff0df2, 0x6bd6bd6b, 0x6fdeb16f, 0xc59154c5,
-		0x30605030, 0x01020301, 0x67cea967, 0x2b567d2b,
-		0xfee719fe, 0xd7b562d7, 0xab4de6ab, 0x76ec9a76,
-		0xca8f45ca, 0x821f9d82, 0xc98940c9, 0x7dfa877d,
-		0xfaef15fa, 0x59b2eb59, 0x478ec947, 0xf0fb0bf0,
-		0xad41ecad, 0xd4b367d4, 0xa25ffda2, 0xaf45eaaf,
-		0x9c23bf9c, 0xa453f7a4, 0x72e49672, 0xc09b5bc0,
-		0xb775c2b7, 0xfde11cfd, 0x933dae93, 0x264c6a26,
-		0x366c5a36, 0x3f7e413f, 0xf7f502f7, 0xcc834fcc,
-		0x34685c34, 0xa551f4a5, 0xe5d134e5, 0xf1f908f1,
-		0x71e29371, 0xd8ab73d8, 0x31625331, 0x152a3f15,
-		0x04080c04, 0xc79552c7, 0x23466523, 0xc39d5ec3,
-		0x18302818, 0x9637a196, 0x050a0f05, 0x9a2fb59a,
-		0x070e0907, 0x12243612, 0x801b9b80, 0xe2df3de2,
-		0xebcd26eb, 0x274e6927, 0xb27fcdb2, 0x75ea9f75,
-		0x09121b09, 0x831d9e83, 0x2c58742c, 0x1a342e1a,
-		0x1b362d1b, 0x6edcb26e, 0x5ab4ee5a, 0xa05bfba0,
-		0x52a4f652, 0x3b764d3b, 0xd6b761d6, 0xb37dceb3,
-		0x29527b29, 0xe3dd3ee3, 0x2f5e712f, 0x84139784,
-		0x53a6f553, 0xd1b968d1, 0x00000000, 0xedc12ced,
-		0x20406020, 0xfce31ffc, 0xb179c8b1, 0x5bb6ed5b,
-		0x6ad4be6a, 0xcb8d46cb, 0xbe67d9be, 0x39724b39,
-		0x4a94de4a, 0x4c98d44c, 0x58b0e858, 0xcf854acf,
-		0xd0bb6bd0, 0xefc52aef, 0xaa4fe5aa, 0xfbed16fb,
-		0x4386c543, 0x4d9ad74d, 0x33665533, 0x85119485,
-		0x458acf45, 0xf9e910f9, 0x02040602, 0x7ffe817f,
-		0x50a0f050, 0x3c78443c, 0x9f25ba9f, 0xa84be3a8,
-		0x51a2f351, 0xa35dfea3, 0x4080c040, 0x8f058a8f,
-		0x923fad92, 0x9d21bc9d, 0x38704838, 0xf5f104f5,
-		0xbc63dfbc, 0xb677c1b6, 0xdaaf75da, 0x21426321,
-		0x10203010, 0xffe51aff, 0xf3fd0ef3, 0xd2bf6dd2,
-		0xcd814ccd, 0x0c18140c, 0x13263513, 0xecc32fec,
-		0x5fbee15f, 0x9735a297, 0x4488cc44, 0x172e3917,
-		0xc49357c4, 0xa755f2a7, 0x7efc827e, 0x3d7a473d,
-		0x64c8ac64, 0x5dbae75d, 0x19322b19, 0x73e69573,
-		0x60c0a060, 0x81199881, 0x4f9ed14f, 0xdca37fdc,
-		0x22446622, 0x2a547e2a, 0x903bab90, 0x880b8388,
-		0x468cca46, 0xeec729ee, 0xb86bd3b8, 0x14283c14,
-		0xdea779de, 0x5ebce25e, 0x0b161d0b, 0xdbad76db,
-		0xe0db3be0, 0x32645632, 0x3a744e3a, 0x0a141e0a,
-		0x4992db49, 0x060c0a06, 0x24486c24, 0x5cb8e45c,
-		0xc29f5dc2, 0xd3bd6ed3, 0xac43efac, 0x62c4a662,
-		0x9139a891, 0x9531a495, 0xe4d337e4, 0x79f28b79,
-		0xe7d532e7, 0xc88b43c8, 0x376e5937, 0x6ddab76d,
-		0x8d018c8d, 0xd5b164d5, 0x4e9cd24e, 0xa949e0a9,
-		0x6cd8b46c, 0x56acfa56, 0xf4f307f4, 0xeacf25ea,
-		0x65caaf65, 0x7af48e7a, 0xae47e9ae, 0x08101808,
-		0xba6fd5ba, 0x78f08878, 0x254a6f25, 0x2e5c722e,
-		0x1c38241c, 0xa657f1a6, 0xb473c7b4, 0xc69751c6,
-		0xe8cb23e8, 0xdda17cdd, 0x74e89c74, 0x1f3e211f,
-		0x4b96dd4b, 0xbd61dcbd, 0x8b0d868b, 0x8a0f858a,
-		0x70e09070, 0x3e7c423e, 0xb571c4b5, 0x66ccaa66,
-		0x4890d848, 0x03060503, 0xf6f701f6, 0x0e1c120e,
-		0x61c2a361, 0x356a5f35, 0x57aef957, 0xb969d0b9,
-		0x86179186, 0xc19958c1, 0x1d3a271d, 0x9e27b99e,
-		0xe1d938e1, 0xf8eb13f8, 0x982bb398, 0x11223311,
-		0x69d2bb69, 0xd9a970d9, 0x8e07898e, 0x9433a794,
-		0x9b2db69b, 0x1e3c221e, 0x87159287, 0xe9c920e9,
-		0xce8749ce, 0x55aaff55, 0x28507828, 0xdfa57adf,
-		0x8c038f8c, 0xa159f8a1, 0x89098089, 0x0d1a170d,
-		0xbf65dabf, 0xe6d731e6, 0x4284c642, 0x68d0b868,
-		0x4182c341, 0x9929b099, 0x2d5a772d, 0x0f1e110f,
-		0xb07bcbb0, 0x54a8fc54, 0xbb6dd6bb, 0x162c3a16,
-	}, {
-		0xc6a56363, 0xf8847c7c, 0xee997777, 0xf68d7b7b,
-		0xff0df2f2, 0xd6bd6b6b, 0xdeb16f6f, 0x9154c5c5,
-		0x60503030, 0x02030101, 0xcea96767, 0x567d2b2b,
-		0xe719fefe, 0xb562d7d7, 0x4de6abab, 0xec9a7676,
-		0x8f45caca, 0x1f9d8282, 0x8940c9c9, 0xfa877d7d,
-		0xef15fafa, 0xb2eb5959, 0x8ec94747, 0xfb0bf0f0,
-		0x41ecadad, 0xb367d4d4, 0x5ffda2a2, 0x45eaafaf,
-		0x23bf9c9c, 0x53f7a4a4, 0xe4967272, 0x9b5bc0c0,
-		0x75c2b7b7, 0xe11cfdfd, 0x3dae9393, 0x4c6a2626,
-		0x6c5a3636, 0x7e413f3f, 0xf502f7f7, 0x834fcccc,
-		0x685c3434, 0x51f4a5a5, 0xd134e5e5, 0xf908f1f1,
-		0xe2937171, 0xab73d8d8, 0x62533131, 0x2a3f1515,
-		0x080c0404, 0x9552c7c7, 0x46652323, 0x9d5ec3c3,
-		0x30281818, 0x37a19696, 0x0a0f0505, 0x2fb59a9a,
-		0x0e090707, 0x24361212, 0x1b9b8080, 0xdf3de2e2,
-		0xcd26ebeb, 0x4e692727, 0x7fcdb2b2, 0xea9f7575,
-		0x121b0909, 0x1d9e8383, 0x58742c2c, 0x342e1a1a,
-		0x362d1b1b, 0xdcb26e6e, 0xb4ee5a5a, 0x5bfba0a0,
-		0xa4f65252, 0x764d3b3b, 0xb761d6d6, 0x7dceb3b3,
-		0x527b2929, 0xdd3ee3e3, 0x5e712f2f, 0x13978484,
-		0xa6f55353, 0xb968d1d1, 0x00000000, 0xc12ceded,
-		0x40602020, 0xe31ffcfc, 0x79c8b1b1, 0xb6ed5b5b,
-		0xd4be6a6a, 0x8d46cbcb, 0x67d9bebe, 0x724b3939,
-		0x94de4a4a, 0x98d44c4c, 0xb0e85858, 0x854acfcf,
-		0xbb6bd0d0, 0xc52aefef, 0x4fe5aaaa, 0xed16fbfb,
-		0x86c54343, 0x9ad74d4d, 0x66553333, 0x11948585,
-		0x8acf4545, 0xe910f9f9, 0x04060202, 0xfe817f7f,
-		0xa0f05050, 0x78443c3c, 0x25ba9f9f, 0x4be3a8a8,
-		0xa2f35151, 0x5dfea3a3, 0x80c04040, 0x058a8f8f,
-		0x3fad9292, 0x21bc9d9d, 0x70483838, 0xf104f5f5,
-		0x63dfbcbc, 0x77c1b6b6, 0xaf75dada, 0x42632121,
-		0x20301010, 0xe51affff, 0xfd0ef3f3, 0xbf6dd2d2,
-		0x814ccdcd, 0x18140c0c, 0x26351313, 0xc32fecec,
-		0xbee15f5f, 0x35a29797, 0x88cc4444, 0x2e391717,
-		0x9357c4c4, 0x55f2a7a7, 0xfc827e7e, 0x7a473d3d,
-		0xc8ac6464, 0xbae75d5d, 0x322b1919, 0xe6957373,
-		0xc0a06060, 0x19988181, 0x9ed14f4f, 0xa37fdcdc,
-		0x44662222, 0x547e2a2a, 0x3bab9090, 0x0b838888,
-		0x8cca4646, 0xc729eeee, 0x6bd3b8b8, 0x283c1414,
-		0xa779dede, 0xbce25e5e, 0x161d0b0b, 0xad76dbdb,
-		0xdb3be0e0, 0x64563232, 0x744e3a3a, 0x141e0a0a,
-		0x92db4949, 0x0c0a0606, 0x486c2424, 0xb8e45c5c,
-		0x9f5dc2c2, 0xbd6ed3d3, 0x43efacac, 0xc4a66262,
-		0x39a89191, 0x31a49595, 0xd337e4e4, 0xf28b7979,
-		0xd532e7e7, 0x8b43c8c8, 0x6e593737, 0xdab76d6d,
-		0x018c8d8d, 0xb164d5d5, 0x9cd24e4e, 0x49e0a9a9,
-		0xd8b46c6c, 0xacfa5656, 0xf307f4f4, 0xcf25eaea,
-		0xcaaf6565, 0xf48e7a7a, 0x47e9aeae, 0x10180808,
-		0x6fd5baba, 0xf0887878, 0x4a6f2525, 0x5c722e2e,
-		0x38241c1c, 0x57f1a6a6, 0x73c7b4b4, 0x9751c6c6,
-		0xcb23e8e8, 0xa17cdddd, 0xe89c7474, 0x3e211f1f,
-		0x96dd4b4b, 0x61dcbdbd, 0x0d868b8b, 0x0f858a8a,
-		0xe0907070, 0x7c423e3e, 0x71c4b5b5, 0xccaa6666,
-		0x90d84848, 0x06050303, 0xf701f6f6, 0x1c120e0e,
-		0xc2a36161, 0x6a5f3535, 0xaef95757, 0x69d0b9b9,
-		0x17918686, 0x9958c1c1, 0x3a271d1d, 0x27b99e9e,
-		0xd938e1e1, 0xeb13f8f8, 0x2bb39898, 0x22331111,
-		0xd2bb6969, 0xa970d9d9, 0x07898e8e, 0x33a79494,
-		0x2db69b9b, 0x3c221e1e, 0x15928787, 0xc920e9e9,
-		0x8749cece, 0xaaff5555, 0x50782828, 0xa57adfdf,
-		0x038f8c8c, 0x59f8a1a1, 0x09808989, 0x1a170d0d,
-		0x65dabfbf, 0xd731e6e6, 0x84c64242, 0xd0b86868,
-		0x82c34141, 0x29b09999, 0x5a772d2d, 0x1e110f0f,
-		0x7bcbb0b0, 0xa8fc5454, 0x6dd6bbbb, 0x2c3a1616,
-	}
-};
-
-static const u32 crypto_fl_tab[4][256] ____cacheline_aligned = {
-	{
-		0x00000063, 0x0000007c, 0x00000077, 0x0000007b,
-		0x000000f2, 0x0000006b, 0x0000006f, 0x000000c5,
-		0x00000030, 0x00000001, 0x00000067, 0x0000002b,
-		0x000000fe, 0x000000d7, 0x000000ab, 0x00000076,
-		0x000000ca, 0x00000082, 0x000000c9, 0x0000007d,
-		0x000000fa, 0x00000059, 0x00000047, 0x000000f0,
-		0x000000ad, 0x000000d4, 0x000000a2, 0x000000af,
-		0x0000009c, 0x000000a4, 0x00000072, 0x000000c0,
-		0x000000b7, 0x000000fd, 0x00000093, 0x00000026,
-		0x00000036, 0x0000003f, 0x000000f7, 0x000000cc,
-		0x00000034, 0x000000a5, 0x000000e5, 0x000000f1,
-		0x00000071, 0x000000d8, 0x00000031, 0x00000015,
-		0x00000004, 0x000000c7, 0x00000023, 0x000000c3,
-		0x00000018, 0x00000096, 0x00000005, 0x0000009a,
-		0x00000007, 0x00000012, 0x00000080, 0x000000e2,
-		0x000000eb, 0x00000027, 0x000000b2, 0x00000075,
-		0x00000009, 0x00000083, 0x0000002c, 0x0000001a,
-		0x0000001b, 0x0000006e, 0x0000005a, 0x000000a0,
-		0x00000052, 0x0000003b, 0x000000d6, 0x000000b3,
-		0x00000029, 0x000000e3, 0x0000002f, 0x00000084,
-		0x00000053, 0x000000d1, 0x00000000, 0x000000ed,
-		0x00000020, 0x000000fc, 0x000000b1, 0x0000005b,
-		0x0000006a, 0x000000cb, 0x000000be, 0x00000039,
-		0x0000004a, 0x0000004c, 0x00000058, 0x000000cf,
-		0x000000d0, 0x000000ef, 0x000000aa, 0x000000fb,
-		0x00000043, 0x0000004d, 0x00000033, 0x00000085,
-		0x00000045, 0x000000f9, 0x00000002, 0x0000007f,
-		0x00000050, 0x0000003c, 0x0000009f, 0x000000a8,
-		0x00000051, 0x000000a3, 0x00000040, 0x0000008f,
-		0x00000092, 0x0000009d, 0x00000038, 0x000000f5,
-		0x000000bc, 0x000000b6, 0x000000da, 0x00000021,
-		0x00000010, 0x000000ff, 0x000000f3, 0x000000d2,
-		0x000000cd, 0x0000000c, 0x00000013, 0x000000ec,
-		0x0000005f, 0x00000097, 0x00000044, 0x00000017,
-		0x000000c4, 0x000000a7, 0x0000007e, 0x0000003d,
-		0x00000064, 0x0000005d, 0x00000019, 0x00000073,
-		0x00000060, 0x00000081, 0x0000004f, 0x000000dc,
-		0x00000022, 0x0000002a, 0x00000090, 0x00000088,
-		0x00000046, 0x000000ee, 0x000000b8, 0x00000014,
-		0x000000de, 0x0000005e, 0x0000000b, 0x000000db,
-		0x000000e0, 0x00000032, 0x0000003a, 0x0000000a,
-		0x00000049, 0x00000006, 0x00000024, 0x0000005c,
-		0x000000c2, 0x000000d3, 0x000000ac, 0x00000062,
-		0x00000091, 0x00000095, 0x000000e4, 0x00000079,
-		0x000000e7, 0x000000c8, 0x00000037, 0x0000006d,
-		0x0000008d, 0x000000d5, 0x0000004e, 0x000000a9,
-		0x0000006c, 0x00000056, 0x000000f4, 0x000000ea,
-		0x00000065, 0x0000007a, 0x000000ae, 0x00000008,
-		0x000000ba, 0x00000078, 0x00000025, 0x0000002e,
-		0x0000001c, 0x000000a6, 0x000000b4, 0x000000c6,
-		0x000000e8, 0x000000dd, 0x00000074, 0x0000001f,
-		0x0000004b, 0x000000bd, 0x0000008b, 0x0000008a,
-		0x00000070, 0x0000003e, 0x000000b5, 0x00000066,
-		0x00000048, 0x00000003, 0x000000f6, 0x0000000e,
-		0x00000061, 0x00000035, 0x00000057, 0x000000b9,
-		0x00000086, 0x000000c1, 0x0000001d, 0x0000009e,
-		0x000000e1, 0x000000f8, 0x00000098, 0x00000011,
-		0x00000069, 0x000000d9, 0x0000008e, 0x00000094,
-		0x0000009b, 0x0000001e, 0x00000087, 0x000000e9,
-		0x000000ce, 0x00000055, 0x00000028, 0x000000df,
-		0x0000008c, 0x000000a1, 0x00000089, 0x0000000d,
-		0x000000bf, 0x000000e6, 0x00000042, 0x00000068,
-		0x00000041, 0x00000099, 0x0000002d, 0x0000000f,
-		0x000000b0, 0x00000054, 0x000000bb, 0x00000016,
-	}, {
-		0x00006300, 0x00007c00, 0x00007700, 0x00007b00,
-		0x0000f200, 0x00006b00, 0x00006f00, 0x0000c500,
-		0x00003000, 0x00000100, 0x00006700, 0x00002b00,
-		0x0000fe00, 0x0000d700, 0x0000ab00, 0x00007600,
-		0x0000ca00, 0x00008200, 0x0000c900, 0x00007d00,
-		0x0000fa00, 0x00005900, 0x00004700, 0x0000f000,
-		0x0000ad00, 0x0000d400, 0x0000a200, 0x0000af00,
-		0x00009c00, 0x0000a400, 0x00007200, 0x0000c000,
-		0x0000b700, 0x0000fd00, 0x00009300, 0x00002600,
-		0x00003600, 0x00003f00, 0x0000f700, 0x0000cc00,
-		0x00003400, 0x0000a500, 0x0000e500, 0x0000f100,
-		0x00007100, 0x0000d800, 0x00003100, 0x00001500,
-		0x00000400, 0x0000c700, 0x00002300, 0x0000c300,
-		0x00001800, 0x00009600, 0x00000500, 0x00009a00,
-		0x00000700, 0x00001200, 0x00008000, 0x0000e200,
-		0x0000eb00, 0x00002700, 0x0000b200, 0x00007500,
-		0x00000900, 0x00008300, 0x00002c00, 0x00001a00,
-		0x00001b00, 0x00006e00, 0x00005a00, 0x0000a000,
-		0x00005200, 0x00003b00, 0x0000d600, 0x0000b300,
-		0x00002900, 0x0000e300, 0x00002f00, 0x00008400,
-		0x00005300, 0x0000d100, 0x00000000, 0x0000ed00,
-		0x00002000, 0x0000fc00, 0x0000b100, 0x00005b00,
-		0x00006a00, 0x0000cb00, 0x0000be00, 0x00003900,
-		0x00004a00, 0x00004c00, 0x00005800, 0x0000cf00,
-		0x0000d000, 0x0000ef00, 0x0000aa00, 0x0000fb00,
-		0x00004300, 0x00004d00, 0x00003300, 0x00008500,
-		0x00004500, 0x0000f900, 0x00000200, 0x00007f00,
-		0x00005000, 0x00003c00, 0x00009f00, 0x0000a800,
-		0x00005100, 0x0000a300, 0x00004000, 0x00008f00,
-		0x00009200, 0x00009d00, 0x00003800, 0x0000f500,
-		0x0000bc00, 0x0000b600, 0x0000da00, 0x00002100,
-		0x00001000, 0x0000ff00, 0x0000f300, 0x0000d200,
-		0x0000cd00, 0x00000c00, 0x00001300, 0x0000ec00,
-		0x00005f00, 0x00009700, 0x00004400, 0x00001700,
-		0x0000c400, 0x0000a700, 0x00007e00, 0x00003d00,
-		0x00006400, 0x00005d00, 0x00001900, 0x00007300,
-		0x00006000, 0x00008100, 0x00004f00, 0x0000dc00,
-		0x00002200, 0x00002a00, 0x00009000, 0x00008800,
-		0x00004600, 0x0000ee00, 0x0000b800, 0x00001400,
-		0x0000de00, 0x00005e00, 0x00000b00, 0x0000db00,
-		0x0000e000, 0x00003200, 0x00003a00, 0x00000a00,
-		0x00004900, 0x00000600, 0x00002400, 0x00005c00,
-		0x0000c200, 0x0000d300, 0x0000ac00, 0x00006200,
-		0x00009100, 0x00009500, 0x0000e400, 0x00007900,
-		0x0000e700, 0x0000c800, 0x00003700, 0x00006d00,
-		0x00008d00, 0x0000d500, 0x00004e00, 0x0000a900,
-		0x00006c00, 0x00005600, 0x0000f400, 0x0000ea00,
-		0x00006500, 0x00007a00, 0x0000ae00, 0x00000800,
-		0x0000ba00, 0x00007800, 0x00002500, 0x00002e00,
-		0x00001c00, 0x0000a600, 0x0000b400, 0x0000c600,
-		0x0000e800, 0x0000dd00, 0x00007400, 0x00001f00,
-		0x00004b00, 0x0000bd00, 0x00008b00, 0x00008a00,
-		0x00007000, 0x00003e00, 0x0000b500, 0x00006600,
-		0x00004800, 0x00000300, 0x0000f600, 0x00000e00,
-		0x00006100, 0x00003500, 0x00005700, 0x0000b900,
-		0x00008600, 0x0000c100, 0x00001d00, 0x00009e00,
-		0x0000e100, 0x0000f800, 0x00009800, 0x00001100,
-		0x00006900, 0x0000d900, 0x00008e00, 0x00009400,
-		0x00009b00, 0x00001e00, 0x00008700, 0x0000e900,
-		0x0000ce00, 0x00005500, 0x00002800, 0x0000df00,
-		0x00008c00, 0x0000a100, 0x00008900, 0x00000d00,
-		0x0000bf00, 0x0000e600, 0x00004200, 0x00006800,
-		0x00004100, 0x00009900, 0x00002d00, 0x00000f00,
-		0x0000b000, 0x00005400, 0x0000bb00, 0x00001600,
-	}, {
-		0x00630000, 0x007c0000, 0x00770000, 0x007b0000,
-		0x00f20000, 0x006b0000, 0x006f0000, 0x00c50000,
-		0x00300000, 0x00010000, 0x00670000, 0x002b0000,
-		0x00fe0000, 0x00d70000, 0x00ab0000, 0x00760000,
-		0x00ca0000, 0x00820000, 0x00c90000, 0x007d0000,
-		0x00fa0000, 0x00590000, 0x00470000, 0x00f00000,
-		0x00ad0000, 0x00d40000, 0x00a20000, 0x00af0000,
-		0x009c0000, 0x00a40000, 0x00720000, 0x00c00000,
-		0x00b70000, 0x00fd0000, 0x00930000, 0x00260000,
-		0x00360000, 0x003f0000, 0x00f70000, 0x00cc0000,
-		0x00340000, 0x00a50000, 0x00e50000, 0x00f10000,
-		0x00710000, 0x00d80000, 0x00310000, 0x00150000,
-		0x00040000, 0x00c70000, 0x00230000, 0x00c30000,
-		0x00180000, 0x00960000, 0x00050000, 0x009a0000,
-		0x00070000, 0x00120000, 0x00800000, 0x00e20000,
-		0x00eb0000, 0x00270000, 0x00b20000, 0x00750000,
-		0x00090000, 0x00830000, 0x002c0000, 0x001a0000,
-		0x001b0000, 0x006e0000, 0x005a0000, 0x00a00000,
-		0x00520000, 0x003b0000, 0x00d60000, 0x00b30000,
-		0x00290000, 0x00e30000, 0x002f0000, 0x00840000,
-		0x00530000, 0x00d10000, 0x00000000, 0x00ed0000,
-		0x00200000, 0x00fc0000, 0x00b10000, 0x005b0000,
-		0x006a0000, 0x00cb0000, 0x00be0000, 0x00390000,
-		0x004a0000, 0x004c0000, 0x00580000, 0x00cf0000,
-		0x00d00000, 0x00ef0000, 0x00aa0000, 0x00fb0000,
-		0x00430000, 0x004d0000, 0x00330000, 0x00850000,
-		0x00450000, 0x00f90000, 0x00020000, 0x007f0000,
-		0x00500000, 0x003c0000, 0x009f0000, 0x00a80000,
-		0x00510000, 0x00a30000, 0x00400000, 0x008f0000,
-		0x00920000, 0x009d0000, 0x00380000, 0x00f50000,
-		0x00bc0000, 0x00b60000, 0x00da0000, 0x00210000,
-		0x00100000, 0x00ff0000, 0x00f30000, 0x00d20000,
-		0x00cd0000, 0x000c0000, 0x00130000, 0x00ec0000,
-		0x005f0000, 0x00970000, 0x00440000, 0x00170000,
-		0x00c40000, 0x00a70000, 0x007e0000, 0x003d0000,
-		0x00640000, 0x005d0000, 0x00190000, 0x00730000,
-		0x00600000, 0x00810000, 0x004f0000, 0x00dc0000,
-		0x00220000, 0x002a0000, 0x00900000, 0x00880000,
-		0x00460000, 0x00ee0000, 0x00b80000, 0x00140000,
-		0x00de0000, 0x005e0000, 0x000b0000, 0x00db0000,
-		0x00e00000, 0x00320000, 0x003a0000, 0x000a0000,
-		0x00490000, 0x00060000, 0x00240000, 0x005c0000,
-		0x00c20000, 0x00d30000, 0x00ac0000, 0x00620000,
-		0x00910000, 0x00950000, 0x00e40000, 0x00790000,
-		0x00e70000, 0x00c80000, 0x00370000, 0x006d0000,
-		0x008d0000, 0x00d50000, 0x004e0000, 0x00a90000,
-		0x006c0000, 0x00560000, 0x00f40000, 0x00ea0000,
-		0x00650000, 0x007a0000, 0x00ae0000, 0x00080000,
-		0x00ba0000, 0x00780000, 0x00250000, 0x002e0000,
-		0x001c0000, 0x00a60000, 0x00b40000, 0x00c60000,
-		0x00e80000, 0x00dd0000, 0x00740000, 0x001f0000,
-		0x004b0000, 0x00bd0000, 0x008b0000, 0x008a0000,
-		0x00700000, 0x003e0000, 0x00b50000, 0x00660000,
-		0x00480000, 0x00030000, 0x00f60000, 0x000e0000,
-		0x00610000, 0x00350000, 0x00570000, 0x00b90000,
-		0x00860000, 0x00c10000, 0x001d0000, 0x009e0000,
-		0x00e10000, 0x00f80000, 0x00980000, 0x00110000,
-		0x00690000, 0x00d90000, 0x008e0000, 0x00940000,
-		0x009b0000, 0x001e0000, 0x00870000, 0x00e90000,
-		0x00ce0000, 0x00550000, 0x00280000, 0x00df0000,
-		0x008c0000, 0x00a10000, 0x00890000, 0x000d0000,
-		0x00bf0000, 0x00e60000, 0x00420000, 0x00680000,
-		0x00410000, 0x00990000, 0x002d0000, 0x000f0000,
-		0x00b00000, 0x00540000, 0x00bb0000, 0x00160000,
-	}, {
-		0x63000000, 0x7c000000, 0x77000000, 0x7b000000,
-		0xf2000000, 0x6b000000, 0x6f000000, 0xc5000000,
-		0x30000000, 0x01000000, 0x67000000, 0x2b000000,
-		0xfe000000, 0xd7000000, 0xab000000, 0x76000000,
-		0xca000000, 0x82000000, 0xc9000000, 0x7d000000,
-		0xfa000000, 0x59000000, 0x47000000, 0xf0000000,
-		0xad000000, 0xd4000000, 0xa2000000, 0xaf000000,
-		0x9c000000, 0xa4000000, 0x72000000, 0xc0000000,
-		0xb7000000, 0xfd000000, 0x93000000, 0x26000000,
-		0x36000000, 0x3f000000, 0xf7000000, 0xcc000000,
-		0x34000000, 0xa5000000, 0xe5000000, 0xf1000000,
-		0x71000000, 0xd8000000, 0x31000000, 0x15000000,
-		0x04000000, 0xc7000000, 0x23000000, 0xc3000000,
-		0x18000000, 0x96000000, 0x05000000, 0x9a000000,
-		0x07000000, 0x12000000, 0x80000000, 0xe2000000,
-		0xeb000000, 0x27000000, 0xb2000000, 0x75000000,
-		0x09000000, 0x83000000, 0x2c000000, 0x1a000000,
-		0x1b000000, 0x6e000000, 0x5a000000, 0xa0000000,
-		0x52000000, 0x3b000000, 0xd6000000, 0xb3000000,
-		0x29000000, 0xe3000000, 0x2f000000, 0x84000000,
-		0x53000000, 0xd1000000, 0x00000000, 0xed000000,
-		0x20000000, 0xfc000000, 0xb1000000, 0x5b000000,
-		0x6a000000, 0xcb000000, 0xbe000000, 0x39000000,
-		0x4a000000, 0x4c000000, 0x58000000, 0xcf000000,
-		0xd0000000, 0xef000000, 0xaa000000, 0xfb000000,
-		0x43000000, 0x4d000000, 0x33000000, 0x85000000,
-		0x45000000, 0xf9000000, 0x02000000, 0x7f000000,
-		0x50000000, 0x3c000000, 0x9f000000, 0xa8000000,
-		0x51000000, 0xa3000000, 0x40000000, 0x8f000000,
-		0x92000000, 0x9d000000, 0x38000000, 0xf5000000,
-		0xbc000000, 0xb6000000, 0xda000000, 0x21000000,
-		0x10000000, 0xff000000, 0xf3000000, 0xd2000000,
-		0xcd000000, 0x0c000000, 0x13000000, 0xec000000,
-		0x5f000000, 0x97000000, 0x44000000, 0x17000000,
-		0xc4000000, 0xa7000000, 0x7e000000, 0x3d000000,
-		0x64000000, 0x5d000000, 0x19000000, 0x73000000,
-		0x60000000, 0x81000000, 0x4f000000, 0xdc000000,
-		0x22000000, 0x2a000000, 0x90000000, 0x88000000,
-		0x46000000, 0xee000000, 0xb8000000, 0x14000000,
-		0xde000000, 0x5e000000, 0x0b000000, 0xdb000000,
-		0xe0000000, 0x32000000, 0x3a000000, 0x0a000000,
-		0x49000000, 0x06000000, 0x24000000, 0x5c000000,
-		0xc2000000, 0xd3000000, 0xac000000, 0x62000000,
-		0x91000000, 0x95000000, 0xe4000000, 0x79000000,
-		0xe7000000, 0xc8000000, 0x37000000, 0x6d000000,
-		0x8d000000, 0xd5000000, 0x4e000000, 0xa9000000,
-		0x6c000000, 0x56000000, 0xf4000000, 0xea000000,
-		0x65000000, 0x7a000000, 0xae000000, 0x08000000,
-		0xba000000, 0x78000000, 0x25000000, 0x2e000000,
-		0x1c000000, 0xa6000000, 0xb4000000, 0xc6000000,
-		0xe8000000, 0xdd000000, 0x74000000, 0x1f000000,
-		0x4b000000, 0xbd000000, 0x8b000000, 0x8a000000,
-		0x70000000, 0x3e000000, 0xb5000000, 0x66000000,
-		0x48000000, 0x03000000, 0xf6000000, 0x0e000000,
-		0x61000000, 0x35000000, 0x57000000, 0xb9000000,
-		0x86000000, 0xc1000000, 0x1d000000, 0x9e000000,
-		0xe1000000, 0xf8000000, 0x98000000, 0x11000000,
-		0x69000000, 0xd9000000, 0x8e000000, 0x94000000,
-		0x9b000000, 0x1e000000, 0x87000000, 0xe9000000,
-		0xce000000, 0x55000000, 0x28000000, 0xdf000000,
-		0x8c000000, 0xa1000000, 0x89000000, 0x0d000000,
-		0xbf000000, 0xe6000000, 0x42000000, 0x68000000,
-		0x41000000, 0x99000000, 0x2d000000, 0x0f000000,
-		0xb0000000, 0x54000000, 0xbb000000, 0x16000000,
-	}
-};
-
-__visible const u32 crypto_it_tab[4][256] ____cacheline_aligned = {
-	{
-		0x50a7f451, 0x5365417e, 0xc3a4171a, 0x965e273a,
-		0xcb6bab3b, 0xf1459d1f, 0xab58faac, 0x9303e34b,
-		0x55fa3020, 0xf66d76ad, 0x9176cc88, 0x254c02f5,
-		0xfcd7e54f, 0xd7cb2ac5, 0x80443526, 0x8fa362b5,
-		0x495ab1de, 0x671bba25, 0x980eea45, 0xe1c0fe5d,
-		0x02752fc3, 0x12f04c81, 0xa397468d, 0xc6f9d36b,
-		0xe75f8f03, 0x959c9215, 0xeb7a6dbf, 0xda595295,
-		0x2d83bed4, 0xd3217458, 0x2969e049, 0x44c8c98e,
-		0x6a89c275, 0x78798ef4, 0x6b3e5899, 0xdd71b927,
-		0xb64fe1be, 0x17ad88f0, 0x66ac20c9, 0xb43ace7d,
-		0x184adf63, 0x82311ae5, 0x60335197, 0x457f5362,
-		0xe07764b1, 0x84ae6bbb, 0x1ca081fe, 0x942b08f9,
-		0x58684870, 0x19fd458f, 0x876cde94, 0xb7f87b52,
-		0x23d373ab, 0xe2024b72, 0x578f1fe3, 0x2aab5566,
-		0x0728ebb2, 0x03c2b52f, 0x9a7bc586, 0xa50837d3,
-		0xf2872830, 0xb2a5bf23, 0xba6a0302, 0x5c8216ed,
-		0x2b1ccf8a, 0x92b479a7, 0xf0f207f3, 0xa1e2694e,
-		0xcdf4da65, 0xd5be0506, 0x1f6234d1, 0x8afea6c4,
-		0x9d532e34, 0xa055f3a2, 0x32e18a05, 0x75ebf6a4,
-		0x39ec830b, 0xaaef6040, 0x069f715e, 0x51106ebd,
-		0xf98a213e, 0x3d06dd96, 0xae053edd, 0x46bde64d,
-		0xb58d5491, 0x055dc471, 0x6fd40604, 0xff155060,
-		0x24fb9819, 0x97e9bdd6, 0xcc434089, 0x779ed967,
-		0xbd42e8b0, 0x888b8907, 0x385b19e7, 0xdbeec879,
-		0x470a7ca1, 0xe90f427c, 0xc91e84f8, 0x00000000,
-		0x83868009, 0x48ed2b32, 0xac70111e, 0x4e725a6c,
-		0xfbff0efd, 0x5638850f, 0x1ed5ae3d, 0x27392d36,
-		0x64d90f0a, 0x21a65c68, 0xd1545b9b, 0x3a2e3624,
-		0xb1670a0c, 0x0fe75793, 0xd296eeb4, 0x9e919b1b,
-		0x4fc5c080, 0xa220dc61, 0x694b775a, 0x161a121c,
-		0x0aba93e2, 0xe52aa0c0, 0x43e0223c, 0x1d171b12,
-		0x0b0d090e, 0xadc78bf2, 0xb9a8b62d, 0xc8a91e14,
-		0x8519f157, 0x4c0775af, 0xbbdd99ee, 0xfd607fa3,
-		0x9f2601f7, 0xbcf5725c, 0xc53b6644, 0x347efb5b,
-		0x7629438b, 0xdcc623cb, 0x68fcedb6, 0x63f1e4b8,
-		0xcadc31d7, 0x10856342, 0x40229713, 0x2011c684,
-		0x7d244a85, 0xf83dbbd2, 0x1132f9ae, 0x6da129c7,
-		0x4b2f9e1d, 0xf330b2dc, 0xec52860d, 0xd0e3c177,
-		0x6c16b32b, 0x99b970a9, 0xfa489411, 0x2264e947,
-		0xc48cfca8, 0x1a3ff0a0, 0xd82c7d56, 0xef903322,
-		0xc74e4987, 0xc1d138d9, 0xfea2ca8c, 0x360bd498,
-		0xcf81f5a6, 0x28de7aa5, 0x268eb7da, 0xa4bfad3f,
-		0xe49d3a2c, 0x0d927850, 0x9bcc5f6a, 0x62467e54,
-		0xc2138df6, 0xe8b8d890, 0x5ef7392e, 0xf5afc382,
-		0xbe805d9f, 0x7c93d069, 0xa92dd56f, 0xb31225cf,
-		0x3b99acc8, 0xa77d1810, 0x6e639ce8, 0x7bbb3bdb,
-		0x097826cd, 0xf418596e, 0x01b79aec, 0xa89a4f83,
-		0x656e95e6, 0x7ee6ffaa, 0x08cfbc21, 0xe6e815ef,
-		0xd99be7ba, 0xce366f4a, 0xd4099fea, 0xd67cb029,
-		0xafb2a431, 0x31233f2a, 0x3094a5c6, 0xc066a235,
-		0x37bc4e74, 0xa6ca82fc, 0xb0d090e0, 0x15d8a733,
-		0x4a9804f1, 0xf7daec41, 0x0e50cd7f, 0x2ff69117,
-		0x8dd64d76, 0x4db0ef43, 0x544daacc, 0xdf0496e4,
-		0xe3b5d19e, 0x1b886a4c, 0xb81f2cc1, 0x7f516546,
-		0x04ea5e9d, 0x5d358c01, 0x737487fa, 0x2e410bfb,
-		0x5a1d67b3, 0x52d2db92, 0x335610e9, 0x1347d66d,
-		0x8c61d79a, 0x7a0ca137, 0x8e14f859, 0x893c13eb,
-		0xee27a9ce, 0x35c961b7, 0xede51ce1, 0x3cb1477a,
-		0x59dfd29c, 0x3f73f255, 0x79ce1418, 0xbf37c773,
-		0xeacdf753, 0x5baafd5f, 0x146f3ddf, 0x86db4478,
-		0x81f3afca, 0x3ec468b9, 0x2c342438, 0x5f40a3c2,
-		0x72c31d16, 0x0c25e2bc, 0x8b493c28, 0x41950dff,
-		0x7101a839, 0xdeb30c08, 0x9ce4b4d8, 0x90c15664,
-		0x6184cb7b, 0x70b632d5, 0x745c6c48, 0x4257b8d0,
-	}, {
-		0xa7f45150, 0x65417e53, 0xa4171ac3, 0x5e273a96,
-		0x6bab3bcb, 0x459d1ff1, 0x58faacab, 0x03e34b93,
-		0xfa302055, 0x6d76adf6, 0x76cc8891, 0x4c02f525,
-		0xd7e54ffc, 0xcb2ac5d7, 0x44352680, 0xa362b58f,
-		0x5ab1de49, 0x1bba2567, 0x0eea4598, 0xc0fe5de1,
-		0x752fc302, 0xf04c8112, 0x97468da3, 0xf9d36bc6,
-		0x5f8f03e7, 0x9c921595, 0x7a6dbfeb, 0x595295da,
-		0x83bed42d, 0x217458d3, 0x69e04929, 0xc8c98e44,
-		0x89c2756a, 0x798ef478, 0x3e58996b, 0x71b927dd,
-		0x4fe1beb6, 0xad88f017, 0xac20c966, 0x3ace7db4,
-		0x4adf6318, 0x311ae582, 0x33519760, 0x7f536245,
-		0x7764b1e0, 0xae6bbb84, 0xa081fe1c, 0x2b08f994,
-		0x68487058, 0xfd458f19, 0x6cde9487, 0xf87b52b7,
-		0xd373ab23, 0x024b72e2, 0x8f1fe357, 0xab55662a,
-		0x28ebb207, 0xc2b52f03, 0x7bc5869a, 0x0837d3a5,
-		0x872830f2, 0xa5bf23b2, 0x6a0302ba, 0x8216ed5c,
-		0x1ccf8a2b, 0xb479a792, 0xf207f3f0, 0xe2694ea1,
-		0xf4da65cd, 0xbe0506d5, 0x6234d11f, 0xfea6c48a,
-		0x532e349d, 0x55f3a2a0, 0xe18a0532, 0xebf6a475,
-		0xec830b39, 0xef6040aa, 0x9f715e06, 0x106ebd51,
-		0x8a213ef9, 0x06dd963d, 0x053eddae, 0xbde64d46,
-		0x8d5491b5, 0x5dc47105, 0xd406046f, 0x155060ff,
-		0xfb981924, 0xe9bdd697, 0x434089cc, 0x9ed96777,
-		0x42e8b0bd, 0x8b890788, 0x5b19e738, 0xeec879db,
-		0x0a7ca147, 0x0f427ce9, 0x1e84f8c9, 0x00000000,
-		0x86800983, 0xed2b3248, 0x70111eac, 0x725a6c4e,
-		0xff0efdfb, 0x38850f56, 0xd5ae3d1e, 0x392d3627,
-		0xd90f0a64, 0xa65c6821, 0x545b9bd1, 0x2e36243a,
-		0x670a0cb1, 0xe757930f, 0x96eeb4d2, 0x919b1b9e,
-		0xc5c0804f, 0x20dc61a2, 0x4b775a69, 0x1a121c16,
-		0xba93e20a, 0x2aa0c0e5, 0xe0223c43, 0x171b121d,
-		0x0d090e0b, 0xc78bf2ad, 0xa8b62db9, 0xa91e14c8,
-		0x19f15785, 0x0775af4c, 0xdd99eebb, 0x607fa3fd,
-		0x2601f79f, 0xf5725cbc, 0x3b6644c5, 0x7efb5b34,
-		0x29438b76, 0xc623cbdc, 0xfcedb668, 0xf1e4b863,
-		0xdc31d7ca, 0x85634210, 0x22971340, 0x11c68420,
-		0x244a857d, 0x3dbbd2f8, 0x32f9ae11, 0xa129c76d,
-		0x2f9e1d4b, 0x30b2dcf3, 0x52860dec, 0xe3c177d0,
-		0x16b32b6c, 0xb970a999, 0x489411fa, 0x64e94722,
-		0x8cfca8c4, 0x3ff0a01a, 0x2c7d56d8, 0x903322ef,
-		0x4e4987c7, 0xd138d9c1, 0xa2ca8cfe, 0x0bd49836,
-		0x81f5a6cf, 0xde7aa528, 0x8eb7da26, 0xbfad3fa4,
-		0x9d3a2ce4, 0x9278500d, 0xcc5f6a9b, 0x467e5462,
-		0x138df6c2, 0xb8d890e8, 0xf7392e5e, 0xafc382f5,
-		0x805d9fbe, 0x93d0697c, 0x2dd56fa9, 0x1225cfb3,
-		0x99acc83b, 0x7d1810a7, 0x639ce86e, 0xbb3bdb7b,
-		0x7826cd09, 0x18596ef4, 0xb79aec01, 0x9a4f83a8,
-		0x6e95e665, 0xe6ffaa7e, 0xcfbc2108, 0xe815efe6,
-		0x9be7bad9, 0x366f4ace, 0x099fead4, 0x7cb029d6,
-		0xb2a431af, 0x233f2a31, 0x94a5c630, 0x66a235c0,
-		0xbc4e7437, 0xca82fca6, 0xd090e0b0, 0xd8a73315,
-		0x9804f14a, 0xdaec41f7, 0x50cd7f0e, 0xf691172f,
-		0xd64d768d, 0xb0ef434d, 0x4daacc54, 0x0496e4df,
-		0xb5d19ee3, 0x886a4c1b, 0x1f2cc1b8, 0x5165467f,
-		0xea5e9d04, 0x358c015d, 0x7487fa73, 0x410bfb2e,
-		0x1d67b35a, 0xd2db9252, 0x5610e933, 0x47d66d13,
-		0x61d79a8c, 0x0ca1377a, 0x14f8598e, 0x3c13eb89,
-		0x27a9ceee, 0xc961b735, 0xe51ce1ed, 0xb1477a3c,
-		0xdfd29c59, 0x73f2553f, 0xce141879, 0x37c773bf,
-		0xcdf753ea, 0xaafd5f5b, 0x6f3ddf14, 0xdb447886,
-		0xf3afca81, 0xc468b93e, 0x3424382c, 0x40a3c25f,
-		0xc31d1672, 0x25e2bc0c, 0x493c288b, 0x950dff41,
-		0x01a83971, 0xb30c08de, 0xe4b4d89c, 0xc1566490,
-		0x84cb7b61, 0xb632d570, 0x5c6c4874, 0x57b8d042,
-	}, {
-		0xf45150a7, 0x417e5365, 0x171ac3a4, 0x273a965e,
-		0xab3bcb6b, 0x9d1ff145, 0xfaacab58, 0xe34b9303,
-		0x302055fa, 0x76adf66d, 0xcc889176, 0x02f5254c,
-		0xe54ffcd7, 0x2ac5d7cb, 0x35268044, 0x62b58fa3,
-		0xb1de495a, 0xba25671b, 0xea45980e, 0xfe5de1c0,
-		0x2fc30275, 0x4c8112f0, 0x468da397, 0xd36bc6f9,
-		0x8f03e75f, 0x9215959c, 0x6dbfeb7a, 0x5295da59,
-		0xbed42d83, 0x7458d321, 0xe0492969, 0xc98e44c8,
-		0xc2756a89, 0x8ef47879, 0x58996b3e, 0xb927dd71,
-		0xe1beb64f, 0x88f017ad, 0x20c966ac, 0xce7db43a,
-		0xdf63184a, 0x1ae58231, 0x51976033, 0x5362457f,
-		0x64b1e077, 0x6bbb84ae, 0x81fe1ca0, 0x08f9942b,
-		0x48705868, 0x458f19fd, 0xde94876c, 0x7b52b7f8,
-		0x73ab23d3, 0x4b72e202, 0x1fe3578f, 0x55662aab,
-		0xebb20728, 0xb52f03c2, 0xc5869a7b, 0x37d3a508,
-		0x2830f287, 0xbf23b2a5, 0x0302ba6a, 0x16ed5c82,
-		0xcf8a2b1c, 0x79a792b4, 0x07f3f0f2, 0x694ea1e2,
-		0xda65cdf4, 0x0506d5be, 0x34d11f62, 0xa6c48afe,
-		0x2e349d53, 0xf3a2a055, 0x8a0532e1, 0xf6a475eb,
-		0x830b39ec, 0x6040aaef, 0x715e069f, 0x6ebd5110,
-		0x213ef98a, 0xdd963d06, 0x3eddae05, 0xe64d46bd,
-		0x5491b58d, 0xc471055d, 0x06046fd4, 0x5060ff15,
-		0x981924fb, 0xbdd697e9, 0x4089cc43, 0xd967779e,
-		0xe8b0bd42, 0x8907888b, 0x19e7385b, 0xc879dbee,
-		0x7ca1470a, 0x427ce90f, 0x84f8c91e, 0x00000000,
-		0x80098386, 0x2b3248ed, 0x111eac70, 0x5a6c4e72,
-		0x0efdfbff, 0x850f5638, 0xae3d1ed5, 0x2d362739,
-		0x0f0a64d9, 0x5c6821a6, 0x5b9bd154, 0x36243a2e,
-		0x0a0cb167, 0x57930fe7, 0xeeb4d296, 0x9b1b9e91,
-		0xc0804fc5, 0xdc61a220, 0x775a694b, 0x121c161a,
-		0x93e20aba, 0xa0c0e52a, 0x223c43e0, 0x1b121d17,
-		0x090e0b0d, 0x8bf2adc7, 0xb62db9a8, 0x1e14c8a9,
-		0xf1578519, 0x75af4c07, 0x99eebbdd, 0x7fa3fd60,
-		0x01f79f26, 0x725cbcf5, 0x6644c53b, 0xfb5b347e,
-		0x438b7629, 0x23cbdcc6, 0xedb668fc, 0xe4b863f1,
-		0x31d7cadc, 0x63421085, 0x97134022, 0xc6842011,
-		0x4a857d24, 0xbbd2f83d, 0xf9ae1132, 0x29c76da1,
-		0x9e1d4b2f, 0xb2dcf330, 0x860dec52, 0xc177d0e3,
-		0xb32b6c16, 0x70a999b9, 0x9411fa48, 0xe9472264,
-		0xfca8c48c, 0xf0a01a3f, 0x7d56d82c, 0x3322ef90,
-		0x4987c74e, 0x38d9c1d1, 0xca8cfea2, 0xd498360b,
-		0xf5a6cf81, 0x7aa528de, 0xb7da268e, 0xad3fa4bf,
-		0x3a2ce49d, 0x78500d92, 0x5f6a9bcc, 0x7e546246,
-		0x8df6c213, 0xd890e8b8, 0x392e5ef7, 0xc382f5af,
-		0x5d9fbe80, 0xd0697c93, 0xd56fa92d, 0x25cfb312,
-		0xacc83b99, 0x1810a77d, 0x9ce86e63, 0x3bdb7bbb,
-		0x26cd0978, 0x596ef418, 0x9aec01b7, 0x4f83a89a,
-		0x95e6656e, 0xffaa7ee6, 0xbc2108cf, 0x15efe6e8,
-		0xe7bad99b, 0x6f4ace36, 0x9fead409, 0xb029d67c,
-		0xa431afb2, 0x3f2a3123, 0xa5c63094, 0xa235c066,
-		0x4e7437bc, 0x82fca6ca, 0x90e0b0d0, 0xa73315d8,
-		0x04f14a98, 0xec41f7da, 0xcd7f0e50, 0x91172ff6,
-		0x4d768dd6, 0xef434db0, 0xaacc544d, 0x96e4df04,
-		0xd19ee3b5, 0x6a4c1b88, 0x2cc1b81f, 0x65467f51,
-		0x5e9d04ea, 0x8c015d35, 0x87fa7374, 0x0bfb2e41,
-		0x67b35a1d, 0xdb9252d2, 0x10e93356, 0xd66d1347,
-		0xd79a8c61, 0xa1377a0c, 0xf8598e14, 0x13eb893c,
-		0xa9ceee27, 0x61b735c9, 0x1ce1ede5, 0x477a3cb1,
-		0xd29c59df, 0xf2553f73, 0x141879ce, 0xc773bf37,
-		0xf753eacd, 0xfd5f5baa, 0x3ddf146f, 0x447886db,
-		0xafca81f3, 0x68b93ec4, 0x24382c34, 0xa3c25f40,
-		0x1d1672c3, 0xe2bc0c25, 0x3c288b49, 0x0dff4195,
-		0xa8397101, 0x0c08deb3, 0xb4d89ce4, 0x566490c1,
-		0xcb7b6184, 0x32d570b6, 0x6c48745c, 0xb8d04257,
-	}, {
-		0x5150a7f4, 0x7e536541, 0x1ac3a417, 0x3a965e27,
-		0x3bcb6bab, 0x1ff1459d, 0xacab58fa, 0x4b9303e3,
-		0x2055fa30, 0xadf66d76, 0x889176cc, 0xf5254c02,
-		0x4ffcd7e5, 0xc5d7cb2a, 0x26804435, 0xb58fa362,
-		0xde495ab1, 0x25671bba, 0x45980eea, 0x5de1c0fe,
-		0xc302752f, 0x8112f04c, 0x8da39746, 0x6bc6f9d3,
-		0x03e75f8f, 0x15959c92, 0xbfeb7a6d, 0x95da5952,
-		0xd42d83be, 0x58d32174, 0x492969e0, 0x8e44c8c9,
-		0x756a89c2, 0xf478798e, 0x996b3e58, 0x27dd71b9,
-		0xbeb64fe1, 0xf017ad88, 0xc966ac20, 0x7db43ace,
-		0x63184adf, 0xe582311a, 0x97603351, 0x62457f53,
-		0xb1e07764, 0xbb84ae6b, 0xfe1ca081, 0xf9942b08,
-		0x70586848, 0x8f19fd45, 0x94876cde, 0x52b7f87b,
-		0xab23d373, 0x72e2024b, 0xe3578f1f, 0x662aab55,
-		0xb20728eb, 0x2f03c2b5, 0x869a7bc5, 0xd3a50837,
-		0x30f28728, 0x23b2a5bf, 0x02ba6a03, 0xed5c8216,
-		0x8a2b1ccf, 0xa792b479, 0xf3f0f207, 0x4ea1e269,
-		0x65cdf4da, 0x06d5be05, 0xd11f6234, 0xc48afea6,
-		0x349d532e, 0xa2a055f3, 0x0532e18a, 0xa475ebf6,
-		0x0b39ec83, 0x40aaef60, 0x5e069f71, 0xbd51106e,
-		0x3ef98a21, 0x963d06dd, 0xddae053e, 0x4d46bde6,
-		0x91b58d54, 0x71055dc4, 0x046fd406, 0x60ff1550,
-		0x1924fb98, 0xd697e9bd, 0x89cc4340, 0x67779ed9,
-		0xb0bd42e8, 0x07888b89, 0xe7385b19, 0x79dbeec8,
-		0xa1470a7c, 0x7ce90f42, 0xf8c91e84, 0x00000000,
-		0x09838680, 0x3248ed2b, 0x1eac7011, 0x6c4e725a,
-		0xfdfbff0e, 0x0f563885, 0x3d1ed5ae, 0x3627392d,
-		0x0a64d90f, 0x6821a65c, 0x9bd1545b, 0x243a2e36,
-		0x0cb1670a, 0x930fe757, 0xb4d296ee, 0x1b9e919b,
-		0x804fc5c0, 0x61a220dc, 0x5a694b77, 0x1c161a12,
-		0xe20aba93, 0xc0e52aa0, 0x3c43e022, 0x121d171b,
-		0x0e0b0d09, 0xf2adc78b, 0x2db9a8b6, 0x14c8a91e,
-		0x578519f1, 0xaf4c0775, 0xeebbdd99, 0xa3fd607f,
-		0xf79f2601, 0x5cbcf572, 0x44c53b66, 0x5b347efb,
-		0x8b762943, 0xcbdcc623, 0xb668fced, 0xb863f1e4,
-		0xd7cadc31, 0x42108563, 0x13402297, 0x842011c6,
-		0x857d244a, 0xd2f83dbb, 0xae1132f9, 0xc76da129,
-		0x1d4b2f9e, 0xdcf330b2, 0x0dec5286, 0x77d0e3c1,
-		0x2b6c16b3, 0xa999b970, 0x11fa4894, 0x472264e9,
-		0xa8c48cfc, 0xa01a3ff0, 0x56d82c7d, 0x22ef9033,
-		0x87c74e49, 0xd9c1d138, 0x8cfea2ca, 0x98360bd4,
-		0xa6cf81f5, 0xa528de7a, 0xda268eb7, 0x3fa4bfad,
-		0x2ce49d3a, 0x500d9278, 0x6a9bcc5f, 0x5462467e,
-		0xf6c2138d, 0x90e8b8d8, 0x2e5ef739, 0x82f5afc3,
-		0x9fbe805d, 0x697c93d0, 0x6fa92dd5, 0xcfb31225,
-		0xc83b99ac, 0x10a77d18, 0xe86e639c, 0xdb7bbb3b,
-		0xcd097826, 0x6ef41859, 0xec01b79a, 0x83a89a4f,
-		0xe6656e95, 0xaa7ee6ff, 0x2108cfbc, 0xefe6e815,
-		0xbad99be7, 0x4ace366f, 0xead4099f, 0x29d67cb0,
-		0x31afb2a4, 0x2a31233f, 0xc63094a5, 0x35c066a2,
-		0x7437bc4e, 0xfca6ca82, 0xe0b0d090, 0x3315d8a7,
-		0xf14a9804, 0x41f7daec, 0x7f0e50cd, 0x172ff691,
-		0x768dd64d, 0x434db0ef, 0xcc544daa, 0xe4df0496,
-		0x9ee3b5d1, 0x4c1b886a, 0xc1b81f2c, 0x467f5165,
-		0x9d04ea5e, 0x015d358c, 0xfa737487, 0xfb2e410b,
-		0xb35a1d67, 0x9252d2db, 0xe9335610, 0x6d1347d6,
-		0x9a8c61d7, 0x377a0ca1, 0x598e14f8, 0xeb893c13,
-		0xceee27a9, 0xb735c961, 0xe1ede51c, 0x7a3cb147,
-		0x9c59dfd2, 0x553f73f2, 0x1879ce14, 0x73bf37c7,
-		0x53eacdf7, 0x5f5baafd, 0xdf146f3d, 0x7886db44,
-		0xca81f3af, 0xb93ec468, 0x382c3424, 0xc25f40a3,
-		0x1672c31d, 0xbc0c25e2, 0x288b493c, 0xff41950d,
-		0x397101a8, 0x08deb30c, 0xd89ce4b4, 0x6490c156,
-		0x7b6184cb, 0xd570b632, 0x48745c6c, 0xd04257b8,
-	}
-};
-
-static const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
-	{
-		0x00000052, 0x00000009, 0x0000006a, 0x000000d5,
-		0x00000030, 0x00000036, 0x000000a5, 0x00000038,
-		0x000000bf, 0x00000040, 0x000000a3, 0x0000009e,
-		0x00000081, 0x000000f3, 0x000000d7, 0x000000fb,
-		0x0000007c, 0x000000e3, 0x00000039, 0x00000082,
-		0x0000009b, 0x0000002f, 0x000000ff, 0x00000087,
-		0x00000034, 0x0000008e, 0x00000043, 0x00000044,
-		0x000000c4, 0x000000de, 0x000000e9, 0x000000cb,
-		0x00000054, 0x0000007b, 0x00000094, 0x00000032,
-		0x000000a6, 0x000000c2, 0x00000023, 0x0000003d,
-		0x000000ee, 0x0000004c, 0x00000095, 0x0000000b,
-		0x00000042, 0x000000fa, 0x000000c3, 0x0000004e,
-		0x00000008, 0x0000002e, 0x000000a1, 0x00000066,
-		0x00000028, 0x000000d9, 0x00000024, 0x000000b2,
-		0x00000076, 0x0000005b, 0x000000a2, 0x00000049,
-		0x0000006d, 0x0000008b, 0x000000d1, 0x00000025,
-		0x00000072, 0x000000f8, 0x000000f6, 0x00000064,
-		0x00000086, 0x00000068, 0x00000098, 0x00000016,
-		0x000000d4, 0x000000a4, 0x0000005c, 0x000000cc,
-		0x0000005d, 0x00000065, 0x000000b6, 0x00000092,
-		0x0000006c, 0x00000070, 0x00000048, 0x00000050,
-		0x000000fd, 0x000000ed, 0x000000b9, 0x000000da,
-		0x0000005e, 0x00000015, 0x00000046, 0x00000057,
-		0x000000a7, 0x0000008d, 0x0000009d, 0x00000084,
-		0x00000090, 0x000000d8, 0x000000ab, 0x00000000,
-		0x0000008c, 0x000000bc, 0x000000d3, 0x0000000a,
-		0x000000f7, 0x000000e4, 0x00000058, 0x00000005,
-		0x000000b8, 0x000000b3, 0x00000045, 0x00000006,
-		0x000000d0, 0x0000002c, 0x0000001e, 0x0000008f,
-		0x000000ca, 0x0000003f, 0x0000000f, 0x00000002,
-		0x000000c1, 0x000000af, 0x000000bd, 0x00000003,
-		0x00000001, 0x00000013, 0x0000008a, 0x0000006b,
-		0x0000003a, 0x00000091, 0x00000011, 0x00000041,
-		0x0000004f, 0x00000067, 0x000000dc, 0x000000ea,
-		0x00000097, 0x000000f2, 0x000000cf, 0x000000ce,
-		0x000000f0, 0x000000b4, 0x000000e6, 0x00000073,
-		0x00000096, 0x000000ac, 0x00000074, 0x00000022,
-		0x000000e7, 0x000000ad, 0x00000035, 0x00000085,
-		0x000000e2, 0x000000f9, 0x00000037, 0x000000e8,
-		0x0000001c, 0x00000075, 0x000000df, 0x0000006e,
-		0x00000047, 0x000000f1, 0x0000001a, 0x00000071,
-		0x0000001d, 0x00000029, 0x000000c5, 0x00000089,
-		0x0000006f, 0x000000b7, 0x00000062, 0x0000000e,
-		0x000000aa, 0x00000018, 0x000000be, 0x0000001b,
-		0x000000fc, 0x00000056, 0x0000003e, 0x0000004b,
-		0x000000c6, 0x000000d2, 0x00000079, 0x00000020,
-		0x0000009a, 0x000000db, 0x000000c0, 0x000000fe,
-		0x00000078, 0x000000cd, 0x0000005a, 0x000000f4,
-		0x0000001f, 0x000000dd, 0x000000a8, 0x00000033,
-		0x00000088, 0x00000007, 0x000000c7, 0x00000031,
-		0x000000b1, 0x00000012, 0x00000010, 0x00000059,
-		0x00000027, 0x00000080, 0x000000ec, 0x0000005f,
-		0x00000060, 0x00000051, 0x0000007f, 0x000000a9,
-		0x00000019, 0x000000b5, 0x0000004a, 0x0000000d,
-		0x0000002d, 0x000000e5, 0x0000007a, 0x0000009f,
-		0x00000093, 0x000000c9, 0x0000009c, 0x000000ef,
-		0x000000a0, 0x000000e0, 0x0000003b, 0x0000004d,
-		0x000000ae, 0x0000002a, 0x000000f5, 0x000000b0,
-		0x000000c8, 0x000000eb, 0x000000bb, 0x0000003c,
-		0x00000083, 0x00000053, 0x00000099, 0x00000061,
-		0x00000017, 0x0000002b, 0x00000004, 0x0000007e,
-		0x000000ba, 0x00000077, 0x000000d6, 0x00000026,
-		0x000000e1, 0x00000069, 0x00000014, 0x00000063,
-		0x00000055, 0x00000021, 0x0000000c, 0x0000007d,
-	}, {
-		0x00005200, 0x00000900, 0x00006a00, 0x0000d500,
-		0x00003000, 0x00003600, 0x0000a500, 0x00003800,
-		0x0000bf00, 0x00004000, 0x0000a300, 0x00009e00,
-		0x00008100, 0x0000f300, 0x0000d700, 0x0000fb00,
-		0x00007c00, 0x0000e300, 0x00003900, 0x00008200,
-		0x00009b00, 0x00002f00, 0x0000ff00, 0x00008700,
-		0x00003400, 0x00008e00, 0x00004300, 0x00004400,
-		0x0000c400, 0x0000de00, 0x0000e900, 0x0000cb00,
-		0x00005400, 0x00007b00, 0x00009400, 0x00003200,
-		0x0000a600, 0x0000c200, 0x00002300, 0x00003d00,
-		0x0000ee00, 0x00004c00, 0x00009500, 0x00000b00,
-		0x00004200, 0x0000fa00, 0x0000c300, 0x00004e00,
-		0x00000800, 0x00002e00, 0x0000a100, 0x00006600,
-		0x00002800, 0x0000d900, 0x00002400, 0x0000b200,
-		0x00007600, 0x00005b00, 0x0000a200, 0x00004900,
-		0x00006d00, 0x00008b00, 0x0000d100, 0x00002500,
-		0x00007200, 0x0000f800, 0x0000f600, 0x00006400,
-		0x00008600, 0x00006800, 0x00009800, 0x00001600,
-		0x0000d400, 0x0000a400, 0x00005c00, 0x0000cc00,
-		0x00005d00, 0x00006500, 0x0000b600, 0x00009200,
-		0x00006c00, 0x00007000, 0x00004800, 0x00005000,
-		0x0000fd00, 0x0000ed00, 0x0000b900, 0x0000da00,
-		0x00005e00, 0x00001500, 0x00004600, 0x00005700,
-		0x0000a700, 0x00008d00, 0x00009d00, 0x00008400,
-		0x00009000, 0x0000d800, 0x0000ab00, 0x00000000,
-		0x00008c00, 0x0000bc00, 0x0000d300, 0x00000a00,
-		0x0000f700, 0x0000e400, 0x00005800, 0x00000500,
-		0x0000b800, 0x0000b300, 0x00004500, 0x00000600,
-		0x0000d000, 0x00002c00, 0x00001e00, 0x00008f00,
-		0x0000ca00, 0x00003f00, 0x00000f00, 0x00000200,
-		0x0000c100, 0x0000af00, 0x0000bd00, 0x00000300,
-		0x00000100, 0x00001300, 0x00008a00, 0x00006b00,
-		0x00003a00, 0x00009100, 0x00001100, 0x00004100,
-		0x00004f00, 0x00006700, 0x0000dc00, 0x0000ea00,
-		0x00009700, 0x0000f200, 0x0000cf00, 0x0000ce00,
-		0x0000f000, 0x0000b400, 0x0000e600, 0x00007300,
-		0x00009600, 0x0000ac00, 0x00007400, 0x00002200,
-		0x0000e700, 0x0000ad00, 0x00003500, 0x00008500,
-		0x0000e200, 0x0000f900, 0x00003700, 0x0000e800,
-		0x00001c00, 0x00007500, 0x0000df00, 0x00006e00,
-		0x00004700, 0x0000f100, 0x00001a00, 0x00007100,
-		0x00001d00, 0x00002900, 0x0000c500, 0x00008900,
-		0x00006f00, 0x0000b700, 0x00006200, 0x00000e00,
-		0x0000aa00, 0x00001800, 0x0000be00, 0x00001b00,
-		0x0000fc00, 0x00005600, 0x00003e00, 0x00004b00,
-		0x0000c600, 0x0000d200, 0x00007900, 0x00002000,
-		0x00009a00, 0x0000db00, 0x0000c000, 0x0000fe00,
-		0x00007800, 0x0000cd00, 0x00005a00, 0x0000f400,
-		0x00001f00, 0x0000dd00, 0x0000a800, 0x00003300,
-		0x00008800, 0x00000700, 0x0000c700, 0x00003100,
-		0x0000b100, 0x00001200, 0x00001000, 0x00005900,
-		0x00002700, 0x00008000, 0x0000ec00, 0x00005f00,
-		0x00006000, 0x00005100, 0x00007f00, 0x0000a900,
-		0x00001900, 0x0000b500, 0x00004a00, 0x00000d00,
-		0x00002d00, 0x0000e500, 0x00007a00, 0x00009f00,
-		0x00009300, 0x0000c900, 0x00009c00, 0x0000ef00,
-		0x0000a000, 0x0000e000, 0x00003b00, 0x00004d00,
-		0x0000ae00, 0x00002a00, 0x0000f500, 0x0000b000,
-		0x0000c800, 0x0000eb00, 0x0000bb00, 0x00003c00,
-		0x00008300, 0x00005300, 0x00009900, 0x00006100,
-		0x00001700, 0x00002b00, 0x00000400, 0x00007e00,
-		0x0000ba00, 0x00007700, 0x0000d600, 0x00002600,
-		0x0000e100, 0x00006900, 0x00001400, 0x00006300,
-		0x00005500, 0x00002100, 0x00000c00, 0x00007d00,
-	}, {
-		0x00520000, 0x00090000, 0x006a0000, 0x00d50000,
-		0x00300000, 0x00360000, 0x00a50000, 0x00380000,
-		0x00bf0000, 0x00400000, 0x00a30000, 0x009e0000,
-		0x00810000, 0x00f30000, 0x00d70000, 0x00fb0000,
-		0x007c0000, 0x00e30000, 0x00390000, 0x00820000,
-		0x009b0000, 0x002f0000, 0x00ff0000, 0x00870000,
-		0x00340000, 0x008e0000, 0x00430000, 0x00440000,
-		0x00c40000, 0x00de0000, 0x00e90000, 0x00cb0000,
-		0x00540000, 0x007b0000, 0x00940000, 0x00320000,
-		0x00a60000, 0x00c20000, 0x00230000, 0x003d0000,
-		0x00ee0000, 0x004c0000, 0x00950000, 0x000b0000,
-		0x00420000, 0x00fa0000, 0x00c30000, 0x004e0000,
-		0x00080000, 0x002e0000, 0x00a10000, 0x00660000,
-		0x00280000, 0x00d90000, 0x00240000, 0x00b20000,
-		0x00760000, 0x005b0000, 0x00a20000, 0x00490000,
-		0x006d0000, 0x008b0000, 0x00d10000, 0x00250000,
-		0x00720000, 0x00f80000, 0x00f60000, 0x00640000,
-		0x00860000, 0x00680000, 0x00980000, 0x00160000,
-		0x00d40000, 0x00a40000, 0x005c0000, 0x00cc0000,
-		0x005d0000, 0x00650000, 0x00b60000, 0x00920000,
-		0x006c0000, 0x00700000, 0x00480000, 0x00500000,
-		0x00fd0000, 0x00ed0000, 0x00b90000, 0x00da0000,
-		0x005e0000, 0x00150000, 0x00460000, 0x00570000,
-		0x00a70000, 0x008d0000, 0x009d0000, 0x00840000,
-		0x00900000, 0x00d80000, 0x00ab0000, 0x00000000,
-		0x008c0000, 0x00bc0000, 0x00d30000, 0x000a0000,
-		0x00f70000, 0x00e40000, 0x00580000, 0x00050000,
-		0x00b80000, 0x00b30000, 0x00450000, 0x00060000,
-		0x00d00000, 0x002c0000, 0x001e0000, 0x008f0000,
-		0x00ca0000, 0x003f0000, 0x000f0000, 0x00020000,
-		0x00c10000, 0x00af0000, 0x00bd0000, 0x00030000,
-		0x00010000, 0x00130000, 0x008a0000, 0x006b0000,
-		0x003a0000, 0x00910000, 0x00110000, 0x00410000,
-		0x004f0000, 0x00670000, 0x00dc0000, 0x00ea0000,
-		0x00970000, 0x00f20000, 0x00cf0000, 0x00ce0000,
-		0x00f00000, 0x00b40000, 0x00e60000, 0x00730000,
-		0x00960000, 0x00ac0000, 0x00740000, 0x00220000,
-		0x00e70000, 0x00ad0000, 0x00350000, 0x00850000,
-		0x00e20000, 0x00f90000, 0x00370000, 0x00e80000,
-		0x001c0000, 0x00750000, 0x00df0000, 0x006e0000,
-		0x00470000, 0x00f10000, 0x001a0000, 0x00710000,
-		0x001d0000, 0x00290000, 0x00c50000, 0x00890000,
-		0x006f0000, 0x00b70000, 0x00620000, 0x000e0000,
-		0x00aa0000, 0x00180000, 0x00be0000, 0x001b0000,
-		0x00fc0000, 0x00560000, 0x003e0000, 0x004b0000,
-		0x00c60000, 0x00d20000, 0x00790000, 0x00200000,
-		0x009a0000, 0x00db0000, 0x00c00000, 0x00fe0000,
-		0x00780000, 0x00cd0000, 0x005a0000, 0x00f40000,
-		0x001f0000, 0x00dd0000, 0x00a80000, 0x00330000,
-		0x00880000, 0x00070000, 0x00c70000, 0x00310000,
-		0x00b10000, 0x00120000, 0x00100000, 0x00590000,
-		0x00270000, 0x00800000, 0x00ec0000, 0x005f0000,
-		0x00600000, 0x00510000, 0x007f0000, 0x00a90000,
-		0x00190000, 0x00b50000, 0x004a0000, 0x000d0000,
-		0x002d0000, 0x00e50000, 0x007a0000, 0x009f0000,
-		0x00930000, 0x00c90000, 0x009c0000, 0x00ef0000,
-		0x00a00000, 0x00e00000, 0x003b0000, 0x004d0000,
-		0x00ae0000, 0x002a0000, 0x00f50000, 0x00b00000,
-		0x00c80000, 0x00eb0000, 0x00bb0000, 0x003c0000,
-		0x00830000, 0x00530000, 0x00990000, 0x00610000,
-		0x00170000, 0x002b0000, 0x00040000, 0x007e0000,
-		0x00ba0000, 0x00770000, 0x00d60000, 0x00260000,
-		0x00e10000, 0x00690000, 0x00140000, 0x00630000,
-		0x00550000, 0x00210000, 0x000c0000, 0x007d0000,
-	}, {
-		0x52000000, 0x09000000, 0x6a000000, 0xd5000000,
-		0x30000000, 0x36000000, 0xa5000000, 0x38000000,
-		0xbf000000, 0x40000000, 0xa3000000, 0x9e000000,
-		0x81000000, 0xf3000000, 0xd7000000, 0xfb000000,
-		0x7c000000, 0xe3000000, 0x39000000, 0x82000000,
-		0x9b000000, 0x2f000000, 0xff000000, 0x87000000,
-		0x34000000, 0x8e000000, 0x43000000, 0x44000000,
-		0xc4000000, 0xde000000, 0xe9000000, 0xcb000000,
-		0x54000000, 0x7b000000, 0x94000000, 0x32000000,
-		0xa6000000, 0xc2000000, 0x23000000, 0x3d000000,
-		0xee000000, 0x4c000000, 0x95000000, 0x0b000000,
-		0x42000000, 0xfa000000, 0xc3000000, 0x4e000000,
-		0x08000000, 0x2e000000, 0xa1000000, 0x66000000,
-		0x28000000, 0xd9000000, 0x24000000, 0xb2000000,
-		0x76000000, 0x5b000000, 0xa2000000, 0x49000000,
-		0x6d000000, 0x8b000000, 0xd1000000, 0x25000000,
-		0x72000000, 0xf8000000, 0xf6000000, 0x64000000,
-		0x86000000, 0x68000000, 0x98000000, 0x16000000,
-		0xd4000000, 0xa4000000, 0x5c000000, 0xcc000000,
-		0x5d000000, 0x65000000, 0xb6000000, 0x92000000,
-		0x6c000000, 0x70000000, 0x48000000, 0x50000000,
-		0xfd000000, 0xed000000, 0xb9000000, 0xda000000,
-		0x5e000000, 0x15000000, 0x46000000, 0x57000000,
-		0xa7000000, 0x8d000000, 0x9d000000, 0x84000000,
-		0x90000000, 0xd8000000, 0xab000000, 0x00000000,
-		0x8c000000, 0xbc000000, 0xd3000000, 0x0a000000,
-		0xf7000000, 0xe4000000, 0x58000000, 0x05000000,
-		0xb8000000, 0xb3000000, 0x45000000, 0x06000000,
-		0xd0000000, 0x2c000000, 0x1e000000, 0x8f000000,
-		0xca000000, 0x3f000000, 0x0f000000, 0x02000000,
-		0xc1000000, 0xaf000000, 0xbd000000, 0x03000000,
-		0x01000000, 0x13000000, 0x8a000000, 0x6b000000,
-		0x3a000000, 0x91000000, 0x11000000, 0x41000000,
-		0x4f000000, 0x67000000, 0xdc000000, 0xea000000,
-		0x97000000, 0xf2000000, 0xcf000000, 0xce000000,
-		0xf0000000, 0xb4000000, 0xe6000000, 0x73000000,
-		0x96000000, 0xac000000, 0x74000000, 0x22000000,
-		0xe7000000, 0xad000000, 0x35000000, 0x85000000,
-		0xe2000000, 0xf9000000, 0x37000000, 0xe8000000,
-		0x1c000000, 0x75000000, 0xdf000000, 0x6e000000,
-		0x47000000, 0xf1000000, 0x1a000000, 0x71000000,
-		0x1d000000, 0x29000000, 0xc5000000, 0x89000000,
-		0x6f000000, 0xb7000000, 0x62000000, 0x0e000000,
-		0xaa000000, 0x18000000, 0xbe000000, 0x1b000000,
-		0xfc000000, 0x56000000, 0x3e000000, 0x4b000000,
-		0xc6000000, 0xd2000000, 0x79000000, 0x20000000,
-		0x9a000000, 0xdb000000, 0xc0000000, 0xfe000000,
-		0x78000000, 0xcd000000, 0x5a000000, 0xf4000000,
-		0x1f000000, 0xdd000000, 0xa8000000, 0x33000000,
-		0x88000000, 0x07000000, 0xc7000000, 0x31000000,
-		0xb1000000, 0x12000000, 0x10000000, 0x59000000,
-		0x27000000, 0x80000000, 0xec000000, 0x5f000000,
-		0x60000000, 0x51000000, 0x7f000000, 0xa9000000,
-		0x19000000, 0xb5000000, 0x4a000000, 0x0d000000,
-		0x2d000000, 0xe5000000, 0x7a000000, 0x9f000000,
-		0x93000000, 0xc9000000, 0x9c000000, 0xef000000,
-		0xa0000000, 0xe0000000, 0x3b000000, 0x4d000000,
-		0xae000000, 0x2a000000, 0xf5000000, 0xb0000000,
-		0xc8000000, 0xeb000000, 0xbb000000, 0x3c000000,
-		0x83000000, 0x53000000, 0x99000000, 0x61000000,
-		0x17000000, 0x2b000000, 0x04000000, 0x7e000000,
-		0xba000000, 0x77000000, 0xd6000000, 0x26000000,
-		0xe1000000, 0x69000000, 0x14000000, 0x63000000,
-		0x55000000, 0x21000000, 0x0c000000, 0x7d000000,
-	}
-};
-
-EXPORT_SYMBOL_GPL(crypto_ft_tab);
-EXPORT_SYMBOL_GPL(crypto_it_tab);
-
-/**
- * crypto_aes_set_key - Set the AES key.
- * @tfm:	The %crypto_tfm that is used in the context.
- * @in_key:	The input key.
- * @key_len:	The size of the key.
- *
- * This function uses aes_expand_key() to expand the key.  &crypto_aes_ctx
- * _must_ be the private data embedded in @tfm which is retrieved with
- * crypto_tfm_ctx().
- *
- * Return: 0 on success; -EINVAL on failure (only happens for bad key lengths)
- */
-int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
-		unsigned int key_len)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	return aes_expandkey(ctx, in_key, key_len);
-}
-EXPORT_SYMBOL_GPL(crypto_aes_set_key);
-
-/* encrypt a block of text */
-
-#define f_rn(bo, bi, n, k)	do {				\
-	bo[n] = crypto_ft_tab[0][byte(bi[n], 0)] ^			\
-		crypto_ft_tab[1][byte(bi[(n + 1) & 3], 1)] ^		\
-		crypto_ft_tab[2][byte(bi[(n + 2) & 3], 2)] ^		\
-		crypto_ft_tab[3][byte(bi[(n + 3) & 3], 3)] ^ *(k + n);	\
-} while (0)
-
-#define f_nround(bo, bi, k)	do {\
-	f_rn(bo, bi, 0, k);	\
-	f_rn(bo, bi, 1, k);	\
-	f_rn(bo, bi, 2, k);	\
-	f_rn(bo, bi, 3, k);	\
-	k += 4;			\
-} while (0)
-
-#define f_rl(bo, bi, n, k)	do {				\
-	bo[n] = crypto_fl_tab[0][byte(bi[n], 0)] ^			\
-		crypto_fl_tab[1][byte(bi[(n + 1) & 3], 1)] ^		\
-		crypto_fl_tab[2][byte(bi[(n + 2) & 3], 2)] ^		\
-		crypto_fl_tab[3][byte(bi[(n + 3) & 3], 3)] ^ *(k + n);	\
-} while (0)
-
-#define f_lround(bo, bi, k)	do {\
-	f_rl(bo, bi, 0, k);	\
-	f_rl(bo, bi, 1, k);	\
-	f_rl(bo, bi, 2, k);	\
-	f_rl(bo, bi, 3, k);	\
-} while (0)
-
-static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	u32 b0[4], b1[4];
-	const u32 *kp = ctx->key_enc + 4;
-	const int key_len = ctx->key_length;
-
-	b0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
-	b0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
-	b0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
-	b0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
-
-	if (key_len > 24) {
-		f_nround(b1, b0, kp);
-		f_nround(b0, b1, kp);
-	}
-
-	if (key_len > 16) {
-		f_nround(b1, b0, kp);
-		f_nround(b0, b1, kp);
-	}
-
-	f_nround(b1, b0, kp);
-	f_nround(b0, b1, kp);
-	f_nround(b1, b0, kp);
-	f_nround(b0, b1, kp);
-	f_nround(b1, b0, kp);
-	f_nround(b0, b1, kp);
-	f_nround(b1, b0, kp);
-	f_nround(b0, b1, kp);
-	f_nround(b1, b0, kp);
-	f_lround(b0, b1, kp);
-
-	put_unaligned_le32(b0[0], out);
-	put_unaligned_le32(b0[1], out + 4);
-	put_unaligned_le32(b0[2], out + 8);
-	put_unaligned_le32(b0[3], out + 12);
-}
-
-/* decrypt a block of text */
-
-#define i_rn(bo, bi, n, k)	do {				\
-	bo[n] = crypto_it_tab[0][byte(bi[n], 0)] ^			\
-		crypto_it_tab[1][byte(bi[(n + 3) & 3], 1)] ^		\
-		crypto_it_tab[2][byte(bi[(n + 2) & 3], 2)] ^		\
-		crypto_it_tab[3][byte(bi[(n + 1) & 3], 3)] ^ *(k + n);	\
-} while (0)
-
-#define i_nround(bo, bi, k)	do {\
-	i_rn(bo, bi, 0, k);	\
-	i_rn(bo, bi, 1, k);	\
-	i_rn(bo, bi, 2, k);	\
-	i_rn(bo, bi, 3, k);	\
-	k += 4;			\
-} while (0)
-
-#define i_rl(bo, bi, n, k)	do {			\
-	bo[n] = crypto_il_tab[0][byte(bi[n], 0)] ^		\
-	crypto_il_tab[1][byte(bi[(n + 3) & 3], 1)] ^		\
-	crypto_il_tab[2][byte(bi[(n + 2) & 3], 2)] ^		\
-	crypto_il_tab[3][byte(bi[(n + 1) & 3], 3)] ^ *(k + n);	\
-} while (0)
-
-#define i_lround(bo, bi, k)	do {\
-	i_rl(bo, bi, 0, k);	\
-	i_rl(bo, bi, 1, k);	\
-	i_rl(bo, bi, 2, k);	\
-	i_rl(bo, bi, 3, k);	\
-} while (0)
-
-static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	u32 b0[4], b1[4];
-	const int key_len = ctx->key_length;
-	const u32 *kp = ctx->key_dec + 4;
-
-	b0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
-	b0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
-	b0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
-	b0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
-
-	if (key_len > 24) {
-		i_nround(b1, b0, kp);
-		i_nround(b0, b1, kp);
-	}
-
-	if (key_len > 16) {
-		i_nround(b1, b0, kp);
-		i_nround(b0, b1, kp);
-	}
-
-	i_nround(b1, b0, kp);
-	i_nround(b0, b1, kp);
-	i_nround(b1, b0, kp);
-	i_nround(b0, b1, kp);
-	i_nround(b1, b0, kp);
-	i_nround(b0, b1, kp);
-	i_nround(b1, b0, kp);
-	i_nround(b0, b1, kp);
-	i_nround(b1, b0, kp);
-	i_lround(b0, b1, kp);
-
-	put_unaligned_le32(b0[0], out);
-	put_unaligned_le32(b0[1], out + 4);
-	put_unaligned_le32(b0[2], out + 8);
-	put_unaligned_le32(b0[3], out + 12);
-}
-
-static struct crypto_alg aes_alg = {
-	.cra_name		=	"aes",
-	.cra_driver_name	=	"aes-generic",
-	.cra_priority		=	100,
-	.cra_flags		=	CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize		=	AES_BLOCK_SIZE,
-	.cra_ctxsize		=	sizeof(struct crypto_aes_ctx),
-	.cra_module		=	THIS_MODULE,
-	.cra_u			=	{
-		.cipher = {
-			.cia_min_keysize	=	AES_MIN_KEY_SIZE,
-			.cia_max_keysize	=	AES_MAX_KEY_SIZE,
-			.cia_setkey		=	crypto_aes_set_key,
-			.cia_encrypt		=	crypto_aes_encrypt,
-			.cia_decrypt		=	crypto_aes_decrypt
-		}
-	}
-};
-
-static int __init aes_init(void)
-{
-	return crypto_register_alg(&aes_alg);
-}
-
-static void __exit aes_fini(void)
-{
-	crypto_unregister_alg(&aes_alg);
-}
-
-module_init(aes_init);
-module_exit(aes_fini);
-
-MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm");
-MODULE_LICENSE("Dual BSD/GPL");
-MODULE_ALIAS_CRYPTO("aes");
-MODULE_ALIAS_CRYPTO("aes-generic");
diff --git a/crypto/crypto_user.c b/crypto/crypto_user.c
index aad429bef03e..3187e0d276f9 100644
--- a/crypto/crypto_user.c
+++ b/crypto/crypto_user.c
@@ -291,11 +291,11 @@ static int crypto_del_alg(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	alg = crypto_alg_match(p, 1);
 	if (!alg)
 		return -ENOENT;
 
-	/* We can not unregister core algorithms such as aes-generic.
+	/* We can not unregister core algorithms such as aes.
 	 * We would loose the reference in the crypto_alg_list to this algorithm
 	 * if we try to unregister. Unregistering such an algorithm without
 	 * removing the module is not possible, so we restrict to crypto
 	 * instances that are build from templates. */
 	err = -EINVAL;
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 5df204d9c9dd..cbc049d697a1 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -4059,18 +4059,18 @@ static int alg_test_null(const struct alg_test_desc *desc,
 
 /* Please keep this list sorted by algorithm name. */
 static const struct alg_test_desc alg_test_descs[] = {
 	{
 		.alg = "adiantum(xchacha12,aes)",
-		.generic_driver = "adiantum(xchacha12-lib,aes-generic)",
+		.generic_driver = "adiantum(xchacha12-lib,aes-lib)",
 		.test = alg_test_skcipher,
 		.suite = {
 			.cipher = __VECS(adiantum_xchacha12_aes_tv_template)
 		},
 	}, {
 		.alg = "adiantum(xchacha20,aes)",
-		.generic_driver = "adiantum(xchacha20-lib,aes-generic)",
+		.generic_driver = "adiantum(xchacha20-lib,aes-lib)",
 		.test = alg_test_skcipher,
 		.suite = {
 			.cipher = __VECS(adiantum_xchacha20_aes_tv_template)
 		},
 	}, {
@@ -4086,11 +4086,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.aead = __VECS(hmac_md5_ecb_cipher_null_tv_template)
 		}
 	}, {
 		.alg = "authenc(hmac(sha1),cbc(aes))",
-		.generic_driver = "authenc(hmac-sha1-lib,cbc(aes-generic))",
+		.generic_driver = "authenc(hmac-sha1-lib,cbc(aes-lib))",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = __VECS(hmac_sha1_aes_cbc_tv_temp)
 		}
@@ -4137,11 +4137,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.aead = __VECS(hmac_sha224_des3_ede_cbc_tv_temp)
 		}
 	}, {
 		.alg = "authenc(hmac(sha256),cbc(aes))",
-		.generic_driver = "authenc(hmac-sha256-lib,cbc(aes-generic))",
+		.generic_driver = "authenc(hmac-sha256-lib,cbc(aes-lib))",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = __VECS(hmac_sha256_aes_cbc_tv_temp)
 		}
@@ -4163,11 +4163,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.alg = "authenc(hmac(sha256),ctr(aes))",
 		.test = alg_test_null,
 		.fips_allowed = 1,
 	}, {
 		.alg = "authenc(hmac(sha256),cts(cbc(aes)))",
-		.generic_driver = "authenc(hmac-sha256-lib,cts(cbc(aes-generic)))",
+		.generic_driver = "authenc(hmac-sha256-lib,cts(cbc(aes-lib)))",
 		.test = alg_test_aead,
 		.suite = {
 			.aead = __VECS(krb5_test_aes128_cts_hmac_sha256_128)
 		}
 	}, {
@@ -4192,22 +4192,22 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.alg = "authenc(hmac(sha384),ctr(aes))",
 		.test = alg_test_null,
 		.fips_allowed = 1,
 	}, {
 		.alg = "authenc(hmac(sha384),cts(cbc(aes)))",
-		.generic_driver = "authenc(hmac-sha384-lib,cts(cbc(aes-generic)))",
+		.generic_driver = "authenc(hmac-sha384-lib,cts(cbc(aes-lib)))",
 		.test = alg_test_aead,
 		.suite = {
 			.aead = __VECS(krb5_test_aes256_cts_hmac_sha384_192)
 		}
 	}, {
 		.alg = "authenc(hmac(sha384),rfc3686(ctr(aes)))",
 		.test = alg_test_null,
 		.fips_allowed = 1,
 	}, {
 		.alg = "authenc(hmac(sha512),cbc(aes))",
-		.generic_driver = "authenc(hmac-sha512-lib,cbc(aes-generic))",
+		.generic_driver = "authenc(hmac-sha512-lib,cbc(aes-lib))",
 		.fips_allowed = 1,
 		.test = alg_test_aead,
 		.suite = {
 			.aead = __VECS(hmac_sha512_aes_cbc_tv_temp)
 		}
@@ -4265,10 +4265,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(blake2b_512_tv_template)
 		}
 	}, {
 		.alg = "cbc(aes)",
+		.generic_driver = "cbc(aes-lib)",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(aes_cbc_tv_template)
 		},
@@ -4360,10 +4361,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 			.cipher = __VECS(aes_cbc_tv_template)
 		}
 	}, {
 #endif
 		.alg = "cbcmac(aes)",
+		.generic_driver = "cbcmac(aes-lib)",
 		.test = alg_test_hash,
 		.suite = {
 			.hash = __VECS(aes_cbcmac_tv_template)
 		}
 	}, {
@@ -4372,11 +4374,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(sm4_cbcmac_tv_template)
 		}
 	}, {
 		.alg = "ccm(aes)",
-		.generic_driver = "ccm_base(ctr(aes-generic),cbcmac(aes-generic))",
+		.generic_driver = "ccm_base(ctr(aes-lib),cbcmac(aes-lib))",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = {
 				____VECS(aes_ccm_tv_template),
@@ -4400,10 +4402,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.cipher = __VECS(chacha20_tv_template)
 		},
 	}, {
 		.alg = "cmac(aes)",
+		.generic_driver = "cmac(aes-lib)",
 		.fips_allowed = 1,
 		.test = alg_test_hash,
 		.suite = {
 			.hash = __VECS(aes_cmac128_tv_template)
 		}
@@ -4441,10 +4444,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(crc32c_tv_template)
 		}
 	}, {
 		.alg = "ctr(aes)",
+		.generic_driver = "ctr(aes-lib)",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(aes_ctr_tv_template)
 		}
@@ -4531,10 +4535,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 			.cipher = __VECS(aes_ctr_tv_template)
 		}
 	}, {
 #endif
 		.alg = "cts(cbc(aes))",
+		.generic_driver = "cts(cbc(aes-lib))",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(cts_mode_tv_template)
 		}
@@ -4687,10 +4692,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.alg = "drbg_pr_sha512",
 		.fips_allowed = 1,
 		.test = alg_test_null,
 	}, {
 		.alg = "ecb(aes)",
+		.generic_driver = "ecb(aes-lib)",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(aes_tv_template)
 		}
@@ -4879,19 +4885,19 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.sig = __VECS(ecrdsa_tv_template)
 		}
 	}, {
 		.alg = "essiv(authenc(hmac(sha256),cbc(aes)),sha256)",
-		.generic_driver = "essiv(authenc(hmac-sha256-lib,cbc(aes-generic)),sha256-lib)",
+		.generic_driver = "essiv(authenc(hmac-sha256-lib,cbc(aes-lib)),sha256-lib)",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = __VECS(essiv_hmac_sha256_aes_cbc_tv_temp)
 		}
 	}, {
 		.alg = "essiv(cbc(aes),sha256)",
-		.generic_driver = "essiv(cbc(aes-generic),sha256-lib)",
+		.generic_driver = "essiv(cbc(aes-lib),sha256-lib)",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(essiv_aes_cbc_tv_template)
 		}
@@ -4932,11 +4938,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 			.kpp = __VECS(ffdhe8192_dh_tv_template)
 		}
 	}, {
 #endif /* CONFIG_CRYPTO_DH_RFC7919_GROUPS */
 		.alg = "gcm(aes)",
-		.generic_driver = "gcm_base(ctr(aes-generic),ghash-generic)",
+		.generic_driver = "gcm_base(ctr(aes-lib),ghash-generic)",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = __VECS(aes_gcm_tv_template)
 		}
@@ -4960,11 +4966,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(ghash_tv_template)
 		}
 	}, {
 		.alg = "hctr2(aes)",
-		.generic_driver = "hctr2_base(xctr(aes-generic),polyval-lib)",
+		.generic_driver = "hctr2_base(xctr(aes-lib),polyval-lib)",
 		.test = alg_test_skcipher,
 		.suite = {
 			.cipher = __VECS(aes_hctr2_tv_template)
 		}
 	}, {
@@ -5078,11 +5084,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.alg = "krb5enc(cmac(camellia),cts(cbc(camellia)))",
 		.test = alg_test_aead,
 		.suite.aead = __VECS(krb5_test_camellia_cts_cmac)
 	}, {
 		.alg = "lrw(aes)",
-		.generic_driver = "lrw(ecb(aes-generic))",
+		.generic_driver = "lrw(ecb(aes-lib))",
 		.test = alg_test_skcipher,
 		.suite = {
 			.cipher = __VECS(aes_lrw_tv_template)
 		}
 	}, {
@@ -5267,10 +5273,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.alg = "pkcs1pad(rsa)",
 		.test = alg_test_null,
 		.fips_allowed = 1,
 	}, {
 		.alg = "rfc3686(ctr(aes))",
+		.generic_driver = "rfc3686(ctr(aes-lib))",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(aes_ctr_rfc3686_tv_template)
 		}
@@ -5280,11 +5287,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.cipher = __VECS(sm4_ctr_rfc3686_tv_template)
 		}
 	}, {
 		.alg = "rfc4106(gcm(aes))",
-		.generic_driver = "rfc4106(gcm_base(ctr(aes-generic),ghash-generic))",
+		.generic_driver = "rfc4106(gcm_base(ctr(aes-lib),ghash-generic))",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = {
 				____VECS(aes_gcm_rfc4106_tv_template),
@@ -5292,11 +5299,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 				.aad_iv = 1,
 			}
 		}
 	}, {
 		.alg = "rfc4309(ccm(aes))",
-		.generic_driver = "rfc4309(ccm_base(ctr(aes-generic),cbcmac(aes-generic)))",
+		.generic_driver = "rfc4309(ccm_base(ctr(aes-lib),cbcmac(aes-lib)))",
 		.test = alg_test_aead,
 		.fips_allowed = 1,
 		.suite = {
 			.aead = {
 				____VECS(aes_ccm_rfc4309_tv_template),
@@ -5304,11 +5311,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 				.aad_iv = 1,
 			}
 		}
 	}, {
 		.alg = "rfc4543(gcm(aes))",
-		.generic_driver = "rfc4543(gcm_base(ctr(aes-generic),ghash-generic))",
+		.generic_driver = "rfc4543(gcm_base(ctr(aes-lib),ghash-generic))",
 		.test = alg_test_aead,
 		.suite = {
 			.aead = {
 				____VECS(aes_gcm_rfc4543_tv_template),
 				.einval_allowed = 1,
@@ -5481,10 +5488,11 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.sig = __VECS(x962_ecdsa_nist_p521_tv_template)
 		}
 	}, {
 		.alg = "xcbc(aes)",
+		.generic_driver = "xcbc(aes-lib)",
 		.test = alg_test_hash,
 		.suite = {
 			.hash = __VECS(aes_xcbc128_tv_template)
 		}
 	}, {
@@ -5507,17 +5515,18 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.cipher = __VECS(xchacha20_tv_template)
 		},
 	}, {
 		.alg = "xctr(aes)",
+		.generic_driver = "xctr(aes-lib)",
 		.test = alg_test_skcipher,
 		.suite = {
 			.cipher = __VECS(aes_xctr_tv_template)
 		}
 	}, {
 		.alg = "xts(aes)",
-		.generic_driver = "xts(ecb(aes-generic))",
+		.generic_driver = "xts(ecb(aes-lib))",
 		.test = alg_test_skcipher,
 		.fips_allowed = 1,
 		.suite = {
 			.cipher = __VECS(aes_xts_tv_template)
 		}
diff --git a/drivers/crypto/starfive/jh7110-aes.c b/drivers/crypto/starfive/jh7110-aes.c
index 426b24889af8..f1edb4fbf364 100644
--- a/drivers/crypto/starfive/jh7110-aes.c
+++ b/drivers/crypto/starfive/jh7110-aes.c
@@ -981,31 +981,31 @@ static int starfive_aes_ccm_decrypt(struct aead_request *req)
 	return starfive_aes_aead_crypt(req, STARFIVE_AES_MODE_CCM);
 }
 
 static int starfive_aes_ecb_init_tfm(struct crypto_skcipher *tfm)
 {
-	return starfive_aes_init_tfm(tfm, "ecb(aes-generic)");
+	return starfive_aes_init_tfm(tfm, "ecb(aes-lib)");
 }
 
 static int starfive_aes_cbc_init_tfm(struct crypto_skcipher *tfm)
 {
-	return starfive_aes_init_tfm(tfm, "cbc(aes-generic)");
+	return starfive_aes_init_tfm(tfm, "cbc(aes-lib)");
 }
 
 static int starfive_aes_ctr_init_tfm(struct crypto_skcipher *tfm)
 {
-	return starfive_aes_init_tfm(tfm, "ctr(aes-generic)");
+	return starfive_aes_init_tfm(tfm, "ctr(aes-lib)");
 }
 
 static int starfive_aes_ccm_init_tfm(struct crypto_aead *tfm)
 {
-	return starfive_aes_aead_init_tfm(tfm, "ccm_base(ctr(aes-generic),cbcmac(aes-generic))");
+	return starfive_aes_aead_init_tfm(tfm, "ccm_base(ctr(aes-lib),cbcmac(aes-lib))");
 }
 
 static int starfive_aes_gcm_init_tfm(struct crypto_aead *tfm)
 {
-	return starfive_aes_aead_init_tfm(tfm, "gcm_base(ctr(aes-generic),ghash-generic)");
+	return starfive_aes_aead_init_tfm(tfm, "gcm_base(ctr(aes-lib),ghash-generic)");
 }
 
 static struct skcipher_engine_alg skcipher_algs[] = {
 {
 	.base.init			= starfive_aes_ecb_init_tfm,
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 4da2f125bb15..be3c134de7b6 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -80,13 +80,10 @@ struct crypto_aes_ctx {
 	u32 key_enc[AES_MAX_KEYLENGTH_U32];
 	u32 key_dec[AES_MAX_KEYLENGTH_U32];
 	u32 key_length;
 };
 
-extern const u32 crypto_ft_tab[4][256] ____cacheline_aligned;
-extern const u32 crypto_it_tab[4][256] ____cacheline_aligned;
-
 /*
  * validate key length for AES algorithms
  */
 static inline int aes_check_keylen(size_t keylen)
 {
@@ -100,13 +97,10 @@ static inline int aes_check_keylen(size_t keylen)
 	}
 
 	return 0;
 }
 
-int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
-		unsigned int key_len);
-
 /**
  * aes_expandkey - Expands the AES key as described in FIPS-197
  * @ctx:	The location where the computed key will be stored.
  * @in_key:	The supplied key.
  * @key_len:	The length of the supplied key.
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/36] lib/crypto: arm/aes: Migrate optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (8 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 09/36] crypto: aes - Replace aes-generic with wrapper around lib Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 11/36] lib/crypto: arm64/aes: " Eric Biggers
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Move the ARM optimized single-block AES en/decryption code into
lib/crypto/, wire it up to the AES library API, and remove the
superseded "aes-arm" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs are now
optimized for ARM, whereas previously only crypto_cipher was (and the
optimizations weren't enabled by default, which this fixes as well).

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm/configs/milbeaut_m10v_defconfig      |  1 -
 arch/arm/configs/multi_v7_defconfig           |  2 +-
 arch/arm/configs/omap2plus_defconfig          |  2 +-
 arch/arm/configs/pxa_defconfig                |  2 +-
 arch/arm/crypto/Kconfig                       | 18 -----
 arch/arm/crypto/Makefile                      |  2 -
 arch/arm/crypto/aes-cipher-glue.c             | 77 -------------------
 arch/arm/crypto/aes-cipher.h                  | 13 ----
 lib/crypto/Kconfig                            |  1 +
 lib/crypto/Makefile                           |  3 +
 .../crypto/arm}/aes-cipher-core.S             |  0
 lib/crypto/arm/aes.h                          | 56 ++++++++++++++
 12 files changed, 63 insertions(+), 114 deletions(-)
 delete mode 100644 arch/arm/crypto/aes-cipher-glue.c
 delete mode 100644 arch/arm/crypto/aes-cipher.h
 rename {arch/arm/crypto => lib/crypto/arm}/aes-cipher-core.S (100%)
 create mode 100644 lib/crypto/arm/aes.h

diff --git a/arch/arm/configs/milbeaut_m10v_defconfig b/arch/arm/configs/milbeaut_m10v_defconfig
index a2995eb390c6..77b69d672d40 100644
--- a/arch/arm/configs/milbeaut_m10v_defconfig
+++ b/arch/arm/configs/milbeaut_m10v_defconfig
@@ -96,11 +96,10 @@ CONFIG_KEYS=y
 CONFIG_CRYPTO_SELFTESTS=y
 # CONFIG_CRYPTO_ECHAINIV is not set
 CONFIG_CRYPTO_AES=y
 CONFIG_CRYPTO_SEQIV=m
 CONFIG_CRYPTO_GHASH_ARM_CE=m
-CONFIG_CRYPTO_AES_ARM=m
 CONFIG_CRYPTO_AES_ARM_BS=m
 CONFIG_CRYPTO_AES_ARM_CE=m
 # CONFIG_CRYPTO_HW is not set
 CONFIG_DMA_CMA=y
 CONFIG_CMA_SIZE_MBYTES=64
diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig
index 7f1fa9dd88c9..b6d3e20926bb 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -1284,11 +1284,11 @@ CONFIG_CRYPTO_USER=m
 CONFIG_CRYPTO_USER_API_HASH=m
 CONFIG_CRYPTO_USER_API_SKCIPHER=m
 CONFIG_CRYPTO_USER_API_RNG=m
 CONFIG_CRYPTO_USER_API_AEAD=m
 CONFIG_CRYPTO_GHASH_ARM_CE=m
-CONFIG_CRYPTO_AES_ARM=m
+CONFIG_CRYPTO_AES=m
 CONFIG_CRYPTO_AES_ARM_BS=m
 CONFIG_CRYPTO_AES_ARM_CE=m
 CONFIG_CRYPTO_DEV_SUN4I_SS=m
 CONFIG_CRYPTO_DEV_FSL_CAAM=m
 CONFIG_CRYPTO_DEV_EXYNOS_RNG=m
diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
index 4e53c331cd84..0464f6552169 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -704,11 +704,11 @@ CONFIG_ROOT_NFS=y
 CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ISO8859_1=y
 CONFIG_SECURITY=y
 CONFIG_CRYPTO_MICHAEL_MIC=y
 CONFIG_CRYPTO_GHASH_ARM_CE=m
-CONFIG_CRYPTO_AES_ARM=m
+CONFIG_CRYPTO_AES=m
 CONFIG_CRYPTO_AES_ARM_BS=m
 CONFIG_CRYPTO_DEV_OMAP=m
 CONFIG_CRYPTO_DEV_OMAP_SHAM=m
 CONFIG_CRYPTO_DEV_OMAP_AES=m
 CONFIG_CRYPTO_DEV_OMAP_DES=m
diff --git a/arch/arm/configs/pxa_defconfig b/arch/arm/configs/pxa_defconfig
index 3ea189f1f42f..eacd08fd87ad 100644
--- a/arch/arm/configs/pxa_defconfig
+++ b/arch/arm/configs/pxa_defconfig
@@ -655,11 +655,11 @@ CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_ANUBIS=m
 CONFIG_CRYPTO_XCBC=m
 CONFIG_CRYPTO_DEFLATE=y
 CONFIG_CRYPTO_LZO=y
-CONFIG_CRYPTO_AES_ARM=m
+CONFIG_CRYPTO_AES=m
 CONFIG_FONTS=y
 CONFIG_FONT_8x8=y
 CONFIG_FONT_8x16=y
 CONFIG_FONT_6x11=y
 CONFIG_FONT_MINI_4x6=y
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 167a648a9def..b9c28c818b7c 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -21,28 +21,10 @@ config CRYPTO_GHASH_ARM_CE
 	  Use an implementation of GHASH (used by the GCM AEAD chaining mode)
 	  that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64)
 	  that is part of the ARMv8 Crypto Extensions, or a slower variant that
 	  uses the vmull.p8 instruction that is part of the basic NEON ISA.
 
-config CRYPTO_AES_ARM
-	tristate "Ciphers: AES"
-	select CRYPTO_ALGAPI
-	select CRYPTO_AES
-	help
-	  Block ciphers: AES cipher algorithms (FIPS-197)
-
-	  Architecture: arm
-
-	  On ARM processors without the Crypto Extensions, this is the
-	  fastest AES implementation for single blocks.  For multiple
-	  blocks, the NEON bit-sliced implementation is usually faster.
-
-	  This implementation may be vulnerable to cache timing attacks,
-	  since it uses lookup tables.  However, as countermeasures it
-	  disables IRQs and preloads the tables; it is hoped this makes
-	  such attacks very difficult.
-
 config CRYPTO_AES_ARM_BS
 	tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (bit-sliced NEON)"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_AES
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index d6683e9d4992..e73099e120b3 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -1,15 +1,13 @@
 # SPDX-License-Identifier: GPL-2.0
 #
 # Arch-specific CryptoAPI modules.
 #
 
-obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o
 obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o
 
 obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
 obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
 
-aes-arm-y	:= aes-cipher-core.o aes-cipher-glue.o
 aes-arm-bs-y	:= aes-neonbs-core.o aes-neonbs-glue.o
 aes-arm-ce-y	:= aes-ce-core.o aes-ce-glue.o
 ghash-arm-ce-y	:= ghash-ce-core.o ghash-ce-glue.o
diff --git a/arch/arm/crypto/aes-cipher-glue.c b/arch/arm/crypto/aes-cipher-glue.c
deleted file mode 100644
index f302db808cd3..000000000000
--- a/arch/arm/crypto/aes-cipher-glue.c
+++ /dev/null
@@ -1,77 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Scalar AES core transform
- *
- * Copyright (C) 2017 Linaro Ltd.
- * Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
- */
-
-#include <crypto/aes.h>
-#include <crypto/algapi.h>
-#include <linux/module.h>
-#include "aes-cipher.h"
-
-EXPORT_SYMBOL_GPL(__aes_arm_encrypt);
-EXPORT_SYMBOL_GPL(__aes_arm_decrypt);
-
-static int aes_arm_setkey(struct crypto_tfm *tfm, const u8 *in_key,
-			  unsigned int key_len)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	return aes_expandkey(ctx, in_key, key_len);
-}
-
-static void aes_arm_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	int rounds = 6 + ctx->key_length / 4;
-
-	__aes_arm_encrypt(ctx->key_enc, rounds, in, out);
-}
-
-static void aes_arm_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	int rounds = 6 + ctx->key_length / 4;
-
-	__aes_arm_decrypt(ctx->key_dec, rounds, in, out);
-}
-
-static struct crypto_alg aes_alg = {
-	.cra_name			= "aes",
-	.cra_driver_name		= "aes-arm",
-	.cra_priority			= 200,
-	.cra_flags			= CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize			= AES_BLOCK_SIZE,
-	.cra_ctxsize			= sizeof(struct crypto_aes_ctx),
-	.cra_module			= THIS_MODULE,
-
-	.cra_cipher.cia_min_keysize	= AES_MIN_KEY_SIZE,
-	.cra_cipher.cia_max_keysize	= AES_MAX_KEY_SIZE,
-	.cra_cipher.cia_setkey		= aes_arm_setkey,
-	.cra_cipher.cia_encrypt		= aes_arm_encrypt,
-	.cra_cipher.cia_decrypt		= aes_arm_decrypt,
-
-#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-	.cra_alignmask			= 3,
-#endif
-};
-
-static int __init aes_init(void)
-{
-	return crypto_register_alg(&aes_alg);
-}
-
-static void __exit aes_fini(void)
-{
-	crypto_unregister_alg(&aes_alg);
-}
-
-module_init(aes_init);
-module_exit(aes_fini);
-
-MODULE_DESCRIPTION("Scalar AES cipher for ARM");
-MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
-MODULE_LICENSE("GPL v2");
-MODULE_ALIAS_CRYPTO("aes");
diff --git a/arch/arm/crypto/aes-cipher.h b/arch/arm/crypto/aes-cipher.h
deleted file mode 100644
index d5db2b87eb69..000000000000
--- a/arch/arm/crypto/aes-cipher.h
+++ /dev/null
@@ -1,13 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-#ifndef ARM_CRYPTO_AES_CIPHER_H
-#define ARM_CRYPTO_AES_CIPHER_H
-
-#include <linux/linkage.h>
-#include <linux/types.h>
-
-asmlinkage void __aes_arm_encrypt(const u32 rk[], int rounds,
-				  const u8 *in, u8 *out);
-asmlinkage void __aes_arm_decrypt(const u32 rk[], int rounds,
-				  const u8 *in, u8 *out);
-
-#endif /* ARM_CRYPTO_AES_CIPHER_H */
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 21fee7c2dfce..67dbf3c0562b 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -12,10 +12,11 @@ config CRYPTO_LIB_AES
 	tristate
 
 config CRYPTO_LIB_AES_ARCH
 	bool
 	depends on CRYPTO_LIB_AES && !UML && !KMSAN
+	default y if ARM
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 01193b3f47ba..2f6b0f59eb1b 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -19,10 +19,13 @@ libcryptoutils-y				:= memneq.o utils.o
 
 obj-$(CONFIG_CRYPTO_LIB_AES) += libaes.o
 libaes-y := aes.o
 ifeq ($(CONFIG_CRYPTO_LIB_AES_ARCH),y)
 CFLAGS_aes.o += -I$(src)/$(SRCARCH)
+
+libaes-$(CONFIG_ARM) += arm/aes-cipher-core.o
+
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
diff --git a/arch/arm/crypto/aes-cipher-core.S b/lib/crypto/arm/aes-cipher-core.S
similarity index 100%
rename from arch/arm/crypto/aes-cipher-core.S
rename to lib/crypto/arm/aes-cipher-core.S
diff --git a/lib/crypto/arm/aes.h b/lib/crypto/arm/aes.h
new file mode 100644
index 000000000000..1dd7dfa657bb
--- /dev/null
+++ b/lib/crypto/arm/aes.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * AES block cipher, optimized for ARM
+ *
+ * Copyright (C) 2017 Linaro Ltd.
+ * Copyright 2026 Google LLC
+ */
+
+asmlinkage void __aes_arm_encrypt(const u32 rk[], int rounds,
+				  const u8 in[AES_BLOCK_SIZE],
+				  u8 out[AES_BLOCK_SIZE]);
+asmlinkage void __aes_arm_decrypt(const u32 inv_rk[], int rounds,
+				  const u8 in[AES_BLOCK_SIZE],
+				  u8 out[AES_BLOCK_SIZE]);
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	aes_expandkey_generic(k->rndkeys, inv_k ? inv_k->inv_rndkeys : NULL,
+			      in_key, key_len);
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
+	    !IS_ALIGNED((uintptr_t)out | (uintptr_t)in, 4)) {
+		u8 bounce_buf[AES_BLOCK_SIZE] __aligned(4);
+
+		memcpy(bounce_buf, in, AES_BLOCK_SIZE);
+		__aes_arm_encrypt(key->k.rndkeys, key->nrounds, bounce_buf,
+				  bounce_buf);
+		memcpy(out, bounce_buf, AES_BLOCK_SIZE);
+		return;
+	}
+	__aes_arm_encrypt(key->k.rndkeys, key->nrounds, in, out);
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
+	    !IS_ALIGNED((uintptr_t)out | (uintptr_t)in, 4)) {
+		u8 bounce_buf[AES_BLOCK_SIZE] __aligned(4);
+
+		memcpy(bounce_buf, in, AES_BLOCK_SIZE);
+		__aes_arm_decrypt(key->inv_k.inv_rndkeys, key->nrounds,
+				  bounce_buf, bounce_buf);
+		memcpy(out, bounce_buf, AES_BLOCK_SIZE);
+		return;
+	}
+	__aes_arm_decrypt(key->inv_k.inv_rndkeys, key->nrounds, in, out);
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 11/36] lib/crypto: arm64/aes: Migrate optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (9 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 10/36] lib/crypto: arm/aes: Migrate optimized code into library Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 12/36] lib/crypto: powerpc/aes: Migrate SPE " Eric Biggers
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Move the ARM64 optimized AES key expansion and single-block AES
en/decryption code into lib/crypto/, wire it up to the AES library API,
and remove the superseded crypto_cipher algorithms.

The result is that both the AES library and crypto_cipher APIs are now
optimized for ARM64, whereas previously only crypto_cipher was (and the
optimizations weren't enabled by default, which this fixes as well).

Note: to see the diff from arch/arm64/crypto/aes-ce-glue.c to
lib/crypto/arm64/aes.h, view this commit with 'git show -M10'.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm64/crypto/Kconfig                     |  26 +--
 arch/arm64/crypto/Makefile                    |   6 -
 arch/arm64/crypto/aes-ce-ccm-glue.c           |   2 -
 arch/arm64/crypto/aes-ce-glue.c               | 178 ------------------
 arch/arm64/crypto/aes-ce-setkey.h             |   6 -
 arch/arm64/crypto/aes-cipher-glue.c           |  71 -------
 arch/arm64/crypto/aes-glue.c                  |   2 -
 include/crypto/aes.h                          |  10 +
 lib/crypto/Kconfig                            |   1 +
 lib/crypto/Makefile                           |   5 +
 .../crypto => lib/crypto/arm64}/aes-ce-core.S |   0
 .../crypto/arm64}/aes-cipher-core.S           |   0
 lib/crypto/arm64/aes.h                        | 164 ++++++++++++++++
 13 files changed, 181 insertions(+), 290 deletions(-)
 delete mode 100644 arch/arm64/crypto/aes-ce-glue.c
 delete mode 100644 arch/arm64/crypto/aes-ce-setkey.h
 delete mode 100644 arch/arm64/crypto/aes-cipher-glue.c
 rename {arch/arm64/crypto => lib/crypto/arm64}/aes-ce-core.S (100%)
 rename {arch/arm64/crypto => lib/crypto/arm64}/aes-cipher-core.S (100%)
 create mode 100644 lib/crypto/arm64/aes.h

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 4453dff8f0c1..81ed892b3b72 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -35,38 +35,15 @@ config CRYPTO_SM3_ARM64_CE
 	  SM3 (ShangMi 3) secure hash function (OSCCA GM/T 0004-2012)
 
 	  Architecture: arm64 using:
 	  - ARMv8.2 Crypto Extensions
 
-config CRYPTO_AES_ARM64
-	tristate "Ciphers: AES, modes: ECB, CBC, CTR, CTS, XCTR, XTS"
-	select CRYPTO_AES
-	help
-	  Block ciphers: AES cipher algorithms (FIPS-197)
-	  Length-preserving ciphers: AES with ECB, CBC, CTR, CTS,
-	    XCTR, and XTS modes
-	  AEAD cipher: AES with CBC, ESSIV, and SHA-256
-	    for fscrypt and dm-crypt
-
-	  Architecture: arm64
-
-config CRYPTO_AES_ARM64_CE
-	tristate "Ciphers: AES (ARMv8 Crypto Extensions)"
-	depends on KERNEL_MODE_NEON
-	select CRYPTO_ALGAPI
-	select CRYPTO_LIB_AES
-	help
-	  Block ciphers: AES cipher algorithms (FIPS-197)
-
-	  Architecture: arm64 using:
-	  - ARMv8 Crypto Extensions
-
 config CRYPTO_AES_ARM64_CE_BLK
 	tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (ARMv8 Crypto Extensions)"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_SKCIPHER
-	select CRYPTO_AES_ARM64_CE
+	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_SHA256
 	help
 	  Length-preserving ciphers: AES cipher algorithms (FIPS-197)
 	  with block cipher modes:
 	  - ECB (Electronic Codebook) mode (NIST SP800-38A)
@@ -163,11 +140,10 @@ config CRYPTO_SM4_ARM64_NEON_BLK
 
 config CRYPTO_AES_ARM64_CE_CCM
 	tristate "AEAD cipher: AES in CCM mode (ARMv8 Crypto Extensions)"
 	depends on KERNEL_MODE_NEON
 	select CRYPTO_ALGAPI
-	select CRYPTO_AES_ARM64_CE
 	select CRYPTO_AES_ARM64_CE_BLK
 	select CRYPTO_AEAD
 	select CRYPTO_LIB_AES
 	help
 	  AEAD cipher: AES cipher algorithms (FIPS-197) with
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index 3ab4b58e5c4c..3574e917bc37 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -27,22 +27,16 @@ obj-$(CONFIG_CRYPTO_SM4_ARM64_NEON_BLK) += sm4-neon.o
 sm4-neon-y := sm4-neon-glue.o sm4-neon-core.o
 
 obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
 ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
 
-obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
-aes-ce-cipher-y := aes-ce-core.o aes-ce-glue.o
-
 obj-$(CONFIG_CRYPTO_AES_ARM64_CE_CCM) += aes-ce-ccm.o
 aes-ce-ccm-y := aes-ce-ccm-glue.o aes-ce-ccm-core.o
 
 obj-$(CONFIG_CRYPTO_AES_ARM64_CE_BLK) += aes-ce-blk.o
 aes-ce-blk-y := aes-glue-ce.o aes-ce.o
 
 obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o
 aes-neon-blk-y := aes-glue-neon.o aes-neon.o
 
-obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o
-aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
-
 obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o
 aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
index c4fd648471f1..db371ac051fc 100644
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c
+++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
@@ -15,12 +15,10 @@
 #include <crypto/internal/skcipher.h>
 #include <linux/module.h>
 
 #include <asm/simd.h>
 
-#include "aes-ce-setkey.h"
-
 MODULE_IMPORT_NS("CRYPTO_INTERNAL");
 
 static int num_rounds(struct crypto_aes_ctx *ctx)
 {
 	/*
diff --git a/arch/arm64/crypto/aes-ce-glue.c b/arch/arm64/crypto/aes-ce-glue.c
deleted file mode 100644
index a4dad370991d..000000000000
--- a/arch/arm64/crypto/aes-ce-glue.c
+++ /dev/null
@@ -1,178 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * aes-ce-cipher.c - core AES cipher using ARMv8 Crypto Extensions
- *
- * Copyright (C) 2013 - 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
- */
-
-#include <asm/neon.h>
-#include <asm/simd.h>
-#include <linux/unaligned.h>
-#include <crypto/aes.h>
-#include <crypto/algapi.h>
-#include <crypto/internal/simd.h>
-#include <linux/cpufeature.h>
-#include <linux/module.h>
-
-#include "aes-ce-setkey.h"
-
-MODULE_DESCRIPTION("Synchronous AES cipher using ARMv8 Crypto Extensions");
-MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
-MODULE_LICENSE("GPL v2");
-
-struct aes_block {
-	u8 b[AES_BLOCK_SIZE];
-};
-
-asmlinkage void __aes_ce_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-asmlinkage void __aes_ce_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-
-asmlinkage u32 __aes_ce_sub(u32 l);
-asmlinkage void __aes_ce_invert(struct aes_block *out,
-				const struct aes_block *in);
-
-static int num_rounds(struct crypto_aes_ctx *ctx)
-{
-	/*
-	 * # of rounds specified by AES:
-	 * 128 bit key		10 rounds
-	 * 192 bit key		12 rounds
-	 * 256 bit key		14 rounds
-	 * => n byte key	=> 6 + (n/4) rounds
-	 */
-	return 6 + ctx->key_length / 4;
-}
-
-static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!crypto_simd_usable()) {
-		aes_encrypt(ctx, dst, src);
-		return;
-	}
-
-	scoped_ksimd()
-		__aes_ce_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
-}
-
-static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!crypto_simd_usable()) {
-		aes_decrypt(ctx, dst, src);
-		return;
-	}
-
-	scoped_ksimd()
-		__aes_ce_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
-}
-
-int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
-		     unsigned int key_len)
-{
-	/*
-	 * The AES key schedule round constants
-	 */
-	static u8 const rcon[] = {
-		0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36,
-	};
-
-	u32 kwords = key_len / sizeof(u32);
-	struct aes_block *key_enc, *key_dec;
-	int i, j;
-
-	if (key_len != AES_KEYSIZE_128 &&
-	    key_len != AES_KEYSIZE_192 &&
-	    key_len != AES_KEYSIZE_256)
-		return -EINVAL;
-
-	ctx->key_length = key_len;
-	for (i = 0; i < kwords; i++)
-		ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
-
-	scoped_ksimd() {
-		for (i = 0; i < sizeof(rcon); i++) {
-			u32 *rki = ctx->key_enc + (i * kwords);
-			u32 *rko = rki + kwords;
-
-			rko[0] = ror32(__aes_ce_sub(rki[kwords - 1]), 8) ^
-				 rcon[i] ^ rki[0];
-			rko[1] = rko[0] ^ rki[1];
-			rko[2] = rko[1] ^ rki[2];
-			rko[3] = rko[2] ^ rki[3];
-
-			if (key_len == AES_KEYSIZE_192) {
-				if (i >= 7)
-					break;
-				rko[4] = rko[3] ^ rki[4];
-				rko[5] = rko[4] ^ rki[5];
-			} else if (key_len == AES_KEYSIZE_256) {
-				if (i >= 6)
-					break;
-				rko[4] = __aes_ce_sub(rko[3]) ^ rki[4];
-				rko[5] = rko[4] ^ rki[5];
-				rko[6] = rko[5] ^ rki[6];
-				rko[7] = rko[6] ^ rki[7];
-			}
-		}
-
-		/*
-		 * Generate the decryption keys for the Equivalent Inverse
-		 * Cipher.  This involves reversing the order of the round
-		 * keys, and applying the Inverse Mix Columns transformation on
-		 * all but the first and the last one.
-		 */
-		key_enc = (struct aes_block *)ctx->key_enc;
-		key_dec = (struct aes_block *)ctx->key_dec;
-		j = num_rounds(ctx);
-
-		key_dec[0] = key_enc[j];
-		for (i = 1, j--; j > 0; i++, j--)
-			__aes_ce_invert(key_dec + i, key_enc + j);
-		key_dec[i] = key_enc[0];
-	}
-
-	return 0;
-}
-EXPORT_SYMBOL(ce_aes_expandkey);
-
-int ce_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
-		  unsigned int key_len)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	return ce_aes_expandkey(ctx, in_key, key_len);
-}
-EXPORT_SYMBOL(ce_aes_setkey);
-
-static struct crypto_alg aes_alg = {
-	.cra_name		= "aes",
-	.cra_driver_name	= "aes-ce",
-	.cra_priority		= 250,
-	.cra_flags		= CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize		= AES_BLOCK_SIZE,
-	.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
-	.cra_module		= THIS_MODULE,
-	.cra_cipher = {
-		.cia_min_keysize	= AES_MIN_KEY_SIZE,
-		.cia_max_keysize	= AES_MAX_KEY_SIZE,
-		.cia_setkey		= ce_aes_setkey,
-		.cia_encrypt		= aes_cipher_encrypt,
-		.cia_decrypt		= aes_cipher_decrypt
-	}
-};
-
-static int __init aes_mod_init(void)
-{
-	return crypto_register_alg(&aes_alg);
-}
-
-static void __exit aes_mod_exit(void)
-{
-	crypto_unregister_alg(&aes_alg);
-}
-
-module_cpu_feature_match(AES, aes_mod_init);
-module_exit(aes_mod_exit);
diff --git a/arch/arm64/crypto/aes-ce-setkey.h b/arch/arm64/crypto/aes-ce-setkey.h
deleted file mode 100644
index fd9ecf07d88c..000000000000
--- a/arch/arm64/crypto/aes-ce-setkey.h
+++ /dev/null
@@ -1,6 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-int ce_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
-		  unsigned int key_len);
-int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
-		     unsigned int key_len);
diff --git a/arch/arm64/crypto/aes-cipher-glue.c b/arch/arm64/crypto/aes-cipher-glue.c
deleted file mode 100644
index 9b27cbac278b..000000000000
--- a/arch/arm64/crypto/aes-cipher-glue.c
+++ /dev/null
@@ -1,71 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Scalar AES core transform
- *
- * Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
- */
-
-#include <crypto/aes.h>
-#include <crypto/algapi.h>
-#include <linux/module.h>
-
-asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-
-static int aes_arm64_setkey(struct crypto_tfm *tfm, const u8 *in_key,
-			    unsigned int key_len)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	return aes_expandkey(ctx, in_key, key_len);
-}
-
-static void aes_arm64_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	int rounds = 6 + ctx->key_length / 4;
-
-	__aes_arm64_encrypt(ctx->key_enc, out, in, rounds);
-}
-
-static void aes_arm64_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-	int rounds = 6 + ctx->key_length / 4;
-
-	__aes_arm64_decrypt(ctx->key_dec, out, in, rounds);
-}
-
-static struct crypto_alg aes_alg = {
-	.cra_name			= "aes",
-	.cra_driver_name		= "aes-arm64",
-	.cra_priority			= 200,
-	.cra_flags			= CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize			= AES_BLOCK_SIZE,
-	.cra_ctxsize			= sizeof(struct crypto_aes_ctx),
-	.cra_module			= THIS_MODULE,
-
-	.cra_cipher.cia_min_keysize	= AES_MIN_KEY_SIZE,
-	.cra_cipher.cia_max_keysize	= AES_MAX_KEY_SIZE,
-	.cra_cipher.cia_setkey		= aes_arm64_setkey,
-	.cra_cipher.cia_encrypt		= aes_arm64_encrypt,
-	.cra_cipher.cia_decrypt		= aes_arm64_decrypt
-};
-
-static int __init aes_init(void)
-{
-	return crypto_register_alg(&aes_alg);
-}
-
-static void __exit aes_fini(void)
-{
-	crypto_unregister_alg(&aes_alg);
-}
-
-module_init(aes_init);
-module_exit(aes_fini);
-
-MODULE_DESCRIPTION("Scalar AES cipher for arm64");
-MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
-MODULE_LICENSE("GPL v2");
-MODULE_ALIAS_CRYPTO("aes");
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index c51d4487e9e9..92f43e1cd097 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -19,12 +19,10 @@
 #include <linux/string.h>
 
 #include <asm/hwcap.h>
 #include <asm/simd.h>
 
-#include "aes-ce-setkey.h"
-
 #ifdef USE_V8_CRYPTO_EXTENSIONS
 #define MODE			"ce"
 #define PRIO			300
 #define aes_expandkey		ce_aes_expandkey
 #define aes_ecb_encrypt		ce_aes_ecb_encrypt
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index be3c134de7b6..8a8dd100d8c6 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -114,10 +114,20 @@ static inline int aes_check_keylen(size_t keylen)
  * for the initial combination, the second slot for the first round and so on.
  */
 int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 		  unsigned int key_len);
 
+/*
+ * The following functions are temporarily exported for use by the AES mode
+ * implementations in arch/$(SRCARCH)/crypto/.  These exports will go away when
+ * that code is migrated into lib/crypto/.
+ */
+#ifdef CONFIG_ARM64
+int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
+		     unsigned int key_len);
+#endif
+
 /**
  * aes_preparekey() - Prepare an AES key for encryption and decryption
  * @key: (output) The key structure to initialize
  * @in_key: The raw AES key
  * @key_len: Length of the raw key in bytes.  Should be either AES_KEYSIZE_128,
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 67dbf3c0562b..2c620c004153 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -13,10 +13,11 @@ config CRYPTO_LIB_AES
 
 config CRYPTO_LIB_AES_ARCH
 	bool
 	depends on CRYPTO_LIB_AES && !UML && !KMSAN
 	default y if ARM
+	default y if ARM64
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 2f6b0f59eb1b..1b690c63fafb 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -22,10 +22,15 @@ libaes-y := aes.o
 ifeq ($(CONFIG_CRYPTO_LIB_AES_ARCH),y)
 CFLAGS_aes.o += -I$(src)/$(SRCARCH)
 
 libaes-$(CONFIG_ARM) += arm/aes-cipher-core.o
 
+ifeq ($(CONFIG_ARM64),y)
+libaes-y += arm64/aes-cipher-core.o
+libaes-$(CONFIG_KERNEL_MODE_NEON) += arm64/aes-ce-core.o
+endif
+
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
diff --git a/arch/arm64/crypto/aes-ce-core.S b/lib/crypto/arm64/aes-ce-core.S
similarity index 100%
rename from arch/arm64/crypto/aes-ce-core.S
rename to lib/crypto/arm64/aes-ce-core.S
diff --git a/arch/arm64/crypto/aes-cipher-core.S b/lib/crypto/arm64/aes-cipher-core.S
similarity index 100%
rename from arch/arm64/crypto/aes-cipher-core.S
rename to lib/crypto/arm64/aes-cipher-core.S
diff --git a/lib/crypto/arm64/aes.h b/lib/crypto/arm64/aes.h
new file mode 100644
index 000000000000..576bfaa493f7
--- /dev/null
+++ b/lib/crypto/arm64/aes.h
@@ -0,0 +1,164 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * AES block cipher, optimized for ARM64
+ *
+ * Copyright (C) 2013 - 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
+ * Copyright 2026 Google LLC
+ */
+
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <linux/unaligned.h>
+#include <linux/cpufeature.h>
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_aes);
+
+struct aes_block {
+	u8 b[AES_BLOCK_SIZE];
+};
+
+asmlinkage void __aes_arm64_encrypt(const u32 rk[], u8 out[AES_BLOCK_SIZE],
+				    const u8 in[AES_BLOCK_SIZE], int rounds);
+asmlinkage void __aes_arm64_decrypt(const u32 inv_rk[], u8 out[AES_BLOCK_SIZE],
+				    const u8 in[AES_BLOCK_SIZE], int rounds);
+asmlinkage void __aes_ce_encrypt(const u32 rk[], u8 out[AES_BLOCK_SIZE],
+				 const u8 in[AES_BLOCK_SIZE], int rounds);
+asmlinkage void __aes_ce_decrypt(const u32 inv_rk[], u8 out[AES_BLOCK_SIZE],
+				 const u8 in[AES_BLOCK_SIZE], int rounds);
+asmlinkage u32 __aes_ce_sub(u32 l);
+asmlinkage void __aes_ce_invert(struct aes_block *out,
+				const struct aes_block *in);
+
+/*
+ * Expand an AES key using the crypto extensions if supported and usable or
+ * generic code otherwise.  The expanded key format is compatible between the
+ * two cases.  The outputs are @rndkeys (required) and @inv_rndkeys (optional).
+ */
+static void aes_expandkey_arm64(u32 rndkeys[], u32 *inv_rndkeys,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	/*
+	 * The AES key schedule round constants
+	 */
+	static u8 const rcon[] = {
+		0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36,
+	};
+
+	u32 kwords = key_len / sizeof(u32);
+	struct aes_block *key_enc, *key_dec;
+	int i, j;
+
+	if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) ||
+	    !static_branch_likely(&have_aes) || unlikely(!may_use_simd())) {
+		aes_expandkey_generic(rndkeys, inv_rndkeys, in_key, key_len);
+		return;
+	}
+
+	for (i = 0; i < kwords; i++)
+		rndkeys[i] = get_unaligned_le32(in_key + i * sizeof(u32));
+
+	scoped_ksimd() {
+		for (i = 0; i < sizeof(rcon); i++) {
+			u32 *rki = &rndkeys[i * kwords];
+			u32 *rko = rki + kwords;
+
+			rko[0] = ror32(__aes_ce_sub(rki[kwords - 1]), 8) ^
+				 rcon[i] ^ rki[0];
+			rko[1] = rko[0] ^ rki[1];
+			rko[2] = rko[1] ^ rki[2];
+			rko[3] = rko[2] ^ rki[3];
+
+			if (key_len == AES_KEYSIZE_192) {
+				if (i >= 7)
+					break;
+				rko[4] = rko[3] ^ rki[4];
+				rko[5] = rko[4] ^ rki[5];
+			} else if (key_len == AES_KEYSIZE_256) {
+				if (i >= 6)
+					break;
+				rko[4] = __aes_ce_sub(rko[3]) ^ rki[4];
+				rko[5] = rko[4] ^ rki[5];
+				rko[6] = rko[5] ^ rki[6];
+				rko[7] = rko[6] ^ rki[7];
+			}
+		}
+
+		/*
+		 * Generate the decryption keys for the Equivalent Inverse
+		 * Cipher.  This involves reversing the order of the round
+		 * keys, and applying the Inverse Mix Columns transformation on
+		 * all but the first and the last one.
+		 */
+		if (inv_rndkeys) {
+			key_enc = (struct aes_block *)rndkeys;
+			key_dec = (struct aes_block *)inv_rndkeys;
+			j = nrounds;
+
+			key_dec[0] = key_enc[j];
+			for (i = 1, j--; j > 0; i++, j--)
+				__aes_ce_invert(key_dec + i, key_enc + j);
+			key_dec[i] = key_enc[0];
+		}
+	}
+}
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	aes_expandkey_arm64(k->rndkeys, inv_k ? inv_k->inv_rndkeys : NULL,
+			    in_key, key_len, nrounds);
+}
+
+/*
+ * This is here temporarily until the remaining AES mode implementations are
+ * migrated from arch/arm64/crypto/ to lib/crypto/arm64/.
+ */
+int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
+		     unsigned int key_len)
+{
+	if (aes_check_keylen(key_len) != 0)
+		return -EINVAL;
+	ctx->key_length = key_len;
+	aes_expandkey_arm64(ctx->key_enc, ctx->key_dec, in_key, key_len,
+			    6 + key_len / 4);
+	return 0;
+}
+EXPORT_SYMBOL(ce_aes_expandkey);
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
+	    static_branch_likely(&have_aes) && likely(may_use_simd())) {
+		scoped_ksimd()
+			__aes_ce_encrypt(key->k.rndkeys, out, in, key->nrounds);
+	} else {
+		__aes_arm64_encrypt(key->k.rndkeys, out, in, key->nrounds);
+	}
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
+	    static_branch_likely(&have_aes) && likely(may_use_simd())) {
+		scoped_ksimd()
+			__aes_ce_decrypt(key->inv_k.inv_rndkeys, out, in,
+					 key->nrounds);
+	} else {
+		__aes_arm64_decrypt(key->inv_k.inv_rndkeys, out, in,
+				    key->nrounds);
+	}
+}
+
+#ifdef CONFIG_KERNEL_MODE_NEON
+#define aes_mod_init_arch aes_mod_init_arch
+static void aes_mod_init_arch(void)
+{
+	if (cpu_have_named_feature(AES))
+		static_branch_enable(&have_aes);
+}
+#endif /* CONFIG_KERNEL_MODE_NEON */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 12/36] lib/crypto: powerpc/aes: Migrate SPE optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (10 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 11/36] lib/crypto: arm64/aes: " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 13/36] lib/crypto: powerpc/aes: Migrate POWER8 " Eric Biggers
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the superseded "aes-ppc-spe" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs are now
optimized with SPE, whereas previously only crypto_cipher was (and
optimizations weren't enabled by default, which this commit fixes too).

Note that many of the functions in the PowerPC SPE assembly code are
still used by the AES mode implementations in arch/powerpc/crypto/.  For
now, just export these functions.  These exports will go away once the
AES modes are migrated to the library as well.  (Trying to split up the
assembly files seemed like much more trouble than it would be worth.)

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/powerpc/crypto/Kconfig                   |  2 +-
 arch/powerpc/crypto/Makefile                  |  2 +-
 arch/powerpc/crypto/aes-spe-glue.c            | 88 ++-----------------
 include/crypto/aes.h                          | 31 +++++++
 lib/crypto/Kconfig                            |  1 +
 lib/crypto/Makefile                           |  9 ++
 .../crypto/powerpc}/aes-spe-core.S            |  0
 .../crypto/powerpc}/aes-spe-keys.S            |  0
 .../crypto/powerpc}/aes-spe-modes.S           |  0
 .../crypto/powerpc}/aes-spe-regs.h            |  0
 .../crypto/powerpc}/aes-tab-4k.S              |  0
 lib/crypto/powerpc/aes.h                      | 74 ++++++++++++++++
 12 files changed, 122 insertions(+), 85 deletions(-)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-core.S (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-keys.S (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-modes.S (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-spe-regs.h (100%)
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aes-tab-4k.S (100%)
 create mode 100644 lib/crypto/powerpc/aes.h

diff --git a/arch/powerpc/crypto/Kconfig b/arch/powerpc/crypto/Kconfig
index 662aed46f9c7..2d056f1fc90f 100644
--- a/arch/powerpc/crypto/Kconfig
+++ b/arch/powerpc/crypto/Kconfig
@@ -3,13 +3,13 @@
 menu "Accelerated Cryptographic Algorithms for CPU (powerpc)"
 
 config CRYPTO_AES_PPC_SPE
 	tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS (SPE)"
 	depends on SPE
+	select CRYPTO_LIB_AES
 	select CRYPTO_SKCIPHER
 	help
-	  Block ciphers: AES cipher algorithms (FIPS-197)
 	  Length-preserving ciphers: AES with ECB, CBC, CTR, and XTS modes
 
 	  Architecture: powerpc using:
 	  - SPE (Signal Processing Engine) extensions
 
diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile
index 5960e5300db7..e22310da86b5 100644
--- a/arch/powerpc/crypto/Makefile
+++ b/arch/powerpc/crypto/Makefile
@@ -7,11 +7,11 @@
 
 obj-$(CONFIG_CRYPTO_AES_PPC_SPE) += aes-ppc-spe.o
 obj-$(CONFIG_CRYPTO_AES_GCM_P10) += aes-gcm-p10-crypto.o
 obj-$(CONFIG_CRYPTO_DEV_VMX_ENCRYPT) += vmx-crypto.o
 
-aes-ppc-spe-y := aes-spe-core.o aes-spe-keys.o aes-tab-4k.o aes-spe-modes.o aes-spe-glue.o
+aes-ppc-spe-y := aes-spe-glue.o
 aes-gcm-p10-crypto-y := aes-gcm-p10-glue.o aes-gcm-p10.o ghashp10-ppc.o aesp10-ppc.o
 vmx-crypto-objs := vmx.o aesp8-ppc.o ghashp8-ppc.o aes.o aes_cbc.o aes_ctr.o aes_xts.o ghash.o
 
 ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y)
 override flavour := linux-ppc64le
diff --git a/arch/powerpc/crypto/aes-spe-glue.c b/arch/powerpc/crypto/aes-spe-glue.c
index efab78a3a8f6..7d2827e65240 100644
--- a/arch/powerpc/crypto/aes-spe-glue.c
+++ b/arch/powerpc/crypto/aes-spe-glue.c
@@ -49,34 +49,10 @@ struct ppc_xts_ctx {
 	u32 key_dec[AES_MAX_KEYLENGTH_U32];
 	u32 key_twk[AES_MAX_KEYLENGTH_U32];
 	u32 rounds;
 };
 
-extern void ppc_encrypt_aes(u8 *out, const u8 *in, u32 *key_enc, u32 rounds);
-extern void ppc_decrypt_aes(u8 *out, const u8 *in, u32 *key_dec, u32 rounds);
-extern void ppc_encrypt_ecb(u8 *out, const u8 *in, u32 *key_enc, u32 rounds,
-			    u32 bytes);
-extern void ppc_decrypt_ecb(u8 *out, const u8 *in, u32 *key_dec, u32 rounds,
-			    u32 bytes);
-extern void ppc_encrypt_cbc(u8 *out, const u8 *in, u32 *key_enc, u32 rounds,
-			    u32 bytes, u8 *iv);
-extern void ppc_decrypt_cbc(u8 *out, const u8 *in, u32 *key_dec, u32 rounds,
-			    u32 bytes, u8 *iv);
-extern void ppc_crypt_ctr  (u8 *out, const u8 *in, u32 *key_enc, u32 rounds,
-			    u32 bytes, u8 *iv);
-extern void ppc_encrypt_xts(u8 *out, const u8 *in, u32 *key_enc, u32 rounds,
-			    u32 bytes, u8 *iv, u32 *key_twk);
-extern void ppc_decrypt_xts(u8 *out, const u8 *in, u32 *key_dec, u32 rounds,
-			    u32 bytes, u8 *iv, u32 *key_twk);
-
-extern void ppc_expand_key_128(u32 *key_enc, const u8 *key);
-extern void ppc_expand_key_192(u32 *key_enc, const u8 *key);
-extern void ppc_expand_key_256(u32 *key_enc, const u8 *key);
-
-extern void ppc_generate_decrypt_key(u32 *key_dec,u32 *key_enc,
-				     unsigned int key_len);
-
 static void spe_begin(void)
 {
 	/* disable preemption and save users SPE registers if required */
 	preempt_disable();
 	enable_kernel_spe();
@@ -87,14 +63,14 @@ static void spe_end(void)
 	disable_kernel_spe();
 	/* reenable preemption */
 	preempt_enable();
 }
 
-static int ppc_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
-		unsigned int key_len)
+static int ppc_aes_setkey_skcipher(struct crypto_skcipher *tfm,
+				   const u8 *in_key, unsigned int key_len)
 {
-	struct ppc_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+	struct ppc_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 
 	switch (key_len) {
 	case AES_KEYSIZE_128:
 		ctx->rounds = 4;
 		ppc_expand_key_128(ctx->key_enc, in_key);
@@ -114,16 +90,10 @@ static int ppc_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
 	ppc_generate_decrypt_key(ctx->key_dec, ctx->key_enc, key_len);
 
 	return 0;
 }
 
-static int ppc_aes_setkey_skcipher(struct crypto_skcipher *tfm,
-				   const u8 *in_key, unsigned int key_len)
-{
-	return ppc_aes_setkey(crypto_skcipher_tfm(tfm), in_key, key_len);
-}
-
 static int ppc_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
 		   unsigned int key_len)
 {
 	struct ppc_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
 	int err;
@@ -157,28 +127,10 @@ static int ppc_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
 	ppc_generate_decrypt_key(ctx->key_dec, ctx->key_enc, key_len);
 
 	return 0;
 }
 
-static void ppc_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct ppc_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	spe_begin();
-	ppc_encrypt_aes(out, in, ctx->key_enc, ctx->rounds);
-	spe_end();
-}
-
-static void ppc_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct ppc_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	spe_begin();
-	ppc_decrypt_aes(out, in, ctx->key_dec, ctx->rounds);
-	spe_end();
-}
-
 static int ppc_ecb_crypt(struct skcipher_request *req, bool enc)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct ppc_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 	struct skcipher_walk walk;
@@ -408,30 +360,10 @@ static int ppc_xts_decrypt(struct skcipher_request *req)
  * This improves IPsec thoughput by another few percent. Additionally we assume
  * that AES context is always aligned to at least 8 bytes because it is created
  * with kmalloc() in the crypto infrastructure
  */
 
-static struct crypto_alg aes_cipher_alg = {
-	.cra_name		=	"aes",
-	.cra_driver_name	=	"aes-ppc-spe",
-	.cra_priority		=	300,
-	.cra_flags		=	CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize		=	AES_BLOCK_SIZE,
-	.cra_ctxsize		=	sizeof(struct ppc_aes_ctx),
-	.cra_alignmask		=	0,
-	.cra_module		=	THIS_MODULE,
-	.cra_u			=	{
-		.cipher = {
-			.cia_min_keysize	=	AES_MIN_KEY_SIZE,
-			.cia_max_keysize	=	AES_MAX_KEY_SIZE,
-			.cia_setkey		=	ppc_aes_setkey,
-			.cia_encrypt		=	ppc_aes_encrypt,
-			.cia_decrypt		=	ppc_aes_decrypt
-		}
-	}
-};
-
 static struct skcipher_alg aes_skcipher_algs[] = {
 	{
 		.base.cra_name		=	"ecb(aes)",
 		.base.cra_driver_name	=	"ecb-ppc-spe",
 		.base.cra_priority	=	300,
@@ -486,26 +418,16 @@ static struct skcipher_alg aes_skcipher_algs[] = {
 	}
 };
 
 static int __init ppc_aes_mod_init(void)
 {
-	int err;
-
-	err = crypto_register_alg(&aes_cipher_alg);
-	if (err)
-		return err;
-
-	err = crypto_register_skciphers(aes_skcipher_algs,
-					ARRAY_SIZE(aes_skcipher_algs));
-	if (err)
-		crypto_unregister_alg(&aes_cipher_alg);
-	return err;
+	return crypto_register_skciphers(aes_skcipher_algs,
+					 ARRAY_SIZE(aes_skcipher_algs));
 }
 
 static void __exit ppc_aes_mod_fini(void)
 {
-	crypto_unregister_alg(&aes_cipher_alg);
 	crypto_unregister_skciphers(aes_skcipher_algs,
 				    ARRAY_SIZE(aes_skcipher_algs));
 }
 
 module_init(ppc_aes_mod_init);
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 8a8dd100d8c6..49ce2d1e086e 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -18,14 +18,26 @@
 #define AES_MAX_KEYLENGTH	(15 * 16)
 #define AES_MAX_KEYLENGTH_U32	(AES_MAX_KEYLENGTH / sizeof(u32))
 
 union aes_enckey_arch {
 	u32 rndkeys[AES_MAX_KEYLENGTH_U32];
+#ifdef CONFIG_CRYPTO_LIB_AES_ARCH
+#if defined(CONFIG_PPC) && defined(CONFIG_SPE)
+	/* Used unconditionally (when SPE AES code is enabled in kconfig) */
+	u32 spe_enc_key[AES_MAX_KEYLENGTH_U32] __aligned(8);
+#endif
+#endif /* CONFIG_CRYPTO_LIB_AES_ARCH */
 };
 
 union aes_invkey_arch {
 	u32 inv_rndkeys[AES_MAX_KEYLENGTH_U32];
+#ifdef CONFIG_CRYPTO_LIB_AES_ARCH
+#if defined(CONFIG_PPC) && defined(CONFIG_SPE)
+	/* Used unconditionally (when SPE AES code is enabled in kconfig) */
+	u32 spe_dec_key[AES_MAX_KEYLENGTH_U32] __aligned(8);
+#endif
+#endif /* CONFIG_CRYPTO_LIB_AES_ARCH */
 };
 
 /**
  * struct aes_enckey - An AES key prepared for encryption
  * @len: Key length in bytes: 16 for AES-128, 24 for AES-192, 32 for AES-256.
@@ -122,10 +134,29 @@ int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
  * that code is migrated into lib/crypto/.
  */
 #ifdef CONFIG_ARM64
 int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 		     unsigned int key_len);
+#elif defined(CONFIG_PPC)
+void ppc_expand_key_128(u32 *key_enc, const u8 *key);
+void ppc_expand_key_192(u32 *key_enc, const u8 *key);
+void ppc_expand_key_256(u32 *key_enc, const u8 *key);
+void ppc_generate_decrypt_key(u32 *key_dec, u32 *key_enc, unsigned int key_len);
+void ppc_encrypt_ecb(u8 *out, const u8 *in, u32 *key_enc, u32 rounds,
+		     u32 bytes);
+void ppc_decrypt_ecb(u8 *out, const u8 *in, u32 *key_dec, u32 rounds,
+		     u32 bytes);
+void ppc_encrypt_cbc(u8 *out, const u8 *in, u32 *key_enc, u32 rounds, u32 bytes,
+		     u8 *iv);
+void ppc_decrypt_cbc(u8 *out, const u8 *in, u32 *key_dec, u32 rounds, u32 bytes,
+		     u8 *iv);
+void ppc_crypt_ctr(u8 *out, const u8 *in, u32 *key_enc, u32 rounds, u32 bytes,
+		   u8 *iv);
+void ppc_encrypt_xts(u8 *out, const u8 *in, u32 *key_enc, u32 rounds, u32 bytes,
+		     u8 *iv, u32 *key_twk);
+void ppc_decrypt_xts(u8 *out, const u8 *in, u32 *key_dec, u32 rounds, u32 bytes,
+		     u8 *iv, u32 *key_twk);
 #endif
 
 /**
  * aes_preparekey() - Prepare an AES key for encryption and decryption
  * @key: (output) The key structure to initialize
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 2c620c004153..50057f534aec 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -14,10 +14,11 @@ config CRYPTO_LIB_AES
 config CRYPTO_LIB_AES_ARCH
 	bool
 	depends on CRYPTO_LIB_AES && !UML && !KMSAN
 	default y if ARM
 	default y if ARM64
+	default y if PPC && SPE
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 1b690c63fafb..d68fde004104 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -27,10 +27,19 @@ libaes-$(CONFIG_ARM) += arm/aes-cipher-core.o
 ifeq ($(CONFIG_ARM64),y)
 libaes-y += arm64/aes-cipher-core.o
 libaes-$(CONFIG_KERNEL_MODE_NEON) += arm64/aes-ce-core.o
 endif
 
+ifeq ($(CONFIG_PPC),y)
+ifeq ($(CONFIG_SPE),y)
+libaes-y += powerpc/aes-spe-core.o \
+	    powerpc/aes-spe-keys.o \
+	    powerpc/aes-spe-modes.o \
+	    powerpc/aes-tab-4k.o
+endif
+endif # CONFIG_PPC
+
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
diff --git a/arch/powerpc/crypto/aes-spe-core.S b/lib/crypto/powerpc/aes-spe-core.S
similarity index 100%
rename from arch/powerpc/crypto/aes-spe-core.S
rename to lib/crypto/powerpc/aes-spe-core.S
diff --git a/arch/powerpc/crypto/aes-spe-keys.S b/lib/crypto/powerpc/aes-spe-keys.S
similarity index 100%
rename from arch/powerpc/crypto/aes-spe-keys.S
rename to lib/crypto/powerpc/aes-spe-keys.S
diff --git a/arch/powerpc/crypto/aes-spe-modes.S b/lib/crypto/powerpc/aes-spe-modes.S
similarity index 100%
rename from arch/powerpc/crypto/aes-spe-modes.S
rename to lib/crypto/powerpc/aes-spe-modes.S
diff --git a/arch/powerpc/crypto/aes-spe-regs.h b/lib/crypto/powerpc/aes-spe-regs.h
similarity index 100%
rename from arch/powerpc/crypto/aes-spe-regs.h
rename to lib/crypto/powerpc/aes-spe-regs.h
diff --git a/arch/powerpc/crypto/aes-tab-4k.S b/lib/crypto/powerpc/aes-tab-4k.S
similarity index 100%
rename from arch/powerpc/crypto/aes-tab-4k.S
rename to lib/crypto/powerpc/aes-tab-4k.S
diff --git a/lib/crypto/powerpc/aes.h b/lib/crypto/powerpc/aes.h
new file mode 100644
index 000000000000..cf22020f9050
--- /dev/null
+++ b/lib/crypto/powerpc/aes.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2015 Markus Stockhausen <stockhausen@collogia.de>
+ * Copyright 2026 Google LLC
+ */
+#include <asm/simd.h>
+#include <asm/switch_to.h>
+#include <linux/cpufeature.h>
+#include <linux/jump_label.h>
+#include <linux/preempt.h>
+#include <linux/uaccess.h>
+
+EXPORT_SYMBOL_GPL(ppc_expand_key_128);
+EXPORT_SYMBOL_GPL(ppc_expand_key_192);
+EXPORT_SYMBOL_GPL(ppc_expand_key_256);
+EXPORT_SYMBOL_GPL(ppc_generate_decrypt_key);
+EXPORT_SYMBOL_GPL(ppc_encrypt_ecb);
+EXPORT_SYMBOL_GPL(ppc_decrypt_ecb);
+EXPORT_SYMBOL_GPL(ppc_encrypt_cbc);
+EXPORT_SYMBOL_GPL(ppc_decrypt_cbc);
+EXPORT_SYMBOL_GPL(ppc_crypt_ctr);
+EXPORT_SYMBOL_GPL(ppc_encrypt_xts);
+EXPORT_SYMBOL_GPL(ppc_decrypt_xts);
+
+void ppc_encrypt_aes(u8 *out, const u8 *in, const u32 *key_enc, u32 rounds);
+void ppc_decrypt_aes(u8 *out, const u8 *in, const u32 *key_dec, u32 rounds);
+
+static void spe_begin(void)
+{
+	/* disable preemption and save users SPE registers if required */
+	preempt_disable();
+	enable_kernel_spe();
+}
+
+static void spe_end(void)
+{
+	disable_kernel_spe();
+	/* reenable preemption */
+	preempt_enable();
+}
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	if (key_len == AES_KEYSIZE_128)
+		ppc_expand_key_128(k->spe_enc_key, in_key);
+	else if (key_len == AES_KEYSIZE_192)
+		ppc_expand_key_192(k->spe_enc_key, in_key);
+	else
+		ppc_expand_key_256(k->spe_enc_key, in_key);
+
+	if (inv_k)
+		ppc_generate_decrypt_key(inv_k->spe_dec_key, k->spe_enc_key,
+					 key_len);
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	spe_begin();
+	ppc_encrypt_aes(out, in, key->k.spe_enc_key, key->nrounds / 2 - 1);
+	spe_end();
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	spe_begin();
+	ppc_decrypt_aes(out, in, key->inv_k.spe_dec_key, key->nrounds / 2 - 1);
+	spe_end();
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 13/36] lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (11 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 12/36] lib/crypto: powerpc/aes: Migrate SPE " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 14/36] lib/crypto: riscv/aes: Migrate " Eric Biggers
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Move the POWER8 AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the superseded "p8_aes" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs are now
optimized for POWER8, whereas previously only crypto_cipher was (and
optimizations weren't enabled by default, which this commit fixes too).

Note that many of the functions in the POWER8 assembly code are still
used by the AES mode implementations in arch/powerpc/crypto/.  For now,
just export these functions.  These exports will go away once the AES
modes are migrated to the library as well.  (Trying to split up the
assembly file seemed like much more trouble than it would be worth.)

Another challenge with this code is that the POWER8 assembly code uses a
custom format for the expanded AES key.  Since that code is imported
from OpenSSL and is also targeted to POWER8 (rather than POWER9 which
has better data movement and byteswap instructions), that is not easily
changed.  For now I've just kept the custom format.  To maintain full
correctness, this requires executing some slow fallback code in the case
where the usability of VSX changes between key expansion and use.  This
should be tolerable, as this case shouldn't happen in practice.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/powerpc/crypto/Makefile                  |   7 +-
 arch/powerpc/crypto/aes.c                     | 134 --------------
 arch/powerpc/crypto/aesp8-ppc.h               |  23 ---
 arch/powerpc/crypto/vmx.c                     |  10 +-
 include/crypto/aes.h                          |  41 +++++
 lib/crypto/Kconfig                            |   2 +-
 lib/crypto/Makefile                           |  14 +-
 lib/crypto/powerpc/.gitignore                 |   2 +
 lib/crypto/powerpc/aes.h                      | 164 ++++++++++++++++++
 .../crypto/powerpc}/aesp8-ppc.pl              |   1 +
 10 files changed, 226 insertions(+), 172 deletions(-)
 delete mode 100644 arch/powerpc/crypto/aes.c
 create mode 100644 lib/crypto/powerpc/.gitignore
 rename {arch/powerpc/crypto => lib/crypto/powerpc}/aesp8-ppc.pl (99%)

diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile
index e22310da86b5..3ac0886282a2 100644
--- a/arch/powerpc/crypto/Makefile
+++ b/arch/powerpc/crypto/Makefile
@@ -9,11 +9,11 @@ obj-$(CONFIG_CRYPTO_AES_PPC_SPE) += aes-ppc-spe.o
 obj-$(CONFIG_CRYPTO_AES_GCM_P10) += aes-gcm-p10-crypto.o
 obj-$(CONFIG_CRYPTO_DEV_VMX_ENCRYPT) += vmx-crypto.o
 
 aes-ppc-spe-y := aes-spe-glue.o
 aes-gcm-p10-crypto-y := aes-gcm-p10-glue.o aes-gcm-p10.o ghashp10-ppc.o aesp10-ppc.o
-vmx-crypto-objs := vmx.o aesp8-ppc.o ghashp8-ppc.o aes.o aes_cbc.o aes_ctr.o aes_xts.o ghash.o
+vmx-crypto-objs := vmx.o ghashp8-ppc.o aes_cbc.o aes_ctr.o aes_xts.o ghash.o
 
 ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y)
 override flavour := linux-ppc64le
 else
 ifdef CONFIG_PPC64_ELF_ABI_V2
@@ -24,17 +24,16 @@ endif
 endif
 
 quiet_cmd_perl = PERL    $@
       cmd_perl = $(PERL) $< $(flavour) > $@
 
-targets += aesp10-ppc.S ghashp10-ppc.S aesp8-ppc.S ghashp8-ppc.S
+targets += aesp10-ppc.S ghashp10-ppc.S ghashp8-ppc.S
 
 $(obj)/aesp10-ppc.S $(obj)/ghashp10-ppc.S: $(obj)/%.S: $(src)/%.pl FORCE
 	$(call if_changed,perl)
 
-$(obj)/aesp8-ppc.S $(obj)/ghashp8-ppc.S: $(obj)/%.S: $(src)/%.pl FORCE
+$(obj)/ghashp8-ppc.S: $(obj)/%.S: $(src)/%.pl FORCE
 	$(call if_changed,perl)
 
 OBJECT_FILES_NON_STANDARD_aesp10-ppc.o := y
 OBJECT_FILES_NON_STANDARD_ghashp10-ppc.o := y
-OBJECT_FILES_NON_STANDARD_aesp8-ppc.o := y
 OBJECT_FILES_NON_STANDARD_ghashp8-ppc.o := y
diff --git a/arch/powerpc/crypto/aes.c b/arch/powerpc/crypto/aes.c
deleted file mode 100644
index b7192ee719fc..000000000000
--- a/arch/powerpc/crypto/aes.c
+++ /dev/null
@@ -1,134 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * AES routines supporting VMX instructions on the Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
- */
-
-#include <asm/simd.h>
-#include <asm/switch_to.h>
-#include <crypto/aes.h>
-#include <crypto/internal/cipher.h>
-#include <crypto/internal/simd.h>
-#include <linux/err.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/uaccess.h>
-
-#include "aesp8-ppc.h"
-
-struct p8_aes_ctx {
-	struct crypto_cipher *fallback;
-	struct p8_aes_key enc_key;
-	struct p8_aes_key dec_key;
-};
-
-static int p8_aes_init(struct crypto_tfm *tfm)
-{
-	const char *alg = crypto_tfm_alg_name(tfm);
-	struct crypto_cipher *fallback;
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	fallback = crypto_alloc_cipher(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
-	if (IS_ERR(fallback)) {
-		printk(KERN_ERR
-		       "Failed to allocate transformation for '%s': %ld\n",
-		       alg, PTR_ERR(fallback));
-		return PTR_ERR(fallback);
-	}
-
-	crypto_cipher_set_flags(fallback,
-				crypto_cipher_get_flags((struct
-							 crypto_cipher *)
-							tfm));
-	ctx->fallback = fallback;
-
-	return 0;
-}
-
-static void p8_aes_exit(struct crypto_tfm *tfm)
-{
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (ctx->fallback) {
-		crypto_free_cipher(ctx->fallback);
-		ctx->fallback = NULL;
-	}
-}
-
-static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
-			 unsigned int keylen)
-{
-	int ret;
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
-	ret |= aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-
-	ret |= crypto_cipher_setkey(ctx->fallback, key, keylen);
-
-	return ret ? -EINVAL : 0;
-}
-
-static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!crypto_simd_usable()) {
-		crypto_cipher_encrypt_one(ctx->fallback, dst, src);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-		aes_p8_encrypt(src, dst, &ctx->enc_key);
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-}
-
-static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!crypto_simd_usable()) {
-		crypto_cipher_decrypt_one(ctx->fallback, dst, src);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-		aes_p8_decrypt(src, dst, &ctx->dec_key);
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-}
-
-struct crypto_alg p8_aes_alg = {
-	.cra_name = "aes",
-	.cra_driver_name = "p8_aes",
-	.cra_module = THIS_MODULE,
-	.cra_priority = 1000,
-	.cra_type = NULL,
-	.cra_flags = CRYPTO_ALG_TYPE_CIPHER | CRYPTO_ALG_NEED_FALLBACK,
-	.cra_alignmask = 0,
-	.cra_blocksize = AES_BLOCK_SIZE,
-	.cra_ctxsize = sizeof(struct p8_aes_ctx),
-	.cra_init = p8_aes_init,
-	.cra_exit = p8_aes_exit,
-	.cra_cipher = {
-		       .cia_min_keysize = AES_MIN_KEY_SIZE,
-		       .cia_max_keysize = AES_MAX_KEY_SIZE,
-		       .cia_setkey = p8_aes_setkey,
-		       .cia_encrypt = p8_aes_encrypt,
-		       .cia_decrypt = p8_aes_decrypt,
-	},
-};
diff --git a/arch/powerpc/crypto/aesp8-ppc.h b/arch/powerpc/crypto/aesp8-ppc.h
index 0bea010128cb..6862c605cc33 100644
--- a/arch/powerpc/crypto/aesp8-ppc.h
+++ b/arch/powerpc/crypto/aesp8-ppc.h
@@ -1,31 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/types.h>
 #include <crypto/aes.h>
 
-struct p8_aes_key {
-	u8 key[AES_MAX_KEYLENGTH];
-	int rounds;
-};
-
 extern struct shash_alg p8_ghash_alg;
-extern struct crypto_alg p8_aes_alg;
 extern struct skcipher_alg p8_aes_cbc_alg;
 extern struct skcipher_alg p8_aes_ctr_alg;
 extern struct skcipher_alg p8_aes_xts_alg;
-
-int aes_p8_set_encrypt_key(const u8 *userKey, const int bits,
-			   struct p8_aes_key *key);
-int aes_p8_set_decrypt_key(const u8 *userKey, const int bits,
-			   struct p8_aes_key *key);
-void aes_p8_encrypt(const u8 *in, u8 *out, const struct p8_aes_key *key);
-void aes_p8_decrypt(const u8 *in, u8 *out, const struct p8_aes_key *key);
-void aes_p8_cbc_encrypt(const u8 *in, u8 *out, size_t len,
-			const struct p8_aes_key *key, u8 *iv, const int enc);
-void aes_p8_ctr32_encrypt_blocks(const u8 *in, u8 *out, size_t len,
-				 const struct p8_aes_key *key, const u8 *iv);
-void aes_p8_xts_encrypt(const u8 *in, u8 *out, size_t len,
-			const struct p8_aes_key *key1,
-			const struct p8_aes_key *key2, u8 *iv);
-void aes_p8_xts_decrypt(const u8 *in, u8 *out, size_t len,
-			const struct p8_aes_key *key1,
-			const struct p8_aes_key *key2, u8 *iv);
diff --git a/arch/powerpc/crypto/vmx.c b/arch/powerpc/crypto/vmx.c
index 0b725e826388..7d2beb774f99 100644
--- a/arch/powerpc/crypto/vmx.c
+++ b/arch/powerpc/crypto/vmx.c
@@ -25,17 +25,13 @@ static int __init p8_init(void)
 
 	ret = crypto_register_shash(&p8_ghash_alg);
 	if (ret)
 		goto err;
 
-	ret = crypto_register_alg(&p8_aes_alg);
-	if (ret)
-		goto err_unregister_ghash;
-
 	ret = crypto_register_skcipher(&p8_aes_cbc_alg);
 	if (ret)
-		goto err_unregister_aes;
+		goto err_unregister_ghash;
 
 	ret = crypto_register_skcipher(&p8_aes_ctr_alg);
 	if (ret)
 		goto err_unregister_aes_cbc;
 
@@ -47,12 +43,10 @@ static int __init p8_init(void)
 
 err_unregister_aes_ctr:
 	crypto_unregister_skcipher(&p8_aes_ctr_alg);
 err_unregister_aes_cbc:
 	crypto_unregister_skcipher(&p8_aes_cbc_alg);
-err_unregister_aes:
-	crypto_unregister_alg(&p8_aes_alg);
 err_unregister_ghash:
 	crypto_unregister_shash(&p8_ghash_alg);
 err:
 	return ret;
 }
@@ -60,11 +54,10 @@ static int __init p8_init(void)
 static void __exit p8_exit(void)
 {
 	crypto_unregister_skcipher(&p8_aes_xts_alg);
 	crypto_unregister_skcipher(&p8_aes_ctr_alg);
 	crypto_unregister_skcipher(&p8_aes_cbc_alg);
-	crypto_unregister_alg(&p8_aes_alg);
 	crypto_unregister_shash(&p8_ghash_alg);
 }
 
 module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, p8_init);
 module_exit(p8_exit);
@@ -72,6 +65,5 @@ module_exit(p8_exit);
 MODULE_AUTHOR("Marcelo Cerri<mhcerri@br.ibm.com>");
 MODULE_DESCRIPTION("IBM VMX cryptographic acceleration instructions "
 		   "support on Power 8");
 MODULE_LICENSE("GPL");
 MODULE_VERSION("1.0.0");
-MODULE_IMPORT_NS("CRYPTO_INTERNAL");
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 49ce2d1e086e..e6082b7c6443 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -16,26 +16,51 @@
 #define AES_KEYSIZE_256		32
 #define AES_BLOCK_SIZE		16
 #define AES_MAX_KEYLENGTH	(15 * 16)
 #define AES_MAX_KEYLENGTH_U32	(AES_MAX_KEYLENGTH / sizeof(u32))
 
+/*
+ * The POWER8 VSX optimized AES assembly code is borrowed from OpenSSL and
+ * inherits OpenSSL's AES_KEY format, which stores the number of rounds after
+ * the round keys.  That assembly code is difficult to change.  So for
+ * compatibility purposes we reserve space for the extra nrounds field on PPC64.
+ *
+ * Note: when prepared for decryption, the round keys are just the reversed
+ * standard round keys, not the round keys for the Equivalent Inverse Cipher.
+ */
+struct p8_aes_key {
+	u32 rndkeys[AES_MAX_KEYLENGTH_U32];
+	int nrounds;
+};
+
 union aes_enckey_arch {
 	u32 rndkeys[AES_MAX_KEYLENGTH_U32];
 #ifdef CONFIG_CRYPTO_LIB_AES_ARCH
 #if defined(CONFIG_PPC) && defined(CONFIG_SPE)
 	/* Used unconditionally (when SPE AES code is enabled in kconfig) */
 	u32 spe_enc_key[AES_MAX_KEYLENGTH_U32] __aligned(8);
+#elif defined(CONFIG_PPC)
+	/*
+	 * Kernels that include the POWER8 VSX optimized AES code use this field
+	 * when that code is usable at key preparation time.  Otherwise they
+	 * fall back to rndkeys.  In the latter case, p8.nrounds (which doesn't
+	 * overlap rndkeys) is set to 0 to differentiate the two formats.
+	 */
+	struct p8_aes_key p8;
 #endif
 #endif /* CONFIG_CRYPTO_LIB_AES_ARCH */
 };
 
 union aes_invkey_arch {
 	u32 inv_rndkeys[AES_MAX_KEYLENGTH_U32];
 #ifdef CONFIG_CRYPTO_LIB_AES_ARCH
 #if defined(CONFIG_PPC) && defined(CONFIG_SPE)
 	/* Used unconditionally (when SPE AES code is enabled in kconfig) */
 	u32 spe_dec_key[AES_MAX_KEYLENGTH_U32] __aligned(8);
+#elif defined(CONFIG_PPC)
+	/* Used conditionally, analogous to aes_enckey_arch::p8 */
+	struct p8_aes_key p8;
 #endif
 #endif /* CONFIG_CRYPTO_LIB_AES_ARCH */
 };
 
 /**
@@ -153,10 +178,26 @@ void ppc_crypt_ctr(u8 *out, const u8 *in, u32 *key_enc, u32 rounds, u32 bytes,
 		   u8 *iv);
 void ppc_encrypt_xts(u8 *out, const u8 *in, u32 *key_enc, u32 rounds, u32 bytes,
 		     u8 *iv, u32 *key_twk);
 void ppc_decrypt_xts(u8 *out, const u8 *in, u32 *key_dec, u32 rounds, u32 bytes,
 		     u8 *iv, u32 *key_twk);
+int aes_p8_set_encrypt_key(const u8 *userKey, const int bits,
+			   struct p8_aes_key *key);
+int aes_p8_set_decrypt_key(const u8 *userKey, const int bits,
+			   struct p8_aes_key *key);
+void aes_p8_encrypt(const u8 *in, u8 *out, const struct p8_aes_key *key);
+void aes_p8_decrypt(const u8 *in, u8 *out, const struct p8_aes_key *key);
+void aes_p8_cbc_encrypt(const u8 *in, u8 *out, size_t len,
+			const struct p8_aes_key *key, u8 *iv, const int enc);
+void aes_p8_ctr32_encrypt_blocks(const u8 *in, u8 *out, size_t len,
+				 const struct p8_aes_key *key, const u8 *iv);
+void aes_p8_xts_encrypt(const u8 *in, u8 *out, size_t len,
+			const struct p8_aes_key *key1,
+			const struct p8_aes_key *key2, u8 *iv);
+void aes_p8_xts_decrypt(const u8 *in, u8 *out, size_t len,
+			const struct p8_aes_key *key1,
+			const struct p8_aes_key *key2, u8 *iv);
 #endif
 
 /**
  * aes_preparekey() - Prepare an AES key for encryption and decryption
  * @key: (output) The key structure to initialize
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 50057f534aec..ddd3fe826b81 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -14,11 +14,11 @@ config CRYPTO_LIB_AES
 config CRYPTO_LIB_AES_ARCH
 	bool
 	depends on CRYPTO_LIB_AES && !UML && !KMSAN
 	default y if ARM
 	default y if ARM64
-	default y if PPC && SPE
+	default y if PPC && (SPE || (PPC64 && VSX))
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index d68fde004104..16140616ace8 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -33,11 +33,23 @@ ifeq ($(CONFIG_PPC),y)
 ifeq ($(CONFIG_SPE),y)
 libaes-y += powerpc/aes-spe-core.o \
 	    powerpc/aes-spe-keys.o \
 	    powerpc/aes-spe-modes.o \
 	    powerpc/aes-tab-4k.o
-endif
+else
+libaes-y += powerpc/aesp8-ppc.o
+aes-perlasm-flavour-y := linux-ppc64
+aes-perlasm-flavour-$(CONFIG_PPC64_ELF_ABI_V2) := linux-ppc64-elfv2
+aes-perlasm-flavour-$(CONFIG_CPU_LITTLE_ENDIAN) := linux-ppc64le
+quiet_cmd_perlasm_aes = PERLASM $@
+      cmd_perlasm_aes = $(PERL) $< $(aes-perlasm-flavour-y) $@
+# Use if_changed instead of cmd, in case the flavour changed.
+$(obj)/powerpc/aesp8-ppc.S: $(src)/powerpc/aesp8-ppc.pl FORCE
+	$(call if_changed,perlasm_aes)
+targets += powerpc/aesp8-ppc.S
+OBJECT_FILES_NON_STANDARD_powerpc/aesp8-ppc.o := y
+endif # !CONFIG_SPE
 endif # CONFIG_PPC
 
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
diff --git a/lib/crypto/powerpc/.gitignore b/lib/crypto/powerpc/.gitignore
new file mode 100644
index 000000000000..598ca7aff6b1
--- /dev/null
+++ b/lib/crypto/powerpc/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+aesp8-ppc.S
diff --git a/lib/crypto/powerpc/aes.h b/lib/crypto/powerpc/aes.h
index cf22020f9050..42e0a993c619 100644
--- a/lib/crypto/powerpc/aes.h
+++ b/lib/crypto/powerpc/aes.h
@@ -1,17 +1,20 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
  * Copyright (c) 2015 Markus Stockhausen <stockhausen@collogia.de>
+ * Copyright (C) 2015 International Business Machines Inc.
  * Copyright 2026 Google LLC
  */
 #include <asm/simd.h>
 #include <asm/switch_to.h>
 #include <linux/cpufeature.h>
 #include <linux/jump_label.h>
 #include <linux/preempt.h>
 #include <linux/uaccess.h>
 
+#ifdef CONFIG_SPE
+
 EXPORT_SYMBOL_GPL(ppc_expand_key_128);
 EXPORT_SYMBOL_GPL(ppc_expand_key_192);
 EXPORT_SYMBOL_GPL(ppc_expand_key_256);
 EXPORT_SYMBOL_GPL(ppc_generate_decrypt_key);
 EXPORT_SYMBOL_GPL(ppc_encrypt_ecb);
@@ -70,5 +73,166 @@ static void aes_decrypt_arch(const struct aes_key *key,
 {
 	spe_begin();
 	ppc_decrypt_aes(out, in, key->inv_k.spe_dec_key, key->nrounds / 2 - 1);
 	spe_end();
 }
+
+#else /* CONFIG_SPE */
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_vec_crypto);
+
+EXPORT_SYMBOL_GPL(aes_p8_set_encrypt_key);
+EXPORT_SYMBOL_GPL(aes_p8_set_decrypt_key);
+EXPORT_SYMBOL_GPL(aes_p8_encrypt);
+EXPORT_SYMBOL_GPL(aes_p8_decrypt);
+EXPORT_SYMBOL_GPL(aes_p8_cbc_encrypt);
+EXPORT_SYMBOL_GPL(aes_p8_ctr32_encrypt_blocks);
+EXPORT_SYMBOL_GPL(aes_p8_xts_encrypt);
+EXPORT_SYMBOL_GPL(aes_p8_xts_decrypt);
+
+static inline bool is_vsx_format(const struct p8_aes_key *key)
+{
+	return key->nrounds != 0;
+}
+
+/*
+ * Convert a round key from VSX to generic format by reflecting the 16 bytes,
+ * and (if apply_inv_mix=true) applying InvMixColumn to each column.
+ *
+ * It would be nice if the VSX and generic key formats would be compatible.  But
+ * that's very difficult to do, with the assembly code having been borrowed from
+ * OpenSSL and also targeted to POWER8 rather than POWER9.
+ *
+ * Fortunately, this conversion should only be needed in extremely rare cases,
+ * possibly not at all in practice.  It's just included for full correctness.
+ */
+static void rndkey_from_vsx(u32 out[4], const u32 in[4], bool apply_inv_mix)
+{
+	u32 k0 = swab32(in[0]);
+	u32 k1 = swab32(in[1]);
+	u32 k2 = swab32(in[2]);
+	u32 k3 = swab32(in[3]);
+
+	if (apply_inv_mix) {
+		k0 = inv_mix_columns(k0);
+		k1 = inv_mix_columns(k1);
+		k2 = inv_mix_columns(k2);
+		k3 = inv_mix_columns(k3);
+	}
+	out[0] = k3;
+	out[1] = k2;
+	out[2] = k1;
+	out[3] = k0;
+}
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	const int keybits = 8 * key_len;
+	int ret;
+
+	if (static_branch_likely(&have_vec_crypto) && likely(may_use_simd())) {
+		preempt_disable();
+		pagefault_disable();
+		enable_kernel_vsx();
+		ret = aes_p8_set_encrypt_key(in_key, keybits, &k->p8);
+		/*
+		 * aes_p8_set_encrypt_key() should never fail here, since the
+		 * key length was already validated.
+		 */
+		WARN_ON_ONCE(ret);
+		if (inv_k) {
+			ret = aes_p8_set_decrypt_key(in_key, keybits,
+						     &inv_k->p8);
+			/* ... and likewise for aes_p8_set_decrypt_key(). */
+			WARN_ON_ONCE(ret);
+		}
+		disable_kernel_vsx();
+		pagefault_enable();
+		preempt_enable();
+	} else {
+		aes_expandkey_generic(k->rndkeys,
+				      inv_k ? inv_k->inv_rndkeys : NULL,
+				      in_key, key_len);
+		/* Mark the key as using the generic format. */
+		k->p8.nrounds = 0;
+		if (inv_k)
+			inv_k->p8.nrounds = 0;
+	}
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (static_branch_likely(&have_vec_crypto) &&
+	    likely(is_vsx_format(&key->k.p8) && may_use_simd())) {
+		preempt_disable();
+		pagefault_disable();
+		enable_kernel_vsx();
+		aes_p8_encrypt(in, out, &key->k.p8);
+		disable_kernel_vsx();
+		pagefault_enable();
+		preempt_enable();
+	} else if (unlikely(is_vsx_format(&key->k.p8))) {
+		/*
+		 * This handles (the hopefully extremely rare) case where a key
+		 * was prepared using the VSX optimized format, then encryption
+		 * is done in a context that cannot use VSX instructions.
+		 */
+		u32 rndkeys[AES_MAX_KEYLENGTH_U32];
+
+		for (int i = 0; i < 4 * (key->nrounds + 1); i += 4)
+			rndkey_from_vsx(&rndkeys[i],
+					&key->k.p8.rndkeys[i], false);
+		aes_encrypt_generic(rndkeys, key->nrounds, out, in);
+	} else {
+		aes_encrypt_generic(key->k.rndkeys, key->nrounds, out, in);
+	}
+}
+
+static void aes_decrypt_arch(const struct aes_key *key, u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (static_branch_likely(&have_vec_crypto) &&
+	    likely(is_vsx_format(&key->inv_k.p8) && may_use_simd())) {
+		preempt_disable();
+		pagefault_disable();
+		enable_kernel_vsx();
+		aes_p8_decrypt(in, out, &key->inv_k.p8);
+		disable_kernel_vsx();
+		pagefault_enable();
+		preempt_enable();
+	} else if (unlikely(is_vsx_format(&key->inv_k.p8))) {
+		/*
+		 * This handles (the hopefully extremely rare) case where a key
+		 * was prepared using the VSX optimized format, then decryption
+		 * is done in a context that cannot use VSX instructions.
+		 */
+		u32 inv_rndkeys[AES_MAX_KEYLENGTH_U32];
+		int i;
+
+		rndkey_from_vsx(&inv_rndkeys[0],
+				&key->inv_k.p8.rndkeys[0], false);
+		for (i = 4; i < 4 * key->nrounds; i += 4) {
+			rndkey_from_vsx(&inv_rndkeys[i],
+					&key->inv_k.p8.rndkeys[i], true);
+		}
+		rndkey_from_vsx(&inv_rndkeys[i],
+				&key->inv_k.p8.rndkeys[i], false);
+		aes_decrypt_generic(inv_rndkeys, key->nrounds, out, in);
+	} else {
+		aes_decrypt_generic(key->inv_k.inv_rndkeys, key->nrounds,
+				    out, in);
+	}
+}
+
+#define aes_mod_init_arch aes_mod_init_arch
+static void aes_mod_init_arch(void)
+{
+	if (cpu_has_feature(CPU_FTR_ARCH_207S) &&
+	    (cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_VEC_CRYPTO))
+		static_branch_enable(&have_vec_crypto);
+}
+
+#endif /* !CONFIG_SPE */
diff --git a/arch/powerpc/crypto/aesp8-ppc.pl b/lib/crypto/powerpc/aesp8-ppc.pl
similarity index 99%
rename from arch/powerpc/crypto/aesp8-ppc.pl
rename to lib/crypto/powerpc/aesp8-ppc.pl
index f729589d792e..253a06758057 100644
--- a/arch/powerpc/crypto/aesp8-ppc.pl
+++ b/lib/crypto/powerpc/aesp8-ppc.pl
@@ -103,10 +103,11 @@ if ($flavour =~ /64/) {
 $LITTLE_ENDIAN = ($flavour=~/le$/) ? $SIZE_T : 0;
 
 $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
 ( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
 ( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
+( $xlate="${dir}../../../arch/powerpc/crypto/ppc-xlate.pl" and -f $xlate) or
 die "can't locate ppc-xlate.pl";
 
 open STDOUT,"| $^X $xlate $flavour ".shift || die "can't call $xlate: $!";
 
 $FRAME=8*$SIZE_T;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 14/36] lib/crypto: riscv/aes: Migrate optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (12 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 13/36] lib/crypto: powerpc/aes: Migrate POWER8 " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 15/36] lib/crypto: s390/aes: " Eric Biggers
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly
functions into lib/crypto/, wire them up to the AES library API, and
remove the "aes-riscv64-zvkned" crypto_cipher algorithm.

To make this possible, change the prototypes of these functions to
take (rndkeys, key_len) instead of a pointer to crypto_aes_ctx, and
change the RISC-V AES-XTS code to implement tweak encryption using the
AES library instead of directly calling aes_encrypt_zvkned().

The result is that both the AES library and crypto_cipher APIs use
RISC-V's AES instructions, whereas previously only crypto_cipher did
(and it wasn't enabled by default, which this commit fixes as well).

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/riscv/crypto/Kconfig              |  2 -
 arch/riscv/crypto/aes-macros.S         | 12 +++-
 arch/riscv/crypto/aes-riscv64-glue.c   | 78 ++----------------------
 arch/riscv/crypto/aes-riscv64-zvkned.S | 27 ---------
 lib/crypto/Kconfig                     |  2 +
 lib/crypto/Makefile                    |  1 +
 lib/crypto/riscv/aes-riscv64-zvkned.S  | 84 ++++++++++++++++++++++++++
 lib/crypto/riscv/aes.h                 | 63 +++++++++++++++++++
 8 files changed, 165 insertions(+), 104 deletions(-)
 create mode 100644 lib/crypto/riscv/aes-riscv64-zvkned.S
 create mode 100644 lib/crypto/riscv/aes.h

diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig
index 14c5acb935e9..22d4eaab15f3 100644
--- a/arch/riscv/crypto/Kconfig
+++ b/arch/riscv/crypto/Kconfig
@@ -4,15 +4,13 @@ menu "Accelerated Cryptographic Algorithms for CPU (riscv)"
 
 config CRYPTO_AES_RISCV64
 	tristate "Ciphers: AES, modes: ECB, CBC, CTS, CTR, XTS"
 	depends on 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \
 		   RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
-	select CRYPTO_ALGAPI
 	select CRYPTO_LIB_AES
 	select CRYPTO_SKCIPHER
 	help
-	  Block cipher: AES cipher algorithms
 	  Length-preserving ciphers: AES with ECB, CBC, CTS, CTR, XTS
 
 	  Architecture: riscv64 using:
 	  - Zvkned vector crypto extension
 	  - Zvbb vector extension (XTS)
diff --git a/arch/riscv/crypto/aes-macros.S b/arch/riscv/crypto/aes-macros.S
index d1a258d04bc7..1384164621a5 100644
--- a/arch/riscv/crypto/aes-macros.S
+++ b/arch/riscv/crypto/aes-macros.S
@@ -49,12 +49,14 @@
 //   - If AES-128, loads round keys into v1-v11 and jumps to \label128.
 //   - If AES-192, loads round keys into v1-v13 and jumps to \label192.
 //   - If AES-256, loads round keys into v1-v15 and continues onwards.
 //
 // Also sets vl=4 and vtype=e32,m1,ta,ma.  Clobbers t0 and t1.
-.macro	aes_begin	keyp, label128, label192
+.macro	aes_begin	keyp, label128, label192, key_len
+.ifb \key_len
 	lwu		t0, 480(\keyp)	// t0 = key length in bytes
+.endif
 	li		t1, 24		// t1 = key length for AES-192
 	vsetivli	zero, 4, e32, m1, ta, ma
 	vle32.v		v1, (\keyp)
 	addi		\keyp, \keyp, 16
 	vle32.v		v2, (\keyp)
@@ -74,16 +76,24 @@
 	vle32.v		v9, (\keyp)
 	addi		\keyp, \keyp, 16
 	vle32.v		v10, (\keyp)
 	addi		\keyp, \keyp, 16
 	vle32.v		v11, (\keyp)
+.ifb \key_len
 	blt		t0, t1, \label128	// If AES-128, goto label128.
+.else
+	blt		\key_len, t1, \label128	// If AES-128, goto label128.
+.endif
 	addi		\keyp, \keyp, 16
 	vle32.v		v12, (\keyp)
 	addi		\keyp, \keyp, 16
 	vle32.v		v13, (\keyp)
+.ifb \key_len
 	beq		t0, t1, \label192	// If AES-192, goto label192.
+.else
+	beq		\key_len, t1, \label192	// If AES-192, goto label192.
+.endif
 	// Else, it's AES-256.
 	addi		\keyp, \keyp, 16
 	vle32.v		v14, (\keyp)
 	addi		\keyp, \keyp, 16
 	vle32.v		v15, (\keyp)
diff --git a/arch/riscv/crypto/aes-riscv64-glue.c b/arch/riscv/crypto/aes-riscv64-glue.c
index f814ee048555..e1b8b0d70666 100644
--- a/arch/riscv/crypto/aes-riscv64-glue.c
+++ b/arch/riscv/crypto/aes-riscv64-glue.c
@@ -13,25 +13,17 @@
  */
 
 #include <asm/simd.h>
 #include <asm/vector.h>
 #include <crypto/aes.h>
-#include <crypto/internal/cipher.h>
 #include <crypto/internal/simd.h>
 #include <crypto/internal/skcipher.h>
 #include <crypto/scatterwalk.h>
 #include <crypto/xts.h>
 #include <linux/linkage.h>
 #include <linux/module.h>
 
-asmlinkage void aes_encrypt_zvkned(const struct crypto_aes_ctx *key,
-				   const u8 in[AES_BLOCK_SIZE],
-				   u8 out[AES_BLOCK_SIZE]);
-asmlinkage void aes_decrypt_zvkned(const struct crypto_aes_ctx *key,
-				   const u8 in[AES_BLOCK_SIZE],
-				   u8 out[AES_BLOCK_SIZE]);
-
 asmlinkage void aes_ecb_encrypt_zvkned(const struct crypto_aes_ctx *key,
 				       const u8 *in, u8 *out, size_t len);
 asmlinkage void aes_ecb_decrypt_zvkned(const struct crypto_aes_ctx *key,
 				       const u8 *in, u8 *out, size_t len);
 
@@ -84,54 +76,18 @@ static int riscv64_aes_setkey(struct crypto_aes_ctx *ctx,
 	 *   struct crypto_aes_ctx and aes_expandkey() everywhere.
 	 */
 	return aes_expandkey(ctx, key, keylen);
 }
 
-static int riscv64_aes_setkey_cipher(struct crypto_tfm *tfm,
-				     const u8 *key, unsigned int keylen)
-{
-	struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	return riscv64_aes_setkey(ctx, key, keylen);
-}
-
 static int riscv64_aes_setkey_skcipher(struct crypto_skcipher *tfm,
 				       const u8 *key, unsigned int keylen)
 {
 	struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 
 	return riscv64_aes_setkey(ctx, key, keylen);
 }
 
-/* Bare AES, without a mode of operation */
-
-static void riscv64_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (crypto_simd_usable()) {
-		kernel_vector_begin();
-		aes_encrypt_zvkned(ctx, src, dst);
-		kernel_vector_end();
-	} else {
-		aes_encrypt(ctx, dst, src);
-	}
-}
-
-static void riscv64_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (crypto_simd_usable()) {
-		kernel_vector_begin();
-		aes_decrypt_zvkned(ctx, src, dst);
-		kernel_vector_end();
-	} else {
-		aes_decrypt(ctx, dst, src);
-	}
-}
-
 /* AES-ECB */
 
 static inline int riscv64_aes_ecb_crypt(struct skcipher_request *req, bool enc)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -336,21 +292,21 @@ static int riscv64_aes_ctr_crypt(struct skcipher_request *req)
 
 /* AES-XTS */
 
 struct riscv64_aes_xts_ctx {
 	struct crypto_aes_ctx ctx1;
-	struct crypto_aes_ctx ctx2;
+	struct aes_enckey tweak_key;
 };
 
 static int riscv64_aes_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
 				  unsigned int keylen)
 {
 	struct riscv64_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
 
 	return xts_verify_key(tfm, key, keylen) ?:
 	       riscv64_aes_setkey(&ctx->ctx1, key, keylen / 2) ?:
-	       riscv64_aes_setkey(&ctx->ctx2, key + keylen / 2, keylen / 2);
+	       aes_prepareenckey(&ctx->tweak_key, key + keylen / 2, keylen / 2);
 }
 
 static int riscv64_aes_xts_crypt(struct skcipher_request *req, bool enc)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -364,13 +320,11 @@ static int riscv64_aes_xts_crypt(struct skcipher_request *req, bool enc)
 
 	if (req->cryptlen < AES_BLOCK_SIZE)
 		return -EINVAL;
 
 	/* Encrypt the IV with the tweak key to get the first tweak. */
-	kernel_vector_begin();
-	aes_encrypt_zvkned(&ctx->ctx2, req->iv, req->iv);
-	kernel_vector_end();
+	aes_encrypt_new(&ctx->tweak_key, req->iv, req->iv);
 
 	err = skcipher_walk_virt(&walk, req, false);
 
 	/*
 	 * If the message length isn't divisible by the AES block size and the
@@ -454,27 +408,10 @@ static int riscv64_aes_xts_decrypt(struct skcipher_request *req)
 	return riscv64_aes_xts_crypt(req, false);
 }
 
 /* Algorithm definitions */
 
-static struct crypto_alg riscv64_zvkned_aes_cipher_alg = {
-	.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize = AES_BLOCK_SIZE,
-	.cra_ctxsize = sizeof(struct crypto_aes_ctx),
-	.cra_priority = 300,
-	.cra_name = "aes",
-	.cra_driver_name = "aes-riscv64-zvkned",
-	.cra_cipher = {
-		.cia_min_keysize = AES_MIN_KEY_SIZE,
-		.cia_max_keysize = AES_MAX_KEY_SIZE,
-		.cia_setkey = riscv64_aes_setkey_cipher,
-		.cia_encrypt = riscv64_aes_encrypt,
-		.cia_decrypt = riscv64_aes_decrypt,
-	},
-	.cra_module = THIS_MODULE,
-};
-
 static struct skcipher_alg riscv64_zvkned_aes_skcipher_algs[] = {
 	{
 		.setkey = riscv64_aes_setkey_skcipher,
 		.encrypt = riscv64_aes_ecb_encrypt,
 		.decrypt = riscv64_aes_ecb_decrypt,
@@ -572,19 +509,15 @@ static int __init riscv64_aes_mod_init(void)
 {
 	int err = -ENODEV;
 
 	if (riscv_isa_extension_available(NULL, ZVKNED) &&
 	    riscv_vector_vlen() >= 128) {
-		err = crypto_register_alg(&riscv64_zvkned_aes_cipher_alg);
-		if (err)
-			return err;
-
 		err = crypto_register_skciphers(
 			riscv64_zvkned_aes_skcipher_algs,
 			ARRAY_SIZE(riscv64_zvkned_aes_skcipher_algs));
 		if (err)
-			goto unregister_zvkned_cipher_alg;
+			return err;
 
 		if (riscv_isa_extension_available(NULL, ZVKB)) {
 			err = crypto_register_skcipher(
 				&riscv64_zvkned_zvkb_aes_skcipher_alg);
 			if (err)
@@ -605,12 +538,10 @@ static int __init riscv64_aes_mod_init(void)
 	if (riscv_isa_extension_available(NULL, ZVKB))
 		crypto_unregister_skcipher(&riscv64_zvkned_zvkb_aes_skcipher_alg);
 unregister_zvkned_skcipher_algs:
 	crypto_unregister_skciphers(riscv64_zvkned_aes_skcipher_algs,
 				    ARRAY_SIZE(riscv64_zvkned_aes_skcipher_algs));
-unregister_zvkned_cipher_alg:
-	crypto_unregister_alg(&riscv64_zvkned_aes_cipher_alg);
 	return err;
 }
 
 static void __exit riscv64_aes_mod_exit(void)
 {
@@ -618,11 +549,10 @@ static void __exit riscv64_aes_mod_exit(void)
 		crypto_unregister_skcipher(&riscv64_zvkned_zvbb_zvkg_aes_skcipher_alg);
 	if (riscv_isa_extension_available(NULL, ZVKB))
 		crypto_unregister_skcipher(&riscv64_zvkned_zvkb_aes_skcipher_alg);
 	crypto_unregister_skciphers(riscv64_zvkned_aes_skcipher_algs,
 				    ARRAY_SIZE(riscv64_zvkned_aes_skcipher_algs));
-	crypto_unregister_alg(&riscv64_zvkned_aes_cipher_alg);
 }
 
 module_init(riscv64_aes_mod_init);
 module_exit(riscv64_aes_mod_exit);
 
diff --git a/arch/riscv/crypto/aes-riscv64-zvkned.S b/arch/riscv/crypto/aes-riscv64-zvkned.S
index 23d063f94ce6..d0fc4581a380 100644
--- a/arch/riscv/crypto/aes-riscv64-zvkned.S
+++ b/arch/riscv/crypto/aes-riscv64-zvkned.S
@@ -54,37 +54,10 @@
 #define INP		a1
 #define OUTP		a2
 #define LEN		a3
 #define IVP		a4
 
-.macro	__aes_crypt_zvkned	enc, keylen
-	vle32.v		v16, (INP)
-	aes_crypt	v16, \enc, \keylen
-	vse32.v		v16, (OUTP)
-	ret
-.endm
-
-.macro	aes_crypt_zvkned	enc
-	aes_begin	KEYP, 128f, 192f
-	__aes_crypt_zvkned	\enc, 256
-128:
-	__aes_crypt_zvkned	\enc, 128
-192:
-	__aes_crypt_zvkned	\enc, 192
-.endm
-
-// void aes_encrypt_zvkned(const struct crypto_aes_ctx *key,
-//			   const u8 in[16], u8 out[16]);
-SYM_FUNC_START(aes_encrypt_zvkned)
-	aes_crypt_zvkned	1
-SYM_FUNC_END(aes_encrypt_zvkned)
-
-// Same prototype and calling convention as the encryption function
-SYM_FUNC_START(aes_decrypt_zvkned)
-	aes_crypt_zvkned	0
-SYM_FUNC_END(aes_decrypt_zvkned)
-
 .macro	__aes_ecb_crypt	enc, keylen
 	srli		t0, LEN, 2
 	// t0 is the remaining length in 32-bit words.  It's a multiple of 4.
 1:
 	vsetvli		t1, t0, e32, m8, ta, ma
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index ddd3fe826b81..a8c0b02a4fb0 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -15,10 +15,12 @@ config CRYPTO_LIB_AES_ARCH
 	bool
 	depends on CRYPTO_LIB_AES && !UML && !KMSAN
 	default y if ARM
 	default y if ARM64
 	default y if PPC && (SPE || (PPC64 && VSX))
+	default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \
+		     RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 16140616ace8..811b60787dd5 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -48,10 +48,11 @@ $(obj)/powerpc/aesp8-ppc.S: $(src)/powerpc/aesp8-ppc.pl FORCE
 targets += powerpc/aesp8-ppc.S
 OBJECT_FILES_NON_STANDARD_powerpc/aesp8-ppc.o := y
 endif # !CONFIG_SPE
 endif # CONFIG_PPC
 
+libaes-$(CONFIG_RISCV) += riscv/aes-riscv64-zvkned.o
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
diff --git a/lib/crypto/riscv/aes-riscv64-zvkned.S b/lib/crypto/riscv/aes-riscv64-zvkned.S
new file mode 100644
index 000000000000..0d988bc3d37b
--- /dev/null
+++ b/lib/crypto/riscv/aes-riscv64-zvkned.S
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause */
+//
+// This file is dual-licensed, meaning that you can use it under your
+// choice of either of the following two licenses:
+//
+// Copyright 2023 The OpenSSL Project Authors. All Rights Reserved.
+//
+// Licensed under the Apache License 2.0 (the "License"). You can obtain
+// a copy in the file LICENSE in the source distribution or at
+// https://www.openssl.org/source/license.html
+//
+// or
+//
+// Copyright (c) 2023, Christoph Müllner <christoph.muellner@vrull.eu>
+// Copyright (c) 2023, Phoebe Chen <phoebe.chen@sifive.com>
+// Copyright (c) 2023, Jerry Shih <jerry.shih@sifive.com>
+// Copyright 2024 Google LLC
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions
+// are met:
+// 1. Redistributions of source code must retain the above copyright
+//    notice, this list of conditions and the following disclaimer.
+// 2. Redistributions in binary form must reproduce the above copyright
+//    notice, this list of conditions and the following disclaimer in the
+//    documentation and/or other materials provided with the distribution.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+// The generated code of this file depends on the following RISC-V extensions:
+// - RV64I
+// - RISC-V Vector ('V') with VLEN >= 128
+// - RISC-V Vector AES block cipher extension ('Zvkned')
+
+#include <linux/linkage.h>
+
+.text
+.option arch, +zvkned
+
+#include "../../arch/riscv/crypto/aes-macros.S"
+
+#define RNDKEYS		a0
+#define KEY_LEN		a1
+#define OUTP		a2
+#define INP		a3
+
+.macro	__aes_crypt_zvkned	enc, keybits
+	vle32.v		v16, (INP)
+	aes_crypt	v16, \enc, \keybits
+	vse32.v		v16, (OUTP)
+	ret
+.endm
+
+.macro	aes_crypt_zvkned	enc
+	aes_begin	RNDKEYS, 128f, 192f, KEY_LEN
+	__aes_crypt_zvkned	\enc, 256
+128:
+	__aes_crypt_zvkned	\enc, 128
+192:
+	__aes_crypt_zvkned	\enc, 192
+.endm
+
+// void aes_encrypt_zvkned(const u32 rndkeys[], int key_len,
+//			   u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+SYM_FUNC_START(aes_encrypt_zvkned)
+	aes_crypt_zvkned	1
+SYM_FUNC_END(aes_encrypt_zvkned)
+
+// void aes_decrypt_zvkned(const u32 rndkeys[], int key_len,
+//			   u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+SYM_FUNC_START(aes_decrypt_zvkned)
+	aes_crypt_zvkned	0
+SYM_FUNC_END(aes_decrypt_zvkned)
diff --git a/lib/crypto/riscv/aes.h b/lib/crypto/riscv/aes.h
new file mode 100644
index 000000000000..0b26f58faf2b
--- /dev/null
+++ b/lib/crypto/riscv/aes.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 VRULL GmbH
+ * Copyright (C) 2023 SiFive, Inc.
+ * Copyright 2024 Google LLC
+ */
+
+#include <asm/simd.h>
+#include <asm/vector.h>
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_zvkned);
+
+void aes_encrypt_zvkned(const u32 rndkeys[], int key_len,
+			u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+void aes_decrypt_zvkned(const u32 rndkeys[], int key_len,
+			u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	aes_expandkey_generic(k->rndkeys, inv_k ? inv_k->inv_rndkeys : NULL,
+			      in_key, key_len);
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (static_branch_likely(&have_zvkned) && likely(may_use_simd())) {
+		kernel_vector_begin();
+		aes_encrypt_zvkned(key->k.rndkeys, key->len, out, in);
+		kernel_vector_end();
+	} else {
+		aes_encrypt_generic(key->k.rndkeys, key->nrounds, out, in);
+	}
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	/*
+	 * Note that the Zvkned code uses the standard round keys, while the
+	 * fallback uses the inverse round keys.  Thus both must be present.
+	 */
+	if (static_branch_likely(&have_zvkned) && likely(may_use_simd())) {
+		kernel_vector_begin();
+		aes_decrypt_zvkned(key->k.rndkeys, key->len, out, in);
+		kernel_vector_end();
+	} else {
+		aes_decrypt_generic(key->inv_k.inv_rndkeys, key->nrounds,
+				    out, in);
+	}
+}
+
+#define aes_mod_init_arch aes_mod_init_arch
+static void aes_mod_init_arch(void)
+{
+	if (riscv_isa_extension_available(NULL, ZVKNED) &&
+	    riscv_vector_vlen() >= 128)
+		static_branch_enable(&have_zvkned);
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 15/36] lib/crypto: s390/aes: Migrate optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (13 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 14/36] lib/crypto: riscv/aes: Migrate " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-07  7:41   ` Holger Dengler
  2026-01-05  5:12 ` [PATCH 16/36] lib/crypto: sparc/aes: " Eric Biggers
                   ` (20 subsequent siblings)
  35 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Implement aes_preparekey_arch(), aes_encrypt_arch(), and
aes_decrypt_arch() using the CPACF AES instructions.

Then, remove the superseded "aes-s390" crypto_cipher.

The result is that both the AES library and crypto_cipher APIs use the
CPACF AES instructions, whereas previously only crypto_cipher did (and
it wasn't enabled by default, which this commit fixes as well).

Note that this preserves the optimization where the AES key is stored in
raw form rather than expanded form.  CPACF just takes the raw key.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/s390/crypto/Kconfig    |   2 -
 arch/s390/crypto/aes_s390.c | 113 ------------------------------------
 include/crypto/aes.h        |   3 +
 lib/crypto/Kconfig          |   1 +
 lib/crypto/s390/aes.h       | 106 +++++++++++++++++++++++++++++++++
 5 files changed, 110 insertions(+), 115 deletions(-)
 create mode 100644 lib/crypto/s390/aes.h

diff --git a/arch/s390/crypto/Kconfig b/arch/s390/crypto/Kconfig
index f838ca055f6d..79a2d0034258 100644
--- a/arch/s390/crypto/Kconfig
+++ b/arch/s390/crypto/Kconfig
@@ -12,14 +12,12 @@ config CRYPTO_GHASH_S390
 
 	  It is available as of z196.
 
 config CRYPTO_AES_S390
 	tristate "Ciphers: AES, modes: ECB, CBC, CTR, XTS, GCM"
-	select CRYPTO_ALGAPI
 	select CRYPTO_SKCIPHER
 	help
-	  Block cipher: AES cipher algorithms (FIPS 197)
 	  AEAD cipher: AES with GCM
 	  Length-preserving ciphers: AES with ECB, CBC, XTS, and CTR modes
 
 	  Architecture: s390
 
diff --git a/arch/s390/crypto/aes_s390.c b/arch/s390/crypto/aes_s390.c
index d0a295435680..62edc66d5478 100644
--- a/arch/s390/crypto/aes_s390.c
+++ b/arch/s390/crypto/aes_s390.c
@@ -18,11 +18,10 @@
 
 #include <crypto/aes.h>
 #include <crypto/algapi.h>
 #include <crypto/ghash.h>
 #include <crypto/internal/aead.h>
-#include <crypto/internal/cipher.h>
 #include <crypto/internal/skcipher.h>
 #include <crypto/scatterwalk.h>
 #include <linux/err.h>
 #include <linux/module.h>
 #include <linux/cpufeature.h>
@@ -43,11 +42,10 @@ struct s390_aes_ctx {
 	u8 key[AES_MAX_KEY_SIZE];
 	int key_len;
 	unsigned long fc;
 	union {
 		struct crypto_skcipher *skcipher;
-		struct crypto_cipher *cip;
 	} fallback;
 };
 
 struct s390_xts_ctx {
 	union {
@@ -70,113 +68,10 @@ struct gcm_sg_walk {
 	unsigned int buf_bytes;
 	u8 *ptr;
 	unsigned int nbytes;
 };
 
-static int setkey_fallback_cip(struct crypto_tfm *tfm, const u8 *in_key,
-		unsigned int key_len)
-{
-	struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
-
-	sctx->fallback.cip->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK;
-	sctx->fallback.cip->base.crt_flags |= (tfm->crt_flags &
-			CRYPTO_TFM_REQ_MASK);
-
-	return crypto_cipher_setkey(sctx->fallback.cip, in_key, key_len);
-}
-
-static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
-		       unsigned int key_len)
-{
-	struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
-	unsigned long fc;
-
-	/* Pick the correct function code based on the key length */
-	fc = (key_len == 16) ? CPACF_KM_AES_128 :
-	     (key_len == 24) ? CPACF_KM_AES_192 :
-	     (key_len == 32) ? CPACF_KM_AES_256 : 0;
-
-	/* Check if the function code is available */
-	sctx->fc = (fc && cpacf_test_func(&km_functions, fc)) ? fc : 0;
-	if (!sctx->fc)
-		return setkey_fallback_cip(tfm, in_key, key_len);
-
-	sctx->key_len = key_len;
-	memcpy(sctx->key, in_key, key_len);
-	return 0;
-}
-
-static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
-
-	if (unlikely(!sctx->fc)) {
-		crypto_cipher_encrypt_one(sctx->fallback.cip, out, in);
-		return;
-	}
-	cpacf_km(sctx->fc, &sctx->key, out, in, AES_BLOCK_SIZE);
-}
-
-static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
-{
-	struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
-
-	if (unlikely(!sctx->fc)) {
-		crypto_cipher_decrypt_one(sctx->fallback.cip, out, in);
-		return;
-	}
-	cpacf_km(sctx->fc | CPACF_DECRYPT,
-		 &sctx->key, out, in, AES_BLOCK_SIZE);
-}
-
-static int fallback_init_cip(struct crypto_tfm *tfm)
-{
-	const char *name = tfm->__crt_alg->cra_name;
-	struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
-
-	sctx->fallback.cip = crypto_alloc_cipher(name, 0,
-						 CRYPTO_ALG_NEED_FALLBACK);
-
-	if (IS_ERR(sctx->fallback.cip)) {
-		pr_err("Allocating AES fallback algorithm %s failed\n",
-		       name);
-		return PTR_ERR(sctx->fallback.cip);
-	}
-
-	return 0;
-}
-
-static void fallback_exit_cip(struct crypto_tfm *tfm)
-{
-	struct s390_aes_ctx *sctx = crypto_tfm_ctx(tfm);
-
-	crypto_free_cipher(sctx->fallback.cip);
-	sctx->fallback.cip = NULL;
-}
-
-static struct crypto_alg aes_alg = {
-	.cra_name		=	"aes",
-	.cra_driver_name	=	"aes-s390",
-	.cra_priority		=	300,
-	.cra_flags		=	CRYPTO_ALG_TYPE_CIPHER |
-					CRYPTO_ALG_NEED_FALLBACK,
-	.cra_blocksize		=	AES_BLOCK_SIZE,
-	.cra_ctxsize		=	sizeof(struct s390_aes_ctx),
-	.cra_module		=	THIS_MODULE,
-	.cra_init               =       fallback_init_cip,
-	.cra_exit               =       fallback_exit_cip,
-	.cra_u			=	{
-		.cipher = {
-			.cia_min_keysize	=	AES_MIN_KEY_SIZE,
-			.cia_max_keysize	=	AES_MAX_KEY_SIZE,
-			.cia_setkey		=	aes_set_key,
-			.cia_encrypt		=	crypto_aes_encrypt,
-			.cia_decrypt		=	crypto_aes_decrypt,
-		}
-	}
-};
-
 static int setkey_fallback_skcipher(struct crypto_skcipher *tfm, const u8 *key,
 				    unsigned int len)
 {
 	struct s390_aes_ctx *sctx = crypto_skcipher_ctx(tfm);
 
@@ -1047,11 +942,10 @@ static struct aead_alg gcm_aes_aead = {
 		.cra_driver_name	= "gcm-aes-s390",
 		.cra_module		= THIS_MODULE,
 	},
 };
 
-static struct crypto_alg *aes_s390_alg;
 static struct skcipher_alg *aes_s390_skcipher_algs[5];
 static int aes_s390_skciphers_num;
 static struct aead_alg *aes_s390_aead_alg;
 
 static int aes_s390_register_skcipher(struct skcipher_alg *alg)
@@ -1064,12 +958,10 @@ static int aes_s390_register_skcipher(struct skcipher_alg *alg)
 	return ret;
 }
 
 static void aes_s390_fini(void)
 {
-	if (aes_s390_alg)
-		crypto_unregister_alg(aes_s390_alg);
 	while (aes_s390_skciphers_num--)
 		crypto_unregister_skcipher(aes_s390_skcipher_algs[aes_s390_skciphers_num]);
 	if (ctrblk)
 		free_page((unsigned long) ctrblk);
 
@@ -1088,14 +980,10 @@ static int __init aes_s390_init(void)
 	cpacf_query(CPACF_KMA, &kma_functions);
 
 	if (cpacf_test_func(&km_functions, CPACF_KM_AES_128) ||
 	    cpacf_test_func(&km_functions, CPACF_KM_AES_192) ||
 	    cpacf_test_func(&km_functions, CPACF_KM_AES_256)) {
-		ret = crypto_register_alg(&aes_alg);
-		if (ret)
-			goto out_err;
-		aes_s390_alg = &aes_alg;
 		ret = aes_s390_register_skcipher(&ecb_aes_alg);
 		if (ret)
 			goto out_err;
 	}
 
@@ -1154,6 +1042,5 @@ module_exit(aes_s390_fini);
 
 MODULE_ALIAS_CRYPTO("aes-all");
 
 MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm");
 MODULE_LICENSE("GPL");
-MODULE_IMPORT_NS("CRYPTO_INTERNAL");
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index e6082b7c6443..b91eb49cbffc 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -44,10 +44,13 @@ union aes_enckey_arch {
 	 * when that code is usable at key preparation time.  Otherwise they
 	 * fall back to rndkeys.  In the latter case, p8.nrounds (which doesn't
 	 * overlap rndkeys) is set to 0 to differentiate the two formats.
 	 */
 	struct p8_aes_key p8;
+#elif defined(CONFIG_S390)
+	/* Used when the CPU supports CPACF AES for this key's length */
+	u8 raw_key[AES_MAX_KEY_SIZE];
 #endif
 #endif /* CONFIG_CRYPTO_LIB_AES_ARCH */
 };
 
 union aes_invkey_arch {
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index a8c0b02a4fb0..b672f0145793 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -17,10 +17,11 @@ config CRYPTO_LIB_AES_ARCH
 	default y if ARM
 	default y if ARM64
 	default y if PPC && (SPE || (PPC64 && VSX))
 	default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \
 		     RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
+	default y if S390
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/s390/aes.h b/lib/crypto/s390/aes.h
new file mode 100644
index 000000000000..5466f6ecbce7
--- /dev/null
+++ b/lib/crypto/s390/aes.h
@@ -0,0 +1,106 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * AES optimized using the CP Assist for Cryptographic Functions (CPACF)
+ *
+ * Copyright 2026 Google LLC
+ */
+#include <asm/cpacf.h>
+#include <linux/cpufeature.h>
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_cpacf_aes128);
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_cpacf_aes192);
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_cpacf_aes256);
+
+/*
+ * When the CPU supports CPACF AES for the requested key length, we need only
+ * save a copy of the raw AES key, as that's what the CPACF instructions need.
+ *
+ * When unsupported, fall back to the generic key expansion and en/decryption.
+ */
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	if (key_len == AES_KEYSIZE_128) {
+		if (static_branch_likely(&have_cpacf_aes128)) {
+			memcpy(k->raw_key, in_key, AES_KEYSIZE_128);
+			return;
+		}
+	} else if (key_len == AES_KEYSIZE_192) {
+		if (static_branch_likely(&have_cpacf_aes192)) {
+			memcpy(k->raw_key, in_key, AES_KEYSIZE_192);
+			return;
+		}
+	} else {
+		if (static_branch_likely(&have_cpacf_aes256)) {
+			memcpy(k->raw_key, in_key, AES_KEYSIZE_256);
+			return;
+		}
+	}
+	aes_expandkey_generic(k->rndkeys, inv_k ? inv_k->inv_rndkeys : NULL,
+			      in_key, key_len);
+}
+
+static inline bool aes_crypt_s390(const struct aes_enckey *key,
+				  u8 out[AES_BLOCK_SIZE],
+				  const u8 in[AES_BLOCK_SIZE], int decrypt)
+{
+	if (key->len == AES_KEYSIZE_128) {
+		if (static_branch_likely(&have_cpacf_aes128)) {
+			cpacf_km(CPACF_KM_AES_128 | decrypt,
+				 (void *)key->k.raw_key, out, in,
+				 AES_BLOCK_SIZE);
+			return true;
+		}
+	} else if (key->len == AES_KEYSIZE_192) {
+		if (static_branch_likely(&have_cpacf_aes192)) {
+			cpacf_km(CPACF_KM_AES_192 | decrypt,
+				 (void *)key->k.raw_key, out, in,
+				 AES_BLOCK_SIZE);
+			return true;
+		}
+	} else {
+		if (static_branch_likely(&have_cpacf_aes256)) {
+			cpacf_km(CPACF_KM_AES_256 | decrypt,
+				 (void *)key->k.raw_key, out, in,
+				 AES_BLOCK_SIZE);
+			return true;
+		}
+	}
+	return false;
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (likely(aes_crypt_s390(key, out, in, 0)))
+		return;
+	aes_encrypt_generic(key->k.rndkeys, key->nrounds, out, in);
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (likely(aes_crypt_s390((const struct aes_enckey *)key, out, in,
+				  CPACF_DECRYPT)))
+		return;
+	aes_decrypt_generic(key->inv_k.inv_rndkeys, key->nrounds, out, in);
+}
+
+#define aes_mod_init_arch aes_mod_init_arch
+static void aes_mod_init_arch(void)
+{
+	if (cpu_have_feature(S390_CPU_FEATURE_MSA)) {
+		cpacf_mask_t km_functions;
+
+		cpacf_query(CPACF_KM, &km_functions);
+		if (cpacf_test_func(&km_functions, CPACF_KM_AES_128))
+			static_branch_enable(&have_cpacf_aes128);
+		if (cpacf_test_func(&km_functions, CPACF_KM_AES_192))
+			static_branch_enable(&have_cpacf_aes192);
+		if (cpacf_test_func(&km_functions, CPACF_KM_AES_256))
+			static_branch_enable(&have_cpacf_aes256);
+	}
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 16/36] lib/crypto: sparc/aes: Migrate optimized code into library
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (14 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 15/36] lib/crypto: s390/aes: " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 17/36] lib/crypto: x86/aes: Add AES-NI optimization Eric Biggers
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Move the SPARC64 AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the "aes-sparc64" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs use the
SPARC64 AES opcodes, whereas previously only crypto_cipher did (and it
wasn't enabled by default, which this commit fixes as well).

Note that some of the functions in the SPARC64 AES assembly code are
still used by the AES mode implementations in
arch/sparc/crypto/aes_glue.c.  For now, just export these functions.
These exports will go away once the AES mode implementations are
migrated to the library as well.  (Trying to split up the assembly file
seemed like much more trouble than it would be worth.)

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/sparc/crypto/Kconfig                     |   2 +-
 arch/sparc/crypto/Makefile                    |   2 +-
 arch/sparc/crypto/aes_glue.c                  | 140 +---------------
 include/crypto/aes.h                          |  42 +++++
 lib/crypto/Kconfig                            |   1 +
 lib/crypto/Makefile                           |   1 +
 lib/crypto/sparc/aes.h                        | 149 ++++++++++++++++++
 .../crypto => lib/crypto/sparc}/aes_asm.S     |   0
 8 files changed, 200 insertions(+), 137 deletions(-)
 create mode 100644 lib/crypto/sparc/aes.h
 rename {arch/sparc/crypto => lib/crypto/sparc}/aes_asm.S (100%)

diff --git a/arch/sparc/crypto/Kconfig b/arch/sparc/crypto/Kconfig
index f755da979534..c1932ce46c7f 100644
--- a/arch/sparc/crypto/Kconfig
+++ b/arch/sparc/crypto/Kconfig
@@ -17,13 +17,13 @@ config CRYPTO_DES_SPARC64
 	  Architecture: sparc64
 
 config CRYPTO_AES_SPARC64
 	tristate "Ciphers: AES, modes: ECB, CBC, CTR"
 	depends on SPARC64
+	select CRYPTO_LIB_AES
 	select CRYPTO_SKCIPHER
 	help
-	  Block ciphers: AES cipher algorithms (FIPS-197)
 	  Length-preseving ciphers: AES with ECB, CBC, and CTR modes
 
 	  Architecture: sparc64 using crypto instructions
 
 config CRYPTO_CAMELLIA_SPARC64
diff --git a/arch/sparc/crypto/Makefile b/arch/sparc/crypto/Makefile
index 7b4796842ddd..cdf9f4b3efbb 100644
--- a/arch/sparc/crypto/Makefile
+++ b/arch/sparc/crypto/Makefile
@@ -5,8 +5,8 @@
 
 obj-$(CONFIG_CRYPTO_AES_SPARC64) += aes-sparc64.o
 obj-$(CONFIG_CRYPTO_DES_SPARC64) += des-sparc64.o
 obj-$(CONFIG_CRYPTO_CAMELLIA_SPARC64) += camellia-sparc64.o
 
-aes-sparc64-y := aes_asm.o aes_glue.o
+aes-sparc64-y := aes_glue.o
 des-sparc64-y := des_asm.o des_glue.o
 camellia-sparc64-y := camellia_asm.o camellia_glue.o
diff --git a/arch/sparc/crypto/aes_glue.c b/arch/sparc/crypto/aes_glue.c
index 359f22643b05..661561837415 100644
--- a/arch/sparc/crypto/aes_glue.c
+++ b/arch/sparc/crypto/aes_glue.c
@@ -30,12 +30,10 @@
 #include <asm/opcodes.h>
 #include <asm/pstate.h>
 #include <asm/elf.h>
 
 struct aes_ops {
-	void (*encrypt)(const u64 *key, const u32 *input, u32 *output);
-	void (*decrypt)(const u64 *key, const u32 *input, u32 *output);
 	void (*load_encrypt_keys)(const u64 *key);
 	void (*load_decrypt_keys)(const u64 *key);
 	void (*ecb_encrypt)(const u64 *key, const u64 *input, u64 *output,
 			    unsigned int len);
 	void (*ecb_decrypt)(const u64 *key, const u64 *input, u64 *output,
@@ -53,123 +51,44 @@ struct crypto_sparc64_aes_ctx {
 	u64 key[AES_MAX_KEYLENGTH / sizeof(u64)];
 	u32 key_length;
 	u32 expanded_key_length;
 };
 
-extern void aes_sparc64_encrypt_128(const u64 *key, const u32 *input,
-				    u32 *output);
-extern void aes_sparc64_encrypt_192(const u64 *key, const u32 *input,
-				    u32 *output);
-extern void aes_sparc64_encrypt_256(const u64 *key, const u32 *input,
-				    u32 *output);
-
-extern void aes_sparc64_decrypt_128(const u64 *key, const u32 *input,
-				    u32 *output);
-extern void aes_sparc64_decrypt_192(const u64 *key, const u32 *input,
-				    u32 *output);
-extern void aes_sparc64_decrypt_256(const u64 *key, const u32 *input,
-				    u32 *output);
-
-extern void aes_sparc64_load_encrypt_keys_128(const u64 *key);
-extern void aes_sparc64_load_encrypt_keys_192(const u64 *key);
-extern void aes_sparc64_load_encrypt_keys_256(const u64 *key);
-
-extern void aes_sparc64_load_decrypt_keys_128(const u64 *key);
-extern void aes_sparc64_load_decrypt_keys_192(const u64 *key);
-extern void aes_sparc64_load_decrypt_keys_256(const u64 *key);
-
-extern void aes_sparc64_ecb_encrypt_128(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len);
-extern void aes_sparc64_ecb_encrypt_192(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len);
-extern void aes_sparc64_ecb_encrypt_256(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len);
-
-extern void aes_sparc64_ecb_decrypt_128(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len);
-extern void aes_sparc64_ecb_decrypt_192(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len);
-extern void aes_sparc64_ecb_decrypt_256(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len);
-
-extern void aes_sparc64_cbc_encrypt_128(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len,
-					u64 *iv);
-
-extern void aes_sparc64_cbc_encrypt_192(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len,
-					u64 *iv);
-
-extern void aes_sparc64_cbc_encrypt_256(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len,
-					u64 *iv);
-
-extern void aes_sparc64_cbc_decrypt_128(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len,
-					u64 *iv);
-
-extern void aes_sparc64_cbc_decrypt_192(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len,
-					u64 *iv);
-
-extern void aes_sparc64_cbc_decrypt_256(const u64 *key, const u64 *input,
-					u64 *output, unsigned int len,
-					u64 *iv);
-
-extern void aes_sparc64_ctr_crypt_128(const u64 *key, const u64 *input,
-				      u64 *output, unsigned int len,
-				      u64 *iv);
-extern void aes_sparc64_ctr_crypt_192(const u64 *key, const u64 *input,
-				      u64 *output, unsigned int len,
-				      u64 *iv);
-extern void aes_sparc64_ctr_crypt_256(const u64 *key, const u64 *input,
-				      u64 *output, unsigned int len,
-				      u64 *iv);
-
 static struct aes_ops aes128_ops = {
-	.encrypt		= aes_sparc64_encrypt_128,
-	.decrypt		= aes_sparc64_decrypt_128,
 	.load_encrypt_keys	= aes_sparc64_load_encrypt_keys_128,
 	.load_decrypt_keys	= aes_sparc64_load_decrypt_keys_128,
 	.ecb_encrypt		= aes_sparc64_ecb_encrypt_128,
 	.ecb_decrypt		= aes_sparc64_ecb_decrypt_128,
 	.cbc_encrypt		= aes_sparc64_cbc_encrypt_128,
 	.cbc_decrypt		= aes_sparc64_cbc_decrypt_128,
 	.ctr_crypt		= aes_sparc64_ctr_crypt_128,
 };
 
 static struct aes_ops aes192_ops = {
-	.encrypt		= aes_sparc64_encrypt_192,
-	.decrypt		= aes_sparc64_decrypt_192,
 	.load_encrypt_keys	= aes_sparc64_load_encrypt_keys_192,
 	.load_decrypt_keys	= aes_sparc64_load_decrypt_keys_192,
 	.ecb_encrypt		= aes_sparc64_ecb_encrypt_192,
 	.ecb_decrypt		= aes_sparc64_ecb_decrypt_192,
 	.cbc_encrypt		= aes_sparc64_cbc_encrypt_192,
 	.cbc_decrypt		= aes_sparc64_cbc_decrypt_192,
 	.ctr_crypt		= aes_sparc64_ctr_crypt_192,
 };
 
 static struct aes_ops aes256_ops = {
-	.encrypt		= aes_sparc64_encrypt_256,
-	.decrypt		= aes_sparc64_decrypt_256,
 	.load_encrypt_keys	= aes_sparc64_load_encrypt_keys_256,
 	.load_decrypt_keys	= aes_sparc64_load_decrypt_keys_256,
 	.ecb_encrypt		= aes_sparc64_ecb_encrypt_256,
 	.ecb_decrypt		= aes_sparc64_ecb_decrypt_256,
 	.cbc_encrypt		= aes_sparc64_cbc_encrypt_256,
 	.cbc_decrypt		= aes_sparc64_cbc_decrypt_256,
 	.ctr_crypt		= aes_sparc64_ctr_crypt_256,
 };
 
-extern void aes_sparc64_key_expand(const u32 *in_key, u64 *output_key,
-				   unsigned int key_len);
-
-static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
-		       unsigned int key_len)
+static int aes_set_key_skcipher(struct crypto_skcipher *tfm, const u8 *in_key,
+				unsigned int key_len)
 {
-	struct crypto_sparc64_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+	struct crypto_sparc64_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 
 	switch (key_len) {
 	case AES_KEYSIZE_128:
 		ctx->expanded_key_length = 0xb0;
 		ctx->ops = &aes128_ops;
@@ -193,30 +112,10 @@ static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
 	ctx->key_length = key_len;
 
 	return 0;
 }
 
-static int aes_set_key_skcipher(struct crypto_skcipher *tfm, const u8 *in_key,
-				unsigned int key_len)
-{
-	return aes_set_key(crypto_skcipher_tfm(tfm), in_key, key_len);
-}
-
-static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct crypto_sparc64_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	ctx->ops->encrypt(&ctx->key[0], (const u32 *) src, (u32 *) dst);
-}
-
-static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct crypto_sparc64_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	ctx->ops->decrypt(&ctx->key[0], (const u32 *) src, (u32 *) dst);
-}
-
 static int ecb_encrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	const struct crypto_sparc64_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 	struct skcipher_walk walk;
@@ -356,30 +255,10 @@ static int ctr_crypt(struct skcipher_request *req)
 	}
 	fprs_write(0);
 	return err;
 }
 
-static struct crypto_alg cipher_alg = {
-	.cra_name		= "aes",
-	.cra_driver_name	= "aes-sparc64",
-	.cra_priority		= SPARC_CR_OPCODE_PRIORITY,
-	.cra_flags		= CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize		= AES_BLOCK_SIZE,
-	.cra_ctxsize		= sizeof(struct crypto_sparc64_aes_ctx),
-	.cra_alignmask		= 3,
-	.cra_module		= THIS_MODULE,
-	.cra_u	= {
-		.cipher	= {
-			.cia_min_keysize	= AES_MIN_KEY_SIZE,
-			.cia_max_keysize	= AES_MAX_KEY_SIZE,
-			.cia_setkey		= aes_set_key,
-			.cia_encrypt		= crypto_aes_encrypt,
-			.cia_decrypt		= crypto_aes_decrypt
-		}
-	}
-};
-
 static struct skcipher_alg skcipher_algs[] = {
 	{
 		.base.cra_name		= "ecb(aes)",
 		.base.cra_driver_name	= "ecb-aes-sparc64",
 		.base.cra_priority	= SPARC_CR_OPCODE_PRIORITY,
@@ -438,30 +317,21 @@ static bool __init sparc64_has_aes_opcode(void)
 	return true;
 }
 
 static int __init aes_sparc64_mod_init(void)
 {
-	int err;
-
 	if (!sparc64_has_aes_opcode()) {
 		pr_info("sparc64 aes opcodes not available.\n");
 		return -ENODEV;
 	}
 	pr_info("Using sparc64 aes opcodes optimized AES implementation\n");
-	err = crypto_register_alg(&cipher_alg);
-	if (err)
-		return err;
-	err = crypto_register_skciphers(skcipher_algs,
-					ARRAY_SIZE(skcipher_algs));
-	if (err)
-		crypto_unregister_alg(&cipher_alg);
-	return err;
+	return crypto_register_skciphers(skcipher_algs,
+					 ARRAY_SIZE(skcipher_algs));
 }
 
 static void __exit aes_sparc64_mod_fini(void)
 {
-	crypto_unregister_alg(&cipher_alg);
 	crypto_unregister_skciphers(skcipher_algs, ARRAY_SIZE(skcipher_algs));
 }
 
 module_init(aes_sparc64_mod_init);
 module_exit(aes_sparc64_mod_fini);
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index b91eb49cbffc..e4b5f60e7a0b 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -47,10 +47,13 @@ union aes_enckey_arch {
 	 */
 	struct p8_aes_key p8;
 #elif defined(CONFIG_S390)
 	/* Used when the CPU supports CPACF AES for this key's length */
 	u8 raw_key[AES_MAX_KEY_SIZE];
+#elif defined(CONFIG_SPARC64)
+	/* Used when the CPU supports the SPARC64 AES opcodes */
+	u64 sparc_rndkeys[AES_MAX_KEYLENGTH / sizeof(u64)];
 #endif
 #endif /* CONFIG_CRYPTO_LIB_AES_ARCH */
 };
 
 union aes_invkey_arch {
@@ -197,10 +200,49 @@ void aes_p8_xts_encrypt(const u8 *in, u8 *out, size_t len,
 			const struct p8_aes_key *key1,
 			const struct p8_aes_key *key2, u8 *iv);
 void aes_p8_xts_decrypt(const u8 *in, u8 *out, size_t len,
 			const struct p8_aes_key *key1,
 			const struct p8_aes_key *key2, u8 *iv);
+#elif defined(CONFIG_SPARC64)
+void aes_sparc64_key_expand(const u32 *in_key, u64 *output_key,
+			    unsigned int key_len);
+void aes_sparc64_load_encrypt_keys_128(const u64 *key);
+void aes_sparc64_load_encrypt_keys_192(const u64 *key);
+void aes_sparc64_load_encrypt_keys_256(const u64 *key);
+void aes_sparc64_load_decrypt_keys_128(const u64 *key);
+void aes_sparc64_load_decrypt_keys_192(const u64 *key);
+void aes_sparc64_load_decrypt_keys_256(const u64 *key);
+void aes_sparc64_ecb_encrypt_128(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len);
+void aes_sparc64_ecb_encrypt_192(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len);
+void aes_sparc64_ecb_encrypt_256(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len);
+void aes_sparc64_ecb_decrypt_128(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len);
+void aes_sparc64_ecb_decrypt_192(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len);
+void aes_sparc64_ecb_decrypt_256(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len);
+void aes_sparc64_cbc_encrypt_128(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len, u64 *iv);
+void aes_sparc64_cbc_encrypt_192(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len, u64 *iv);
+void aes_sparc64_cbc_encrypt_256(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len, u64 *iv);
+void aes_sparc64_cbc_decrypt_128(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len, u64 *iv);
+void aes_sparc64_cbc_decrypt_192(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len, u64 *iv);
+void aes_sparc64_cbc_decrypt_256(const u64 *key, const u64 *input, u64 *output,
+				 unsigned int len, u64 *iv);
+void aes_sparc64_ctr_crypt_128(const u64 *key, const u64 *input, u64 *output,
+			       unsigned int len, u64 *iv);
+void aes_sparc64_ctr_crypt_192(const u64 *key, const u64 *input, u64 *output,
+			       unsigned int len, u64 *iv);
+void aes_sparc64_ctr_crypt_256(const u64 *key, const u64 *input, u64 *output,
+			       unsigned int len, u64 *iv);
 #endif
 
 /**
  * aes_preparekey() - Prepare an AES key for encryption and decryption
  * @key: (output) The key structure to initialize
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index b672f0145793..222887c04240 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -18,10 +18,11 @@ config CRYPTO_LIB_AES_ARCH
 	default y if ARM64
 	default y if PPC && (SPE || (PPC64 && VSX))
 	default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \
 		     RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
 	default y if S390
+	default y if SPARC64
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 811b60787dd5..761d52d91f92 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -49,10 +49,11 @@ targets += powerpc/aesp8-ppc.S
 OBJECT_FILES_NON_STANDARD_powerpc/aesp8-ppc.o := y
 endif # !CONFIG_SPE
 endif # CONFIG_PPC
 
 libaes-$(CONFIG_RISCV) += riscv/aes-riscv64-zvkned.o
+libaes-$(CONFIG_SPARC) += sparc/aes_asm.o
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
diff --git a/lib/crypto/sparc/aes.h b/lib/crypto/sparc/aes.h
new file mode 100644
index 000000000000..e354aa507ee0
--- /dev/null
+++ b/lib/crypto/sparc/aes.h
@@ -0,0 +1,149 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * AES accelerated using the sparc64 aes opcodes
+ *
+ * Copyright (C) 2008, Intel Corp.
+ * Copyright (c) 2010, Intel Corporation.
+ * Copyright 2026 Google LLC
+ */
+
+#include <asm/fpumacro.h>
+#include <asm/opcodes.h>
+#include <asm/pstate.h>
+#include <asm/elf.h>
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_aes_opcodes);
+
+EXPORT_SYMBOL_GPL(aes_sparc64_key_expand);
+EXPORT_SYMBOL_GPL(aes_sparc64_load_encrypt_keys_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_load_encrypt_keys_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_load_encrypt_keys_256);
+EXPORT_SYMBOL_GPL(aes_sparc64_load_decrypt_keys_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_load_decrypt_keys_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_load_decrypt_keys_256);
+EXPORT_SYMBOL_GPL(aes_sparc64_ecb_encrypt_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_ecb_encrypt_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_ecb_encrypt_256);
+EXPORT_SYMBOL_GPL(aes_sparc64_ecb_decrypt_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_ecb_decrypt_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_ecb_decrypt_256);
+EXPORT_SYMBOL_GPL(aes_sparc64_cbc_encrypt_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_cbc_encrypt_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_cbc_encrypt_256);
+EXPORT_SYMBOL_GPL(aes_sparc64_cbc_decrypt_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_cbc_decrypt_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_cbc_decrypt_256);
+EXPORT_SYMBOL_GPL(aes_sparc64_ctr_crypt_128);
+EXPORT_SYMBOL_GPL(aes_sparc64_ctr_crypt_192);
+EXPORT_SYMBOL_GPL(aes_sparc64_ctr_crypt_256);
+
+void aes_sparc64_encrypt_128(const u64 *key, const u32 *input, u32 *output);
+void aes_sparc64_encrypt_192(const u64 *key, const u32 *input, u32 *output);
+void aes_sparc64_encrypt_256(const u64 *key, const u32 *input, u32 *output);
+void aes_sparc64_decrypt_128(const u64 *key, const u32 *input, u32 *output);
+void aes_sparc64_decrypt_192(const u64 *key, const u32 *input, u32 *output);
+void aes_sparc64_decrypt_256(const u64 *key, const u32 *input, u32 *output);
+
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	if (static_branch_likely(&have_aes_opcodes)) {
+		u32 aligned_key[AES_MAX_KEY_SIZE / 4];
+
+		if (IS_ALIGNED((uintptr_t)in_key, 4)) {
+			aes_sparc64_key_expand((const u32 *)in_key,
+					       k->sparc_rndkeys, key_len);
+		} else {
+			memcpy(aligned_key, in_key, key_len);
+			aes_sparc64_key_expand(aligned_key,
+					       k->sparc_rndkeys, key_len);
+			memzero_explicit(aligned_key, key_len);
+		}
+		/*
+		 * Note that nothing needs to be written to inv_k (if it's
+		 * non-NULL) here, since the SPARC64 assembly code uses
+		 * k->sparc_rndkeys for both encryption and decryption.
+		 */
+	} else {
+		aes_expandkey_generic(k->rndkeys,
+				      inv_k ? inv_k->inv_rndkeys : NULL,
+				      in_key, key_len);
+	}
+}
+
+static void aes_sparc64_encrypt(const struct aes_enckey *key,
+				const u32 *input, u32 *output)
+{
+	if (key->len == AES_KEYSIZE_128)
+		aes_sparc64_encrypt_128(key->k.sparc_rndkeys, input, output);
+	else if (key->len == AES_KEYSIZE_192)
+		aes_sparc64_encrypt_192(key->k.sparc_rndkeys, input, output);
+	else
+		aes_sparc64_encrypt_256(key->k.sparc_rndkeys, input, output);
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	u32 bounce_buf[AES_BLOCK_SIZE / 4];
+
+	if (static_branch_likely(&have_aes_opcodes)) {
+		if (IS_ALIGNED((uintptr_t)in | (uintptr_t)out, 4)) {
+			aes_sparc64_encrypt(key, (const u32 *)in, (u32 *)out);
+		} else {
+			memcpy(bounce_buf, in, AES_BLOCK_SIZE);
+			aes_sparc64_encrypt(key, bounce_buf, bounce_buf);
+			memcpy(out, bounce_buf, AES_BLOCK_SIZE);
+		}
+	} else {
+		aes_encrypt_generic(key->k.rndkeys, key->nrounds, out, in);
+	}
+}
+
+static void aes_sparc64_decrypt(const struct aes_key *key,
+				const u32 *input, u32 *output)
+{
+	if (key->len == AES_KEYSIZE_128)
+		aes_sparc64_decrypt_128(key->k.sparc_rndkeys, input, output);
+	else if (key->len == AES_KEYSIZE_192)
+		aes_sparc64_decrypt_192(key->k.sparc_rndkeys, input, output);
+	else
+		aes_sparc64_decrypt_256(key->k.sparc_rndkeys, input, output);
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	u32 bounce_buf[AES_BLOCK_SIZE / 4];
+
+	if (static_branch_likely(&have_aes_opcodes)) {
+		if (IS_ALIGNED((uintptr_t)in | (uintptr_t)out, 4)) {
+			aes_sparc64_decrypt(key, (const u32 *)in, (u32 *)out);
+		} else {
+			memcpy(bounce_buf, in, AES_BLOCK_SIZE);
+			aes_sparc64_decrypt(key, bounce_buf, bounce_buf);
+			memcpy(out, bounce_buf, AES_BLOCK_SIZE);
+		}
+	} else {
+		aes_decrypt_generic(key->inv_k.inv_rndkeys, key->nrounds,
+				    out, in);
+	}
+}
+
+#define aes_mod_init_arch aes_mod_init_arch
+static void aes_mod_init_arch(void)
+{
+	unsigned long cfr;
+
+	if (!(sparc64_elf_hwcap & HWCAP_SPARC_CRYPTO))
+		return;
+
+	__asm__ __volatile__("rd %%asr26, %0" : "=r" (cfr));
+	if (!(cfr & CFR_AES))
+		return;
+
+	static_branch_enable(&have_aes_opcodes);
+}
diff --git a/arch/sparc/crypto/aes_asm.S b/lib/crypto/sparc/aes_asm.S
similarity index 100%
rename from arch/sparc/crypto/aes_asm.S
rename to lib/crypto/sparc/aes_asm.S
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 17/36] lib/crypto: x86/aes: Add AES-NI optimization
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (15 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 16/36] lib/crypto: sparc/aes: " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 18/36] crypto: x86/aes - Remove the superseded AES-NI crypto_cipher Eric Biggers
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Optimize the AES library with x86 AES-NI instructions.

The relevant existing assembly functions, aesni_set_key(), aesni_enc(),
and aesni_dec(), are a bit difficult to extract into the library:

- They're coupled to the code for the AES modes.
- They operate on struct crypto_aes_ctx.  The AES library now uses
  different structs.
- They assume the key is 16-byte aligned.  The AES library only
  *prefers* 16-byte alignment; it doesn't require it.

Moreover, they're not all that great in the first place:

- They use unrolled loops, which isn't a great choice on x86.
- They use the 'aeskeygenassist' instruction, which is unnecessary, is
  slow on Intel CPUs, and forces the loop to be unrolled.
- They have special code for AES-192 key expansion, despite that being
  kind of useless.  AES-128 and AES-256 are the ones used in practice.

These are small functions anyway.

Therefore, I opted to just write replacements of these functions for the
library.  They address all the above issues.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 lib/crypto/Kconfig         |   1 +
 lib/crypto/Makefile        |   1 +
 lib/crypto/x86/aes-aesni.S | 261 +++++++++++++++++++++++++++++++++++++
 lib/crypto/x86/aes.h       |  85 ++++++++++++
 4 files changed, 348 insertions(+)
 create mode 100644 lib/crypto/x86/aes-aesni.S
 create mode 100644 lib/crypto/x86/aes.h

diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 222887c04240..e3ee31217988 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -19,10 +19,11 @@ config CRYPTO_LIB_AES_ARCH
 	default y if PPC && (SPE || (PPC64 && VSX))
 	default y if RISCV && 64BIT && TOOLCHAIN_HAS_VECTOR_CRYPTO && \
 		     RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
 	default y if S390
 	default y if SPARC64
+	default y if X86
 
 config CRYPTO_LIB_AESCFB
 	tristate
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_UTILS
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 761d52d91f92..725eef05b758 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -50,10 +50,11 @@ OBJECT_FILES_NON_STANDARD_powerpc/aesp8-ppc.o := y
 endif # !CONFIG_SPE
 endif # CONFIG_PPC
 
 libaes-$(CONFIG_RISCV) += riscv/aes-riscv64-zvkned.o
 libaes-$(CONFIG_SPARC) += sparc/aes_asm.o
+libaes-$(CONFIG_X86) += x86/aes-aesni.o
 endif # CONFIG_CRYPTO_LIB_AES_ARCH
 
 ################################################################################
 
 obj-$(CONFIG_CRYPTO_LIB_AESCFB)			+= libaescfb.o
diff --git a/lib/crypto/x86/aes-aesni.S b/lib/crypto/x86/aes-aesni.S
new file mode 100644
index 000000000000..b8c3e104a3be
--- /dev/null
+++ b/lib/crypto/x86/aes-aesni.S
@@ -0,0 +1,261 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+//
+// AES block cipher using AES-NI instructions
+//
+// Copyright 2026 Google LLC
+//
+// The code in this file supports 32-bit and 64-bit CPUs, and it doesn't require
+// AVX.  It does use up to SSE4.1, which all CPUs with AES-NI have.
+#include <linux/linkage.h>
+
+.section .rodata
+#ifdef __x86_64__
+#define RODATA(label)	label(%rip)
+#else
+#define RODATA(label)	label
+#endif
+
+	// A mask for pshufb that extracts the last dword, rotates it right by 8
+	// bits, and copies the result to all four dwords.
+.p2align 4
+.Lmask:
+	.byte	13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12
+
+	// The AES round constants, used during key expansion
+.Lrcon:
+	.long	0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36
+
+.text
+
+// Transform four dwords [a0, a1, a2, a3] in \a into
+// [a0, a0^a1, a0^a1^a2, a0^a1^a2^a3].  \tmp is a temporary xmm register.
+//
+// Note: this could be done in four instructions, shufps + pxor + shufps + pxor,
+// if the temporary register were zero-initialized ahead of time.  We instead do
+// it in an easier-to-understand way that doesn't require zero-initialization
+// and avoids the unusual shufps instruction.  movdqa is usually "free" anyway.
+.macro	_prefix_sum	a, tmp
+	movdqa		\a, \tmp	// [a0, a1, a2, a3]
+	pslldq		$4, \a		// [0, a0, a1, a2]
+	pxor		\tmp, \a	// [a0, a0^a1, a1^a2, a2^a3]
+	movdqa		\a, \tmp
+	pslldq		$8, \a		// [0, 0, a0, a0^a1]
+	pxor		\tmp, \a	// [a0, a0^a1, a0^a1^a2, a0^a1^a2^a3]
+.endm
+
+.macro	_gen_round_key	a, b
+	// Compute four copies of rcon[i] ^ SubBytes(ror32(w, 8)), where w is
+	// the last dword of the previous round key (given in \b).
+	//
+	// 'aesenclast src, dst' does dst = src XOR SubBytes(ShiftRows(dst)).
+	// It is used here solely for the SubBytes and the XOR.  The ShiftRows
+	// is a no-op because all four columns are the same here.
+	//
+	// Don't use the 'aeskeygenassist' instruction, since:
+	//  - On most Intel CPUs it is microcoded, making it have a much higher
+	//    latency and use more execution ports than 'aesenclast'.
+	//  - It cannot be used in a loop, since it requires an immediate.
+	//  - It doesn't do much more than 'aesenclast' in the first place.
+	movdqa		\b, %xmm2
+	pshufb		MASK, %xmm2
+	aesenclast	RCON, %xmm2
+
+	// XOR in the prefix sum of the four dwords of \a, which is the
+	// previous round key (AES-128) or the first round key in the previous
+	// pair of round keys (AES-256).  The result is the next round key.
+	_prefix_sum	\a, tmp=%xmm3
+	pxor		%xmm2, \a
+
+	// Store the next round key to memory.  Also leave it in \a.
+	movdqu		\a, (RNDKEYS)
+.endm
+
+.macro	_aes_expandkey_aesni	is_aes128
+#ifdef __x86_64__
+	// Arguments
+	.set	RNDKEYS,	%rdi
+	.set	INV_RNDKEYS,	%rsi
+	.set	IN_KEY,		%rdx
+
+	// Other local variables
+	.set	RCON_PTR,	%rcx
+	.set	COUNTER,	%eax
+#else
+	// Arguments, assuming -mregparm=3
+	.set	RNDKEYS,	%eax
+	.set	INV_RNDKEYS,	%edx
+	.set	IN_KEY,		%ecx
+
+	// Other local variables
+	.set	RCON_PTR,	%ebx
+	.set	COUNTER,	%esi
+#endif
+	.set	RCON,		%xmm6
+	.set	MASK,		%xmm7
+
+#ifdef __i386__
+	push		%ebx
+	push		%esi
+#endif
+
+.if \is_aes128
+	// AES-128: the first round key is simply a copy of the raw key.
+	movdqu		(IN_KEY), %xmm0
+	movdqu		%xmm0, (RNDKEYS)
+.else
+	// AES-256: the first two round keys are simply a copy of the raw key.
+	movdqu		(IN_KEY), %xmm0
+	movdqu		%xmm0, (RNDKEYS)
+	movdqu		16(IN_KEY), %xmm1
+	movdqu		%xmm1, 16(RNDKEYS)
+	add		$32, RNDKEYS
+.endif
+
+	// Generate the remaining round keys.
+	movdqa		RODATA(.Lmask), MASK
+.if \is_aes128
+	lea		RODATA(.Lrcon), RCON_PTR
+	mov		$10, COUNTER
+.Lgen_next_aes128_round_key:
+	add		$16, RNDKEYS
+	movd		(RCON_PTR), RCON
+	pshufd		$0x00, RCON, RCON
+	add		$4, RCON_PTR
+	_gen_round_key	%xmm0, %xmm0
+	dec		COUNTER
+	jnz		.Lgen_next_aes128_round_key
+.else
+	// AES-256: only the first 7 round constants are needed, so instead of
+	// loading each one from memory, just start by loading [1, 1, 1, 1] and
+	// then generate the rest by doubling.
+	pshufd		$0x00, RODATA(.Lrcon), RCON
+	pxor		%xmm5, %xmm5	// All-zeroes
+	mov		$7, COUNTER
+.Lgen_next_aes256_round_key_pair:
+	// Generate the next AES-256 round key: either the first of a pair of
+	// two, or the last one.
+	_gen_round_key	%xmm0, %xmm1
+
+	dec		COUNTER
+	jz		.Lgen_aes256_round_keys_done
+
+	// Generate the second AES-256 round key of the pair.  Compared to the
+	// first, there's no rotation and no XOR of a round constant.
+	pshufd		$0xff, %xmm0, %xmm2	// Get four copies of last dword
+	aesenclast	%xmm5, %xmm2		// Just does SubBytes
+	_prefix_sum	%xmm1, tmp=%xmm3
+	pxor		%xmm2, %xmm1
+	movdqu		%xmm1, 16(RNDKEYS)
+	add		$32, RNDKEYS
+	paddd		RCON, RCON		// RCON <<= 1
+	jmp		.Lgen_next_aes256_round_key_pair
+.Lgen_aes256_round_keys_done:
+.endif
+
+	// If INV_RNDKEYS is non-NULL, write the round keys for the Equivalent
+	// Inverse Cipher to it.  To do that, reverse the standard round keys,
+	// and apply aesimc (InvMixColumn) to each except the first and last.
+	test		INV_RNDKEYS, INV_RNDKEYS
+	jz		.Ldone\@
+	movdqu		(RNDKEYS), %xmm0	// Last standard round key
+	movdqu		%xmm0, (INV_RNDKEYS)	// => First inverse round key
+.if \is_aes128
+	mov		$9, COUNTER
+.else
+	mov		$13, COUNTER
+.endif
+.Lgen_next_inv_round_key\@:
+	sub		$16, RNDKEYS
+	add		$16, INV_RNDKEYS
+	movdqu		(RNDKEYS), %xmm0
+	aesimc		%xmm0, %xmm0
+	movdqu		%xmm0, (INV_RNDKEYS)
+	dec		COUNTER
+	jnz		.Lgen_next_inv_round_key\@
+	movdqu		-16(RNDKEYS), %xmm0	// First standard round key
+	movdqu		%xmm0, 16(INV_RNDKEYS)	// => Last inverse round key
+
+.Ldone\@:
+#ifdef __i386__
+	pop		%esi
+	pop		%ebx
+#endif
+	RET
+.endm
+
+// void aes128_expandkey_aesni(u32 rndkeys[], u32 *inv_rndkeys,
+//			       const u8 in_key[AES_KEYSIZE_128]);
+SYM_FUNC_START(aes128_expandkey_aesni)
+	_aes_expandkey_aesni	1
+SYM_FUNC_END(aes128_expandkey_aesni)
+
+// void aes256_expandkey_aesni(u32 rndkeys[], u32 *inv_rndkeys,
+//			       const u8 in_key[AES_KEYSIZE_256]);
+SYM_FUNC_START(aes256_expandkey_aesni)
+	_aes_expandkey_aesni	0
+SYM_FUNC_END(aes256_expandkey_aesni)
+
+.macro	_aes_crypt_aesni	enc
+#ifdef __x86_64__
+	.set	RNDKEYS,	%rdi
+	.set	NROUNDS,	%esi
+	.set	OUT,		%rdx
+	.set	IN,		%rcx
+#else
+	// Assuming -mregparm=3
+	.set	RNDKEYS,	%eax
+	.set	NROUNDS,	%edx
+	.set	OUT,		%ecx
+	.set	IN,		%ebx	// Passed on stack
+#endif
+
+#ifdef __i386__
+	push		%ebx
+	mov		8(%esp), %ebx
+#endif
+
+	// Zero-th round
+	movdqu		(IN), %xmm0
+	movdqu		(RNDKEYS), %xmm1
+	pxor		%xmm1, %xmm0
+
+	// Normal rounds
+	add		$16, RNDKEYS
+	dec		NROUNDS
+.Lnext_round\@:
+	movdqu		(RNDKEYS), %xmm1
+.if \enc
+	aesenc		%xmm1, %xmm0
+.else
+	aesdec		%xmm1, %xmm0
+.endif
+	add		$16, RNDKEYS
+	dec		NROUNDS
+	jne		.Lnext_round\@
+
+	// Last round
+	movdqu		(RNDKEYS), %xmm1
+.if \enc
+	aesenclast	%xmm1, %xmm0
+.else
+	aesdeclast	%xmm1, %xmm0
+.endif
+	movdqu		%xmm0, (OUT)
+
+#ifdef __i386__
+	pop		%ebx
+#endif
+	RET
+.endm
+
+// void aes_encrypt_aesni(const u32 rndkeys[], int nrounds,
+//			  u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+SYM_FUNC_START(aes_encrypt_aesni)
+	_aes_crypt_aesni	1
+SYM_FUNC_END(aes_encrypt_aesni)
+
+// void aes_decrypt_aesni(const u32 inv_rndkeys[], int nrounds,
+//			  u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+SYM_FUNC_START(aes_decrypt_aesni)
+	_aes_crypt_aesni	0
+SYM_FUNC_END(aes_decrypt_aesni)
diff --git a/lib/crypto/x86/aes.h b/lib/crypto/x86/aes.h
new file mode 100644
index 000000000000..b047dee94f57
--- /dev/null
+++ b/lib/crypto/x86/aes.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * AES block cipher using AES-NI instructions
+ *
+ * Copyright 2026 Google LLC
+ */
+
+#include <asm/fpu/api.h>
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_aes);
+
+void aes128_expandkey_aesni(u32 rndkeys[], u32 *inv_rndkeys,
+			    const u8 in_key[AES_KEYSIZE_128]);
+void aes256_expandkey_aesni(u32 rndkeys[], u32 *inv_rndkeys,
+			    const u8 in_key[AES_KEYSIZE_256]);
+void aes_encrypt_aesni(const u32 rndkeys[], int nrounds,
+		       u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+void aes_decrypt_aesni(const u32 inv_rndkeys[], int nrounds,
+		       u8 out[AES_BLOCK_SIZE], const u8 in[AES_BLOCK_SIZE]);
+
+/*
+ * Expand an AES key using AES-NI if supported and usable or generic code
+ * otherwise.  The expanded key format is compatible between the two cases.  The
+ * outputs are @k->rndkeys (required) and @inv_k->inv_rndkeys (optional).
+ *
+ * We could just always use the generic key expansion code.  AES key expansion
+ * is usually less performance-critical than AES en/decryption.  However,
+ * there's still *some* value in speed here, as well as in non-key-dependent
+ * execution time which AES-NI provides.  So, do use AES-NI to expand AES-128
+ * and AES-256 keys.  (Don't bother with AES-192, as it's almost never used.)
+ */
+static void aes_preparekey_arch(union aes_enckey_arch *k,
+				union aes_invkey_arch *inv_k,
+				const u8 *in_key, int key_len, int nrounds)
+{
+	u32 *rndkeys = k->rndkeys;
+	u32 *inv_rndkeys = inv_k ? inv_k->inv_rndkeys : NULL;
+
+	if (static_branch_likely(&have_aes) && key_len != AES_KEYSIZE_192 &&
+	    irq_fpu_usable()) {
+		kernel_fpu_begin();
+		if (key_len == AES_KEYSIZE_128)
+			aes128_expandkey_aesni(rndkeys, inv_rndkeys, in_key);
+		else
+			aes256_expandkey_aesni(rndkeys, inv_rndkeys, in_key);
+		kernel_fpu_end();
+	} else {
+		aes_expandkey_generic(rndkeys, inv_rndkeys, in_key, key_len);
+	}
+}
+
+static void aes_encrypt_arch(const struct aes_enckey *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (static_branch_likely(&have_aes) && irq_fpu_usable()) {
+		kernel_fpu_begin();
+		aes_encrypt_aesni(key->k.rndkeys, key->nrounds, out, in);
+		kernel_fpu_end();
+	} else {
+		aes_encrypt_generic(key->k.rndkeys, key->nrounds, out, in);
+	}
+}
+
+static void aes_decrypt_arch(const struct aes_key *key,
+			     u8 out[AES_BLOCK_SIZE],
+			     const u8 in[AES_BLOCK_SIZE])
+{
+	if (static_branch_likely(&have_aes) && irq_fpu_usable()) {
+		kernel_fpu_begin();
+		aes_decrypt_aesni(key->inv_k.inv_rndkeys, key->nrounds,
+				  out, in);
+		kernel_fpu_end();
+	} else {
+		aes_decrypt_generic(key->inv_k.inv_rndkeys, key->nrounds,
+				    out, in);
+	}
+}
+
+#define aes_mod_init_arch aes_mod_init_arch
+static void aes_mod_init_arch(void)
+{
+	if (boot_cpu_has(X86_FEATURE_AES))
+		static_branch_enable(&have_aes);
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 18/36] crypto: x86/aes - Remove the superseded AES-NI crypto_cipher
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (16 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 17/36] lib/crypto: x86/aes: Add AES-NI optimization Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 19/36] Bluetooth: SMP: Use new AES library API Eric Biggers
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Remove the "aes-aesni" crypto_cipher algorithm and the code specific to
its implementation.  It is no longer necessary because the AES library
is now optimized with x86 AES-NI, and crypto/aes.c exposes the AES
library via the crypto_cipher API.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/x86/crypto/Kconfig            |  2 -
 arch/x86/crypto/aesni-intel_asm.S  | 25 ------------
 arch/x86/crypto/aesni-intel_glue.c | 62 +-----------------------------
 3 files changed, 1 insertion(+), 88 deletions(-)

diff --git a/arch/x86/crypto/Kconfig b/arch/x86/crypto/Kconfig
index ebb0838eaf30..7fb2319a0916 100644
--- a/arch/x86/crypto/Kconfig
+++ b/arch/x86/crypto/Kconfig
@@ -5,14 +5,12 @@ menu "Accelerated Cryptographic Algorithms for CPU (x86)"
 config CRYPTO_AES_NI_INTEL
 	tristate "Ciphers: AES, modes: ECB, CBC, CTS, CTR, XCTR, XTS, GCM (AES-NI/VAES)"
 	select CRYPTO_AEAD
 	select CRYPTO_LIB_AES
 	select CRYPTO_LIB_GF128MUL
-	select CRYPTO_ALGAPI
 	select CRYPTO_SKCIPHER
 	help
-	  Block cipher: AES cipher algorithms
 	  AEAD cipher: AES with GCM
 	  Length-preserving ciphers: AES with ECB, CBC, CTS, CTR, XCTR, XTS
 
 	  Architecture: x86 (32-bit and 64-bit) using:
 	  - AES-NI (AES new instructions)
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index b37881bb9f15..6abe5e38a6d7 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -434,35 +434,10 @@ SYM_FUNC_START_LOCAL(_aesni_enc4)
 	aesenclast KEY, STATE3
 	aesenclast KEY, STATE4
 	RET
 SYM_FUNC_END(_aesni_enc4)
 
-/*
- * void aesni_dec (const void *ctx, u8 *dst, const u8 *src)
- */
-SYM_FUNC_START(aesni_dec)
-	FRAME_BEGIN
-#ifndef __x86_64__
-	pushl KEYP
-	pushl KLEN
-	movl (FRAME_OFFSET+12)(%esp), KEYP	# ctx
-	movl (FRAME_OFFSET+16)(%esp), OUTP	# dst
-	movl (FRAME_OFFSET+20)(%esp), INP	# src
-#endif
-	mov 480(KEYP), KLEN		# key length
-	add $240, KEYP
-	movups (INP), STATE		# input
-	call _aesni_dec1
-	movups STATE, (OUTP)		#output
-#ifndef __x86_64__
-	popl KLEN
-	popl KEYP
-#endif
-	FRAME_END
-	RET
-SYM_FUNC_END(aesni_dec)
-
 /*
  * _aesni_dec1:		internal ABI
  * input:
  *	KEYP:		key struct pointer
  *	KLEN:		key length
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 48405e02d6e4..453e0e890041 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -58,11 +58,10 @@ static inline void *aes_align_addr(void *addr)
 }
 
 asmlinkage void aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
 			      unsigned int key_len);
 asmlinkage void aesni_enc(const void *ctx, u8 *out, const u8 *in);
-asmlinkage void aesni_dec(const void *ctx, u8 *out, const u8 *in);
 asmlinkage void aesni_ecb_enc(struct crypto_aes_ctx *ctx, u8 *out,
 			      const u8 *in, unsigned int len);
 asmlinkage void aesni_ecb_dec(struct crypto_aes_ctx *ctx, u8 *out,
 			      const u8 *in, unsigned int len);
 asmlinkage void aesni_cbc_enc(struct crypto_aes_ctx *ctx, u8 *out,
@@ -111,43 +110,10 @@ static int aes_set_key_common(struct crypto_aes_ctx *ctx,
 	aesni_set_key(ctx, in_key, key_len);
 	kernel_fpu_end();
 	return 0;
 }
 
-static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
-		       unsigned int key_len)
-{
-	return aes_set_key_common(aes_ctx(crypto_tfm_ctx(tfm)), in_key,
-				  key_len);
-}
-
-static void aesni_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
-
-	if (!crypto_simd_usable()) {
-		aes_encrypt(ctx, dst, src);
-	} else {
-		kernel_fpu_begin();
-		aesni_enc(ctx, dst, src);
-		kernel_fpu_end();
-	}
-}
-
-static void aesni_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
-
-	if (!crypto_simd_usable()) {
-		aes_decrypt(ctx, dst, src);
-	} else {
-		kernel_fpu_begin();
-		aesni_dec(ctx, dst, src);
-		kernel_fpu_end();
-	}
-}
-
 static int aesni_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key,
 			         unsigned int len)
 {
 	return aes_set_key_common(aes_ctx(crypto_skcipher_ctx(tfm)), key, len);
 }
@@ -542,29 +508,10 @@ static int xts_encrypt_aesni(struct skcipher_request *req)
 static int xts_decrypt_aesni(struct skcipher_request *req)
 {
 	return xts_crypt(req, aesni_xts_encrypt_iv, aesni_xts_decrypt);
 }
 
-static struct crypto_alg aesni_cipher_alg = {
-	.cra_name		= "aes",
-	.cra_driver_name	= "aes-aesni",
-	.cra_priority		= 300,
-	.cra_flags		= CRYPTO_ALG_TYPE_CIPHER,
-	.cra_blocksize		= AES_BLOCK_SIZE,
-	.cra_ctxsize		= CRYPTO_AES_CTX_SIZE,
-	.cra_module		= THIS_MODULE,
-	.cra_u	= {
-		.cipher	= {
-			.cia_min_keysize	= AES_MIN_KEY_SIZE,
-			.cia_max_keysize	= AES_MAX_KEY_SIZE,
-			.cia_setkey		= aes_set_key,
-			.cia_encrypt		= aesni_encrypt,
-			.cia_decrypt		= aesni_decrypt
-		}
-	}
-};
-
 static struct skcipher_alg aesni_skciphers[] = {
 	{
 		.base = {
 			.cra_name		= "ecb(aes)",
 			.cra_driver_name	= "ecb-aes-aesni",
@@ -1687,18 +1634,14 @@ static int __init aesni_init(void)
 	int err;
 
 	if (!x86_match_cpu(aesni_cpu_id))
 		return -ENODEV;
 
-	err = crypto_register_alg(&aesni_cipher_alg);
-	if (err)
-		return err;
-
 	err = crypto_register_skciphers(aesni_skciphers,
 					ARRAY_SIZE(aesni_skciphers));
 	if (err)
-		goto unregister_cipher;
+		return err;
 
 	err = crypto_register_aeads(aes_gcm_algs_aesni,
 				    ARRAY_SIZE(aes_gcm_algs_aesni));
 	if (err)
 		goto unregister_skciphers;
@@ -1714,22 +1657,19 @@ static int __init aesni_init(void)
 	crypto_unregister_aeads(aes_gcm_algs_aesni,
 				ARRAY_SIZE(aes_gcm_algs_aesni));
 unregister_skciphers:
 	crypto_unregister_skciphers(aesni_skciphers,
 				    ARRAY_SIZE(aesni_skciphers));
-unregister_cipher:
-	crypto_unregister_alg(&aesni_cipher_alg);
 	return err;
 }
 
 static void __exit aesni_exit(void)
 {
 	crypto_unregister_aeads(aes_gcm_algs_aesni,
 				ARRAY_SIZE(aes_gcm_algs_aesni));
 	crypto_unregister_skciphers(aesni_skciphers,
 				    ARRAY_SIZE(aesni_skciphers));
-	crypto_unregister_alg(&aesni_cipher_alg);
 	unregister_avx_algs();
 }
 
 module_init(aesni_init);
 module_exit(aesni_exit);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 19/36] Bluetooth: SMP: Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (17 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 18/36] crypto: x86/aes - Remove the superseded AES-NI crypto_cipher Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05 15:40   ` Andrew Cooper
  2026-01-05  5:12 ` [PATCH 20/36] chelsio: " Eric Biggers
                   ` (16 subsequent siblings)
  35 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 net/bluetooth/smp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 3a1ce04a7a53..69007e510177 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -372,36 +372,36 @@ static int smp_h7(struct crypto_shash *tfm_cmac, const u8 w[16],
  * s1 and ah.
  */
 
 static int smp_e(const u8 *k, u8 *r)
 {
-	struct crypto_aes_ctx ctx;
+	struct aes_enckey aes;
 	uint8_t tmp[16], data[16];
 	int err;
 
 	SMP_DBG("k %16phN r %16phN", k, r);
 
 	/* The most significant octet of key corresponds to k[0] */
 	swap_buf(k, tmp, 16);
 
-	err = aes_expandkey(&ctx, tmp, 16);
+	err = aes_prepareenckey(&aes, tmp, 16);
 	if (err) {
 		BT_ERR("cipher setkey failed: %d", err);
 		return err;
 	}
 
 	/* Most significant octet of plaintextData corresponds to data[0] */
 	swap_buf(r, data, 16);
 
-	aes_encrypt(&ctx, data, data);
+	aes_encrypt_new(&aes, data, data);
 
 	/* Most significant octet of encryptedData corresponds to data[0] */
 	swap_buf(data, r, 16);
 
 	SMP_DBG("r %16phN", r);
 
-	memzero_explicit(&ctx, sizeof(ctx));
+	memzero_explicit(&aes, sizeof(aes));
 	return err;
 }
 
 static int smp_c1(const u8 k[16],
 		  const u8 r[16], const u8 preq[7], const u8 pres[7], u8 _iat,
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 20/36] chelsio: Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (18 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 19/36] Bluetooth: SMP: Use new AES library API Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 21/36] net: phy: mscc: macsec: " Eric Biggers
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 .../ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c  | 6 +++---
 .../ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c    | 8 ++++----
 .../net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c   | 6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
index 49b57bb5fac1..882d09b2b1a8 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
@@ -168,11 +168,11 @@ static int ch_ipsec_setkey(struct xfrm_state *x,
 {
 	int keylen = (x->aead->alg_key_len + 7) / 8;
 	unsigned char *key = x->aead->alg_key;
 	int ck_size, key_ctx_size = 0;
 	unsigned char ghash_h[AEAD_H_SIZE];
-	struct crypto_aes_ctx aes;
+	struct aes_enckey aes;
 	int ret = 0;
 
 	if (keylen > 3) {
 		keylen -= 4;  /* nonce/salt is present in the last 4 bytes */
 		memcpy(sa_entry->salt, key + keylen, 4);
@@ -202,17 +202,17 @@ static int ch_ipsec_setkey(struct xfrm_state *x,
 						 key_ctx_size >> 4);
 
 	/* Calculate the H = CIPH(K, 0 repeated 16 times).
 	 * It will go in key context
 	 */
-	ret = aes_expandkey(&aes, key, keylen);
+	ret = aes_prepareenckey(&aes, key, keylen);
 	if (ret) {
 		sa_entry->enckey_len = 0;
 		goto out;
 	}
 	memset(ghash_h, 0, AEAD_H_SIZE);
-	aes_encrypt(&aes, ghash_h, ghash_h);
+	aes_encrypt_new(&aes, ghash_h, ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 
 	memcpy(sa_entry->key + (DIV_ROUND_UP(sa_entry->enckey_len, 16) *
 	       16), ghash_h, AEAD_H_SIZE);
 	sa_entry->kctx_len = ((DIV_ROUND_UP(sa_entry->enckey_len, 16)) << 4) +
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 4e2096e49684..09c0687f911f 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -74,11 +74,11 @@ static int chcr_ktls_save_keys(struct chcr_ktls_info *tx_info,
 {
 	int ck_size, key_ctx_size, mac_key_size, keylen, ghash_size, ret;
 	unsigned char ghash_h[TLS_CIPHER_AES_GCM_256_TAG_SIZE];
 	struct tls12_crypto_info_aes_gcm_128 *info_128_gcm;
 	struct ktls_key_ctx *kctx = &tx_info->key_ctx;
-	struct crypto_aes_ctx aes_ctx;
+	struct aes_enckey aes;
 	unsigned char *key, *salt;
 
 	switch (crypto_info->cipher_type) {
 	case TLS_CIPHER_AES_GCM_128:
 		info_128_gcm =
@@ -136,17 +136,17 @@ static int chcr_ktls_save_keys(struct chcr_ktls_info *tx_info,
 		       roundup(keylen, 16) + ghash_size;
 	/* Calculate the H = CIPH(K, 0 repeated 16 times).
 	 * It will go in key context
 	 */
 
-	ret = aes_expandkey(&aes_ctx, key, keylen);
+	ret = aes_prepareenckey(&aes, key, keylen);
 	if (ret)
 		goto out;
 
 	memset(ghash_h, 0, ghash_size);
-	aes_encrypt(&aes_ctx, ghash_h, ghash_h);
-	memzero_explicit(&aes_ctx, sizeof(aes_ctx));
+	aes_encrypt_new(&aes, ghash_h, ghash_h);
+	memzero_explicit(&aes, sizeof(aes));
 
 	/* fill the Key context */
 	if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
 		kctx->ctx_hdr = FILL_KEY_CTX_HDR(ck_size,
 						 mac_key_size,
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c
index fab6df21f01c..be2b623957c0 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c
@@ -245,11 +245,11 @@ static int chtls_key_info(struct chtls_sock *csk,
 {
 	unsigned char key[AES_MAX_KEY_SIZE];
 	unsigned char *key_p, *salt;
 	unsigned char ghash_h[AEAD_H_SIZE];
 	int ck_size, key_ctx_size, kctx_mackey_size, salt_size;
-	struct crypto_aes_ctx aes;
+	struct aes_enckey aes;
 	int ret;
 
 	key_ctx_size = sizeof(struct _key_ctx) +
 		       roundup(keylen, 16) + AEAD_H_SIZE;
 
@@ -289,16 +289,16 @@ static int chtls_key_info(struct chtls_sock *csk,
 	}
 
 	/* Calculate the H = CIPH(K, 0 repeated 16 times).
 	 * It will go in key context
 	 */
-	ret = aes_expandkey(&aes, key, keylen);
+	ret = aes_prepareenckey(&aes, key, keylen);
 	if (ret)
 		return ret;
 
 	memset(ghash_h, 0, AEAD_H_SIZE);
-	aes_encrypt(&aes, ghash_h, ghash_h);
+	aes_encrypt_new(&aes, ghash_h, ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 	csk->tlshws.keylen = key_ctx_size;
 
 	/* Copy the Key context */
 	if (optname == TLS_RX) {
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 21/36] net: phy: mscc: macsec: Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (19 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 20/36] chelsio: " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 22/36] staging: rtl8723bs: core: " Eric Biggers
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/net/phy/mscc/mscc_macsec.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/mscc/mscc_macsec.c b/drivers/net/phy/mscc/mscc_macsec.c
index 4f39ba63a9a9..bcb7f5a4a8fd 100644
--- a/drivers/net/phy/mscc/mscc_macsec.c
+++ b/drivers/net/phy/mscc/mscc_macsec.c
@@ -502,19 +502,19 @@ static u32 vsc8584_macsec_flow_context_id(struct macsec_flow *flow)
 
 /* Derive the AES key to get a key for the hash autentication */
 static int vsc8584_macsec_derive_key(const u8 *key, u16 key_len, u8 hkey[16])
 {
 	const u8 input[AES_BLOCK_SIZE] = {0};
-	struct crypto_aes_ctx ctx;
+	struct aes_enckey aes;
 	int ret;
 
-	ret = aes_expandkey(&ctx, key, key_len);
+	ret = aes_prepareenckey(&aes, key, key_len);
 	if (ret)
 		return ret;
 
-	aes_encrypt(&ctx, hkey, input);
-	memzero_explicit(&ctx, sizeof(ctx));
+	aes_encrypt_new(&aes, hkey, input);
+	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
 static int vsc8584_macsec_transformation(struct phy_device *phydev,
 					 struct macsec_flow *flow,
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 22/36] staging: rtl8723bs: core: Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (20 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 21/36] net: phy: mscc: macsec: " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 23/36] crypto: arm/ghash - " Eric Biggers
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/staging/rtl8723bs/core/rtw_security.c | 20 +++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/rtl8723bs/core/rtw_security.c b/drivers/staging/rtl8723bs/core/rtw_security.c
index 2f941ffbd465..79825324e70f 100644
--- a/drivers/staging/rtl8723bs/core/rtw_security.c
+++ b/drivers/staging/rtl8723bs/core/rtw_security.c
@@ -635,15 +635,15 @@ u32 rtw_tkip_decrypt(struct adapter *padapter, u8 *precvframe)
 /* Performs a 128 bit AES encrypt with  */
 /* 128 bit data.                        */
 /****************************************/
 static void aes128k128d(u8 *key, u8 *data, u8 *ciphertext)
 {
-	struct crypto_aes_ctx ctx;
+	struct aes_enckey aes;
 
-	aes_expandkey(&ctx, key, 16);
-	aes_encrypt(&ctx, ciphertext, data);
-	memzero_explicit(&ctx, sizeof(ctx));
+	aes_prepareenckey(&aes, key, 16);
+	aes_encrypt_new(&aes, ciphertext, data);
+	memzero_explicit(&aes, sizeof(aes));
 }
 
 /************************************************/
 /* construct_mic_iv()                           */
 /* Builds the MIC IV from header fields and PN  */
@@ -1404,17 +1404,17 @@ static void gf_mulx(u8 *pad)
  * (SP) 800-38B.
  */
 static int omac1_aes_128_vector(u8 *key, size_t num_elem,
 				u8 *addr[], size_t *len, u8 *mac)
 {
-	struct crypto_aes_ctx ctx;
+	struct aes_enckey aes;
 	u8 cbc[AES_BLOCK_SIZE], pad[AES_BLOCK_SIZE];
 	u8 *pos, *end;
 	size_t i, e, left, total_len;
 	int ret;
 
-	ret = aes_expandkey(&ctx, key, 16);
+	ret = aes_prepareenckey(&aes, key, 16);
 	if (ret)
 		return -1;
 	memset(cbc, 0, AES_BLOCK_SIZE);
 
 	total_len = 0;
@@ -1434,16 +1434,16 @@ static int omac1_aes_128_vector(u8 *key, size_t num_elem,
 				pos = addr[e];
 				end = pos + len[e];
 			}
 		}
 		if (left > AES_BLOCK_SIZE)
-			aes_encrypt(&ctx, cbc, cbc);
+			aes_encrypt_new(&aes, cbc, cbc);
 		left -= AES_BLOCK_SIZE;
 	}
 
 	memset(pad, 0, AES_BLOCK_SIZE);
-	aes_encrypt(&ctx, pad, pad);
+	aes_encrypt_new(&aes, pad, pad);
 	gf_mulx(pad);
 
 	if (left || total_len == 0) {
 		for (i = 0; i < left; i++) {
 			cbc[i] ^= *pos++;
@@ -1457,12 +1457,12 @@ static int omac1_aes_128_vector(u8 *key, size_t num_elem,
 		gf_mulx(pad);
 	}
 
 	for (i = 0; i < AES_BLOCK_SIZE; i++)
 		pad[i] ^= cbc[i];
-	aes_encrypt(&ctx, pad, mac);
-	memzero_explicit(&ctx, sizeof(ctx));
+	aes_encrypt_new(&aes, pad, mac);
+	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
 /**
  * omac1_aes_128 - One-Key CBC MAC (OMAC1) hash with AES-128 (aka AES-CMAC)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 23/36] crypto: arm/ghash - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (21 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 22/36] staging: rtl8723bs: core: " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 24/36] crypto: arm64/ghash " Eric Biggers
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm/crypto/ghash-ce-glue.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glue.c
index a52dcc8c1e33..9ab03bce352d 100644
--- a/arch/arm/crypto/ghash-ce-glue.c
+++ b/arch/arm/crypto/ghash-ce-glue.c
@@ -202,24 +202,28 @@ int pmull_gcm_dec_final(int bytes, u64 dg[], char *tag,
 
 static int gcm_aes_setkey(struct crypto_aead *tfm, const u8 *inkey,
 			  unsigned int keylen)
 {
 	struct gcm_key *ctx = crypto_aead_ctx(tfm);
-	struct crypto_aes_ctx aes_ctx;
+	struct aes_enckey aes_key;
 	be128 h, k;
 	int ret;
 
-	ret = aes_expandkey(&aes_ctx, inkey, keylen);
+	ret = aes_prepareenckey(&aes_key, inkey, keylen);
 	if (ret)
 		return -EINVAL;
 
-	aes_encrypt(&aes_ctx, (u8 *)&k, (u8[AES_BLOCK_SIZE]){});
+	aes_encrypt_new(&aes_key, (u8 *)&k, (u8[AES_BLOCK_SIZE]){});
 
-	memcpy(ctx->rk, aes_ctx.key_enc, sizeof(ctx->rk));
+	/*
+	 * Note: this assumes that the arm implementation of the AES library
+	 * stores the standard round keys in k.rndkeys.
+	 */
+	memcpy(ctx->rk, aes_key.k.rndkeys, sizeof(ctx->rk));
 	ctx->rounds = 6 + keylen / 4;
 
-	memzero_explicit(&aes_ctx, sizeof(aes_ctx));
+	memzero_explicit(&aes_key, sizeof(aes_key));
 
 	ghash_reflect(ctx->h[0], &k);
 
 	h = k;
 	gf128mul_lle(&h, &k);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 24/36] crypto: arm64/ghash - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (22 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 23/36] crypto: arm/ghash - " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 25/36] crypto: x86/aes-gcm " Eric Biggers
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm64/crypto/ghash-ce-glue.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index ef249d06c92c..bfd38e485e77 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -38,11 +38,11 @@ struct ghash_key {
 struct arm_ghash_desc_ctx {
 	u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)];
 };
 
 struct gcm_aes_ctx {
-	struct crypto_aes_ctx	aes_key;
+	struct aes_enckey	aes_key;
 	u8			nonce[RFC4106_NONCE_SIZE];
 	struct ghash_key	ghash_key;
 };
 
 asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src,
@@ -184,35 +184,23 @@ static struct shash_alg ghash_alg = {
 	.import			= ghash_import,
 	.descsize		= sizeof(struct arm_ghash_desc_ctx),
 	.statesize		= sizeof(struct ghash_desc_ctx),
 };
 
-static int num_rounds(struct crypto_aes_ctx *ctx)
-{
-	/*
-	 * # of rounds specified by AES:
-	 * 128 bit key		10 rounds
-	 * 192 bit key		12 rounds
-	 * 256 bit key		14 rounds
-	 * => n byte key	=> 6 + (n/4) rounds
-	 */
-	return 6 + ctx->key_length / 4;
-}
-
 static int gcm_aes_setkey(struct crypto_aead *tfm, const u8 *inkey,
 			  unsigned int keylen)
 {
 	struct gcm_aes_ctx *ctx = crypto_aead_ctx(tfm);
 	u8 key[GHASH_BLOCK_SIZE];
 	be128 h;
 	int ret;
 
-	ret = aes_expandkey(&ctx->aes_key, inkey, keylen);
+	ret = aes_prepareenckey(&ctx->aes_key, inkey, keylen);
 	if (ret)
 		return -EINVAL;
 
-	aes_encrypt(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});
+	aes_encrypt_new(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});
 
 	/* needed for the fallback */
 	memcpy(&ctx->ghash_key.k, key, GHASH_BLOCK_SIZE);
 
 	ghash_reflect(ctx->ghash_key.h[0], &ctx->ghash_key.k);
@@ -294,11 +282,10 @@ static void gcm_calculate_auth_mac(struct aead_request *req, u64 dg[], u32 len)
 
 static int gcm_encrypt(struct aead_request *req, char *iv, int assoclen)
 {
 	struct crypto_aead *aead = crypto_aead_reqtfm(req);
 	struct gcm_aes_ctx *ctx = crypto_aead_ctx(aead);
-	int nrounds = num_rounds(&ctx->aes_key);
 	struct skcipher_walk walk;
 	u8 buf[AES_BLOCK_SIZE];
 	u64 dg[2] = {};
 	be128 lengths;
 	u8 *tag;
@@ -329,12 +316,12 @@ static int gcm_encrypt(struct aead_request *req, char *iv, int assoclen)
 			tag = NULL;
 		}
 
 		scoped_ksimd()
 			pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
-					  dg, iv, ctx->aes_key.key_enc, nrounds,
-					  tag);
+					  dg, iv, ctx->aes_key.k.rndkeys,
+					  ctx->aes_key.nrounds, tag);
 
 		if (unlikely(!nbytes))
 			break;
 
 		if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
@@ -357,11 +344,10 @@ static int gcm_encrypt(struct aead_request *req, char *iv, int assoclen)
 static int gcm_decrypt(struct aead_request *req, char *iv, int assoclen)
 {
 	struct crypto_aead *aead = crypto_aead_reqtfm(req);
 	struct gcm_aes_ctx *ctx = crypto_aead_ctx(aead);
 	unsigned int authsize = crypto_aead_authsize(aead);
-	int nrounds = num_rounds(&ctx->aes_key);
 	struct skcipher_walk walk;
 	u8 otag[AES_BLOCK_SIZE];
 	u8 buf[AES_BLOCK_SIZE];
 	u64 dg[2] = {};
 	be128 lengths;
@@ -399,12 +385,13 @@ static int gcm_decrypt(struct aead_request *req, char *iv, int assoclen)
 		}
 
 		scoped_ksimd()
 			ret = pmull_gcm_decrypt(nbytes, dst, src,
 						ctx->ghash_key.h,
-						dg, iv, ctx->aes_key.key_enc,
-						nrounds, tag, otag, authsize);
+						dg, iv, ctx->aes_key.k.rndkeys,
+						ctx->aes_key.nrounds, tag, otag,
+						authsize);
 
 		if (unlikely(!nbytes))
 			break;
 
 		if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 25/36] crypto: x86/aes-gcm - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (23 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 24/36] crypto: arm64/ghash " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:12 ` [PATCH 26/36] crypto: ccp " Eric Biggers
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Since this changes the format of the AES-GCM key structures that are
used by the AES-GCM assembly code, the offsets in the assembly code had
to be updated to match.  Note that the new key structures are smaller,
since the decryption round keys are no longer unnecessarily included.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/x86/crypto/aes-gcm-aesni-x86_64.S | 33 +++++++-------
 arch/x86/crypto/aes-gcm-vaes-avx2.S    | 21 ++++-----
 arch/x86/crypto/aes-gcm-vaes-avx512.S  | 25 ++++++-----
 arch/x86/crypto/aesni-intel_glue.c     | 59 ++++++++++++--------------
 4 files changed, 68 insertions(+), 70 deletions(-)

diff --git a/arch/x86/crypto/aes-gcm-aesni-x86_64.S b/arch/x86/crypto/aes-gcm-aesni-x86_64.S
index 7c8a8a32bd3c..6b2abb76827e 100644
--- a/arch/x86/crypto/aes-gcm-aesni-x86_64.S
+++ b/arch/x86/crypto/aes-gcm-aesni-x86_64.S
@@ -141,14 +141,15 @@
 .Lzeropad_mask:
 	.octa	0xffffffffffffffffffffffffffffffff
 	.octa	0
 
 // Offsets in struct aes_gcm_key_aesni
-#define OFFSETOF_AESKEYLEN	480
-#define OFFSETOF_H_POWERS	496
-#define OFFSETOF_H_POWERS_XORED	624
-#define OFFSETOF_H_TIMES_X64	688
+#define OFFSETOF_AESKEYLEN	0
+#define OFFSETOF_AESROUNDKEYS	16
+#define OFFSETOF_H_POWERS	272
+#define OFFSETOF_H_POWERS_XORED	400
+#define OFFSETOF_H_TIMES_X64	464
 
 .text
 
 // Do a vpclmulqdq, or fall back to a movdqa and a pclmulqdq.  The fallback
 // assumes that all operands are distinct and that any mem operand is aligned.
@@ -503,13 +504,13 @@
 	.set	H_POW1_X64,	%xmm4	// H^1 * x^64
 	.set	GFPOLY,		%xmm5
 
 	// Encrypt an all-zeroes block to get the raw hash subkey.
 	movl		OFFSETOF_AESKEYLEN(KEY), %eax
-	lea		6*16(KEY,%rax,4), RNDKEYLAST_PTR
-	movdqa		(KEY), H_POW1  // Zero-th round key XOR all-zeroes block
-	lea		16(KEY), %rax
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,%rax,4), RNDKEYLAST_PTR
+	movdqa		OFFSETOF_AESROUNDKEYS(KEY), H_POW1
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rax
 1:
 	aesenc		(%rax), H_POW1
 	add		$16, %rax
 	cmp		%rax, RNDKEYLAST_PTR
 	jne		1b
@@ -622,11 +623,11 @@
 // Increment LE_CTR eight times to generate eight little-endian counter blocks,
 // swap each to big-endian, and store them in AESDATA[0-7].  Also XOR them with
 // the zero-th AES round key.  Clobbers TMP0 and TMP1.
 .macro	_ctr_begin_8x
 	movq		.Lone(%rip), TMP0
-	movdqa		(KEY), TMP1		// zero-th round key
+	movdqa		OFFSETOF_AESROUNDKEYS(KEY), TMP1 // zero-th round key
 .irp i, 0,1,2,3,4,5,6,7
 	_vpshufb	BSWAP_MASK, LE_CTR, AESDATA\i
 	pxor		TMP1, AESDATA\i
 	paddd		TMP0, LE_CTR
 .endr
@@ -724,11 +725,11 @@
 	movdqa		.Lbswap_mask(%rip), BSWAP_MASK
 	movdqu		(GHASH_ACC_PTR), GHASH_ACC
 	movdqu		(LE_CTR_PTR), LE_CTR
 
 	movl		OFFSETOF_AESKEYLEN(KEY), AESKEYLEN
-	lea		6*16(KEY,AESKEYLEN64,4), RNDKEYLAST_PTR
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,AESKEYLEN64,4), RNDKEYLAST_PTR
 
 	// If there are at least 8*16 bytes of data, then continue into the main
 	// loop, which processes 8*16 bytes of data per iteration.
 	//
 	// The main loop interleaves AES and GHASH to improve performance on
@@ -743,11 +744,11 @@
 	add		$-8*16, DATALEN
 	jl		.Lcrypt_loop_8x_done\@
 .if \enc
 	// Encrypt the first 8 plaintext blocks.
 	_ctr_begin_8x
-	lea		16(KEY), %rsi
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rsi
 	.p2align 4
 1:
 	movdqa		(%rsi), TMP0
 	_aesenc_8x	TMP0
 	add		$16, %rsi
@@ -765,11 +766,11 @@
 	.p2align 4
 .Lcrypt_loop_8x\@:
 
 	// Generate the next set of 8 counter blocks and start encrypting them.
 	_ctr_begin_8x
-	lea		16(KEY), %rsi
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rsi
 
 	// Do a round of AES, and start the GHASH update of 8 ciphertext blocks
 	// by doing the unreduced multiplication for the first ciphertext block.
 	movdqa		(%rsi), TMP0
 	add		$16, %rsi
@@ -867,11 +868,11 @@
 .Lcrypt_loop_1x\@:
 
 	// Encrypt the next counter block.
 	_vpshufb	BSWAP_MASK, LE_CTR, TMP0
 	paddd		ONE, LE_CTR
-	pxor		(KEY), TMP0
+	pxor		OFFSETOF_AESROUNDKEYS(KEY), TMP0
 	lea		-6*16(RNDKEYLAST_PTR), %rsi	// Reduce code size
 	cmp		$24, AESKEYLEN
 	jl		128f	// AES-128?
 	je		192f	// AES-192?
 	// AES-256
@@ -924,12 +925,12 @@
 
 	// Process a partial block of length 1 <= DATALEN <= 15.
 
 	// Encrypt a counter block for the last time.
 	pshufb		BSWAP_MASK, LE_CTR
-	pxor		(KEY), LE_CTR
-	lea		16(KEY), %rsi
+	pxor		OFFSETOF_AESROUNDKEYS(KEY), LE_CTR
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rsi
 1:
 	aesenc		(%rsi), LE_CTR
 	add		$16, %rsi
 	cmp		%rsi, RNDKEYLAST_PTR
 	jne		1b
@@ -1036,16 +1037,16 @@
 	movdqa		OFFSETOF_H_TIMES_X64(KEY), H_POW1_X64
 	movq		.Lgfpoly(%rip), GFPOLY
 
 	// Make %rax point to the 6th from last AES round key.  (Using signed
 	// byte offsets -7*16 through 6*16 decreases code size.)
-	lea		(KEY,AESKEYLEN64,4), %rax
+	lea		OFFSETOF_AESROUNDKEYS(KEY,AESKEYLEN64,4), %rax
 
 	// AES-encrypt the counter block and also multiply GHASH_ACC by H^1.
 	// Interleave the AES and GHASH instructions to improve performance.
 	pshufb		BSWAP_MASK, %xmm0
-	pxor		(KEY), %xmm0
+	pxor		OFFSETOF_AESROUNDKEYS(KEY), %xmm0
 	cmp		$24, AESKEYLEN
 	jl		128f	// AES-128?
 	je		192f	// AES-192?
 	// AES-256
 	aesenc		-7*16(%rax), %xmm0
diff --git a/arch/x86/crypto/aes-gcm-vaes-avx2.S b/arch/x86/crypto/aes-gcm-vaes-avx2.S
index 93c9504a488f..9cc387957fa9 100644
--- a/arch/x86/crypto/aes-gcm-vaes-avx2.S
+++ b/arch/x86/crypto/aes-gcm-vaes-avx2.S
@@ -120,12 +120,13 @@
 	// The number of AES blocks per vector, as a 128-bit value.
 .Linc_2blocks:
 	.octa	2
 
 // Offsets in struct aes_gcm_key_vaes_avx2
-#define OFFSETOF_AESKEYLEN	480
-#define OFFSETOF_H_POWERS	512
+#define OFFSETOF_AESKEYLEN	0
+#define OFFSETOF_AESROUNDKEYS	16
+#define OFFSETOF_H_POWERS	288
 #define NUM_H_POWERS		8
 #define OFFSETOFEND_H_POWERS    (OFFSETOF_H_POWERS + (NUM_H_POWERS * 16))
 #define OFFSETOF_H_POWERS_XORED	OFFSETOFEND_H_POWERS
 
 .text
@@ -238,13 +239,13 @@ SYM_FUNC_START(aes_gcm_precompute_vaes_avx2)
 	.set	GFPOLY,		%ymm6
 	.set	GFPOLY_XMM,	%xmm6
 
 	// Encrypt an all-zeroes block to get the raw hash subkey.
 	movl		OFFSETOF_AESKEYLEN(KEY), %eax
-	lea		6*16(KEY,%rax,4), RNDKEYLAST_PTR
-	vmovdqu		(KEY), H_CUR_XMM  // Zero-th round key XOR all-zeroes block
-	lea		16(KEY), %rax
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,%rax,4), RNDKEYLAST_PTR
+	vmovdqu		OFFSETOF_AESROUNDKEYS(KEY), H_CUR_XMM
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rax
 1:
 	vaesenc		(%rax), H_CUR_XMM, H_CUR_XMM
 	add		$16, %rax
 	cmp		%rax, RNDKEYLAST_PTR
 	jne		1b
@@ -633,11 +634,11 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx2)
 
 // Generate and encrypt counter blocks in the given AESDATA vectors, excluding
 // the last AES round.  Clobbers %rax and TMP0.
 .macro	_aesenc_loop	vecs:vararg
 	_ctr_begin	\vecs
-	lea		16(KEY), %rax
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rax
 .Laesenc_loop\@:
 	vbroadcasti128	(%rax), TMP0
 	_vaesenc	TMP0, \vecs
 	add		$16, %rax
 	cmp		%rax, RNDKEYLAST_PTR
@@ -766,12 +767,12 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx2)
 	movl		OFFSETOF_AESKEYLEN(KEY), AESKEYLEN
 
 	// Make RNDKEYLAST_PTR point to the last AES round key.  This is the
 	// round key with index 10, 12, or 14 for AES-128, AES-192, or AES-256
 	// respectively.  Then load the zero-th and last round keys.
-	lea		6*16(KEY,AESKEYLEN64,4), RNDKEYLAST_PTR
-	vbroadcasti128	(KEY), RNDKEY0
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,AESKEYLEN64,4), RNDKEYLAST_PTR
+	vbroadcasti128	OFFSETOF_AESROUNDKEYS(KEY), RNDKEY0
 	vbroadcasti128	(RNDKEYLAST_PTR), RNDKEYLAST
 
 	// Finish initializing LE_CTR by adding 1 to the second block.
 	vpaddd		.Lctr_pattern(%rip), LE_CTR, LE_CTR
 
@@ -1067,16 +1068,16 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx2)
 .if !\enc
 	movl		8(%rsp), TAGLEN
 .endif
 
 	// Make %rax point to the last AES round key for the chosen AES variant.
-	lea		6*16(KEY,AESKEYLEN64,4), %rax
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,AESKEYLEN64,4), %rax
 
 	// Start the AES encryption of the counter block by swapping the counter
 	// block to big-endian and XOR-ing it with the zero-th AES round key.
 	vpshufb		BSWAP_MASK, LE_CTR, %xmm0
-	vpxor		(KEY), %xmm0, %xmm0
+	vpxor		OFFSETOF_AESROUNDKEYS(KEY), %xmm0, %xmm0
 
 	// Complete the AES encryption and multiply GHASH_ACC by H^1.
 	// Interleave the AES and GHASH instructions to improve performance.
 	cmp		$24, AESKEYLEN
 	jl		128f	// AES-128?
diff --git a/arch/x86/crypto/aes-gcm-vaes-avx512.S b/arch/x86/crypto/aes-gcm-vaes-avx512.S
index 06b71314d65c..516747db4659 100644
--- a/arch/x86/crypto/aes-gcm-vaes-avx512.S
+++ b/arch/x86/crypto/aes-gcm-vaes-avx512.S
@@ -84,14 +84,17 @@
 // Number of powers of the hash key stored in the key struct.  The powers are
 // stored from highest (H^NUM_H_POWERS) to lowest (H^1).
 #define NUM_H_POWERS		16
 
 // Offset to AES key length (in bytes) in the key struct
-#define OFFSETOF_AESKEYLEN	480
+#define OFFSETOF_AESKEYLEN	0
+
+// Offset to AES round keys in the key struct
+#define OFFSETOF_AESROUNDKEYS	16
 
 // Offset to start of hash key powers array in the key struct
-#define OFFSETOF_H_POWERS	512
+#define OFFSETOF_H_POWERS	320
 
 // Offset to end of hash key powers array in the key struct.
 //
 // This is immediately followed by three zeroized padding blocks, which are
 // included so that partial vectors can be handled more easily.  E.g. if two
@@ -299,13 +302,13 @@ SYM_FUNC_START(aes_gcm_precompute_vaes_avx512)
 	// Get pointer to lowest set of key powers (located at end of array).
 	lea		OFFSETOFEND_H_POWERS-64(KEY), POWERS_PTR
 
 	// Encrypt an all-zeroes block to get the raw hash subkey.
 	movl		OFFSETOF_AESKEYLEN(KEY), %eax
-	lea		6*16(KEY,%rax,4), RNDKEYLAST_PTR
-	vmovdqu		(KEY), %xmm0  // Zero-th round key XOR all-zeroes block
-	add		$16, KEY
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,%rax,4), RNDKEYLAST_PTR
+	vmovdqu		OFFSETOF_AESROUNDKEYS(KEY), %xmm0
+	add		$OFFSETOF_AESROUNDKEYS+16, KEY
 1:
 	vaesenc		(KEY), %xmm0, %xmm0
 	add		$16, KEY
 	cmp		KEY, RNDKEYLAST_PTR
 	jne		1b
@@ -788,12 +791,12 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx512)
 	movl		OFFSETOF_AESKEYLEN(KEY), AESKEYLEN
 
 	// Make RNDKEYLAST_PTR point to the last AES round key.  This is the
 	// round key with index 10, 12, or 14 for AES-128, AES-192, or AES-256
 	// respectively.  Then load the zero-th and last round keys.
-	lea		6*16(KEY,AESKEYLEN64,4), RNDKEYLAST_PTR
-	vbroadcasti32x4	(KEY), RNDKEY0
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,AESKEYLEN64,4), RNDKEYLAST_PTR
+	vbroadcasti32x4	OFFSETOF_AESROUNDKEYS(KEY), RNDKEY0
 	vbroadcasti32x4	(RNDKEYLAST_PTR), RNDKEYLAST
 
 	// Finish initializing LE_CTR by adding [0, 1, ...] to its low words.
 	vpaddd		.Lctr_pattern(%rip), LE_CTR, LE_CTR
 
@@ -832,11 +835,11 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx512)
 
 .if \enc
 	// Encrypt the first 4 vectors of plaintext blocks.  Leave the resulting
 	// ciphertext in GHASHDATA[0-3] for GHASH.
 	_ctr_begin_4x
-	lea		16(KEY), %rax
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rax
 1:
 	vbroadcasti32x4	(%rax), RNDKEY
 	_vaesenc_4x	RNDKEY
 	add		$16, %rax
 	cmp		%rax, RNDKEYLAST_PTR
@@ -955,11 +958,11 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx512)
 
 	// Encrypt a vector of counter blocks.  This does not need to be masked.
 	vpshufb		BSWAP_MASK, LE_CTR, %zmm0
 	vpaddd		LE_CTR_INC, LE_CTR, LE_CTR
 	vpxord		RNDKEY0, %zmm0, %zmm0
-	lea		16(KEY), %rax
+	lea		OFFSETOF_AESROUNDKEYS+16(KEY), %rax
 1:
 	vbroadcasti32x4	(%rax), RNDKEY
 	vaesenc		RNDKEY, %zmm0, %zmm0
 	add		$16, %rax
 	cmp		%rax, RNDKEYLAST_PTR
@@ -1085,16 +1088,16 @@ SYM_FUNC_END(aes_gcm_aad_update_vaes_avx512)
 	bzhi		TAGLEN, %eax, %eax
 	kmovd		%eax, %k1
 .endif
 
 	// Make %rax point to the last AES round key for the chosen AES variant.
-	lea		6*16(KEY,AESKEYLEN64,4), %rax
+	lea		OFFSETOF_AESROUNDKEYS+6*16(KEY,AESKEYLEN64,4), %rax
 
 	// Start the AES encryption of the counter block by swapping the counter
 	// block to big-endian and XOR-ing it with the zero-th AES round key.
 	vpshufb		BSWAP_MASK, LE_CTR, %xmm0
-	vpxor		(KEY), %xmm0, %xmm0
+	vpxor		OFFSETOF_AESROUNDKEYS(KEY), %xmm0, %xmm0
 
 	// Complete the AES encryption and multiply GHASH_ACC by H^1.
 	// Interleave the AES and GHASH instructions to improve performance.
 	cmp		$24, AESKEYLEN
 	jl		128f	// AES-128?
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 453e0e890041..5633e50e46a0 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -778,24 +778,23 @@ DEFINE_AVX_SKCIPHER_ALGS(vaes_avx2, "vaes-avx2", 600);
 DEFINE_AVX_SKCIPHER_ALGS(vaes_avx512, "vaes-avx512", 800);
 
 /* The common part of the x86_64 AES-GCM key struct */
 struct aes_gcm_key {
 	/* Expanded AES key and the AES key length in bytes */
-	struct crypto_aes_ctx aes_key;
+	struct aes_enckey aes_key;
 
 	/* RFC4106 nonce (used only by the rfc4106 algorithms) */
 	u32 rfc4106_nonce;
 };
 
 /* Key struct used by the AES-NI implementations of AES-GCM */
 struct aes_gcm_key_aesni {
 	/*
-	 * Common part of the key.  The assembly code requires 16-byte alignment
-	 * for the round keys; we get this by them being located at the start of
-	 * the struct and the whole struct being 16-byte aligned.
+	 * Common part of the key.  16-byte alignment is required by the
+	 * assembly code.
 	 */
-	struct aes_gcm_key base;
+	struct aes_gcm_key base __aligned(16);
 
 	/*
 	 * Powers of the hash key H^8 through H^1.  These are 128-bit values.
 	 * They all have an extra factor of x^-1 and are byte-reversed.  16-byte
 	 * alignment is required by the assembly code.
@@ -822,14 +821,13 @@ struct aes_gcm_key_aesni {
 
 /* Key struct used by the VAES + AVX2 implementation of AES-GCM */
 struct aes_gcm_key_vaes_avx2 {
 	/*
 	 * Common part of the key.  The assembly code prefers 16-byte alignment
-	 * for the round keys; we get this by them being located at the start of
-	 * the struct and the whole struct being 32-byte aligned.
+	 * for this.
 	 */
-	struct aes_gcm_key base;
+	struct aes_gcm_key base __aligned(16);
 
 	/*
 	 * Powers of the hash key H^8 through H^1.  These are 128-bit values.
 	 * They all have an extra factor of x^-1 and are byte-reversed.
 	 * The assembly code prefers 32-byte alignment for this.
@@ -852,14 +850,13 @@ struct aes_gcm_key_vaes_avx2 {
 
 /* Key struct used by the VAES + AVX512 implementation of AES-GCM */
 struct aes_gcm_key_vaes_avx512 {
 	/*
 	 * Common part of the key.  The assembly code prefers 16-byte alignment
-	 * for the round keys; we get this by them being located at the start of
-	 * the struct and the whole struct being 64-byte aligned.
+	 * for this.
 	 */
-	struct aes_gcm_key base;
+	struct aes_gcm_key base __aligned(16);
 
 	/*
 	 * Powers of the hash key H^16 through H^1.  These are 128-bit values.
 	 * They all have an extra factor of x^-1 and are byte-reversed.  This
 	 * array is aligned to a 64-byte boundary to make it naturally aligned
@@ -1180,30 +1177,30 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key,
 		keylen -= 4;
 		key->rfc4106_nonce = get_unaligned_be32(raw_key + keylen);
 	}
 
 	/* The assembly code assumes the following offsets. */
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, base.aes_key.key_enc) != 0);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, base.aes_key.key_length) != 480);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_powers) != 496);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_powers_xored) != 624);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_times_x64) != 688);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, base.aes_key.key_enc) != 0);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, base.aes_key.key_length) != 480);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, h_powers) != 512);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx2, h_powers_xored) != 640);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, base.aes_key.key_enc) != 0);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, base.aes_key.key_length) != 480);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, h_powers) != 512);
-	BUILD_BUG_ON(offsetof(struct aes_gcm_key_vaes_avx512, padding) != 768);
+	static_assert(offsetof(struct aes_gcm_key_aesni, base.aes_key.len) == 0);
+	static_assert(offsetof(struct aes_gcm_key_aesni, base.aes_key.k.rndkeys) == 16);
+	static_assert(offsetof(struct aes_gcm_key_aesni, h_powers) == 272);
+	static_assert(offsetof(struct aes_gcm_key_aesni, h_powers_xored) == 400);
+	static_assert(offsetof(struct aes_gcm_key_aesni, h_times_x64) == 464);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx2, base.aes_key.len) == 0);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx2, base.aes_key.k.rndkeys) == 16);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx2, h_powers) == 288);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx2, h_powers_xored) == 416);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx512, base.aes_key.len) == 0);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx512, base.aes_key.k.rndkeys) == 16);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx512, h_powers) == 320);
+	static_assert(offsetof(struct aes_gcm_key_vaes_avx512, padding) == 576);
+
+	err = aes_prepareenckey(&key->aes_key, raw_key, keylen);
+	if (err)
+		return err;
 
 	if (likely(crypto_simd_usable())) {
-		err = aes_check_keylen(keylen);
-		if (err)
-			return err;
 		kernel_fpu_begin();
-		aesni_set_key(&key->aes_key, raw_key, keylen);
 		aes_gcm_precompute(key, flags);
 		kernel_fpu_end();
 	} else {
 		static const u8 x_to_the_minus1[16] __aligned(__alignof__(be128)) = {
 			[0] = 0xc2, [15] = 1
@@ -1213,16 +1210,12 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key,
 		};
 		be128 h1 = {};
 		be128 h;
 		int i;
 
-		err = aes_expandkey(&key->aes_key, raw_key, keylen);
-		if (err)
-			return err;
-
 		/* Encrypt the all-zeroes block to get the hash key H^1 */
-		aes_encrypt(&key->aes_key, (u8 *)&h1, (u8 *)&h1);
+		aes_encrypt_new(&key->aes_key, (u8 *)&h1, (u8 *)&h1);
 
 		/* Compute H^1 * x^-1 */
 		h = h1;
 		gf128mul_lle(&h, (const be128 *)x_to_the_minus1);
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 26/36] crypto: ccp - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (24 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 25/36] crypto: x86/aes-gcm " Eric Biggers
@ 2026-01-05  5:12 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 27/36] crypto: chelsio " Eric Biggers
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:12 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/crypto/ccp/ccp-crypto-aes-cmac.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
index d8426bdf3190..ed5b0f8609f1 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
@@ -259,11 +259,11 @@ static int ccp_aes_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 	struct ccp_ctx *ctx = crypto_ahash_ctx_dma(tfm);
 	struct ccp_crypto_ahash_alg *alg =
 		ccp_crypto_ahash_alg(crypto_ahash_tfm(tfm));
 	u64 k0_hi, k0_lo, k1_hi, k1_lo, k2_hi, k2_lo;
 	u64 rb_hi = 0x00, rb_lo = 0x87;
-	struct crypto_aes_ctx aes;
+	struct aes_enckey aes;
 	__be64 *gk;
 	int ret;
 
 	switch (key_len) {
 	case AES_KEYSIZE_128:
@@ -282,17 +282,17 @@ static int ccp_aes_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 
 	/* Set to zero until complete */
 	ctx->u.aes.key_len = 0;
 
 	/* Set the key for the AES cipher used to generate the keys */
-	ret = aes_expandkey(&aes, key, key_len);
+	ret = aes_prepareenckey(&aes, key, key_len);
 	if (ret)
 		return ret;
 
 	/* Encrypt a block of zeroes - use key area in context */
 	memset(ctx->u.aes.key, 0, sizeof(ctx->u.aes.key));
-	aes_encrypt(&aes, ctx->u.aes.key, ctx->u.aes.key);
+	aes_encrypt_new(&aes, ctx->u.aes.key, ctx->u.aes.key);
 	memzero_explicit(&aes, sizeof(aes));
 
 	/* Generate K1 and K2 */
 	k0_hi = be64_to_cpu(*((__be64 *)ctx->u.aes.key));
 	k0_lo = be64_to_cpu(*((__be64 *)ctx->u.aes.key + 1));
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 27/36] crypto: chelsio - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (25 preceding siblings ...)
  2026-01-05  5:12 ` [PATCH 26/36] crypto: ccp " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 28/36] crypto: crypto4xx " Eric Biggers
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_key and struct
aes_enckey).  In encryption-only use cases, this eliminates the
unnecessary computation and caching of the decryption round keys.  The
new AES en/decryption functions are also much faster and use AES
instructions when supported by the CPU.

Note: aes_encrypt_new() and aes_decrypt_new() will be renamed to
aes_encrypt() and aes_decrypt(), respectively, once all callers of the
old aes_encrypt() and aes_decrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/crypto/chelsio/chcr_algo.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c
index 22cbc343198a..b6b97088dfc5 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -1026,11 +1026,11 @@ static int chcr_update_tweak(struct skcipher_request *req, u8 *iv,
 			     u32 isfinal)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct ablk_ctx *ablkctx = ABLK_CTX(c_ctx(tfm));
 	struct chcr_skcipher_req_ctx *reqctx = skcipher_request_ctx(req);
-	struct crypto_aes_ctx aes;
+	struct aes_key aes;
 	int ret, i;
 	u8 *key;
 	unsigned int keylen;
 	int round = reqctx->last_req_len / AES_BLOCK_SIZE;
 	int round8 = round / 8;
@@ -1042,24 +1042,24 @@ static int chcr_update_tweak(struct skcipher_request *req, u8 *iv,
 	/* For a 192 bit key remove the padded zeroes which was
 	 * added in chcr_xts_setkey
 	 */
 	if (KEY_CONTEXT_CK_SIZE_G(ntohl(ablkctx->key_ctx_hdr))
 			== CHCR_KEYCTX_CIPHER_KEY_SIZE_192)
-		ret = aes_expandkey(&aes, key, keylen - 8);
+		ret = aes_preparekey(&aes, key, keylen - 8);
 	else
-		ret = aes_expandkey(&aes, key, keylen);
+		ret = aes_preparekey(&aes, key, keylen);
 	if (ret)
 		return ret;
-	aes_encrypt(&aes, iv, iv);
+	aes_encrypt_new(&aes, iv, iv);
 	for (i = 0; i < round8; i++)
 		gf128mul_x8_ble((le128 *)iv, (le128 *)iv);
 
 	for (i = 0; i < (round % 8); i++)
 		gf128mul_x_ble((le128 *)iv, (le128 *)iv);
 
 	if (!isfinal)
-		aes_decrypt(&aes, iv, iv);
+		aes_decrypt_new(&aes, iv, iv);
 
 	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
@@ -3404,11 +3404,11 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, const u8 *key,
 {
 	struct chcr_aead_ctx *aeadctx = AEAD_CTX(a_ctx(aead));
 	struct chcr_gcm_ctx *gctx = GCM_CTX(aeadctx);
 	unsigned int ck_size;
 	int ret = 0, key_ctx_size = 0;
-	struct crypto_aes_ctx aes;
+	struct aes_enckey aes;
 
 	aeadctx->enckey_len = 0;
 	crypto_aead_clear_flags(aeadctx->sw_cipher, CRYPTO_TFM_REQ_MASK);
 	crypto_aead_set_flags(aeadctx->sw_cipher, crypto_aead_get_flags(aead)
 			      & CRYPTO_TFM_REQ_MASK);
@@ -3442,17 +3442,17 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, const u8 *key,
 						0, 0,
 						key_ctx_size >> 4);
 	/* Calculate the H = CIPH(K, 0 repeated 16 times).
 	 * It will go in key context
 	 */
-	ret = aes_expandkey(&aes, key, keylen);
+	ret = aes_prepareenckey(&aes, key, keylen);
 	if (ret) {
 		aeadctx->enckey_len = 0;
 		goto out;
 	}
 	memset(gctx->ghash_h, 0, AEAD_H_SIZE);
-	aes_encrypt(&aes, gctx->ghash_h, gctx->ghash_h);
+	aes_encrypt_new(&aes, gctx->ghash_h, gctx->ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 
 out:
 	return ret;
 }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 28/36] crypto: crypto4xx - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (26 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 27/36] crypto: chelsio " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 29/36] crypto: drbg " Eric Biggers
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/crypto/amcc/crypto4xx_alg.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/amcc/crypto4xx_alg.c b/drivers/crypto/amcc/crypto4xx_alg.c
index 38e8a61e9166..1947708334ef 100644
--- a/drivers/crypto/amcc/crypto4xx_alg.c
+++ b/drivers/crypto/amcc/crypto4xx_alg.c
@@ -489,23 +489,23 @@ static int crypto4xx_aes_gcm_validate_keylen(unsigned int keylen)
 }
 
 static int crypto4xx_compute_gcm_hash_key_sw(__le32 *hash_start, const u8 *key,
 					     unsigned int keylen)
 {
-	struct crypto_aes_ctx ctx;
+	struct aes_enckey aes;
 	uint8_t src[16] = { 0 };
 	int rc;
 
-	rc = aes_expandkey(&ctx, key, keylen);
+	rc = aes_prepareenckey(&aes, key, keylen);
 	if (rc) {
-		pr_err("aes_expandkey() failed: %d\n", rc);
+		pr_err("aes_prepareenckey() failed: %d\n", rc);
 		return rc;
 	}
 
-	aes_encrypt(&ctx, src, src);
+	aes_encrypt_new(&aes, src, src);
 	crypto4xx_memcpy_to_le32(hash_start, src, 16);
-	memzero_explicit(&ctx, sizeof(ctx));
+	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
 int crypto4xx_setkey_aes_gcm(struct crypto_aead *cipher,
 			     const u8 *key, unsigned int keylen)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 29/36] crypto: drbg - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (27 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 28/36] crypto: crypto4xx " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 30/36] crypto: inside-secure " Eric Biggers
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 crypto/df_sp80090a.c                | 30 ++++++++++-------------------
 crypto/drbg.c                       | 12 ++++++------
 drivers/crypto/xilinx/xilinx-trng.c |  8 ++++----
 include/crypto/df_sp80090a.h        |  2 +-
 4 files changed, 21 insertions(+), 31 deletions(-)

diff --git a/crypto/df_sp80090a.c b/crypto/df_sp80090a.c
index dc63b31a93fc..5686d37ebba2 100644
--- a/crypto/df_sp80090a.c
+++ b/crypto/df_sp80090a.c
@@ -12,31 +12,21 @@
 #include <linux/string.h>
 #include <crypto/aes.h>
 #include <crypto/df_sp80090a.h>
 #include <crypto/internal/drbg.h>
 
-static void drbg_kcapi_symsetkey(struct crypto_aes_ctx *aesctx,
-				 const unsigned char *key,
-				 u8 keylen);
-static void drbg_kcapi_symsetkey(struct crypto_aes_ctx *aesctx,
-				 const unsigned char *key, u8 keylen)
-{
-	aes_expandkey(aesctx, key, keylen);
-}
-
-static void drbg_kcapi_sym(struct crypto_aes_ctx *aesctx,
-			   unsigned char *outval,
+static void drbg_kcapi_sym(struct aes_enckey *aeskey, unsigned char *outval,
 			   const struct drbg_string *in, u8 blocklen_bytes)
 {
 	/* there is only component in *in */
 	BUG_ON(in->len < blocklen_bytes);
-	aes_encrypt(aesctx, outval, in->buf);
+	aes_encrypt_new(aeskey, outval, in->buf);
 }
 
 /* BCC function for CTR DRBG as defined in 10.4.3 */
 
-static void drbg_ctr_bcc(struct crypto_aes_ctx *aesctx,
+static void drbg_ctr_bcc(struct aes_enckey *aeskey,
 			 unsigned char *out, const unsigned char *key,
 			 struct list_head *in,
 			 u8 blocklen_bytes,
 			 u8 keylen)
 {
@@ -45,30 +35,30 @@ static void drbg_ctr_bcc(struct crypto_aes_ctx *aesctx,
 	short cnt = 0;
 
 	drbg_string_fill(&data, out, blocklen_bytes);
 
 	/* 10.4.3 step 2 / 4 */
-	drbg_kcapi_symsetkey(aesctx, key, keylen);
+	aes_prepareenckey(aeskey, key, keylen);
 	list_for_each_entry(curr, in, list) {
 		const unsigned char *pos = curr->buf;
 		size_t len = curr->len;
 		/* 10.4.3 step 4.1 */
 		while (len) {
 			/* 10.4.3 step 4.2 */
 			if (blocklen_bytes == cnt) {
 				cnt = 0;
-				drbg_kcapi_sym(aesctx, out, &data, blocklen_bytes);
+				drbg_kcapi_sym(aeskey, out, &data, blocklen_bytes);
 			}
 			out[cnt] ^= *pos;
 			pos++;
 			cnt++;
 			len--;
 		}
 	}
 	/* 10.4.3 step 4.2 for last block */
 	if (cnt)
-		drbg_kcapi_sym(aesctx, out, &data, blocklen_bytes);
+		drbg_kcapi_sym(aeskey, out, &data, blocklen_bytes);
 }
 
 /*
  * scratchpad usage: drbg_ctr_update is interlinked with crypto_drbg_ctr_df
  * (and drbg_ctr_bcc, but this function does not need any temporary buffers),
@@ -108,11 +98,11 @@ static void drbg_ctr_bcc(struct crypto_aes_ctx *aesctx,
  *			possibilities.
  * refer to crypto_drbg_ctr_df_datalen() to get required length
  */
 
 /* Derivation Function for CTR DRBG as defined in 10.4.2 */
-int crypto_drbg_ctr_df(struct crypto_aes_ctx *aesctx,
+int crypto_drbg_ctr_df(struct aes_enckey *aeskey,
 		       unsigned char *df_data, size_t bytes_to_return,
 		       struct list_head *seedlist,
 		       u8 blocklen_bytes,
 		       u8 statelen)
 {
@@ -185,11 +175,11 @@ int crypto_drbg_ctr_df(struct crypto_aes_ctx *aesctx,
 		 * holds zeros after allocation -- even the increment of i
 		 * is irrelevant as the increment remains within length of i
 		 */
 		drbg_cpu_to_be32(i, iv);
 		/* 10.4.2 step 9.2 -- BCC and concatenation with temp */
-		drbg_ctr_bcc(aesctx, temp + templen, K, &bcc_list,
+		drbg_ctr_bcc(aeskey, temp + templen, K, &bcc_list,
 			     blocklen_bytes, keylen);
 		/* 10.4.2 step 9.3 */
 		i++;
 		templen += blocklen_bytes;
 	}
@@ -199,19 +189,19 @@ int crypto_drbg_ctr_df(struct crypto_aes_ctx *aesctx,
 	drbg_string_fill(&cipherin, X, blocklen_bytes);
 
 	/* 10.4.2 step 12: overwriting of outval is implemented in next step */
 
 	/* 10.4.2 step 13 */
-	drbg_kcapi_symsetkey(aesctx, temp, keylen);
+	aes_prepareenckey(aeskey, temp, keylen);
 	while (generated_len < bytes_to_return) {
 		short blocklen = 0;
 		/*
 		 * 10.4.2 step 13.1: the truncation of the key length is
 		 * implicit as the key is only drbg_blocklen in size based on
 		 * the implementation of the cipher function callback
 		 */
-		drbg_kcapi_sym(aesctx, X, &cipherin, blocklen_bytes);
+		drbg_kcapi_sym(aeskey, X, &cipherin, blocklen_bytes);
 		blocklen = (blocklen_bytes <
 				(bytes_to_return - generated_len)) ?
 			    blocklen_bytes :
 				(bytes_to_return - generated_len);
 		/* 10.4.2 step 13.2 and 14 */
diff --git a/crypto/drbg.c b/crypto/drbg.c
index 1d433dae9955..85cc4549bd58 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -1503,13 +1503,13 @@ static int drbg_kcapi_hash(struct drbg_state *drbg, unsigned char *outval,
 #endif /* (CONFIG_CRYPTO_DRBG_HASH || CONFIG_CRYPTO_DRBG_HMAC) */
 
 #ifdef CONFIG_CRYPTO_DRBG_CTR
 static int drbg_fini_sym_kernel(struct drbg_state *drbg)
 {
-	struct crypto_aes_ctx *aesctx =	(struct crypto_aes_ctx *)drbg->priv_data;
+	struct aes_enckey *aeskey = drbg->priv_data;
 
-	kfree(aesctx);
+	kfree(aeskey);
 	drbg->priv_data = NULL;
 
 	if (drbg->ctr_handle)
 		crypto_free_skcipher(drbg->ctr_handle);
 	drbg->ctr_handle = NULL;
@@ -1524,20 +1524,20 @@ static int drbg_fini_sym_kernel(struct drbg_state *drbg)
 	return 0;
 }
 
 static int drbg_init_sym_kernel(struct drbg_state *drbg)
 {
-	struct crypto_aes_ctx *aesctx;
+	struct aes_enckey *aeskey;
 	struct crypto_skcipher *sk_tfm;
 	struct skcipher_request *req;
 	unsigned int alignmask;
 	char ctr_name[CRYPTO_MAX_ALG_NAME];
 
-	aesctx = kzalloc(sizeof(*aesctx), GFP_KERNEL);
-	if (!aesctx)
+	aeskey = kzalloc(sizeof(*aeskey), GFP_KERNEL);
+	if (!aeskey)
 		return -ENOMEM;
-	drbg->priv_data = aesctx;
+	drbg->priv_data = aeskey;
 
 	if (snprintf(ctr_name, CRYPTO_MAX_ALG_NAME, "ctr(%s)",
 	    drbg->core->backend_cra_name) >= CRYPTO_MAX_ALG_NAME) {
 		drbg_fini_sym_kernel(drbg);
 		return -EINVAL;
diff --git a/drivers/crypto/xilinx/xilinx-trng.c b/drivers/crypto/xilinx/xilinx-trng.c
index db0fbb28ff32..5276ac2d82bb 100644
--- a/drivers/crypto/xilinx/xilinx-trng.c
+++ b/drivers/crypto/xilinx/xilinx-trng.c
@@ -58,11 +58,11 @@
 
 struct xilinx_rng {
 	void __iomem *rng_base;
 	struct device *dev;
 	unsigned char *scratchpadbuf;
-	struct crypto_aes_ctx *aesctx;
+	struct aes_enckey *aeskey;
 	struct mutex lock;	/* Protect access to TRNG device */
 	struct hwrng trng;
 };
 
 struct xilinx_rng_ctx {
@@ -196,11 +196,11 @@ static int xtrng_reseed_internal(struct xilinx_rng *rng)
 
 	/* collect random data to use it as entropy (input for DF) */
 	ret = xtrng_collect_random_data(rng, entropy, TRNG_SEED_LEN_BYTES, true);
 	if (ret != TRNG_SEED_LEN_BYTES)
 		return -EINVAL;
-	ret = crypto_drbg_ctr_df(rng->aesctx, rng->scratchpadbuf,
+	ret = crypto_drbg_ctr_df(rng->aeskey, rng->scratchpadbuf,
 				 TRNG_SEED_LEN_BYTES, &seedlist, AES_BLOCK_SIZE,
 				 TRNG_SEED_LEN_BYTES);
 	if (ret)
 		return ret;
 
@@ -347,12 +347,12 @@ static int xtrng_probe(struct platform_device *pdev)
 	if (IS_ERR(rng->rng_base)) {
 		dev_err(&pdev->dev, "Failed to map resource %pe\n", rng->rng_base);
 		return PTR_ERR(rng->rng_base);
 	}
 
-	rng->aesctx = devm_kzalloc(&pdev->dev, sizeof(*rng->aesctx), GFP_KERNEL);
-	if (!rng->aesctx)
+	rng->aeskey = devm_kzalloc(&pdev->dev, sizeof(*rng->aeskey), GFP_KERNEL);
+	if (!rng->aeskey)
 		return -ENOMEM;
 
 	sb_size = crypto_drbg_ctr_df_datalen(TRNG_SEED_LEN_BYTES, AES_BLOCK_SIZE);
 	rng->scratchpadbuf = devm_kzalloc(&pdev->dev, sb_size, GFP_KERNEL);
 	if (!rng->scratchpadbuf) {
diff --git a/include/crypto/df_sp80090a.h b/include/crypto/df_sp80090a.h
index 6b25305fe611..cb5d6fe15d40 100644
--- a/include/crypto/df_sp80090a.h
+++ b/include/crypto/df_sp80090a.h
@@ -16,11 +16,11 @@ static inline int crypto_drbg_ctr_df_datalen(u8 statelen, u8 blocklen)
 		blocklen +      /* pad */
 		blocklen +      /* iv */
 		statelen + blocklen;  /* temp */
 }
 
-int crypto_drbg_ctr_df(struct crypto_aes_ctx *aes,
+int crypto_drbg_ctr_df(struct aes_enckey *aes,
 		       unsigned char *df_data,
 		       size_t bytes_to_return,
 		       struct list_head *seedlist,
 		       u8 blocklen_bytes,
 		       u8 statelen);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 30/36] crypto: inside-secure - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (28 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 29/36] crypto: drbg " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-07  3:48   ` Qingfang Deng
  2026-01-05  5:13 ` [PATCH 31/36] crypto: omap " Eric Biggers
                   ` (5 subsequent siblings)
  35 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note that this driver used crypto_aes_ctx::key_enc, but only to access
the copy of the raw key that is stored at the beginning of the expanded
key.  To eliminate the dependency on this field, instead just access the
raw key directly, which is already available in the relevant functions.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 .../crypto/inside-secure/safexcel_cipher.c    | 14 ++++-----
 drivers/crypto/inside-secure/safexcel_hash.c  | 30 +++++++++----------
 2 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c
index 919e5a2cab95..eb4e0dc38b7f 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -2505,37 +2505,35 @@ static int safexcel_aead_gcm_setkey(struct crypto_aead *ctfm, const u8 *key,
 				    unsigned int len)
 {
 	struct crypto_tfm *tfm = crypto_aead_tfm(ctfm);
 	struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
 	struct safexcel_crypto_priv *priv = ctx->base.priv;
-	struct crypto_aes_ctx aes;
+	struct aes_enckey aes;
 	u32 hashkey[AES_BLOCK_SIZE >> 2];
 	int ret, i;
 
-	ret = aes_expandkey(&aes, key, len);
-	if (ret) {
-		memzero_explicit(&aes, sizeof(aes));
+	ret = aes_prepareenckey(&aes, key, len);
+	if (ret)
 		return ret;
-	}
 
 	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
 		for (i = 0; i < len / sizeof(u32); i++) {
-			if (le32_to_cpu(ctx->key[i]) != aes.key_enc[i]) {
+			if (ctx->key[i] != get_unaligned((__le32 *)key + i)) {
 				ctx->base.needs_inv = true;
 				break;
 			}
 		}
 	}
 
 	for (i = 0; i < len / sizeof(u32); i++)
-		ctx->key[i] = cpu_to_le32(aes.key_enc[i]);
+		ctx->key[i] = get_unaligned((__le32 *)key + i);
 
 	ctx->key_len = len;
 
 	/* Compute hash key by encrypting zeroes with cipher key */
 	memset(hashkey, 0, AES_BLOCK_SIZE);
-	aes_encrypt(&aes, (u8 *)hashkey, (u8 *)hashkey);
+	aes_encrypt_new(&aes, (u8 *)hashkey, (u8 *)hashkey);
 
 	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
 		for (i = 0; i < AES_BLOCK_SIZE / sizeof(u32); i++) {
 			if (be32_to_cpu(ctx->base.ipad.be[i]) != hashkey[i]) {
 				ctx->base.needs_inv = true;
diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c
index ef0ba4832928..dae10d0066d7 100644
--- a/drivers/crypto/inside-secure/safexcel_hash.c
+++ b/drivers/crypto/inside-secure/safexcel_hash.c
@@ -28,11 +28,11 @@ struct safexcel_ahash_ctx {
 	bool cbcmac;
 	bool do_fallback;
 	bool fb_init_done;
 	bool fb_do_setkey;
 
-	struct crypto_aes_ctx *aes;
+	struct aes_enckey *aes;
 	struct crypto_ahash *fback;
 	struct crypto_shash *shpre;
 	struct shash_desc *shdesc;
 };
 
@@ -820,11 +820,11 @@ static int safexcel_ahash_final(struct ahash_request *areq)
 
 			/* K3 */
 			result[i] = swab32(ctx->base.ipad.word[i + 4]);
 		}
 		areq->result[0] ^= 0x80;			// 10- padding
-		aes_encrypt(ctx->aes, areq->result, areq->result);
+		aes_encrypt_new(ctx->aes, areq->result, areq->result);
 		return 0;
 	} else if (unlikely(req->hmac &&
 			    (req->len == req->block_sz) &&
 			    !areq->nbytes)) {
 		/*
@@ -1974,27 +1974,27 @@ static int safexcel_xcbcmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 {
 	struct safexcel_ahash_ctx *ctx = crypto_tfm_ctx(crypto_ahash_tfm(tfm));
 	u32 key_tmp[3 * AES_BLOCK_SIZE / sizeof(u32)];
 	int ret, i;
 
-	ret = aes_expandkey(ctx->aes, key, len);
+	ret = aes_prepareenckey(ctx->aes, key, len);
 	if (ret)
 		return ret;
 
 	/* precompute the XCBC key material */
-	aes_encrypt(ctx->aes, (u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
-		    "\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1");
-	aes_encrypt(ctx->aes, (u8 *)key_tmp,
-		    "\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2");
-	aes_encrypt(ctx->aes, (u8 *)key_tmp + AES_BLOCK_SIZE,
-		    "\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3");
+	aes_encrypt_new(ctx->aes, (u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
+			"\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1");
+	aes_encrypt_new(ctx->aes, (u8 *)key_tmp,
+			"\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2");
+	aes_encrypt_new(ctx->aes, (u8 *)key_tmp + AES_BLOCK_SIZE,
+			"\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3");
 	for (i = 0; i < 3 * AES_BLOCK_SIZE / sizeof(u32); i++)
 		ctx->base.ipad.word[i] = swab32(key_tmp[i]);
 
-	ret = aes_expandkey(ctx->aes,
-			    (u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
-			    AES_MIN_KEY_SIZE);
+	ret = aes_prepareenckey(ctx->aes,
+				(u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
+				AES_MIN_KEY_SIZE);
 	if (ret)
 		return ret;
 
 	ctx->alg    = CONTEXT_CONTROL_CRYPTO_ALG_XCBC128;
 	ctx->key_sz = AES_MIN_KEY_SIZE + 2 * AES_BLOCK_SIZE;
@@ -2060,21 +2060,21 @@ static int safexcel_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 	u64 _const[2];
 	u8 msb_mask, gfmask;
 	int ret, i;
 
 	/* precompute the CMAC key material */
-	ret = aes_expandkey(ctx->aes, key, len);
+	ret = aes_prepareenckey(ctx->aes, key, len);
 	if (ret)
 		return ret;
 
 	for (i = 0; i < len / sizeof(u32); i++)
-		ctx->base.ipad.word[i + 8] = swab32(ctx->aes->key_enc[i]);
+		ctx->base.ipad.word[i + 8] = get_unaligned_be32(&key[4 * i]);
 
 	/* code below borrowed from crypto/cmac.c */
 	/* encrypt the zero block */
 	memset(consts, 0, AES_BLOCK_SIZE);
-	aes_encrypt(ctx->aes, (u8 *)consts, (u8 *)consts);
+	aes_encrypt_new(ctx->aes, (u8 *)consts, (u8 *)consts);
 
 	gfmask = 0x87;
 	_const[0] = be64_to_cpu(consts[1]);
 	_const[1] = be64_to_cpu(consts[0]);
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 31/36] crypto: omap - Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (29 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 30/36] crypto: inside-secure " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 32/36] lib/crypto: aescfb: " Eric Biggers
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/crypto/omap-aes-gcm.c | 6 +++---
 drivers/crypto/omap-aes.h     | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/omap-aes-gcm.c b/drivers/crypto/omap-aes-gcm.c
index 1f4586509ca4..efe94a983589 100644
--- a/drivers/crypto/omap-aes-gcm.c
+++ b/drivers/crypto/omap-aes-gcm.c
@@ -175,11 +175,11 @@ static int omap_aes_gcm_copy_buffers(struct omap_aes_dev *dd,
 
 static int do_encrypt_iv(struct aead_request *req, u32 *tag, u32 *iv)
 {
 	struct omap_aes_gcm_ctx *ctx = crypto_aead_ctx(crypto_aead_reqtfm(req));
 
-	aes_encrypt(&ctx->actx, (u8 *)tag, (u8 *)iv);
+	aes_encrypt_new(&ctx->akey, (u8 *)tag, (const u8 *)iv);
 	return 0;
 }
 
 void omap_aes_gcm_dma_out_callback(void *data)
 {
@@ -312,11 +312,11 @@ int omap_aes_gcm_setkey(struct crypto_aead *tfm, const u8 *key,
 			unsigned int keylen)
 {
 	struct omap_aes_gcm_ctx *ctx = crypto_aead_ctx(tfm);
 	int ret;
 
-	ret = aes_expandkey(&ctx->actx, key, keylen);
+	ret = aes_prepareenckey(&ctx->akey, key, keylen);
 	if (ret)
 		return ret;
 
 	memcpy(ctx->octx.key, key, keylen);
 	ctx->octx.keylen = keylen;
@@ -332,11 +332,11 @@ int omap_aes_4106gcm_setkey(struct crypto_aead *tfm, const u8 *key,
 
 	if (keylen < 4)
 		return -EINVAL;
 	keylen -= 4;
 
-	ret = aes_expandkey(&ctx->actx, key, keylen);
+	ret = aes_prepareenckey(&ctx->akey, key, keylen);
 	if (ret)
 		return ret;
 
 	memcpy(ctx->octx.key, key, keylen);
 	memcpy(ctx->octx.nonce, key + keylen, 4);
diff --git a/drivers/crypto/omap-aes.h b/drivers/crypto/omap-aes.h
index 99c36a777e97..6aa70bde387a 100644
--- a/drivers/crypto/omap-aes.h
+++ b/drivers/crypto/omap-aes.h
@@ -96,11 +96,11 @@ struct omap_aes_ctx {
 	struct crypto_skcipher	*fallback;
 };
 
 struct omap_aes_gcm_ctx {
 	struct omap_aes_ctx	octx;
-	struct crypto_aes_ctx	actx;
+	struct aes_enckey	akey;
 };
 
 struct omap_aes_reqctx {
 	struct omap_aes_dev *dd;
 	unsigned long mode;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 32/36] lib/crypto: aescfb: Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (30 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 31/36] crypto: omap " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 33/36] lib/crypto: aesgcm: " Eric Biggers
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 drivers/char/tpm/tpm2-sessions.c | 10 +++++-----
 include/crypto/aes.h             |  4 ++--
 lib/crypto/aescfb.c              | 30 +++++++++++++++---------------
 3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c
index 4149379665c4..09df6353ef04 100644
--- a/drivers/char/tpm/tpm2-sessions.c
+++ b/drivers/char/tpm/tpm2-sessions.c
@@ -124,11 +124,11 @@ struct tpm2_auth {
 	 * session_key and passphrase.
 	 */
 	u8 session_key[SHA256_DIGEST_SIZE];
 	u8 passphrase[SHA256_DIGEST_SIZE];
 	int passphrase_len;
-	struct crypto_aes_ctx aes_ctx;
+	struct aes_enckey aes_key;
 	/* saved session attributes: */
 	u8 attrs;
 	__be32 ordinal;
 
 	/*
@@ -675,12 +675,12 @@ int tpm_buf_fill_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf)
 			  + auth->passphrase_len, "CFB", auth->our_nonce,
 			  auth->tpm_nonce, AES_KEY_BYTES + AES_BLOCK_SIZE,
 			  auth->scratch);
 
 		len = tpm_buf_read_u16(buf, &offset_p);
-		aes_expandkey(&auth->aes_ctx, auth->scratch, AES_KEY_BYTES);
-		aescfb_encrypt(&auth->aes_ctx, &buf->data[offset_p],
+		aes_prepareenckey(&auth->aes_key, auth->scratch, AES_KEY_BYTES);
+		aescfb_encrypt(&auth->aes_key, &buf->data[offset_p],
 			       &buf->data[offset_p], len,
 			       auth->scratch + AES_KEY_BYTES);
 		/* reset p to beginning of parameters for HMAC */
 		offset_p -= 2;
 	}
@@ -856,12 +856,12 @@ int tpm_buf_check_hmac_response(struct tpm_chip *chip, struct tpm_buf *buf,
 			  + auth->passphrase_len, "CFB", auth->tpm_nonce,
 			  auth->our_nonce, AES_KEY_BYTES + AES_BLOCK_SIZE,
 			  auth->scratch);
 
 		len = tpm_buf_read_u16(buf, &offset_p);
-		aes_expandkey(&auth->aes_ctx, auth->scratch, AES_KEY_BYTES);
-		aescfb_decrypt(&auth->aes_ctx, &buf->data[offset_p],
+		aes_prepareenckey(&auth->aes_key, auth->scratch, AES_KEY_BYTES);
+		aescfb_decrypt(&auth->aes_key, &buf->data[offset_p],
 			       &buf->data[offset_p], len,
 			       auth->scratch + AES_KEY_BYTES);
 	}
 
  out:
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index e4b5f60e7a0b..18a5f518e914 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -342,11 +342,11 @@ void aes_decrypt_new(const struct aes_key *key, u8 out[at_least AES_BLOCK_SIZE],
 extern const u8 crypto_aes_sbox[];
 extern const u8 crypto_aes_inv_sbox[];
 extern const u32 __cacheline_aligned aes_enc_tab[256];
 extern const u32 __cacheline_aligned aes_dec_tab[256];
 
-void aescfb_encrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
+void aescfb_encrypt(const struct aes_enckey *key, u8 *dst, const u8 *src,
 		    int len, const u8 iv[AES_BLOCK_SIZE]);
-void aescfb_decrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
+void aescfb_decrypt(const struct aes_enckey *key, u8 *dst, const u8 *src,
 		    int len, const u8 iv[AES_BLOCK_SIZE]);
 
 #endif
diff --git a/lib/crypto/aescfb.c b/lib/crypto/aescfb.c
index 0f294c8cbf3c..3149d688c4e0 100644
--- a/lib/crypto/aescfb.c
+++ b/lib/crypto/aescfb.c
@@ -9,11 +9,11 @@
 #include <crypto/algapi.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <asm/irqflags.h>
 
-static void aescfb_encrypt_block(const struct crypto_aes_ctx *ctx, void *dst,
+static void aescfb_encrypt_block(const struct aes_enckey *key, void *dst,
 				 const void *src)
 {
 	unsigned long flags;
 
 	/*
@@ -23,31 +23,31 @@ static void aescfb_encrypt_block(const struct crypto_aes_ctx *ctx, void *dst,
 	 * extent by pulling the entire S-box into the caches before doing any
 	 * substitutions, but this strategy is more effective when running with
 	 * interrupts disabled.
 	 */
 	local_irq_save(flags);
-	aes_encrypt(ctx, dst, src);
+	aes_encrypt_new(key, dst, src);
 	local_irq_restore(flags);
 }
 
 /**
  * aescfb_encrypt - Perform AES-CFB encryption on a block of data
  *
- * @ctx:	The AES-CFB key schedule
+ * @key:	The AES-CFB key schedule
  * @dst:	Pointer to the ciphertext output buffer
  * @src:	Pointer the plaintext (may equal @dst for encryption in place)
  * @len:	The size in bytes of the plaintext and ciphertext.
  * @iv:		The initialization vector (IV) to use for this block of data
  */
-void aescfb_encrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
+void aescfb_encrypt(const struct aes_enckey *key, u8 *dst, const u8 *src,
 		    int len, const u8 iv[AES_BLOCK_SIZE])
 {
 	u8 ks[AES_BLOCK_SIZE];
 	const u8 *v = iv;
 
 	while (len > 0) {
-		aescfb_encrypt_block(ctx, ks, v);
+		aescfb_encrypt_block(key, ks, v);
 		crypto_xor_cpy(dst, src, ks, min(len, AES_BLOCK_SIZE));
 		v = dst;
 
 		dst += AES_BLOCK_SIZE;
 		src += AES_BLOCK_SIZE;
@@ -59,31 +59,31 @@ void aescfb_encrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
 EXPORT_SYMBOL(aescfb_encrypt);
 
 /**
  * aescfb_decrypt - Perform AES-CFB decryption on a block of data
  *
- * @ctx:	The AES-CFB key schedule
+ * @key:	The AES-CFB key schedule
  * @dst:	Pointer to the plaintext output buffer
  * @src:	Pointer the ciphertext (may equal @dst for decryption in place)
  * @len:	The size in bytes of the plaintext and ciphertext.
  * @iv:		The initialization vector (IV) to use for this block of data
  */
-void aescfb_decrypt(const struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src,
+void aescfb_decrypt(const struct aes_enckey *key, u8 *dst, const u8 *src,
 		    int len, const u8 iv[AES_BLOCK_SIZE])
 {
 	u8 ks[2][AES_BLOCK_SIZE];
 
-	aescfb_encrypt_block(ctx, ks[0], iv);
+	aescfb_encrypt_block(key, ks[0], iv);
 
 	for (int i = 0; len > 0; i ^= 1) {
 		if (len > AES_BLOCK_SIZE)
 			/*
 			 * Generate the keystream for the next block before
 			 * performing the XOR, as that may update in place and
 			 * overwrite the ciphertext.
 			 */
-			aescfb_encrypt_block(ctx, ks[!i], src);
+			aescfb_encrypt_block(key, ks[!i], src);
 
 		crypto_xor_cpy(dst, src, ks[i], min(len, AES_BLOCK_SIZE));
 
 		dst += AES_BLOCK_SIZE;
 		src += AES_BLOCK_SIZE;
@@ -212,34 +212,34 @@ static struct {
 };
 
 static int __init libaescfb_init(void)
 {
 	for (int i = 0; i < ARRAY_SIZE(aescfb_tv); i++) {
-		struct crypto_aes_ctx ctx;
+		struct aes_enckey key;
 		u8 buf[64];
 
-		if (aes_expandkey(&ctx, aescfb_tv[i].key, aescfb_tv[i].klen)) {
-			pr_err("aes_expandkey() failed on vector %d\n", i);
+		if (aes_prepareenckey(&key, aescfb_tv[i].key, aescfb_tv[i].klen)) {
+			pr_err("aes_prepareenckey() failed on vector %d\n", i);
 			return -ENODEV;
 		}
 
-		aescfb_encrypt(&ctx, buf, aescfb_tv[i].ptext, aescfb_tv[i].len,
+		aescfb_encrypt(&key, buf, aescfb_tv[i].ptext, aescfb_tv[i].len,
 			       aescfb_tv[i].iv);
 		if (memcmp(buf, aescfb_tv[i].ctext, aescfb_tv[i].len)) {
 			pr_err("aescfb_encrypt() #1 failed on vector %d\n", i);
 			return -ENODEV;
 		}
 
 		/* decrypt in place */
-		aescfb_decrypt(&ctx, buf, buf, aescfb_tv[i].len, aescfb_tv[i].iv);
+		aescfb_decrypt(&key, buf, buf, aescfb_tv[i].len, aescfb_tv[i].iv);
 		if (memcmp(buf, aescfb_tv[i].ptext, aescfb_tv[i].len)) {
 			pr_err("aescfb_decrypt() failed on vector %d\n", i);
 			return -ENODEV;
 		}
 
 		/* encrypt in place */
-		aescfb_encrypt(&ctx, buf, buf, aescfb_tv[i].len, aescfb_tv[i].iv);
+		aescfb_encrypt(&key, buf, buf, aescfb_tv[i].len, aescfb_tv[i].iv);
 		if (memcmp(buf, aescfb_tv[i].ctext, aescfb_tv[i].len)) {
 			pr_err("aescfb_encrypt() #2 failed on vector %d\n", i);
 
 			return -ENODEV;
 		}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 33/36] lib/crypto: aesgcm: Use new AES library API
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (31 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 32/36] lib/crypto: aescfb: " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 34/36] lib/crypto: aes: Remove old AES en/decryption functions Eric Biggers
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey).  This
eliminates the unnecessary computation and caching of the decryption
round keys.  The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.

Note: aes_encrypt_new() will be renamed to aes_encrypt() once all
callers of the old aes_encrypt() have been updated.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 include/crypto/gcm.h |  2 +-
 lib/crypto/aesgcm.c  | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/crypto/gcm.h b/include/crypto/gcm.h
index fd9df607a836..b524e47bd4d0 100644
--- a/include/crypto/gcm.h
+++ b/include/crypto/gcm.h
@@ -64,11 +64,11 @@ static inline int crypto_ipsec_check_assoclen(unsigned int assoclen)
 	return 0;
 }
 
 struct aesgcm_ctx {
 	be128			ghash_key;
-	struct crypto_aes_ctx	aes_ctx;
+	struct aes_enckey	aes_key;
 	unsigned int		authsize;
 };
 
 int aesgcm_expandkey(struct aesgcm_ctx *ctx, const u8 *key,
 		     unsigned int keysize, unsigned int authsize);
diff --git a/lib/crypto/aesgcm.c b/lib/crypto/aesgcm.c
index ac0b2fcfd606..19106fe008fd 100644
--- a/lib/crypto/aesgcm.c
+++ b/lib/crypto/aesgcm.c
@@ -10,11 +10,11 @@
 #include <crypto/ghash.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <asm/irqflags.h>
 
-static void aesgcm_encrypt_block(const struct crypto_aes_ctx *ctx, void *dst,
+static void aesgcm_encrypt_block(const struct aes_enckey *key, void *dst,
 				 const void *src)
 {
 	unsigned long flags;
 
 	/*
@@ -24,11 +24,11 @@ static void aesgcm_encrypt_block(const struct crypto_aes_ctx *ctx, void *dst,
 	 * mitigates this risk to some extent by pulling the entire S-box into
 	 * the caches before doing any substitutions, but this strategy is more
 	 * effective when running with interrupts disabled.
 	 */
 	local_irq_save(flags);
-	aes_encrypt(ctx, dst, src);
+	aes_encrypt_new(key, dst, src);
 	local_irq_restore(flags);
 }
 
 /**
  * aesgcm_expandkey - Expands the AES and GHASH keys for the AES-GCM key
@@ -47,16 +47,16 @@ int aesgcm_expandkey(struct aesgcm_ctx *ctx, const u8 *key,
 {
 	u8 kin[AES_BLOCK_SIZE] = {};
 	int ret;
 
 	ret = crypto_gcm_check_authsize(authsize) ?:
-	      aes_expandkey(&ctx->aes_ctx, key, keysize);
+	      aes_prepareenckey(&ctx->aes_key, key, keysize);
 	if (ret)
 		return ret;
 
 	ctx->authsize = authsize;
-	aesgcm_encrypt_block(&ctx->aes_ctx, &ctx->ghash_key, kin);
+	aesgcm_encrypt_block(&ctx->aes_key, &ctx->ghash_key, kin);
 
 	return 0;
 }
 EXPORT_SYMBOL(aesgcm_expandkey);
 
@@ -95,11 +95,11 @@ static void aesgcm_mac(const struct aesgcm_ctx *ctx, const u8 *src, int src_len,
 	aesgcm_ghash(&ghash, &ctx->ghash_key, assoc, assoc_len);
 	aesgcm_ghash(&ghash, &ctx->ghash_key, src, src_len);
 	aesgcm_ghash(&ghash, &ctx->ghash_key, &tail, sizeof(tail));
 
 	ctr[3] = cpu_to_be32(1);
-	aesgcm_encrypt_block(&ctx->aes_ctx, buf, ctr);
+	aesgcm_encrypt_block(&ctx->aes_key, buf, ctr);
 	crypto_xor_cpy(authtag, buf, (u8 *)&ghash, ctx->authsize);
 
 	memzero_explicit(&ghash, sizeof(ghash));
 	memzero_explicit(buf, sizeof(buf));
 }
@@ -117,11 +117,11 @@ static void aesgcm_crypt(const struct aesgcm_ctx *ctx, u8 *dst, const u8 *src,
 		 * inadvertent IV reuse, which must be avoided at all cost for
 		 * stream ciphers such as AES-CTR. Given the range of 'int
 		 * len', this cannot happen, so no explicit test is necessary.
 		 */
 		ctr[3] = cpu_to_be32(n++);
-		aesgcm_encrypt_block(&ctx->aes_ctx, buf, ctr);
+		aesgcm_encrypt_block(&ctx->aes_key, buf, ctr);
 		crypto_xor_cpy(dst, src, buf, min(len, AES_BLOCK_SIZE));
 
 		dst += AES_BLOCK_SIZE;
 		src += AES_BLOCK_SIZE;
 		len -= AES_BLOCK_SIZE;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 34/36] lib/crypto: aes: Remove old AES en/decryption functions
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (32 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 33/36] lib/crypto: aesgcm: " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 35/36] lib/crypto: aes: Drop "_new" suffix from " Eric Biggers
  2026-01-05  5:13 ` [PATCH 36/36] lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox Eric Biggers
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Now that no callers of the original aes_encrypt() and aes_decrypt()
remain, remove them.  This frees up their names for aes_encrypt_new()
and aes_decrypt_new() to be renamed to.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 include/crypto/aes.h |  16 ------
 lib/crypto/aes.c     | 118 -------------------------------------------
 2 files changed, 134 deletions(-)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 18a5f518e914..4ce710209da8 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -293,26 +293,10 @@ int aes_preparekey(struct aes_key *key, const u8 *in_key, size_t key_len);
  * Context: Any context.
  */
 int aes_prepareenckey(struct aes_enckey *enc_key,
 		      const u8 *in_key, size_t key_len);
 
-/**
- * aes_encrypt - Encrypt a single AES block
- * @ctx:	Context struct containing the key schedule
- * @out:	Buffer to store the ciphertext
- * @in:		Buffer containing the plaintext
- */
-void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
-
-/**
- * aes_decrypt - Decrypt a single AES block
- * @ctx:	Context struct containing the key schedule
- * @out:	Buffer to store the plaintext
- * @in:		Buffer containing the ciphertext
- */
-void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
-
 typedef union {
 	const struct aes_enckey *enc_key;
 	const struct aes_key *full_key;
 } aes_encrypt_arg __attribute__ ((__transparent_union__));
 
diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
index 57b6d68fd378..f8c67206b850 100644
--- a/lib/crypto/aes.c
+++ b/lib/crypto/aes.c
@@ -249,26 +249,10 @@ static u32 inv_mix_columns(u32 x)
 	u32 y = mul_by_x2(x);
 
 	return mix_columns(x ^ y ^ ror32(y, 16));
 }
 
-static __always_inline u32 subshift(u32 in[], int pos)
-{
-	return (aes_sbox[in[pos] & 0xff]) ^
-	       (aes_sbox[(in[(pos + 1) % 4] >>  8) & 0xff] <<  8) ^
-	       (aes_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
-	       (aes_sbox[(in[(pos + 3) % 4] >> 24) & 0xff] << 24);
-}
-
-static __always_inline u32 inv_subshift(u32 in[], int pos)
-{
-	return (aes_inv_sbox[in[pos] & 0xff]) ^
-	       (aes_inv_sbox[(in[(pos + 3) % 4] >>  8) & 0xff] <<  8) ^
-	       (aes_inv_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
-	       (aes_inv_sbox[(in[(pos + 1) % 4] >> 24) & 0xff] << 24);
-}
-
 static u32 subw(u32 in)
 {
 	return (aes_sbox[in & 0xff]) ^
 	       (aes_sbox[(in >>  8) & 0xff] <<  8) ^
 	       (aes_sbox[(in >> 16) & 0xff] << 16) ^
@@ -343,61 +327,10 @@ int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
 	aes_expandkey_generic(ctx->key_enc, ctx->key_dec, in_key, key_len);
 	return 0;
 }
 EXPORT_SYMBOL(aes_expandkey);
 
-/**
- * aes_encrypt - Encrypt a single AES block
- * @ctx:	Context struct containing the key schedule
- * @out:	Buffer to store the ciphertext
- * @in:		Buffer containing the plaintext
- */
-void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
-{
-	const u32 *rkp = ctx->key_enc + 4;
-	int rounds = 6 + ctx->key_length / 4;
-	u32 st0[4], st1[4];
-	int round;
-
-	st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
-	st0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
-	st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
-	st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
-
-	/*
-	 * Force the compiler to emit data independent Sbox references,
-	 * by xoring the input with Sbox values that are known to add up
-	 * to zero. This pulls the entire Sbox into the D-cache before any
-	 * data dependent lookups are done.
-	 */
-	st0[0] ^= aes_sbox[ 0] ^ aes_sbox[ 64] ^ aes_sbox[134] ^ aes_sbox[195];
-	st0[1] ^= aes_sbox[16] ^ aes_sbox[ 82] ^ aes_sbox[158] ^ aes_sbox[221];
-	st0[2] ^= aes_sbox[32] ^ aes_sbox[ 96] ^ aes_sbox[160] ^ aes_sbox[234];
-	st0[3] ^= aes_sbox[48] ^ aes_sbox[112] ^ aes_sbox[186] ^ aes_sbox[241];
-
-	for (round = 0;; round += 2, rkp += 8) {
-		st1[0] = mix_columns(subshift(st0, 0)) ^ rkp[0];
-		st1[1] = mix_columns(subshift(st0, 1)) ^ rkp[1];
-		st1[2] = mix_columns(subshift(st0, 2)) ^ rkp[2];
-		st1[3] = mix_columns(subshift(st0, 3)) ^ rkp[3];
-
-		if (round == rounds - 2)
-			break;
-
-		st0[0] = mix_columns(subshift(st1, 0)) ^ rkp[4];
-		st0[1] = mix_columns(subshift(st1, 1)) ^ rkp[5];
-		st0[2] = mix_columns(subshift(st1, 2)) ^ rkp[6];
-		st0[3] = mix_columns(subshift(st1, 3)) ^ rkp[7];
-	}
-
-	put_unaligned_le32(subshift(st1, 0) ^ rkp[4], out);
-	put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
-	put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
-	put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
-}
-EXPORT_SYMBOL(aes_encrypt);
-
 static __always_inline u32 enc_quarterround(const u32 w[4], int i, u32 rk)
 {
 	return rk ^ aes_enc_tab[(u8)w[i]] ^
 	       rol32(aes_enc_tab[(u8)(w[(i + 1) % 4] >> 8)], 8) ^
 	       rol32(aes_enc_tab[(u8)(w[(i + 2) % 4] >> 16)], 16) ^
@@ -502,61 +435,10 @@ static void __maybe_unused aes_decrypt_generic(const u32 inv_rndkeys[],
 	put_unaligned_le32(declast_quarterround(w, 1, *rkp++), &out[4]);
 	put_unaligned_le32(declast_quarterround(w, 2, *rkp++), &out[8]);
 	put_unaligned_le32(declast_quarterround(w, 3, *rkp++), &out[12]);
 }
 
-/**
- * aes_decrypt - Decrypt a single AES block
- * @ctx:	Context struct containing the key schedule
- * @out:	Buffer to store the plaintext
- * @in:		Buffer containing the ciphertext
- */
-void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
-{
-	const u32 *rkp = ctx->key_dec + 4;
-	int rounds = 6 + ctx->key_length / 4;
-	u32 st0[4], st1[4];
-	int round;
-
-	st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
-	st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
-	st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
-	st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
-
-	/*
-	 * Force the compiler to emit data independent Sbox references,
-	 * by xoring the input with Sbox values that are known to add up
-	 * to zero. This pulls the entire Sbox into the D-cache before any
-	 * data dependent lookups are done.
-	 */
-	st0[0] ^= aes_inv_sbox[ 0] ^ aes_inv_sbox[ 64] ^ aes_inv_sbox[129] ^ aes_inv_sbox[200];
-	st0[1] ^= aes_inv_sbox[16] ^ aes_inv_sbox[ 83] ^ aes_inv_sbox[150] ^ aes_inv_sbox[212];
-	st0[2] ^= aes_inv_sbox[32] ^ aes_inv_sbox[ 96] ^ aes_inv_sbox[160] ^ aes_inv_sbox[236];
-	st0[3] ^= aes_inv_sbox[48] ^ aes_inv_sbox[112] ^ aes_inv_sbox[187] ^ aes_inv_sbox[247];
-
-	for (round = 0;; round += 2, rkp += 8) {
-		st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0];
-		st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1];
-		st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2];
-		st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3];
-
-		if (round == rounds - 2)
-			break;
-
-		st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4];
-		st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5];
-		st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6];
-		st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7];
-	}
-
-	put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out);
-	put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
-	put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
-	put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
-}
-EXPORT_SYMBOL(aes_decrypt);
-
 /*
  * Note: the aes_prepare*key_* names reflect the fact that the implementation
  * might not actually expand the key.  (The s390 code for example doesn't.)
  * Where the key is expanded we use the more specific names aes_expandkey_*.
  *
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 35/36] lib/crypto: aes: Drop "_new" suffix from en/decryption functions
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (33 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 34/36] lib/crypto: aes: Remove old AES en/decryption functions Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  2026-01-05  5:13 ` [PATCH 36/36] lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox Eric Biggers
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

Now that all callers of aes_encrypt() and aes_decrypt() have been
updated to use aes_encrypt_new() and aes_decrypt_new() instead, and the
original aes_encrypt() and aes_decrypt() have been removed, drop the
"_new" suffix.  This completes the migration to the revised AES API,
which uses a different type for the key argument and is more efficient
when the user only requires the encryption direction of the cipher.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 arch/arm/crypto/aes-neonbs-glue.c                |  8 ++++----
 arch/arm/crypto/ghash-ce-glue.c                  |  2 +-
 arch/arm64/crypto/ghash-ce-glue.c                |  2 +-
 arch/riscv/crypto/aes-riscv64-glue.c             |  2 +-
 arch/x86/crypto/aesni-intel_glue.c               |  2 +-
 crypto/aes.c                                     |  4 ++--
 crypto/df_sp80090a.c                             |  2 +-
 drivers/crypto/amcc/crypto4xx_alg.c              |  2 +-
 drivers/crypto/ccp/ccp-crypto-aes-cmac.c         |  2 +-
 drivers/crypto/chelsio/chcr_algo.c               |  6 +++---
 drivers/crypto/inside-secure/safexcel_cipher.c   |  2 +-
 drivers/crypto/inside-secure/safexcel_hash.c     | 16 ++++++++--------
 drivers/crypto/omap-aes-gcm.c                    |  2 +-
 .../chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c  |  2 +-
 .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c    |  2 +-
 .../chelsio/inline_crypto/chtls/chtls_hw.c       |  2 +-
 drivers/net/phy/mscc/mscc_macsec.c               |  2 +-
 drivers/staging/rtl8723bs/core/rtw_security.c    |  8 ++++----
 include/crypto/aes.h                             | 12 ++++++------
 lib/crypto/aes.c                                 | 12 ++++++------
 lib/crypto/aescfb.c                              |  2 +-
 lib/crypto/aesgcm.c                              |  2 +-
 net/bluetooth/smp.c                              |  2 +-
 23 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c
index f892f281b441..c49ddafc54f3 100644
--- a/arch/arm/crypto/aes-neonbs-glue.c
+++ b/arch/arm/crypto/aes-neonbs-glue.c
@@ -154,11 +154,11 @@ static int cbc_encrypt(struct skcipher_request *req)
 		u8 *dst = walk.dst.virt.addr;
 		u8 *prev = walk.iv;
 
 		do {
 			crypto_xor_cpy(dst, src, prev, AES_BLOCK_SIZE);
-			aes_encrypt_new(&ctx->fallback, dst, dst);
+			aes_encrypt(&ctx->fallback, dst, dst);
 			prev = dst;
 			src += AES_BLOCK_SIZE;
 			dst += AES_BLOCK_SIZE;
 			nbytes -= AES_BLOCK_SIZE;
 		} while (nbytes >= AES_BLOCK_SIZE);
@@ -280,11 +280,11 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
 
 	err = skcipher_walk_virt(&walk, req, true);
 	if (err)
 		return err;
 
-	aes_encrypt_new(&ctx->tweak_key, walk.iv, walk.iv);
+	aes_encrypt(&ctx->tweak_key, walk.iv, walk.iv);
 
 	while (walk.nbytes >= AES_BLOCK_SIZE) {
 		unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
 		int reorder_last_tweak = !encrypt && tail > 0;
 
@@ -312,13 +312,13 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
 	scatterwalk_map_and_copy(buf, req->src, req->cryptlen, tail, 0);
 
 	crypto_xor(buf, req->iv, AES_BLOCK_SIZE);
 
 	if (encrypt)
-		aes_encrypt_new(&ctx->fallback, buf, buf);
+		aes_encrypt(&ctx->fallback, buf, buf);
 	else
-		aes_decrypt_new(&ctx->fallback, buf, buf);
+		aes_decrypt(&ctx->fallback, buf, buf);
 
 	crypto_xor(buf, req->iv, AES_BLOCK_SIZE);
 
 	scatterwalk_map_and_copy(buf, req->dst, req->cryptlen - AES_BLOCK_SIZE,
 				 AES_BLOCK_SIZE + tail, 1);
diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glue.c
index 9ab03bce352d..454adcc62cc6 100644
--- a/arch/arm/crypto/ghash-ce-glue.c
+++ b/arch/arm/crypto/ghash-ce-glue.c
@@ -210,11 +210,11 @@ static int gcm_aes_setkey(struct crypto_aead *tfm, const u8 *inkey,
 
 	ret = aes_prepareenckey(&aes_key, inkey, keylen);
 	if (ret)
 		return -EINVAL;
 
-	aes_encrypt_new(&aes_key, (u8 *)&k, (u8[AES_BLOCK_SIZE]){});
+	aes_encrypt(&aes_key, (u8 *)&k, (u8[AES_BLOCK_SIZE]){});
 
 	/*
 	 * Note: this assumes that the arm implementation of the AES library
 	 * stores the standard round keys in k.rndkeys.
 	 */
diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index bfd38e485e77..63bb9e062251 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -196,11 +196,11 @@ static int gcm_aes_setkey(struct crypto_aead *tfm, const u8 *inkey,
 
 	ret = aes_prepareenckey(&ctx->aes_key, inkey, keylen);
 	if (ret)
 		return -EINVAL;
 
-	aes_encrypt_new(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});
+	aes_encrypt(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});
 
 	/* needed for the fallback */
 	memcpy(&ctx->ghash_key.k, key, GHASH_BLOCK_SIZE);
 
 	ghash_reflect(ctx->ghash_key.h[0], &ctx->ghash_key.k);
diff --git a/arch/riscv/crypto/aes-riscv64-glue.c b/arch/riscv/crypto/aes-riscv64-glue.c
index e1b8b0d70666..8d6d4338b90b 100644
--- a/arch/riscv/crypto/aes-riscv64-glue.c
+++ b/arch/riscv/crypto/aes-riscv64-glue.c
@@ -320,11 +320,11 @@ static int riscv64_aes_xts_crypt(struct skcipher_request *req, bool enc)
 
 	if (req->cryptlen < AES_BLOCK_SIZE)
 		return -EINVAL;
 
 	/* Encrypt the IV with the tweak key to get the first tweak. */
-	aes_encrypt_new(&ctx->tweak_key, req->iv, req->iv);
+	aes_encrypt(&ctx->tweak_key, req->iv, req->iv);
 
 	err = skcipher_walk_virt(&walk, req, false);
 
 	/*
 	 * If the message length isn't divisible by the AES block size and the
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 5633e50e46a0..e6c38d1d8a92 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -1211,11 +1211,11 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key,
 		be128 h1 = {};
 		be128 h;
 		int i;
 
 		/* Encrypt the all-zeroes block to get the hash key H^1 */
-		aes_encrypt_new(&key->aes_key, (u8 *)&h1, (u8 *)&h1);
+		aes_encrypt(&key->aes_key, (u8 *)&h1, (u8 *)&h1);
 
 		/* Compute H^1 * x^-1 */
 		h = h1;
 		gf128mul_lle(&h, (const be128 *)x_to_the_minus1);
 
diff --git a/crypto/aes.c b/crypto/aes.c
index 5c3a0b24dbc0..ae8385df0ce5 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -21,18 +21,18 @@ static int crypto_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
 
 static void crypto_aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
 {
 	const struct aes_key *key = crypto_tfm_ctx(tfm);
 
-	aes_encrypt_new(key, out, in);
+	aes_encrypt(key, out, in);
 }
 
 static void crypto_aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
 {
 	const struct aes_key *key = crypto_tfm_ctx(tfm);
 
-	aes_decrypt_new(key, out, in);
+	aes_decrypt(key, out, in);
 }
 
 static struct crypto_alg alg = {
 	.cra_name = "aes",
 	.cra_driver_name = "aes-lib",
diff --git a/crypto/df_sp80090a.c b/crypto/df_sp80090a.c
index 5686d37ebba2..b8134be6f7ad 100644
--- a/crypto/df_sp80090a.c
+++ b/crypto/df_sp80090a.c
@@ -17,11 +17,11 @@
 static void drbg_kcapi_sym(struct aes_enckey *aeskey, unsigned char *outval,
 			   const struct drbg_string *in, u8 blocklen_bytes)
 {
 	/* there is only component in *in */
 	BUG_ON(in->len < blocklen_bytes);
-	aes_encrypt_new(aeskey, outval, in->buf);
+	aes_encrypt(aeskey, outval, in->buf);
 }
 
 /* BCC function for CTR DRBG as defined in 10.4.3 */
 
 static void drbg_ctr_bcc(struct aes_enckey *aeskey,
diff --git a/drivers/crypto/amcc/crypto4xx_alg.c b/drivers/crypto/amcc/crypto4xx_alg.c
index 1947708334ef..3177dc4f5f7b 100644
--- a/drivers/crypto/amcc/crypto4xx_alg.c
+++ b/drivers/crypto/amcc/crypto4xx_alg.c
@@ -499,11 +499,11 @@ static int crypto4xx_compute_gcm_hash_key_sw(__le32 *hash_start, const u8 *key,
 	if (rc) {
 		pr_err("aes_prepareenckey() failed: %d\n", rc);
 		return rc;
 	}
 
-	aes_encrypt_new(&aes, src, src);
+	aes_encrypt(&aes, src, src);
 	crypto4xx_memcpy_to_le32(hash_start, src, 16);
 	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
diff --git a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
index ed5b0f8609f1..71480f7e6f6b 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
@@ -288,11 +288,11 @@ static int ccp_aes_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 	if (ret)
 		return ret;
 
 	/* Encrypt a block of zeroes - use key area in context */
 	memset(ctx->u.aes.key, 0, sizeof(ctx->u.aes.key));
-	aes_encrypt_new(&aes, ctx->u.aes.key, ctx->u.aes.key);
+	aes_encrypt(&aes, ctx->u.aes.key, ctx->u.aes.key);
 	memzero_explicit(&aes, sizeof(aes));
 
 	/* Generate K1 and K2 */
 	k0_hi = be64_to_cpu(*((__be64 *)ctx->u.aes.key));
 	k0_lo = be64_to_cpu(*((__be64 *)ctx->u.aes.key + 1));
diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c
index b6b97088dfc5..6dec42282768 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -1047,19 +1047,19 @@ static int chcr_update_tweak(struct skcipher_request *req, u8 *iv,
 		ret = aes_preparekey(&aes, key, keylen - 8);
 	else
 		ret = aes_preparekey(&aes, key, keylen);
 	if (ret)
 		return ret;
-	aes_encrypt_new(&aes, iv, iv);
+	aes_encrypt(&aes, iv, iv);
 	for (i = 0; i < round8; i++)
 		gf128mul_x8_ble((le128 *)iv, (le128 *)iv);
 
 	for (i = 0; i < (round % 8); i++)
 		gf128mul_x_ble((le128 *)iv, (le128 *)iv);
 
 	if (!isfinal)
-		aes_decrypt_new(&aes, iv, iv);
+		aes_decrypt(&aes, iv, iv);
 
 	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
@@ -3448,11 +3448,11 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, const u8 *key,
 	if (ret) {
 		aeadctx->enckey_len = 0;
 		goto out;
 	}
 	memset(gctx->ghash_h, 0, AEAD_H_SIZE);
-	aes_encrypt_new(&aes, gctx->ghash_h, gctx->ghash_h);
+	aes_encrypt(&aes, gctx->ghash_h, gctx->ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 
 out:
 	return ret;
 }
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c
index eb4e0dc38b7f..27b180057417 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -2529,11 +2529,11 @@ static int safexcel_aead_gcm_setkey(struct crypto_aead *ctfm, const u8 *key,
 
 	ctx->key_len = len;
 
 	/* Compute hash key by encrypting zeroes with cipher key */
 	memset(hashkey, 0, AES_BLOCK_SIZE);
-	aes_encrypt_new(&aes, (u8 *)hashkey, (u8 *)hashkey);
+	aes_encrypt(&aes, (u8 *)hashkey, (u8 *)hashkey);
 
 	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
 		for (i = 0; i < AES_BLOCK_SIZE / sizeof(u32); i++) {
 			if (be32_to_cpu(ctx->base.ipad.be[i]) != hashkey[i]) {
 				ctx->base.needs_inv = true;
diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c
index dae10d0066d7..e534b7a200cf 100644
--- a/drivers/crypto/inside-secure/safexcel_hash.c
+++ b/drivers/crypto/inside-secure/safexcel_hash.c
@@ -820,11 +820,11 @@ static int safexcel_ahash_final(struct ahash_request *areq)
 
 			/* K3 */
 			result[i] = swab32(ctx->base.ipad.word[i + 4]);
 		}
 		areq->result[0] ^= 0x80;			// 10- padding
-		aes_encrypt_new(ctx->aes, areq->result, areq->result);
+		aes_encrypt(ctx->aes, areq->result, areq->result);
 		return 0;
 	} else if (unlikely(req->hmac &&
 			    (req->len == req->block_sz) &&
 			    !areq->nbytes)) {
 		/*
@@ -1979,16 +1979,16 @@ static int safexcel_xcbcmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 	ret = aes_prepareenckey(ctx->aes, key, len);
 	if (ret)
 		return ret;
 
 	/* precompute the XCBC key material */
-	aes_encrypt_new(ctx->aes, (u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
-			"\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1");
-	aes_encrypt_new(ctx->aes, (u8 *)key_tmp,
-			"\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2");
-	aes_encrypt_new(ctx->aes, (u8 *)key_tmp + AES_BLOCK_SIZE,
-			"\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3");
+	aes_encrypt(ctx->aes, (u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
+		    "\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1");
+	aes_encrypt(ctx->aes, (u8 *)key_tmp,
+		    "\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2\x2");
+	aes_encrypt(ctx->aes, (u8 *)key_tmp + AES_BLOCK_SIZE,
+		    "\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3\x3");
 	for (i = 0; i < 3 * AES_BLOCK_SIZE / sizeof(u32); i++)
 		ctx->base.ipad.word[i] = swab32(key_tmp[i]);
 
 	ret = aes_prepareenckey(ctx->aes,
 				(u8 *)key_tmp + 2 * AES_BLOCK_SIZE,
@@ -2070,11 +2070,11 @@ static int safexcel_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
 		ctx->base.ipad.word[i + 8] = get_unaligned_be32(&key[4 * i]);
 
 	/* code below borrowed from crypto/cmac.c */
 	/* encrypt the zero block */
 	memset(consts, 0, AES_BLOCK_SIZE);
-	aes_encrypt_new(ctx->aes, (u8 *)consts, (u8 *)consts);
+	aes_encrypt(ctx->aes, (u8 *)consts, (u8 *)consts);
 
 	gfmask = 0x87;
 	_const[0] = be64_to_cpu(consts[1]);
 	_const[1] = be64_to_cpu(consts[0]);
 
diff --git a/drivers/crypto/omap-aes-gcm.c b/drivers/crypto/omap-aes-gcm.c
index efe94a983589..c652f9d0062f 100644
--- a/drivers/crypto/omap-aes-gcm.c
+++ b/drivers/crypto/omap-aes-gcm.c
@@ -175,11 +175,11 @@ static int omap_aes_gcm_copy_buffers(struct omap_aes_dev *dd,
 
 static int do_encrypt_iv(struct aead_request *req, u32 *tag, u32 *iv)
 {
 	struct omap_aes_gcm_ctx *ctx = crypto_aead_ctx(crypto_aead_reqtfm(req));
 
-	aes_encrypt_new(&ctx->akey, (u8 *)tag, (const u8 *)iv);
+	aes_encrypt(&ctx->akey, (u8 *)tag, (const u8 *)iv);
 	return 0;
 }
 
 void omap_aes_gcm_dma_out_callback(void *data)
 {
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
index 882d09b2b1a8..074717d4bb16 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
@@ -208,11 +208,11 @@ static int ch_ipsec_setkey(struct xfrm_state *x,
 	if (ret) {
 		sa_entry->enckey_len = 0;
 		goto out;
 	}
 	memset(ghash_h, 0, AEAD_H_SIZE);
-	aes_encrypt_new(&aes, ghash_h, ghash_h);
+	aes_encrypt(&aes, ghash_h, ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 
 	memcpy(sa_entry->key + (DIV_ROUND_UP(sa_entry->enckey_len, 16) *
 	       16), ghash_h, AEAD_H_SIZE);
 	sa_entry->kctx_len = ((DIV_ROUND_UP(sa_entry->enckey_len, 16)) << 4) +
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 09c0687f911f..b8ebb56de65e 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -141,11 +141,11 @@ static int chcr_ktls_save_keys(struct chcr_ktls_info *tx_info,
 	ret = aes_prepareenckey(&aes, key, keylen);
 	if (ret)
 		goto out;
 
 	memset(ghash_h, 0, ghash_size);
-	aes_encrypt_new(&aes, ghash_h, ghash_h);
+	aes_encrypt(&aes, ghash_h, ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 
 	/* fill the Key context */
 	if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
 		kctx->ctx_hdr = FILL_KEY_CTX_HDR(ck_size,
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c
index be2b623957c0..d84473ca844d 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c
@@ -294,11 +294,11 @@ static int chtls_key_info(struct chtls_sock *csk,
 	ret = aes_prepareenckey(&aes, key, keylen);
 	if (ret)
 		return ret;
 
 	memset(ghash_h, 0, AEAD_H_SIZE);
-	aes_encrypt_new(&aes, ghash_h, ghash_h);
+	aes_encrypt(&aes, ghash_h, ghash_h);
 	memzero_explicit(&aes, sizeof(aes));
 	csk->tlshws.keylen = key_ctx_size;
 
 	/* Copy the Key context */
 	if (optname == TLS_RX) {
diff --git a/drivers/net/phy/mscc/mscc_macsec.c b/drivers/net/phy/mscc/mscc_macsec.c
index bcb7f5a4a8fd..9a38a29cf397 100644
--- a/drivers/net/phy/mscc/mscc_macsec.c
+++ b/drivers/net/phy/mscc/mscc_macsec.c
@@ -509,11 +509,11 @@ static int vsc8584_macsec_derive_key(const u8 *key, u16 key_len, u8 hkey[16])
 
 	ret = aes_prepareenckey(&aes, key, key_len);
 	if (ret)
 		return ret;
 
-	aes_encrypt_new(&aes, hkey, input);
+	aes_encrypt(&aes, hkey, input);
 	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
 static int vsc8584_macsec_transformation(struct phy_device *phydev,
diff --git a/drivers/staging/rtl8723bs/core/rtw_security.c b/drivers/staging/rtl8723bs/core/rtw_security.c
index 79825324e70f..8ee5bed252bf 100644
--- a/drivers/staging/rtl8723bs/core/rtw_security.c
+++ b/drivers/staging/rtl8723bs/core/rtw_security.c
@@ -638,11 +638,11 @@ u32 rtw_tkip_decrypt(struct adapter *padapter, u8 *precvframe)
 static void aes128k128d(u8 *key, u8 *data, u8 *ciphertext)
 {
 	struct aes_enckey aes;
 
 	aes_prepareenckey(&aes, key, 16);
-	aes_encrypt_new(&aes, ciphertext, data);
+	aes_encrypt(&aes, ciphertext, data);
 	memzero_explicit(&aes, sizeof(aes));
 }
 
 /************************************************/
 /* construct_mic_iv()                           */
@@ -1434,16 +1434,16 @@ static int omac1_aes_128_vector(u8 *key, size_t num_elem,
 				pos = addr[e];
 				end = pos + len[e];
 			}
 		}
 		if (left > AES_BLOCK_SIZE)
-			aes_encrypt_new(&aes, cbc, cbc);
+			aes_encrypt(&aes, cbc, cbc);
 		left -= AES_BLOCK_SIZE;
 	}
 
 	memset(pad, 0, AES_BLOCK_SIZE);
-	aes_encrypt_new(&aes, pad, pad);
+	aes_encrypt(&aes, pad, pad);
 	gf_mulx(pad);
 
 	if (left || total_len == 0) {
 		for (i = 0; i < left; i++) {
 			cbc[i] ^= *pos++;
@@ -1457,11 +1457,11 @@ static int omac1_aes_128_vector(u8 *key, size_t num_elem,
 		gf_mulx(pad);
 	}
 
 	for (i = 0; i < AES_BLOCK_SIZE; i++)
 		pad[i] ^= cbc[i];
-	aes_encrypt_new(&aes, pad, mac);
+	aes_encrypt(&aes, pad, mac);
 	memzero_explicit(&aes, sizeof(aes));
 	return 0;
 }
 
 /**
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 4ce710209da8..30522cc0604c 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -299,31 +299,31 @@ typedef union {
 	const struct aes_enckey *enc_key;
 	const struct aes_key *full_key;
 } aes_encrypt_arg __attribute__ ((__transparent_union__));
 
 /**
- * aes_encrypt_new() - Encrypt a single AES block
+ * aes_encrypt() - Encrypt a single AES block
  * @key: The AES key, as a pointer to either an encryption-only key
  *	 (struct aes_enckey) or a full, bidirectional key (struct aes_key).
  * @out: Buffer to store the ciphertext block
  * @in: Buffer containing the plaintext block
  *
  * Context: Any context.
  */
-void aes_encrypt_new(aes_encrypt_arg key, u8 out[at_least AES_BLOCK_SIZE],
-		     const u8 in[at_least AES_BLOCK_SIZE]);
+void aes_encrypt(aes_encrypt_arg key, u8 out[at_least AES_BLOCK_SIZE],
+		 const u8 in[at_least AES_BLOCK_SIZE]);
 
 /**
- * aes_decrypt_new() - Decrypt a single AES block
+ * aes_decrypt() - Decrypt a single AES block
  * @key: The AES key, previously initialized by aes_preparekey()
  * @out: Buffer to store the plaintext block
  * @in: Buffer containing the ciphertext block
  *
  * Context: Any context.
  */
-void aes_decrypt_new(const struct aes_key *key, u8 out[at_least AES_BLOCK_SIZE],
-		     const u8 in[at_least AES_BLOCK_SIZE]);
+void aes_decrypt(const struct aes_key *key, u8 out[at_least AES_BLOCK_SIZE],
+		 const u8 in[at_least AES_BLOCK_SIZE]);
 
 extern const u8 crypto_aes_sbox[];
 extern const u8 crypto_aes_inv_sbox[];
 extern const u32 __cacheline_aligned aes_enc_tab[256];
 extern const u32 __cacheline_aligned aes_dec_tab[256];
diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
index f8c67206b850..98ade1758735 100644
--- a/lib/crypto/aes.c
+++ b/lib/crypto/aes.c
@@ -500,23 +500,23 @@ int aes_prepareenckey(struct aes_enckey *key, const u8 *in_key, size_t key_len)
 {
 	return __aes_preparekey(key, NULL, in_key, key_len);
 }
 EXPORT_SYMBOL(aes_prepareenckey);
 
-void aes_encrypt_new(aes_encrypt_arg key, u8 out[AES_BLOCK_SIZE],
-		     const u8 in[AES_BLOCK_SIZE])
+void aes_encrypt(aes_encrypt_arg key, u8 out[AES_BLOCK_SIZE],
+		 const u8 in[AES_BLOCK_SIZE])
 {
 	aes_encrypt_arch(key.enc_key, out, in);
 }
-EXPORT_SYMBOL(aes_encrypt_new);
+EXPORT_SYMBOL(aes_encrypt);
 
-void aes_decrypt_new(const struct aes_key *key, u8 out[AES_BLOCK_SIZE],
-		     const u8 in[AES_BLOCK_SIZE])
+void aes_decrypt(const struct aes_key *key, u8 out[AES_BLOCK_SIZE],
+		 const u8 in[AES_BLOCK_SIZE])
 {
 	aes_decrypt_arch(key, out, in);
 }
-EXPORT_SYMBOL(aes_decrypt_new);
+EXPORT_SYMBOL(aes_decrypt);
 
 #ifdef aes_mod_init_arch
 static int __init aes_mod_init(void)
 {
 	aes_mod_init_arch();
diff --git a/lib/crypto/aescfb.c b/lib/crypto/aescfb.c
index 3149d688c4e0..147e5211728f 100644
--- a/lib/crypto/aescfb.c
+++ b/lib/crypto/aescfb.c
@@ -23,11 +23,11 @@ static void aescfb_encrypt_block(const struct aes_enckey *key, void *dst,
 	 * extent by pulling the entire S-box into the caches before doing any
 	 * substitutions, but this strategy is more effective when running with
 	 * interrupts disabled.
 	 */
 	local_irq_save(flags);
-	aes_encrypt_new(key, dst, src);
+	aes_encrypt(key, dst, src);
 	local_irq_restore(flags);
 }
 
 /**
  * aescfb_encrypt - Perform AES-CFB encryption on a block of data
diff --git a/lib/crypto/aesgcm.c b/lib/crypto/aesgcm.c
index 19106fe008fd..02f5b5f32c76 100644
--- a/lib/crypto/aesgcm.c
+++ b/lib/crypto/aesgcm.c
@@ -24,11 +24,11 @@ static void aesgcm_encrypt_block(const struct aes_enckey *key, void *dst,
 	 * mitigates this risk to some extent by pulling the entire S-box into
 	 * the caches before doing any substitutions, but this strategy is more
 	 * effective when running with interrupts disabled.
 	 */
 	local_irq_save(flags);
-	aes_encrypt_new(key, dst, src);
+	aes_encrypt(key, dst, src);
 	local_irq_restore(flags);
 }
 
 /**
  * aesgcm_expandkey - Expands the AES and GHASH keys for the AES-GCM key
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 69007e510177..bf61e8841535 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -390,11 +390,11 @@ static int smp_e(const u8 *k, u8 *r)
 	}
 
 	/* Most significant octet of plaintextData corresponds to data[0] */
 	swap_buf(r, data, 16);
 
-	aes_encrypt_new(&aes, data, data);
+	aes_encrypt(&aes, data, data);
 
 	/* Most significant octet of encryptedData corresponds to data[0] */
 	swap_buf(data, r, 16);
 
 	SMP_DBG("r %16phN", r);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 36/36] lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox
  2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
                   ` (34 preceding siblings ...)
  2026-01-05  5:13 ` [PATCH 35/36] lib/crypto: aes: Drop "_new" suffix from " Eric Biggers
@ 2026-01-05  5:13 ` Eric Biggers
  35 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-05  5:13 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Holger Dengler, Harald Freudenberger,
	Eric Biggers

The volatile keyword is no longer necessary or useful on aes_sbox and
aes_inv_sbox, since the table prefetching is now done using a helper
function that casts to volatile itself and also includes an optimization
barrier.  Since it prevents some compiler optimizations, remove it.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
 lib/crypto/aes.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
index 98ade1758735..e85c905296f1 100644
--- a/lib/crypto/aes.c
+++ b/lib/crypto/aes.c
@@ -9,15 +9,11 @@
 #include <linux/crypto.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/unaligned.h>
 
-/*
- * Emit the sbox as volatile const to prevent the compiler from doing
- * constant folding on sbox references involving fixed indexes.
- */
-static volatile const u8 __cacheline_aligned aes_sbox[] = {
+static const u8 __cacheline_aligned aes_sbox[] = {
 	0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
 	0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
 	0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
 	0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
 	0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,
@@ -48,11 +44,11 @@ static volatile const u8 __cacheline_aligned aes_sbox[] = {
 	0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
 	0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,
 	0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
 };
 
-static volatile const u8 __cacheline_aligned aes_inv_sbox[] = {
+static const u8 __cacheline_aligned aes_inv_sbox[] = {
 	0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
 	0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
 	0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
 	0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
 	0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,
@@ -428,11 +424,11 @@ static void __maybe_unused aes_decrypt_generic(const u32 inv_rndkeys[],
 		w[1] = w1;
 		w[2] = w2;
 		w[3] = w3;
 	} while (--n);
 
-	aes_prefetch((const void *)aes_inv_sbox, sizeof(aes_inv_sbox));
+	aes_prefetch(aes_inv_sbox, sizeof(aes_inv_sbox));
 	put_unaligned_le32(declast_quarterround(w, 0, *rkp++), &out[0]);
 	put_unaligned_le32(declast_quarterround(w, 1, *rkp++), &out[4]);
 	put_unaligned_le32(declast_quarterround(w, 2, *rkp++), &out[8]);
 	put_unaligned_le32(declast_quarterround(w, 3, *rkp++), &out[12]);
 }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/36] lib/crypto: aes: Introduce improved AES library
  2026-01-05  5:12 ` [PATCH 02/36] lib/crypto: aes: Introduce improved AES library Eric Biggers
@ 2026-01-05  7:47   ` Qingfang Deng
  2026-01-06  6:36     ` Eric Biggers
  0 siblings, 1 reply; 45+ messages in thread
From: Qingfang Deng @ 2026-01-05  7:47 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, linux-kernel, Ard Biesheuvel, Jason A. Donenfeld,
	Herbert Xu, linux-arm-kernel, linuxppc-dev, linux-riscv,
	linux-s390, sparclinux, x86, Holger Dengler, Harald Freudenberger

On 4 Jan 2026 21:12:35 -0800, Eric Biggers wrote:
>  extern const u8 crypto_aes_sbox[];
>  extern const u8 crypto_aes_inv_sbox[];
> +extern const u32 __cacheline_aligned aes_enc_tab[256];
> +extern const u32 __cacheline_aligned aes_dec_tab[256];
 
__cacheline_aligned puts the array in ".data..cacheline_aligned"
section. As a const array, it should be in ".rodata" section, so
____cacheline_aligned (note the extra underscores) should be used
instead.
You can also apply the same to crypto_aes_sbox and crypto_aes_inv_sbox
while at it.

Regards,
Qingfang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 19/36] Bluetooth: SMP: Use new AES library API
  2026-01-05  5:12 ` [PATCH 19/36] Bluetooth: SMP: Use new AES library API Eric Biggers
@ 2026-01-05 15:40   ` Andrew Cooper
  2026-01-05 19:05     ` David Laight
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Cooper @ 2026-01-05 15:40 UTC (permalink / raw)
  To: ebiggers
  Cc: Andrew Cooper, Jason, ardb, dengler, freude, herbert,
	linux-arm-kernel, linux-crypto, linux-kernel, linux-riscv,
	linux-s390, linuxppc-dev, sparclinux, x86

>  	/* Most significant octet of plaintextData corresponds to data[0] */
>  	swap_buf(r, data, 16);
>  
> - aes_encrypt(&ctx, data, data); + aes_encrypt_new(&aes, data, data);

One thing you might want to consider, which reduces the churn in the series.

You can use _Generic() to do type-based dispatch on the first pointer. 
Something like this:

void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
void aes_encrypt_new(aes_encrypt_arg key, u8 out[at_least AES_BLOCK_SIZE],
             const u8 in[at_least AES_BLOCK_SIZE]);

#define aes_encrypt(ctx, out, in)                                       \
    _Generic(ctx,                                                       \
             const struct crypto_aes_ctx *: aes_encrypt(ctx, out, in),  \
             aes_encrypt_arg: aes_encrypt_new(ctx, out, in))


i.e. it keeps the _new()-ism in a single header, without needing to
change the drivers a second time.

~Andrew


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 19/36] Bluetooth: SMP: Use new AES library API
  2026-01-05 15:40   ` Andrew Cooper
@ 2026-01-05 19:05     ` David Laight
  2026-01-06  6:58       ` Eric Biggers
  0 siblings, 1 reply; 45+ messages in thread
From: David Laight @ 2026-01-05 19:05 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: ebiggers, Jason, ardb, dengler, freude, herbert, linux-arm-kernel,
	linux-crypto, linux-kernel, linux-riscv, linux-s390, linuxppc-dev,
	sparclinux, x86

On Mon, 5 Jan 2026 15:40:22 +0000
Andrew Cooper <andrew.cooper3@citrix.com> wrote:

> >  	/* Most significant octet of plaintextData corresponds to data[0] */
> >  	swap_buf(r, data, 16);
> >  
> > - aes_encrypt(&ctx, data, data); + aes_encrypt_new(&aes, data, data);  
> 
> One thing you might want to consider, which reduces the churn in the series.
> 
> You can use _Generic() to do type-based dispatch on the first pointer. 
> Something like this:
> 
> void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
> void aes_encrypt_new(aes_encrypt_arg key, u8 out[at_least AES_BLOCK_SIZE],
>              const u8 in[at_least AES_BLOCK_SIZE]);
> 
> #define aes_encrypt(ctx, out, in)                                       \
>     _Generic(ctx,                                                       \
>              const struct crypto_aes_ctx *: aes_encrypt(ctx, out, in),  \
>              aes_encrypt_arg: aes_encrypt_new(ctx, out, in))
> 
> 
> i.e. it keeps the _new()-ism in a single header, without needing to
> change the drivers a second time.

You'll need to cast the 'ctx' argument in both calls.
All the code in an _Generic() must compile cleanly in all the cases.
(Totally annoying....)

	David

> 
> ~Andrew
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/36] lib/crypto: aes: Introduce improved AES library
  2026-01-05  7:47   ` Qingfang Deng
@ 2026-01-06  6:36     ` Eric Biggers
  0 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-06  6:36 UTC (permalink / raw)
  To: Qingfang Deng
  Cc: linux-crypto, linux-kernel, Ard Biesheuvel, Jason A. Donenfeld,
	Herbert Xu, linux-arm-kernel, linuxppc-dev, linux-riscv,
	linux-s390, sparclinux, x86, Holger Dengler, Harald Freudenberger

On Mon, Jan 05, 2026 at 03:47:12PM +0800, Qingfang Deng wrote:
> On 4 Jan 2026 21:12:35 -0800, Eric Biggers wrote:
> >  extern const u8 crypto_aes_sbox[];
> >  extern const u8 crypto_aes_inv_sbox[];
> > +extern const u32 __cacheline_aligned aes_enc_tab[256];
> > +extern const u32 __cacheline_aligned aes_dec_tab[256];
>  
> __cacheline_aligned puts the array in ".data..cacheline_aligned"
> section. As a const array, it should be in ".rodata" section, so
> ____cacheline_aligned (note the extra underscores) should be used
> instead.
> You can also apply the same to crypto_aes_sbox and crypto_aes_inv_sbox
> while at it.
> 
> Regards,
> Qingfang

Good catch!  So the result is that MMU protection isn't applied to the
const data as intended.

I guess I'll change these to the four-underscore ____cacheline_aligned.

Though, I'm tempted to instead just do __aligned(SMP_CACHE_BYTES), to
stay well away from this footgun.

- Eric


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 19/36] Bluetooth: SMP: Use new AES library API
  2026-01-05 19:05     ` David Laight
@ 2026-01-06  6:58       ` Eric Biggers
  0 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-06  6:58 UTC (permalink / raw)
  To: David Laight
  Cc: Andrew Cooper, Jason, ardb, dengler, freude, herbert,
	linux-arm-kernel, linux-crypto, linux-kernel, linux-riscv,
	linux-s390, linuxppc-dev, sparclinux, x86

On Mon, Jan 05, 2026 at 07:05:03PM +0000, David Laight wrote:
> On Mon, 5 Jan 2026 15:40:22 +0000
> Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> > >  	/* Most significant octet of plaintextData corresponds to data[0] */
> > >  	swap_buf(r, data, 16);
> > >  
> > > - aes_encrypt(&ctx, data, data); + aes_encrypt_new(&aes, data, data);  
> > 
> > One thing you might want to consider, which reduces the churn in the series.
> > 
> > You can use _Generic() to do type-based dispatch on the first pointer. 
> > Something like this:
> > 
> > void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
> > void aes_encrypt_new(aes_encrypt_arg key, u8 out[at_least AES_BLOCK_SIZE],
> >              const u8 in[at_least AES_BLOCK_SIZE]);
> > 
> > #define aes_encrypt(ctx, out, in)                                       \
> >     _Generic(ctx,                                                       \
> >              const struct crypto_aes_ctx *: aes_encrypt(ctx, out, in),  \
> >              aes_encrypt_arg: aes_encrypt_new(ctx, out, in))
> > 
> > 
> > i.e. it keeps the _new()-ism in a single header, without needing to
> > change the drivers a second time.
> 
> You'll need to cast the 'ctx' argument in both calls.
> All the code in an _Generic() must compile cleanly in all the cases.
> (Totally annoying....)
> 
> 	David

It seems it would actually have to be:

#define aes_encrypt(key, out, in) \
_Generic(key, \
	 struct crypto_aes_ctx *: aes_encrypt_old((const struct crypto_aes_ctx *)key, out, in), \
	 const struct crypto_aes_ctx *: aes_encrypt_old((const struct crypto_aes_ctx *)key, out, in), \
	 struct aes_enckey *: aes_encrypt_new((const struct aes_enckey *)key, out, in), \
	 const struct aes_enckey *: aes_encrypt_new((const struct aes_enckey *)key, out, in), \
	 struct aes_key *: aes_encrypt_new((const struct aes_key *)key, out, in), \
	 const struct aes_key *: aes_encrypt_new((const struct aes_key *)key, out, in))

#define aes_decrypt(key, out, in) \
_Generic(key, \
	 struct crypto_aes_ctx *: aes_decrypt_old((const struct crypto_aes_ctx *)key, out, in), \
	 const struct crypto_aes_ctx *: aes_decrypt_old((const struct crypto_aes_ctx *)key, out, in), \
	 struct aes_key *: aes_decrypt_new((const struct aes_key *)key, out, in), \
	 const struct aes_key *: aes_decrypt_new((const struct aes_key *)key, out, in))

Note that both const and non-const args need to be handled.

It also doesn't work for any callers passing a 'void *' or
'const void *' and relying on an implicit cast.  I didn't notice any,
but that needs to be considered too.

I guess maybe it would still be worth it to avoid the "*_new" name
temporarily leaking into too many files.  (It goes away by the end of
the series anyway.)  It's just not quite as simple as you're suggesting,
and all the callers have to be checked for compatibility with it.

- Eric


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 30/36] crypto: inside-secure - Use new AES library API
  2026-01-05  5:13 ` [PATCH 30/36] crypto: inside-secure " Eric Biggers
@ 2026-01-07  3:48   ` Qingfang Deng
  2026-01-07  4:01     ` Eric Biggers
  0 siblings, 1 reply; 45+ messages in thread
From: Qingfang Deng @ 2026-01-07  3:48 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, linux-kernel, Ard Biesheuvel, Jason A. Donenfeld,
	Herbert Xu, linux-arm-kernel, linuxppc-dev, linux-riscv,
	linux-s390, sparclinux, x86, Holger Dengler, Harald Freudenberger

On Sun,  4 Jan 2026 21:13:03 -0800, Eric Biggers wrote:
> --- a/drivers/crypto/inside-secure/safexcel_cipher.c
> +++ b/drivers/crypto/inside-secure/safexcel_cipher.c
> @@ -2505,37 +2505,35 @@ static int safexcel_aead_gcm_setkey(struct crypto_aead *ctfm, const u8 *key,
>  				    unsigned int len)
>  {
>  	struct crypto_tfm *tfm = crypto_aead_tfm(ctfm);
>  	struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
>  	struct safexcel_crypto_priv *priv = ctx->base.priv;
> -	struct crypto_aes_ctx aes;
> +	struct aes_enckey aes;
>  	u32 hashkey[AES_BLOCK_SIZE >> 2];
>  	int ret, i;
>  
> -	ret = aes_expandkey(&aes, key, len);
> -	if (ret) {
> -		memzero_explicit(&aes, sizeof(aes));
> +	ret = aes_prepareenckey(&aes, key, len);
> +	if (ret)
>  		return ret;
> -	}
>  
>  	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
>  		for (i = 0; i < len / sizeof(u32); i++) {
> -			if (le32_to_cpu(ctx->key[i]) != aes.key_enc[i]) {
> +			if (ctx->key[i] != get_unaligned((__le32 *)key + i)) {

"key" is big-endian. Casting it to __le32 does not seem correct.
Did you mean "get_unaligned_le32", which also convert the endianness?

>  				ctx->base.needs_inv = true;
>  				break;
>  			}
>  		}
>  	}
>  
>  	for (i = 0; i < len / sizeof(u32); i++)
> -		ctx->key[i] = cpu_to_le32(aes.key_enc[i]);
> +		ctx->key[i] = get_unaligned((__le32 *)key + i);

Same here.

>  
>  	ctx->key_len = len;
>  
>  	/* Compute hash key by encrypting zeroes with cipher key */
>  	memset(hashkey, 0, AES_BLOCK_SIZE);
> -	aes_encrypt(&aes, (u8 *)hashkey, (u8 *)hashkey);
> +	aes_encrypt_new(&aes, (u8 *)hashkey, (u8 *)hashkey);
>  
>  	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
>  		for (i = 0; i < AES_BLOCK_SIZE / sizeof(u32); i++) {
>  			if (be32_to_cpu(ctx->base.ipad.be[i]) != hashkey[i]) {
>  				ctx->base.needs_inv = true;


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 30/36] crypto: inside-secure - Use new AES library API
  2026-01-07  3:48   ` Qingfang Deng
@ 2026-01-07  4:01     ` Eric Biggers
  0 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2026-01-07  4:01 UTC (permalink / raw)
  To: Qingfang Deng
  Cc: linux-crypto, linux-kernel, Ard Biesheuvel, Jason A. Donenfeld,
	Herbert Xu, linux-arm-kernel, linuxppc-dev, linux-riscv,
	linux-s390, sparclinux, x86, Holger Dengler, Harald Freudenberger

On Wed, Jan 07, 2026 at 11:48:33AM +0800, Qingfang Deng wrote:
> On Sun,  4 Jan 2026 21:13:03 -0800, Eric Biggers wrote:
> > --- a/drivers/crypto/inside-secure/safexcel_cipher.c
> > +++ b/drivers/crypto/inside-secure/safexcel_cipher.c
> > @@ -2505,37 +2505,35 @@ static int safexcel_aead_gcm_setkey(struct crypto_aead *ctfm, const u8 *key,
> >  				    unsigned int len)
> >  {
> >  	struct crypto_tfm *tfm = crypto_aead_tfm(ctfm);
> >  	struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
> >  	struct safexcel_crypto_priv *priv = ctx->base.priv;
> > -	struct crypto_aes_ctx aes;
> > +	struct aes_enckey aes;
> >  	u32 hashkey[AES_BLOCK_SIZE >> 2];
> >  	int ret, i;
> >  
> > -	ret = aes_expandkey(&aes, key, len);
> > -	if (ret) {
> > -		memzero_explicit(&aes, sizeof(aes));
> > +	ret = aes_prepareenckey(&aes, key, len);
> > +	if (ret)
> >  		return ret;
> > -	}
> >  
> >  	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
> >  		for (i = 0; i < len / sizeof(u32); i++) {
> > -			if (le32_to_cpu(ctx->key[i]) != aes.key_enc[i]) {
> > +			if (ctx->key[i] != get_unaligned((__le32 *)key + i)) {
> 
> "key" is big-endian. Casting it to __le32 does not seem correct.
> Did you mean "get_unaligned_le32", which also convert the endianness?

No.  Previously, in this driver the 32-bit words of the AES key went
from little endian in their original byte array in 'key', to CPU endian
in 'aes.key_enc' (via the get_unaligned_le32() in aes_expandkey()), to
little endian in 'ctx->key' (via the cpu_to_le32() in this function).
Note that ctx->key is an array of __le32.

Those two conversions canceled out.  So I've simplified it to just grab
the little endian words of the key directly.

- Eric


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 15/36] lib/crypto: s390/aes: Migrate optimized code into library
  2026-01-05  5:12 ` [PATCH 15/36] lib/crypto: s390/aes: " Eric Biggers
@ 2026-01-07  7:41   ` Holger Dengler
  0 siblings, 0 replies; 45+ messages in thread
From: Holger Dengler @ 2026-01-07  7:41 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, Ard Biesheuvel, Jason A . Donenfeld, Herbert Xu,
	linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	sparclinux, x86, Harald Freudenberger, linux-crypto

Hi Eric,

first of all: happy New Year! ANd thanks for the series.

On 05/01/2026 06:12, Eric Biggers wrote:
> Implement aes_preparekey_arch(), aes_encrypt_arch(), and 
> aes_decrypt_arch() using the CPACF AES instructions.

I'm not sure, it it makes sense to implement this on s390 at all. The CPACF
instructions cover full modes of operations and are optimized to process more
than one cipher-block-size (*). For single-block operations, the performance
might be worth than using the generic functions. I will look into it and do
some performance tests. If there is a possibility to plug-in to the level
where the modes of operation are implemented, it would make much more sense
for s390.

(*) Yes, it's a bit uncommon, but the CPACF instructions on s390 can process
multiple block with a single instruction call! They are so called long running
instructions.

> Then, remove the superseded "aes-s390" crypto_cipher.
> 
> The result is that both the AES library and crypto_cipher APIs use the 
> CPACF AES instructions, whereas previously only crypto_cipher did (and it 
> wasn't enabled by default, which this commit fixes as well).
> 
> Note that this preserves the optimization where the AES key is stored in 
> raw form rather than expanded form.  CPACF just takes the raw key.
> 
> Signed-off-by: Eric Biggers <ebiggers@kernel.org> --- arch/s390/crypto/

-- Mit freundlichen Grüßen / Kind regards
Holger Dengler
--
IBM Systems, Linux on IBM Z Development
dengler@linux.ibm.com



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2026-01-07  7:42 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-05  5:12 [PATCH 00/36] AES library improvements Eric Biggers
2026-01-05  5:12 ` [PATCH 01/36] crypto: powerpc/aes - Rename struct aes_key Eric Biggers
2026-01-05  5:12 ` [PATCH 02/36] lib/crypto: aes: Introduce improved AES library Eric Biggers
2026-01-05  7:47   ` Qingfang Deng
2026-01-06  6:36     ` Eric Biggers
2026-01-05  5:12 ` [PATCH 03/36] crypto: arm/aes-neonbs - Use AES library for single blocks Eric Biggers
2026-01-05  5:12 ` [PATCH 04/36] crypto: arm/aes - Switch to aes_enc_tab[] and aes_dec_tab[] Eric Biggers
2026-01-05  5:12 ` [PATCH 05/36] crypto: arm64/aes " Eric Biggers
2026-01-05  5:12 ` [PATCH 06/36] crypto: arm64/aes - Select CRYPTO_LIB_SHA256 from correct places Eric Biggers
2026-01-05  5:12 ` [PATCH 07/36] crypto: aegis - Switch from crypto_ft_tab[] to aes_enc_tab[] Eric Biggers
2026-01-05  5:12 ` [PATCH 08/36] crypto: aes - Remove aes-fixed-time / CONFIG_CRYPTO_AES_TI Eric Biggers
2026-01-05  5:12 ` [PATCH 09/36] crypto: aes - Replace aes-generic with wrapper around lib Eric Biggers
2026-01-05  5:12 ` [PATCH 10/36] lib/crypto: arm/aes: Migrate optimized code into library Eric Biggers
2026-01-05  5:12 ` [PATCH 11/36] lib/crypto: arm64/aes: " Eric Biggers
2026-01-05  5:12 ` [PATCH 12/36] lib/crypto: powerpc/aes: Migrate SPE " Eric Biggers
2026-01-05  5:12 ` [PATCH 13/36] lib/crypto: powerpc/aes: Migrate POWER8 " Eric Biggers
2026-01-05  5:12 ` [PATCH 14/36] lib/crypto: riscv/aes: Migrate " Eric Biggers
2026-01-05  5:12 ` [PATCH 15/36] lib/crypto: s390/aes: " Eric Biggers
2026-01-07  7:41   ` Holger Dengler
2026-01-05  5:12 ` [PATCH 16/36] lib/crypto: sparc/aes: " Eric Biggers
2026-01-05  5:12 ` [PATCH 17/36] lib/crypto: x86/aes: Add AES-NI optimization Eric Biggers
2026-01-05  5:12 ` [PATCH 18/36] crypto: x86/aes - Remove the superseded AES-NI crypto_cipher Eric Biggers
2026-01-05  5:12 ` [PATCH 19/36] Bluetooth: SMP: Use new AES library API Eric Biggers
2026-01-05 15:40   ` Andrew Cooper
2026-01-05 19:05     ` David Laight
2026-01-06  6:58       ` Eric Biggers
2026-01-05  5:12 ` [PATCH 20/36] chelsio: " Eric Biggers
2026-01-05  5:12 ` [PATCH 21/36] net: phy: mscc: macsec: " Eric Biggers
2026-01-05  5:12 ` [PATCH 22/36] staging: rtl8723bs: core: " Eric Biggers
2026-01-05  5:12 ` [PATCH 23/36] crypto: arm/ghash - " Eric Biggers
2026-01-05  5:12 ` [PATCH 24/36] crypto: arm64/ghash " Eric Biggers
2026-01-05  5:12 ` [PATCH 25/36] crypto: x86/aes-gcm " Eric Biggers
2026-01-05  5:12 ` [PATCH 26/36] crypto: ccp " Eric Biggers
2026-01-05  5:13 ` [PATCH 27/36] crypto: chelsio " Eric Biggers
2026-01-05  5:13 ` [PATCH 28/36] crypto: crypto4xx " Eric Biggers
2026-01-05  5:13 ` [PATCH 29/36] crypto: drbg " Eric Biggers
2026-01-05  5:13 ` [PATCH 30/36] crypto: inside-secure " Eric Biggers
2026-01-07  3:48   ` Qingfang Deng
2026-01-07  4:01     ` Eric Biggers
2026-01-05  5:13 ` [PATCH 31/36] crypto: omap " Eric Biggers
2026-01-05  5:13 ` [PATCH 32/36] lib/crypto: aescfb: " Eric Biggers
2026-01-05  5:13 ` [PATCH 33/36] lib/crypto: aesgcm: " Eric Biggers
2026-01-05  5:13 ` [PATCH 34/36] lib/crypto: aes: Remove old AES en/decryption functions Eric Biggers
2026-01-05  5:13 ` [PATCH 35/36] lib/crypto: aes: Drop "_new" suffix from " Eric Biggers
2026-01-05  5:13 ` [PATCH 36/36] lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).