* [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted
@ 2017-06-16 11:17 Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 1/6] drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies Ard Biesheuvel
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel
The generic AES driver uses 16 lookup tables of 1 KB each, and has
encryption and decryption routines that are fully unrolled. Given how
the dependencies between this code and other drivers are declared in
Kconfig files, this code is always pulled into the core kernel, even
if it is usually superseded at runtime by accelerated drivers that
exist for many architectures.
This leaves us with 25 KB of dead code in the kernel, which is negligible
in typical environments, but which is actually a big deal for the IoT
domain, where every kilobyte counts.
Also, the scalar, table based AES routines that exist for ARM, arm64, i586
and x86_64 share the lookup tables with AES generic, and may be invoked
occasionally when the time-invariant AES-NI or other special instruction
drivers are called in interrupt context, at which time the SIMD register
file cannot be used. Pulling 16 KB of code and 9 KB of instructions into
the L1s (and evicting what was already there) when a softirq happens to
be handled in the context of an interrupt taken from kernel mode (which
means no SIMD on x86) is also something that we may like to avoid, by
falling back to a much smaller and moderately less performant driver.
(Note that arm64 will be updated shortly to supply fallbacks for all
SIMD based AES implementations, which will be based on the core routines
[if they are accepted].)
For the reasons above, this series refactors the way the various AES
implementations are wired up, to allow the generic version in
crypto/aes_generic.c to be omitted from the build entirely.
Patch #1 removes some bogus 'select CRYPTO_AES' statement.
Patch #2 introduces CRYPTO_AES_CORE and its implementation crypto/aes_core.c,
which contains the existing key expansion routines, and default encrypt and
decrypt routines that are not exposed as a crypto_cipher themselves, but
can be pulled in by other AES drivers. These routines only depend on the two
256 byte Sboxes
Patch #3 switches the fallback in the AES-NI code to the new, generic encrypt
and decrypt routines so it no longer depends on the x86 scalar code or
[transitively] on AES-generic.
Patch #4 repurposes the CRYPTO_AES Kconfig symbol as an abstract symbol that
indicates whether some implementation of AES needs to be available. The
existing generic code is now controlled by CRYPTO_AES_GENERIC.
Patch #5 updates the Kconfig help text to be more descriptive of what they
actually control, rather than duplicating AES's wikipedia entry a number of
times.
Patch #6 updates the Kconfig logic so CRYPTO_AES_GENERIC can be disabled if
any CRYPTO_AES dependencies are satisfied by the fixed time driver.
v2: - repurpose CRYPTO_AES and avoid HAVE_AES/NEED_AES Kconfig symbols
- don't factor out tables from AES generic to be reused by per arch drivers,
since the space saving is moderate (the generic code only), and the
drivers weren't made to be small anyway
Ard Biesheuvel (6):
drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies
crypto: aes - refactor shared routines into separate core module
crypto: x86/aes-ni - switch to generic fallback
crypto: aes - repurpose CRYPTO_AES and introduce CRYPTO_AES_GENERIC
crypto: aes - add meaningful help text to the various AES drivers
crypto: aes - allow generic AES to be replaced by fixed time AES
arch/arm/crypto/Kconfig | 8 +-
arch/arm64/crypto/Kconfig | 11 +-
arch/x86/crypto/aesni-intel_glue.c | 4 +-
crypto/Kconfig | 85 ++---
crypto/Makefile | 3 +-
crypto/aes_core.c | 333 ++++++++++++++++++++
crypto/aes_generic.c | 178 -----------
crypto/aes_ti.c | 305 ++----------------
drivers/crypto/Kconfig | 13 +-
include/crypto/aes.h | 6 +
net/sunrpc/Kconfig | 3 +-
11 files changed, 407 insertions(+), 542 deletions(-)
create mode 100644 crypto/aes_core.c
--
2.7.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/6] drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
@ 2017-06-16 11:17 ` Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 2/6] crypto: aes - refactor shared routines into separate core module Ard Biesheuvel
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel
In preparation of fine tuning the dependency relations between the
accelerated AES drivers and the core support code, let's remove the
dependency declarations that are false. None of these modules have
link time dependencies on the generic AES code, nor do they declare
any AES algos with CRYPTO_ALG_NEED_FALLBACK, so they can function
perfectly fine without crypto/aes_generic.o loaded.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/crypto/Kconfig | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 0528a62a39a6..7a737c1c669e 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -419,7 +419,6 @@ config CRYPTO_DEV_S5P
tristate "Support for Samsung S5PV210/Exynos crypto accelerator"
depends on ARCH_S5PV210 || ARCH_EXYNOS || COMPILE_TEST
depends on HAS_IOMEM && HAS_DMA
- select CRYPTO_AES
select CRYPTO_BLKCIPHER
help
This option allows you to have support for S5P crypto acceleration.
@@ -473,7 +472,6 @@ config CRYPTO_DEV_ATMEL_AES
tristate "Support for Atmel AES hw accelerator"
depends on HAS_DMA
depends on ARCH_AT91 || COMPILE_TEST
- select CRYPTO_AES
select CRYPTO_AEAD
select CRYPTO_BLKCIPHER
help
@@ -591,7 +589,6 @@ config CRYPTO_DEV_SUN4I_SS
depends on ARCH_SUNXI && !64BIT
select CRYPTO_MD5
select CRYPTO_SHA1
- select CRYPTO_AES
select CRYPTO_DES
select CRYPTO_BLKCIPHER
help
@@ -606,7 +603,6 @@ config CRYPTO_DEV_SUN4I_SS
config CRYPTO_DEV_ROCKCHIP
tristate "Rockchip's Cryptographic Engine driver"
depends on OF && ARCH_ROCKCHIP
- select CRYPTO_AES
select CRYPTO_DES
select CRYPTO_MD5
select CRYPTO_SHA1
@@ -622,7 +618,6 @@ config CRYPTO_DEV_MEDIATEK
tristate "MediaTek's EIP97 Cryptographic Engine driver"
depends on HAS_DMA
depends on (ARM && ARCH_MEDIATEK) || COMPILE_TEST
- select CRYPTO_AES
select CRYPTO_AEAD
select CRYPTO_BLKCIPHER
select CRYPTO_CTR
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/6] crypto: aes - refactor shared routines into separate core module
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 1/6] drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies Ard Biesheuvel
@ 2017-06-16 11:17 ` Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 3/6] crypto: x86/aes-ni - switch to generic fallback Ard Biesheuvel
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel
In preparation of further refactoring and cleanup of the AES code, move
the implementations of crypto_aes_expand_key() and crypto_aes_set_key()
into a separate module called aes_core, along with the Sboxes and the
GF(2^8) routines that they rely on.
Also, introduce crypto_aes_[en|de]crypt() based on the fixed time code,
which will be used in future patches by time invariant SIMD drivers that
may need to fallback to scalar code in exceptional circumstances. These
fallbacks offer a different tradeoff between time invariance and speed,
but are generally more appropriate due to the smaller size and cache
footprint.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/crypto/Kconfig | 2 +-
arch/arm64/crypto/Kconfig | 2 +-
crypto/Kconfig | 5 +
crypto/Makefile | 1 +
crypto/aes_core.c | 333 ++++++++++++++++++++
crypto/aes_generic.c | 178 -----------
crypto/aes_ti.c | 305 ++----------------
drivers/crypto/Kconfig | 8 +-
include/crypto/aes.h | 6 +
9 files changed, 374 insertions(+), 466 deletions(-)
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index b9adedcc5b2e..fd77aebcb7a9 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -73,7 +73,7 @@ config CRYPTO_AES_ARM_BS
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_SIMD
- select CRYPTO_AES
+ select CRYPTO_AES_CORE
help
Use a faster and more secure NEON based implementation of AES in CBC,
CTR and XTS modes
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index d92293747d63..db55e069c17b 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -68,7 +68,7 @@ config CRYPTO_AES_ARM64_NEON_BLK
tristate "AES in ECB/CBC/CTR/XTS modes using NEON instructions"
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
- select CRYPTO_AES
+ select CRYPTO_AES_CORE
select CRYPTO_SIMD
config CRYPTO_CHACHA20_NEON
diff --git a/crypto/Kconfig b/crypto/Kconfig
index caa770e535a2..b4edea2aed22 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -894,9 +894,13 @@ config CRYPTO_GHASH_CLMUL_NI_INTEL
comment "Ciphers"
+config CRYPTO_AES_CORE
+ tristate
+
config CRYPTO_AES
tristate "AES cipher algorithms"
select CRYPTO_ALGAPI
+ select CRYPTO_AES_CORE
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
@@ -917,6 +921,7 @@ config CRYPTO_AES
config CRYPTO_AES_TI
tristate "Fixed time AES cipher"
select CRYPTO_ALGAPI
+ select CRYPTO_AES_CORE
help
This is a generic implementation of AES that attempts to eliminate
data dependent latencies as much as possible without affecting
diff --git a/crypto/Makefile b/crypto/Makefile
index d41f0331b085..0979ca461ddb 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_CRYPTO_TWOFISH) += twofish_generic.o
obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += twofish_common.o
obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
+obj-$(CONFIG_CRYPTO_AES_CORE) += aes_core.o
obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
obj-$(CONFIG_CRYPTO_AES_TI) += aes_ti.o
obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o
diff --git a/crypto/aes_core.c b/crypto/aes_core.c
new file mode 100644
index 000000000000..3f3d1f2c813e
--- /dev/null
+++ b/crypto/aes_core.c
@@ -0,0 +1,333 @@
+/*
+ * Shared AES primitives for accelerated and generic implementations
+ *
+ * Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <crypto/aes.h>
+#include <linux/crypto.h>
+#include <linux/module.h>
+#include <asm/unaligned.h>
+
+static const u8 __cacheline_aligned aes_sbox[] = {
+ 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
+ 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
+ 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
+ 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
+ 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,
+ 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
+ 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a,
+ 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,
+ 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0,
+ 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,
+ 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b,
+ 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,
+ 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85,
+ 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,
+ 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5,
+ 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,
+ 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17,
+ 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,
+ 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88,
+ 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,
+ 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c,
+ 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,
+ 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9,
+ 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,
+ 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6,
+ 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,
+ 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e,
+ 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,
+ 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94,
+ 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
+ 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,
+ 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
+};
+
+static const u8 __cacheline_aligned aes_inv_sbox[] = {
+ 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
+ 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
+ 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
+ 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
+ 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,
+ 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,
+ 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2,
+ 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,
+ 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16,
+ 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,
+ 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda,
+ 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,
+ 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a,
+ 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,
+ 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02,
+ 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,
+ 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea,
+ 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,
+ 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85,
+ 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,
+ 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89,
+ 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,
+ 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20,
+ 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,
+ 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31,
+ 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,
+ 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d,
+ 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
+ 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0,
+ 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
+ 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26,
+ 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d,
+};
+
+static u32 mul_by_x(u32 w)
+{
+ u32 x = w & 0x7f7f7f7f;
+ u32 y = w & 0x80808080;
+
+ /* multiply by polynomial 'x' (0b10) in GF(2^8) */
+ return (x << 1) ^ (y >> 7) * 0x1b;
+}
+
+static u32 mul_by_x2(u32 w)
+{
+ u32 x = w & 0x3f3f3f3f;
+ u32 y = w & 0x80808080;
+ u32 z = w & 0x40404040;
+
+ /* multiply by polynomial 'x^2' (0b100) in GF(2^8) */
+ return (x << 2) ^ (y >> 7) * 0x36 ^ (z >> 6) * 0x1b;
+}
+
+static u32 mix_columns(u32 x)
+{
+ /*
+ * Perform the following matrix multiplication in GF(2^8)
+ *
+ * | 0x2 0x3 0x1 0x1 | | x[0] |
+ * | 0x1 0x2 0x3 0x1 | | x[1] |
+ * | 0x1 0x1 0x2 0x3 | x | x[2] |
+ * | 0x3 0x1 0x1 0x3 | | x[3] |
+ */
+ u32 y = mul_by_x(x) ^ ror32(x, 16);
+
+ return y ^ ror32(x ^ y, 8);
+}
+
+static u32 inv_mix_columns(u32 x)
+{
+ /*
+ * Perform the following matrix multiplication in GF(2^8)
+ *
+ * | 0xe 0xb 0xd 0x9 | | x[0] |
+ * | 0x9 0xe 0xb 0xd | | x[1] |
+ * | 0xd 0x9 0xe 0xb | x | x[2] |
+ * | 0xb 0xd 0x9 0xe | | x[3] |
+ *
+ * which can conveniently be reduced to
+ *
+ * | 0x2 0x3 0x1 0x1 | | 0x5 0x0 0x4 0x0 | | x[0] |
+ * | 0x1 0x2 0x3 0x1 | | 0x0 0x5 0x0 0x4 | | x[1] |
+ * | 0x1 0x1 0x2 0x3 | x | 0x4 0x0 0x5 0x0 | x | x[2] |
+ * | 0x3 0x1 0x1 0x2 | | 0x0 0x4 0x0 0x5 | | x[3] |
+ */
+ u32 y = mul_by_x2(x);
+
+ return mix_columns(x ^ y ^ ror32(y, 16));
+}
+
+static __always_inline u32 subshift(u32 in[], int pos)
+{
+ return (aes_sbox[in[pos] & 0xff]) ^
+ (aes_sbox[(in[(pos + 1) % 4] >> 8) & 0xff] << 8) ^
+ (aes_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
+ (aes_sbox[(in[(pos + 3) % 4] >> 24) & 0xff] << 24);
+}
+
+static __always_inline u32 inv_subshift(u32 in[], int pos)
+{
+ return (aes_inv_sbox[in[pos] & 0xff]) ^
+ (aes_inv_sbox[(in[(pos + 3) % 4] >> 8) & 0xff] << 8) ^
+ (aes_inv_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
+ (aes_inv_sbox[(in[(pos + 1) % 4] >> 24) & 0xff] << 24);
+}
+
+static u32 subw(u32 in)
+{
+ return (aes_sbox[in & 0xff]) ^
+ (aes_sbox[(in >> 8) & 0xff] << 8) ^
+ (aes_sbox[(in >> 16) & 0xff] << 16) ^
+ (aes_sbox[(in >> 24) & 0xff] << 24);
+}
+
+int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
+ unsigned int key_len)
+{
+ u32 kwords = key_len / sizeof(u32);
+ u32 rc, i, j;
+
+ if (key_len != AES_KEYSIZE_128 &&
+ key_len != AES_KEYSIZE_192 &&
+ key_len != AES_KEYSIZE_256)
+ return -EINVAL;
+
+ ctx->key_length = key_len;
+
+ for (i = 0; i < kwords; i++)
+ ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
+
+ for (i = 0, rc = 1; i < 10; i++, rc = mul_by_x(rc)) {
+ u32 *rki = ctx->key_enc + (i * kwords);
+ u32 *rko = rki + kwords;
+
+ rko[0] = ror32(subw(rki[kwords - 1]), 8) ^ rc ^ rki[0];
+ rko[1] = rko[0] ^ rki[1];
+ rko[2] = rko[1] ^ rki[2];
+ rko[3] = rko[2] ^ rki[3];
+
+ if (key_len == AES_KEYSIZE_192) {
+ if (i >= 7)
+ break;
+ rko[4] = rko[3] ^ rki[4];
+ rko[5] = rko[4] ^ rki[5];
+ } else if (key_len == AES_KEYSIZE_256) {
+ if (i >= 6)
+ break;
+ rko[4] = subw(rko[3]) ^ rki[4];
+ rko[5] = rko[4] ^ rki[5];
+ rko[6] = rko[5] ^ rki[6];
+ rko[7] = rko[6] ^ rki[7];
+ }
+ }
+
+ /*
+ * Generate the decryption keys for the Equivalent Inverse Cipher.
+ * This involves reversing the order of the round keys, and applying
+ * the Inverse Mix Columns transformation to all but the first and
+ * the last one.
+ */
+ ctx->key_dec[0] = ctx->key_enc[key_len + 24];
+ ctx->key_dec[1] = ctx->key_enc[key_len + 25];
+ ctx->key_dec[2] = ctx->key_enc[key_len + 26];
+ ctx->key_dec[3] = ctx->key_enc[key_len + 27];
+
+ for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
+ ctx->key_dec[i] = inv_mix_columns(ctx->key_enc[j]);
+ ctx->key_dec[i + 1] = inv_mix_columns(ctx->key_enc[j + 1]);
+ ctx->key_dec[i + 2] = inv_mix_columns(ctx->key_enc[j + 2]);
+ ctx->key_dec[i + 3] = inv_mix_columns(ctx->key_enc[j + 3]);
+ }
+
+ ctx->key_dec[i] = ctx->key_enc[0];
+ ctx->key_dec[i + 1] = ctx->key_enc[1];
+ ctx->key_dec[i + 2] = ctx->key_enc[2];
+ ctx->key_dec[i + 3] = ctx->key_enc[3];
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(crypto_aes_expand_key);
+
+/**
+ * crypto_aes_set_key - Set the AES key.
+ * @tfm: The %crypto_tfm that is used in the context.
+ * @in_key: The input key.
+ * @key_len: The size of the key.
+ *
+ * Returns 0 on success, on failure the %CRYPTO_TFM_RES_BAD_KEY_LEN flag in tfm
+ * is set. The function uses crypto_aes_expand_key() to expand the key.
+ * &crypto_aes_ctx _must_ be the private data embedded in @tfm which is
+ * retrieved with crypto_tfm_ctx().
+ */
+int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
+ unsigned int key_len)
+{
+ struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+
+ if (crypto_aes_expand_key(ctx, in_key, key_len)) {
+ tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
+ return -EINVAL;
+ }
+ return 0;
+}
+EXPORT_SYMBOL_GPL(crypto_aes_set_key);
+
+void crypto_aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
+{
+ const u32 *rkp = ctx->key_enc + 4;
+ int rounds = 6 + ctx->key_length / 4;
+ u32 st0[4], st1[4];
+ int round;
+
+ st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
+ st0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
+ st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
+ st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
+
+ for (round = 0;; round += 2, rkp += 8) {
+ st1[0] = mix_columns(subshift(st0, 0)) ^ rkp[0];
+ st1[1] = mix_columns(subshift(st0, 1)) ^ rkp[1];
+ st1[2] = mix_columns(subshift(st0, 2)) ^ rkp[2];
+ st1[3] = mix_columns(subshift(st0, 3)) ^ rkp[3];
+
+ if (round == rounds - 2)
+ break;
+
+ st0[0] = mix_columns(subshift(st1, 0)) ^ rkp[4];
+ st0[1] = mix_columns(subshift(st1, 1)) ^ rkp[5];
+ st0[2] = mix_columns(subshift(st1, 2)) ^ rkp[6];
+ st0[3] = mix_columns(subshift(st1, 3)) ^ rkp[7];
+ }
+
+ put_unaligned_le32(subshift(st1, 0) ^ rkp[4], out);
+ put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
+ put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
+ put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
+}
+EXPORT_SYMBOL_GPL(crypto_aes_encrypt);
+
+void crypto_aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
+{
+ const u32 *rkp = ctx->key_dec + 4;
+ int rounds = 6 + ctx->key_length / 4;
+ u32 st0[4], st1[4];
+ int round;
+
+ st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
+ st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
+ st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
+ st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
+
+ for (round = 0;; round += 2, rkp += 8) {
+ st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0];
+ st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1];
+ st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2];
+ st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3];
+
+ if (round == rounds - 2)
+ break;
+
+ st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4];
+ st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5];
+ st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6];
+ st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7];
+ }
+
+ put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out);
+ put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
+ put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
+ put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
+}
+EXPORT_SYMBOL_GPL(crypto_aes_decrypt);
+
+extern volatile const u8 __aesti_sbox[256] __alias(aes_sbox);
+EXPORT_SYMBOL_GPL(__aesti_sbox);
+
+extern volatile const u8 __aesti_inv_sbox[256] __alias(aes_inv_sbox);
+EXPORT_SYMBOL_GPL(__aesti_inv_sbox);
+
+MODULE_DESCRIPTION("Shared AES core routines");
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
index ca554d57d01e..c0a7cf9ab574 100644
--- a/crypto/aes_generic.c
+++ b/crypto/aes_generic.c
@@ -61,8 +61,6 @@ static inline u8 byte(const u32 x, const unsigned n)
return x >> (n << 3);
}
-static const u32 rco_tab[10] = { 1, 2, 4, 8, 16, 32, 64, 128, 27, 54 };
-
__visible const u32 crypto_ft_tab[4][256] = {
{
0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6,
@@ -1124,182 +1122,6 @@ EXPORT_SYMBOL_GPL(crypto_fl_tab);
EXPORT_SYMBOL_GPL(crypto_it_tab);
EXPORT_SYMBOL_GPL(crypto_il_tab);
-/* initialise the key schedule from the user supplied key */
-
-#define star_x(x) (((x) & 0x7f7f7f7f) << 1) ^ ((((x) & 0x80808080) >> 7) * 0x1b)
-
-#define imix_col(y, x) do { \
- u = star_x(x); \
- v = star_x(u); \
- w = star_x(v); \
- t = w ^ (x); \
- (y) = u ^ v ^ w; \
- (y) ^= ror32(u ^ t, 8) ^ \
- ror32(v ^ t, 16) ^ \
- ror32(t, 24); \
-} while (0)
-
-#define ls_box(x) \
- crypto_fl_tab[0][byte(x, 0)] ^ \
- crypto_fl_tab[1][byte(x, 1)] ^ \
- crypto_fl_tab[2][byte(x, 2)] ^ \
- crypto_fl_tab[3][byte(x, 3)]
-
-#define loop4(i) do { \
- t = ror32(t, 8); \
- t = ls_box(t) ^ rco_tab[i]; \
- t ^= ctx->key_enc[4 * i]; \
- ctx->key_enc[4 * i + 4] = t; \
- t ^= ctx->key_enc[4 * i + 1]; \
- ctx->key_enc[4 * i + 5] = t; \
- t ^= ctx->key_enc[4 * i + 2]; \
- ctx->key_enc[4 * i + 6] = t; \
- t ^= ctx->key_enc[4 * i + 3]; \
- ctx->key_enc[4 * i + 7] = t; \
-} while (0)
-
-#define loop6(i) do { \
- t = ror32(t, 8); \
- t = ls_box(t) ^ rco_tab[i]; \
- t ^= ctx->key_enc[6 * i]; \
- ctx->key_enc[6 * i + 6] = t; \
- t ^= ctx->key_enc[6 * i + 1]; \
- ctx->key_enc[6 * i + 7] = t; \
- t ^= ctx->key_enc[6 * i + 2]; \
- ctx->key_enc[6 * i + 8] = t; \
- t ^= ctx->key_enc[6 * i + 3]; \
- ctx->key_enc[6 * i + 9] = t; \
- t ^= ctx->key_enc[6 * i + 4]; \
- ctx->key_enc[6 * i + 10] = t; \
- t ^= ctx->key_enc[6 * i + 5]; \
- ctx->key_enc[6 * i + 11] = t; \
-} while (0)
-
-#define loop8tophalf(i) do { \
- t = ror32(t, 8); \
- t = ls_box(t) ^ rco_tab[i]; \
- t ^= ctx->key_enc[8 * i]; \
- ctx->key_enc[8 * i + 8] = t; \
- t ^= ctx->key_enc[8 * i + 1]; \
- ctx->key_enc[8 * i + 9] = t; \
- t ^= ctx->key_enc[8 * i + 2]; \
- ctx->key_enc[8 * i + 10] = t; \
- t ^= ctx->key_enc[8 * i + 3]; \
- ctx->key_enc[8 * i + 11] = t; \
-} while (0)
-
-#define loop8(i) do { \
- loop8tophalf(i); \
- t = ctx->key_enc[8 * i + 4] ^ ls_box(t); \
- ctx->key_enc[8 * i + 12] = t; \
- t ^= ctx->key_enc[8 * i + 5]; \
- ctx->key_enc[8 * i + 13] = t; \
- t ^= ctx->key_enc[8 * i + 6]; \
- ctx->key_enc[8 * i + 14] = t; \
- t ^= ctx->key_enc[8 * i + 7]; \
- ctx->key_enc[8 * i + 15] = t; \
-} while (0)
-
-/**
- * crypto_aes_expand_key - Expands the AES key as described in FIPS-197
- * @ctx: The location where the computed key will be stored.
- * @in_key: The supplied key.
- * @key_len: The length of the supplied key.
- *
- * Returns 0 on success. The function fails only if an invalid key size (or
- * pointer) is supplied.
- * The expanded key size is 240 bytes (max of 14 rounds with a unique 16 bytes
- * key schedule plus a 16 bytes key which is used before the first round).
- * The decryption key is prepared for the "Equivalent Inverse Cipher" as
- * described in FIPS-197. The first slot (16 bytes) of each key (enc or dec) is
- * for the initial combination, the second slot for the first round and so on.
- */
-int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
- unsigned int key_len)
-{
- u32 i, t, u, v, w, j;
-
- if (key_len != AES_KEYSIZE_128 && key_len != AES_KEYSIZE_192 &&
- key_len != AES_KEYSIZE_256)
- return -EINVAL;
-
- ctx->key_length = key_len;
-
- ctx->key_enc[0] = get_unaligned_le32(in_key);
- ctx->key_enc[1] = get_unaligned_le32(in_key + 4);
- ctx->key_enc[2] = get_unaligned_le32(in_key + 8);
- ctx->key_enc[3] = get_unaligned_le32(in_key + 12);
-
- ctx->key_dec[key_len + 24] = ctx->key_enc[0];
- ctx->key_dec[key_len + 25] = ctx->key_enc[1];
- ctx->key_dec[key_len + 26] = ctx->key_enc[2];
- ctx->key_dec[key_len + 27] = ctx->key_enc[3];
-
- switch (key_len) {
- case AES_KEYSIZE_128:
- t = ctx->key_enc[3];
- for (i = 0; i < 10; ++i)
- loop4(i);
- break;
-
- case AES_KEYSIZE_192:
- ctx->key_enc[4] = get_unaligned_le32(in_key + 16);
- t = ctx->key_enc[5] = get_unaligned_le32(in_key + 20);
- for (i = 0; i < 8; ++i)
- loop6(i);
- break;
-
- case AES_KEYSIZE_256:
- ctx->key_enc[4] = get_unaligned_le32(in_key + 16);
- ctx->key_enc[5] = get_unaligned_le32(in_key + 20);
- ctx->key_enc[6] = get_unaligned_le32(in_key + 24);
- t = ctx->key_enc[7] = get_unaligned_le32(in_key + 28);
- for (i = 0; i < 6; ++i)
- loop8(i);
- loop8tophalf(i);
- break;
- }
-
- ctx->key_dec[0] = ctx->key_enc[key_len + 24];
- ctx->key_dec[1] = ctx->key_enc[key_len + 25];
- ctx->key_dec[2] = ctx->key_enc[key_len + 26];
- ctx->key_dec[3] = ctx->key_enc[key_len + 27];
-
- for (i = 4; i < key_len + 24; ++i) {
- j = key_len + 24 - (i & ~3) + (i & 3);
- imix_col(ctx->key_dec[j], ctx->key_enc[i]);
- }
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_aes_expand_key);
-
-/**
- * crypto_aes_set_key - Set the AES key.
- * @tfm: The %crypto_tfm that is used in the context.
- * @in_key: The input key.
- * @key_len: The size of the key.
- *
- * Returns 0 on success, on failure the %CRYPTO_TFM_RES_BAD_KEY_LEN flag in tfm
- * is set. The function uses crypto_aes_expand_key() to expand the key.
- * &crypto_aes_ctx _must_ be the private data embedded in @tfm which is
- * retrieved with crypto_tfm_ctx().
- */
-int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
- unsigned int key_len)
-{
- struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
- u32 *flags = &tfm->crt_flags;
- int ret;
-
- ret = crypto_aes_expand_key(ctx, in_key, key_len);
- if (!ret)
- return 0;
-
- *flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
- return -EINVAL;
-}
-EXPORT_SYMBOL_GPL(crypto_aes_set_key);
-
/* encrypt a block of text */
#define f_rn(bo, bi, n, k) do { \
diff --git a/crypto/aes_ti.c b/crypto/aes_ti.c
index 92644fd1ac19..57cfdeace1f2 100644
--- a/crypto/aes_ti.c
+++ b/crypto/aes_ti.c
@@ -13,225 +13,8 @@
#include <linux/module.h>
#include <asm/unaligned.h>
-/*
- * Emit the sbox as volatile const to prevent the compiler from doing
- * constant folding on sbox references involving fixed indexes.
- */
-static volatile const u8 __cacheline_aligned __aesti_sbox[] = {
- 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
- 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
- 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
- 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
- 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,
- 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
- 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a,
- 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,
- 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0,
- 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,
- 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b,
- 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,
- 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85,
- 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,
- 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5,
- 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,
- 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17,
- 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,
- 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88,
- 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,
- 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c,
- 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,
- 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9,
- 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,
- 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6,
- 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,
- 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e,
- 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,
- 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94,
- 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
- 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,
- 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
-};
-
-static volatile const u8 __cacheline_aligned __aesti_inv_sbox[] = {
- 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
- 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
- 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
- 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
- 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,
- 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,
- 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2,
- 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,
- 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16,
- 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,
- 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda,
- 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,
- 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a,
- 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,
- 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02,
- 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,
- 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea,
- 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,
- 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85,
- 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,
- 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89,
- 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,
- 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20,
- 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,
- 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31,
- 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,
- 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d,
- 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
- 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0,
- 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
- 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26,
- 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d,
-};
-
-static u32 mul_by_x(u32 w)
-{
- u32 x = w & 0x7f7f7f7f;
- u32 y = w & 0x80808080;
-
- /* multiply by polynomial 'x' (0b10) in GF(2^8) */
- return (x << 1) ^ (y >> 7) * 0x1b;
-}
-
-static u32 mul_by_x2(u32 w)
-{
- u32 x = w & 0x3f3f3f3f;
- u32 y = w & 0x80808080;
- u32 z = w & 0x40404040;
-
- /* multiply by polynomial 'x^2' (0b100) in GF(2^8) */
- return (x << 2) ^ (y >> 7) * 0x36 ^ (z >> 6) * 0x1b;
-}
-
-static u32 mix_columns(u32 x)
-{
- /*
- * Perform the following matrix multiplication in GF(2^8)
- *
- * | 0x2 0x3 0x1 0x1 | | x[0] |
- * | 0x1 0x2 0x3 0x1 | | x[1] |
- * | 0x1 0x1 0x2 0x3 | x | x[2] |
- * | 0x3 0x1 0x1 0x3 | | x[3] |
- */
- u32 y = mul_by_x(x) ^ ror32(x, 16);
-
- return y ^ ror32(x ^ y, 8);
-}
-
-static u32 inv_mix_columns(u32 x)
-{
- /*
- * Perform the following matrix multiplication in GF(2^8)
- *
- * | 0xe 0xb 0xd 0x9 | | x[0] |
- * | 0x9 0xe 0xb 0xd | | x[1] |
- * | 0xd 0x9 0xe 0xb | x | x[2] |
- * | 0xb 0xd 0x9 0xe | | x[3] |
- *
- * which can conveniently be reduced to
- *
- * | 0x2 0x3 0x1 0x1 | | 0x5 0x0 0x4 0x0 | | x[0] |
- * | 0x1 0x2 0x3 0x1 | | 0x0 0x5 0x0 0x4 | | x[1] |
- * | 0x1 0x1 0x2 0x3 | x | 0x4 0x0 0x5 0x0 | x | x[2] |
- * | 0x3 0x1 0x1 0x2 | | 0x0 0x4 0x0 0x5 | | x[3] |
- */
- u32 y = mul_by_x2(x);
-
- return mix_columns(x ^ y ^ ror32(y, 16));
-}
-
-static __always_inline u32 subshift(u32 in[], int pos)
-{
- return (__aesti_sbox[in[pos] & 0xff]) ^
- (__aesti_sbox[(in[(pos + 1) % 4] >> 8) & 0xff] << 8) ^
- (__aesti_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
- (__aesti_sbox[(in[(pos + 3) % 4] >> 24) & 0xff] << 24);
-}
-
-static __always_inline u32 inv_subshift(u32 in[], int pos)
-{
- return (__aesti_inv_sbox[in[pos] & 0xff]) ^
- (__aesti_inv_sbox[(in[(pos + 3) % 4] >> 8) & 0xff] << 8) ^
- (__aesti_inv_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
- (__aesti_inv_sbox[(in[(pos + 1) % 4] >> 24) & 0xff] << 24);
-}
-
-static u32 subw(u32 in)
-{
- return (__aesti_sbox[in & 0xff]) ^
- (__aesti_sbox[(in >> 8) & 0xff] << 8) ^
- (__aesti_sbox[(in >> 16) & 0xff] << 16) ^
- (__aesti_sbox[(in >> 24) & 0xff] << 24);
-}
-
-static int aesti_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
- unsigned int key_len)
-{
- u32 kwords = key_len / sizeof(u32);
- u32 rc, i, j;
-
- if (key_len != AES_KEYSIZE_128 &&
- key_len != AES_KEYSIZE_192 &&
- key_len != AES_KEYSIZE_256)
- return -EINVAL;
-
- ctx->key_length = key_len;
-
- for (i = 0; i < kwords; i++)
- ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
-
- for (i = 0, rc = 1; i < 10; i++, rc = mul_by_x(rc)) {
- u32 *rki = ctx->key_enc + (i * kwords);
- u32 *rko = rki + kwords;
-
- rko[0] = ror32(subw(rki[kwords - 1]), 8) ^ rc ^ rki[0];
- rko[1] = rko[0] ^ rki[1];
- rko[2] = rko[1] ^ rki[2];
- rko[3] = rko[2] ^ rki[3];
-
- if (key_len == 24) {
- if (i >= 7)
- break;
- rko[4] = rko[3] ^ rki[4];
- rko[5] = rko[4] ^ rki[5];
- } else if (key_len == 32) {
- if (i >= 6)
- break;
- rko[4] = subw(rko[3]) ^ rki[4];
- rko[5] = rko[4] ^ rki[5];
- rko[6] = rko[5] ^ rki[6];
- rko[7] = rko[6] ^ rki[7];
- }
- }
-
- /*
- * Generate the decryption keys for the Equivalent Inverse Cipher.
- * This involves reversing the order of the round keys, and applying
- * the Inverse Mix Columns transformation to all but the first and
- * the last one.
- */
- ctx->key_dec[0] = ctx->key_enc[key_len + 24];
- ctx->key_dec[1] = ctx->key_enc[key_len + 25];
- ctx->key_dec[2] = ctx->key_enc[key_len + 26];
- ctx->key_dec[3] = ctx->key_enc[key_len + 27];
-
- for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
- ctx->key_dec[i] = inv_mix_columns(ctx->key_enc[j]);
- ctx->key_dec[i + 1] = inv_mix_columns(ctx->key_enc[j + 1]);
- ctx->key_dec[i + 2] = inv_mix_columns(ctx->key_enc[j + 2]);
- ctx->key_dec[i + 3] = inv_mix_columns(ctx->key_enc[j + 3]);
- }
-
- ctx->key_dec[i] = ctx->key_enc[0];
- ctx->key_dec[i + 1] = ctx->key_enc[1];
- ctx->key_dec[i + 2] = ctx->key_enc[2];
- ctx->key_dec[i + 3] = ctx->key_enc[3];
-
- return 0;
-}
+extern volatile const u8 __aesti_sbox[];
+extern volatile const u8 __aesti_inv_sbox[];
static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
@@ -239,7 +22,7 @@ static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
int err;
- err = aesti_expand_key(ctx, in_key, key_len);
+ err = crypto_aes_expand_key(ctx, in_key, key_len);
if (err)
return err;
@@ -266,79 +49,37 @@ static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
- const u32 *rkp = ctx->key_enc + 4;
- int rounds = 6 + ctx->key_length / 4;
- u32 st0[4], st1[4];
- int round;
-
- st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
- st0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
- st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
- st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
-
- st0[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[128];
- st0[1] ^= __aesti_sbox[32] ^ __aesti_sbox[160];
- st0[2] ^= __aesti_sbox[64] ^ __aesti_sbox[192];
- st0[3] ^= __aesti_sbox[96] ^ __aesti_sbox[224];
-
- for (round = 0;; round += 2, rkp += 8) {
- st1[0] = mix_columns(subshift(st0, 0)) ^ rkp[0];
- st1[1] = mix_columns(subshift(st0, 1)) ^ rkp[1];
- st1[2] = mix_columns(subshift(st0, 2)) ^ rkp[2];
- st1[3] = mix_columns(subshift(st0, 3)) ^ rkp[3];
+ u32 st[4];
- if (round == rounds - 2)
- break;
+ st[0] = get_unaligned_le32(in);
+ st[1] = get_unaligned_le32(in + 4);
+ st[2] = get_unaligned_le32(in + 8);
+ st[3] = get_unaligned_le32(in + 12);
- st0[0] = mix_columns(subshift(st1, 0)) ^ rkp[4];
- st0[1] = mix_columns(subshift(st1, 1)) ^ rkp[5];
- st0[2] = mix_columns(subshift(st1, 2)) ^ rkp[6];
- st0[3] = mix_columns(subshift(st1, 3)) ^ rkp[7];
- }
+ st[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[128];
+ st[1] ^= __aesti_sbox[32] ^ __aesti_sbox[160];
+ st[2] ^= __aesti_sbox[64] ^ __aesti_sbox[192];
+ st[3] ^= __aesti_sbox[96] ^ __aesti_sbox[224];
- put_unaligned_le32(subshift(st1, 0) ^ rkp[4], out);
- put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
- put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
- put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
+ crypto_aes_encrypt(ctx, out, (u8 *)st);
}
static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
- const u32 *rkp = ctx->key_dec + 4;
- int rounds = 6 + ctx->key_length / 4;
- u32 st0[4], st1[4];
- int round;
-
- st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
- st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
- st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
- st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
-
- st0[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[128];
- st0[1] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[160];
- st0[2] ^= __aesti_inv_sbox[64] ^ __aesti_inv_sbox[192];
- st0[3] ^= __aesti_inv_sbox[96] ^ __aesti_inv_sbox[224];
-
- for (round = 0;; round += 2, rkp += 8) {
- st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0];
- st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1];
- st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2];
- st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3];
+ u32 st[4];
- if (round == rounds - 2)
- break;
+ st[0] = get_unaligned_le32(in);
+ st[1] = get_unaligned_le32(in + 4);
+ st[2] = get_unaligned_le32(in + 8);
+ st[3] = get_unaligned_le32(in + 12);
- st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4];
- st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5];
- st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6];
- st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7];
- }
+ st[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[128];
+ st[1] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[160];
+ st[2] ^= __aesti_inv_sbox[64] ^ __aesti_inv_sbox[192];
+ st[3] ^= __aesti_inv_sbox[96] ^ __aesti_inv_sbox[224];
- put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out);
- put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
- put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
- put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
+ crypto_aes_decrypt(ctx, out, (u8 *)st);
}
static struct crypto_alg aes_alg = {
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 7a737c1c669e..704712d226e4 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -26,7 +26,7 @@ config CRYPTO_DEV_PADLOCK_AES
tristate "PadLock driver for AES algorithm"
depends on CRYPTO_DEV_PADLOCK
select CRYPTO_BLKCIPHER
- select CRYPTO_AES
+ select CRYPTO_AES_CORE
help
Use VIA PadLock for AES algorithm.
@@ -189,7 +189,7 @@ config CRYPTO_CRC32_S390
config CRYPTO_DEV_MV_CESA
tristate "Marvell's Cryptographic Engine"
depends on PLAT_ORION
- select CRYPTO_AES
+ select CRYPTO_AES_CORE
select CRYPTO_BLKCIPHER
select CRYPTO_HASH
select SRAM
@@ -203,7 +203,7 @@ config CRYPTO_DEV_MV_CESA
config CRYPTO_DEV_MARVELL_CESA
tristate "New Marvell's Cryptographic Engine driver"
depends on PLAT_ORION || ARCH_MVEBU
- select CRYPTO_AES
+ select CRYPTO_AES_CORE
select CRYPTO_DES
select CRYPTO_BLKCIPHER
select CRYPTO_HASH
@@ -655,7 +655,7 @@ config CRYPTO_DEV_SAFEXCEL
tristate "Inside Secure's SafeXcel cryptographic engine driver"
depends on HAS_DMA && OF
depends on (ARM64 && ARCH_MVEBU) || (COMPILE_TEST && 64BIT)
- select CRYPTO_AES
+ select CRYPTO_AES_CORE
select CRYPTO_BLKCIPHER
select CRYPTO_HASH
select CRYPTO_HMAC
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 7524ba3b6f3c..6374f91f5a0a 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -36,4 +36,10 @@ int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len);
int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
unsigned int key_len);
+
+void crypto_aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out,
+ const u8 *in);
+void crypto_aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out,
+ const u8 *in);
+
#endif
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 3/6] crypto: x86/aes-ni - switch to generic fallback
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 1/6] drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 2/6] crypto: aes - refactor shared routines into separate core module Ard Biesheuvel
@ 2017-06-16 11:17 ` Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 4/6] crypto: aes - repurpose CRYPTO_AES and introduce CRYPTO_AES_GENERIC Ard Biesheuvel
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel, Johannes Berg
The time invariant AES-NI implementation is SIMD based, and so it needs
a fallback in case the code is called from a context where SIMD is not
allowed. On x86, this is really only when executing in the context of an
interrupt taken while in kernel mode, since SIMD is allowed in all other
cases.
There is very little code in the kernel that actually performs AES in
interrupt context, and the code that does (mac80211) only does so when
using 802.11 devices that have no support for AES in hardware, and those
are rare these days.
So switch to the new AES core code as a fallback. It is much smaller, as
well as more resistant to cache timing attacks, and removing the
dependency allows us to disable the time variant drivers altogether if
desired.
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/x86/crypto/aesni-intel_glue.c | 4 ++--
crypto/Kconfig | 3 +--
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 4a55cdcdc008..1734e6185800 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -334,7 +334,7 @@ static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
if (!irq_fpu_usable())
- crypto_aes_encrypt_x86(ctx, dst, src);
+ crypto_aes_encrypt(ctx, dst, src);
else {
kernel_fpu_begin();
aesni_enc(ctx, dst, src);
@@ -347,7 +347,7 @@ static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
if (!irq_fpu_usable())
- crypto_aes_decrypt_x86(ctx, dst, src);
+ crypto_aes_decrypt(ctx, dst, src);
else {
kernel_fpu_begin();
aesni_dec(ctx, dst, src);
diff --git a/crypto/Kconfig b/crypto/Kconfig
index b4edea2aed22..1e6e021fda10 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -984,8 +984,7 @@ config CRYPTO_AES_NI_INTEL
tristate "AES cipher algorithms (AES-NI)"
depends on X86
select CRYPTO_AEAD
- select CRYPTO_AES_X86_64 if 64BIT
- select CRYPTO_AES_586 if !64BIT
+ select CRYPTO_AES_CORE
select CRYPTO_ALGAPI
select CRYPTO_BLKCIPHER
select CRYPTO_GLUE_HELPER_X86 if 64BIT
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 4/6] crypto: aes - repurpose CRYPTO_AES and introduce CRYPTO_AES_GENERIC
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
` (2 preceding siblings ...)
2017-06-16 11:17 ` [PATCH v2 3/6] crypto: x86/aes-ni - switch to generic fallback Ard Biesheuvel
@ 2017-06-16 11:17 ` Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 5/6] crypto: aes - add meaningful help text to the various AES drivers Ard Biesheuvel
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel
Repurpose the Kconfig symbol CRYPTO_AES to signify that a 'select' or
'depends on' relationship on it can be satisfied by any driver that
exposes a "aes" cipher.
The existing generic AES code is now controlled by a new Kconfig symbol
CRYPTO_AES_GENERIC, and only dependencies on CRYPTO_AES that truly depend
on its exported lookup tables are updated accordingly.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/crypto/Kconfig | 2 +-
arch/arm64/crypto/Kconfig | 2 +-
crypto/Kconfig | 8 ++++++--
crypto/Makefile | 2 +-
net/sunrpc/Kconfig | 3 ++-
5 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index fd77aebcb7a9..3a6994ada2d1 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -64,7 +64,7 @@ config CRYPTO_SHA512_ARM
config CRYPTO_AES_ARM
tristate "Scalar AES cipher for ARM"
select CRYPTO_ALGAPI
- select CRYPTO_AES
+ select CRYPTO_AES_GENERIC
help
Use optimized AES assembler routines for ARM platforms.
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index db55e069c17b..7ffe88267943 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -43,7 +43,7 @@ config CRYPTO_CRC32_ARM64_CE
config CRYPTO_AES_ARM64
tristate "AES core cipher using scalar instructions"
- select CRYPTO_AES
+ select CRYPTO_AES_GENERIC
config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 1e6e021fda10..9ae3dade4b2b 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -898,6 +898,10 @@ config CRYPTO_AES_CORE
tristate
config CRYPTO_AES
+ tristate
+ select CRYPTO_AES_GENERIC
+
+config CRYPTO_AES_GENERIC
tristate "AES cipher algorithms"
select CRYPTO_ALGAPI
select CRYPTO_AES_CORE
@@ -940,7 +944,7 @@ config CRYPTO_AES_586
tristate "AES cipher algorithms (i586)"
depends on (X86 || UML_X86) && !64BIT
select CRYPTO_ALGAPI
- select CRYPTO_AES
+ select CRYPTO_AES_GENERIC
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
@@ -962,7 +966,7 @@ config CRYPTO_AES_X86_64
tristate "AES cipher algorithms (x86_64)"
depends on (X86 || UML_X86) && 64BIT
select CRYPTO_ALGAPI
- select CRYPTO_AES
+ select CRYPTO_AES_GENERIC
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
diff --git a/crypto/Makefile b/crypto/Makefile
index 0979ca461ddb..73395307bcea 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -97,7 +97,7 @@ obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += twofish_common.o
obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
obj-$(CONFIG_CRYPTO_AES_CORE) += aes_core.o
-obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
+obj-$(CONFIG_CRYPTO_AES_GENERIC) += aes_generic.o
obj-$(CONFIG_CRYPTO_AES_TI) += aes_ti.o
obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o
obj-$(CONFIG_CRYPTO_CAST_COMMON) += cast_common.o
diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig
index ac09ca803296..58aa2ada40b3 100644
--- a/net/sunrpc/Kconfig
+++ b/net/sunrpc/Kconfig
@@ -19,7 +19,8 @@ config RPCSEC_GSS_KRB5
tristate "Secure RPC: Kerberos V mechanism"
depends on SUNRPC && CRYPTO
depends on CRYPTO_MD5 && CRYPTO_DES && CRYPTO_CBC && CRYPTO_CTS
- depends on CRYPTO_ECB && CRYPTO_HMAC && CRYPTO_SHA1 && CRYPTO_AES
+ depends on CRYPTO_ECB && CRYPTO_HMAC && CRYPTO_SHA1
+ select CRYPTO_AES
depends on CRYPTO_ARC4
default y
select SUNRPC_GSS
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 5/6] crypto: aes - add meaningful help text to the various AES drivers
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
` (3 preceding siblings ...)
2017-06-16 11:17 ` [PATCH v2 4/6] crypto: aes - repurpose CRYPTO_AES and introduce CRYPTO_AES_GENERIC Ard Biesheuvel
@ 2017-06-16 11:17 ` Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 6/6] crypto: aes - allow generic AES to be replaced by fixed time AES Ard Biesheuvel
2017-06-19 3:15 ` [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Eric Biggers
6 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel
Remove the duplicated boilerplate help text and add a bit of explanation
about the nature of the various AES implementations that exist for ARM
and x86.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/crypto/Kconfig | 4 +-
arch/arm64/crypto/Kconfig | 7 ++
crypto/Kconfig | 68 +++-----------------
3 files changed, 18 insertions(+), 61 deletions(-)
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 3a6994ada2d1..24d70d74ae51 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -66,7 +66,9 @@ config CRYPTO_AES_ARM
select CRYPTO_ALGAPI
select CRYPTO_AES_GENERIC
help
- Use optimized AES assembler routines for ARM platforms.
+ Use optimized AES assembler routines for ARM platforms. This
+ implementation is table based, and thus not time invariant.
+ It reuses the tables exposed by the generic AES driver.
config CRYPTO_AES_ARM_BS
tristate "Bit sliced AES using NEON instructions"
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 7ffe88267943..48404ae2a11a 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -44,11 +44,18 @@ config CRYPTO_CRC32_ARM64_CE
config CRYPTO_AES_ARM64
tristate "AES core cipher using scalar instructions"
select CRYPTO_AES_GENERIC
+ help
+ Use optimized AES assembler routines for ARM platforms. This
+ implementation is table based, and thus not time invariant.
+ It reuses the tables exposed by the generic AES driver.
config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_ALGAPI
+ help
+ Assembler implementation for arm64 of AES using special dedicated
+ instructions. This implementation is time invariant.
config CRYPTO_AES_ARM64_CE_CCM
tristate "AES in CCM mode using ARMv8 Crypto Extensions"
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 9ae3dade4b2b..f33c0d9136cf 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -906,21 +906,10 @@ config CRYPTO_AES_GENERIC
select CRYPTO_ALGAPI
select CRYPTO_AES_CORE
help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
- algorithm.
-
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See <http://csrc.nist.gov/CryptoToolkit/aes/> for more information.
+ Generic table based implementation of AES. This is the fastest
+ implementation in C, but may be susceptible to known plaintext
+ attacks on the key due to the correlation between the processing
+ time and the input of the first round.
config CRYPTO_AES_TI
tristate "Fixed time AES cipher"
@@ -946,44 +935,18 @@ config CRYPTO_AES_586
select CRYPTO_ALGAPI
select CRYPTO_AES_GENERIC
help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
+ Assembler implementation for 32-bit x86 of the table based AES
algorithm.
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See <http://csrc.nist.gov/encryption/aes/> for more information.
-
config CRYPTO_AES_X86_64
tristate "AES cipher algorithms (x86_64)"
depends on (X86 || UML_X86) && 64BIT
select CRYPTO_ALGAPI
select CRYPTO_AES_GENERIC
help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
+ Assembler implementation for 64-bit x86 of the table based AES
algorithm.
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See <http://csrc.nist.gov/encryption/aes/> for more information.
-
config CRYPTO_AES_NI_INTEL
tristate "AES cipher algorithms (AES-NI)"
depends on X86
@@ -994,23 +957,8 @@ config CRYPTO_AES_NI_INTEL
select CRYPTO_GLUE_HELPER_X86 if 64BIT
select CRYPTO_SIMD
help
- Use Intel AES-NI instructions for AES algorithm.
-
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
- algorithm.
-
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See <http://csrc.nist.gov/encryption/aes/> for more information.
+ Assembler implementation for x86 of AES using special dedicated
+ instructions. This implementation is time invariant.
In addition to AES cipher algorithm support, the acceleration
for some popular block cipher mode is supported too, including
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 6/6] crypto: aes - allow generic AES to be replaced by fixed time AES
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
` (4 preceding siblings ...)
2017-06-16 11:17 ` [PATCH v2 5/6] crypto: aes - add meaningful help text to the various AES drivers Ard Biesheuvel
@ 2017-06-16 11:17 ` Ard Biesheuvel
2017-06-19 3:15 ` [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Eric Biggers
6 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-16 11:17 UTC (permalink / raw)
To: linux-crypto; +Cc: herbert, nico, ebiggers3, Ard Biesheuvel
On systems where a small memory footprint is important, the generic
AES code with its 16 KB of lookup tables and fully unrolled encrypt
and decrypt routines may be an unnecessary burden, especially given
that modern SoCs often have dedicated instructions for AES. And even
if they don't, a time invariant implementation may be preferred over
a fast one that may be susceptible to cache timing attacks.
So allow the declared dependency of other subsystems on AES to be
fulfilled by either the generic AES or the much smaller time invariant
implementation.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
crypto/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/crypto/Kconfig b/crypto/Kconfig
index f33c0d9136cf..2958120cdef3 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -899,12 +899,14 @@ config CRYPTO_AES_CORE
config CRYPTO_AES
tristate
- select CRYPTO_AES_GENERIC
+ select CRYPTO_AES_GENERIC if (CRYPTO_AES=y && CRYPTO_AES_TI != y) || \
+ (CRYPTO_AES=m && !CRYPTO_AES_TI)
config CRYPTO_AES_GENERIC
tristate "AES cipher algorithms"
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
` (5 preceding siblings ...)
2017-06-16 11:17 ` [PATCH v2 6/6] crypto: aes - allow generic AES to be replaced by fixed time AES Ard Biesheuvel
@ 2017-06-19 3:15 ` Eric Biggers
2017-06-19 14:04 ` Ard Biesheuvel
6 siblings, 1 reply; 9+ messages in thread
From: Eric Biggers @ 2017-06-19 3:15 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: linux-crypto, herbert, nico
Hi Ard,
On Fri, Jun 16, 2017 at 01:17:43PM +0200, Ard Biesheuvel wrote:
> The generic AES driver uses 16 lookup tables of 1 KB each, and has
> encryption and decryption routines that are fully unrolled. Given how
> the dependencies between this code and other drivers are declared in
> Kconfig files, this code is always pulled into the core kernel, even
> if it is usually superseded at runtime by accelerated drivers that
> exist for many architectures.
>
> This leaves us with 25 KB of dead code in the kernel, which is negligible
> in typical environments, but which is actually a big deal for the IoT
> domain, where every kilobyte counts.
>
> Also, the scalar, table based AES routines that exist for ARM, arm64, i586
> and x86_64 share the lookup tables with AES generic, and may be invoked
> occasionally when the time-invariant AES-NI or other special instruction
> drivers are called in interrupt context, at which time the SIMD register
> file cannot be used. Pulling 16 KB of code and 9 KB of instructions into
> the L1s (and evicting what was already there) when a softirq happens to
> be handled in the context of an interrupt taken from kernel mode (which
> means no SIMD on x86) is also something that we may like to avoid, by
> falling back to a much smaller and moderately less performant driver.
> (Note that arm64 will be updated shortly to supply fallbacks for all
> SIMD based AES implementations, which will be based on the core routines
> [if they are accepted].)
>
> For the reasons above, this series refactors the way the various AES
> implementations are wired up, to allow the generic version in
> crypto/aes_generic.c to be omitted from the build entirely.
>
This looks better now. I think the help text and prompts could still use some
improvement. For the prompts, on x86_64 now I see:
-*- AES cipher algorithms
[*] Fixed time AES cipher
[*] AES cipher algorithms (x86_64)
[*] AES cipher algorithms (AES-NI)
The first is actually the generic table-based implementation now, and it can be
deselected if the generic fixed-time implementation is selected and the x86_64
table-based implementation is deselected. How about making the prompts be:
AES cipher algorithm (generic, table-based)
AES cipher algorithm (generic, time-invariant)
AES cipher algorithm (x86_64, table-based)
AES cipher algorithm (AES-NI)
For the help text, removing the Wikipedia-style boilerplate is good, but IMO the
help text should at least spell out "AES (Advanced Encryption Standard)". It's
"obvious" to people familiar with crypto algorithms, but I always find it
annoying when Kconfig options elsewhere in the kernel use unfamiliar acronyms
which the developers didn't bother to spell out because it was "obvious" to
them.
The help text could also give a bit more information to help people decide which
options to enable. For example, the help for CRYPTO_AES_X86_64 could say that
it's only useful on older processors that do not have AES-NI instructions, and
that the AES-NI implementation, if enabled, will take priority on newer
processors. Similarly for the generic implementations, though note that the
user may still be required to enable at least one of them as a fallback. Also,
the AES-NI and ARMv8-CE implementations are not only time-invariant but also the
fastest --- and therefore strongly recommended to enable.
Eric
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted
2017-06-19 3:15 ` [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Eric Biggers
@ 2017-06-19 14:04 ` Ard Biesheuvel
0 siblings, 0 replies; 9+ messages in thread
From: Ard Biesheuvel @ 2017-06-19 14:04 UTC (permalink / raw)
To: Eric Biggers; +Cc: linux-crypto@vger.kernel.org, Herbert Xu, nico@linaro.org
On 19 June 2017 at 05:15, Eric Biggers <ebiggers3@gmail.com> wrote:
> Hi Ard,
>
> On Fri, Jun 16, 2017 at 01:17:43PM +0200, Ard Biesheuvel wrote:
>> The generic AES driver uses 16 lookup tables of 1 KB each, and has
>> encryption and decryption routines that are fully unrolled. Given how
>> the dependencies between this code and other drivers are declared in
>> Kconfig files, this code is always pulled into the core kernel, even
>> if it is usually superseded at runtime by accelerated drivers that
>> exist for many architectures.
>>
>> This leaves us with 25 KB of dead code in the kernel, which is negligible
>> in typical environments, but which is actually a big deal for the IoT
>> domain, where every kilobyte counts.
>>
>> Also, the scalar, table based AES routines that exist for ARM, arm64, i586
>> and x86_64 share the lookup tables with AES generic, and may be invoked
>> occasionally when the time-invariant AES-NI or other special instruction
>> drivers are called in interrupt context, at which time the SIMD register
>> file cannot be used. Pulling 16 KB of code and 9 KB of instructions into
>> the L1s (and evicting what was already there) when a softirq happens to
>> be handled in the context of an interrupt taken from kernel mode (which
>> means no SIMD on x86) is also something that we may like to avoid, by
>> falling back to a much smaller and moderately less performant driver.
>> (Note that arm64 will be updated shortly to supply fallbacks for all
>> SIMD based AES implementations, which will be based on the core routines
>> [if they are accepted].)
>>
>> For the reasons above, this series refactors the way the various AES
>> implementations are wired up, to allow the generic version in
>> crypto/aes_generic.c to be omitted from the build entirely.
>>
>
> This looks better now. I think the help text and prompts could still use some
> improvement. For the prompts, on x86_64 now I see:
>
> -*- AES cipher algorithms
> [*] Fixed time AES cipher
> [*] AES cipher algorithms (x86_64)
> [*] AES cipher algorithms (AES-NI)
>
> The first is actually the generic table-based implementation now, and it can be
> deselected if the generic fixed-time implementation is selected and the x86_64
> table-based implementation is deselected. How about making the prompts be:
>
> AES cipher algorithm (generic, table-based)
> AES cipher algorithm (generic, time-invariant)
> AES cipher algorithm (x86_64, table-based)
> AES cipher algorithm (AES-NI)
>
> For the help text, removing the Wikipedia-style boilerplate is good, but IMO the
> help text should at least spell out "AES (Advanced Encryption Standard)". It's
> "obvious" to people familiar with crypto algorithms, but I always find it
> annoying when Kconfig options elsewhere in the kernel use unfamiliar acronyms
> which the developers didn't bother to spell out because it was "obvious" to
> them.
>
> The help text could also give a bit more information to help people decide which
> options to enable. For example, the help for CRYPTO_AES_X86_64 could say that
> it's only useful on older processors that do not have AES-NI instructions, and
> that the AES-NI implementation, if enabled, will take priority on newer
> processors. Similarly for the generic implementations, though note that the
> user may still be required to enable at least one of them as a fallback. Also,
> the AES-NI and ARMv8-CE implementations are not only time-invariant but also the
> fastest --- and therefore strongly recommended to enable.
>
Thanks Eric, all good feedback. I will incorporate it into the next respin.
--
Ard.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-06-19 14:04 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-16 11:17 [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 1/6] drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 2/6] crypto: aes - refactor shared routines into separate core module Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 3/6] crypto: x86/aes-ni - switch to generic fallback Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 4/6] crypto: aes - repurpose CRYPTO_AES and introduce CRYPTO_AES_GENERIC Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 5/6] crypto: aes - add meaningful help text to the various AES drivers Ard Biesheuvel
2017-06-16 11:17 ` [PATCH v2 6/6] crypto: aes - allow generic AES to be replaced by fixed time AES Ard Biesheuvel
2017-06-19 3:15 ` [PATCH v2 0/6] crypto: aes - allow generic AES to be omitted Eric Biggers
2017-06-19 14:04 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).