* [PATCH v2 01/20] arm64: Revert support for generic kernel mode FPU
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-02 16:23 ` Mark Brown
2025-10-01 21:02 ` [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD Ard Biesheuvel
` (19 subsequent siblings)
20 siblings, 1 reply; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers, stable
From: Ard Biesheuvel <ardb@kernel.org>
On arm64, generic kernel mode FPU support, as used by the AMD GPU
driver, involves dropping the -mgeneral-regs-only compiler flag, as that
flag makes the use of double and float C types impossible.
However, dropping that flag allows the compiler to use FPU and SIMD
registers in other ways too, and for this reason, arm64 only permits
doing so in strictly controlled contexts, i.e., isolated compilation
units that get called from inside a kernel_neon_begin() and
kernel_neon_end() pair.
The users of the generic kernel mode FPU API lack such strict checks,
and this may result in userland FP/SIMD state to get corrupted, given
that touching FP/SIMD registers outside of a kernel_neon_begin/end pair
does not fault, but silently operates on the userland state without
preserving it.
So disable this feature for the time being. This reverts commits
71883ae35278 arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
7177089525d9 arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
4be073931cd8 lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
Cc: <stable@vger.kernel.org> # v6.12+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/Kconfig | 1 -
arch/arm64/Makefile | 9 +-----
arch/arm64/include/asm/fpu.h | 15 ---------
arch/arm64/lib/Makefile | 6 ++--
lib/raid6/Makefile | 33 ++++++++++++++------
5 files changed, 28 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b81ab5fbde57..abf70929f675 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -32,7 +32,6 @@ config ARM64
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
- select ARCH_HAS_KERNEL_FPU_SUPPORT if KERNEL_MODE_NEON
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEM_ENCRYPT
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 73a10f65ce8b..82209cc52a5a 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -33,14 +33,7 @@ ifeq ($(CONFIG_BROKEN_GAS_INST),y)
$(warning Detected assembler with broken .inst; disassembly will be unreliable)
endif
-# The GCC option -ffreestanding is required in order to compile code containing
-# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
-CC_FLAGS_FPU := -ffreestanding
-# Enable <arm_neon.h>
-CC_FLAGS_FPU += -isystem $(shell $(CC) -print-file-name=include)
-CC_FLAGS_NO_FPU := -mgeneral-regs-only
-
-KBUILD_CFLAGS += $(CC_FLAGS_NO_FPU) \
+KBUILD_CFLAGS += -mgeneral-regs-only \
$(compat_vdso) $(cc_has_k_constraint)
KBUILD_CFLAGS += $(call cc-disable-warning, psabi)
KBUILD_AFLAGS += $(compat_vdso)
diff --git a/arch/arm64/include/asm/fpu.h b/arch/arm64/include/asm/fpu.h
deleted file mode 100644
index 2ae50bdce59b..000000000000
--- a/arch/arm64/include/asm/fpu.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2023 SiFive
- */
-
-#ifndef __ASM_FPU_H
-#define __ASM_FPU_H
-
-#include <asm/neon.h>
-
-#define kernel_fpu_available() cpu_has_neon()
-#define kernel_fpu_begin() kernel_neon_begin()
-#define kernel_fpu_end() kernel_neon_end()
-
-#endif /* ! __ASM_FPU_H */
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 633e5223d944..291b616ab511 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -7,8 +7,10 @@ lib-y := clear_user.o delay.o copy_from_user.o \
ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
-CFLAGS_xor-neon.o += $(CC_FLAGS_FPU)
-CFLAGS_REMOVE_xor-neon.o += $(CC_FLAGS_NO_FPU)
+CFLAGS_REMOVE_xor-neon.o += -mgeneral-regs-only
+CFLAGS_xor-neon.o += -ffreestanding
+# Enable <arm_neon.h>
+CFLAGS_xor-neon.o += -isystem $(shell $(CC) -print-file-name=include)
endif
lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 5be0a4e60ab1..903e287c50c8 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -34,6 +34,25 @@ CFLAGS_REMOVE_vpermxor8.o += -msoft-float
endif
endif
+# The GCC option -ffreestanding is required in order to compile code containing
+# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
+ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
+NEON_FLAGS := -ffreestanding
+# Enable <arm_neon.h>
+NEON_FLAGS += -isystem $(shell $(CC) -print-file-name=include)
+ifeq ($(ARCH),arm)
+NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
+endif
+CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
+ifeq ($(ARCH),arm64)
+CFLAGS_REMOVE_recov_neon_inner.o += -mgeneral-regs-only
+CFLAGS_REMOVE_neon1.o += -mgeneral-regs-only
+CFLAGS_REMOVE_neon2.o += -mgeneral-regs-only
+CFLAGS_REMOVE_neon4.o += -mgeneral-regs-only
+CFLAGS_REMOVE_neon8.o += -mgeneral-regs-only
+endif
+endif
+
quiet_cmd_unroll = UNROLL $@
cmd_unroll = $(AWK) -v N=$* -f $(src)/unroll.awk < $< > $@
@@ -57,16 +76,10 @@ targets += vpermxor1.c vpermxor2.c vpermxor4.c vpermxor8.c
$(obj)/vpermxor%.c: $(src)/vpermxor.uc $(src)/unroll.awk FORCE
$(call if_changed,unroll)
-CFLAGS_neon1.o += $(CC_FLAGS_FPU)
-CFLAGS_neon2.o += $(CC_FLAGS_FPU)
-CFLAGS_neon4.o += $(CC_FLAGS_FPU)
-CFLAGS_neon8.o += $(CC_FLAGS_FPU)
-CFLAGS_recov_neon_inner.o += $(CC_FLAGS_FPU)
-CFLAGS_REMOVE_neon1.o += $(CC_FLAGS_NO_FPU)
-CFLAGS_REMOVE_neon2.o += $(CC_FLAGS_NO_FPU)
-CFLAGS_REMOVE_neon4.o += $(CC_FLAGS_NO_FPU)
-CFLAGS_REMOVE_neon8.o += $(CC_FLAGS_NO_FPU)
-CFLAGS_REMOVE_recov_neon_inner.o += $(CC_FLAGS_NO_FPU)
+CFLAGS_neon1.o += $(NEON_FLAGS)
+CFLAGS_neon2.o += $(NEON_FLAGS)
+CFLAGS_neon4.o += $(NEON_FLAGS)
+CFLAGS_neon8.o += $(NEON_FLAGS)
targets += neon1.c neon2.c neon4.c neon8.c
$(obj)/neon%.c: $(src)/neon.uc $(src)/unroll.awk FORCE
$(call if_changed,unroll)
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH v2 01/20] arm64: Revert support for generic kernel mode FPU
2025-10-01 21:02 ` [PATCH v2 01/20] arm64: Revert support for generic kernel mode FPU Ard Biesheuvel
@ 2025-10-02 16:23 ` Mark Brown
2025-10-08 12:44 ` Ard Biesheuvel
0 siblings, 1 reply; 33+ messages in thread
From: Mark Brown @ 2025-10-02 16:23 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Kees Cook, Catalin Marinas, Eric Biggers, stable
[-- Attachment #1: Type: text/plain, Size: 787 bytes --]
On Wed, Oct 01, 2025 at 11:02:03PM +0200, Ard Biesheuvel wrote:
> However, dropping that flag allows the compiler to use FPU and SIMD
> registers in other ways too, and for this reason, arm64 only permits
> doing so in strictly controlled contexts, i.e., isolated compilation
> units that get called from inside a kernel_neon_begin() and
> kernel_neon_end() pair.
> The users of the generic kernel mode FPU API lack such strict checks,
> and this may result in userland FP/SIMD state to get corrupted, given
> that touching FP/SIMD registers outside of a kernel_neon_begin/end pair
> does not fault, but silently operates on the userland state without
> preserving it.
Oh dear, that's nasty - I didn't see the patch when it was going in:
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v2 01/20] arm64: Revert support for generic kernel mode FPU
2025-10-02 16:23 ` Mark Brown
@ 2025-10-08 12:44 ` Ard Biesheuvel
0 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-08 12:44 UTC (permalink / raw)
To: Mark Brown
Cc: Ard Biesheuvel, linux-arm-kernel, linux-crypto, linux-kernel,
herbert, linux, Marc Zyngier, Will Deacon, Mark Rutland,
Kees Cook, Catalin Marinas, Eric Biggers, stable
On Thu, 2 Oct 2025 at 09:23, Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Oct 01, 2025 at 11:02:03PM +0200, Ard Biesheuvel wrote:
>
> > However, dropping that flag allows the compiler to use FPU and SIMD
> > registers in other ways too, and for this reason, arm64 only permits
> > doing so in strictly controlled contexts, i.e., isolated compilation
> > units that get called from inside a kernel_neon_begin() and
> > kernel_neon_end() pair.
>
> > The users of the generic kernel mode FPU API lack such strict checks,
> > and this may result in userland FP/SIMD state to get corrupted, given
> > that touching FP/SIMD registers outside of a kernel_neon_begin/end pair
> > does not fault, but silently operates on the userland state without
> > preserving it.
>
> Oh dear, that's nasty - I didn't see the patch when it was going in:
>
Actually, there is a check, it just wasn't wired up correctly by the
amdgpu driver, due to the fact that it wraps kernel_fpu_begin()/end()
calls into its own API, which are therefore always made from a
compilation unit where it is supported.
The trick is to #include <linux/fpu.h> into the definition of their
own wrapper API, so that using /that/ from FP/SIMD code also triggers
a build error.
So I'll drop this patch.
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 01/20] arm64: Revert support for generic kernel mode FPU Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-02 16:17 ` Kees Cook
2025-10-14 14:34 ` Mark Brown
2025-10-01 21:02 ` [PATCH v2 03/20] ARM/simd: " Ard Biesheuvel
` (18 subsequent siblings)
20 siblings, 2 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Encapsulate kernel_neon_begin() and kernel_neon_end() using a 'ksimd'
cleanup guard. This hides the prototype of those functions, allowing
them to be changed for arm64 but not ARM, without breaking code that is
shared between those architectures (RAID6, AEGIS-128)
It probably makes sense to expose this API more widely across
architectures, as it affords more flexibility to the arch code to
plumb it in, while imposing more rigid rules regarding the start/end
bookends appearing in matched pairs.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/simd.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
index 8e86c9e70e48..d9f83c478736 100644
--- a/arch/arm64/include/asm/simd.h
+++ b/arch/arm64/include/asm/simd.h
@@ -6,12 +6,15 @@
#ifndef __ASM_SIMD_H
#define __ASM_SIMD_H
+#include <linux/cleanup.h>
#include <linux/compiler.h>
#include <linux/irqflags.h>
#include <linux/percpu.h>
#include <linux/preempt.h>
#include <linux/types.h>
+#include <asm/neon.h>
+
#ifdef CONFIG_KERNEL_MODE_NEON
/*
@@ -40,4 +43,8 @@ static __must_check inline bool may_use_simd(void) {
#endif /* ! CONFIG_KERNEL_MODE_NEON */
+DEFINE_LOCK_GUARD_0(ksimd, kernel_neon_begin(), kernel_neon_end())
+
+#define scoped_ksimd() scoped_guard(ksimd)
+
#endif
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD
2025-10-01 21:02 ` [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD Ard Biesheuvel
@ 2025-10-02 16:17 ` Kees Cook
2025-10-14 14:34 ` Mark Brown
1 sibling, 0 replies; 33+ messages in thread
From: Kees Cook @ 2025-10-02 16:17 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Catalin Marinas, Mark Brown, Eric Biggers
On Wed, Oct 01, 2025 at 11:02:04PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Encapsulate kernel_neon_begin() and kernel_neon_end() using a 'ksimd'
> cleanup guard. This hides the prototype of those functions, allowing
> them to be changed for arm64 but not ARM, without breaking code that is
> shared between those architectures (RAID6, AEGIS-128)
>
> It probably makes sense to expose this API more widely across
> architectures, as it affords more flexibility to the arch code to
> plumb it in, while imposing more rigid rules regarding the start/end
> bookends appearing in matched pairs.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
--
Kees Cook
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD
2025-10-01 21:02 ` [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD Ard Biesheuvel
2025-10-02 16:17 ` Kees Cook
@ 2025-10-14 14:34 ` Mark Brown
1 sibling, 0 replies; 33+ messages in thread
From: Mark Brown @ 2025-10-14 14:34 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Kees Cook, Catalin Marinas, Eric Biggers
[-- Attachment #1: Type: text/plain, Size: 436 bytes --]
On Wed, Oct 01, 2025 at 11:02:04PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Encapsulate kernel_neon_begin() and kernel_neon_end() using a 'ksimd'
> cleanup guard. This hides the prototype of those functions, allowing
> them to be changed for arm64 but not ARM, without breaking code that is
> shared between those architectures (RAID6, AEGIS-128)
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH v2 03/20] ARM/simd: Add scoped guard API for kernel mode SIMD
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 01/20] arm64: Revert support for generic kernel mode FPU Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 02/20] arm64/simd: Add scoped guard API for kernel mode SIMD Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-02 16:18 ` Kees Cook
2025-10-01 21:02 ` [PATCH v2 04/20] crypto: aegis128-neon - Move to more abstract 'ksimd' guard API Ard Biesheuvel
` (17 subsequent siblings)
20 siblings, 1 reply; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Implement the ksimd scoped guard API so that it can be used by code that
supports both ARM and arm64.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/include/asm/simd.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm/include/asm/simd.h b/arch/arm/include/asm/simd.h
index be08a8da046f..8549fa8b7253 100644
--- a/arch/arm/include/asm/simd.h
+++ b/arch/arm/include/asm/simd.h
@@ -2,14 +2,21 @@
#ifndef _ASM_SIMD_H
#define _ASM_SIMD_H
+#include <linux/cleanup.h>
#include <linux/compiler_attributes.h>
#include <linux/preempt.h>
#include <linux/types.h>
+#include <asm/neon.h>
+
static __must_check inline bool may_use_simd(void)
{
return IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && !in_hardirq()
&& !irqs_disabled();
}
+DEFINE_LOCK_GUARD_0(ksimd, kernel_neon_begin(), kernel_neon_end())
+
+#define scoped_ksimd() scoped_guard(ksimd)
+
#endif /* _ASM_SIMD_H */
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH v2 03/20] ARM/simd: Add scoped guard API for kernel mode SIMD
2025-10-01 21:02 ` [PATCH v2 03/20] ARM/simd: " Ard Biesheuvel
@ 2025-10-02 16:18 ` Kees Cook
0 siblings, 0 replies; 33+ messages in thread
From: Kees Cook @ 2025-10-02 16:18 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Catalin Marinas, Mark Brown, Eric Biggers
On Wed, Oct 01, 2025 at 11:02:05PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Implement the ksimd scoped guard API so that it can be used by code that
> supports both ARM and arm64.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
--
Kees Cook
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH v2 04/20] crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (2 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 03/20] ARM/simd: " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-02 16:20 ` Kees Cook
2025-10-01 21:02 ` [PATCH v2 05/20] raid6: " Ard Biesheuvel
` (16 subsequent siblings)
20 siblings, 1 reply; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Move away from calling kernel_neon_begin() and kernel_neon_end()
directly, and instead, use the newly introduced scoped_ksimd() API. This
permits arm64 to modify the kernel mode NEON API without affecting code
that is shared between ARM and arm64.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
crypto/aegis128-neon.c | 33 +++++++-------------
1 file changed, 12 insertions(+), 21 deletions(-)
diff --git a/crypto/aegis128-neon.c b/crypto/aegis128-neon.c
index 9ee50549e823..b41807e63bd3 100644
--- a/crypto/aegis128-neon.c
+++ b/crypto/aegis128-neon.c
@@ -4,7 +4,7 @@
*/
#include <asm/cpufeature.h>
-#include <asm/neon.h>
+#include <asm/simd.h>
#include "aegis.h"
#include "aegis-neon.h"
@@ -24,32 +24,28 @@ void crypto_aegis128_init_simd(struct aegis_state *state,
const union aegis_block *key,
const u8 *iv)
{
- kernel_neon_begin();
- crypto_aegis128_init_neon(state, key, iv);
- kernel_neon_end();
+ scoped_ksimd()
+ crypto_aegis128_init_neon(state, key, iv);
}
void crypto_aegis128_update_simd(struct aegis_state *state, const void *msg)
{
- kernel_neon_begin();
- crypto_aegis128_update_neon(state, msg);
- kernel_neon_end();
+ scoped_ksimd()
+ crypto_aegis128_update_neon(state, msg);
}
void crypto_aegis128_encrypt_chunk_simd(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
- kernel_neon_begin();
- crypto_aegis128_encrypt_chunk_neon(state, dst, src, size);
- kernel_neon_end();
+ scoped_ksimd()
+ crypto_aegis128_encrypt_chunk_neon(state, dst, src, size);
}
void crypto_aegis128_decrypt_chunk_simd(struct aegis_state *state, u8 *dst,
const u8 *src, unsigned int size)
{
- kernel_neon_begin();
- crypto_aegis128_decrypt_chunk_neon(state, dst, src, size);
- kernel_neon_end();
+ scoped_ksimd()
+ crypto_aegis128_decrypt_chunk_neon(state, dst, src, size);
}
int crypto_aegis128_final_simd(struct aegis_state *state,
@@ -58,12 +54,7 @@ int crypto_aegis128_final_simd(struct aegis_state *state,
unsigned int cryptlen,
unsigned int authsize)
{
- int ret;
-
- kernel_neon_begin();
- ret = crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen,
- authsize);
- kernel_neon_end();
-
- return ret;
+ scoped_ksimd()
+ return crypto_aegis128_final_neon(state, tag_xor, assoclen,
+ cryptlen, authsize);
}
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH v2 04/20] crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
2025-10-01 21:02 ` [PATCH v2 04/20] crypto: aegis128-neon - Move to more abstract 'ksimd' guard API Ard Biesheuvel
@ 2025-10-02 16:20 ` Kees Cook
2025-10-02 16:48 ` Ard Biesheuvel
0 siblings, 1 reply; 33+ messages in thread
From: Kees Cook @ 2025-10-02 16:20 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Catalin Marinas, Mark Brown, Eric Biggers
On Wed, Oct 01, 2025 at 11:02:06PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Move away from calling kernel_neon_begin() and kernel_neon_end()
> directly, and instead, use the newly introduced scoped_ksimd() API. This
> permits arm64 to modify the kernel mode NEON API without affecting code
> that is shared between ARM and arm64.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> crypto/aegis128-neon.c | 33 +++++++-------------
> 1 file changed, 12 insertions(+), 21 deletions(-)
>
> diff --git a/crypto/aegis128-neon.c b/crypto/aegis128-neon.c
> index 9ee50549e823..b41807e63bd3 100644
> --- a/crypto/aegis128-neon.c
> +++ b/crypto/aegis128-neon.c
> @@ -4,7 +4,7 @@
> */
>
> #include <asm/cpufeature.h>
> -#include <asm/neon.h>
> +#include <asm/simd.h>
>
> #include "aegis.h"
> #include "aegis-neon.h"
> @@ -24,32 +24,28 @@ void crypto_aegis128_init_simd(struct aegis_state *state,
> const union aegis_block *key,
> const u8 *iv)
> {
> - kernel_neon_begin();
> - crypto_aegis128_init_neon(state, key, iv);
> - kernel_neon_end();
> + scoped_ksimd()
> + crypto_aegis128_init_neon(state, key, iv);
> }
For these cases (to avoid the indentation change), do you want to use
just "guard" instead of "scope_guard", or do you want to explicitly
require explicit scope context even when the scope ends at the function
return?
--
Kees Cook
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH v2 04/20] crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
2025-10-02 16:20 ` Kees Cook
@ 2025-10-02 16:48 ` Ard Biesheuvel
0 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-02 16:48 UTC (permalink / raw)
To: Kees Cook
Cc: Ard Biesheuvel, linux-arm-kernel, linux-crypto, linux-kernel,
herbert, linux, Marc Zyngier, Will Deacon, Mark Rutland,
Catalin Marinas, Mark Brown, Eric Biggers
On Thu, 2 Oct 2025 at 18:20, Kees Cook <kees@kernel.org> wrote:
>
> On Wed, Oct 01, 2025 at 11:02:06PM +0200, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Move away from calling kernel_neon_begin() and kernel_neon_end()
> > directly, and instead, use the newly introduced scoped_ksimd() API. This
> > permits arm64 to modify the kernel mode NEON API without affecting code
> > that is shared between ARM and arm64.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > crypto/aegis128-neon.c | 33 +++++++-------------
> > 1 file changed, 12 insertions(+), 21 deletions(-)
> >
> > diff --git a/crypto/aegis128-neon.c b/crypto/aegis128-neon.c
> > index 9ee50549e823..b41807e63bd3 100644
> > --- a/crypto/aegis128-neon.c
> > +++ b/crypto/aegis128-neon.c
> > @@ -4,7 +4,7 @@
> > */
> >
> > #include <asm/cpufeature.h>
> > -#include <asm/neon.h>
> > +#include <asm/simd.h>
> >
> > #include "aegis.h"
> > #include "aegis-neon.h"
> > @@ -24,32 +24,28 @@ void crypto_aegis128_init_simd(struct aegis_state *state,
> > const union aegis_block *key,
> > const u8 *iv)
> > {
> > - kernel_neon_begin();
> > - crypto_aegis128_init_neon(state, key, iv);
> > - kernel_neon_end();
> > + scoped_ksimd()
> > + crypto_aegis128_init_neon(state, key, iv);
> > }
>
> For these cases (to avoid the indentation change), do you want to use
> just "guard" instead of "scope_guard", or do you want to explicitly
> require explicit scope context even when the scope ends at the function
> return?
>
I'm on the fence tbh. I think for future maintainability, being forced
to define the scope is perhaps better but in this case, the whole
function contains only a single line so there is little room for
confusion.
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH v2 05/20] raid6: Move to more abstract 'ksimd' guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (3 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 04/20] crypto: aegis128-neon - Move to more abstract 'ksimd' guard API Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 06/20] crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit Ard Biesheuvel
` (15 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Move away from calling kernel_neon_begin() and kernel_neon_end()
directly, and instead, use the newly introduced scoped_ksimd() API. This
permits arm64 to modify the kernel mode NEON API without affecting code
that is shared between ARM and arm64.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
lib/raid6/neon.c | 17 +++++++----------
lib/raid6/recov_neon.c | 15 ++++++---------
2 files changed, 13 insertions(+), 19 deletions(-)
diff --git a/lib/raid6/neon.c b/lib/raid6/neon.c
index 0a2e76035ea9..6d9474ce6da9 100644
--- a/lib/raid6/neon.c
+++ b/lib/raid6/neon.c
@@ -8,10 +8,9 @@
#include <linux/raid/pq.h>
#ifdef __KERNEL__
-#include <asm/neon.h>
+#include <asm/simd.h>
#else
-#define kernel_neon_begin()
-#define kernel_neon_end()
+#define scoped_ksimd()
#define cpu_has_neon() (1)
#endif
@@ -32,10 +31,9 @@
{ \
void raid6_neon ## _n ## _gen_syndrome_real(int, \
unsigned long, void**); \
- kernel_neon_begin(); \
- raid6_neon ## _n ## _gen_syndrome_real(disks, \
+ scoped_ksimd() \
+ raid6_neon ## _n ## _gen_syndrome_real(disks, \
(unsigned long)bytes, ptrs); \
- kernel_neon_end(); \
} \
static void raid6_neon ## _n ## _xor_syndrome(int disks, \
int start, int stop, \
@@ -43,10 +41,9 @@
{ \
void raid6_neon ## _n ## _xor_syndrome_real(int, \
int, int, unsigned long, void**); \
- kernel_neon_begin(); \
- raid6_neon ## _n ## _xor_syndrome_real(disks, \
- start, stop, (unsigned long)bytes, ptrs); \
- kernel_neon_end(); \
+ scoped_ksimd() \
+ raid6_neon ## _n ## _xor_syndrome_real(disks, \
+ start, stop, (unsigned long)bytes, ptrs);\
} \
struct raid6_calls const raid6_neonx ## _n = { \
raid6_neon ## _n ## _gen_syndrome, \
diff --git a/lib/raid6/recov_neon.c b/lib/raid6/recov_neon.c
index 70e1404c1512..9d99aeabd31a 100644
--- a/lib/raid6/recov_neon.c
+++ b/lib/raid6/recov_neon.c
@@ -7,11 +7,10 @@
#include <linux/raid/pq.h>
#ifdef __KERNEL__
-#include <asm/neon.h>
+#include <asm/simd.h>
#include "neon.h"
#else
-#define kernel_neon_begin()
-#define kernel_neon_end()
+#define scoped_ksimd()
#define cpu_has_neon() (1)
#endif
@@ -55,9 +54,8 @@ static void raid6_2data_recov_neon(int disks, size_t bytes, int faila,
qmul = raid6_vgfmul[raid6_gfinv[raid6_gfexp[faila] ^
raid6_gfexp[failb]]];
- kernel_neon_begin();
- __raid6_2data_recov_neon(bytes, p, q, dp, dq, pbmul, qmul);
- kernel_neon_end();
+ scoped_ksimd()
+ __raid6_2data_recov_neon(bytes, p, q, dp, dq, pbmul, qmul);
}
static void raid6_datap_recov_neon(int disks, size_t bytes, int faila,
@@ -86,9 +84,8 @@ static void raid6_datap_recov_neon(int disks, size_t bytes, int faila,
/* Now, pick the proper data tables */
qmul = raid6_vgfmul[raid6_gfinv[raid6_gfexp[faila]]];
- kernel_neon_begin();
- __raid6_datap_recov_neon(bytes, p, q, dq, qmul);
- kernel_neon_end();
+ scoped_ksimd()
+ __raid6_datap_recov_neon(bytes, p, q, dq, qmul);
}
const struct raid6_recov_calls raid6_recov_neon = {
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 06/20] crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (4 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 05/20] raid6: " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 07/20] crypto/arm64: sm4-ce-ccm " Ard Biesheuvel
` (14 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Kernel mode NEON sections are now preemptible on arm64, and so there is
no need to yield it explicitly in order to prevent scheduling latency
spikes.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/aes-ce-ccm-glue.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
index 2d791d51891b..2eb4e76cabc3 100644
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c
+++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
@@ -114,11 +114,8 @@ static u32 ce_aes_ccm_auth_data(u8 mac[], u8 const in[], u32 abytes,
in += adv;
abytes -= adv;
- if (unlikely(rem)) {
- kernel_neon_end();
- kernel_neon_begin();
+ if (unlikely(rem))
macp = 0;
- }
} else {
u32 l = min(AES_BLOCK_SIZE - macp, abytes);
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 07/20] crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (5 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 06/20] crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 08/20] crypto/arm64: sm4-ce-gcm " Ard Biesheuvel
` (13 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Kernel mode NEON sections are now preemptible on arm64, and so there is
no need to yield it when calling APIs that may sleep.
Also, move the calls to kernel_neon_end() to the same scope as
kernel_neon_begin(). This is needed for a subsequent change where a
stack buffer is allocated transparently and passed to
kernel_neon_begin().
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/sm4-ce-ccm-glue.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/crypto/sm4-ce-ccm-glue.c b/arch/arm64/crypto/sm4-ce-ccm-glue.c
index e9cc1c1364ec..f9771ab2a05f 100644
--- a/arch/arm64/crypto/sm4-ce-ccm-glue.c
+++ b/arch/arm64/crypto/sm4-ce-ccm-glue.c
@@ -179,11 +179,7 @@ static int ccm_crypt(struct aead_request *req, struct skcipher_walk *walk,
walk->src.virt.addr, walk->iv,
walk->nbytes - tail, mac);
- kernel_neon_end();
-
err = skcipher_walk_done(walk, tail);
-
- kernel_neon_begin();
}
if (walk->nbytes) {
@@ -193,15 +189,13 @@ static int ccm_crypt(struct aead_request *req, struct skcipher_walk *walk,
sm4_ce_ccm_final(rkey_enc, ctr0, mac);
- kernel_neon_end();
-
err = skcipher_walk_done(walk, 0);
} else {
sm4_ce_ccm_final(rkey_enc, ctr0, mac);
-
- kernel_neon_end();
}
+ kernel_neon_end();
+
return err;
}
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 08/20] crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (6 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 07/20] crypto/arm64: sm4-ce-ccm " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 09/20] lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API Ard Biesheuvel
` (12 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Kernel mode NEON sections are now preemptible on arm64, and so there is
no need to yield it when calling APIs that may sleep.
Also, move the calls to kernel_neon_end() to the same scope as
kernel_neon_begin(). This is needed for a subsequent change where a
stack buffer is allocated transparently and passed to
kernel_neon_begin().
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/sm4-ce-gcm-glue.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/crypto/sm4-ce-gcm-glue.c b/arch/arm64/crypto/sm4-ce-gcm-glue.c
index c2ea3d5f690b..170cd0151385 100644
--- a/arch/arm64/crypto/sm4-ce-gcm-glue.c
+++ b/arch/arm64/crypto/sm4-ce-gcm-glue.c
@@ -165,26 +165,22 @@ static int gcm_crypt(struct aead_request *req, struct skcipher_walk *walk,
ctx->ghash_table,
(const u8 *)&lengths);
- kernel_neon_end();
-
- return skcipher_walk_done(walk, 0);
+ err = skcipher_walk_done(walk, 0);
+ goto out;
}
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
walk->nbytes - tail, ghash,
ctx->ghash_table, NULL);
- kernel_neon_end();
-
err = skcipher_walk_done(walk, tail);
-
- kernel_neon_begin();
}
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, NULL, NULL, iv,
walk->nbytes, ghash, ctx->ghash_table,
(const u8 *)&lengths);
+out:
kernel_neon_end();
return err;
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 09/20] lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (7 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 08/20] crypto/arm64: sm4-ce-gcm " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 10/20] lib/crypto: " Ard Biesheuvel
` (11 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.
For symmetry, do the same for 32-bit ARM too.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
lib/crc/arm/crc-t10dif.h | 16 +++++-----------
lib/crc/arm/crc32.h | 11 ++++-------
lib/crc/arm64/crc-t10dif.h | 16 +++++-----------
lib/crc/arm64/crc32.h | 16 ++++++----------
4 files changed, 20 insertions(+), 39 deletions(-)
diff --git a/lib/crc/arm/crc-t10dif.h b/lib/crc/arm/crc-t10dif.h
index 2edf7e9681d0..133a773b8248 100644
--- a/lib/crc/arm/crc-t10dif.h
+++ b/lib/crc/arm/crc-t10dif.h
@@ -7,7 +7,6 @@
#include <crypto/internal/simd.h>
-#include <asm/neon.h>
#include <asm/simd.h>
static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon);
@@ -22,21 +21,16 @@ asmlinkage void crc_t10dif_pmull8(u16 init_crc, const u8 *buf, size_t len,
static inline u16 crc_t10dif_arch(u16 crc, const u8 *data, size_t length)
{
if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE) {
- if (static_branch_likely(&have_pmull)) {
- if (crypto_simd_usable()) {
- kernel_neon_begin();
- crc = crc_t10dif_pmull64(crc, data, length);
- kernel_neon_end();
- return crc;
- }
+ if (static_branch_likely(&have_pmull) && crypto_simd_usable()) {
+ scoped_ksimd()
+ return crc_t10dif_pmull64(crc, data, length);
} else if (length > CRC_T10DIF_PMULL_CHUNK_SIZE &&
static_branch_likely(&have_neon) &&
crypto_simd_usable()) {
u8 buf[16] __aligned(16);
- kernel_neon_begin();
- crc_t10dif_pmull8(crc, data, length, buf);
- kernel_neon_end();
+ scoped_ksimd()
+ crc_t10dif_pmull8(crc, data, length, buf);
return crc_t10dif_generic(0, buf, sizeof(buf));
}
diff --git a/lib/crc/arm/crc32.h b/lib/crc/arm/crc32.h
index 018007e162a2..32ad299319cd 100644
--- a/lib/crc/arm/crc32.h
+++ b/lib/crc/arm/crc32.h
@@ -10,7 +10,6 @@
#include <crypto/internal/simd.h>
#include <asm/hwcap.h>
-#include <asm/neon.h>
#include <asm/simd.h>
static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_crc32);
@@ -44,9 +43,8 @@ static inline u32 crc32_le_arch(u32 crc, const u8 *p, size_t len)
len -= n;
}
n = round_down(len, 16);
- kernel_neon_begin();
- crc = crc32_pmull_le(p, n, crc);
- kernel_neon_end();
+ scoped_ksimd()
+ crc = crc32_pmull_le(p, n, crc);
p += n;
len -= n;
}
@@ -73,9 +71,8 @@ static inline u32 crc32c_arch(u32 crc, const u8 *p, size_t len)
len -= n;
}
n = round_down(len, 16);
- kernel_neon_begin();
- crc = crc32c_pmull_le(p, n, crc);
- kernel_neon_end();
+ scoped_ksimd()
+ crc = crc32c_pmull_le(p, n, crc);
p += n;
len -= n;
}
diff --git a/lib/crc/arm64/crc-t10dif.h b/lib/crc/arm64/crc-t10dif.h
index c4521a7f1ee9..dcbee08801d6 100644
--- a/lib/crc/arm64/crc-t10dif.h
+++ b/lib/crc/arm64/crc-t10dif.h
@@ -9,7 +9,6 @@
#include <crypto/internal/simd.h>
-#include <asm/neon.h>
#include <asm/simd.h>
static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_asimd);
@@ -24,21 +23,16 @@ asmlinkage u16 crc_t10dif_pmull_p64(u16 init_crc, const u8 *buf, size_t len);
static inline u16 crc_t10dif_arch(u16 crc, const u8 *data, size_t length)
{
if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE) {
- if (static_branch_likely(&have_pmull)) {
- if (crypto_simd_usable()) {
- kernel_neon_begin();
- crc = crc_t10dif_pmull_p64(crc, data, length);
- kernel_neon_end();
- return crc;
- }
+ if (static_branch_likely(&have_pmull) && crypto_simd_usable()) {
+ scoped_ksimd()
+ return crc_t10dif_pmull_p64(crc, data, length);
} else if (length > CRC_T10DIF_PMULL_CHUNK_SIZE &&
static_branch_likely(&have_asimd) &&
crypto_simd_usable()) {
u8 buf[16];
- kernel_neon_begin();
- crc_t10dif_pmull_p8(crc, data, length, buf);
- kernel_neon_end();
+ scoped_ksimd()
+ crc_t10dif_pmull_p8(crc, data, length, buf);
return crc_t10dif_generic(0, buf, sizeof(buf));
}
diff --git a/lib/crc/arm64/crc32.h b/lib/crc/arm64/crc32.h
index 6e5dec45f05d..2b5cbb686a13 100644
--- a/lib/crc/arm64/crc32.h
+++ b/lib/crc/arm64/crc32.h
@@ -2,7 +2,6 @@
#include <asm/alternative.h>
#include <asm/cpufeature.h>
-#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/internal/simd.h>
@@ -24,9 +23,8 @@ static inline u32 crc32_le_arch(u32 crc, const u8 *p, size_t len)
return crc32_le_base(crc, p, len);
if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) {
- kernel_neon_begin();
- crc = crc32_le_arm64_4way(crc, p, len);
- kernel_neon_end();
+ scoped_ksimd()
+ crc = crc32_le_arm64_4way(crc, p, len);
p += round_down(len, 64);
len %= 64;
@@ -44,9 +42,8 @@ static inline u32 crc32c_arch(u32 crc, const u8 *p, size_t len)
return crc32c_base(crc, p, len);
if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) {
- kernel_neon_begin();
- crc = crc32c_le_arm64_4way(crc, p, len);
- kernel_neon_end();
+ scoped_ksimd()
+ crc = crc32c_le_arm64_4way(crc, p, len);
p += round_down(len, 64);
len %= 64;
@@ -64,9 +61,8 @@ static inline u32 crc32_be_arch(u32 crc, const u8 *p, size_t len)
return crc32_be_base(crc, p, len);
if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) {
- kernel_neon_begin();
- crc = crc32_be_arm64_4way(crc, p, len);
- kernel_neon_end();
+ scoped_ksimd()
+ crc = crc32_be_arm64_4way(crc, p, len);
p += round_down(len, 64);
len %= 64;
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 10/20] lib/crypto: Switch ARM and arm64 to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (8 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 09/20] lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 11/20] crypto/arm64: aes-ccm - Switch " Ard Biesheuvel
` (10 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.
For symmetry, do the same for 32-bit ARM too.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
lib/crypto/arm/chacha-glue.c | 6 ++----
lib/crypto/arm/poly1305-glue.c | 6 ++----
lib/crypto/arm/sha1.h | 13 ++++++-------
lib/crypto/arm/sha256.h | 14 +++++++-------
lib/crypto/arm/sha512.h | 6 +++---
lib/crypto/arm64/chacha-neon-glue.c | 11 ++++-------
lib/crypto/arm64/poly1305-glue.c | 6 ++----
lib/crypto/arm64/sha1.h | 7 +++----
lib/crypto/arm64/sha256.h | 15 +++++++--------
lib/crypto/arm64/sha512.h | 8 ++++----
10 files changed, 40 insertions(+), 52 deletions(-)
diff --git a/lib/crypto/arm/chacha-glue.c b/lib/crypto/arm/chacha-glue.c
index 88ec96415283..9c2e8d5edf20 100644
--- a/lib/crypto/arm/chacha-glue.c
+++ b/lib/crypto/arm/chacha-glue.c
@@ -14,7 +14,6 @@
#include <asm/cputype.h>
#include <asm/hwcap.h>
-#include <asm/neon.h>
#include <asm/simd.h>
asmlinkage void chacha_block_xor_neon(const struct chacha_state *state,
@@ -90,9 +89,8 @@ void chacha_crypt_arch(struct chacha_state *state, u8 *dst, const u8 *src,
do {
unsigned int todo = min_t(unsigned int, bytes, SZ_4K);
- kernel_neon_begin();
- chacha_doneon(state, dst, src, todo, nrounds);
- kernel_neon_end();
+ scoped_ksimd()
+ chacha_doneon(state, dst, src, todo, nrounds);
bytes -= todo;
src += todo;
diff --git a/lib/crypto/arm/poly1305-glue.c b/lib/crypto/arm/poly1305-glue.c
index 2d86c78af883..3e4624477e9f 100644
--- a/lib/crypto/arm/poly1305-glue.c
+++ b/lib/crypto/arm/poly1305-glue.c
@@ -6,7 +6,6 @@
*/
#include <asm/hwcap.h>
-#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
@@ -39,9 +38,8 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
- kernel_neon_begin();
- poly1305_blocks_neon(state, src, todo, padbit);
- kernel_neon_end();
+ scoped_ksimd()
+ poly1305_blocks_neon(state, src, todo, padbit);
len -= todo;
src += todo;
diff --git a/lib/crypto/arm/sha1.h b/lib/crypto/arm/sha1.h
index fa1e92419000..a4296ffefd05 100644
--- a/lib/crypto/arm/sha1.h
+++ b/lib/crypto/arm/sha1.h
@@ -4,7 +4,6 @@
*
* Copyright 2025 Google LLC
*/
-#include <asm/neon.h>
#include <asm/simd.h>
static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon);
@@ -22,12 +21,12 @@ static void sha1_blocks(struct sha1_block_state *state,
{
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
static_branch_likely(&have_neon) && likely(may_use_simd())) {
- kernel_neon_begin();
- if (static_branch_likely(&have_ce))
- sha1_ce_transform(state, data, nblocks);
- else
- sha1_transform_neon(state, data, nblocks);
- kernel_neon_end();
+ scoped_ksimd() {
+ if (static_branch_likely(&have_ce))
+ sha1_ce_transform(state, data, nblocks);
+ else
+ sha1_transform_neon(state, data, nblocks);
+ }
} else {
sha1_block_data_order(state, data, nblocks);
}
diff --git a/lib/crypto/arm/sha256.h b/lib/crypto/arm/sha256.h
index da75cbdc51d4..df861cc5b9ff 100644
--- a/lib/crypto/arm/sha256.h
+++ b/lib/crypto/arm/sha256.h
@@ -4,7 +4,7 @@
*
* Copyright 2025 Google LLC
*/
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/simd.h>
asmlinkage void sha256_block_data_order(struct sha256_block_state *state,
@@ -22,12 +22,12 @@ static void sha256_blocks(struct sha256_block_state *state,
{
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
static_branch_likely(&have_neon) && crypto_simd_usable()) {
- kernel_neon_begin();
- if (static_branch_likely(&have_ce))
- sha256_ce_transform(state, data, nblocks);
- else
- sha256_block_data_order_neon(state, data, nblocks);
- kernel_neon_end();
+ scoped_ksimd() {
+ if (static_branch_likely(&have_ce))
+ sha256_ce_transform(state, data, nblocks);
+ else
+ sha256_block_data_order_neon(state, data, nblocks);
+ }
} else {
sha256_block_data_order(state, data, nblocks);
}
diff --git a/lib/crypto/arm/sha512.h b/lib/crypto/arm/sha512.h
index f147b6490d6c..35b80e7e7db7 100644
--- a/lib/crypto/arm/sha512.h
+++ b/lib/crypto/arm/sha512.h
@@ -6,6 +6,7 @@
*/
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/simd.h>
static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon);
@@ -20,9 +21,8 @@ static void sha512_blocks(struct sha512_block_state *state,
{
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
static_branch_likely(&have_neon) && likely(crypto_simd_usable())) {
- kernel_neon_begin();
- sha512_block_data_order_neon(state, data, nblocks);
- kernel_neon_end();
+ scoped_ksimd()
+ sha512_block_data_order_neon(state, data, nblocks);
} else {
sha512_block_data_order(state, data, nblocks);
}
diff --git a/lib/crypto/arm64/chacha-neon-glue.c b/lib/crypto/arm64/chacha-neon-glue.c
index d0188f974ca5..a3d109f0ce1e 100644
--- a/lib/crypto/arm64/chacha-neon-glue.c
+++ b/lib/crypto/arm64/chacha-neon-glue.c
@@ -25,7 +25,6 @@
#include <linux/module.h>
#include <asm/hwcap.h>
-#include <asm/neon.h>
#include <asm/simd.h>
asmlinkage void chacha_block_xor_neon(const struct chacha_state *state,
@@ -67,9 +66,8 @@ void hchacha_block_arch(const struct chacha_state *state,
if (!static_branch_likely(&have_neon) || !crypto_simd_usable()) {
hchacha_block_generic(state, out, nrounds);
} else {
- kernel_neon_begin();
- hchacha_block_neon(state, out, nrounds);
- kernel_neon_end();
+ scoped_ksimd()
+ hchacha_block_neon(state, out, nrounds);
}
}
EXPORT_SYMBOL(hchacha_block_arch);
@@ -84,9 +82,8 @@ void chacha_crypt_arch(struct chacha_state *state, u8 *dst, const u8 *src,
do {
unsigned int todo = min_t(unsigned int, bytes, SZ_4K);
- kernel_neon_begin();
- chacha_doneon(state, dst, src, todo, nrounds);
- kernel_neon_end();
+ scoped_ksimd()
+ chacha_doneon(state, dst, src, todo, nrounds);
bytes -= todo;
src += todo;
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index 31aea21ce42f..c83ce7d835d9 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -6,7 +6,6 @@
*/
#include <asm/hwcap.h>
-#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
@@ -38,9 +37,8 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
- kernel_neon_begin();
- poly1305_blocks_neon(state, src, todo, padbit);
- kernel_neon_end();
+ scoped_ksimd()
+ poly1305_blocks_neon(state, src, todo, padbit);
len -= todo;
src += todo;
diff --git a/lib/crypto/arm64/sha1.h b/lib/crypto/arm64/sha1.h
index f822563538cc..3d0da0045fed 100644
--- a/lib/crypto/arm64/sha1.h
+++ b/lib/crypto/arm64/sha1.h
@@ -4,7 +4,6 @@
*
* Copyright 2025 Google LLC
*/
-#include <asm/neon.h>
#include <asm/simd.h>
#include <linux/cpufeature.h>
@@ -20,9 +19,9 @@ static void sha1_blocks(struct sha1_block_state *state,
do {
size_t rem;
- kernel_neon_begin();
- rem = __sha1_ce_transform(state, data, nblocks);
- kernel_neon_end();
+ scoped_ksimd()
+ rem = __sha1_ce_transform(state, data, nblocks);
+
data += (nblocks - rem) * SHA1_BLOCK_SIZE;
nblocks = rem;
} while (nblocks);
diff --git a/lib/crypto/arm64/sha256.h b/lib/crypto/arm64/sha256.h
index a211966c124a..0a9f9d70bb43 100644
--- a/lib/crypto/arm64/sha256.h
+++ b/lib/crypto/arm64/sha256.h
@@ -4,7 +4,7 @@
*
* Copyright 2025 Google LLC
*/
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/simd.h>
#include <linux/cpufeature.h>
@@ -27,17 +27,16 @@ static void sha256_blocks(struct sha256_block_state *state,
do {
size_t rem;
- kernel_neon_begin();
- rem = __sha256_ce_transform(state,
- data, nblocks);
- kernel_neon_end();
+ scoped_ksimd()
+ rem = __sha256_ce_transform(state, data,
+ nblocks);
+
data += (nblocks - rem) * SHA256_BLOCK_SIZE;
nblocks = rem;
} while (nblocks);
} else {
- kernel_neon_begin();
- sha256_block_neon(state, data, nblocks);
- kernel_neon_end();
+ scoped_ksimd()
+ sha256_block_neon(state, data, nblocks);
}
} else {
sha256_block_data_order(state, data, nblocks);
diff --git a/lib/crypto/arm64/sha512.h b/lib/crypto/arm64/sha512.h
index 6abb40b467f2..1b6c3974d553 100644
--- a/lib/crypto/arm64/sha512.h
+++ b/lib/crypto/arm64/sha512.h
@@ -5,7 +5,7 @@
* Copyright 2025 Google LLC
*/
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/simd.h>
#include <linux/cpufeature.h>
@@ -25,9 +25,9 @@ static void sha512_blocks(struct sha512_block_state *state,
do {
size_t rem;
- kernel_neon_begin();
- rem = __sha512_ce_transform(state, data, nblocks);
- kernel_neon_end();
+ scoped_ksimd()
+ rem = __sha512_ce_transform(state, data, nblocks);
+
data += (nblocks - rem) * SHA512_BLOCK_SIZE;
nblocks = rem;
} while (nblocks);
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 11/20] crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (9 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 10/20] lib/crypto: " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 12/20] crypto/arm64: aes-blk " Ard Biesheuvel
` (9 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/aes-ce-ccm-glue.c | 135 ++++++++++----------
1 file changed, 66 insertions(+), 69 deletions(-)
diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
index 2eb4e76cabc3..c4fd648471f1 100644
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c
+++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
@@ -8,7 +8,6 @@
* Author: Ard Biesheuvel <ardb@kernel.org>
*/
-#include <asm/neon.h>
#include <linux/unaligned.h>
#include <crypto/aes.h>
#include <crypto/scatterwalk.h>
@@ -16,6 +15,8 @@
#include <crypto/internal/skcipher.h>
#include <linux/module.h>
+#include <asm/simd.h>
+
#include "aes-ce-setkey.h"
MODULE_IMPORT_NS("CRYPTO_INTERNAL");
@@ -184,40 +185,38 @@ static int ccm_encrypt(struct aead_request *req)
if (unlikely(err))
return err;
- kernel_neon_begin();
-
- if (req->assoclen)
- ccm_calculate_auth_mac(req, mac);
-
- do {
- u32 tail = walk.nbytes % AES_BLOCK_SIZE;
- const u8 *src = walk.src.virt.addr;
- u8 *dst = walk.dst.virt.addr;
- u8 buf[AES_BLOCK_SIZE];
- u8 *final_iv = NULL;
-
- if (walk.nbytes == walk.total) {
- tail = 0;
- final_iv = orig_iv;
- }
-
- if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
- src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
- src, walk.nbytes);
-
- ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail,
- ctx->key_enc, num_rounds(ctx),
- mac, walk.iv, final_iv);
-
- if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
- memcpy(walk.dst.virt.addr, dst, walk.nbytes);
-
- if (walk.nbytes) {
- err = skcipher_walk_done(&walk, tail);
- }
- } while (walk.nbytes);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ if (req->assoclen)
+ ccm_calculate_auth_mac(req, mac);
+
+ do {
+ u32 tail = walk.nbytes % AES_BLOCK_SIZE;
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+ u8 buf[AES_BLOCK_SIZE];
+ u8 *final_iv = NULL;
+
+ if (walk.nbytes == walk.total) {
+ tail = 0;
+ final_iv = orig_iv;
+ }
+
+ if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
+ src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
+ src, walk.nbytes);
+
+ ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail,
+ ctx->key_enc, num_rounds(ctx),
+ mac, walk.iv, final_iv);
+
+ if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
+ memcpy(walk.dst.virt.addr, dst, walk.nbytes);
+
+ if (walk.nbytes) {
+ err = skcipher_walk_done(&walk, tail);
+ }
+ } while (walk.nbytes);
+ }
if (unlikely(err))
return err;
@@ -251,40 +250,38 @@ static int ccm_decrypt(struct aead_request *req)
if (unlikely(err))
return err;
- kernel_neon_begin();
-
- if (req->assoclen)
- ccm_calculate_auth_mac(req, mac);
-
- do {
- u32 tail = walk.nbytes % AES_BLOCK_SIZE;
- const u8 *src = walk.src.virt.addr;
- u8 *dst = walk.dst.virt.addr;
- u8 buf[AES_BLOCK_SIZE];
- u8 *final_iv = NULL;
-
- if (walk.nbytes == walk.total) {
- tail = 0;
- final_iv = orig_iv;
- }
-
- if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
- src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
- src, walk.nbytes);
-
- ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail,
- ctx->key_enc, num_rounds(ctx),
- mac, walk.iv, final_iv);
-
- if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
- memcpy(walk.dst.virt.addr, dst, walk.nbytes);
-
- if (walk.nbytes) {
- err = skcipher_walk_done(&walk, tail);
- }
- } while (walk.nbytes);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ if (req->assoclen)
+ ccm_calculate_auth_mac(req, mac);
+
+ do {
+ u32 tail = walk.nbytes % AES_BLOCK_SIZE;
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+ u8 buf[AES_BLOCK_SIZE];
+ u8 *final_iv = NULL;
+
+ if (walk.nbytes == walk.total) {
+ tail = 0;
+ final_iv = orig_iv;
+ }
+
+ if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
+ src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes],
+ src, walk.nbytes);
+
+ ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail,
+ ctx->key_enc, num_rounds(ctx),
+ mac, walk.iv, final_iv);
+
+ if (unlikely(walk.nbytes < AES_BLOCK_SIZE))
+ memcpy(walk.dst.virt.addr, dst, walk.nbytes);
+
+ if (walk.nbytes) {
+ err = skcipher_walk_done(&walk, tail);
+ }
+ } while (walk.nbytes);
+ }
if (unlikely(err))
return err;
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 12/20] crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (10 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 11/20] crypto/arm64: aes-ccm - Switch " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 13/20] crypto/arm64: aes-gcm " Ard Biesheuvel
` (8 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/aes-ce-glue.c | 87 ++++++------
arch/arm64/crypto/aes-glue.c | 139 ++++++++----------
arch/arm64/crypto/aes-neonbs-glue.c | 150 ++++++++++----------
3 files changed, 181 insertions(+), 195 deletions(-)
diff --git a/arch/arm64/crypto/aes-ce-glue.c b/arch/arm64/crypto/aes-ce-glue.c
index 00b8749013c5..a4dad370991d 100644
--- a/arch/arm64/crypto/aes-ce-glue.c
+++ b/arch/arm64/crypto/aes-ce-glue.c
@@ -52,9 +52,8 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
return;
}
- kernel_neon_begin();
- __aes_ce_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
- kernel_neon_end();
+ scoped_ksimd()
+ __aes_ce_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
}
static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
@@ -66,9 +65,8 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
return;
}
- kernel_neon_begin();
- __aes_ce_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
- kernel_neon_end();
+ scoped_ksimd()
+ __aes_ce_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
}
int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
@@ -94,47 +92,48 @@ int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
for (i = 0; i < kwords; i++)
ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
- kernel_neon_begin();
- for (i = 0; i < sizeof(rcon); i++) {
- u32 *rki = ctx->key_enc + (i * kwords);
- u32 *rko = rki + kwords;
-
- rko[0] = ror32(__aes_ce_sub(rki[kwords - 1]), 8) ^ rcon[i] ^ rki[0];
- rko[1] = rko[0] ^ rki[1];
- rko[2] = rko[1] ^ rki[2];
- rko[3] = rko[2] ^ rki[3];
-
- if (key_len == AES_KEYSIZE_192) {
- if (i >= 7)
- break;
- rko[4] = rko[3] ^ rki[4];
- rko[5] = rko[4] ^ rki[5];
- } else if (key_len == AES_KEYSIZE_256) {
- if (i >= 6)
- break;
- rko[4] = __aes_ce_sub(rko[3]) ^ rki[4];
- rko[5] = rko[4] ^ rki[5];
- rko[6] = rko[5] ^ rki[6];
- rko[7] = rko[6] ^ rki[7];
+ scoped_ksimd() {
+ for (i = 0; i < sizeof(rcon); i++) {
+ u32 *rki = ctx->key_enc + (i * kwords);
+ u32 *rko = rki + kwords;
+
+ rko[0] = ror32(__aes_ce_sub(rki[kwords - 1]), 8) ^
+ rcon[i] ^ rki[0];
+ rko[1] = rko[0] ^ rki[1];
+ rko[2] = rko[1] ^ rki[2];
+ rko[3] = rko[2] ^ rki[3];
+
+ if (key_len == AES_KEYSIZE_192) {
+ if (i >= 7)
+ break;
+ rko[4] = rko[3] ^ rki[4];
+ rko[5] = rko[4] ^ rki[5];
+ } else if (key_len == AES_KEYSIZE_256) {
+ if (i >= 6)
+ break;
+ rko[4] = __aes_ce_sub(rko[3]) ^ rki[4];
+ rko[5] = rko[4] ^ rki[5];
+ rko[6] = rko[5] ^ rki[6];
+ rko[7] = rko[6] ^ rki[7];
+ }
}
- }
- /*
- * Generate the decryption keys for the Equivalent Inverse Cipher.
- * This involves reversing the order of the round keys, and applying
- * the Inverse Mix Columns transformation on all but the first and
- * the last one.
- */
- key_enc = (struct aes_block *)ctx->key_enc;
- key_dec = (struct aes_block *)ctx->key_dec;
- j = num_rounds(ctx);
-
- key_dec[0] = key_enc[j];
- for (i = 1, j--; j > 0; i++, j--)
- __aes_ce_invert(key_dec + i, key_enc + j);
- key_dec[i] = key_enc[0];
+ /*
+ * Generate the decryption keys for the Equivalent Inverse
+ * Cipher. This involves reversing the order of the round
+ * keys, and applying the Inverse Mix Columns transformation on
+ * all but the first and the last one.
+ */
+ key_enc = (struct aes_block *)ctx->key_enc;
+ key_dec = (struct aes_block *)ctx->key_dec;
+ j = num_rounds(ctx);
+
+ key_dec[0] = key_enc[j];
+ for (i = 1, j--; j > 0; i++, j--)
+ __aes_ce_invert(key_dec + i, key_enc + j);
+ key_dec[i] = key_enc[0];
+ }
- kernel_neon_end();
return 0;
}
EXPORT_SYMBOL(ce_aes_expandkey);
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index 81560f722b9d..4b76d8dfa2c5 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -5,8 +5,6 @@
* Copyright (C) 2013 - 2017 Linaro Ltd <ard.biesheuvel@linaro.org>
*/
-#include <asm/hwcap.h>
-#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/ctr.h>
#include <crypto/internal/hash.h>
@@ -20,6 +18,9 @@
#include <linux/module.h>
#include <linux/string.h>
+#include <asm/hwcap.h>
+#include <asm/simd.h>
+
#include "aes-ce-setkey.h"
#ifdef USE_V8_CRYPTO_EXTENSIONS
@@ -187,10 +188,9 @@ static int __maybe_unused ecb_encrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
- kernel_neon_begin();
- aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key_enc, rounds, blocks);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key_enc, rounds, blocks);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -207,10 +207,9 @@ static int __maybe_unused ecb_decrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false);
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
- kernel_neon_begin();
- aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key_dec, rounds, blocks);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key_dec, rounds, blocks);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -225,10 +224,9 @@ static int cbc_encrypt_walk(struct skcipher_request *req,
unsigned int blocks;
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
- kernel_neon_begin();
- aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr,
- ctx->key_enc, rounds, blocks, walk->iv);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_cbc_encrypt(walk->dst.virt.addr, walk->src.virt.addr,
+ ctx->key_enc, rounds, blocks, walk->iv);
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -254,10 +252,9 @@ static int cbc_decrypt_walk(struct skcipher_request *req,
unsigned int blocks;
while ((blocks = (walk->nbytes / AES_BLOCK_SIZE))) {
- kernel_neon_begin();
- aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr,
- ctx->key_dec, rounds, blocks, walk->iv);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_cbc_decrypt(walk->dst.virt.addr, walk->src.virt.addr,
+ ctx->key_dec, rounds, blocks, walk->iv);
err = skcipher_walk_done(walk, walk->nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -323,10 +320,9 @@ static int cts_cbc_encrypt(struct skcipher_request *req)
if (err)
return err;
- kernel_neon_begin();
- aes_cbc_cts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key_enc, rounds, walk.nbytes, walk.iv);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_cbc_cts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key_enc, rounds, walk.nbytes, walk.iv);
return skcipher_walk_done(&walk, 0);
}
@@ -380,10 +376,9 @@ static int cts_cbc_decrypt(struct skcipher_request *req)
if (err)
return err;
- kernel_neon_begin();
- aes_cbc_cts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key_dec, rounds, walk.nbytes, walk.iv);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_cbc_cts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key_dec, rounds, walk.nbytes, walk.iv);
return skcipher_walk_done(&walk, 0);
}
@@ -416,11 +411,11 @@ static int __maybe_unused essiv_cbc_encrypt(struct skcipher_request *req)
blocks = walk.nbytes / AES_BLOCK_SIZE;
if (blocks) {
- kernel_neon_begin();
- aes_essiv_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key1.key_enc, rounds, blocks,
- req->iv, ctx->key2.key_enc);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_essiv_cbc_encrypt(walk.dst.virt.addr,
+ walk.src.virt.addr,
+ ctx->key1.key_enc, rounds, blocks,
+ req->iv, ctx->key2.key_enc);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err ?: cbc_encrypt_walk(req, &walk);
@@ -438,11 +433,11 @@ static int __maybe_unused essiv_cbc_decrypt(struct skcipher_request *req)
blocks = walk.nbytes / AES_BLOCK_SIZE;
if (blocks) {
- kernel_neon_begin();
- aes_essiv_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key1.key_dec, rounds, blocks,
- req->iv, ctx->key2.key_enc);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_essiv_cbc_decrypt(walk.dst.virt.addr,
+ walk.src.virt.addr,
+ ctx->key1.key_dec, rounds, blocks,
+ req->iv, ctx->key2.key_enc);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err ?: cbc_decrypt_walk(req, &walk);
@@ -478,10 +473,9 @@ static int __maybe_unused xctr_encrypt(struct skcipher_request *req)
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
- kernel_neon_begin();
- aes_xctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
- walk.iv, byte_ctr);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_xctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
+ walk.iv, byte_ctr);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
@@ -523,10 +517,9 @@ static int __maybe_unused ctr_encrypt(struct skcipher_request *req)
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
- kernel_neon_begin();
- aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
- walk.iv);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
+ walk.iv);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
@@ -579,11 +572,10 @@ static int __maybe_unused xts_encrypt(struct skcipher_request *req)
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
- kernel_neon_begin();
- aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key1.key_enc, rounds, nbytes,
- ctx->key2.key_enc, walk.iv, first);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key1.key_enc, rounds, nbytes,
+ ctx->key2.key_enc, walk.iv, first);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
@@ -601,11 +593,10 @@ static int __maybe_unused xts_encrypt(struct skcipher_request *req)
if (err)
return err;
- kernel_neon_begin();
- aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key1.key_enc, rounds, walk.nbytes,
- ctx->key2.key_enc, walk.iv, first);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key1.key_enc, rounds, walk.nbytes,
+ ctx->key2.key_enc, walk.iv, first);
return skcipher_walk_done(&walk, 0);
}
@@ -651,11 +642,10 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req)
if (walk.nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
- kernel_neon_begin();
- aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key1.key_dec, rounds, nbytes,
- ctx->key2.key_enc, walk.iv, first);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key1.key_dec, rounds, nbytes,
+ ctx->key2.key_enc, walk.iv, first);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
@@ -674,11 +664,10 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req)
return err;
- kernel_neon_begin();
- aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key1.key_dec, rounds, walk.nbytes,
- ctx->key2.key_enc, walk.iv, first);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key1.key_dec, rounds, walk.nbytes,
+ ctx->key2.key_enc, walk.iv, first);
return skcipher_walk_done(&walk, 0);
}
@@ -827,10 +816,9 @@ static int cmac_setkey(struct crypto_shash *tfm, const u8 *in_key,
return err;
/* encrypt the zero vector */
- kernel_neon_begin();
- aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, ctx->key.key_enc,
- rounds, 1);
- kernel_neon_end();
+ scoped_ksimd()
+ aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){},
+ ctx->key.key_enc, rounds, 1);
cmac_gf128_mul_by_x(consts, consts);
cmac_gf128_mul_by_x(consts + 1, consts);
@@ -856,10 +844,10 @@ static int xcbc_setkey(struct crypto_shash *tfm, const u8 *in_key,
if (err)
return err;
- kernel_neon_begin();
- aes_ecb_encrypt(key, ks[0], ctx->key.key_enc, rounds, 1);
- aes_ecb_encrypt(ctx->consts, ks[1], ctx->key.key_enc, rounds, 2);
- kernel_neon_end();
+ scoped_ksimd() {
+ aes_ecb_encrypt(key, ks[0], ctx->key.key_enc, rounds, 1);
+ aes_ecb_encrypt(ctx->consts, ks[1], ctx->key.key_enc, rounds, 2);
+ }
return cbcmac_setkey(tfm, key, sizeof(key));
}
@@ -879,10 +867,9 @@ static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks,
int rem;
do {
- kernel_neon_begin();
- rem = aes_mac_update(in, ctx->key_enc, rounds, blocks,
- dg, enc_before, !enc_before);
- kernel_neon_end();
+ scoped_ksimd()
+ rem = aes_mac_update(in, ctx->key_enc, rounds, blocks,
+ dg, enc_before, !enc_before);
in += (blocks - rem) * AES_BLOCK_SIZE;
blocks = rem;
} while (blocks);
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
index c4a623e86593..d496effb0a5b 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -85,9 +85,8 @@ static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
ctx->rounds = 6 + key_len / 4;
- kernel_neon_begin();
- aesbs_convert_key(ctx->rk, rk.key_enc, ctx->rounds);
- kernel_neon_end();
+ scoped_ksimd()
+ aesbs_convert_key(ctx->rk, rk.key_enc, ctx->rounds);
return 0;
}
@@ -110,10 +109,9 @@ static int __ecb_crypt(struct skcipher_request *req,
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
- kernel_neon_begin();
- fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
- ctx->rounds, blocks);
- kernel_neon_end();
+ scoped_ksimd()
+ fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
+ ctx->rounds, blocks);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
@@ -146,9 +144,8 @@ static int aesbs_cbc_ctr_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
memcpy(ctx->enc, rk.key_enc, sizeof(ctx->enc));
- kernel_neon_begin();
- aesbs_convert_key(ctx->key.rk, rk.key_enc, ctx->key.rounds);
- kernel_neon_end();
+ scoped_ksimd()
+ aesbs_convert_key(ctx->key.rk, rk.key_enc, ctx->key.rounds);
memzero_explicit(&rk, sizeof(rk));
return 0;
@@ -167,11 +164,11 @@ static int cbc_encrypt(struct skcipher_request *req)
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
/* fall back to the non-bitsliced NEON implementation */
- kernel_neon_begin();
- neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->enc, ctx->key.rounds, blocks,
- walk.iv);
- kernel_neon_end();
+ scoped_ksimd()
+ neon_aes_cbc_encrypt(walk.dst.virt.addr,
+ walk.src.virt.addr,
+ ctx->enc, ctx->key.rounds, blocks,
+ walk.iv);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
@@ -193,11 +190,10 @@ static int cbc_decrypt(struct skcipher_request *req)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
- kernel_neon_begin();
- aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
- ctx->key.rk, ctx->key.rounds, blocks,
- walk.iv);
- kernel_neon_end();
+ scoped_ksimd()
+ aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ ctx->key.rk, ctx->key.rounds, blocks,
+ walk.iv);
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
}
@@ -220,30 +216,32 @@ static int ctr_encrypt(struct skcipher_request *req)
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
- kernel_neon_begin();
- if (blocks >= 8) {
- aesbs_ctr_encrypt(dst, src, ctx->key.rk, ctx->key.rounds,
- blocks, walk.iv);
- dst += blocks * AES_BLOCK_SIZE;
- src += blocks * AES_BLOCK_SIZE;
- }
- if (nbytes && walk.nbytes == walk.total) {
- u8 buf[AES_BLOCK_SIZE];
- u8 *d = dst;
-
- if (unlikely(nbytes < AES_BLOCK_SIZE))
- src = dst = memcpy(buf + sizeof(buf) - nbytes,
- src, nbytes);
-
- neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds,
- nbytes, walk.iv);
+ scoped_ksimd() {
+ if (blocks >= 8) {
+ aesbs_ctr_encrypt(dst, src, ctx->key.rk,
+ ctx->key.rounds, blocks,
+ walk.iv);
+ dst += blocks * AES_BLOCK_SIZE;
+ src += blocks * AES_BLOCK_SIZE;
+ }
+ if (nbytes && walk.nbytes == walk.total) {
+ u8 buf[AES_BLOCK_SIZE];
+ u8 *d = dst;
+
+ if (unlikely(nbytes < AES_BLOCK_SIZE))
+ src = dst = memcpy(buf + sizeof(buf) -
+ nbytes, src, nbytes);
+
+ neon_aes_ctr_encrypt(dst, src, ctx->enc,
+ ctx->key.rounds, nbytes,
+ walk.iv);
- if (unlikely(nbytes < AES_BLOCK_SIZE))
- memcpy(d, dst, nbytes);
+ if (unlikely(nbytes < AES_BLOCK_SIZE))
+ memcpy(d, dst, nbytes);
- nbytes = 0;
+ nbytes = 0;
+ }
}
- kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
return err;
@@ -320,33 +318,33 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in = walk.src.virt.addr;
nbytes = walk.nbytes;
- kernel_neon_begin();
- if (blocks >= 8) {
- if (first == 1)
- neon_aes_ecb_encrypt(walk.iv, walk.iv,
- ctx->twkey,
- ctx->key.rounds, 1);
- first = 2;
-
- fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
- walk.iv);
-
- out += blocks * AES_BLOCK_SIZE;
- in += blocks * AES_BLOCK_SIZE;
- nbytes -= blocks * AES_BLOCK_SIZE;
+ scoped_ksimd() {
+ if (blocks >= 8) {
+ if (first == 1)
+ neon_aes_ecb_encrypt(walk.iv, walk.iv,
+ ctx->twkey,
+ ctx->key.rounds, 1);
+ first = 2;
+
+ fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
+ walk.iv);
+
+ out += blocks * AES_BLOCK_SIZE;
+ in += blocks * AES_BLOCK_SIZE;
+ nbytes -= blocks * AES_BLOCK_SIZE;
+ }
+ if (walk.nbytes == walk.total && nbytes > 0) {
+ if (encrypt)
+ neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
+ ctx->key.rounds, nbytes,
+ ctx->twkey, walk.iv, first);
+ else
+ neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
+ ctx->key.rounds, nbytes,
+ ctx->twkey, walk.iv, first);
+ nbytes = first = 0;
+ }
}
- if (walk.nbytes == walk.total && nbytes > 0) {
- if (encrypt)
- neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
- ctx->key.rounds, nbytes,
- ctx->twkey, walk.iv, first);
- else
- neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
- ctx->key.rounds, nbytes,
- ctx->twkey, walk.iv, first);
- nbytes = first = 0;
- }
- kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
@@ -369,14 +367,16 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in = walk.src.virt.addr;
nbytes = walk.nbytes;
- kernel_neon_begin();
- if (encrypt)
- neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds,
- nbytes, ctx->twkey, walk.iv, first);
- else
- neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds,
- nbytes, ctx->twkey, walk.iv, first);
- kernel_neon_end();
+ scoped_ksimd() {
+ if (encrypt)
+ neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
+ ctx->key.rounds, nbytes, ctx->twkey,
+ walk.iv, first);
+ else
+ neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
+ ctx->key.rounds, nbytes, ctx->twkey,
+ walk.iv, first);
+ }
return skcipher_walk_done(&walk, 0);
}
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 13/20] crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (11 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 12/20] crypto/arm64: aes-blk " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 14/20] crypto/arm64: nhpoly1305 " Ard Biesheuvel
` (7 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/ghash-ce-glue.c | 27 ++++++++++----------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index 4995b6e22335..7951557a285a 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -5,7 +5,6 @@
* Copyright (C) 2014 - 2018 Linaro Ltd. <ard.biesheuvel@linaro.org>
*/
-#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/b128ops.h>
#include <crypto/gcm.h>
@@ -22,6 +21,8 @@
#include <linux/string.h>
#include <linux/unaligned.h>
+#include <asm/simd.h>
+
MODULE_DESCRIPTION("GHASH and AES-GCM using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@@ -74,9 +75,8 @@ void ghash_do_simd_update(int blocks, u64 dg[], const char *src,
u64 const h[][2],
const char *head))
{
- kernel_neon_begin();
- simd_update(blocks, dg, src, key->h, head);
- kernel_neon_end();
+ scoped_ksimd()
+ simd_update(blocks, dg, src, key->h, head);
}
/* avoid hogging the CPU for too long */
@@ -329,11 +329,10 @@ static int gcm_encrypt(struct aead_request *req, char *iv, int assoclen)
tag = NULL;
}
- kernel_neon_begin();
- pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
- dg, iv, ctx->aes_key.key_enc, nrounds,
- tag);
- kernel_neon_end();
+ scoped_ksimd()
+ pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
+ dg, iv, ctx->aes_key.key_enc, nrounds,
+ tag);
if (unlikely(!nbytes))
break;
@@ -399,11 +398,11 @@ static int gcm_decrypt(struct aead_request *req, char *iv, int assoclen)
tag = NULL;
}
- kernel_neon_begin();
- ret = pmull_gcm_decrypt(nbytes, dst, src, ctx->ghash_key.h,
- dg, iv, ctx->aes_key.key_enc,
- nrounds, tag, otag, authsize);
- kernel_neon_end();
+ scoped_ksimd()
+ ret = pmull_gcm_decrypt(nbytes, dst, src,
+ ctx->ghash_key.h,
+ dg, iv, ctx->aes_key.key_enc,
+ nrounds, tag, otag, authsize);
if (unlikely(!nbytes))
break;
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 14/20] crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (12 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 13/20] crypto/arm64: aes-gcm " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 15/20] crypto/arm64: polyval " Ard Biesheuvel
` (6 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/nhpoly1305-neon-glue.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/crypto/nhpoly1305-neon-glue.c b/arch/arm64/crypto/nhpoly1305-neon-glue.c
index e4a0b463f080..013de6ac569a 100644
--- a/arch/arm64/crypto/nhpoly1305-neon-glue.c
+++ b/arch/arm64/crypto/nhpoly1305-neon-glue.c
@@ -25,9 +25,8 @@ static int nhpoly1305_neon_update(struct shash_desc *desc,
do {
unsigned int n = min_t(unsigned int, srclen, SZ_4K);
- kernel_neon_begin();
- crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
- kernel_neon_end();
+ scoped_ksimd()
+ crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
src += n;
srclen -= n;
} while (srclen);
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 15/20] crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (13 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 14/20] crypto/arm64: nhpoly1305 " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 16/20] crypto/arm64: sha3 " Ard Biesheuvel
` (5 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/polyval-ce-glue.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c
index c4e653688ea0..51eefbe97885 100644
--- a/arch/arm64/crypto/polyval-ce-glue.c
+++ b/arch/arm64/crypto/polyval-ce-glue.c
@@ -15,7 +15,7 @@
* ARMv8 Crypto Extensions instructions to implement the finite field operations.
*/
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/hash.h>
#include <crypto/polyval.h>
#include <crypto/utils.h>
@@ -45,16 +45,14 @@ asmlinkage void pmull_polyval_mul(u8 *op1, const u8 *op2);
static void internal_polyval_update(const struct polyval_tfm_ctx *keys,
const u8 *in, size_t nblocks, u8 *accumulator)
{
- kernel_neon_begin();
- pmull_polyval_update(keys, in, nblocks, accumulator);
- kernel_neon_end();
+ scoped_ksimd()
+ pmull_polyval_update(keys, in, nblocks, accumulator);
}
static void internal_polyval_mul(u8 *op1, const u8 *op2)
{
- kernel_neon_begin();
- pmull_polyval_mul(op1, op2);
- kernel_neon_end();
+ scoped_ksimd()
+ pmull_polyval_mul(op1, op2);
}
static int polyval_arm64_setkey(struct crypto_shash *tfm,
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 16/20] crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (14 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 15/20] crypto/arm64: polyval " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 17/20] crypto/arm64: sm3 " Ard Biesheuvel
` (4 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/sha3-ce-glue.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/crypto/sha3-ce-glue.c b/arch/arm64/crypto/sha3-ce-glue.c
index b4f1001046c9..22732760edd3 100644
--- a/arch/arm64/crypto/sha3-ce-glue.c
+++ b/arch/arm64/crypto/sha3-ce-glue.c
@@ -46,9 +46,8 @@ static int sha3_update(struct shash_desc *desc, const u8 *data,
do {
int rem;
- kernel_neon_begin();
- rem = sha3_ce_transform(sctx->st, data, blocks, ds);
- kernel_neon_end();
+ scoped_ksimd()
+ rem = sha3_ce_transform(sctx->st, data, blocks, ds);
data += (blocks - rem) * bs;
blocks = rem;
} while (blocks);
@@ -73,9 +72,8 @@ static int sha3_finup(struct shash_desc *desc, const u8 *src, unsigned int len,
memset(block + len, 0, bs - len);
block[bs - 1] |= 0x80;
- kernel_neon_begin();
- sha3_ce_transform(sctx->st, block, 1, ds);
- kernel_neon_end();
+ scoped_ksimd()
+ sha3_ce_transform(sctx->st, block, 1, ds);
memzero_explicit(block , sizeof(block));
for (i = 0; i < ds / 8; i++)
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 17/20] crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (15 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 16/20] crypto/arm64: sha3 " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 18/20] crypto/arm64: sm4 " Ard Biesheuvel
` (3 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/sm3-ce-glue.c | 15 ++++++++-------
arch/arm64/crypto/sm3-neon-glue.c | 16 ++++++----------
2 files changed, 14 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/crypto/sm3-ce-glue.c b/arch/arm64/crypto/sm3-ce-glue.c
index eac6f5fa0abe..24c1fcfae072 100644
--- a/arch/arm64/crypto/sm3-ce-glue.c
+++ b/arch/arm64/crypto/sm3-ce-glue.c
@@ -5,7 +5,6 @@
* Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org>
*/
-#include <asm/neon.h>
#include <crypto/internal/hash.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
@@ -13,6 +12,8 @@
#include <linux/kernel.h>
#include <linux/module.h>
+#include <asm/simd.h>
+
MODULE_DESCRIPTION("SM3 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@@ -25,18 +26,18 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
{
int remain;
- kernel_neon_begin();
- remain = sm3_base_do_update_blocks(desc, data, len, sm3_ce_transform);
- kernel_neon_end();
+ scoped_ksimd() {
+ remain = sm3_base_do_update_blocks(desc, data, len, sm3_ce_transform);
+ }
return remain;
}
static int sm3_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- kernel_neon_begin();
- sm3_base_do_finup(desc, data, len, sm3_ce_transform);
- kernel_neon_end();
+ scoped_ksimd() {
+ sm3_base_do_finup(desc, data, len, sm3_ce_transform);
+ }
return sm3_base_finish(desc, out);
}
diff --git a/arch/arm64/crypto/sm3-neon-glue.c b/arch/arm64/crypto/sm3-neon-glue.c
index 6c4611a503a3..15f30cc24f32 100644
--- a/arch/arm64/crypto/sm3-neon-glue.c
+++ b/arch/arm64/crypto/sm3-neon-glue.c
@@ -5,7 +5,7 @@
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/hash.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
@@ -20,20 +20,16 @@ asmlinkage void sm3_neon_transform(struct sm3_state *sst, u8 const *src,
static int sm3_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
- int remain;
-
- kernel_neon_begin();
- remain = sm3_base_do_update_blocks(desc, data, len, sm3_neon_transform);
- kernel_neon_end();
- return remain;
+ scoped_ksimd()
+ return sm3_base_do_update_blocks(desc, data, len,
+ sm3_neon_transform);
}
static int sm3_neon_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- kernel_neon_begin();
- sm3_base_do_finup(desc, data, len, sm3_neon_transform);
- kernel_neon_end();
+ scoped_ksimd()
+ sm3_base_do_finup(desc, data, len, sm3_neon_transform);
return sm3_base_finish(desc, out);
}
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 18/20] crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (16 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 17/20] crypto/arm64: sm3 " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 19/20] arm64/xorblocks: " Ard Biesheuvel
` (2 subsequent siblings)
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/crypto/sm4-ce-ccm-glue.c | 49 +++--
arch/arm64/crypto/sm4-ce-cipher-glue.c | 10 +-
arch/arm64/crypto/sm4-ce-gcm-glue.c | 61 +++---
arch/arm64/crypto/sm4-ce-glue.c | 214 +++++++++-----------
arch/arm64/crypto/sm4-neon-glue.c | 25 +--
5 files changed, 158 insertions(+), 201 deletions(-)
diff --git a/arch/arm64/crypto/sm4-ce-ccm-glue.c b/arch/arm64/crypto/sm4-ce-ccm-glue.c
index f9771ab2a05f..390facf909a0 100644
--- a/arch/arm64/crypto/sm4-ce-ccm-glue.c
+++ b/arch/arm64/crypto/sm4-ce-ccm-glue.c
@@ -11,7 +11,7 @@
#include <linux/crypto.h>
#include <linux/kernel.h>
#include <linux/cpufeature.h>
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
@@ -35,10 +35,9 @@ static int ccm_setkey(struct crypto_aead *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
- kernel_neon_begin();
- sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
return 0;
}
@@ -167,35 +166,33 @@ static int ccm_crypt(struct aead_request *req, struct skcipher_walk *walk,
memcpy(ctr0, walk->iv, SM4_BLOCK_SIZE);
crypto_inc(walk->iv, SM4_BLOCK_SIZE);
- kernel_neon_begin();
+ scoped_ksimd() {
+ if (req->assoclen)
+ ccm_calculate_auth_mac(req, mac);
- if (req->assoclen)
- ccm_calculate_auth_mac(req, mac);
-
- while (walk->nbytes && walk->nbytes != walk->total) {
- unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
+ while (walk->nbytes && walk->nbytes != walk->total) {
+ unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
- sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
- walk->src.virt.addr, walk->iv,
- walk->nbytes - tail, mac);
+ sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
+ walk->src.virt.addr, walk->iv,
+ walk->nbytes - tail, mac);
- err = skcipher_walk_done(walk, tail);
- }
+ err = skcipher_walk_done(walk, tail);
+ }
- if (walk->nbytes) {
- sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
- walk->src.virt.addr, walk->iv,
- walk->nbytes, mac);
+ if (walk->nbytes) {
+ sm4_ce_ccm_crypt(rkey_enc, walk->dst.virt.addr,
+ walk->src.virt.addr, walk->iv,
+ walk->nbytes, mac);
- sm4_ce_ccm_final(rkey_enc, ctr0, mac);
+ sm4_ce_ccm_final(rkey_enc, ctr0, mac);
- err = skcipher_walk_done(walk, 0);
- } else {
- sm4_ce_ccm_final(rkey_enc, ctr0, mac);
+ err = skcipher_walk_done(walk, 0);
+ } else {
+ sm4_ce_ccm_final(rkey_enc, ctr0, mac);
+ }
}
- kernel_neon_end();
-
return err;
}
diff --git a/arch/arm64/crypto/sm4-ce-cipher-glue.c b/arch/arm64/crypto/sm4-ce-cipher-glue.c
index c31d76fb5a17..bceec833ef4e 100644
--- a/arch/arm64/crypto/sm4-ce-cipher-glue.c
+++ b/arch/arm64/crypto/sm4-ce-cipher-glue.c
@@ -32,9 +32,8 @@ static void sm4_ce_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
if (!crypto_simd_usable()) {
sm4_crypt_block(ctx->rkey_enc, out, in);
} else {
- kernel_neon_begin();
- sm4_ce_do_crypt(ctx->rkey_enc, out, in);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_do_crypt(ctx->rkey_enc, out, in);
}
}
@@ -45,9 +44,8 @@ static void sm4_ce_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
if (!crypto_simd_usable()) {
sm4_crypt_block(ctx->rkey_dec, out, in);
} else {
- kernel_neon_begin();
- sm4_ce_do_crypt(ctx->rkey_dec, out, in);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_do_crypt(ctx->rkey_dec, out, in);
}
}
diff --git a/arch/arm64/crypto/sm4-ce-gcm-glue.c b/arch/arm64/crypto/sm4-ce-gcm-glue.c
index 170cd0151385..32a6ab669281 100644
--- a/arch/arm64/crypto/sm4-ce-gcm-glue.c
+++ b/arch/arm64/crypto/sm4-ce-gcm-glue.c
@@ -11,7 +11,7 @@
#include <linux/crypto.h>
#include <linux/kernel.h>
#include <linux/cpufeature.h>
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/b128ops.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
@@ -48,13 +48,11 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
- kernel_neon_begin();
-
- sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
- sm4_ce_pmull_ghash_setup(ctx->key.rkey_enc, ctx->ghash_table);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
+ sm4_ce_pmull_ghash_setup(ctx->key.rkey_enc, ctx->ghash_table);
+ }
return 0;
}
@@ -149,40 +147,35 @@ static int gcm_crypt(struct aead_request *req, struct skcipher_walk *walk,
memcpy(iv, req->iv, GCM_IV_SIZE);
put_unaligned_be32(2, iv + GCM_IV_SIZE);
- kernel_neon_begin();
+ scoped_ksimd() {
+ if (req->assoclen)
+ gcm_calculate_auth_mac(req, ghash);
- if (req->assoclen)
- gcm_calculate_auth_mac(req, ghash);
+ while (walk->nbytes) {
+ unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
+ const u8 *src = walk->src.virt.addr;
+ u8 *dst = walk->dst.virt.addr;
- while (walk->nbytes) {
- unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
- const u8 *src = walk->src.virt.addr;
- u8 *dst = walk->dst.virt.addr;
+ if (walk->nbytes == walk->total) {
+ sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
+ walk->nbytes, ghash,
+ ctx->ghash_table,
+ (const u8 *)&lengths);
+
+ return skcipher_walk_done(walk, 0);
+ }
- if (walk->nbytes == walk->total) {
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
- walk->nbytes, ghash,
- ctx->ghash_table,
- (const u8 *)&lengths);
+ walk->nbytes - tail, ghash,
+ ctx->ghash_table, NULL);
- err = skcipher_walk_done(walk, 0);
- goto out;
+ err = skcipher_walk_done(walk, tail);
}
- sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
- walk->nbytes - tail, ghash,
- ctx->ghash_table, NULL);
-
- err = skcipher_walk_done(walk, tail);
+ sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, NULL, NULL, iv,
+ walk->nbytes, ghash, ctx->ghash_table,
+ (const u8 *)&lengths);
}
-
- sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, NULL, NULL, iv,
- walk->nbytes, ghash, ctx->ghash_table,
- (const u8 *)&lengths);
-
-out:
- kernel_neon_end();
-
return err;
}
diff --git a/arch/arm64/crypto/sm4-ce-glue.c b/arch/arm64/crypto/sm4-ce-glue.c
index 7a60e7b559dc..57ae3406257c 100644
--- a/arch/arm64/crypto/sm4-ce-glue.c
+++ b/arch/arm64/crypto/sm4-ce-glue.c
@@ -8,7 +8,7 @@
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
-#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/b128ops.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/skcipher.h>
@@ -74,10 +74,9 @@ static int sm4_setkey(struct crypto_skcipher *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
- kernel_neon_begin();
- sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
return 0;
}
@@ -94,12 +93,12 @@ static int sm4_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
if (ret)
return ret;
- kernel_neon_begin();
- sm4_ce_expand_key(key, ctx->key1.rkey_enc,
- ctx->key1.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
- sm4_ce_expand_key(&key[SM4_KEY_SIZE], ctx->key2.rkey_enc,
- ctx->key2.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
- kernel_neon_end();
+ scoped_ksimd() {
+ sm4_ce_expand_key(key, ctx->key1.rkey_enc,
+ ctx->key1.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
+ sm4_ce_expand_key(&key[SM4_KEY_SIZE], ctx->key2.rkey_enc,
+ ctx->key2.rkey_dec, crypto_sm4_fk, crypto_sm4_ck);
+ }
return 0;
}
@@ -117,16 +116,14 @@ static int sm4_ecb_do_crypt(struct skcipher_request *req, const u32 *rkey)
u8 *dst = walk.dst.virt.addr;
unsigned int nblks;
- kernel_neon_begin();
-
- nblks = BYTES2BLKS(nbytes);
- if (nblks) {
- sm4_ce_crypt(rkey, dst, src, nblks);
- nbytes -= nblks * SM4_BLOCK_SIZE;
+ scoped_ksimd() {
+ nblks = BYTES2BLKS(nbytes);
+ if (nblks) {
+ sm4_ce_crypt(rkey, dst, src, nblks);
+ nbytes -= nblks * SM4_BLOCK_SIZE;
+ }
}
- kernel_neon_end();
-
err = skcipher_walk_done(&walk, nbytes);
}
@@ -167,16 +164,14 @@ static int sm4_cbc_crypt(struct skcipher_request *req,
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
- kernel_neon_begin();
-
- if (encrypt)
- sm4_ce_cbc_enc(ctx->rkey_enc, dst, src,
- walk.iv, nblocks);
- else
- sm4_ce_cbc_dec(ctx->rkey_dec, dst, src,
- walk.iv, nblocks);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ if (encrypt)
+ sm4_ce_cbc_enc(ctx->rkey_enc, dst, src,
+ walk.iv, nblocks);
+ else
+ sm4_ce_cbc_dec(ctx->rkey_dec, dst, src,
+ walk.iv, nblocks);
+ }
}
err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
@@ -249,16 +244,14 @@ static int sm4_cbc_cts_crypt(struct skcipher_request *req, bool encrypt)
if (err)
return err;
- kernel_neon_begin();
-
- if (encrypt)
- sm4_ce_cbc_cts_enc(ctx->rkey_enc, walk.dst.virt.addr,
- walk.src.virt.addr, walk.iv, walk.nbytes);
- else
- sm4_ce_cbc_cts_dec(ctx->rkey_dec, walk.dst.virt.addr,
- walk.src.virt.addr, walk.iv, walk.nbytes);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ if (encrypt)
+ sm4_ce_cbc_cts_enc(ctx->rkey_enc, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.iv, walk.nbytes);
+ else
+ sm4_ce_cbc_cts_dec(ctx->rkey_dec, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.iv, walk.nbytes);
+ }
return skcipher_walk_done(&walk, 0);
}
@@ -288,28 +281,26 @@ static int sm4_ctr_crypt(struct skcipher_request *req)
u8 *dst = walk.dst.virt.addr;
unsigned int nblks;
- kernel_neon_begin();
-
- nblks = BYTES2BLKS(nbytes);
- if (nblks) {
- sm4_ce_ctr_enc(ctx->rkey_enc, dst, src, walk.iv, nblks);
- dst += nblks * SM4_BLOCK_SIZE;
- src += nblks * SM4_BLOCK_SIZE;
- nbytes -= nblks * SM4_BLOCK_SIZE;
- }
-
- /* tail */
- if (walk.nbytes == walk.total && nbytes > 0) {
- u8 keystream[SM4_BLOCK_SIZE];
-
- sm4_ce_crypt_block(ctx->rkey_enc, keystream, walk.iv);
- crypto_inc(walk.iv, SM4_BLOCK_SIZE);
- crypto_xor_cpy(dst, src, keystream, nbytes);
- nbytes = 0;
+ scoped_ksimd() {
+ nblks = BYTES2BLKS(nbytes);
+ if (nblks) {
+ sm4_ce_ctr_enc(ctx->rkey_enc, dst, src, walk.iv, nblks);
+ dst += nblks * SM4_BLOCK_SIZE;
+ src += nblks * SM4_BLOCK_SIZE;
+ nbytes -= nblks * SM4_BLOCK_SIZE;
+ }
+
+ /* tail */
+ if (walk.nbytes == walk.total && nbytes > 0) {
+ u8 keystream[SM4_BLOCK_SIZE];
+
+ sm4_ce_crypt_block(ctx->rkey_enc, keystream, walk.iv);
+ crypto_inc(walk.iv, SM4_BLOCK_SIZE);
+ crypto_xor_cpy(dst, src, keystream, nbytes);
+ nbytes = 0;
+ }
}
- kernel_neon_end();
-
err = skcipher_walk_done(&walk, nbytes);
}
@@ -359,18 +350,16 @@ static int sm4_xts_crypt(struct skcipher_request *req, bool encrypt)
if (nbytes < walk.total)
nbytes &= ~(SM4_BLOCK_SIZE - 1);
- kernel_neon_begin();
-
- if (encrypt)
- sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
- walk.src.virt.addr, walk.iv, nbytes,
- rkey2_enc);
- else
- sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
- walk.src.virt.addr, walk.iv, nbytes,
- rkey2_enc);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ if (encrypt)
+ sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.iv, nbytes,
+ rkey2_enc);
+ else
+ sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.iv, nbytes,
+ rkey2_enc);
+ }
rkey2_enc = NULL;
@@ -395,18 +384,16 @@ static int sm4_xts_crypt(struct skcipher_request *req, bool encrypt)
if (err)
return err;
- kernel_neon_begin();
-
- if (encrypt)
- sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
- walk.src.virt.addr, walk.iv, walk.nbytes,
- rkey2_enc);
- else
- sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
- walk.src.virt.addr, walk.iv, walk.nbytes,
- rkey2_enc);
-
- kernel_neon_end();
+ scoped_ksimd() {
+ if (encrypt)
+ sm4_ce_xts_enc(ctx->key1.rkey_enc, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.iv, walk.nbytes,
+ rkey2_enc);
+ else
+ sm4_ce_xts_dec(ctx->key1.rkey_dec, walk.dst.virt.addr,
+ walk.src.virt.addr, walk.iv, walk.nbytes,
+ rkey2_enc);
+ }
return skcipher_walk_done(&walk, 0);
}
@@ -510,11 +497,9 @@ static int sm4_cbcmac_setkey(struct crypto_shash *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
- kernel_neon_begin();
- sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
- kernel_neon_end();
-
+ scoped_ksimd()
+ sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
return 0;
}
@@ -530,15 +515,13 @@ static int sm4_cmac_setkey(struct crypto_shash *tfm, const u8 *key,
memset(consts, 0, SM4_BLOCK_SIZE);
- kernel_neon_begin();
-
- sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
+ scoped_ksimd() {
+ sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
- /* encrypt the zero block */
- sm4_ce_crypt_block(ctx->key.rkey_enc, (u8 *)consts, (const u8 *)consts);
-
- kernel_neon_end();
+ /* encrypt the zero block */
+ sm4_ce_crypt_block(ctx->key.rkey_enc, (u8 *)consts, (const u8 *)consts);
+ }
/* gf(2^128) multiply zero-ciphertext with u and u^2 */
a = be64_to_cpu(consts[0].a);
@@ -568,18 +551,16 @@ static int sm4_xcbc_setkey(struct crypto_shash *tfm, const u8 *key,
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
- kernel_neon_begin();
-
- sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
+ scoped_ksimd() {
+ sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
- sm4_ce_crypt_block(ctx->key.rkey_enc, key2, ks[0]);
- sm4_ce_crypt(ctx->key.rkey_enc, ctx->consts, ks[1], 2);
+ sm4_ce_crypt_block(ctx->key.rkey_enc, key2, ks[0]);
+ sm4_ce_crypt(ctx->key.rkey_enc, ctx->consts, ks[1], 2);
- sm4_ce_expand_key(key2, ctx->key.rkey_enc, ctx->key.rkey_dec,
- crypto_sm4_fk, crypto_sm4_ck);
-
- kernel_neon_end();
+ sm4_ce_expand_key(key2, ctx->key.rkey_enc, ctx->key.rkey_dec,
+ crypto_sm4_fk, crypto_sm4_ck);
+ }
return 0;
}
@@ -600,10 +581,9 @@ static int sm4_mac_update(struct shash_desc *desc, const u8 *p,
unsigned int nblocks = len / SM4_BLOCK_SIZE;
len %= SM4_BLOCK_SIZE;
- kernel_neon_begin();
- sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, p,
- nblocks, false, true);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, p,
+ nblocks, false, true);
return len;
}
@@ -619,10 +599,9 @@ static int sm4_cmac_finup(struct shash_desc *desc, const u8 *src,
ctx->digest[len] ^= 0x80;
consts += SM4_BLOCK_SIZE;
}
- kernel_neon_begin();
- sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, consts, 1,
- false, true);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_mac_update(tctx->key.rkey_enc, ctx->digest, consts, 1,
+ false, true);
memcpy(out, ctx->digest, SM4_BLOCK_SIZE);
return 0;
}
@@ -635,10 +614,9 @@ static int sm4_cbcmac_finup(struct shash_desc *desc, const u8 *src,
if (len) {
crypto_xor(ctx->digest, src, len);
- kernel_neon_begin();
- sm4_ce_crypt_block(tctx->key.rkey_enc, ctx->digest,
- ctx->digest);
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_ce_crypt_block(tctx->key.rkey_enc, ctx->digest,
+ ctx->digest);
}
memcpy(out, ctx->digest, SM4_BLOCK_SIZE);
return 0;
diff --git a/arch/arm64/crypto/sm4-neon-glue.c b/arch/arm64/crypto/sm4-neon-glue.c
index e3500aca2d18..e944c2a2efb0 100644
--- a/arch/arm64/crypto/sm4-neon-glue.c
+++ b/arch/arm64/crypto/sm4-neon-glue.c
@@ -48,11 +48,8 @@ static int sm4_ecb_do_crypt(struct skcipher_request *req, const u32 *rkey)
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
- kernel_neon_begin();
-
- sm4_neon_crypt(rkey, dst, src, nblocks);
-
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_neon_crypt(rkey, dst, src, nblocks);
}
err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
@@ -126,12 +123,9 @@ static int sm4_cbc_decrypt(struct skcipher_request *req)
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
- kernel_neon_begin();
-
- sm4_neon_cbc_dec(ctx->rkey_dec, dst, src,
- walk.iv, nblocks);
-
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_neon_cbc_dec(ctx->rkey_dec, dst, src,
+ walk.iv, nblocks);
}
err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
@@ -157,12 +151,9 @@ static int sm4_ctr_crypt(struct skcipher_request *req)
nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
- kernel_neon_begin();
-
- sm4_neon_ctr_crypt(ctx->rkey_enc, dst, src,
- walk.iv, nblocks);
-
- kernel_neon_end();
+ scoped_ksimd()
+ sm4_neon_ctr_crypt(ctx->rkey_enc, dst, src,
+ walk.iv, nblocks);
dst += nblocks * SM4_BLOCK_SIZE;
src += nblocks * SM4_BLOCK_SIZE;
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 19/20] arm64/xorblocks: Switch to 'ksimd' scoped guard API
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (17 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 18/20] crypto/arm64: sm4 " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-01 21:02 ` [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack Ard Biesheuvel
2025-10-03 20:28 ` [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to " Eric Biggers
20 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/xor.h | 22 ++++++++------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/xor.h b/arch/arm64/include/asm/xor.h
index befcd8a7abc9..c38e3d017a79 100644
--- a/arch/arm64/include/asm/xor.h
+++ b/arch/arm64/include/asm/xor.h
@@ -9,7 +9,7 @@
#include <linux/hardirq.h>
#include <asm-generic/xor.h>
#include <asm/hwcap.h>
-#include <asm/neon.h>
+#include <asm/simd.h>
#ifdef CONFIG_KERNEL_MODE_NEON
@@ -19,9 +19,8 @@ static void
xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
- kernel_neon_begin();
- xor_block_inner_neon.do_2(bytes, p1, p2);
- kernel_neon_end();
+ scoped_ksimd()
+ xor_block_inner_neon.do_2(bytes, p1, p2);
}
static void
@@ -29,9 +28,8 @@ xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
- kernel_neon_begin();
- xor_block_inner_neon.do_3(bytes, p1, p2, p3);
- kernel_neon_end();
+ scoped_ksimd()
+ xor_block_inner_neon.do_3(bytes, p1, p2, p3);
}
static void
@@ -40,9 +38,8 @@ xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
- kernel_neon_begin();
- xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
- kernel_neon_end();
+ scoped_ksimd()
+ xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
}
static void
@@ -52,9 +49,8 @@ xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
- kernel_neon_begin();
- xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);
- kernel_neon_end();
+ scoped_ksimd()
+ xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);
}
static struct xor_block_template xor_block_arm64 = {
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (18 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 19/20] arm64/xorblocks: " Ard Biesheuvel
@ 2025-10-01 21:02 ` Ard Biesheuvel
2025-10-02 16:22 ` Kees Cook
2025-10-03 20:18 ` Eric Biggers
2025-10-03 20:28 ` [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to " Eric Biggers
20 siblings, 2 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-01 21:02 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-crypto, linux-kernel, herbert, linux, Ard Biesheuvel,
Marc Zyngier, Will Deacon, Mark Rutland, Kees Cook,
Catalin Marinas, Mark Brown, Eric Biggers
From: Ard Biesheuvel <ardb@kernel.org>
Commit aefbab8e77eb16b5
("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch")
added a 'kernel_fpsimd_state' field to struct thread_struct, which is
the arch-specific portion of struct task_struct, and is allocated for
each task in the system. The size of this field is 528 bytes, resulting
in non-trivial bloat of task_struct, and the resulting memory overhead
may impact performance on systems with many processes.
This allocation is only used if the task is scheduled out or interrupted
by a softirq while using the FP/SIMD unit in kernel mode, and so it is
possible to transparently allocate this buffer on the caller's stack
instead.
So tweak the 'ksimd' scoped guard implementation so that a stack buffer
is allocated and passed to both kernel_neon_begin() and
kernel_neon_end(), and record it in the task struct. Passing the address
to both functions, and checking the addresses for consistency ensures
that callers of the updated bare begin/end API use it in a manner that
is consistent with the new context switch semantics.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/neon.h | 4 +--
arch/arm64/include/asm/processor.h | 2 +-
arch/arm64/include/asm/simd.h | 7 ++--
arch/arm64/kernel/fpsimd.c | 34 +++++++++++++-------
4 files changed, 31 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h
index d4b1d172a79b..acebee4605b5 100644
--- a/arch/arm64/include/asm/neon.h
+++ b/arch/arm64/include/asm/neon.h
@@ -13,7 +13,7 @@
#define cpu_has_neon() system_supports_fpsimd()
-void kernel_neon_begin(void);
-void kernel_neon_end(void);
+void kernel_neon_begin(struct user_fpsimd_state *);
+void kernel_neon_end(struct user_fpsimd_state *);
#endif /* ! __ASM_NEON_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 4f8d677b73ee..93bca4d454d7 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -172,7 +172,7 @@ struct thread_struct {
unsigned long fault_code; /* ESR_EL1 value */
struct debug_info debug; /* debugging */
- struct user_fpsimd_state kernel_fpsimd_state;
+ struct user_fpsimd_state *kernel_fpsimd_state;
unsigned int kernel_fpsimd_cpu;
#ifdef CONFIG_ARM64_PTR_AUTH
struct ptrauth_keys_user keys_user;
diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
index d9f83c478736..7ddb25df5c98 100644
--- a/arch/arm64/include/asm/simd.h
+++ b/arch/arm64/include/asm/simd.h
@@ -43,8 +43,11 @@ static __must_check inline bool may_use_simd(void) {
#endif /* ! CONFIG_KERNEL_MODE_NEON */
-DEFINE_LOCK_GUARD_0(ksimd, kernel_neon_begin(), kernel_neon_end())
+DEFINE_LOCK_GUARD_1(ksimd,
+ struct user_fpsimd_state,
+ kernel_neon_begin(_T->lock),
+ kernel_neon_end(_T->lock))
-#define scoped_ksimd() scoped_guard(ksimd)
+#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
#endif
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index c37f02d7194e..ea9192a180aa 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1488,21 +1488,23 @@ static void fpsimd_load_kernel_state(struct task_struct *task)
* Elide the load if this CPU holds the most recent kernel mode
* FPSIMD context of the current task.
*/
- if (last->st == &task->thread.kernel_fpsimd_state &&
+ if (last->st == task->thread.kernel_fpsimd_state &&
task->thread.kernel_fpsimd_cpu == smp_processor_id())
return;
- fpsimd_load_state(&task->thread.kernel_fpsimd_state);
+ fpsimd_load_state(task->thread.kernel_fpsimd_state);
}
static void fpsimd_save_kernel_state(struct task_struct *task)
{
struct cpu_fp_state cpu_fp_state = {
- .st = &task->thread.kernel_fpsimd_state,
+ .st = task->thread.kernel_fpsimd_state,
.to_save = FP_STATE_FPSIMD,
};
- fpsimd_save_state(&task->thread.kernel_fpsimd_state);
+ BUG_ON(!cpu_fp_state.st);
+
+ fpsimd_save_state(task->thread.kernel_fpsimd_state);
fpsimd_bind_state_to_cpu(&cpu_fp_state);
task->thread.kernel_fpsimd_cpu = smp_processor_id();
@@ -1773,6 +1775,7 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
void fpsimd_flush_task_state(struct task_struct *t)
{
t->thread.fpsimd_cpu = NR_CPUS;
+ t->thread.kernel_fpsimd_state = NULL;
/*
* If we don't support fpsimd, bail out after we have
* reset the fpsimd_cpu for this task and clear the
@@ -1833,7 +1836,7 @@ void fpsimd_save_and_flush_cpu_state(void)
* The caller may freely use the FPSIMD registers until kernel_neon_end() is
* called.
*/
-void kernel_neon_begin(void)
+void kernel_neon_begin(struct user_fpsimd_state *s)
{
if (WARN_ON(!system_supports_fpsimd()))
return;
@@ -1866,8 +1869,16 @@ void kernel_neon_begin(void)
* mode in task context. So in this case, setting the flag here
* is always appropriate.
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq())
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq()) {
+ /*
+ * Record the caller provided buffer as the kernel mode
+ * FP/SIMD buffer for this task, so that the state can
+ * be preserved and restored on a context switch.
+ */
+ if (cmpxchg(¤t->thread.kernel_fpsimd_state, NULL, s))
+ BUG();
set_thread_flag(TIF_KERNEL_FPSTATE);
+ }
}
/* Invalidate any task state remaining in the fpsimd regs: */
@@ -1886,7 +1897,7 @@ EXPORT_SYMBOL_GPL(kernel_neon_begin);
* The caller must not use the FPSIMD registers after this function is called,
* unless kernel_neon_begin() is called again in the meantime.
*/
-void kernel_neon_end(void)
+void kernel_neon_end(struct user_fpsimd_state *s)
{
if (!system_supports_fpsimd())
return;
@@ -1899,8 +1910,9 @@ void kernel_neon_end(void)
if (!IS_ENABLED(CONFIG_PREEMPT_RT) && in_serving_softirq() &&
test_thread_flag(TIF_KERNEL_FPSTATE))
fpsimd_load_kernel_state(current);
- else
- clear_thread_flag(TIF_KERNEL_FPSTATE);
+ else if (test_and_clear_thread_flag(TIF_KERNEL_FPSTATE))
+ if (cmpxchg(¤t->thread.kernel_fpsimd_state, s, NULL) != s)
+ BUG();
}
EXPORT_SYMBOL_GPL(kernel_neon_end);
@@ -1936,7 +1948,7 @@ void __efi_fpsimd_begin(void)
WARN_ON(preemptible());
if (may_use_simd()) {
- kernel_neon_begin();
+ kernel_neon_begin(&efi_fpsimd_state);
} else {
/*
* If !efi_sve_state, SVE can't be in use yet and doesn't need
@@ -1985,7 +1997,7 @@ void __efi_fpsimd_end(void)
return;
if (!efi_fpsimd_state_used) {
- kernel_neon_end();
+ kernel_neon_end(&efi_fpsimd_state);
} else {
if (system_supports_sve() && efi_sve_state_used) {
bool ffr = true;
--
2.51.0.618.g983fd99d29-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
2025-10-01 21:02 ` [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack Ard Biesheuvel
@ 2025-10-02 16:22 ` Kees Cook
2025-10-02 16:51 ` Ard Biesheuvel
2025-10-03 20:18 ` Eric Biggers
1 sibling, 1 reply; 33+ messages in thread
From: Kees Cook @ 2025-10-02 16:22 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Catalin Marinas, Mark Brown, Eric Biggers
On Wed, Oct 01, 2025 at 11:02:22PM +0200, Ard Biesheuvel wrote:
> [...]
> diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> index d9f83c478736..7ddb25df5c98 100644
> --- a/arch/arm64/include/asm/simd.h
> +++ b/arch/arm64/include/asm/simd.h
> @@ -43,8 +43,11 @@ static __must_check inline bool may_use_simd(void) {
>
> #endif /* ! CONFIG_KERNEL_MODE_NEON */
>
> -DEFINE_LOCK_GUARD_0(ksimd, kernel_neon_begin(), kernel_neon_end())
> +DEFINE_LOCK_GUARD_1(ksimd,
> + struct user_fpsimd_state,
> + kernel_neon_begin(_T->lock),
> + kernel_neon_end(_T->lock))
>
> -#define scoped_ksimd() scoped_guard(ksimd)
> +#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
I love it!
> [...]
> -void kernel_neon_end(void)
> +void kernel_neon_end(struct user_fpsimd_state *s)
> {
> if (!system_supports_fpsimd())
> return;
> @@ -1899,8 +1910,9 @@ void kernel_neon_end(void)
> if (!IS_ENABLED(CONFIG_PREEMPT_RT) && in_serving_softirq() &&
> test_thread_flag(TIF_KERNEL_FPSTATE))
> fpsimd_load_kernel_state(current);
> - else
> - clear_thread_flag(TIF_KERNEL_FPSTATE);
> + else if (test_and_clear_thread_flag(TIF_KERNEL_FPSTATE))
> + if (cmpxchg(¤t->thread.kernel_fpsimd_state, s, NULL) != s)
> + BUG();
I always question BUG() uses -- is there a recoverable way to deal with
a mismatch here? I assume not and that this is the best we can do, but I
thought I'd just explicitly ask. :)
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
2025-10-02 16:22 ` Kees Cook
@ 2025-10-02 16:51 ` Ard Biesheuvel
0 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-02 16:51 UTC (permalink / raw)
To: Kees Cook
Cc: Ard Biesheuvel, linux-arm-kernel, linux-crypto, linux-kernel,
herbert, linux, Marc Zyngier, Will Deacon, Mark Rutland,
Catalin Marinas, Mark Brown, Eric Biggers
On Thu, 2 Oct 2025 at 18:22, Kees Cook <kees@kernel.org> wrote:
>
> On Wed, Oct 01, 2025 at 11:02:22PM +0200, Ard Biesheuvel wrote:
> > [...]
> > diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> > index d9f83c478736..7ddb25df5c98 100644
> > --- a/arch/arm64/include/asm/simd.h
> > +++ b/arch/arm64/include/asm/simd.h
> > @@ -43,8 +43,11 @@ static __must_check inline bool may_use_simd(void) {
> >
> > #endif /* ! CONFIG_KERNEL_MODE_NEON */
> >
> > -DEFINE_LOCK_GUARD_0(ksimd, kernel_neon_begin(), kernel_neon_end())
> > +DEFINE_LOCK_GUARD_1(ksimd,
> > + struct user_fpsimd_state,
> > + kernel_neon_begin(_T->lock),
> > + kernel_neon_end(_T->lock))
> >
> > -#define scoped_ksimd() scoped_guard(ksimd)
> > +#define scoped_ksimd() scoped_guard(ksimd, &(struct user_fpsimd_state){})
>
> I love it!
>
> > [...]
> > -void kernel_neon_end(void)
> > +void kernel_neon_end(struct user_fpsimd_state *s)
> > {
> > if (!system_supports_fpsimd())
> > return;
> > @@ -1899,8 +1910,9 @@ void kernel_neon_end(void)
> > if (!IS_ENABLED(CONFIG_PREEMPT_RT) && in_serving_softirq() &&
> > test_thread_flag(TIF_KERNEL_FPSTATE))
> > fpsimd_load_kernel_state(current);
> > - else
> > - clear_thread_flag(TIF_KERNEL_FPSTATE);
> > + else if (test_and_clear_thread_flag(TIF_KERNEL_FPSTATE))
> > + if (cmpxchg(¤t->thread.kernel_fpsimd_state, s, NULL) != s)
> > + BUG();
>
> I always question BUG() uses -- is there a recoverable way to deal with
> a mismatch here? I assume not and that this is the best we can do, but I
> thought I'd just explicitly ask. :)
>
So this fires when kernel_neon_end() passes a different buffer than
the preceding kernel_neon_begin(), but the only purpose of passing it
twice is this particular check.
So perhaps this one can be demoted to a WARN() instead. The preceding
one will fire if kernel_neon_begin() is called and the recorded buffer
pointer is still set from a previous NEON block. Perhaps the same
applies there too: the warning is about something that already
happened, and we can actually proceed as usual.
TL;DR yes let's WARN() instead in both cases.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
2025-10-01 21:02 ` [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack Ard Biesheuvel
2025-10-02 16:22 ` Kees Cook
@ 2025-10-03 20:18 ` Eric Biggers
2025-10-05 14:54 ` Ard Biesheuvel
1 sibling, 1 reply; 33+ messages in thread
From: Eric Biggers @ 2025-10-03 20:18 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Kees Cook, Catalin Marinas, Mark Brown
On Wed, Oct 01, 2025 at 11:02:22PM +0200, Ard Biesheuvel wrote:
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 4f8d677b73ee..93bca4d454d7 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -172,7 +172,7 @@ struct thread_struct {
> unsigned long fault_code; /* ESR_EL1 value */
> struct debug_info debug; /* debugging */
>
> - struct user_fpsimd_state kernel_fpsimd_state;
> + struct user_fpsimd_state *kernel_fpsimd_state;
Is TIF_KERNEL_FPSTATE redundant with kernel_fpsimd_state != NULL? If
so, should we have both?
> +void kernel_neon_begin(struct user_fpsimd_state *s)
> {
> if (WARN_ON(!system_supports_fpsimd()))
> return;
> @@ -1866,8 +1869,16 @@ void kernel_neon_begin(void)
> * mode in task context. So in this case, setting the flag here
> * is always appropriate.
> */
> - if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq())
> + if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq()) {
> + /*
> + * Record the caller provided buffer as the kernel mode
> + * FP/SIMD buffer for this task, so that the state can
> + * be preserved and restored on a context switch.
> + */
> + if (cmpxchg(¤t->thread.kernel_fpsimd_state, NULL, s))
> + BUG();
Does this really need to be a cmpxchg, vs. a regular load and store?
It's just operating on current.
> + else if (test_and_clear_thread_flag(TIF_KERNEL_FPSTATE))
> + if (cmpxchg(¤t->thread.kernel_fpsimd_state, s, NULL) != s)
> + BUG();
Likewise.
- Eric
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
2025-10-03 20:18 ` Eric Biggers
@ 2025-10-05 14:54 ` Ard Biesheuvel
0 siblings, 0 replies; 33+ messages in thread
From: Ard Biesheuvel @ 2025-10-05 14:54 UTC (permalink / raw)
To: Eric Biggers
Cc: Ard Biesheuvel, linux-arm-kernel, linux-crypto, linux-kernel,
herbert, linux, Marc Zyngier, Will Deacon, Mark Rutland,
Kees Cook, Catalin Marinas, Mark Brown
On Fri, 3 Oct 2025 at 22:18, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Wed, Oct 01, 2025 at 11:02:22PM +0200, Ard Biesheuvel wrote:
> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index 4f8d677b73ee..93bca4d454d7 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -172,7 +172,7 @@ struct thread_struct {
> > unsigned long fault_code; /* ESR_EL1 value */
> > struct debug_info debug; /* debugging */
> >
> > - struct user_fpsimd_state kernel_fpsimd_state;
> > + struct user_fpsimd_state *kernel_fpsimd_state;
>
> Is TIF_KERNEL_FPSTATE redundant with kernel_fpsimd_state != NULL?
Not entirely.
The generic kernel_fpu_begin/end API that the amdgpu driver uses
cannot straight-forwardly use a stack buffer, given how the API is
implemented. Since this code already disables preemption when using
the FPU, and should not assume being able to use kernel mode FP in
softirq context, I think we'll end up doing something like the
following for arm64
void kernel_fpu_begin(void)
{
preempt_disable();
local_bh_disable();
kernel_neon_begin(NULL);
}
etc, as it never actually needs this buffer to begin with.
Technically, setting TIF_KERNEL_FPSTATE is not necessary when no
scheduling or softirq interruptions may be expected, but I still think
it is better to keep these separate.
> > +void kernel_neon_begin(struct user_fpsimd_state *s)
> > {
> > if (WARN_ON(!system_supports_fpsimd()))
> > return;
> > @@ -1866,8 +1869,16 @@ void kernel_neon_begin(void)
> > * mode in task context. So in this case, setting the flag here
> > * is always appropriate.
> > */
> > - if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq())
> > + if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq()) {
> > + /*
> > + * Record the caller provided buffer as the kernel mode
> > + * FP/SIMD buffer for this task, so that the state can
> > + * be preserved and restored on a context switch.
> > + */
> > + if (cmpxchg(¤t->thread.kernel_fpsimd_state, NULL, s))
> > + BUG();
>
> Does this really need to be a cmpxchg, vs. a regular load and store?
> It's just operating on current.
>
It does not need to be atomic, no. I moved this around a bit while I
was working on it, but in its current form, we can just BUG/WARN on
the old value being wrong, and set the new one using an ordinary store
(in both cases).
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack
2025-10-01 21:02 [PATCH v2 00/20] arm64: Move kernel mode FPSIMD buffer to the stack Ard Biesheuvel
` (19 preceding siblings ...)
2025-10-01 21:02 ` [PATCH v2 20/20] arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack Ard Biesheuvel
@ 2025-10-03 20:28 ` Eric Biggers
20 siblings, 0 replies; 33+ messages in thread
From: Eric Biggers @ 2025-10-03 20:28 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-crypto, linux-kernel, herbert, linux,
Ard Biesheuvel, Marc Zyngier, Will Deacon, Mark Rutland,
Kees Cook, Catalin Marinas, Mark Brown
On Wed, Oct 01, 2025 at 11:02:02PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Move the buffer for preserving/restoring the kernel mode FPSIMD state on a
> context switch out of struct thread_struct, and onto the stack, so that
> the memory cost is not imposed needlessly on all tasks in the system.
>
> Changes since v1:
> - Add a patch reverting the arm64 support for the generic
> kernel_fpu_begin()/end() API, which is problematic on arm64.
>
> - Introduce a new 'ksimd' scoped guard that encapsulates the calls the
> kernel_neon_begin() and kernel_neon_end() at a higher level of
> abstraction. This makes it straight-forward to plumb in the stack
> buffer without complicating the callers.
>
> - Move all kernel mode NEON users on arm64 (and some on ARM) over to the
> new API.
>
> - Add Mark's ack to patches #6 - #8
>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Eric Biggers <ebiggers@kernel.org>
>
> Ard Biesheuvel (20):
> arm64: Revert support for generic kernel mode FPU
> arm64/simd: Add scoped guard API for kernel mode SIMD
> ARM/simd: Add scoped guard API for kernel mode SIMD
> crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
> raid6: Move to more abstract 'ksimd' guard API
> crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
> crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
> crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
> lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
> lib/crypto: Switch ARM and arm64 to 'ksimd' scoped guard API
> crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
> crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
> crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
> crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
> crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
> crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
> crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
> crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
> arm64/xorblocks: Switch to 'ksimd' scoped guard API
> arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
- Eric
^ permalink raw reply [flat|nested] 33+ messages in thread