linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
@ 2025-05-18 19:32 Eric Biggers
  2025-05-19  4:18 ` Jain, Ayush
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Eric Biggers @ 2025-05-18 19:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, linux-crypto, linux-pm, Borislav Petkov,
	Thomas Gleixner, Ayush Jain, Herbert Xu, Ard Biesheuvel

From: Eric Biggers <ebiggers@google.com>

irq_fpu_usable() incorrectly returned true before the FPU is
initialized.  The x86 CPU onlining code can call sha256() to checksum
AMD microcode images, before the FPU is initialized.  Since sha256()
recently gained a kernel-mode FPU optimized code path, a crash occurred
in kernel_fpu_begin_mask() during hotplug CPU onlining.

(The crash did not occur during boot-time CPU onlining, since the
optimized sha256() code is not enabled until subsys_initcalls run.)

Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
has run.  To do this without adding any additional overhead to
irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
kernel_fpu_allowed which tracks both initialization and usage rather
than just usage.  The initial state is false; FPU initialization sets it
to true; kernel-mode FPU sections toggle it to false and then back to
true; and CPU offlining restores it to the initial state of false.

Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 arch/x86/include/asm/fpu/api.h |  1 +
 arch/x86/kernel/fpu/core.c     | 34 +++++++++++++++++++++-------------
 arch/x86/kernel/fpu/init.c     |  3 +++
 arch/x86/kernel/smpboot.c      |  6 ++++++
 4 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index 8e6848f55dcdb..cd6f194a912bf 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -124,10 +124,11 @@ extern void fpstate_init_soft(struct swregs_state *soft);
 #else
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
 /* State tracking */
+DECLARE_PER_CPU(bool, kernel_fpu_allowed);
 DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
 
 /* Process cleanup */
 #ifdef CONFIG_X86_64
 extern void fpstate_free(struct fpu *fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 948b4f5fad99c..ea138583dd92a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -42,12 +42,15 @@ struct fpu_state_config fpu_user_cfg __ro_after_init;
  * Represents the initial FPU state. It's mostly (but not completely) zeroes,
  * depending on the FPU hardware format:
  */
 struct fpstate init_fpstate __ro_after_init;
 
-/* Track in-kernel FPU usage */
-static DEFINE_PER_CPU(bool, in_kernel_fpu);
+/*
+ * Track FPU initialization and kernel-mode usage. 'true' means the FPU is
+ * initialized and is not currently being used by the kernel:
+ */
+DEFINE_PER_CPU(bool, kernel_fpu_allowed);
 
 /*
  * Track which context is using the FPU on the CPU:
  */
 DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
@@ -70,19 +73,22 @@ bool irq_fpu_usable(void)
 {
 	if (WARN_ON_ONCE(in_nmi()))
 		return false;
 
 	/*
-	 * In kernel FPU usage already active?  This detects any explicitly
-	 * nested usage in task or softirq context, which is unsupported.  It
-	 * also detects attempted usage in a hardirq that has interrupted a
-	 * kernel-mode FPU section.
+	 * Return false in the following cases:
+	 *
+	 * - FPU is not yet initialized. This can happen only when the call is
+	 *   coming from CPU onlining, for example for microcode checksumming.
+	 * - The kernel is already using the FPU, either because of explicit
+	 *   nesting (which should never be done), or because of implicit
+	 *   nesting when a hardirq interrupted a kernel-mode FPU section.
+	 *
+	 * The single boolean check below handles both cases:
 	 */
-	if (this_cpu_read(in_kernel_fpu)) {
-		WARN_ON_FPU(!in_hardirq());
+	if (!this_cpu_read(kernel_fpu_allowed))
 		return false;
-	}
 
 	/*
 	 * When not in NMI or hard interrupt context, FPU can be used in:
 	 *
 	 * - Task context except from within fpregs_lock()'ed critical
@@ -437,13 +443,14 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
 	if (!irqs_disabled())
 		fpregs_lock();
 
 	WARN_ON_FPU(!irq_fpu_usable());
-	WARN_ON_FPU(this_cpu_read(in_kernel_fpu));
 
-	this_cpu_write(in_kernel_fpu, true);
+	/* Toggle kernel_fpu_allowed to false: */
+	WARN_ON_FPU(!this_cpu_read(kernel_fpu_allowed));
+	this_cpu_write(kernel_fpu_allowed, false);
 
 	if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER)) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 		save_fpregs_to_fpstate(x86_task_fpu(current));
@@ -459,13 +466,14 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);
 
 void kernel_fpu_end(void)
 {
-	WARN_ON_FPU(!this_cpu_read(in_kernel_fpu));
+	/* Toggle kernel_fpu_allowed back to true: */
+	WARN_ON_FPU(this_cpu_read(kernel_fpu_allowed));
+	this_cpu_write(kernel_fpu_allowed, true);
 
-	this_cpu_write(in_kernel_fpu, false);
 	if (!irqs_disabled())
 		fpregs_unlock();
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 6bb3e35c40e24..99db41bf9fa6b 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -49,10 +49,13 @@ static void fpu__init_cpu_generic(void)
  */
 void fpu__init_cpu(void)
 {
 	fpu__init_cpu_generic();
 	fpu__init_cpu_xstate();
+
+	/* Start allowing kernel-mode FPU: */
+	this_cpu_write(kernel_fpu_allowed, true);
 }
 
 static bool __init fpu__probe_without_cpuid(void)
 {
 	unsigned long cr0;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e266c4edea17e..58ede3fa6a75b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1186,10 +1186,16 @@ void cpu_disable_common(void)
 {
 	int cpu = smp_processor_id();
 
 	remove_siblinginfo(cpu);
 
+	/*
+	 * Stop allowing kernel-mode FPU. This is needed so that if the CPU is
+	 * brought online again, the initial state is not allowed:
+	 */
+	this_cpu_write(kernel_fpu_allowed, false);
+
 	/* It's now safe to remove this processor from the online map */
 	lock_vector_lock();
 	remove_cpu_from_maps(cpu);
 	unlock_vector_lock();
 	fixup_irqs();

base-commit: 8566fc3b96539e3235909d6bdda198e1282beaed
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-18 19:32 [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining Eric Biggers
@ 2025-05-19  4:18 ` Jain, Ayush
  2025-05-19  8:26 ` Ingo Molnar
  2025-05-21 15:39 ` Thomas Gleixner
  2 siblings, 0 replies; 10+ messages in thread
From: Jain, Ayush @ 2025-05-19  4:18 UTC (permalink / raw)
  To: Eric Biggers, x86@kernel.org
  Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org,
	linux-pm@vger.kernel.org, Borislav Petkov, Thomas Gleixner,
	Jain, Ayush, Herbert Xu, Ard Biesheuvel

Hello Eric,

I have tested this on AMD EPYC 8534P 64-Core Processor
and this patch fixes reported issue.

Tested-by: Ayush Jain <Ayush.Jain3@amd.com>

Thanks,
Ayush

On 5/19/2025 1:02 AM, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> irq_fpu_usable() incorrectly returned true before the FPU is
> initialized.  The x86 CPU onlining code can call sha256() to checksum
> AMD microcode images, before the FPU is initialized.  Since sha256()
> recently gained a kernel-mode FPU optimized code path, a crash occurred
> in kernel_fpu_begin_mask() during hotplug CPU onlining.
> 
> (The crash did not occur during boot-time CPU onlining, since the
> optimized sha256() code is not enabled until subsys_initcalls run.)
> 
> Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
> has run.  To do this without adding any additional overhead to
> irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
> kernel_fpu_allowed which tracks both initialization and usage rather
> than just usage.  The initial state is false; FPU initialization sets it
> to true; kernel-mode FPU sections toggle it to false and then back to
> true; and CPU offlining restores it to the initial state of false.
> 
> Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
> Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
> Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  arch/x86/include/asm/fpu/api.h |  1 +
>  arch/x86/kernel/fpu/core.c     | 34 +++++++++++++++++++++-------------
>  arch/x86/kernel/fpu/init.c     |  3 +++
>  arch/x86/kernel/smpboot.c      |  6 ++++++
>  4 files changed, 31 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
> index 8e6848f55dcdb..cd6f194a912bf 100644
> --- a/arch/x86/include/asm/fpu/api.h
> +++ b/arch/x86/include/asm/fpu/api.h
> @@ -124,10 +124,11 @@ extern void fpstate_init_soft(struct swregs_state *soft);
>  #else
>  static inline void fpstate_init_soft(struct swregs_state *soft) {}
>  #endif
>  
>  /* State tracking */
> +DECLARE_PER_CPU(bool, kernel_fpu_allowed);
>  DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
>  
>  /* Process cleanup */
>  #ifdef CONFIG_X86_64
>  extern void fpstate_free(struct fpu *fpu);
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 948b4f5fad99c..ea138583dd92a 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -42,12 +42,15 @@ struct fpu_state_config fpu_user_cfg __ro_after_init;
>   * Represents the initial FPU state. It's mostly (but not completely) zeroes,
>   * depending on the FPU hardware format:
>   */
>  struct fpstate init_fpstate __ro_after_init;
>  
> -/* Track in-kernel FPU usage */
> -static DEFINE_PER_CPU(bool, in_kernel_fpu);
> +/*
> + * Track FPU initialization and kernel-mode usage. 'true' means the FPU is
> + * initialized and is not currently being used by the kernel:
> + */
> +DEFINE_PER_CPU(bool, kernel_fpu_allowed);
>  
>  /*
>   * Track which context is using the FPU on the CPU:
>   */
>  DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
> @@ -70,19 +73,22 @@ bool irq_fpu_usable(void)
>  {
>  	if (WARN_ON_ONCE(in_nmi()))
>  		return false;
>  
>  	/*
> -	 * In kernel FPU usage already active?  This detects any explicitly
> -	 * nested usage in task or softirq context, which is unsupported.  It
> -	 * also detects attempted usage in a hardirq that has interrupted a
> -	 * kernel-mode FPU section.
> +	 * Return false in the following cases:
> +	 *
> +	 * - FPU is not yet initialized. This can happen only when the call is
> +	 *   coming from CPU onlining, for example for microcode checksumming.
> +	 * - The kernel is already using the FPU, either because of explicit
> +	 *   nesting (which should never be done), or because of implicit
> +	 *   nesting when a hardirq interrupted a kernel-mode FPU section.
> +	 *
> +	 * The single boolean check below handles both cases:
>  	 */
> -	if (this_cpu_read(in_kernel_fpu)) {
> -		WARN_ON_FPU(!in_hardirq());
> +	if (!this_cpu_read(kernel_fpu_allowed))
>  		return false;
> -	}
>  
>  	/*
>  	 * When not in NMI or hard interrupt context, FPU can be used in:
>  	 *
>  	 * - Task context except from within fpregs_lock()'ed critical
> @@ -437,13 +443,14 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
>  {
>  	if (!irqs_disabled())
>  		fpregs_lock();
>  
>  	WARN_ON_FPU(!irq_fpu_usable());
> -	WARN_ON_FPU(this_cpu_read(in_kernel_fpu));
>  
> -	this_cpu_write(in_kernel_fpu, true);
> +	/* Toggle kernel_fpu_allowed to false: */
> +	WARN_ON_FPU(!this_cpu_read(kernel_fpu_allowed));
> +	this_cpu_write(kernel_fpu_allowed, false);
>  
>  	if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER)) &&
>  	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
>  		set_thread_flag(TIF_NEED_FPU_LOAD);
>  		save_fpregs_to_fpstate(x86_task_fpu(current));
> @@ -459,13 +466,14 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);
>  
>  void kernel_fpu_end(void)
>  {
> -	WARN_ON_FPU(!this_cpu_read(in_kernel_fpu));
> +	/* Toggle kernel_fpu_allowed back to true: */
> +	WARN_ON_FPU(this_cpu_read(kernel_fpu_allowed));
> +	this_cpu_write(kernel_fpu_allowed, true);
>  
> -	this_cpu_write(in_kernel_fpu, false);
>  	if (!irqs_disabled())
>  		fpregs_unlock();
>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_end);
>  
> diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
> index 6bb3e35c40e24..99db41bf9fa6b 100644
> --- a/arch/x86/kernel/fpu/init.c
> +++ b/arch/x86/kernel/fpu/init.c
> @@ -49,10 +49,13 @@ static void fpu__init_cpu_generic(void)
>   */
>  void fpu__init_cpu(void)
>  {
>  	fpu__init_cpu_generic();
>  	fpu__init_cpu_xstate();
> +
> +	/* Start allowing kernel-mode FPU: */
> +	this_cpu_write(kernel_fpu_allowed, true);
>  }
>  
>  static bool __init fpu__probe_without_cpuid(void)
>  {
>  	unsigned long cr0;
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index e266c4edea17e..58ede3fa6a75b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1186,10 +1186,16 @@ void cpu_disable_common(void)
>  {
>  	int cpu = smp_processor_id();
>  
>  	remove_siblinginfo(cpu);
>  
> +	/*
> +	 * Stop allowing kernel-mode FPU. This is needed so that if the CPU is
> +	 * brought online again, the initial state is not allowed:
> +	 */
> +	this_cpu_write(kernel_fpu_allowed, false);
> +
>  	/* It's now safe to remove this processor from the online map */
>  	lock_vector_lock();
>  	remove_cpu_from_maps(cpu);
>  	unlock_vector_lock();
>  	fixup_irqs();
> 
> base-commit: 8566fc3b96539e3235909d6bdda198e1282beaed


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-18 19:32 [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining Eric Biggers
  2025-05-19  4:18 ` Jain, Ayush
@ 2025-05-19  8:26 ` Ingo Molnar
  2025-05-19  8:32   ` Ingo Molnar
  2025-05-21 15:39 ` Thomas Gleixner
  2 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2025-05-19  8:26 UTC (permalink / raw)
  To: Eric Biggers
  Cc: x86, linux-kernel, linux-crypto, linux-pm, Borislav Petkov,
	Thomas Gleixner, Ayush Jain, Herbert Xu, Ard Biesheuvel


* Eric Biggers <ebiggers@kernel.org> wrote:

> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -42,12 +42,15 @@ struct fpu_state_config fpu_user_cfg __ro_after_init;
>   * Represents the initial FPU state. It's mostly (but not completely) zeroes,
>   * depending on the FPU hardware format:
>   */
>  struct fpstate init_fpstate __ro_after_init;
>  
> -/* Track in-kernel FPU usage */
> -static DEFINE_PER_CPU(bool, in_kernel_fpu);
> +/*
> + * Track FPU initialization and kernel-mode usage. 'true' means the FPU is
> + * initialized and is not currently being used by the kernel:
> + */
> +DEFINE_PER_CPU(bool, kernel_fpu_allowed);

So this is a nice independent cleanup, regardless of the CPU 
bootstrapping bug it fixes. The fuzzy/negated meaning of in_kernel_fpu 
always bothered me a bit, and your patch makes this condition a bit 
cleaner, plus it defaults to 'disabled' on zero-initialization, which 
is a bonus.

>  void kernel_fpu_end(void)
>  {
> -	WARN_ON_FPU(!this_cpu_read(in_kernel_fpu));
> +	/* Toggle kernel_fpu_allowed back to true: */
> +	WARN_ON_FPU(this_cpu_read(kernel_fpu_allowed));
> +	this_cpu_write(kernel_fpu_allowed, true);
>  
> -	this_cpu_write(in_kernel_fpu, false);
>  	if (!irqs_disabled())
>  		fpregs_unlock();

In addition to this fix, feel free to also send your x86 irqs-enabled 
FPU model optimization series on top, Ard says it shouldn't cause 
fundamental problems on EFI.

>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_end);
>  
> diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
> index 6bb3e35c40e24..99db41bf9fa6b 100644
> --- a/arch/x86/kernel/fpu/init.c
> +++ b/arch/x86/kernel/fpu/init.c
> @@ -49,10 +49,13 @@ static void fpu__init_cpu_generic(void)
>   */
>  void fpu__init_cpu(void)
>  {
>  	fpu__init_cpu_generic();
>  	fpu__init_cpu_xstate();
> +
> +	/* Start allowing kernel-mode FPU: */
> +	this_cpu_write(kernel_fpu_allowed, true);

Since this goes outside the regular kernel_fpu_begin()/end() methods, 
could you please also add an WARN_ON_FPU() check to make sure it was 
false before? x86 CPU init code is still a bit of spaghetti at times.

> @@ -1186,10 +1186,16 @@ void cpu_disable_common(void)
>  {
>  	int cpu = smp_processor_id();
>  
>  	remove_siblinginfo(cpu);
>  
> +	/*
> +	 * Stop allowing kernel-mode FPU. This is needed so that if the CPU is
> +	 * brought online again, the initial state is not allowed:
> +	 */
> +	this_cpu_write(kernel_fpu_allowed, false);

Ditto, an WARN_ON_FPU() would be nice: if kernel FPU is disabled at 
this point then something's fishy.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-19  8:26 ` Ingo Molnar
@ 2025-05-19  8:32   ` Ingo Molnar
  2025-05-19 17:04     ` Eric Biggers
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2025-05-19  8:32 UTC (permalink / raw)
  To: Eric Biggers
  Cc: x86, linux-kernel, linux-crypto, linux-pm, Borislav Petkov,
	Thomas Gleixner, Ayush Jain, Herbert Xu, Ard Biesheuvel


> void fpu__init_cpu(void)
> {
>        fpu__init_cpu_generic();
>        fpu__init_cpu_xstate();
> +
> +       /* Start allowing kernel-mode FPU: */
> +       this_cpu_write(kernel_fpu_allowed, true);
> }

BTW., this is the chunk that fixes the crypto crash, right? If yes, 
then could you please split this from the main patch, with the main 
patch setting kernel_fpu_allowed very early, which should make the main 
patch an identity transformation with no (expected) change in behavior.

Likewise, the cpu_disable_common change should similarly replicate the 
current code, and should only be changed in the second patch.

Phasing it in like that should improve bisectability, for the off 
chance of some regression.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-19  8:32   ` Ingo Molnar
@ 2025-05-19 17:04     ` Eric Biggers
  2025-05-20  9:33       ` Ingo Molnar
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Biggers @ 2025-05-19 17:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: x86, linux-kernel, linux-crypto, linux-pm, Borislav Petkov,
	Thomas Gleixner, Ayush Jain, Herbert Xu, Ard Biesheuvel

On Mon, May 19, 2025 at 10:32:08AM +0200, Ingo Molnar wrote:
> 
> > void fpu__init_cpu(void)
> > {
> >        fpu__init_cpu_generic();
> >        fpu__init_cpu_xstate();
> > +
> > +       /* Start allowing kernel-mode FPU: */
> > +       this_cpu_write(kernel_fpu_allowed, true);
> > }
> 
> BTW., this is the chunk that fixes the crypto crash, right? If yes, 
> then could you please split this from the main patch, with the main 
> patch setting kernel_fpu_allowed very early, which should make the main 
> patch an identity transformation with no (expected) change in behavior.
> 
> Likewise, the cpu_disable_common change should similarly replicate the 
> current code, and should only be changed in the second patch.
> 
> Phasing it in like that should improve bisectability, for the off 
> chance of some regression.

The line in fpu__init_cpu() is needed at the same time that the boolean is
inverted (when in_kernel_fpu is replaced with kernel_fpu_allowed), since
otherwise it never gets set to true and kernel-mode FPU is never allowed.

We could include the fpu__init_cpu() change in patch 1 and leave CPU hotplug
broken, and fix it in patch 2 by updating cpu_disable_common().  I think it
makes a lot more sense to keep them together though.

Or we could use DEFINE_PER_CPU() = true in patch 1, then revert that in patch 2
and replace it with the line in fpu__init_cpu().  But again I think the split
would be more likely to create problems than solve them.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-19 17:04     ` Eric Biggers
@ 2025-05-20  9:33       ` Ingo Molnar
  2025-05-21 15:39         ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2025-05-20  9:33 UTC (permalink / raw)
  To: Eric Biggers
  Cc: x86, linux-kernel, linux-crypto, linux-pm, Borislav Petkov,
	Thomas Gleixner, Ayush Jain, Herbert Xu, Ard Biesheuvel


* Eric Biggers <ebiggers@kernel.org> wrote:

> Or we could use DEFINE_PER_CPU() = true in patch 1, then revert that 
> in patch 2 and replace it with the line in fpu__init_cpu().  But 
> again I think the split would be more likely to create problems than 
> solve them.

Well, my request would be for the first patch to simply mimic current 
(and buggy) behavior as much as reasonably possible (obviously the 
effects of BSS zeroing shouldn't be mimiced 100%) - and the second 
patch to fix the initialization-ordering bug.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-20  9:33       ` Ingo Molnar
@ 2025-05-21 15:39         ` Thomas Gleixner
  2025-05-24  2:55           ` Eric Biggers
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2025-05-21 15:39 UTC (permalink / raw)
  To: Ingo Molnar, Eric Biggers
  Cc: x86, linux-kernel, linux-crypto, linux-pm, Borislav Petkov,
	Ayush Jain, Herbert Xu, Ard Biesheuvel

On Tue, May 20 2025 at 11:33, Ingo Molnar wrote:
> * Eric Biggers <ebiggers@kernel.org> wrote:
>
>> Or we could use DEFINE_PER_CPU() = true in patch 1, then revert that 
>> in patch 2 and replace it with the line in fpu__init_cpu().  But 
>> again I think the split would be more likely to create problems than 
>> solve them.
>
> Well, my request would be for the first patch to simply mimic current 
> (and buggy) behavior as much as reasonably possible (obviously the 
> effects of BSS zeroing shouldn't be mimiced 100%) - and the second 
> patch to fix the initialization-ordering bug.

So the first patch is then incomprehensible buggy and needs a trivial
one-liner to fix up, right?

TBH, that's just bonkers. Eric's patch is trivial enough as is and easy
to review. Artifical patch splitting with buggy intermediate state makes
only sense, when the overall changes are massive and hard to
review. That's absolutely not the case here.

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-18 19:32 [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining Eric Biggers
  2025-05-19  4:18 ` Jain, Ayush
  2025-05-19  8:26 ` Ingo Molnar
@ 2025-05-21 15:39 ` Thomas Gleixner
  2025-05-26  2:56   ` Herbert Xu
  2 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2025-05-21 15:39 UTC (permalink / raw)
  To: Eric Biggers, x86
  Cc: linux-kernel, linux-crypto, linux-pm, Borislav Petkov, Ayush Jain,
	Herbert Xu, Ard Biesheuvel

On Sun, May 18 2025 at 12:32, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> irq_fpu_usable() incorrectly returned true before the FPU is
> initialized.  The x86 CPU onlining code can call sha256() to checksum
> AMD microcode images, before the FPU is initialized.  Since sha256()
> recently gained a kernel-mode FPU optimized code path, a crash occurred
> in kernel_fpu_begin_mask() during hotplug CPU onlining.
>
> (The crash did not occur during boot-time CPU onlining, since the
> optimized sha256() code is not enabled until subsys_initcalls run.)
>
> Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
> has run.  To do this without adding any additional overhead to
> irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
> kernel_fpu_allowed which tracks both initialization and usage rather
> than just usage.  The initial state is false; FPU initialization sets it
> to true; kernel-mode FPU sections toggle it to false and then back to
> true; and CPU offlining restores it to the initial state of false.
>
> Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
> Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
> Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-21 15:39         ` Thomas Gleixner
@ 2025-05-24  2:55           ` Eric Biggers
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Biggers @ 2025-05-24  2:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, x86, linux-kernel, linux-crypto, linux-pm,
	Borislav Petkov, Ayush Jain, Herbert Xu, Ard Biesheuvel

On Wed, May 21, 2025 at 05:39:27PM +0200, Thomas Gleixner wrote:
> On Tue, May 20 2025 at 11:33, Ingo Molnar wrote:
> > * Eric Biggers <ebiggers@kernel.org> wrote:
> >
> >> Or we could use DEFINE_PER_CPU() = true in patch 1, then revert that 
> >> in patch 2 and replace it with the line in fpu__init_cpu().  But 
> >> again I think the split would be more likely to create problems than 
> >> solve them.
> >
> > Well, my request would be for the first patch to simply mimic current 
> > (and buggy) behavior as much as reasonably possible (obviously the 
> > effects of BSS zeroing shouldn't be mimiced 100%) - and the second 
> > patch to fix the initialization-ordering bug.
> 
> So the first patch is then incomprehensible buggy and needs a trivial
> one-liner to fix up, right?
> 
> TBH, that's just bonkers. Eric's patch is trivial enough as is and easy
> to review. Artifical patch splitting with buggy intermediate state makes
> only sense, when the overall changes are massive and hard to
> review. That's absolutely not the case here.
> 
> Thanks,

That sounds reasonable to me.  Anyway, any interest in applying one of the
versions to the x86 tree?  Maybe either this original one, or v2 which has the
extra WARN_ON_FPU() checks that Ingo requested.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  2025-05-21 15:39 ` Thomas Gleixner
@ 2025-05-26  2:56   ` Herbert Xu
  0 siblings, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2025-05-26  2:56 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Eric Biggers, x86, linux-kernel, linux-crypto, linux-pm,
	Borislav Petkov, Ayush Jain, Ard Biesheuvel, Ingo Molnar

On Wed, May 21, 2025 at 05:39:41PM +0200, Thomas Gleixner wrote:
> On Sun, May 18 2025 at 12:32, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> >
> > irq_fpu_usable() incorrectly returned true before the FPU is
> > initialized.  The x86 CPU onlining code can call sha256() to checksum
> > AMD microcode images, before the FPU is initialized.  Since sha256()
> > recently gained a kernel-mode FPU optimized code path, a crash occurred
> > in kernel_fpu_begin_mask() during hotplug CPU onlining.
> >
> > (The crash did not occur during boot-time CPU onlining, since the
> > optimized sha256() code is not enabled until subsys_initcalls run.)
> >
> > Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
> > has run.  To do this without adding any additional overhead to
> > irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
> > kernel_fpu_allowed which tracks both initialization and usage rather
> > than just usage.  The initial state is false; FPU initialization sets it
> > to true; kernel-mode FPU sections toggle it to false and then back to
> > true; and CPU offlining restores it to the initial state of false.
> >
> > Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
> > Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
> > Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

Since this patch is holding up my pull request, I will be applying
it to cryptodev.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-05-26  2:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-18 19:32 [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining Eric Biggers
2025-05-19  4:18 ` Jain, Ayush
2025-05-19  8:26 ` Ingo Molnar
2025-05-19  8:32   ` Ingo Molnar
2025-05-19 17:04     ` Eric Biggers
2025-05-20  9:33       ` Ingo Molnar
2025-05-21 15:39         ` Thomas Gleixner
2025-05-24  2:55           ` Eric Biggers
2025-05-21 15:39 ` Thomas Gleixner
2025-05-26  2:56   ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).