public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jain, Ayush" <ayushjai@amd.com>
To: Eric Biggers <ebiggers@kernel.org>, "x86@kernel.org" <x86@kernel.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Jain, Ayush" <Ayush.Jain3@amd.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
Date: Mon, 19 May 2025 09:48:35 +0530	[thread overview]
Message-ID: <3c612c2f-026e-4bc2-bbb4-3e9097acc6f5@amd.com> (raw)
In-Reply-To: <20250518193212.1822-1-ebiggers@kernel.org>

Hello Eric,

I have tested this on AMD EPYC 8534P 64-Core Processor
and this patch fixes reported issue.

Tested-by: Ayush Jain <Ayush.Jain3@amd.com>

Thanks,
Ayush

On 5/19/2025 1:02 AM, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> irq_fpu_usable() incorrectly returned true before the FPU is
> initialized.  The x86 CPU onlining code can call sha256() to checksum
> AMD microcode images, before the FPU is initialized.  Since sha256()
> recently gained a kernel-mode FPU optimized code path, a crash occurred
> in kernel_fpu_begin_mask() during hotplug CPU onlining.
> 
> (The crash did not occur during boot-time CPU onlining, since the
> optimized sha256() code is not enabled until subsys_initcalls run.)
> 
> Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
> has run.  To do this without adding any additional overhead to
> irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
> kernel_fpu_allowed which tracks both initialization and usage rather
> than just usage.  The initial state is false; FPU initialization sets it
> to true; kernel-mode FPU sections toggle it to false and then back to
> true; and CPU offlining restores it to the initial state of false.
> 
> Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
> Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
> Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  arch/x86/include/asm/fpu/api.h |  1 +
>  arch/x86/kernel/fpu/core.c     | 34 +++++++++++++++++++++-------------
>  arch/x86/kernel/fpu/init.c     |  3 +++
>  arch/x86/kernel/smpboot.c      |  6 ++++++
>  4 files changed, 31 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
> index 8e6848f55dcdb..cd6f194a912bf 100644
> --- a/arch/x86/include/asm/fpu/api.h
> +++ b/arch/x86/include/asm/fpu/api.h
> @@ -124,10 +124,11 @@ extern void fpstate_init_soft(struct swregs_state *soft);
>  #else
>  static inline void fpstate_init_soft(struct swregs_state *soft) {}
>  #endif
>  
>  /* State tracking */
> +DECLARE_PER_CPU(bool, kernel_fpu_allowed);
>  DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
>  
>  /* Process cleanup */
>  #ifdef CONFIG_X86_64
>  extern void fpstate_free(struct fpu *fpu);
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 948b4f5fad99c..ea138583dd92a 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -42,12 +42,15 @@ struct fpu_state_config fpu_user_cfg __ro_after_init;
>   * Represents the initial FPU state. It's mostly (but not completely) zeroes,
>   * depending on the FPU hardware format:
>   */
>  struct fpstate init_fpstate __ro_after_init;
>  
> -/* Track in-kernel FPU usage */
> -static DEFINE_PER_CPU(bool, in_kernel_fpu);
> +/*
> + * Track FPU initialization and kernel-mode usage. 'true' means the FPU is
> + * initialized and is not currently being used by the kernel:
> + */
> +DEFINE_PER_CPU(bool, kernel_fpu_allowed);
>  
>  /*
>   * Track which context is using the FPU on the CPU:
>   */
>  DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
> @@ -70,19 +73,22 @@ bool irq_fpu_usable(void)
>  {
>  	if (WARN_ON_ONCE(in_nmi()))
>  		return false;
>  
>  	/*
> -	 * In kernel FPU usage already active?  This detects any explicitly
> -	 * nested usage in task or softirq context, which is unsupported.  It
> -	 * also detects attempted usage in a hardirq that has interrupted a
> -	 * kernel-mode FPU section.
> +	 * Return false in the following cases:
> +	 *
> +	 * - FPU is not yet initialized. This can happen only when the call is
> +	 *   coming from CPU onlining, for example for microcode checksumming.
> +	 * - The kernel is already using the FPU, either because of explicit
> +	 *   nesting (which should never be done), or because of implicit
> +	 *   nesting when a hardirq interrupted a kernel-mode FPU section.
> +	 *
> +	 * The single boolean check below handles both cases:
>  	 */
> -	if (this_cpu_read(in_kernel_fpu)) {
> -		WARN_ON_FPU(!in_hardirq());
> +	if (!this_cpu_read(kernel_fpu_allowed))
>  		return false;
> -	}
>  
>  	/*
>  	 * When not in NMI or hard interrupt context, FPU can be used in:
>  	 *
>  	 * - Task context except from within fpregs_lock()'ed critical
> @@ -437,13 +443,14 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
>  {
>  	if (!irqs_disabled())
>  		fpregs_lock();
>  
>  	WARN_ON_FPU(!irq_fpu_usable());
> -	WARN_ON_FPU(this_cpu_read(in_kernel_fpu));
>  
> -	this_cpu_write(in_kernel_fpu, true);
> +	/* Toggle kernel_fpu_allowed to false: */
> +	WARN_ON_FPU(!this_cpu_read(kernel_fpu_allowed));
> +	this_cpu_write(kernel_fpu_allowed, false);
>  
>  	if (!(current->flags & (PF_KTHREAD | PF_USER_WORKER)) &&
>  	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
>  		set_thread_flag(TIF_NEED_FPU_LOAD);
>  		save_fpregs_to_fpstate(x86_task_fpu(current));
> @@ -459,13 +466,14 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask);
>  
>  void kernel_fpu_end(void)
>  {
> -	WARN_ON_FPU(!this_cpu_read(in_kernel_fpu));
> +	/* Toggle kernel_fpu_allowed back to true: */
> +	WARN_ON_FPU(this_cpu_read(kernel_fpu_allowed));
> +	this_cpu_write(kernel_fpu_allowed, true);
>  
> -	this_cpu_write(in_kernel_fpu, false);
>  	if (!irqs_disabled())
>  		fpregs_unlock();
>  }
>  EXPORT_SYMBOL_GPL(kernel_fpu_end);
>  
> diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
> index 6bb3e35c40e24..99db41bf9fa6b 100644
> --- a/arch/x86/kernel/fpu/init.c
> +++ b/arch/x86/kernel/fpu/init.c
> @@ -49,10 +49,13 @@ static void fpu__init_cpu_generic(void)
>   */
>  void fpu__init_cpu(void)
>  {
>  	fpu__init_cpu_generic();
>  	fpu__init_cpu_xstate();
> +
> +	/* Start allowing kernel-mode FPU: */
> +	this_cpu_write(kernel_fpu_allowed, true);
>  }
>  
>  static bool __init fpu__probe_without_cpuid(void)
>  {
>  	unsigned long cr0;
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index e266c4edea17e..58ede3fa6a75b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1186,10 +1186,16 @@ void cpu_disable_common(void)
>  {
>  	int cpu = smp_processor_id();
>  
>  	remove_siblinginfo(cpu);
>  
> +	/*
> +	 * Stop allowing kernel-mode FPU. This is needed so that if the CPU is
> +	 * brought online again, the initial state is not allowed:
> +	 */
> +	this_cpu_write(kernel_fpu_allowed, false);
> +
>  	/* It's now safe to remove this processor from the online map */
>  	lock_vector_lock();
>  	remove_cpu_from_maps(cpu);
>  	unlock_vector_lock();
>  	fixup_irqs();
> 
> base-commit: 8566fc3b96539e3235909d6bdda198e1282beaed


  reply	other threads:[~2025-05-19  4:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-18 19:32 [PATCH] x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining Eric Biggers
2025-05-19  4:18 ` Jain, Ayush [this message]
2025-05-19  8:26 ` Ingo Molnar
2025-05-19  8:32   ` Ingo Molnar
2025-05-19 17:04     ` Eric Biggers
2025-05-20  9:33       ` Ingo Molnar
2025-05-21 15:39         ` Thomas Gleixner
2025-05-24  2:55           ` Eric Biggers
2025-05-21 15:39 ` Thomas Gleixner
2025-05-26  2:56   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3c612c2f-026e-4bc2-bbb4-3e9097acc6f5@amd.com \
    --to=ayushjai@amd.com \
    --cc=Ayush.Jain3@amd.com \
    --cc=ardb@kernel.org \
    --cc=bp@alien8.de \
    --cc=ebiggers@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox