public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
To: Dave Hansen <dave.hansen@linux.intel.com>, linux-kernel@vger.kernel.org
Cc: x86@kernel.org, andrew.cooper3@citrix.com,
	Len Brown <len.brown@intel.com>,
	 Peter Zijlstra <peterz@infradead.org>,
	"Rafael J.Wysocki" <rafael.j.wysocki@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] Handle Ice Lake MONITOR erratum
Date: Mon, 21 Apr 2025 16:02:10 -0700	[thread overview]
Message-ID: <595bce6203d6d8951e312064bd5a8ba14c1fe141.camel@linux.intel.com> (raw)
In-Reply-To: <20250421192205.7CC1A7D9@davehans-spike.ostc.intel.com>

On Mon, 2025-04-21 at 12:22 -0700, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Andrew Cooper reported some boot issues on Ice Lake servers when
> running Xen that he tracked down to MWAIT not waking up. Do the safe
> thing and consider them buggy since there's a published erratum.
> Note: I've seen no reports of this occurring on Linux.
> 
> Add Ice Lake servers to the list of shaky MONITOR implementations
> with
> no workaround available. Also, before the if() gets too unwieldy,
> move
> it over to a x86_cpu_id array. Additionally, add a comment to the
> X86_BUG_MONITOR consumption site to make it clear how and why
> affected
> CPUs get IPIs to wake them up.
> 
> There is no equivalent erratum for the "Xeon D" Ice Lakes so
> INTEL_ICELAKE_D is not affected.
> 
> The erratum is called ICX143 in the "3rd Gen Intel Xeon Scalable
> Processors, Codename Ice Lake Specification Update". It is Intel
> document 637780, currently available here:
> 
> 	https://cdrdv2.intel.com/v1/dl/getContent/637780
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: stable@vger.kernel.org
> 
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

Thanks,
Srinivas

> ---
> 
>  b/arch/x86/include/asm/mwait.h |    3 +++
>  b/arch/x86/kernel/cpu/intel.c  |   17 ++++++++++++++---
>  2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff -puN arch/x86/kernel/cpu/intel.c~ICX-MONITOR-bug
> arch/x86/kernel/cpu/intel.c
> --- a/arch/x86/kernel/cpu/intel.c~ICX-MONITOR-bug	2025-04-18
> 13:54:46.022590596 -0700
> +++ b/arch/x86/kernel/cpu/intel.c	2025-04-18
> 15:15:19.374365069 -0700
> @@ -513,6 +513,19 @@ static void init_intel_misc_features(str
>  }
>  
>  /*
> + * These CPUs have buggy MWAIT/MONITOR implementations that
> + * usually manifest as hangs or stalls at boot.
> + */
> +#define MWAIT_VFM(_vfm)	\
> +	X86_MATCH_VFM_FEATURE(_vfm, X86_FEATURE_MWAIT, 0)
> +static const struct x86_cpu_id monitor_bug_list[] = {
> +	MWAIT_VFM(INTEL_ATOM_GOLDMONT),
> +	MWAIT_VFM(INTEL_LUNARLAKE_M),
> +	MWAIT_VFM(INTEL_ICELAKE_X),	/* Erratum ICX143 */
> +	{},
> +};
> +
> +/*
>   * This is a list of Intel CPUs that are known to suffer from
> downclocking when
>   * ZMM registers (512-bit vectors) are used.  On these CPUs, when
> the kernel
>   * executes SIMD-optimized code such as cryptography functions or
> CRCs, it
> @@ -565,9 +578,7 @@ static void init_intel(struct cpuinfo_x8
>  	     c->x86_vfm == INTEL_WESTMERE_EX))
>  		set_cpu_bug(c, X86_BUG_CLFLUSH_MONITOR);
>  
> -	if (boot_cpu_has(X86_FEATURE_MWAIT) &&
> -	    (c->x86_vfm == INTEL_ATOM_GOLDMONT ||
> -	     c->x86_vfm == INTEL_LUNARLAKE_M))
> +	if (x86_match_cpu(monitor_bug_list))
>  		set_cpu_bug(c, X86_BUG_MONITOR);
>  
>  #ifdef CONFIG_X86_64
> diff -puN arch/x86/include/asm/mwait.h~ICX-MONITOR-bug
> arch/x86/include/asm/mwait.h
> --- a/arch/x86/include/asm/mwait.h~ICX-MONITOR-bug	2025-04-18
> 15:17:18.353749634 -0700
> +++ b/arch/x86/include/asm/mwait.h	2025-04-18
> 15:20:06.037927656 -0700
> @@ -110,6 +110,9 @@ static __always_inline void __sti_mwait(
>   * through MWAIT. Whenever someone changes need_resched, we would be
> woken
>   * up from MWAIT (without an IPI).
>   *
> + * Buggy (X86_BUG_MONITOR) CPUs will never set the polling bit and
> will
> + * always be sent IPIs.
> + *
>   * New with Core Duo processors, MWAIT can take some hints based on
> CPU
>   * capability.
>   */
> _


  parent reply	other threads:[~2025-04-21 23:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-21 19:22 [PATCH] Handle Ice Lake MONITOR erratum Dave Hansen
2025-04-21 19:32 ` Andrew Cooper
2025-04-21 23:02 ` srinivas pandruvada [this message]
2025-04-22  6:46 ` Ingo Molnar
2025-04-22 14:18   ` Dave Hansen
2025-04-22 19:35     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=595bce6203d6d8951e312064bd5a8ba14c1fe141.camel@linux.intel.com \
    --to=srinivas.pandruvada@linux.intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox