Linux Power Management development
 help / color / mirror / Atom feed
From: Jim Mattson <jmattson@google.com>
To: Borislav Petkov <bp@alien8.de>, Thomas Gleixner <tglx@kernel.org>,
	x86@kernel.org,  linux-kernel@vger.kernel.org,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	 Viresh Kumar <viresh.kumar@linaro.org>,
	linux-pm@vger.kernel.org, yosry@kernel.org
Cc: Jim Mattson <jmattson@google.com>
Subject: [PATCH v2 1/3] x86/CPU/AMD: Avoid racy updates to MSR_K7_HWCR in set_cpuid_faulting()
Date: Fri, 12 Jun 2026 14:53:17 -0700	[thread overview]
Message-ID: <20260612215729.1532175-2-jmattson@google.com> (raw)
In-Reply-To: <20260612215729.1532175-1-jmattson@google.com>

Since msr_set_bit() and msr_clear_bit() perform a non-atomic update to an
MSR, they can race with a write to the same MSR from interrupt context.

On AMD CPUs, set_cpuid_faulting() uses these functions to modify
MSR_K7_HWCR from process context, with preemption disabled but interrupts
enabled. The acpi-cpufreq driver's boost_set_msr() modifies HWCR from
interrupt context. If a crosscall IPI arrives between
set_cpuid_faulting()'s read and write of MSR_K7_HWCR and toggles the Core
Performance Boost disable bit (CPB_DIS), the IPI's update is lost.

This race has been observed empirically on a Turin system running the
acpi-cpufreq driver with a synthetic test. One thread repeatedly toggles
/sys/devices/system/cpu/cpufreq/boost and verifies CPB_DIS on CPU0 after
each write. A second thread pinned to CPU0 calls arch_prctl(ARCH_SET_CPUID,
<val>), with alternating <val>s of 0 and 1. CPB_DIS bit changes are
sometimes lost.

Introduce amd_update_hwcr() to perform an interrupt-safe
read-modify-write of MSR_K7_HWCR, and use it in set_cpuid_faulting() to
prevent races with HWCR updates in interrupt context. Note that when
set_cpuid_faulting() is called from __switch_to_xtra(), interrupts are
already disabled, so the race is only possible on the arch_prctl()
paths.

Reported-by: Sashiko (gemini/gemini-3.1-pro-preview)
Closes: https://lore.kernel.org/all/20260609211611.466231-1-jmattson@google.com/
Suggested-by: Borislav Petkov <bp@alien8.de>
Fixes: 65f55a301766 ("x86/CPU/AMD: Add CPUID faulting support")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/include/asm/processor.h |  2 ++
 arch/x86/kernel/cpu/amd.c        | 42 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/process.c        |  4 +--
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 86554de9a3f5..18c4be75e927 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -899,6 +899,7 @@
 #define MSR_K7_HWCR_IRPERF_EN_BIT	30
 #define MSR_K7_HWCR_IRPERF_EN		BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
 #define MSR_K7_HWCR_CPUID_USER_DIS_BIT	35
+#define MSR_K7_HWCR_CPUID_USER_DIS	BIT_ULL(MSR_K7_HWCR_CPUID_USER_DIS_BIT)
 #define MSR_K7_FID_VID_CTL		0xc0010041
 #define MSR_K7_FID_VID_STATUS		0xc0010042
 #define MSR_K7_HWCR_CPB_DIS_BIT		25
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 67dd932305db..d2ad7ac24e02 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -716,9 +716,11 @@ static __always_inline void amd_clear_divider(void)
 }
 
 extern void amd_check_microcode(void);
+extern int amd_update_hwcr(u64 clear, u64 set);
 #else
 static inline void amd_clear_divider(void)		{ }
 static inline void amd_check_microcode(void)		{ }
+static inline int amd_update_hwcr(u64 clear, u64 set) { return -ENODEV; }
 #endif
 
 extern unsigned long arch_align_stack(unsigned long sp);
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 31f01e9c7114..abb6f755be98 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -1319,6 +1319,48 @@ void amd_check_microcode(void)
 		on_each_cpu(zenbleed_check_cpu, NULL, 1);
 }
 
+/**
+ * amd_update_hwcr - Update MSR_K7_HWCR on the executing CPU
+ * @clear: Bits to clear
+ * @set: Bits to set
+ *
+ * MSR_K7_HWCR is written from both process context (e.g. CPUID faulting
+ * updates via arch_prctl(ARCH_SET_CPUID)) and interrupt context (e.g.
+ * Core Performance Boost updates IPI'd by the acpi-cpufreq driver), so
+ * a read-modify-write of the MSR must be performed with interrupts
+ * disabled to avoid losing an update made by an intervening interrupt.
+ * All runtime (non-initialization) updates of MSR_K7_HWCR should go
+ * through this helper.
+ *
+ * Bits in @set take precedence over bits in @clear.
+ *
+ * Context: Any context except NMI. Disabling interrupts does not
+ *          serialize against an NMI, so NMI handlers must not write
+ *          MSR_K7_HWCR.
+ *
+ * Return: 0 on success, negative error code if an MSR access faults.
+ */
+int amd_update_hwcr(u64 clear, u64 set)
+{
+	unsigned long flags;
+	u64 oldval, newval;
+	int ret;
+
+	local_irq_save(flags);
+	ret = rdmsrq_safe(MSR_K7_HWCR, &oldval);
+	if (ret)
+		goto out;
+
+	newval = (oldval & ~clear) | set;
+
+	if (newval != oldval)
+		ret = wrmsrq_safe(MSR_K7_HWCR, newval);
+out:
+	local_irq_restore(flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(amd_update_hwcr);
+
 static const char * const s5_reset_reason_txt[] = {
 	[0]  = "thermal pin BP_THERMTRIP_L was tripped",
 	[1]  = "power button was pressed for 4 seconds",
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4c718f8adc59..08ef205f6b7f 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -355,9 +355,9 @@ static void set_cpuid_faulting(bool on)
 		wrmsrq(MSR_MISC_FEATURES_ENABLES, msrval);
 	} else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
 		if (on)
-			msr_set_bit(MSR_K7_HWCR, MSR_K7_HWCR_CPUID_USER_DIS_BIT);
+			amd_update_hwcr(0, MSR_K7_HWCR_CPUID_USER_DIS);
 		else
-			msr_clear_bit(MSR_K7_HWCR, MSR_K7_HWCR_CPUID_USER_DIS_BIT);
+			amd_update_hwcr(MSR_K7_HWCR_CPUID_USER_DIS, 0);
 	}
 }
 
-- 
2.54.0.1136.gdb2ca164c4-goog


  reply	other threads:[~2026-06-12 21:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 21:53 [PATCH v2 0/3] Fix three racy updates to MSR_K7_HWCR Jim Mattson
2026-06-12 21:53 ` Jim Mattson [this message]
2026-06-12 22:30   ` [PATCH v2 1/3] x86/CPU/AMD: Avoid racy updates to MSR_K7_HWCR in set_cpuid_faulting() Yosry Ahmed
2026-06-16 21:54     ` Jim Mattson
2026-06-12 21:53 ` [PATCH v2 2/3] x86/mce/inject: Avoid racy updates to MSR_K7_HWCR in toggle_hw_mce_inject() Jim Mattson
2026-06-12 21:53 ` [PATCH v2 3/3] cpufreq: ACPI: Avoid racy updates to MSR_K7_HWCR in boost_set_msr() Jim Mattson
2026-06-12 23:01 ` [PATCH v2 0/3] Fix three racy updates to MSR_K7_HWCR Andrew Cooper
2026-06-12 23:21   ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612215729.1532175-2-jmattson@google.com \
    --to=jmattson@google.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=tglx@kernel.org \
    --cc=viresh.kumar@linaro.org \
    --cc=x86@kernel.org \
    --cc=yosry@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox