* Re: [PATCH] mm/memory-failure: trace: change memory_failure_event to ras subsystem
From: Lance Yang @ 2026-06-08 12:14 UTC (permalink / raw)
To: xieyuanbin1
Cc: david, qiuxu.zhuo, bp, akpm, rostedt, linmiaohe, nao.horiguchi,
mhiramat, mchehab+huawei, tony.luck, yi1.lai, linux-edac,
linux-kernel, linux-mm, linux-trace-kernel, torvalds, lilinjie8,
liaohua4, Lance Yang
In-Reply-To: <20260605081213.154660-1-xieyuanbin1@huawei.com>
On Fri, Jun 05, 2026 at 04:12:13PM +0800, Xie Yuanbin wrote:
>For historical version, commit 97f0b1345219 ("tracing: add trace event
>for memory-failure") introduced memory_failure_event in ras subsystem.
>commit 31807483d395 ("mm/memory-failure: remove the selection of RAS")
>changed memory_failure_event to memory_failure subsystem. This breaks
>the backward compatibility, some user programs rely on it.
>
>Change memory_failure_event to ras subsystem to keep backward
>compatibility.
>
>Fixes: 31807483d395 ("mm/memory-failure: remove the selection of RAS")
>
>Reported-by: Yi Lai <yi1.lai@intel.com>
>Reported-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>Closes: https://lore.kernel.org/linux-mm/CY8PR11MB7134346A3E4BB28ECA28D6E989132@CY8PR11MB7134.namprd11.prod.outlook.com
>Cc: David Hildenbrand <david@kernel.org>
>Cc: Steven Rostedt <rostedt@goodmis.org>
>Cc: Borislav Petkov <bp@alien8.de>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Miaohe Lin <linmiaohe@huawei.com>
>Signed-off-by: Xie Yuanbin <xieyuanbin1@huawei.com>
>---
> include/trace/events/memory-failure.h | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
>diff --git a/include/trace/events/memory-failure.h b/include/trace/events/memory-failure.h
>index aa57cc8f896b..7a8ee5d1a44e 100644
>--- a/include/trace/events/memory-failure.h
>+++ b/include/trace/events/memory-failure.h
>@@ -1,6 +1,10 @@
> /* SPDX-License-Identifier: GPL-2.0 */
> #undef TRACE_SYSTEM
>-#define TRACE_SYSTEM memory_failure
>+/*
>+ * For historical versions, memory_failure_event is in ras subsystem,
>+ * some user programs depend on it.
>+ */
>+#define TRACE_SYSTEM ras
> #define TRACE_INCLUDE_FILE memory-failure
>
> #if !defined(_TRACE_MEMORY_FAILURE_H) || defined(TRACE_HEADER_MULTI_READ)
>--
Thanks. Feel free to add:
Reviewed-by: Lance Yang <lance.yang@linux.dev>
^ permalink raw reply
* Re: [PATCH] mm/memory-failure: trace: change memory_failure_event to ras subsystem
From: Miaohe Lin @ 2026-06-08 11:27 UTC (permalink / raw)
To: Xie Yuanbin
Cc: linux-edac, linux-kernel, linux-mm, linux-trace-kernel, torvalds,
lilinjie8, liaohua4, david, qiuxu.zhuo, bp, akpm, rostedt,
nao.horiguchi, mhiramat, mchehab+huawei, tony.luck, yi1.lai
In-Reply-To: <20260605081213.154660-1-xieyuanbin1@huawei.com>
On 2026/6/5 16:12, Xie Yuanbin wrote:
> For historical version, commit 97f0b1345219 ("tracing: add trace event
> for memory-failure") introduced memory_failure_event in ras subsystem.
> commit 31807483d395 ("mm/memory-failure: remove the selection of RAS")
> changed memory_failure_event to memory_failure subsystem. This breaks
> the backward compatibility, some user programs rely on it.
>
> Change memory_failure_event to ras subsystem to keep backward
> compatibility.
>
> Fixes: 31807483d395 ("mm/memory-failure: remove the selection of RAS")
With David's comment addressed:
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Thanks.
.
^ permalink raw reply
* Re: [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
From: Shubhrajyoti Datta @ 2026-06-08 11:11 UTC (permalink / raw)
To: Prasanna Kumar T S M
Cc: ssengar, shubhrajyoti.datta, bp, tony.luck, linux-edac,
linux-kernel
In-Reply-To: <20260401111836.2342918-1-ptsm@linux.microsoft.com>
On Wed, Apr 1, 2026 at 4:56 PM Prasanna Kumar T S M
<ptsm@linux.microsoft.com> wrote:
>
> The teardown sequence in mc_remove() does not mirror the reverse of the
> initialization order in mc_probe(). In particular,
> unregister_rpmsg_driver() is called before remove_versalnet(), and
> cdx_mcdi_finish() is called after rproc_shutdown().
>
> Reorder mc_remove() to reverse the probe initialization sequence,
> consistent with the probe error-unwind paths.
I think that the remote proc should be quiescence first so that no
more messages will be
queued. and then the edac should be removed. See below.
>
> The rproc reference acquired via rproc_get_by_phandle() during probe
> is not released in mc_remove(), causing a reference count leak. Add
> the missing rproc_put() call.
>
> Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
> Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
> ---
> drivers/edac/versalnet_edac.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
> index b87fe57aa842..acd51b492772 100644
> --- a/drivers/edac/versalnet_edac.c
> +++ b/drivers/edac/versalnet_edac.c
> @@ -955,10 +955,11 @@ static void mc_remove(struct platform_device *pdev)
> {
> struct mc_priv *priv = platform_get_drvdata(pdev);
>
> - unregister_rpmsg_driver(&amd_rpmsg_driver);
> remove_versalnet(priv);
Here we are removing the edac but the remoteproc can be triggered.
calling the remote callback.
> - rproc_shutdown(priv->mcdi->r5_rproc);
> cdx_mcdi_finish(priv->mcdi);
> + unregister_rpmsg_driver(&amd_rpmsg_driver);
> + rproc_shutdown(priv->mcdi->r5_rproc);
> + rproc_put(priv->mcdi->r5_rproc);
The put is a valid fix.
> kfree(priv->mcdi);
> }
>
> --
> 2.49.0
>
>
^ permalink raw reply
* Re: [PATCH 0/8] ras: aest: extend AEST support to Device Tree frontend
From: Umang Chheda @ 2026-06-08 9:45 UTC (permalink / raw)
To: Ruidong Tian, Ruidong Tian, Tony Luck, Borislav Petkov,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Bjorn Andersson,
Konrad Dybcio, catalin.marinas, will, lpieralisi, rafael,
mark.rutland, Sudeep Holla
Cc: linux-arm-msm, linux-acpi, linux-arm-kernel, linux-edac,
linux-kernel, devicetree, Faruque Ansari
In-Reply-To: <4eeeef74-8a27-470c-b516-095f029b9e9e@linux.alibaba.com>
Hi Ruidog,
On 6/2/2026 12:59 PM, Ruidong Tian wrote:
> Hi Umang,
>
> I have sent out v7, and I wanted to highlight a few changes to make it
> easier for you to adapt the devicetree support:
>
> 1. I stopped passing device information to the driver through the
> acpi_aest_node structure. Instead, I switched to using the device
> property infrastructure and removed the aest_device abstraction layer
> (which was originally introduced to support CMN). This should provide
> good compatibility between ACPI and devicetree and avoid the need to
> write extra adaptation code for devicetree (such as aest_of). In the
> ideal case, adding just one of_match_id should be enough to make it
> work, although this will require you to update the DTB file accordingly.
>
> 2. I removed the use of genpool. The current AEST driver only needs
> memory in interrupt context, so genpool is not needed.
>
> 3.The driver has been renamed to arm64_ras.
>
> I have already applied some of your previous fix patches and added your
> Signed-off-by.
>
>
> Best regards,
> Ruidong
Thanks for this summary and for re-factoring the code to make it easy to
adapt for DT as well.
>
>
> 在 2026/5/5 20:23, Umang Chheda 写道:
>> This series extends Tian Ruidong’s [1] ACPI-based AEST support series
>> to also cover Device Tree based platforms.
>>
>> While the existing AEST driver relies on the AEST ACPI table [3], many
>> embedded Arm platforms use Device Tree exclusively and cannot use the
>> driver today. This series adds a DT frontend that mirrors the ACPI
>> implementation and feeds the same core driver, keeping ACPI and DT
>> paths functionally equivalent.
>>
>> Along the way, several correctness issues were identified in the core
>> driver and are fixed in the first part of this series.
>>
>> The DT frontend is mutually exclusive with ACPI and does not introduce
>> any DT-specific logic into the core.
>>
>> How to test with QEMU
>> --------------------------
>> Tian Ruidong's QEMU fork [2] emulates AEST MMIO error records on the
>> virt machine. To test the DT frontend:
>>
>> 1. Build QEMU:
>>
>> git clone https://github.com/winterddd/qemu.git
>> cd qemu
>> git checkout c5e2d5dec9fd62ba622314c40bff0fbecb4dfb34
>> ./configure --target-list=aarch64-softmmu
>> make -j$(nproc)
>>
>> 2. Build the kernel with:
>>
>> CONFIG_OF_AEST=y
>> CONFIG_AEST=y
>> CONFIG_ARM64_RAS_EXTN=y
>> CONFIG_RAS=y
>>
>> 3. Add the following DT node to your virt machine DTB. The QEMU
>> fork maps DRAM error records at 0x090d0000 (SPI 44) and CMN
>> vendor records at 0x090e0000 (SPI 45):
>>
>> aest {
>> compatible = "arm,aest";
>> #address-cells = <2>;
>> #size-cells = <2>;
>> ranges;
>> interrupt-parent = <&gic>;
>>
>> /* DRAM memory node — MMIO at 0x090d0000, SPI 44 */
>> aest-dram0@90d0000 {
>> compatible = "arm,aest-memory";
>> arm,interface-type = <1>;
>> arm,group-format = <0>;
>> arm,interface-flags = <0x22>;
>> arm,num-records = <4>;
>> arm,record-impl = /bits/ 64 <0x0>;
>> arm,status-report = /bits/ 64 <0x0>;
>> arm,addr-mode = /bits/ 64 <0x0>;
>> arm,proximity-domain = <0>;
>> reg = <0x0 0x090d0000 0x0 0x1000>,
>> <0x0 0x090d0800 0x0 0x200>,
>> <0x0 0x090d0e00 0x0 0x100>;
>> reg-names = "errblock", "fault-inject",
>> "err-group";
>> interrupts = <GIC_SPI 44
>> IRQ_TYPE_LEVEL_HIGH>;
>> interrupt-names = "fhi";
>> };
>> };
>>
>> 4. Boot QEMU with acpi=off:
>>
>> ./qemu-system-aarch64 \
>> -machine virt,accel=tcg,gic-version=3 \
>> -cpu cortex-a57 -m 2G -smp 4 \
>> -kernel Image -dtb virt-aest.dtb \
>> -append "console=ttyAMA0 acpi=off earlycon" \
>> -nographic
>>
>> 5. Verify probe:
>>
>> dmesg | grep "DT AEST"
>> # Expected: DT AEST: registered 1 AEST error source(s) from DT
>> ls /sys/kernel/debug/aest/
>>
>> 6. Inject a CE error via the QEMU MMIO fault injection registers.
>> The QEMU device accepts 64-bit accesses only (use devmem with
>> the 64-bit width flag):
>>
>> devmem 0x090d0808 64 0x80000040 # CDOFF | CE inject
>>
>> This triggers QEMU's error_record_inj_write() which sets
>> ERR<n>STATUS.V=1 and asserts the IRQ. The kernel driver's
>> aest_irq_func() fires, reads the status, and logs:
>>
>> AEST: {1}[Hardware Error]: Hardware error from AEST memory.90d0000
>> AEST: {1}[Hardware Error]: Error from memory at SRAT proximity
>> domain 0x0
>>
>> Testing
>> -------
>> - Validated on Qualcomm's lemans-evk and monaco-evk board with DT boot.
>> - Validated CE and UE injection via debugfs soft_inject.
>> - Tested ACPI path is unaffected: ACPI boot continues to use
>> drivers/acpi/arm64/aest.c unchanged.
>>
>> [1] https://lore.kernel.org/lkml/20260122094656.73399-1-
>> tianruidong@linux.alibaba.com/
>> [2] https://github.com/winterddd/qemu/tree/error_record
>> [3] https://developer.arm.com/documentation/den0085/0200/
>>
>> Signed-off-by: Umang Chheda <umang.chheda@oss.qualcomm.com>
>> ---
>> Umang Chheda (8):
>> ras: aest: Fix shared processor node handling and error log
>> messages
>> ras: aest: Fix CE/UE error counts not incrementing in debugfs
>> ras: aest: Skip unimplemented records in debugfs
>> ras: aest: Add panic_on_ue module parameter
>> dt-bindings: arm: ras: Introduce bindings for ARM AEST
>> ras: aest: Add DT frontend for ARM AEST RAS error sources
>> arm64: dts: qcom: lemans: add AEST error nodes
>> arm64: dts: qcom: monaco: add AEST error nodes
>>
>> .../devicetree/bindings/arm/arm,aest.yaml | 406 +++++++++++++
>> arch/arm64/boot/dts/qcom/lemans.dtsi | 41 ++
>> arch/arm64/boot/dts/qcom/monaco.dtsi | 41 ++
>> drivers/ras/aest/Kconfig | 15 +-
>> drivers/ras/aest/Makefile | 2 +
>> drivers/ras/aest/aest-core.c | 63 +-
>> drivers/ras/aest/aest-of.c | 673 +++++++++++
>> ++++++++++
>> drivers/ras/aest/aest-sysfs.c | 27 +-
>> drivers/ras/aest/aest.h | 15 +-
>> include/dt-bindings/arm/aest.h | 43 ++
>> 10 files changed, 1310 insertions(+), 16 deletions(-)
>> ---
>> base-commit: a67b7fd0dd1f6ccf3d128dc2099cdb07af1f6a09
>> change-id: 20260505-aest-devicetree-support-a3722d90e1f5
>> prerequisite-message-id: <20260122094656.73399-1-
>> tianruidong@linux.alibaba.com>
>> prerequisite-patch-id: c5a7c6431c6c1e6351241e694ee053800039d41d
>> prerequisite-patch-id: 1f6e2c20829eee41a210dd8a538f1e8efcc65872
>> prerequisite-patch-id: 5556287e3f46c2ed2c0431c53c7782e87bcbd866
>> prerequisite-patch-id: 2edae0a136d7779b8f686181720e71d044a73311
>> prerequisite-patch-id: b5190b2844dcb01e72f87a59f3a29548795fdb82
>> prerequisite-patch-id: 7ba848583708b2ae776a7ce847bb056e3de7f77b
>> prerequisite-patch-id: 397e5b22802b67942435f4f2968f0b1e210ba0e8
>> prerequisite-patch-id: 2169f4b65537eecbd0ccbd2ad6b28c64ec44655d
>> prerequisite-patch-id: b626f85d98747595b3240bc49e6ad9c9dd5c0fa9
>> prerequisite-patch-id: 1323dfd2eebad2ef6514dbbce58ba08e8859f894
>> prerequisite-patch-id: 95b826e5e329408437a3ef336c4f45d4d74f82bb
>> prerequisite-patch-id: b60ff489a5a33c5d5220fa8144af7b7511769cba
>> prerequisite-patch-id: 43f35a52b8a3d13c938ff08083403c1d3bd0df8b
>> prerequisite-patch-id: c55d4e9117ca36d3c2cba82d550a618cb82bb745
>> prerequisite-patch-id: 3885e10f318ae8101d6909b35d92a976cc359e3c
>> prerequisite-patch-id: 92958cde05577f069c5659018a274bb39cfb6b24
>>
>> Best regards,
>> --
>> Umang Chheda <umang.chheda@oss.qualcomm.com>
>>
>
^ permalink raw reply
* [PATCH] EDAC/amd64: Fix incorrect Node ID mapping on CPU-only Zen4+ systems
From: Phineas Su @ 2026-06-08 7:26 UTC (permalink / raw)
To: yazen.ghannam, bp, tony.luck
Cc: muralidhara.mk, linux-edac, linux-kernel, Phineas Su
On CPU-only systems using AMD Zen4 (Genoa) and newer processors, memory
ECC errors can be reported with an incorrect Node ID (incremented by 1).
This happens because these CPUs use SMCA_UMC_V2 banks, which triggers
the GPU node ID fixup logic in fixup_node_id().
Since it is a CPU-only system, gpu_node_map is not initialized and
gpu_node_map.base_node_id is 0. The check (nid < gpu_node_map.base_node_id)
evaluates to (nid < 0), which is always false for the unsigned u8 nid.
The function then incorrectly applies the fixup (nid - 0 + 1), resulting
in the Node ID being incremented.
Fix this by ensuring that gpu_node_map has been initialized (i.e.,
node_count is non-zero) before applying any GPU-specific fixups.
Fixes: 4251566ebc1c ("EDAC/amd64: Cache and use GPU node map")
Signed-off-by: Phineas Su <pohaosu@google.com>
---
drivers/edac/amd64_edac.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index c6aa69dbd9fb..1e688123a50c 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1047,6 +1047,10 @@ static int fixup_node_id(int node_id, struct mce *m)
if (smca_get_bank_type(m->extcpu, m->bank) != SMCA_UMC_V2)
return node_id;
+ /* If no GPU nodes are present, no fixup is needed. */
+ if (!gpu_node_map.node_count)
+ return node_id;
+
/* Nodes below the GPU base node are CPU nodes and don't need a fixup. */
if (nid < gpu_node_map.base_node_id)
return node_id;
--
2.54.0.1032.g2f8565e1d1-goog
^ permalink raw reply related
* Re: [PATCH v3 00/11] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: K Prateek Nayak @ 2026-06-08 5:59 UTC (permalink / raw)
To: Juergen Gross, linux-kernel, linux-pm, x86, linux-edac,
linux-hwmon, linux-perf-users
Cc: Huang Rui, Mario Limonciello, Perry Yuan, Rafael J. Wysocki,
Viresh Kumar, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Tony Luck, Guenter Roeck,
Daniel Lezcano, Zhang Rui, Lukasz Luba, Peter Zijlstra,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
James Clark
In-Reply-To: <20260608051741.3207435-1-jgross@suse.com>
Hello Juergen,
On 6/8/2026 10:47 AM, Juergen Gross wrote:
> Drop the variants using 2 32-bit values instead of a single 64-bit one
> of the *_on_cpu() MSR access functions.
>
> Changes in V2:
> - patches 1+2 split out from other patch
> - keep the *q() variants instead of those without suffix
>
> Changes in V3:
> - V3 patch 7 split out from V2 patch 7
I've taken this series for a spin on top of tip/master and haven't
notices anything unusual on both baremetal and on an i386 QEMU
guest. Feel free to include:
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
--
Thanks and Regards,
Prateek
^ permalink raw reply
* [PATCH v3 05/11] x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu()
From: Juergen Gross @ 2026-06-08 5:17 UTC (permalink / raw)
To: linux-kernel, x86, linux-perf-users, linux-edac, linux-pm
Cc: Juergen Gross, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
James Clark, Thomas Gleixner, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Daniel Lezcano, Zhang Rui, Lukasz Luba
In-Reply-To: <20260608051741.3207435-1-jgross@suse.com>
In order to prepare retiring wrmsr_on_cpu() switch wrmsr_on_cpu() users
to wrmsrq_on_cpu().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
---
V2:
- instead of changing wrmsr_on_cpu(), use wrmsrq_on_cpu() (Ingo Molnar)
---
arch/x86/events/intel/ds.c | 11 ++++-------
arch/x86/include/asm/msr.h | 2 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
drivers/cpufreq/p4-clockmod.c | 4 ++--
drivers/cpufreq/speedstep-centrino.c | 4 ++--
drivers/thermal/intel/x86_pkg_temp_thermal.c | 2 +-
6 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7f0d515c07c5..5b9c01383f49 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -780,9 +780,7 @@ void init_debug_store_on_cpu(int cpu)
if (!ds)
return;
- wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA,
- (u32)((u64)(unsigned long)ds),
- (u32)((u64)(unsigned long)ds >> 32));
+ wrmsrq_on_cpu(cpu, MSR_IA32_DS_AREA, (u64)(unsigned long)ds);
}
void fini_debug_store_on_cpu(int cpu)
@@ -790,7 +788,7 @@ void fini_debug_store_on_cpu(int cpu)
if (!per_cpu(cpu_hw_events, cpu).ds)
return;
- wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA, 0, 0);
+ wrmsrq_on_cpu(cpu, MSR_IA32_DS_AREA, 0);
}
static DEFINE_PER_CPU(void *, insn_buffer);
@@ -1095,8 +1093,7 @@ void init_arch_pebs_on_cpu(int cpu)
* contiguous physical buffer (__alloc_pages_node() with order)
*/
arch_pebs_base = virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT;
- wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, (u32)arch_pebs_base,
- (u32)(arch_pebs_base >> 32));
+ wrmsrq_on_cpu(cpu, MSR_IA32_PEBS_BASE, arch_pebs_base);
x86_pmu.pebs_active = 1;
}
@@ -1105,7 +1102,7 @@ inline void fini_arch_pebs_on_cpu(int cpu)
if (!x86_pmu.arch_pebs)
return;
- wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0, 0);
+ wrmsrq_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0);
}
/*
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 22f914f7affe..6e0d7a6335ff 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -291,7 +291,7 @@ static inline void rdmsr_on_cpus(const struct cpumask *m, u32 msr_no,
static inline void wrmsr_on_cpus(const struct cpumask *m, u32 msr_no,
struct msr __percpu *msrs)
{
- wrmsr_on_cpu(0, msr_no, raw_cpu_read(msrs->l), raw_cpu_read(msrs->h));
+ wrmsrq_on_cpu(0, msr_no, raw_cpu_read(msrs->q));
}
static inline int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no,
u32 *l, u32 *h)
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index bee9c35762b8..6d30e7720f31 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -327,7 +327,7 @@ static int toggle_hw_mce_inject(unsigned int cpu, bool enable)
enable ? (val.l |= BIT(18)) : (val.l &= ~BIT(18));
- err = wrmsr_on_cpu(cpu, MSR_K7_HWCR, val.l, val.h);
+ err = wrmsrq_on_cpu(cpu, MSR_K7_HWCR, val.q);
if (err)
pr_err("%s: error writing HWCR\n", __func__);
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index d96e8b665f39..c1690aa48193 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -68,7 +68,7 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
rdmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &val.q);
if (newstate == DC_DISABLE) {
pr_debug("CPU#%d disabling modulation\n", cpu);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l & ~(1<<4), val.h);
+ wrmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.q & ~(1ULL << 4));
} else {
pr_debug("CPU#%d setting duty cycle to %d%%\n",
cpu, ((125 * newstate) / 10));
@@ -79,7 +79,7 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
*/
val.l = (val.l & ~14);
val.l = val.l | (1<<4) | ((newstate & 0x7)<<1);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l, val.h);
+ wrmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.q);
}
return 0;
diff --git a/drivers/cpufreq/speedstep-centrino.c b/drivers/cpufreq/speedstep-centrino.c
index cefee19d1100..9237ed8f2b1f 100644
--- a/drivers/cpufreq/speedstep-centrino.c
+++ b/drivers/cpufreq/speedstep-centrino.c
@@ -475,7 +475,7 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
oldmsr.l |= msr;
}
- wrmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
+ wrmsrq_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr.q);
if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY)
break;
@@ -491,7 +491,7 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
*/
for_each_cpu(j, covered_cpus)
- wrmsr_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
+ wrmsrq_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr.q);
}
retval = 0;
diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 2e7de8cf756d..144603c356a0 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -167,7 +167,7 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
v.l |= intr;
}
- return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, v.l, v.h);
+ return wrmsrq_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, v.q);
}
/* Thermal zone callback registry */
--
2.54.0
^ permalink raw reply related
* [PATCH v3 03/11] x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu()
From: Juergen Gross @ 2026-06-08 5:17 UTC (permalink / raw)
To: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon
Cc: Juergen Gross, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Tony Luck, Rafael J. Wysocki,
Viresh Kumar, Guenter Roeck, Daniel Lezcano, Zhang Rui,
Lukasz Luba
In-Reply-To: <20260608051741.3207435-1-jgross@suse.com>
In order to prepare retiring rdmsr_on_cpu() switch rdmsr_on_cpu() users
to rdmsrq_on_cpu().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
---
V2:
- instead of changing rdmsr_on_cpu(), use rdmsrq_on_cpu() (Ingo Molnar)
---
arch/x86/include/asm/msr.h | 2 +-
arch/x86/kernel/cpu/mce/amd.c | 6 ++--
arch/x86/kernel/cpu/mce/inject.c | 8 ++---
drivers/cpufreq/amd_freq_sensitivity.c | 6 ++--
drivers/cpufreq/p4-clockmod.c | 32 ++++++++++----------
drivers/cpufreq/speedstep-centrino.c | 27 +++++++++--------
drivers/hwmon/coretemp.c | 12 ++++----
drivers/thermal/intel/x86_pkg_temp_thermal.c | 25 ++++++++-------
8 files changed, 58 insertions(+), 60 deletions(-)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index fddadbc625be..d5985d6fdaf9 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -292,7 +292,7 @@ static inline int wrmsrq_on_cpu(unsigned int cpu, u32 msr_no, u64 q)
static inline void rdmsr_on_cpus(const struct cpumask *m, u32 msr_no,
struct msr __percpu *msrs)
{
- rdmsr_on_cpu(0, msr_no, raw_cpu_ptr(&msrs->l), raw_cpu_ptr(&msrs->h));
+ rdmsrq_on_cpu(0, msr_no, raw_cpu_ptr(&msrs->q));
}
static inline void wrmsr_on_cpus(const struct cpumask *m, u32 msr_no,
struct msr __percpu *msrs)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6605a0224659..1305d9a2ee32 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -969,13 +969,13 @@ store_threshold_limit(struct threshold_block *b, const char *buf, size_t size)
static ssize_t show_error_count(struct threshold_block *b, char *buf)
{
- u32 lo, hi;
+ struct msr val;
/* CPU might be offline by now */
- if (rdmsr_on_cpu(b->cpu, b->address, &lo, &hi))
+ if (rdmsrq_on_cpu(b->cpu, b->address, &val.q))
return -ENODEV;
- return sprintf(buf, "%u\n", ((hi & THRESHOLD_MAX) -
+ return sprintf(buf, "%u\n", ((val.h & THRESHOLD_MAX) -
(THRESHOLD_MAX - b->threshold_limit)));
}
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index d02c4f556cd0..bee9c35762b8 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -316,18 +316,18 @@ static struct notifier_block inject_nb = {
*/
static int toggle_hw_mce_inject(unsigned int cpu, bool enable)
{
- u32 l, h;
+ struct msr val;
int err;
- err = rdmsr_on_cpu(cpu, MSR_K7_HWCR, &l, &h);
+ err = rdmsrq_on_cpu(cpu, MSR_K7_HWCR, &val.q);
if (err) {
pr_err("%s: error reading HWCR\n", __func__);
return err;
}
- enable ? (l |= BIT(18)) : (l &= ~BIT(18));
+ enable ? (val.l |= BIT(18)) : (val.l &= ~BIT(18));
- err = wrmsr_on_cpu(cpu, MSR_K7_HWCR, l, h);
+ err = wrmsr_on_cpu(cpu, MSR_K7_HWCR, val.l, val.h);
if (err)
pr_err("%s: error writing HWCR\n", __func__);
diff --git a/drivers/cpufreq/amd_freq_sensitivity.c b/drivers/cpufreq/amd_freq_sensitivity.c
index 13fed4b9e02b..739d54dc9f2b 100644
--- a/drivers/cpufreq/amd_freq_sensitivity.c
+++ b/drivers/cpufreq/amd_freq_sensitivity.c
@@ -51,10 +51,8 @@ static unsigned int amd_powersave_bias_target(struct cpufreq_policy *policy,
if (!policy->freq_table)
return freq_next;
- rdmsr_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_ACTUAL,
- &actual.l, &actual.h);
- rdmsr_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_REFERENCE,
- &reference.l, &reference.h);
+ rdmsrq_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_ACTUAL, &actual.q);
+ rdmsrq_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_REFERENCE, &reference.q);
actual.h &= 0x00ffffff;
reference.h &= 0x00ffffff;
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 69c19233fcd4..d96e8b665f39 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -51,24 +51,24 @@ static unsigned int cpufreq_p4_get(unsigned int cpu);
static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
{
- u32 l, h;
+ struct msr val;
if ((newstate > DC_DISABLE) || (newstate == DC_RESV))
return -EINVAL;
- rdmsr_on_cpu(cpu, MSR_IA32_THERM_STATUS, &l, &h);
+ rdmsrq_on_cpu(cpu, MSR_IA32_THERM_STATUS, &val.q);
- if (l & 0x01)
+ if (val.l & 0x01)
pr_debug("CPU#%d currently thermal throttled\n", cpu);
if (has_N44_O17_errata[cpu] &&
(newstate == DC_25PT || newstate == DC_DFLT))
newstate = DC_38PT;
- rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h);
+ rdmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &val.q);
if (newstate == DC_DISABLE) {
pr_debug("CPU#%d disabling modulation\n", cpu);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & ~(1<<4), h);
+ wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l & ~(1<<4), val.h);
} else {
pr_debug("CPU#%d setting duty cycle to %d%%\n",
cpu, ((125 * newstate) / 10));
@@ -77,9 +77,9 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
* bits 3-1 : duty cycle
* bit 0 : reserved
*/
- l = (l & ~14);
- l = l | (1<<4) | ((newstate & 0x7)<<1);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h);
+ val.l = (val.l & ~14);
+ val.l = val.l | (1<<4) | ((newstate & 0x7)<<1);
+ wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l, val.h);
}
return 0;
@@ -205,18 +205,18 @@ static int cpufreq_p4_cpu_init(struct cpufreq_policy *policy)
static unsigned int cpufreq_p4_get(unsigned int cpu)
{
- u32 l, h;
+ struct msr val;
- rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h);
+ rdmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &val.q);
- if (l & 0x10) {
- l = l >> 1;
- l &= 0x7;
+ if (val.l & 0x10) {
+ val.l = val.l >> 1;
+ val.l &= 0x7;
} else
- l = DC_DISABLE;
+ val.l = DC_DISABLE;
- if (l != DC_DISABLE)
- return stock_freq * l / 8;
+ if (val.l != DC_DISABLE)
+ return stock_freq * val.l / 8;
return stock_freq;
}
diff --git a/drivers/cpufreq/speedstep-centrino.c b/drivers/cpufreq/speedstep-centrino.c
index 3e6e85a92212..cefee19d1100 100644
--- a/drivers/cpufreq/speedstep-centrino.c
+++ b/drivers/cpufreq/speedstep-centrino.c
@@ -322,11 +322,11 @@ static unsigned extract_clock(unsigned msr, unsigned int cpu, int failsafe)
/* Return the current CPU frequency in kHz */
static unsigned int get_cur_freq(unsigned int cpu)
{
- unsigned l, h;
+ struct msr val;
unsigned clock_freq;
- rdmsr_on_cpu(cpu, MSR_IA32_PERF_STATUS, &l, &h);
- clock_freq = extract_clock(l, cpu, 0);
+ rdmsrq_on_cpu(cpu, MSR_IA32_PERF_STATUS, &val.q);
+ clock_freq = extract_clock(val.l, cpu, 0);
if (unlikely(clock_freq == 0)) {
/*
@@ -335,8 +335,8 @@ static unsigned int get_cur_freq(unsigned int cpu)
* P-state transition (like TM2). Get the last freq set
* in PERF_CTL.
*/
- rdmsr_on_cpu(cpu, MSR_IA32_PERF_CTL, &l, &h);
- clock_freq = extract_clock(l, cpu, 1);
+ rdmsrq_on_cpu(cpu, MSR_IA32_PERF_CTL, &val.q);
+ clock_freq = extract_clock(val.l, cpu, 1);
}
return clock_freq;
}
@@ -417,7 +417,8 @@ static void centrino_cpu_exit(struct cpufreq_policy *policy)
*/
static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
{
- unsigned int msr, oldmsr = 0, h = 0, cpu = policy->cpu;
+ unsigned int msr, cpu = policy->cpu;
+ struct msr oldmsr = { .q = 0 };
int retval = 0;
unsigned int j, first_cpu;
struct cpufreq_frequency_table *op_points;
@@ -459,22 +460,22 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
msr = op_points->driver_data;
if (first_cpu) {
- rdmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, &oldmsr, &h);
- if (msr == (oldmsr & 0xffff)) {
+ rdmsrq_on_cpu(good_cpu, MSR_IA32_PERF_CTL, &oldmsr.q);
+ if (msr == (oldmsr.l & 0xffff)) {
pr_debug("no change needed - msr was and needs "
- "to be %x\n", oldmsr);
+ "to be %x\n", oldmsr.l);
retval = 0;
goto out;
}
first_cpu = 0;
/* all but 16 LSB are reserved, treat them with care */
- oldmsr &= ~0xffff;
+ oldmsr.l &= ~0xffff;
msr &= 0xffff;
- oldmsr |= msr;
+ oldmsr.l |= msr;
}
- wrmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr, h);
+ wrmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY)
break;
@@ -490,7 +491,7 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
*/
for_each_cpu(j, covered_cpus)
- wrmsr_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr, h);
+ wrmsr_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
}
retval = 0;
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 6a0d94711ead..1259c78c95c6 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -356,15 +356,15 @@ static ssize_t show_label(struct device *dev,
static ssize_t show_crit_alarm(struct device *dev,
struct device_attribute *devattr, char *buf)
{
- u32 eax, edx;
+ struct msr val;
struct temp_data *tdata = container_of(devattr, struct temp_data,
sd_attrs[ATTR_CRIT_ALARM]);
mutex_lock(&tdata->update_lock);
- rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
+ rdmsrq_on_cpu(tdata->cpu, tdata->status_reg, &val.q);
mutex_unlock(&tdata->update_lock);
- return sprintf(buf, "%d\n", (eax >> 5) & 1);
+ return sprintf(buf, "%d\n", (val.l >> 5) & 1);
}
static ssize_t show_tjmax(struct device *dev,
@@ -398,7 +398,7 @@ static ssize_t show_ttarget(struct device *dev,
static ssize_t show_temp(struct device *dev,
struct device_attribute *devattr, char *buf)
{
- u32 eax, edx;
+ struct msr val;
struct temp_data *tdata = container_of(devattr, struct temp_data, sd_attrs[ATTR_TEMP]);
int tjmax;
@@ -407,14 +407,14 @@ static ssize_t show_temp(struct device *dev,
tjmax = get_tjmax(tdata, dev);
/* Check whether the time interval has elapsed */
if (time_after(jiffies, tdata->last_updated + HZ)) {
- rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
+ rdmsrq_on_cpu(tdata->cpu, tdata->status_reg, &val.q);
/*
* Ignore the valid bit. In all observed cases the register
* value is either low or zero if the valid bit is 0.
* Return it instead of reporting an error which doesn't
* really help at all.
*/
- tdata->temp = tjmax - ((eax >> 16) & 0xff) * 1000;
+ tdata->temp = tjmax - ((val.l >> 16) & 0xff) * 1000;
tdata->last_updated = jiffies;
}
diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 540109761f0a..2e7de8cf756d 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -125,8 +125,9 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
{
struct zone_device *zonedev = thermal_zone_device_priv(tzd);
unsigned int trip_index = THERMAL_TRIP_PRIV_TO_INT(trip->priv);
- u32 l, h, mask, shift, intr;
+ u32 mask, shift, intr;
int tj_max, val, ret;
+ struct msr v;
if (temp == THERMAL_TEMP_INVALID)
temp = 0;
@@ -141,8 +142,7 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
if (trip_index >= MAX_NUMBER_OF_TRIPS || val < 0 || val > 0x7f)
return -EINVAL;
- ret = rdmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
- &l, &h);
+ ret = rdmsrq_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &v.q);
if (ret < 0)
return ret;
@@ -155,20 +155,19 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
shift = THERM_SHIFT_THRESHOLD0;
intr = THERM_INT_THRESHOLD0_ENABLE;
}
- l &= ~mask;
+ v.l &= ~mask;
/*
* When users space sets a trip temperature == 0, which is indication
* that, it is no longer interested in receiving notifications.
*/
if (!temp) {
- l &= ~intr;
+ v.l &= ~intr;
} else {
- l |= val << shift;
- l |= intr;
+ v.l |= val << shift;
+ v.l |= intr;
}
- return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
- l, h);
+ return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, v.l, v.h);
}
/* Thermal zone callback registry */
@@ -277,7 +276,8 @@ static int pkg_temp_thermal_trips_init(int cpu, int tj_max,
struct thermal_trip *trips, int num_trips)
{
unsigned long thres_reg_value;
- u32 mask, shift, eax, edx;
+ u32 mask, shift;
+ struct msr val;
int ret, i;
for (i = 0; i < num_trips; i++) {
@@ -290,12 +290,11 @@ static int pkg_temp_thermal_trips_init(int cpu, int tj_max,
shift = THERM_SHIFT_THRESHOLD0;
}
- ret = rdmsr_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
- &eax, &edx);
+ ret = rdmsrq_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &val.q);
if (ret < 0)
return ret;
- thres_reg_value = (eax & mask) >> shift;
+ thres_reg_value = (val.l & mask) >> shift;
trips[i].temperature = thres_reg_value ?
tj_max - thres_reg_value * 1000 : THERMAL_TEMP_INVALID;
--
2.54.0
^ permalink raw reply related
* [PATCH v3 00/11] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Juergen Gross @ 2026-06-08 5:17 UTC (permalink / raw)
To: linux-kernel, linux-pm, x86, linux-edac, linux-hwmon,
linux-perf-users
Cc: Juergen Gross, Huang Rui, Mario Limonciello, Perry Yuan,
K Prateek Nayak, Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
Tony Luck, Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark
Drop the variants using 2 32-bit values instead of a single 64-bit one
of the *_on_cpu() MSR access functions.
Changes in V2:
- patches 1+2 split out from other patch
- keep the *q() variants instead of those without suffix
Changes in V3:
- V3 patch 7 split out from V2 patch 7
Juergen Gross (11):
x86/msr: Switch rdmsrl_on_cpu() users to rdmsrq_on_cpu()
x86/msr: Remove rdmsrl_on_cpu()
x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu()
x86/msr: Remove rdmsr_on_cpu()
x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu()
x86/msr: Remove wrmsr_on_cpu()
x86/msr: Don't use rdmsr_safe_on_cpu() in rdmsrq_safe_on_cpu()
x86/msr: Switch rdmsr_safe_on_cpu() users to rdmsrq_safe_on_cpu()
x86/msr: Remove rdmsr_safe_on_cpu()
x86/msr: Switch wrmsr_safe_on_cpu() users to wrmsrq_safe_on_cpu()
x86/msr: Remove wrmsr_safe_on_cpu()
arch/x86/events/intel/ds.c | 11 +--
arch/x86/include/asm/msr.h | 28 +-----
arch/x86/kernel/cpu/mce/amd.c | 6 +-
arch/x86/kernel/cpu/mce/inject.c | 8 +-
arch/x86/kernel/msr.c | 8 +-
arch/x86/lib/msr-smp.c | 89 +++-----------------
drivers/cpufreq/amd-pstate.c | 2 +-
drivers/cpufreq/amd_freq_sensitivity.c | 6 +-
drivers/cpufreq/p4-clockmod.c | 32 +++----
drivers/cpufreq/speedstep-centrino.c | 27 +++---
drivers/hwmon/coretemp.c | 44 +++++-----
drivers/hwmon/via-cputemp.c | 16 ++--
drivers/thermal/intel/intel_tcc.c | 43 +++++-----
drivers/thermal/intel/x86_pkg_temp_thermal.c | 25 +++---
14 files changed, 128 insertions(+), 217 deletions(-)
--
2.54.0
^ permalink raw reply
* Re: [PATCH] EDAC/versalnet: release remoteproc reference in remove
From: Borislav Petkov @ 2026-06-07 22:20 UTC (permalink / raw)
To: Guangshuo Li; +Cc: Shubhrajyoti Datta, Tony Luck, linux-edac, linux-kernel
In-Reply-To: <20260603133506.1396535-1-lgs201920130244@gmail.com>
On Wed, Jun 03, 2026 at 09:35:06PM +0800, Guangshuo Li wrote:
> mc_probe() gets a remoteproc reference with rproc_get_by_phandle() and
> stores it in priv->mcdi->r5_rproc after the remote processor has been
> booted.
>
> The probe error paths release that reference with rproc_put(), but the
> remove path only shuts the remote processor down. This leaks the
> remoteproc device and module references on every successful probe/remove
> cycle.
>
> Call rproc_put() in mc_remove() after rproc_shutdown().
>
> Fixes: d5fe2fec6c40d ("EDAC: Add a driver for the AMD Versal NET DDR controller")
> Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
> ---
> drivers/edac/versalnet_edac.c | 1 +
> 1 file changed, 1 insertion(+)
https://lore.kernel.org/r/20260401111836.2342918-1-ptsm@linux.microsoft.com
We're sorting it out currently.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply
* Re: [PATCH v2 00/10] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Jürgen Groß @ 2026-06-05 15:09 UTC (permalink / raw)
To: Dave Hansen, linux-kernel, linux-pm, x86, linux-edac, linux-hwmon,
linux-perf-users
Cc: Huang Rui, Mario Limonciello, Perry Yuan, K Prateek Nayak,
Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Tony Luck,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark
In-Reply-To: <633a8b8a-9dd8-410d-a9a5-a9cd2e87e63b@intel.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 1378 bytes --]
On 05.06.26 17:08, Dave Hansen wrote:
> On 6/5/26 07:43, Juergen Gross wrote:
>> arch/x86/events/intel/ds.c | 11 +--
>> arch/x86/include/asm/msr.h | 28 +-----
>> arch/x86/kernel/cpu/mce/amd.c | 6 +-
>> arch/x86/kernel/cpu/mce/inject.c | 8 +-
>> arch/x86/kernel/msr.c | 8 +-
>> arch/x86/lib/msr-smp.c | 89 +++-----------------
>> drivers/cpufreq/amd-pstate.c | 2 +-
>> drivers/cpufreq/amd_freq_sensitivity.c | 6 +-
>> drivers/cpufreq/p4-clockmod.c | 32 +++----
>> drivers/cpufreq/speedstep-centrino.c | 27 +++---
>> drivers/hwmon/coretemp.c | 44 +++++-----
>> drivers/hwmon/via-cputemp.c | 16 ++--
>> drivers/thermal/intel/intel_tcc.c | 43 +++++-----
>> drivers/thermal/intel/x86_pkg_temp_thermal.c | 25 +++---
>> 14 files changed, 128 insertions(+), 217 deletions(-)
>
> This is wonderful. Thank you for doing this!
>
> My only real complaint is the lack of changelog for 07/10. Otherwise, it
> looks great to me. Ideally, you'd collect a few more reviews and post a
> v3 rebased right after the next -rc1.
>
> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Thanks, will do as you suggest.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply
* Re: [PATCH v2 00/10] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Dave Hansen @ 2026-06-05 15:08 UTC (permalink / raw)
To: Juergen Gross, linux-kernel, linux-pm, x86, linux-edac,
linux-hwmon, linux-perf-users
Cc: Huang Rui, Mario Limonciello, Perry Yuan, K Prateek Nayak,
Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Tony Luck,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark
In-Reply-To: <20260605144314.3031049-1-jgross@suse.com>
On 6/5/26 07:43, Juergen Gross wrote:
> arch/x86/events/intel/ds.c | 11 +--
> arch/x86/include/asm/msr.h | 28 +-----
> arch/x86/kernel/cpu/mce/amd.c | 6 +-
> arch/x86/kernel/cpu/mce/inject.c | 8 +-
> arch/x86/kernel/msr.c | 8 +-
> arch/x86/lib/msr-smp.c | 89 +++-----------------
> drivers/cpufreq/amd-pstate.c | 2 +-
> drivers/cpufreq/amd_freq_sensitivity.c | 6 +-
> drivers/cpufreq/p4-clockmod.c | 32 +++----
> drivers/cpufreq/speedstep-centrino.c | 27 +++---
> drivers/hwmon/coretemp.c | 44 +++++-----
> drivers/hwmon/via-cputemp.c | 16 ++--
> drivers/thermal/intel/intel_tcc.c | 43 +++++-----
> drivers/thermal/intel/x86_pkg_temp_thermal.c | 25 +++---
> 14 files changed, 128 insertions(+), 217 deletions(-)
This is wonderful. Thank you for doing this!
My only real complaint is the lack of changelog for 07/10. Otherwise, it
looks great to me. Ideally, you'd collect a few more reviews and post a
v3 rebased right after the next -rc1.
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
^ permalink raw reply
* [PATCH v2 05/10] x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu()
From: Juergen Gross @ 2026-06-05 14:43 UTC (permalink / raw)
To: linux-kernel, x86, linux-perf-users, linux-edac, linux-pm
Cc: Juergen Gross, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
James Clark, Thomas Gleixner, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Daniel Lezcano, Zhang Rui, Lukasz Luba
In-Reply-To: <20260605144314.3031049-1-jgross@suse.com>
In order to prepare retiring wrmsr_on_cpu() switch wrmsr_on_cpu() users
to wrmsrq_on_cpu().
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- instead of changing wrmsr_on_cpu(), use wrmsrq_on_cpu() (Ingo Molnar)
---
arch/x86/events/intel/ds.c | 11 ++++-------
arch/x86/include/asm/msr.h | 2 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
drivers/cpufreq/p4-clockmod.c | 4 ++--
drivers/cpufreq/speedstep-centrino.c | 4 ++--
drivers/thermal/intel/x86_pkg_temp_thermal.c | 2 +-
6 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7f0d515c07c5..5b9c01383f49 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -780,9 +780,7 @@ void init_debug_store_on_cpu(int cpu)
if (!ds)
return;
- wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA,
- (u32)((u64)(unsigned long)ds),
- (u32)((u64)(unsigned long)ds >> 32));
+ wrmsrq_on_cpu(cpu, MSR_IA32_DS_AREA, (u64)(unsigned long)ds);
}
void fini_debug_store_on_cpu(int cpu)
@@ -790,7 +788,7 @@ void fini_debug_store_on_cpu(int cpu)
if (!per_cpu(cpu_hw_events, cpu).ds)
return;
- wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA, 0, 0);
+ wrmsrq_on_cpu(cpu, MSR_IA32_DS_AREA, 0);
}
static DEFINE_PER_CPU(void *, insn_buffer);
@@ -1095,8 +1093,7 @@ void init_arch_pebs_on_cpu(int cpu)
* contiguous physical buffer (__alloc_pages_node() with order)
*/
arch_pebs_base = virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT;
- wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, (u32)arch_pebs_base,
- (u32)(arch_pebs_base >> 32));
+ wrmsrq_on_cpu(cpu, MSR_IA32_PEBS_BASE, arch_pebs_base);
x86_pmu.pebs_active = 1;
}
@@ -1105,7 +1102,7 @@ inline void fini_arch_pebs_on_cpu(int cpu)
if (!x86_pmu.arch_pebs)
return;
- wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0, 0);
+ wrmsrq_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0);
}
/*
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 22f914f7affe..6e0d7a6335ff 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -291,7 +291,7 @@ static inline void rdmsr_on_cpus(const struct cpumask *m, u32 msr_no,
static inline void wrmsr_on_cpus(const struct cpumask *m, u32 msr_no,
struct msr __percpu *msrs)
{
- wrmsr_on_cpu(0, msr_no, raw_cpu_read(msrs->l), raw_cpu_read(msrs->h));
+ wrmsrq_on_cpu(0, msr_no, raw_cpu_read(msrs->q));
}
static inline int rdmsr_safe_on_cpu(unsigned int cpu, u32 msr_no,
u32 *l, u32 *h)
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index bee9c35762b8..6d30e7720f31 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -327,7 +327,7 @@ static int toggle_hw_mce_inject(unsigned int cpu, bool enable)
enable ? (val.l |= BIT(18)) : (val.l &= ~BIT(18));
- err = wrmsr_on_cpu(cpu, MSR_K7_HWCR, val.l, val.h);
+ err = wrmsrq_on_cpu(cpu, MSR_K7_HWCR, val.q);
if (err)
pr_err("%s: error writing HWCR\n", __func__);
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index d96e8b665f39..c1690aa48193 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -68,7 +68,7 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
rdmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &val.q);
if (newstate == DC_DISABLE) {
pr_debug("CPU#%d disabling modulation\n", cpu);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l & ~(1<<4), val.h);
+ wrmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.q & ~(1ULL << 4));
} else {
pr_debug("CPU#%d setting duty cycle to %d%%\n",
cpu, ((125 * newstate) / 10));
@@ -79,7 +79,7 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
*/
val.l = (val.l & ~14);
val.l = val.l | (1<<4) | ((newstate & 0x7)<<1);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l, val.h);
+ wrmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.q);
}
return 0;
diff --git a/drivers/cpufreq/speedstep-centrino.c b/drivers/cpufreq/speedstep-centrino.c
index cefee19d1100..9237ed8f2b1f 100644
--- a/drivers/cpufreq/speedstep-centrino.c
+++ b/drivers/cpufreq/speedstep-centrino.c
@@ -475,7 +475,7 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
oldmsr.l |= msr;
}
- wrmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
+ wrmsrq_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr.q);
if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY)
break;
@@ -491,7 +491,7 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
*/
for_each_cpu(j, covered_cpus)
- wrmsr_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
+ wrmsrq_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr.q);
}
retval = 0;
diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 2e7de8cf756d..144603c356a0 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -167,7 +167,7 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
v.l |= intr;
}
- return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, v.l, v.h);
+ return wrmsrq_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, v.q);
}
/* Thermal zone callback registry */
--
2.54.0
^ permalink raw reply related
* [PATCH v2 03/10] x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu()
From: Juergen Gross @ 2026-06-05 14:43 UTC (permalink / raw)
To: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon
Cc: Juergen Gross, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Tony Luck, Rafael J. Wysocki,
Viresh Kumar, Guenter Roeck, Daniel Lezcano, Zhang Rui,
Lukasz Luba
In-Reply-To: <20260605144314.3031049-1-jgross@suse.com>
In order to prepare retiring rdmsr_on_cpu() switch rdmsr_on_cpu() users
to rdmsrq_on_cpu().
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- instead of changing rdmsr_on_cpu(), use rdmsrq_on_cpu() (Ingo Molnar)
---
arch/x86/include/asm/msr.h | 2 +-
arch/x86/kernel/cpu/mce/amd.c | 6 ++--
arch/x86/kernel/cpu/mce/inject.c | 8 ++---
drivers/cpufreq/amd_freq_sensitivity.c | 6 ++--
drivers/cpufreq/p4-clockmod.c | 32 ++++++++++----------
drivers/cpufreq/speedstep-centrino.c | 27 +++++++++--------
drivers/hwmon/coretemp.c | 12 ++++----
drivers/thermal/intel/x86_pkg_temp_thermal.c | 25 ++++++++-------
8 files changed, 58 insertions(+), 60 deletions(-)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index fddadbc625be..d5985d6fdaf9 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -292,7 +292,7 @@ static inline int wrmsrq_on_cpu(unsigned int cpu, u32 msr_no, u64 q)
static inline void rdmsr_on_cpus(const struct cpumask *m, u32 msr_no,
struct msr __percpu *msrs)
{
- rdmsr_on_cpu(0, msr_no, raw_cpu_ptr(&msrs->l), raw_cpu_ptr(&msrs->h));
+ rdmsrq_on_cpu(0, msr_no, raw_cpu_ptr(&msrs->q));
}
static inline void wrmsr_on_cpus(const struct cpumask *m, u32 msr_no,
struct msr __percpu *msrs)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6605a0224659..1305d9a2ee32 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -969,13 +969,13 @@ store_threshold_limit(struct threshold_block *b, const char *buf, size_t size)
static ssize_t show_error_count(struct threshold_block *b, char *buf)
{
- u32 lo, hi;
+ struct msr val;
/* CPU might be offline by now */
- if (rdmsr_on_cpu(b->cpu, b->address, &lo, &hi))
+ if (rdmsrq_on_cpu(b->cpu, b->address, &val.q))
return -ENODEV;
- return sprintf(buf, "%u\n", ((hi & THRESHOLD_MAX) -
+ return sprintf(buf, "%u\n", ((val.h & THRESHOLD_MAX) -
(THRESHOLD_MAX - b->threshold_limit)));
}
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index d02c4f556cd0..bee9c35762b8 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -316,18 +316,18 @@ static struct notifier_block inject_nb = {
*/
static int toggle_hw_mce_inject(unsigned int cpu, bool enable)
{
- u32 l, h;
+ struct msr val;
int err;
- err = rdmsr_on_cpu(cpu, MSR_K7_HWCR, &l, &h);
+ err = rdmsrq_on_cpu(cpu, MSR_K7_HWCR, &val.q);
if (err) {
pr_err("%s: error reading HWCR\n", __func__);
return err;
}
- enable ? (l |= BIT(18)) : (l &= ~BIT(18));
+ enable ? (val.l |= BIT(18)) : (val.l &= ~BIT(18));
- err = wrmsr_on_cpu(cpu, MSR_K7_HWCR, l, h);
+ err = wrmsr_on_cpu(cpu, MSR_K7_HWCR, val.l, val.h);
if (err)
pr_err("%s: error writing HWCR\n", __func__);
diff --git a/drivers/cpufreq/amd_freq_sensitivity.c b/drivers/cpufreq/amd_freq_sensitivity.c
index 13fed4b9e02b..739d54dc9f2b 100644
--- a/drivers/cpufreq/amd_freq_sensitivity.c
+++ b/drivers/cpufreq/amd_freq_sensitivity.c
@@ -51,10 +51,8 @@ static unsigned int amd_powersave_bias_target(struct cpufreq_policy *policy,
if (!policy->freq_table)
return freq_next;
- rdmsr_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_ACTUAL,
- &actual.l, &actual.h);
- rdmsr_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_REFERENCE,
- &reference.l, &reference.h);
+ rdmsrq_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_ACTUAL, &actual.q);
+ rdmsrq_on_cpu(policy->cpu, MSR_AMD64_FREQ_SENSITIVITY_REFERENCE, &reference.q);
actual.h &= 0x00ffffff;
reference.h &= 0x00ffffff;
diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c
index 69c19233fcd4..d96e8b665f39 100644
--- a/drivers/cpufreq/p4-clockmod.c
+++ b/drivers/cpufreq/p4-clockmod.c
@@ -51,24 +51,24 @@ static unsigned int cpufreq_p4_get(unsigned int cpu);
static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
{
- u32 l, h;
+ struct msr val;
if ((newstate > DC_DISABLE) || (newstate == DC_RESV))
return -EINVAL;
- rdmsr_on_cpu(cpu, MSR_IA32_THERM_STATUS, &l, &h);
+ rdmsrq_on_cpu(cpu, MSR_IA32_THERM_STATUS, &val.q);
- if (l & 0x01)
+ if (val.l & 0x01)
pr_debug("CPU#%d currently thermal throttled\n", cpu);
if (has_N44_O17_errata[cpu] &&
(newstate == DC_25PT || newstate == DC_DFLT))
newstate = DC_38PT;
- rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h);
+ rdmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &val.q);
if (newstate == DC_DISABLE) {
pr_debug("CPU#%d disabling modulation\n", cpu);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & ~(1<<4), h);
+ wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l & ~(1<<4), val.h);
} else {
pr_debug("CPU#%d setting duty cycle to %d%%\n",
cpu, ((125 * newstate) / 10));
@@ -77,9 +77,9 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate)
* bits 3-1 : duty cycle
* bit 0 : reserved
*/
- l = (l & ~14);
- l = l | (1<<4) | ((newstate & 0x7)<<1);
- wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h);
+ val.l = (val.l & ~14);
+ val.l = val.l | (1<<4) | ((newstate & 0x7)<<1);
+ wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, val.l, val.h);
}
return 0;
@@ -205,18 +205,18 @@ static int cpufreq_p4_cpu_init(struct cpufreq_policy *policy)
static unsigned int cpufreq_p4_get(unsigned int cpu)
{
- u32 l, h;
+ struct msr val;
- rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h);
+ rdmsrq_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &val.q);
- if (l & 0x10) {
- l = l >> 1;
- l &= 0x7;
+ if (val.l & 0x10) {
+ val.l = val.l >> 1;
+ val.l &= 0x7;
} else
- l = DC_DISABLE;
+ val.l = DC_DISABLE;
- if (l != DC_DISABLE)
- return stock_freq * l / 8;
+ if (val.l != DC_DISABLE)
+ return stock_freq * val.l / 8;
return stock_freq;
}
diff --git a/drivers/cpufreq/speedstep-centrino.c b/drivers/cpufreq/speedstep-centrino.c
index 3e6e85a92212..cefee19d1100 100644
--- a/drivers/cpufreq/speedstep-centrino.c
+++ b/drivers/cpufreq/speedstep-centrino.c
@@ -322,11 +322,11 @@ static unsigned extract_clock(unsigned msr, unsigned int cpu, int failsafe)
/* Return the current CPU frequency in kHz */
static unsigned int get_cur_freq(unsigned int cpu)
{
- unsigned l, h;
+ struct msr val;
unsigned clock_freq;
- rdmsr_on_cpu(cpu, MSR_IA32_PERF_STATUS, &l, &h);
- clock_freq = extract_clock(l, cpu, 0);
+ rdmsrq_on_cpu(cpu, MSR_IA32_PERF_STATUS, &val.q);
+ clock_freq = extract_clock(val.l, cpu, 0);
if (unlikely(clock_freq == 0)) {
/*
@@ -335,8 +335,8 @@ static unsigned int get_cur_freq(unsigned int cpu)
* P-state transition (like TM2). Get the last freq set
* in PERF_CTL.
*/
- rdmsr_on_cpu(cpu, MSR_IA32_PERF_CTL, &l, &h);
- clock_freq = extract_clock(l, cpu, 1);
+ rdmsrq_on_cpu(cpu, MSR_IA32_PERF_CTL, &val.q);
+ clock_freq = extract_clock(val.l, cpu, 1);
}
return clock_freq;
}
@@ -417,7 +417,8 @@ static void centrino_cpu_exit(struct cpufreq_policy *policy)
*/
static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
{
- unsigned int msr, oldmsr = 0, h = 0, cpu = policy->cpu;
+ unsigned int msr, cpu = policy->cpu;
+ struct msr oldmsr = { .q = 0 };
int retval = 0;
unsigned int j, first_cpu;
struct cpufreq_frequency_table *op_points;
@@ -459,22 +460,22 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
msr = op_points->driver_data;
if (first_cpu) {
- rdmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, &oldmsr, &h);
- if (msr == (oldmsr & 0xffff)) {
+ rdmsrq_on_cpu(good_cpu, MSR_IA32_PERF_CTL, &oldmsr.q);
+ if (msr == (oldmsr.l & 0xffff)) {
pr_debug("no change needed - msr was and needs "
- "to be %x\n", oldmsr);
+ "to be %x\n", oldmsr.l);
retval = 0;
goto out;
}
first_cpu = 0;
/* all but 16 LSB are reserved, treat them with care */
- oldmsr &= ~0xffff;
+ oldmsr.l &= ~0xffff;
msr &= 0xffff;
- oldmsr |= msr;
+ oldmsr.l |= msr;
}
- wrmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr, h);
+ wrmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY)
break;
@@ -490,7 +491,7 @@ static int centrino_target(struct cpufreq_policy *policy, unsigned int index)
*/
for_each_cpu(j, covered_cpus)
- wrmsr_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr, h);
+ wrmsr_on_cpu(j, MSR_IA32_PERF_CTL, oldmsr.l, oldmsr.h);
}
retval = 0;
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 6a0d94711ead..1259c78c95c6 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -356,15 +356,15 @@ static ssize_t show_label(struct device *dev,
static ssize_t show_crit_alarm(struct device *dev,
struct device_attribute *devattr, char *buf)
{
- u32 eax, edx;
+ struct msr val;
struct temp_data *tdata = container_of(devattr, struct temp_data,
sd_attrs[ATTR_CRIT_ALARM]);
mutex_lock(&tdata->update_lock);
- rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
+ rdmsrq_on_cpu(tdata->cpu, tdata->status_reg, &val.q);
mutex_unlock(&tdata->update_lock);
- return sprintf(buf, "%d\n", (eax >> 5) & 1);
+ return sprintf(buf, "%d\n", (val.l >> 5) & 1);
}
static ssize_t show_tjmax(struct device *dev,
@@ -398,7 +398,7 @@ static ssize_t show_ttarget(struct device *dev,
static ssize_t show_temp(struct device *dev,
struct device_attribute *devattr, char *buf)
{
- u32 eax, edx;
+ struct msr val;
struct temp_data *tdata = container_of(devattr, struct temp_data, sd_attrs[ATTR_TEMP]);
int tjmax;
@@ -407,14 +407,14 @@ static ssize_t show_temp(struct device *dev,
tjmax = get_tjmax(tdata, dev);
/* Check whether the time interval has elapsed */
if (time_after(jiffies, tdata->last_updated + HZ)) {
- rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
+ rdmsrq_on_cpu(tdata->cpu, tdata->status_reg, &val.q);
/*
* Ignore the valid bit. In all observed cases the register
* value is either low or zero if the valid bit is 0.
* Return it instead of reporting an error which doesn't
* really help at all.
*/
- tdata->temp = tjmax - ((eax >> 16) & 0xff) * 1000;
+ tdata->temp = tjmax - ((val.l >> 16) & 0xff) * 1000;
tdata->last_updated = jiffies;
}
diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 540109761f0a..2e7de8cf756d 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -125,8 +125,9 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
{
struct zone_device *zonedev = thermal_zone_device_priv(tzd);
unsigned int trip_index = THERMAL_TRIP_PRIV_TO_INT(trip->priv);
- u32 l, h, mask, shift, intr;
+ u32 mask, shift, intr;
int tj_max, val, ret;
+ struct msr v;
if (temp == THERMAL_TEMP_INVALID)
temp = 0;
@@ -141,8 +142,7 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
if (trip_index >= MAX_NUMBER_OF_TRIPS || val < 0 || val > 0x7f)
return -EINVAL;
- ret = rdmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
- &l, &h);
+ ret = rdmsrq_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &v.q);
if (ret < 0)
return ret;
@@ -155,20 +155,19 @@ sys_set_trip_temp(struct thermal_zone_device *tzd,
shift = THERM_SHIFT_THRESHOLD0;
intr = THERM_INT_THRESHOLD0_ENABLE;
}
- l &= ~mask;
+ v.l &= ~mask;
/*
* When users space sets a trip temperature == 0, which is indication
* that, it is no longer interested in receiving notifications.
*/
if (!temp) {
- l &= ~intr;
+ v.l &= ~intr;
} else {
- l |= val << shift;
- l |= intr;
+ v.l |= val << shift;
+ v.l |= intr;
}
- return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
- l, h);
+ return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, v.l, v.h);
}
/* Thermal zone callback registry */
@@ -277,7 +276,8 @@ static int pkg_temp_thermal_trips_init(int cpu, int tj_max,
struct thermal_trip *trips, int num_trips)
{
unsigned long thres_reg_value;
- u32 mask, shift, eax, edx;
+ u32 mask, shift;
+ struct msr val;
int ret, i;
for (i = 0; i < num_trips; i++) {
@@ -290,12 +290,11 @@ static int pkg_temp_thermal_trips_init(int cpu, int tj_max,
shift = THERM_SHIFT_THRESHOLD0;
}
- ret = rdmsr_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
- &eax, &edx);
+ ret = rdmsrq_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &val.q);
if (ret < 0)
return ret;
- thres_reg_value = (eax & mask) >> shift;
+ thres_reg_value = (val.l & mask) >> shift;
trips[i].temperature = thres_reg_value ?
tj_max - thres_reg_value * 1000 : THERMAL_TEMP_INVALID;
--
2.54.0
^ permalink raw reply related
* [PATCH v2 00/10] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Juergen Gross @ 2026-06-05 14:43 UTC (permalink / raw)
To: linux-kernel, linux-pm, x86, linux-edac, linux-hwmon,
linux-perf-users
Cc: Juergen Gross, Huang Rui, Mario Limonciello, Perry Yuan,
K Prateek Nayak, Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
Tony Luck, Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark
Drop the variants using 2 32-bit values instead of a single 64-bit one
of the *_on_cpu() MSR access functions.
Changes in V2:
- patches 1+2 split out from other patch
- keep the *q() variants instead of those without suffix
Juergen Gross (10):
x86/msr: Switch rdmsrl_on_cpu() users to rdmsrq_on_cpu()
x86/msr: Remove rdmsrl_on_cpu()
x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu()
x86/msr: Remove rdmsr_on_cpu()
x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu()
x86/msr: Remove wrmsr_on_cpu()
x86/msr: Switch rdmsr_safe_on_cpu() users to rdmsrq_safe_on_cpu()
x86/msr: Remove rdmsr_safe_on_cpu()
x86/msr: Switch wrmsr_safe_on_cpu() users to wrmsrq_safe_on_cpu()
x86/msr: Remove wrmsr_safe_on_cpu()
arch/x86/events/intel/ds.c | 11 +--
arch/x86/include/asm/msr.h | 28 +-----
arch/x86/kernel/cpu/mce/amd.c | 6 +-
arch/x86/kernel/cpu/mce/inject.c | 8 +-
arch/x86/kernel/msr.c | 8 +-
arch/x86/lib/msr-smp.c | 89 +++-----------------
drivers/cpufreq/amd-pstate.c | 2 +-
drivers/cpufreq/amd_freq_sensitivity.c | 6 +-
drivers/cpufreq/p4-clockmod.c | 32 +++----
drivers/cpufreq/speedstep-centrino.c | 27 +++---
drivers/hwmon/coretemp.c | 44 +++++-----
drivers/hwmon/via-cputemp.c | 16 ++--
drivers/thermal/intel/intel_tcc.c | 43 +++++-----
drivers/thermal/intel/x86_pkg_temp_thermal.c | 25 +++---
14 files changed, 128 insertions(+), 217 deletions(-)
--
2.54.0
^ permalink raw reply
* Re: mm/memory-failure tracepoint change breaks userspace rasdaemon
From: Steven Rostedt @ 2026-06-05 14:13 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Borislav Petkov, Zhuo, Qiuxu, mchehab+huawei@kernel.org,
Luck, Tony, akpm@linux-foundation.org, linmiaohe@huawei.com,
xieyuanbin1@huawei.com, Lai, Yi1, linux-kernel@vger.kernel.org,
linux-edac@vger.kernel.org, linux-mm@kvack.org,
linux-trace-kernel@vger.kernel.org, Linus Torvalds
In-Reply-To: <39cbb38f-ed3b-4f17-b9b7-ed466957ee99@kernel.org>
On June 5, 2026 4:52:28 AM EDT, "David Hildenbrand (Arm)" <david@kernel.org> wrote:
>On 6/3/26 21:31, Steven Rostedt wrote:
>> On Wed, 3 Jun 2026 21:13:30 +0200
>> "David Hildenbrand (Arm)" <david@kernel.org> wrote:
>>
>>> Thanks, that makes sense!
>>>
>>> So, would it be fair to say that, in general, what's exposed through
>>>
>>> /sys/kernel/tracing/events/
>>>
>>> is stable ABI?
>>
>> It's only stable if something depends on it. It changes all the time.
>> It's only when someone complains about it that it becomes "stable"!
>
>Heh, so we only know that it's stable when we break it ...
>
>Let me figure out how to document that.
>
Yep. That's basically Linus's rule. He even said we break user space API all the time. What we don't allow is to break actual user space. The problem is that you can break user space by fixing an API without knowing something depended on the bug.
^ permalink raw reply
* Re: [PATCH v2] EDAC/synopsys: Fix cleanup on injection sysfs failure
From: Michal Simek @ 2026-06-05 13:23 UTC (permalink / raw)
To: Yuho Choi, Borislav Petkov, Tony Luck
Cc: linux-edac, linux-arm-kernel, linux-kernel
In-Reply-To: <20260605125417.2348115-1-dbgh9129@gmail.com>
On 6/5/26 14:54, Yuho Choi wrote:
> edac_create_sysfs_attributes() creates inject_data_error before
> inject_data_poison. If the second file creation fails, the first file is
> left behind.
>
> The same failure path runs after edac_mc_add_mc() has registered the
> memory controller with the EDAC core. Jumping directly to edac_mc_free()
> skips edac_mc_del_mc() and leaves the registered controller state
> unwound incorrectly.
>
> Remove inject_data_error when inject_data_poison creation fails, and
> route the probe failure through edac_mc_del_mc() before freeing mci.
>
> Fixes: 1a81361f75d8 ("EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller")
> Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
> ---
> Changes in v2:
> - Remove the CONFIG_EDAC_DEBUG-guarded del_mc label.
> - Call edac_mc_del_mc() inline before jumping to free_edac_mc when
> edac_create_sysfs_attributes() fails.
>
> drivers/edac/synopsys_edac.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
> index 51143b3257de..9ca2a842612e 100644
> --- a/drivers/edac/synopsys_edac.c
> +++ b/drivers/edac/synopsys_edac.c
> @@ -1120,8 +1120,10 @@ static int edac_create_sysfs_attributes(struct mem_ctl_info *mci)
> if (rc < 0)
> return rc;
> rc = device_create_file(&mci->dev, &dev_attr_inject_data_poison);
> - if (rc < 0)
> + if (rc < 0) {
> + device_remove_file(&mci->dev, &dev_attr_inject_data_error);
> return rc;
> + }
> return 0;
> }
>
> @@ -1431,6 +1433,7 @@ static int mc_probe(struct platform_device *pdev)
> if (rc) {
> edac_printk(KERN_ERR, EDAC_MC,
> "Failed to create sysfs entries\n");
> + edac_mc_del_mc(&pdev->dev);
> goto free_edac_mc;
> }
> }
Cc: stable@vger.kernel.org
Acked-by: Michal Simek <michal.simek@amd.com>
Thanks,
Michal
^ permalink raw reply
* RE: [PATCH] mm/memory-failure: trace: change memory_failure_event to ras subsystem
From: Zhuo, Qiuxu @ 2026-06-05 13:09 UTC (permalink / raw)
To: Xie Yuanbin, david@kernel.org, bp@alien8.de,
akpm@linux-foundation.org, rostedt@goodmis.org,
linmiaohe@huawei.com, nao.horiguchi@gmail.com,
mhiramat@kernel.org, mchehab+huawei@kernel.org, Luck, Tony,
Lai, Yi1
Cc: linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
torvalds@linux-foundation.org, lilinjie8@huawei.com,
liaohua4@huawei.com
In-Reply-To: <20260605081213.154660-1-xieyuanbin1@huawei.com>
> From: Xie Yuanbin <xieyuanbin1@huawei.com>
> [...]
> Subject: [PATCH] mm/memory-failure: trace: change memory_failure_event to
> ras subsystem
>
> For historical version, commit 97f0b1345219 ("tracing: add trace event for
> memory-failure") introduced memory_failure_event in ras subsystem.
> commit 31807483d395 ("mm/memory-failure: remove the selection of RAS")
> changed memory_failure_event to memory_failure subsystem. This breaks
> the backward compatibility, some user programs rely on it.
>
> Change memory_failure_event to ras subsystem to keep backward
> compatibility.
>
> Fixes: 31807483d395 ("mm/memory-failure: remove the selection of RAS")
>
> Reported-by: Yi Lai <yi1.lai@intel.com>
> Reported-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Closes: https://lore.kernel.org/linux-
> mm/CY8PR11MB7134346A3E4BB28ECA28D6E989132@CY8PR11MB7134.nam
> prd11.prod.outlook.com
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Signed-off-by: Xie Yuanbin <xieyuanbin1@huawei.com>
LGTM.
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Verified that rasdaemon can enable and receive memory_failure_event on
v7.1-rc3.
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Thanks
-Qiuxu
^ permalink raw reply
* [PATCH v2] EDAC/synopsys: Fix cleanup on injection sysfs failure
From: Yuho Choi @ 2026-06-05 12:54 UTC (permalink / raw)
To: Borislav Petkov, Tony Luck, Michal Simek
Cc: linux-edac, linux-arm-kernel, linux-kernel, Yuho Choi
edac_create_sysfs_attributes() creates inject_data_error before
inject_data_poison. If the second file creation fails, the first file is
left behind.
The same failure path runs after edac_mc_add_mc() has registered the
memory controller with the EDAC core. Jumping directly to edac_mc_free()
skips edac_mc_del_mc() and leaves the registered controller state
unwound incorrectly.
Remove inject_data_error when inject_data_poison creation fails, and
route the probe failure through edac_mc_del_mc() before freeing mci.
Fixes: 1a81361f75d8 ("EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller")
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
---
Changes in v2:
- Remove the CONFIG_EDAC_DEBUG-guarded del_mc label.
- Call edac_mc_del_mc() inline before jumping to free_edac_mc when
edac_create_sysfs_attributes() fails.
drivers/edac/synopsys_edac.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 51143b3257de..9ca2a842612e 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -1120,8 +1120,10 @@ static int edac_create_sysfs_attributes(struct mem_ctl_info *mci)
if (rc < 0)
return rc;
rc = device_create_file(&mci->dev, &dev_attr_inject_data_poison);
- if (rc < 0)
+ if (rc < 0) {
+ device_remove_file(&mci->dev, &dev_attr_inject_data_error);
return rc;
+ }
return 0;
}
@@ -1431,6 +1433,7 @@ static int mc_probe(struct platform_device *pdev)
if (rc) {
edac_printk(KERN_ERR, EDAC_MC,
"Failed to create sysfs entries\n");
+ edac_mc_del_mc(&pdev->dev);
goto free_edac_mc;
}
}
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v1] EDAC/synopsys: Fix cleanup on injection sysfs failure
From: 최유호 @ 2026-06-05 12:29 UTC (permalink / raw)
To: Michal Simek
Cc: Borislav Petkov, Tony Luck, linux-edac, linux-arm-kernel,
linux-kernel
In-Reply-To: <78e66f8f-3998-4051-99a5-7b5fccb40943@amd.com>
Dear Michal
I appreciate your time reviewing this.
On Fri, 5 Jun 2026 at 05:03, Michal Simek <michal.simek@amd.com> wrote:
> I don't think this is nice way how to do it. I would do it above to avoid using
> ifdefs here.
>
> like this
> if (rc) {
> edac_printk(KERN_ERR, EDAC_MC,
> "Failed to create sysfs entries\n");
> edac_mc_del_mc(&pdev->dev);
> goto free_edac_mc;
>
> }
> }
>
Yes, that is cleaner.
I'll update it as suggested and send v2.
Thanks,
Yuho
^ permalink raw reply
* Re: [PATCH 0/8] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Jürgen Groß @ 2026-06-05 10:06 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon,
linux-perf-users, platform-driver-x86, linux-acpi,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark, Huang Rui, Mario Limonciello,
Perry Yuan, K Prateek Nayak, Srinivas Pandruvada, Len Brown,
Hans de Goede, Ilpo Järvinen
In-Reply-To: <aiKdSPS-YT6KZV81@gmail.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 2308 bytes --]
On 05.06.26 11:56, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@kernel.org> wrote:
>
>>
>> * Jürgen Groß <jgross@suse.com> wrote:
>>
>>>> Well, we had a similar discussion back when we standardized on
>>>> rdmsrq() and wrmsrq(), and we use them as our primary 64-bit
>>>> MSR handling APIs. Why have a different pattern in any of the
>>>> derived APIs? It should really use the same conceptual namespace,
>>>> not some confusing mixture of two naming schemes.
>>>
>>> In the long run I'd like to do the same conversion for the rdmsr*() and
>>> wrmsr*() interfaces, too (so only offering and using the 64-bit variants).
>>
>> Why? We had this discussion for the original MSR API namespace
>> cleanup a year ago, and decided to standardize on the rdmsrq()/wrmsrq()
>> namespace:
>>
>> c435e608cf59 x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'
>> 78255eb23973 x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'
>> 6fe22abacd40 x86/msr: Rename 'rdmsrl_safe()' to 'rdmsrq_safe()'
>> 6fa17efe4544 x86/msr: Rename 'wrmsrl_safe()' to 'wrmsrq_safe()'
>> 5e404cb7ac4c x86/msr: Rename 'rdmsrl_safe_on_cpu()' to 'rdmsrq_safe_on_cpu()'
>> 27a23a544a55 x86/msr: Rename 'wrmsrl_safe_on_cpu()' to 'wrmsrq_safe_on_cpu()'
>> d7484babd2c4 x86/msr: Rename 'rdmsrl_on_cpu()' to 'rdmsrq_on_cpu()'
>> c895ecdab2e4 x86/msr: Rename 'wrmsrl_on_cpu()' to 'wrmsrq_on_cpu()'
>> ebe29309c4d2 x86/msr: Rename 'mce_rdmsrl()' to 'mce_rdmsrq()'
>> 8e44e83f57c3 x86/msr: Rename 'mce_wrmsrl()' to 'mce_wrmsrq()'
>> e2b8af0c6939 x86/msr: Rename 'rdmsrl_amd_safe()' to 'rdmsrq_amd_safe()'
>> 604d15d15ebd x86/msr: Rename 'wrmsrl_amd_safe()' to 'wrmsrq_amd_safe()'
>> 7cbc2ba7c107 x86/msr: Rename 'native_wrmsrl()' to 'native_wrmsrq()'
>> eef476f15c83 x86/msr: Rename 'wrmsrl_cstar()' to 'wrmsrq_cstar()'
>>
>> There's several good reasons to use the 'q' suffix in the API names,
>> why relitigate this? :-)
>
> And just to be clear: I have no objections whatsoever to
> phasing out all the old 32-bit APIs, like your series does.
Thanks for the confirmation. :-)
And regarding the "q" suffix: I'm not insisting to drop it, I just felt it would
no longer be needed when the variants without suffix no longer exist. If you
like to keep it, then be it so.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply
* Re: [PATCH 0/8] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Jürgen Groß @ 2026-06-05 10:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon,
linux-perf-users, platform-driver-x86, linux-acpi,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark, Huang Rui, Mario Limonciello,
Perry Yuan, K Prateek Nayak, Srinivas Pandruvada, Len Brown,
Hans de Goede, Ilpo Järvinen
In-Reply-To: <aiKdSPS-YT6KZV81@gmail.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 2308 bytes --]
On 05.06.26 11:56, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@kernel.org> wrote:
>
>>
>> * Jürgen Groß <jgross@suse.com> wrote:
>>
>>>> Well, we had a similar discussion back when we standardized on
>>>> rdmsrq() and wrmsrq(), and we use them as our primary 64-bit
>>>> MSR handling APIs. Why have a different pattern in any of the
>>>> derived APIs? It should really use the same conceptual namespace,
>>>> not some confusing mixture of two naming schemes.
>>>
>>> In the long run I'd like to do the same conversion for the rdmsr*() and
>>> wrmsr*() interfaces, too (so only offering and using the 64-bit variants).
>>
>> Why? We had this discussion for the original MSR API namespace
>> cleanup a year ago, and decided to standardize on the rdmsrq()/wrmsrq()
>> namespace:
>>
>> c435e608cf59 x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'
>> 78255eb23973 x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'
>> 6fe22abacd40 x86/msr: Rename 'rdmsrl_safe()' to 'rdmsrq_safe()'
>> 6fa17efe4544 x86/msr: Rename 'wrmsrl_safe()' to 'wrmsrq_safe()'
>> 5e404cb7ac4c x86/msr: Rename 'rdmsrl_safe_on_cpu()' to 'rdmsrq_safe_on_cpu()'
>> 27a23a544a55 x86/msr: Rename 'wrmsrl_safe_on_cpu()' to 'wrmsrq_safe_on_cpu()'
>> d7484babd2c4 x86/msr: Rename 'rdmsrl_on_cpu()' to 'rdmsrq_on_cpu()'
>> c895ecdab2e4 x86/msr: Rename 'wrmsrl_on_cpu()' to 'wrmsrq_on_cpu()'
>> ebe29309c4d2 x86/msr: Rename 'mce_rdmsrl()' to 'mce_rdmsrq()'
>> 8e44e83f57c3 x86/msr: Rename 'mce_wrmsrl()' to 'mce_wrmsrq()'
>> e2b8af0c6939 x86/msr: Rename 'rdmsrl_amd_safe()' to 'rdmsrq_amd_safe()'
>> 604d15d15ebd x86/msr: Rename 'wrmsrl_amd_safe()' to 'wrmsrq_amd_safe()'
>> 7cbc2ba7c107 x86/msr: Rename 'native_wrmsrl()' to 'native_wrmsrq()'
>> eef476f15c83 x86/msr: Rename 'wrmsrl_cstar()' to 'wrmsrq_cstar()'
>>
>> There's several good reasons to use the 'q' suffix in the API names,
>> why relitigate this? :-)
>
> And just to be clear: I have no objections whatsoever to
> phasing out all the old 32-bit APIs, like your series does.
Thanks for the confirmation. :-)
And regarding the "q" suffix: I'm not insisting to drop it, I just felt it would
no longer be needed when the variants without suffix no longer exist. If you
like to keep it, then be it so.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply
* Re: [PATCH 0/8] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Ingo Molnar @ 2026-06-05 9:56 UTC (permalink / raw)
To: Jürgen Groß
Cc: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon,
linux-perf-users, platform-driver-x86, linux-acpi,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark, Huang Rui, Mario Limonciello,
Perry Yuan, K Prateek Nayak, Srinivas Pandruvada, Len Brown,
Hans de Goede, Ilpo Järvinen
In-Reply-To: <aiKcz6I8GO-TG8uq@gmail.com>
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Jürgen Groß <jgross@suse.com> wrote:
>
> > > Well, we had a similar discussion back when we standardized on
> > > rdmsrq() and wrmsrq(), and we use them as our primary 64-bit
> > > MSR handling APIs. Why have a different pattern in any of the
> > > derived APIs? It should really use the same conceptual namespace,
> > > not some confusing mixture of two naming schemes.
> >
> > In the long run I'd like to do the same conversion for the rdmsr*() and
> > wrmsr*() interfaces, too (so only offering and using the 64-bit variants).
>
> Why? We had this discussion for the original MSR API namespace
> cleanup a year ago, and decided to standardize on the rdmsrq()/wrmsrq()
> namespace:
>
> c435e608cf59 x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'
> 78255eb23973 x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'
> 6fe22abacd40 x86/msr: Rename 'rdmsrl_safe()' to 'rdmsrq_safe()'
> 6fa17efe4544 x86/msr: Rename 'wrmsrl_safe()' to 'wrmsrq_safe()'
> 5e404cb7ac4c x86/msr: Rename 'rdmsrl_safe_on_cpu()' to 'rdmsrq_safe_on_cpu()'
> 27a23a544a55 x86/msr: Rename 'wrmsrl_safe_on_cpu()' to 'wrmsrq_safe_on_cpu()'
> d7484babd2c4 x86/msr: Rename 'rdmsrl_on_cpu()' to 'rdmsrq_on_cpu()'
> c895ecdab2e4 x86/msr: Rename 'wrmsrl_on_cpu()' to 'wrmsrq_on_cpu()'
> ebe29309c4d2 x86/msr: Rename 'mce_rdmsrl()' to 'mce_rdmsrq()'
> 8e44e83f57c3 x86/msr: Rename 'mce_wrmsrl()' to 'mce_wrmsrq()'
> e2b8af0c6939 x86/msr: Rename 'rdmsrl_amd_safe()' to 'rdmsrq_amd_safe()'
> 604d15d15ebd x86/msr: Rename 'wrmsrl_amd_safe()' to 'wrmsrq_amd_safe()'
> 7cbc2ba7c107 x86/msr: Rename 'native_wrmsrl()' to 'native_wrmsrq()'
> eef476f15c83 x86/msr: Rename 'wrmsrl_cstar()' to 'wrmsrq_cstar()'
>
> There's several good reasons to use the 'q' suffix in the API names,
> why relitigate this? :-)
And just to be clear: I have no objections whatsoever to
phasing out all the old 32-bit APIs, like your series does.
Thanks,
Ingo
^ permalink raw reply
* Re: [PATCH 0/8] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Ingo Molnar @ 2026-06-05 9:54 UTC (permalink / raw)
To: Jürgen Groß
Cc: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon,
linux-perf-users, platform-driver-x86, linux-acpi,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark, Huang Rui, Mario Limonciello,
Perry Yuan, K Prateek Nayak, Srinivas Pandruvada, Len Brown,
Hans de Goede, Ilpo Järvinen
In-Reply-To: <b7e799a6-1f1a-49ef-8aac-0d5fd4a06dc7@suse.com>
* Jürgen Groß <jgross@suse.com> wrote:
> > Well, we had a similar discussion back when we standardized on
> > rdmsrq() and wrmsrq(), and we use them as our primary 64-bit
> > MSR handling APIs. Why have a different pattern in any of the
> > derived APIs? It should really use the same conceptual namespace,
> > not some confusing mixture of two naming schemes.
>
> In the long run I'd like to do the same conversion for the rdmsr*() and
> wrmsr*() interfaces, too (so only offering and using the 64-bit variants).
Why? We had this discussion for the original MSR API namespace
cleanup a year ago, and decided to standardize on the rdmsrq()/wrmsrq()
namespace:
c435e608cf59 x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'
78255eb23973 x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'
6fe22abacd40 x86/msr: Rename 'rdmsrl_safe()' to 'rdmsrq_safe()'
6fa17efe4544 x86/msr: Rename 'wrmsrl_safe()' to 'wrmsrq_safe()'
5e404cb7ac4c x86/msr: Rename 'rdmsrl_safe_on_cpu()' to 'rdmsrq_safe_on_cpu()'
27a23a544a55 x86/msr: Rename 'wrmsrl_safe_on_cpu()' to 'wrmsrq_safe_on_cpu()'
d7484babd2c4 x86/msr: Rename 'rdmsrl_on_cpu()' to 'rdmsrq_on_cpu()'
c895ecdab2e4 x86/msr: Rename 'wrmsrl_on_cpu()' to 'wrmsrq_on_cpu()'
ebe29309c4d2 x86/msr: Rename 'mce_rdmsrl()' to 'mce_rdmsrq()'
8e44e83f57c3 x86/msr: Rename 'mce_wrmsrl()' to 'mce_wrmsrq()'
e2b8af0c6939 x86/msr: Rename 'rdmsrl_amd_safe()' to 'rdmsrq_amd_safe()'
604d15d15ebd x86/msr: Rename 'wrmsrl_amd_safe()' to 'wrmsrq_amd_safe()'
7cbc2ba7c107 x86/msr: Rename 'native_wrmsrl()' to 'native_wrmsrq()'
eef476f15c83 x86/msr: Rename 'wrmsrl_cstar()' to 'wrmsrq_cstar()'
There's several good reasons to use the 'q' suffix in the API names,
why relitigate this? :-)
Thanks,
Ingo
^ permalink raw reply
* Re: [PATCH 0/8] x86/msr: Drop 32-bit variants of *_on_cpu() MSR functions
From: Jürgen Groß @ 2026-06-05 9:40 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, x86, linux-edac, linux-pm, linux-hwmon,
linux-perf-users, platform-driver-x86, linux-acpi,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Tony Luck, Rafael J. Wysocki, Viresh Kumar,
Guenter Roeck, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Peter Zijlstra, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, James Clark, Huang Rui, Mario Limonciello,
Perry Yuan, K Prateek Nayak, Srinivas Pandruvada, Len Brown,
Hans de Goede, Ilpo Järvinen
In-Reply-To: <aiKUenaT9VD0DrpW@gmail.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 2883 bytes --]
On 05.06.26 11:18, Ingo Molnar wrote:
>
> * Jürgen Groß <jgross@suse.com> wrote:
>
>> On 05.06.26 11:05, Ingo Molnar wrote:
>>>
>>> * Juergen Gross <jgross@suse.com> wrote:
>>>
>>>> Drop the variants using 2 32-bit values instead of a single 64-bit one
>>>> of the *_on_cpu() MSR access functions.
>>>>
>>>> Juergen Gross (8):
>>>> x86/msr: Switch rdmsr_on_cpu() to return a 64-bit quantity
>>>> x86/msr: Switch all callers of rdmsrq_on_cpu() to use rdmsr_on_cpu()
>>>> x86/msr: Switch wrmsr_on_cpu() to use a 64-bit quantity
>>>> x86/msr: Switch all callers of wrmsrq_on_cpu() to use wrmsr_on_cpu()
>>>> x86/msr: Switch rdmsr_safe_on_cpu() to return a 64-bit quantity
>>>> x86/msr: Switch all callers of rdmsrq_safe_on_cpu() to use rdmsr_safe_on_cpu()
>>>> x86/msr: Switch wrmsr_safe_on_cpu() to use a 64-bit quantity
>>>> x86/msr: Switch all callers of wrmsrq_safe_on_cpu() to use wrmsr_safe_on_cpu()
>>>
>>> To sum up my review feedback for the invididual patches, we want
>>> to do this instead:
>>>
>>> x86/msr: Convert rdmsrl_on_cpu() users to rdmsrq_on_cpu()
>>> x86/msr: Drop the rdmsrl_on_cpu() alias to rdmsrq_on_cpu()
>>>
>>> x86/msr: Switch all callers of rdmsr_on_cpu() to use rdmsrq_on_cpu()
>>> x86/msr: Remove the unused rdmsr_on_cpu() API
>>>
>>> x86/msr: Switch all callers of wrmsr_on_cpu() to use wrmsrq_on_cpu()
>>> x86/msr: Remove unused wrmsr_on_cpu() API
>>>
>>> x86/msr: Switch all callers of rdmsr_safe_on_cpu() to use rdmsrq_safe_on_cpu()
>>> x86/msr: Remove unused rdmsr_safe_on_cpu() API
>>>
>>> x86/msr: Switch all callers of wrmsr_safe_on_cpu() to use wrmsrq_safe_on_cpu()
>>> x86/mrs: Remove unused wrmsrq_safe_on_cpu() API
>>>
>>> Note how there's no "conversion" of the 32-bit API itself in this
>>> approach, we just do a straightforward migration of the users to
>>> the already existing 64-bit APIs, then remove any unused APIs.
>>
>> Fine with me, but I just wanted to get rid of the "q" and "l" suffices
>> completely, as they serve no special purpose after dropping all other
>> variants.
>>
>> OTOH if wanted such a switch could be done later easily.
>
> Well, we had a similar discussion back when we standardized on
> rdmsrq() and wrmsrq(), and we use them as our primary 64-bit
> MSR handling APIs. Why have a different pattern in any of the
> derived APIs? It should really use the same conceptual namespace,
> not some confusing mixture of two naming schemes.
In the long run I'd like to do the same conversion for the rdmsr*() and
wrmsr*() interfaces, too (so only offering and using the 64-bit variants).
I understand that this is not guaranteed to be accepted immediately after
this series, so I agree that it is better to keep the "q" suffix for now
in order to avoid confusion.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply
page: next (older)
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox