* intel_pstate_get_hwp_cap on wrong CPU
@ 2024-05-29 15:57 Sebastian Andrzej Siewior
2024-05-29 23:08 ` srinivas pandruvada
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-05-29 15:57 UTC (permalink / raw)
To: linux-pm
Cc: Rafael J. Wysocki, Len Brown, Srinivas Pandruvada,
Thomas Gleixner, Viresh Kumar
Hi,
this just popped up:
| [ 6538.614568] unchecked MSR access error: RDMSR from 0x771 at rIP: 0xffffffff817d2250 (__rdmsr_on_cpu+0x20/0x50)
| [ 6538.625714] Call Trace:
| [ 6538.629156] <TASK>
| [ 6538.675067] generic_exec_single+0x58/0x120
| [ 6538.680236] smp_call_function_single+0xbf/0x110
| [ 6538.696333] rdmsrl_on_cpu+0x46/0x60
| [ 6538.700894] intel_pstate_get_hwp_cap+0x1b/0x70
| [ 6538.706420] intel_pstate_update_limits+0x2a/0x60
| [ 6538.712110] acpi_processor_notify+0xb7/0x140
| [ 6538.717479] acpi_ev_notify_dispatch+0x3b/0x60
| [ 6538.722910] acpi_os_execute_deferred+0xf/0x20
| [ 6538.728346] process_one_work+0x13d/0x350
| [ 6538.733342] worker_thread+0x2c2/0x3d0
| [ 6538.743066] kthread+0xca/0x100
| [ 6538.751844] ret_from_fork+0x2c/0x40
| [ 6538.761073] ret_from_fork_asm+0x11/0x20
| [ 6538.765993] </TASK>
| root@h:~# grep . /sys/devices/system/cpu/intel_pstate/*
| /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
| /sys/devices/system/cpu/intel_pstate/min_perf_pct:36
| /sys/devices/system/cpu/intel_pstate/no_turbo:0
| /sys/devices/system/cpu/intel_pstate/num_pstates:4294967285
^ => -EAGAIN
| /sys/devices/system/cpu/intel_pstate/status:passive
| /sys/devices/system/cpu/intel_pstate/turbo_pct:0
|# cpupower frequency-info
|analyzing CPU 0:
| driver: intel_cpufreq
…
from turbostat:
| CPUID(0): GenuineIntel 0xf CPUID levels
| CPUID(1): family:model:stepping 0x6:3f:4 (6:63:4) microcode 0x1a
| CPUID(0x80000000): max_extended_levels: 0x80000008
| CPUID(1): SSE3 MONITOR SMX EIST TM2 TSC MSR ACPI-TM HT TM
| CPUID(6): APERF, TURBO, DTS, PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, EPB
| cpu4: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT PREFETCH TURBO)
| CPUID(7): No-SGX No-Hybrid
This is v6.10.0-rc1.
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: intel_pstate_get_hwp_cap on wrong CPU
2024-05-29 15:57 intel_pstate_get_hwp_cap on wrong CPU Sebastian Andrzej Siewior
@ 2024-05-29 23:08 ` srinivas pandruvada
2024-05-31 11:02 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 6+ messages in thread
From: srinivas pandruvada @ 2024-05-29 23:08 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, linux-pm
Cc: Rafael J. Wysocki, Len Brown, Thomas Gleixner, Viresh Kumar
[-- Attachment #1: Type: text/plain, Size: 2137 bytes --]
Hi Sebastian,
On Wed, 2024-05-29 at 17:57 +0200, Sebastian Andrzej Siewior wrote:
> Hi,
>
> this just popped up:
> > [ 6538.614568] unchecked MSR access error: RDMSR from 0x771 at rIP:
Please check if the attached change fixes?
Thanks,
Srinivas
> > 0xffffffff817d2250 (__rdmsr_on_cpu+0x20/0x50)
> > [ 6538.625714] Call Trace:
> > [ 6538.629156] <TASK>
> > [ 6538.675067] generic_exec_single+0x58/0x120
> > [ 6538.680236] smp_call_function_single+0xbf/0x110
> > [ 6538.696333] rdmsrl_on_cpu+0x46/0x60
> > [ 6538.700894] intel_pstate_get_hwp_cap+0x1b/0x70
> > [ 6538.706420] intel_pstate_update_limits+0x2a/0x60
> > [ 6538.712110] acpi_processor_notify+0xb7/0x140
> > [ 6538.717479] acpi_ev_notify_dispatch+0x3b/0x60
> > [ 6538.722910] acpi_os_execute_deferred+0xf/0x20
> > [ 6538.728346] process_one_work+0x13d/0x350
> > [ 6538.733342] worker_thread+0x2c2/0x3d0
> > [ 6538.743066] kthread+0xca/0x100
> > [ 6538.751844] ret_from_fork+0x2c/0x40
> > [ 6538.761073] ret_from_fork_asm+0x11/0x20
> > [ 6538.765993] </TASK>
>
> > root@h:~# grep . /sys/devices/system/cpu/intel_pstate/*
> > /sys/devices/system/cpu/intel_pstate/max_perf_pct:100
> > /sys/devices/system/cpu/intel_pstate/min_perf_pct:36
> > /sys/devices/system/cpu/intel_pstate/no_turbo:0
> > /sys/devices/system/cpu/intel_pstate/num_pstates:4294967285
> ^ => -EAGAIN
>
> > /sys/devices/system/cpu/intel_pstate/status:passive
> > /sys/devices/system/cpu/intel_pstate/turbo_pct:0
>
> > # cpupower frequency-info
> > analyzing CPU 0:
> > driver: intel_cpufreq
> …
>
> from turbostat:
> > CPUID(0): GenuineIntel 0xf CPUID levels
> > CPUID(1): family:model:stepping 0x6:3f:4 (6:63:4) microcode 0x1a
> > CPUID(0x80000000): max_extended_levels: 0x80000008
> > CPUID(1): SSE3 MONITOR SMX EIST TM2 TSC MSR ACPI-TM HT TM
> > CPUID(6): APERF, TURBO, DTS, PTM, No-HWP, No-HWPnotify, No-
> > HWPwindow, No-HWPepp, No-HWPpkg, EPB
> > cpu4: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT PREFETCH
> > TURBO)
> > CPUID(7): No-SGX No-Hybrid
>
> This is v6.10.0-rc1.
>
> Sebastian
>
[-- Attachment #2: 0001-cpufreq-intel_pstate-Fix-unchecked-HWP-MSR-access.patch --]
[-- Type: text/x-patch, Size: 1393 bytes --]
From f85a83508ef029bceaf9192cb648d66f32b61d02 Mon Sep 17 00:00:00 2001
From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date: Wed, 29 May 2024 15:30:49 -0700
Subject: [PATCH] cpufreq: intel_pstate: Fix unchecked HWP MSR access
HWP MSR 0x771 can be only read on a CPU which supports HWP and enabled.
Hence intel_pstate_get_hwp_cap() can only be called when hwp_active is
true.
Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Closes: https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/
Fixes: e8217b4bece3 ("cpufreq: intel_pstate: Update the maximum CPU frequency consistently")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
drivers/cpufreq/intel_pstate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 4b986c044741..65d3f79104bd 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1153,7 +1153,8 @@ static void intel_pstate_update_policies(void)
static void __intel_pstate_update_max_freq(struct cpudata *cpudata,
struct cpufreq_policy *policy)
{
- intel_pstate_get_hwp_cap(cpudata);
+ if (hwp_active)
+ intel_pstate_get_hwp_cap(cpudata);
policy->cpuinfo.max_freq = READ_ONCE(global.no_turbo) ?
cpudata->pstate.max_freq : cpudata->pstate.turbo_freq;
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: intel_pstate_get_hwp_cap on wrong CPU
2024-05-29 23:08 ` srinivas pandruvada
@ 2024-05-31 11:02 ` Sebastian Andrzej Siewior
2024-05-31 11:56 ` srinivas pandruvada
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-05-31 11:02 UTC (permalink / raw)
To: srinivas pandruvada
Cc: linux-pm, Rafael J. Wysocki, Len Brown, Thomas Gleixner,
Viresh Kumar
On 2024-05-29 16:08:19 [-0700], srinivas pandruvada wrote:
> Hi Sebastian,
Hi,
> On Wed, 2024-05-29 at 17:57 +0200, Sebastian Andrzej Siewior wrote:
> > Hi,
> >
> > this just popped up:
> > > [ 6538.614568] unchecked MSR access error: RDMSR from 0x771 at rIP:
>
> Please check if the attached change fixes?
it should based on the callchain. Let me test it in a few…
Would you mind letting
/sys/devices/system/cpu/intel_pstate/num_pstates
reporting something sane? Not 4294967285 but 0 for instance? Would that
make sense?
But
…
> From f85a83508ef029bceaf9192cb648d66f32b61d02 Mon Sep 17 00:00:00 2001
> From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Date: Wed, 29 May 2024 15:30:49 -0700
> Subject: [PATCH] cpufreq: intel_pstate: Fix unchecked HWP MSR access
>
> HWP MSR 0x771 can be only read on a CPU which supports HWP and enabled.
> Hence intel_pstate_get_hwp_cap() can only be called when hwp_active is
> true.
>
> Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Did Sebastian _Reichel_ report it, too?
> Closes: https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/
Because this my report ;)
> Fixes: e8217b4bece3 ("cpufreq: intel_pstate: Update the maximum CPU frequency consistently")
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: intel_pstate_get_hwp_cap on wrong CPU
2024-05-31 11:02 ` Sebastian Andrzej Siewior
@ 2024-05-31 11:56 ` srinivas pandruvada
2024-05-31 14:01 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 6+ messages in thread
From: srinivas pandruvada @ 2024-05-31 11:56 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-pm, Rafael J. Wysocki, Len Brown, Thomas Gleixner,
Viresh Kumar
Hi,
On Fri, 2024-05-31 at 13:02 +0200, Sebastian Andrzej Siewior wrote:
> On 2024-05-29 16:08:19 [-0700], srinivas pandruvada wrote:
> > Hi Sebastian,
> Hi,
>
> > On Wed, 2024-05-29 at 17:57 +0200, Sebastian Andrzej Siewior wrote:
> > > Hi,
> > >
> > > this just popped up:
> > > > [ 6538.614568] unchecked MSR access error: RDMSR from 0x771 at
> > > > rIP:
> >
> > Please check if the attached change fixes?
>
> it should based on the callchain. Let me test it in a few…
>
> Would you mind letting
> /sys/devices/system/cpu/intel_pstate/num_pstates
>
> reporting something sane? Not 4294967285 but 0 for instance? Would
> that
> make sense?
>
It should be some good value, usually less than 50. Do you see this
high number without even triggering condition, which caused warning?
In your system, firmware changed performance notifying via ACPI. That
method is deprecated for a while. You are using Haswell, which has this
support. But deprecated from Skylake.
> …
> > From f85a83508ef029bceaf9192cb648d66f32b61d02 Mon Sep 17 00:00:00
> > 2001
> > From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > Date: Wed, 29 May 2024 15:30:49 -0700
> > Subject: [PATCH] cpufreq: intel_pstate: Fix unchecked HWP MSR
> > access
> >
> > HWP MSR 0x771 can be only read on a CPU which supports HWP and
> > enabled.
> > Hence intel_pstate_get_hwp_cap() can only be called when hwp_active
> > is
> > true.
> >
> > Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com>
>
> Did Sebastian _Reichel_ report it, too?
My mistake. I picked up wrong Sebastian. Sorry.
Thanks,
Srinivas
>
> > Closes:
> > https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/
>
> Because this my report ;)
>
> > Fixes: e8217b4bece3 ("cpufreq: intel_pstate: Update the maximum CPU
> > frequency consistently")
> > Signed-off-by: Srinivas Pandruvada
> > <srinivas.pandruvada@linux.intel.com>
>
> Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: intel_pstate_get_hwp_cap on wrong CPU
2024-05-31 11:56 ` srinivas pandruvada
@ 2024-05-31 14:01 ` Sebastian Andrzej Siewior
2024-05-31 22:27 ` srinivas pandruvada
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-05-31 14:01 UTC (permalink / raw)
To: srinivas pandruvada
Cc: linux-pm, Rafael J. Wysocki, Len Brown, Thomas Gleixner,
Viresh Kumar
On 2024-05-31 04:56:04 [-0700], srinivas pandruvada wrote:
> > Would you mind letting
> > /sys/devices/system/cpu/intel_pstate/num_pstates
> >
> > reporting something sane? Not 4294967285 but 0 for instance? Would
> > that
> > make sense?
> >
> It should be some good value, usually less than 50. Do you see this
> high number without even triggering condition, which caused warning?
Nope, without the error I see 22. I think this went "-EIO + -EIO -1"
which ended up as what I reported by chance. Never mind then.
> In your system, firmware changed performance notifying via ACPI. That
> method is deprecated for a while. You are using Haswell, which has this
> support. But deprecated from Skylake.
Okay. Anything I should change?
> > Did Sebastian _Reichel_ report it, too?
> My mistake. I picked up wrong Sebastian. Sorry.
Ah okay. Then
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
because it does not do read the MSR in the code path I reported.
> Thanks,
> Srinivas
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: intel_pstate_get_hwp_cap on wrong CPU
2024-05-31 14:01 ` Sebastian Andrzej Siewior
@ 2024-05-31 22:27 ` srinivas pandruvada
0 siblings, 0 replies; 6+ messages in thread
From: srinivas pandruvada @ 2024-05-31 22:27 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-pm, Rafael J. Wysocki, Len Brown, Thomas Gleixner,
Viresh Kumar
Hi,
On Fri, 2024-05-31 at 16:01 +0200, Sebastian Andrzej Siewior wrote:
> On 2024-05-31 04:56:04 [-0700], srinivas pandruvada wrote:
> > > Would you mind letting
> > > /sys/devices/system/cpu/intel_pstate/num_pstates
> > >
> > > reporting something sane? Not 4294967285 but 0 for instance?
> > > Would
> > > that
> > > make sense?
> > >
> > It should be some good value, usually less than 50. Do you see this
> > high number without even triggering condition, which caused
> > warning?
>
> Nope, without the error I see 22. I think this went "-EIO + -EIO -1"
> which ended up as what I reported by chance. Never mind then.
>
> > In your system, firmware changed performance notifying via ACPI.
> > That
> > method is deprecated for a while. You are using Haswell, which has
> > this
> > support. But deprecated from Skylake.
>
> Okay. Anything I should change?
You can't on these models.
>
> > > Did Sebastian _Reichel_ report it, too?
> > My mistake. I picked up wrong Sebastian. Sorry.
>
> Ah okay. Then
> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
> because it does not do read the MSR in the code path I reported.
>
Thanks. I will post the patch to the mailing list with the correction.
-Srinivas
> > Thanks,
> > Srinivas
> Sebastian
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-05-31 22:27 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29 15:57 intel_pstate_get_hwp_cap on wrong CPU Sebastian Andrzej Siewior
2024-05-29 23:08 ` srinivas pandruvada
2024-05-31 11:02 ` Sebastian Andrzej Siewior
2024-05-31 11:56 ` srinivas pandruvada
2024-05-31 14:01 ` Sebastian Andrzej Siewior
2024-05-31 22:27 ` srinivas pandruvada
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox