* [PATCH] thermal: intel: Don't set HFI status bit to 1
@ 2022-12-14 2:06 Srinivas Pandruvada
2022-12-14 3:16 ` Linus Torvalds
0 siblings, 1 reply; 3+ messages in thread
From: Srinivas Pandruvada @ 2022-12-14 2:06 UTC (permalink / raw)
To: rafael, daniel.lezcano, rui.zhang, amitk, torvalds
Cc: linux-pm, linux-kernel, Srinivas Pandruvada
When CPU doesn't support HFI (Hardware Feedback Interface), don't include
BIT 26 in the mask to prevent clearing. otherwise this results in:
unchecked MSR access error: WRMSR to 0x1b1
(tried to write 0x0000000004000aa8)
at rIP: 0xffffffff8b8559fe (throttle_active_work+0xbe/0x1b0)
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 6fe1e64b6026 ("thermal: intel: Prevent accidental clearing of HFI status")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
drivers/thermal/intel/therm_throt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/therm_throt.c
index 4bb7fddaa143..2e22bb82b738 100644
--- a/drivers/thermal/intel/therm_throt.c
+++ b/drivers/thermal/intel/therm_throt.c
@@ -194,7 +194,7 @@ static const struct attribute_group thermal_attr_group = {
#define THERM_STATUS_PROCHOT_LOG BIT(1)
#define THERM_STATUS_CLEAR_CORE_MASK (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11) | BIT(13) | BIT(15))
-#define THERM_STATUS_CLEAR_PKG_MASK (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11) | BIT(26))
+#define THERM_STATUS_CLEAR_PKG_MASK (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11))
/*
* Clear the bits in package thermal status register for bit = 1
@@ -211,6 +211,9 @@ void thermal_clear_package_intr_status(int level, u64 bit_mask)
} else {
msr = MSR_IA32_PACKAGE_THERM_STATUS;
msr_val = THERM_STATUS_CLEAR_PKG_MASK;
+ if (boot_cpu_has(X86_FEATURE_HFI))
+ msr_val |= BIT(26);
+
}
msr_val &= ~bit_mask;
--
2.31.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] thermal: intel: Don't set HFI status bit to 1
2022-12-14 2:06 [PATCH] thermal: intel: Don't set HFI status bit to 1 Srinivas Pandruvada
@ 2022-12-14 3:16 ` Linus Torvalds
2022-12-14 13:53 ` Rafael J. Wysocki
0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2022-12-14 3:16 UTC (permalink / raw)
To: Srinivas Pandruvada
Cc: rafael, daniel.lezcano, rui.zhang, amitk, linux-pm, linux-kernel
On Tue, Dec 13, 2022 at 6:07 PM Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
>
> When CPU doesn't support HFI (Hardware Feedback Interface), don't include
> BIT 26 in the mask to prevent clearing. otherwise this results in:
> unchecked MSR access error: WRMSR to 0x1b1
> (tried to write 0x0000000004000aa8)
> at rIP: 0xffffffff8b8559fe (throttle_active_work+0xbe/0x1b0)
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Fixes: 6fe1e64b6026 ("thermal: intel: Prevent accidental clearing of HFI status")
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
You can add my tested-by as well, it seems to fix the issue.
Of course, it could be that I just didn't happen to trigger the
throttling in my test just now, so that testing is pretty limited, but
at least from a very quick check it seems good.
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] thermal: intel: Don't set HFI status bit to 1
2022-12-14 3:16 ` Linus Torvalds
@ 2022-12-14 13:53 ` Rafael J. Wysocki
0 siblings, 0 replies; 3+ messages in thread
From: Rafael J. Wysocki @ 2022-12-14 13:53 UTC (permalink / raw)
To: Linus Torvalds, Srinivas Pandruvada
Cc: rafael, daniel.lezcano, rui.zhang, amitk, linux-pm, linux-kernel
On Wed, Dec 14, 2022 at 4:16 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Tue, Dec 13, 2022 at 6:07 PM Srinivas Pandruvada
> <srinivas.pandruvada@linux.intel.com> wrote:
> >
> > When CPU doesn't support HFI (Hardware Feedback Interface), don't include
> > BIT 26 in the mask to prevent clearing. otherwise this results in:
> > unchecked MSR access error: WRMSR to 0x1b1
> > (tried to write 0x0000000004000aa8)
> > at rIP: 0xffffffff8b8559fe (throttle_active_work+0xbe/0x1b0)
> >
> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > Fixes: 6fe1e64b6026 ("thermal: intel: Prevent accidental clearing of HFI status")
> > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
>
> You can add my tested-by as well, it seems to fix the issue.
>
> Of course, it could be that I just didn't happen to trigger the
> throttling in my test just now, so that testing is pretty limited, but
> at least from a very quick check it seems good.
I've applied the patch for 6.2-rc1, thanks!
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-12-14 13:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-14 2:06 [PATCH] thermal: intel: Don't set HFI status bit to 1 Srinivas Pandruvada
2022-12-14 3:16 ` Linus Torvalds
2022-12-14 13:53 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox