From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
To: Arnd Bergmann <arnd@kernel.org>
Cc: Len Brown <len.brown@intel.com>,
Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Amit Kucheria <amitk@kernel.org>, Zhang Rui <rui.zhang@intel.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: x86/mce/therm_throt incorrect THERM_STATUS_CLEAR_CORE_MASK?
Date: Thu, 02 Jun 2022 13:10:27 -0700 [thread overview]
Message-ID: <21b7d5a3de39e9eee4ccda48ad0c66d31b1fe7d1.camel@linux.intel.com> (raw)
In-Reply-To: <CAK8P3a3aUtQ6C6kVmEZKzHv2tGL3=3WXK=_agc-Mg5Pq47vbdA@mail.gmail.com>
On Thu, 2022-06-02 at 20:53 +0200, Arnd Bergmann wrote:
> On Thu, Jun 2, 2022 at 6:25 PM srinivas pandruvada
> <srinivas.pandruvada@linux.intel.com> wrote:
> > On Thu, 2022-06-02 at 18:18 +0200, Arnd Bergmann wrote:
> > > On Thu, Jun 2, 2022 at 5:52 PM srinivas pandruvada
> > > <srinivas.pandruvada@linux.intel.com> wrote:
> > > >
> > > > On Thu, 2022-06-02 at 11:19 +0200, Arnd Bergmann wrote:
> > > > > I have a Xeon W-2265 (family 6, model 85, stepping 7) that
> > > > > started
> > > > > constantly spewing messages from the therm_throt driver after
> > > > > one
> > > > > core overheated:
> > > > >
> > > > I think this is a Cascade Lake system. Have you tried the
> > > > latest
> > > > micro-
> > > > code?
> > >
> > > Thanks for your quick reply. I have installed the latest
> > > microcode
> > > 0x5003302
> > > now (manually, because the version provided by the distro was
> > > still
> > > using
> > > version 0x5003102).
> > >
> > > After that, I tried writing the value 0x2a80 from userspace, and
> > > that did not cause a trap, so I assume that fixed it.
> > >
> > Thanks for reporting.
> > I am aware of this issue and should be fixed by microcode update.
>
> I wonder how common this problem it is. Would it help to add a driver
> workaround
> like this?
This issue affects only certain skews. The others already working as
expected. These are important log bits for debug, we don't want to
clear in this path. Printing warning for CLX stepping is fine without
clearing unrelated bits 13 and 15.
Read-modify-update should always work where we only update the bits of
interest. Writing 1s to this register should be NOP.
Thanks,
Srinivas
>
> diff --git a/drivers/thermal/intel/therm_throt.c
> b/drivers/thermal/intel/therm_throt.c
> index 8352083b87c7..acb402e56796 100644
> --- a/drivers/thermal/intel/therm_throt.c
> +++ b/drivers/thermal/intel/therm_throt.c
> @@ -214,7 +214,13 @@ static void clear_therm_status_log(int level)
>
> rdmsrl(msr, msr_val);
> msr_val &= mask;
> - wrmsrl(msr, msr_val & ~THERM_STATUS_PROCHOT_LOG);
> + if (wrmsrl_safe(msr, msr_val & ~THERM_STATUS_PROCHOT_LOG)) {
> + /* work around Cascade Lake SKZ57 erratum */
> + printk_once(KERN_WARNING "Failed to update
> IA32_THERM_STATUS, "
> + "please upgrade
> microcode\n");
> + wrmsrl(msr, msr_val & ~THERM_STATUS_PROCHOT_LOG &
> + ~BIT(13) & ~BIT(15));
> + }
> }
>
> static void get_therm_status(int level, bool *proc_hot, u8 *temp)
>
> Arnd
next prev parent reply other threads:[~2022-06-02 20:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-02 9:19 x86/mce/therm_throt incorrect THERM_STATUS_CLEAR_CORE_MASK? Arnd Bergmann
2022-06-02 15:52 ` srinivas pandruvada
2022-06-02 16:18 ` Arnd Bergmann
2022-06-02 16:25 ` srinivas pandruvada
2022-06-02 18:53 ` Arnd Bergmann
2022-06-02 20:10 ` srinivas pandruvada [this message]
2022-06-02 20:42 ` Arnd Bergmann
2022-06-02 21:13 ` srinivas pandruvada
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=21b7d5a3de39e9eee4ccda48ad0c66d31b1fe7d1.camel@linux.intel.com \
--to=srinivas.pandruvada@linux.intel.com \
--cc=amitk@kernel.org \
--cc=arnd@kernel.org \
--cc=daniel.lezcano@linaro.org \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=ricardo.neri-calderon@linux.intel.com \
--cc=rui.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).