From: Ashok Raj <ashok.raj@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
"Tony Luck" <tony.luck@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
"LKML Mailing List" <linux-kernel@vger.kernel.org>,
X86-kernel <x86@kernel.org>,
"Andy Lutomirski" <luto@amacapital.net>,
Tom Lendacky <thomas.lendacky@amd.com>,
Jacon Jun Pan <jacob.jun.pan@intel.com>,
Ashok Raj <ashok.raj@intel.com>
Subject: Re: [PATCH v3 3/5] x86/microcode: Avoid any chance of MCE's during microcode update
Date: Wed, 17 Aug 2022 11:57:37 +0000 [thread overview]
Message-ID: <YvzXsf0mGEcOlZC5@araj-dh-work> (raw)
In-Reply-To: <YvyiHGMbp2MtV0Vr@zn.tnic>
On Wed, Aug 17, 2022 at 10:09:00AM +0200, Borislav Petkov wrote:
> On Wed, Aug 17, 2022 at 09:58:03AM +0200, Ingo Molnar wrote:
> > Also, Boris tells me that writing 0x0 to MSR_IA32_MCG_STATUS
> > apparently shuts the platform down - which is not ideal...
>
> Right, if you get an MCE raised while MCIP=0, the machine shuts down.
>
> And frankly, I can't think of a good solution to this whole issue:
>
> - with current hw, if you get an MCE and MCIP=0 -> shutdown
You have this reversed. if you get an MCE and MCIP=1 -> shutdown
I'm still very reluctant, this is actually an overkill. I added what is
possible based on Boris's recommendation.
When MCE's happen during the update they are always fatal errors. But
atleast you can log them, even if some other weird error were to be
observed because they stomed over the patch area that primary is currently
working on.
What we do here by setting MCIP=1, we promote to a more severe shutdown.
Ideally I would rather let the fallout happen since its observable vs a
blind shutdown is what we are promoting to.
>
> - in the future, even if you change the hardware to block MCEs from
> being detected while the microcode update runs, what happens if a CPU
> encounters a hw error during that update?
I don't think there ever will be blocking MCE's :-)
If an error happens, it leads to shutdown.
>
> You raise it immediately after? What if there are multiple MCEs? Not
> unheard of on a big machine...
Shutdown, shutdown.. There is only 1 MCE no matter how many CPUs you have.
Exception is the Local MCE which is recoverable, but only to user space.
If you get an error in the atomic we are polling, its a fatal error since
SW can't recover and we shutdown.
>
> Worse, what happens if there's a bitflip in the memory where the
> to-be-updated microcode patch is?
>
> You report the error afterwards?
>
> Just thinking about this makes me real nervous.
Overthinking :-).. If there is concensus, if Boris feels comfortable
enough, i would drop this patch.
next prev parent reply other threads:[~2022-08-17 11:58 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-17 5:11 [PATCH v3 0/5] Making microcode late-load robust Ashok Raj
2022-08-17 5:11 ` [PATCH v3 1/5] x86/microcode/intel: Check against CPU signature before saving microcode Ashok Raj
2022-08-17 7:43 ` Ingo Molnar
2022-08-17 10:45 ` Ashok Raj
2022-08-19 10:24 ` Borislav Petkov
2022-08-23 11:13 ` Ashok Raj
2022-08-24 19:27 ` Borislav Petkov
2022-08-25 3:27 ` Ashok Raj
2022-08-26 16:24 ` Borislav Petkov
2022-08-26 17:18 ` Ashok Raj
2022-08-26 17:29 ` Borislav Petkov
2022-08-17 5:11 ` [PATCH v3 2/5] x86/microcode/intel: Allow a late-load only if a min rev is specified Ashok Raj
2022-08-17 7:45 ` Ingo Molnar
2022-08-19 11:11 ` Borislav Petkov
2022-08-23 0:08 ` Ashok Raj
2022-08-24 19:52 ` Borislav Petkov
2022-08-25 4:02 ` Ashok Raj
2022-08-26 12:09 ` Borislav Petkov
2022-08-17 5:11 ` [PATCH v3 3/5] x86/microcode: Avoid any chance of MCE's during microcode update Ashok Raj
2022-08-17 7:41 ` Ingo Molnar
2022-08-17 7:58 ` Ingo Molnar
2022-08-17 8:09 ` Borislav Petkov
2022-08-17 11:57 ` Ashok Raj [this message]
2022-08-17 12:10 ` Borislav Petkov
2022-08-17 12:30 ` Ashok Raj
2022-08-17 14:19 ` Borislav Petkov
2022-08-17 15:06 ` Ashok Raj
2022-08-29 14:23 ` Andy Lutomirski
2022-08-17 11:40 ` Ashok Raj
2022-08-17 5:11 ` [PATCH v3 4/5] x86/x2apic: Support x2apic self IPI with NMI_VECTOR Ashok Raj
2022-08-17 5:11 ` [PATCH v3 5/5] x86/microcode: Place siblings in NMI loop while update in progress Ashok Raj
2022-08-30 19:15 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YvzXsf0mGEcOlZC5@araj-dh-work \
--to=ashok.raj@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=jacob.jun.pan@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox