All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ashok Raj <ashok.raj@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Tony Luck" <tony.luck@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	"LKML Mailing List" <linux-kernel@vger.kernel.org>,
	X86-kernel <x86@kernel.org>,
	"Andy Lutomirski" <luto@amacapital.net>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Jacon Jun Pan <jacob.jun.pan@intel.com>,
	Ashok Raj <ashok.raj@intel.com>
Subject: Re: [PATCH v3 3/5] x86/microcode: Avoid any chance of MCE's during microcode update
Date: Wed, 17 Aug 2022 15:06:27 +0000	[thread overview]
Message-ID: <Yv0D88jxFkXcc18o@araj-dh-work> (raw)
In-Reply-To: <Yvz4/ASoX4SiXbhp@zn.tnic>

On Wed, Aug 17, 2022 at 04:19:40PM +0200, Borislav Petkov wrote:
> On Wed, Aug 17, 2022 at 12:30:49PM +0000, Ashok Raj wrote:
> > You will find out when system returns after reboot and hopefully wasn't
> > promoted to a cold-boot which will loose MCE banks.
> 
> Not good enough!

I probably misread your question.. are you suggesting we add some WARN when
we initiate late_load? I thought you were asking if the HW must signal
something and OS should log when an MCE happens if MCIP=1


> 
> This should issue a warning in dmesg that a potential MCE while update
> is running would cause a lockup. That is if we don't disable MCE around
> it.
> 
> If we decide to disable MCE, it should say shutdown.

Ok, that clarifies it.. "IF we choose to set MCIP=1, we should tell users
that hell can break loose, get under the table" :-)

> 
> > Meaning deal with the effect of a really rare MCE. Rather than trying to
> > avoid it. Taking the MCE is more important than finishing the update,
> > and loosing what the error signaled was trying to convey.
> 
> Right now I'm inclined to not do anything and warn of a potential rare
> situation.

Encouraging.. So I'll drop that patch from the list next time around.

> 
> > > > Shutdown, shutdown.. There is only 1 MCE no matter how many CPUs you have.
> > > 
> > > Because all CPUs are executing the loop? Or how do you decide this?
> > 
> > Fatal errors signaled with PCC=1 in the MCAx.STATUS is *ALWAYS*
> 
> What does that have to do with
> 
> "There is only 1 MCE no matter how many CPUs you have."
> 
> ?
> 
> That's bullsh*t. Especially if the machine can do LMCE.

Well, not outlandish :)

LMCE is only for recoverable errors. When we have a fatal error, sometimes
the signalling and consumption of poison are going in different directions.
In order to minimize exposure of bad data from being consumed,
*ALL* Intel processors have always broadcast fatal errors. This is the
history behind why we broadcast.

BTW: This is all legacy behavior. Nothing should come as surprise.

LMCE is best effort. This is the current state.

Cheers,
Ashok

  reply	other threads:[~2022-08-17 15:07 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17  5:11 [PATCH v3 0/5] Making microcode late-load robust Ashok Raj
2022-08-17  5:11 ` [PATCH v3 1/5] x86/microcode/intel: Check against CPU signature before saving microcode Ashok Raj
2022-08-17  7:43   ` Ingo Molnar
2022-08-17 10:45     ` Ashok Raj
2022-08-19 10:24   ` Borislav Petkov
2022-08-23 11:13     ` Ashok Raj
2022-08-24 19:27       ` Borislav Petkov
2022-08-25  3:27         ` Ashok Raj
2022-08-26 16:24           ` Borislav Petkov
2022-08-26 17:18             ` Ashok Raj
2022-08-26 17:29               ` Borislav Petkov
2022-08-17  5:11 ` [PATCH v3 2/5] x86/microcode/intel: Allow a late-load only if a min rev is specified Ashok Raj
2022-08-17  7:45   ` Ingo Molnar
2022-08-19 11:11   ` Borislav Petkov
2022-08-23  0:08     ` Ashok Raj
2022-08-24 19:52       ` Borislav Petkov
2022-08-25  4:02         ` Ashok Raj
2022-08-26 12:09           ` Borislav Petkov
2022-08-17  5:11 ` [PATCH v3 3/5] x86/microcode: Avoid any chance of MCE's during microcode update Ashok Raj
2022-08-17  7:41   ` Ingo Molnar
2022-08-17  7:58     ` Ingo Molnar
2022-08-17  8:09       ` Borislav Petkov
2022-08-17 11:57         ` Ashok Raj
2022-08-17 12:10           ` Borislav Petkov
2022-08-17 12:30             ` Ashok Raj
2022-08-17 14:19               ` Borislav Petkov
2022-08-17 15:06                 ` Ashok Raj [this message]
2022-08-29 14:23                   ` Andy Lutomirski
2022-08-17 11:40     ` Ashok Raj
2022-08-17  5:11 ` [PATCH v3 4/5] x86/x2apic: Support x2apic self IPI with NMI_VECTOR Ashok Raj
2022-08-17  5:11 ` [PATCH v3 5/5] x86/microcode: Place siblings in NMI loop while update in progress Ashok Raj
2022-08-30 19:15   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yv0D88jxFkXcc18o@araj-dh-work \
    --to=ashok.raj@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=jacob.jun.pan@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.