public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ashok Raj <ashok.raj@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux regressions mailing list <regressions@lists.linux.dev>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	Yanjun Yang <yangyj.ee@gmail.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	"Borislav Petkov (AMD)" <bp@alien8.de>,
	Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	Ashok Raj <ashok.raj@intel.com>
Subject: Re: [regression] some Dell systems hang at shutdown due to "x86/smp: Put CPUs into INIT on shutdown if possible" (was Fwd: Kernel 6.5 hangs on shutdown)
Date: Fri, 13 Oct 2023 11:28:22 -0700	[thread overview]
Message-ID: <ZSmMRoftUrmIAeR/@a4bf019067fa.jf.intel.com> (raw)
In-Reply-To: <CAHk-=wj99i00K5ZD_OJj3d8rLG07bnTH=0_GxpzxrSzNF-WYQQ@mail.gmail.com>

Hi

On Fri, Oct 13, 2023 at 10:48:19AM -0700, Linus Torvalds wrote:
> On Fri, 13 Oct 2023 at 05:05, Linux regression tracking (Thorsten
> Leemhuis) <regressions@leemhuis.info> wrote:
> >
> > Thomas, turns out that bisection result was slightly wrong: a recheck
> > confirmed that the regression is actually caused by 45e34c8af58f23
> > ("x86/smp: Put CPUs into INIT on shutdown if possible") [v6.5-rc1] of
> > yours. See https://bugzilla.kernel.org/show_bug.cgi?id=217995 for details.
> 
> That commit does look pretty dangerous.
> 
> If *anything* is done through SMI after the code does that
> smp_park_other_cpus_in_init() sequence, I wouldn't be surprised in the
> least if the machine is hung.
> 
> That's made worse since it looks like the shutdown sequence isn't
> necessarily run on the boot CPU, so the boot CPU itself may be in
> INIT, and any SMI quite possibly ends up treating that CPU specially.

Sending INIT to processor marked as BSP will tank the system.

> 
> Who knows what SMI does, but the fact that the affected machines seem
> to be mainly from one particular manufacturer does tend to imply it's
> something like that.

There was a report (probably this same one), and it turns out it was a
bug in the BIOS SMI handler.

The client BIOS's were waiting for the lowest APICID to be the SMI
rendevous master. If this is MeteorLake, the BSP wasn't the one
with the lowest APIC and it triped here.

The BIOS change is also being pushed to others for assimilation :)

Server BIOS's had this correctly for a while now.
> 
> And the code does do a fair amount *after* shutting down cpu's. Not
> just things like calling x86_platform.iommu_shutdown(), but also
> things like possibly the tboot shutdown sequence (which almost
> *certainly* is some SMI thing).
> 
> I dunno. Thomas - I htink the argument for that commit was fairly
> theoretical, and reverting it seems the obvious thing, unless you have
> some idea of what might be wrong.
> 
>                Linus

-- 
Cheers,
Ashok

  reply	other threads:[~2023-10-13 18:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12  9:37 Fwd: Kernel 6.5 hangs on shutdown Bagas Sanjaya
2023-10-13 12:05 ` [regression] some Dell systems hang at shutdown due to "x86/smp: Put CPUs into INIT on shutdown if possible" (was Fwd: Kernel 6.5 hangs on shutdown) Linux regression tracking (Thorsten Leemhuis)
2023-10-13 17:48   ` Linus Torvalds
2023-10-13 18:28     ` Ashok Raj [this message]
2023-10-13 19:40     ` Thomas Gleixner
2023-10-16  8:46 ` Fwd: Kernel 6.5 hangs on shutdown Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSmMRoftUrmIAeR/@a4bf019067fa.jf.intel.com \
    --to=ashok.raj@intel.com \
    --cc=bagasdotme@gmail.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=yangyj.ee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox