All of lore.kernel.org
 help / color / mirror / Atom feed
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Mitchell, Lisa (MCLinux in Fort Collins)" <lisa.mitchell@hp.com>,
	Vivek Goyal <vgoyal@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	bhelgaas@google.com, Jingbai Ma <jingbai.ma@hp.com>
Subject: Re: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag
Date: Mon, 19 Aug 2013 11:29:47 +0900	[thread overview]
Message-ID: <5211831B.6090704@jp.fujitsu.com> (raw)
In-Reply-To: <87ob90839p.fsf@xmission.com>

(2013/08/15 4:45), Eric W. Biederman wrote:
> Jingbai Ma <jingbai.ma@hp.com> writes:
>
>> I found a side effect of unsetting BSP flag.
>> It affected system rebooting, once the BSP flags been removed, and issue
>> reboot command, system will hang after message:
>> Restarting system.
>> And have to do a hardware reset to recover it.
>>
>> I have reproduced this problem on the following systems:
>> HP EliteBook 6930p
>> HP Compaq DC7700
>> HP ProLiant DL980 (4 sockets, 40 cores)
>>
>> I have an idea: To avoid such kind of issue, we can unset BSP flag in
>> the first kernel during crash processing, and restore it in the second
>> kernel in the APs initializing.
>
> The premise was clearing BSP would not be an issue.  If we could
> reliably count on unsetting the BSP during crash processing we could
> just switch to the BSP and be done totally avoid this problem.
>
> Given that there are reald world issues with clearing the BSP flag,
> I believe the alternate suggestion was to simply never attempt to start
> the bootstrap processor during processor bring up.
>
> If as normal we are running on the bootstrap processor everything will
> work the same, but if we are in the kdump scenario we will be short one
> core.  Being short one core seems like a reasonable tradeoff between
> reliability and performance.
>
> Eric

Sorry Eric, I'm not clear to what you mean by ``short one core''...
Which are you suggesting? Disabling BSP if crash happens on AP is reasonable?
Or restricting cpus to a single one only just as the current kdump
configuration is reasonable?

-- 
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jingbai Ma <jingbai.ma@hp.com>, Fenghua Yu <fenghua.yu@intel.com>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Mitchell, Lisa (MCLinux in Fort Collins)" <lisa.mitchell@hp.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	bhelgaas@google.com, Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag
Date: Mon, 19 Aug 2013 11:29:47 +0900	[thread overview]
Message-ID: <5211831B.6090704@jp.fujitsu.com> (raw)
In-Reply-To: <87ob90839p.fsf@xmission.com>

(2013/08/15 4:45), Eric W. Biederman wrote:
> Jingbai Ma <jingbai.ma@hp.com> writes:
>
>> I found a side effect of unsetting BSP flag.
>> It affected system rebooting, once the BSP flags been removed, and issue
>> reboot command, system will hang after message:
>> Restarting system.
>> And have to do a hardware reset to recover it.
>>
>> I have reproduced this problem on the following systems:
>> HP EliteBook 6930p
>> HP Compaq DC7700
>> HP ProLiant DL980 (4 sockets, 40 cores)
>>
>> I have an idea: To avoid such kind of issue, we can unset BSP flag in
>> the first kernel during crash processing, and restore it in the second
>> kernel in the APs initializing.
>
> The premise was clearing BSP would not be an issue.  If we could
> reliably count on unsetting the BSP during crash processing we could
> just switch to the BSP and be done totally avoid this problem.
>
> Given that there are reald world issues with clearing the BSP flag,
> I believe the alternate suggestion was to simply never attempt to start
> the bootstrap processor during processor bring up.
>
> If as normal we are running on the bootstrap processor everything will
> work the same, but if we are in the kdump scenario we will be short one
> core.  Being short one core seems like a reasonable tradeoff between
> reliability and performance.
>
> Eric

Sorry Eric, I'm not clear to what you mean by ``short one core''...
Which are you suggesting? Disabling BSP if crash happens on AP is reasonable?
Or restricting cpus to a single one only just as the current kdump
configuration is reasonable?

-- 
Thanks.
HATAYAMA, Daisuke


  reply	other threads:[~2013-08-19  2:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  9:19 [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag HATAYAMA Daisuke
2013-08-06  9:19 ` HATAYAMA Daisuke
2013-08-06 16:25 ` Bjorn Helgaas
2013-08-06 16:25   ` Bjorn Helgaas
2013-08-07 10:05   ` HATAYAMA Daisuke
2013-08-07 10:05     ` HATAYAMA Daisuke
2013-08-13 10:55 ` Jingbai Ma
2013-08-13 10:55   ` Jingbai Ma
2013-08-14  9:13   ` Jingbai Ma
2013-08-14  9:13     ` Jingbai Ma
2013-08-14 19:45     ` Eric W. Biederman
2013-08-14 19:45       ` Eric W. Biederman
2013-08-19  2:29       ` HATAYAMA Daisuke [this message]
2013-08-19  2:29         ` HATAYAMA Daisuke
2013-08-19  2:59         ` Eric W. Biederman
2013-08-19  2:59           ` Eric W. Biederman
2013-08-19  9:13           ` HATAYAMA Daisuke
2013-08-19  9:13             ` HATAYAMA Daisuke
2013-08-19 13:46           ` Petr Tesarik
2013-08-19 13:46             ` Petr Tesarik
2013-08-20  3:13             ` HATAYAMA Daisuke
2013-08-20  3:13               ` HATAYAMA Daisuke
2013-08-19  1:57     ` HATAYAMA Daisuke
2013-08-19  1:57       ` HATAYAMA Daisuke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5211831B.6090704@jp.fujitsu.com \
    --to=d.hatayama@jp.fujitsu.com \
    --cc=bhelgaas@google.com \
    --cc=ebiederm@xmission.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=jingbai.ma@hp.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lisa.mitchell@hp.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.