All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Fernando Luis Vázquez Cao" <fernando@oss.ntt.co.jp>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Don Zickus <dzickus@redhat.com>,
	akpm@linux-foundation.org, linux-tip-commits@vger.kernel.org,
	Yinghai Lu <yinghai@kernel.org>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	mingo@redhat.com, "Eric W. Biederman" <ebiederm@xmission.com>,
	tglx@linutronix.de, torvalds@linux-foundation.org, mingo@elte.hu,
	vgoyal@redhat.com
Subject: Re: [PATCH 1/2] boot: ignore early NMIs
Date: Mon, 12 Mar 2012 14:43:42 +0900	[thread overview]
Message-ID: <4F5D8D0E.8060702@oss.ntt.co.jp> (raw)
In-Reply-To: <4F5A6D87.4050809@zytor.com>

On 03/10/2012 05:52 AM, H. Peter Anvin wrote:

> Is there a reason to not just simply block these NMIs during the kexec
> sequence?
Ok, some background:

In the reboot path to the kdump kernel we disable local interrupts
and the APICs in native_machine_crash_shutdown() and reset the IDT
in machine_kexec(), which leaves an in valid IDT installed.

However, disabling the I/O APIC involves taking a lock, which in
the event of a crash can is racy and can lead to a deadlock. To
solve this issue Don wrote a patch that left the I/O APICs and
the LAPIC of the crashing CPU untouched in the kdump reboot path,
but this seemed to cause mysterious reboots in some systems.
It turned out that an NMI coming from the perf based hardlockup
detector was causing the system to triple fault. If a NMI happens
to arrive in the window between the invalidation of the IDT in
machine_kexec() and the configuration of the final IDT we will be
in big trouble. In particular, the system will either triple fault
or halt, depending on whether the NMI arrived before or after
installing the early IDT.

To tackle this issue we can either stop the hardlockup detector
or disable the LAPIC (the NMIs needed by x86's hardlockup detector
are generated using performance counters in the LAPIC), leaving
the I/O APICs untouched. The second is simpler and I think it
is the approach Don took to fix this issue in RHEL kernels.

Unfortunately, this is not enough, we are still exposed to external
NMIs not routed through the LAPIC. In other words, we have to make
sure that we always have and IDT that is able to handle NMIs without
seemingly random reboots and lockups. To achieve this goal we need
to fix machine_kexec() and the early IDT handlers. The current patch
set takes care of the latter.

- Fernando


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: "Fernando Luis Vázquez Cao" <fernando@oss.ntt.co.jp>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Don Zickus <dzickus@redhat.com>,
	linux-tip-commits@vger.kernel.org, torvalds@linux-foundation.org,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	mingo@redhat.com, tglx@linutronix.de, mingo@elte.hu,
	Yinghai Lu <yinghai@kernel.org>,
	akpm@linux-foundation.org, vgoyal@redhat.com
Subject: Re: [PATCH 1/2] boot: ignore early NMIs
Date: Mon, 12 Mar 2012 14:43:42 +0900	[thread overview]
Message-ID: <4F5D8D0E.8060702@oss.ntt.co.jp> (raw)
In-Reply-To: <4F5A6D87.4050809@zytor.com>

On 03/10/2012 05:52 AM, H. Peter Anvin wrote:

> Is there a reason to not just simply block these NMIs during the kexec
> sequence?
Ok, some background:

In the reboot path to the kdump kernel we disable local interrupts
and the APICs in native_machine_crash_shutdown() and reset the IDT
in machine_kexec(), which leaves an in valid IDT installed.

However, disabling the I/O APIC involves taking a lock, which in
the event of a crash can is racy and can lead to a deadlock. To
solve this issue Don wrote a patch that left the I/O APICs and
the LAPIC of the crashing CPU untouched in the kdump reboot path,
but this seemed to cause mysterious reboots in some systems.
It turned out that an NMI coming from the perf based hardlockup
detector was causing the system to triple fault. If a NMI happens
to arrive in the window between the invalidation of the IDT in
machine_kexec() and the configuration of the final IDT we will be
in big trouble. In particular, the system will either triple fault
or halt, depending on whether the NMI arrived before or after
installing the early IDT.

To tackle this issue we can either stop the hardlockup detector
or disable the LAPIC (the NMIs needed by x86's hardlockup detector
are generated using performance counters in the LAPIC), leaving
the I/O APICs untouched. The second is simpler and I think it
is the approach Don took to fix this issue in RHEL kernels.

Unfortunately, this is not enough, we are still exposed to external
NMIs not routed through the LAPIC. In other words, we have to make
sure that we always have and IDT that is able to handle NMIs without
seemingly random reboots and lockups. To achieve this goal we need
to fix machine_kexec() and the early IDT handlers. The current patch
set takes care of the latter.

- Fernando


  reply	other threads:[~2012-03-12  5:44 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-11 23:09 [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path tip-bot for Don Zickus
2012-02-12  1:04 ` Yinghai Lu
2012-02-12  1:04   ` Yinghai Lu
2012-02-12  3:13   ` Eric W. Biederman
2012-02-12  3:13     ` Eric W. Biederman
2012-02-12  4:17     ` Yinghai Lu
2012-02-12  4:17       ` Yinghai Lu
2012-02-13 12:52       ` Eric W. Biederman
2012-02-13 12:52         ` Eric W. Biederman
2012-02-13 16:51         ` Yinghai Lu
2012-02-13 16:51           ` Yinghai Lu
2012-02-13 18:16           ` Yinghai Lu
2012-02-13 18:16             ` Yinghai Lu
2012-02-16 17:27             ` Don Zickus
2012-02-16 17:27               ` Don Zickus
2012-02-16 21:53               ` Yinghai Lu
2012-02-16 21:53                 ` Yinghai Lu
2012-02-16 21:56                 ` Don Zickus
2012-02-16 21:56                   ` Don Zickus
2012-02-17  3:38                   ` Eric W. Biederman
2012-02-17  3:38                     ` Eric W. Biederman
2012-02-17 12:41                     ` Eric W. Biederman
2012-02-17 12:41                       ` Eric W. Biederman
2012-02-17 15:49                       ` HATAYAMA Daisuke
2012-02-17 15:49                         ` HATAYAMA Daisuke
2012-02-17 20:18                         ` Don Zickus
2012-02-17 20:18                           ` Don Zickus
2012-02-20  5:17                           ` HATAYAMA Daisuke
2012-02-20  5:17                             ` HATAYAMA Daisuke
2012-02-20 15:24                             ` Don Zickus
2012-02-20 15:24                               ` Don Zickus
2012-02-17 19:54                       ` Don Zickus
2012-02-17 19:54                         ` Don Zickus
2012-02-18  3:21                         ` Eric W. Biederman
2012-02-18  3:21                           ` Eric W. Biederman
2012-02-20 15:14                           ` Don Zickus
2012-02-20 15:14                             ` Don Zickus
2012-02-21  8:01                             ` Eric W. Biederman
2012-02-21  8:01                               ` Eric W. Biederman
2012-02-21 13:59                               ` Don Zickus
2012-02-21 13:59                                 ` Don Zickus
2012-02-29 23:19                                 ` Eric W. Biederman
2012-02-29 23:19                                   ` Eric W. Biederman
2012-03-07 10:53                                   ` Fernando Luis Vázquez Cao
2012-03-07 10:53                                     ` Fernando Luis Vázquez Cao
2012-03-07 10:54                                     ` [PATCH 1/2] boot: ignore early NMIs Fernando Luis Vázquez Cao
2012-03-07 10:54                                       ` Fernando Luis Vázquez Cao
2012-03-07 10:56                                       ` [PATCH 2/2] boot: add early NMI counter Fernando Luis Vázquez Cao
2012-03-07 10:56                                         ` Fernando Luis Vázquez Cao
2012-03-08  4:50                                         ` Eric W. Biederman
2012-03-08  4:50                                           ` Eric W. Biederman
2012-03-08  6:00                                           ` Fernando Luis Vázquez Cao
2012-03-08  6:00                                             ` Fernando Luis Vázquez Cao
2012-03-08  4:41                                       ` [PATCH 1/2] boot: ignore early NMIs Eric W. Biederman
2012-03-08  4:41                                         ` Eric W. Biederman
2012-03-08  5:53                                         ` Fernando Luis Vázquez Cao
2012-03-08  5:53                                           ` Fernando Luis Vázquez Cao
2012-03-08 16:35                                           ` Eric W. Biederman
2012-03-08 16:35                                             ` Eric W. Biederman
2012-03-09  9:31                                             ` Fernando Luis Vázquez Cao
2012-03-09  9:31                                               ` Fernando Luis Vázquez Cao
2012-03-09  9:51                                               ` [PATCH 1/3] boot: fortify early_idt_handlers definition Fernando Luis Vázquez Cao
2012-03-09  9:51                                                 ` Fernando Luis Vázquez Cao
2012-03-09  9:55                                                 ` [PATCH 2/3] boot: ignore early NMIs Fernando Luis Vázquez Cao
2012-03-09  9:55                                                   ` Fernando Luis Vázquez Cao
2012-03-09 10:01                                                   ` [PATCH 3/3] boot: add early NMI counter Fernando Luis Vázquez Cao
2012-03-09 10:01                                                     ` Fernando Luis Vázquez Cao
2012-03-09 20:52                                             ` [PATCH 1/2] boot: ignore early NMIs H. Peter Anvin
2012-03-09 20:52                                               ` H. Peter Anvin
2012-03-12  5:43                                               ` Fernando Luis Vázquez Cao [this message]
2012-03-12  5:43                                                 ` Fernando Luis Vázquez Cao
2012-03-12  5:49                                                 ` H. Peter Anvin
2012-03-12  5:49                                                   ` H. Peter Anvin
2012-03-12  6:14                                                   ` Fernando Luis Vázquez Cao
2012-03-12  6:14                                                     ` Fernando Luis Vázquez Cao
2012-03-12 13:36                                                     ` Vivek Goyal
2012-03-12 13:36                                                       ` Vivek Goyal
2012-03-12 19:02                                                       ` Eric W. Biederman
2012-03-12 19:02                                                         ` Eric W. Biederman
2012-03-12 19:58                                                         ` Vivek Goyal
2012-03-12 19:58                                                           ` Vivek Goyal
2012-03-12 20:02                                                         ` H. Peter Anvin
2012-03-12 20:02                                                           ` H. Peter Anvin
2012-03-12 18:40                                                     ` H. Peter Anvin
2012-03-12 18:40                                                       ` H. Peter Anvin
2012-03-12 20:01                                                       ` Eric W. Biederman
2012-03-12 20:01                                                         ` Eric W. Biederman
2012-03-12 20:04                                                         ` H. Peter Anvin
2012-03-12 20:04                                                           ` H. Peter Anvin
2012-03-12 20:16                                                           ` H. Peter Anvin
2012-03-12 20:16                                                             ` H. Peter Anvin
2012-03-13  2:11                                                             ` Fernando Luis Vázquez Cao
2012-03-13  2:11                                                               ` Fernando Luis Vázquez Cao
2012-03-13 13:33                                                               ` Don Zickus
2012-03-13 13:33                                                                 ` Don Zickus
2012-03-15  0:43                                                                 ` Simon Horman
2012-03-15  0:43                                                                   ` Simon Horman
2012-03-13  1:43                                                       ` Fernando Luis Vázquez Cao
2012-03-13  1:43                                                         ` Fernando Luis Vázquez Cao
2012-03-12 14:41                                                   ` Don Zickus
2012-03-12 14:41                                                     ` Don Zickus
2012-03-07 15:50                                     ` [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path Vivek Goyal
2012-03-07 15:50                                       ` Vivek Goyal
2012-03-07 18:27                                       ` Yinghai Lu
2012-03-07 18:27                                         ` Yinghai Lu
2012-03-08  1:29                                         ` Fernando Luis Vázquez Cao
2012-03-08  1:29                                           ` Fernando Luis Vázquez Cao
2012-03-09  0:59                                     ` HATAYAMA Daisuke
2012-03-09  0:59                                       ` HATAYAMA Daisuke
2012-03-09  2:48                                       ` Eric W. Biederman
2012-03-09  2:48                                         ` Eric W. Biederman
2012-02-12 11:12   ` Ingo Molnar
2012-02-12 11:12     ` Ingo Molnar
2012-02-13 15:28   ` Don Zickus
2012-02-13 15:28     ` Don Zickus
2012-02-13 16:52     ` Yinghai Lu
2012-02-13 16:52       ` Yinghai Lu
2012-02-13 22:12       ` Don Zickus
2012-02-13 22:12         ` Don Zickus
2012-02-13 22:51         ` Don Zickus
2012-02-13 22:51           ` Don Zickus
2012-02-16  2:53       ` Don Zickus
2012-02-16  2:53         ` Don Zickus
2012-02-16 18:43         ` Yinghai Lu
2012-02-16 18:43           ` Yinghai Lu
2012-02-16 21:41           ` Don Zickus
2012-02-16 21:41             ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F5D8D0E.8060702@oss.ntt.co.jp \
    --to=fernando@oss.ntt.co.jp \
    --cc=akpm@linux-foundation.org \
    --cc=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.