All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Neil Horman <nhorman@tuxdriver.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	mingo@redhat.com, "H. Peter Anvin" <hpa@zytor.com>,
	tglx@linutronix.de, Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path
Date: Thu, 7 Feb 2008 01:39:18 +0100	[thread overview]
Message-ID: <20080207003918.GA29943@elte.hu> (raw)
In-Reply-To: <m1r6fpd2uo.fsf@ebiederm.dsl.xmission.com>


* Eric W. Biederman <ebiederm@xmission.com> wrote:

> Looking at the patch the local_irq_enable() is totally bogus.  As soon 
> was we hit machine_crash_shutdown the first thing we do is disable 
> irqs.

yeah.

> I'm wondering if someone was using the switch cpus on crash patch that 
> was floating around.  That would require the ipis to work.
> 
> I don't know if nmi_exit makes sense.  There are enough layers of 
> abstraction in that piece of code I can't quickly spot the part that 
> is banging the hardware.
> 
> The location of nmi_exit in the patch is clearly wrong.  crash_kexec 
> is a noop if we don't have a crash kernel loaded (and if we are not 
> the first cpu into it), so if we don't execute the crash code 
> something weird may happen.  Further the code is just more 
> maintainable if that kind of code lives in machine_crash_shutdown.

nmi_exit() has no hw effects - it's just our own bookeeping.

the hw knows that we finished the NMI when we do an iret. Perhaps that's 
the bug or side-effect that made the difference: via enabling irqs we 
get an irq entry, and that does an iret and clears the NMI nested state 
- allowing the kexec context to proceed? I suspect kexec() will do an 
iret eventually (at minimum in the booted up kernel's context) - all 
NMIs are blocked up to that point and maybe the APIC doesnt really like 
being frobbed in that state? In any case, the local_irq_enable() is just 
wrong - it's the worst thing a crashing kernel can do. Perhaps doing an 
intentional iret with a prepared stack-let that just restores to 
still-irqs-off state and jumps to the next instruction could 'exit' the 
NMI context without really having to exit it in the kernel code flow?

	Ingo

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@elte.hu>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Vivek Goyal <vgoyal@redhat.com>,
	Neil Horman <nhorman@tuxdriver.com>,
	tglx@linutronix.de, mingo@redhat.com, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path
Date: Thu, 7 Feb 2008 01:39:18 +0100	[thread overview]
Message-ID: <20080207003918.GA29943@elte.hu> (raw)
In-Reply-To: <m1r6fpd2uo.fsf@ebiederm.dsl.xmission.com>


* Eric W. Biederman <ebiederm@xmission.com> wrote:

> Looking at the patch the local_irq_enable() is totally bogus.  As soon 
> was we hit machine_crash_shutdown the first thing we do is disable 
> irqs.

yeah.

> I'm wondering if someone was using the switch cpus on crash patch that 
> was floating around.  That would require the ipis to work.
> 
> I don't know if nmi_exit makes sense.  There are enough layers of 
> abstraction in that piece of code I can't quickly spot the part that 
> is banging the hardware.
> 
> The location of nmi_exit in the patch is clearly wrong.  crash_kexec 
> is a noop if we don't have a crash kernel loaded (and if we are not 
> the first cpu into it), so if we don't execute the crash code 
> something weird may happen.  Further the code is just more 
> maintainable if that kind of code lives in machine_crash_shutdown.

nmi_exit() has no hw effects - it's just our own bookeeping.

the hw knows that we finished the NMI when we do an iret. Perhaps that's 
the bug or side-effect that made the difference: via enabling irqs we 
get an irq entry, and that does an iret and clears the NMI nested state 
- allowing the kexec context to proceed? I suspect kexec() will do an 
iret eventually (at minimum in the booted up kernel's context) - all 
NMIs are blocked up to that point and maybe the APIC doesnt really like 
being frobbed in that state? In any case, the local_irq_enable() is just 
wrong - it's the worst thing a crashing kernel can do. Perhaps doing an 
intentional iret with a prepared stack-let that just restores to 
still-irqs-off state and jumps to the next instruction could 'exit' the 
NMI context without really having to exit it in the kernel code flow?

	Ingo

  reply	other threads:[~2008-02-07  0:39 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-06 19:25 [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path Neil Horman
2008-02-06 19:25 ` Neil Horman
2008-02-06 19:40 ` Vivek Goyal
2008-02-06 19:40   ` Vivek Goyal
2008-02-06 20:12   ` Neil Horman
2008-02-06 20:12     ` Neil Horman
2008-02-06 20:21     ` H. Peter Anvin
2008-02-06 20:21       ` H. Peter Anvin
2008-02-06 21:04       ` Neil Horman
2008-02-06 21:04         ` Neil Horman
2008-02-06 20:35     ` Vivek Goyal
2008-02-06 20:35       ` Vivek Goyal
2008-02-06 22:00 ` Ingo Molnar
2008-02-06 22:00   ` Ingo Molnar
2008-02-06 22:48   ` Vivek Goyal
2008-02-06 22:48     ` Vivek Goyal
2008-02-06 22:53     ` Ingo Molnar
2008-02-06 22:53       ` Ingo Molnar
2008-02-06 22:56     ` H. Peter Anvin
2008-02-06 22:56       ` H. Peter Anvin
2008-02-06 23:36       ` Ingo Molnar
2008-02-06 23:36         ` Ingo Molnar
2008-02-06 23:50         ` Vivek Goyal
2008-02-06 23:50           ` Vivek Goyal
2008-02-07  0:31         ` Eric W. Biederman
2008-02-07  0:31           ` Eric W. Biederman
2008-02-07  0:39           ` Ingo Molnar [this message]
2008-02-07  0:39             ` Ingo Molnar
2008-02-07  1:30             ` Eric W. Biederman
2008-02-07  1:30               ` Eric W. Biederman
2008-02-07 12:17           ` Neil Horman
2008-02-07 12:17             ` Neil Horman
2008-02-07 12:24             ` Ingo Molnar
2008-02-07 12:24               ` Ingo Molnar
2008-02-07 20:37               ` Neil Horman
2008-02-07 20:37                 ` Neil Horman
2008-02-08 16:14               ` Neil Horman
2008-02-08 16:14                 ` Neil Horman
2008-02-08 16:45                 ` Vivek Goyal
2008-02-08 16:45                   ` Vivek Goyal
2008-02-08 17:26                   ` Neil Horman
2008-02-08 17:26                     ` Neil Horman
2008-02-12 21:08                   ` Neil Horman
2008-02-12 21:08                     ` Neil Horman
2008-02-15 14:02                     ` Eric W. Biederman
2008-02-15 14:02                       ` Eric W. Biederman
2008-02-20 14:57                     ` Neil Horman
2008-02-20 14:57                       ` Neil Horman
2008-02-08 16:54               ` Andi Kleen
2008-02-08 16:54                 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080207003918.GA29943@elte.hu \
    --to=mingo@elte.hu \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nhorman@tuxdriver.com \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.