Re: [PATCH 2 of 2 V3] x86/kexec: Change NMI and MCE handling on kexec path

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH 2 of 2 V3] x86/kexec: Change NMI and MCE handling on kexec path
Date: Fri, 7 Dec 2012 21:18:53 +0000	[thread overview]
Message-ID: <50C25D3D.10407@citrix.com> (raw)
In-Reply-To: <50C1E67F02000078000AEEEF@nat28.tlf.novell.com>

On 07/12/2012 11:52, Jan Beulich wrote:
>>>> On 06.12.12 at 22:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> +
>> +    /* Change NMI trap handlers.  Non-crashing pcpus get nmi_crash which
>> +     * invokes do_nmi_crash (above), which cause them to write state and
>> +     * fall into a loop.  The crashing pcpu gets the nop handler to
>> +     * cause it to return to this function ASAP.
>> +     */
>> +    for ( i = 0; i < nr_cpu_ids; ++i )
>> +        if ( idt_tables[i] )
>> +        {
>> +
>> +            if ( i == cpu )
>> +            {
>> +                /* Disable the interrupt stack tables for this MCE and
>> +                 * NMI handler (shortly to become a nop) as there is a 1
>> +                 * instruction race window where NMIs could be
>> +                 * re-enabled and corrupt the exception frame, leaving
>> +                 * us unable to continue on this crash path (which half
>> +                 * defeats the point of using the nop handler in the
>> +                 * first place).
>> +                 *
>> +                 * This update is safe from a security point of view, as
>> +                 * this pcpu is never going to try to sysret back to a
>> +                 * PV vcpu.
>> +                 */
>> +                set_ist(&idt_tables[i][TRAP_nmi],           IST_NONE);
>> +                set_ist(&idt_tables[i][TRAP_machine_check], IST_NONE);
>> +
>> +                _set_gate(&idt_tables[i][TRAP_nmi], 14, 0, &trap_nop);
> This makes the first set_ist() above pointless, doesn't it?

No.  The set_ist() is to remove the possibility of stack corruption from
reentrant NMIs, while the trap_nop handler is so we don't get diverted
into the regular NMI handler.  There is nothing the NMI handler would do
which could alter the outcome, and there are many cases where the
regular NMI handler would try to panic, starting us reentrantly on the
kexec path again (where we would deadlock on the one_cpu_only() check).

Given that we are going to crash, any time spent in the NMI handler is
time not spent trying to kill the other pcpus, which themselves might be
heading towards a cascade failure.

>
>> +            }
>> +            else
>> +                _set_gate(&idt_tables[i][TRAP_nmi], 14, 0, &nmi_crash);
>> +        }
>> +
>>      /* Ensure the new callback function is set before sending out the NMI. */
>>      wmb();
>>  ...
>> +/* Enable NMIs.  No special register assumptions, and all preserved. */
>> +ENTRY(enable_nmis)
>> +        pushq %rax
> What's the point of saving %rax here, btw?
>
> Jan

Because at the moment I believe I might need to call it from asm
context, when doing some of the later fixes.  I figured that it was
better to make it safe now, rather than patch it up later.

~Andrew

>
>> +        movq  %rsp, %rax /* Grab RSP before pushing */
>> +
>> +        /* Set up stack frame */
>> +        pushq $0               /* SS */
>> +        pushq %rax             /* RSP */
>> +        pushfq                 /* RFLAGS */
>> +        pushq $__HYPERVISOR_CS /* CS */
>> +        leaq  1f(%rip),%rax
>> +        pushq %rax             /* RIP */
>> +
>> +/* No op trap handler.  Required for kexec crash path.
>> + * It is not used in performance critical code, and saves having a single
>> + * lone aligned iret. It does not use ENTRY to declare the symbol to avoid the
>> + * explicit alignment. */
>> +.globl trap_nop;
>> +trap_nop:
>> +
>> +        iretq /* Disable the hardware NMI latch */
>> +1:
>> +        popq %rax
>> +        retq
>> +
>>  .section .rodata, "a", @progbits
>>  
>>  ENTRY(exception_table)
>

next prev parent reply	other threads:[~2012-12-07 21:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-06 21:42 [PATCH 0 of 2 V3] Kexec alterations Andrew Cooper
2012-12-06 21:42 ` [PATCH 1 of 2 V3] x86/IST: Create set_ist() helper function Andrew Cooper
2012-12-07 11:45   ` Jan Beulich
2012-12-07 21:01     ` Andrew Cooper
2012-12-06 21:42 ` [PATCH 2 of 2 V3] x86/kexec: Change NMI and MCE handling on kexec path Andrew Cooper
2012-12-07 11:52   ` Jan Beulich
2012-12-07 21:18     ` Andrew Cooper [this message]
2012-12-10  9:32       ` Jan Beulich
2012-12-10 12:54         ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C25D3D.10407@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.