Re: [PATCH V5] x86/kexec: Change NMI and MCE handling on kexec path

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: Keir <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH V5] x86/kexec: Change NMI and MCE handling on kexec path
Date: Thu, 13 Dec 2012 11:47:15 +0000	[thread overview]
Message-ID: <50C9C043.7090507@citrix.com> (raw)
In-Reply-To: <20121213105951.GA75286@ocelot.phlegethon.org>

On 13/12/12 10:59, Tim Deegan wrote:
> At 11:35 +0000 on 12 Dec (1355312134), Andrew Cooper wrote:
>> diff -r ef8c1b607b10 -r 96b068439bc4 xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c
>> +++ b/xen/arch/x86/crash.c
>> @@ -32,41 +32,127 @@
>>
>>   static atomic_t waiting_for_crash_ipi;
>>   static unsigned int crashing_cpu;
>> +static DEFINE_PER_CPU_READ_MOSTLY(bool_t, crash_save_done);
>>
>> -static int crash_nmi_callback(struct cpu_user_regs *regs, int cpu)
>> +/* This becomes the NMI handler for non-crashing CPUs, when Xen is crashing. */
>> +void __attribute__((noreturn)) do_nmi_crash(struct cpu_user_regs *regs)
>>   {
>> -    /* Don't do anything if this handler is invoked on crashing cpu.
>> -     * Otherwise, system will completely hang. Crashing cpu can get
>> -     * an NMI if system was initially booted with nmi_watchdog parameter.
>> +    int cpu = smp_processor_id();
>> +
>> +    /* nmi_shootdown_cpus() should ensure that this assertion is correct. */
>> +    ASSERT( cpu != crashing_cpu );
> nmi_shootdown_cpus() has a go, but I think it would have to do an atomic
> compare-exchange on crashing_cpu to actually be sure that only one CPU
> is crashing at a time.  If two CPUs try to lead crashes at the same
> time, it will deadlock here, with NMIs disabled on all CPUs.
>
> The old behaviour was also not entirely race-free, but a little more
> likey to make progress, as in most cases at least one CPU would see
> cpu == crashing_cpu here and return.
>
> Using cmpxchg in nmi_shootdown_cpus() would have the drawback that if
> one CPU tried to kexec and wedged, another CPU couldn't then have a go.
> Not sure which of all these unlikely scenarios is worst. :)

kexec_common_shutdown() calls one_cpu_only() which gives the impression 
that nmi_shootdown_cpus() can only be gotten to once.  I think this is 
race free for different pcpus attempting to crash at the same time, but 
is certainly not safe for a cascade crash on the same pcpu.

As far as I can reason, this race can be worked around by doing a 
combined atomic set of the KEXEC_IN_PROGRESS flag, and a set of the 
crashing cpu id.  However, there are also other race conditions along 
the crash path which are addressed in subsequent patches which are still 
in progress.  As the choice here is between two differently nasty race 
conditions, I would prefer to defer fixing it to later and doing it 
properly.

>
>> diff -r ef8c1b607b10 -r 96b068439bc4 xen/arch/x86/x86_64/entry.S
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -635,11 +635,45 @@ ENTRY(nmi)
>>           movl  $TRAP_nmi,4(%rsp)
>>           jmp   handle_ist_exception
>>
>> +ENTRY(nmi_crash)
>> +        cli
> Aren't interrupts disabled already because we came in through an
> interrupt gate?

I was taking the pragmatic approach that in this case, we really have no 
guarantees about Xen state.  Having said that, we will (hopefully) have 
entered here through a gate, which will be intacked, and the function 
itself is deliberately safe to interruption.  I shall remove it.

>
>> +        pushq $0
>> +        movl $TRAP_nmi,4(%rsp)
>> +        SAVE_ALL
>> +        movq %rsp,%rdi
> Why?  do_nmi_crash() doesn't actually use any of its arguments.
> AFAICT, we could pretty much use do_nmi_crash explicitly in the IDT
> entry.

I was going with the DECLARE_TRAP_HANDLER() way of doing things, for 
consistency with the other handlers.  I can change it if you wish, but 
as this is not a hotpath, I would err on the lazy side and leave it as 
it is.

>
>> +        callq do_nmi_crash /* Does not return */
>> +        ud2
>> +
>>   ENTRY(machine_check)
>>           pushq $0
>>           movl  $TRAP_machine_check,4(%rsp)
>>           jmp   handle_ist_exception
>>
>> +/* Enable NMIs.  No special register assumptions. Only %rax is not preserved. */
>> +ENTRY(enable_nmis)
>> +        movq  %rsp, %rax /* Grab RSP before pushing */
>> +
>> +        /* Set up stack frame */
>> +        pushq $0               /* SS */
>> +        pushq %rax             /* RSP */
>> +        pushfq                 /* RFLAGS */
>> +        pushq $__HYPERVISOR_CS /* CS */
>> +        leaq  1f(%rip),%rax
>> +        pushq %rax             /* RIP */
>> +
>> +        iretq /* Disable the hardware NMI latch */
>> +1:
>> +        retq
> This could be made a bit smaller by something like:
>
> 	popq  %rcx         /* Remember return address */
> 	movq  %rsp, %rax   /* Grab RSP before pushing */
> 	
> 	/* Set up stack frame */
> 	pushq $0               /* SS */
> 	pushq %rax             /* RSP */
> 	pushfq                 /* RFLAGS */
> 	pushq $__HYPERVISOR_CS /* CS */
> 	pushq %rcx             /* RIP */
> 	
> 	iretq /* Disable the hardware NMI latch and return */
>
> which also keeps the call/return counts balanced.  But tbh I'm not sure
> it's worth it.  And I suspect you'll tell me why it's wrong. :)

I cant see any problem with it.

>
>> +
>> +/* No op trap handler.  Required for kexec crash path.  This is not
>> + * declared with the ENTRY() macro to avoid wasted alignment space.
>> + */
>> +.globl trap_nop
>> +trap_nop:
>> +        iretq
> The commit message says this reuses the iretq in enable_nmis().
>
> Cheers,
>
> Tim.

Ok - I will fix the stale message.

~Andrew

next prev parent reply	other threads:[~2012-12-13 11:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-12 11:35 [PATCH V5] x86/kexec: Change NMI and MCE handling on kexec path Andrew Cooper
2012-12-12 12:50 ` Jan Beulich
2012-12-12 15:33   ` Keir Fraser
2012-12-12 16:12     ` Jan Beulich
2012-12-13 10:59 ` Tim Deegan
2012-12-13 11:47   ` Andrew Cooper [this message]
2012-12-13 11:55     ` Tim Deegan
2012-12-13 11:58       ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C9C043.7090507@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.