From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH 2 of 2 V3] x86/kexec: Change NMI and MCE handling on kexec path Date: Mon, 10 Dec 2012 12:54:36 +0000 Message-ID: <50C5DB8C.4050601@citrix.com> References: <3757511a785287066cfd.1354830150@andrewcoop.uk.xensource.com> <50C1E67F02000078000AEEEF@nat28.tlf.novell.com> <50C25D3D.10407@citrix.com> <50C5BA2A02000078000AF4F4@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <50C5BA2A02000078000AF4F4@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 10/12/12 09:32, Jan Beulich wrote: >>>> On 07.12.12 at 22:18, Andrew Cooper wrote: >> On 07/12/2012 11:52, Jan Beulich wrote: >>>>>> On 06.12.12 at 22:42, Andrew Cooper wrote: >>>> + >>>> + /* Change NMI trap handlers. Non-crashing pcpus get nmi_crash which >>>> + * invokes do_nmi_crash (above), which cause them to write state and >>>> + * fall into a loop. The crashing pcpu gets the nop handler to >>>> + * cause it to return to this function ASAP. >>>> + */ >>>> + for ( i = 0; i< nr_cpu_ids; ++i ) >>>> + if ( idt_tables[i] ) >>>> + { >>>> + >>>> + if ( i == cpu ) >>>> + { >>>> + /* Disable the interrupt stack tables for this MCE and >>>> + * NMI handler (shortly to become a nop) as there is a 1 >>>> + * instruction race window where NMIs could be >>>> + * re-enabled and corrupt the exception frame, leaving >>>> + * us unable to continue on this crash path (which half >>>> + * defeats the point of using the nop handler in the >>>> + * first place). >>>> + * >>>> + * This update is safe from a security point of view, as >>>> + * this pcpu is never going to try to sysret back to a >>>> + * PV vcpu. >>>> + */ >>>> + set_ist(&idt_tables[i][TRAP_nmi], IST_NONE); >>>> + set_ist(&idt_tables[i][TRAP_machine_check], IST_NONE); >>>> + >>>> + _set_gate(&idt_tables[i][TRAP_nmi], 14, 0,&trap_nop); >>> This makes the first set_ist() above pointless, doesn't it? >> No. The set_ist() is to remove the possibility of stack corruption from >> reentrant NMIs, while the trap_nop handler is so we don't get diverted >> into the regular NMI handler. There is nothing the NMI handler would do >> which could alter the outcome, and there are many cases where the >> regular NMI handler would try to panic, starting us reentrantly on the >> kexec path again (where we would deadlock on the one_cpu_only() check). > I think you didn't get the point of the question: _set_gate() clears > the IST field of the descriptor anyway, so why clear it separately > first, and then overwrite it again? Ah - my apologies. I indeed was not understanding the point. I will need to fix up the calls then. Having _set_gate() change the IST as well reintroduces the security vulnerability. I will create a new function similar to _set_gate() which only changes the handler and nothing else. > >>>> + } >>>> + else >>>> + _set_gate(&idt_tables[i][TRAP_nmi], 14, 0,&nmi_crash); >>>> + } >>>> + >>>> /* Ensure the new callback function is set before sending out the NMI. >> */ >>>> wmb(); >>>> ... >>>> +/* Enable NMIs. No special register assumptions, and all preserved. */ >>>> +ENTRY(enable_nmis) >>>> + pushq %rax >>> What's the point of saving %rax here, btw? >>> >>> Jan >> Because at the moment I believe I might need to call it from asm >> context, when doing some of the later fixes. I figured that it was >> better to make it safe now, rather than patch it up later. > I don't think that's good practice - if you end up not calling the > thing from assembly code, the question on the purpose of saving > %rax will re-surface sooner or later. Plus the patch by itself > wouldn't really explain either why this is so (which might be > of interest in the context of backporting it to older trees). > > Jan > Ok - will remove. ~Andrew