[PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
@ 2012-11-13 11:23 Malcolm Crossley
  2012-11-13 11:38 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Malcolm Crossley @ 2012-11-13 11:23 UTC (permalink / raw)
  To: xen-devel; +Cc: eddie.dong, jun.nakajima

The self_nmi() code cause's an NMI to be triggered by sending an APIC message
to the local processor. Unfortunately there is a delay in the delivery of the
APIC message and we may already have re-entered the HVM guest by the time the
NMI is taken. This results in the VMEXIT code calling the self_nmi() function
again. We have seen hundreds of iterations of this VMEXIT/VMENTER loop before
the HVM guest resumes normal operation.

Volume 3 Chapter 27 Section 1 of the Intel SDM states:

An NMI causes subsequent NMIs to be blocked, but only after the VM exit
completes.

So we believe it is safe to directly invoke the INT call as NMI's should
already be blocked.

diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
                  (X86_EVENTTYPE_NMI << 8) )
                 goto exit_and_crash;
             HVMTRACE_0D(NMI);
-            self_nmi(); /* Real NMI, vector 2: normal processing. */
+            asm("int $2"); /* Real NMI, vector 2: normal processing. */
             break;
         case TRAP_machine_check:
             HVMTRACE_0D(MCE);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 11:23 [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler Malcolm Crossley
@ 2012-11-13 11:38 ` Jan Beulich
  2012-11-13 11:47   ` Tim Deegan
  2012-11-13 11:48   ` Andrew Cooper
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Beulich @ 2012-11-13 11:38 UTC (permalink / raw)
  To: Malcolm Crossley; +Cc: eddie.dong, jun.nakajima, xen-devel

>>> On 13.11.12 at 12:23, Malcolm Crossley <malcolm.crossley@citrix.com> wrote:
> The self_nmi() code cause's an NMI to be triggered by sending an APIC message
> to the local processor. Unfortunately there is a delay in the delivery of 
> the
> APIC message and we may already have re-entered the HVM guest by the time the
> NMI is taken. This results in the VMEXIT code calling the self_nmi() 
> function
> again. We have seen hundreds of iterations of this VMEXIT/VMENTER loop 
> before
> the HVM guest resumes normal operation.

But it does, eventually? I ask because ...

> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
> 
> An NMI causes subsequent NMIs to be blocked, but only after the VM exit
> completes.

... this blocking of NMIs should last until you enter the guest
again, so I would actually expect this to be an infinite loop ...

> So we believe it is safe to directly invoke the INT call as NMI's should
> already be blocked.

... and this not only be safe, but actually required.

> diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
>                   (X86_EVENTTYPE_NMI << 8) )
>                  goto exit_and_crash;
>              HVMTRACE_0D(NMI);
> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> +            asm("int $2"); /* Real NMI, vector 2: normal processing. */

In any case - why can't you call do_nmi() directly from here?

Jan

>              break;
>          case TRAP_machine_check:
>              HVMTRACE_0D(MCE);
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 11:38 ` Jan Beulich
@ 2012-11-13 11:47   ` Tim Deegan
  2012-11-13 12:05     ` Tim Deegan
  2012-11-13 11:48   ` Andrew Cooper
  1 sibling, 1 reply; 15+ messages in thread
From: Tim Deegan @ 2012-11-13 11:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Malcolm Crossley, eddie.dong, jun.nakajima, xen-devel

Hi, 

Would need to go and read manuals to comment on much of this, but...

At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
> > diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
> >                   (X86_EVENTTYPE_NMI << 8) )
> >                  goto exit_and_crash;
> >              HVMTRACE_0D(NMI);
> > -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> > +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
> 
> In any case - why can't you call do_nmi() directly from here?

... this is my doing.  There used to be a call to do_nmi() here, but
do_nmi() doesn't block NMIs, so you can't just call it here in case you
get _another_ NMI while you're in the NMI handler.

Tim.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 11:38 ` Jan Beulich
  2012-11-13 11:47   ` Tim Deegan
@ 2012-11-13 11:48   ` Andrew Cooper
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2012-11-13 11:48 UTC (permalink / raw)
  To: xen-devel

On 13/11/12 11:38, Jan Beulich wrote:
>>>> On 13.11.12 at 12:23, Malcolm Crossley <malcolm.crossley@citrix.com> wrote:
>> The self_nmi() code cause's an NMI to be triggered by sending an APIC message
>> to the local processor. Unfortunately there is a delay in the delivery of 
>> the
>> APIC message and we may already have re-entered the HVM guest by the time the
>> NMI is taken. This results in the VMEXIT code calling the self_nmi() 
>> function
>> again. We have seen hundreds of iterations of this VMEXIT/VMENTER loop 
>> before
>> the HVM guest resumes normal operation.
> But it does, eventually? I ask because ...
>
>> Volume 3 Chapter 27 Section 1 of the Intel SDM states:
>>
>> An NMI causes subsequent NMIs to be blocked, but only after the VM exit
>> completes.
> ... this blocking of NMIs should last until you enter the guest
> again, so I would actually expect this to be an infinite loop ...

The observed behaviour was some other event like a timer interrupt
breaking it out of the loop and taking a different path through the
vmexit handler.

>
>> So we believe it is safe to directly invoke the INT call as NMI's should
>> already be blocked.
> ... and this not only be safe, but actually required.
>
>> diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
>>                   (X86_EVENTTYPE_NMI << 8) )
>>                  goto exit_and_crash;
>>              HVMTRACE_0D(NMI);
>> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
>> +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
> In any case - why can't you call do_nmi() directly from here?
>
> Jan

Actually, thinking about it that is a better idea than my suggestion,
because if for some reason another, different NMI occurs (the Intel SDM
is not exactly clear on exactly when NMIs are blocked till), the other
NMI will enter on the real NMI stack and not trample over itself.

~Andrew

>
>>              break;
>>          case TRAP_machine_check:
>>              HVMTRACE_0D(MCE);
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org 
>> http://lists.xen.org/xen-devel 
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 11:47   ` Tim Deegan
@ 2012-11-13 12:05     ` Tim Deegan
  2012-11-13 12:16       ` Andrew Cooper
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Tim Deegan @ 2012-11-13 12:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Malcolm Crossley, eddie.dong, jun.nakajima, xen-devel

At 11:47 +0000 on 13 Nov (1352807234), Tim Deegan wrote:
> At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
> > > diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
> > > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > > @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
> > >                   (X86_EVENTTYPE_NMI << 8) )
> > >                  goto exit_and_crash;
> > >              HVMTRACE_0D(NMI);
> > > -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> > > +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
> > 
> > In any case - why can't you call do_nmi() directly from here?
> 
> ... this is my doing.  There used to be a call to do_nmi() here, but
> do_nmi() doesn't block NMIs, so you can't just call it here in case you
> get _another_ NMI while you're in the NMI handler.

Oh wait, I see -- you're saying that this (20059:76a65bf2aa4d) is wrong
because NMIs are indeed blocked, and have been since the VMEXIT.

In that case, I agree that we should just run the NMI handler, but first
I would really like to know what _unblocks_ NMIs in this case.  Any of
the things I can think of (the next vmenter, the next iret, ??) will
need some handling to make sure they actually happen before, say, we
take this CPU into the idle loop...

Tim.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 12:05     ` Tim Deegan
@ 2012-11-13 12:16       ` Andrew Cooper
  2012-11-13 12:17       ` Ian Campbell
  2012-11-13 14:47       ` Jan Beulich
  2 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2012-11-13 12:16 UTC (permalink / raw)
  To: xen-devel


On 13/11/12 12:05, Tim Deegan wrote:
> At 11:47 +0000 on 13 Nov (1352807234), Tim Deegan wrote:
>> At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
>>>> diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
>>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>>> @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
>>>>                   (X86_EVENTTYPE_NMI << 8) )
>>>>                  goto exit_and_crash;
>>>>              HVMTRACE_0D(NMI);
>>>> -            self_nmi(); /* Real NMI, vector 2: normal processing. */
>>>> +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
>>> In any case - why can't you call do_nmi() directly from here?
>> ... this is my doing.  There used to be a call to do_nmi() here, but
>> do_nmi() doesn't block NMIs, so you can't just call it here in case you
>> get _another_ NMI while you're in the NMI handler.
> Oh wait, I see -- you're saying that this (20059:76a65bf2aa4d) is wrong
> because NMIs are indeed blocked, and have been since the VMEXIT.
>
> In that case, I agree that we should just run the NMI handler, but first
> I would really like to know what _unblocks_ NMIs in this case.  Any of
> the things I can think of (the next vmenter, the next iret, ??) will
> need some handling to make sure they actually happen before, say, we
> take this CPU into the idle loop...
>
> Tim.

The first experiment was to have self_nmi() wait until the NMI count for
the current processor changed.  This prevented the thousands of bounces
in and out of non-root mode, but the enter delays fairly consistently in
the millions of TSC ticks (a few milliseconds on the test box), which is
a noticeable chunk of time.

~Andrew

>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 12:05     ` Tim Deegan
  2012-11-13 12:16       ` Andrew Cooper
@ 2012-11-13 12:17       ` Ian Campbell
  2012-11-13 12:39         ` Tim Deegan
  2012-11-13 14:47       ` Jan Beulich
  2 siblings, 1 reply; 15+ messages in thread
From: Ian Campbell @ 2012-11-13 12:17 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Malcolm Crossley, eddie.dong@intel.com, jun.nakajima@intel.com,
	Jan Beulich, xen-devel

On Tue, 2012-11-13 at 12:05 +0000, Tim Deegan wrote:
> At 11:47 +0000 on 13 Nov (1352807234), Tim Deegan wrote:
> > At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
> > > > diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
> > > > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > > > @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
> > > >                   (X86_EVENTTYPE_NMI << 8) )
> > > >                  goto exit_and_crash;
> > > >              HVMTRACE_0D(NMI);
> > > > -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> > > > +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
> > > 
> > > In any case - why can't you call do_nmi() directly from here?
> > 
> > ... this is my doing.  There used to be a call to do_nmi() here, but
> > do_nmi() doesn't block NMIs, so you can't just call it here in case you
> > get _another_ NMI while you're in the NMI handler.
> 
> Oh wait, I see -- you're saying that this (20059:76a65bf2aa4d) is wrong
> because NMIs are indeed blocked, and have been since the VMEXIT.
> 
> In that case, I agree that we should just run the NMI handler, but first
> I would really like to know what _unblocks_ NMIs in this case.  Any of
> the things I can think of (the next vmenter, the next iret, ??) will
> need some handling to make sure they actually happen before, say, we
> take this CPU into the idle loop...

What about a little stub-asm return_from_nmi / reenable_nmis with
something like:
	pushf
        pushq $__HYPERVISOR_CS
	pushq 1f
	iret
1: ...

That should re-enable NMIs at whichever point we think is appropriate?
Perhaps a little more work is needed to create a suitable frame under
this one too, not sure what a vmexit frame looks like.

I hope we aren't going to get into the NMI nesting problem described
here: http://lwn.net/Articles/484932/ (I don't think so, no page-faults
or break points in our NMI handlers)

Ian.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 12:17       ` Ian Campbell
@ 2012-11-13 12:39         ` Tim Deegan
  2012-11-13 14:21           ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Tim Deegan @ 2012-11-13 12:39 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Malcolm Crossley, eddie.dong@intel.com, Jan Beulich,
	jun.nakajima@intel.com, xen-devel

At 12:17 +0000 on 13 Nov (1352809049), Ian Campbell wrote:
> On Tue, 2012-11-13 at 12:05 +0000, Tim Deegan wrote:
> > At 11:47 +0000 on 13 Nov (1352807234), Tim Deegan wrote:
> > > At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
> > > > > diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
> > > > > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > > > > @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
> > > > >                   (X86_EVENTTYPE_NMI << 8) )
> > > > >                  goto exit_and_crash;
> > > > >              HVMTRACE_0D(NMI);
> > > > > -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> > > > > +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
> > > > 
> > > > In any case - why can't you call do_nmi() directly from here?
> > > 
> > > ... this is my doing.  There used to be a call to do_nmi() here, but
> > > do_nmi() doesn't block NMIs, so you can't just call it here in case you
> > > get _another_ NMI while you're in the NMI handler.
> > 
> > Oh wait, I see -- you're saying that this (20059:76a65bf2aa4d) is wrong
> > because NMIs are indeed blocked, and have been since the VMEXIT.
> > 
> > In that case, I agree that we should just run the NMI handler, but first
> > I would really like to know what _unblocks_ NMIs in this case.  Any of
> > the things I can think of (the next vmenter, the next iret, ??) will
> > need some handling to make sure they actually happen before, say, we
> > take this CPU into the idle loop...
> 
> What about a little stub-asm return_from_nmi / reenable_nmis with
> something like:
> 	pushf
>         pushq $__HYPERVISOR_CS
> 	pushq 1f
> 	iret
> 1: ...
> 
> That should re-enable NMIs at whichever point we think is appropriate?

If an iret is what's needed then replacing self_nmi() with 'int $2'
(i.e. the proposed patch) seems like a neater fix.  And section 6.7.1 of
the manual suggests that iret is indeed what we want, so:

Acked-by: Tim Deegan <tim@xen.org>

Cheers,

Tim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 12:39         ` Tim Deegan
@ 2012-11-13 14:21           ` Jan Beulich
  2012-11-13 14:43             ` Tim Deegan
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2012-11-13 14:21 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Malcolm Crossley, eddie.dong@intel.com, Ian Campbell,
	jun.nakajima@intel.com, xen-devel

>>> On 13.11.12 at 13:39, Tim Deegan <tim@xen.org> wrote:
> At 12:17 +0000 on 13 Nov (1352809049), Ian Campbell wrote:
>> On Tue, 2012-11-13 at 12:05 +0000, Tim Deegan wrote:
>> > At 11:47 +0000 on 13 Nov (1352807234), Tim Deegan wrote:
>> > > At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
>> > > > > diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
>> > > > > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > > > > @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
>> > > > >                   (X86_EVENTTYPE_NMI << 8) )
>> > > > >                  goto exit_and_crash;
>> > > > >              HVMTRACE_0D(NMI);
>> > > > > -            self_nmi(); /* Real NMI, vector 2: normal processing. */
>> > > > > +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
>> > > > 
>> > > > In any case - why can't you call do_nmi() directly from here?
>> > > 
>> > > ... this is my doing.  There used to be a call to do_nmi() here, but
>> > > do_nmi() doesn't block NMIs, so you can't just call it here in case you
>> > > get _another_ NMI while you're in the NMI handler.
>> > 
>> > Oh wait, I see -- you're saying that this (20059:76a65bf2aa4d) is wrong
>> > because NMIs are indeed blocked, and have been since the VMEXIT.
>> > 
>> > In that case, I agree that we should just run the NMI handler, but first
>> > I would really like to know what _unblocks_ NMIs in this case.  Any of
>> > the things I can think of (the next vmenter, the next iret, ??) will
>> > need some handling to make sure they actually happen before, say, we
>> > take this CPU into the idle loop...
>> 
>> What about a little stub-asm return_from_nmi / reenable_nmis with
>> something like:
>> 	pushf
>>         pushq $__HYPERVISOR_CS
>> 	pushq 1f
>> 	iret
>> 1: ...
>> 
>> That should re-enable NMIs at whichever point we think is appropriate?
> 
> If an iret is what's needed then replacing self_nmi() with 'int $2'
> (i.e. the proposed patch) seems like a neater fix.  And section 6.7.1 of
> the manual suggests that iret is indeed what we want, so:

An IRET just cannot be the only thing that ends the NMI masked
window. At least some forms of #VMENTER should have that
effect too, as otherwise NMIs would remain masked for arbitrary
periods of time.

Further, under the assumption that the self_nmi() worked at least
in some cases (since you likely tested it), there would also be room
for the NMI not being masked when getting here. Which would get
us into the stack trouble that Andrew mentioned in his reply.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 14:21           ` Jan Beulich
@ 2012-11-13 14:43             ` Tim Deegan
  2012-11-13 14:56               ` Jan Beulich
  2012-11-13 15:14               ` Andrew Cooper
  0 siblings, 2 replies; 15+ messages in thread
From: Tim Deegan @ 2012-11-13 14:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Malcolm Crossley, eddie.dong@intel.com, Ian Campbell,
	jun.nakajima@intel.com, xen-devel

At 14:21 +0000 on 13 Nov (1352816476), Jan Beulich wrote:
> >>> On 13.11.12 at 13:39, Tim Deegan <tim@xen.org> wrote:
> > At 12:17 +0000 on 13 Nov (1352809049), Ian Campbell wrote:
> >> On Tue, 2012-11-13 at 12:05 +0000, Tim Deegan wrote:
> >> > At 11:47 +0000 on 13 Nov (1352807234), Tim Deegan wrote:
> >> > > At 11:38 +0000 on 13 Nov (1352806689), Jan Beulich wrote:
> >> > > > > diff -r 62885b3c34c8 -r ea756059a8da xen/arch/x86/hvm/vmx/vmx.c
> >> > > > > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> > > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> > > > > @@ -2442,7 +2442,7 @@ void vmx_vmexit_handler(struct cpu_user_
> >> > > > >                   (X86_EVENTTYPE_NMI << 8) )
> >> > > > >                  goto exit_and_crash;
> >> > > > >              HVMTRACE_0D(NMI);
> >> > > > > -            self_nmi(); /* Real NMI, vector 2: normal processing. */
> >> > > > > +            asm("int $2"); /* Real NMI, vector 2: normal processing. */
> >> > > > 
> >> > > > In any case - why can't you call do_nmi() directly from here?
> >> > > 
> >> > > ... this is my doing.  There used to be a call to do_nmi() here, but
> >> > > do_nmi() doesn't block NMIs, so you can't just call it here in case you
> >> > > get _another_ NMI while you're in the NMI handler.
> >> > 
> >> > Oh wait, I see -- you're saying that this (20059:76a65bf2aa4d) is wrong
> >> > because NMIs are indeed blocked, and have been since the VMEXIT.
> >> > 
> >> > In that case, I agree that we should just run the NMI handler, but first
> >> > I would really like to know what _unblocks_ NMIs in this case.  Any of
> >> > the things I can think of (the next vmenter, the next iret, ??) will
> >> > need some handling to make sure they actually happen before, say, we
> >> > take this CPU into the idle loop...
> >> 
> >> What about a little stub-asm return_from_nmi / reenable_nmis with
> >> something like:
> >> 	pushf
> >>         pushq $__HYPERVISOR_CS
> >> 	pushq 1f
> >> 	iret
> >> 1: ...
> >> 
> >> That should re-enable NMIs at whichever point we think is appropriate?
> > 
> > If an iret is what's needed then replacing self_nmi() with 'int $2'
> > (i.e. the proposed patch) seems like a neater fix.  And section 6.7.1 of
> > the manual suggests that iret is indeed what we want, so:
> 
> An IRET just cannot be the only thing that ends the NMI masked
> window.  At least some forms of #VMENTER should have that
> effect too, as otherwise NMIs would remain masked for arbitrary
> periods of time.
> 
> Further, under the assumption that the self_nmi() worked at least
> in some cases (since you likely tested it), there would also be room
> for the NMI not being masked when getting here. Which would get
> us into the stack trouble that Andrew mentioned in his reply.

I got the impression that Andrew and Malcolm were seeing a long loop of
exit/self_nmi()/enter/NMI/exit/..., eventually broken by a real
interrupt or an IPI causing (by its iret) the NMI to get delivered in
root mode.  So the NMI does get handled, just not immediately.

The fact that there was a loop and not just a delay in the NMI handling
suggests that VMENTER does indeed re-enable NMIs (or at least
NMI-exiting) but I couldn't find that in the manual.  In any case, I
think the int $2 version is correcter than the direct call as it also
disables normal interrupts.

Tim.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 12:05     ` Tim Deegan
  2012-11-13 12:16       ` Andrew Cooper
  2012-11-13 12:17       ` Ian Campbell
@ 2012-11-13 14:47       ` Jan Beulich
  2 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2012-11-13 14:47 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Malcolm Crossley, eddie.dong, jun.nakajima, xen-devel

>>> On 13.11.12 at 13:05, Tim Deegan <tim@xen.org> wrote:
> In that case, I agree that we should just run the NMI handler, but first
> I would really like to know what _unblocks_ NMIs in this case.  Any of
> the things I can think of (the next vmenter, the next iret, ??) will
> need some handling to make sure they actually happen before, say, we
> take this CPU into the idle loop...

Wouldn't we have that exact same problem on the "normal" NMI
path? We do switch away from the NMI stack when it arrived
while in guest context, but I don't think we do anything to end the
NMI masking window before going into the scheduler (or handling
other softirqs).

If so, rather than doing this potentially dangerous "int $2" (in
terms of IST stack usage), we might be better off dealing with
the ending of the mask window in a uniform way.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 14:43             ` Tim Deegan
@ 2012-11-13 14:56               ` Jan Beulich
  2012-11-13 15:21                 ` Tim Deegan
  2012-11-13 15:14               ` Andrew Cooper
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2012-11-13 14:56 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Malcolm Crossley, eddie.dong@intel.com, Ian Campbell,
	jun.nakajima@intel.com, xen-devel

>>> On 13.11.12 at 15:43, Tim Deegan <tim@xen.org> wrote:
> The fact that there was a loop and not just a delay in the NMI handling
> suggests that VMENTER does indeed re-enable NMIs (or at least
> NMI-exiting) but I couldn't find that in the manual.  In any case, I
> think the int $2 version is correcter than the direct call as it also
> disables normal interrupts.

You're not commenting on the stack aspect and previous approach
at all: Assuming your self_nmi() approach at least worked
somewhere, that somewhere would be the place where the "int $2"
approach would break. In other words - are you confident that
NMI is _always_ masked when we get there (and hence your
earlier approach _never_ worked)?

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 14:43             ` Tim Deegan
  2012-11-13 14:56               ` Jan Beulich
@ 2012-11-13 15:14               ` Andrew Cooper
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2012-11-13 15:14 UTC (permalink / raw)
  To: xen-devel

>>>> What about a little stub-asm return_from_nmi / reenable_nmis with
>>>> something like:
>>>> 	pushf
>>>>         pushq $__HYPERVISOR_CS
>>>> 	pushq 1f
>>>> 	iret
>>>> 1: ...
>>>>
>>>> That should re-enable NMIs at whichever point we think is appropriate?
>>> If an iret is what's needed then replacing self_nmi() with 'int $2'
>>> (i.e. the proposed patch) seems like a neater fix.  And section 6.7.1 of
>>> the manual suggests that iret is indeed what we want, so:
>> An IRET just cannot be the only thing that ends the NMI masked
>> window.  At least some forms of #VMENTER should have that
>> effect too, as otherwise NMIs would remain masked for arbitrary
>> periods of time.
>>
>> Further, under the assumption that the self_nmi() worked at least
>> in some cases (since you likely tested it), there would also be room
>> for the NMI not being masked when getting here. Which would get
>> us into the stack trouble that Andrew mentioned in his reply.
> I got the impression that Andrew and Malcolm were seeing a long loop of
> exit/self_nmi()/enter/NMI/exit/...,

Yes - thousands of instantaneous exits from the VM (the eip of the guest
is always the same while this is going on), and this is trivial
behaviour to demonstrate with a CPU-bound VCPU task.

>  eventually broken by a real
> interrupt or an IPI causing (by its iret) the NMI to get delivered in
> root mode.  So the NMI does get handled, just not immediately.

We are still not certain exactly what breaks the PCPU out of this cycle,
but timer interrupts were seen as the first deviation from the above cycle.

My interpretation of the Intel SDM manual along with the observable
evidence, is that NMIs are disabled from the vmexit until the vmresume
(as there appear to be no other irets on that path).  The action of
receiving a vmexit with reason NMI means that an external NMI has
occurred and we should run the handler.

With the current code, we raise a self_nmi(), which queues up a brand
new NMI to interrupt us as soon as we vmresume and NMIs are re-enabled.

>
> The fact that there was a loop and not just a delay in the NMI handling
> suggests that VMENTER does indeed re-enable NMIs (or at least
> NMI-exiting) but I couldn't find that in the manual.  In any case, I
> think the int $2 version is correcter than the direct call as it also
> disables normal interrupts.
>
> Tim.

With the asm("int $2") solution, we jump into the NMI handler (on the
NMI stack) with NMIs disabled (due to the vmexit), run the handler for
the real NMI which caused the vmexit in the first place, then iret from
the NMI handler which re-enables NMIs.  We are then free to take NMIs
for the remainder of the codepath back to the vmresume.

The question then becomes whether we wish to enable NMIs or not, which
affects whether we issue "int $2" or "do_nmi()".  Can someone from Intel
weigh in and confirm the exact behaviour of NMIs around this code?

(Irrespective of this question at hand, having read that LWN article
about nested NMIs, I am not certain that we are immune to the issue, so
should think about implementing a similar fix)

~Andrew

>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 14:56               ` Jan Beulich
@ 2012-11-13 15:21                 ` Tim Deegan
  2012-11-13 15:32                   ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Tim Deegan @ 2012-11-13 15:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Malcolm Crossley, eddie.dong@intel.com, Ian Campbell,
	jun.nakajima@intel.com, xen-devel

At 14:56 +0000 on 13 Nov (1352818572), Jan Beulich wrote:
> >>> On 13.11.12 at 15:43, Tim Deegan <tim@xen.org> wrote:
> > The fact that there was a loop and not just a delay in the NMI handling
> > suggests that VMENTER does indeed re-enable NMIs (or at least
> > NMI-exiting) but I couldn't find that in the manual.  In any case, I
> > think the int $2 version is correcter than the direct call as it also
> > disables normal interrupts.
> 
> You're not commenting on the stack aspect and previous approach
> at all: Assuming your self_nmi() approach at least worked
> somewhere, that somewhere would be the place where the "int $2"
> approach would break. In other words - are you confident that
> NMI is _always_ masked when we get there (and hence your
> earlier approach _never_ worked)?

ISWYM.  No, having thought about it a bit more, I'm not confident of
that at all -- at this point in the exit handlers, interrupts are
enabled, so we may already have had an IRET.  So we'd be risking taking
one NMI while handling another (whether we do int $2 or a direct call;
it's bad in either case). :(

I think that the NMI case needs to move to the top of the vmexit
handler, beside the machine_check cases.  Once it's there, either the
direct call (+ artifical iret to clear the masking) or int $2 should be
fine.

Tim.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler
  2012-11-13 15:21                 ` Tim Deegan
@ 2012-11-13 15:32                   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2012-11-13 15:32 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Malcolm Crossley, eddie.dong@intel.com, Ian Campbell,
	jun.nakajima@intel.com, xen-devel

>>> On 13.11.12 at 16:21, Tim Deegan <tim@xen.org> wrote:
> I think that the NMI case needs to move to the top of the vmexit
> handler, beside the machine_check cases.  Once it's there, either the
> direct call (+ artifical iret to clear the masking) or int $2 should be
> fine.

Yes, that sounds like a viable thing (whether the artificial iret is
going to be needed here [and not elsewhere] is a different thing,
see my other response regarding the PV case).

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-11-13 15:32 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-13 11:23 [PATCH] [RFC PATCH] xen: vmx: Use an INT 2 call to process real NMI's instead of self_nmi() in VMEXIT handler Malcolm Crossley
2012-11-13 11:38 ` Jan Beulich
2012-11-13 11:47   ` Tim Deegan
2012-11-13 12:05     ` Tim Deegan
2012-11-13 12:16       ` Andrew Cooper
2012-11-13 12:17       ` Ian Campbell
2012-11-13 12:39         ` Tim Deegan
2012-11-13 14:21           ` Jan Beulich
2012-11-13 14:43             ` Tim Deegan
2012-11-13 14:56               ` Jan Beulich
2012-11-13 15:21                 ` Tim Deegan
2012-11-13 15:32                   ` Jan Beulich
2012-11-13 15:14               ` Andrew Cooper
2012-11-13 14:47       ` Jan Beulich
2012-11-13 11:48   ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).