[PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
@ 2016-03-23  5:08 Yuki Shibuya
  2016-03-23 13:18 ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Yuki Shibuya @ 2016-03-23  5:08 UTC (permalink / raw)
  To: pbonzini@redhat.com; +Cc: kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida

Non maskable interrupts (NMI) are preferred to interrupts in current
implementation. If a NMI is pending and NMI is blocked by the result
of nmi_allowed(), pending interrupt is not injected and
enable_irq_window() is not executed, even if interrupts injection is
allowed.

In old kernel (e.g. 2.6.32), schedule() is often called in NMI context.
In this case, interrupts are needed to execute iret that intends end
of NMI. The flag of blocking new NMI is not cleared until the guest
execute the iret, and interrupts are blocked by pending NMI. Due to
this, iret can't be invoked in the guest, and the guest is starved
until block is cleared by some events (e.g. canceling injection).

This patch injects pending interrupts, when it's allowed, even if NMI
is blocked. And, if NMI pending counter > 0, NMI is not blocked and an
interrupt is pending, NMI pending counter is cleared to execute
enable_irq_window().

Signed-off-by: Yuki Shibuya <shibuya.yk@ncos.nec.co.jp>
---
 arch/x86/kvm/x86.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7236bd3..294a977 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6087,12 +6087,18 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
 	}

 	/* try to inject new event if pending */
-	if (vcpu->arch.nmi_pending) {
-		if (kvm_x86_ops->nmi_allowed(vcpu)) {
-			--vcpu->arch.nmi_pending;
-			vcpu->arch.nmi_injected = true;
-			kvm_x86_ops->set_nmi(vcpu);
-		}
+	if (vcpu->arch.nmi_pending && kvm_x86_ops->nmi_allowed(vcpu)) {
+		--vcpu->arch.nmi_pending;
+		vcpu->arch.nmi_injected = true;
+		kvm_x86_ops->set_nmi(vcpu);
+
+		/* If nmi pending > 0 and injectable interrupts exist,
+		 * nmi pending counter is cleared to prevent skipping
+		 * injectable pending interrupts.
+		 */
+		if (vcpu->arch.nmi_pending && kvm_cpu_has_injectable_intr(vcpu)
+					&& kvm_x86_ops->interrupt_allowed(vcpu))
+			vcpu->arch.nmi_pending = 0;
 	} else if (kvm_cpu_has_injectable_intr(vcpu)) {
 		/*
 		 * Because interrupts can be injected asynchronously, we are
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23  5:08 [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist Yuki Shibuya
@ 2016-03-23 13:18 ` Paolo Bonzini
  2016-03-23 17:21   ` Radim Krčmář
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2016-03-23 13:18 UTC (permalink / raw)
  To: Yuki Shibuya; +Cc: kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida



On 23/03/2016 06:08, Yuki Shibuya wrote:
> +		/* If nmi pending > 0 and injectable interrupts exist,
> +		 * nmi pending counter is cleared to prevent skipping
> +		 * injectable pending interrupts.
> +		 */
> +		if (vcpu->arch.nmi_pending && kvm_cpu_has_injectable_intr(vcpu)
> +					&& kvm_x86_ops->interrupt_allowed(vcpu))
> +			vcpu->arch.nmi_pending = 0;

I am not sure I understand this.  Why is it safe to drop nmi_pending?
Can we instead do something like this in vcpu_enter_guest?

It would be nice to have a testcase for this in kvm-unit-tests.

Paolo

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7236bd3a4c3d..6c73cbc8e19a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6561,10 +6561,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (inject_pending_event(vcpu, req_int_win) != 0)
 			req_immediate_exit = true;
 		/* enable NMI/IRQ window open exits if needed */
-		else if (vcpu->arch.nmi_pending)
-			kvm_x86_ops->enable_nmi_window(vcpu);
-		else if (kvm_cpu_has_injectable_intr(vcpu) || req_int_win)
-			kvm_x86_ops->enable_irq_window(vcpu);
+		else {
+			if (vcpu->arch.nmi_pending)
+				kvm_x86_ops->enable_nmi_window(vcpu);
+			if (kvm_cpu_has_injectable_intr(vcpu) || req_int_win)
+				kvm_x86_ops->enable_irq_window(vcpu);
+		}
 
 		if (kvm_lapic_enabled(vcpu)) {
 			update_cr8_intercept(vcpu);



Paolo

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 13:18 ` Paolo Bonzini
@ 2016-03-23 17:21   ` Radim Krčmář
  2016-03-23 18:22     ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Radim Krčmář @ 2016-03-23 17:21 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida

2016-03-23 14:18+0100, Paolo Bonzini:
> On 23/03/2016 06:08, Yuki Shibuya wrote:
>> +		/* If nmi pending > 0 and injectable interrupts exist,
>> +		 * nmi pending counter is cleared to prevent skipping
>> +		 * injectable pending interrupts.
>> +		 */
>> +		if (vcpu->arch.nmi_pending && kvm_cpu_has_injectable_intr(vcpu)
>> +					&& kvm_x86_ops->interrupt_allowed(vcpu))
>> +			vcpu->arch.nmi_pending = 0;
> 
> I am not sure I understand this.  Why is it safe to drop nmi_pending?

NMIs are latched (queue length 1) and therefore cannot be pending after
an injection.  I think we want to do it unconditionally.

> Can we instead do something like this in vcpu_enter_guest?

(We should, even if it doesn't fix the bug.  Maskable interrupts can be
 injected while NMIs are blocked.)

What the hell is 2.6.32 doing, though?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 17:21   ` Radim Krčmář
@ 2016-03-23 18:22     ` Paolo Bonzini
  2016-03-23 19:04       ` Radim Krčmář
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2016-03-23 18:22 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida



On 23/03/2016 18:21, Radim Krčmář wrote:
> > > +		 * nmi pending counter is cleared to prevent skipping
> > > +		 * injectable pending interrupts.
> > > +		 */
> > > +		if (vcpu->arch.nmi_pending && kvm_cpu_has_injectable_intr(vcpu)
> > > +					&& kvm_x86_ops->interrupt_allowed(vcpu))
> > > +			vcpu->arch.nmi_pending = 0;
> > 
> > I am not sure I understand this.  Why is it safe to drop nmi_pending?
> 
> NMIs are latched (queue length 1) and therefore cannot be pending after
> an injection.  I think we want to do it unconditionally.

If that is right, process_nmi would be the place where you'd limit the
queue to 1.

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 18:22     ` Paolo Bonzini
@ 2016-03-23 19:04       ` Radim Krčmář
  2016-03-23 21:00         ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Radim Krčmář @ 2016-03-23 19:04 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida

2016-03-23 19:22+0100, Paolo Bonzini:
> On 23/03/2016 18:21, Radim Krčmář wrote:
>> NMIs are latched (queue length 1) and therefore cannot be pending after
>> an injection.  I think we want to do it unconditionally.
> 
> If that is right, process_nmi would be the place where you'd limit the
> queue to 1.

You are right.

I think we can always limit the queue to 1:
process_nmi is from 7460fb4a3400 ("KVM: Fix simultaneous NMIs") and the
commit message explains
  If simultaneous NMIs happen, we're supposed to queue the second and
  next (collapsing them), but currently we sometimes collapse the second
  into the first.

I think that hardware coalesces all NMIs that arrive within one
instruction (NMI is delivered at instruction boundaries)
and one NMI is sufficient anyway (all events that triggered NMIs are
going to be handled in the first one and the second one is for nothing),
so reasons behind "supposed to" elude me.

We could overhaul NMI handling much more at that point, but it's safer
to keep it this way as there aren't major bugs. :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 19:04       ` Radim Krčmář
@ 2016-03-23 21:00         ` Paolo Bonzini
  2016-03-23 21:06           ` Jan Kiszka
  2016-03-23 21:52           ` Radim Krčmář
  0 siblings, 2 replies; 12+ messages in thread
From: Paolo Bonzini @ 2016-03-23 21:00 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida

On 23/03/2016 20:04, Radim Krčmář wrote:
> I think that hardware coalesces all NMIs that arrive within one
> instruction (NMI is delivered at instruction boundaries)
> and one NMI is sufficient anyway (all events that triggered NMIs are
> going to be handled in the first one and the second one is for nothing),
> so reasons behind "supposed to" elude me.

http://www.spinics.net/lists/kvm/msg61313.html provides the relevant
history lesson:

Jan: "Thinking this through again, it's actually not yet clear to me
what we are modeling here: If two NMI events arrive almost perfectly in
parallel, does the real hardware guarantee that they will always cause
two NMI events in the CPU? Then this is required."

Avi: "It's not 100% clear from the SDM, but this is what I understood
from it. And it's needed - the NMI handlers are now being reworked to
handle just one NMI source (hopefully the cheapest) in the handler, and
if we detect a back-to-back NMI, handle all possible NMI sources."

Not sure if that change actually ever happened in Linux (Avi went on
describing an application to pvspinlocks, but pvspinlocks ended up _not_
using NMIs), actually.

The Intel SDM is very obscure, AMD a bit less.  It says: "The processor
recognizes an NMI at an instruction boundary. [...] NMI cannot be
masked. However, when an NMI is recognized by the processor, recognition
of subsequent NMIs are disabled until an IRET instruction is executed."
Here recognition means delivery, acceptance need not be at instruction
boundaries (it can be at clock cycle boundary).  It might be possible to
figure this out on bare-metal by executing an expensive instruction such
as MFENCE.

There is also at least one case where you could have one pending (but
not injected) and one latched NMI at instruction boundary, and that is
the special NMI shadow from the STI instruction.

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 21:00         ` Paolo Bonzini
@ 2016-03-23 21:06           ` Jan Kiszka
  2016-03-23 21:11             ` Paolo Bonzini
  2016-03-23 21:52           ` Radim Krčmář
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Kiszka @ 2016-03-23 21:06 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Nobuo Yoshida

On 2016-03-23 22:00, Paolo Bonzini wrote:
> 
> 
> On 23/03/2016 20:04, Radim Krčmář wrote:
>> I think that hardware coalesces all NMIs that arrive within one
>> instruction (NMI is delivered at instruction boundaries)
>> and one NMI is sufficient anyway (all events that triggered NMIs are
>> going to be handled in the first one and the second one is for nothing),
>> so reasons behind "supposed to" elude me.
> 
> http://www.spinics.net/lists/kvm/msg61313.html provides the relevant
> history lesson:
> 
> Jan: "Thinking this through again, it's actually not yet clear to me
> what we are modeling here: If two NMI events arrive almost perfectly in
> parallel, does the real hardware guarantee that they will always cause
> two NMI events in the CPU? Then this is required."
> 
> Avi: "It's not 100% clear from the SDM, but this is what I understood
> from it. And it's needed - the NMI handlers are now being reworked to
> handle just one NMI source (hopefully the cheapest) in the handler, and
> if we detect a back-to-back NMI, handle all possible NMI sources."
> 
> Not sure if that change actually ever happened in Linux (Avi went on
> describing an application to pvspinlocks, but pvspinlocks ended up _not_
> using NMIs), actually.
> 
> The Intel SDM is very obscure, AMD a bit less.  It says: "The processor
> recognizes an NMI at an instruction boundary. [...] NMI cannot be
> masked. However, when an NMI is recognized by the processor, recognition
> of subsequent NMIs are disabled until an IRET instruction is executed."
> Here recognition means delivery, acceptance need not be at instruction
> boundaries (it can be at clock cycle boundary).  It might be possible to
> figure this out on bare-metal by executing an expensive instruction such
> as MFENCE.
> 
> There is also at least one case where you could have one pending (but
> not injected) and one latched NMI at instruction boundary, and that is
> the special NMI shadow from the STI instruction.

This sounds to me like we should try to address the issue Yuki is seeing
without playing with the nmi_pending counter.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 21:06           ` Jan Kiszka
@ 2016-03-23 21:11             ` Paolo Bonzini
  2016-03-23 21:55               ` Radim Krčmář
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2016-03-23 21:11 UTC (permalink / raw)
  To: Jan Kiszka, Radim Krčmář
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Nobuo Yoshida



On 23/03/2016 22:06, Jan Kiszka wrote:
> > There is also at least one case where you could have one pending (but
> > not injected) and one latched NMI at instruction boundary, and that is
> > the special NMI shadow from the STI instruction.
> 
> This sounds to me like we should try to address the issue Yuki is seeing
> without playing with the nmi_pending counter.

Yes, it would be nice if my suggestion worked. :)

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 21:00         ` Paolo Bonzini
  2016-03-23 21:06           ` Jan Kiszka
@ 2016-03-23 21:52           ` Radim Krčmář
  1 sibling, 0 replies; 12+ messages in thread
From: Radim Krčmář @ 2016-03-23 21:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Yuki Shibuya, kvm@vger.kernel.org, Jan Kiszka, Nobuo Yoshida

2016-03-23 22:00+0100, Paolo Bonzini:
> On 23/03/2016 20:04, Radim Krčmář wrote:
>> I think that hardware coalesces all NMIs that arrive within one
>> instruction (NMI is delivered at instruction boundaries)
>> and one NMI is sufficient anyway (all events that triggered NMIs are
>> going to be handled in the first one and the second one is for nothing),
>> so reasons behind "supposed to" elude me.
> 
> http://www.spinics.net/lists/kvm/msg61313.html provides the relevant
> history lesson:

Very nice, thanks!

> Jan: "Thinking this through again, it's actually not yet clear to me
> what we are modeling here: If two NMI events arrive almost perfectly in
> parallel, does the real hardware guarantee that they will always cause
> two NMI events in the CPU? Then this is required."
> 
> Avi: "It's not 100% clear from the SDM, but this is what I understood
> from it. And it's needed - the NMI handlers are now being reworked to
> handle just one NMI source (hopefully the cheapest) in the handler, and
> if we detect a back-to-back NMI, handle all possible NMI sources."
> 
> Not sure if that change actually ever happened in Linux (Avi went on
> describing an application to pvspinlocks, but pvspinlocks ended up _not_
> using NMIs), actually.

All NMI handlers are called now (nmi_handle) and it would be quite wrong
to do otherwise, because NMIs do collapse in both models.

> The Intel SDM is very obscure, AMD a bit less.  It says: "The processor
> recognizes an NMI at an instruction boundary. [...] NMI cannot be
> masked. However, when an NMI is recognized by the processor, recognition
> of subsequent NMIs are disabled until an IRET instruction is executed."
> Here recognition means delivery, acceptance need not be at instruction
> boundaries (it can be at clock cycle boundary).  It might be possible to
> figure this out on bare-metal by executing an expensive instruction such
> as MFENCE.

Yes. :(

> There is also at least one case where you could have one pending (but
> not injected) and one latched NMI at instruction boundary, and that is
> the special NMI shadow from the STI instruction.

I think that STI will keep the NMI latched (pending is a synonym) and
that there is nothing between injected (recognized) and latched.

APM (vol2, 15.21.5 Interrupt Shadows) says that interrupts aren't
recognized during shadow.  And that "The saved instruction pointer
points to the instruction immediately following the boundary where the
NMI was recognized." (vol2, 8.2.3 NMI—Non-Maskable-Interrupt Exception),
so recognizing the NMI after STI would mean that we'd execute some
instruction twice ...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 21:11             ` Paolo Bonzini
@ 2016-03-23 21:55               ` Radim Krčmář
  2016-03-24  5:08                 ` Yuki Shibuya
  0 siblings, 1 reply; 12+ messages in thread
From: Radim Krčmář @ 2016-03-23 21:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jan Kiszka, Yuki Shibuya, kvm@vger.kernel.org, Nobuo Yoshida

2016-03-23 22:11+0100, Paolo Bonzini:
> On 23/03/2016 22:06, Jan Kiszka wrote:
>> > There is also at least one case where you could have one pending (but
>> > not injected) and one latched NMI at instruction boundary, and that is
>> > the special NMI shadow from the STI instruction.
>> 
>> This sounds to me like we should try to address the issue Yuki is seeing
>> without playing with the nmi_pending counter.

I agree, the discussion was mostly offtopic, sorry.

> Yes, it would be nice if my suggestion worked. :)

And if not, we'll at least learn what 2.6.32 is doing. :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-23 21:55               ` Radim Krčmář
@ 2016-03-24  5:08                 ` Yuki Shibuya
  2016-03-24 11:09                   ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Yuki Shibuya @ 2016-03-24  5:08 UTC (permalink / raw)
  To: Radim Krčmář, pbonzini@redhat.com, Jan Kiszka
  Cc: kvm@vger.kernel.org, Nobuo Yoshida

> From: Radim Krcmar
> Sent: Thursday, March 24, 2016 6:55 AM
> 
> 2016-03-23 22:11+0100, Paolo Bonzini:
> > On 23/03/2016 22:06, Jan Kiszka wrote:
> >> > There is also at least one case where you could have one pending
> >> > (but not injected) and one latched NMI at instruction boundary, and
> >> > that is the special NMI shadow from the STI instruction.
> >>
> >> This sounds to me like we should try to address the issue Yuki is
> >> seeing without playing with the nmi_pending counter.
> 
> I agree, the discussion was mostly offtopic, sorry.
> 
> > Yes, it would be nice if my suggestion worked. :)
> 
> And if not, we'll at least learn what 2.6.32 is doing. :)

I really appreciate your cooperation and valuable discussion. I understand that 
changing nmi_pending count is not good because it is not guaranteed as safe. 
Sorry my mistake. 

I tested the logic suggested by Paolo in our environment, and it worked. I will 
post v3, replacing second part by this logic.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist
  2016-03-24  5:08                 ` Yuki Shibuya
@ 2016-03-24 11:09                   ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2016-03-24 11:09 UTC (permalink / raw)
  To: Yuki Shibuya, Radim Krčmář, Jan Kiszka
  Cc: kvm@vger.kernel.org, Nobuo Yoshida



On 24/03/2016 06:08, Yuki Shibuya wrote:
> I really appreciate your cooperation and valuable discussion. I understand that 
> changing nmi_pending count is not good because it is not guaranteed as safe. 
> Sorry my mistake. 
> 
> I tested the logic suggested by Paolo in our environment, and it worked. I will 
> post v3, replacing second part by this logic.

Great, it would be even better if you could contribute a regression test
using the kvm-unit-tests framework.

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-03-24 11:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-23  5:08 [PATCH v2] KVM: x86: Inject pending interrupt even if pending nmi exist Yuki Shibuya
2016-03-23 13:18 ` Paolo Bonzini
2016-03-23 17:21   ` Radim Krčmář
2016-03-23 18:22     ` Paolo Bonzini
2016-03-23 19:04       ` Radim Krčmář
2016-03-23 21:00         ` Paolo Bonzini
2016-03-23 21:06           ` Jan Kiszka
2016-03-23 21:11             ` Paolo Bonzini
2016-03-23 21:55               ` Radim Krčmář
2016-03-24  5:08                 ` Yuki Shibuya
2016-03-24 11:09                   ` Paolo Bonzini
2016-03-23 21:52           ` Radim Krčmář

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox