public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Nathan Chancellor <nathan@kernel.org>,
	Chao Gao <chao.gao@intel.com>,  Zeng Guang <guang.zeng@intel.com>
Subject: [PATCH v2 4/7] KVM: nVMX: Detect nested posted interrupt NV at nested VM-Exit injection
Date: Thu,  5 Sep 2024 21:34:10 -0700	[thread overview]
Message-ID: <20240906043413.1049633-5-seanjc@google.com> (raw)
In-Reply-To: <20240906043413.1049633-1-seanjc@google.com>

When synthensizing a nested VM-Exit due to an external interrupt, pend a
nested posted interrupt if the external interrupt vector matches L2's PI
notification vector, i.e. if the interrupt is a PI notification for L2.
This fixes a bug where KVM will incorrectly inject VM-Exit instead of
processing nested posted interrupt when IPI virtualization is enabled.

Per the SDM, detection of the notification vector doesn't occur until the
interrupt is acknowledge and deliver to the CPU core.

  If the external-interrupt exiting VM-execution control is 1, any unmasked
  external interrupt causes a VM exit (see Section 26.2). If the "process
  posted interrupts" VM-execution control is also 1, this behavior is
  changed and the processor handles an external interrupt as follows:

    1. The local APIC is acknowledged; this provides the processor core
       with an interrupt vector, called here the physical vector.
    2. If the physical vector equals the posted-interrupt notification
       vector, the logical processor continues to the next step. Otherwise,
       a VM exit occurs as it would normally due to an external interrupt;
       the vector is saved in the VM-exit interruption-information field.

For the most part, KVM has avoided problems because a PI NV for L2 that
arrives will L2 is active will be processed by hardware, and KVM checks
for a pending notification vector during nested VM-Enter.  Thus, to hit
the bug, the PI NV interrupt needs to sneak its way into L1's vIRR while
L2 is active.

Without IPI virtualization, the scenario is practically impossible to hit,
modulo L1 doing weird things (see below), as the ordering between
vmx_deliver_posted_interrupt() and nested VM-Enter effectively guarantees
that either the sender will see the vCPU as being in_guest_mode(), or the
receiver will see the interrupt in its vIRR.

With IPI virtualization, introduced by commit d588bb9be1da ("KVM: VMX:
enable IPI virtualization"), the sending CPU effectively implements a rough
equivalent of vmx_deliver_posted_interrupt(), sans the nested PI NV check.
If the target vCPU has a valid PID, the CPU will send a PI NV interrupt
based on _L1's_ PID, as the sender's because IPIv table points at L1 PIDs.

  PIR := 32 bytes at PID_ADDR;
  // under lock
  PIR[V] := 1;
  store PIR at PID_ADDR;
  // release lock

  NotifyInfo := 8 bytes at PID_ADDR + 32;
  // under lock
  IF NotifyInfo.ON = 0 AND NotifyInfo.SN = 0; THEN
    NotifyInfo.ON := 1;
    SendNotify := 1;
  ELSE
    SendNotify := 0;
  FI;
  store NotifyInfo at PID_ADDR + 32;
  // release lock

  IF SendNotify = 1; THEN
    send an IPI specified by NotifyInfo.NDST and NotifyInfo.NV;
  FI;

As a result, the target vCPU ends up receiving an interrupt on KVM's
POSTED_INTR_VECTOR while L2 is running, with an interrupt in L1's PIR for
L2's nested PI NV.  The POSTED_INTR_VECTOR interrupt triggers a VM-Exit
from L2 to L0, KVM moves the interrupt from L1's PIR to vIRR, triggers a
KVM_REQ_EVENT prior to re-entry to L2, and calls vmx_check_nested_events(),
effectively bypassing all of KVM's "early" checks on nested PI NV.

Without IPI virtualization, the bug can likely be hit only if L1 programs
an assigned device to _post_ an interrupt to L2's notification vector, by
way of L1's PID.PIR.  Doing so would allow the interrupt to get into L1's
vIRR without KVM checking vmcs12's NV.  Which is architecturally allowed,
but unlikely behavior for a hypervisor.

Cc: Zeng Guang <guang.zeng@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6b7e0ab0e45e..238c26155c2a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4307,6 +4307,20 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 		if (WARN_ON_ONCE(irq < 0))
 			goto no_vmexit;
 
+		/*
+		 * If the IRQ is L2's PI notification vector, process posted
+		 * interrupts for L2 instead of injecting VM-Exit, as the
+		 * detection/morphing architecturally occurs when the IRQ is
+		 * delivered to the CPU.  Note, only interrupts that are routed
+		 * through the local APIC trigger posted interrupt processing,
+		 * and enabling posted interrupts requires ACK-on-exit.
+		 */
+		if (irq == vmx->nested.posted_intr_nv) {
+			vmx->nested.pi_pending = true;
+			kvm_apic_clear_irr(vcpu, irq);
+			goto no_vmexit;
+		}
+
 		nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT,
 				  INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR | irq, 0);
 
-- 
2.46.0.469.g59c65b2a67-goog


  parent reply	other threads:[~2024-09-06  4:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-06  4:34 [PATCH v2 0/7] KVM: nVMX: Fix IPIv vs. nested posted interrupts Sean Christopherson
2024-09-06  4:34 ` [PATCH v2 1/7] KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API Sean Christopherson
2024-09-06  4:34 ` [PATCH v2 2/7] KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection site Sean Christopherson
2024-09-06  4:34 ` [PATCH v2 3/7] KVM: nVMX: Suppress external interrupt VM-Exit injection if there's no IRQ Sean Christopherson
2024-09-06  4:34 ` Sean Christopherson [this message]
2024-09-06  4:34 ` [PATCH v2 5/7] KVM: x86: Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt() Sean Christopherson
2024-09-06  4:34 ` [PATCH v2 6/7] KVM: nVMX: Explicitly invalidate posted_intr_nv if PI is disabled at VM-Enter Sean Christopherson
2024-09-06  4:34 ` [PATCH v2 7/7] KVM: nVMX: Assert that vcpu->mutex is held when accessing secondary VMCSes Sean Christopherson
2024-09-10  4:56 ` [PATCH v2 0/7] KVM: nVMX: Fix IPIv vs. nested posted interrupts Sean Christopherson
2024-09-10 16:22   ` Nathan Chancellor
2024-09-10 17:43     ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240906043413.1049633-5-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=chao.gao@intel.com \
    --cc=guang.zeng@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nathan@kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox