* [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter
@ 2024-11-01 19:14 Sean Christopherson
2024-11-01 19:14 ` [PATCH 1/5] KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter Sean Christopherson
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-11-01 19:14 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
Rework and cleanup KVM's event handling during nested VM-Enter emulation,
and ultimately fix a bug where KVM doesn't honor event priority when
delivering a nested posted interrupt. Specifically, if there is a posted
interrupt *notification* IRQ in L1's vIRR, the IRQ should not be acked by
the CPU if a higher priority event is recognized after VM-Enter (which
unblocks L1 IRQs).
FWIW, I don't exactly love the resulting code in vmx_check_nested_events(),
so if someone has a better idea...
Sean Christopherson (5):
KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter
KVM: nVMX: Check for pending INIT/SIPI after entering non-root mode
KVM: nVMX: Drop manual vmcs01.GUEST_INTERRUPT_STATUS.RVI check at
VM-Enter
KVM: nVMX: Use vmcs01's controls shadow to check for IRQ/NMI windows
at VM-Enter
KVM: nVMX: Honor event priority when emulating PI delivery during
VM-Enter
arch/x86/kvm/vmx/nested.c | 77 ++++++++++++++++++++++-----------------
1 file changed, 44 insertions(+), 33 deletions(-)
base-commit: e466901b947d529f7b091a3b00b19d2bdee206ee
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/5] KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
@ 2024-11-01 19:14 ` Sean Christopherson
2024-11-01 19:14 ` [PATCH 2/5] KVM: nVMX: Check for pending INIT/SIPI after entering non-root mode Sean Christopherson
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-11-01 19:14 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
Always request pending event evaluation after successful nested VM-Enter
if L1 has a pending IRQ, as KVM will effectively do so anyways when APICv
is enabled, by way of vmx_has_apicv_interrupt(). This will allow dropping
the aforementioned APICv check, and will also allow handling nested Posted
Interrupt processing entirely within vmx_check_nested_events(), which is
necessary to honor priority between concurrent events.
Note, checking for pending IRQs has a subtle side effect, as it results in
a PPR update for L1's vAPIC (PPR virtualization does happen at VM-Enter,
but for nested VM-Enter that affects L2's vAPIC, not L1's vAPIC). However,
KVM updates PPR _constantly_, even when PPR technically shouldn't be
refreshed, e.g. kvm_vcpu_has_events() re-evaluates PPR if IRQs are
unblocked, by way of the same kvm_apic_has_interrupt() check. Ditto for
nested VM-Enter itself, when nested posted interrupts are enabled. Thus,
trying to avoid a PPR update on VM-Enter just to be pedantically accurate
is ridiculous, given the behavior elsewhere in KVM.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 746cb41c5b98..84386329474b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3604,7 +3604,8 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
* effectively unblock various events, e.g. INIT/SIPI cause VM-Exit
* unconditionally.
*/
- if (unlikely(evaluate_pending_interrupts))
+ if (unlikely(evaluate_pending_interrupts) ||
+ kvm_apic_has_interrupt(vcpu))
kvm_make_request(KVM_REQ_EVENT, vcpu);
/*
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/5] KVM: nVMX: Check for pending INIT/SIPI after entering non-root mode
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
2024-11-01 19:14 ` [PATCH 1/5] KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter Sean Christopherson
@ 2024-11-01 19:14 ` Sean Christopherson
2024-11-01 19:14 ` [PATCH 3/5] KVM: nVMX: Drop manual vmcs01.GUEST_INTERRUPT_STATUS.RVI check at VM-Enter Sean Christopherson
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-11-01 19:14 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
Explicitly check for a pending INIT or SIPI after entering non-root mode
during nested VM-Enter emulation, as no VMCS information is quered as part
of the check, i.e. there is no need to check for INIT/SIPI while vmcs01 is
still loaded.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 84386329474b..781da9fe979f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3517,8 +3517,6 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
(CPU_BASED_INTR_WINDOW_EXITING | CPU_BASED_NMI_WINDOW_EXITING);
if (likely(!evaluate_pending_interrupts) && kvm_vcpu_apicv_active(vcpu))
evaluate_pending_interrupts |= vmx_has_apicv_interrupt(vcpu);
- if (!evaluate_pending_interrupts)
- evaluate_pending_interrupts |= kvm_apic_has_pending_init_or_sipi(vcpu);
if (!vmx->nested.nested_run_pending ||
!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
@@ -3605,6 +3603,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
* unconditionally.
*/
if (unlikely(evaluate_pending_interrupts) ||
+ kvm_apic_has_pending_init_or_sipi(vcpu) ||
kvm_apic_has_interrupt(vcpu))
kvm_make_request(KVM_REQ_EVENT, vcpu);
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/5] KVM: nVMX: Drop manual vmcs01.GUEST_INTERRUPT_STATUS.RVI check at VM-Enter
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
2024-11-01 19:14 ` [PATCH 1/5] KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter Sean Christopherson
2024-11-01 19:14 ` [PATCH 2/5] KVM: nVMX: Check for pending INIT/SIPI after entering non-root mode Sean Christopherson
@ 2024-11-01 19:14 ` Sean Christopherson
2024-11-01 19:14 ` [PATCH 4/5] KVM: nVMX: Use vmcs01's controls shadow to check for IRQ/NMI windows " Sean Christopherson
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-11-01 19:14 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
Drop the manual check for a pending IRQ in vmcs01's RVI field during
nested VM-Enter, as the recently added call to kvm_apic_has_interrupt()
when checking for pending events after successful VM-Enter is a superset
of the RVI check (IRQs that are pending in RVI are also pending in L1's
IRR).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 781da9fe979f..4d20ab647876 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3467,14 +3467,6 @@ static int nested_vmx_check_permission(struct kvm_vcpu *vcpu)
return 1;
}
-static u8 vmx_has_apicv_interrupt(struct kvm_vcpu *vcpu)
-{
- u8 rvi = vmx_get_rvi();
- u8 vppr = kvm_lapic_get_reg(vcpu->arch.apic, APIC_PROCPRI);
-
- return ((rvi & 0xf0) > (vppr & 0xf0));
-}
-
static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12);
@@ -3515,8 +3507,6 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
evaluate_pending_interrupts = exec_controls_get(vmx) &
(CPU_BASED_INTR_WINDOW_EXITING | CPU_BASED_NMI_WINDOW_EXITING);
- if (likely(!evaluate_pending_interrupts) && kvm_vcpu_apicv_active(vcpu))
- evaluate_pending_interrupts |= vmx_has_apicv_interrupt(vcpu);
if (!vmx->nested.nested_run_pending ||
!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] KVM: nVMX: Use vmcs01's controls shadow to check for IRQ/NMI windows at VM-Enter
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
` (2 preceding siblings ...)
2024-11-01 19:14 ` [PATCH 3/5] KVM: nVMX: Drop manual vmcs01.GUEST_INTERRUPT_STATUS.RVI check at VM-Enter Sean Christopherson
@ 2024-11-01 19:14 ` Sean Christopherson
2024-11-01 19:14 ` [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter Sean Christopherson
2024-12-19 2:41 ` [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
5 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-11-01 19:14 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
Use vmcs01's execution controls shadow to check for IRQ/NMI windows after
a successful nested VM-Enter, instead of snapshotting the information prior
to emulating VM-Enter. It's quite difficult to see that the entire reason
controls are snapshot prior nested VM-Enter is to read them from vmcs01
(vmcs02 is loaded if nested VM-Enter is successful).
That could be solved with a comment, but explicitly using vmcs01's shadow
makes the code self-documenting to a certain extent.
No functional change intended (vmcs01's execution controls must not be
modified during emulation of nested VM-Enter).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4d20ab647876..0540faef0c85 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3486,7 +3486,6 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
enum vm_entry_failure_code entry_failure_code;
- bool evaluate_pending_interrupts;
union vmx_exit_reason exit_reason = {
.basic = EXIT_REASON_INVALID_STATE,
.failed_vmentry = 1,
@@ -3505,9 +3504,6 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
kvm_service_local_tlb_flush_requests(vcpu);
- evaluate_pending_interrupts = exec_controls_get(vmx) &
- (CPU_BASED_INTR_WINDOW_EXITING | CPU_BASED_NMI_WINDOW_EXITING);
-
if (!vmx->nested.nested_run_pending ||
!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
vmx->nested.pre_vmenter_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
@@ -3590,9 +3586,11 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
* Re-evaluate pending events if L1 had a pending IRQ/NMI/INIT/SIPI
* when it executed VMLAUNCH/VMRESUME, as entering non-root mode can
* effectively unblock various events, e.g. INIT/SIPI cause VM-Exit
- * unconditionally.
+ * unconditionally. Take care to pull data from vmcs01 as appropriate,
+ * e.g. when checking for interrupt windows, as vmcs02 is now loaded.
*/
- if (unlikely(evaluate_pending_interrupts) ||
+ if ((__exec_controls_get(&vmx->vmcs01) & (CPU_BASED_INTR_WINDOW_EXITING |
+ CPU_BASED_NMI_WINDOW_EXITING)) ||
kvm_apic_has_pending_init_or_sipi(vcpu) ||
kvm_apic_has_interrupt(vcpu))
kvm_make_request(KVM_REQ_EVENT, vcpu);
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
` (3 preceding siblings ...)
2024-11-01 19:14 ` [PATCH 4/5] KVM: nVMX: Use vmcs01's controls shadow to check for IRQ/NMI windows " Sean Christopherson
@ 2024-11-01 19:14 ` Sean Christopherson
2024-11-12 9:04 ` Chao Gao
2024-12-19 2:41 ` [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
5 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2024-11-01 19:14 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
Move the handling of a nested posted interrupt notification that is
unblocked by nested VM-Enter (unblocks L1 IRQs when ack-on-exit is enabled
by L1) from VM-Enter emulation to vmx_check_nested_events(). To avoid a
pointless forced immediate exit, i.e. to not regress IRQ delivery latency
when a nested posted interrupt is pending at VM-Enter, block processing of
the notification IRQ if and only if KVM must block _all_ events. Unlike
injected events, KVM doesn't need to actually enter L2 before updating the
vIRR and vmcs02.GUEST_INTR_STATUS, as the resulting L2 IRQ will be blocked
by hardware itself, until VM-Enter to L2 completes.
Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before
entering L2 is still technically wrong. But, practically speaking, only a
userspace that is deliberately checking KVM_STATE_NESTED_RUN_PENDING
against PIR and IRR can even notice; L2 will see architecturally correct
behavior, as KVM ensure the VM-Enter is finished before doing anything
that would effectively preempt the PIR=>IRR movement.
Reported-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 53 ++++++++++++++++++++++++++++-----------
1 file changed, 38 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 0540faef0c85..0c6c0aeaddc2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3725,14 +3725,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
if (unlikely(status != NVMX_VMENTRY_SUCCESS))
goto vmentry_failed;
- /* Emulate processing of posted interrupts on VM-Enter. */
- if (nested_cpu_has_posted_intr(vmcs12) &&
- kvm_apic_has_interrupt(vcpu) == vmx->nested.posted_intr_nv) {
- vmx->nested.pi_pending = true;
- kvm_make_request(KVM_REQ_EVENT, vcpu);
- kvm_apic_clear_irr(vcpu, vmx->nested.posted_intr_nv);
- }
-
/* Hide L1D cache contents from the nested guest. */
vmx->vcpu.arch.l1tf_flush_l1d = true;
@@ -4194,13 +4186,25 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
*/
bool block_nested_exceptions = vmx->nested.nested_run_pending;
/*
- * New events (not exceptions) are only recognized at instruction
+ * Events that don't require injection, i.e. that are virtualized by
+ * hardware, aren't blocked by a pending VM-Enter as KVM doesn't need
+ * to regain control in order to deliver the event, and hardware will
+ * handle event ordering, e.g. with respect to injected exceptions.
+ *
+ * But, new events (not exceptions) are only recognized at instruction
* boundaries. If an event needs reinjection, then KVM is handling a
- * VM-Exit that occurred _during_ instruction execution; new events are
- * blocked until the instruction completes.
+ * VM-Exit that occurred _during_ instruction execution; new events,
+ * irrespective of whether or not they're injected, are blocked until
+ * the instruction completes.
+ */
+ bool block_non_injected_events = kvm_event_needs_reinjection(vcpu);
+ /*
+ * Inject events are blocked by nested VM-Enter, as KVM is responsible
+ * for managing priority between concurrent events, i.e. KVM needs to
+ * wait until after VM-Enter completes to deliver injected events.
*/
bool block_nested_events = block_nested_exceptions ||
- kvm_event_needs_reinjection(vcpu);
+ block_non_injected_events;
if (lapic_in_kernel(vcpu) &&
test_bit(KVM_APIC_INIT, &apic->pending_events)) {
@@ -4312,18 +4316,26 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
if (kvm_cpu_has_interrupt(vcpu) && !vmx_interrupt_blocked(vcpu)) {
int irq;
- if (block_nested_events)
- return -EBUSY;
- if (!nested_exit_on_intr(vcpu))
+ if (!nested_exit_on_intr(vcpu)) {
+ if (block_nested_events)
+ return -EBUSY;
+
goto no_vmexit;
+ }
if (!nested_exit_intr_ack_set(vcpu)) {
+ if (block_nested_events)
+ return -EBUSY;
+
nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
return 0;
}
irq = kvm_cpu_get_extint(vcpu);
if (irq != -1) {
+ if (block_nested_events)
+ return -EBUSY;
+
nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT,
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR | irq, 0);
return 0;
@@ -4342,11 +4354,22 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
* and enabling posted interrupts requires ACK-on-exit.
*/
if (irq == vmx->nested.posted_intr_nv) {
+ /*
+ * Nested posted interrupts are delivered via RVI, i.e.
+ * aren't injected by KVM, and so can be queued even if
+ * manual event injection is disallowed.
+ */
+ if (block_non_injected_events)
+ return -EBUSY;
+
vmx->nested.pi_pending = true;
kvm_apic_clear_irr(vcpu, irq);
goto no_vmexit;
}
+ if (block_nested_events)
+ return -EBUSY;
+
nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT,
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR | irq, 0);
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter
2024-11-01 19:14 ` [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter Sean Christopherson
@ 2024-11-12 9:04 ` Chao Gao
2024-11-14 15:27 ` Sean Christopherson
0 siblings, 1 reply; 9+ messages in thread
From: Chao Gao @ 2024-11-12 9:04 UTC (permalink / raw)
To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel
On Fri, Nov 01, 2024 at 12:14:47PM -0700, Sean Christopherson wrote:
>Move the handling of a nested posted interrupt notification that is
>unblocked by nested VM-Enter (unblocks L1 IRQs when ack-on-exit is enabled
>by L1) from VM-Enter emulation to vmx_check_nested_events(). To avoid a
>pointless forced immediate exit, i.e. to not regress IRQ delivery latency
>when a nested posted interrupt is pending at VM-Enter, block processing of
>the notification IRQ if and only if KVM must block _all_ events. Unlike
>injected events, KVM doesn't need to actually enter L2 before updating the
>vIRR and vmcs02.GUEST_INTR_STATUS, as the resulting L2 IRQ will be blocked
>by hardware itself, until VM-Enter to L2 completes.
>
>Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before
>entering L2 is still technically wrong. But, practically speaking, only a
>userspace that is deliberately checking KVM_STATE_NESTED_RUN_PENDING
>against PIR and IRR can even notice; L2 will see architecturally correct
>behavior, as KVM ensure the VM-Enter is finished before doing anything
>that would effectively preempt the PIR=>IRR movement.
In my understanding, L1 can notice some priority issue in some cases. e.g.,
L1 enables NMI window VM-exit and enters L2 with a nested posted interrupt
notification. Assuming L2 doesn't block NMIs, then NMI window VM-exit should
happen immediately after nested VM-enter even before the nested posted
interrupt processing.
Another case is the nested VM-enter may inject some events (i.e.,
vmcs12->vm_entry_intr_info_field has a valid event). Event injection has
higher priority over external interrupt VM-exit. The event injection may
encounter EPT_VIOLATION which needs to be reflected to L1. In this case,
L1 is supposed to observe the EPT VIOLATION before the nested posted interrupt
processing.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter
2024-11-12 9:04 ` Chao Gao
@ 2024-11-14 15:27 ` Sean Christopherson
0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-11-14 15:27 UTC (permalink / raw)
To: Chao Gao; +Cc: Paolo Bonzini, kvm, linux-kernel
On Tue, Nov 12, 2024, Chao Gao wrote:
> On Fri, Nov 01, 2024 at 12:14:47PM -0700, Sean Christopherson wrote:
> >Move the handling of a nested posted interrupt notification that is
> >unblocked by nested VM-Enter (unblocks L1 IRQs when ack-on-exit is enabled
> >by L1) from VM-Enter emulation to vmx_check_nested_events(). To avoid a
> >pointless forced immediate exit, i.e. to not regress IRQ delivery latency
> >when a nested posted interrupt is pending at VM-Enter, block processing of
> >the notification IRQ if and only if KVM must block _all_ events. Unlike
> >injected events, KVM doesn't need to actually enter L2 before updating the
> >vIRR and vmcs02.GUEST_INTR_STATUS, as the resulting L2 IRQ will be blocked
> >by hardware itself, until VM-Enter to L2 completes.
> >
> >Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before
> >entering L2 is still technically wrong. But, practically speaking, only a
> >userspace that is deliberately checking KVM_STATE_NESTED_RUN_PENDING
> >against PIR and IRR can even notice; L2 will see architecturally correct
> >behavior, as KVM ensure the VM-Enter is finished before doing anything
> >that would effectively preempt the PIR=>IRR movement.
>
> In my understanding, L1 can notice some priority issue in some cases. e.g.,
> L1 enables NMI window VM-exit and enters L2 with a nested posted interrupt
> notification. Assuming L2 doesn't block NMIs, then NMI window VM-exit should
> happen immediately after nested VM-enter even before the nested posted
> interrupt processing.
>
> Another case is the nested VM-enter may inject some events (i.e.,
> vmcs12->vm_entry_intr_info_field has a valid event). Event injection has
> higher priority over external interrupt VM-exit. The event injection may
> encounter EPT_VIOLATION which needs to be reflected to L1. In this case,
> L1 is supposed to observe the EPT VIOLATION before the nested posted interrupt
> processing.
Hmm, right, L1 could also observe the PIR=>IRR movement. How about this?
Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before
entering L2 is still technically wrong. But, practically speaking, only
an L1 hypervisor or an L0 userspace that is deliberately checking event
priority against PIR=>IRR processing can even notice; L2 will see
architecturally correct behavior, as KVM ensures the VM-Enter is finished
before doing anything that would effectively preempt the PIR=>IRR movement.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
` (4 preceding siblings ...)
2024-11-01 19:14 ` [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter Sean Christopherson
@ 2024-12-19 2:41 ` Sean Christopherson
5 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2024-12-19 2:41 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Chao Gao
On Fri, 01 Nov 2024 12:14:42 -0700, Sean Christopherson wrote:
> Rework and cleanup KVM's event handling during nested VM-Enter emulation,
> and ultimately fix a bug where KVM doesn't honor event priority when
> delivering a nested posted interrupt. Specifically, if there is a posted
> interrupt *notification* IRQ in L1's vIRR, the IRQ should not be acked by
> the CPU if a higher priority event is recognized after VM-Enter (which
> unblocks L1 IRQs).
>
> [...]
Applied to kvm-x86 vmx, thanks!
[1/5] KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter
https://github.com/kvm-x86/linux/commit/637df11290b3
[2/5] KVM: nVMX: Check for pending INIT/SIPI after entering non-root mode
https://github.com/kvm-x86/linux/commit/3d0e20e45378
[3/5] KVM: nVMX: Drop manual vmcs01.GUEST_INTERRUPT_STATUS.RVI check at VM-Enter
https://github.com/kvm-x86/linux/commit/2732f6a7ccee
[4/5] KVM: nVMX: Use vmcs01's controls shadow to check for IRQ/NMI windows at VM-Enter
https://github.com/kvm-x86/linux/commit/1a265986bff6
[5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter
https://github.com/kvm-x86/linux/commit/ce5cdfb49813
--
https://github.com/kvm-x86/linux/tree/next
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-12-19 2:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-01 19:14 [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
2024-11-01 19:14 ` [PATCH 1/5] KVM: nVMX: Explicitly update vPPR on successful nested VM-Enter Sean Christopherson
2024-11-01 19:14 ` [PATCH 2/5] KVM: nVMX: Check for pending INIT/SIPI after entering non-root mode Sean Christopherson
2024-11-01 19:14 ` [PATCH 3/5] KVM: nVMX: Drop manual vmcs01.GUEST_INTERRUPT_STATUS.RVI check at VM-Enter Sean Christopherson
2024-11-01 19:14 ` [PATCH 4/5] KVM: nVMX: Use vmcs01's controls shadow to check for IRQ/NMI windows " Sean Christopherson
2024-11-01 19:14 ` [PATCH 5/5] KVM: nVMX: Honor event priority when emulating PI delivery during VM-Enter Sean Christopherson
2024-11-12 9:04 ` Chao Gao
2024-11-14 15:27 ` Sean Christopherson
2024-12-19 2:41 ` [PATCH 0/5] KVM: nVMX: Honor event priority for PI ack at VM-Enter Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).