* [PATCH] KVM: VMX: Fix NMI event loss
@ 2023-08-28 9:07 Tianyi Liu
2023-08-28 14:04 ` Sean Christopherson
0 siblings, 1 reply; 3+ messages in thread
From: Tianyi Liu @ 2023-08-28 9:07 UTC (permalink / raw)
To: seanjc; +Cc: kvm, linux-kernel, peterz, Tianyi Liu
Hi, Sean:
I have found that in the latest version of the kernel, some PMU events are
being lost. I used bisect and found out the breaking commit [1], which
moved the handling of NMI events from `handle_exception_irqoff` to
`vmx_vcpu_enter_exit`.
If I revert this part as done in this patch, it works correctly. However,
I'm not really familiar with KVM, and I'm not sure about the intent behind
the original patch [1]. Could you please take a look on this? Thanks a lot.
My use case is to sample the IP of guest OS using `perf kvm`:
`perf kvm --guest record -a -g -e instructions -F 10000 -- sleep 1`
If it works correctly, it will record about 10000 samples (as `-F 10000`)
and it will say:
`[ perf record: Captured and wrote 0.9 MB perf.data.guest (9729 samples) ]`
And if not, it will only record ~100 samples, sometimes no sample at all.
If it's useful for your debug, The callchain of `vmx_vcpu_enter_exit` is:
vmx_vcpu_enter_exit
vmx_vcpu_run
kvm_x86_vcpu_run
vcpu_enter_guest
While the callchain of `handle_exception_irqoff` is:
handle_exception_irqoff
vmx_handle_exit_irqoff
kvm_x86_handle_exit_irqoff
vcpu_enter_guest
[1] https://lore.kernel.org/all/20221213060912.654668-8-seanjc@google.com/
Signed-off-by: Tianyi Liu <i.pear@outlook.com>
---
arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index df461f387e20..3a0b13867a6b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6955,6 +6955,12 @@ static void handle_exception_irqoff(struct vcpu_vmx *vmx)
/* Handle machine checks before interrupts are enabled */
else if (is_machine_check(intr_info))
kvm_machine_check();
+ /* We need to handle NMIs before interrupts are enabled */
+ else if (is_nmi(intr_info)) {
+ kvm_before_interrupt(&vmx->vcpu, KVM_HANDLING_NMI);
+ vmx_do_nmi_irqoff();
+ kvm_after_interrupt(&vmx->vcpu);
+ }
}
static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
@@ -7251,13 +7257,6 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
else
vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
- if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI &&
- is_nmi(vmx_get_intr_info(vcpu))) {
- kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
- vmx_do_nmi_irqoff();
- kvm_after_interrupt(vcpu);
- }
-
guest_state_exit_irqoff();
}
--
2.41.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] KVM: VMX: Fix NMI event loss
2023-08-28 9:07 [PATCH] KVM: VMX: Fix NMI event loss Tianyi Liu
@ 2023-08-28 14:04 ` Sean Christopherson
2023-08-28 15:53 ` Tianyi Liu
0 siblings, 1 reply; 3+ messages in thread
From: Sean Christopherson @ 2023-08-28 14:04 UTC (permalink / raw)
To: Tianyi Liu; +Cc: kvm, linux-kernel, peterz
On Mon, Aug 28, 2023, Tianyi Liu wrote:
> Hi, Sean:
>
> I have found that in the latest version of the kernel, some PMU events are
> being lost. I used bisect and found out the breaking commit [1], which
> moved the handling of NMI events from `handle_exception_irqoff` to
> `vmx_vcpu_enter_exit`.
>
> If I revert this part as done in this patch, it works correctly. However,
> I'm not really familiar with KVM, and I'm not sure about the intent behind
> the original patch [1].
FWIW, the goal was to invoke vmx_do_nmi_irqoff() before leaving the "noinstr"
region. I messed up and forgot that vmx_get_intr_info() relied on metadata being
reset after VM-Exit :-/
> Could you please take a look on this? Thanks a lot.
Please try this patch, it should fix the problem but I haven't fully tested it
against an affected workload yet. I'll do that later today.
https://lore.kernel.org/all/20230825014532.2846714-1-seanjc@google.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] KVM: VMX: Fix NMI event loss
2023-08-28 14:04 ` Sean Christopherson
@ 2023-08-28 15:53 ` Tianyi Liu
0 siblings, 0 replies; 3+ messages in thread
From: Tianyi Liu @ 2023-08-28 15:53 UTC (permalink / raw)
To: seanjc; +Cc: i.pear, kvm, linux-kernel, peterz
On Mon, Aug 28, 2023, Sean Christopherson wrote:
> On Mon, Aug 28, 2023, Tianyi Liu wrote:
> > Hi, Sean:
> >
> > I have found that in the latest version of the kernel, some PMU events are
> > being lost. I used bisect and found out the breaking commit [1], which
> > moved the handling of NMI events from `handle_exception_irqoff` to
> > `vmx_vcpu_enter_exit`.
> >
> > If I revert this part as done in this patch, it works correctly. However,
> > I'm not really familiar with KVM, and I'm not sure about the intent behind
> > the original patch [1].
>
> FWIW, the goal was to invoke vmx_do_nmi_irqoff() before leaving the "noinstr"
> region. I messed up and forgot that vmx_get_intr_info() relied on metadata being
> reset after VM-Exit :-/
>
> > Could you please take a look on this? Thanks a lot.
>
> Please try this patch, it should fix the problem but I haven't fully tested it
> against an affected workload yet. I'll do that later today.
>
> https://lore.kernel.org/all/20230825014532.2846714-1-seanjc@google.com
This works for me, thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-08-28 15:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-28 9:07 [PATCH] KVM: VMX: Fix NMI event loss Tianyi Liu
2023-08-28 14:04 ` Sean Christopherson
2023-08-28 15:53 ` Tianyi Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox