From: Marcelo Tosatti <mtosatti@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Alex Williamson <alex.williamson@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Pei Zhang <pezhang@redhat.com>
Subject: [patch 3/3 V2] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device
Date: Wed, 26 May 2021 14:20:14 -0300 [thread overview]
Message-ID: <20210526172014.GA29007@fuller.cnet> (raw)
In-Reply-To: <YK1WHWsA7XuwTQR3@t490s>
For VMX, when a vcpu enters HLT emulation, pi_post_block will:
1) Add vcpu to per-cpu list of blocked vcpus.
2) Program the posted-interrupt descriptor "notification vector"
to POSTED_INTR_WAKEUP_VECTOR
With interrupt remapping, an interrupt will set the PIR bit for the
vector programmed for the device on the CPU, test-and-set the
ON bit on the posted interrupt descriptor, and if the ON bit is clear
generate an interrupt for the notification vector.
This way, the target CPU wakes upon a device interrupt and wakes up
the target vcpu.
Problem is that pi_post_block only programs the notification vector
if kvm_arch_has_assigned_device() is true. Its possible for the
following to happen:
1) vcpu V HLTs on pcpu P, kvm_arch_has_assigned_device is false,
notification vector is not programmed
2) device is assigned to VM
3) device interrupts vcpu V, sets ON bit
(notification vector not programmed, so pcpu P remains in idle)
4) vcpu 0 IPIs vcpu V (in guest), but since pi descriptor ON bit is set,
kvm_vcpu_kick is skipped
5) vcpu 0 busy spins on vcpu V's response for several seconds, until
RCU watchdog NMIs all vCPUs.
To fix this, use the start_assignment kvm_x86_ops callback to kick
vcpus out of the halt loop, so the notification vector is
properly reprogrammed to the wakeup vector.
Reported-by: Pei Zhang <pezhang@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
For build error:
Reported-by: kernel test robot <lkp@intel.com>
---
v2: Add brief comment to vmx_pi_start_assignment (Peter Xu).
Export kvm_make_all_cpus_request (kernel test robot).
Index: linux-2.6/arch/x86/kvm/vmx/posted_intr.c
===================================================================
--- linux-2.6.orig/arch/x86/kvm/vmx/posted_intr.c
+++ linux-2.6/arch/x86/kvm/vmx/posted_intr.c
@@ -238,6 +238,20 @@ bool pi_has_pending_interrupt(struct kvm
/*
+ * Bail out of the block loop if the VM has an assigned
+ * device, but the blocking vCPU didn't reconfigure the
+ * PI.NV to the wakeup vector, i.e. the assigned device
+ * came along after the initial check in pi_pre_block().
+ */
+void vmx_pi_start_assignment(struct kvm *kvm)
+{
+ if (!irq_remapping_cap(IRQ_POSTING_CAP))
+ return;
+
+ kvm_make_all_cpus_request(kvm, KVM_REQ_UNBLOCK);
+}
+
+/*
* pi_update_irte - set IRTE for Posted-Interrupts
*
* @kvm: kvm
Index: linux-2.6/arch/x86/kvm/vmx/posted_intr.h
===================================================================
--- linux-2.6.orig/arch/x86/kvm/vmx/posted_intr.h
+++ linux-2.6/arch/x86/kvm/vmx/posted_intr.h
@@ -95,5 +95,6 @@ void __init pi_init_cpu(int cpu);
bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
bool set);
+void vmx_pi_start_assignment(struct kvm *kvm);
#endif /* __KVM_X86_VMX_POSTED_INTR_H */
Index: linux-2.6/arch/x86/kvm/vmx/vmx.c
===================================================================
--- linux-2.6.orig/arch/x86/kvm/vmx/vmx.c
+++ linux-2.6/arch/x86/kvm/vmx/vmx.c
@@ -7732,6 +7732,7 @@ static struct kvm_x86_ops vmx_x86_ops __
.nested_ops = &vmx_nested_ops,
.update_pi_irte = pi_update_irte,
+ .start_assignment = vmx_pi_start_assignment,
#ifdef CONFIG_X86_64
.set_hv_timer = vmx_set_hv_timer,
Index: linux-2.6/virt/kvm/kvm_main.c
===================================================================
--- linux-2.6.orig/virt/kvm/kvm_main.c
+++ linux-2.6/virt/kvm/kvm_main.c
@@ -307,6 +307,7 @@ bool kvm_make_all_cpus_request(struct kv
{
return kvm_make_all_cpus_request_except(kvm, req, NULL);
}
+EXPORT_SYMBOL_GPL(kvm_make_all_cpus_request);
#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
void kvm_flush_remote_tlbs(struct kvm *kvm)
next prev parent reply other threads:[~2021-05-26 17:20 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-25 13:41 [patch 0/3] VMX: configure posted interrupt descriptor when assigning device (v5) Marcelo Tosatti
2021-05-25 13:41 ` [patch 1/3] KVM: x86: add start_assignment hook to kvm_x86_ops Marcelo Tosatti
2021-05-25 19:52 ` Peter Xu
2021-05-25 13:41 ` [patch 2/3] KVM: rename KVM_REQ_PENDING_TIMER to KVM_REQ_UNBLOCK Marcelo Tosatti
2021-05-25 19:14 ` Peter Xu
2021-05-25 19:26 ` Marcelo Tosatti
2021-05-25 19:52 ` Peter Xu
2021-05-27 11:57 ` Paolo Bonzini
2021-05-27 11:57 ` Paolo Bonzini
2021-05-25 13:41 ` [patch 3/3] KVM: VMX: update vcpu posted-interrupt descriptor when assigning device Marcelo Tosatti
2021-05-25 19:55 ` Peter Xu
2021-05-26 17:20 ` Marcelo Tosatti [this message]
2021-05-25 21:09 ` kernel test robot
2021-05-27 11:59 ` [patch 0/3] VMX: configure posted interrupt descriptor when assigning device (v5) Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210526172014.GA29007@fuller.cnet \
--to=mtosatti@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pezhang@redhat.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).