From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH] KVM: nVMX: Fix direct injection of interrupts from L0 to L2 Date: Sun, 17 Feb 2013 16:31:26 +0100 Message-ID: <5120F7CE.6050905@web.de> References: <511FBD76.8010307@web.de> <20130217150721.GU9817@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2KPMCIRIPEFCRDLSOBLJO" Cc: Marcelo Tosatti , kvm , Nadav Har'El , "Nakajima, Jun" To: Gleb Natapov Return-path: Received: from mout.web.de ([212.227.17.12]:62410 "EHLO mout.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755572Ab3BQPbn (ORCPT ); Sun, 17 Feb 2013 10:31:43 -0500 In-Reply-To: <20130217150721.GU9817@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2KPMCIRIPEFCRDLSOBLJO Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 2013-02-17 16:07, Gleb Natapov wrote: > On Sat, Feb 16, 2013 at 06:10:14PM +0100, Jan Kiszka wrote: >> From: Jan Kiszka >> >> If L1 does not set PIN_BASED_EXT_INTR_MASK, we incorrectly skipped >> vmx_complete_interrupts on L2 exits. This is required because, with >> direct interrupt injection from L0 to L2, L0 has to update its pending= >> events. >> >> Also, we need to allow vmx_cancel_injection when entering L2 in we lef= t >> to L0. This condition is indirectly derived from the absence of valid >> vectoring info in vmcs12. We no explicitly clear it if we find out tha= t >> the L2 exit is not targeting L1 but L0. >> > We really need to overhaul how interrupt injection is emulated in neste= d > VMX. Why not put pending events into event queue instead of > get_vmcs12(vcpu)->idt_vectoring_info_field and inject them in usual way= =2E I was thinking about the same step but felt unsure so far if vmx_complete_interrupts & Co. do not include any assumptions about the vmcs configuration that won't match what L1 does. So I went for a different path first, specifically to avoid impact on these hairy bits for non-nested mode. >=20 >> Signed-off-by: Jan Kiszka >> --- >> arch/x86/kvm/vmx.c | 43 +++++++++++++++++++++++++++----------------= >> 1 files changed, 27 insertions(+), 16 deletions(-) >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 68a045ae..464b6a5 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -624,6 +624,7 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu,= >> struct kvm_segment *var, int seg); >> static bool guest_state_valid(struct kvm_vcpu *vcpu); >> static u32 vmx_segment_access_rights(struct kvm_segment *var); >> +static void vmx_complete_interrupts(struct vcpu_vmx *vmx); >> =20 >> static DEFINE_PER_CPU(struct vmcs *, vmxarea); >> static DEFINE_PER_CPU(struct vmcs *, current_vmcs); >> @@ -6213,9 +6214,19 @@ static int vmx_handle_exit(struct kvm_vcpu *vcp= u) >> else >> vmx->nested.nested_run_pending =3D 0; >> =20 >> - if (is_guest_mode(vcpu) && nested_vmx_exit_handled(vcpu)) { >> - nested_vmx_vmexit(vcpu); >> - return 1; >> + if (is_guest_mode(vcpu)) { >> + if (nested_vmx_exit_handled(vcpu)) { >> + nested_vmx_vmexit(vcpu); >> + return 1; >> + } >> + /* >> + * Now it's clear, we are leaving to L0. Perform the postponed >> + * interrupt completion and clear L1's vectoring info field so >> + * that we do not overwrite what L0 wants to inject on >> + * re-entry. >> + */ >> + vmx_complete_interrupts(vmx); >> + get_vmcs12(vcpu)->idt_vectoring_info_field =3D 0; >> } >> =20 >> if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) { >> @@ -6495,8 +6506,6 @@ static void __vmx_complete_interrupts(struct vcp= u_vmx *vmx, >> =20 >> static void vmx_complete_interrupts(struct vcpu_vmx *vmx) >> { >> - if (is_guest_mode(&vmx->vcpu)) >> - return; >> __vmx_complete_interrupts(vmx, vmx->idt_vectoring_info, >> VM_EXIT_INSTRUCTION_LEN, >> IDT_VECTORING_ERROR_CODE); >> @@ -6504,7 +6513,9 @@ static void vmx_complete_interrupts(struct vcpu_= vmx *vmx) >> =20 >> static void vmx_cancel_injection(struct kvm_vcpu *vcpu) >> { >> - if (is_guest_mode(vcpu)) >> + if (is_guest_mode(vcpu) && >> + get_vmcs12(vcpu)->idt_vectoring_info_field & >> + VECTORING_INFO_VALID_MASK) > Why skip cancel_injection at all? As far as I see we can lose injected > irq if we do. Consider: >=20 > io thread vcpu in nested mode > set irr 200 > clear irr 200 set isr 200 > set 200 in VM_ENTRY_INTR_INFO= _FIELD > set irr 250 > set KVM_REQ_EVENT > if (KVM_REQ_EVENT) > vmx_cancel_injection(= ) <- does nothing No, it does cancel as vmcs12's idt_vectoring_info_field signals an invalid state then. Only if we left L2 with valid vectoring info and are about to reenter, we skip the cancellation - but in that case we didn't inject anything from L0 previously anyway. Jan >=20 > clear irr 250 set isr 250 > set 250 in VM_ENTRY_INTR_INFO= _FIELD > vmentry >=20 > So now APIC state is bogus. isr bit 200 is set but vector 200 was never= > injected and actually is lost forever. Next EOI will clear isr 250 and > isr 200 will block all lower level interrupt forever. >=20 > -- > Gleb. >=20 ------enig2KPMCIRIPEFCRDLSOBLJO Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlEg99IACgkQitSsb3rl5xThoACeLzd5/XbyvUjb04OwXxbyjjpI hTwAn08qkC7DyclzlcfwjDUTUJDWGggM =CCOs -----END PGP SIGNATURE----- ------enig2KPMCIRIPEFCRDLSOBLJO--