From: Avi Kivity <avi@redhat.com>
To: "Nadav Har'El" <nyh@il.ibm.com>
Cc: kvm@vger.kernel.org, gleb@redhat.com
Subject: Re: [PATCH 22/30] nVMX: Correct handling of interrupt injection
Date: Mon, 09 May 2011 13:57:11 +0300 [thread overview]
Message-ID: <4DC7C887.5060402@redhat.com> (raw)
In-Reply-To: <201105080826.p488QUXA018307@rice.haifa.ibm.com>
On 05/08/2011 11:26 AM, Nadav Har'El wrote:
> When KVM wants to inject an interrupt, the guest should think a real interrupt
> has happened. Normally (in the non-nested case) this means checking that the
> guest doesn't block interrupts (and if it does, inject when it doesn't - using
> the "interrupt window" VMX mechanism), and setting up the appropriate VMCS
> fields for the guest to receive the interrupt.
>
> However, when we are running a nested guest (L2) and its hypervisor (L1)
> requested exits on interrupts (as most hypervisors do), the most efficient
> thing to do is to exit L2, telling L1 that the exit was caused by an
> interrupt, the one we were injecting; Only when L1 asked not to be notified
> of interrupts, we should inject directly to the running L2 guest (i.e.,
> the normal code path).
>
> However, properly doing what is described above requires invasive changes to
> the flow of the existing code, which we elected not to do in this stage.
> Instead we do something more simplistic and less efficient: we modify
> vmx_interrupt_allowed(), which kvm calls to see if it can inject the interrupt
> now, to exit from L2 to L1 before continuing the normal code. The normal kvm
> code then notices that L1 is blocking interrupts, and sets the interrupt
> window to inject the interrupt later to L1. Shortly after, L1 gets the
> interrupt while it is itself running, not as an exit from L2. The cost is an
> extra L1 exit (the interrupt window).
>
> Signed-off-by: Nadav Har'El<nyh@il.ibm.com>
> ---
> arch/x86/kvm/vmx.c | 35 +++++++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> --- .before/arch/x86/kvm/vmx.c 2011-05-08 10:43:20.000000000 +0300
> +++ .after/arch/x86/kvm/vmx.c 2011-05-08 10:43:20.000000000 +0300
> @@ -3675,9 +3675,25 @@ out:
> return ret;
> }
>
> +/*
> + * In nested virtualization, check if L1 asked to exit on external interrupts.
> + * For most existing hypervisors, this will always return true.
> + */
> +static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
> +{
> + return get_vmcs12(vcpu)->pin_based_vm_exec_control&
> + PIN_BASED_EXT_INTR_MASK;
> +}
> +
> static void enable_irq_window(struct kvm_vcpu *vcpu)
> {
> u32 cpu_based_vm_exec_control;
> + if (is_guest_mode(vcpu)&& nested_exit_on_intr(vcpu))
> + /* We can get here when nested_run_pending caused
> + * vmx_interrupt_allowed() to return false. In this case, do
> + * nothing - the interrupt will be injected later.
> + */
> + return;
Why not do (or schedule) the nested vmexit here? It's more natural than
in vmx_interrupt_allowed() which from its name you'd expect to only read
stuff.
I guess it can live for now if there's some unexpected complexity there.
>
> cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
> cpu_based_vm_exec_control |= CPU_BASED_VIRTUAL_INTR_PENDING;
> @@ -3800,6 +3816,13 @@ static void vmx_set_nmi_mask(struct kvm_
>
> static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
> {
> + if (is_guest_mode(vcpu)&& nested_exit_on_intr(vcpu)) {
> + if (to_vmx(vcpu)->nested.nested_run_pending)
> + return 0;
> + nested_vmx_vmexit(vcpu, true);
> + /* fall through to normal code, but now in L1, not L2 */
> + }
> +
> return (vmcs_readl(GUEST_RFLAGS)& X86_EFLAGS_IF)&&
> !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO)&
> (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
> @@ -5463,6 +5486,14 @@ static int vmx_handle_exit(struct kvm_vc
> if (vmx->emulation_required&& emulate_invalid_guest_state)
> return handle_invalid_guest_state(vcpu);
>
> + /*
> + * the KVM_REQ_EVENT optimization bit is only on for one entry, and if
> + * we did not inject a still-pending event to L1 now because of
> + * nested_run_pending, we need to re-enable this bit.
> + */
> + if (vmx->nested.nested_run_pending)
> + kvm_make_request(KVM_REQ_EVENT, vcpu);
> +
> if (exit_reason == EXIT_REASON_VMLAUNCH ||
> exit_reason == EXIT_REASON_VMRESUME)
> vmx->nested.nested_run_pending = 1;
> @@ -5660,6 +5691,8 @@ static void __vmx_complete_interrupts(st
>
> static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
> {
> + if (is_guest_mode(&vmx->vcpu))
> + return;
> __vmx_complete_interrupts(vmx, vmx->idt_vectoring_info,
> VM_EXIT_INSTRUCTION_LEN,
> IDT_VECTORING_ERROR_CODE);
> @@ -5667,6 +5700,8 @@ static void vmx_complete_interrupts(stru
>
> static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> {
> + if (is_guest_mode(vcpu))
> + return;
Hmm. What if L0 injected something into L2?
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
next prev parent reply other threads:[~2011-05-09 10:57 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-08 8:15 [PATCH 0/30] nVMX: Nested VMX, v9 Nadav Har'El
2011-05-08 8:15 ` [PATCH 01/30] nVMX: Add "nested" module option to kvm_intel Nadav Har'El
2011-05-08 8:16 ` [PATCH 02/30] nVMX: Implement VMXON and VMXOFF Nadav Har'El
2011-05-08 8:16 ` [PATCH 03/30] nVMX: Allow setting the VMXE bit in CR4 Nadav Har'El
2011-05-08 8:17 ` [PATCH 04/30] nVMX: Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2011-05-08 8:17 ` [PATCH 05/30] nVMX: Implement reading and writing of VMX MSRs Nadav Har'El
2011-05-08 8:18 ` [PATCH 06/30] nVMX: Decoding memory operands of VMX instructions Nadav Har'El
2011-05-09 9:47 ` Avi Kivity
2011-05-08 8:18 ` [PATCH 07/30] nVMX: Introduce vmcs02: VMCS used to run L2 Nadav Har'El
2011-05-16 15:30 ` Marcelo Tosatti
2011-05-16 18:32 ` Nadav Har'El
2011-05-17 13:20 ` Marcelo Tosatti
2011-05-08 8:19 ` [PATCH 08/30] nVMX: Fix local_vcpus_link handling Nadav Har'El
2011-05-08 8:19 ` [PATCH 09/30] nVMX: Add VMCS fields to the vmcs12 Nadav Har'El
2011-05-08 8:20 ` [PATCH 10/30] nVMX: Success/failure of VMX instructions Nadav Har'El
2011-05-08 8:20 ` [PATCH 11/30] nVMX: Implement VMCLEAR Nadav Har'El
2011-05-08 8:21 ` [PATCH 12/30] nVMX: Implement VMPTRLD Nadav Har'El
2011-05-16 14:34 ` Marcelo Tosatti
2011-05-16 18:58 ` Nadav Har'El
2011-05-16 19:09 ` Nadav Har'El
2011-05-08 8:21 ` [PATCH 13/30] nVMX: Implement VMPTRST Nadav Har'El
2011-05-08 8:22 ` [PATCH 14/30] nVMX: Implement VMREAD and VMWRITE Nadav Har'El
2011-05-08 8:22 ` [PATCH 15/30] nVMX: Move host-state field setup to a function Nadav Har'El
2011-05-09 9:56 ` Avi Kivity
2011-05-09 10:40 ` Nadav Har'El
2011-05-08 8:23 ` [PATCH 16/30] nVMX: Move control field setup to functions Nadav Har'El
2011-05-08 8:23 ` [PATCH 17/30] nVMX: Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2011-05-09 10:12 ` Avi Kivity
2011-05-09 10:27 ` Nadav Har'El
2011-05-09 10:45 ` Avi Kivity
2011-05-08 8:24 ` [PATCH 18/30] nVMX: Implement VMLAUNCH and VMRESUME Nadav Har'El
2011-05-08 8:24 ` [PATCH 19/30] nVMX: No need for handle_vmx_insn function any more Nadav Har'El
2011-05-08 8:25 ` [PATCH 20/30] nVMX: Exiting from L2 to L1 Nadav Har'El
2011-05-09 10:45 ` Avi Kivity
2011-05-08 8:25 ` [PATCH 21/30] nVMX: Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2011-05-08 8:26 ` [PATCH 22/30] nVMX: Correct handling of interrupt injection Nadav Har'El
2011-05-09 10:57 ` Avi Kivity [this message]
2011-05-08 8:27 ` [PATCH 23/30] nVMX: Correct handling of exception injection Nadav Har'El
2011-05-08 8:27 ` [PATCH 24/30] nVMX: Correct handling of idt vectoring info Nadav Har'El
2011-05-09 11:04 ` Avi Kivity
2011-05-08 8:28 ` [PATCH 25/30] nVMX: Handling of CR0 and CR4 modifying instructions Nadav Har'El
2011-05-08 8:28 ` [PATCH 26/30] nVMX: Further fixes for lazy FPU loading Nadav Har'El
2011-05-08 8:29 ` [PATCH 27/30] nVMX: Additional TSC-offset handling Nadav Har'El
2011-05-09 17:27 ` Zachary Amsden
2011-05-08 8:29 ` [PATCH 28/30] nVMX: Add VMX to list of supported cpuid features Nadav Har'El
2011-05-08 8:30 ` [PATCH 29/30] nVMX: Miscellenous small corrections Nadav Har'El
2011-05-08 8:30 ` [PATCH 30/30] nVMX: Documentation Nadav Har'El
2011-05-09 11:18 ` [PATCH 0/30] nVMX: Nested VMX, v9 Avi Kivity
2011-05-09 11:37 ` Nadav Har'El
2011-05-11 8:20 ` Gleb Natapov
2011-05-12 15:42 ` Nadav Har'El
2011-05-12 15:57 ` Gleb Natapov
2011-05-12 16:08 ` Avi Kivity
2011-05-12 16:14 ` Gleb Natapov
2011-05-12 16:31 ` Nadav Har'El
2011-05-12 16:51 ` Gleb Natapov
2011-05-12 17:00 ` Avi Kivity
2011-05-15 23:11 ` Nadav Har'El
2011-05-16 6:38 ` Gleb Natapov
2011-05-16 7:44 ` Nadav Har'El
2011-05-16 7:57 ` Gleb Natapov
2011-05-16 9:50 ` Avi Kivity
2011-05-16 10:20 ` Avi Kivity
2011-05-22 19:32 ` Nadav Har'El
2011-05-23 9:37 ` Joerg Roedel
2011-05-23 9:52 ` Avi Kivity
2011-05-23 13:02 ` Joerg Roedel
2011-05-23 13:08 ` Avi Kivity
2011-05-23 13:40 ` Joerg Roedel
2011-05-23 13:52 ` Avi Kivity
2011-05-23 14:10 ` Nadav Har'El
2011-05-23 14:32 ` Avi Kivity
2011-05-23 14:44 ` Nadav Har'El
2011-05-23 15:23 ` Avi Kivity
2011-05-23 18:06 ` Alexander Graf
2011-05-24 11:09 ` Avi Kivity
2011-05-24 13:07 ` Joerg Roedel
2011-05-23 14:28 ` Joerg Roedel
2011-05-23 14:34 ` Avi Kivity
2011-05-23 14:58 ` Joerg Roedel
2011-05-23 15:19 ` Avi Kivity
2011-05-23 13:18 ` Nadav Har'El
2011-05-12 16:18 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DC7C887.5060402@redhat.com \
--to=avi@redhat.com \
--cc=gleb@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=nyh@il.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).