From: Avi Kivity <avi@redhat.com>
To: "Nadav Har'El" <nyh@il.ibm.com>
Cc: kvm@vger.kernel.org, gleb@redhat.com
Subject: Re: [PATCH 22/30] nVMX: Correct handling of interrupt injection
Date: Mon, 09 May 2011 13:57:11 +0300 [thread overview]
Message-ID: <4DC7C887.5060402@redhat.com> (raw)
In-Reply-To: <201105080826.p488QUXA018307@rice.haifa.ibm.com>
On 05/08/2011 11:26 AM, Nadav Har'El wrote:
> When KVM wants to inject an interrupt, the guest should think a real interrupt
> has happened. Normally (in the non-nested case) this means checking that the
> guest doesn't block interrupts (and if it does, inject when it doesn't - using
> the "interrupt window" VMX mechanism), and setting up the appropriate VMCS
> fields for the guest to receive the interrupt.
>
> However, when we are running a nested guest (L2) and its hypervisor (L1)
> requested exits on interrupts (as most hypervisors do), the most efficient
> thing to do is to exit L2, telling L1 that the exit was caused by an
> interrupt, the one we were injecting; Only when L1 asked not to be notified
> of interrupts, we should inject directly to the running L2 guest (i.e.,
> the normal code path).
>
> However, properly doing what is described above requires invasive changes to
> the flow of the existing code, which we elected not to do in this stage.
> Instead we do something more simplistic and less efficient: we modify
> vmx_interrupt_allowed(), which kvm calls to see if it can inject the interrupt
> now, to exit from L2 to L1 before continuing the normal code. The normal kvm
> code then notices that L1 is blocking interrupts, and sets the interrupt
> window to inject the interrupt later to L1. Shortly after, L1 gets the
> interrupt while it is itself running, not as an exit from L2. The cost is an
> extra L1 exit (the interrupt window).
>
> Signed-off-by: Nadav Har'El<nyh@il.ibm.com>
> ---
> arch/x86/kvm/vmx.c | 35 +++++++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> --- .before/arch/x86/kvm/vmx.c 2011-05-08 10:43:20.000000000 +0300
> +++ .after/arch/x86/kvm/vmx.c 2011-05-08 10:43:20.000000000 +0300
> @@ -3675,9 +3675,25 @@ out:
> return ret;
> }
>
> +/*
> + * In nested virtualization, check if L1 asked to exit on external interrupts.
> + * For most existing hypervisors, this will always return true.
> + */
> +static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
> +{
> + return get_vmcs12(vcpu)->pin_based_vm_exec_control&
> + PIN_BASED_EXT_INTR_MASK;
> +}
> +
> static void enable_irq_window(struct kvm_vcpu *vcpu)
> {
> u32 cpu_based_vm_exec_control;
> + if (is_guest_mode(vcpu)&& nested_exit_on_intr(vcpu))
> + /* We can get here when nested_run_pending caused
> + * vmx_interrupt_allowed() to return false. In this case, do
> + * nothing - the interrupt will be injected later.
> + */
> + return;
Why not do (or schedule) the nested vmexit here? It's more natural than
in vmx_interrupt_allowed() which from its name you'd expect to only read
stuff.
I guess it can live for now if there's some unexpected complexity there.
>
> cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
> cpu_based_vm_exec_control |= CPU_BASED_VIRTUAL_INTR_PENDING;
> @@ -3800,6 +3816,13 @@ static void vmx_set_nmi_mask(struct kvm_
>
> static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
> {
> + if (is_guest_mode(vcpu)&& nested_exit_on_intr(vcpu)) {
> + if (to_vmx(vcpu)->nested.nested_run_pending)
> + return 0;
> + nested_vmx_vmexit(vcpu, true);
> + /* fall through to normal code, but now in L1, not L2 */
> + }
> +
> return (vmcs_readl(GUEST_RFLAGS)& X86_EFLAGS_IF)&&
> !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO)&
> (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
> @@ -5463,6 +5486,14 @@ static int vmx_handle_exit(struct kvm_vc
> if (vmx->emulation_required&& emulate_invalid_guest_state)
> return handle_invalid_guest_state(vcpu);
>
> + /*
> + * the KVM_REQ_EVENT optimization bit is only on for one entry, and if
> + * we did not inject a still-pending event to L1 now because of
> + * nested_run_pending, we need to re-enable this bit.
> + */
> + if (vmx->nested.nested_run_pending)
> + kvm_make_request(KVM_REQ_EVENT, vcpu);
> +
> if (exit_reason == EXIT_REASON_VMLAUNCH ||
> exit_reason == EXIT_REASON_VMRESUME)
> vmx->nested.nested_run_pending = 1;
> @@ -5660,6 +5691,8 @@ static void __vmx_complete_interrupts(st
>
> static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
> {
> + if (is_guest_mode(&vmx->vcpu))
> + return;
> __vmx_complete_interrupts(vmx, vmx->idt_vectoring_info,
> VM_EXIT_INSTRUCTION_LEN,
> IDT_VECTORING_ERROR_CODE);
> @@ -5667,6 +5700,8 @@ static void vmx_complete_interrupts(stru
>
> static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> {
> + if (is_guest_mode(vcpu))
> + return;
Hmm. What if L0 injected something into L2?
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
next prev parent reply other threads:[~2011-05-09 10:57 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-08 8:15 [PATCH 0/30] nVMX: Nested VMX, v9 Nadav Har'El
2011-05-08 8:15 ` [PATCH 01/30] nVMX: Add "nested" module option to kvm_intel Nadav Har'El
2011-05-08 8:16 ` [PATCH 02/30] nVMX: Implement VMXON and VMXOFF Nadav Har'El
2011-05-08 8:16 ` [PATCH 03/30] nVMX: Allow setting the VMXE bit in CR4 Nadav Har'El
2011-05-08 8:17 ` [PATCH 04/30] nVMX: Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2011-05-08 8:17 ` [PATCH 05/30] nVMX: Implement reading and writing of VMX MSRs Nadav Har'El
2011-05-08 8:18 ` [PATCH 06/30] nVMX: Decoding memory operands of VMX instructions Nadav Har'El
2011-05-09 9:47 ` Avi Kivity
2011-05-08 8:18 ` [PATCH 07/30] nVMX: Introduce vmcs02: VMCS used to run L2 Nadav Har'El
2011-05-16 15:30 ` Marcelo Tosatti
2011-05-16 18:32 ` Nadav Har'El
2011-05-17 13:20 ` Marcelo Tosatti
2011-05-08 8:19 ` [PATCH 08/30] nVMX: Fix local_vcpus_link handling Nadav Har'El
2011-05-08 8:19 ` [PATCH 09/30] nVMX: Add VMCS fields to the vmcs12 Nadav Har'El
2011-05-08 8:20 ` [PATCH 10/30] nVMX: Success/failure of VMX instructions Nadav Har'El
2011-05-08 8:20 ` [PATCH 11/30] nVMX: Implement VMCLEAR Nadav Har'El
2011-05-08 8:21 ` [PATCH 12/30] nVMX: Implement VMPTRLD Nadav Har'El
2011-05-16 14:34 ` Marcelo Tosatti
2011-05-16 18:58 ` Nadav Har'El
2011-05-16 19:09 ` Nadav Har'El
2011-05-08 8:21 ` [PATCH 13/30] nVMX: Implement VMPTRST Nadav Har'El
2011-05-08 8:22 ` [PATCH 14/30] nVMX: Implement VMREAD and VMWRITE Nadav Har'El
2011-05-08 8:22 ` [PATCH 15/30] nVMX: Move host-state field setup to a function Nadav Har'El
2011-05-09 9:56 ` Avi Kivity
2011-05-09 10:40 ` Nadav Har'El
2011-05-08 8:23 ` [PATCH 16/30] nVMX: Move control field setup to functions Nadav Har'El
2011-05-08 8:23 ` [PATCH 17/30] nVMX: Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2011-05-09 10:12 ` Avi Kivity
2011-05-09 10:27 ` Nadav Har'El
2011-05-09 10:45 ` Avi Kivity
2011-05-08 8:24 ` [PATCH 18/30] nVMX: Implement VMLAUNCH and VMRESUME Nadav Har'El
2011-05-08 8:24 ` [PATCH 19/30] nVMX: No need for handle_vmx_insn function any more Nadav Har'El
2011-05-08 8:25 ` [PATCH 20/30] nVMX: Exiting from L2 to L1 Nadav Har'El
2011-05-09 10:45 ` Avi Kivity
2011-05-08 8:25 ` [PATCH 21/30] nVMX: Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2011-05-08 8:26 ` [PATCH 22/30] nVMX: Correct handling of interrupt injection Nadav Har'El
2011-05-09 10:57 ` Avi Kivity [this message]
2011-05-08 8:27 ` [PATCH 23/30] nVMX: Correct handling of exception injection Nadav Har'El
2011-05-08 8:27 ` [PATCH 24/30] nVMX: Correct handling of idt vectoring info Nadav Har'El
2011-05-09 11:04 ` Avi Kivity
2011-05-08 8:28 ` [PATCH 25/30] nVMX: Handling of CR0 and CR4 modifying instructions Nadav Har'El
2011-05-08 8:28 ` [PATCH 26/30] nVMX: Further fixes for lazy FPU loading Nadav Har'El
2011-05-08 8:29 ` [PATCH 27/30] nVMX: Additional TSC-offset handling Nadav Har'El
2011-05-09 17:27 ` Zachary Amsden
2011-05-08 8:29 ` [PATCH 28/30] nVMX: Add VMX to list of supported cpuid features Nadav Har'El
2011-05-08 8:30 ` [PATCH 29/30] nVMX: Miscellenous small corrections Nadav Har'El
2011-05-08 8:30 ` [PATCH 30/30] nVMX: Documentation Nadav Har'El
2011-05-09 11:18 ` [PATCH 0/30] nVMX: Nested VMX, v9 Avi Kivity
2011-05-09 11:37 ` Nadav Har'El
2011-05-11 8:20 ` Gleb Natapov
2011-05-12 15:42 ` Nadav Har'El
2011-05-12 15:57 ` Gleb Natapov
2011-05-12 16:08 ` Avi Kivity
2011-05-12 16:14 ` Gleb Natapov
2011-05-12 16:31 ` Nadav Har'El
2011-05-12 16:51 ` Gleb Natapov
2011-05-12 17:00 ` Avi Kivity
2011-05-15 23:11 ` Nadav Har'El
2011-05-16 6:38 ` Gleb Natapov
2011-05-16 7:44 ` Nadav Har'El
2011-05-16 7:57 ` Gleb Natapov
2011-05-16 9:50 ` Avi Kivity
2011-05-16 10:20 ` Avi Kivity
2011-05-22 19:32 ` Nadav Har'El
2011-05-23 9:37 ` Joerg Roedel
2011-05-23 9:52 ` Avi Kivity
2011-05-23 13:02 ` Joerg Roedel
2011-05-23 13:08 ` Avi Kivity
2011-05-23 13:40 ` Joerg Roedel
2011-05-23 13:52 ` Avi Kivity
2011-05-23 14:10 ` Nadav Har'El
2011-05-23 14:32 ` Avi Kivity
2011-05-23 14:44 ` Nadav Har'El
2011-05-23 15:23 ` Avi Kivity
2011-05-23 18:06 ` Alexander Graf
2011-05-24 11:09 ` Avi Kivity
2011-05-24 13:07 ` Joerg Roedel
2011-05-23 14:28 ` Joerg Roedel
2011-05-23 14:34 ` Avi Kivity
2011-05-23 14:58 ` Joerg Roedel
2011-05-23 15:19 ` Avi Kivity
2011-05-23 13:18 ` Nadav Har'El
2011-05-12 16:18 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DC7C887.5060402@redhat.com \
--to=avi@redhat.com \
--cc=gleb@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=nyh@il.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.