All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bandan Das <bsd@redhat.com>
To: Jim Mattson <jmattson@google.com>
Cc: kvm@vger.kernel.org, David Matlack <dmatlack@google.com>,
	Peter Feiner <pfeiner@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] kvm: nVMX: Track active shadow VMCSs
Date: Mon, 18 Jul 2016 19:36:13 -0400	[thread overview]
Message-ID: <jpgwpki5xjm.fsf@linux.bootlegged.copy> (raw)
In-Reply-To: <CALMp9eQ=ketQQrLiG8OLcZNS7anzWCtf7hAjQTKaWozszGTgBw@mail.gmail.com> (Jim Mattson's message of "Mon, 18 Jul 2016 14:30:43 -0700")

Jim Mattson <jmattson@google.com> writes:

> My replies are inline.
>
> On Fri, Jul 15, 2016 at 5:50 PM, Bandan Das <bsd@redhat.com> wrote:
>>
>> Hi Jim,
>>
>> Some comments below.
>>
>> Jim Mattson <jmattson@google.com> writes:
>>
>>> If a shadow VMCS is referenced by the VMCS link pointer in the
>>> current VMCS, then VM-entry makes the shadow VMCS active on the
>>> current processor. No VMCS should ever be active on more than one
>>> processor. If a VMCS is to be migrated from one processor to
>>> another, software should execute a VMCLEAR for the VMCS on the
>>> first processor before the VMCS is made active on the second
>>> processor.
>>>
>>> We already keep track of ordinary VMCSs that are made active by
>>> VMPTRLD. Extend that tracking to handle shadow VMCSs that are
>>> made active by VM-entry to their parents.
>>>
>>> Signed-off-by: Jim Mattson <jmattson@google.com>
>>>
>>>  arch/x86/kvm/vmx.c | 73 +++++++++++++++++++++++++++++++++++++++---------------
>>>  1 file changed, 53 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 64a79f2..dd38521 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -398,7 +398,7 @@ struct nested_vmx {
>>>       /* The host-usable pointer to the above */
>>>       struct page *current_vmcs12_page;
>>>       struct vmcs12 *current_vmcs12;
>>> -     struct vmcs *current_shadow_vmcs;
>>> +     struct loaded_vmcs current_shadow_vmcs;
>>>       /*
>>>        * Indicates if the shadow vmcs must be updated with the
>>>        * data hold by vmcs12
>>> @@ -2113,6 +2113,15 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>>>                       new.control) != old.control);
>>>  }
>>>
>>> +static void record_loaded_vmcs(struct loaded_vmcs *loaded_vmcs, int cpu)
>>> +{
>>> +       if (loaded_vmcs->vmcs) {
>>> +            list_add(&loaded_vmcs->loaded_vmcss_on_cpu_link,
>>> +                     &per_cpu(loaded_vmcss_on_cpu, cpu));
>>> +            loaded_vmcs->cpu = cpu;
>>> +       }
>>> +}
>>> +
>>>  /*
>>>   * Switches to specified vcpu, until a matching vcpu_put(), but assumes
>>>   * vcpu mutex is already taken.
>>> @@ -2124,8 +2133,11 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>>
>>>       if (!vmm_exclusive)
>>>               kvm_cpu_vmxon(phys_addr);
>>> -     else if (vmx->loaded_vmcs->cpu != cpu)
>>> +     else if (vmx->loaded_vmcs->cpu != cpu) {
>>>               loaded_vmcs_clear(vmx->loaded_vmcs);
>>> +             if (vmx->nested.current_shadow_vmcs.vmcs)
>>> +                     loaded_vmcs_clear(&vmx->nested.current_shadow_vmcs);
>>> +     }
>>>
>>>       if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>>>               per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>>> @@ -2147,8 +2159,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>>                */
>>>               smp_rmb();
>>>
>>> -             list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
>>> -                      &per_cpu(loaded_vmcss_on_cpu, cpu));
>>> +             record_loaded_vmcs(vmx->loaded_vmcs, cpu);
>>> +             record_loaded_vmcs(&vmx->nested.current_shadow_vmcs, cpu);
>>>               crash_enable_local_vmclear(cpu);
>>>               local_irq_enable();
>>>
>>> @@ -2161,8 +2173,6 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>>
>>>               rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
>>>               vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
>>> -
>>> -             vmx->loaded_vmcs->cpu = cpu;
>>>       }
>>
>> Is the order of setting loaded_vmcs->cpu important here ? Your patch changed it, I can't
>> think of anything wrong with it however...
>>
>
> I can't find any references to loaded_vmcs->cpu between the new
> assignment and the original assignment. I believe this change is
> okay.
>
> However, on a related note, I have preserved the existing ordering of
> VMPTRLD followed by adding the VMCS to the loaded_vmcss_on_cpu. Is
> this ordering safe in the event of a KEXEC?

So, there might be the possibility that kexec doesn't clear the recently loaded vmcs 
since it was not on the list ? Looks like an issue to me but I have cc-ed Paolo to
comment/confirm.

>>>       /* Setup TSC multiplier */
>>> @@ -6812,6 +6822,34 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu *vcpu, int exit_reason,
>>>       return 0;
>>>  }
>>>
>>> +static int setup_shadow_vmcs(struct vcpu_vmx *vmx)
>>> +{
>>> +       struct vmcs *shadow_vmcs;
>>> +       int cpu;
>>> +
>>> +       shadow_vmcs = alloc_vmcs();
>>> +       if (!shadow_vmcs)
>>> +            return -ENOMEM;
>>> +
>>> +       /* mark vmcs as shadow */
>>> +       shadow_vmcs->revision_id |= (1u << 31);
>>> +       /* init shadow vmcs */
>>> +       vmx->nested.current_shadow_vmcs.vmcs = shadow_vmcs;
>>> +       loaded_vmcs_init(&vmx->nested.current_shadow_vmcs);
>>> +
>>> +       cpu = get_cpu();
>>> +       local_irq_disable();
>>> +       crash_disable_local_vmclear(cpu);
>>> +
>>> +       record_loaded_vmcs(&vmx->nested.current_shadow_vmcs, cpu);
>>> +
>>> +       crash_enable_local_vmclear(cpu);
>>> +       local_irq_enable();
>>> +       put_cpu();
>>> +
>>> +       return 0;
>>> +}
>>> +
>>>  /*
>>>   * Emulate the VMXON instruction.
>>>   * Currently, we just remember that VMX is active, and do not save or even
>>> @@ -6824,7 +6862,6 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
>>>  {
>>>       struct kvm_segment cs;
>>>       struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> -     struct vmcs *shadow_vmcs;
>>>       const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
>>>               | FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
>>>
>>> @@ -6867,14 +6904,9 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
>>>       }
>>>
>>>       if (enable_shadow_vmcs) {
>>> -             shadow_vmcs = alloc_vmcs();
>>> -             if (!shadow_vmcs)
>>> -                     return -ENOMEM;
>>> -             /* mark vmcs as shadow */
>>> -             shadow_vmcs->revision_id |= (1u << 31);
>>> -             /* init shadow vmcs */
>>> -             vmcs_clear(shadow_vmcs);
>>> -             vmx->nested.current_shadow_vmcs = shadow_vmcs;
>>> +             int ret = setup_shadow_vmcs(vmx);
>>> +             if (ret)
>>> +                     return ret;
>> Nit:
>>              if (setup_shadow_vmcs(vmx))
>>                        return -ENOMEM;
>>
>> Also, isn't it better to actually add the shadow vmcs to the list
>> in handle_vmptrld than in handle_vmon ? That is where shadow is actually
>> enabled and the link pointer set.
>>
>
> Are you are suggesting that the invariant should be that the
> shadow VMCS is on the loaded_vmcss_on_cpu list iff it is
> referenced by the link pointer in the parent? Maintaining that
> invariant will require some extra work. The shadow VMCS will have
> to be removed from the loaded_vmcss_on_cpu list in
> nested_release_vmcs12, and vmx_vcpu_load will have to check
> vmx->nested.current_vmptr before the loaded_vmcss_on_cpu
> operations involving the shadow VMCS.

Hmm... It just seemed like managing shadow_vmcs() is more of a vmptrld/vmclear
operation but that also means removing/adding vmcss everytime vmclear is called
and iirc the current code does a good job of reusing vmcss. So, I think this is
fine.

Bandan

>>>       }
>>>
>>>       INIT_LIST_HEAD(&(vmx->nested.vmcs02_pool));
>>> @@ -6959,7 +6991,7 @@ static void free_nested(struct vcpu_vmx *vmx)
>>>       free_vpid(vmx->nested.vpid02);
>>>       nested_release_vmcs12(vmx);
>>>       if (enable_shadow_vmcs)
>>> -             free_vmcs(vmx->nested.current_shadow_vmcs);
>>> +             free_loaded_vmcs(&vmx->nested.current_shadow_vmcs);
>>>       /* Unpin physical memory we referred to in current vmcs02 */
>>>       if (vmx->nested.apic_access_page) {
>>>               nested_release_page(vmx->nested.apic_access_page);
>>> @@ -7135,7 +7167,7 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
>>>       int i;
>>>       unsigned long field;
>>>       u64 field_value;
>>> -     struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs;
>>> +     struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs.vmcs;
>>>       const unsigned long *fields = shadow_read_write_fields;
>>>       const int num_fields = max_shadow_read_write_fields;
>>>
>>> @@ -7184,7 +7216,7 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
>>>       int i, q;
>>>       unsigned long field;
>>>       u64 field_value = 0;
>>> -     struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs;
>>> +     struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs.vmcs;
>>>
>>>       vmcs_load(shadow_vmcs);
>>>
>>> @@ -7364,10 +7396,11 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
>>>               vmx->nested.current_vmcs12 = new_vmcs12;
>>>               vmx->nested.current_vmcs12_page = page;
>>>               if (enable_shadow_vmcs) {
>>> +                     struct vmcs *shadow_vmcs;
>>> +                     shadow_vmcs = vmx->nested.current_shadow_vmcs.vmcs;
>>
>> Nit: struct vmcs *shadow_vmcs =
>>                  vmx->nested.current_shadow_vmcs.vmcs;
>>
>> Bandan
>>
>>>                       vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>>>                                     SECONDARY_EXEC_SHADOW_VMCS);
>>> -                     vmcs_write64(VMCS_LINK_POINTER,
>>> -                                  __pa(vmx->nested.current_shadow_vmcs));
>>> +                     vmcs_write64(VMCS_LINK_POINTER, __pa(shadow_vmcs));
>>>                       vmx->nested.sync_shadow_vmcs = true;
>>>               }
>>>       }
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-07-18 23:36 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15 17:39 [PATCH] kvm: nVMX: Track active shadow VMCSs Jim Mattson
2016-07-16  0:50 ` Bandan Das
2016-07-16 14:50   ` Paolo Bonzini
2016-07-16 19:48     ` Bandan Das
2016-07-19 16:22       ` Jim Mattson
2016-07-22 18:25         ` [PATCH v2] KVM: " Jim Mattson
2016-07-27 20:45           ` Radim Krčmář
2016-07-27 21:53             ` Bandan Das
2016-07-28 13:30               ` Jim Mattson
2016-07-28 14:01                 ` Radim Krčmář
2016-10-28 15:29         ` [PATCH v2] kvm: nVMX: VMCLEAR an active shadow VMCS after last use Jim Mattson
2016-11-02 17:23           ` Paolo Bonzini
2016-11-02 17:56             ` Jim Mattson
2016-11-02 18:04               ` Paolo Bonzini
2016-07-18 21:30   ` [PATCH] kvm: nVMX: Track active shadow VMCSs Jim Mattson
2016-07-18 23:36     ` Bandan Das [this message]
2016-07-19  8:41       ` Paolo Bonzini
2016-07-19 16:13         ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jpgwpki5xjm.fsf@linux.bootlegged.copy \
    --to=bsd@redhat.com \
    --cc=dmatlack@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pfeiner@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.