From: Bandan Das <bsd@redhat.com>
To: Jim Mattson <jmattson@google.com>
Cc: kvm@vger.kernel.org, David Matlack <dmatlack@google.com>,
Peter Feiner <pfeiner@google.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] kvm: nVMX: Track active shadow VMCSs
Date: Mon, 18 Jul 2016 19:36:13 -0400 [thread overview]
Message-ID: <jpgwpki5xjm.fsf@linux.bootlegged.copy> (raw)
In-Reply-To: <CALMp9eQ=ketQQrLiG8OLcZNS7anzWCtf7hAjQTKaWozszGTgBw@mail.gmail.com> (Jim Mattson's message of "Mon, 18 Jul 2016 14:30:43 -0700")
Jim Mattson <jmattson@google.com> writes:
> My replies are inline.
>
> On Fri, Jul 15, 2016 at 5:50 PM, Bandan Das <bsd@redhat.com> wrote:
>>
>> Hi Jim,
>>
>> Some comments below.
>>
>> Jim Mattson <jmattson@google.com> writes:
>>
>>> If a shadow VMCS is referenced by the VMCS link pointer in the
>>> current VMCS, then VM-entry makes the shadow VMCS active on the
>>> current processor. No VMCS should ever be active on more than one
>>> processor. If a VMCS is to be migrated from one processor to
>>> another, software should execute a VMCLEAR for the VMCS on the
>>> first processor before the VMCS is made active on the second
>>> processor.
>>>
>>> We already keep track of ordinary VMCSs that are made active by
>>> VMPTRLD. Extend that tracking to handle shadow VMCSs that are
>>> made active by VM-entry to their parents.
>>>
>>> Signed-off-by: Jim Mattson <jmattson@google.com>
>>>
>>> arch/x86/kvm/vmx.c | 73 +++++++++++++++++++++++++++++++++++++++---------------
>>> 1 file changed, 53 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 64a79f2..dd38521 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -398,7 +398,7 @@ struct nested_vmx {
>>> /* The host-usable pointer to the above */
>>> struct page *current_vmcs12_page;
>>> struct vmcs12 *current_vmcs12;
>>> - struct vmcs *current_shadow_vmcs;
>>> + struct loaded_vmcs current_shadow_vmcs;
>>> /*
>>> * Indicates if the shadow vmcs must be updated with the
>>> * data hold by vmcs12
>>> @@ -2113,6 +2113,15 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
>>> new.control) != old.control);
>>> }
>>>
>>> +static void record_loaded_vmcs(struct loaded_vmcs *loaded_vmcs, int cpu)
>>> +{
>>> + if (loaded_vmcs->vmcs) {
>>> + list_add(&loaded_vmcs->loaded_vmcss_on_cpu_link,
>>> + &per_cpu(loaded_vmcss_on_cpu, cpu));
>>> + loaded_vmcs->cpu = cpu;
>>> + }
>>> +}
>>> +
>>> /*
>>> * Switches to specified vcpu, until a matching vcpu_put(), but assumes
>>> * vcpu mutex is already taken.
>>> @@ -2124,8 +2133,11 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>>
>>> if (!vmm_exclusive)
>>> kvm_cpu_vmxon(phys_addr);
>>> - else if (vmx->loaded_vmcs->cpu != cpu)
>>> + else if (vmx->loaded_vmcs->cpu != cpu) {
>>> loaded_vmcs_clear(vmx->loaded_vmcs);
>>> + if (vmx->nested.current_shadow_vmcs.vmcs)
>>> + loaded_vmcs_clear(&vmx->nested.current_shadow_vmcs);
>>> + }
>>>
>>> if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>>> per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>>> @@ -2147,8 +2159,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>> */
>>> smp_rmb();
>>>
>>> - list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
>>> - &per_cpu(loaded_vmcss_on_cpu, cpu));
>>> + record_loaded_vmcs(vmx->loaded_vmcs, cpu);
>>> + record_loaded_vmcs(&vmx->nested.current_shadow_vmcs, cpu);
>>> crash_enable_local_vmclear(cpu);
>>> local_irq_enable();
>>>
>>> @@ -2161,8 +2173,6 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>>
>>> rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
>>> vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
>>> -
>>> - vmx->loaded_vmcs->cpu = cpu;
>>> }
>>
>> Is the order of setting loaded_vmcs->cpu important here ? Your patch changed it, I can't
>> think of anything wrong with it however...
>>
>
> I can't find any references to loaded_vmcs->cpu between the new
> assignment and the original assignment. I believe this change is
> okay.
>
> However, on a related note, I have preserved the existing ordering of
> VMPTRLD followed by adding the VMCS to the loaded_vmcss_on_cpu. Is
> this ordering safe in the event of a KEXEC?
So, there might be the possibility that kexec doesn't clear the recently loaded vmcs
since it was not on the list ? Looks like an issue to me but I have cc-ed Paolo to
comment/confirm.
>>> /* Setup TSC multiplier */
>>> @@ -6812,6 +6822,34 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu *vcpu, int exit_reason,
>>> return 0;
>>> }
>>>
>>> +static int setup_shadow_vmcs(struct vcpu_vmx *vmx)
>>> +{
>>> + struct vmcs *shadow_vmcs;
>>> + int cpu;
>>> +
>>> + shadow_vmcs = alloc_vmcs();
>>> + if (!shadow_vmcs)
>>> + return -ENOMEM;
>>> +
>>> + /* mark vmcs as shadow */
>>> + shadow_vmcs->revision_id |= (1u << 31);
>>> + /* init shadow vmcs */
>>> + vmx->nested.current_shadow_vmcs.vmcs = shadow_vmcs;
>>> + loaded_vmcs_init(&vmx->nested.current_shadow_vmcs);
>>> +
>>> + cpu = get_cpu();
>>> + local_irq_disable();
>>> + crash_disable_local_vmclear(cpu);
>>> +
>>> + record_loaded_vmcs(&vmx->nested.current_shadow_vmcs, cpu);
>>> +
>>> + crash_enable_local_vmclear(cpu);
>>> + local_irq_enable();
>>> + put_cpu();
>>> +
>>> + return 0;
>>> +}
>>> +
>>> /*
>>> * Emulate the VMXON instruction.
>>> * Currently, we just remember that VMX is active, and do not save or even
>>> @@ -6824,7 +6862,6 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
>>> {
>>> struct kvm_segment cs;
>>> struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> - struct vmcs *shadow_vmcs;
>>> const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
>>> | FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
>>>
>>> @@ -6867,14 +6904,9 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
>>> }
>>>
>>> if (enable_shadow_vmcs) {
>>> - shadow_vmcs = alloc_vmcs();
>>> - if (!shadow_vmcs)
>>> - return -ENOMEM;
>>> - /* mark vmcs as shadow */
>>> - shadow_vmcs->revision_id |= (1u << 31);
>>> - /* init shadow vmcs */
>>> - vmcs_clear(shadow_vmcs);
>>> - vmx->nested.current_shadow_vmcs = shadow_vmcs;
>>> + int ret = setup_shadow_vmcs(vmx);
>>> + if (ret)
>>> + return ret;
>> Nit:
>> if (setup_shadow_vmcs(vmx))
>> return -ENOMEM;
>>
>> Also, isn't it better to actually add the shadow vmcs to the list
>> in handle_vmptrld than in handle_vmon ? That is where shadow is actually
>> enabled and the link pointer set.
>>
>
> Are you are suggesting that the invariant should be that the
> shadow VMCS is on the loaded_vmcss_on_cpu list iff it is
> referenced by the link pointer in the parent? Maintaining that
> invariant will require some extra work. The shadow VMCS will have
> to be removed from the loaded_vmcss_on_cpu list in
> nested_release_vmcs12, and vmx_vcpu_load will have to check
> vmx->nested.current_vmptr before the loaded_vmcss_on_cpu
> operations involving the shadow VMCS.
Hmm... It just seemed like managing shadow_vmcs() is more of a vmptrld/vmclear
operation but that also means removing/adding vmcss everytime vmclear is called
and iirc the current code does a good job of reusing vmcss. So, I think this is
fine.
Bandan
>>> }
>>>
>>> INIT_LIST_HEAD(&(vmx->nested.vmcs02_pool));
>>> @@ -6959,7 +6991,7 @@ static void free_nested(struct vcpu_vmx *vmx)
>>> free_vpid(vmx->nested.vpid02);
>>> nested_release_vmcs12(vmx);
>>> if (enable_shadow_vmcs)
>>> - free_vmcs(vmx->nested.current_shadow_vmcs);
>>> + free_loaded_vmcs(&vmx->nested.current_shadow_vmcs);
>>> /* Unpin physical memory we referred to in current vmcs02 */
>>> if (vmx->nested.apic_access_page) {
>>> nested_release_page(vmx->nested.apic_access_page);
>>> @@ -7135,7 +7167,7 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
>>> int i;
>>> unsigned long field;
>>> u64 field_value;
>>> - struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs;
>>> + struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs.vmcs;
>>> const unsigned long *fields = shadow_read_write_fields;
>>> const int num_fields = max_shadow_read_write_fields;
>>>
>>> @@ -7184,7 +7216,7 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
>>> int i, q;
>>> unsigned long field;
>>> u64 field_value = 0;
>>> - struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs;
>>> + struct vmcs *shadow_vmcs = vmx->nested.current_shadow_vmcs.vmcs;
>>>
>>> vmcs_load(shadow_vmcs);
>>>
>>> @@ -7364,10 +7396,11 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
>>> vmx->nested.current_vmcs12 = new_vmcs12;
>>> vmx->nested.current_vmcs12_page = page;
>>> if (enable_shadow_vmcs) {
>>> + struct vmcs *shadow_vmcs;
>>> + shadow_vmcs = vmx->nested.current_shadow_vmcs.vmcs;
>>
>> Nit: struct vmcs *shadow_vmcs =
>> vmx->nested.current_shadow_vmcs.vmcs;
>>
>> Bandan
>>
>>> vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>>> SECONDARY_EXEC_SHADOW_VMCS);
>>> - vmcs_write64(VMCS_LINK_POINTER,
>>> - __pa(vmx->nested.current_shadow_vmcs));
>>> + vmcs_write64(VMCS_LINK_POINTER, __pa(shadow_vmcs));
>>> vmx->nested.sync_shadow_vmcs = true;
>>> }
>>> }
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-07-18 23:36 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-15 17:39 [PATCH] kvm: nVMX: Track active shadow VMCSs Jim Mattson
2016-07-16 0:50 ` Bandan Das
2016-07-16 14:50 ` Paolo Bonzini
2016-07-16 19:48 ` Bandan Das
2016-07-19 16:22 ` Jim Mattson
2016-07-22 18:25 ` [PATCH v2] KVM: " Jim Mattson
2016-07-27 20:45 ` Radim Krčmář
2016-07-27 21:53 ` Bandan Das
2016-07-28 13:30 ` Jim Mattson
2016-07-28 14:01 ` Radim Krčmář
2016-10-28 15:29 ` [PATCH v2] kvm: nVMX: VMCLEAR an active shadow VMCS after last use Jim Mattson
2016-11-02 17:23 ` Paolo Bonzini
2016-11-02 17:56 ` Jim Mattson
2016-11-02 18:04 ` Paolo Bonzini
2016-07-18 21:30 ` [PATCH] kvm: nVMX: Track active shadow VMCSs Jim Mattson
2016-07-18 23:36 ` Bandan Das [this message]
2016-07-19 8:41 ` Paolo Bonzini
2016-07-19 16:13 ` Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jpgwpki5xjm.fsf@linux.bootlegged.copy \
--to=bsd@redhat.com \
--cc=dmatlack@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=pfeiner@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox