From: Gleb Natapov <gleb@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Arthur Chunqi Li <yzt356@gmail.com>,
kvm@vger.kernel.org, jan.kiszka@web.de, yang.z.zhang@intel.com
Subject: Re: [PATCH v4] KVM: nVMX: Fully support of nested VMX preemption timer
Date: Mon, 16 Sep 2013 10:44:22 +0300 [thread overview]
Message-ID: <20130916074422.GG17294@redhat.com> (raw)
In-Reply-To: <5233481F.8060606@redhat.com>
On Fri, Sep 13, 2013 at 07:15:11PM +0200, Paolo Bonzini wrote:
> Il 06/09/2013 04:04, Arthur Chunqi Li ha scritto:
> > This patch contains the following two changes:
> > 1. Fix the bug in nested preemption timer support. If vmexit L2->L0
> > with some reasons not emulated by L1, preemption timer value should
> > be save in such exits.
> > 2. Add support of "Save VMX-preemption timer value" VM-Exit controls
> > to nVMX.
> >
> > With this patch, nested VMX preemption timer features are fully
> > supported.
> >
> > Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com>
> > ---
> > ChangeLog to v3:
> > Move nested_adjust_preemption_timer to the latest place just before vmenter.
> > Some minor changes.
> >
> > arch/x86/include/uapi/asm/msr-index.h | 1 +
> > arch/x86/kvm/vmx.c | 49 +++++++++++++++++++++++++++++++--
> > 2 files changed, 48 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h
> > index bb04650..b93e09a 100644
> > --- a/arch/x86/include/uapi/asm/msr-index.h
> > +++ b/arch/x86/include/uapi/asm/msr-index.h
> > @@ -536,6 +536,7 @@
> >
> > /* MSR_IA32_VMX_MISC bits */
> > #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
> > +#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
> > /* AMD-V MSRs */
> >
> > #define MSR_VM_CR 0xc0010114
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 1f1da43..f364d16 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -374,6 +374,8 @@ struct nested_vmx {
> > */
> > struct page *apic_access_page;
> > u64 msr_ia32_feature_control;
> > + /* Set if vmexit is L2->L1 */
> > + bool nested_vmx_exit;
> > };
> >
> > #define POSTED_INTR_ON 0
> > @@ -2204,7 +2206,17 @@ static __init void nested_vmx_setup_ctls_msrs(void)
> > #ifdef CONFIG_X86_64
> > VM_EXIT_HOST_ADDR_SPACE_SIZE |
> > #endif
> > - VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT;
> > + VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT |
> > + VM_EXIT_SAVE_VMX_PREEMPTION_TIMER;
> > + if (!(nested_vmx_pinbased_ctls_high &
> > + PIN_BASED_VMX_PREEMPTION_TIMER) ||
> > + !(nested_vmx_exit_ctls_high &
> > + VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)) {
>
> Align this under the other "!". Also, I prefer to have one long line
> for the whole "!(... & ...) ||" (and likewise below), but I don't know
> if Gleb agrees
>
!(... & ...) ||
!(... & ...)
fits perfectly to 80 chars.
> > + nested_vmx_exit_ctls_high &=
> > + (~VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
>
> Please remove parentheses around ~VM_EXIT_SAVE_VMX_PREEMPTION_TIMER, and
> likewise elsewhere in the patch.
>
and line break too.
> > + nested_vmx_pinbased_ctls_high &=
> > + (~PIN_BASED_VMX_PREEMPTION_TIMER);
> > + }
> > nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
> > VM_EXIT_LOAD_IA32_EFER);
> >
> > @@ -6707,6 +6719,24 @@ static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2)
> > *info2 = vmcs_read32(VM_EXIT_INTR_INFO);
> > }
> >
> > +static void nested_adjust_preemption_timer(struct kvm_vcpu *vcpu)
> > +{
> > + u64 delta_tsc_l1;
> > + u32 preempt_val_l1, preempt_val_l2, preempt_scale;
>
> Should this exit immediately if the preemption timer pin-based control
> is disabled?
>
> > + preempt_scale = native_read_msr(MSR_IA32_VMX_MISC) &
> > + MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE;
> > + preempt_val_l2 = vmcs_read32(VMX_PREEMPTION_TIMER_VALUE);
> > + delta_tsc_l1 = kvm_x86_ops->read_l1_tsc(vcpu,
> > + native_read_tsc()) - vcpu->arch.last_guest_tsc;
>
> Please format this like:
>
> delta_tsc_l1 =
> kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc())
> - vcpu->arch.last_guest_tsc;
>
And call vmx_read_l1_tsc() directly. Actually you can even use
to_vmx(vcpu)->nested.vmcs01_tsc_offset directly here since the function
will be called only when is_guest_mode() == true, but vmx_read_l1_tsc()
may be more clear here and compile should optimize second is_guest_mode()
check anyway.
> > + preempt_val_l1 = delta_tsc_l1 >> preempt_scale;
> > + if (preempt_val_l2 <= preempt_val_l1)
> > + preempt_val_l2 = 0;
> > + else
> > + preempt_val_l2 -= preempt_val_l1;
> > + vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, preempt_val_l2);
>
> Did you test that a value of 0 triggers an immediate exit, rather than
> counting down by 2^32? Perhaps it's safer to limit the value to 1
> instead of 0.
>
> > +}
> > +
> > /*
> > * The guest has exited. See if we can fix it or if we need userspace
> > * assistance.
> > @@ -6736,9 +6766,11 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
> > vmx->nested.nested_run_pending = 0;
> >
> > if (is_guest_mode(vcpu) && nested_vmx_exit_handled(vcpu)) {
> > + vmx->nested.nested_vmx_exit = true;
>
> I think this assignment should be in nested_vmx_vmexit, since it is
> called from other places as well.
>
> > nested_vmx_vmexit(vcpu);
> > return 1;
> > }
> > + vmx->nested.nested_vmx_exit = false;
> >
> > if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
> > vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > @@ -7132,6 +7164,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > debugctlmsr = get_debugctlmsr();
> >
> > vmx->__launched = vmx->loaded_vmcs->launched;
> > + if (is_guest_mode(vcpu) && !(vmx->nested.nested_vmx_exit))
> > + nested_adjust_preemption_timer(vcpu);
>
> Please leave the assignment to __launched last, since it's already
> initializing the asm below.
>
> I don't like the is_guest_mode check here... Maybe it's
> micro-optimizing, but I wonder if we already do too many checks in
> vmx_vcpu_run... For example, is_guest_mode could be changed (I think)
> to a check for "vmx->loaded_vmcs == &vmx->vmcs1".
>
Why this will be more efficient that HF_GUEST_MASK check?
> Alternatively, we could change nested_vmx_exit to an enum in struct
> vcpu_vmx (with values for L0->L1, L0->L2, L1->L2) that is initialized in
> vmx_handle_exit. Then we could check directly for L0->L2 and not adjust
> the preemption timer in other cases. In fact, I suspect this enum could
> replace HF_GUEST_MASK altogether. However, this would require some
> other, more complicated, changes to svm.c.
>
> Gleb, what do you think?
>
I do not see why nested_vmx_exit is necessary at all yet. We can detect
all aforementioned cases without.
--
Gleb.
next prev parent reply other threads:[~2013-09-16 7:44 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-06 2:04 [PATCH v4] KVM: nVMX: Fully support of nested VMX preemption timer Arthur Chunqi Li
2013-09-13 17:15 ` Paolo Bonzini
2013-09-14 7:44 ` Jan Kiszka
2013-09-15 7:59 ` Arthur Chunqi Li
2013-09-16 5:42 ` Arthur Chunqi Li
2013-09-16 7:18 ` Jan Kiszka
2013-09-16 7:44 ` Gleb Natapov [this message]
2013-09-16 8:58 ` Paolo Bonzini
2013-09-16 9:09 ` Gleb Natapov
2013-09-16 9:49 ` Paolo Bonzini
2013-09-16 10:44 ` Gleb Natapov
2013-09-16 10:50 ` Paolo Bonzini
2013-09-15 12:31 ` Gleb Natapov
2013-09-16 5:49 ` Arthur Chunqi Li
2013-09-16 6:20 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130916074422.GG17294@redhat.com \
--to=gleb@redhat.com \
--cc=jan.kiszka@web.de \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=yang.z.zhang@intel.com \
--cc=yzt356@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).