From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Marcelo Tosatti <marcelo-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>
Cc: kvm-devel <kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: [RFC] fix VMX TSC synchronicity
Date: Tue, 15 Jan 2008 16:33:44 +0200 [thread overview]
Message-ID: <478CC448.1030901@qumranet.com> (raw)
In-Reply-To: <20080114160647.GA15919@dmt>
Marcelo Tosatti wrote:
> Hi Avi,
>
> On Sun, Jan 13, 2008 at 02:19:29PM +0200, Avi Kivity wrote:
>
>> Marcelo Tosatti wrote:
>>
>>> The boot TSC sync check is failing on recent Linux SMP guests on TSC
>>> stable hosts.
>>>
>>>
>>>
>> What about tsc unstable hosts? If your patch convinces the guest its
>> tsc is table, while the host tsc is not, then it may cause confusion
>> later on.
>>
>
> The adjustment to zero won't fool the guest because it assumes that the
> TSC's are synchronized. It will simply set the guest TSC to zero on all
> VCPUs based on the time VCPU0 was initialized.
>
> That is, setting -(vcpu[0].first_tsc) on all VCPU's will not correct any
> synchronization problem.
>
>
What I mean is, right now we present really broken tscs to the guest.
After your patch, we present less-broken tscs (at boot, they will
closely resemble stable tscs). But after the machine idles a bit and
cpufreq takes over, the tscs won't be stable any more.
>>> +static void guest_write_tsc(u64 host_tsc, u64 guest_tsc);
>>> +
>>> /*
>>> * Switches to specified vcpu, until a matching vcpu_put(), but assumes
>>> * vcpu mutex is already taken.
>>> @@ -488,7 +493,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
>>> cpu)
>>> {
>>> struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> u64 phys_addr = __pa(vmx->vmcs);
>>> - u64 tsc_this, delta;
>>> + u64 delta;
>>>
>>> if (vcpu->cpu != cpu) {
>>> vcpu_clear(vmx);
>>> @@ -511,6 +516,19 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
>>> cpu)
>>> struct descriptor_table dt;
>>> unsigned long sysenter_esp;
>>>
>>> + if (unlikely(vcpu->cpu == -1)) {
>>>
>>>
>> This can happen after migration, I believe.
>>
>
> Right now the destination host will do ->vcpu_reset() on all VCPU's on
> startup... so its already broken. This patch is not introducing any
> issues, just changing where it happens :)
>
No, after the reset qemu will set the guest msrs including the tsc msr,
overriding the reset value.
However, by that time vcpu->cpu will no longer be -1, so there's no
issue there.
On the other hand, it may happen during cpu hotunplug (see
decache_vcpus_on_vcpu).
>
>
>>> + rdtscll(vcpu->arch.host_tsc);
>>> + vmx->tsc_this = vcpu->arch.host_tsc;
>>> + if (vcpu->vcpu_id == 0) {
>>> + guest_write_tsc(vcpu->arch.host_tsc, 0);
>>> + vmx->first_tsc = vcpu->arch.host_tsc;
>>> + } else {
>>> + struct vcpu_vmx *cpu0;
>>> + cpu0 = to_vmx(vcpu->kvm->vcpus[0]);
>>> + guest_write_tsc(cpu0->first_tsc, 0);
>>> + }
>>> + }
>>> +
>>>
>>>
>> Depending on vcpu_id == 0 can cause problems (for example, if vcpu 0 is
>> somehow not the first to run).
>>
>
> But QEMU will always run kvm_create() first (which does
> kvm_create_vcpu(0)), then start the other threads later. So I don't see
> how that can happen.
>
>
We're not developing purely for qemu (there's xenner, for example) and I
don't want to embed hidden assumptions into the API.
>>> vcpu->cpu = cpu;
>>> /*
>>> * Linux uses per-cpu TSS and GDT, so set these when
>>> switching
>>> @@ -526,8 +544,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
>>> cpu)
>>> /*
>>> * Make sure the time stamp counter is monotonous.
>>> */
>>> - rdtscll(tsc_this);
>>> - delta = vcpu->arch.host_tsc - tsc_this;
>>> + delta = vcpu->arch.host_tsc - vmx->tsc_this;
>>> vmcs_write64(TSC_OFFSET, vmcs_read64(TSC_OFFSET) + delta);
>>>
>>>
>> This is a little roundabout, how about moving the delta calculation
>> immediately after the call to vcpu_clear()?
>>
>> I don't think this is the cause of the problem, it can't account for
>> more than a few hundred cycles, compared to the much greater vmentry cost.
>>
>
> Actually it accounts for several thousand cycles warp by the time the
> kernel checks the TSC synchronization between CPU's, which happens very
> early on boot.
>
> You saw the hang on userspace init, by then there could be many
> VCPU->CPU switchs.
>
>
Okay.
>> Anyway it should be in a separate patch.
>>
>
> How does the following look (minus proper changelog):
>
>
> Index: kvm.quilt/arch/x86/kvm/vmx.c
> ===================================================================
> --- kvm.quilt.orig/arch/x86/kvm/vmx.c
> +++ kvm.quilt/arch/x86/kvm/vmx.c
> @@ -47,6 +47,7 @@ struct vcpu_vmx {
> struct kvm_vcpu vcpu;
> int launched;
> u8 fail;
> + u64 first_tsc;
> u32 idt_vectoring_info;
> struct kvm_msr_entry *guest_msrs;
> struct kvm_msr_entry *host_msrs;
> @@ -480,6 +481,7 @@ static void vmx_load_host_state(struct v
> reload_host_efer(vmx);
> }
>
> +static void guest_write_tsc(u64 host_tsc, u64 guest_tsc);
> /*
> * Switches to specified vcpu, until a matching vcpu_put(), but assumes
> * vcpu mutex is already taken.
> @@ -511,6 +513,18 @@ static void vmx_vcpu_load(struct kvm_vcp
> struct descriptor_table dt;
> unsigned long sysenter_esp;
>
> + if (unlikely(vcpu->cpu == -1)) {
>
> + rdtscll(vcpu->arch.host_tsc);
> + if (vcpu->vcpu_id == 0) {
> + guest_write_tsc(vcpu->arch.host_tsc, 0);
> + vmx->first_tsc = vcpu->arch.host_tsc;
> + } else {
> + struct vcpu_vmx *cpu0;
> + cpu0 = to_vmx(vcpu->kvm->vcpus[0]);
> + guest_write_tsc(cpu0->first_tsc, 0);
> + }
> + }
> +
>
I don't like the dependency on vcpu 0. Also see comments above.
>
> +static void __vcpu_clear(void *arg)
> +{
> + struct vcpu_vmx *vmx = arg;
> + int cpu = raw_smp_processor_id();
> +
> + if (vmx->vcpu.cpu == cpu)
> + vmcs_clear(vmx->vmcs);
> + if (per_cpu(current_vmcs, cpu) == vmx->vmcs)
> + per_cpu(current_vmcs, cpu) = NULL;
> + rdtscll(vmx->vcpu.arch.host_tsc);
> +}
> +
> +static void vcpu_clear(struct vcpu_vmx *vmx)
> +{
> + u64 tsc_this, delta;
> +
> + if (vmx->vcpu.cpu == -1)
> + return;
> + smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear, vmx, 0, 1);
> + /*
> + * Make sure the time stamp counter is monotonous.
> + */
> + rdtscll(tsc_this);
> + delta = vmx->vcpu.arch.host_tsc - tsc_this;
> + vmcs_write64(TSC_OFFSET, vmcs_read64(TSC_OFFSET) + delta);
> + vmx->launched = 0;
> +}
>
This part is fine.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
next prev parent reply other threads:[~2008-01-15 14:33 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-11 20:49 [RFC] fix VMX TSC synchronicity Marcelo Tosatti
2008-01-13 12:19 ` Avi Kivity
[not found] ` <478A01D1.7000402-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-14 16:06 ` Marcelo Tosatti
2008-01-14 20:46 ` Marcelo Tosatti
2008-01-15 14:59 ` Avi Kivity
2008-01-15 14:33 ` Avi Kivity [this message]
[not found] ` <478CC448.1030901-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-15 14:50 ` Alexander Graf
[not found] ` <478CC819.3040106-r27SGEef+tmzQB+pC5nmwQ@public.gmane.org>
2008-01-15 15:09 ` Avi Kivity
[not found] ` <478CCCA9.2080300-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-15 16:42 ` Alexander Graf
[not found] ` <478CE277.9010109-r27SGEef+tmzQB+pC5nmwQ@public.gmane.org>
2008-01-15 17:46 ` Avi Kivity
[not found] ` <478CF186.5030304-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-15 18:29 ` Amit Shah
2008-01-16 5:51 ` Andi Kleen
[not found] ` <p73ir1ul3ls.fsf-KvMlXPVkKihbpigZmTR7Iw@public.gmane.org>
2008-01-16 8:46 ` Avi Kivity
[not found] ` <478DC453.1000404-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-16 13:34 ` Andi Kleen
[not found] ` <478E08E5.2030507@qumranet.com>
[not found] ` <478E08E5.2030507-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-16 13:54 ` Andi Kleen
[not found] ` <20080116135415.GA14664-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2008-01-16 13:56 ` Avi Kivity
2008-01-17 18:43 ` Marcelo Tosatti
2008-01-17 18:56 ` Andi Kleen
2008-01-20 14:59 ` Avi Kivity
2008-05-04 15:40 ` Avi Kivity
2008-05-06 13:55 ` Avi Kivity
-- strict thread matches above, loose matches on Subject: below --
2008-01-12 3:11 Will Trives
2008-01-12 12:28 ` Marcelo Tosatti
2008-01-12 13:48 ` Will Trives
2008-01-12 20:51 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478CC448.1030901@qumranet.com \
--to=avi-atkuwr5tajbwk0htik3j/w@public.gmane.org \
--cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
--cc=marcelo-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.