From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Marcelo Tosatti <marcelo-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>
Cc: kvm-devel <kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: [RFC] fix VMX TSC synchronicity
Date: Tue, 15 Jan 2008 16:33:44 +0200 [thread overview]
Message-ID: <478CC448.1030901@qumranet.com> (raw)
In-Reply-To: <20080114160647.GA15919@dmt>
Marcelo Tosatti wrote:
> Hi Avi,
>
> On Sun, Jan 13, 2008 at 02:19:29PM +0200, Avi Kivity wrote:
>
>> Marcelo Tosatti wrote:
>>
>>> The boot TSC sync check is failing on recent Linux SMP guests on TSC
>>> stable hosts.
>>>
>>>
>>>
>> What about tsc unstable hosts? If your patch convinces the guest its
>> tsc is table, while the host tsc is not, then it may cause confusion
>> later on.
>>
>
> The adjustment to zero won't fool the guest because it assumes that the
> TSC's are synchronized. It will simply set the guest TSC to zero on all
> VCPUs based on the time VCPU0 was initialized.
>
> That is, setting -(vcpu[0].first_tsc) on all VCPU's will not correct any
> synchronization problem.
>
>
What I mean is, right now we present really broken tscs to the guest.
After your patch, we present less-broken tscs (at boot, they will
closely resemble stable tscs). But after the machine idles a bit and
cpufreq takes over, the tscs won't be stable any more.
>>> +static void guest_write_tsc(u64 host_tsc, u64 guest_tsc);
>>> +
>>> /*
>>> * Switches to specified vcpu, until a matching vcpu_put(), but assumes
>>> * vcpu mutex is already taken.
>>> @@ -488,7 +493,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
>>> cpu)
>>> {
>>> struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> u64 phys_addr = __pa(vmx->vmcs);
>>> - u64 tsc_this, delta;
>>> + u64 delta;
>>>
>>> if (vcpu->cpu != cpu) {
>>> vcpu_clear(vmx);
>>> @@ -511,6 +516,19 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
>>> cpu)
>>> struct descriptor_table dt;
>>> unsigned long sysenter_esp;
>>>
>>> + if (unlikely(vcpu->cpu == -1)) {
>>>
>>>
>> This can happen after migration, I believe.
>>
>
> Right now the destination host will do ->vcpu_reset() on all VCPU's on
> startup... so its already broken. This patch is not introducing any
> issues, just changing where it happens :)
>
No, after the reset qemu will set the guest msrs including the tsc msr,
overriding the reset value.
However, by that time vcpu->cpu will no longer be -1, so there's no
issue there.
On the other hand, it may happen during cpu hotunplug (see
decache_vcpus_on_vcpu).
>
>
>>> + rdtscll(vcpu->arch.host_tsc);
>>> + vmx->tsc_this = vcpu->arch.host_tsc;
>>> + if (vcpu->vcpu_id == 0) {
>>> + guest_write_tsc(vcpu->arch.host_tsc, 0);
>>> + vmx->first_tsc = vcpu->arch.host_tsc;
>>> + } else {
>>> + struct vcpu_vmx *cpu0;
>>> + cpu0 = to_vmx(vcpu->kvm->vcpus[0]);
>>> + guest_write_tsc(cpu0->first_tsc, 0);
>>> + }
>>> + }
>>> +
>>>
>>>
>> Depending on vcpu_id == 0 can cause problems (for example, if vcpu 0 is
>> somehow not the first to run).
>>
>
> But QEMU will always run kvm_create() first (which does
> kvm_create_vcpu(0)), then start the other threads later. So I don't see
> how that can happen.
>
>
We're not developing purely for qemu (there's xenner, for example) and I
don't want to embed hidden assumptions into the API.
>>> vcpu->cpu = cpu;
>>> /*
>>> * Linux uses per-cpu TSS and GDT, so set these when
>>> switching
>>> @@ -526,8 +544,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
>>> cpu)
>>> /*
>>> * Make sure the time stamp counter is monotonous.
>>> */
>>> - rdtscll(tsc_this);
>>> - delta = vcpu->arch.host_tsc - tsc_this;
>>> + delta = vcpu->arch.host_tsc - vmx->tsc_this;
>>> vmcs_write64(TSC_OFFSET, vmcs_read64(TSC_OFFSET) + delta);
>>>
>>>
>> This is a little roundabout, how about moving the delta calculation
>> immediately after the call to vcpu_clear()?
>>
>> I don't think this is the cause of the problem, it can't account for
>> more than a few hundred cycles, compared to the much greater vmentry cost.
>>
>
> Actually it accounts for several thousand cycles warp by the time the
> kernel checks the TSC synchronization between CPU's, which happens very
> early on boot.
>
> You saw the hang on userspace init, by then there could be many
> VCPU->CPU switchs.
>
>
Okay.
>> Anyway it should be in a separate patch.
>>
>
> How does the following look (minus proper changelog):
>
>
> Index: kvm.quilt/arch/x86/kvm/vmx.c
> ===================================================================
> --- kvm.quilt.orig/arch/x86/kvm/vmx.c
> +++ kvm.quilt/arch/x86/kvm/vmx.c
> @@ -47,6 +47,7 @@ struct vcpu_vmx {
> struct kvm_vcpu vcpu;
> int launched;
> u8 fail;
> + u64 first_tsc;
> u32 idt_vectoring_info;
> struct kvm_msr_entry *guest_msrs;
> struct kvm_msr_entry *host_msrs;
> @@ -480,6 +481,7 @@ static void vmx_load_host_state(struct v
> reload_host_efer(vmx);
> }
>
> +static void guest_write_tsc(u64 host_tsc, u64 guest_tsc);
> /*
> * Switches to specified vcpu, until a matching vcpu_put(), but assumes
> * vcpu mutex is already taken.
> @@ -511,6 +513,18 @@ static void vmx_vcpu_load(struct kvm_vcp
> struct descriptor_table dt;
> unsigned long sysenter_esp;
>
> + if (unlikely(vcpu->cpu == -1)) {
>
> + rdtscll(vcpu->arch.host_tsc);
> + if (vcpu->vcpu_id == 0) {
> + guest_write_tsc(vcpu->arch.host_tsc, 0);
> + vmx->first_tsc = vcpu->arch.host_tsc;
> + } else {
> + struct vcpu_vmx *cpu0;
> + cpu0 = to_vmx(vcpu->kvm->vcpus[0]);
> + guest_write_tsc(cpu0->first_tsc, 0);
> + }
> + }
> +
>
I don't like the dependency on vcpu 0. Also see comments above.
>
> +static void __vcpu_clear(void *arg)
> +{
> + struct vcpu_vmx *vmx = arg;
> + int cpu = raw_smp_processor_id();
> +
> + if (vmx->vcpu.cpu == cpu)
> + vmcs_clear(vmx->vmcs);
> + if (per_cpu(current_vmcs, cpu) == vmx->vmcs)
> + per_cpu(current_vmcs, cpu) = NULL;
> + rdtscll(vmx->vcpu.arch.host_tsc);
> +}
> +
> +static void vcpu_clear(struct vcpu_vmx *vmx)
> +{
> + u64 tsc_this, delta;
> +
> + if (vmx->vcpu.cpu == -1)
> + return;
> + smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear, vmx, 0, 1);
> + /*
> + * Make sure the time stamp counter is monotonous.
> + */
> + rdtscll(tsc_this);
> + delta = vmx->vcpu.arch.host_tsc - tsc_this;
> + vmcs_write64(TSC_OFFSET, vmcs_read64(TSC_OFFSET) + delta);
> + vmx->launched = 0;
> +}
>
This part is fine.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
next prev parent reply other threads:[~2008-01-15 14:33 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-11 20:49 [RFC] fix VMX TSC synchronicity Marcelo Tosatti
2008-01-13 12:19 ` Avi Kivity
[not found] ` <478A01D1.7000402-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-14 16:06 ` Marcelo Tosatti
2008-01-14 20:46 ` Marcelo Tosatti
2008-01-15 14:59 ` Avi Kivity
2008-01-15 14:33 ` Avi Kivity [this message]
[not found] ` <478CC448.1030901-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-15 14:50 ` Alexander Graf
[not found] ` <478CC819.3040106-r27SGEef+tmzQB+pC5nmwQ@public.gmane.org>
2008-01-15 15:09 ` Avi Kivity
[not found] ` <478CCCA9.2080300-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-15 16:42 ` Alexander Graf
[not found] ` <478CE277.9010109-r27SGEef+tmzQB+pC5nmwQ@public.gmane.org>
2008-01-15 17:46 ` Avi Kivity
[not found] ` <478CF186.5030304-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-15 18:29 ` Amit Shah
2008-01-16 5:51 ` Andi Kleen
[not found] ` <p73ir1ul3ls.fsf-KvMlXPVkKihbpigZmTR7Iw@public.gmane.org>
2008-01-16 8:46 ` Avi Kivity
[not found] ` <478DC453.1000404-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-16 13:34 ` Andi Kleen
[not found] ` <478E08E5.2030507@qumranet.com>
[not found] ` <478E08E5.2030507-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-16 13:54 ` Andi Kleen
[not found] ` <20080116135415.GA14664-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2008-01-16 13:56 ` Avi Kivity
2008-01-17 18:43 ` Marcelo Tosatti
2008-01-17 18:56 ` Andi Kleen
2008-01-20 14:59 ` Avi Kivity
2008-05-04 15:40 ` Avi Kivity
2008-05-06 13:55 ` Avi Kivity
-- strict thread matches above, loose matches on Subject: below --
2008-01-12 3:11 Will Trives
2008-01-12 12:28 ` Marcelo Tosatti
2008-01-12 13:48 ` Will Trives
2008-01-12 20:51 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478CC448.1030901@qumranet.com \
--to=avi-atkuwr5tajbwk0htik3j/w@public.gmane.org \
--cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
--cc=marcelo-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox