public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@siemens.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org,
	Gleb Natapov <gleb@redhat.com>,
	Zachary Amsden <zamsden@redhat.com>
Subject: Re: [PATCH v3 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback
Date: Thu, 25 Feb 2010 16:17:22 +0100	[thread overview]
Message-ID: <4B869482.4030207@siemens.com> (raw)
In-Reply-To: <20100225150736.GA10458@amt.cnet>

Marcelo Tosatti wrote:
> On Thu, Feb 25, 2010 at 09:48:47AM +0100, Jan Kiszka wrote:
>> Marcelo Tosatti wrote:
>>> On Thu, Feb 25, 2010 at 12:58:26AM +0100, Jan Kiszka wrote:
>>>> Marcelo Tosatti wrote:
>>>>> On Thu, Feb 25, 2010 at 12:45:55AM +0100, Jan Kiszka wrote:
>>>>>> Marcelo Tosatti wrote:
>>>>>>> On Wed, Feb 24, 2010 at 03:17:55PM +0100, Jan Kiszka wrote:
>>>>>>>> Drop kvm_load_tsc in favor of level-dependent writeback in
>>>>>>>> kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and
>>>>>>>> should therefore only be written back on full sync.
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>> ---
>>>>>>>>  qemu-kvm-x86.c        |   19 +++++--------------
>>>>>>>>  qemu-kvm.h            |    4 ----
>>>>>>>>  target-i386/machine.c |    5 -----
>>>>>>>>  3 files changed, 5 insertions(+), 23 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
>>>>>>>> index 840c1c9..84fd7fa 100644
>>>>>>>> --- a/qemu-kvm-x86.c
>>>>>>>> +++ b/qemu-kvm-x86.c
>>>>>>>> @@ -965,8 +965,11 @@ void kvm_arch_load_regs(CPUState *env, int level)
>>>>>>>>          set_msr_entry(&msrs[n++], MSR_LSTAR  ,           env->lstar);
>>>>>>>>      }
>>>>>>>>  #endif
>>>>>>>> -    set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
>>>>>>>> -    set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
>>>>>>>> +    if (level == KVM_PUT_FULL_STATE) {
>>>>>>>> +        set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
>>>>>>>> +        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
>>>>>>>> +        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
>>>>>>>> +    }
>>>>>>> As things stand today, the TSC should only be written on migration. See
>>>>>>> 53f658b3c33616a4997ee254311b335e59063289 in the kernel.
>>>>>> Migration and power-up - that's what this patch ensures (=>
>>>>>> KVM_PUT_FULL_STATE). Or where do you see any problem?
>>>>>>
>>>>>> Jan
>>>>>>
>>>>> The problem is it should not write on power up (the kernel attempts
>>>>> to synchronize the TSCs in that case, see the commit).
>>>>>
>>>> OK, need to read this more carefully.
>>>>
>>>> I do not yet understand the difference from user space POV: it tries to
>>>> transfer the identical TSC values to all currently stopped VCPU threads.
>>> guest tsc = host tsc + offset
>>>
>>> So at the time you set_msr(TSC), the guest visible TSC starts ticking. 
>>> For SMP guests, this does not happen exactly at the same time for all
>>> vcpus.
>> Ouch.
>>
>>>> That should not be different if we are booting a fresh VM or loading a
>>>> complete state of a migrated image. If it does, it looks like a KVM
>>>> kernel deficit on first glance.
>>> Yes it is a deficit. After migration TSCs of SMP guests go out of sync.
>>> Zachary is working on that.
>>>
>> OK, so we need a workaround, ideally without reintroducing hooks. Is
>> this one acceptable?
>>
>>
>> diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
>> index 84fd7fa..285c05a 100644
>> --- a/qemu-kvm-x86.c
>> +++ b/qemu-kvm-x86.c
>> @@ -966,7 +966,15 @@ void kvm_arch_load_regs(CPUState *env, int level)
>>      }
>>  #endif
>>      if (level == KVM_PUT_FULL_STATE) {
>> -        set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
>> +        /*
>> +         * KVM is yet unable to synchronize TSC values of multiple VCPUs on
>> +         * writeback. Until this is fixed, we only write the offset to SMP
>> +         * guests after migration, desynchronizing the VCPUs, but avoiding
>> +         * huge jump-backs that would occur without any writeback at all.
>> +         */
>> +        if (smp_cpus == 1 || env->tsc != 0) {
>> +            set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
>> +        }
>>          set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
>>          set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
>>      }
> 
> Well, migration of the SMP TSCs is not precise, but it is better than
> zeroing the TSCs on migration. So something like a new state (migration
> only), or a hack that mimicks that behaviour, is needed :(

It is not precise, but the above code shouldn't behave differently
compared to the existing one: tcp != 0 => we are loading values from
some VM that ran before, i.e. we are migrating.

If somehow possible, I do not want to introduce a new writeback level
for an x86-only kernel quirk.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

  reply	other threads:[~2010-02-25 15:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24 14:17 [PATCH v3 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 01/10] qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 02/10] qemu-kvm: Rework VCPU state writeback API Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 03/10] x86: Extend validity of cpu_is_bsp Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 04/10] qemu-kvm: Clean up mpstate synchronization Jan Kiszka
2010-02-24 22:44   ` Marcelo Tosatti
2010-02-25  0:02     ` Jan Kiszka
2010-02-25 11:56       ` Jan Kiszka
2010-02-25 17:20   ` [PATCH v4 " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 05/10] KVM: x86: Restrict writeback of VCPU state Jan Kiszka
2010-02-24 22:59   ` Marcelo Tosatti
2010-02-24 23:51     ` Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 06/10] qemu-kvm: Use VCPU event state for reset and vmsave/load Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback Jan Kiszka
2010-02-24 23:17   ` Marcelo Tosatti
2010-02-24 23:45     ` Jan Kiszka
2010-02-24 23:49       ` Marcelo Tosatti
2010-02-24 23:58         ` Jan Kiszka
2010-02-25  3:58           ` Marcelo Tosatti
2010-02-25  8:48             ` Jan Kiszka
2010-02-25 15:07               ` Marcelo Tosatti
2010-02-25 15:17                 ` Jan Kiszka [this message]
2010-02-25 15:48                   ` Marcelo Tosatti
2010-02-25 15:56   ` [PATCH v4 " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 08/10] qemu-kvm: Clean up KVM's APIC hooks Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 09/10] qemu-kvm: Move kvm_set_boot_cpu_id Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 10/10] qemu-kvm: Bring qemu_init_vcpu back home Jan Kiszka
2010-02-24 23:26 ` [PATCH v3 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code Marcelo Tosatti
2010-02-24 23:55   ` Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B869482.4030207@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zamsden@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox