All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@web.de>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org,
	Gleb Natapov <gleb@redhat.com>,
	Zachary Amsden <zamsden@redhat.com>
Subject: Re: [PATCH v3 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback
Date: Thu, 25 Feb 2010 09:48:47 +0100	[thread overview]
Message-ID: <4B86396F.8070208@web.de> (raw)
In-Reply-To: <20100225035814.GA470@amt.cnet>

[-- Attachment #1: Type: text/plain, Size: 3763 bytes --]

Marcelo Tosatti wrote:
> On Thu, Feb 25, 2010 at 12:58:26AM +0100, Jan Kiszka wrote:
>> Marcelo Tosatti wrote:
>>> On Thu, Feb 25, 2010 at 12:45:55AM +0100, Jan Kiszka wrote:
>>>> Marcelo Tosatti wrote:
>>>>> On Wed, Feb 24, 2010 at 03:17:55PM +0100, Jan Kiszka wrote:
>>>>>> Drop kvm_load_tsc in favor of level-dependent writeback in
>>>>>> kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and
>>>>>> should therefore only be written back on full sync.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>> ---
>>>>>>  qemu-kvm-x86.c        |   19 +++++--------------
>>>>>>  qemu-kvm.h            |    4 ----
>>>>>>  target-i386/machine.c |    5 -----
>>>>>>  3 files changed, 5 insertions(+), 23 deletions(-)
>>>>>>
>>>>>> diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
>>>>>> index 840c1c9..84fd7fa 100644
>>>>>> --- a/qemu-kvm-x86.c
>>>>>> +++ b/qemu-kvm-x86.c
>>>>>> @@ -965,8 +965,11 @@ void kvm_arch_load_regs(CPUState *env, int level)
>>>>>>          set_msr_entry(&msrs[n++], MSR_LSTAR  ,           env->lstar);
>>>>>>      }
>>>>>>  #endif
>>>>>> -    set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
>>>>>> -    set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
>>>>>> +    if (level == KVM_PUT_FULL_STATE) {
>>>>>> +        set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
>>>>>> +        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
>>>>>> +        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
>>>>>> +    }
>>>>> As things stand today, the TSC should only be written on migration. See
>>>>> 53f658b3c33616a4997ee254311b335e59063289 in the kernel.
>>>> Migration and power-up - that's what this patch ensures (=>
>>>> KVM_PUT_FULL_STATE). Or where do you see any problem?
>>>>
>>>> Jan
>>>>
>>> The problem is it should not write on power up (the kernel attempts
>>> to synchronize the TSCs in that case, see the commit).
>>>
>> OK, need to read this more carefully.
>>
>> I do not yet understand the difference from user space POV: it tries to
>> transfer the identical TSC values to all currently stopped VCPU threads.
> 
> guest tsc = host tsc + offset
> 
> So at the time you set_msr(TSC), the guest visible TSC starts ticking. 
> For SMP guests, this does not happen exactly at the same time for all
> vcpus.

Ouch.

> 
>> That should not be different if we are booting a fresh VM or loading a
>> complete state of a migrated image. If it does, it looks like a KVM
>> kernel deficit on first glance.
> 
> Yes it is a deficit. After migration TSCs of SMP guests go out of sync.
> Zachary is working on that.
> 

OK, so we need a workaround, ideally without reintroducing hooks. Is
this one acceptable?


diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 84fd7fa..285c05a 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -966,7 +966,15 @@ void kvm_arch_load_regs(CPUState *env, int level)
     }
 #endif
     if (level == KVM_PUT_FULL_STATE) {
-        set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        /*
+         * KVM is yet unable to synchronize TSC values of multiple VCPUs on
+         * writeback. Until this is fixed, we only write the offset to SMP
+         * guests after migration, desynchronizing the VCPUs, but avoiding
+         * huge jump-backs that would occur without any writeback at all.
+         */
+        if (smp_cpus == 1 || env->tsc != 0) {
+            set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        }
         set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
         set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
     }


Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kiszka <jan.kiszka@web.de>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>,
	Zachary Amsden <zamsden@redhat.com>, Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [PATCH v3 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback
Date: Thu, 25 Feb 2010 09:48:47 +0100	[thread overview]
Message-ID: <4B86396F.8070208@web.de> (raw)
In-Reply-To: <20100225035814.GA470@amt.cnet>

[-- Attachment #1: Type: text/plain, Size: 3763 bytes --]

Marcelo Tosatti wrote:
> On Thu, Feb 25, 2010 at 12:58:26AM +0100, Jan Kiszka wrote:
>> Marcelo Tosatti wrote:
>>> On Thu, Feb 25, 2010 at 12:45:55AM +0100, Jan Kiszka wrote:
>>>> Marcelo Tosatti wrote:
>>>>> On Wed, Feb 24, 2010 at 03:17:55PM +0100, Jan Kiszka wrote:
>>>>>> Drop kvm_load_tsc in favor of level-dependent writeback in
>>>>>> kvm_arch_load_regs. KVM's PV clock MSRs fall in the same category and
>>>>>> should therefore only be written back on full sync.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>> ---
>>>>>>  qemu-kvm-x86.c        |   19 +++++--------------
>>>>>>  qemu-kvm.h            |    4 ----
>>>>>>  target-i386/machine.c |    5 -----
>>>>>>  3 files changed, 5 insertions(+), 23 deletions(-)
>>>>>>
>>>>>> diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
>>>>>> index 840c1c9..84fd7fa 100644
>>>>>> --- a/qemu-kvm-x86.c
>>>>>> +++ b/qemu-kvm-x86.c
>>>>>> @@ -965,8 +965,11 @@ void kvm_arch_load_regs(CPUState *env, int level)
>>>>>>          set_msr_entry(&msrs[n++], MSR_LSTAR  ,           env->lstar);
>>>>>>      }
>>>>>>  #endif
>>>>>> -    set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME,  env->system_time_msr);
>>>>>> -    set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK,  env->wall_clock_msr);
>>>>>> +    if (level == KVM_PUT_FULL_STATE) {
>>>>>> +        set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
>>>>>> +        set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
>>>>>> +        set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
>>>>>> +    }
>>>>> As things stand today, the TSC should only be written on migration. See
>>>>> 53f658b3c33616a4997ee254311b335e59063289 in the kernel.
>>>> Migration and power-up - that's what this patch ensures (=>
>>>> KVM_PUT_FULL_STATE). Or where do you see any problem?
>>>>
>>>> Jan
>>>>
>>> The problem is it should not write on power up (the kernel attempts
>>> to synchronize the TSCs in that case, see the commit).
>>>
>> OK, need to read this more carefully.
>>
>> I do not yet understand the difference from user space POV: it tries to
>> transfer the identical TSC values to all currently stopped VCPU threads.
> 
> guest tsc = host tsc + offset
> 
> So at the time you set_msr(TSC), the guest visible TSC starts ticking. 
> For SMP guests, this does not happen exactly at the same time for all
> vcpus.

Ouch.

> 
>> That should not be different if we are booting a fresh VM or loading a
>> complete state of a migrated image. If it does, it looks like a KVM
>> kernel deficit on first glance.
> 
> Yes it is a deficit. After migration TSCs of SMP guests go out of sync.
> Zachary is working on that.
> 

OK, so we need a workaround, ideally without reintroducing hooks. Is
this one acceptable?


diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 84fd7fa..285c05a 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -966,7 +966,15 @@ void kvm_arch_load_regs(CPUState *env, int level)
     }
 #endif
     if (level == KVM_PUT_FULL_STATE) {
-        set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        /*
+         * KVM is yet unable to synchronize TSC values of multiple VCPUs on
+         * writeback. Until this is fixed, we only write the offset to SMP
+         * guests after migration, desynchronizing the VCPUs, but avoiding
+         * huge jump-backs that would occur without any writeback at all.
+         */
+        if (smp_cpus == 1 || env->tsc != 0) {
+            set_msr_entry(&msrs[n++], MSR_IA32_TSC, env->tsc);
+        }
         set_msr_entry(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr);
         set_msr_entry(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
     }


Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

  reply	other threads:[~2010-02-25  8:49 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24 14:17 [PATCH v3 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code Jan Kiszka
2010-02-24 14:17 ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 01/10] qemu-kvm: Add KVM_CAP_X86_ROBUST_SINGLESTEP-awareness Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 02/10] qemu-kvm: Rework VCPU state writeback API Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 03/10] x86: Extend validity of cpu_is_bsp Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 04/10] qemu-kvm: Clean up mpstate synchronization Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 22:44   ` Marcelo Tosatti
2010-02-24 22:44     ` [Qemu-devel] " Marcelo Tosatti
2010-02-25  0:02     ` Jan Kiszka
2010-02-25  0:02       ` [Qemu-devel] " Jan Kiszka
2010-02-25 11:56       ` Jan Kiszka
2010-02-25 11:56         ` [Qemu-devel] " Jan Kiszka
2010-02-25 17:20   ` [PATCH v4 " Jan Kiszka
2010-02-25 17:20     ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 05/10] KVM: x86: Restrict writeback of VCPU state Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 22:59   ` Marcelo Tosatti
2010-02-24 22:59     ` [Qemu-devel] " Marcelo Tosatti
2010-02-24 23:51     ` Jan Kiszka
2010-02-24 23:51       ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 06/10] qemu-kvm: Use VCPU event state for reset and vmsave/load Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 07/10] qemu-kvm: Cleanup/fix TSC and PV clock writeback Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 23:17   ` Marcelo Tosatti
2010-02-24 23:17     ` [Qemu-devel] " Marcelo Tosatti
2010-02-24 23:45     ` Jan Kiszka
2010-02-24 23:45       ` [Qemu-devel] " Jan Kiszka
2010-02-24 23:49       ` Marcelo Tosatti
2010-02-24 23:49         ` [Qemu-devel] " Marcelo Tosatti
2010-02-24 23:58         ` Jan Kiszka
2010-02-24 23:58           ` [Qemu-devel] " Jan Kiszka
2010-02-25  3:58           ` Marcelo Tosatti
2010-02-25  3:58             ` [Qemu-devel] " Marcelo Tosatti
2010-02-25  8:48             ` Jan Kiszka [this message]
2010-02-25  8:48               ` Jan Kiszka
2010-02-25 15:07               ` Marcelo Tosatti
2010-02-25 15:07                 ` [Qemu-devel] " Marcelo Tosatti
2010-02-25 15:17                 ` Jan Kiszka
2010-02-25 15:17                   ` [Qemu-devel] " Jan Kiszka
2010-02-25 15:48                   ` Marcelo Tosatti
2010-02-25 15:48                     ` [Qemu-devel] " Marcelo Tosatti
2010-02-25 15:56   ` [PATCH v4 " Jan Kiszka
2010-02-25 15:56     ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 08/10] qemu-kvm: Clean up KVM's APIC hooks Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 09/10] qemu-kvm: Move kvm_set_boot_cpu_id Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 14:17 ` [PATCH v3 10/10] qemu-kvm: Bring qemu_init_vcpu back home Jan Kiszka
2010-02-24 14:17   ` [Qemu-devel] " Jan Kiszka
2010-02-24 23:26 ` [PATCH v3 00/10] qemu-kvm: Hook cleanups and yet more use of upstream code Marcelo Tosatti
2010-02-24 23:26   ` [Qemu-devel] " Marcelo Tosatti
2010-02-24 23:55   ` Jan Kiszka
2010-02-24 23:55     ` [Qemu-devel] " Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B86396F.8070208@web.de \
    --to=jan.kiszka@web.de \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zamsden@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.