* RFC: VMX: initialize TSC offset relative to vm creation time
@ 2008-09-10 20:58 Marcelo Tosatti
2008-09-10 22:18 ` Glauber Costa
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Marcelo Tosatti @ 2008-09-10 20:58 UTC (permalink / raw)
To: kvm-devel; +Cc: David S. Ahern, Chris Wright, Glauber de Oliveira Costa
VMX initializes the TSC offset for each vcpu at different times, and
also reinitializes it for vcpus other than 0 on APIC SIPI message.
This bug causes the TSC's to appear unsynchronized in the guest, even if
the host is good.
Older Linux kernels don't handle the situation very well, so
gettimeofday is likely to go backwards in time:
http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
Fix it by initializating the offset of each vcpu relative to vm creation
time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
APIC MP init path.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Index: kvm.tip/arch/x86/kvm/vmx.c
===================================================================
--- kvm.tip.orig/arch/x86/kvm/vmx.c
+++ kvm.tip/arch/x86/kvm/vmx.c
@@ -850,11 +850,8 @@ static u64 guest_read_tsc(void)
* writes 'guest_tsc' into guest's timestamp counter "register"
* guest_tsc = host_tsc + tsc_offset ==> tsc_offset = guest_tsc - host_tsc
*/
-static void guest_write_tsc(u64 guest_tsc)
+static void guest_write_tsc(u64 guest_tsc, u64 host_tsc)
{
- u64 host_tsc;
-
- rdtscll(host_tsc);
vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
}
@@ -918,6 +915,7 @@ static int vmx_set_msr(struct kvm_vcpu *
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct kvm_msr_entry *msr;
+ u64 host_tsc;
int ret = 0;
switch (msr_index) {
@@ -943,7 +941,8 @@ static int vmx_set_msr(struct kvm_vcpu *
vmcs_writel(GUEST_SYSENTER_ESP, data);
break;
case MSR_IA32_TIME_STAMP_COUNTER:
- guest_write_tsc(data);
+ rdtscll(host_tsc);
+ guest_write_tsc(data, host_tsc);
break;
case MSR_P6_PERFCTR0:
case MSR_P6_PERFCTR1:
@@ -2202,6 +2201,7 @@ static int vmx_vcpu_setup(struct vcpu_vm
vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
+ guest_write_tsc(0, vmx->vcpu.kvm->arch.vm_init_tsc);
return 0;
}
@@ -2292,8 +2292,6 @@ static int vmx_vcpu_reset(struct kvm_vcp
vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
- guest_write_tsc(0);
-
/* Special registers */
vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
Index: kvm.tip/arch/x86/kvm/x86.c
===================================================================
--- kvm.tip.orig/arch/x86/kvm/x86.c
+++ kvm.tip/arch/x86/kvm/x86.c
@@ -4250,6 +4250,8 @@ struct kvm *kvm_arch_create_vm(void)
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
+ rdtscll(kvm->arch.vm_init_tsc);
+
return kvm;
}
Index: kvm.tip/include/asm-x86/kvm_host.h
===================================================================
--- kvm.tip.orig/include/asm-x86/kvm_host.h
+++ kvm.tip/include/asm-x86/kvm_host.h
@@ -377,6 +377,7 @@ struct kvm_arch{
struct page *ept_identity_pagetable;
bool ept_identity_pagetable_done;
+ u64 vm_init_tsc;
};
struct kvm_vm_stat {
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-09-10 20:58 RFC: VMX: initialize TSC offset relative to vm creation time Marcelo Tosatti
@ 2008-09-10 22:18 ` Glauber Costa
2008-09-11 8:32 ` Marcelo Tosatti
2008-09-11 4:58 ` David S. Ahern
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Glauber Costa @ 2008-09-10 22:18 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: kvm-devel, David S. Ahern, Chris Wright,
Glauber de Oliveira Costa
On Wed, Sep 10, 2008 at 05:58:42PM -0300, Marcelo Tosatti wrote:
>
> VMX initializes the TSC offset for each vcpu at different times, and
> also reinitializes it for vcpus other than 0 on APIC SIPI message.
>
> This bug causes the TSC's to appear unsynchronized in the guest, even if
> the host is good.
>
> Older Linux kernels don't handle the situation very well, so
> gettimeofday is likely to go backwards in time:
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
> http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
>
> Fix it by initializating the offset of each vcpu relative to vm creation
> time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
> APIC MP init path.
How does it work if # vcpu > # pcpus? I remember that when I tried it, the big biting dog were
cases in which all cpus tried to sync, but they naturally put the value "0" in different points of time
(for obvious reasons), and would still appear unsynchronized to the guests.
>
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
>
> Index: kvm.tip/arch/x86/kvm/vmx.c
> ===================================================================
> --- kvm.tip.orig/arch/x86/kvm/vmx.c
> +++ kvm.tip/arch/x86/kvm/vmx.c
> @@ -850,11 +850,8 @@ static u64 guest_read_tsc(void)
> * writes 'guest_tsc' into guest's timestamp counter "register"
> * guest_tsc = host_tsc + tsc_offset ==> tsc_offset = guest_tsc - host_tsc
> */
> -static void guest_write_tsc(u64 guest_tsc)
> +static void guest_write_tsc(u64 guest_tsc, u64 host_tsc)
> {
> - u64 host_tsc;
> -
> - rdtscll(host_tsc);
> vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
> }
>
> @@ -918,6 +915,7 @@ static int vmx_set_msr(struct kvm_vcpu *
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> struct kvm_msr_entry *msr;
> + u64 host_tsc;
> int ret = 0;
>
> switch (msr_index) {
> @@ -943,7 +941,8 @@ static int vmx_set_msr(struct kvm_vcpu *
> vmcs_writel(GUEST_SYSENTER_ESP, data);
> break;
> case MSR_IA32_TIME_STAMP_COUNTER:
> - guest_write_tsc(data);
> + rdtscll(host_tsc);
> + guest_write_tsc(data, host_tsc);
> break;
> case MSR_P6_PERFCTR0:
> case MSR_P6_PERFCTR1:
> @@ -2202,6 +2201,7 @@ static int vmx_vcpu_setup(struct vcpu_vm
> vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
> vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
>
> + guest_write_tsc(0, vmx->vcpu.kvm->arch.vm_init_tsc);
>
> return 0;
> }
> @@ -2292,8 +2292,6 @@ static int vmx_vcpu_reset(struct kvm_vcp
> vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
> vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
>
> - guest_write_tsc(0);
> -
> /* Special registers */
> vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
>
> Index: kvm.tip/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm.tip.orig/arch/x86/kvm/x86.c
> +++ kvm.tip/arch/x86/kvm/x86.c
> @@ -4250,6 +4250,8 @@ struct kvm *kvm_arch_create_vm(void)
> INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
> INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
>
> + rdtscll(kvm->arch.vm_init_tsc);
> +
> return kvm;
> }
>
> Index: kvm.tip/include/asm-x86/kvm_host.h
> ===================================================================
> --- kvm.tip.orig/include/asm-x86/kvm_host.h
> +++ kvm.tip/include/asm-x86/kvm_host.h
> @@ -377,6 +377,7 @@ struct kvm_arch{
>
> struct page *ept_identity_pagetable;
> bool ept_identity_pagetable_done;
> + u64 vm_init_tsc;
> };
>
> struct kvm_vm_stat {
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-09-10 20:58 RFC: VMX: initialize TSC offset relative to vm creation time Marcelo Tosatti
2008-09-10 22:18 ` Glauber Costa
@ 2008-09-11 4:58 ` David S. Ahern
2008-09-13 4:55 ` Avi Kivity
2008-10-13 13:12 ` David S. Ahern
3 siblings, 0 replies; 11+ messages in thread
From: David S. Ahern @ 2008-09-11 4:58 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: kvm-devel, Chris Wright, Glauber de Oliveira Costa
Hi Marcelo:
Dramatic improvement. The following is an example with kvm-75 and this
patch. Without cpu affinity from a kvm perspective (vcpu-to-pcpu):
cpu 0: 1221107886.020298
cpu 1: 1221107886.020290 *
cpu 2: 1221107886.020555
cpu 3: 1221107886.020549 *
cpu 0: 1221107887.030244
cpu 1: 1221107887.030236 *
cpu 2: 1221107887.030498
cpu 3: 1221107887.030493 *
cpu 0: 1221107888.040248
cpu 1: 1221107888.040262
cpu 2: 1221107888.040314
cpu 3: 1221107888.040470
cpu 0: 1221107889.050305
cpu 1: 1221107889.050300 *
cpu 2: 1221107889.050354
cpu 3: 1221107889.050394
cpu 0: 1221107890.060384
cpu 1: 1221107890.060489
cpu 2: 1221107890.060753
cpu 3: 1221107890.060918
cpu 0: 1221107891.083559
cpu 1: 1221107891.083558 *
cpu 2: 1221107891.083614
cpu 3: 1221107891.083613 *
cpu 0: 1221107892.091705
cpu 1: 1221107892.091699 *
cpu 2: 1221107892.092998
cpu 3: 1221107892.093011
Setting vcpu-pcpu affinity well after guest startup, tracking is a bit
better (fewer time travels).
I do not believe there's a way to set affinity as kvm/qemu threads are
spawned (short of modifying qemu).
As before, RHEL3 guest. DL380G5 host.
david
Marcelo Tosatti wrote:
> VMX initializes the TSC offset for each vcpu at different times, and
> also reinitializes it for vcpus other than 0 on APIC SIPI message.
>
> This bug causes the TSC's to appear unsynchronized in the guest, even if
> the host is good.
>
> Older Linux kernels don't handle the situation very well, so
> gettimeofday is likely to go backwards in time:
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
> http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
>
> Fix it by initializating the offset of each vcpu relative to vm creation
> time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
> APIC MP init path.
>
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
>
> Index: kvm.tip/arch/x86/kvm/vmx.c
> ===================================================================
> --- kvm.tip.orig/arch/x86/kvm/vmx.c
> +++ kvm.tip/arch/x86/kvm/vmx.c
> @@ -850,11 +850,8 @@ static u64 guest_read_tsc(void)
> * writes 'guest_tsc' into guest's timestamp counter "register"
> * guest_tsc = host_tsc + tsc_offset ==> tsc_offset = guest_tsc - host_tsc
> */
> -static void guest_write_tsc(u64 guest_tsc)
> +static void guest_write_tsc(u64 guest_tsc, u64 host_tsc)
> {
> - u64 host_tsc;
> -
> - rdtscll(host_tsc);
> vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
> }
>
> @@ -918,6 +915,7 @@ static int vmx_set_msr(struct kvm_vcpu *
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> struct kvm_msr_entry *msr;
> + u64 host_tsc;
> int ret = 0;
>
> switch (msr_index) {
> @@ -943,7 +941,8 @@ static int vmx_set_msr(struct kvm_vcpu *
> vmcs_writel(GUEST_SYSENTER_ESP, data);
> break;
> case MSR_IA32_TIME_STAMP_COUNTER:
> - guest_write_tsc(data);
> + rdtscll(host_tsc);
> + guest_write_tsc(data, host_tsc);
> break;
> case MSR_P6_PERFCTR0:
> case MSR_P6_PERFCTR1:
> @@ -2202,6 +2201,7 @@ static int vmx_vcpu_setup(struct vcpu_vm
> vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
> vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
>
> + guest_write_tsc(0, vmx->vcpu.kvm->arch.vm_init_tsc);
>
> return 0;
> }
> @@ -2292,8 +2292,6 @@ static int vmx_vcpu_reset(struct kvm_vcp
> vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
> vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
>
> - guest_write_tsc(0);
> -
> /* Special registers */
> vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
>
> Index: kvm.tip/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm.tip.orig/arch/x86/kvm/x86.c
> +++ kvm.tip/arch/x86/kvm/x86.c
> @@ -4250,6 +4250,8 @@ struct kvm *kvm_arch_create_vm(void)
> INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
> INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
>
> + rdtscll(kvm->arch.vm_init_tsc);
> +
> return kvm;
> }
>
> Index: kvm.tip/include/asm-x86/kvm_host.h
> ===================================================================
> --- kvm.tip.orig/include/asm-x86/kvm_host.h
> +++ kvm.tip/include/asm-x86/kvm_host.h
> @@ -377,6 +377,7 @@ struct kvm_arch{
>
> struct page *ept_identity_pagetable;
> bool ept_identity_pagetable_done;
> + u64 vm_init_tsc;
> };
>
> struct kvm_vm_stat {
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-09-10 22:18 ` Glauber Costa
@ 2008-09-11 8:32 ` Marcelo Tosatti
0 siblings, 0 replies; 11+ messages in thread
From: Marcelo Tosatti @ 2008-09-11 8:32 UTC (permalink / raw)
To: Glauber Costa
Cc: kvm-devel, David S. Ahern, Chris Wright,
Glauber de Oliveira Costa
On Wed, Sep 10, 2008 at 07:18:43PM -0300, Glauber Costa wrote:
> On Wed, Sep 10, 2008 at 05:58:42PM -0300, Marcelo Tosatti wrote:
> >
> > VMX initializes the TSC offset for each vcpu at different times, and
> > also reinitializes it for vcpus other than 0 on APIC SIPI message.
> >
> > This bug causes the TSC's to appear unsynchronized in the guest, even if
> > the host is good.
> >
> > Older Linux kernels don't handle the situation very well, so
> > gettimeofday is likely to go backwards in time:
> >
> > http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
> > http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
> >
> > Fix it by initializating the offset of each vcpu relative to vm creation
> > time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
> > APIC MP init path.
>
> How does it work if # vcpu > # pcpus? I remember that when I tried it, the big biting dog were
> cases in which all cpus tried to sync, but they naturally put the value "0" in different points of time
> (for obvious reasons), and would still appear unsynchronized to the guests.
Seems to work fine. All vcpus now put the value "0" relative to VM
creation.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-09-10 20:58 RFC: VMX: initialize TSC offset relative to vm creation time Marcelo Tosatti
2008-09-10 22:18 ` Glauber Costa
2008-09-11 4:58 ` David S. Ahern
@ 2008-09-13 4:55 ` Avi Kivity
2008-10-27 23:42 ` Marcelo Tosatti
2008-10-13 13:12 ` David S. Ahern
3 siblings, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2008-09-13 4:55 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: kvm-devel, David S. Ahern, Chris Wright,
Glauber de Oliveira Costa
Marcelo Tosatti wrote:
> VMX initializes the TSC offset for each vcpu at different times, and
> also reinitializes it for vcpus other than 0 on APIC SIPI message.
>
> This bug causes the TSC's to appear unsynchronized in the guest, even if
> the host is good.
>
> Older Linux kernels don't handle the situation very well, so
> gettimeofday is likely to go backwards in time:
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
> http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
>
> Fix it by initializating the offset of each vcpu relative to vm creation
> time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
> APIC MP init path.
>
>
>
This is good in principle, but we need to detect if we're on a multiple
board host (or a host with unsynced tscs) and do something else in that
case.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-09-10 20:58 RFC: VMX: initialize TSC offset relative to vm creation time Marcelo Tosatti
` (2 preceding siblings ...)
2008-09-13 4:55 ` Avi Kivity
@ 2008-10-13 13:12 ` David S. Ahern
3 siblings, 0 replies; 11+ messages in thread
From: David S. Ahern @ 2008-10-13 13:12 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: kvm-devel, Chris Wright, Glauber de Oliveira Costa
Marcelo:
Do you have a similar patch/idea for AMD?
Same program as before. Sets affinity to run on vcpu 0, call
gettimeofday(). Repeat for vcpu 1. ... Repeat for vcpu max. sleep(1).
Repeat sequence.
So in the following example output the process calls sleep with affinity
set to vcpu3, and on wake sets it to vcpu0 and then calls gettimeofday.
The result is a backward jump in time going from vcpu3 to vcpu0 and then
a forward jump from vcpu0 to vcpu1:
cpu 0: 1223902798.704804 *
cpu 1: 1223902799.824095
cpu 2: 1223902799.824139
cpu 3: 1223902799.824198
(sleep 1)
cpu 0: 1223902799.714804 *
cpu 1: 1223902800.834148
cpu 2: 1223902800.834190
cpu 3: 1223902800.834231
(sleep 1)
cpu 0: 1223902800.724863 *
cpu 1: 1223902801.844156
cpu 2: 1223902801.844234
cpu 3: 1223902801.844278
...
david
Marcelo Tosatti wrote:
> VMX initializes the TSC offset for each vcpu at different times, and
> also reinitializes it for vcpus other than 0 on APIC SIPI message.
>
> This bug causes the TSC's to appear unsynchronized in the guest, even if
> the host is good.
>
> Older Linux kernels don't handle the situation very well, so
> gettimeofday is likely to go backwards in time:
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
> http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
>
> Fix it by initializating the offset of each vcpu relative to vm creation
> time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
> APIC MP init path.
>
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
>
> Index: kvm.tip/arch/x86/kvm/vmx.c
> ===================================================================
> --- kvm.tip.orig/arch/x86/kvm/vmx.c
> +++ kvm.tip/arch/x86/kvm/vmx.c
> @@ -850,11 +850,8 @@ static u64 guest_read_tsc(void)
> * writes 'guest_tsc' into guest's timestamp counter "register"
> * guest_tsc = host_tsc + tsc_offset ==> tsc_offset = guest_tsc - host_tsc
> */
> -static void guest_write_tsc(u64 guest_tsc)
> +static void guest_write_tsc(u64 guest_tsc, u64 host_tsc)
> {
> - u64 host_tsc;
> -
> - rdtscll(host_tsc);
> vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
> }
>
> @@ -918,6 +915,7 @@ static int vmx_set_msr(struct kvm_vcpu *
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> struct kvm_msr_entry *msr;
> + u64 host_tsc;
> int ret = 0;
>
> switch (msr_index) {
> @@ -943,7 +941,8 @@ static int vmx_set_msr(struct kvm_vcpu *
> vmcs_writel(GUEST_SYSENTER_ESP, data);
> break;
> case MSR_IA32_TIME_STAMP_COUNTER:
> - guest_write_tsc(data);
> + rdtscll(host_tsc);
> + guest_write_tsc(data, host_tsc);
> break;
> case MSR_P6_PERFCTR0:
> case MSR_P6_PERFCTR1:
> @@ -2202,6 +2201,7 @@ static int vmx_vcpu_setup(struct vcpu_vm
> vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
> vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
>
> + guest_write_tsc(0, vmx->vcpu.kvm->arch.vm_init_tsc);
>
> return 0;
> }
> @@ -2292,8 +2292,6 @@ static int vmx_vcpu_reset(struct kvm_vcp
> vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
> vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0);
>
> - guest_write_tsc(0);
> -
> /* Special registers */
> vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
>
> Index: kvm.tip/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm.tip.orig/arch/x86/kvm/x86.c
> +++ kvm.tip/arch/x86/kvm/x86.c
> @@ -4250,6 +4250,8 @@ struct kvm *kvm_arch_create_vm(void)
> INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
> INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
>
> + rdtscll(kvm->arch.vm_init_tsc);
> +
> return kvm;
> }
>
> Index: kvm.tip/include/asm-x86/kvm_host.h
> ===================================================================
> --- kvm.tip.orig/include/asm-x86/kvm_host.h
> +++ kvm.tip/include/asm-x86/kvm_host.h
> @@ -377,6 +377,7 @@ struct kvm_arch{
>
> struct page *ept_identity_pagetable;
> bool ept_identity_pagetable_done;
> + u64 vm_init_tsc;
> };
>
> struct kvm_vm_stat {
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-09-13 4:55 ` Avi Kivity
@ 2008-10-27 23:42 ` Marcelo Tosatti
2008-10-28 18:36 ` David S. Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Marcelo Tosatti @ 2008-10-27 23:42 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm-devel, David S. Ahern, Chris Wright,
Glauber de Oliveira Costa
On Sat, Sep 13, 2008 at 07:55:02AM +0300, Avi Kivity wrote:
> Marcelo Tosatti wrote:
> > VMX initializes the TSC offset for each vcpu at different times, and
> > also reinitializes it for vcpus other than 0 on APIC SIPI message.
> >
> > This bug causes the TSC's to appear unsynchronized in the guest, even if
> > the host is good.
> >
> > Older Linux kernels don't handle the situation very well, so
> > gettimeofday is likely to go backwards in time:
> >
> > http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
> > http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
> >
> > Fix it by initializating the offset of each vcpu relative to vm creation
> > time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
> > APIC MP init path.
> >
> >
> >
>
> This is good in principle, but we need to detect if we're on a multiple
> board host (or a host with unsynced tscs) and do something else in that
> case.
I think this is a separate, and difficult, problem. For instance older
Linux guests that correct the TSC across CPU's are broken at the moment
in the unsynced TSC case.
That is, the fact that KVM does not handle unsynced TSC's on the host is
not an argument against this patch which clearly fixes a bug.
Take commit 019960ae9933161c2809fa4ee608ba30d9639fd2 for example.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-10-27 23:42 ` Marcelo Tosatti
@ 2008-10-28 18:36 ` David S. Ahern
2008-10-30 10:20 ` Marcelo Tosatti
2008-10-30 10:34 ` Avi Kivity
0 siblings, 2 replies; 11+ messages in thread
From: David S. Ahern @ 2008-10-28 18:36 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Avi Kivity, kvm-devel, Chris Wright, Glauber de Oliveira Costa,
Benjamin Serebrin
Marcelo Tosatti wrote:
> On Sat, Sep 13, 2008 at 07:55:02AM +0300, Avi Kivity wrote:
>> Marcelo Tosatti wrote:
>>> VMX initializes the TSC offset for each vcpu at different times, and
>>> also reinitializes it for vcpus other than 0 on APIC SIPI message.
>>>
>>> This bug causes the TSC's to appear unsynchronized in the guest, even if
>>> the host is good.
>>>
>>> Older Linux kernels don't handle the situation very well, so
>>> gettimeofday is likely to go backwards in time:
>>>
>>> http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
>>> http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831
>>>
>>> Fix it by initializating the offset of each vcpu relative to vm creation
>>> time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
>>> APIC MP init path.
>>>
>>>
>>>
>> This is good in principle, but we need to detect if we're on a multiple
>> board host (or a host with unsynced tscs) and do something else in that
>> case.
>
> I think this is a separate, and difficult, problem. For instance older
> Linux guests that correct the TSC across CPU's are broken at the moment
> in the unsynced TSC case.
>
> That is, the fact that KVM does not handle unsynced TSC's on the host is
> not an argument against this patch which clearly fixes a bug.
>
> Take commit 019960ae9933161c2809fa4ee608ba30d9639fd2 for example.
>
Has anything changed "recently" with the TSC code? Recently here being
the past 2 months since you first crafted the patch. I ask because in
the past few runs based on kvm.git trees (e.g., as recently as a pull on
10/26), this tsc offset patch no longer fixes the problem.
The following one does fix the problem with kvm.git pulled on 10/26/08:
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 64e2439..d5da717 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -860,7 +860,7 @@ static void guest_write_tsc(u64 guest_tsc)
u64 host_tsc;
rdtscll(host_tsc);
- vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
+ vmcs_write64(TSC_OFFSET, 0);
}
/*
This is the vmx counterpart (or at least to my understanding) to a
suggestion Ben had for the svm code.
david
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-10-28 18:36 ` David S. Ahern
@ 2008-10-30 10:20 ` Marcelo Tosatti
2008-10-30 14:00 ` David S. Ahern
2008-10-30 10:34 ` Avi Kivity
1 sibling, 1 reply; 11+ messages in thread
From: Marcelo Tosatti @ 2008-10-30 10:20 UTC (permalink / raw)
To: David S. Ahern
Cc: Avi Kivity, kvm-devel, Chris Wright, Glauber de Oliveira Costa,
Benjamin Serebrin
On Tue, Oct 28, 2008 at 12:36:14PM -0600, David S. Ahern wrote:
>
> > That is, the fact that KVM does not handle unsynced TSC's on the host is
> > not an argument against this patch which clearly fixes a bug.
> >
> > Take commit 019960ae9933161c2809fa4ee608ba30d9639fd2 for example.
> >
>
> Has anything changed "recently" with the TSC code? Recently here being
> the past 2 months since you first crafted the patch. I ask because in
> the past few runs based on kvm.git trees (e.g., as recently as a pull on
> 10/26), this tsc offset patch no longer fixes the problem.
Hi David,
Can you share showtime output? Works for me.
>
> The following one does fix the problem with kvm.git pulled on 10/26/08:
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 64e2439..d5da717 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -860,7 +860,7 @@ static void guest_write_tsc(u64 guest_tsc)
> u64 host_tsc;
>
> rdtscll(host_tsc);
> - vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
> + vmcs_write64(TSC_OFFSET, 0);
> }
>
> /*
>
> This is the vmx counterpart (or at least to my understanding) to a
> suggestion Ben had for the svm code.
>
> david
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-10-28 18:36 ` David S. Ahern
2008-10-30 10:20 ` Marcelo Tosatti
@ 2008-10-30 10:34 ` Avi Kivity
1 sibling, 0 replies; 11+ messages in thread
From: Avi Kivity @ 2008-10-30 10:34 UTC (permalink / raw)
To: David S. Ahern
Cc: Marcelo Tosatti, Avi Kivity, kvm-devel, Chris Wright,
Glauber de Oliveira Costa, Benjamin Serebrin
David S. Ahern wrote:
> Has anything changed "recently" with the TSC code? Recently here being
> the past 2 months since you first crafted the patch. I ask because in
> the past few runs based on kvm.git trees (e.g., as recently as a pull on
> 10/26), this tsc offset patch no longer fixes the problem.
>
> The following one does fix the problem with kvm.git pulled on 10/26/08:
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 64e2439..d5da717 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -860,7 +860,7 @@ static void guest_write_tsc(u64 guest_tsc)
> u64 host_tsc;
>
> rdtscll(host_tsc);
> - vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
> + vmcs_write64(TSC_OFFSET, 0);
> }
>
That's a bit heavy handed, it doesn't start he guest tsc from zero and
doesn't allow the guest to adjust tsc.
But it does work for the case the tscs are synced.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: VMX: initialize TSC offset relative to vm creation time
2008-10-30 10:20 ` Marcelo Tosatti
@ 2008-10-30 14:00 ` David S. Ahern
0 siblings, 0 replies; 11+ messages in thread
From: David S. Ahern @ 2008-10-30 14:00 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Avi Kivity, kvm-devel, Chris Wright, Glauber de Oliveira Costa,
Benjamin Serebrin
Marcelo Tosatti wrote:
> On Tue, Oct 28, 2008 at 12:36:14PM -0600, David S. Ahern wrote:
>>> That is, the fact that KVM does not handle unsynced TSC's on the host is
>>> not an argument against this patch which clearly fixes a bug.
>>>
>>> Take commit 019960ae9933161c2809fa4ee608ba30d9639fd2 for example.
>>>
>> Has anything changed "recently" with the TSC code? Recently here being
>> the past 2 months since you first crafted the patch. I ask because in
>> the past few runs based on kvm.git trees (e.g., as recently as a pull on
>> 10/26), this tsc offset patch no longer fixes the problem.
>
> Hi David,
>
> Can you share showtime output? Works for me.
>
Hi Marcelo:
I pulled kvm.git this morning and ran three cases:
1. kvm.git with no patches,
2. kvm.git with your TSC offset patch from September 10th,
3. kvm.git with TSC offset set to 0.
In all cases the host is a DL380G5, Fedora 9 OS, kvm-77 userspace. Guest
is running RHEL3U8. 3 samples for each case:
1. kvm.git, no patches:
cpu 0: 1225374376.351910 *
cpu 1: 1225374376.598833
cpu 2: 1225374378.154530
cpu 3: 1225374377.874563 *
sleeping 1 with affinity set to 0x8
cpu 0: 1225374377.361762 *
cpu 1: 1225374377.608669
cpu 2: 1225374379.164366
cpu 3: 1225374378.884393 *
sleeping 1 with affinity set to 0x8
cpu 0: 1225374378.371607 *
cpu 1: 1225374378.618517
cpu 2: 1225374380.174213
cpu 3: 1225374379.894246 *
2. kvm.git, Marcelo patch
cpu 0: 1225374671.069711
cpu 1: 1225374671.069711
cpu 2: 1225374671.069804
cpu 3: 1225374671.069761 *
sleeping 1 with affinity set to 0x8
cpu 0: 1225374672.079221
cpu 1: 1225374672.079220 *
cpu 2: 1225374672.079309
cpu 3: 1225374672.079267 *
sleeping 1 with affinity set to 0x8
cpu 0: 1225374673.088703
cpu 1: 1225374673.088701 *
cpu 2: 1225374673.088802
cpu 3: 1225374673.088763 *
3. tsc offset 0
cpu 0: 1225374910.953226
cpu 1: 1225374910.953307
cpu 2: 1225374910.953355
cpu 3: 1225374910.953446
sleeping 1 with affinity set to 0x8
cpu 0: 1225374911.962735
cpu 1: 1225374911.962808
cpu 2: 1225374911.962857
cpu 3: 1225374911.962949
sleeping 1 with affinity set to 0x8
cpu 0: 1225374912.972211
cpu 1: 1225374912.972284
cpu 2: 1225374912.972333
cpu 3: 1225374912.972425
I'll repeat the test later on a PowerEdge 2950 with a similar setup, but
it has the same processor as the DL380G5.
david
>> The following one does fix the problem with kvm.git pulled on 10/26/08:
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 64e2439..d5da717 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -860,7 +860,7 @@ static void guest_write_tsc(u64 guest_tsc)
>> u64 host_tsc;
>>
>> rdtscll(host_tsc);
>> - vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc);
>> + vmcs_write64(TSC_OFFSET, 0);
>> }
>>
>> /*
>>
>> This is the vmx counterpart (or at least to my understanding) to a
>> suggestion Ben had for the svm code.
>>
>> david
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-10-30 14:00 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-10 20:58 RFC: VMX: initialize TSC offset relative to vm creation time Marcelo Tosatti
2008-09-10 22:18 ` Glauber Costa
2008-09-11 8:32 ` Marcelo Tosatti
2008-09-11 4:58 ` David S. Ahern
2008-09-13 4:55 ` Avi Kivity
2008-10-27 23:42 ` Marcelo Tosatti
2008-10-28 18:36 ` David S. Ahern
2008-10-30 10:20 ` Marcelo Tosatti
2008-10-30 14:00 ` David S. Ahern
2008-10-30 10:34 ` Avi Kivity
2008-10-13 13:12 ` David S. Ahern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).