public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
@ 2007-12-18 17:20 Avi Kivity
       [not found] ` <47680173.6060606-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-18 17:20 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner; +Cc: kvm-devel, linux-kernel

Booting RHEL 5 i386 in kvm with -no-kvm-irqchip -smp 4 will hang in 
udev.  I bisected this to a change in the _guest_ kernel:

> commit 95492e4646e5de8b43d9a7908d6177fb737b61f0
> Author: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
> Date:   Fri Feb 16 01:27:34 2007 -0800
>
>     [PATCH] x86: rewrite SMP TSC sync code
>
>     make the TSC synchronization code more robust, and unify it 
> between x86_64 and
>     i386.
>
>     The biggest change is the removal of the 'fix up TSCs' code on 
> x86_64 and
>     i386, in some rare cases it was /causing/ time-warps on SMP systems.
>
>     The new code only checks for TSC asynchronity - and if it can prove a
>     time-warp (if it can observe the TSC going backwards when going 
> from one CPU
>     to another within a critical section), then the TSC clock-source 
> is turned
>     off.
>
>     The TSC synchronization-checking code also got moved into a 
> separate file.

So, guest kernels prior to this commit will hang in kvm smp; after this 
commit they will boot fine.

While the change mentions that it fixes a time warp bug, it also says it 
should be rare.  So clearly kvm smp tsc handing is buggy.  Ingo/Thomas, 
(or anybody else), do you have any insight as to what kvm can be doing 
wrong to trigger this behavior?


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found] ` <47680173.6060606-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-18 22:19   ` Ingo Molnar
       [not found]     ` <20071218221930.GA26109-X9Un+BFzKDI@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2007-12-18 22:19 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel


* Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:

> Booting RHEL 5 i386 in kvm with -no-kvm-irqchip -smp 4 will hang in udev.  
> I bisected this to a change in the _guest_ kernel:
>
>> commit 95492e4646e5de8b43d9a7908d6177fb737b61f0
>> Author: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
>> Date:   Fri Feb 16 01:27:34 2007 -0800
>>
>>     [PATCH] x86: rewrite SMP TSC sync code
>>
>>     make the TSC synchronization code more robust, and unify it between 
>> x86_64 and
>>     i386.
>>
>>     The biggest change is the removal of the 'fix up TSCs' code on x86_64 
>> and
>>     i386, in some rare cases it was /causing/ time-warps on SMP systems.
>>
>>     The new code only checks for TSC asynchronity - and if it can prove a
>>     time-warp (if it can observe the TSC going backwards when going from 
>> one CPU
>>     to another within a critical section), then the TSC clock-source is 
>> turned
>>     off.
>>
>>     The TSC synchronization-checking code also got moved into a separate 
>> file.
>
> So, guest kernels prior to this commit will hang in kvm smp; after this 
> commit they will boot fine.
>
> While the change mentions that it fixes a time warp bug, it also says 
> it should be rare.  So clearly kvm smp tsc handing is buggy.  
> Ingo/Thomas, (or anybody else), do you have any insight as to what kvm 
> can be doing wrong to trigger this behavior?

hm. Those time warps were really small, due to the small imperfections 
in the "sync up all CPUs to the same moment and do a WRMSR to clear all 
their TSCs" mechanism. I.e. at most a few usec time warps. I really dont 
know how that should result in udevd hanging. Can you debug udevd in any 
way?

so the only thing that KVM might be doing incorrectly here is the 
emulation of the WRMSR that clears the TSC of each vcpu?

	Ingo

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]     ` <20071218221930.GA26109-X9Un+BFzKDI@public.gmane.org>
@ 2007-12-19  6:33       ` Avi Kivity
       [not found]         ` <4768BB43.1000609-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-19  6:33 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: kvm-devel, linux-kernel

Ingo Molnar wrote:
>> While the change mentions that it fixes a time warp bug, it also says 
>> it should be rare.  So clearly kvm smp tsc handing is buggy.  
>> Ingo/Thomas, (or anybody else), do you have any insight as to what kvm 
>> can be doing wrong to trigger this behavior?
>>     
>
> hm. Those time warps were really small, due to the small imperfections 
> in the "sync up all CPUs to the same moment and do a WRMSR to clear all 
> their TSCs" mechanism. I.e. at most a few usec time warps. I really dont 
> know how that should result in udevd hanging. Can you debug udevd in any 
> way?
>
>   

Adding debug didn't help.  I'll try some sysrq keys to see what the 
guest thinks is happening.

> so the only thing that KVM might be doing incorrectly here is the 
> emulation of the WRMSR that clears the TSC of each vcpu?
>   

By inspection, it is correct.  Of course I may be missing something, so 
I'll write a unit test for it.  It should also be much slower than the 
native wrmsr.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]         ` <4768BB43.1000609-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-19 11:19           ` Avi Kivity
  2007-12-19 11:39             ` [kvm-devel] " Avi Kivity
  0 siblings, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 11:19 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: kvm-devel, linux-kernel

Avi Kivity wrote:
> Ingo Molnar wrote:
>   
>>> While the change mentions that it fixes a time warp bug, it also says 
>>> it should be rare.  So clearly kvm smp tsc handing is buggy.  
>>> Ingo/Thomas, (or anybody else), do you have any insight as to what kvm 
>>> can be doing wrong to trigger this behavior?
>>>     
>>>       
>> hm. Those time warps were really small, due to the small imperfections 
>> in the "sync up all CPUs to the same moment and do a WRMSR to clear all 
>> their TSCs" mechanism. I.e. at most a few usec time warps. I really dont 
>> know how that should result in udevd hanging. Can you debug udevd in any 
>> way?
>>
>>   
>>     
>
> Adding debug didn't help.  I'll try some sysrq keys to see what the 
> guest thinks is happening.
>
>   

many udev children are exiting; udevd itself is sleeping:

> udevd         S D5DCDF24  2924   573    372   594     629   535 (NOTLB)
>        d5dcdf38 00000086 00000002 d5dcdf24 d5dcdf20 00000000 d5dcdefc 
> d6169f68
>        d7db7f68 d5dcdf68 00000001 d5dd7560 c13b8a90 749ae8d2 00000002 
> 000326a1
>        d5dd7684 c131c700 00000003 d74f8900 892d6946 00000402 ffffffff 
> 00000000
> Call Trace:
>  [<c060d2c9>] do_nanosleep+0x3b/0x66
>  [<c0439b20>] hrtimer_nanosleep+0x50/0x106
>  [<c04397ee>] hrtimer_wakeup+0x0/0x18
>  [<c0439c1f>] sys_nanosleep+0x49/0x59
>  [<c0404e4c>] syscall_call+0x7/0xb
>  [<c0600000>] xfrm_state_find+0x49f/0x51e

So likely sleeping is screwed up somehow (though only on smp).


>> so the only thing that KVM might be doing incorrectly here is the 
>> emulation of the WRMSR that clears the TSC of each vcpu?
>>   
>>     
>
> By inspection, it is correct.  Of course I may be missing something, so 
> I'll write a unit test for it.  It should also be much slower than the 
> native wrmsr.
>
>   

Testing shows wrmsr and rdtsc function normally.

I'll try pinning the vcpus to cpus and see if that helps.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
  2007-12-19 11:19           ` Avi Kivity
@ 2007-12-19 11:39             ` Avi Kivity
       [not found]               ` <47690304.1090903-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 11:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: kvm-devel, linux-kernel

Avi Kivity wrote:
>  
> Testing shows wrmsr and rdtsc function normally.
>
> I'll try pinning the vcpus to cpus and see if that helps.
>

It does.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]               ` <47690304.1090903-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-19 14:06                 ` Ingo Molnar
  2007-12-19 14:27                   ` [kvm-devel] " Avi Kivity
       [not found]                   ` <20071219140624.GF21282-X9Un+BFzKDI@public.gmane.org>
  0 siblings, 2 replies; 16+ messages in thread
From: Ingo Molnar @ 2007-12-19 14:06 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel


* Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:

> Avi Kivity wrote:
>>  Testing shows wrmsr and rdtsc function normally.
>>
>> I'll try pinning the vcpus to cpus and see if that helps.
>>
>
> It does.

do we let the guest read the physical CPU's TSC? That would be trouble.

	Ingo

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
  2007-12-19 14:06                 ` Ingo Molnar
@ 2007-12-19 14:27                   ` Avi Kivity
       [not found]                     ` <47692A47.4040803-7k6+44Jx4zn6gbPvEgmw2w@public.gmane.org>
       [not found]                   ` <20071219140624.GF21282-X9Un+BFzKDI@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 14:27 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Avi Kivity, kvm-devel, linux-kernel

Ingo Molnar wrote:
> * Avi Kivity <avi@qumranet.com> wrote:
>
>   
>> Avi Kivity wrote:
>>     
>>>  Testing shows wrmsr and rdtsc function normally.
>>>
>>> I'll try pinning the vcpus to cpus and see if that helps.
>>>
>>>       
>> It does.
>>     
>
> do we let the guest read the physical CPU's TSC? That would be trouble.
>
>   

vmx (and svm) allow us to add an offset to the physical tsc.  We set it 
on startup to -tsc (so that an rdtsc on boot would return 0), and 
massage it on vcpu migration so that guest rdtsc is monotonic.

The net effect is that tsc on a vcpu can experience large forward jumps 
and changes in rate, but no negative jumps.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]                   ` <20071219140624.GF21282-X9Un+BFzKDI@public.gmane.org>
@ 2007-12-19 14:53                     ` Avi Kivity
  2007-12-19 15:09                       ` [kvm-devel] " Ingo Molnar
  0 siblings, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 14:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: kvm-devel, linux-kernel, Avi Kivity

Ingo Molnar wrote:
> * Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
>
>   
>> Avi Kivity wrote:
>>     
>>>  Testing shows wrmsr and rdtsc function normally.
>>>
>>> I'll try pinning the vcpus to cpus and see if that helps.
>>>
>>>       
>> It does.
>>     
>
> do we let the guest read the physical CPU's TSC? That would be trouble.
>
>   

vmx (and svm) allow us to add an offset to the physical tsc.  We set it
on startup to -tsc (so that an rdtsc on boot would return 0), and
massage it on vcpu migration so that guest rdtsc is monotonic.

The net effect is that tsc on a vcpu can experience large forward jumps
and changes in rate, but no negative jumps.

-- 
error compiling committee.c: too many arguments to function

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
  2007-12-19 14:53                     ` Guest kernel hangs in smp kvm for older kernels prior " Avi Kivity
@ 2007-12-19 15:09                       ` Ingo Molnar
       [not found]                         ` <20071219150938.GA15267-X9Un+BFzKDI@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2007-12-19 15:09 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel


try this test perhaps in an SMP guest:

 http://people.redhat.com/mingo/time-warp-test/time-warp-test.c

you can ignore TSC warps - but no GTOD or CLOCK warps should occur.

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]                         ` <20071219150938.GA15267-X9Un+BFzKDI@public.gmane.org>
@ 2007-12-19 15:21                           ` Avi Kivity
  0 siblings, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 15:21 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: kvm-devel, linux-kernel

Ingo Molnar wrote:
> try this test perhaps in an SMP guest:
>
>  http://people.redhat.com/mingo/time-warp-test/time-warp-test.c
>
> you can ignore TSC warps - but no GTOD or CLOCK warps should occur.
>
>   

On a broken guest kernel, I see gtod and clock warps.  On a good guest 
kernel, I do not, presumably because the tsc clocksource is marked as 
unstable.

I see tsc warps on both.  8 threads on 4 cpus.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]                     ` <47692A47.4040803-7k6+44Jx4zn6gbPvEgmw2w@public.gmane.org>
@ 2007-12-19 15:32                       ` Glauber de Oliveira Costa
  2007-12-19 15:41                         ` [kvm-devel] " Avi Kivity
       [not found]                         ` <5d6222a80712190732h515a63e6y49c64c0f572f044-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 2 replies; 16+ messages in thread
From: Glauber de Oliveira Costa @ 2007-12-19 15:32 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel, Avi Kivity, Gerd Hoffmann

On Dec 19, 2007 12:27 PM, Avi Kivity <avi-7k6+44Jx4zn6gbPvEgmw2w@public.gmane.org> wrote:
> Ingo Molnar wrote:
> > * Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> >
> >
> >> Avi Kivity wrote:
> >>
> >>>  Testing shows wrmsr and rdtsc function normally.
> >>>
> >>> I'll try pinning the vcpus to cpus and see if that helps.
> >>>
> >>>
> >> It does.
> >>
> >
> > do we let the guest read the physical CPU's TSC? That would be trouble.
> >
> >
>
> vmx (and svm) allow us to add an offset to the physical tsc.  We set it
> on startup to -tsc (so that an rdtsc on boot would return 0), and
> massage it on vcpu migration so that guest rdtsc is monotonic.
>
> The net effect is that tsc on a vcpu can experience large forward jumps
> and changes in rate, but no negative jumps.
>

Changes in rate does not sound good. It's possibly what's screwing up
my paravirt clock implementation in smp.
Since the host updates guest time prior to putting vcpu to run, two
vcpus that start running at different times will have different system
values.

Now if the vcpu that started running later probes the time first,
we'll se the time going backwards. A constant tsc rate is the only way
around
my limited mind sees around the problem (besides, obviously, _not_
making the system time per-vcpu).


-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
  2007-12-19 15:32                       ` Glauber de Oliveira Costa
@ 2007-12-19 15:41                         ` Avi Kivity
       [not found]                           ` <47693B9D.7080809-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
       [not found]                         ` <5d6222a80712190732h515a63e6y49c64c0f572f044-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 15:41 UTC (permalink / raw)
  To: Glauber de Oliveira Costa
  Cc: Avi Kivity, Ingo Molnar, kvm-devel, linux-kernel, Chris Wright,
	Gerd Hoffmann

Glauber de Oliveira Costa wrote:
> Changes in rate does not sound good. It's possibly what's screwing up
> my paravirt clock implementation in smp.
>   

You should renew the timebase on vcpu migration, and hook cpufreq so 
that changes in frequency are reflected in the timebase.

> Since the host updates guest time prior to putting vcpu to run, two
> vcpus that start running at different times will have different system
> values.
>
> Now if the vcpu that started running later probes the time first,
> we'll se the time going backwards. A constant tsc rate is the only way
> around
> my limited mind sees around the problem (besides, obviously, _not_
> making the system time per-vcpu).
>   

I tried disabling frequency scaling (rmmod acpi_cpufreq) but that didn't 
help my present problems.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]                         ` <5d6222a80712190732h515a63e6y49c64c0f572f044-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-19 16:55                           ` Amit Shah
       [not found]                             ` <200712192225.53748.amit.shah-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Amit Shah @ 2007-12-19 16:55 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Avi Kivity, linux-kernel, Avi Kivity, Gerd Hoffmann

On Wednesday 19 December 2007 21:02:06 Glauber de Oliveira Costa wrote:
> On Dec 19, 2007 12:27 PM, Avi Kivity <avi-7k6+44Jx4zn6gbPvEgmw2w@public.gmane.org> wrote:
> > Ingo Molnar wrote:
> > > * Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> > >> Avi Kivity wrote:
> > >>>  Testing shows wrmsr and rdtsc function normally.
> > >>>
> > >>> I'll try pinning the vcpus to cpus and see if that helps.
> > >>
> > >> It does.
> > >
> > > do we let the guest read the physical CPU's TSC? That would be trouble.
> >
> > vmx (and svm) allow us to add an offset to the physical tsc.  We set it
> > on startup to -tsc (so that an rdtsc on boot would return 0), and
> > massage it on vcpu migration so that guest rdtsc is monotonic.
> >
> > The net effect is that tsc on a vcpu can experience large forward jumps
> > and changes in rate, but no negative jumps.
>
> Changes in rate does not sound good. It's possibly what's screwing up
> my paravirt clock implementation in smp.

Do you mean in the case of VM migration, or just starting them on a single 
host?

> Since the host updates guest time prior to putting vcpu to run, two
> vcpus that start running at different times will have different system
> values.
>
> Now if the vcpu that started running later probes the time first,
> we'll se the time going backwards. A constant tsc rate is the only way
> around
> my limited mind sees around the problem (besides, obviously, _not_
> making the system time per-vcpu).

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernelsprior to tsc sync cleanup
       [not found]                             ` <200712192225.53748.amit.shah-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-19 17:04                               ` Dor Laor
       [not found]                                 ` <47694F35.6070401-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Dor Laor @ 2007-12-19 17:04 UTC (permalink / raw)
  To: Amit Shah
  Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Avi Kivity,
	linux-kernel, Avi Kivity, Gerd Hoffmann

Amit Shah wrote:
>
> On Wednesday 19 December 2007 21:02:06 Glauber de Oliveira Costa wrote:
> > On Dec 19, 2007 12:27 PM, Avi Kivity <avi-7k6+44Jx4zn6gbPvEgmw2w@public.gmane.org> wrote:
> > > Ingo Molnar wrote:
> > > > * Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> > > >> Avi Kivity wrote:
> > > >>>  Testing shows wrmsr and rdtsc function normally.
> > > >>>
> > > >>> I'll try pinning the vcpus to cpus and see if that helps.
> > > >>
> > > >> It does.
> > > >
> > > > do we let the guest read the physical CPU's TSC? That would be 
> trouble.
> > >
> > > vmx (and svm) allow us to add an offset to the physical tsc.  We 
> set it
> > > on startup to -tsc (so that an rdtsc on boot would return 0), and
> > > massage it on vcpu migration so that guest rdtsc is monotonic.
> > >
> > > The net effect is that tsc on a vcpu can experience large forward 
> jumps
> > > and changes in rate, but no negative jumps.
> >
> > Changes in rate does not sound good. It's possibly what's screwing up
> > my paravirt clock implementation in smp.
>
> Do you mean in the case of VM migration, or just starting them on a single
> host?
>
It's the cpu preemption stuff on local host and not VM migration
>
> > Since the host updates guest time prior to putting vcpu to run, two
> > vcpus that start running at different times will have different system
> > values.
> >
> > Now if the vcpu that started running later probes the time first,
> > we'll se the time going backwards. A constant tsc rate is the only way
> > around
> > my limited mind sees around the problem (besides, obviously, _not_
> > making the system time per-vcpu).
>
> -------------------------------------------------------------------------
> SF.Net email is sponsored by:
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services
> for just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> kvm-devel mailing list
> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernelsprior to tsc sync cleanup
       [not found]                                 ` <47694F35.6070401-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-19 17:09                                   ` Avi Kivity
  0 siblings, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2007-12-19 17:09 UTC (permalink / raw)
  To: dor.laor-atKUWr5tajBWk0Htik3J/w
  Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-kernel,
	Avi Kivity, Gerd Hoffmann

Dor Laor wrote:
>> > >
>> > > vmx (and svm) allow us to add an offset to the physical tsc.  We 
>> set it
>> > > on startup to -tsc (so that an rdtsc on boot would return 0), and
>> > > massage it on vcpu migration so that guest rdtsc is monotonic.
>> > >
>> > > The net effect is that tsc on a vcpu can experience large forward 
>> jumps
>> > > and changes in rate, but no negative jumps.
>> >
>> > Changes in rate does not sound good. It's possibly what's screwing up
>> > my paravirt clock implementation in smp.
>>
>> Do you mean in the case of VM migration, or just starting them on a 
>> single
>> host?
>>
> It's the cpu preemption stuff on local host and not VM migration

No, migrating a vcpu to another cpu.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup
       [not found]                           ` <47693B9D.7080809-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-20 10:34                             ` Glauber de Oliveira Costa
  0 siblings, 0 replies; 16+ messages in thread
From: Glauber de Oliveira Costa @ 2007-12-20 10:34 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, linux-kernel, Avi Kivity, Gerd Hoffmann

On Dec 19, 2007 1:41 PM, Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> Glauber de Oliveira Costa wrote:
> > Changes in rate does not sound good. It's possibly what's screwing up
> > my paravirt clock implementation in smp.
> >
>
> You should renew the timebase on vcpu migration, and hook cpufreq so
> that changes in frequency are reflected in the timebase.

 To be conservative, I do it in every vcpu run, and have any kind of
cpu frequency scaling disabled. And it does not work.

In a trace in the host, I see that vcpu runs happens very often in
vcpu 0 (probably because exits happen often there, so we have to go
back),
and comparatively, very few times in vcpu 1.

So what's probably happening is : vcpu 1 does system_time + tsc_delta,
 but vcpu 0 has already updated it so many times, the tsc does not
keep up,
and it end going backwards.

I'm running (in the host), the following test, upon module loading
(and Ingo can please tell me if I'm doing something idiotic in it,
compromising my conclusions)

void test (int foo)
{
       u64 start, stop;
       start = native_read_tsc();
       udelay(foo);
       stop = native_read_tsc();
       printk("%d Result: %lld\n", foo, foo * 1000 - cycles_2_ns(stop
- start));
}

Output is:

30 Result: -126
90 Result: 576
300 Result: 2627
1000 Result: 9381
3000 Result: 28238
5000 Result: 48086


So the delta is expecting to get bigger. If a vcpu passes a long time
without having the time updated.
Xen manages to keep the guest tsc stable and steady by doing
synchronization from time to time.

We can either: (If I'm right at this, of course):

* put a periodic timer in the host to update the system time from time to time;
* use some sort of global timestamp, instead of the per-cpu one.
* do something akin to what xen does, and still rely on the tsc.

Any thoughts?
-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-12-20 10:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-18 17:20 Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup Avi Kivity
     [not found] ` <47680173.6060606-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-18 22:19   ` Ingo Molnar
     [not found]     ` <20071218221930.GA26109-X9Un+BFzKDI@public.gmane.org>
2007-12-19  6:33       ` Avi Kivity
     [not found]         ` <4768BB43.1000609-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-19 11:19           ` Avi Kivity
2007-12-19 11:39             ` [kvm-devel] " Avi Kivity
     [not found]               ` <47690304.1090903-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-19 14:06                 ` Ingo Molnar
2007-12-19 14:27                   ` [kvm-devel] " Avi Kivity
     [not found]                     ` <47692A47.4040803-7k6+44Jx4zn6gbPvEgmw2w@public.gmane.org>
2007-12-19 15:32                       ` Glauber de Oliveira Costa
2007-12-19 15:41                         ` [kvm-devel] " Avi Kivity
     [not found]                           ` <47693B9D.7080809-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-20 10:34                             ` Glauber de Oliveira Costa
     [not found]                         ` <5d6222a80712190732h515a63e6y49c64c0f572f044-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-19 16:55                           ` Amit Shah
     [not found]                             ` <200712192225.53748.amit.shah-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-19 17:04                               ` Guest kernel hangs in smp kvm for older kernelsprior " Dor Laor
     [not found]                                 ` <47694F35.6070401-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-19 17:09                                   ` Avi Kivity
     [not found]                   ` <20071219140624.GF21282-X9Un+BFzKDI@public.gmane.org>
2007-12-19 14:53                     ` Guest kernel hangs in smp kvm for older kernels prior " Avi Kivity
2007-12-19 15:09                       ` [kvm-devel] " Ingo Molnar
     [not found]                         ` <20071219150938.GA15267-X9Un+BFzKDI@public.gmane.org>
2007-12-19 15:21                           ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox