* A testing for KVM
@ 2007-06-13 8:37 Zhao, Yunfeng
[not found] ` <10EA09EFD8728347A513008B6B0DA77AA3EEA0-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 14+ messages in thread
From: Zhao, Yunfeng @ 2007-06-13 8:37 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Hi,
We ran a testing against latest KVM to know its quality status.
In the testing we tried to boot guests, test basic devices of guest, and
install guests.
The commit is: 08908a44630210f97b0276f6e41ff71ec04f1308
A summary for the testing result:
Can boot most IA32/IA32e UP VMX guests, but SMP guests, IA32e windows,
and some linux with acpi can not boot.
Can NOT boot guests with mem > 2GB
Save/Restore works on IA32e/IA32
Live migration works on IA32e/IA32, but IA32 guest could NOT be migrated
from an IA32 host to an IA32e host
Basic devices, Keybord,disk,VGA, and nic works well, but timer is NOT
accurate while running some workload on guests.
Here is an Issue list
~~~~~~~~~~~~~~~~~~~~~~~~~~
1) timer is not accurate.
On ia32e host, the timer will become 7 secs negative after
doing kernel build, which lasts for about 20 mins. 'cat
/proc/interrupts | grep timer' shows the interrupt will be 50 less than
normal in one second, which is consistent with the phenomenon that the
timer is slow.
On ia32 host, the timer will become 3 secs negative after
doing kernel build, which lasts for about 20 mins.
https://sourceforge.net/tracker/index.php?func=detail&aid=1736299&group_
id=180599&atid=893831
2) boot acpi guest fail (both for rhel4u3 guest and windows guest)
3) boot smp guest fail (we tried rhel4u3 guest)
4) live migrate a 32bit guest from 32bit host to 64bit host fail
The destination machine will report: migrate_incoming_fd
failed(rc=231)
https://sourceforge.net/tracker/index.php?func=detail&aid=1736301&group_
id=180599&atid=893831
5) live migrate a 32bit guest between 32bit hosts will cause timer
quite inaccurate.
The guest will report TSC can not be used as a timesource,
and the time difference will reach up to 70secs in one second.
https://sourceforge.net/tracker/index.php?func=detail&aid=1736305&group_
id=180599&atid=893831
6) Cannot boot guest with > 2GB mem
https://sourceforge.net/tracker/index.php?func=detail&aid=1736307&group_
id=180599&atid=893831
Detail Test Report
~~~~~~~~~~~~~~~~~~~~~~~~~~
Ia32e host:
1) Save/restore ia32e 512M linux -- pass
2) Save/restore ia32 512M winxp -- pass
3) Live migrate ia32e 512M linux -- pass
4) Live migrate ia32 512M winxp -- pass
5) Guest startx -- pass
6) Mouse usability -- pass
7) Keyboard usability -- pass
8) scp a big file to guest -- pass
9) guest timer -- fail
After doing kernel build for 20 mins, ntpdate shows 7secs time negative
10) kernel build in guest -- pass
11) boot_up_acpi_win2k3_g64 -- FAIL
Ia32 host:
1) Save/restore ia32 512M linux -- pass
2) Save/restore ia32 512M winxp -- pass
3) Live migrate ia32 512M linux from 32bit host to 64bit host --
fail
The destination machine will report: migrate_incoming_fd
failed(rc=231)
4) Live migrate ia32 512M linux from 32bit host to 32bit host --
pass
The guest will report : TSC can not be used as a timesource,
and timer will become quite inaccurate
5) Live migrate ia32 512M winxp -- pass
6) Guest startx -- pass
7) Mouse usability -- pass
8) Keyboard usability -- pass
9) scp a big file to guest -- pass
10) guest timer -- fail
After doing kernel build for 20 mins, ntpdate shows 3 secs time negative
11) kernel build in guest -- pass
12) install 64bit FC6 guest -- pass
13) install 64bit RHEL4U1 guest --pass
14) boot 64bit RHEL4U1 guest with acpi -- fail
15) boot 64bit RHEL4U1 guest without acpi -- pass
Thanks
Yunfeng
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <10EA09EFD8728347A513008B6B0DA77AA3EEA0-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-06-13 8:48 ` Avi Kivity
[not found] ` <466FAF56.4070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2007-06-13 8:48 UTC (permalink / raw)
To: Zhao, Yunfeng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Zhao, Yunfeng wrote:
> Hi,
> We ran a testing against latest KVM to know its quality status.
> In the testing we tried to boot guests, test basic devices of guest, and
> install guests.
>
Thanks for doing this -- it is enormously useful.
> Basic devices, Keybord,disk,VGA, and nic works well, but timer is NOT
> accurate while running some workload on guests.
>
We found that using an hrtimer enabled host with CONFIG_HZ=1000 improves
things. However I don't think that it's as accurate as 7 seconds in 20
minutes (that's better than 1% accuracy), so probably more work is
needed in qemu to correct time drift.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <466FAF56.4070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-06-13 10:16 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A0198452F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-06-13 11:02 ` Gregory Haskins
1 sibling, 1 reply; 14+ messages in thread
From: Dong, Eddie @ 2007-06-13 10:16 UTC (permalink / raw)
To: Avi Kivity, Zhao, Yunfeng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
> We found that using an hrtimer enabled host with
> CONFIG_HZ=1000 improves
> things. However I don't think that it's as accurate as 7
> seconds in 20
> minutes (that's better than 1% accuracy), so probably more work is
> needed in qemu to correct time drift.
>
Time virtualization for HVM is always a headache, no simple way can
provide
accurate source :-(
Per current KVM time virtualization policy, drop of jiffies or redundant
jiffies
may happen frequently. BTW, even in Xen, we are not satisfied with the
time
virtualization especially for TSC timer.
Some modification in guest Linux side is really needed, maybe somebody
can bring this issue to virtualization mini summit?
Eddie
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <10EA09EFD8728347A513008B6B0DA77A0198452F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-06-13 10:58 ` Avi Kivity
0 siblings, 0 replies; 14+ messages in thread
From: Avi Kivity @ 2007-06-13 10:58 UTC (permalink / raw)
To: Dong, Eddie; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Dong, Eddie wrote:
>> We found that using an hrtimer enabled host with
>> CONFIG_HZ=1000 improves
>> things. However I don't think that it's as accurate as 7
>> seconds in 20
>> minutes (that's better than 1% accuracy), so probably more work is
>> needed in qemu to correct time drift.
>>
>>
> Time virtualization for HVM is always a headache, no simple way can
> provide
> accurate source :-(
> Per current KVM time virtualization policy, drop of jiffies or redundant
> jiffies
> may happen frequently.
There is some code in kvm's qemu to track dropped ticks and reinject
them. The code is disabled because it doesn't work well.
> BTW, even in Xen, we are not satisfied with the
> time
> virtualization especially for TSC timer.
>
I don't think tsc can be virtualized well. It's too fast.
> Some modification in guest Linux side is really needed, maybe somebody
> can bring this issue to virtualization mini summit?
>
Linux already has support for paravirtualized time, we just need to use it.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <466FAF56.4070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-06-13 10:16 ` Dong, Eddie
@ 2007-06-13 11:02 ` Gregory Haskins
[not found] ` <1181732578.26394.9.camel-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
1 sibling, 1 reply; 14+ messages in thread
From: Gregory Haskins @ 2007-06-13 11:02 UTC (permalink / raw)
To: Avi Kivity, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Wed, 2007-06-13 at 11:48 +0300, Avi Kivity wrote:
> Zhao, Yunfeng wrote:
> > Hi,
> > We ran a testing against latest KVM to know its quality status.
> > In the testing we tried to boot guests, test basic devices of guest, and
> > install guests.
> >
>
> Thanks for doing this -- it is enormously useful.
>
> > Basic devices, Keybord,disk,VGA, and nic works well, but timer is NOT
> > accurate while running some workload on guests.
> >
>
> We found that using an hrtimer enabled host with CONFIG_HZ=1000 improves
> things. However I don't think that it's as accurate as 7 seconds in 20
> minutes (that's better than 1% accuracy), so probably more work is
> needed in qemu to correct time drift.
>
>
>
One of the things that I noticed during the development of the APIC
patchset that was quite odd:
1) Linux guest was programming the PIT for 4ms.
2) QEMU was programming the sigalarm for 1ms
3) SIGALARM was only delivered every 8ms (probably maximum resolution
with this setup) so the timer-wheel injected two PIT interrupts per
SIGALARM.
4) Since PICs can generally only queue a single interrupt, the second
tick was always lost.
HOWEVER (and this is where it gets really weird), linux wallclock time
in the guest runs at normal speed even with PIT at 1/2 freq. If I made
corrections such that every PIT tick is actually delivered to the guest,
wallclock runs at 2x.
So aside from the fact that I know we are losing at least 50% of our
ticks, it seems that something else was hacked to accommodate it somehow
(perhaps the RTC emulation?).
Hopefully this information might be useful to someone who wishes to
tackle the problem.
-Greg
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <1181732578.26394.9.camel-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
@ 2007-06-13 11:41 ` Dor Laor
2007-06-21 3:23 ` Dong, Eddie
1 sibling, 0 replies; 14+ messages in thread
From: Dor Laor @ 2007-06-13 11:41 UTC (permalink / raw)
To: Gregory Haskins, Avi Kivity,
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
>> Zhao, Yunfeng wrote:
>> > Hi,
>> > We ran a testing against latest KVM to know its quality status.
>> > In the testing we tried to boot guests, test basic devices of
guest,
>and
>> > install guests.
>> >
>>
>> Thanks for doing this -- it is enormously useful.
>>
>> > Basic devices, Keybord,disk,VGA, and nic works well, but timer is
NOT
>> > accurate while running some workload on guests.
>> >
>>
>> We found that using an hrtimer enabled host with CONFIG_HZ=1000
improves
>> things. However I don't think that it's as accurate as 7 seconds in
20
>> minutes (that's better than 1% accuracy), so probably more work is
>> needed in qemu to correct time drift.
>>
>>
>>
>
>One of the things that I noticed during the development of the APIC
>patchset that was quite odd:
>
>1) Linux guest was programming the PIT for 4ms.
>2) QEMU was programming the sigalarm for 1ms
>3) SIGALARM was only delivered every 8ms (probably maximum resolution
>with this setup) so the timer-wheel injected two PIT interrupts per
>SIGALARM.
>4) Since PICs can generally only queue a single interrupt, the second
>tick was always lost.
>
>HOWEVER (and this is where it gets really weird), linux wallclock time
>in the guest runs at normal speed even with PIT at 1/2 freq. If I made
>corrections such that every PIT tick is actually delivered to the
guest,
>wallclock runs at 2x.
>
>So aside from the fact that I know we are losing at least 50% of our
>ticks, it seems that something else was hacked to accommodate it
somehow
>(perhaps the RTC emulation?).
>
>Hopefully this information might be useful to someone who wishes to
>tackle the problem.
>
>-Greg
>
The current status is that the guest clock depends on qemu's PIT
interrupts which depend on qemu's SIGALARM which depends on the host's
HZ frequency.
There will always be situations where the guest wants higher frequency
than the host. There will always be situations where SIGALARMS are not
accurate, or as Greg pointed out that several signals collapse into one
irq, thus causing time drift problems in the guest.
To solve the problem we need the following:
1. Have the host kernel config with hrtimer and dyn-tick.
It increases the accuracy of SIGALARM.
2. Based on the above, implement dyn-tick timer in qemu, and have the
PIT
use qemu's dynamic timer.
I tested guests running with x86 hrtimer/dyn-tick patch and it worked
fine with the regular qemu. Fixing '2' will make sure that drift of 7
seconds in 20 minutes won't happen.
Dor
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <1181732578.26394.9.camel-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
2007-06-13 11:41 ` Dor Laor
@ 2007-06-21 3:23 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A01A562AD-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
1 sibling, 1 reply; 14+ messages in thread
From: Dong, Eddie @ 2007-06-21 3:23 UTC (permalink / raw)
To: Gregory Haskins, Avi Kivity,
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
[-- Attachment #1: Type: text/plain, Size: 2621 bytes --]
>
>One of the things that I noticed during the development of the APIC
>patchset that was quite odd:
>
>1) Linux guest was programming the PIT for 4ms.
>2) QEMU was programming the sigalarm for 1ms
>3) SIGALARM was only delivered every 8ms (probably maximum resolution
>with this setup) so the timer-wheel injected two PIT interrupts per
>SIGALARM.
>4) Since PICs can generally only queue a single interrupt, the second
>tick was always lost.
>
>HOWEVER (and this is where it gets really weird), linux wallclock time
>in the guest runs at normal speed even with PIT at 1/2 freq. If I made
>corrections such that every PIT tick is actually delivered to
>the guest,
>wallclock runs at 2x.
That is what I mean the time virtualization headache for today's Linux.
We should inject all guest PIT irqs (Xen did so), but the guest code
(probably Linux only)
is referencing PIT time (port 40) with TSC time. Since we didn't
virtualize TSC now,
i.e. guest TSC = host TSC, guest OS will think there is losted PIT IRQ
ticks(actually no),
since guest see TSC goes forward but PIT IRQ arrives later.
and thus add back those fake lost ticks. In this way we see guest time
ticks 2X faster.
In Xen, we virtualize TSC too to make sure the guest TSC time is
synchronized with
guest PIT time, so guest can see an accurate virtual time. (refer my
presentation
doc on Xen September summit 06.) It is good but time to time
we see bugs due to the complicated time virtualization mechanism. As far
as I know,
Vmware doesn't solve this problem either, it depends on guest
application to sync
guest time with real time (network or host).
We can use PV time of linux, another way is to persuade community to
give up
the cross reference of PIT and TSC to give up picking the lost ticks
given that
today's OS is smart enough and won't disable IRQ for that long time
before
another PIT expires (1-4ms), so those kind of picking up lost ticks is
quit
unnecessary, especially for X86-64 (faster processor).
Another big issue is that guest Linux will eventually fail back to PIT
time source after
hundreds of "lost ticks" and thus give up TSC time source. This is OK
for normal application
but dead set for database server where gettimeofday is frequently used.
We saw
30-40% performance degradation due to this only. (gettimeofday will
read port 40
multiple times which is extremely slow under virtualization).
I will ask Jun, Avi to bring this issue to virtualization mini summit to
persuade
communitty to do some changes in guest timer IRQ handler.
Thx,eddie
[-- Attachment #2: SMP-time.pdf --]
[-- Type: application/octet-stream, Size: 381539 bytes --]
[-- Attachment #3: Type: text/plain, Size: 286 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
[-- Attachment #4: Type: text/plain, Size: 186 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
@ 2007-06-21 5:12 Dong, Eddie
0 siblings, 0 replies; 14+ messages in thread
From: Dong, Eddie @ 2007-06-21 5:12 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
>
>One of the things that I noticed during the development of the APIC
>patchset that was quite odd:
>
>1) Linux guest was programming the PIT for 4ms.
>2) QEMU was programming the sigalarm for 1ms
>3) SIGALARM was only delivered every 8ms (probably maximum resolution
>with this setup) so the timer-wheel injected two PIT interrupts per
>SIGALARM.
>4) Since PICs can generally only queue a single interrupt, the second
>tick was always lost.
>
>HOWEVER (and this is where it gets really weird), linux wallclock time
>in the guest runs at normal speed even with PIT at 1/2 freq. If I made
>corrections such that every PIT tick is actually delivered to
>the guest,
>wallclock runs at 2x.
That is what I mean the time virtualization headache for today's Linux.
We should inject all guest PIT irqs (Xen did so), but the guest code
(probably Linux only)
is referencing PIT time (port 40) with TSC time. Since we didn't
virtualize TSC now,
i.e. guest TSC = host TSC, guest OS will think there is losted PIT IRQ
ticks(actually no),
since guest see TSC goes forward but PIT IRQ arrives later.
and thus add back those fake lost ticks. In this way we see guest time
ticks 2X faster.
In Xen, we virtualize TSC too to make sure the guest TSC time is
synchronized with
guest PIT time, so guest can see an accurate virtual time. (refer to
http://www.xensource.com/files/summit_3/Xen_HVM_SMP.pdf) It is good but
time to time
we see bugs due to the complicated time virtualization mechanism. As far
as I know,
Vmware doesn't solve this problem either, it depends on guest
application to sync
guest time with real time (network or host).
We can use PV time of linux, another way is to persuade community to
give up
the cross reference of PIT and TSC to give up picking the lost ticks
given that
today's OS is smart enough and won't disable IRQ for that long time
before
another PIT expires (1-4ms), so those kind of picking up lost ticks is
quit
unnecessary, especially for X86-64 (faster processor).
Another big issue is that guest Linux will eventually fail back to PIT
time source after
hundreds of "lost ticks" and thus give up TSC time source. This is OK
for normal application
but dead set for database server where gettimeofday is frequently used.
We saw
30-40% performance degradation due to this only. (gettimeofday will
read port 40
multiple times which is extremely slow under virtualization).
I will ask Jun, Avi to bring this issue to virtualization mini summit to
persuade
communitty to do some changes in guest timer IRQ handler.
Thx,eddie
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <10EA09EFD8728347A513008B6B0DA77A01A562AD-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-06-21 9:01 ` Avi Kivity
[not found] ` <467A3E66.4090406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2007-06-21 9:01 UTC (permalink / raw)
To: Dong, Eddie; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Dong, Eddie wrote:
> In Xen, we virtualize TSC too to make sure the guest TSC time is
> synchronized with
> guest PIT time, so guest can see an accurate virtual time. (refer my
> presentation
> doc on Xen September summit 06.) It is good but time to time
> we see bugs due to the complicated time virtualization mechanism. As far
> as I know,
> Vmware doesn't solve this problem either, it depends on guest
> application to sync
> guest time with real time (network or host).
>
>
Do you mean that rdtsc is trapped? Or that you play with TSC_OFFSET so
that time is smoother?
> We can use PV time of linux, another way is to persuade community to
> give up
> the cross reference of PIT and TSC to give up picking the lost ticks
> given that
> today's OS is smart enough and won't disable IRQ for that long time
> before
> another PIT expires (1-4ms), so those kind of picking up lost ticks is
> quit
> unnecessary, especially for X86-64 (faster processor).
>
> Another big issue is that guest Linux will eventually fail back to PIT
> time source after
> hundreds of "lost ticks" and thus give up TSC time source. This is OK
> for normal application
> but dead set for database server where gettimeofday is frequently used.
> We saw
> 30-40% performance degradation due to this only. (gettimeofday will
> read port 40
> multiple times which is extremely slow under virtualization).
>
> I will ask Jun, Avi to bring this issue to virtualization mini summit to
> persuade
> communitty to do some changes in guest timer IRQ handler.
>
For newer kernels, we can supply a paravirt clocksource (as in Anthony's
patchset) which will remove the need for changing the bare hardware
time code.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <467A3E66.4090406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-06-21 9:09 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A01A56502-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 14+ messages in thread
From: Dong, Eddie @ 2007-06-21 9:09 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
[-- Attachment #1: Type: text/plain, Size: 2743 bytes --]
>-----Original Message-----
>From: Avi Kivity [mailto:avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org]
>Sent: 2007年6月21日 17:01
>To: Dong, Eddie
>Cc: Gregory Haskins; kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>Subject: Re: [kvm-devel] A testing for KVM
>
>Dong, Eddie wrote:
>> In Xen, we virtualize TSC too to make sure the guest TSC time is
>> synchronized with
>> guest PIT time, so guest can see an accurate virtual time. (refer my
>> presentation
>> doc on Xen September summit 06.) It is good but time to time
>> we see bugs due to the complicated time virtualization
>mechanism. As far
>> as I know,
>> Vmware doesn't solve this problem either, it depends on guest
>> application to sync
>> guest time with real time (network or host).
>>
>>
>
>Do you mean that rdtsc is trapped? Or that you play with
>TSC_OFFSET so
>that time is smoother?
Yes, change the TSC_OFFSET value in case a VM is descheduled.
And adjust it back till all pending PIT irq is injected.
In Xen, we keep a notion of "guest time", so each time resource no matter
guest PIT, RTC, PM timer or TSC time should sync with "guest time".
I mention this, but not suggest to implement same way since it is
too complicated.
>
>> We can use PV time of linux, another way is to persuade community to
>> give up
>> the cross reference of PIT and TSC to give up picking the lost ticks
>> given that
>> today's OS is smart enough and won't disable IRQ for that long time
>> before
>> another PIT expires (1-4ms), so those kind of picking up
>lost ticks is
>> quit
>> unnecessary, especially for X86-64 (faster processor).
>>
>> Another big issue is that guest Linux will eventually fail
>back to PIT
>> time source after
>> hundreds of "lost ticks" and thus give up TSC time source. This is OK
>> for normal application
>> but dead set for database server where gettimeofday is
>frequently used.
>> We saw
>> 30-40% performance degradation due to this only. (gettimeofday will
>> read port 40
>> multiple times which is extremely slow under virtualization).
>>
>> I will ask Jun, Avi to bring this issue to virtualization
>mini summit to
>> persuade
>> communitty to do some changes in guest timer IRQ handler.
>>
>
>For newer kernels, we can supply a paravirt clocksource (as in
>Anthony's
>patchset) which will remove the need for changing the bare hardware
>time code.
Yes. but I am wondering about the performance. Hypercall to get
host time should be expansive than hardware support TSC read which is
about 200 cycles. I may make mistake since I didn't go through the patch
in very detail.
gettimeofday is very important :-)
Eddie
[-- Attachment #2: Type: text/plain, Size: 286 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
[-- Attachment #3: Type: text/plain, Size: 186 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <10EA09EFD8728347A513008B6B0DA77A01A56502-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-06-21 9:28 ` Avi Kivity
0 siblings, 0 replies; 14+ messages in thread
From: Avi Kivity @ 2007-06-21 9:28 UTC (permalink / raw)
To: Dong, Eddie; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
[-- Attachment #1: Type: text/plain, Size: 479 bytes --]
Dong, Eddie wrote:
> Yes. but I am wondering about the performance. Hypercall to get
> host time should be expansive than hardware support TSC read which is
> about 200 cycles. I may make mistake since I didn't go through the patch
> in very detail.
>
> gettimeofday is very important :-)
>
Maybe we should have a guest-visible tsc offset that can be used to
extrapolate the last clock read using the tsc.
--
error compiling committee.c: too many arguments to function
[-- Attachment #2: Type: text/plain, Size: 286 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
[-- Attachment #3: Type: text/plain, Size: 186 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
@ 2007-06-21 11:06 Gregory Haskins
[not found] ` <467A23890200005A000262B7-Igcdv/6uVdMHoYOw/+koYqIwWpluYiW7@public.gmane.org>
0 siblings, 1 reply; 14+ messages in thread
From: Gregory Haskins @ 2007-06-21 11:06 UTC (permalink / raw)
To: eddie.dong-ral2JQCrhuEAvxtiuMwx3w
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Thu, 2007-06-21 at 11:23 +0800, Dong, Eddie wrote:
> >
> >One of the things that I noticed during the development of the APIC
> >patchset that was quite odd:
> >
> >1) Linux guest was programming the PIT for 4ms.
> >2) QEMU was programming the sigalarm for 1ms
> >3) SIGALARM was only delivered every 8ms (probably maximum resolution
> >with this setup) so the timer-wheel injected two PIT interrupts per
> >SIGALARM.
> >4) Since PICs can generally only queue a single interrupt, the second
> >tick was always lost.
> >
> >HOWEVER (and this is where it gets really weird), linux wallclock time
> >in the guest runs at normal speed even with PIT at 1/2 freq. If I made
> >corrections such that every PIT tick is actually delivered to
> >the guest,
> >wallclock runs at 2x.
>
>
> That is what I mean the time virtualization headache for today's Linux.
> We should inject all guest PIT irqs (Xen did so), but the guest code
> (probably Linux only)
> is referencing PIT time (port 40) with TSC time. Since we didn't
> virtualize TSC now,
> i.e. guest TSC = host TSC, guest OS will think there is losted PIT IRQ
> ticks(actually no),
> since guest see TSC goes forward but PIT IRQ arrives later.
> and thus add back those fake lost ticks. In this way we see guest time
> ticks 2X faster.
I'm not sure if this affects the TSC theory or not, but note that the
host and guest have 250Hz PIT configured in the config. E.g. both
kernels are programming the PIT to 250Hz, but guest is only seeing 125Hz
ticks. If I "fix" the lost interrupt to bring the ticks to a true
250Hz, wall-clock time runs at 2x. Could it still be a TSC
virtualization problem under these conditions? (I dont know much about
TSC, so I just thought I would clarify this detail)
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
[not found] ` <467A23890200005A000262B7-Igcdv/6uVdMHoYOw/+koYqIwWpluYiW7@public.gmane.org>
@ 2007-06-21 12:41 ` Dong, Eddie
0 siblings, 0 replies; 14+ messages in thread
From: Dong, Eddie @ 2007-06-21 12:41 UTC (permalink / raw)
To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
> I'm not sure if this affects the TSC theory or not, but note that the
> host and guest have 250Hz PIT configured in the config. E.g. both
> kernels are programming the PIT to 250Hz, but guest is only
> seeing 125Hz
> ticks. If I "fix" the lost interrupt to bring the ticks to a true
> 250Hz, wall-clock time runs at 2x. Could it still be a TSC
> virtualization problem under these conditions? (I dont know much about
> TSC, so I just thought I would clarify this detail)
Yes, though I am not sure if it is 2X but anyway faster in guest.
We saw this in Xen but not that much. probably +20-30%.
>From your observation, it looks like guest OS add back 100%
lost ticks which is quit more than I observed in Xen. Probably it
has additional issue besides TSC. (I won't say it is a TSC
virtualization issue, but PIT virtualization issue :-(
Eddie
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: A testing for KVM
@ 2007-06-21 12:50 Gregory Haskins
0 siblings, 0 replies; 14+ messages in thread
From: Gregory Haskins @ 2007-06-21 12:50 UTC (permalink / raw)
To: eddie.dong-ral2JQCrhuEAvxtiuMwx3w
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Thu, 2007-06-21 at 20:41 +0800, Dong, Eddie wrote:
> > I'm not sure if this affects the TSC theory or not, but note that the
> > host and guest have 250Hz PIT configured in the config. E.g. both
> > kernels are programming the PIT to 250Hz, but guest is only
> > seeing 125Hz
> > ticks. If I "fix" the lost interrupt to bring the ticks to a true
> > 250Hz, wall-clock time runs at 2x. Could it still be a TSC
> > virtualization problem under these conditions? (I dont know much about
> > TSC, so I just thought I would clarify this detail)
>
> Yes, though I am not sure if it is 2X but anyway faster in guest.
> We saw this in Xen but not that much. probably +20-30%.
Yeah, you are probably right. I never quantified it. I just new PIT
was 2x and clock was faster, so I assumed 2x.
>
> From your observation, it looks like guest OS add back 100%
> lost ticks which is quit more than I observed in Xen. Probably it
> has additional issue besides TSC. (I won't say it is a TSC
> virtualization issue, but PIT virtualization issue :-(
> Eddie
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-06-21 12:50 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-13 8:37 A testing for KVM Zhao, Yunfeng
[not found] ` <10EA09EFD8728347A513008B6B0DA77AA3EEA0-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-06-13 8:48 ` Avi Kivity
[not found] ` <466FAF56.4070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-06-13 10:16 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A0198452F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-06-13 10:58 ` Avi Kivity
2007-06-13 11:02 ` Gregory Haskins
[not found] ` <1181732578.26394.9.camel-5CR4LY5GPkvLDviKLk5550HKjMygAv58XqFh9Ls21Oc@public.gmane.org>
2007-06-13 11:41 ` Dor Laor
2007-06-21 3:23 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A01A562AD-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-06-21 9:01 ` Avi Kivity
[not found] ` <467A3E66.4090406-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-06-21 9:09 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A01A56502-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-06-21 9:28 ` Avi Kivity
-- strict thread matches above, loose matches on Subject: below --
2007-06-21 5:12 Dong, Eddie
2007-06-21 11:06 Gregory Haskins
[not found] ` <467A23890200005A000262B7-Igcdv/6uVdMHoYOw/+koYqIwWpluYiW7@public.gmane.org>
2007-06-21 12:41 ` Dong, Eddie
2007-06-21 12:50 Gregory Haskins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox