* VDSO pvclock may increase host cpu consumption, is this a problem?
@ 2014-03-29 8:47 Zhanghailiang
2014-03-29 14:46 ` Marcelo Tosatti
2014-03-31 17:52 ` Andy Lutomirski
0 siblings, 2 replies; 13+ messages in thread
From: Zhanghailiang @ 2014-03-29 8:47 UTC (permalink / raw)
To: mtosatti@redhat.com, johnstul@us.ibm.com, tglx@linutronix.de,
kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Zhouxiangjiu, zhang yanying
Hi,
I found when Guest is idle, VDSO pvclock may increase host consumption.
We can calcutate as follow, Correct me if I am wrong.
(Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest)
In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created.
In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call.
When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption.
Both Host and Guest is linux-3.13.6.
So, whether the host cpu consumption is a problem?
Thanks.
Zhang hailiang
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-03-29 8:47 VDSO pvclock may increase host cpu consumption, is this a problem? Zhanghailiang @ 2014-03-29 14:46 ` Marcelo Tosatti 2014-03-31 1:12 ` Zhanghailiang 2014-03-31 17:52 ` Andy Lutomirski 1 sibling, 1 reply; 13+ messages in thread From: Marcelo Tosatti @ 2014-03-29 14:46 UTC (permalink / raw) To: Zhanghailiang Cc: johnstul@us.ibm.com, tglx@linutronix.de, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Zhouxiangjiu, zhang yanying On Sat, Mar 29, 2014 at 08:47:27AM +0000, Zhanghailiang wrote: > Hi, > I found when Guest is idle, VDSO pvclock may increase host consumption. > We can calcutate as follow, Correct me if I am wrong. > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > Both Host and Guest is linux-3.13.6. > So, whether the host cpu consumption is a problem? Hi, How many percents out of the total CPU cycles are 225,000 cycles, for your CPU ? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-03-29 14:46 ` Marcelo Tosatti @ 2014-03-31 1:12 ` Zhanghailiang 0 siblings, 0 replies; 13+ messages in thread From: Zhanghailiang @ 2014-03-31 1:12 UTC (permalink / raw) To: Marcelo Tosatti Cc: johnstul@us.ibm.com, tglx@linutronix.de, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Zhouxiangjiu Hi Marcelo, The CPU's info is: processor : 15 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz stepping : 2 microcode : 12 cpu MHz : 2400.125 cache size : 12288 KB physical id : 1 siblings : 8 core id : 10 cpu cores : 4 apicid : 53 initial apicid : 53 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 4800.18 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: Thanks Zhang hailiang > On Sat, Mar 29, 2014 at 08:47:27AM +0000, Zhanghailiang wrote: > > Hi, > > I found when Guest is idle, VDSO pvclock may increase host consumption. > > We can calcutate as follow, Correct me if I am wrong. > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) In > > Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in > timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 > Hz, it may consume 225,000 cycles per second, even no VM is created. > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If > the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per > call. The feature decrease 150 cycles consumption per call. > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the > host consumption. > > Both Host and Guest is linux-3.13.6. > > So, whether the host cpu consumption is a problem? > > Hi, > > How many percents out of the total CPU cycles are 225,000 cycles, for your > CPU ? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-03-29 8:47 VDSO pvclock may increase host cpu consumption, is this a problem? Zhanghailiang 2014-03-29 14:46 ` Marcelo Tosatti @ 2014-03-31 17:52 ` Andy Lutomirski 2014-03-31 21:30 ` Marcelo Tosatti 1 sibling, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2014-03-31 17:52 UTC (permalink / raw) To: Zhanghailiang, mtosatti@redhat.com, johnstul@us.ibm.com, tglx@linutronix.de, kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Zhouxiangjiu, zhang yanying On 03/29/2014 01:47 AM, Zhanghailiang wrote: > Hi, > I found when Guest is idle, VDSO pvclock may increase host consumption. > We can calcutate as follow, Correct me if I am wrong. > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > Both Host and Guest is linux-3.13.6. > So, whether the host cpu consumption is a problem? Does pvclock serve any real purpose on systems with fully-functional TSCs? The x86 guest implementation is awful, so it's about 2x slower than TSC. It could be improved a lot, but I'm not sure I understand why it exists in the first place. I certainly understand the goal of keeping the guest CLOCK_REALTIME is sync with the host, but pvclock seems like overkill for that. --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-03-31 17:52 ` Andy Lutomirski @ 2014-03-31 21:30 ` Marcelo Tosatti 2014-04-01 5:33 ` Andy Lutomirski 0 siblings, 1 reply; 13+ messages in thread From: Marcelo Tosatti @ 2014-03-31 21:30 UTC (permalink / raw) To: Andy Lutomirski Cc: Zhanghailiang, johnstul@us.ibm.com, tglx@linutronix.de, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Zhouxiangjiu, zhang yanying On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: > On 03/29/2014 01:47 AM, Zhanghailiang wrote: > > Hi, > > I found when Guest is idle, VDSO pvclock may increase host consumption. > > We can calcutate as follow, Correct me if I am wrong. > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > > Both Host and Guest is linux-3.13.6. > > So, whether the host cpu consumption is a problem? > > Does pvclock serve any real purpose on systems with fully-functional > TSCs? The x86 guest implementation is awful, so it's about 2x slower > than TSC. It could be improved a lot, but I'm not sure I understand why > it exists in the first place. VM migration. Can you explain why you consider it so bad ? How you think it could be improved ? > I certainly understand the goal of keeping the guest CLOCK_REALTIME is > sync with the host, but pvclock seems like overkill for that. VM migration. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-03-31 21:30 ` Marcelo Tosatti @ 2014-04-01 5:33 ` Andy Lutomirski 2014-04-01 18:01 ` Marcelo Tosatti 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2014-04-01 5:33 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: > > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: > > > Hi, > > > I found when Guest is idle, VDSO pvclock may increase host consumption. > > > We can calcutate as follow, Correct me if I am wrong. > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > > > Both Host and Guest is linux-3.13.6. > > > So, whether the host cpu consumption is a problem? > > > > Does pvclock serve any real purpose on systems with fully-functional > > TSCs? The x86 guest implementation is awful, so it's about 2x slower > > than TSC. It could be improved a lot, but I'm not sure I understand why > > it exists in the first place. > > VM migration. Why does that need percpu stuff? Wouldn't it be sufficient to interrupt all CPUs (or at least all cpus running in userspace) on migration and update the normal timing data structures? Even better: have the VM offer to invalidate the physical page containing the kernel's clock data on migration and interrupt one CPU. If another CPU races, it'll fault and wait for the guest kernel to update its timing. Does the current kvmclock stuff track CLOCK_MONOTONIC and CLOCK_REALTIME separately? > > Can you explain why you consider it so bad ? How you think it could be > improved ? The second rdtsc_barrier looks unnecessary. Even better, if rdtscp is available, then rdtscp can replace rdtsc_barrier, rdtsc, and the getcpu call. It would also be nice to avoid having two sets of rescalings of the timing data. > > > I certainly understand the goal of keeping the guest CLOCK_REALTIME is > > sync with the host, but pvclock seems like overkill for that. > > VM migration. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-04-01 5:33 ` Andy Lutomirski @ 2014-04-01 18:01 ` Marcelo Tosatti 2014-04-01 19:17 ` Andy Lutomirski 0 siblings, 1 reply; 13+ messages in thread From: Marcelo Tosatti @ 2014-04-01 18:01 UTC (permalink / raw) To: Andy Lutomirski Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: > On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: > > > > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: > > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: > > > > Hi, > > > > I found when Guest is idle, VDSO pvclock may increase host consumption. > > > > We can calcutate as follow, Correct me if I am wrong. > > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > > > > Both Host and Guest is linux-3.13.6. > > > > So, whether the host cpu consumption is a problem? > > > > > > Does pvclock serve any real purpose on systems with fully-functional > > > TSCs? The x86 guest implementation is awful, so it's about 2x slower > > > than TSC. It could be improved a lot, but I'm not sure I understand why > > > it exists in the first place. > > > > VM migration. > > Why does that need percpu stuff? Wouldn't it be sufficient to > interrupt all CPUs (or at least all cpus running in userspace) on > migration and update the normal timing data structures? Are you suggesting to allow interruption of the timekeeping code at any time to update frequency information ? Do you want to that as a special tsc clocksource driver ? > Even better: have the VM offer to invalidate the physical page > containing the kernel's clock data on migration and interrupt one CPU. > If another CPU races, it'll fault and wait for the guest kernel to > update its timing. Perhaps that is a good idea. > Does the current kvmclock stuff track CLOCK_MONOTONIC and > CLOCK_REALTIME separately? No. kvmclock counting is interrupted on vm pause (the "hw" clock does not count during vm pause). > > Can you explain why you consider it so bad ? How you think it could be > > improved ? > > The second rdtsc_barrier looks unnecessary. Even better, if rdtscp is > available, then rdtscp can replace rdtsc_barrier, rdtsc, and the > getcpu call. > > It would also be nice to avoid having two sets of rescalings of the timing data. Yep, probably good improvements, patches are welcome :-) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-04-01 18:01 ` Marcelo Tosatti @ 2014-04-01 19:17 ` Andy Lutomirski 2014-04-02 0:12 ` Marcelo Tosatti [not found] ` <20140402002926.GB31945@amt.cnet> 0 siblings, 2 replies; 13+ messages in thread From: Andy Lutomirski @ 2014-04-01 19:17 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: >> > >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: >> > > > Hi, >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption. >> > > > We can calcutate as follow, Correct me if I am wrong. >> > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. >> > > > Both Host and Guest is linux-3.13.6. >> > > > So, whether the host cpu consumption is a problem? >> > > >> > > Does pvclock serve any real purpose on systems with fully-functional >> > > TSCs? The x86 guest implementation is awful, so it's about 2x slower >> > > than TSC. It could be improved a lot, but I'm not sure I understand why >> > > it exists in the first place. >> > >> > VM migration. >> >> Why does that need percpu stuff? Wouldn't it be sufficient to >> interrupt all CPUs (or at least all cpus running in userspace) on >> migration and update the normal timing data structures? > > Are you suggesting to allow interruption of the timekeeping code > at any time to update frequency information ? I'm not sure what you mean by "interruption of the timekeeping code". I'm suggesting sending an interrupt to the guest (via a virtio device, presumably) to tell it that it has been paused and resumed. This is probably worth getting John's input if you actually want to do this. I'm not about to :) Is there any case in which the TSC is stable and the kvmclock data for different cpus is actually different? > > Do you want to that as a special tsc clocksource driver ? > >> Even better: have the VM offer to invalidate the physical page >> containing the kernel's clock data on migration and interrupt one CPU. >> If another CPU races, it'll fault and wait for the guest kernel to >> update its timing. > > Perhaps that is a good idea. > >> Does the current kvmclock stuff track CLOCK_MONOTONIC and >> CLOCK_REALTIME separately? > > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not > count during vm pause). Makes sense. > >> > Can you explain why you consider it so bad ? How you think it could be >> > improved ? >> >> The second rdtsc_barrier looks unnecessary. Even better, if rdtscp is >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the >> getcpu call. >> >> It would also be nice to avoid having two sets of rescalings of the timing data. > > Yep, probably good improvements, patches are welcome :-) > I may get to it at some point. No guarantees. I did just rewrite all the mapping-related code for every other x86 vdso timesource, so maybe I should try to add this to the pile. The fact that the data is a variable number of pages makes it messy, though, and since I don't understand why there's a separate structure for each CPU, I'm hesitant to change it too much. --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-04-01 19:17 ` Andy Lutomirski @ 2014-04-02 0:12 ` Marcelo Tosatti 2014-04-02 0:20 ` Andy Lutomirski [not found] ` <20140402002926.GB31945@amt.cnet> 1 sibling, 1 reply; 13+ messages in thread From: Marcelo Tosatti @ 2014-04-02 0:12 UTC (permalink / raw) To: Andy Lutomirski Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote: > On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: > >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: > >> > > >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: > >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: > >> > > > Hi, > >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption. > >> > > > We can calcutate as follow, Correct me if I am wrong. > >> > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > >> > > > Both Host and Guest is linux-3.13.6. > >> > > > So, whether the host cpu consumption is a problem? > >> > > > >> > > Does pvclock serve any real purpose on systems with fully-functional > >> > > TSCs? The x86 guest implementation is awful, so it's about 2x slower > >> > > than TSC. It could be improved a lot, but I'm not sure I understand why > >> > > it exists in the first place. > >> > > >> > VM migration. > >> > >> Why does that need percpu stuff? Wouldn't it be sufficient to > >> interrupt all CPUs (or at least all cpus running in userspace) on > >> migration and update the normal timing data structures? > > > > Are you suggesting to allow interruption of the timekeeping code > > at any time to update frequency information ? > > I'm not sure what you mean by "interruption of the timekeeping code". > I'm suggesting sending an interrupt to the guest (via a virtio device, > presumably) to tell it that it has been paused and resumed. code: 1) disable interrupts 2) A = RDTSC 3) B = SCALE(A, TSC.FREQ) If migration happens between 2 and 3, you've got an incorrect value. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-04-02 0:12 ` Marcelo Tosatti @ 2014-04-02 0:20 ` Andy Lutomirski 0 siblings, 0 replies; 13+ messages in thread From: Andy Lutomirski @ 2014-04-02 0:20 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Tue, Apr 1, 2014 at 5:12 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote: >> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: >> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: >> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: >> >> > >> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: >> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: >> >> > > > Hi, >> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption. >> >> > > > We can calcutate as follow, Correct me if I am wrong. >> >> > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) >> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. >> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. >> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. >> >> > > > Both Host and Guest is linux-3.13.6. >> >> > > > So, whether the host cpu consumption is a problem? >> >> > > >> >> > > Does pvclock serve any real purpose on systems with fully-functional >> >> > > TSCs? The x86 guest implementation is awful, so it's about 2x slower >> >> > > than TSC. It could be improved a lot, but I'm not sure I understand why >> >> > > it exists in the first place. >> >> > >> >> > VM migration. >> >> >> >> Why does that need percpu stuff? Wouldn't it be sufficient to >> >> interrupt all CPUs (or at least all cpus running in userspace) on >> >> migration and update the normal timing data structures? >> > >> > Are you suggesting to allow interruption of the timekeeping code >> > at any time to update frequency information ? >> >> I'm not sure what you mean by "interruption of the timekeeping code". >> I'm suggesting sending an interrupt to the guest (via a virtio device, >> presumably) to tell it that it has been paused and resumed. > > code: > > 1) disable interrupts > 2) A = RDTSC > 3) B = SCALE(A, TSC.FREQ) > > If migration happens between 2 and 3, you've got an incorrect value. > Fair enough. I guess 1) disable interrupts 2) A = RDTSC 3) B = SCALE(A, TSC.FREQ) is also bad if (3) blocks due to magic invalidation of the physical page. --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20140402002926.GB31945@amt.cnet>]
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? [not found] ` <20140402002926.GB31945@amt.cnet> @ 2014-04-02 0:46 ` Andy Lutomirski 2014-04-02 22:05 ` Marcelo Tosatti 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2014-04-02 0:46 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Tue, Apr 1, 2014 at 5:29 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote: >> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: >> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: >> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: >> >> > >> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: >> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: >> >> > > > Hi, >> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption. >> >> > > > We can calcutate as follow, Correct me if I am wrong. >> >> > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) >> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. >> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. >> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. >> >> > > > Both Host and Guest is linux-3.13.6. >> >> > > > So, whether the host cpu consumption is a problem? >> >> > > >> >> > > Does pvclock serve any real purpose on systems with fully-functional >> >> > > TSCs? The x86 guest implementation is awful, so it's about 2x slower >> >> > > than TSC. It could be improved a lot, but I'm not sure I understand why >> >> > > it exists in the first place. >> >> > >> >> > VM migration. >> >> >> >> Why does that need percpu stuff? Wouldn't it be sufficient to >> >> interrupt all CPUs (or at least all cpus running in userspace) on >> >> migration and update the normal timing data structures? >> > >> > Are you suggesting to allow interruption of the timekeeping code >> > at any time to update frequency information ? >> >> I'm not sure what you mean by "interruption of the timekeeping code". >> I'm suggesting sending an interrupt to the guest (via a virtio device, >> presumably) to tell it that it has been paused and resumed. >> >> This is probably worth getting John's input if you actually want to do >> this. I'm not about to :) > > Honestly, neither am i at the moment. But i'll think about it. > >> Is there any case in which the TSC is stable and the kvmclock data for >> different cpus is actually different? > > No. However, kvmclock_data.flags field is an interface for watchdog > unpause. > >> > Do you want to that as a special tsc clocksource driver ? >> > >> >> Even better: have the VM offer to invalidate the physical page >> >> containing the kernel's clock data on migration and interrupt one CPU. >> >> If another CPU races, it'll fault and wait for the guest kernel to >> >> update its timing. >> > >> > Perhaps that is a good idea. >> > >> >> Does the current kvmclock stuff track CLOCK_MONOTONIC and >> >> CLOCK_REALTIME separately? >> > >> > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not >> > count during vm pause). >> >> Makes sense. >> >> > >> >> > Can you explain why you consider it so bad ? How you think it could be >> >> > improved ? >> >> >> >> The second rdtsc_barrier looks unnecessary. Even better, if rdtscp is >> >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the >> >> getcpu call. >> >> >> >> It would also be nice to avoid having two sets of rescalings of the timing data. >> > >> > Yep, probably good improvements, patches are welcome :-) >> > >> >> I may get to it at some point. No guarantees. I did just rewrite all >> the mapping-related code for every other x86 vdso timesource, so maybe >> I should try to add this to the pile. The fact that the data is a >> variable number of pages makes it messy, though, and since I don't >> understand why there's a separate structure for each CPU, I'm hesitant >> to change it too much. >> >> --Andy > > kvmclock.data? Because each VCPU can have different .flags fields for > example. It looks like the vdso kvmclock code only runs if PVCLOCK_TSC_STABLE_BIT is set, which in turn is only the case if the TSC is guaranteed to be monotonic across all CPUs. If we can rely on the fact that that bit will only be set if tsc_to_system_mul and tsc_shift are the same on all CPUs and that (system_time - (tsc_timestamp * mul) >> shift) is the same on all CPUs, then there should be no reason for the vdso to read the pvclock data for anything but CPU 0. That will make it a lot faster and simpler. Can we rely on that? I wonder what happens if the guest runs ntpd or otherwise uses adjtimex. Presumably it starts drifting relative to the host. --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-04-02 0:46 ` Andy Lutomirski @ 2014-04-02 22:05 ` Marcelo Tosatti 2014-04-02 22:31 ` Andy Lutomirski 0 siblings, 1 reply; 13+ messages in thread From: Marcelo Tosatti @ 2014-04-02 22:05 UTC (permalink / raw) To: Andy Lutomirski Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Tue, Apr 01, 2014 at 05:46:34PM -0700, Andy Lutomirski wrote: > On Tue, Apr 1, 2014 at 5:29 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > > On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote: > >> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > >> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: > >> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: > >> >> > > >> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: > >> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: > >> >> > > > Hi, > >> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption. > >> >> > > > We can calcutate as follow, Correct me if I am wrong. > >> >> > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) > >> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. > >> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. > >> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. > >> >> > > > Both Host and Guest is linux-3.13.6. > >> >> > > > So, whether the host cpu consumption is a problem? > >> >> > > > >> >> > > Does pvclock serve any real purpose on systems with fully-functional > >> >> > > TSCs? The x86 guest implementation is awful, so it's about 2x slower > >> >> > > than TSC. It could be improved a lot, but I'm not sure I understand why > >> >> > > it exists in the first place. > >> >> > > >> >> > VM migration. > >> >> > >> >> Why does that need percpu stuff? Wouldn't it be sufficient to > >> >> interrupt all CPUs (or at least all cpus running in userspace) on > >> >> migration and update the normal timing data structures? > >> > > >> > Are you suggesting to allow interruption of the timekeeping code > >> > at any time to update frequency information ? > >> > >> I'm not sure what you mean by "interruption of the timekeeping code". > >> I'm suggesting sending an interrupt to the guest (via a virtio device, > >> presumably) to tell it that it has been paused and resumed. > >> > >> This is probably worth getting John's input if you actually want to do > >> this. I'm not about to :) > > > > Honestly, neither am i at the moment. But i'll think about it. > > > >> Is there any case in which the TSC is stable and the kvmclock data for > >> different cpus is actually different? > > > > No. However, kvmclock_data.flags field is an interface for watchdog > > unpause. > > > >> > Do you want to that as a special tsc clocksource driver ? > >> > > >> >> Even better: have the VM offer to invalidate the physical page > >> >> containing the kernel's clock data on migration and interrupt one CPU. > >> >> If another CPU races, it'll fault and wait for the guest kernel to > >> >> update its timing. > >> > > >> > Perhaps that is a good idea. > >> > > >> >> Does the current kvmclock stuff track CLOCK_MONOTONIC and > >> >> CLOCK_REALTIME separately? > >> > > >> > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not > >> > count during vm pause). > >> > >> Makes sense. > >> > >> > > >> >> > Can you explain why you consider it so bad ? How you think it could be > >> >> > improved ? > >> >> > >> >> The second rdtsc_barrier looks unnecessary. Even better, if rdtscp is > >> >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the > >> >> getcpu call. > >> >> > >> >> It would also be nice to avoid having two sets of rescalings of the timing data. > >> > > >> > Yep, probably good improvements, patches are welcome :-) > >> > > >> > >> I may get to it at some point. No guarantees. I did just rewrite all > >> the mapping-related code for every other x86 vdso timesource, so maybe > >> I should try to add this to the pile. The fact that the data is a > >> variable number of pages makes it messy, though, and since I don't > >> understand why there's a separate structure for each CPU, I'm hesitant > >> to change it too much. > >> > >> --Andy > > > > kvmclock.data? Because each VCPU can have different .flags fields for > > example. > > It looks like the vdso kvmclock code only runs if > PVCLOCK_TSC_STABLE_BIT is set, which in turn is only the case if the > TSC is guaranteed to be monotonic across all CPUs. If we can rely on > the fact that that bit will only be set if tsc_to_system_mul and > tsc_shift are the same on all CPUs and that (system_time - > (tsc_timestamp * mul) >> shift) is the same on all CPUs, then there > should be no reason for the vdso to read the pvclock data for anything > but CPU 0. That will make it a lot faster and simpler. > > Can we rely on that? In theory yes, but you would have to handle PVCLOCK_TSC_STABLE_BIT set -> PVCLOCK_TSC_STABLE_BIT not set Transition (and the other way around as well). > I wonder what happens if the guest runs ntpd or otherwise uses > adjtimex. Presumably it starts drifting relative to the host. It should use ntpd and adjtimex. KVMCLOCK is the "hw" clock, the values returned by CLOCK_REALTIME and CLOCK_GETTIME are built by the Linux guest timekeeping subsystem on top of the "hw" clock. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VDSO pvclock may increase host cpu consumption, is this a problem? 2014-04-02 22:05 ` Marcelo Tosatti @ 2014-04-02 22:31 ` Andy Lutomirski 0 siblings, 0 replies; 13+ messages in thread From: Andy Lutomirski @ 2014-04-02 22:31 UTC (permalink / raw) To: Marcelo Tosatti Cc: Thomas Gleixner, linux-kernel@vger.kernel.org, zhang yanying, Zhouxiangjiu, kvm@vger.kernel.org, johnstul@us.ibm.com, Zhanghailiang On Wed, Apr 2, 2014 at 3:05 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Tue, Apr 01, 2014 at 05:46:34PM -0700, Andy Lutomirski wrote: >> On Tue, Apr 1, 2014 at 5:29 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote: >> > On Tue, Apr 01, 2014 at 12:17:16PM -0700, Andy Lutomirski wrote: >> >> On Tue, Apr 1, 2014 at 11:01 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: >> >> > On Mon, Mar 31, 2014 at 10:33:41PM -0700, Andy Lutomirski wrote: >> >> >> On Mar 31, 2014 8:45 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote: >> >> >> > >> >> >> > On Mon, Mar 31, 2014 at 10:52:25AM -0700, Andy Lutomirski wrote: >> >> >> > > On 03/29/2014 01:47 AM, Zhanghailiang wrote: >> >> >> > > > Hi, >> >> >> > > > I found when Guest is idle, VDSO pvclock may increase host consumption. >> >> >> > > > We can calcutate as follow, Correct me if I am wrong. >> >> >> > > > (Host)250 * update_pvclock_gtod = 1500 * gettimeofday(Guest) >> >> >> > > > In Host, VDSO pvclock introduce a notifier chain, pvclock_gtod_chain in timekeeping.c. It consume nearly 900 cycles per call. So in consideration of 250 Hz, it may consume 225,000 cycles per second, even no VM is created. >> >> >> > > > In Guest, gettimeofday consumes 220 cycles per call with VDSO pvclock. If the no-kvmclock-vsyscall is configured, gettimeofday consumes 370 cycles per call. The feature decrease 150 cycles consumption per call. >> >> >> > > > When call gettimeofday 1500 times,it decrease 225,000 cycles,equal to the host consumption. >> >> >> > > > Both Host and Guest is linux-3.13.6. >> >> >> > > > So, whether the host cpu consumption is a problem? >> >> >> > > >> >> >> > > Does pvclock serve any real purpose on systems with fully-functional >> >> >> > > TSCs? The x86 guest implementation is awful, so it's about 2x slower >> >> >> > > than TSC. It could be improved a lot, but I'm not sure I understand why >> >> >> > > it exists in the first place. >> >> >> > >> >> >> > VM migration. >> >> >> >> >> >> Why does that need percpu stuff? Wouldn't it be sufficient to >> >> >> interrupt all CPUs (or at least all cpus running in userspace) on >> >> >> migration and update the normal timing data structures? >> >> > >> >> > Are you suggesting to allow interruption of the timekeeping code >> >> > at any time to update frequency information ? >> >> >> >> I'm not sure what you mean by "interruption of the timekeeping code". >> >> I'm suggesting sending an interrupt to the guest (via a virtio device, >> >> presumably) to tell it that it has been paused and resumed. >> >> >> >> This is probably worth getting John's input if you actually want to do >> >> this. I'm not about to :) >> > >> > Honestly, neither am i at the moment. But i'll think about it. >> > >> >> Is there any case in which the TSC is stable and the kvmclock data for >> >> different cpus is actually different? >> > >> > No. However, kvmclock_data.flags field is an interface for watchdog >> > unpause. >> > >> >> > Do you want to that as a special tsc clocksource driver ? >> >> > >> >> >> Even better: have the VM offer to invalidate the physical page >> >> >> containing the kernel's clock data on migration and interrupt one CPU. >> >> >> If another CPU races, it'll fault and wait for the guest kernel to >> >> >> update its timing. >> >> > >> >> > Perhaps that is a good idea. >> >> > >> >> >> Does the current kvmclock stuff track CLOCK_MONOTONIC and >> >> >> CLOCK_REALTIME separately? >> >> > >> >> > No. kvmclock counting is interrupted on vm pause (the "hw" clock does not >> >> > count during vm pause). >> >> >> >> Makes sense. >> >> >> >> > >> >> >> > Can you explain why you consider it so bad ? How you think it could be >> >> >> > improved ? >> >> >> >> >> >> The second rdtsc_barrier looks unnecessary. Even better, if rdtscp is >> >> >> available, then rdtscp can replace rdtsc_barrier, rdtsc, and the >> >> >> getcpu call. >> >> >> >> >> >> It would also be nice to avoid having two sets of rescalings of the timing data. >> >> > >> >> > Yep, probably good improvements, patches are welcome :-) >> >> > >> >> >> >> I may get to it at some point. No guarantees. I did just rewrite all >> >> the mapping-related code for every other x86 vdso timesource, so maybe >> >> I should try to add this to the pile. The fact that the data is a >> >> variable number of pages makes it messy, though, and since I don't >> >> understand why there's a separate structure for each CPU, I'm hesitant >> >> to change it too much. >> >> >> >> --Andy >> > >> > kvmclock.data? Because each VCPU can have different .flags fields for >> > example. >> >> It looks like the vdso kvmclock code only runs if >> PVCLOCK_TSC_STABLE_BIT is set, which in turn is only the case if the >> TSC is guaranteed to be monotonic across all CPUs. If we can rely on >> the fact that that bit will only be set if tsc_to_system_mul and >> tsc_shift are the same on all CPUs and that (system_time - >> (tsc_timestamp * mul) >> shift) is the same on all CPUs, then there >> should be no reason for the vdso to read the pvclock data for anything >> but CPU 0. That will make it a lot faster and simpler. >> >> Can we rely on that? > > In theory yes, but you would have to handle > > PVCLOCK_TSC_STABLE_BIT set -> PVCLOCK_TSC_STABLE_BIT not set > > Transition (and the other way around as well). Since !STABLE already results in a real syscall for clock_gettime and gettimeofday, I don't think this is a real hardship for the vdso. > >> I wonder what happens if the guest runs ntpd or otherwise uses >> adjtimex. Presumably it starts drifting relative to the host. > > It should use ntpd and adjtimex. KVMCLOCK is the "hw" clock, > the values returned by CLOCK_REALTIME and CLOCK_GETTIME are built > by the Linux guest timekeeping subsystem on top of the "hw" clock. > If the kernel can guarantee that, then the timing code gets faster, since the cyc2ns scale will be unity. Maybe this is worth a branch. Anyway, I'll try to find some time to improve this if/when hpa picks up my current series of vdso cleanups. I suspect that the overall effect will be a 30-40% speedup in clock_gettime along with a decent reduction of code complexity. --Andy ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2014-04-02 22:32 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-29 8:47 VDSO pvclock may increase host cpu consumption, is this a problem? Zhanghailiang
2014-03-29 14:46 ` Marcelo Tosatti
2014-03-31 1:12 ` Zhanghailiang
2014-03-31 17:52 ` Andy Lutomirski
2014-03-31 21:30 ` Marcelo Tosatti
2014-04-01 5:33 ` Andy Lutomirski
2014-04-01 18:01 ` Marcelo Tosatti
2014-04-01 19:17 ` Andy Lutomirski
2014-04-02 0:12 ` Marcelo Tosatti
2014-04-02 0:20 ` Andy Lutomirski
[not found] ` <20140402002926.GB31945@amt.cnet>
2014-04-02 0:46 ` Andy Lutomirski
2014-04-02 22:05 ` Marcelo Tosatti
2014-04-02 22:31 ` Andy Lutomirski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox