From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gerd Hoffmann Subject: Re: [PATCH 0/4] paravirt clock patches Date: Wed, 07 May 2008 20:45:12 +0200 Message-ID: <4821F8B8.7050601@redhat.com> References: <1209026228-9113-1-git-send-email-kraxel@redhat.com> <20080428192816.GA4596@dmt> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------020903080006060505050103" Cc: kvm-devel@lists.sourceforge.net, Glauber de Oliveira Costa To: Marcelo Tosatti Return-path: In-Reply-To: <20080428192816.GA4596@dmt> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces@lists.sourceforge.net Errors-To: kvm-devel-bounces@lists.sourceforge.net List-Id: kvm.vger.kernel.org This is a multi-part message in MIME format. --------------020903080006060505050103 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Marcelo Tosatti wrote: > On Thu, Apr 24, 2008 at 10:37:04AM +0200, Gerd Hoffmann wrote: >> Hi folks, >> >> My first attempt to send out a patch series with git ... >> >> The patches fix the kvm paravirt clocksource code to be compatible with >> xen and they also factor out some code which can be shared into a >> separate source files used by both kvm and xen. > > The issue with SMP guests is still present. Booting with "nohz=off" resolves it. > > Same symptoms as before, apic_timer_fn for one of the vcpu's is ticking way slower > than the remaining ones: > > [root@localhost ~]# cat /proc/timer_stats | grep apic > 391, 4125 qemu-system-x86 apic_mmio_write (apic_timer_fn) > 2103, 4126 qemu-system-x86 apic_mmio_write (apic_timer_fn) > 1896, 4127 qemu-system-x86 apic_mmio_write (apic_timer_fn) > 1857, 4128 qemu-system-x86 apic_mmio_write (apic_timer_fn) > > Let me know what else is needed, or any patches to try. Ok folks, here is the band aid fix for testing from the odd bugs department. Goes on top of the four patches of this series. A real, clean solution is TBD. Tomorrow I hope (some urgent private problems are in the queue too ...). Problem is the per-cpu area for cpu 0 has two locations in memory, one before and one after pda initialization. kvmclock registers the first due to being initialized quite early, and the paravirt clock for cpu 0 stops seeing updates once the pda setup is done. Which makes the TSC effectively the base for timekeeping (instead of using the TSC for millisecond delta adjustments only). Secondary CPUs work as intended. This obviously screws up timekeeping on SMP guests, especially on hosts with unstable TSC. happy testing, Gerd -- [root@localhost ~]# dmesg | grep _clock kvm_register_clock: cpu 0 at 0:798601 (boot) kvm_clock_read: cpu 0 at 0:140b601 (pda) kvm_register_clock: cpu 1 at 0:1415601 --------------020903080006060505050103 Content-Type: text/plain; name="fix" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="fix" diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index 867523e..43135ed 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -40,6 +40,7 @@ early_param("no-kvmclock", parse_no_kvmclock); static DEFINE_PER_CPU_SHARED_ALIGNED(struct kvm_vcpu_time_info, hv_clock); static struct kvm_wall_clock wall_clock; +static void *boot_clock; /* * The wallclock is the time of day when we booted. Since then, some time may @@ -74,6 +75,19 @@ static cycle_t kvm_clock_read(void) cycle_t ret; src = &get_cpu_var(hv_clock); + + if (boot_clock && 0 == smp_processor_id()) { + if (boot_clock != src) { + int low, high; + low = (int)__pa(src) | 1; + high = ((u64)__pa(src) >> 32); + printk(KERN_INFO "%s: cpu %d at %x:%x (pda)\n", __FUNCTION__, + smp_processor_id(), high, low); + native_write_msr_safe(MSR_KVM_SYSTEM_TIME, low, high); + boot_clock = NULL; + } + } + ret = pvclock_clocksource_read(src); put_cpu_var(hv_clock); return ret; @@ -92,12 +106,18 @@ static struct clocksource kvm_clock = { static int kvm_register_clock(void) { int cpu = smp_processor_id(); + void *ptr; int low, high; - low = (int)__pa(&per_cpu(hv_clock, cpu)) | 1; - high = ((u64)__pa(&per_cpu(hv_clock, cpu)) >> 32); - printk(KERN_DEBUG "%s: cpu %d at %x:%x\n", __FUNCTION__, - cpu, high, low); + ptr = &per_cpu(hv_clock, cpu); + if (0 == cpu) + boot_clock = ptr; + + low = (int)__pa(ptr) | 1; + high = ((u64)__pa(ptr) >> 32); + + printk(KERN_INFO "%s: cpu %d at %x:%x%s\n", __FUNCTION__, + cpu, high, low, boot_clock ? " (boot)" : ""); return native_write_msr_safe(MSR_KVM_SYSTEM_TIME, low, high); } --------------020903080006060505050103 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone --------------020903080006060505050103 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel --------------020903080006060505050103--