From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zachary Amsden Subject: Re: RHEL5.5, 32-bit VM repeatedly locks up due to kvmclock Date: Fri, 23 Apr 2010 11:39:54 -1000 Message-ID: <4BD213AA.7070601@redhat.com> References: <4BD1D406.1040508@cisco.com> <201004231539.32205.iggy@theiggy.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "David S. Ahern" , kvm-devel To: Brian Jackson Return-path: Received: from mx1.redhat.com ([209.132.183.28]:41103 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751920Ab0DWVj6 (ORCPT ); Fri, 23 Apr 2010 17:39:58 -0400 In-Reply-To: <201004231539.32205.iggy@theiggy.com> Sender: kvm-owner@vger.kernel.org List-ID: On 04/23/2010 10:39 AM, Brian Jackson wrote: > On Friday 23 April 2010 12:08:22 David S. Ahern wrote: > >> After a few days of debugging I think kvmclock is the source of lockups >> for a RHEL5.5-based VM. The VM works fine on one host, but repeatedly >> locks up on another. >> >> Server 1 - VM locks up repeatedly >> -- DL580 G5 >> -- 4 quad-core X7350 processors at 2.93GHz >> -- 48GB RAM >> >> Server 2 - VM works just fine >> -- DL380 G6 >> -- 2 quad-core E5540 processors at 2.53GHz >> -- 24GB RAM >> >> Both host servers are running Fedora Core 12, 2.6.32.11-99.fc12.x86_64 >> kernel. I have tried various versions of qemu-kvm -- the version in >> FC-12 and the version for FC-12 in virt-preview. In both cases the >> qemu-kvm command line is identical. >> >> VM >> - RHEL5.5, PAE kernel (also tried standard 32-bit) >> - 2 vcpus >> - 3GB RAM >> - virtio network and disk >> >> When the VM locks up both vcpu threads are spinning at 100%. Changing >> the clocksource to jiffies appears to have addressed the problem. >> > > Does changing the guest to -smp 1 help? > > Based on our current understanding of the problem, it should help, but it may not prevent the problem entirely. There are three issues with kvmclock due to sampling: 1) smp clock alignment may be slightly off due to timing conditions 2) kvmclock is resampled at each switch of vcpu to another pcpu 3) kvmclock granularity exceeds that of kernel timespec, which means sampling errors may show even on UP Recommend using a different clocksource (tsc is great if you have stable TSC and don't migrate across different-speed machines) until we have all the fixes in place. Zach