From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@redhat.com>
Subject: Re: KVM host freezing
Date: Thu, 02 Jun 2011 16:41:30 +0300
Message-ID: <4DE7930A.3060704@redhat.com>
References: <20110602082512.GF14747@torres.zugschlus.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org
To: Marc Haber <mh+kvm@zugschlus.de>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:24971 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752832Ab1FBNll (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 2 Jun 2011 09:41:41 -0400
In-Reply-To: <20110602082512.GF14747@torres.zugschlus.de>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 06/02/2011 11:25 AM, Marc Haber wrote:
> Hi,
>
> I have just started deploying a host doing virtualization with KVM.
> The box has an Athlon 64 X2, 4 GB RAM and is running Debian squeeze
> with a locally built 2.6,39 kernel and backported versions of qemu-kvm
> (0.14.0) and libvirt (0.9.0) from Debian sid. The box is currently
> hosting five VMs, all of them Debian systems as well and rather
> unloaded. The only time when there is significant load is when all VMs
> are simultaneously starting up their cron jobs.
>
> When the host starts up, it immediately spews the following lines to
> the console:
>
> kvm: 2865: cpu0 unhandled rdmsr: 0xc0010048
> kvm: 2865: cpu0 unhandled wrmsr: 0xc0010048 data 2100000401
> kvm: 2865: cpu0 unhandled rdmsr: 0xc0010001
> kvm: 2849: cpu0 unhandled rdmsr: 0xc0010048
> kvm: 2849: cpu0 unhandled wrmsr: 0xc0010048 data c0579f7cc0010448
> kvm: 2849: cpu0 unhandled rdmsr: 0xc0010001
> kvm: 2950: cpu0 unhandled rdmsr: 0xc0010048
> kvm: 2950: cpu0 unhandled wrmsr: 0xc0010048 data c0579f7cc0010448
> kvm: 2849: cpu1 unhandled rdmsr: 0xc0010048
> kvm: 2963: cpu0 unhandled rdmsr: 0xc0010112
> kvm: 2963: cpu0 unhandled rdmsr: 0xc0010048
> kvm: 2963: cpu0 unhandled wrmsr: 0xc0010048 data 2100000401
> kvm: 2963: cpu0 unhandled rdmsr: 0xc0010001
> kvm: 2963: cpu1 unhandled rdmsr: 0xc0010048
> kvm: 2963: cpu1 unhandled wrmsr: 0xc0010048 data 2100000401
>
> Every few days, the system stops dead in its tracks and needs a hard
> reset to be revived. I have a serial console, which unfortunately
> disconnects me after a few minutes of inactivity, and only caches the
> last few lines of activity. Whenever I connect to the serial console
> of the frozen system, I have a few lines of the same "unhandled
> (rd|wr)msr" messages.
>
> The syslog doesn't show anything strange. The system just stops dead
> in its tracks.
>
> Is there any possibility that the freezes have to do with the
> "unhandles (rd|wr)msr" messages?

Very unlikely.

> When else could be the cause?
>
> In the mean time, I have taken the box offline and am running memtest.
> Up to now, everything seems to be fine.
>
> Any hints will be appreciated.

You might try setting up netconsole to get reliable logging.

Do you have NMIs?  'grep NMI /proc/interrupts'.

Does running 'perf top -F 10000' make the hang come sooner?

-- 
error compiling committee.c: too many arguments to function