From mboxrd@z Thu Jan  1 00:00:00 1970
From: Phil Daws <uxbod@splatnix.net>
Subject: Re: KVM Guest Lock up (100%) again!
Date: Fri, 12 Apr 2013 17:31:56 +0100 (BST)
Message-ID: <531816088.1344948.1365784316602.JavaMail.root@innovot.com>
References: <764654559.1031795.1365086171933.JavaMail.root@innovot.com> <717425441.1193987.1365451324088.JavaMail.root@innovot.com> <20130410141027.GI17919@redhat.com> <221698623.1341142.1365775843844.JavaMail.root@innovot.com> <20130412151316.GA25786@redhat.com>
Reply-To: Phil Daws <uxbod@splatnix.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org
To: Gleb Natapov <gleb@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.dc1.innovot.com ([77.73.4.109]:39754 "EHLO
	mx1.dc1.innovot.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752441Ab3DLQdT (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 12 Apr 2013 12:33:19 -0400
In-Reply-To: <20130412151316.GA25786@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Was running two guests on k3.7.10 but have now switched one to stock 2.6.32; and neither have crashed yet.  Will leave running as and if it stays stable will switch out the other kernel to stock as-well.  Am wondering if have hit a kernel buglet. Thank you for the ftrace info. Have a great weekend.

----- Original Message -----
From: "Gleb Natapov" <gleb@redhat.com>
To: "Phil Daws" <uxbod@splatnix.net>
Cc: kvm@vger.kernel.org
Sent: Friday, 12 April, 2013 4:13:16 PM
Subject: Re: KVM Guest Lock up (100%) again!

On Fri, Apr 12, 2013 at 03:10:43PM +0100, Phil Daws wrote:
> Well this is still happening ... I have tried to isolate what could be causing but not much luck yet.  Thought the VMs may have been IO bound but that not the case and even tried upping the vCPU allocation from one to two as plenty of head room.  When it locks up I see this on a strace:
> 
> [pid  1343] read(14, 0x7fff82aeb360, 4096) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] read(7, "\0", 512)          = 1
> [pid  1343] read(7, 0x7fff82aec160, 512) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] select(26, [7 10 13 14 16 17 22 25], [], [], {1, 0}) = 1 (in [16], left {0, 999981})
> [pid  1343] read(16, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128
> [pid  1343] rt_sigaction(SIGALRM, NULL, {0x7f210b2c0510, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7f210ac22500}, 8) = 0
> [pid  1343] write(8, "\0", 1)           = 1
> [pid  1343] write(15, "\1\0\0\0\0\0\0\0", 8) = 8
> [pid  1343] read(16, 0x7fff82aec2d0, 128) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
> [pid  1343] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 656000000}}, NULL) = 0
> [pid  1343] select(26, [7 10 13 14 16 17 22 25], [], [], {1, 0}) = 2 (in [7 14], left {0, 999998})
> [pid  1343] read(14, "\1\0\0\0\0\0\0\0", 4096) = 8
> [pid  1343] read(14, 0x7fff82aeb360, 4096) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] read(7, "\0", 512)          = 1
> [pid  1343] read(7, 0x7fff82aec160, 512) = -1 EAGAIN (Resource temporarily unavailable)
> 
> Does that shed any light ? Trying to find a how to for upgrading to the latest KVM/QEMU.
> 
Is the lockup with upstream now? strace is not very helpful to
diagnose kvm problems. Try to run ftrace: http://www.linux-kvm.org/page/Tracing

--
			Gleb.