From mboxrd@z Thu Jan  1 00:00:00 1970
From: Gleb Natapov <gleb@redhat.com>
Subject: Re: KVM Guest Lock up (100%) again!
Date: Fri, 12 Apr 2013 18:13:16 +0300
Message-ID: <20130412151316.GA25786@redhat.com>
References: <764654559.1031795.1365086171933.JavaMail.root@innovot.com>
 <717425441.1193987.1365451324088.JavaMail.root@innovot.com>
 <20130410141027.GI17919@redhat.com>
 <221698623.1341142.1365775843844.JavaMail.root@innovot.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org
To: Phil Daws <uxbod@splatnix.net>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:12783 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753831Ab3DLPNX (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 12 Apr 2013 11:13:23 -0400
Content-Disposition: inline
In-Reply-To: <221698623.1341142.1365775843844.JavaMail.root@innovot.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Fri, Apr 12, 2013 at 03:10:43PM +0100, Phil Daws wrote:
> Well this is still happening ... I have tried to isolate what could be causing but not much luck yet.  Thought the VMs may have been IO bound but that not the case and even tried upping the vCPU allocation from one to two as plenty of head room.  When it locks up I see this on a strace:
> 
> [pid  1343] read(14, 0x7fff82aeb360, 4096) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] read(7, "\0", 512)          = 1
> [pid  1343] read(7, 0x7fff82aec160, 512) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] select(26, [7 10 13 14 16 17 22 25], [], [], {1, 0}) = 1 (in [16], left {0, 999981})
> [pid  1343] read(16, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128
> [pid  1343] rt_sigaction(SIGALRM, NULL, {0x7f210b2c0510, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7f210ac22500}, 8) = 0
> [pid  1343] write(8, "\0", 1)           = 1
> [pid  1343] write(15, "\1\0\0\0\0\0\0\0", 8) = 8
> [pid  1343] read(16, 0x7fff82aec2d0, 128) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
> [pid  1343] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 656000000}}, NULL) = 0
> [pid  1343] select(26, [7 10 13 14 16 17 22 25], [], [], {1, 0}) = 2 (in [7 14], left {0, 999998})
> [pid  1343] read(14, "\1\0\0\0\0\0\0\0", 4096) = 8
> [pid  1343] read(14, 0x7fff82aeb360, 4096) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  1343] read(7, "\0", 512)          = 1
> [pid  1343] read(7, 0x7fff82aec160, 512) = -1 EAGAIN (Resource temporarily unavailable)
> 
> Does that shed any light ? Trying to find a how to for upgrading to the latest KVM/QEMU.
> 
Is the lockup with upstream now? strace is not very helpful to
diagnose kvm problems. Try to run ftrace: http://www.linux-kvm.org/page/Tracing

--
			Gleb.