From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59970) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VbTQz-00087P-Qf for qemu-devel@nongnu.org; Wed, 30 Oct 2013 06:55:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VbTQs-0007Fz-ML for qemu-devel@nongnu.org; Wed, 30 Oct 2013 06:55:33 -0400 Received: from mail-bk0-x232.google.com ([2a00:1450:4008:c01::232]:45092) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VbTQs-0007FL-E5 for qemu-devel@nongnu.org; Wed, 30 Oct 2013 06:55:26 -0400 Received: by mail-bk0-f50.google.com with SMTP id v4so471890bkz.9 for ; Wed, 30 Oct 2013 03:55:22 -0700 (PDT) Message-ID: <5270E598.9040200@gmail.com> Date: Wed, 30 Oct 2013 11:55:20 +0100 From: Jack Wang MIME-Version: 1.0 References: <526A87E2.9020705@gmail.com> <20131030095018.GC11994@stefanha-thinkpad.redhat.com> In-Reply-To: <20131030095018.GC11994@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] block io lost in the guest , possible related to qemu? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , qemu-devel , Stefan Hajnoczi , Alexey Zaytsev On 10/30/2013 10:50 AM, Stefan Hajnoczi wrote: > On Fri, Oct 25, 2013 at 05:01:54PM +0200, Jack Wang wrote: >> We've seen guest block io lost in a VM.any response will be helpful >> >> environment is: >> guest os: Ubuntu 1304 >> running busy database workload with xfs on a disk export with virtio-blk >> >> the exported vdb has very high infight io over 300. Some times later a >> lot io process in D state, looks a lot requests is lost in below storage >> stack. > > Is the image file on a local file system or are you using a network > storage system (e.g. NFS, Gluster, Ceph, Sheepdog)? > > If you run "vmstat 5" inside the guest, do you see "bi"/"bo" block I/O > activity? If that number is very low or zero then there may be a > starvation problem. If that number is reasonable then the workload is > simply bottlenecked on disk I/O. > > virtio-blk only has 128 descriptors available so it's not possible to > have 300 requests pending at the virtio-blk layer. > > If you suspect QEMU, try building qemu.git/master from source in case > the bug has already been fixed. > > If you want to trace I/O requests, you might find this blog post on > writing trace analysis scripts useful: > http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html > > Stefan > Thanks Stefan for your valuable input. The image is on device exported with InfiniBand srp/srpt. Will follow your suggestions to do further investigation. The 300 infight ios I memtioned is from the /proc/diskstats Field 9 -- # of I/Os currently in progress. Jack