From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:37819)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <zhanghy@sangfor.com>) id 1XH1Yr-0004S4-Vz
	for qemu-devel@nongnu.org; Mon, 11 Aug 2014 22:11:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <zhanghy@sangfor.com>) id 1XH1Yl-0002QF-Q6
	for qemu-devel@nongnu.org; Mon, 11 Aug 2014 22:11:41 -0400
Received: from [58.251.49.30] (port=45001 helo=mail.sangfor.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <zhanghy@sangfor.com>) id 1XH1Yk-0002Ox-DB
	for qemu-devel@nongnu.org; Mon, 11 Aug 2014 22:11:35 -0400
Date: Tue, 12 Aug 2014 10:09:08 +0800
From: "Zhang Haoyu" <zhanghy@sangfor.com>
References: <53E87FD1.3070600@huawei.com>,
	<20140811142136.GA496@stefanha-thinkpad.redhat.com>,
	<20140812005853.GC6226@T430.nay.redhat.com>
Message-ID: <201408121009071107044@sangfor.com>
Mime-Version: 1.0
Content-Type: text/plain;
	charset="gb2312"
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] the whole virtual machine hangs when IO does
	notcome back!
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Fam Zheng <famz@redhat.com>, Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, Bin Wu <wu.wubin@huawei.com>

>> > Hi,
>> > 
>> > I tested the reliability of qemu in the IPSAN environment as follows:
>> > (1) create one VM on a X86 server which is connected to an IPSAN, and the VM
>> > has only one system volume which is on the IPSAN;
>> > (2) disconnect the network between the server and the IPSAN. On the server,
>> > I have a "multipath" software which can hold the IO for a long time
>> > (configurable) when the network is disconnected;
>> > (3) about 30 seconds later, the whole VM hangs there, nothing can be done to
>> > the VM!
>> > 
>> > Then, I used "gstack" tool to collect the stacks of all qemu threads, it
>> > looked like:
>> > 
>> > Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
>> > #0  0x00007fd84253a4f6 in poll () from /lib64/libc.so.6
>> > #1  0x00007fd84410ceff in aio_poll ()
>> > #2  0x00007fd84429bb05 in qemu_aio_wait ()
>> > #3  0x00007fd844120f51 in bdrv_drain_all ()
>> > #4  0x00007fd8441f1a4a in bmdma_cmd_writeb ()
>> > #5  0x00007fd8441f216e in bmdma_write ()
>> > #6  0x00007fd8443a93cf in memory_region_write_accessor ()
>> > #7  0x00007fd8443a94a6 in access_with_adjusted_size ()
>> > #8  0x00007fd8443a9901 in memory_region_iorange_write ()
>> > #9  0x00007fd8443a19bd in ioport_writeb_thunk ()
>> > #10 0x00007fd8443a13a8 in ioport_write ()
>> > #11 0x00007fd8443a1f55 in cpu_outb ()
>> > #12 0x00007fd8443a5b12 in kvm_handle_io ()
>> > #13 0x00007fd8443a64a9 in kvm_cpu_exec ()
>> > #14 0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
>> > #15 0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
>> > #16 0x00007fd8425439cd in clone () from /lib64/libc.so.6
>> > #17 0x0000000000000000 in ?? ()
>> 
>> Use virtio-blk.  Read, write, and flush are asynchronous in virtio-blk.
>> 
>> Note that the QEMU monitor commands are typically synchronous so they
>> will still block the VM.
>> 
>
>If some of the requests are dropped by host and never return to QEMU, I think
>bdrv_drain_all() will still cause the hang. Even with virtio-blk, reset has
>such a call. Maybe we could add some -ETIMEDOUT machanism in QEMU's block
>layer.
>
>A workaround might be to configure the host storage to fail the IO after a
>timeout.
>
If -ETIMEOUT returned after a short time network disconnection, may unpredicted fault happened in VM ?
e.g., the VM was reading important data(like, system data).
Does aio replay work for this case?

Thanks,
Zhang Haoyu

>Fam