From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37158) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XQuQM-0005Rx-7q for qemu-devel@nongnu.org; Mon, 08 Sep 2014 04:35:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XQuQD-00082z-7K for qemu-devel@nongnu.org; Mon, 08 Sep 2014 04:35:46 -0400 Received: from mail-wi0-x236.google.com ([2a00:1450:400c:c05::236]:54339) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XQuQD-00082n-1M for qemu-devel@nongnu.org; Mon, 08 Sep 2014 04:35:37 -0400 Received: by mail-wi0-f182.google.com with SMTP id z2so2185136wiv.9 for ; Mon, 08 Sep 2014 01:35:36 -0700 (PDT) Date: Mon, 8 Sep 2014 09:35:33 +0100 From: Stefan Hajnoczi Message-ID: <20140908083533.GB7638@stefanha-thinkpad.redhat.com> References: <53E87FD1.3070600@huawei.com> <20140811142136.GA496@stefanha-thinkpad.redhat.com> <53E96982.3050704@huawei.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="lEGEL1/lMxI0MVQ2" Content-Disposition: inline In-Reply-To: <53E96982.3050704@huawei.com> Subject: Re: [Qemu-devel] the whole virtual machine hangs when IO does not come back! List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bin Wu Cc: qemu-devel@nongnu.org, peter.huangpeng@huawei.com --lEGEL1/lMxI0MVQ2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Aug 12, 2014 at 09:10:26AM +0800, Bin Wu wrote: > On 2014/8/11 22:21, Stefan Hajnoczi wrote: > >On Mon, Aug 11, 2014 at 04:33:21PM +0800, Bin Wu wrote: > >>Hi, > >> > >>I tested the reliability of qemu in the IPSAN environment as follows: > >>(1) create one VM on a X86 server which is connected to an IPSAN, and the VM > >>has only one system volume which is on the IPSAN; > >>(2) disconnect the network between the server and the IPSAN. On the server, > >>I have a "multipath" software which can hold the IO for a long time > >>(configurable) when the network is disconnected; > >>(3) about 30 seconds later, the whole VM hangs there, nothing can be done to > >>the VM! > >> > >>Then, I used "gstack" tool to collect the stacks of all qemu threads, it > >>looked like: > >> > >>Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)): > >>#0 0x00007fd84253a4f6 in poll () from /lib64/libc.so.6 > >>#1 0x00007fd84410ceff in aio_poll () > >>#2 0x00007fd84429bb05 in qemu_aio_wait () > >>#3 0x00007fd844120f51 in bdrv_drain_all () > >>#4 0x00007fd8441f1a4a in bmdma_cmd_writeb () > >>#5 0x00007fd8441f216e in bmdma_write () > >>#6 0x00007fd8443a93cf in memory_region_write_accessor () > >>#7 0x00007fd8443a94a6 in access_with_adjusted_size () > >>#8 0x00007fd8443a9901 in memory_region_iorange_write () > >>#9 0x00007fd8443a19bd in ioport_writeb_thunk () > >>#10 0x00007fd8443a13a8 in ioport_write () > >>#11 0x00007fd8443a1f55 in cpu_outb () > >>#12 0x00007fd8443a5b12 in kvm_handle_io () > >>#13 0x00007fd8443a64a9 in kvm_cpu_exec () > >>#14 0x00007fd844330962 in qemu_kvm_cpu_thread_fn () > >>#15 0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0 > >>#16 0x00007fd8425439cd in clone () from /lib64/libc.so.6 > >>#17 0x0000000000000000 in ?? () > >Use virtio-blk. Read, write, and flush are asynchronous in virtio-blk. > > > >Note that the QEMU monitor commands are typically synchronous so they > >will still block the VM. > > > >Stefan > Thank you for your attention. I tested virtio-blk and it's true that the VM > doesn't hange. > Why does the virtio-blk implement this in asynchronous way, but virtio-scsi > in synchronous > way? There is no fundamental reason why virtio-scsi should be synchronous, it's just that QEMU internally has some points (such as bdrv_drain_all() that Fam mentioned) that wait synchronously. Since the SCSI cancel code path hit bdrv_drain_all(), the guest hung. virtio-blk doesn't have a "cancel" operation and therefore doesn't hang. Stefan --lEGEL1/lMxI0MVQ2 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJUDWpVAAoJEJykq7OBq3PIYhsH/1+Brii6XrGA2ycng7RcY9X5 zjDmrThj0CY4xrTIOpNbVqr+4+L/SQW8rBNV/baN9v0Uc8ecA3euP1u140WwgN78 gkNguc7GK6CRQjdMgOjNfeIZQ98tlbYNYslcfoo4IHpddRACsNPUXtFYLUKnJ9qh gjEbY1OeSdZYdqDvrlfa2rmjcPLv7gGB1+4fw8ZM4c6F5Kc5MtYaR56j7Kr8hbNY fjP/GL7tlDcqQPUQViMOp5f6qLy+vw0g8Ur+eWpyiRvqnWnIKikO555N90bqKB45 53v4kXbBRQUQNwujxSAK/omFVPTi2WaU0KrvCmPC3UWSpi1jDsoSyTN6BKWu3MQ= =bgq4 -----END PGP SIGNATURE----- --lEGEL1/lMxI0MVQ2--