From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59227) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aVK05-0001l3-Pl for qemu-devel@nongnu.org; Mon, 15 Feb 2016 09:19:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aVK02-0003L2-Iu for qemu-devel@nongnu.org; Mon, 15 Feb 2016 09:19:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34397) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aVK02-0003Kx-BC for qemu-devel@nongnu.org; Mon, 15 Feb 2016 09:19:38 -0500 From: Markus Armbruster References: <20160208151722.GB3022@work-vm> <20160209134706.GE6510@stefanha-x1.localdomain> <20160214062210.GD9723@ad.usersys.redhat.com> <20160215134214.GC10217@stefanha-x1.localdomain> Date: Mon, 15 Feb 2016 15:19:35 +0100 In-Reply-To: <20160215134214.GC10217@stefanha-x1.localdomain> (Stefan Hajnoczi's message of "Mon, 15 Feb 2016 13:42:14 +0000") Message-ID: <874mda2gzs.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] lock-free monitor? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: xiecl.fnst@cn.fujitsu.com, Fam Zheng , "Dr. David Alan Gilbert" , qemu-devel@nongnu.org Stefan Hajnoczi writes: > On Sun, Feb 14, 2016 at 02:22:10PM +0800, Fam Zheng wrote: >> On Tue, 02/09 13:47, Stefan Hajnoczi wrote: >> > On Mon, Feb 08, 2016 at 03:17:23PM +0000, Dr. David Alan Gilbert wrote: >> > > Does this make sense to everyone else, or does anyone have any better >> > > suggestions? >> > >> > As a concrete example, any monitor command that calls bdrv_drain_all() >> > can hang forever with the QEMU global mutex held if I/O requests are >> > stuck (e.g. NFS mount is unreachable). >> > >> > bdrv_aio_cancel() can also hang but is mostly exposed to device >> > emulation, not the monitor. >> > >> > One solution for these block layer functions is to add a timeout >> > argument and let them return an error. This way the monitor and device >> > emulation do not hang forever. >> >> Yes, there are a few places in block layer invoking aio_poll() in a loop >> waiting for certain events, and a disconnected network link could make QEMU >> hang. In these cases a timeout is a huge improvement. Maybe we can mark the >> BDS as "hanging" (-EIO is returned for all further requests) and let >> bdrv_drain_all() return. > > No, we need to be very careful about hung requests that are still in > flight. They could complete after a long time and modify the disk or > guest RAM. > > The only thing bdrv_drain_all() can return is -ETIME. Beyond that it > cannot pretend that requests have been drained since that could lead to > data corruption/loss. > > The caller has to abort whatever it was trying to do since it has no > guarantee that requests have drained. Maybe the remaining requests will > fail and go away, maybe they will complete. We don't know... > >> > >> > The benefit of the timeout is that both monitor and device emulation >> > hangs are tackled. It also doesn't require monitor changes. >> > >> > I'm not sure who chooses the timeout value and which value makes sense >> > (policy vs mechanism separation)... >> >> Default to 30 seconds like Linux, and make it tunable through command line >> options as well as QMP? > > Yes. Either commands that block can take an optional timeout (seconds) > parameter, or we could have a global disk I/O timeout value. I prefer > passing an optional per-command value. > > Libvirt and other sophisticated clients could begin using it in a > backwards-compatible way. Existing code wouldn't use it and still > suffer from the hang problem (but at least QEMU is > backwards-compatible). I'm not arguing against timeouts, I just want to remind everyone that blocking while holding the BQL is a bug, and limiting the block time with timeouts is a work-around, no more, no less.