From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34766) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V1tuH-0003FB-5L for qemu-devel@nongnu.org; Wed, 24 Jul 2013 03:54:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V1tuF-0007hy-RQ for qemu-devel@nongnu.org; Wed, 24 Jul 2013 03:54:45 -0400 Received: from mail-ee0-x22c.google.com ([2a00:1450:4013:c00::22c]:46814) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V1tuF-0007hi-Ik for qemu-devel@nongnu.org; Wed, 24 Jul 2013 03:54:43 -0400 Received: by mail-ee0-f44.google.com with SMTP id c13so51724eek.17 for ; Wed, 24 Jul 2013 00:54:42 -0700 (PDT) Date: Wed, 24 Jul 2013 09:54:39 +0200 From: Stefan Hajnoczi Message-ID: <20130724075439.GC31445@stefanha-thinkpad.muc.redhat.com> References: <26DE76D4FD616A2955DC1732@nimrod.local> <20130723121825.GB20857@stefanha-thinkpad.redhat.com> <2B93060044B2D160D39B27F0@nimrod.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2B93060044B2D160D39B27F0@nimrod.local> Subject: Re: [Qemu-devel] Question on aio_poll List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Bligh Cc: qemu-devel@nongnu.org On Tue, Jul 23, 2013 at 03:46:23PM +0100, Alex Bligh wrote: > --On 23 July 2013 14:18:25 +0200 Stefan Hajnoczi wrote: > >Unfortunately there is an issue with the series which I haven't had time > >to look into yet. I don't remember the details but I think make check > >is failing. > > > >The current qemu.git/master code is doing the "correct" thing though. > >Callers of aio_poll() are using it to complete any pending I/O requests > >and process BHs. If there is no work left, we do not want to block > >indefinitely. Instead we want to return. > > If we have no work to do (no FDs) and have a timer, then this should > wait for the timer to expire (i.e. wait until progress has been > made). Hence without a timer, it would be peculiar if it returned > earlier. > > I think it should behave like select really, i.e. if you give it > an infinite timeout (blocking) and no descriptors to work on, it hangs > for ever. At the very least it should warn, as this is in my opinion > an error by the caller. > > I left this how it was in the end (I think), and got round it by > creating a bogus pipe for the test to listen to. Doing that requires the changes in my patch series, otherwise you break aio_poll() loops that are waiting for pending I/O requests. They don't want to wait for timers. > >>Thirdly, I don't quite understand how/why busy is being set. It seems > >>to be set if the flush callback returns non-zero. That would imply (I > >>think) the fd handler has something to write. But what if it is just > >>interested in any data to read that is available (and never writes)? If > >>this is the only fd aio_poll has, it would appear it never polls. > > > >The point of .io_flush() is to select file descriptors that are awaiting > >I/O (either direction). For example, consider an iSCSI TCP socket with > >no I/O requests pending. In that case .io_flush() returns 0 and we will > >not block in aio_poll(). But if there is an iSCSI request pending, then > >.io_flush() will return 1 and we'll wait for the iSCSI response to be > >received. > > > >The effect of .io_flush() is that aio_poll() will return false if there > >is no I/O pending. > > Right, but take that example. If the tcp socket is idle because it's an > iSCSI server and it is waiting for an iSCSI request, then io_flush > returns 0. That will mean busy will not be set, and if it's the only > FD, g_poll won't be called AT ALL - forget the fact it won't block - > because it will exit aio_poll a couple of lines before the g_poll. That > means you'll never actually poll for the incoming iSCSI command. > Surely that can't be right! > > Or are you saying that this type of FD never appears in the aio poll > set so it is just returning for the main loop to handle them. That happens because QEMU has two types of fd monitoring. It has AioContext's aio_poll() which is designed for asynchronous I/O requests initiated by QEMU. It can wait for them to complete. QEMU also has main-loop's qemu_set_fd_handler() (iohandler) which is used for server connections like the one you described. The NBD server uses it, for example. I hope we can eventually unify event loops and then the select function should behave as you described. For now though, we need to keep the current behavior until my .io_flush() removal series or something equivalent is merged, at least. > >It turned out that this behavior could be implemented at the block layer > >instead of using the .io_flush() interface at the AioContext layer. The > >patch series I linked to above modifies the code so AioContext can > >eliminate the .io_flush() concept. > > I've just had a quick read of that. > > I think the key one is: > http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00099.html > > I note you've eliminated 'busy' - hurrah. > > I note you now have: > if (ctx->pollfds->len == 1) { > return progress; > } > > Is the '1' there the event notifier? How do we know there is only > one of them? There many be many EventNotifier instances. That's not what matters. Rather, it's about the aio_notify() EventNotifier. Each AioContext has its own EventNotifier which can be signalled with aio_notify(). The purpose of this function is to kick an event loop that is blocking in select()/poll(). This is necessary when another thread modifies something that the AioContext needs to act upon, such as adding/removing an fd.