qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Bligh <alex@alex.org.uk>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-devel@nongnu.org, Alex Bligh <alex@alex.org.uk>
Subject: Re: [Qemu-devel] Question on aio_poll
Date: Tue, 23 Jul 2013 15:46:23 +0100	[thread overview]
Message-ID: <2B93060044B2D160D39B27F0@nimrod.local> (raw)
In-Reply-To: <20130723121825.GB20857@stefanha-thinkpad.redhat.com>

Stefan,

--On 23 July 2013 14:18:25 +0200 Stefan Hajnoczi <stefanha@gmail.com> wrote:

>> Secondly, the tests I am writing for the timer fail because g_poll is
>> not called, because busy is false. This is because I haven't got an
>> fd set up. But in the instance where aio_poll is called with
>> blocking=true and there are no fd's to wait on, surely aio_poll should
>> either hang indefinitely or (perhaps better) assert, rather than return
>> immediately which is a recipe for an unexpected busywait? If I have
>> timers, it should be running those timers rather than returning.
>
> FWIW this goes away in my series that gets rid of .io_flush():
>
> http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00092.html

Oooh another overlapping patch series into the mix :-)

> Unfortunately there is an issue with the series which I haven't had time
> to look into yet.  I don't remember the details but I think make check
> is failing.
>
> The current qemu.git/master code is doing the "correct" thing though.
> Callers of aio_poll() are using it to complete any pending I/O requests
> and process BHs.  If there is no work left, we do not want to block
> indefinitely.  Instead we want to return.

If we have no work to do (no FDs) and have a timer, then this should
wait for the timer to expire (i.e. wait until progress has been
made). Hence without a timer, it would be peculiar if it returned
earlier.

I think it should behave like select really, i.e. if you give it
an infinite timeout (blocking) and no descriptors to work on, it hangs
for ever. At the very least it should warn, as this is in my opinion
an error by the caller.

I left this how it was in the end (I think), and got round it by
creating a bogus pipe for the test to listen to.

>> Thirdly, I don't quite understand how/why busy is being set. It seems
>> to be set if the flush callback returns non-zero. That would imply (I
>> think) the fd handler has something to write. But what if it is just
>> interested in any data to read that is available (and never writes)? If
>> this is the only fd aio_poll has, it would appear it never polls.
>
> The point of .io_flush() is to select file descriptors that are awaiting
> I/O (either direction).  For example, consider an iSCSI TCP socket with
> no I/O requests pending.  In that case .io_flush() returns 0 and we will
> not block in aio_poll().  But if there is an iSCSI request pending, then
> .io_flush() will return 1 and we'll wait for the iSCSI response to be
> received.
>
> The effect of .io_flush() is that aio_poll() will return false if there
> is no I/O pending.

Right, but take that example. If the tcp socket is idle because it's an
iSCSI server and it is waiting for an iSCSI request, then io_flush
returns 0. That will mean busy will not be set, and if it's the only
FD, g_poll won't be called AT ALL - forget the fact it won't block -
because it will exit aio_poll a couple of lines before the g_poll. That
means you'll never actually poll for the incoming iSCSI command.
Surely that can't be right!

Or are you saying that this type of FD never appears in the aio poll
set so it is just returning for the main loop to handle them.

> It turned out that this behavior could be implemented at the block layer
> instead of using the .io_flush() interface at the AioContext layer.  The
> patch series I linked to above modifies the code so AioContext can
> eliminate the .io_flush() concept.

I've just had a quick read of that.

I think the key one is:
  http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00099.html

I note you've eliminated 'busy' - hurrah.

I note you now have:
     if (ctx->pollfds->len == 1) {
         return progress;
     }

Is the '1' there the event notifier? How do we know there is only
one of them?

>> When adapting this for timers, I think it should be returning true only
>> if a an AIO dispatch did something, or a BH was executed, or a timer ran.
>> Specifically if the poll simply times out, it should not be returning
>> true unless a timer ran. Correct?
>
> Yes, the return value is about making progress.  If there is still work
> pending then it must return true.  If there is no work pending it must
> return false.

Yup, you made the same fix as me at the end of aio_poll in my PATCHv2
RFC series.

On a related point, the g_source appears very fragile in respect of
false wakeups. I would not be confident that it would not busy-loop.
See the comments in the last of the patches in my series.

-- 
Alex Bligh

  reply	other threads:[~2013-07-23 14:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-20 13:14 [Qemu-devel] Question on aio_poll Alex Bligh
2013-07-23 12:18 ` Stefan Hajnoczi
2013-07-23 14:46   ` Alex Bligh [this message]
2013-07-24  7:54     ` Stefan Hajnoczi
2013-07-24  8:05       ` Alex Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2B93060044B2D160D39B27F0@nimrod.local \
    --to=alex@alex.org.uk \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).