All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, Karl Rister <krister@redhat.com>,
	Fam Zheng <famz@redhat.com>
Subject: Re: [Qemu-devel] [RFC 1/3] aio-posix: add aio_set_poll_handler()
Date: Thu, 10 Nov 2016 08:20:18 -0500 (EST)	[thread overview]
Message-ID: <1494168529.11932093.1478784018094.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <20161110101735.GB17332@stefanha-x1.localdomain>

On Thursday, November 10, 2016 11:17:35 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > I think the question is not "is there any polling to be done" but rather
> > "is there anything that requires looking at a file descriptor".  If you
> > have e.g. an NBD device on the AioContext you cannot poll.  On the other
> > hand if all you have is bottom halves (which you can poll with
> > ctx->notified), AIO and virtio ioeventfds, you can poll.
> 
> This is a good point.  Polling should only be done if all resources in
> the AioContext benefit from polling - otherwise it adds latency to
> resources that don't support polling.
> 
> Another thing: only poll if there is work to be done.  Linux AIO must
> only poll the ring when there are >0 requests outstanding.  Currently it
> always polls (doh!).

Good idea.  So the result of polling callback could be one of ready, not
ready and not active?  Or did you have in mind something else?

> > In particular, testing for bottom halves is necessary to avoid incurring
> > extra latency on flushes, which use the thread pool.
> 
> The current code uses a half-solution: it uses aio_compute_timeout() to
> see if any existing BHs are ready to execute *before* beginning to poll.
> 
> Really we should poll BHs since they can be scheduled during the polling
> loop.

We should do so for correctness (hopefully with just ctx->notified: there
should be no need to walk the BH list during polling).  However, the user
of the BH should activate polling "manually" by registering its own
polling handler: if there are no active polling handlers, just ignore
bottom halves and do the poll().

This is because there are always a handful of registered bottom halves, but
they are not necessarily "activatable" from other threads.  For example the
thread pool always has one BH but as you noticed for Linux AIO, it may not
have any pending requests.  So the thread pool would still have to register
with aio_set_poll_handler, even if it uses bottom halves internally for
the signaling.  I guess it would not need to register an associated IOHandler,
since it can just use aio_bh_poll.

A couple more random observations:

- you can pass the output of aio_compute_timeout(ctx) to run_poll_handlers,
like MIN((uint64_t)aio_compute_timeout(ctx), (uint64_t)aio_poll_max_ns).

- since we know that all resources are pollable, we don't need to poll() at
all if polling succeeds (though we do need aio_notify_accept()+aio_bh_poll()).

> > Perhaps the poll handler could be a parameter to aio_set_event_notifier?
> >  run_poll_handlers can just set revents (to G_IO_IN for example) if the
> > polling handler returns true, and return true as well.  aio_poll can
> > then call aio_notify_accept and aio_dispatch, bypassing the poll system
> > call altogether.
> 
> This is problematic.  The poll source != file descriptor so there is a
> race condition:
> 
> 1. Guest increments virtqueue avail.idx
> 2. QEMU poll notices avail.idx update and marks fd.revents readable.
> 3. QEMU dispatches fd handler:
> 4. Guest kicks virtqueue -> ioeventfd is signalled
> 
> Unfortunately polling is "too fast" and event_notifier_test_and_clear()
> returns false; we won't process the virtqueue!
> 
> Pretending that polling is the same as fd monitoring only works when #4
> happens before #3.  We have to solve this race condition.
> 
> The simplest solution is to get rid of the if statement (i.e. enable
> spurious event processing).  Not sure if that has a drawback though.
> Do you have a nicer solution in mind?

No, I don't.  Removing the if seems sensible, but I like the polling handler
more now that I know why it's there.  The event_notifier_test_and_clear does
add a small latency.

On one hand, because you need to check if *all* "resources"
support polling, you need a common definition of "resource" (e.g.
aio_set_fd_handler).  But on the other hand it would be nice to have
a streamlined polling callback.  I guess you could add something like
aio_set_handler that takes a struct with all interesting callbacks:

- in/out callbacks (for aio_set_fd_handler)
- polling handler
- polling callback

Then there would be simplified interfaces on top, such as aio_set_fd_handler,
aio_set_event_notifier and your own aio_set_poll_handler.

Paolo

  reply	other threads:[~2016-11-10 13:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09 17:13 [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode Stefan Hajnoczi
2016-11-09 17:13 ` [Qemu-devel] [RFC 1/3] aio-posix: add aio_set_poll_handler() Stefan Hajnoczi
2016-11-09 17:30   ` Paolo Bonzini
2016-11-10 10:17     ` Stefan Hajnoczi
2016-11-10 13:20       ` Paolo Bonzini [this message]
2016-11-15 20:14   ` Fam Zheng
2016-11-09 17:13 ` [Qemu-devel] [RFC 2/3] virtio: poll virtqueues for new buffers Stefan Hajnoczi
2016-11-09 17:13 ` [Qemu-devel] [RFC 3/3] linux-aio: poll ring for completions Stefan Hajnoczi
2016-11-11 19:59 ` [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode Karl Rister
2016-11-14 13:53   ` Fam Zheng
2016-11-14 14:52     ` Karl Rister
2016-11-14 16:56       ` Stefan Hajnoczi
2016-11-14 15:26   ` Stefan Hajnoczi
2016-11-14 15:29     ` Paolo Bonzini
2016-11-14 17:06       ` Stefan Hajnoczi
2016-11-14 17:13         ` Fam Zheng
2016-11-14 17:15         ` Paolo Bonzini
2016-11-15 10:36           ` Stefan Hajnoczi
2016-11-16  8:27       ` Fam Zheng
2016-11-14 15:36     ` Karl Rister
2016-11-14 20:12     ` Karl Rister
2016-11-14 20:52       ` Paolo Bonzini
2016-11-15 10:32         ` Stefan Hajnoczi
2016-11-15 18:45           ` Karl Rister
2016-11-13  6:20 ` no-reply
2016-11-14 14:51 ` Christian Borntraeger
2016-11-14 16:53   ` Stefan Hajnoczi
2016-11-14 14:59 ` Christian Borntraeger
2016-11-14 16:52   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1494168529.11932093.1478784018094.JavaMail.zimbra@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=famz@redhat.com \
    --cc=krister@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.