qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"crosa@redhat.com" <crosa@redhat.com>,
	armbru@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] [RFC PATCH] coroutines: generate wrapper code
Date: Thu, 14 Feb 2019 10:14:55 +0800	[thread overview]
Message-ID: <20190214021455.GA21877@stefanha-x1.localdomain> (raw)
In-Reply-To: <20190213100956.GB994@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 7037 bytes --]

On Wed, Feb 13, 2019 at 11:09:56AM +0100, Kevin Wolf wrote:
> Am 13.02.2019 um 07:58 hat Stefan Hajnoczi geschrieben:
> > On Tue, Feb 12, 2019 at 12:58:40PM +0100, Kevin Wolf wrote:
> > > Am 12.02.2019 um 04:22 hat Stefan Hajnoczi geschrieben:
> > > > On Mon, Feb 11, 2019 at 09:38:37AM +0000, Vladimir Sementsov-Ogievskiy wrote:
> > > > > 11.02.2019 6:42, Stefan Hajnoczi wrote:
> > > > > > On Fri, Feb 08, 2019 at 05:11:22PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > > >> Hi all!
> > > > > >>
> > > > > >> We have a very frequent pattern of wrapping a coroutine_fn function
> > > > > >> to be called from non-coroutine context:
> > > > > >>
> > > > > >>    - create structure to pack parameters
> > > > > >>    - create function to call original function taking parameters from
> > > > > >>      struct
> > > > > >>    - create wrapper, which in case of non-coroutine context will
> > > > > >>      create a coroutine, enter it and start poll-loop.
> > > > > >>
> > > > > >> Here is a draft of template code + example how it can be used to drop a
> > > > > >> lot of similar code.
> > > > > >>
> > > > > >> Hope someone like it except me)
> > > > > > 
> > > > > > My 2 cents.  Cons:
> > > > > > 
> > > > > >   * Synchronous poll loops are an anti-pattern.  They block all of QEMU
> > > > > >     with the big mutex held.  Making them easier to write is
> > > > > >     questionable because we should aim to have as few of these as
> > > > > >     possible.
> > > > > 
> > > > > Understand. Do we have a concept or a kind of target for a future to get rid of
> > > > > these a lot of poll-loops? What is the right way? At least for block-layer?
> > > > 
> > > > It's non-trivial.  The nested event loop could be flattened if there was
> > > > a mechanism to stop further activity on a specific object only (e.g.
> > > > BlockDriverState).  That way the event loop can continue processing
> > > > events for other objects and device emulation could continue for other
> > > > objects.
> > > 
> > > The mechanism to stop activity on BlockDriverStates is bdrv_drain(). But
> > > I don't see how this is related. Nested event loops aren't for stopping
> > > concurrent activity (events related to async operations started earlier
> > > are still processed in nested event loops), but for making progress on
> > > the operation we're waiting for. They happen when synchronous code calls
> > > into asynchronous code.
> > > 
> > > The way to get rid of them is making their callers async. I think we
> > > would come a long way if we ran QMP command handlers (at least the block
> > > related ones) and qemu-img operations in coroutines instead of blocking
> > > while we wait for the result.
> > 
> > A difficult caller is device reset, where we need to drain all requests.
> > But even converting some sync code paths to async is a win because it
> > removes places where QEMU can get stuck.
> 
> Yes, as I said, draining a node can hang. And it can hang not because
> there are nested event loops or because of any other bad design
> decision, but because waiting for all requests to complete is required.
> 
> The only thing we could try to improve this is cancelling requests after
> a timeout (or triggered by an OOB QMP command?) during a drain
> operation, but cancelling requests hasn't really been a success story so
> far.

I agree, cancelling block I/O requests seems unworkable in the general
case because it's not supported across all protocols (POSIX files, etc).

> > Regarding block QMP handlers, do you mean suspending the monitor when
> > a command yields?  The monitor will be unresponsive to the outside
> > world, so this doesn't solve the problem from the QMP client's
> > perspective.  This is why async QMP and jobs are interesting but it's a
> > lot of work both inside QEMU and for clients like libvirt.
> 
> Yes, it wouldn't keep the monitor responsive as long as the monitor
> protocol is synchronous. But it would keep the VM running at least, the
> GUI would stay responsive etc.
> 
> Blocking the monitor is again nothing that restructuring the code could
> fix. It requires a change to the QMP protocol, but then it will easily
> fit in the current design.
> 
> Nested event loops are unrelated.

Okay.

> > > > Unfortunately there are interactions between objects like in block jobs
> > > > that act on multiple BDSes, so it becomes even tricky.
> > > > 
> > > > A simple way of imagining this is to make each object an "actor"
> > > > coroutine.  The coroutine processes a single message (request) at a time
> > > > and yields when it needs to wait.  Callers send messages and expect
> > > > asynchronous responses.  This model is bad for efficiency (parallelism
> > > > is necessary) but at least it offers a sane way of thinking about
> > > > multiple asynchronous components coordinating together.  (It's another
> > > > way of saying, let's put everything into coroutines.)
> > > > 
> > > > The advantage of a flat event loop is that a hang in one object (e.g.
> > > > I/O getting stuck in one file) doesn't freeze the entire event loop.
> > > 
> > > I think this one is more theoretical because you'll still have
> > > dependencies between the components. blk_drain_all() isn't hanging
> > > because the code is designed suboptimally, but because its semantics is
> > > to wait until all requests have completed. And it's called because this
> > > semantics is required.
> > 
> > If we try to convert everything to async there will be two cases:
> > 1. Accidental sync code which can be made async.  (Rare nowadays?)
> > 2. Fundamental synchronization points that require waiting.
> > 
> > When you reach a point that hangs there is still the possibility of a
> > timeout or an explicit cancel.  Today QEMU supports neither, so a
> > command that gets stuck will hang QEMU for as long as it takes.
> > 
> > If QMP clients want timeouts or cancel then making everything async is
> > necessary.  If not, then we can leave it as is and simply audit the code
> > for accidental sync code (there used to be a lot of this but it's rarer
> > now) and convert it.
> 
> I'm not sure what makes sync code only "accidentally" sync, but at least
> everything the monitor does is sync.
> 
> With your assessment that we can leave everything as it is if QMP
> clients don't want improvements, you're ignoring that currently, not
> only the QMP monitor becomes unresponsive, but it holds the BQL while
> doing so and brings down the whole VM this way. So even without changing
> the QMP protocol, there is something to be gained by making it async
> (i.e. executing the block layer command handlers in a coroutine).

True.  Releasing the BQL helps when the hang is brief (a few seconds).
I see that as a performance problem - fixing jitter.

Beyond a few seconds a VM that cannot be observed or controlled via the
management tools is a problem and not very useful.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

      reply	other threads:[~2019-02-14  2:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-08 14:11 [Qemu-devel] [RFC PATCH] coroutines: generate wrapper code Vladimir Sementsov-Ogievskiy
2019-02-08 14:13 ` Vladimir Sementsov-Ogievskiy
2019-02-11  3:42 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2019-02-11  9:38   ` Vladimir Sementsov-Ogievskiy
2019-02-12  3:22     ` Stefan Hajnoczi
2019-02-12 10:03       ` Vladimir Sementsov-Ogievskiy
2019-02-12 10:55         ` Stefan Hajnoczi
2019-02-12 11:58       ` Kevin Wolf
2019-02-13  6:58         ` Stefan Hajnoczi
2019-02-13 10:09           ` Kevin Wolf
2019-02-14  2:14             ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190214021455.GA21877@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=crosa@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).