qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>,
	Laurent Vivier <lvivier@redhat.com>, Fam Zheng <famz@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	mdroth@linux.vnet.ibm.com, Peter Xu <peterx@redhat.com>,
	qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread
Date: Thu, 7 Sep 2017 19:09:00 +0100	[thread overview]
Message-ID: <20170907180900.GV2098@work-vm> (raw)
In-Reply-To: <87h8weo186.fsf@dusky.pond.sub.org>

* Markus Armbruster (armbru@redhat.com) wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Thu, Sep 07, 2017 at 02:59:28PM +0200, Markus Armbruster wrote:
> >> So, what exactly is going to drain the command queue?  If there's more
> >> than one consumer, how exactly are commands from the queue dispatched to
> >> the consumers?
> >
> > In terms of my proposal, for any single command there should only ever
> > be a single consumer. The default consumer would be the main event loop
> > thread, such that we have no semantic change to QMP operation from today.
> >
> > Some commands that are capable of being made "async", would have a
> > different consumer. For example, if the client requested the 'migrate-cancel'
> > be made async, this would change things such that the migration thread is
> > now responsible for consuming the "migrate-cancel" command, instead of the
> > default main loop.
> >
> >> What are the "no hang" guarantees (if any) and conditions for each of
> >> these consumers?
> >
> > The non-main thread consumers would have to have some reasonable
> > guarantee that they won't block on a lock held by the main loop,
> > otherwise the whole feature is largely useless.
> 
> Same if they block indefinitely on anything else, actually.  In other
> words, we need to talk about liveness.
> 
> Threads by themselves don't buy us liveness.  Being careful with
> operations that may block does.  That care may lead to farming out
> certain operations to other threads, where they may block without harm.
> 
> You only talk about "the non-main thread consumers".  What about the
> main thread?  Is it okay for the main thread to block?  If yes, why?

It would be great if the main thread never blocked; but IMHO that's
a huge task that we'll never get done [challenge].

> >> We can have any number of QMP monitors today.  Would each of them feed
> >> its own queue?  Would they all feed a shared queue?
> >
> > Currently with multiple QMP monitors, everything runs in the main
> > loop, so commands arriving across  multiple monitors are 100%
> > serialized and processed strictly in the order in which QEMU reads
> > them off the wire.  To maintain these semantics, we would need to
> > have a single shared queue for the default main loop consumer, so
> > that ordering does not change.
> >
> >> How exactly is opt-in asynchronous to work?  Per QMP monitor?  Per
> >> command?
> >
> > Per monitor+command. ie just because libvirt knows how to cope with
> > async execution on the monitor it has open, does not mean that a
> > different app on the 2nd monitor command can cope. So in my proposal
> > the switch to async must be scoped to the particular command only
> > for the monitor connection that requesteed it.
> >
> >> What does it mean when an asynchronous command follows a synchronous
> >> command in the same QMP monitor?  I would expect the synchronous command
> >> to complete before the asynchronous command, because that's what
> >> synchronous means, isn't it?  To keep your QMP monitor available, you
> >> then must not send synchronous commands that can hang.
> >
> > No, that is not what I described. All synchronous commands are
> > serialized wrt each other, just as today. An asychronous command
> > can run as soon as it is received, regardless of whether any
> > earlier sent sync commands are still executing or pending. This
> > is trivial to achieve when you separate monitor I/O from command
> > execution in separate threads, provided of course the async
> > command consumers are not in the main loop.
> 
> So, a synchronous command is synchronous with respect to other commands,
> except for certain non-blocking commands.  The distinctive feature of
> the latter isn't so much an asynchronous reply, but out-of-band
> dispatch.
> 
> Out-of-band dispatch of commands that cannot block in fact orthogonal to
> asynchronous replies.  I can't see why out-of-band dispatch of
> synchronous non-blocking commands wouldn't work, too.
> 
> >> How can we determine whether a certain synchronous command can hang?
> >> Note that with opt-in async, *all* commands are also synchronous
> >> commands.
> >> 
> >> In short, explain to me how exactly you plan to ensure that certain QMP
> >> commands (such as post-copy recovery) can always "get through", in the
> >> presence of multiple monitors, hanging main loop, hanging synchronous
> >> commands, hanging whatever-else-can-now-hang-in-this-post-copy-world.
> >
> > Taking migrate-cancel as the example. The migration code already has
> > a background thread doing work independantly onthe main loop. Upon
> > marking the migrate-cancel command as async, the migration control
> > thread would become the consumer of migrate-cancel.
> 
> From 30,000 feet, the QMP monitor sends a "cancel" message to the
> migration thread, and later receives a "canceled" message from the
> migration thread.
> 
> From 300 feet, we use the migrate-cancel QMP command as the cancel
> message, and its success response as the "canceled" message.
> 
> In other words, we're pressing the external QM-Protocol into service as
> internal message passing protocol.

Be careful; it's not a cancel in the postcopy recovery case, it's a
restart.  The command is very much like the migration-incoming command.
The management layer has to provide data with the request, so it's not
an internal command.

> >                                                     This allows the
> > migration operation to be cancelled immediately, regardless of whether
> > there are earlier monitor commands blocked in the main loop.
> 
> The necessary part is moving all operations that can block out of
> whatever loop runs the monitor, be it the main loop, some other event
> loop, or a dedicated monitor thread's monitor loop.
> 
> Moving out non-blocking operations isn't necessary.  migrate-cancel
> could communicate with the migration thread by any suitable mechanism or
> protocol.  It doesn't have to be QMP.  Why would we want it to be QMP?

Because why invent another wheel?
This is a command that the management layer has to issue to qemu for
it to recover, including passing data, in a way similar to other
commands - so it looks like a QMP command, so why not use QMP.

Also, I think making other commands lock-free is advantageous - 
some of the 'info' commands just dont really need locks, making them
not use locks removes latency effects caused by the management layer
prodding qemu.

> > Of course this assumes the migration control thread can't block
> > for locks held by the main thread.
> 
> Thanks for your answers, they help.
> 
> >> Now let's talk about QMP requirements.
> >> 
> >> Any addition to QMP must consider what exists already.
> >> 
> >> You may add more of the same.
> >> 
> >> You may generalize existing stuff.
> >> 
> >> You may change existing stuff if you have sufficient reason, subject to
> >> backward compatibility constraints.
> >> 
> >> But attempts to add new ways to do the same old stuff without properly
> >> integrating the existing ways are not going to fly.
> >> 
> >> In particular, any new way to start some job, monitor and control it
> >> while it lives, get notified about its state changes and so forth must
> >> integrate the existing ways.  These include block jobs (probably the
> >> most sophisticated of the lot), migration, dump-guest-memory, and
> >> possibly more.  They all work the same way: synchronous command to kick
> >> off the job, more synchronous commands to monitor and control, events to
> >> notify.  They do differ in detail.
> >> 
> >> Asynchronous commands are a new way to do this.  When you only need to
> >> be notified on "done", and don't need to monitor / control, they fit the
> >> bill quite neatly.
> >> 
> >> However, we can't just ignore the cases where we need more than that!
> >> For those, we want a single generic solution instead of the several ad
> >> hoc solutions we have now.
> >> 
> >> If we add asynchronous commands *now*, and for simple cases only, we add
> >> yet another special case for a future generic solution to integrate.
> >> I'm not going to let that happen.
> >
> > With the async commands suggestion, while it would initially not
> > provide a way to query incremental status, that could easily be
> > fitted in.
> 
> This is [*] below.
> 
> >             Because command replies from async commands may be
> > out-of-order wrt the original requests, clients would need to
> > provide a unique ID for each command run. This originally was
> > part of QMP spec but then dropped, but libvirt still actually
> > generates a uniqe ID for every QMP command.
> >
> > Given this, one option is to actually use the QMP command ID as
> > a job ID, and let you query ongoing status via some new QMP
> > command that accepts the ID of the job to be queried. A complexity
> > with this is how to make the jobs visible across multiple QMP
> > monitors. The job ID might actually have to be a combination of
> > the serial ID from the QMP command, and the ID of the monitor
> > chardev combined.
> 
> Yes.  The job ID must be unique across all QMP monitors to make
> broadcast notifications work.
> 
> >> I figure the closest to a generic solution we have is block jobs.
> >> Perhaps a generic solution could be had by abstracting away the "block"
> >> from "block jobs", leaving just "jobs".
> 
> [*] starts here:
> 
> >> Another approach is generalizing the asynchronous command proposal to
> >> fully cover the not-so-simple cases.
> 
> We know asynchronous commands "fully cover" when we can use them to
> replace all the existing job-like commands.
> 
> Until then, they enlarge rather than solve our jobs problem.
> 
> I get the need for an available monitor.  But I need to balance it with
> other needs.  Can we find a solution for our monitor availability
> problem that doesn't enlarge our jobs problem?

Hopefully!

Dave

> >> If you'd rather want to make progress on monitor availability without
> >> cracking the "jobs" problem, you're in luck!  Use your license to "add
> >> more of the same": synchronous command to start a job, query to monitor,
> >> event to notify.  
> >> 
> >> If you insist on tying your monitor availability solution to
> >> asynchronous commands, then I'm in luck!  I just found volunteers to
> >> solve the "jobs" problem for me.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2017-09-07 18:09 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23  6:51 [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 1/8] monitor: move skip_flush into monitor_data_init Peter Xu
2017-08-23 16:31   ` Dr. David Alan Gilbert
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll Peter Xu
2017-08-23 17:35   ` Dr. David Alan Gilbert
2017-08-25  4:25     ` Peter Xu
2017-08-25  9:30       ` Dr. David Alan Gilbert
2017-08-28  5:53         ` Peter Xu
2017-09-08 17:29           ` Dr. David Alan Gilbert
2017-08-25 15:27   ` Marc-André Lureau
2017-08-25 15:33     ` Dr. David Alan Gilbert
2017-08-25 16:07       ` Marc-André Lureau
2017-08-25 16:12         ` Dr. David Alan Gilbert
2017-08-25 16:21           ` Marc-André Lureau
2017-08-25 16:29             ` Dr. David Alan Gilbert
2017-08-26  8:33               ` Marc-André Lureau
2017-08-28  3:05         ` Peter Xu
2017-08-28 10:11           ` Marc-André Lureau
2017-08-28 12:48             ` Peter Xu
2017-09-05 18:58               ` Dr. David Alan Gilbert
2017-08-28 11:08         ` Markus Armbruster
2017-08-28 12:28           ` Marc-André Lureau
2017-08-28 16:24             ` Markus Armbruster
2017-08-28 17:24               ` Marc-André Lureau
2017-08-29  6:27                 ` Markus Armbruster
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 3/8] char-io: fix possible risk on IOWatchPoll Peter Xu
2017-08-25 14:44   ` Marc-André Lureau
2017-08-26  7:19   ` Fam Zheng
2017-08-28  5:56     ` Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 4/8] QAPI: new QMP command option "without-bql" Peter Xu
2017-08-23 17:44   ` Dr. David Alan Gilbert
2017-08-23 23:37     ` Fam Zheng
2017-08-25  5:37       ` Peter Xu
2017-08-25  9:14         ` Dr. David Alan Gilbert
2017-08-28  8:08           ` Peter Xu
2017-09-08 17:38             ` Dr. David Alan Gilbert
2017-08-25  5:35     ` Peter Xu
2017-08-25  9:06       ` Dr. David Alan Gilbert
2017-08-28  8:26         ` Peter Xu
2017-09-08 17:52           ` Dr. David Alan Gilbert
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 5/8] hmp: support "without_bql" Peter Xu
2017-08-23 17:46   ` Dr. David Alan Gilbert
2017-08-25  5:44     ` Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 6/8] migration: qmp: migrate_incoming don't need BQL Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 7/8] migration: hmp: " Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 8/8] migration: add incoming mgmt lock Peter Xu
2017-08-23 18:01   ` Dr. David Alan Gilbert
2017-08-25  5:49     ` Peter Xu
2017-08-25  9:34       ` Dr. David Alan Gilbert
2017-08-28  8:39         ` Peter Xu
2017-08-29 11:03 ` [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread Daniel P. Berrange
2017-08-30  7:06   ` Markus Armbruster
2017-08-30 10:13     ` Daniel P. Berrange
2017-08-31  3:31       ` Peter Xu
2017-08-31  9:14         ` Daniel P. Berrange
2017-09-06  9:48   ` Dr. David Alan Gilbert
2017-09-06 10:46     ` Daniel P. Berrange
2017-09-06 10:48       ` Dr. David Alan Gilbert
2017-09-06 10:54         ` Daniel P. Berrange
2017-09-06 10:57           ` Dr. David Alan Gilbert
2017-09-06 11:06             ` Daniel P. Berrange
2017-09-06 11:31               ` Dr. David Alan Gilbert
2017-09-06 11:54                 ` Daniel P. Berrange
2017-09-07  8:13                   ` Peter Xu
2017-09-07  8:49                     ` Stefan Hajnoczi
2017-09-07  9:18                       ` Dr. David Alan Gilbert
2017-09-07 10:19                         ` Stefan Hajnoczi
2017-09-07 10:24                         ` Peter Xu
2017-09-07  8:55                     ` Daniel P. Berrange
2017-09-07  9:19                       ` Dr. David Alan Gilbert
2017-09-07  9:22                         ` Daniel P. Berrange
2017-09-07  9:27                           ` Dr. David Alan Gilbert
2017-09-07 11:19                         ` Markus Armbruster
2017-09-07 11:31                           ` Dr. David Alan Gilbert
2017-09-07  9:15                     ` Dr. David Alan Gilbert
2017-09-07  9:25                       ` Daniel P. Berrange
2017-09-07 12:59                     ` Markus Armbruster
2017-09-07 13:22                       ` Daniel P. Berrange
2017-09-07 17:41                         ` Markus Armbruster
2017-09-07 18:09                           ` Dr. David Alan Gilbert [this message]
2017-09-08  8:41                             ` Markus Armbruster
2017-09-08  9:32                               ` Dr. David Alan Gilbert
2017-09-08 11:49                                 ` Markus Armbruster
2017-09-08 13:19                                   ` Stefan Hajnoczi
2017-09-11 10:32                                   ` Peter Xu
2017-09-11 10:36                                     ` Peter Xu
2017-09-11 10:43                                   ` Daniel P. Berrange
2017-09-08  9:27                           ` Daniel P. Berrange
2017-09-07 14:20                       ` Dr. David Alan Gilbert
2017-09-07 17:41                         ` Markus Armbruster
2017-09-07 18:04                           ` Dr. David Alan Gilbert
2017-09-07 10:04                   ` Dr. David Alan Gilbert
2017-09-07 10:08                     ` Daniel P. Berrange
2017-09-07 13:59                 ` Eric Blake
2017-09-06 14:50 ` Stefan Hajnoczi
2017-09-06 15:14   ` Dr. David Alan Gilbert
2017-09-07  7:38     ` Peter Xu
2017-09-07  8:58     ` Stefan Hajnoczi
2017-09-07  9:35       ` Dr. David Alan Gilbert
2017-09-07 10:09         ` Stefan Hajnoczi
2017-09-07 12:02           ` Peter Xu
2017-09-07 16:53             ` Stefan Hajnoczi
2017-09-07 17:14               ` Dr. David Alan Gilbert
2017-09-07 17:35                 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170907180900.GV2098@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).