From: Kevin Wolf <kwolf@redhat.com>
To: "Marc-André Lureau" <marcandre.lureau@gmail.com>
Cc: Markus Armbruster <armbru@redhat.com>,
Jeff Cody <jcody@redhat.com>,
qemu-devel@nongnu.org, kraxel@redhat.com,
Stefan Hajnoczi <stefanha@redhat.com>,
John Snow <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 00/25] qmp: add async command type
Date: Fri, 28 Apr 2017 21:13:21 +0200 [thread overview]
Message-ID: <20170428191321.GF4714@noname.redhat.com> (raw)
In-Reply-To: <CAJ+F1CKYUbMo-3TrkNfc+H3tzLUaN00c6sGTxdnPn5qapfA_bA@mail.gmail.com>
Am 28.04.2017 um 17:55 hat Marc-André Lureau geschrieben:
> On Tue, Apr 25, 2017 at 2:23 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> > Am 24.04.2017 um 21:10 hat Markus Armbruster geschrieben:
> > > With 2.9 out of the way, how can we make progress on this one?
> > >
> > > I can see two ways to get asynchronous QMP commands accepted:
> > >
> > > 1. We break QMP compatibility in QEMU 3.0 and convert all long-running
> > > tasks from "synchronous command + event" to "asynchronous command".
> > >
> > > This is design option 1 quoted below. *If* we decide to leave
> > > compatibility behind for 3.0, *and* we decide we like the
> > > asynchronous sufficiently better to put in the work, we can do it.
> > >
> > > I guess there's nothing to do here until we decide on breaking
> > > compatibility in 3.0.
> > >
> > > 2. We don't break QMP compatibility, but we add asynchronous commands
> > > anyway, because we decide that's how we want to do "jobs".
> > >
> > > This is design option 3 quoted below. As I said, I dislike its lack
> > > of orthogonality. But if asynchronous commands help us get jobs
> > > done, I can bury my dislike.
> >
> > I don't think async commands are attractive at all for doing jobs. I
>
> It's still a bit obscure to me what we mean by "jobs".
I guess the best definition that we have is: Image streaming, mirroring,
live backup, live commit and future "similar things".
> feel they bring up more questions that they answer, for example, what
> > happens if libvirt crashes and then reconnects? Which monitor connection
> > does get the reply for an async command sent on the now disconnected
> > one?
> >
>
> The monitor to receive a reply is the one that sent the command (just
> like return today)
>
> As explained in the cover letter, an async command may cancel the
> ongoing operation on disconnect.
But that's not what you generally want. You don't want to abort your
backup just because libvirt lost its monitor connection, but qemu should
continue to copy the data, and when libvirt reconnects it should be able
to get back control of this background operation and bring it to
successful completion.
> If there is a global state change, a separate event should be
> broadcasted (no change proposed here)
In a way, the existence of a block job is global state today. Not sure
if this is what you mean, though.
> > We already have a model for doing long-running jobs, and as far as I'm
> > aware, it's working and we're not fighting limitations of the design. So
> > what are we even trying to solve here? In the context of jobs, async
> > commands feel like a solution in need of a problem to me.
>
> See the cover letter for the 2 main reasons for this proposal. If your
> domain API is fine, you don't have to opt-in and you may continue to use
> the current sync model. However, I believe there is benefit in using this
> work to have a more consitent async API.
I think we need a clear understanding of what the potential use cases
are that could make good use a new infrastructure. We don't generally
add infrastructure if we don't have a concrete idea what its users could
be. I only ruled out that the current users of block jobs are a good fit
for it, but there may be other use cases for which it works great.
If commands can opt-in or opt-out of the new model, consistency isn't a
particularly good argument, though.
> > Things may look a bit different in typically quick, but potentially
> > long-running commands. That is, anything that we currently execute
> > synchronously while holding the BQL, but that involves I/O and could
> > therefore take a while (impacting the performance of the VM) or even
> > block indefinitely.
> >
> > The first problem (we're holding the lock too long) can be addressed
> > by making things async just inside qemu and we don't need to expose
> > the change on the QMP level. The second one (blocking indefinitely)
> > requires
> >
>
> That's what I propose as 1)
>
>
> > being async on the QMP level if we want the monitor to be responsive
> > even if we're using an image on an NFS server that went down.
> >
>
> That's the 2)
>
> > On the other hand, using the traditional job infrastructure is way
> > over the top if all you want to do is 'query-block', so we need
> > something different for making it async. And if a client
> > disconnects, the 'query-block' result can just be thrown away, it's
> > much simpler than actual jobs.
>
> I agree a fully-featured job infrastructure is way over the top, and I
> believe I propose a minimal change to make optionnally some QMP
> commands async.
So are commands like 'query-block' (which are typically _not_ considered
long-running) what you're aiming for with your proposal? This is a case
where I think we could consider the use of async QMP commands, but I
didn't have the impression that this kind of commands was your primary
target.
> > So where I can see advantages for a new async command type is not for
> > converting real long-running commands like block jobs, but only for the
> > typically, but not necessarily quick operations. At the same time it is
> > where you're rightfully afraid that the less common case might not
> > receive much testing in management tools.
> >
>
> I believe management tools / libvirt will want to use the async variant if
> available. (the sync version is a one-command at a time constrained version
> of 'async')
The point here is rather that even async commands degenerate into sync
commands if the management tool doesn't send multiple commands in
parallel.
If sending only a single command at a time is the common case (which
appears quite plausible to me), then race conditions that exist when
multiple commands are used in a rarer case might go unnoticed because
nobody gave the scenario real testing.
> > In the end, I'm unsure whether async commands are a good idea, I can
> > see good arguments for both stances. But I'm almost certain that
> > they are the wrong tool for jobs.
> >
> >
> Well, we already have 'async' commands, they are just hidden. They do
> not use QAPI/QMP facility and lack consistency.
>
> This series addresses the problem 1), internal to qemu.
>
> And also proposes to replace the idiomatic:
>
> -> { "execute": "do-foo", "id": 42
> }
>
> <- { "return": {}, "id": 42 } (this is a dummy useless
> return)
> (foo is in fact async, you may do other commands here)
I know you like to insist on its uselessness, but no, it's not useless.
It tells the management tool that the background job has successfully
been started and block job management commands can be used with it now.
>
> <- { "event": "FOO_DONE" } (this is a broadcasted event that other
> monitor may not know how to deal with, lack of consistency with naming for
> various async op, "id" field may be lost, no facilities in generated code
> etc etc)
Are these theoretical concerns or do you see them confirmed with
actually existing commands?
The broadcast is actually a feature, as mentioned above, because it
allows libvirt to reconnect after losing the connection and continue to
control the background operation.
> with a streamlined:
>
> -> { "execute": "do-foo", "id": 42 }
> (you may do other commands here)
>
>
> <- { "return": {}, "id": 42 } (returned only to the caller)
> (if there is a global state change, there should also be a FOO_DONE
> event)
>
> As pointed out in the cover letter, existing client *have to* deal with
> dispatching unrelated messages when sending commands, because events may
> come before a return message. So they have facilities to handle async
> replies.
>
> But in any case, this streamlined version is behind a "async" QMP
> capability.
>
> I have been careful to not expose this change to qemu internal or qemu
> client if they don't want or need it.
The question is whether enough users (command implementations and
clients) need the change to justify maintaining another type of commands
long term. Just not breaking existing users doesn't justify a new
feature, it's only the most basic requirement for it to even be
considered.
Kevin
next prev parent reply other threads:[~2017-04-28 19:13 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-18 16:03 [Qemu-devel] [PATCH v2 00/25] qmp: add async command type Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 01/25] tests: start generic qemu-qmp tests Marc-André Lureau
2017-01-18 16:14 ` Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 02/25] tests: change /0.15/* tests to /qmp/* Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 03/25] qmp: teach qmp_dispatch() to take a pre-filled QDict Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 04/25] qmp: use a return callback for the command reply Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 05/25] qmp: add QmpClient Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 06/25] qmp: add qmp_return_is_cancelled() Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 07/25] qmp: introduce async command type Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 08/25] qapi: ignore top-level 'id' field Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 09/25] qmp: take 'id' from request Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 10/25] qmp: check that async command have an 'id' Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 11/25] scripts: learn 'async' qapi commands Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 12/25] tests: add dispatch async tests Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 13/25] monitor: add 'async' capability Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 14/25] monitor: add !qmp pre-conditions Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 15/25] monitor: suspend when running async and client has no async Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 16/25] qmp: update qmp-spec about async capability Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 17/25] qtest: add qtest-timeout Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 18/25] qtest: add qtest_init_qmp_caps() Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 19/25] tests: add tests for async and non-async clients Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 20/25] qapi: improve 'screendump' documentation Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 21/25] console: graphic_hw_update return true if async Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 22/25] console: add graphic_hw_update_done() Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 23/25] console: make screendump async Marc-André Lureau
2017-01-19 8:20 ` Gerd Hoffmann
2017-01-20 13:07 ` Marc-André Lureau
2017-01-23 12:04 ` Gerd Hoffmann
2017-01-23 12:35 ` Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 24/25] qtest: add /qemu-qmp/screendump test Marc-André Lureau
2017-01-18 16:03 ` [Qemu-devel] [PATCH v2 25/25] qmp: move json-message-parser and check to QmpClient Marc-André Lureau
2017-01-18 17:35 ` [Qemu-devel] [PATCH v2 00/25] qmp: add async command type no-reply
2017-01-23 10:55 ` Stefan Hajnoczi
2017-01-23 11:27 ` Marc-André Lureau
2017-01-24 11:43 ` Stefan Hajnoczi
2017-01-24 18:43 ` Marc-André Lureau
2017-01-30 13:43 ` Stefan Hajnoczi
2017-01-30 18:18 ` Marc-André Lureau
2017-01-31 7:43 ` Gerd Hoffmann
2017-01-31 8:18 ` Markus Armbruster
2017-02-01 16:25 ` Stefan Hajnoczi
2017-02-01 20:25 ` Marc-André Lureau
2017-02-02 10:13 ` Stefan Hajnoczi
2017-01-25 15:16 ` Markus Armbruster
2017-04-24 19:10 ` Markus Armbruster
2017-04-25 0:06 ` John Snow
2017-04-25 6:55 ` Markus Armbruster
2017-04-25 9:13 ` Daniel P. Berrange
2017-04-25 10:22 ` Kevin Wolf
2017-04-28 15:55 ` Marc-André Lureau
2017-04-28 19:13 ` Kevin Wolf [this message]
2017-05-02 8:30 ` Gerd Hoffmann
2017-05-02 9:05 ` Marc-André Lureau
2017-05-02 10:42 ` Kevin Wolf
2017-05-04 19:01 ` Markus Armbruster
2017-05-05 9:00 ` Marc-André Lureau
2017-05-05 12:35 ` Markus Armbruster
2017-05-05 12:54 ` Marc-André Lureau
2017-05-05 13:50 ` Markus Armbruster
2017-05-05 12:27 ` Kevin Wolf
2017-05-05 12:51 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170428191321.GF4714@noname.redhat.com \
--to=kwolf@redhat.com \
--cc=armbru@redhat.com \
--cc=jcody@redhat.com \
--cc=jsnow@redhat.com \
--cc=kraxel@redhat.com \
--cc=marcandre.lureau@gmail.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).