From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:51544)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <armbru@redhat.com>) id 1ZbU4B-0008LS-15
	for qemu-devel@nongnu.org; Mon, 14 Sep 2015 09:45:08 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <armbru@redhat.com>) id 1ZbU47-0001ih-OS
	for qemu-devel@nongnu.org; Mon, 14 Sep 2015 09:45:06 -0400
Received: from mx1.redhat.com ([209.132.183.28]:50561)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <armbru@redhat.com>) id 1ZbU47-0001iI-Gt
	for qemu-devel@nongnu.org; Mon, 14 Sep 2015 09:45:03 -0400
From: Markus Armbruster <armbru@redhat.com>
References: <CAJ+F1CLnB9bJ0R0sk5+VJvB582xx+0QYYrk3OieJYvtvBEpieg@mail.gmail.com>
	<20150729150531.GI16847@redhat.com>
	<CAJ+F1CLueL9JcXo3eqhVUUi6nWG4Y3+La=s8v3v99uHHCcn8-g@mail.gmail.com>
Date: Mon, 14 Sep 2015 15:45:00 +0200
In-Reply-To: <CAJ+F1CLueL9JcXo3eqhVUUi6nWG4Y3+La=s8v3v99uHHCcn8-g@mail.gmail.com>
	(=?utf-8?Q?=22Marc-Andr=C3=A9?= Lureau"'s message of "Wed, 29 Jul 2015
	17:18:10 +0200")
Message-ID: <87613dp12b.fsf@blackfin.pond.sub.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] RFC: async commands with QMP
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?utf-8?Q?Marc-Andr=C3=A9?= Lureau <marcandre.lureau@gmail.com>
Cc: QEMU <qemu-devel@nongnu.org>

I'm so far behind on e-mail it's not funny anymore.  My apologies...

Marc-Andr=C3=A9 Lureau <marcandre.lureau@gmail.com> writes:

> Hi Daniel
>
> On Wed, Jul 29, 2015 at 5:05 PM, Daniel P. Berrange <berrange@redhat.com>=
 wrote:
>> When QMP was originally written there was some infrastructure for doing
>> async commands, but over time this has all been ripped out because it
>> was never really used, complicated the code and didn't work too well
>> either. It seems we pretty much settled on the approach that all
>> commands should be fast to execute, and where there is a long term
>> job being run, we have commands to query its status, cancel it, and
>> sometimes events too.  One of the benefits of this is that it means
>> that libvirt can determine current status of ongoing background jobs
>> when it restarts and reconnects to a previously launched QEMU, where
>> as an async command approach is tied to the specific monitor connection
>> that is open.
>
> Thanks for the historic reasons, I ignored that.
>
> Querying status is not incompatible with having async commands. It's
> not because command return immediately that the job is finished, it's
> async already!

I'm not sure I got your point here.  But see below.

>                This proposal is about having a return when the job is
> actually finished to avoid having to continuously query, that's the
> main motivation. Today's return for async commands is pretty useless,
> you can admit, it's just a ack that the command got accepted and
> eventually started...

Let's step back and consider the general life cycle of a "job".  I'm
calling it "job" to distinguish it from "QMP command".

    User orders a job to get done

    System either accepts or rejects the job
    if reject, we're done

    Job runs for a while, then either fails or succeeds

Anthony's initial QMP design wanted this wrapped in QMP as follows:

    User sends a QMP command
    If we reject the job, send a QMP error response
    Else, the job runs until it fails or succeeds
        when it fails, send a QMP error response
        when it succeeds, send a QMP success response

The idea was that *all* commands are asynchronous!  However, the initial
implementation was in fact completely synchronous.

For most jobs, the "while" in "runs for a while" is reliably very short,
so this wasn't a problem.  When the first job was added where that isn't
the case, the implementation was hacked up to support asynchronous
commands.  However, we didn't change the existing commands to become
asynchronous then.  Probably because we were pushing very hard to get
QMP reasonably complete, so clients can migrate off HMP.  In retrospect,
that was probably the last chance to go back the original design,
because from then on, commands being synchronous became ABI real quick.
Asynchronous commands remained a freaky exception, and we never even
bothered to fix its bugs.

Instead, we went down a different route: we used QMP events to signal
job completion.  Events already existed, so this was natural.  Jobs
become wrapped in QMP as follows:

    User sends a QMP command
    If we reject the job, send a QMP error response
    Else, send a QMP success response
        now the job runs until it fails or succeeds
        in either case, send a QMP event

For the event to make sense, the initial command usually needs to
establish a suitable job ID.

Now let's refine the life cycle some: clients want to query job status,
and cancel jobs.

The synchronous commands + events design can support it easily.  Simply
have a command to query status, taking the job ID as argument.
Likewise, have a command to cancel.  Both are synchronous.  Cancel may
fail (say because the job has completed meanwhile).

The asynchronous command design could support it as well, with one
problem: we need a job ID.  The QMP command ID is no good, because it's
tied to a connection, while a job ID must remain valid as long as the
job runs.  Solvable.

So yes, "querying status is not incompatible with having async
commands".

My point is: asynchronous commands vs. synchronous commands + events
appears to be a wash: both get the job done.  For better or worse, we
have the latter working, but not the former.  Why should we add the
former now?