From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51544) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZbU4B-0008LS-15 for qemu-devel@nongnu.org; Mon, 14 Sep 2015 09:45:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZbU47-0001ih-OS for qemu-devel@nongnu.org; Mon, 14 Sep 2015 09:45:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50561) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZbU47-0001iI-Gt for qemu-devel@nongnu.org; Mon, 14 Sep 2015 09:45:03 -0400 From: Markus Armbruster References: <20150729150531.GI16847@redhat.com> Date: Mon, 14 Sep 2015 15:45:00 +0200 In-Reply-To: (=?utf-8?Q?=22Marc-Andr=C3=A9?= Lureau"'s message of "Wed, 29 Jul 2015 17:18:10 +0200") Message-ID: <87613dp12b.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] RFC: async commands with QMP List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?utf-8?Q?Marc-Andr=C3=A9?= Lureau Cc: QEMU I'm so far behind on e-mail it's not funny anymore. My apologies... Marc-Andr=C3=A9 Lureau writes: > Hi Daniel > > On Wed, Jul 29, 2015 at 5:05 PM, Daniel P. Berrange = wrote: >> When QMP was originally written there was some infrastructure for doing >> async commands, but over time this has all been ripped out because it >> was never really used, complicated the code and didn't work too well >> either. It seems we pretty much settled on the approach that all >> commands should be fast to execute, and where there is a long term >> job being run, we have commands to query its status, cancel it, and >> sometimes events too. One of the benefits of this is that it means >> that libvirt can determine current status of ongoing background jobs >> when it restarts and reconnects to a previously launched QEMU, where >> as an async command approach is tied to the specific monitor connection >> that is open. > > Thanks for the historic reasons, I ignored that. > > Querying status is not incompatible with having async commands. It's > not because command return immediately that the job is finished, it's > async already! I'm not sure I got your point here. But see below. > This proposal is about having a return when the job is > actually finished to avoid having to continuously query, that's the > main motivation. Today's return for async commands is pretty useless, > you can admit, it's just a ack that the command got accepted and > eventually started... Let's step back and consider the general life cycle of a "job". I'm calling it "job" to distinguish it from "QMP command". User orders a job to get done System either accepts or rejects the job if reject, we're done Job runs for a while, then either fails or succeeds Anthony's initial QMP design wanted this wrapped in QMP as follows: User sends a QMP command If we reject the job, send a QMP error response Else, the job runs until it fails or succeeds when it fails, send a QMP error response when it succeeds, send a QMP success response The idea was that *all* commands are asynchronous! However, the initial implementation was in fact completely synchronous. For most jobs, the "while" in "runs for a while" is reliably very short, so this wasn't a problem. When the first job was added where that isn't the case, the implementation was hacked up to support asynchronous commands. However, we didn't change the existing commands to become asynchronous then. Probably because we were pushing very hard to get QMP reasonably complete, so clients can migrate off HMP. In retrospect, that was probably the last chance to go back the original design, because from then on, commands being synchronous became ABI real quick. Asynchronous commands remained a freaky exception, and we never even bothered to fix its bugs. Instead, we went down a different route: we used QMP events to signal job completion. Events already existed, so this was natural. Jobs become wrapped in QMP as follows: User sends a QMP command If we reject the job, send a QMP error response Else, send a QMP success response now the job runs until it fails or succeeds in either case, send a QMP event For the event to make sense, the initial command usually needs to establish a suitable job ID. Now let's refine the life cycle some: clients want to query job status, and cancel jobs. The synchronous commands + events design can support it easily. Simply have a command to query status, taking the job ID as argument. Likewise, have a command to cancel. Both are synchronous. Cancel may fail (say because the job has completed meanwhile). The asynchronous command design could support it as well, with one problem: we need a job ID. The QMP command ID is no good, because it's tied to a connection, while a job ID must remain valid as long as the job runs. Solvable. So yes, "querying status is not incompatible with having async commands". My point is: asynchronous commands vs. synchronous commands + events appears to be a wash: both get the job done. For better or worse, we have the latter working, but not the former. Why should we add the former now?