From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56247) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X21aF-0006yN-Kl for qemu-devel@nongnu.org; Tue, 01 Jul 2014 13:11:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X21aA-0002wp-G4 for qemu-devel@nongnu.org; Tue, 01 Jul 2014 13:11:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49244) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X21aA-0002wc-68 for qemu-devel@nongnu.org; Tue, 01 Jul 2014 13:11:02 -0400 Message-ID: <53B2EBA0.8080108@redhat.com> Date: Tue, 01 Jul 2014 19:10:56 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1403889855-5740-1-git-send-email-armbru@redhat.com> <1403889855-5740-2-git-send-email-armbru@redhat.com> <53B2EB0F.2050402@redhat.com> In-Reply-To: <53B2EB0F.2050402@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 2.1 1/3] blockjob: Fix recent BLOCK_JOB_READY regression List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , Markus Armbruster , qemu-devel@nongnu.org Cc: kwolf@redhat.com, wenchaoqemu@gmail.com, stefanha@redhat.com, lcapitulino@redhat.com Il 01/07/2014 19:08, Eric Blake ha scritto: > On 06/27/2014 11:24 AM, Markus Armbruster wrote: >> Commit bcada37 dropped the (up to now undocumented) members type, len, >> offset, speed, breaking tests/qemu-iotests/040 and 041. >> >> Restore and document them. This fixes 040, and partially fixes 041. >> >> Signed-off-by: Markus Armbruster >> Tested-By: Benoit Canet >> --- >> blockjob.c | 6 +++++- >> qapi/block-core.json | 15 ++++++++++++++- >> 2 files changed, 19 insertions(+), 2 deletions(-) > > Nothing wrong with this commit, but a design issue that I've recently > run into: > > what happens if management misses the BLOCK_JOB_COMPLETED event? How is > it supposed to learn whether the job succeeded or failed? > 'query-blockjobs' no longer reports the job (because it is completed), > so all information about the job is lost. Normally, we've tried hard to > make sure that all information learned from an event can also be polled > (the ideal is use of events to minimize cpu overhead, but to rely on the > poll in situations where events may have been lost such as on a libvirtd > restart). > > Should we enhance job failure to be sticky, in that it not only causes > an event, but also remains around so that it can be reported in the next > 'query-blockjobs'? I think this fixes itself automatically if you use rerror=stop/werror=stop on block jobs. At least that was part of the design, whether the implementation gets it right I cannot say without looking at the code more carefully. Paolo