From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37966) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SwXxJ-00045E-4D for qemu-devel@nongnu.org; Wed, 01 Aug 2012 08:23:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SwXxF-0005xv-0T for qemu-devel@nongnu.org; Wed, 01 Aug 2012 08:23:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46659) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SwXxE-0005xq-OG for qemu-devel@nongnu.org; Wed, 01 Aug 2012 08:23:08 -0400 Message-ID: <50191FA7.1070309@redhat.com> Date: Wed, 01 Aug 2012 14:23:03 +0200 From: Kevin Wolf MIME-Version: 1.0 References: <1343127865-16608-1-git-send-email-pbonzini@redhat.com> <1343127865-16608-14-git-send-email-pbonzini@redhat.com> <5019016F.1050909@redhat.com> <50191044.7070109@redhat.com> <501917CB.1060704@redhat.com> <50191C97.8000005@redhat.com> In-Reply-To: <50191C97.8000005@redhat.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 13/47] block: introduce block job error List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: jcody@redhat.com, eblake@redhat.com, qemu-devel@nongnu.org, stefanha@linux.vnet.ibm.com Am 01.08.2012 14:09, schrieb Paolo Bonzini: > Il 01/08/2012 13:49, Kevin Wolf ha scritto: >>> The real question is: if I remove the possibility to inspect the (so far >>> anonymous) target device with query-block-jobs, how do you read the >>> status of the target device?... >> >> You don't? :-) > > That's a possibility. :) > > You can just report it in the block job. It's more consistent with the > fact that you got a BLOCK_JOB_ERROR and not a BLOCK_IO_ERROR. So I > would do: > > + bdrv_emit_qmp_error_event(job->bs, QEVENT_BLOCK_JOB_ERROR, > + action, is_read); Isn't job->bs the source? Also, if you miss the event (e.g. libvirt crashed), can you still tell which bs caused the error? Do we need another BlockJobInfo field for the name (*cough* ;-)) of the failed bs? If I understand it right, error reporting on the source and on the target would be completely symmetrical then, which I think is a good thing. job->bs makes one image special, which isn't really useful for a generic interface. So we should keep the fact that it exists internal and get rid of it sometime. > + if (action == BDRV_ACTION_STOP) { > + block_job_pause(job); > + block_job_iostatus_set_err(job, error); > + if (bs != job->bs) { > + bdrv_iostatus_set_err(bs, error); > + } > + } > > where the bdrv_iostatus_set_err is mostly to "prepare for the future" > usage of named block devices. Again, I'd make it unconditional to get symmetric behaviour. > As you said for ENOSPC vs. EIO, management must be ready to retry > multiple times, if it has only the final state at its disposal. > > On the other hand, if you see the exact sequence of BLOCK_IO_ERROR vs. > BLOCK_JOB_ERROR you know exactly how the error happened and you can fix it. > >> Maybe we should use named block devices from the beginning. > > Hmm, but I'm a bit wary of introducing such a big change now. We know > what it makes nicer, but we don't know of anything irremediably broken > without them, and we haven't thought enough of any warts it introduces. On the one hand, I can understand your concerns, but on the other hand, introducing an API now and then making such a big change afterwards scares me much more. Kevin