From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43973) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XZM2p-0003OK-4k for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:42:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XZM2i-0003sP-Ng for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:42:23 -0400 Received: from dew.nodalink.com ([95.130.14.197]:39384) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XZM2i-0003rJ-F4 for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:42:16 -0400 Date: Wed, 1 Oct 2014 15:42:18 +0000 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20141001154218.GA17552@nodalink.com> References: <1411412452-8655-1-git-send-email-benoit.canet@nodalink.com> <1411412452-8655-3-git-send-email-benoit.canet@nodalink.com> <20141001043842.GA12347@localhost.localdomain> <20141001092944.GB5378@nodalink.com> <20141001151952.GB12347@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20141001151952.GB12347@localhost.localdomain> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jeff Cody Cc: kwolf@redhat.com, famz@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org, =?iso-8859-1?Q?Beno=EEt?= Canet > >=20 > > The main purpose of this is mirror.c and commit.c would form BDS loop= s on completion. > > These callers could break the look manually but the code would fail > > if a loop is not breaked and the blocker function are called on it. > > So the blocker code have to handle recursion loops. >=20 > I think commit/mirror/etc should break any loops prior to calling > recursive functions on those loops (just like it should do before > calling bdrv_unref(), etc..). Otherwise, I think the recursive > functions make assumptions that may be true in certain contexts, but > not necessarily all. >=20 > (Hmm, I believe that Fam had a series that did some nice cleanup on > bdrv_drop_intermediate() and related areas that did some loop > breaking, IIRC). Ok I could use that as a basis. > >=20 > > I don't think this particular test is a failure point. > > >=20 > With the way it is written, the individual BDS is blocked with the > same reason pointer, but not necessarily the whole chain - unless you > make the assumption that blockers will only be used via the recursive > interface, and never individually on a node. there is no more a no recursive version with this patch so this assumptio= n will be respected. >=20 > The caller doesn't have an interface to check that the chain is not > blocked - bdrv_op_is_blocked_by() doesn't operate recursively. =20 >=20 > If we tried to block a chain that has some portion already blocked for > the same reason, shouldn't that be an error? Why should we allow this ? My understanding is that blocking something should be associated to a single operation whatever they are. So one operation to block implying one different reason is not so strange= . > > > > + > > > > + /* block first for recursion loop protection to work */ > > > > + bdrv_do_op_block(bs, op, reason); > > > > + > > > > + bdrv_op_block(bs->file, base, op, reason); > > > > + > > > > + if (bs->drv && bs->drv->supports_backing) { > > > > + bdrv_op_block(bs->backing_hd, base, op, reason); > > > > + } > > > > + > > > > + if (bs->drv && bs->drv->bdrv_op_recursive_block) { > > > > + bs->drv->bdrv_op_recursive_block(bs, base, op, reason); > > >=20 > > > Do we need to allow .bdrv_op_recursive_block() to fail and return > > > error (and handle it, of course)? > >=20 > > I don't know yet: but lets let this question float a little more in t= he mail thread. > >=20 > > >=20 > > >=20 > > >=20 > >=20 > > To reach this state the caller code would have to do the following tw= isted sequence. > >=20 > > block(image3, with_reason1) > > unblock(image2, with_reason1) > > block(image1, with_reason1) > > > > There is no such sequence in the code thanks to the base argument and= we could > > enforce that no such sequence ever get written. > > >=20 > If we ignore blockdev-add and scenarios where an image node may have > multiple overlays, we might be able to assume that such a sequence > could not exist. >=20 > But in that case, should this negative check result in an error? >=20 > It would seem at this point we would have encountered one of these > scenarios: >=20 > 1.) An invalid block/unblock state in the chain, if we assume that no > such sequence should exist >=20 > 2.) We tried to unblock more than we originally blocked >=20 > > >=20 > > > I would assume that bdrv_op_unblock(image2, NULL, reason) would sti= ll > > > unblock image1, even if image2 was unblocked. > >=20 > > Should we also assume that bdrv_op_unblock(image4, NULL, reason) with= image4 being > > image3 parent unblock everything underneath ? > >=20 >=20 > I think we either do that, or return an error. But to have > bdrv_op_unblock() (or bdrv_op_block()) silently stop at some point in > the chain prior to reaching 'base' doesn't seem correct to me. >=20 Maybe you are right. I don't mind rewriting the patchset with error handling and without the r= ecursion loop avoidance code given I find Fam's loop breaking patches on the list. I remember trying to write loop breaking by myself just before 2.1 and it= was annoying. Best regards Beno=EEt