From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38892) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XZLhH-00056d-Bt for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:20:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XZLhB-0003ic-4I for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:20:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34406) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XZLhA-0003iY-Sq for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:20:01 -0400 Date: Wed, 1 Oct 2014 11:19:52 -0400 From: Jeff Cody Message-ID: <20141001151952.GB12347@localhost.localdomain> References: <1411412452-8655-1-git-send-email-benoit.canet@nodalink.com> <1411412452-8655-3-git-send-email-benoit.canet@nodalink.com> <20141001043842.GA12347@localhost.localdomain> <20141001092944.GB5378@nodalink.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20141001092944.GB5378@nodalink.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?Beno=EEt?= Canet Cc: kwolf@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com On Wed, Oct 01, 2014 at 09:29:44AM +0000, Beno=EEt Canet wrote: >=20 > Thanks a lot for reviewing this patch. >=20 > Since the code is not trivial I will give my arguments for writing it > this way. > Thanks, that is very helpful. >=20 > > > +/* block recursively a BDS > > > + * > > > + * If base !=3D NULL the caller must make sure that there is no mu= ltiple child > > > + * BDS in the subtree pointed to by bs > >=20 > > In the case when base !=3D NULL, should that really matter? In a > > driver's .bdrv_op_recursive_block/unblock, if that driver has private > > children (multiple or not), shouldn't the private children be treated > > as one black box, and blocked / unblocked just like the parent > > BDS? >=20 > This is a stale comment. My bad. >=20 > >=20 > > For instance, what if we had a tree such as this: > >=20 > > [quorum1] <---- [active] > > | > > | (private) > > | > > v > > [node2]<-- [node1] <--- [imag>=20 > > if 'quorum1' was to be op blocked, and 'image1' and its children all > > comprise 'quorum1', wouldn't we always want to lock 'image1' all the > > way down to 'node2'? >=20 > That's what the patch does. >=20 OK, great - my comment above was written in response to the stale comment= :) > >=20 > > Or do we expect that someone will intentionally pass a 'base' pointer > > along that matches somewhere in the private child tree? >=20 > This is not expected by the caller but I wrote the patch so it will als= o work. >=20 > >=20 > > > + * > > > + * @bs: the BDS to start to recurse from > > > + * @base: the BDS where the recursion should end > > > + * could be NULL if the recursion should go down everywhe= re > > > + * @op: the type of op blocker to block > > > + * @reason: the error message associated with the blocking > > > + */ > > > +void bdrv_op_block(BlockDriverState *bs, BlockDriverState *base, > > > + BlockOpType op, Error *reason) > > > +{ > > > + if (!bs) { > > > + return; > > > + } > > > + > > > + /* did we recurse down to base ? */ > > > + if (bs =3D=3D base) { > > > + return; > > > + } > > > + > > > + /* prevent recursion loop */ > > > + if (bdrv_op_is_blocked_by(bs, op, reason)) { > > > + return; > > > + } > >=20 > > Should we handle this somehow (isn't this effectively an error, that > > will prematurely end the blocking of this particular line)?=20 >=20 > The main purpose of this is mirror.c and commit.c would form BDS loops = on completion. > These callers could break the look manually but the code would fail > if a loop is not breaked and the blocker function are called on it. > So the blocker code have to handle recursion loops. I think commit/mirror/etc should break any loops prior to calling recursive functions on those loops (just like it should do before calling bdrv_unref(), etc..). Otherwise, I think the recursive functions make assumptions that may be true in certain contexts, but not necessarily all. (Hmm, I believe that Fam had a series that did some nice cleanup on bdrv_drop_intermediate() and related areas that did some loop breaking, IIRC). >=20 > The bdrv_op_is_blocked_by match a reason being the criteria to test the= blocker. >=20 > So this test will trigger only if a BDS is already blocked by the same = reason > pointer. >=20 > This make the bdrv_op_block function idempotent. >=20 > note that it is the only sane way I found to make the blockers function= handle > loops. >=20 > >=20 > > If we hit this, we are going to end up in a scenario where we haven't > > blocked the chain as requested, and we don't know the state of the > > blocking below this failure point. Seems like the caller should know= , > > and we should have a way of cleaning up. >=20 > If we hit this the chain was already blocked with the same reason point= er. >=20 > >=20 > > Actually, now I am wondering if we perhaps shouldn't be building a > > list to block, and then blocking everything atomically once we know > > there are no failure points. > > >=20 > I don't think this particular test is a failure point. > With the way it is written, the individual BDS is blocked with the same reason pointer, but not necessarily the whole chain - unless you make the assumption that blockers will only be used via the recursive interface, and never individually on a node. The caller doesn't have an interface to check that the chain is not blocked - bdrv_op_is_blocked_by() doesn't operate recursively. =20 If we tried to block a chain that has some portion already blocked for the same reason, shouldn't that be an error? > > > + > > > + /* block first for recursion loop protection to work */ > > > + bdrv_do_op_block(bs, op, reason); > > > + > > > + bdrv_op_block(bs->file, base, op, reason); > > > + > > > + if (bs->drv && bs->drv->supports_backing) { > > > + bdrv_op_block(bs->backing_hd, base, op, reason); > > > + } > > > + > > > + if (bs->drv && bs->drv->bdrv_op_recursive_block) { > > > + bs->drv->bdrv_op_recursive_block(bs, base, op, reason); > >=20 > > Do we need to allow .bdrv_op_recursive_block() to fail and return > > error (and handle it, of course)? >=20 > I don't know yet: but lets let this question float a little more in the= mail thread. >=20 > >=20 > >=20 > >=20 > > s/block/unblock > Thanks >=20 > >=20 > >=20 > > > + * @reason: the error message associated with the blocking > > > + */ > > > +void bdrv_op_unblock(BlockDriverState *bs, BlockDriverState *base, > > > + BlockOpType op, Error *reason) > > > +{ > > > + if (!bs) { > > > + return; > > > + } > > > + > > > + /* did we recurse down to base ? */ > > > + if (bs =3D=3D base) { > > > + return; > > > + } > > > + > > > + /* prevent recursion loop */ > > > + if (!bdrv_op_is_blocked_by(bs, op, reason)) { > > > + return; > > > + } > >=20 > > Do we want to do this negative check here? Even if a given node is > > not blocked, its children may be, and I would assume (since this is > > recursive) that if I called bdrv_op_unblock() all of the chain down t= o > > 'base' would be unblocked for the given reason. > >=20 > > E.g.=20 > >=20 > > [image1] <-- [image2] <-- [image3] > > blocked unblocked blocked >=20 > To reach this state the caller code would have to do the following twis= ted sequence. >=20 > block(image3, with_reason1) > unblock(image2, with_reason1) > block(image1, with_reason1) > > There is no such sequence in the code thanks to the base argument and w= e could > enforce that no such sequence ever get written. > If we ignore blockdev-add and scenarios where an image node may have multiple overlays, we might be able to assume that such a sequence could not exist. But in that case, should this negative check result in an error? It would seem at this point we would have encountered one of these scenarios: 1.) An invalid block/unblock state in the chain, if we assume that no such sequence should exist 2.) We tried to unblock more than we originally blocked > >=20 > > I would assume that bdrv_op_unblock(image2, NULL, reason) would still > > unblock image1, even if image2 was unblocked. >=20 > Should we also assume that bdrv_op_unblock(image4, NULL, reason) with i= mage4 being > image3 parent unblock everything underneath ? >=20 I think we either do that, or return an error. But to have bdrv_op_unblock() (or bdrv_op_block()) silently stop at some point in the chain prior to reaching 'base' doesn't seem correct to me. > > > +static void quorum_op_recursive_block(BlockDriverState *bs, > > > + BlockDriverState *base, > > > + BlockOpType op, > > > + Error *reason) > > > +{ > > > + BDRVQuorumState *s =3D bs->opaque; > > > + int i; > > > + > > > + for (i =3D 0; i < s->num_children; i++) { > > > + bdrv_op_block(s->bs[i], base, op, reason); > > > + } > >=20 > > I am wondering if the 'base' argument above should always be NULL in > > the case of private children. I.e., the state of private children > > trees in their entirety should always reflect the state of the > > multi-children BDS parent. >=20 > Yes we could simply discard it. >=20 > > > + /* helps blockers to propagate recursively */ > > > + void (*bdrv_op_recursive_block)(BlockDriverState *bs, > > > + BlockDriverState *base, > > > + BlockOpType op, > > > + Error *reason); > > > + void (*bdrv_op_recursive_unblock)(BlockDriverState *bs, > > > + BlockDriverState *base, > > > + BlockOpType op, > > > + Error *reason); > > > + > >=20 > > Seems like these new functions would be better named '.bdrv_op_block' > > and '.bdrv_op_unblock'? That way, recursive or not, it is clear bloc= k > > drivers can implement whatever blocking is appropriate for themselves. >=20 > I agree. >=20 > Best regards >=20 > Beno=EEt