From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:38892)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jcody@redhat.com>) id 1XZLhH-00056d-Bt
	for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:20:13 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jcody@redhat.com>) id 1XZLhB-0003ic-4I
	for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:20:07 -0400
Received: from mx1.redhat.com ([209.132.183.28]:34406)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jcody@redhat.com>) id 1XZLhA-0003iY-Sq
	for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:20:01 -0400
Date: Wed, 1 Oct 2014 11:19:52 -0400
From: Jeff Cody <jcody@redhat.com>
Message-ID: <20141001151952.GB12347@localhost.localdomain>
References: <1411412452-8655-1-git-send-email-benoit.canet@nodalink.com>
	<1411412452-8655-3-git-send-email-benoit.canet@nodalink.com>
	<20141001043842.GA12347@localhost.localdomain>
	<20141001092944.GB5378@nodalink.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20141001092944.GB5378@nodalink.com>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@nodalink.com>
Cc: kwolf@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com

On Wed, Oct 01, 2014 at 09:29:44AM +0000, Beno=EEt Canet wrote:
>=20
> Thanks a lot for reviewing this patch.
>=20
> Since the code is not trivial I will give my arguments for writing it
> this way.
>

Thanks, that is very helpful.

>=20
> > > +/* block recursively a BDS
> > > + *
> > > + * If base !=3D NULL the caller must make sure that there is no mu=
ltiple child
> > > + * BDS in the subtree pointed to by bs
> >=20
> > In the case when base !=3D NULL, should that really matter?  In a
> > driver's .bdrv_op_recursive_block/unblock, if that driver has private
> > children (multiple or not), shouldn't the private children be treated
> > as one black box, and blocked / unblocked just like the parent
> > BDS?
>=20
> This is a stale comment. My bad.
>=20
> >=20
> > For instance, what if we had a tree such as this:
> >=20
> >                        [quorum1] <---- [active]
> >                            |
> >                            | (private)
> >                            |
> >                            v
> > [node2]<-- [node1] <--- [imag>=20
> > if 'quorum1' was to be op blocked, and 'image1' and its children all
> > comprise 'quorum1', wouldn't we always want to lock 'image1' all the
> > way down to 'node2'?
>=20
> That's what the patch does.
>=20

OK, great - my comment above was written in response to the stale comment=
 :)

> >=20
> > Or do we expect that someone will intentionally pass a 'base' pointer
> > along that matches somewhere in the private child tree?
>=20
> This is not expected by the caller but I wrote the patch so it will als=
o work.
>=20
> >=20
> > > + *
> > > + * @bs:     the BDS to start to recurse from
> > > + * @base:   the BDS where the recursion should end
> > > + *          could be NULL if the recursion should go down everywhe=
re
> > > + * @op:     the type of op blocker to block
> > > + * @reason: the error message associated with the blocking
> > > + */
> > > +void bdrv_op_block(BlockDriverState *bs, BlockDriverState *base,
> > > +                   BlockOpType op, Error *reason)
> > > +{
> > > +    if (!bs) {
> > > +        return;
> > > +    }
> > > +
> > > +    /* did we recurse down to base ? */
> > > +    if (bs =3D=3D base) {
> > > +        return;
> > > +    }
> > > +
> > > +    /* prevent recursion loop */
> > > +    if (bdrv_op_is_blocked_by(bs, op, reason)) {
> > > +        return;
> > > +    }
> >=20
> > Should we handle this somehow (isn't this effectively an error, that
> > will prematurely end the blocking of this particular line)?=20
>=20
> The main purpose of this is mirror.c and commit.c would form BDS loops =
on completion.
> These callers could break the look manually but the code would fail
> if a loop is not breaked and the blocker function are called on it.
> So the blocker code have to handle recursion loops.

I think commit/mirror/etc should break any loops prior to calling
recursive functions on those loops (just like it should do before
calling bdrv_unref(), etc..).  Otherwise, I think the recursive
functions make assumptions that may be true in certain contexts, but
not necessarily all.

(Hmm, I believe that Fam had a series that did some nice cleanup on
bdrv_drop_intermediate() and related areas that did some loop
breaking, IIRC).

>=20
> The bdrv_op_is_blocked_by match a reason being the criteria to test the=
 blocker.
>=20
> So this test will trigger only if a BDS is already blocked by the same =
reason
> pointer.
>=20
> This make the bdrv_op_block function idempotent.
>=20
> note that it is the only sane way I found to make the blockers function=
 handle
> loops.
>=20
> >=20
> > If we hit this, we are going to end up in a scenario where we haven't
> > blocked the chain as requested, and we don't know the state of the
> > blocking below this failure point.  Seems like the caller should know=
,
> > and we should have a way of cleaning up.
>=20
> If we hit this the chain was already blocked with the same reason point=
er.
>=20
> >=20
> > Actually, now I am wondering if we perhaps shouldn't be building a
> > list to block, and then blocking everything atomically once we know
> > there are no failure points.
> >
>=20
> I don't think this particular test is a failure point.
>

With the way it is written, the individual BDS is blocked with the
same reason pointer, but not necessarily the whole chain - unless you
make the assumption that blockers will only be used via the recursive
interface, and never individually on a node.

The caller doesn't have an interface to check that the chain is not
blocked - bdrv_op_is_blocked_by() doesn't operate recursively. =20

If we tried to block a chain that has some portion already blocked for
the same reason, shouldn't that be an error?


> > > +
> > > +    /* block first for recursion loop protection to work */
> > > +    bdrv_do_op_block(bs, op, reason);
> > > +
> > > +    bdrv_op_block(bs->file, base, op, reason);
> > > +
> > > +    if (bs->drv && bs->drv->supports_backing) {
> > > +        bdrv_op_block(bs->backing_hd, base, op, reason);
> > > +    }
> > > +
> > > +    if (bs->drv && bs->drv->bdrv_op_recursive_block) {
> > > +        bs->drv->bdrv_op_recursive_block(bs, base, op, reason);
> >=20
> > Do we need to allow .bdrv_op_recursive_block() to fail and return
> > error (and handle it, of course)?
>=20
> I don't know yet: but lets let this question float a little more in the=
 mail thread.
>=20
> >=20
> >=20
> >=20
> > s/block/unblock
> Thanks
>=20
> >=20
> >=20
> > > + * @reason: the error message associated with the blocking
> > > + */
> > > +void bdrv_op_unblock(BlockDriverState *bs, BlockDriverState *base,
> > > +                     BlockOpType op, Error *reason)
> > > +{
> > > +    if (!bs) {
> > > +        return;
> > > +    }
> > > +
> > > +    /* did we recurse down to base ? */
> > > +    if (bs =3D=3D base) {
> > > +        return;
> > > +    }
> > > +
> > > +    /* prevent recursion loop */
> > > +    if (!bdrv_op_is_blocked_by(bs, op, reason)) {
> > > +        return;
> > > +    }
> >=20
> > Do we want to do this negative check here?  Even if a given node is
> > not blocked, its children may be, and I would assume (since this is
> > recursive) that if I called bdrv_op_unblock() all of the chain down t=
o
> > 'base' would be unblocked for the given reason.
> >=20
> > E.g.=20
> >=20
> >    [image1] <-- [image2] <-- [image3]
> >    blocked      unblocked     blocked
>=20
> To reach this state the caller code would have to do the following twis=
ted sequence.
>=20
> block(image3, with_reason1)
> unblock(image2, with_reason1)
> block(image1, with_reason1)
>
> There is no such sequence in the code thanks to the base argument and w=
e could
> enforce that no such sequence ever get written.
>

If we ignore blockdev-add and scenarios where an image node may have
multiple overlays, we might be able to assume that such a sequence
could not exist.

But in that case, should this negative check result in an error?

It would seem at this point we would have encountered one of these
scenarios:

1.) An invalid block/unblock state in the chain, if we assume that no
such sequence should exist

2.) We tried to unblock more than we originally blocked

> >=20
> > I would assume that bdrv_op_unblock(image2, NULL, reason) would still
> > unblock image1, even if image2 was unblocked.
>=20
> Should we also assume that bdrv_op_unblock(image4, NULL, reason) with i=
mage4 being
> image3 parent unblock everything underneath ?
>=20

I think we either do that, or return an error.  But to have
bdrv_op_unblock() (or bdrv_op_block()) silently stop at some point in
the chain prior to reaching 'base' doesn't seem correct to me.

> > > +static void quorum_op_recursive_block(BlockDriverState *bs,
> > > +                                      BlockDriverState *base,
> > > +                                      BlockOpType op,
> > > +                                      Error *reason)
> > > +{
> > > +    BDRVQuorumState *s =3D bs->opaque;
> > > +    int i;
> > > +
> > > +    for (i =3D 0; i < s->num_children; i++) {
> > > +        bdrv_op_block(s->bs[i], base, op, reason);
> > > +    }
> >=20
> > I am wondering if the 'base' argument above should always be NULL in
> > the case of private children.  I.e., the state of private children
> > trees in their entirety should always reflect the state of the
> > multi-children BDS parent.
>=20
> Yes we could simply discard it.
>=20
> > > +    /* helps blockers to propagate recursively */
> > > +    void (*bdrv_op_recursive_block)(BlockDriverState *bs,
> > > +                                    BlockDriverState *base,
> > > +                                    BlockOpType op,
> > > +                                    Error *reason);
> > > +    void (*bdrv_op_recursive_unblock)(BlockDriverState *bs,
> > > +                                      BlockDriverState *base,
> > > +                                      BlockOpType op,
> > > +                                      Error *reason);
> > > +
> >=20
> > Seems like these new functions would be better named '.bdrv_op_block'
> > and '.bdrv_op_unblock'?  That way, recursive or not, it is clear bloc=
k
> > drivers can implement whatever blocking is appropriate for themselves.
>=20
> I agree.
>=20
> Best regards
>=20
> Beno=EEt