From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:43973)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <benoit.canet@nodalink.com>) id 1XZM2p-0003OK-4k
	for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:42:29 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <benoit.canet@nodalink.com>) id 1XZM2i-0003sP-Ng
	for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:42:23 -0400
Received: from dew.nodalink.com ([95.130.14.197]:39384)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <benoit.canet@nodalink.com>) id 1XZM2i-0003rJ-F4
	for qemu-devel@nongnu.org; Wed, 01 Oct 2014 11:42:16 -0400
Date: Wed, 1 Oct 2014 15:42:18 +0000
From: =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@nodalink.com>
Message-ID: <20141001154218.GA17552@nodalink.com>
References: <1411412452-8655-1-git-send-email-benoit.canet@nodalink.com>
	<1411412452-8655-3-git-send-email-benoit.canet@nodalink.com>
	<20141001043842.GA12347@localhost.localdomain>
	<20141001092944.GB5378@nodalink.com>
	<20141001151952.GB12347@localhost.localdomain>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <20141001151952.GB12347@localhost.localdomain>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Jeff Cody <jcody@redhat.com>
Cc: kwolf@redhat.com, famz@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org, =?iso-8859-1?Q?Beno=EEt?= Canet <benoit.canet@nodalink.com>

> >=20
> > The main purpose of this is mirror.c and commit.c would form BDS loop=
s on completion.
> > These callers could break the look manually but the code would fail
> > if a loop is not breaked and the blocker function are called on it.
> > So the blocker code have to handle recursion loops.
>=20
> I think commit/mirror/etc should break any loops prior to calling
> recursive functions on those loops (just like it should do before
> calling bdrv_unref(), etc..).  Otherwise, I think the recursive
> functions make assumptions that may be true in certain contexts, but
> not necessarily all.
>=20
> (Hmm, I believe that Fam had a series that did some nice cleanup on
> bdrv_drop_intermediate() and related areas that did some loop
> breaking, IIRC).

Ok I could use that as a basis.

> >=20
> > I don't think this particular test is a failure point.
> >
>=20
> With the way it is written, the individual BDS is blocked with the
> same reason pointer, but not necessarily the whole chain - unless you
> make the assumption that blockers will only be used via the recursive
> interface, and never individually on a node.

there is no more a no recursive version with this patch so this assumptio=
n
will be respected.

>=20
> The caller doesn't have an interface to check that the chain is not
> blocked - bdrv_op_is_blocked_by() doesn't operate recursively. =20
>=20
> If we tried to block a chain that has some portion already blocked for
> the same reason, shouldn't that be an error?

Why should we allow this ?
My understanding is that blocking something should be associated to a
single operation whatever they are.
So one operation to block implying one different reason is not so strange=
.

> > > > +
> > > > +    /* block first for recursion loop protection to work */
> > > > +    bdrv_do_op_block(bs, op, reason);
> > > > +
> > > > +    bdrv_op_block(bs->file, base, op, reason);
> > > > +
> > > > +    if (bs->drv && bs->drv->supports_backing) {
> > > > +        bdrv_op_block(bs->backing_hd, base, op, reason);
> > > > +    }
> > > > +
> > > > +    if (bs->drv && bs->drv->bdrv_op_recursive_block) {
> > > > +        bs->drv->bdrv_op_recursive_block(bs, base, op, reason);
> > >=20
> > > Do we need to allow .bdrv_op_recursive_block() to fail and return
> > > error (and handle it, of course)?
> >=20
> > I don't know yet: but lets let this question float a little more in t=
he mail thread.
> >=20
> > >=20
> > >=20
> > >=20


> >=20
> > To reach this state the caller code would have to do the following tw=
isted sequence.
> >=20
> > block(image3, with_reason1)
> > unblock(image2, with_reason1)
> > block(image1, with_reason1)
> >
> > There is no such sequence in the code thanks to the base argument and=
 we could
> > enforce that no such sequence ever get written.
> >
>=20
> If we ignore blockdev-add and scenarios where an image node may have
> multiple overlays, we might be able to assume that such a sequence
> could not exist.
>=20
> But in that case, should this negative check result in an error?
>=20
> It would seem at this point we would have encountered one of these
> scenarios:
>=20
> 1.) An invalid block/unblock state in the chain, if we assume that no
> such sequence should exist
>=20
> 2.) We tried to unblock more than we originally blocked

>=20
> > >=20
> > > I would assume that bdrv_op_unblock(image2, NULL, reason) would sti=
ll
> > > unblock image1, even if image2 was unblocked.
> >=20
> > Should we also assume that bdrv_op_unblock(image4, NULL, reason) with=
 image4 being
> > image3 parent unblock everything underneath ?
> >=20
>=20
> I think we either do that, or return an error.  But to have
> bdrv_op_unblock() (or bdrv_op_block()) silently stop at some point in
> the chain prior to reaching 'base' doesn't seem correct to me.
>=20

Maybe you are right.
I don't mind rewriting the patchset with error handling and without the r=
ecursion
loop avoidance code given I find Fam's loop breaking patches on the list.

I remember trying to write loop breaking by myself just before 2.1 and it=
 was
annoying.

Best regards

Beno=EEt