From: Fam Zheng <famz@redhat.com>
To: Markus Armbruster <armbru@redhat.com>, jcody@redhat.com
Cc: kwolf@redhat.com, benoit.canet@irqsave.net,
qemu-devel@nongnu.org, rjones@redhat.com, ptoscano@redhat.com,
imain@redhat.com, stefanha@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v17 08/14] block: Support dropping active in bdrv_drop_intermediate
Date: Tue, 8 Apr 2014 17:07:38 +0800 [thread overview]
Message-ID: <20140408090711.GE11793@T430.redhat.com> (raw)
In-Reply-To: <87a9bw1blj.fsf@blackfin.pond.sub.org>
On Tue, 04/08 10:15, Markus Armbruster wrote:
> Jeff Cody <jcody@redhat.com> writes:
>
> > On Mon, Mar 10, 2014 at 03:26:04PM +0800, Fam Zheng wrote:
> >> Dropping intermediate could be useful both for commit and stream, and
> >> BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
> >> to work with op blockers.
> >>
> >> Signed-off-by: Fam Zheng <famz@redhat.com>
> >> ---
> >> block.c | 139 ++++++++++++++++++++++++++++-----------------------------
> >> block/commit.c | 2 +-
> >> 2 files changed, 70 insertions(+), 71 deletions(-)
> >>
> >> diff --git a/block.c b/block.c
> >> index 05f7766..0af7c62 100644
> >> --- a/block.c
> >> +++ b/block.c
> >> @@ -2503,115 +2503,114 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
> >> return overlay;
> >> }
> >>
> >> -typedef struct BlkIntermediateStates {
> >> - BlockDriverState *bs;
> >> - QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
> >> -} BlkIntermediateStates;
> >> -
> >> -
> >> /*
> >> - * Drops images above 'base' up to and including 'top', and sets the image
> >> - * above 'top' to have base as its backing file.
> >> + * Drops images above 'base' up to and including 'top', and sets new 'base' as
> >> + * backing_hd of top's overlay (the image orignally has 'top' as backing file).
> >> + * top's overlay may be NULL if 'top' is active, no such update needed.
> >> + * Requires that the top's overlay to 'top' is opened r/w.
> >> + *
> >> + * 1) This will convert the following chain:
> >> + *
> >> + * ... <- base <- ... <- top <- overlay <-... <- active
> >> *
> >> - * Requires that the overlay to 'top' is opened r/w, so that the backing file
> >> - * information in 'bs' can be properly updated.
> >> + * to
> >> + *
> >> + * ... <- base <- overlay <- active
> >> + *
> >> + * 2) It is allowed for bottom==base, in which case it converts:
> >> *
> >> - * E.g., this will convert the following chain:
> >> - * bottom <- base <- intermediate <- top <- active
> >> + * base <- ... <- top <- overlay <- ... <- active
> >> *
> >> * to
> >> *
> >> - * bottom <- base <- active
> >> + * base <- overlay <- active
> >> *
> >> - * It is allowed for bottom==base, in which case it converts:
> >> + * 2) It also allows active==top, in which case it converts:
> >> *
> >> - * base <- intermediate <- top <- active
> >> + * ... <- base <- ... <- top (active)
> >> *
> >> * to
> >> *
> >> - * base <- active
> >> + * ... <- base == active == top
> >> + *
> >> + * i.e. only base and lower remains: *top == *base when return.
> >> + *
> >> + * 3) If base==NULL, it will drop all the BDS below overlay and set its
> >> + * backing_hd to NULL. I.e.:
> >> *
> >> - * Error conditions:
> >> - * if active == top, that is considered an error
> >> + * base(NULL) <- ... <- overlay <- ... <- active
> >> + *
> >> + * to
> >> + *
> >> + * overlay <- ... <- active
> >> *
> >> */
> >> int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
> >> BlockDriverState *base)
> >> {
> >> - BlockDriverState *intermediate;
> >> - BlockDriverState *base_bs = NULL;
> >> - BlockDriverState *new_top_bs = NULL;
> >> - BlkIntermediateStates *intermediate_state, *next;
> >> - int ret = -EIO;
> >> -
> >> - QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> >> - QSIMPLEQ_INIT(&states_to_delete);
> >> + BlockDriverState *drop_start, *overlay, *bs;
> >> + int ret = -EINVAL;
> >>
> >> - if (!top->drv || !base->drv) {
> >> + assert(active);
> >> + assert(top);
> >> + /* Verify that top is in backing chain of active */
> >> + bs = active;
> >> + while (bs && bs != top) {
> >> + bs = bs->backing_hd;
> >> + }
> >> + if (!bs) {
> >> goto exit;
> >> }
> >> + /* Verify that base is in backing chain of top */
> >> + if (base) {
> >> + while (bs && bs != base) {
> >> + bs = bs->backing_hd;
> >> + }
> >> + if (bs != base) {
> >> + goto exit;
> >> + }
> >> + }
> >>
> >> - new_top_bs = bdrv_find_overlay(active, top);
> >> -
> >> - if (new_top_bs == NULL) {
> >> - /* we could not find the image above 'top', this is an error */
> >> + if (!top->drv || (base && !base->drv)) {
> >> goto exit;
> >> }
> >> -
> >> - /* special case of new_top_bs->backing_hd already pointing to base - nothing
> >> - * to do, no intermediate images */
> >> - if (new_top_bs->backing_hd == base) {
> >> + if (top == base) {
> >> + ret = 0;
> >> + goto exit;
> >> + } else if (top == active) {
> >> + assert(base);
> >> + drop_start = active->backing_hd;
> >> + bdrv_swap(active, base);
> >
> > This will assert in block.c, in bdrv_swap, on the test for
> > anonymity of active. (For testing, I changed the active layer commit
> > in mirror to use bdrv_drop_intermediate()).
Jeff, you're right, because bdrv_swap requires first argument to be a "new"
BDS, while we are passing in an top BDS.
But what happens if we write bdrv_swap(base, active)?
> >
> > Unfortunately, there are other problems as well (anonymity could be
> > fixed by bdrv_make_anon(active)).
> >
> > Using line numbers from my version of block.c, lines 1957, 1959, and
> > 1960 will each cause an assert (these lines are all in bdrv_swap()):
> >
> > 1956: /* bs_new must be anonymous and shouldn't have anything fancy
> > enabled */
> > 1957: assert(bs_new->device_name[0] == '\0');
> > 1958: assert(QLIST_EMPTY(&bs_new->dirty_bitmaps));
> > 1959: assert(bs_new->job == NULL);
> > 1960: assert(bs_new->dev == NULL);
> >
> > Markus - on line 1960 above, is it safe to remove that check (and the
> > other check further down in bdrv_swap())?
>
> I guess you mean the one under /* Check a few fields that should remain
> attached to the device */ by "the other check".
>
> bdrv_swap() is a scary, scary function. It has a number of
> preconditions, and we've tried to make them explicit in assertions.
>
> The preconditions could be:
>
> (0) Implicit. Or less politely said: unstated / unknown.
>
> (1) Explicit, but wrong.
>
> (2) Restrictions: the swapping code doesn't cover this state, but it
> could be made to cover it, relaxing the precondition.
>
> (3) Fundamental: the precondition cannot or should not be relaxed.
>
> BDS member dev is is the back pointer for the device model property
> pointing to the device model's backend. It must stay on top, and
> bdrv_move_feature_fields() duly moves it, along with its buddies dev_ops
> and dev_opaque. Obviously, only one of bdrv_old and bdrv_new can be on
> top at the same time. bdrv_swap() currently assumes that bdrv_old is
> the top one on entry. Feels like an instance of (3).
>
> > Thinking about it more, there may be other landmines in bdrv_swap()
> > for this case; prior to this, bdrv_swap() was always called with
> > bs_new being a newly-created BDS. With the active layer BDS it almost
> > certainly is not 'new', and could have other "prohibited" fields set,
> > in addition to the above (e.g. io limits, etc.)
>
> Hairy...
>
> We talked about splitting BlockBackend off BDS with an axe. Do you
> think that would make this series less hairy?
Hopefully, because it will help us to get rid of bdrv_move_feature_fields. :)
But if we need to get rid of bdrv_swap, we need a generic "level of indirect"
between BDS and BDS user, so we don't need to worry about the validity of old
BDS pointers when we update the chain. Just like BlockBackend does at device
emulation side. That's an even bigger project.
Thanks,
Fam
next prev parent reply other threads:[~2014-04-08 9:07 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-10 7:25 [Qemu-devel] [PATCH v17 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
2014-03-10 7:25 ` [Qemu-devel] [PATCH v17 01/14] block: Add BlockOpType enum Fam Zheng
2014-04-06 23:47 ` Jeff Cody
2014-03-10 7:25 ` [Qemu-devel] [PATCH v17 02/14] block: Introduce op_blockers to BlockDriverState Fam Zheng
2014-04-06 23:49 ` Jeff Cody
2014-04-08 6:56 ` Fam Zheng
2014-03-10 7:25 ` [Qemu-devel] [PATCH v17 03/14] block: Replace in_use with operation blocker Fam Zheng
2014-04-07 0:10 ` Jeff Cody
2014-04-07 0:24 ` Jeff Cody
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 04/14] block: Move op_blocker check from block_job_create to its caller Fam Zheng
2014-04-06 23:50 ` Jeff Cody
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 05/14] block: Add bdrv_set_backing_hd() Fam Zheng
2014-04-07 0:01 ` Jeff Cody
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 06/14] block: Add backing_blocker in BlockDriverState Fam Zheng
2014-04-07 0:31 ` Jeff Cody
2014-04-08 7:37 ` Fam Zheng
2014-04-09 18:29 ` Jeff Cody
2014-04-10 2:36 ` Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 07/14] block: Parse "backing" option to reference existing BDS Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 08/14] block: Support dropping active in bdrv_drop_intermediate Fam Zheng
2014-04-07 18:47 ` Jeff Cody
2014-04-08 8:15 ` Markus Armbruster
2014-04-08 9:07 ` Fam Zheng [this message]
2014-04-09 18:12 ` Jeff Cody
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 10/14] qmp: Add command 'blockdev-backup' Fam Zheng
2014-04-07 21:07 ` Eric Blake
2014-04-08 7:00 ` Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 11/14] block: Allow backup on referenced named BlockDriverState Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 12/14] block: Add blockdev-backup to transaction Fam Zheng
2014-04-07 21:11 ` Eric Blake
2014-04-10 2:15 ` Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 13/14] qemu-iotests: Test blockdev-backup in 055 Fam Zheng
2014-03-10 7:26 ` [Qemu-devel] [PATCH v17 14/14] qemu-iotests: Image fleecing test case 083 Fam Zheng
2014-04-02 6:01 ` [Qemu-devel] [PATCH v17 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
2014-04-03 1:53 ` Jeff Cody
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140408090711.GE11793@T430.redhat.com \
--to=famz@redhat.com \
--cc=armbru@redhat.com \
--cc=benoit.canet@irqsave.net \
--cc=imain@redhat.com \
--cc=jcody@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=ptoscano@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rjones@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).