From: Kevin Wolf <kwolf@redhat.com>
To: Peter Krempa <pkrempa@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
John Snow <jsnow@redhat.com>, qemu-devel <qemu-devel@nongnu.org>,
Qemu-block <qemu-block@nongnu.org>, Max Reitz <mreitz@redhat.com>
Subject: Re: bitmap migration bug with -drive while block mirror runs
Date: Wed, 2 Oct 2019 13:11:47 +0200 [thread overview]
Message-ID: <20191002111147.GB5819@localhost.localdomain> (raw)
In-Reply-To: <20191002104600.GC6129@angien.pipo.sk>
Am 02.10.2019 um 12:46 hat Peter Krempa geschrieben:
> On Tue, Oct 01, 2019 at 12:07:54 -0400, John Snow wrote:
> >
> >
> > On 10/1/19 11:57 AM, Vladimir Sementsov-Ogievskiy wrote:
> > > 01.10.2019 17:10, John Snow wrote:
> > >>
> > >>
> > >> On 10/1/19 10:00 AM, Vladimir Sementsov-Ogievskiy wrote:
> > >>>> Otherwise: I have a lot of cloudy ideas on how to solve this, but
> > >>>> ultimately what we want is to be able to find the "addressable" name for
> > >>>> the node the bitmap is attached to, which would be the name of the first
> > >>>> ancestor node that isn't a filter. (OR, the name of the block-backend
> > >>>> above that node.)
> > >>> Not the name of ancestor node, it will break mapping: it must be name of the
> > >>> node itself or name of parent (may be through several filters) block-backend
> > >>>
> > >>
> > >> Ah, you are right of course -- because block-backends are the only
> > >> "nodes" for which we actually descend the graph and add the bitmap to
> > >> its child.
> > >>
> > >> So the real back-resolution mechanism is:
> > >>
> >
> > Amendment:
> > - If our local node-name N is well-formed, use this.
>
> I'd like to re-iterate that the necessity to keep node names same on
> both sides of migration is unexpected, undocumented and in some cases
> impossible.
I think the (implicitly made) requirement is not that all node-names are
kept the same, but only the node-names of those nodes for which
migration transfers some state.
It seems to me that bitmap migration is the first case of putting
something in the migration stream that isn't related to a frontend, but
to the backend, so the usual device hierarchy to address information
doesn't work here. And it seems the implications of this weren't really
considered sufficiently, resulting in the design problem we're
discussing now.
What we need to transfer is dirty bitmaps, which can be attached to any
node in the block graph. If we accept that the way to transfer this is
the migration stream, we need a way to tell which bitmap belongs to
which node. Matching node-name is the obvious answer, just like a
matching device tree hierarchy is used for frontends.
If we don't want to use the migration stream for backends, we would need
to find another way to transfer the bitmaps. I would welcome removing
backend data from the migration stream, but if this includes
non-persistent bitmaps, I don't see what the alternative could be.
> If you want to mandate that they must be kept the same please document
> it and also note the following:
>
> - during migrations the storage layout may change e.g. a backing chain
> may become flattened, thus keeping node names stable beyond the top
> layer is impossible
You don't want to transfer bitmaps of nodes that you're going to drop.
I'm not an expert for these bitmaps, but I think this just means you
would have to disable any bitmaps on the backing files to be dropped on
the source host before you migrate.
> - in some cases (readonly image in a cdrom not present on destination,
> thus not relevant here probably) it may even become impossible to
> create any node thus keeping the top node may be impossible
Same thing, you don't want to transfer a bitmap for a node that
disappears.
> - it should be documented when and why this happens and how management
> tools are supposed to do it
>
> - please let me know what's actually expected, since libvirt
> didn't enable blockdev yet we can fix any unexpected expectations
>
> - Document it so that the expectations don't change after this.
Yes, we need a good and ideally future-proof rule of which node-names
need to stay the same. Currently it's only bitmaps, but might we get
another feature later where we want to transfer more backend data?
> - Ideally node names will not be bound to anything and freely
> changeable. If necessary we can provide a map to qemu during migration
> which is probably less painful and more straightforward than keeping
> them in sync somehow ...
A map feels painful for the average user (and for the QEMU
implementation), even if it looks convenient for libvirt. If anything,
I'd make it optional and default to 1:1 mappings for anything that isn't
explicitly mapped.
Kevin
next prev parent reply other threads:[~2019-10-02 11:14 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-01 0:09 bitmap migration bug with -drive while block mirror runs John Snow
2019-10-01 4:28 ` Peter Krempa
2019-10-01 9:07 ` Vladimir Sementsov-Ogievskiy
2019-10-01 8:57 ` Vladimir Sementsov-Ogievskiy
2019-10-01 9:54 ` Kevin Wolf
2019-10-01 10:05 ` Vladimir Sementsov-Ogievskiy
2019-10-01 13:24 ` Peter Krempa
2019-10-01 15:09 ` John Snow
2019-10-01 15:58 ` Kevin Wolf
2019-10-01 16:12 ` Vladimir Sementsov-Ogievskiy
2019-10-01 16:24 ` Kevin Wolf
2019-10-01 16:23 ` John Snow
2019-10-01 11:45 ` Peter Krempa
2019-10-01 9:17 ` Vladimir Sementsov-Ogievskiy
2019-10-01 14:00 ` Vladimir Sementsov-Ogievskiy
2019-10-01 14:10 ` John Snow
2019-10-01 15:57 ` Vladimir Sementsov-Ogievskiy
2019-10-01 16:07 ` John Snow
2019-10-02 8:12 ` Kevin Wolf
2019-10-02 10:46 ` Peter Krempa
2019-10-02 11:11 ` Kevin Wolf [this message]
2019-10-02 12:22 ` Vladimir Sementsov-Ogievskiy
2019-10-02 13:48 ` Peter Krempa
2019-10-02 13:43 ` Peter Krempa
2019-10-02 14:03 ` Vladimir Sementsov-Ogievskiy
2019-10-02 21:35 ` John Snow
2019-10-03 10:14 ` Vladimir Sementsov-Ogievskiy
2019-10-03 23:34 ` John Snow
2019-10-04 8:33 ` Peter Krempa
2019-10-04 9:21 ` Vladimir Sementsov-Ogievskiy
2019-10-06 3:15 ` John Snow
2019-10-04 9:24 ` Vladimir Sementsov-Ogievskiy
2019-10-04 13:07 ` Eric Blake
2019-10-06 3:19 ` John Snow
2019-10-01 16:16 ` Kevin Wolf
2019-10-01 16:17 ` Vladimir Sementsov-Ogievskiy
2019-10-01 14:13 ` Max Reitz
2019-10-01 14:27 ` Vladimir Sementsov-Ogievskiy
2019-10-01 14:34 ` Max Reitz
2019-10-01 14:53 ` Vladimir Sementsov-Ogievskiy
2019-10-01 15:26 ` Max Reitz
2019-10-02 7:34 ` Peter Krempa
2019-10-01 15:09 ` Kevin Wolf
2019-10-01 15:27 ` Max Reitz
2019-10-01 16:12 ` Kevin Wolf
2019-10-01 16:17 ` Max Reitz
2019-10-01 16:22 ` Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191002111147.GB5819@localhost.localdomain \
--to=kwolf@redhat.com \
--cc=jsnow@redhat.com \
--cc=mreitz@redhat.com \
--cc=pkrempa@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).