qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: qemu block <qemu-block@nongnu.org>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	qemu devel <qemu-devel@nongnu.org>, Max Reitz <mreitz@redhat.com>,
	Gonglei <arei.gonglei@huawei.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Yang Hongyang <yanghy@cn.fujitsu.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] Dynamic reconfiguration (was: qmp: add monitor command to add/remove a child)
Date: Thu, 8 Oct 2015 13:10:09 +0200	[thread overview]
Message-ID: <20151008111009.GE5379@noname.redhat.com> (raw)
In-Reply-To: <20151008110214.GD5379@noname.redhat.com>

Am 08.10.2015 um 13:02 hat Kevin Wolf geschrieben:
> Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben:
> > Max Reitz <mreitz@redhat.com> writes:
> > > E.g. you may have a block filter in the future where you want to
> > > exchange its child BDS. This exchange should be an atomic operation, so
> > > we cannot use this interface there anyway. For quorum, such an exchange
> > > does not need to be atomic, since you can just add the new child first
> > > and remove the old one afterwards.
> > >
> > > So maybe in the future we get some block driver other than quorum for
> > > which adding and removing children (as opposed to atomically exchanging
> > > them) makes sense, but for now I can only see quorum. Therefore, that
> > > this works for quorum only is in my opinion not a reason to make it
> > > experimental. I think we actually want to keep it that way.
> > 
> > Are you telling us the existing interface is insufficiently general?
> > That the general interface neeeds to support atomic replacement?
> > 
> > If yes, why isn't the general interface is what we should do for quorum?
> > Delete is atomic replacement by nothing, add is atomic replacement of
> > nothing.
> 
> The general thing is what we used to call "dynamic reconfiguration". If
> we want a single command to enable it (which would be great if we
> could), we need to some more design work first. Atomic replacement might
> be the operation we're looking for, but we have to be sure.
> 
> So far we haven't thought about dynamic reconfiguation enough that we
> would know the right solution, but enough that we know it's hard. That
> would be an argument for me that makes adding an x-* command now
> acceptable. On the other hand, the fact that we want a single command in
> the end makes me want to keep it experimental.
> 
> 
> What types of dynamic reconfiguration do we need to support? I'll start
> with a small list, feel free to extend it:
> 
> * Replace a child node with another node. This works pretty much
>   everywhere in the tree - including the root, i.e. BlockBackend!
>   Working just on BDSes doesn't seem to be enough.
> 
> * Adding a child to a node that can take additional children (e.g.
>   quorum can take an arbitrary number; but also COW image formats have
>   an option child, so you could add a backing file to a node originally
>   opened with BDRV_O_NO_BACKING)
> 
>   Same as atomically replacing nothing by a node.
> 
> * Removing an optional child from a node that remains valid with that
>   child removed. The same examples apply.
> 
>   Same as atomically replacing a child by nothing.
> 
> * Add a new node between two existing nodes. An example is taking a live
>   snapshot, which inserts a new node between the BlockBackend and the
>   first BDS. Or it could be used to insert a filter somewhere in the
>   graph.
> 
>   Same as creating the new node pointing to node B (or dynamically
>   adding it) and then atomically replacing the reference of node A that
>   pointed to B with a reference to the new node.
> 
> * Add a new node between multiple existing nodes. This is one of the
>   examples we always used to discuss with dynamic reconfiguration:
> 
>       base <- sn1 <- sn2 <--+-- virtio-blk
>                             |
>                             +-- NBD server
> 
>   Adding a filter could result in this:
> 
>         base <- sn1 <- sn2 <- throttling <--+-- virtio-blk
>                                           |
>                                           +-- NBD server
> 
>   Or this:
> 
>         base <- sn1 <- sn2 <--+-- throttling <- virtio-blk
>                               |
>                               +-- NBD server
> 
>   Or this:
> 
>         base <- sn1 <- sn2 <--+-- virtio-blk
>                               |
>                               +-- throttling <- NBD server
> 
>   All of these are different kinds of "adding a filter", and all of
>   them are valid operations that a user could want to perform.
> 
>   Case 2 and 3 are really just "add a new node between two existing
>   nodes", as explained above.
> 
>   Case 1 is new: We still create the throttling node so that it already
>   points to sn2, but now we have to atomically change the children of
>   two things (the BlockBackends of virtio-blk and the NBD server). Not a
>   problem per se because we can just do that, but it raises the question
>   whether the atomic replacement operation needs to be transactionable.
> 
> * Remove a node between two (or more) other nodes.
> 
>   Same as atomically replacing a child by a grandchild. For more than
>   two involved nodes, again a transactional version might be needed.
> 
> So at the first sight, this operation seems to work as the general
> interface for dynamic reconfiguration.

And actually, after re-reading the other branch of this email thread
where I replied to Berto that he wants to use bdrv_reopen() to change
the voting threshold for quorum, it occurred to me that bdrv_reopen()
should possibly also be this atomatic replacement function.

After all, the references to other nodes are blockdev-add options, and
if we implement bdrv_reopen() fully so that it can change any options
that can be given to blockdev-add, that means that we must be able to
replace children.

Kevin

> One problem we discussed earlier that I'm not sure whether it's related
> is filter nodes inserted automatically once we change e.g. the I/O
> throttling QMP commands to add a throttling filter BDS to the graph.
> 
> If the user creates nodes A and B, but sets throttling options, they
> might end up with a chain like this:
> 
>     A <- throttling <- B
> 
> Now imagine that they want to add another filter between A and B, let's
> say blkdebug. They would need to know that they have to insert the new
> node between A and throttling or B and throttling, but not between A and
> B. If they tried to insert it between A and B, the algorithm above says
> that they would let blkdebug point to A, and replace B's child with
> blkdebug, the resulting tree wouldn't be throttled any more!
> 
>     A <- blkdebug <- B
> 
>          throttling (ref = 0 -> delete)
> 
> Even if they knew that they have to consider the throttling node, it
> currently wouldn't have a node-name, and with Jeff's autogenerated names
> it wouldn't be predictable.
> 
> Maybe the dynamic reconfiguration interface does need to be a bit
> cleverer.
> 
> 
> Anyway, after writing all of this, I'm almost convinced now that an
> experimental interface is the right thing to do in this patch series.
> 
> Kevin
> 

  reply	other threads:[~2015-10-08 11:10 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-22  7:44 [Qemu-devel] [PATCH v5 0/4] qapi: child add/delete support Wen Congyang
2015-09-22  7:44 ` [Qemu-devel] [PATCH v5 1/4] Add new block driver interface to add/delete a BDS's child Wen Congyang
2015-10-07 13:35   ` Alberto Garcia
2015-10-08  2:05     ` Wen Congyang
2015-10-07 18:33   ` [Qemu-devel] [Qemu-block] " Max Reitz
2015-10-08  2:06     ` Wen Congyang
2015-10-07 19:00   ` [Qemu-devel] " Dr. David Alan Gilbert
2015-10-08  2:03     ` Wen Congyang
2015-10-08 18:44       ` Dr. David Alan Gilbert
2015-09-22  7:44 ` [Qemu-devel] [PATCH v5 2/4] quorum: implement bdrv_add_child() and bdrv_del_child() Wen Congyang
2015-10-07 14:12   ` Alberto Garcia
2015-10-08  2:10     ` Wen Congyang
2015-10-07 18:51   ` [Qemu-devel] [Qemu-block] " Max Reitz
2015-10-08  8:12     ` Alberto Garcia
2015-10-09 15:51       ` Max Reitz
2015-10-12 11:56         ` Alberto Garcia
2015-09-22  7:44 ` [Qemu-devel] [PATCH v5 3/4] qmp: add monitor command to add/remove a child Wen Congyang
2015-10-07 14:33   ` Alberto Garcia
2015-10-07 19:42   ` [Qemu-devel] [Qemu-block] " Max Reitz
2015-10-08  6:15     ` Markus Armbruster
2015-10-08  8:29       ` Alberto Garcia
2015-10-08 10:03         ` Kevin Wolf
2015-10-08 10:13           ` Alberto Garcia
2015-10-09 16:14         ` Max Reitz
2015-10-08 11:02       ` [Qemu-devel] Dynamic reconfiguration (was: qmp: add monitor command to add/remove a child) Kevin Wolf
2015-10-08 11:10         ` Kevin Wolf [this message]
2015-10-21  8:27           ` [Qemu-devel] [Qemu-block] Dynamic reconfiguration Markus Armbruster
2015-10-26  2:04             ` Wen Congyang
2015-10-26  7:24               ` Markus Armbruster
2015-10-26  7:25                 ` Wen Congyang
2015-10-09 16:13       ` [Qemu-devel] [Qemu-block] [PATCH v5 3/4] qmp: add monitor command to add/remove a child Max Reitz
2015-10-09 16:42         ` Dr. David Alan Gilbert
2015-10-09 18:24           ` Max Reitz
2015-10-12  8:07             ` Dr. David Alan Gilbert
2015-10-12  8:18             ` Kevin Wolf
2015-10-12  7:58           ` Markus Armbruster
2015-10-12  7:56         ` Markus Armbruster
2015-10-12 16:27     ` Max Reitz
2015-09-22  7:44 ` [Qemu-devel] [PATCH v5 4/4] hmp: " Wen Congyang
2015-10-07 14:38   ` Alberto Garcia
2015-09-22 11:15 ` [Qemu-devel] [PATCH v5 0/4] qapi: child add/delete support Dr. David Alan Gilbert
2015-09-23  1:08   ` Wen Congyang
2015-09-23  9:21     ` Dr. David Alan Gilbert
2015-09-23  9:30       ` Wen Congyang
2015-10-07  6:40 ` Wen Congyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151008111009.GE5379@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).