qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Block layer meeting notes from June 10/11
@ 2014-06-13 13:28 Stefan Hajnoczi
  2014-06-13 13:40 ` Jeff Cody
  2014-06-13 13:53 ` Jeff Cody
  0 siblings, 2 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2014-06-13 13:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Benoît Canet, Fam Zheng, Jeff Cody,
	Markus Armbruster, Max Reitz

Kevin, Markus, Benoit, and I recently had the opportunity to meet
face-to-face to discuss ongoing design challenges in the QEMU block
layer.  I am sharing the meeting notes in this email.  Questions and
comments welcome!

These notes are somewhat sparse.  If you want full details, please add
an agenda item to the QEMU Community Call so we can have a live
discussion during the next call.  Or just reply to this email thread.

The general theme is that the "node name" concept is being introduced
into the QMP commands.  They allow the client to operate on specific
nodes in a BlockDriverState graph, not just the "drive" or root node.
This is much more powerful but also adds complexity.  We need to make
sure the operations are safe, make sense, and do not destroy data.

1. I/O throttling groups (multiple disks sharing I/O throttling quota)
 * BDSes already have multiple children, do they need multiple parents
as well? (And be able to distinguish them)

Action:
 * Add bpsgroup=<name> property to -drive [Benoit]
 * First drive using bpsgroup= can set the initial group bps/iops values
 * Successive drives will get EBUSY if they try to set group bps/iops values
 * throttle_set_limits on drive0 linked to a group will update the
group bps/iops values
 * When last drive using group is deleted, the group is destroyed too

2. Dynamic graph reconfiguration (e.g. adding block filters, taking
snapshots, etc.)
 * Where does the new node get inserted and how to specify how it is
linked up with the existing nodes?
  * On a given "arrow" between two nodes (only works with 1 child, 1 parent)
  * On a given set of arrows (possibly more complex than what is really needed?)
 * How does removing a node work with more than one child of the deleted node?
 * Keep using the existing QMP command for I/O throttling for now,
until we understand the general problem reasonably well

Action:
 * Figure out the general problem
 * Split I/O throttling off into own BDS [Benoît]
  * Requires some care with snapshots etc.

3. Proper specification for blockdev-add
 * What (magic) does -drive add that blockdev-add is missing?
 * Filename parsing and protocol detectioin
 * Format probing
 * Desugaring -drive
 * What does BDRV_O_PROTOCOL mean?
  * Disable format probing
  * Parse protocol name from filename (but not from options QDict)
  * Put filename then into options QDict
  * Set bs->growable
  * Disable adding of a bs->file layer
  * Ignore BDRV_O_SNAPSHOT
  * Which callers need which of these properties?

Action:
 * Convert network block drivers to QDict options (keep legacy
filename parsing for compatibility) [volunteer?]
 * Add network block drivers to blockdev-add [volunteer?]
 * Translate bdrv_open() arguments into options qdict, if appropriate [Kevin]
  * Translate legacy "filename" to qdict
 * Specify bdrv_open() behavior (especially magic) [Kevin]

4. BDS graph rules and manipulating arbitrary nodes
 * A proper design: iterate children, safely manipulate graph

Action:
 * Get rid of bdrv_swap() and update child/parent pointers instead
(depends on BlockBackend) [Markus?]
 * Add notifier list to BDS so users can get updated when pointer changes:
bdrv_register_bs_pointer(bs, &mystruct->bs) /* automatically refresh pointer */
 * Add parents and child list to BlockDriverState (could be realloc
array or just a function interface that operates on
->file/->backing_hd) [nice to have]
  * Audit drivers
   * Especially VMDK and quorum
   * Make them use generic child interface
   * Use child list where generic block layer currently hardcodes
->backing_hd and ->file

 * Mutual exclusion of operations/background jobs (bs->in_use / BlockOpType)
  * Streaming in two different parts of the backing chain - allowed?
(Benoît though that not, but does anything break?)
  * Does streaming only require that streamed images stay read-only
(i.e. backing chain segment on which the operation is performed)
  * Live commit in the opposite direction at the same time?

Action:
 * Draw up matrix of operations (mirror, stream, resize, etc)
 * Make op blocker mechanism use matrix as data instead of code
(define an array)
 * Enforce that new QMP/QAPI commands and block jobs add themselves to
the matrix
 * Recursively add blockers to child nodes (driver method?) [Benoit]

 * Arbitrary nodes
  * drive-mirror of arbitrary node
  * block-stream of arbitrary node
  * Jeff Cody's block-commit of arbitrary node patch series

Action:
 * Add base-nodename argument to block-stream command [Benoit]
 * Add top-nodename argument to block-stream command [Benoit]
  * If command can modify part of a backing chain, need to add option
to update the parent's backing filename field on disk!
  * Add optional backing-filename argument (since libvirt may use fd
passing and QEMU's filename is useless)
  * Add boolean whether to update backing file (for users who don't
need to override backing filename)
 * drive-mirror (block-mirror) of arbitrary node [Benoit]
 * Deprecate filename references in QMP commands in favour of node
names (e.g. streaming base) [Jeff?]

5. BlockBackend split
Action:
 * Split BlockBackend from BlockDriverState [Markus]
 * bdrv_new+bdrv_open and bdrv_close+bdrv_unref should be same,
eliminate ENOMEDIUM semantics
 * Make block driver private embedded in BlockDriverState instead of
opaque pointer

6. Dataplane programming model recap
 * What do block drivers need to be careful of?
 * Any comments on new docs/dataplane.txt documentation?

Action:
 * AioContext assertions to prevent callbacks in wrong event loop [Stefan]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Block layer meeting notes from June 10/11
  2014-06-13 13:28 [Qemu-devel] Block layer meeting notes from June 10/11 Stefan Hajnoczi
@ 2014-06-13 13:40 ` Jeff Cody
  2014-06-13 13:53 ` Jeff Cody
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2014-06-13 13:40 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Benoît Canet, Fam Zheng, qemu-devel,
	Markus Armbruster, Max Reitz

On Fri, Jun 13, 2014 at 09:28:28PM +0800, Stefan Hajnoczi wrote:

<snip>

>  * Arbitrary nodes
>   * drive-mirror of arbitrary node
>   * block-stream of arbitrary node
>   * Jeff Cody's block-commit of arbitrary node patch series
> 
> Action:
>  * Add base-nodename argument to block-stream command [Benoit]
>  * Add top-nodename argument to block-stream command [Benoit]
>   * If command can modify part of a backing chain, need to add option
> to update the parent's backing filename field on disk!
>   * Add optional backing-filename argument (since libvirt may use fd
> passing and QEMU's filename is useless)
>   * Add boolean whether to update backing file (for users who don't
> need to override backing filename)
>  * drive-mirror (block-mirror) of arbitrary node [Benoit]
>  * Deprecate filename references in QMP commands in favour of node
> names (e.g. streaming base) [Jeff?]

Please see my node-name patch series - it addresses block-stream in
addition to block-commit, and also allows updating the parent's
backing filenames as well.  Libvirt and OpenStack have been testing
their relative pathname functionality (including gluster protocol
relative pathnames) with these patches.

-Jeff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Block layer meeting notes from June 10/11
  2014-06-13 13:28 [Qemu-devel] Block layer meeting notes from June 10/11 Stefan Hajnoczi
  2014-06-13 13:40 ` Jeff Cody
@ 2014-06-13 13:53 ` Jeff Cody
  2014-06-13 14:17   ` Benoît Canet
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff Cody @ 2014-06-13 13:53 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Benoît Canet, Fam Zheng, qemu-devel,
	Markus Armbruster, Max Reitz

On Fri, Jun 13, 2014 at 09:28:28PM +0800, Stefan Hajnoczi wrote:
>

<snip>

>  * Mutual exclusion of operations/background jobs (bs->in_use / BlockOpType)
>   * Streaming in two different parts of the backing chain - allowed?
> (Benoît though that not, but does anything break?)
>   * Does streaming only require that streamed images stay read-only
> (i.e. backing chain segment on which the operation is performed)
>   * Live commit in the opposite direction at the same time?
> 
> Action:
>  * Draw up matrix of operations (mirror, stream, resize, etc)
>  * Make op blocker mechanism use matrix as data instead of code
> (define an array)
>  * Enforce that new QMP/QAPI commands and block jobs add themselves to
> the matrix
>  * Recursively add blockers to child nodes (driver method?) [Benoit]
> 

Benoit, you have quite a few items on your list - would it be useful
if I worked on this?  It would dovetail nicely with the node-name
commit/stream patches.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Block layer meeting notes from June 10/11
  2014-06-13 13:53 ` Jeff Cody
@ 2014-06-13 14:17   ` Benoît Canet
  2014-06-13 14:49     ` Benoît Canet
  0 siblings, 1 reply; 5+ messages in thread
From: Benoît Canet @ 2014-06-13 14:17 UTC (permalink / raw)
  To: Jeff Cody
  Cc: Kevin Wolf, Benoît Canet, Fam Zheng, Stefan Hajnoczi,
	qemu-devel, Markus Armbruster, Max Reitz

The Friday 13 Jun 2014 à 09:53:55 (-0400), Jeff Cody wrote :
> On Fri, Jun 13, 2014 at 09:28:28PM +0800, Stefan Hajnoczi wrote:
> >
> 
> <snip>
> 
> >  * Mutual exclusion of operations/background jobs (bs->in_use / BlockOpType)
> >   * Streaming in two different parts of the backing chain - allowed?
> > (Benoît though that not, but does anything break?)
> >   * Does streaming only require that streamed images stay read-only
> > (i.e. backing chain segment on which the operation is performed)
> >   * Live commit in the opposite direction at the same time?
> > 
> > Action:
> >  * Draw up matrix of operations (mirror, stream, resize, etc)
> >  * Make op blocker mechanism use matrix as data instead of code
> > (define an array)
> >  * Enforce that new QMP/QAPI commands and block jobs add themselves to
> > the matrix
> >  * Recursively add blockers to child nodes (driver method?) [Benoit]
> > 
> 
> Benoit, you have quite a few items on your list - would it be useful
> if I worked on this?  It would dovetail nicely with the node-name
> commit/stream patches.

Ok as you want tell me want you want to take I'll do the rest.

Best regards

Benoît

> 
> Thanks,
> Jeff
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Block layer meeting notes from June 10/11
  2014-06-13 14:17   ` Benoît Canet
@ 2014-06-13 14:49     ` Benoît Canet
  0 siblings, 0 replies; 5+ messages in thread
From: Benoît Canet @ 2014-06-13 14:49 UTC (permalink / raw)
  To: Benoît Canet
  Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi, Jeff Cody, qemu-devel,
	Markus Armbruster, Max Reitz

The Friday 13 Jun 2014 à 16:17:34 (+0200), Benoît Canet wrote :
> The Friday 13 Jun 2014 à 09:53:55 (-0400), Jeff Cody wrote :
> > On Fri, Jun 13, 2014 at 09:28:28PM +0800, Stefan Hajnoczi wrote:
> > >
> > 
> > <snip>
> > 
> > >  * Mutual exclusion of operations/background jobs (bs->in_use / BlockOpType)
> > >   * Streaming in two different parts of the backing chain - allowed?
> > > (Benoît though that not, but does anything break?)
> > >   * Does streaming only require that streamed images stay read-only
> > > (i.e. backing chain segment on which the operation is performed)
> > >   * Live commit in the opposite direction at the same time?
> > > 
> > > Action:
> > >  * Draw up matrix of operations (mirror, stream, resize, etc)
> > >  * Make op blocker mechanism use matrix as data instead of code
> > > (define an array)
> > >  * Enforce that new QMP/QAPI commands and block jobs add themselves to
> > > the matrix
> > >  * Recursively add blockers to child nodes (driver method?) [Benoit]
> > > 
> > 
> > Benoit, you have quite a few items on your list - would it be useful
> > if I worked on this?  It would dovetail nicely with the node-name
> > commit/stream patches.

Still I need to do some work in order to bill my customer ;)

> 
> Ok as you want tell me want you want to take I'll do the rest.
> 
> Best regards
> 
> Benoît
> 
> > 
> > Thanks,
> > Jeff
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-13 14:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-13 13:28 [Qemu-devel] Block layer meeting notes from June 10/11 Stefan Hajnoczi
2014-06-13 13:40 ` Jeff Cody
2014-06-13 13:53 ` Jeff Cody
2014-06-13 14:17   ` Benoît Canet
2014-06-13 14:49     ` Benoît Canet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).