Re: [Qemu-devel] [PATCH] block: allow write-threshold on device name

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: fromani@redhat.com, qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] block: allow write-threshold on device name
Date: Wed, 10 Jun 2015 09:57:29 +0200	[thread overview]
Message-ID: <20150610075729.GB4899@noname.str.redhat.com> (raw)
In-Reply-To: <55776A49.3000705@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4234 bytes --]

Am 10.06.2015 um 00:35 hat Eric Blake geschrieben:
> On 06/06/2015 07:38 PM, Eric Blake wrote:
> > Commit e2462113 allowed the ability to fire an event if a BDS
> > node exceeds a threshold during a write, but limited the option
> > to only work on node names.  For convenience, expand this to
> > allow a device name as a way to set the threshold on the BDS
> > at the active layer of the device.
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > ---
> >  block/write-threshold.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> 
> > @@ -110,9 +110,8 @@ void qmp_block_set_write_threshold(const char *node_name,
> >      BlockDriverState *bs;
> >      AioContext *aio_context;
> > 
> > -    bs = bdrv_find_node(node_name);
> > +    bs = bdrv_lookup_bs(node_name, node_name, errp);
> 
> Hmm, I'm not quite sure this does what I want.  If I understand
> correctly, when you open a qcow2 image, you get the following
> query-block layout (abbreviated):
> 
>             "inserted": {
> ...
>                 "image": {
> ...
>                     "backing-filename-format": "qcow2",
>                     "virtual-size": 12884901888,
>                     "filename": "/dev/sda6",
>                     "cluster-size": 65536,
>                     "format": "qcow2",
>                     "actual-size": 0,
> ...
>                 "drv": "qcow2",
>                 "iops": 0,
>                 "bps_wr": 0,
>                 "write_threshold": 0,
> ...
>                 "file": "/dev/sda6",
> 
> 
> That is, the only write_threshold reported is that of the active layer
> BDS (write_threshold of other BDS is reported through
> query-named-block-nodes, but only if those BDS have a name), but
> query-block is not listing any secondary BDS details.
> 
> Meanwhile, here is the corresponding query-blockstats layout:
> 
>             "device": "drive-virtio-disk0",
>             "parent": {
>                 "stats": {
>                     "flush_total_time_ns": 0,
>                     "wr_highest_offset": 72482304,
> ...
>             "stats": {
>                 "flush_total_time_ns": 728455560,
>                 "wr_highest_offset": 9129332224,
> 
> which DOES show the BDS chain; in particular, each qcow2 file has two
> BDS (one for the protocol, and the other ('parent') for the actual file).
> 
> The statistic I'm interested in is the allocation of the block device
> (the host offset, aka wr_highest_offset 72482304 above), and NOT the
> usage pattern of the guest (the qcow2 protocol, wr_highest_offset
> 9129332224).  But bdrv_lookup_bs() finds the qcow2 protocol layer,
> rather than the intended backing file layer; likewise, query-block is
> only reporting write_threshold for the protocol layer.
> 
> I'm wondering if, when a device name is given rather than a node name,
> it is safe to blindly follow the active layer down to its lowest member
> (or error out if there are more than one lower members, as in quorum),
> as that is the statistic that libvirt and upper layers really want ("am
> I about to exceed the allocation of my underlying storage?").  Likewise,
> on reporting, it is more useful to know the threshold of the backing
> layer if the qcow2 protocol layer does not have a threshold.  I'm
> playing with that idea before submitting a v2.

That is indeed what you need in your specific use case. However, qemu
shouldn't try to guess what management tools really want. It should
provide a clean interface that allows management tools to express
themselves what they want.

The cleanest interface that I can think of is that you access exactly
the node whose name you specified. If we do any magic like going down
the chain (which chain? What do you do with things like quorum in the
path?), we make the interface inconsistent and if anyone really wants to
know the highest offset that the guest accessed on its virtual disk, it
wouldn't even be possible any more because we said that that's not what
a management tool is interested in.

Let's stay away from such magic, as much as we can. libvirt can just
specify a node-name for the protocol layer and use that.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

next prev parent reply	other threads:[~2015-06-10  7:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-07  1:38 [Qemu-devel] [PATCH] block: allow write-threshold on device name Eric Blake
2015-06-07  8:53 ` Amos Kong
2015-06-08 14:35   ` Eric Blake
2015-06-09 22:35 ` Eric Blake
2015-06-10  7:57   ` Kevin Wolf [this message]
2015-06-10 13:07     ` Eric Blake
2015-06-10 13:43       ` Kevin Wolf
2015-06-10 14:53         ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150610075729.GB4899@noname.str.redhat.com \
    --to=kwolf@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fromani@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).