All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Dillaman <jdillama@redhat.com>
To: Adam Wolfe Gordon <awg@digitalocean.com>, qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] rbd: Detect rbd image resizes and propagate them
Date: Wed, 13 Sep 2017 17:21:05 -0400	[thread overview]
Message-ID: <1505337665.15211.3.camel@redhat.com> (raw)
In-Reply-To: <20170913164424.32129-1-awg@digitalocean.com>

On Wed, 2017-09-13 at 10:44 -0600, Adam Wolfe Gordon wrote:
> Register a watcher with rbd so that we get notified when an image is
> resized. Propagate resizes to parent block devices so that guest devices
> get resized without user intervention.
> 
> Signed-off-by: Adam Wolfe Gordon <awg@digitalocean.com>
> ---
> Hello!
> 
> We are using this patch in production at DigitalOcean and have tested it fairly
> extensively. However, we use the same block device configuration everywhere, so
> I'd be interested to get an answer to the question I've left in the code:
> 
> > /* NOTE: This assumes there's only one layer between us and the
> >    block-backend. Is this always true? */
> 
> If that isn't true, this patch will need a bit of work to traverse up the block
> device tree and find the block-backend.
> 
> Of course, any other feedback would be welcome too - this is my first foray into
> the qemu code.
> 
> Regards,
> Adam
> 
>  block/rbd.c | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
> 
> diff --git a/block/rbd.c b/block/rbd.c
> index 144f350e1f..1c9fcbec1f 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -96,6 +96,8 @@ typedef struct BDRVRBDState {
>      rbd_image_t image;
>      char *image_name;
>      char *snap;
> +    uint64_t watcher_handle;
> +    uint64_t size_bytes;
>  } BDRVRBDState;
>  
>  static char *qemu_rbd_next_tok(char *src, char delim, char **p)
> @@ -540,6 +542,67 @@ out:
>      return rados_str;
>  }
>  
> +/* BH for when rbd notifies us of an update. */
> +static void qemu_rbd_update_bh(void *arg)
> +{
> +    BlockDriverState *parent, *bs = arg;
> +    BDRVRBDState *s = bs->opaque;
> +    bool was_variable_length;
> +    uint64_t new_size_bytes;
> +    int64_t new_parent_len;
> +    BdrvChild *c;
> +    int r;
> +
> +    r = rbd_get_size(s->image, &new_size_bytes);
> +    if (r < 0) {
> +        error_report("error reading size for %s: %s", s->name, strerror(-r));
> +        return;
> +    }
> +
> +    /* Avoid no-op resizes on non-resize notifications. */
> +    if (new_size_bytes == s->size_bytes) {
> +        error_printf("skipping non-resize rbd cb\n");

Image update callbacks can be invoked for any number of reasons, not
just resize events. Therefore, you probably don't want to have an error
message printed out here.

> +        return;
> +    }
> +
> +    /* NOTE: This assumes there's only one layer between us and the
> +       block-backend. Is this always true? */

I don't think that can be assumed to be true.

> +    parent = bs->inherits_from;
> +    if (parent == NULL) {
> +        error_report("bs %s does not have parent", bdrv_get_device_or_node_name(bs));
> +        return;
> +    }
> +
> +    /* Force parents to re-read our size. */

Can you just invoke blk_truncate instead? 

> +    was_variable_length = bs->drv->has_variable_length;
> +    bs->drv->has_variable_length = true;
> +    new_parent_len = bdrv_getlength(parent);
> +    if (new_parent_len < 0) {
> +        error_report("getlength failed on parent %s", bdrv_get_device_or_node_name(parent));
> +        bs->drv->has_variable_length = was_variable_length;
> +        return;
> +    }
> +    bs->drv->has_variable_length = was_variable_length;
> +
> +    /* Notify block backends that that we have resized.
> +       Copied from bdrv_parent_cb_resize. */
> +    QLIST_FOREACH(c, &parent->parents, next_parent) {
> +        if (c->role->resize) {
> +            c->role->resize(c);
> +        }
> +    }
> +
> +    s->size_bytes = new_size_bytes;
> +}
> +
> +/* Called from non-qemu thread - careful! */
> +static void qemu_rbd_update_cb(void *arg)
> +{
> +    BlockDriverState *bs = arg;
> +
> +    aio_bh_schedule_oneshot(bdrv_get_aio_context(bs), qemu_rbd_update_bh, bs);
> +}
> +
>  static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
>                           Error **errp)
>  {
> @@ -672,9 +735,25 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
>          }
>      }
>  
> +    r = rbd_update_watch(s->image, &s->watcher_handle, qemu_rbd_update_cb, bs);

This API method was only added in the Ceph Jewel release just over a
year ago. This should probably gracefully degrade if compiled against an
older version of the librbd API.

> +    if (r < 0) {
> +        error_setg_errno(errp, -r, "error registering watcher on %s", s->name);
> +        goto failed_watch;
> +    }
> +
> +    r = rbd_get_size(s->image, &s->size_bytes);
> +    if (r < 0) {
> +        error_setg_errno(errp, -r, "error reading size for %s", s->name);
> +        goto failed_sz;
> +    }
> +
>      qemu_opts_del(opts);
>      return 0;
>  
> +failed_sz:
> +    rbd_update_unwatch(s->image, s->watcher_handle);
> +failed_watch:
> +    rbd_close(s->image);
>  failed_open:
>      rados_ioctx_destroy(s->io_ctx);
>  failed_shutdown:
> @@ -712,6 +791,7 @@ static void qemu_rbd_close(BlockDriverState *bs)
>  {
>      BDRVRBDState *s = bs->opaque;
>  
> +    rbd_update_unwatch(s->image, s->watcher_handle);
>      rbd_close(s->image);
>      rados_ioctx_destroy(s->io_ctx);
>      g_free(s->snap);

  parent reply	other threads:[~2017-09-13 21:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-13 16:44 [Qemu-devel] [PATCH] rbd: Detect rbd image resizes and propagate them Adam Wolfe Gordon
2017-09-13 20:22 ` no-reply
2017-09-13 20:22 ` no-reply
2017-09-13 20:53 ` [Qemu-devel] [Qemu-block] " John Snow
2017-09-13 21:36   ` Adam Wolfe Gordon
2017-09-14  0:47     ` John Snow
2017-09-14 22:27       ` Adam Wolfe Gordon
2017-09-13 21:21 ` Jason Dillaman [this message]
2017-09-13 21:56   ` Adam Wolfe Gordon
2017-09-15 12:33 ` [Qemu-devel] " Kevin Wolf
2017-09-15 22:08   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1505337665.15211.3.camel@redhat.com \
    --to=jdillama@redhat.com \
    --cc=awg@digitalocean.com \
    --cc=dillaman@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.