qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: armbru@redhat.com, pbonzini@redhat.com, mreitz@redhat.com,
	kwolf@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export
Date: Wed, 20 Jun 2018 06:24:08 -0500	[thread overview]
Message-ID: <4ab9dd4e-1c79-fbe2-8e4b-6563e3775366@redhat.com> (raw)
In-Reply-To: <20180609151758.17343-5-vsementsov@virtuozzo.com>

On 06/09/2018 10:17 AM, Vladimir Sementsov-Ogievskiy wrote:
> Handle new NBD meta namespace: "qemu", and corresponding queries:

s/new/a new/

> "qemu:dirty-bitmap:<export bitmap name>".
> 
> With new metadata context negotiated, BLOCK_STATUS query will reply

s/new/the new/

> with dirty-bitmap data, converted to extents. New public function

s/New/The new/

> nbd_export_bitmap selects bitmap to export. For now, only one bitmap

s/bitmap/which bitmap/

> may be exported.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/block/nbd.h |   6 ++
>   nbd/server.c        | 271 ++++++++++++++++++++++++++++++++++++++++++++++++----
>   2 files changed, 259 insertions(+), 18 deletions(-)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index fcdcd54502..a653d0fba7 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -234,6 +234,10 @@ enum {
>   #define NBD_STATE_HOLE (1 << 0)
>   #define NBD_STATE_ZERO (1 << 1)
>   
> +/* Flags for extents (NBDExtent.flags) of NBD_REPLY_TYPE_BLOCK_STATUS,
> + * for qemu:dirty-bitmap:* meta contexts */

Comment tail.

> +#define NBD_STATE_DIRTY (1 << 0)
> +
>   static inline bool nbd_reply_type_is_error(int type)
>   {
>       return type & (1 << 15);
> @@ -315,6 +319,8 @@ void nbd_client_put(NBDClient *client);
>   void nbd_server_start(SocketAddress *addr, const char *tls_creds,
>                         Error **errp);
>   
> +void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
> +                       const char *bitmap_export_name, Error **errp);
>   
>   /* nbd_read
>    * Reads @size bytes from @ioc. Returns 0 on success.
> diff --git a/nbd/server.c b/nbd/server.c
> index 2d762d7289..569e068fc2 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -23,6 +23,12 @@
>   #include "nbd-internal.h"
>   
>   #define NBD_META_ID_BASE_ALLOCATION 0
> +#define NBD_META_ID_DIRTY_BITMAP 1
> +
> +/* NBD_MAX_BITMAP_EXTENTS: 1 mb of extents data. An empirical constant. If need
> + * to increase, note that NBD protocol defines 32 mb as a limit, after which the

s/need to increase/an increase is needed/

> + * reply may be considered as a denial of service attack. */
> +#define NBD_MAX_BITMAP_EXTENTS (0x100000 / 8)
>   
>   static int system_errno_to_nbd_errno(int err)
>   {
> @@ -80,6 +86,9 @@ struct NBDExport {
>   
>       BlockBackend *eject_notifier_blk;
>       Notifier eject_notifier;
> +
> +    BdrvDirtyBitmap *export_bitmap;
> +    char *export_bitmap_context;
>   };
>   
>   static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
> @@ -92,6 +101,7 @@ typedef struct NBDExportMetaContexts {
>       bool valid; /* means that negotiation of the option finished without
>                      errors */
>       bool base_allocation; /* export base:allocation context (block status) */
> +    bool dirty_bitmap; /* export qemu:dirty-bitmap:<export bitmap name> */
>   } NBDExportMetaContexts;
>   
>   struct NBDClient {
> @@ -810,6 +820,55 @@ static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
>                                        &meta->base_allocation, errp);
>   }
>   
> +/* nbd_meta_bitmap_query
> + *
> + * Handle query to 'qemu:' namespace.
> + * @len is the amount of text remaining to be read from the current name, after
> + * the 'qemu:' portion has been stripped.
> + *
> + * Return -errno on I/O error, 0 if option was completely handled by
> + * sending a reply about inconsistent lengths, or 1 on success. */
> +static int nbd_meta_qemu_query(NBDClient *client, NBDExportMetaContexts *meta,
> +                               uint32_t len, Error **errp)
> +{
> +    bool dirty_bitmap = false;
> +    int dirty_bitmap_len = strlen("dirty-bitmap:");

size_t is better for strlen() values.

> +    int ret;
> +
> +    if (!meta->exp->export_bitmap) {
> +        return nbd_opt_skip(client, len, errp);
> +    }

Worth a trace?...

> +
> +    if (len == 0) {
> +        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
> +            meta->dirty_bitmap = true;
> +        }
> +        trace_nbd_negotiate_meta_query_parse("empty");
> +        return 1;
> +    }
> +
> +    if (len < dirty_bitmap_len) {
> +        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
> +        return nbd_opt_skip(client, len, errp);
> +    }
> +

...especially since other returns have traced the result.

I'm strongly thinking of adding a patch to QMP to add an x-context 
parameter when creating an NBD client, in order to at least make testing 
client/server interactions easier than just code inspection.  Does not 
have to be part of this series.

> +    len -= dirty_bitmap_len;
> +    ret = nbd_meta_pattern(client, "dirty-bitmap:", &dirty_bitmap, errp);
> +    if (ret <= 0) {
> +        return ret;
> +    }
> +    if (!dirty_bitmap) {
> +        trace_nbd_negotiate_meta_query_skip("not dirty-bitmap:");
> +        return nbd_opt_skip(client, len, errp);
> +    }
> +
> +    trace_nbd_negotiate_meta_query_parse("dirty-bitmap:");
> +
> +    return nbd_meta_empty_or_pattern(
> +            client, meta->exp->export_bitmap_context +
> +            strlen("qemu:dirty_bitmap:"), len, &meta->dirty_bitmap, errp);
> +}
> +
>   /* nbd_negotiate_meta_query
>    *
>    * Parse namespace name and call corresponding function to parse body of the
> @@ -826,8 +885,10 @@ static int nbd_negotiate_meta_query(NBDClient *client,
>                                       NBDExportMetaContexts *meta, Error **errp)
>   {
>       int ret;
> -    char query[sizeof("base:") - 1];
> -    size_t baselen = strlen("base:");
> +    size_t ns_len = 5;
> +    char ns[5]; /* Both 'qemu' and 'base' namespaces have length = 5 including a
> +                   colon. If it's needed to introduce a namespace of the other
> +                   length, this should be certainly refactored. */

s/be certainly/certainly be/

...

> @@ -1793,6 +1871,9 @@ static int blockstatus_to_extent_be(BlockDriverState *bs, uint64_t offset,
>   }
>   
>   /* nbd_co_send_extents
> + *
> + * NBD_REPLY_FLAG_DONE is not set, don't forget to send it.
> + *
>    * @extents should be in big-endian */
>   static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
>                                  NBDExtent *extents, unsigned nb_extents,
> @@ -1805,7 +1886,7 @@ static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
>           {.iov_base = extents, .iov_len = nb_extents * sizeof(extents[0])}
>       };
>   
> -    set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_BLOCK_STATUS,
> +    set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_BLOCK_STATUS,

Worth a bool parameter to send the flag automatically on the last context?

>                    handle, sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
>       stl_be_p(&chunk.context_id, context_id);
>   
> @@ -1830,6 +1911,97 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle,
>       return nbd_co_send_extents(client, handle, &extent, 1, context_id, errp);
>   }
>   
> +/* Set several extents, describing region of given @length with given @flags.
> + * Do not set more than @nb_extents, return number of set extents.
> + */
> +static unsigned add_extents(NBDExtent *extents, unsigned nb_extents,
> +                            uint64_t length, uint32_t flags)
> +{
> +    unsigned i = 0;
> +    uint32_t max_extent = QEMU_ALIGN_DOWN(INT32_MAX, BDRV_SECTOR_SIZE);

This is too small of a granularity wrong when the server advertised 4k 
alignment during NBD_OPT_GO; it should probably refer to 
bs->bl.request_alignment.

> +    uint32_t max_extent_be = cpu_to_be32(max_extent);
> +    uint32_t flags_be = cpu_to_be32(flags);
> +
> +    for (i = 0; i < nb_extents && length > max_extent;
> +         i++, length -= max_extent)
> +    {
> +        extents[i].length = max_extent_be;
> +        extents[i].flags = flags_be;
> +    }
> +
> +    if (length > 0 && i < nb_extents) {
> +        extents[i].length = cpu_to_be32(length);
> +        extents[i].flags = flags_be;
> +        i++;
> +    }
> +
> +    return i;
> +}
> +
> +static unsigned bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
> +                                  uint64_t length, NBDExtent *extents,
> +                                  unsigned nb_extents, bool dont_fragment)
> +{
> +    uint64_t begin = offset, end;
> +    uint64_t overall_end = offset + length;
> +    unsigned i = 0;
> +    BdrvDirtyBitmapIter *it;
> +    bool dirty;
> +
> +    bdrv_dirty_bitmap_lock(bitmap);
> +
> +    it = bdrv_dirty_iter_new(bitmap);
> +    dirty = bdrv_get_dirty_locked(NULL, bitmap, offset);
> +
> +    while (begin < overall_end && i < nb_extents) {
> +        if (dirty) {
> +            end = bdrv_dirty_bitmap_next_zero(bitmap, begin);
> +        } else {
> +            bdrv_set_dirty_iter(it, begin);
> +            end = bdrv_dirty_iter_next(it);
> +        }
> +        if (end == -1) {
> +            end = bdrv_dirty_bitmap_size(bitmap);
> +        }
> +        if (dont_fragment && end > overall_end) {
> +            /* We MUST not exceed requested region if NBD_CMD_FLAG_REQ_ONE flag
> +             * is set */
> +            end = overall_end;
> +        }
> +
> +        i += add_extents(extents + i, nb_extents - i, end - begin,
> +                         dirty ? NBD_STATE_DIRTY : 0);
> +        begin = end;
> +        dirty = !dirty;
> +    }
> +
> +    bdrv_dirty_iter_free(it);
> +
> +    bdrv_dirty_bitmap_unlock(bitmap);
> +
> +    return i;
> +}
> +
> +static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
> +                              BdrvDirtyBitmap *bitmap, uint64_t offset,
> +                              uint64_t length, bool dont_fragment,
> +                              uint32_t context_id, Error **errp)
> +{
> +    int ret;
> +    unsigned nb_extents = dont_fragment ? 1 : NBD_MAX_BITMAP_EXTENTS;
> +    NBDExtent *extents = g_new(NBDExtent, nb_extents);
> +
> +    nb_extents = bitmap_to_extents(bitmap, offset, length, extents, nb_extents,
> +                                   dont_fragment);
> +
> +    ret = nbd_co_send_extents(client, handle, extents, nb_extents, context_id,
> +                              errp);

Worth a trace when extents are sent?

> +
> +    g_free(extents);
> +
> +    return ret;
> +}
> +
>   /* nbd_co_receive_request
>    * Collect a client request. Return 0 if request looks valid, -EIO to drop
>    * connection right away, and any other negative value to report an error to
> @@ -2043,11 +2215,33 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
>                                         "discard failed", errp);
>   
>       case NBD_CMD_BLOCK_STATUS:
> -        if (client->export_meta.valid && client->export_meta.base_allocation) {
> -            return nbd_co_send_block_status(client, request->handle,
> -                                            blk_bs(exp->blk), request->from,
> -                                            request->len,
> -                                            NBD_META_ID_BASE_ALLOCATION, errp);
> +        if (client->export_meta.valid &&
> +            (client->export_meta.base_allocation ||
> +             client->export_meta.dirty_bitmap))
> +        {
> +            if (client->export_meta.base_allocation) {
> +                ret = nbd_co_send_block_status(client, request->handle,
> +                                               blk_bs(exp->blk), request->from,
> +                                               request->len,
> +                                               NBD_META_ID_BASE_ALLOCATION,
> +                                               errp);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +            }
> +
> +            if (client->export_meta.dirty_bitmap) {
> +                ret = nbd_co_send_bitmap(client, request->handle,
> +                                         client->exp->export_bitmap,
> +                                         request->from, request->len,
> +                                         request->flags & NBD_CMD_FLAG_REQ_ONE,
> +                                         NBD_META_ID_DIRTY_BITMAP, errp);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +            }
> +
> +            return nbd_co_send_structured_done(client, request->handle, errp);
>           } else {
>               return nbd_send_generic_reply(client, request->handle, -EINVAL,
>                                             "CMD_BLOCK_STATUS not negotiated",
> @@ -2199,3 +2393,44 @@ void nbd_client_new(NBDExport *exp,
>       co = qemu_coroutine_create(nbd_co_client_start, client);
>       qemu_coroutine_enter(co);
>   }
> +
> +void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
> +                       const char *bitmap_export_name, Error **errp)
> +{
> +    BdrvDirtyBitmap *bm = NULL;
> +    BlockDriverState *bs = blk_bs(exp->blk);
> +
> +    if (exp->export_bitmap) {
> +        error_setg(errp, "Export bitmap is already set");
> +        return;
> +    }
> +
> +    while (true) {
> +        bm = bdrv_find_dirty_bitmap(bs, bitmap);
> +        if (bm != NULL || bs->backing == NULL) {
> +            break;
> +        }
> +
> +        bs = bs->backing->bs;
> +    }

Is searching for the dirty bitmap in the backing chain always the wisest 
thing to do?  I guess it works, but it seems a bit odd on first glance.

> +
> +    if (bm == NULL) {
> +        error_setg(errp, "Bitmap '%s' is not found", bitmap);
> +        return;
> +    }
> +
> +    if (bdrv_dirty_bitmap_enabled(bm)) {
> +        error_setg(errp, "Bitmap '%s' is enabled", bitmap);
> +        return;
> +    }
> +
> +    if (bdrv_dirty_bitmap_qmp_locked(bm)) {
> +        error_setg(errp, "Bitmap '%s' is locked", bitmap);
> +        return;
> +    }
> +
> +    bdrv_dirty_bitmap_set_qmp_locked(bm, true);
> +    exp->export_bitmap = bm;
> +    exp->export_bitmap_context =
> +            g_strdup_printf("qemu:dirty-bitmap:%s", bitmap_export_name);
> +}
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

  reply	other threads:[~2018-06-20 11:24 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-09 15:17 [Qemu-devel] [PATCH v5 0/6] NBD export bitmaps Vladimir Sementsov-Ogievskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 1/6] nbd/server: fix trace Vladimir Sementsov-Ogievskiy
2018-06-19 18:39   ` Eric Blake
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 2/6] nbd/server: refactor NBDExportMetaContexts Vladimir Sementsov-Ogievskiy
2018-06-19 19:03   ` Eric Blake
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 3/6] nbd/server: add nbd_meta_empty_or_pattern helper Vladimir Sementsov-Ogievskiy
2018-06-19 20:24   ` Eric Blake
2018-06-20  9:43     ` Vladimir Sementsov-Ogievskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export Vladimir Sementsov-Ogievskiy
2018-06-20 11:24   ` Eric Blake [this message]
2018-06-20 14:04     ` Vladimir Sementsov-Ogievskiy
2018-06-20 15:43     ` Eric Blake
2018-06-20 15:58       ` Eric Blake
2018-06-20 16:27   ` Eric Blake
2018-06-20 17:04     ` Vladimir Sementsov-Ogievskiy
2018-06-20 18:09       ` Eric Blake
2018-06-21 10:09         ` Vladimir Sementsov-Ogievskiy
2018-09-14 16:22         ` Vladimir Sementsov-Ogievskiy
2018-11-29  4:34   ` Eric Blake
2019-01-09 19:21   ` Eric Blake
2019-01-10  7:15     ` Eric Blake
2019-01-17 21:09     ` John Snow
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 5/6] qapi: new qmp command nbd-server-add-bitmap Vladimir Sementsov-Ogievskiy
2018-06-20 11:26   ` Eric Blake
2018-06-20 14:13     ` Vladimir Sementsov-Ogievskiy
2018-06-20 18:14       ` Eric Blake
2018-06-21 10:10         ` Vladimir Sementsov-Ogievskiy
2018-06-21 10:23       ` Nikolay Shirokovskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 6/6] docs/interop: add nbd.txt Vladimir Sementsov-Ogievskiy
2018-06-20 11:33   ` Eric Blake
2018-06-20 14:16     ` Vladimir Sementsov-Ogievskiy
2018-06-20 20:58       ` [Qemu-devel] [Qemu-block] " John Snow
2018-06-21 15:59         ` Vladimir Sementsov-Ogievskiy
2018-06-21 22:10           ` [Qemu-devel] Incremental Backup Status (Was: Re: [Qemu-block] [PATCH v5 6/6] docs/interop: add nbd.txt) John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ab9dd4e-1c79-fbe2-8e4b-6563e3775366@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).