qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>,
	qemu-devel@nongnu.org
Cc: kwolf@redhat.com, peter.maydell@linaro.org, quintela@redhat.com,
	dgilbert@redhat.com, stefanha@redhat.com, pbonzini@redhat.com,
	amit.shah@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH RFC v3 05/14] block: add meta bitmaps
Date: Wed, 18 Feb 2015 18:45:38 -0500	[thread overview]
Message-ID: <54E52422.9050507@redhat.com> (raw)
In-Reply-To: <1424268014-13293-6-git-send-email-vsementsov@parallels.com>



On 02/18/2015 09:00 AM, Vladimir Sementsov-Ogievskiy wrote:
> Meta bitmap is a 'dirty bitmap' for the BdrvDirtyBitmap. It tracks
> changes (set/unset) of this BdrvDirtyBitmap. It is needed for live
> migration of block dirty bitmaps.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
> ---
>   block.c               | 40 ++++++++++++++++++++++++++++++++++++++++
>   include/block/block.h |  5 +++++
>   2 files changed, 45 insertions(+)
>
> diff --git a/block.c b/block.c
> index a127fd2..aaa08b8 100644
> --- a/block.c
> +++ b/block.c
> @@ -58,9 +58,15 @@
>    * (3) successor is set: frozen mode.
>    *     A frozen bitmap cannot be renamed, deleted, anonymized, cleared, set,
>    *     or enabled. A frozen bitmap can only abdicate() or reclaim().
> + *
> + * Meta bitmap:
> + * Meta bitmap is a 'dirty bitmap' for the BdrvDirtyBitmap. It tracks changes
> + * (set/unset) of this BdrvDirtyBitmap. It is needed for live migration of
> + * block dirty bitmaps.
>    */
>   struct BdrvDirtyBitmap {
>       HBitmap *bitmap;            /* Dirty sector bitmap implementation */
> +    HBitmap *meta_bitmap;       /* Meta bitmap */
>       BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
>       char *name;                 /* Optional non-empty unique ID */
>       int64_t size;               /* Size of the bitmap (Number of sectors) */
> @@ -5398,6 +5404,31 @@ void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
>       bitmap->name = NULL;
>   }
>
> +HBitmap *bdrv_create_meta_bitmap(BdrvDirtyBitmap *bitmap,
> +                                        uint64_t granularity)
> +{
> +    uint64_t sector_granularity;
> +
> +    assert((granularity & (granularity - 1)) == 0);
> +
> +    granularity *= 8 * bdrv_dirty_bitmap_granularity(bitmap);
> +    sector_granularity = granularity >> BDRV_SECTOR_BITS;
> +    assert(sector_granularity);
> +

The maths here could use a comment, I think.

the "granularity" field here actually describes the desired 
serialization buffer size; e.g. CHUNK_SIZE (1 << 20) or 1 MiB. This 
parameter should be renamed to explain what it's actually for. Something 
like "chunk_size" and a comment explaining that it is in bytes.

...

That said, let's talk about the default chunk size you're using in 
correlation with this function.

a CHUNK_SIZE of 1MiB here is going to lead us to, if we have a bitmap 
with the default granularity of 128 sectors\64KiB bytes, a granularity 
for the meta_bitmap of one billion sectors (1 << 30) or 512GiB.

That's going to be bigger than most drives entirely, which will 
generally lead us to only using a single "chunk" per drive. Which means 
we won't really get a lot of mileage out of the bulk/dirty phases most 
of the time.

It's wild to think about that the first 1,000,000,000 sectors or 
512,000,000,000 bytes will all be represented by the first single bit in 
this bitmap. If a single hair on the drive changes, we resend the 
_entire_ bitmap, possibly over and over again. Will we ever make 
progress? Should we investigate a smaller chunk size?

Here's some quick mappings of chunk size (bytes) to effective 
meta_bitmap byte granularities, assuming the meta_bitmap is tracking a 
bitmap with the default granularity of 64KiB:

(1 << 20) 1MiB   -- 512GiB  // This is too high of a granularity
(1 << 17) 128KiB --  64GiB
(1 << 15) 32KiB  --  16GiB
(1 << 11) 2KiB   --   1GiB
(1 << 10) 1KiB   -- 512MiB
(1 << 9)  512B   -- 256MiB
(1 << 8)  256B   -- 128MiB
(1 << 5)  32 B   --  16MiB  // This is too small of a chunk size.
(1 << 1)   1 B   --   1MiB

We want to make the chunk sends efficient, but we also want to make sure 
that the dirty phase doesn't resend more data than it needs to, so we 
need to strike a balance here, no?

I think arguments could be made for most granularities between 128MiB 
through 1GiB. Anything outside of that is too lopsided, IMO.

What are your thoughts on this?

> +    bitmap->meta_bitmap =
> +        hbitmap_alloc(bitmap->size, ffsll(sector_granularity) - 1);
> +
> +    return bitmap->meta_bitmap;
> +}
> +
> +void bdrv_release_meta_bitmap(BdrvDirtyBitmap *bitmap)
> +{
> +    if (bitmap->meta_bitmap) {
> +        hbitmap_free(bitmap->meta_bitmap);
> +        bitmap->meta_bitmap = NULL;
> +    }
> +}
> +
>   BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
>                                             uint32_t granularity,
>                                             const char *name,
> @@ -5532,6 +5563,9 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
>               assert(!bdrv_dirty_bitmap_frozen(bm));
>               QLIST_REMOVE(bitmap, list);
>               hbitmap_free(bitmap->bitmap);
> +            if (bitmap->meta_bitmap) {
> +                hbitmap_free(bitmap->meta_bitmap);
> +            }
>               g_free(bitmap->name);
>               g_free(bitmap);
>               return;
> @@ -5659,6 +5693,9 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
>   {
>       assert(bdrv_dirty_bitmap_enabled(bitmap));
>       hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
> +    if (bitmap->meta_bitmap) {
> +        hbitmap_set(bitmap->meta_bitmap, cur_sector, nr_sectors);
> +    }
>   }
>
>   void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
> @@ -5666,6 +5703,9 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
>   {
>       assert(bdrv_dirty_bitmap_enabled(bitmap));
>       hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
> +    if (bitmap->meta_bitmap) {
> +        hbitmap_set(bitmap->meta_bitmap, cur_sector, nr_sectors);
> +    }
>   }
>
>   void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
> diff --git a/include/block/block.h b/include/block/block.h
> index c6a928d..f2c62f6 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -4,6 +4,7 @@
>   #include "block/aio.h"
>   #include "qemu-common.h"
>   #include "qemu/option.h"
> +#include "qemu/hbitmap.h"
>   #include "block/coroutine.h"
>   #include "block/accounting.h"
>   #include "qapi/qmp/qobject.h"
> @@ -487,6 +488,10 @@ void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
>                                             uint64_t start, uint64_t count);
>   void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);
>
> +HBitmap *bdrv_create_meta_bitmap(BdrvDirtyBitmap *bitmap,
> +                                        uint64_t granularity);
> +void bdrv_release_meta_bitmap(BdrvDirtyBitmap *bitmap);
> +
>   void bdrv_enable_copy_on_read(BlockDriverState *bs);
>   void bdrv_disable_copy_on_read(BlockDriverState *bs);
>
>

  reply	other threads:[~2015-02-18 23:45 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-18 14:00 [Qemu-devel] [PATCH RFC v3 00/14] Dirty bitmaps migration Vladimir Sementsov-Ogievskiy
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 01/14] qmp: add query-block-dirty-bitmap Vladimir Sementsov-Ogievskiy
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 02/14] hbitmap: serialization Vladimir Sementsov-Ogievskiy
2015-02-18 23:42   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 03/14] block: BdrvDirtyBitmap serialization interface Vladimir Sementsov-Ogievskiy
2015-02-18 23:43   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 04/14] block: tiny refactoring: minimize hbitmap_(set/reset) usage Vladimir Sementsov-Ogievskiy
2015-02-18 23:44   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 05/14] block: add meta bitmaps Vladimir Sementsov-Ogievskiy
2015-02-18 23:45   ` John Snow [this message]
2015-02-19 11:43     ` Vladimir Sementsov-Ogievskiy
2015-02-21  0:53       ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 06/14] block: add bdrv_next_dirty_bitmap() Vladimir Sementsov-Ogievskiy
2015-02-18 23:45   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 07/14] qapi: add dirty-bitmaps migration capability Vladimir Sementsov-Ogievskiy
2015-02-18 23:45   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 08/14] migration: add migration/block-dirty-bitmap.c Vladimir Sementsov-Ogievskiy
2015-02-18 23:47   ` John Snow
2015-02-19 13:48     ` Vladimir Sementsov-Ogievskiy
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 09/14] iotests: maintain several vms in test Vladimir Sementsov-Ogievskiy
2015-02-18 23:48   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 10/14] iotests: add add_incoming_migration to VM class Vladimir Sementsov-Ogievskiy
2015-02-18 23:48   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 11/14] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
2015-02-19 18:47   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 12/14] qapi: add md5 checksum of last dirty bitmap level to query-block Vladimir Sementsov-Ogievskiy
2015-02-19 18:53   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 13/14] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
2015-02-19 19:30   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 14/14] migration/qemu-file: make functions qemu_(get/put)_string public Vladimir Sementsov-Ogievskiy
2015-02-19  0:00   ` John Snow
2015-02-19  0:11 ` [Qemu-devel] [PATCH RFC v3 00/14] Dirty bitmaps migration John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54E52422.9050507@redhat.com \
    --to=jsnow@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).