From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57051)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <vsementsov@parallels.com>) id 1YCX15-0005Mz-CA
	for qemu-devel@nongnu.org; Sat, 17 Jan 2015 12:18:34 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <vsementsov@parallels.com>) id 1YCX11-0007A9-7Z
	for qemu-devel@nongnu.org; Sat, 17 Jan 2015 12:18:31 -0500
Received: from mx2.parallels.com ([199.115.105.18]:43538)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <vsementsov@parallels.com>) id 1YCX10-0006bO-Qe
	for qemu-devel@nongnu.org; Sat, 17 Jan 2015 12:18:27 -0500
Message-ID: <54BA9926.7010509@parallels.com>
Date: Sat, 17 Jan 2015 20:17:26 +0300
From: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
MIME-Version: 1.0
References: <1418307457-25996-1-git-send-email-vsementsov@parallels.com>
	<1418307457-25996-10-git-send-email-vsementsov@parallels.com>
	<54AEFF44.5010805@redhat.com>
In-Reply-To: <54AEFF44.5010805@redhat.com>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 9/9] block-migration: add named dirty
 bitmaps migration
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: John Snow <jsnow@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, "Juan Q >> Juan Jose Quintela Carreira" <quintela@redhat.com>, "gilbert >> Dr. David Alan Gilbert" <dgilbert@redhat.com>, stefanha@redhat.com, Amit Shah <amit.shah@redhat.com>, den@openvz.org


Best regards,
Vladimir

On 09.01.2015 01:05, John Snow wrote:
> CCing migration maintainers, feedback otherwise in-line.
>
> On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Just migrate parts of dirty bitmaps, corresponding to the block being
>> migrated. Also, skip block migration if it is disabled (blk parameter
>> of migrate command is false).
>>
>> Skipping shared sectors: bitmaps are migrated independently of this,
>> just send blk without block data.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
>> ---
>
> In terms of general approach, migrating the dirty bitmap alongside the 
> blocks it describes makes sense when migrating both.
>
> Is this a lot of overhead when that's not the case, though? If we 
> utilize the "bitmap only" pathways added here, don't we iterate 
> through the savevm handlers a lot to only transfer very little data 
> per section?
>
> If we really need migration of bitmaps apart from the data they 
> describe, is it not worth developing a more condensed transfer 
> mechanism to get more bitmap data per iteration, instead of just one 
> block's worth?
The first stage of migration can be easily optimized in this case - just 
take a larger bitmap pieces. For the second stage, it's not as simple. 
If we will take larger pieces on each step - we will just increase dirty 
data transfer. One approach is to maintain "dirty-region" - two numbers, 
representing the region of set bits in migration dirty bitmap, and send 
data only when this region is large enough.

>
>>   block-migration.c | 173 
>> +++++++++++++++++++++++++++++++++++++++++++++++-------
>>   savevm.c          |   1 +
>>   2 files changed, 154 insertions(+), 20 deletions(-)
>>
>> diff --git a/block-migration.c b/block-migration.c
>> index 908a66d..95d54a1 100644
>> --- a/block-migration.c
>> +++ b/block-migration.c
>> @@ -32,6 +32,8 @@
>>   #define BLK_MIG_FLAG_EOS                0x02
>>   #define BLK_MIG_FLAG_PROGRESS           0x04
>>   #define BLK_MIG_FLAG_ZERO_BLOCK         0x08
>> +#define BLK_MIG_FLAG_HAS_BLOCK          0x10
>> +#define BLK_MIG_FLAG_HAS_BITMAPS        0x20
>>
>
> OK: As a result of allowing bitmaps to be migrated without the block 
> data itself, we now must acknowledge the concept that we can send 
> either block data, bitmaps, both, or neither -- so new defines are 
> added here to indicate what data can be found in the section following.
>
>>   #define MAX_IS_ALLOCATED_SEARCH 65536
>>
>> @@ -51,6 +53,8 @@ typedef struct BlkMigDevState {
>>       int shared_base;
>>       int64_t total_sectors;
>>       QSIMPLEQ_ENTRY(BlkMigDevState) entry;
>> +    int nr_bitmaps;
>> +    BdrvDirtyBitmap **dirty_bitmaps;
>>
>>       /* Only used by migration thread.  Does not need a lock. */
>>       int bulk_completed;
>> @@ -64,6 +68,11 @@ typedef struct BlkMigDevState {
>>       Error *blocker;
>>   } BlkMigDevState;
>>
>> +typedef struct BlkMigDirtyBitmap {
>> +    BdrvDirtyBitmap *bitmap;
>> +    uint8_t *buf;
>> +} BlkMigDirtyBitmap;
>> +
>>   typedef struct BlkMigBlock {
>>       /* Only used by migration thread.  */
>>       uint8_t *buf;
>> @@ -74,6 +83,9 @@ typedef struct BlkMigBlock {
>>       QEMUIOVector qiov;
>>       BlockAIOCB *aiocb;
>>
>> +    int nr_bitmaps;
>> +    BlkMigDirtyBitmap *dirty_bitmaps;
>> +
>>       /* Protected by block migration lock.  */
>>       int ret;
>>       QSIMPLEQ_ENTRY(BlkMigBlock) entry;
>> @@ -83,6 +95,7 @@ typedef struct BlkMigState {
>>       /* Written during setup phase.  Can be read without a lock.  */
>>       int blk_enable;
>>       int shared_base;
>> +    int dbm_enable;
>
> Similar to the feedback in a previous patch, we may not want to use 
> 'dbm' to mean "Dirty Bitmap" because we have not been applying the 
> abbreviation consistently.
>
> For now, the recommendation from stefan is to use the full 
> "bdrv_dirty_bitmap" or "dirty_bitmap" in function names.
>
> If we do want an acronym to refer to this particular type of dirty 
> bitmap, we should apply it consistently to all functions that work 
> with the BdrvDirtyBitmap type.
>
> For now, "bdrv_dirty_bitmap_enable" should suffice, even though it's a 
> bit wordy.
It's a flag, that enables the migration of dirty bitmaps, like 
blk_enable enables the migrtion of blocks.. Then, should it be 
"dirty_bitmap_migration_enable"? Isn't it too long for a simple flag 
variable?
>
>>       QSIMPLEQ_HEAD(bmds_list, BlkMigDevState) bmds_list;
>>       int64_t total_sector_sum;
>>       bool zero_blocks;
>> @@ -116,27 +129,63 @@ static void blk_mig_unlock(void)
>>   /* Only allocating and initializing structure fields, not copying 
>> any data. */
>>
>>   static BlkMigBlock *blk_create(BlkMigDevState *bmds, int64_t sector,
>> -                                int nr_sectors)
>> +                               int nr_sectors, bool only_bitmaps)
>>   {
>> +    int i;
>>       BlkMigBlock *blk = g_new(BlkMigBlock, 1);
>> -    blk->buf = g_malloc(BLOCK_SIZE);
>> +    blk->buf = only_bitmaps ? NULL : g_malloc(BLOCK_SIZE);
>>       blk->bmds = bmds;
>>       blk->sector = sector;
>>       blk->nr_sectors = nr_sectors;
>> +    blk->dirty_bitmaps = NULL;
>> +    blk->nr_bitmaps = 0;
>>
>>       blk->iov.iov_base = blk->buf;
>>       blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE;
>>       qemu_iovec_init_external(&blk->qiov, &blk->iov, 1);
>>
>
> We can probably skip this block if only_bitmaps is true.
I agree.
>
>> +    if (block_mig_state.dbm_enable) {
>> +        BlkMigDirtyBitmap *mig_bm;
>> +
>> +        blk->nr_bitmaps = bmds->nr_bitmaps;
>> +        mig_bm = blk->dirty_bitmaps = g_new(BlkMigDirtyBitmap,
>> + bmds->nr_bitmaps);
>> +        for (i = 0; i < bmds->nr_bitmaps; ++i) {
>> +            BdrvDirtyBitmap *bm = bmds->dirty_bitmaps[i];
>> +            mig_bm->buf = g_malloc(bdrv_dbm_data_size(bm, nr_sectors));
>> +            mig_bm->bitmap = bm;
>> +            mig_bm++;
>> +        }
>> +    }
>> +
>
> It's strange to me that we'd give it an "only bitmaps" boolean and 
> then rely on a different condition to populate them. Is it not worthy 
> of an error if we specify only_bitmaps when block_mig_state.dbm_enable 
> is false?
only_bitmaps means: no blocks but only bitmaps. But there is another 
case: blocks with bitmaps, and in this case this "if ()" should execute.

{only_bitmaps = true} when {block_mig_state.dbm_enable = false} - it's 
of course an error. I'll add assert() for this.
>
>>       return blk;
>>   }
>>
>>   static void blk_free(BlkMigBlock *blk)
>>   {
>> +    int i;
>>       g_free(blk->buf);
>> +
>> +    if (blk->dirty_bitmaps) {
>> +        for (i = 0; i < blk->nr_bitmaps; ++i) {
>> +            g_free(blk->dirty_bitmaps[i].buf);
>> +        }
>> +        g_free(blk->dirty_bitmaps);
>> +    }
>> +
>>       g_free(blk);
>>   }
>>
>> +static void blk_store_bitmaps(BlkMigBlock *blk)
>> +{
>> +    int i;
>> +    for (i = 0; i < blk->nr_bitmaps; ++i) {
>> +        bdrv_dbm_store_data(blk->dirty_bitmaps[i].bitmap,
>> +                            blk->dirty_bitmaps[i].buf,
>> +                            blk->sector, blk->nr_sectors);
>> +    }
>> +}
>> +
>
> I am worried about the case in which blk->nr_sectors is not an 
> integral multiple of the bitmap sector granularity, for reasons 
> discussed in the previous patches.
>
> I'm not positive it IS a problem either, but maybe you have already 
> thought about this.
Only allocated memory should be extended in the previous patch, as I've 
already proposed. As a general approach - it's ok, because we always 
calculate the real number of bit in bitmap from the virtual in the same 
way, therefore stored region will be restored in the same place.
>
>>   /* Must run outside of the iothread lock during the bulk phase,
>>    * or the VM will stall.
>>    */
>> @@ -146,9 +195,15 @@ static void blk_send(QEMUFile *f, BlkMigBlock * 
>> blk)
>>       int len;
>>       uint64_t flags = BLK_MIG_FLAG_DEVICE_BLOCK;
>>
>> -    if (block_mig_state.zero_blocks &&
>> +    if (block_mig_state.zero_blocks && blk->buf &&
>>           buffer_is_zero(blk->buf, BLOCK_SIZE)) {
>>           flags |= BLK_MIG_FLAG_ZERO_BLOCK;
>> +    } else if (blk->buf) {
>> +        flags |= BLK_MIG_FLAG_HAS_BLOCK;
>> +    }
>> +
>> +    if (block_mig_state.dbm_enable) {
>> +        flags |= BLK_MIG_FLAG_HAS_BITMAPS;
>>       }
>>
>>       /* sector number and flags */
>> @@ -160,10 +215,27 @@ static void blk_send(QEMUFile *f, BlkMigBlock * 
>> blk)
>>       qemu_put_byte(f, len);
>>       qemu_put_buffer(f, (uint8_t 
>> *)bdrv_get_device_name(blk->bmds->bs), len);
>>
>> +    /* dirty bitmaps */
>> +    if (flags & BLK_MIG_FLAG_HAS_BITMAPS) {
>> +        int i;
>> +        qemu_put_be32(f, blk->nr_bitmaps);
>> +        for (i = 0; i < blk->nr_bitmaps; ++i) {
>> +            BdrvDirtyBitmap *bm = blk->dirty_bitmaps[i].bitmap;
>> +            int buf_size = bdrv_dbm_data_size(bm, blk->nr_sectors);
>> +            const char *name = bdrv_dirty_bitmap_name(bm);
>> +            int len = strlen(name);
>> +
>> +            qemu_put_byte(f, len);
>> +            qemu_put_buffer(f, (const uint8_t *)name, len);
>
> No length in the stream for the size of the dirty bitmap buffer?
>
>> +            qemu_put_buffer(f, blk->dirty_bitmaps[i].buf, buf_size);
>> +        }
>> +    }
>> +
>
> Future patch idea: We can probably optimize this block by testing a 
> range of sectors on the bitmap and skipping bitmaps that don't have 
> data for this block.
>
> Since we populated the bitmaps previously, we should already have the 
> chance to have counted how many of them are nonempty, so we can use 
> that number instead of blk->nr_bitmaps.
>
> Not important for the series right now.
>
>>       /* if a block is zero we need to flush here since the network
>>        * bandwidth is now a lot higher than the storage device 
>> bandwidth.
>> -     * thus if we queue zero blocks we slow down the migration */
>> -    if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
>> +     * thus if we queue zero blocks we slow down the migration.
>> +     * also, skip writing block when migrate only dirty bitmaps. */
>> +    if (!(flags & BLK_MIG_FLAG_HAS_BLOCK)) {
>
> I'm not completely clear on the reasoning behind the original 
> codeblock here, but changing it from "Only when zeroes" to "Not when 
> we have data: Zeroes and Bitmaps" makes sense, since bitmap-only 
> sections are going to be very sparse too.
>
>>           qemu_fflush(f);
>>           return;
>>       }
>
> A little after this, there's a call to qemu_put_buffer(f, blk->buf, 
> BLOCK_SIZE), but we have the case where blk->buf may be NULL in bitmap 
> only cases. I think we need to guard against that.
in this case (flags & BLK_MIG_FLAG_HAS_BLOCK) should be zero. However an 
assert here won't hurt.
>
>> @@ -282,13 +354,20 @@ static int mig_save_device_bulk(QEMUFile *f, 
>> BlkMigDevState *bmds)
>>       BlockDriverState *bs = bmds->bs;
>>       BlkMigBlock *blk;
>>       int nr_sectors;
>> +    bool skip_chunk = false;
>>
>>       if (bmds->shared_base) {
>>           qemu_mutex_lock_iothread();
>> -        while (cur_sector < total_sectors &&
>> -               !bdrv_is_allocated(bs, cur_sector, 
>> MAX_IS_ALLOCATED_SEARCH,
>> -                                  &nr_sectors)) {
>> -            cur_sector += nr_sectors;
>> +        if (block_mig_state.dbm_enable) {
>> +            bdrv_is_allocated(bs, cur_sector, 
>> BDRV_SECTORS_PER_DIRTY_CHUNK,
>> +                              &nr_sectors);
>> +            skip_chunk = nr_sectors >= BDRV_SECTORS_PER_DIRTY_CHUNK;
>> +        } else {
>> +            while (cur_sector < total_sectors &&
>> +                   !bdrv_is_allocated(bs, cur_sector, 
>> MAX_IS_ALLOCATED_SEARCH,
>> +                                      &nr_sectors)) {
>> +                cur_sector += nr_sectors;
>> +            }
>
> OK, so the approach taken here is that if bitmaps are enabled, we 
> check only to see if this current chunk is allocated or not. If it 
> isn't, we declare this a "bitmap only" chunk and set skip_chunk to true.
>
> The else clause taken implies no bitmap data, because either 
> dbm_enable or blk_enable (or both) MUST be set for us to be here.
>
> (1) Why are you not checking the return value of bdrv_is_allocated? It 
> could return either true or false AND nr_sectors == 
> BDRV_SECTORS_PER_DIRTY_CHUNK, provided it was entirely allocated or 
> entirely unallocated, so I think this condition is not working as you 
> intend it to.
Yes, it's my mistake.
>
> (2) This seems slightly hackey in a sparse or no-data situation. We 
> will transmit many small data sections, each containing a chunk of 
> bitmap data, one at a time, with no data interspersed -- which leaves 
> a high metadata-to-data ratio in these cases.
>
> Would it be an involved process to alter the nr_sectors count to be 
> something that isn't a single sector in the (dbm_enable && 
> !blk_enable) case? That way we can spin until we find data worth 
> transferring and transfer all of the bitmap metadata in a larger chunk 
> all at once.
>
> Whatever approach you choose, it might be nice to have a comment 
> in-line explaining the general approach.
>
>>           }
>>           qemu_mutex_unlock_iothread();
>>       }
>> @@ -309,7 +388,8 @@ static int mig_save_device_bulk(QEMUFile *f, 
>> BlkMigDevState *bmds)
>>           nr_sectors = total_sectors - cur_sector;
>>       }
>>
>> -    blk = blk_create(bmds, cur_sector, nr_sectors);
>> +    blk = blk_create(bmds, cur_sector, nr_sectors,
>> +                     skip_chunk || !block_mig_state.blk_enable);
>>
>>       blk_mig_lock();
>>       block_mig_state.submitted++;
>> @@ -317,8 +397,16 @@ static int mig_save_device_bulk(QEMUFile *f, 
>> BlkMigDevState *bmds)
>>
>>       bdrv_reset_dirty_bitmap(bs, bmds->dirty_bitmap, cur_sector, 
>> nr_sectors);
>>
>> -    blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov,
>> -                                nr_sectors, blk_mig_read_cb, blk);
>> +    if (block_mig_state.dbm_enable) {
>> +        blk_store_bitmaps(blk);
>> +    }
>> +
>> +    if (block_mig_state.blk_enable && !skip_chunk) {
>> +        blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov,
>> +                                    nr_sectors, blk_mig_read_cb, blk);
>> +    } else if (block_mig_state.dbm_enable) {
>> +        blk_mig_read_cb(blk, 0);
>> +    }
>>
>>       bmds->cur_sector = cur_sector + nr_sectors;
>>       return (bmds->cur_sector >= total_sectors);
>> @@ -403,6 +491,8 @@ static void init_blk_migration(QEMUFile *f)
>>               DPRINTF("Start full migration for %s\n", 
>> bdrv_get_device_name(bs));
>>           }
>>
>> +        bmds->dirty_bitmaps = bdrv_dbm_find_all_named(bs, 
>> &bmds->nr_bitmaps);
>> +
>
> At this point, we should have block_mig_state.dbm_enable set (or 
> disabled), so we can skip this initialization if it is not set.
Agree.
>
>> QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
>>       }
>>   }
>> @@ -481,20 +571,32 @@ static int mig_save_device_dirty(QEMUFile *f, 
>> BlkMigDevState *bmds,
>>               } else {
>>                   nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>               }
>> -            blk = blk_create(bmds, sector, nr_sectors);
>> +            blk = blk_create(bmds, sector, nr_sectors,
>> +                             !block_mig_state.blk_enable);
>> +
>> +            if (block_mig_state.dbm_enable) {
>> +                blk_store_bitmaps(blk);
>> +            }
>>
>>               if (is_async) {
>> -                blk->aiocb = bdrv_aio_readv(bmds->bs, sector, 
>> &blk->qiov,
>> -                                            nr_sectors, 
>> blk_mig_read_cb, blk);
>> +                if (block_mig_state.blk_enable) {
>> +                    blk->aiocb = bdrv_aio_readv(bmds->bs, sector, 
>> &blk->qiov,
>> +                                                nr_sectors, 
>> blk_mig_read_cb,
>> +                                                blk);
>> +                } else {
>> +                    blk_mig_read_cb(blk, 0);
>> +                }
>>
>>                   blk_mig_lock();
>>                   block_mig_state.submitted++;
>>                   bmds_set_aio_inflight(bmds, sector, nr_sectors, 1);
>>                   blk_mig_unlock();
>>               } else {
>> -                ret = bdrv_read(bmds->bs, sector, blk->buf, 
>> nr_sectors);
>> -                if (ret < 0) {
>> -                    goto error;
>> +                if (block_mig_state.blk_enable) {
>> +                    ret = bdrv_read(bmds->bs, sector, blk->buf, 
>> nr_sectors);
>> +                    if (ret < 0) {
>> +                        goto error;
>> +                    }
>>                   }
>>                   blk_send(f, blk);
>>
>> @@ -817,10 +919,39 @@ static int block_load(QEMUFile *f, void 
>> *opaque, int version_id)
>>                   nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>               }
>>
>> +            /* load dirty bitmaps */
>> +            if (flags & BLK_MIG_FLAG_HAS_BITMAPS) {
>> +                int i, nr_bitmaps = qemu_get_be32(f);
>> +
>> +                for (i = 0; i < nr_bitmaps; ++i) {
>> +                    char bitmap_name[256];
>> +                    BdrvDirtyBitmap *bitmap;
>> +                    int buf_size, len;
>> +
>> +                    len = qemu_get_byte(f);
>> +                    qemu_get_buffer(f, (uint8_t *)bitmap_name, len);
>> +                    bitmap_name[len] = '\0';
>> +
>> +                    bitmap = bdrv_find_dirty_bitmap(bs, bitmap_name);
>> +                    if (!bitmap) {
>> +                        fprintf(stderr, "Error unknown dirty bitmap\
>> +                                %s for block device %s\n",
>> +                                bitmap_name, device_name);
>> +                        return -EINVAL;
>> +                    }
>> +
>> +                    buf_size = bdrv_dbm_data_size(bitmap, nr_sectors);
>
> Oh, this is why you didn't store the length... Still, since it seems 
> as if you expect the bitmap to already exist on the destination, what 
> if the granularity or size is different? It might be nice to 
> explicitly state the size of the buffer -- one user misconfiguration 
> and you might blow the rest of the migration.
>
> This way, if the calculated size doesn't match the received size, you 
> have the chance to throw a meaningful error.
>
>> +                    buf = g_malloc(buf_size);
>> +                    qemu_get_buffer(f, buf, buf_size);
>> +                    bdrv_dbm_restore_data(bitmap, buf, addr, 
>> nr_sectors);
>> +                    g_free(buf);
>> +                }
>> +            }
>> +
>
> So with this approach, the user needs to have created the bitmaps 
> already? Would it be possible to create a mode where they don't, or am 
> I misreading something?
>
> We could re-create them on the fly whenever the offset is 0 on the 
> receiving side, the only other piece of information we'd really need 
> at that point is the granularity.
Agree.
>
>>               if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
>>                   ret = bdrv_write_zeroes(bs, addr, nr_sectors,
>>                                           BDRV_REQ_MAY_UNMAP);
>> -            } else {
>> +            } else if (flags & BLK_MIG_FLAG_HAS_BLOCK) {
>>                   buf = g_malloc(BLOCK_SIZE);
>>                   qemu_get_buffer(f, buf, BLOCK_SIZE);
>>                   ret = bdrv_write(bs, addr, buf, nr_sectors);
>> @@ -855,6 +986,7 @@ static void block_set_params(const 
>> MigrationParams *params, void *opaque)
>>   {
>>       block_mig_state.blk_enable = params->blk;
>>       block_mig_state.shared_base = params->shared;
>> +    block_mig_state.dbm_enable = params->dirty;
>>
>>       /* shared base means that blk_enable = 1 */
>>       block_mig_state.blk_enable |= params->shared;
>> @@ -862,7 +994,8 @@ static void block_set_params(const 
>> MigrationParams *params, void *opaque)
>>
>>   static bool block_is_active(void *opaque)
>>   {
>> -    return block_mig_state.blk_enable == 1;
>> +    return block_mig_state.blk_enable == 1 ||
>> +           block_mig_state.dbm_enable == 1;
>>   }
>>
>>   static SaveVMHandlers savevm_block_handlers = {
>> diff --git a/savevm.c b/savevm.c
>> index a598d1d..1299faa 100644
>> --- a/savevm.c
>> +++ b/savevm.c
>> @@ -983,6 +983,7 @@ int qemu_loadvm_state(QEMUFile *f)
>>           }
>>       }
>>
>> +    bdrv_dbm_restore_finish();
>
> Do we not have a cleaner way to call this? I guess we have no explicit 
> callback mechanisms for when a particular BDS is completely done, 
> because of the multiple passes over the data that we perform --
>
> Still, it might be nicer to have a callback for the whole of block 
> migration, and call bdrv_dbm_restore_finish() from there instead of 
> putting this implementation detail all the way up inside of 
> qemu_loadvm_state.
>
> Can we migrate an empty-data sentinel section from block migration 
> upon completion that qemu_loadvm_state launches a post_load method for 
> that can invoke bdrv_dbm_restore_finish for us?
>
> Alternatively, can this not be safely done inside of block_load? 
> somewhere around here:
>
> printf("Completed %d %%%c", (int)addr,
>                    (addr == 100) ? '\n' : '\r');
>             fflush(stdout);
>
> The "100%" completion flag is only written in block_save_complete(),
> so once we read this flag we should have necessary and sufficient 
> information to conclude that it's safe to do the BdrvDirtyBitmap 
> migration post-load stuff.
Good point, I think, I'll chose this approach.
>
>
> Of course, maybe generic section post-load callbacks are useful to 
> other devices, and I'd be fine with either approach.
>
> CCing migration people for opinions.
>
>>       cpu_synchronize_all_post_init();
>>
>>       ret = 0;
>>
>
>
> Thank you, and sorry it took me so long to digest the series!
> --John Snow
>
> P.S.: Happy New Year! I hope 2015 treats you well :~)


With all this story about not efficient migration in case of no block in 
blk, I think, that, may be, it would be better to refuse using block 
migration bitmap to migrate bitmaps, make additional dirty bitmaps for 
dirty bitmaps and migrate bitmaps in separate migration/dirty-bitmaps.c 
file (sorry for the pun).. This will additionally provide more precise 
migration in case of resetting the bitmap while migration in progress.