From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:40222)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1Yj914-0003yM-DM
	for qemu-devel@nongnu.org; Fri, 17 Apr 2015 12:21:19 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jsnow@redhat.com>) id 1Yj912-0007c2-9m
	for qemu-devel@nongnu.org; Fri, 17 Apr 2015 12:21:17 -0400
Message-ID: <553132F5.2020605@redhat.com>
Date: Fri, 17 Apr 2015 12:21:09 -0400
From: John Snow <jsnow@redhat.com>
MIME-Version: 1.0
References: <1428531604-9428-1-git-send-email-jsnow@redhat.com>	<1428531604-9428-11-git-send-email-jsnow@redhat.com>
	<553107F4.6040100@redhat.com>
In-Reply-To: <553107F4.6040100@redhat.com>
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH v5 10/21] qmp: Add support of
 "dirty-bitmap" sync mode for drive-backup
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Max Reitz <mreitz@redhat.com>, qemu-block@nongnu.org
Cc: kwolf@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, armbru@redhat.com, vsementsov@parallels.com, stefanha@redhat.com


On 04/17/2015 09:17 AM, Max Reitz wrote:
> On 09.04.2015 00:19, John Snow wrote:
>> For "dirty-bitmap" sync mode, the block job will iterate through the
>> given dirty bitmap to decide if a sector needs backup (backup all the
>> dirty clusters and skip clean ones), just as allocation conditions of
>> "top" sync mode.
>>
>> Signed-off-by: Fam Zheng <famz@redhat.com>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   block.c                   |   9 +++
>>   block/backup.c            | 156
>> +++++++++++++++++++++++++++++++++++++++-------
>>   block/mirror.c            |   4 ++
>>   blockdev.c                |  18 +++++-
>>   hmp.c                     |   3 +-
>>   include/block/block.h     |   1 +
>>   include/block/block_int.h |   2 +
>>   qapi/block-core.json      |  13 ++--
>>   qmp-commands.hx           |   7 ++-
>>   9 files changed, 180 insertions(+), 33 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 9d30379..2367311 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -5717,6 +5717,15 @@ static void bdrv_reset_dirty(BlockDriverState
>> *bs, int64_t cur_sector,
>>       }
>>   }
>> +/**
>> + * Advance an HBitmapIter to an arbitrary offset.
>> + */
>> +void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
>> +{
>> +    assert(hbi->hb);
>> +    hbitmap_iter_init(hbi, hbi->hb, offset);
>> +}
>> +
>>   int64_t bdrv_get_dirty_count(BlockDriverState *bs, BdrvDirtyBitmap
>> *bitmap)
>>   {
>>       return hbitmap_count(bitmap->bitmap);
>> diff --git a/block/backup.c b/block/backup.c
>> index 1c535b1..8513917 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -37,6 +37,8 @@ typedef struct CowRequest {
>>   typedef struct BackupBlockJob {
>>       BlockJob common;
>>       BlockDriverState *target;
>> +    /* bitmap for sync=dirty-bitmap */
>> +    BdrvDirtyBitmap *sync_bitmap;
>>       MirrorSyncMode sync_mode;
>>       RateLimit limit;
>>       BlockdevOnError on_source_error;
>> @@ -242,6 +244,92 @@ static void backup_complete(BlockJob *job, void
>> *opaque)
>>       g_free(data);
>>   }
>> +static bool coroutine_fn yield_and_check(BackupBlockJob *job)
>> +{
>> +    if (block_job_is_cancelled(&job->common)) {
>> +        return true;
>> +    }
>> +
>> +    /* we need to yield so that qemu_aio_flush() returns.
>> +     * (without, VM does not reboot)
>> +     */
>> +    if (job->common.speed) {
>> +        uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
>> +
>> job->sectors_read);
>> +        job->sectors_read = 0;
>> +        block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
>> +    } else {
>> +        block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
>> +    }
>> +
>> +    if (block_job_is_cancelled(&job->common)) {
>> +        return true;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>> +static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>> +{
>> +    bool error_is_read;
>> +    int ret = 0;
>> +    int clusters_per_iter;
>> +    uint32_t granularity;
>> +    int64_t sector;
>> +    int64_t cluster;
>> +    int64_t end;
>> +    int64_t last_cluster = -1;
>> +    BlockDriverState *bs = job->common.bs;
>> +    HBitmapIter hbi;
>> +
>> +    granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
>> +    clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
>
> DIV_ROUND_UP(granularity, BACKUP_CLUSTER_SIZE) would've worked, too
> (instead of the MAX()), but since both are powers of two, this is
> equivalent.
>

But this way we get to put your name in the source code.

>> +    bdrv_dirty_iter_init(bs, job->sync_bitmap, &hbi);
>> +
>> +    /* Find the next dirty sector(s) */
>> +    while ((sector = hbitmap_iter_next(&hbi)) != -1) {
>> +        cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
>> +
>> +        /* Fake progress updates for any clusters we skipped */
>> +        if (cluster != last_cluster + 1) {
>> +            job->common.offset += ((cluster - last_cluster - 1) *
>> +                                   BACKUP_CLUSTER_SIZE);
>> +        }
>> +
>> +        for (end = cluster + clusters_per_iter; cluster < end;
>> cluster++) {
>> +            if (yield_and_check(job)) {
>> +                return ret;
>> +            }
>> +
>> +            do {
>> +                ret = backup_do_cow(bs, cluster *
>> BACKUP_SECTORS_PER_CLUSTER,
>> +                                    BACKUP_SECTORS_PER_CLUSTER,
>> &error_is_read);
>> +                if ((ret < 0) &&
>> +                    backup_error_action(job, error_is_read, -ret) ==
>> +                    BLOCK_ERROR_ACTION_REPORT) {
>> +                    return ret;
>> +                }
>
> Now that I'm reading this code again... The other backup implementation
> handles retries differently; it redoes the whole loop, with the
> effective difference being that it calls yield_and_check() between every
> retry. Would it make sense to move the yield_and_check() call into this
> loop?
>

Yes, I should be mindful of the case where we might have to copy many 
clusters per dirty bit. I don't think we lose anything by inserting it 
at the top of the do{}while(), but we will potentially exit the loop 
quicker on cancellation cases.

>> +            } while (ret < 0);
>> +        }
>> +
>> +        /* If the bitmap granularity is smaller than the backup
>> granularity,
>> +         * we need to advance the iterator pointer to the next
>> cluster. */
>> +        if (granularity < BACKUP_CLUSTER_SIZE) {
>
> Actually, whenever BACKUP_CLUSTER_SIZE isn't a factor of granularity.
> Both are powers of two, though, so that's the case iff granularity <
> BACKUP_CLUSTER_SIZE. (thus, the condition is correct)
>
>> +            bdrv_set_dirty_iter(&hbi, cluster *
>> BACKUP_SECTORS_PER_CLUSTER);
>> +        }
>> +
>> +        last_cluster = cluster - 1;
>
> A bit awkward, but hey...
>

The inner for() is going to advance cluster as an index, plus an extra 
before it exits the loop. Either I manually do cluster-- or just 
subtract one from last_cluster... Either way I have to fiddle with the 
index.

I could also use a separate index and only advance cluster once we make 
it into the loop, but that looks awkward too, so I choice my poison.

> So, what's preventing me from giving an R-b is whether or not
> yield_and_check() should be moved.
>
> Max
>

[email_truncate()]