qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: jcody@redhat.com, eblake@redhat.com, qemu-devel@nongnu.org,
	stefanha@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 27/47] block: introduce mirror job
Date: Thu, 13 Sep 2012 14:54:39 +0200	[thread overview]
Message-ID: <5051D78F.90104@redhat.com> (raw)
In-Reply-To: <1343127865-16608-28-git-send-email-pbonzini@redhat.com>

Am 24.07.2012 13:04, schrieb Paolo Bonzini:
> This patch adds the implementation of a new job that mirrors a disk to
> a new image while letting the guest continue using the old image.
> The target is treated as a "black box" and data is copied from the
> source to the target in the background.  This can be used for several
> purposes, including storage migration, continuous replication, and
> observation of the guest I/O in an external program.  It is also a
> first step in replacing the inefficient block migration code that is
> part of QEMU.
> 
> The job is possibly never-ending, but it is logically structured into
> two phases: 1) copy all data as fast as possible until the target
> first gets in sync with the source; 2) keep target in sync and
> ensure that reopening to the target gets a correct (full) copy
> of the source data.
> 
> The second phase is indicated by the progress in "info block-jobs"
> reporting the current offset to be equal to the length of the file.
> When the job is cancelled in the second phase, QEMU will run the
> job until the source is clean and quiescent, then it will report
> successful completion of the job.
> 
> In other words, the BLOCK_JOB_CANCELLED event means that the target
> may _not_ be consistent with a past state of the source; the
> BLOCK_JOB_COMPLETED event means that the target is consistent with
> a past state of the source.  (Note that it could already happen
> that management lost the race against QEMU and got a completion
> event instead of cancellation).
> 
> It is not yet possible to complete the job and switch over to the target
> disk.  The next patches will fix this and add many refinements to the
> basic idea introduced here.  These include improved error management,
> some tunable knobs and performance optimizations.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/Makefile.objs |    2 +-
>  block/mirror.c      |  232 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  block_int.h         |   20 +++++
>  qapi-schema.json    |   17 ++++
>  trace-events        |    7 ++
>  5 files changed, 277 insertions(+), 1 deletion(-)
>  create mode 100644 block/mirror.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index c45affc..f1a394a 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -9,4 +9,4 @@ block-obj-$(CONFIG_LIBISCSI) += iscsi.o
>  block-obj-$(CONFIG_CURL) += curl.o
>  block-obj-$(CONFIG_RBD) += rbd.o
>  
> -common-obj-y += stream.o
> +common-obj-y += stream.o mirror.o
> diff --git a/block/mirror.c b/block/mirror.c
> new file mode 100644
> index 0000000..f7d36f9
> --- /dev/null
> +++ b/block/mirror.c
> @@ -0,0 +1,232 @@
> +/*
> + * Image mirroring
> + *
> + * Copyright Red Hat, Inc. 2012
> + *
> + * Authors:
> + *  Paolo Bonzini  <pbonzini@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#include "trace.h"
> +#include "blockjob.h"
> +#include "block_int.h"
> +#include "qemu/ratelimit.h"
> +
> +enum {
> +    /*
> +     * Size of data buffer for populating the image file.  This should be large
> +     * enough to process multiple clusters in a single call, so that populating
> +     * contiguous regions of the image is efficient.
> +     */
> +    BLOCK_SIZE = 512 * BDRV_SECTORS_PER_DIRTY_CHUNK, /* in bytes */
> +};
> +
> +#define SLICE_TIME 100000000ULL /* ns */
> +
> +typedef struct MirrorBlockJob {
> +    BlockJob common;
> +    RateLimit limit;
> +    BlockDriverState *target;
> +    MirrorSyncMode mode;
> +    int64_t sector_num;
> +    uint8_t *buf;
> +} MirrorBlockJob;
> +
> +static int coroutine_fn mirror_iteration(MirrorBlockJob *s)
> +{
> +    BlockDriverState *source = s->common.bs;
> +    BlockDriverState *target = s->target;
> +    QEMUIOVector qiov;
> +    int ret, nb_sectors;
> +    int64_t end;
> +    struct iovec iov;
> +
> +    end = s->common.len >> BDRV_SECTOR_BITS;
> +    s->sector_num = bdrv_get_next_dirty(source, s->sector_num);
> +    nb_sectors = MIN(BDRV_SECTORS_PER_DIRTY_CHUNK, end - s->sector_num);
> +    bdrv_reset_dirty(source, s->sector_num, nb_sectors);
> +
> +    /* Copy the dirty cluster.  */
> +    iov.iov_base = s->buf;
> +    iov.iov_len  = nb_sectors * 512;
> +    qemu_iovec_init_external(&qiov, &iov, 1);
> +
> +    trace_mirror_one_iteration(s, s->sector_num, nb_sectors);
> +    ret = bdrv_co_readv(source, s->sector_num, nb_sectors, &qiov);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    return bdrv_co_writev(target, s->sector_num, nb_sectors, &qiov);
> +}
> +
> +static void coroutine_fn mirror_run(void *opaque)
> +{
> +    MirrorBlockJob *s = opaque;
> +    BlockDriverState *bs = s->common.bs;
> +    int64_t sector_num, end;
> +    int ret = 0;
> +    int n;
> +    bool synced = false;
> +
> +    if (block_job_is_cancelled(&s->common)) {
> +        goto immediate_exit;
> +    }
> +
> +    s->common.len = bdrv_getlength(bs);
> +    if (s->common.len < 0) {
> +        block_job_completed(&s->common, s->common.len);
> +        return;
> +    }
> +
> +    end = s->common.len >> BDRV_SECTOR_BITS;
> +    s->buf = qemu_blockalign(bs, BLOCK_SIZE);
> +
> +    if (s->mode == MIRROR_SYNC_MODE_FULL || s->mode == MIRROR_SYNC_MODE_TOP) {

I think this is the common case, so s->mode != MIRROR_SYNC_MODE_NONE
might describe it better?

> +        /* First part, loop on the sectors and initialize the dirty bitmap.  */
> +        BlockDriverState *base;
> +        base = s->mode == MIRROR_SYNC_MODE_FULL ? NULL : bs->backing_hd;
> +        for (sector_num = 0; sector_num < end; ) {
> +            int64_t next = (sector_num | (BDRV_SECTORS_PER_DIRTY_CHUNK - 1)) + 1;
> +            ret = bdrv_co_is_allocated_above(bs, base,
> +                                             sector_num, next - sector_num, &n);
> +
> +            if (ret < 0) {
> +                break;
> +            } else if (ret == 1) {
> +                bdrv_set_dirty(bs, sector_num, n);
> +                sector_num = next;
> +            } else {
> +                sector_num += n;
> +            }

Maybe it would be worth checking for n == 0 and returning an error in
that case. One example where this happens is when asking for the
allocation status after EOF. It shouldn't happen as long as
bdrv_truncate() is forbidden while the job runs, but an extra check
rarely hurts.

> +        }
> +    }
> +
> +    if (ret < 0) {
> +        goto immediate_exit;
> +    }

Why not do that directly instead of having a break; first just to get here?

> +
> +    s->sector_num = -1;
> +    for (;;) {
> +        uint64_t delay_ns;
> +        int64_t cnt;
> +        bool should_complete;
> +
> +        cnt = bdrv_get_dirty_count(bs);
> +        if (cnt != 0) {
> +            ret = mirror_iteration(s);
> +            if (ret < 0) {
> +                break;

goto immediate_exit? It's the same now, but code after the loop may be
added in the future.

> +            }
> +            cnt = bdrv_get_dirty_count(bs);
> +        }
> +
> +        if (cnt != 0) {
> +            should_complete = false;
> +        } else {
> +            trace_mirror_before_flush(s);
> +            bdrv_flush(s->target);

No error handling?

Kevin

  parent reply	other threads:[~2012-09-13 12:55 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-24 11:03 [Qemu-devel] [PATCH 00/47] Block job improvements for 1.2 Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 01/47] qapi: generalize documentation of streaming commands Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 02/47] qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE Paolo Bonzini
2012-07-26 15:26   ` Kevin Wolf
2012-07-26 15:41     ` Paolo Bonzini
2012-07-26 16:49       ` Luiz Capitulino
2012-07-26 16:59         ` Paolo Bonzini
2012-07-26 17:02           ` Luiz Capitulino
2012-07-24 11:03 ` [Qemu-devel] [PATCH 03/47] block: move job APIs to separate files Paolo Bonzini
2012-07-26 15:50   ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 04/47] block: add block_job_query Paolo Bonzini
2012-07-30 14:47   ` Kevin Wolf
2012-07-30 15:05     ` Paolo Bonzini
2012-07-31  8:47       ` Kevin Wolf
2012-07-31  8:50         ` Paolo Bonzini
2012-08-02 19:28           ` Jeff Cody
2012-07-24 11:03 ` [Qemu-devel] [PATCH 05/47] block: add support for job pause/resume Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 06/47] qmp: add block-job-pause and block-job-resume Paolo Bonzini
2012-08-01  7:42   ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 07/47] qemu-iotests: add test for pausing a streaming operation Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 08/47] block: rename block_job_complete to block_job_completed Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 09/47] block: rename BlockErrorAction, BlockQMPEventAction Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 10/47] block: move BlockdevOnError declaration to QAPI Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 11/47] block: reorganize io error code Paolo Bonzini
2012-08-01  9:30   ` Kevin Wolf
2012-08-01  9:46     ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 12/47] block: sort BlockDeviceIoStatus errors by severity Paolo Bonzini
2012-08-01  9:44   ` Paolo Bonzini
2012-08-01  9:44   ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 13/47] block: introduce block job error Paolo Bonzini
2012-07-25 17:40   ` Eric Blake
2012-08-01 10:14   ` Kevin Wolf
2012-08-01 11:17     ` Paolo Bonzini
2012-08-01 11:49       ` Kevin Wolf
2012-08-01 12:09         ` Paolo Bonzini
2012-08-01 12:23           ` Kevin Wolf
2012-08-01 12:30             ` Paolo Bonzini
2012-08-01 13:09               ` Kevin Wolf
2012-08-01 13:21                 ` Paolo Bonzini
2012-08-01 14:01                   ` Kevin Wolf
2012-08-01 14:34                     ` Paolo Bonzini
2012-08-01 14:59                       ` Kevin Wolf
2012-08-01 15:15                         ` Paolo Bonzini
2012-08-06  9:29                           ` Kevin Wolf
2012-08-06  9:44                             ` Paolo Bonzini
2012-08-06 10:45                               ` Kevin Wolf
2012-08-06 10:58                                 ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 14/47] stream: add on-error argument Paolo Bonzini
2012-07-31 18:40   ` Eric Blake
2012-08-01 10:29   ` Kevin Wolf
2012-08-01 11:11     ` Paolo Bonzini
2012-08-01 11:45       ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 15/47] blkdebug: process all set_state rules in the old state Paolo Bonzini
2012-07-24 20:06   ` Blue Swirl
2012-07-24 11:03 ` [Qemu-devel] [PATCH 16/47] qemu-iotests: map underscore to dash in QMP argument names Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 17/47] qemu-iotests: add tests for streaming error handling Paolo Bonzini
2012-08-01 10:43   ` Kevin Wolf
2012-08-01 11:09     ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 18/47] block: live snapshot documentation tweaks Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 19/47] block: add bdrv_query_info Paolo Bonzini
2012-09-11 13:07   ` Kevin Wolf
2012-09-11 13:12     ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 20/47] block: add bdrv_query_stats Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 21/47] block: add bdrv_ensure_backing_file Paolo Bonzini
2012-09-11 13:32   ` Kevin Wolf
2012-09-11 13:46     ` Paolo Bonzini
2012-09-11 13:58       ` Kevin Wolf
2012-09-11 14:10         ` Paolo Bonzini
2012-09-11 15:38           ` Kevin Wolf
2012-07-24 11:04 ` [Qemu-devel] [PATCH 22/47] block: make device optional in BlockInfo Paolo Bonzini
2012-09-11 13:38   ` Kevin Wolf
2012-09-11 13:49     ` Paolo Bonzini
2012-09-11 14:02       ` Kevin Wolf
2012-09-11 14:14         ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 23/47] block: add target info to QMP query-blockjobs command Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 24/47] block: introduce new dirty bitmap functionality Paolo Bonzini
2012-09-11 14:57   ` Kevin Wolf
2012-09-11 16:17     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 25/47] block: add block-job-complete Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 26/47] block: introduce BLOCK_JOB_READY event Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 27/47] block: introduce mirror job Paolo Bonzini
2012-07-25 23:02   ` Eric Blake
2012-09-13 12:54   ` Kevin Wolf [this message]
2012-09-13 14:07     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command Paolo Bonzini
2012-07-26 23:42   ` Eric Blake
2012-07-27  7:04     ` Paolo Bonzini
2012-07-31  9:26   ` Kevin Wolf
2012-07-31  9:33     ` Paolo Bonzini
2012-07-31  9:46       ` Kevin Wolf
2012-07-31 10:02         ` Paolo Bonzini
2012-07-31 10:25           ` Kevin Wolf
2012-07-31 10:51             ` Paolo Bonzini
2012-07-31 11:13               ` Kevin Wolf
2012-07-31 11:25                 ` Paolo Bonzini
2012-07-31 12:17                   ` Kevin Wolf
2012-07-31 12:52                     ` Paolo Bonzini
2012-09-13 13:15   ` Kevin Wolf
2012-09-13 13:24     ` Paolo Bonzini
2012-09-13 13:26       ` Kevin Wolf
2012-09-13 13:38         ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 29/47] mirror: support querying target file Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 30/47] mirror: implement completion Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 31/47] qemu-iotests: add mirroring test case Paolo Bonzini
2012-07-26 23:46   ` Eric Blake
2012-07-27  7:04     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 32/47] block: forward bdrv_iostatus_reset to block job Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 33/47] mirror: add support for on-source-error/on-target-error Paolo Bonzini
2012-07-27 15:26   ` Eric Blake
2012-07-30 13:29     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 34/47] qmp: add pull_event function Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 35/47] qemu-iotests: add testcases for mirroring on-source-error/on-target-error Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 36/47] host-utils: add ffsl and flsl Paolo Bonzini
2012-07-27 16:05   ` Eric Blake
2012-07-30 13:30     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 37/47] add hierarchical bitmap data type and test cases Paolo Bonzini
2012-07-28 13:26   ` Eric Blake
2012-07-30 13:39     ` Paolo Bonzini
2012-07-30 14:18       ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 38/47] block: implement dirty bitmap using HBitmap Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 39/47] block: make round_to_clusters public Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 40/47] mirror: perform COW if the cluster size is bigger than the granularity Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 41/47] block: return count of dirty sectors, not chunks Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 42/47] block: allow customizing the granularity of the dirty bitmap Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 43/47] mirror: allow customizing the granularity Paolo Bonzini
2012-07-28 13:43   ` Eric Blake
2012-07-30 13:40     ` Paolo Bonzini
2012-07-30 13:53       ` Eric Blake
2012-07-30 14:03         ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 44/47] mirror: switch mirror_iteration to AIO Paolo Bonzini
2012-07-28 13:46   ` Eric Blake
2012-07-30 13:41     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 45/47] mirror: add buf-size argument to drive-mirror Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 46/47] mirror: support more than one in-flight AIO operation Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 47/47] mirror: support arbitrarily-sized iterations Paolo Bonzini
2012-07-28 13:51 ` [Qemu-devel] [PATCH 00/47] Block job improvements for 1.2 Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5051D78F.90104@redhat.com \
    --to=kwolf@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jcody@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).