qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 05/12] mirror: perform COW if the cluster size is bigger than the granularity
Date: Fri, 18 Jan 2013 16:13:14 +0100	[thread overview]
Message-ID: <50F9668A.4080308@redhat.com> (raw)
In-Reply-To: <1358357479-7912-6-git-send-email-pbonzini@redhat.com>

Am 16.01.2013 18:31, schrieb Paolo Bonzini:
> When mirroring runs, the backing files for the target may not yet be
> ready.  However, this means that a copy-on-write operation on the target
> would fill the missing sectors with zeros.  Copy-on-write only happens
> if the granularity of the dirty bitmap is smaller than the cluster size
> (and only for clusters that are allocated in the source after the job
> has started copying).  So far, the granularity was fixed to 1MB; to avoid
> the problem we detected the situation and required the backing files to
> be available in that case only.
> 
> However, we want to lower the granularity for efficiency, so we need
> a better solution.  The solution is to always copy a whole cluster the
> first time it is touched.  The code keeps a bitmap of clusters that
> have already been allocated by the mirroring job, and only does "manual"
> copy-on-write if the chunk being copied is zero in the bitmap.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>         v1->v2: rebased for moved include files
> 
>  block/mirror.c             |   60 +++++++++++++++++++++++++++++++++++++------
>  blockdev.c                 |   15 ++---------
>  tests/qemu-iotests/041     |   21 +++++++++++++++
>  tests/qemu-iotests/041.out |    4 +-
>  trace-events               |    1 +
>  5 files changed, 78 insertions(+), 23 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 20cb1e7..ee45e2e 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -15,6 +15,7 @@
>  #include "block/blockjob.h"
>  #include "block/block_int.h"
>  #include "qemu/ratelimit.h"
> +#include "qemu/bitmap.h"
>  
>  enum {
>      /*
> @@ -36,6 +37,8 @@ typedef struct MirrorBlockJob {
>      bool synced;
>      bool should_complete;
>      int64_t sector_num;
> +    size_t buf_size;
> +    unsigned long *cow_bitmap;
>      HBitmapIter hbi;
>      uint8_t *buf;
>  } MirrorBlockJob;
> @@ -60,7 +63,7 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
>      BlockDriverState *target = s->target;
>      QEMUIOVector qiov;
>      int ret, nb_sectors;
> -    int64_t end;
> +    int64_t end, sector_num, cluster_num;
>      struct iovec iov;
>  
>      s->sector_num = hbitmap_iter_next(&s->hbi);
> @@ -71,22 +74,41 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
>          assert(s->sector_num >= 0);
>      }
>  
> +    /* If we have no backing file yet in the destination, and the cluster size
> +     * is very large, we need to do COW ourselves.  The first time a cluster is
> +     * copied, copy it entirely.
> +     *
> +     * Because both BDRV_SECTORS_PER_DIRTY_CHUNK and the cluster size are
> +     * powers of two, the number of sectors to copy cannot exceed one cluster.
> +     */
> +    sector_num = s->sector_num;
> +    nb_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
> +    cluster_num = sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK;
> +    if (s->cow_bitmap && !test_bit(cluster_num, s->cow_bitmap)) {
> +        trace_mirror_cow(s, sector_num);
> +        bdrv_round_to_clusters(s->target,
> +                               sector_num, BDRV_SECTORS_PER_DIRTY_CHUNK,
> +                               &sector_num, &nb_sectors);
> +        bitmap_set(s->cow_bitmap, sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK,
> +                   nb_sectors / BDRV_SECTORS_PER_DIRTY_CHUNK);

Here the bit in the cow_bitmap is set before the COW has actually been
performed. It could still fail.

> +    }
> +
>      end = s->common.len >> BDRV_SECTOR_BITS;
> -    nb_sectors = MIN(BDRV_SECTORS_PER_DIRTY_CHUNK, end - s->sector_num);
> -    bdrv_reset_dirty(source, s->sector_num, nb_sectors);
> +    nb_sectors = MIN(nb_sectors, end - sector_num);
> +    bdrv_reset_dirty(source, sector_num, nb_sectors);
>  
>      /* Copy the dirty cluster.  */
>      iov.iov_base = s->buf;
>      iov.iov_len  = nb_sectors * 512;
>      qemu_iovec_init_external(&qiov, &iov, 1);
>  
> -    trace_mirror_one_iteration(s, s->sector_num, nb_sectors);
> -    ret = bdrv_co_readv(source, s->sector_num, nb_sectors, &qiov);
> +    trace_mirror_one_iteration(s, sector_num, nb_sectors);
> +    ret = bdrv_co_readv(source, sector_num, nb_sectors, &qiov);
>      if (ret < 0) {
>          *p_action = mirror_error_action(s, true, -ret);
>          goto fail;
>      }
> -    ret = bdrv_co_writev(target, s->sector_num, nb_sectors, &qiov);
> +    ret = bdrv_co_writev(target, sector_num, nb_sectors, &qiov);
>      if (ret < 0) {
>          *p_action = mirror_error_action(s, false, -ret);
>          s->synced = false;
> @@ -96,7 +118,7 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
>  
>  fail:
>      /* Try again later.  */
> -    bdrv_set_dirty(source, s->sector_num, nb_sectors);
> +    bdrv_set_dirty(source, sector_num, nb_sectors);

If it does, we mark the whole cluster dirty now, but in the cow_bitmap
it's still marked at present on the target. When restarting the job,
wouldn't it copy only the start of the cluster next time and corrupt the
rest of it?

Kevin

  reply	other threads:[~2013-01-18 15:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-16 17:31 [Qemu-devel] [PATCH v2 00/12] Drive mirroring performance improvements Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 01/12] host-utils: add ffsl Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 02/12] add hierarchical bitmap data type and test cases Paolo Bonzini
2013-01-16 22:50   ` Eric Blake
2013-01-18 13:21   ` Kevin Wolf
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 03/12] block: implement dirty bitmap using HBitmap Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 04/12] block: make round_to_clusters public Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 05/12] mirror: perform COW if the cluster size is bigger than the granularity Paolo Bonzini
2013-01-18 15:13   ` Kevin Wolf [this message]
2013-01-18 16:22     ` Paolo Bonzini
2013-01-18 17:05       ` Kevin Wolf
2013-01-18 17:33         ` Paolo Bonzini
2013-01-21 10:17           ` Kevin Wolf
2013-01-21 11:15             ` Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 06/12] block: return count of dirty sectors, not chunks Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 07/12] block: allow customizing the granularity of the dirty bitmap Paolo Bonzini
2013-01-16 23:39   ` Eric Blake
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 08/12] mirror: allow customizing the granularity Paolo Bonzini
2013-01-16 23:44   ` Eric Blake
2013-01-21 11:00   ` Kevin Wolf
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 09/12] mirror: switch mirror_iteration to AIO Paolo Bonzini
2013-01-21 11:39   ` Kevin Wolf
2013-01-21 12:09     ` Paolo Bonzini
2013-01-21 12:15       ` Kevin Wolf
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 10/12] mirror: add buf-size argument to drive-mirror Paolo Bonzini
2013-01-16 23:46   ` Eric Blake
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 11/12] mirror: support more than one in-flight AIO operation Paolo Bonzini
2013-01-21 12:35   ` Kevin Wolf
2013-01-21 12:55     ` Paolo Bonzini
2013-01-16 17:31 ` [Qemu-devel] [PATCH v2 12/12] mirror: support arbitrarily-sized iterations Paolo Bonzini
2013-01-16 23:48 ` [Qemu-devel] [PATCH v2 00/12] Drive mirroring performance improvements Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50F9668A.4080308@redhat.com \
    --to=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).