qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-block@nongnu.org
Cc: fam@euphon.net, kwolf@redhat.com, vsementsov@virtuozzo.com,
	qemu-devel@nongnu.org, mreitz@redhat.com, den@openvz.org,
	jsnow@redhat.com
Subject: [PATCH v2 4/5] block/mirror: support unaligned write in active mirror
Date: Fri, 11 Oct 2019 12:07:10 +0300	[thread overview]
Message-ID: <20191011090711.19940-5-vsementsov@virtuozzo.com> (raw)
In-Reply-To: <20191011090711.19940-1-vsementsov@virtuozzo.com>

Prior 9adc1cb49af8d do_sync_target_write had a bug: it reset aligned-up
region in the dirty bitmap, which means that we may not copy some bytes
and assume them copied, which actually leads to producing corrupted
target.

So 9adc1cb49af8d forced dirty bitmap granularity to be
request_alignment for mirror-top filter, so we are not working with
unaligned requests. However forcing large alignment obviously decreases
performance of unaligned requests.

This commit provides another solution for the problem: if unaligned
padding is already dirty, we can safely ignore it, as
1. It's dirty, it will be copied by mirror_iteration anyway
2. It's dirty, so skipping it now we don't increase dirtiness of the
   bitmap and therefore don't damage "synchronicity" of the
   write-blocking mirror.

If unaligned padding is not dirty, we just write it, no reason to touch
dirty bitmap if we succeed (on failure we'll set the whole region
ofcourse, but we loss "synchronicity" on failure anyway).

Note: we need to disable dirty_bitmap, otherwise we will not be able to
see in do_sync_target_write bitmap state before current operation. We
may of course check dirty bitmap before the operation in
bdrv_mirror_top_do_write and remember it, but we don't need active
dirty bitmap for write-blocking mirror anyway.

New code-path is unused until the following commit reverts
9adc1cb49af8d.

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/mirror.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 68 insertions(+), 3 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 1ae57a5131..1ed1d79ff8 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1182,14 +1182,67 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
                      QEMUIOVector *qiov, int flags)
 {
     int ret;
+    size_t qiov_offset = 0;
+    int64_t bitmap_offset, bitmap_end;
 
-    bdrv_reset_dirty_bitmap(job->dirty_bitmap, offset, bytes);
+    if (!QEMU_IS_ALIGNED(offset, job->granularity) &&
+        bdrv_dirty_bitmap_get(job->dirty_bitmap, offset))
+    {
+            /*
+             * Dirty unaligned padding: ignore it.
+             *
+             * Reasoning:
+             * 1. If we copy it, we can't reset corresponding bit in
+             *    dirty_bitmap as there may be some "dirty" bytes still not
+             *    copied.
+             * 2. It's already dirty, so skipping it we don't diverge mirror
+             *    progress.
+             *
+             * Note, that because of this, guest write may have no contribution
+             * into mirror converge, but that's not bad, as we have background
+             * process of mirroring. If under some bad circumstances (high guest
+             * IO load) background process starve, we will not converge anyway,
+             * even if each write will contribute, as guest is not guaranteed to
+             * rewrite the whole disk.
+             */
+            qiov_offset = QEMU_ALIGN_UP(offset, job->granularity) - offset;
+            if (bytes <= qiov_offset) {
+                /* nothing to do after shrink */
+                return;
+            }
+            offset += qiov_offset;
+            bytes -= qiov_offset;
+    }
+
+    if (!QEMU_IS_ALIGNED(offset + bytes, job->granularity) &&
+        bdrv_dirty_bitmap_get(job->dirty_bitmap, offset + bytes - 1))
+    {
+        uint64_t tail = (offset + bytes) % job->granularity;
+
+        if (bytes <= tail) {
+            /* nothing to do after shrink */
+            return;
+        }
+        bytes -= tail;
+    }
+
+    /*
+     * Tails are either clean or shrunk, so for bitmap resetting
+     * we safely align the range down.
+     */
+    bitmap_offset = QEMU_ALIGN_UP(offset, job->granularity);
+    bitmap_end = QEMU_ALIGN_DOWN(offset + bytes, job->granularity);
+    if (bitmap_offset < bitmap_end) {
+        bdrv_reset_dirty_bitmap(job->dirty_bitmap, bitmap_offset,
+                                bitmap_end - bitmap_offset);
+    }
 
     job_progress_increase_remaining(&job->common.job, bytes);
 
     switch (method) {
     case MIRROR_METHOD_COPY:
-        ret = blk_co_pwritev(job->target, offset, bytes, qiov, flags);
+        ret = blk_co_pwritev_part(job->target, offset, bytes,
+                                  qiov, qiov_offset, flags);
         break;
 
     case MIRROR_METHOD_ZERO:
@@ -1211,7 +1264,16 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
     } else {
         BlockErrorAction action;
 
-        bdrv_set_dirty_bitmap(job->dirty_bitmap, offset, bytes);
+        /*
+         * We failed, so we should mark dirty the whole area, aligned up.
+         * Note that we don't care about shrunk tails if any: they were dirty
+         * at function start, and they must be still dirty, as we've locked
+         * the region for in-flight op.
+         */
+        bitmap_offset = QEMU_ALIGN_DOWN(offset, job->granularity);
+        bitmap_end = QEMU_ALIGN_UP(offset + bytes, job->granularity);
+        bdrv_set_dirty_bitmap(job->dirty_bitmap, bitmap_offset,
+                              bitmap_end - bitmap_offset);
         job->actively_synced = false;
 
         action = mirror_error_action(job, false, -ret);
@@ -1618,6 +1680,9 @@ static BlockJob *mirror_start_job(
     if (!s->dirty_bitmap) {
         goto fail;
     }
+    if (s->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING) {
+        bdrv_disable_dirty_bitmap(s->dirty_bitmap);
+    }
 
     ret = block_job_add_bdrv(&s->common, "source", bs, 0,
                              BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
-- 
2.21.0



  parent reply	other threads:[~2019-10-11  9:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-11  9:07 [PATCH v2 0/5] active-mirror: support unaligned guest operations Vladimir Sementsov-Ogievskiy
2019-10-11  9:07 ` [PATCH v2 1/5] hbitmap: handle set/reset with zero length Vladimir Sementsov-Ogievskiy
2019-10-18 15:29   ` Max Reitz
2019-10-11  9:07 ` [PATCH v2 2/5] block/mirror: simplify do_sync_target_write Vladimir Sementsov-Ogievskiy
2019-10-11  9:07 ` [PATCH v2 3/5] block/block-backend: add blk_co_pwritev_part Vladimir Sementsov-Ogievskiy
2019-10-11  9:07 ` Vladimir Sementsov-Ogievskiy [this message]
2019-10-18 15:51   ` [PATCH v2 4/5] block/mirror: support unaligned write in active mirror Max Reitz
2019-10-11  9:07 ` [PATCH v2 5/5] Revert "mirror: Only mirror granularity-aligned chunks" Vladimir Sementsov-Ogievskiy
2019-10-18 15:54 ` [PATCH v2 0/5] active-mirror: support unaligned guest operations Max Reitz
2019-10-18 16:11   ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191011090711.19940-5-vsementsov@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).