qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: [PULL 11/69] block/mirror: support unaligned write in active mirror
Date: Mon, 28 Oct 2019 13:14:03 +0100	[thread overview]
Message-ID: <20191028121501.15279-12-mreitz@redhat.com> (raw)
In-Reply-To: <20191028121501.15279-1-mreitz@redhat.com>

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Prior 9adc1cb49af8d do_sync_target_write had a bug: it reset aligned-up
region in the dirty bitmap, which means that we may not copy some bytes
and assume them copied, which actually leads to producing corrupted
target.

So 9adc1cb49af8d forced dirty bitmap granularity to be
request_alignment for mirror-top filter, so we are not working with
unaligned requests. However forcing large alignment obviously decreases
performance of unaligned requests.

This commit provides another solution for the problem: if unaligned
padding is already dirty, we can safely ignore it, as
1. It's dirty, it will be copied by mirror_iteration anyway
2. It's dirty, so skipping it now we don't increase dirtiness of the
   bitmap and therefore don't damage "synchronicity" of the
   write-blocking mirror.

If unaligned padding is not dirty, we just write it, no reason to touch
dirty bitmap if we succeed (on failure we'll set the whole region
ofcourse, but we loss "synchronicity" on failure anyway).

Note: we need to disable dirty_bitmap, otherwise we will not be able to
see in do_sync_target_write bitmap state before current operation. We
may of course check dirty bitmap before the operation in
bdrv_mirror_top_do_write and remember it, but we don't need active
dirty bitmap for write-blocking mirror anyway.

New code-path is unused until the following commit reverts
9adc1cb49af8d.

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20191011090711.19940-5-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/mirror.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 68 insertions(+), 3 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 351faf9367..11d4d66f43 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1182,14 +1182,67 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
                      QEMUIOVector *qiov, int flags)
 {
     int ret;
+    size_t qiov_offset = 0;
+    int64_t bitmap_offset, bitmap_end;
 
-    bdrv_reset_dirty_bitmap(job->dirty_bitmap, offset, bytes);
+    if (!QEMU_IS_ALIGNED(offset, job->granularity) &&
+        bdrv_dirty_bitmap_get(job->dirty_bitmap, offset))
+    {
+            /*
+             * Dirty unaligned padding: ignore it.
+             *
+             * Reasoning:
+             * 1. If we copy it, we can't reset corresponding bit in
+             *    dirty_bitmap as there may be some "dirty" bytes still not
+             *    copied.
+             * 2. It's already dirty, so skipping it we don't diverge mirror
+             *    progress.
+             *
+             * Note, that because of this, guest write may have no contribution
+             * into mirror converge, but that's not bad, as we have background
+             * process of mirroring. If under some bad circumstances (high guest
+             * IO load) background process starve, we will not converge anyway,
+             * even if each write will contribute, as guest is not guaranteed to
+             * rewrite the whole disk.
+             */
+            qiov_offset = QEMU_ALIGN_UP(offset, job->granularity) - offset;
+            if (bytes <= qiov_offset) {
+                /* nothing to do after shrink */
+                return;
+            }
+            offset += qiov_offset;
+            bytes -= qiov_offset;
+    }
+
+    if (!QEMU_IS_ALIGNED(offset + bytes, job->granularity) &&
+        bdrv_dirty_bitmap_get(job->dirty_bitmap, offset + bytes - 1))
+    {
+        uint64_t tail = (offset + bytes) % job->granularity;
+
+        if (bytes <= tail) {
+            /* nothing to do after shrink */
+            return;
+        }
+        bytes -= tail;
+    }
+
+    /*
+     * Tails are either clean or shrunk, so for bitmap resetting
+     * we safely align the range down.
+     */
+    bitmap_offset = QEMU_ALIGN_UP(offset, job->granularity);
+    bitmap_end = QEMU_ALIGN_DOWN(offset + bytes, job->granularity);
+    if (bitmap_offset < bitmap_end) {
+        bdrv_reset_dirty_bitmap(job->dirty_bitmap, bitmap_offset,
+                                bitmap_end - bitmap_offset);
+    }
 
     job_progress_increase_remaining(&job->common.job, bytes);
 
     switch (method) {
     case MIRROR_METHOD_COPY:
-        ret = blk_co_pwritev(job->target, offset, bytes, qiov, flags);
+        ret = blk_co_pwritev_part(job->target, offset, bytes,
+                                  qiov, qiov_offset, flags);
         break;
 
     case MIRROR_METHOD_ZERO:
@@ -1211,7 +1264,16 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
     } else {
         BlockErrorAction action;
 
-        bdrv_set_dirty_bitmap(job->dirty_bitmap, offset, bytes);
+        /*
+         * We failed, so we should mark dirty the whole area, aligned up.
+         * Note that we don't care about shrunk tails if any: they were dirty
+         * at function start, and they must be still dirty, as we've locked
+         * the region for in-flight op.
+         */
+        bitmap_offset = QEMU_ALIGN_DOWN(offset, job->granularity);
+        bitmap_end = QEMU_ALIGN_UP(offset + bytes, job->granularity);
+        bdrv_set_dirty_bitmap(job->dirty_bitmap, bitmap_offset,
+                              bitmap_end - bitmap_offset);
         job->actively_synced = false;
 
         action = mirror_error_action(job, false, -ret);
@@ -1618,6 +1680,9 @@ static BlockJob *mirror_start_job(
     if (!s->dirty_bitmap) {
         goto fail;
     }
+    if (s->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING) {
+        bdrv_disable_dirty_bitmap(s->dirty_bitmap);
+    }
 
     ret = block_job_add_bdrv(&s->common, "source", bs, 0,
                              BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
-- 
2.21.0



  parent reply	other threads:[~2019-10-28 12:37 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28 12:13 [PULL 00/69] Block patches for softfreeze Max Reitz
2019-10-28 12:13 ` [PULL 01/69] iotests: Prefer null-co over null-aio Max Reitz
2019-10-28 12:13 ` [PULL 02/69] iotests: Allow skipping test cases Max Reitz
2019-10-28 12:13 ` [PULL 03/69] iotests: Use case_skip() in skip_if_unsupported() Max Reitz
2019-10-28 12:13 ` [PULL 04/69] iotests: Let skip_if_unsupported accept a function Max Reitz
2019-10-28 12:13 ` [PULL 05/69] iotests: Test driver whitelisting in 093 Max Reitz
2019-10-28 12:13 ` [PULL 06/69] iotests: Test driver whitelisting in 136 Max Reitz
2019-10-28 12:13 ` [PULL 07/69] iotests: Cache supported_formats() Max Reitz
2019-10-28 12:14 ` [PULL 08/69] hbitmap: handle set/reset with zero length Max Reitz
2019-10-28 12:14 ` [PULL 09/69] block/mirror: simplify do_sync_target_write Max Reitz
2019-10-28 12:14 ` [PULL 10/69] block/block-backend: add blk_co_pwritev_part Max Reitz
2019-10-28 12:14 ` Max Reitz [this message]
2019-10-28 12:14 ` [PULL 12/69] Revert "mirror: Only mirror granularity-aligned chunks" Max Reitz
2019-10-28 12:14 ` [PULL 13/69] iotests: Introduce $SOCK_DIR Max Reitz
2019-10-28 12:14 ` [PULL 14/69] iotests.py: Store socket files in $SOCK_DIR Max Reitz
2019-10-28 12:14 ` [PULL 15/69] iotests.py: Add @base_dir to FilePaths etc Max Reitz
2019-10-28 12:14 ` [PULL 16/69] iotests: Filter $SOCK_DIR Max Reitz
2019-10-28 12:14 ` [PULL 17/69] iotests: Let common.nbd create socket in $SOCK_DIR Max Reitz
2019-10-28 12:14 ` [PULL 18/69] iotests/083: Create " Max Reitz
2019-10-28 12:14 ` [PULL 19/69] iotests/140: " Max Reitz
2019-10-28 12:14 ` [PULL 20/69] iotests/143: " Max Reitz
2019-10-28 12:14 ` [PULL 21/69] iotests/147: " Max Reitz
2019-10-28 12:14 ` [PULL 22/69] iotests/181: " Max Reitz
2019-10-28 12:14 ` [PULL 23/69] iotests/182: " Max Reitz
2019-10-28 12:14 ` [PULL 24/69] iotests/183: " Max Reitz
2019-10-28 12:14 ` [PULL 25/69] iotests/192: " Max Reitz
2019-10-28 12:14 ` [PULL 26/69] iotests/194: Create sockets " Max Reitz
2019-10-28 12:14 ` [PULL 27/69] iotests/201: Create socket " Max Reitz
2019-10-28 12:14 ` [PULL 28/69] iotests/205: " Max Reitz
2019-10-28 12:14 ` [PULL 29/69] iotests/208: " Max Reitz
2019-10-28 12:14 ` [PULL 30/69] iotests/209: " Max Reitz
2019-10-28 12:14 ` [PULL 31/69] iotests/222: " Max Reitz
2019-10-28 12:14 ` [PULL 32/69] iotests/223: " Max Reitz
2019-10-28 12:14 ` [PULL 33/69] iotests/240: " Max Reitz
2019-10-28 12:14 ` [PULL 34/69] iotests/267: " Max Reitz
2019-10-28 12:14 ` [PULL 35/69] iotests: Drop TEST_DIR filter from _filter_nbd Max Reitz
2019-10-28 12:14 ` [PULL 36/69] block/block-copy: allocate buffer in block_copy_with_bounce_buffer Max Reitz
2019-10-28 12:14 ` [PULL 37/69] block/block-copy: limit copy_range_size to 16 MiB Max Reitz
2019-10-28 12:14 ` [PULL 38/69] block/block-copy: refactor copying Max Reitz
2019-10-28 12:14 ` [PULL 39/69] util: introduce SharedResource Max Reitz
2019-10-28 12:14 ` [PULL 40/69] block/block-copy: add memory limit Max Reitz
2019-10-28 12:14 ` [PULL 41/69] block/block-copy: increase buffered copy request Max Reitz
2019-10-28 12:14 ` [PULL 42/69] block/nvme: add support for write zeros Max Reitz
2019-10-28 12:14 ` [PULL 43/69] block/nvme: add support for discard Max Reitz
2019-10-28 12:14 ` [PULL 44/69] mirror: Do not dereference invalid pointers Max Reitz
2019-10-28 12:14 ` [PULL 45/69] include: Move endof() up from hw/virtio/virtio.h Max Reitz
2019-10-28 12:14 ` [PULL 46/69] qcow2: Use endof() Max Reitz
2019-10-28 12:14 ` [PULL 47/69] qcow2: Add Error ** to qcow2_read_snapshots() Max Reitz
2019-10-28 12:14 ` [PULL 48/69] qcow2: Keep unknown extra snapshot data Max Reitz
2019-10-28 12:14 ` [PULL 49/69] qcow2: Make qcow2_write_snapshots() public Max Reitz
2019-10-28 12:14 ` [PULL 50/69] qcow2: Put qcow2_upgrade() into its own function Max Reitz
2019-10-28 12:14 ` [PULL 51/69] qcow2: Write v3-compliant snapshot list on upgrade Max Reitz
2019-10-28 12:14 ` [PULL 52/69] qcow2: Separate qcow2_check_read_snapshot_table() Max Reitz
2019-10-28 12:14 ` [PULL 53/69] qcow2: Add qcow2_check_fix_snapshot_table() Max Reitz
2019-10-28 12:14 ` [PULL 54/69] qcow2: Fix broken snapshot table entries Max Reitz
2019-10-28 12:14 ` [PULL 55/69] qcow2: Keep track of the snapshot table length Max Reitz
2019-10-28 12:14 ` [PULL 56/69] qcow2: Fix overly long snapshot tables Max Reitz
2019-10-28 12:14 ` [PULL 57/69] qcow2: Repair snapshot table with too many entries Max Reitz
2019-10-28 12:14 ` [PULL 58/69] qcow2: Fix v3 snapshot table entry compliancy Max Reitz
2019-10-28 12:14 ` [PULL 59/69] iotests: Add peek_file* functions Max Reitz
2019-10-28 12:14 ` [PULL 60/69] iotests: Test qcow2's snapshot table handling Max Reitz
2019-10-28 12:14 ` [PULL 61/69] block: Handle filter truncation like native impl Max Reitz
2019-10-28 12:14 ` [PULL 62/69] block/cor: Drop cor_co_truncate() Max Reitz
2019-10-28 12:14 ` [PULL 63/69] block: Do not truncate file node when formatting Max Reitz
2019-10-28 12:14 ` [PULL 64/69] block: Add @exact parameter to bdrv_co_truncate() Max Reitz
2019-10-28 12:14 ` [PULL 65/69] block: Evaluate @exact in protocol drivers Max Reitz
2019-10-28 12:14 ` [PULL 66/69] block: Let format drivers pass @exact Max Reitz
2019-10-28 12:14 ` [PULL 67/69] block: Pass truncate exact=true where reasonable Max Reitz
2019-10-28 12:15 ` [PULL 68/69] Revert "qemu-img: Check post-truncation size" Max Reitz
2019-10-28 12:15 ` [PULL 69/69] qemu-iotests: restrict 264 to qcow2 only Max Reitz
2019-10-28 21:13 ` [PULL 00/69] Block patches for softfreeze Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191028121501.15279-12-mreitz@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).