qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, mreitz@redhat.com, kwolf@redhat.com,
	vsementsov@virtuozzo.com, den@openvz.org
Subject: [PATCH v5 6/6] qcow2: do not discard host clusters during in-flight writes
Date: Fri, 26 Mar 2021 23:00:45 +0300	[thread overview]
Message-ID: <20210326200045.363290-7-vsementsov@virtuozzo.com> (raw)
In-Reply-To: <20210326200045.363290-1-vsementsov@virtuozzo.com>

Finally, after all the preparations, when we have the whole
infrastructure of inflight-write-counters prepared in previous commits,
let's fix the following bug:

1. Start write to qcow2. Assume guest cluster G and corresponding host
   cluster is H.

2. The write requests come to the point of data writing to .file. The
   write to .file is started and qcow2 mutex is unlocked.

3. At this time refcount of H becomes 0. For example, it may be due to
   discard operation on qcow2 node, or rewriting compressed data by
   normal write, or some operation with snapshots..

4. Next, some operations occurs and leads to allocation of H for some
   other needs: it may be another write-to-qcow2-node operation, or
   allocation of L2 table or some other data or metadata cluster
   allocation.

5. So, at this point H is used for something other. Assume, L2 table is
   written into H.

6. And now, our write from [2] finishes. And pollutes L2 table in H.
   That's a bug.

To fix the bug we now have inflight-write-counters, which work in a
way that cluster is not "free" (and therefore will not be reused
and we don't fall into use-after-free described above) until both
refcount and inflight-writes-counter are zero for this cluster.

So, we have the only remaining action: actually call
qcow2_inflight_writes_inc() and qcow2_inflight_writes_dec() in
corresponding places.

Note: both functions don't rely on s->lock being held or not. But we
should call qcow2_inflight_writes_inc() in the same s->lock critical
section where we allocated host cluster (or get an offset of existing
cluster from metadata). Otherwise we risk that discard will interrupt
at s->lock unlocking (it should only schedule other coroutines to be
entered on next our yield, but let's be absolutely safe).

Iotest qcow2-discard-during-rewrite is enabled, as it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c                                 | 21 ++++++++++++++++++-
 .../tests/qcow2-discard-during-rewrite        |  2 +-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 0a5bd4ea4e..d021c9d3ac 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2550,6 +2550,9 @@ out_unlocked:
 
 out_locked:
     qcow2_handle_l2meta(bs, &l2meta, false);
+
+    qcow2_inflight_writes_dec(bs, host_offset, bytes);
+
     qemu_co_mutex_unlock(&s->lock);
 
     qemu_vfree(crypt_buf);
@@ -2609,6 +2612,8 @@ static coroutine_fn int qcow2_co_pwritev_part(
             goto out_locked;
         }
 
+        qcow2_inflight_writes_inc(bs, host_offset, cur_bytes);
+
         qemu_co_mutex_unlock(&s->lock);
 
         if (!aio && cur_bytes != bytes) {
@@ -4099,10 +4104,17 @@ qcow2_co_copy_range_to(BlockDriverState *bs,
             goto fail;
         }
 
+        qcow2_inflight_writes_inc(bs, host_offset, cur_bytes);
+
         qemu_co_mutex_unlock(&s->lock);
+
         ret = bdrv_co_copy_range_to(src, src_offset, s->data_file, host_offset,
                                     cur_bytes, read_flags, write_flags);
+
         qemu_co_mutex_lock(&s->lock);
+
+        qcow2_inflight_writes_dec(bs, host_offset, cur_bytes);
+
         if (ret < 0) {
             goto fail;
         }
@@ -4538,13 +4550,20 @@ qcow2_co_pwritev_compressed_task(BlockDriverState *bs,
     }
 
     ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len, true);
-    qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
+        qemu_co_mutex_unlock(&s->lock);
         goto fail;
     }
 
+    qcow2_inflight_writes_inc(bs, cluster_offset, out_len);
+
+    qemu_co_mutex_unlock(&s->lock);
+
     BLKDBG_EVENT(s->data_file, BLKDBG_WRITE_COMPRESSED);
     ret = bdrv_co_pwrite(s->data_file, cluster_offset, out_len, out_buf, 0);
+
+    qcow2_inflight_writes_dec(bs, cluster_offset, out_len);
+
     if (ret < 0) {
         goto fail;
     }
diff --git a/tests/qemu-iotests/tests/qcow2-discard-during-rewrite b/tests/qemu-iotests/tests/qcow2-discard-during-rewrite
index 7f0d8a107a..2e2e0d2cb0 100755
--- a/tests/qemu-iotests/tests/qcow2-discard-during-rewrite
+++ b/tests/qemu-iotests/tests/qcow2-discard-during-rewrite
@@ -1,5 +1,5 @@
 #!/usr/bin/env bash
-# group: quick disabled
+# group: quick
 #
 # Test discarding (and reusing) host cluster during writing data to it.
 #
-- 
2.29.2



      parent reply	other threads:[~2021-03-26 20:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26 20:00 [PATCH v5 0/6] qcow2: fix parallel rewrite and discard (lockless) Vladimir Sementsov-Ogievskiy
2021-03-26 20:00 ` [PATCH v5 1/6] iotests: add qcow2-discard-during-rewrite Vladimir Sementsov-Ogievskiy
2021-03-26 20:00 ` [PATCH v5 2/6] qcow2: fix cache discarding in update_refcount() Vladimir Sementsov-Ogievskiy
2021-04-07 16:34   ` Alberto Garcia
2021-03-26 20:00 ` [PATCH v5 3/6] qcow2: introduce is_cluster_free() helper Vladimir Sementsov-Ogievskiy
2021-03-26 20:00 ` [PATCH v5 4/6] qcow2: introduce inflight-write-counters Vladimir Sementsov-Ogievskiy
2021-03-30 12:05   ` Vladimir Sementsov-Ogievskiy
2021-03-26 20:00 ` [PATCH v5 5/6] qcow2: consider in-flight-writes when freeing clusters Vladimir Sementsov-Ogievskiy
2021-03-26 20:00 ` Vladimir Sementsov-Ogievskiy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210326200045.363290-7-vsementsov@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).