* [PATCH 0/4] improve block_status() for cbw + snapshot setup
@ 2025-05-13 1:32 Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 1/4] hbitmap: drop meta bitmap leftovers Andrey Zhadchenko
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-13 1:32 UTC (permalink / raw)
To: qemu-block, vsementsov, eblake
Cc: jsnow, kwolf, hreitz, qemu-devel, andrey.drobyshev, den
Recently we had some reports about stuck full backups: running backup
job had 0 progress for more than an hour. This reproduced only on big
images, at least 30TB of allocated space.
We are using snapshot filter + cbw filter + blockdev-backup. The
discovered problem lies in access bitmap in cbw filter.
To build bcs bitmap for backup, the job repeatedly asks for block
status with a range (X, virtual_size). As we are accessing the disk
via snapshot filter, we end up in cbw_co_snapshot_block_status()->
cbw_snapshot_read_lock(). Here we check that access bitmap does not
have any zeroes in this range. After that, we ask the disk.
Firstly, we always look for zero in access bitmap for the whole
range. This is rather slow, because it is full and we need to scan
every bit in a lower level. Secondly, the following block status call
to the disk may return not-very-high amount of continious clusters,
for example 512 (empirical value, guessing for the 'full' image it is
4K/clu_addr_size for qcow2 driver).
This way we are doomed to re-scan access bitmap on the next block
status request (X + 512, virtual_size).
perf tracing example:
96.67% 9581 qemu-kvm qemu-kvm [.] hbitmap_next_zero
0.33% 32 qemu-kvm qemu-kvm [.] qcow2_cache_do_get.lto_priv.0
0.10% 10 qemu-kvm [kernel.vmlinux] [k] perf_adjust_freq_unthr_context
0.08% 8 qemu-kvm qemu-kvm [.] qcow2_get_host_offset
This can be clearly observed on the image with small clu_size=65536
and preallocated metadata.
size 10T 11T
blockdev-backup 52s 57s
cbw + snap 325s 413s
cbw + snap + patches 55s 61s
The growth is also non-linear in this case, +10% size results in
+20% time.
This patchset changes access-bitmap into 'deny' bitmap by reversing
the bits within and making the code look for set bits. It is much
faster due to hbitmap levels.
Also
- update iotest 257: now access bitmap on the cbw filter is empty,
so count is 0 instead of 67108864 (which is the image size)
- remove meta betmap leftovers
- report block_status() until the end of accessible section to
snapshot filter, instead of returning EINVAL on big requests
Andrey Zhadchenko (4):
hbitmap: drop meta bitmap leftovers
hbitmap: introduce hbitmap_reverse()
block/copy-before-write: reverse access bitmap
block/copy-before-write: report partial block status to snapshot
block/copy-before-write.c | 33 +++++++++++++++++++++------------
block/dirty-bitmap.c | 9 +++++++++
include/block/block_int-io.h | 1 +
include/qemu/hbitmap.h | 8 ++++++++
tests/qemu-iotests/257.out | 28 ++++++++++++++--------------
util/hbitmap.c | 32 +++++++++++++++++---------------
6 files changed, 70 insertions(+), 41 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] hbitmap: drop meta bitmap leftovers
2025-05-13 1:32 [PATCH 0/4] improve block_status() for cbw + snapshot setup Andrey Zhadchenko
@ 2025-05-13 1:32 ` Andrey Zhadchenko
2025-05-20 16:27 ` Eric Blake
2025-05-13 1:32 ` [PATCH 2/4] hbitmap: introduce hbitmap_reverse() Andrey Zhadchenko
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-13 1:32 UTC (permalink / raw)
To: qemu-block, vsementsov, eblake
Cc: jsnow, kwolf, hreitz, qemu-devel, andrey.drobyshev, den
API to manipulate meta bitmap was removed with commit 0c88f1970c76
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
---
util/hbitmap.c | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/util/hbitmap.c b/util/hbitmap.c
index d9a1dabc63..16674f33e4 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -84,9 +84,6 @@ struct HBitmap {
*/
int granularity;
- /* A meta dirty bitmap to track the dirtiness of bits in this HBitmap. */
- HBitmap *meta;
-
/* A number of progressively less coarse bitmaps (i.e. level 0 is the
* coarsest). Each bit in level N represents a word in level N+1 that
* has a set bit, except the last level where each bit represents the
@@ -480,10 +477,7 @@ void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t count)
n = last - first + 1;
hb->count += n - hb_count_between(hb, first, last);
- if (hb_set_between(hb, HBITMAP_LEVELS - 1, first, last) &&
- hb->meta) {
- hbitmap_set(hb->meta, start, count);
- }
+ hb_set_between(hb, HBITMAP_LEVELS - 1, first, last);
}
/* Resetting works the other way round: propagate up if the new
@@ -577,10 +571,7 @@ void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count)
assert(last < hb->size);
hb->count -= hb_count_between(hb, first, last);
- if (hb_reset_between(hb, HBITMAP_LEVELS - 1, first, last) &&
- hb->meta) {
- hbitmap_set(hb->meta, start, count);
- }
+ hb_reset_between(hb, HBITMAP_LEVELS - 1, first, last);
}
void hbitmap_reset_all(HBitmap *hb)
@@ -784,7 +775,6 @@ void hbitmap_deserialize_finish(HBitmap *bitmap)
void hbitmap_free(HBitmap *hb)
{
unsigned i;
- assert(!hb->meta);
for (i = HBITMAP_LEVELS; i-- > 0; ) {
g_free(hb->levels[i]);
}
@@ -868,9 +858,6 @@ void hbitmap_truncate(HBitmap *hb, uint64_t size)
(size - old) * sizeof(*hb->levels[i]));
}
}
- if (hb->meta) {
- hbitmap_truncate(hb->meta, hb->size << hb->granularity);
- }
}
/**
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/4] hbitmap: introduce hbitmap_reverse()
2025-05-13 1:32 [PATCH 0/4] improve block_status() for cbw + snapshot setup Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 1/4] hbitmap: drop meta bitmap leftovers Andrey Zhadchenko
@ 2025-05-13 1:32 ` Andrey Zhadchenko
2025-05-20 16:29 ` Eric Blake
2025-05-13 1:32 ` [PATCH 3/4] block/copy-before-write: reverse access bitmap Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 4/4] block/copy-before-write: report partial block status to snapshot Andrey Zhadchenko
3 siblings, 1 reply; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-13 1:32 UTC (permalink / raw)
To: qemu-block, vsementsov, eblake
Cc: jsnow, kwolf, hreitz, qemu-devel, andrey.drobyshev, den
and bdrv_dirty_bitmap_reverse() helper
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
---
block/dirty-bitmap.c | 9 +++++++++
include/block/block_int-io.h | 1 +
include/qemu/hbitmap.h | 8 ++++++++
util/hbitmap.c | 15 +++++++++++++++
4 files changed, 33 insertions(+)
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 13a1979755..c7f453fdb9 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -888,3 +888,12 @@ void bdrv_dirty_bitmap_merge_internal(BdrvDirtyBitmap *dest,
}
}
}
+
+void bdrv_dirty_bitmap_reverse(BdrvDirtyBitmap *bitmap)
+{
+ assert(!bdrv_dirty_bitmap_readonly(bitmap));
+ assert(!bdrv_dirty_bitmap_inconsistent(bitmap));
+ bdrv_dirty_bitmaps_lock(bitmap->bs);
+ hbitmap_reverse(bitmap->bitmap);
+ bdrv_dirty_bitmaps_unlock(bitmap->bs);
+}
diff --git a/include/block/block_int-io.h b/include/block/block_int-io.h
index 4a7cf2b4fd..093613e7d1 100644
--- a/include/block/block_int-io.h
+++ b/include/block/block_int-io.h
@@ -109,6 +109,7 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
void bdrv_dirty_bitmap_merge_internal(BdrvDirtyBitmap *dest,
const BdrvDirtyBitmap *src,
HBitmap **backup, bool lock);
+void bdrv_dirty_bitmap_reverse(BdrvDirtyBitmap *bitmap);
void bdrv_inc_in_flight(BlockDriverState *bs);
void bdrv_dec_in_flight(BlockDriverState *bs);
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 8136e33674..dbdc9aa2d4 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -350,4 +350,12 @@ bool hbitmap_status(const HBitmap *hb, int64_t start, int64_t count,
*/
int64_t hbitmap_iter_next(HBitmapIter *hbi);
+/**
+ * hbitmap_reverse:
+ * @bitmap: The HBitmap to operate on
+ *
+ * Reverse the bits in the bitmap.
+ */
+void hbitmap_reverse(HBitmap *bitmap);
+
#endif
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 16674f33e4..b99c4b1eec 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -940,3 +940,18 @@ char *hbitmap_sha256(const HBitmap *bitmap, Error **errp)
return hash;
}
+
+void hbitmap_reverse(HBitmap *bitmap)
+{
+ int64_t pnum, pos = 0;
+ int64_t size = bitmap->orig_size;
+
+ while (pos < size) {
+ if (hbitmap_status(bitmap, pos, size - pos, &pnum)) {
+ hbitmap_reset(bitmap, pos, pnum);
+ } else {
+ hbitmap_set(bitmap, pos, pnum);
+ }
+ pos += pnum;
+ }
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/4] block/copy-before-write: reverse access bitmap
2025-05-13 1:32 [PATCH 0/4] improve block_status() for cbw + snapshot setup Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 1/4] hbitmap: drop meta bitmap leftovers Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 2/4] hbitmap: introduce hbitmap_reverse() Andrey Zhadchenko
@ 2025-05-13 1:32 ` Andrey Zhadchenko
2025-05-20 17:26 ` Eric Blake
2025-05-13 1:32 ` [PATCH 4/4] block/copy-before-write: report partial block status to snapshot Andrey Zhadchenko
3 siblings, 1 reply; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-13 1:32 UTC (permalink / raw)
To: qemu-block, vsementsov, eblake
Cc: jsnow, kwolf, hreitz, qemu-devel, andrey.drobyshev, den
HBitmaps allow us to search set bits pretty fast. On the contrary,
when searching zeroes, we may be forced to fully traverse the lower
level.
When we run blockdev-backup with mode=full on top of snapshot filter
+ cbw filter, the job fills copy bitmap by calling block_status()
with range (X, virtual_size). The problem is that we check for zeroes
in this whole range. We also hit the worst case here, as access
bitmap is fully set and we need to scan the entire lowest level.
After scanning the full bitmap we actually ask the block status of
original image, which may return significantly lower amount of empty
clusters.
Beacuse of this, the backup job 'hangs' on block copy initializaiton
for a long time with 100% CPU.
Example copy bitmap buildup time for image with clu_size=65536 and
preallocated metadata
size 10T 11T
blockdev-backup 52s 57s
cbw + snap 325s 413s
cbw + snap + patch 55s 61s
To fix it, reverse the access bitmap in cbw filter: rather set it
when the user is not allowed to read the cluster.
Update qemu-iotest 257: now access bitmap have count 0 instead of
the image size 67108864
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
---
block/copy-before-write.c | 17 ++++++++++-------
tests/qemu-iotests/257.out | 28 ++++++++++++++--------------
2 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index fd470f5f92..5f5b3e7515 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -53,7 +53,7 @@ typedef struct BDRVCopyBeforeWriteState {
CoMutex lock;
/*
- * @access_bitmap: represents areas allowed for reading by fleecing user.
+ * @access_bitmap: represents areas disallowed for reading by fleecing user.
* Reading from non-dirty areas leads to -EACCES.
*/
BdrvDirtyBitmap *access_bitmap;
@@ -220,7 +220,7 @@ cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset, int64_t bytes,
return NULL;
}
- if (bdrv_dirty_bitmap_next_zero(s->access_bitmap, offset, bytes) != -1) {
+ if (bdrv_dirty_bitmap_next_dirty(s->access_bitmap, offset, bytes) != -1) {
g_free(req);
return NULL;
}
@@ -338,8 +338,8 @@ cbw_co_pdiscard_snapshot(BlockDriverState *bs, int64_t offset, int64_t bytes)
aligned_bytes = aligned_end - aligned_offset;
WITH_QEMU_LOCK_GUARD(&s->lock) {
- bdrv_reset_dirty_bitmap(s->access_bitmap, aligned_offset,
- aligned_bytes);
+ bdrv_set_dirty_bitmap(s->access_bitmap, aligned_offset,
+ aligned_bytes);
}
block_copy_reset(s->bcs, aligned_offset, aligned_bytes);
@@ -501,9 +501,12 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
bdrv_disable_dirty_bitmap(s->access_bitmap);
- bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
- block_copy_dirty_bitmap(s->bcs), NULL,
- true);
+ if (bitmap) {
+ bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
+ block_copy_dirty_bitmap(s->bcs), NULL,
+ true);
+ bdrv_dirty_bitmap_reverse(s->access_bitmap);
+ }
qemu_co_mutex_init(&s->lock);
QLIST_INIT(&s->frozen_read_reqs);
diff --git a/tests/qemu-iotests/257.out b/tests/qemu-iotests/257.out
index c33dd7f3a9..55efb418e6 100644
--- a/tests/qemu-iotests/257.out
+++ b/tests/qemu-iotests/257.out
@@ -109,7 +109,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -585,7 +585,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -854,7 +854,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -1330,7 +1330,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -1599,7 +1599,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -2075,7 +2075,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -2344,7 +2344,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -2820,7 +2820,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -3089,7 +3089,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -3565,7 +3565,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -3834,7 +3834,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -4310,7 +4310,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -4579,7 +4579,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
@@ -5055,7 +5055,7 @@ write -P0x67 0x3fe0000 0x20000
"backup-top": [
{
"busy": false,
- "count": 67108864,
+ "count": 0,
"granularity": 65536,
"persistent": false,
"recording": false
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/4] block/copy-before-write: report partial block status to snapshot
2025-05-13 1:32 [PATCH 0/4] improve block_status() for cbw + snapshot setup Andrey Zhadchenko
` (2 preceding siblings ...)
2025-05-13 1:32 ` [PATCH 3/4] block/copy-before-write: reverse access bitmap Andrey Zhadchenko
@ 2025-05-13 1:32 ` Andrey Zhadchenko
3 siblings, 0 replies; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-13 1:32 UTC (permalink / raw)
To: qemu-block, vsementsov, eblake
Cc: jsnow, kwolf, hreitz, qemu-devel, andrey.drobyshev, den
until the non-accessible area
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
---
block/copy-before-write.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 5f5b3e7515..81d7f40b13 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -207,10 +207,11 @@ static int coroutine_fn GRAPH_RDLOCK cbw_co_flush(BlockDriverState *bs)
*/
static BlockReq * coroutine_fn GRAPH_RDLOCK
cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset, int64_t bytes,
- int64_t *pnum, BdrvChild **file)
+ int64_t *pnum, BdrvChild **file, bool query)
{
BDRVCopyBeforeWriteState *s = bs->opaque;
BlockReq *req = g_new(BlockReq, 1);
+ int64_t next_dirty;
bool done;
QEMU_LOCK_GUARD(&s->lock);
@@ -220,9 +221,13 @@ cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset, int64_t bytes,
return NULL;
}
- if (bdrv_dirty_bitmap_next_dirty(s->access_bitmap, offset, bytes) != -1) {
- g_free(req);
- return NULL;
+ next_dirty = bdrv_dirty_bitmap_next_dirty(s->access_bitmap, offset, bytes);
+ if (next_dirty != -1) {
+ if (!query || next_dirty == offset) {
+ g_free(req);
+ return NULL;
+ }
+ bytes = offset + bytes - next_dirty;
}
done = bdrv_dirty_bitmap_status(s->done_bitmap, offset, bytes, pnum);
@@ -270,7 +275,8 @@ cbw_co_preadv_snapshot(BlockDriverState *bs, int64_t offset, int64_t bytes,
while (bytes) {
int64_t cur_bytes;
- req = cbw_snapshot_read_lock(bs, offset, bytes, &cur_bytes, &file);
+ req = cbw_snapshot_read_lock(bs, offset, bytes, &cur_bytes, &file,
+ false);
if (!req) {
return -EACCES;
}
@@ -302,7 +308,7 @@ cbw_co_snapshot_block_status(BlockDriverState *bs,
int64_t cur_bytes;
BdrvChild *child;
- req = cbw_snapshot_read_lock(bs, offset, bytes, &cur_bytes, &child);
+ req = cbw_snapshot_read_lock(bs, offset, bytes, &cur_bytes, &child, true);
if (!req) {
return -EACCES;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/4] hbitmap: drop meta bitmap leftovers
2025-05-13 1:32 ` [PATCH 1/4] hbitmap: drop meta bitmap leftovers Andrey Zhadchenko
@ 2025-05-20 16:27 ` Eric Blake
0 siblings, 0 replies; 10+ messages in thread
From: Eric Blake @ 2025-05-20 16:27 UTC (permalink / raw)
To: Andrey Zhadchenko
Cc: qemu-block, vsementsov, jsnow, kwolf, hreitz, qemu-devel,
andrey.drobyshev, den
On Tue, May 13, 2025 at 03:32:35AM +0200, Andrey Zhadchenko wrote:
> API to manipulate meta bitmap was removed with commit 0c88f1970c76
>
> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
> ---
> util/hbitmap.c | 17 ++---------------
> 1 file changed, 2 insertions(+), 15 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] hbitmap: introduce hbitmap_reverse()
2025-05-13 1:32 ` [PATCH 2/4] hbitmap: introduce hbitmap_reverse() Andrey Zhadchenko
@ 2025-05-20 16:29 ` Eric Blake
2025-05-21 10:26 ` Andrey Zhadchenko
0 siblings, 1 reply; 10+ messages in thread
From: Eric Blake @ 2025-05-20 16:29 UTC (permalink / raw)
To: Andrey Zhadchenko
Cc: qemu-block, vsementsov, jsnow, kwolf, hreitz, qemu-devel,
andrey.drobyshev, den
On Tue, May 13, 2025 at 03:32:36AM +0200, Andrey Zhadchenko wrote:
> and bdrv_dirty_bitmap_reverse() helper
Is 'inverse' a better name than 'reverse'?
>
> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
> ---
> +++ b/util/hbitmap.c
> @@ -940,3 +940,18 @@ char *hbitmap_sha256(const HBitmap *bitmap, Error **errp)
>
> return hash;
> }
> +
> +void hbitmap_reverse(HBitmap *bitmap)
> +{
> + int64_t pnum, pos = 0;
> + int64_t size = bitmap->orig_size;
> +
> + while (pos < size) {
> + if (hbitmap_status(bitmap, pos, size - pos, &pnum)) {
> + hbitmap_reset(bitmap, pos, pnum);
> + } else {
> + hbitmap_set(bitmap, pos, pnum);
> + }
To me, reverse on 1110000 would be 0000111 (swapping the order); while
inverse would be 0001111 (swapping the bits but preserving the order).
The naming change will require respinning the series, but the concept
makes sense.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] block/copy-before-write: reverse access bitmap
2025-05-13 1:32 ` [PATCH 3/4] block/copy-before-write: reverse access bitmap Andrey Zhadchenko
@ 2025-05-20 17:26 ` Eric Blake
2025-05-21 10:27 ` Andrey Zhadchenko
0 siblings, 1 reply; 10+ messages in thread
From: Eric Blake @ 2025-05-20 17:26 UTC (permalink / raw)
To: Andrey Zhadchenko
Cc: qemu-block, vsementsov, jsnow, kwolf, hreitz, qemu-devel,
andrey.drobyshev, den
On Tue, May 13, 2025 at 03:32:37AM +0200, Andrey Zhadchenko wrote:
> HBitmaps allow us to search set bits pretty fast. On the contrary,
> when searching zeroes, we may be forced to fully traverse the lower
> level.
> When we run blockdev-backup with mode=full on top of snapshot filter
> + cbw filter, the job fills copy bitmap by calling block_status()
> with range (X, virtual_size). The problem is that we check for zeroes
> in this whole range. We also hit the worst case here, as access
> bitmap is fully set and we need to scan the entire lowest level.
> After scanning the full bitmap we actually ask the block status of
> original image, which may return significantly lower amount of empty
> clusters.
> Beacuse of this, the backup job 'hangs' on block copy initializaiton
> for a long time with 100% CPU.
>
> Example copy bitmap buildup time for image with clu_size=65536 and
> preallocated metadata
> size 10T 11T
> blockdev-backup 52s 57s
> cbw + snap 325s 413s
> cbw + snap + patch 55s 61s
>
> To fix it, reverse the access bitmap in cbw filter: rather set it
s/reverse/invert/
> when the user is not allowed to read the cluster.
>
> Update qemu-iotest 257: now access bitmap have count 0 instead of
> the image size 67108864
>
> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
> ---
> @@ -501,9 +501,12 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
> return -EINVAL;
> }
> bdrv_disable_dirty_bitmap(s->access_bitmap);
> - bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
> - block_copy_dirty_bitmap(s->bcs), NULL,
> - true);
> + if (bitmap) {
> + bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
> + block_copy_dirty_bitmap(s->bcs), NULL,
> + true);
> + bdrv_dirty_bitmap_reverse(s->access_bitmap);
Is this setting the bits correctly? Inverting a bitmap is a
reversible operation, but it looks odd that you are only reversing
once around the merge. Either the two sources of the merge have the
same sense (whether that be 0 for dirty 1 for clean, or 0 for readable
1 for avoid) and no inverting is needed before or after the merge, or
the two sources have opposite sense (in which case, I would have
expected inverting one of the bitmaps before the merge to get them to
agree on sense, then merging, then inverting back to the desired
sense). Am I missing something?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] hbitmap: introduce hbitmap_reverse()
2025-05-20 16:29 ` Eric Blake
@ 2025-05-21 10:26 ` Andrey Zhadchenko
0 siblings, 0 replies; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-21 10:26 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-block, vsementsov, jsnow, kwolf, hreitz, qemu-devel,
andrey.drobyshev, den
On 5/20/25 18:29, Eric Blake wrote:
> [?? ??????? ????????? ?????? ?? eblake@redhat.com. ???????, ?????? ??? ?????, ?? ?????? https://aka.ms/LearnAboutSenderIdentification ]
>
> On Tue, May 13, 2025 at 03:32:36AM +0200, Andrey Zhadchenko wrote:
>> and bdrv_dirty_bitmap_reverse() helper
>
> Is 'inverse' a better name than 'reverse'?
Yeah, it sounds much better this way!
I will re-do the patchset
>
>>
>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
>> ---
>> +++ b/util/hbitmap.c
>> @@ -940,3 +940,18 @@ char *hbitmap_sha256(const HBitmap *bitmap, Error **errp)
>>
>> return hash;
>> }
>> +
>> +void hbitmap_reverse(HBitmap *bitmap)
>> +{
>> + int64_t pnum, pos = 0;
>> + int64_t size = bitmap->orig_size;
>> +
>> + while (pos < size) {
>> + if (hbitmap_status(bitmap, pos, size - pos, &pnum)) {
>> + hbitmap_reset(bitmap, pos, pnum);
>> + } else {
>> + hbitmap_set(bitmap, pos, pnum);
>> + }
>
> To me, reverse on 1110000 would be 0000111 (swapping the order); while
> inverse would be 0001111 (swapping the bits but preserving the order).
>
> The naming change will require respinning the series, but the concept
> makes sense.
> > --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization: qemu.org | libguestfs.org
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] block/copy-before-write: reverse access bitmap
2025-05-20 17:26 ` Eric Blake
@ 2025-05-21 10:27 ` Andrey Zhadchenko
0 siblings, 0 replies; 10+ messages in thread
From: Andrey Zhadchenko @ 2025-05-21 10:27 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-block, vsementsov, jsnow, kwolf, hreitz, qemu-devel,
andrey.drobyshev, den
On 5/20/25 19:26, Eric Blake wrote:
> [?? ??????? ????????? ?????? ?? eblake@redhat.com. ???????, ?????? ??? ?????, ?? ?????? https://aka.ms/LearnAboutSenderIdentification ]
>
> On Tue, May 13, 2025 at 03:32:37AM +0200, Andrey Zhadchenko wrote:
>> HBitmaps allow us to search set bits pretty fast. On the contrary,
>> when searching zeroes, we may be forced to fully traverse the lower
>> level.
>> When we run blockdev-backup with mode=full on top of snapshot filter
>> + cbw filter, the job fills copy bitmap by calling block_status()
>> with range (X, virtual_size). The problem is that we check for zeroes
>> in this whole range. We also hit the worst case here, as access
>> bitmap is fully set and we need to scan the entire lowest level.
>> After scanning the full bitmap we actually ask the block status of
>> original image, which may return significantly lower amount of empty
>> clusters.
>> Beacuse of this, the backup job 'hangs' on block copy initializaiton
>> for a long time with 100% CPU.
>>
>> Example copy bitmap buildup time for image with clu_size=65536 and
>> preallocated metadata
>> size 10T 11T
>> blockdev-backup 52s 57s
>> cbw + snap 325s 413s
>> cbw + snap + patch 55s 61s
>>
>> To fix it, reverse the access bitmap in cbw filter: rather set it
>
> s/reverse/invert/
>
>> when the user is not allowed to read the cluster.
>>
>> Update qemu-iotest 257: now access bitmap have count 0 instead of
>> the image size 67108864
>>
>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
>> ---
>> @@ -501,9 +501,12 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
>> return -EINVAL;
>> }
>> bdrv_disable_dirty_bitmap(s->access_bitmap);
>> - bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
>> - block_copy_dirty_bitmap(s->bcs), NULL,
>> - true);
>> + if (bitmap) {
>> + bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
>> + block_copy_dirty_bitmap(s->bcs), NULL,
>> + true);
>> + bdrv_dirty_bitmap_reverse(s->access_bitmap);
>
> Is this setting the bits correctly? Inverting a bitmap is a
> reversible operation, but it looks odd that you are only reversing
> once around the merge. Either the two sources of the merge have the
> same sense (whether that be 0 for dirty 1 for clean, or 0 for readable
> 1 for avoid) and no inverting is needed before or after the merge, or
> the two sources have opposite sense (in which case, I would have
> expected inverting one of the bitmaps before the merge to get them to
> agree on sense, then merging, then inverting back to the desired
> sense). Am I missing something?
We have just created access bitmap a few lines above. So it is empty:
everything is accessible.
- If cbw did not have any bitmap, we don't need to to do anything
- If someone set bitmap on cbw, we used it to init block-copy dirty
bitmap. So to get the access bitmap, we inverse the bitmap that marks
to-be-copied blocks: access bitmap now has all of them as 0 (allowed)
and not-to-be-copied as 1 (disallowed)
This case is also covered with iotests/image-fleecing with filter
snapshot + cbw bitmap case: some not-in-a-bitmap blocks are accessed and
we expect to get errors:
--- Sanity Check ---
read -P0x5d 0 64k
read -P0xd5 1M 64k
read -P0xdc 32M 64k
read -P0xcd 0x3ff0000 64k
read -P0 0x00f8000 32k
read failed: Invalid argument
read -P0 0x2010000 32k
read failed: Invalid argument
read -P0 0x3fe0000 64k
read failed: Invalid argument
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization: qemu.org | libguestfs.org
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-05-21 10:27 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-13 1:32 [PATCH 0/4] improve block_status() for cbw + snapshot setup Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 1/4] hbitmap: drop meta bitmap leftovers Andrey Zhadchenko
2025-05-20 16:27 ` Eric Blake
2025-05-13 1:32 ` [PATCH 2/4] hbitmap: introduce hbitmap_reverse() Andrey Zhadchenko
2025-05-20 16:29 ` Eric Blake
2025-05-21 10:26 ` Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 3/4] block/copy-before-write: reverse access bitmap Andrey Zhadchenko
2025-05-20 17:26 ` Eric Blake
2025-05-21 10:27 ` Andrey Zhadchenko
2025-05-13 1:32 ` [PATCH 4/4] block/copy-before-write: report partial block status to snapshot Andrey Zhadchenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).