* [PATCH] block: qcow2-bitmap: fix persistent bitmap corruption on ENOSPC
@ 2026-06-10 2:33 Christian Barry
2026-06-16 9:06 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 3+ messages in thread
From: Christian Barry @ 2026-06-10 2:33 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, John Snow, qemu-block, Hanna Reitz,
David de Barros Tadokoro, Vladimir Sementsov-Ogievskiy,
Christian Barry, Eduardo Augusto Cavalcanti
From: Christian Barry <christian.barry@usp.br>
Changed qcow2_co_remove_persistent_dirty_bitmap() to treat free_bitmap_clusters()
failure as fatal and roll back the header update through a new helper
function rollback_ext_header_and_dir_helper(), which was also applied
to update_ext_header_and_dir(), which had internal rollback logic.
Changed bitmap_list_store() to zero the cluster tail after write,
guaranteeing that qemu-img won't parse into stale bytes.
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3516
Signed-off-by: Christian Barry <christian.barry@usp.br>
Co-developed by: Eduardo Augusto Cavalcanti <eduardoaugustoabc@ime.usp.br>
Signed-off-by: Eduardo Augusto Cavalcanti <eduardoaugustoabc@ime.usp.br>
---
block/qcow2-bitmap.c | 37 ++++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 256ec99878..5796298157 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -737,10 +737,12 @@ static int GRAPH_RDLOCK
bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
uint64_t *offset, uint64_t *size, bool in_place)
{
+ BDRVQcow2State *s = bs->opaque;
int ret;
uint8_t *dir;
int64_t dir_offset = 0;
uint64_t dir_size = 0;
+ uint64_t tail_size;
Qcow2Bitmap *bm;
Qcow2BitmapDirEntry *e;
@@ -810,6 +812,14 @@ bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
goto fail;
}
+ tail_size = ROUND_UP(dir_size, s->cluster_size) - dir_size;
+ if (tail_size > 0) {
+ ret = bdrv_pwrite_zeroes(bs->file, dir_offset + dir_size, tail_size, 0);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
g_free(dir);
if (!in_place) {
@@ -882,6 +892,19 @@ update_ext_header_and_dir_in_place(BlockDriverState *bs,
*/
}
+static void rollback_ext_header_and_dir_helper(BlockDriverState *bs,
+ uint64_t old_offset,
+ uint64_t old_size,
+ uint32_t old_nb_bitmaps,
+ uint64_t old_autocl)
+{
+ BDRVQcow2State *s = bs->opaque;
+ s->bitmap_directory_offset = old_offset;
+ s->bitmap_directory_size = old_size;
+ s->nb_bitmaps = old_nb_bitmaps;
+ s->autoclear_features = old_autocl;
+}
+
static int GRAPH_RDLOCK
update_ext_header_and_dir(BlockDriverState *bs, Qcow2BitmapList *bm_list)
{
@@ -937,10 +960,7 @@ fail:
qcow2_free_clusters(bs, new_offset, new_size, QCOW2_DISCARD_OTHER);
}
- s->bitmap_directory_offset = old_offset;
- s->bitmap_directory_size = old_size;
- s->nb_bitmaps = old_nb_bitmaps;
- s->autoclear_features = old_autocl;
+ rollback_ext_header_and_dir_helper(bs, old_offset, old_size, old_nb_bitmaps, old_autocl);
return ret;
}
@@ -1460,6 +1480,10 @@ int coroutine_fn qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
{
int ret;
BDRVQcow2State *s = bs->opaque;
+ uint64_t old_offset = s->bitmap_directory_offset;
+ uint64_t old_size = s->bitmap_directory_size;
+ uint32_t old_nb_bitmaps = s->nb_bitmaps;
+ uint64_t old_autocl = s->autoclear_features;
Qcow2Bitmap *bm = NULL;
Qcow2BitmapList *bm_list;
@@ -1495,7 +1519,10 @@ int coroutine_fn qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
goto out;
}
- free_bitmap_clusters(bs, &bm->table);
+ ret = free_bitmap_clusters(bs, &bm->table);
+ if (ret < 0){
+ rollback_ext_header_and_dir_helper(bs, old_offset, old_size, old_nb_bitmaps, old_autocl);
+ }
out:
qemu_co_mutex_unlock(&s->lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] block: qcow2-bitmap: fix persistent bitmap corruption on ENOSPC
2026-06-10 2:33 [PATCH] block: qcow2-bitmap: fix persistent bitmap corruption on ENOSPC Christian Barry
@ 2026-06-16 9:06 ` Vladimir Sementsov-Ogievskiy
2026-06-16 10:03 ` Kevin Wolf
0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-06-16 9:06 UTC (permalink / raw)
To: Christian Barry, qemu-devel
Cc: Kevin Wolf, Eric Blake, John Snow, qemu-block, Hanna Reitz,
David de Barros Tadokoro, Eduardo Augusto Cavalcanti
10.06.26 05:33, Christian Barry пишет:
> From: Christian Barry <christian.barry@usp.br>
>
> Changed qcow2_co_remove_persistent_dirty_bitmap() to treat free_bitmap_clusters()
> failure as fatal and roll back the header update through a new helper
> function rollback_ext_header_and_dir_helper(), which was also applied
> to update_ext_header_and_dir(), which had internal rollback logic.
> Changed bitmap_list_store() to zero the cluster tail after write,
> guaranteeing that qemu-img won't parse into stale bytes.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3516
Hi!
First, seem like two different fixes merged into one commit, please if
possible, make them separate two commits.
Next, please, make commit messages more descriptive, about what the
issue we solve (link is good as additional information, but the commit
message itself should be complete).
For example, I don't understand free_bitmap_clusters makes the image
unreadable. And we we fix it in one place, leaving other calls to
free_bitmap_clusters() unchanged. Could we fix free_bitmap_clusters()
itself? See also my comments below
>
> Signed-off-by: Christian Barry <christian.barry@usp.br>
> Co-developed by: Eduardo Augusto Cavalcanti <eduardoaugustoabc@ime.usp.br>
> Signed-off-by: Eduardo Augusto Cavalcanti <eduardoaugustoabc@ime.usp.br>
> ---
> block/qcow2-bitmap.c | 37 ++++++++++++++++++++++++++++++++-----
> 1 file changed, 32 insertions(+), 5 deletions(-)
>
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index 256ec99878..5796298157 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -737,10 +737,12 @@ static int GRAPH_RDLOCK
> bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
> uint64_t *offset, uint64_t *size, bool in_place)
> {
> + BDRVQcow2State *s = bs->opaque;
> int ret;
> uint8_t *dir;
> int64_t dir_offset = 0;
> uint64_t dir_size = 0;
> + uint64_t tail_size;
> Qcow2Bitmap *bm;
> Qcow2BitmapDirEntry *e;
>
> @@ -810,6 +812,14 @@ bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
> goto fail;
> }
>
> + tail_size = ROUND_UP(dir_size, s->cluster_size) - dir_size;
> + if (tail_size > 0) {
> + ret = bdrv_pwrite_zeroes(bs->file, dir_offset + dir_size, tail_size, 0);
> + if (ret < 0) {
> + goto fail;
> + }
> + }
If zeroing of unused bytes help to fix some issue, it obviously means,
that something is wrong in deeper logic.
> +
> g_free(dir);
>
> if (!in_place) {
> @@ -882,6 +892,19 @@ update_ext_header_and_dir_in_place(BlockDriverState *bs,
> */
> }
>
> +static void rollback_ext_header_and_dir_helper(BlockDriverState *bs,
> + uint64_t old_offset,
> + uint64_t old_size,
> + uint32_t old_nb_bitmaps,
> + uint64_t old_autocl)
> +{
> + BDRVQcow2State *s = bs->opaque;
> + s->bitmap_directory_offset = old_offset;
> + s->bitmap_directory_size = old_size;
> + s->nb_bitmaps = old_nb_bitmaps;
> + s->autoclear_features = old_autocl;
> +}
I think, better to inline this code, or refactor deeper. This way it only
makes code more complicated: we can't deduplicate first part of logic,
which is defining these 4 variables, but do that for the second part.
> +
> static int GRAPH_RDLOCK
> update_ext_header_and_dir(BlockDriverState *bs, Qcow2BitmapList *bm_list)
> {
> @@ -937,10 +960,7 @@ fail:
> qcow2_free_clusters(bs, new_offset, new_size, QCOW2_DISCARD_OTHER);
> }
>
> - s->bitmap_directory_offset = old_offset;
> - s->bitmap_directory_size = old_size;
> - s->nb_bitmaps = old_nb_bitmaps;
> - s->autoclear_features = old_autocl;
> + rollback_ext_header_and_dir_helper(bs, old_offset, old_size, old_nb_bitmaps, old_autocl);
>
> return ret;
> }
> @@ -1460,6 +1480,10 @@ int coroutine_fn qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
> {
> int ret;
> BDRVQcow2State *s = bs->opaque;
> + uint64_t old_offset = s->bitmap_directory_offset;
> + uint64_t old_size = s->bitmap_directory_size;
> + uint32_t old_nb_bitmaps = s->nb_bitmaps;
> + uint64_t old_autocl = s->autoclear_features;
> Qcow2Bitmap *bm = NULL;
> Qcow2BitmapList *bm_list;
>
> @@ -1495,7 +1519,10 @@ int coroutine_fn qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
> goto out;
> }
>
> - free_bitmap_clusters(bs, &bm->table);
> + ret = free_bitmap_clusters(bs, &bm->table);
> + if (ret < 0){
> + rollback_ext_header_and_dir_helper(bs, old_offset, old_size, old_nb_bitmaps, old_autocl);
> + }
Again, this looks strange too. Ignoring failure of freeing _unused_ clusters
should not be a problem. If it is, it means that things are broken earlier
and something should be fixed at earlier stage.
>
> out:
> qemu_co_mutex_unlock(&s->lock);
---
Let me try to understand the issue at https://gitlab.com/qemu-project/qemu/-/work_items/3516..
1. We failed to flush bitmaps. So, autoclear bit should be unset in the header and the whole bitmap extension considered inconsistent
2. On next qcow2 open, no bitmaps should be loaded, and we'll see message of
error_printf("Some clusters may be leaked, "
"run 'qemu-img check -r' on the image "
"file to fix.");
What's wrong with that?
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] block: qcow2-bitmap: fix persistent bitmap corruption on ENOSPC
2026-06-16 9:06 ` Vladimir Sementsov-Ogievskiy
@ 2026-06-16 10:03 ` Kevin Wolf
0 siblings, 0 replies; 3+ messages in thread
From: Kevin Wolf @ 2026-06-16 10:03 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: Christian Barry, qemu-devel, Eric Blake, John Snow, qemu-block,
Hanna Reitz, David de Barros Tadokoro, Eduardo Augusto Cavalcanti
Am 16.06.2026 um 11:06 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 10.06.26 05:33, Christian Barry пишет:
> > From: Christian Barry <christian.barry@usp.br>
> >
> > Changed qcow2_co_remove_persistent_dirty_bitmap() to treat free_bitmap_clusters()
> > failure as fatal and roll back the header update through a new helper
> > function rollback_ext_header_and_dir_helper(), which was also applied
> > to update_ext_header_and_dir(), which had internal rollback logic.
> > Changed bitmap_list_store() to zero the cluster tail after write,
> > guaranteeing that qemu-img won't parse into stale bytes.
> >
> > Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3516
>
> Hi!
>
> First, seem like two different fixes merged into one commit, please if
> possible, make them separate two commits.
>
> Next, please, make commit messages more descriptive, about what the
> issue we solve (link is good as additional information, but the commit
> message itself should be complete).
>
> For example, I don't understand free_bitmap_clusters makes the image
> unreadable. And we we fix it in one place, leaving other calls to
> free_bitmap_clusters() unchanged. Could we fix free_bitmap_clusters()
> itself? See also my comments below
I haven't looked at the patch yet, but this sounds like apart from
splitting the patches into one logical change per patch and improving
the commit message, the patch series should also include a test case
(probably qemu-iotests) that demonstrates the problem and how the patch
changes behaviour.
Kevin
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-16 10:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 2:33 [PATCH] block: qcow2-bitmap: fix persistent bitmap corruption on ENOSPC Christian Barry
2026-06-16 9:06 ` Vladimir Sementsov-Ogievskiy
2026-06-16 10:03 ` Kevin Wolf
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.