From: Stefan Hajnoczi <stefanha@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, hreitz@redhat.com,
kwolf@redhat.com, fam@euphon.net
Subject: Re: [PATCH] block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_status
Date: Wed, 17 Jan 2024 09:32:02 -0500 [thread overview]
Message-ID: <20240117143202.GA1404231@fedora> (raw)
In-Reply-To: <20240116154839.401030-1-f.ebner@proxmox.com>
[-- Attachment #1: Type: text/plain, Size: 3438 bytes --]
On Tue, Jan 16, 2024 at 04:48:39PM +0100, Fiona Ebner wrote:
> Using fleecing backup like in [0] on a qcow2 image (with metadata
> preallocation) can lead to the following assertion failure:
>
> > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed.
>
> In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag
> will be set by the qcow2 driver, so the caller will recursively check
> the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call
> chain, in bdrv_co_do_block_status() for the snapshot-access driver,
> the assertion failure will happen, because both flags are set.
>
> To fix it, clear the recurse flag after the recursive check was done.
>
> In detail:
>
> > #0 qcow2_co_block_status
>
> Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
> BDRV_BLOCK_OFFSET_VALID.
>
> > #1 bdrv_co_do_block_status
>
> Because of the data flag, bdrv_co_do_block_status() will now also set
> BDRV_BLOCK_ALLOCATED. Because of the recurse flag,
> bdrv_co_do_block_status() for the bdrv_file child will be called,
> which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID |
> BDRV_BLOCK_ZERO. Now the return value inherits the zero flag.
>
> Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
> BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO.
>
> > #2 bdrv_co_common_block_status_above
> > #3 bdrv_co_block_status_above
> > #4 bdrv_co_block_status
> > #5 cbw_co_snapshot_block_status
> > #6 bdrv_co_snapshot_block_status
> > #7 snapshot_access_co_block_status
> > #8 bdrv_co_do_block_status
>
> Return value is propagated all the way up to here, where the assertion
> failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are
> both set.
>
> > #9 bdrv_co_common_block_status_above
> > #10 bdrv_co_block_status_above
> > #11 block_copy_block_status
> > #12 block_copy_dirty_clusters
> > #13 block_copy_common
> > #14 block_copy_async_co_entry
> > #15 coroutine_trampoline
>
> [0]:
>
> > #!/bin/bash
> > rm /tmp/disk.qcow2
> > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G
> > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G
> > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
> > ./qemu-system-x86_64 --qmp stdio \
> > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
> > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \
> > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
> > <<EOF
> > {"execute": "qmp_capabilities"}
> > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } }
> > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } }
> > EOF
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> I'm new to this part of the code, so I'm not sure if it is actually
> safe to clear the flag? Intuitively, I'd expect it to be only relevant
> until it was acted upon, but no clue.
>
> block/io.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2024-01-17 14:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-16 15:48 [PATCH] block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_status Fiona Ebner
2024-01-16 22:06 ` Stefan Hajnoczi
2024-01-17 12:42 ` Vladimir Sementsov-Ogievskiy
2024-01-17 14:32 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240117143202.GA1404231@fedora \
--to=stefanha@redhat.com \
--cc=f.ebner@proxmox.com \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.