* [PATCH 0/4] block: fix deadlock when doing blockdev-snapshot of a node with compress filter
@ 2025-06-04 12:07 Fiona Ebner
2025-06-04 12:07 ` [PATCH 1/4] block/graph-lock: fix typo in comment Fiona Ebner
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-06-04 12:07 UTC (permalink / raw)
To: qemu-block
Cc: qemu-devel, kwolf, hreitz, stefanha, fam, jsnow, vsementsov,
eblake, kbusch, its, foss
Patches 1/4 and 2/4 are tangentially related. See patches 3/4 and 4/4
for the details.
I did not CC the people for the individual block drivers. Let me know
if I should!
Fiona Ebner (4):
block/graph-lock: fix typo in comment
block-coroutine-wrapper: mark generated co-wrapper as GRAPH_UNLOCKED
block: make calling bdrv_refresh_limits() safe while holding graph
lock
iotests: add test for snapshot on top of node with compress filter
block.c | 14 ++++++++++----
block/blkio.c | 5 ++---
block/crypto.c | 8 ++++----
block/file-posix.c | 7 +++----
block/io.c | 4 ++--
block/iscsi.c | 7 +++----
block/mirror.c | 5 +++--
block/monitor/bitmap-qmp-cmds.c | 2 ++
block/qcow.c | 5 ++---
block/qcow2.c | 5 ++---
block/qed.c | 5 ++---
block/raw-format.c | 7 +++----
block/rbd.c | 5 ++---
block/vdi.c | 11 ++++++-----
block/vhdx.c | 5 ++---
block/vmdk.c | 5 ++---
block/vpc.c | 5 ++---
hw/nvme/ns.c | 2 +-
include/block/block-io.h | 9 +++++----
include/block/block_int-common.h | 4 ++--
include/block/dirty-bitmap.h | 2 +-
include/block/graph-lock.h | 2 +-
qemu-img.c | 6 ++++++
scripts/block-coroutine-wrapper.py | 2 +-
tests/qemu-iotests/085 | 18 ++++++++++++++++++
tests/qemu-iotests/085.out | 21 +++++++++++++++++++++
26 files changed, 108 insertions(+), 63 deletions(-)
--
2.39.5
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/4] block/graph-lock: fix typo in comment
2025-06-04 12:07 [PATCH 0/4] block: fix deadlock when doing blockdev-snapshot of a node with compress filter Fiona Ebner
@ 2025-06-04 12:07 ` Fiona Ebner
2025-06-17 18:36 ` Eric Blake
2025-06-04 12:07 ` [PATCH 2/4] block-coroutine-wrapper: mark generated co-wrapper as GRAPH_UNLOCKED Fiona Ebner
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2025-06-04 12:07 UTC (permalink / raw)
To: qemu-block
Cc: qemu-devel, kwolf, hreitz, stefanha, fam, jsnow, vsementsov,
eblake, kbusch, its, foss
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
include/block/graph-lock.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index 2c26c72108..67c8d04867 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -169,7 +169,7 @@ bdrv_graph_rdunlock_main_loop(void);
/*
* assert_bdrv_graph_readable:
* Make sure that the reader is either the main loop,
- * or there is at least a reader helding the rdlock.
+ * or there is at least a reader holding the rdlock.
* In this way an incoming writer is aware of the read and waits.
*/
void GRAPH_RDLOCK assert_bdrv_graph_readable(void);
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/4] block-coroutine-wrapper: mark generated co-wrapper as GRAPH_UNLOCKED
2025-06-04 12:07 [PATCH 0/4] block: fix deadlock when doing blockdev-snapshot of a node with compress filter Fiona Ebner
2025-06-04 12:07 ` [PATCH 1/4] block/graph-lock: fix typo in comment Fiona Ebner
@ 2025-06-04 12:07 ` Fiona Ebner
2025-06-04 12:07 ` [PATCH 3/4] block: make calling bdrv_refresh_limits() safe while holding graph lock Fiona Ebner
2025-06-04 12:07 ` [PATCH 4/4] iotests: add test for snapshot on top of node with compress filter Fiona Ebner
3 siblings, 0 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-06-04 12:07 UTC (permalink / raw)
To: qemu-block
Cc: qemu-devel, kwolf, hreitz, stefanha, fam, jsnow, vsementsov,
eblake, kbusch, its, foss
Generated co-wrappers poll, so they need to be called with the graph
unlocked.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
Unfortunately, it's not clear how to achieve something similar for the
mixed wrappers, which would've caught the issue fixed in the following
patch.
scripts/block-coroutine-wrapper.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py
index dbbde99e39..cdde957257 100644
--- a/scripts/block-coroutine-wrapper.py
+++ b/scripts/block-coroutine-wrapper.py
@@ -186,7 +186,7 @@ def create_co_wrapper(func: FuncDecl) -> str:
name = func.target_name
struct_name = func.struct_name
return f"""\
-{func.return_type} {func.name}({ func.gen_list('{decl}') })
+{func.return_type} GRAPH_UNLOCKED {func.name}({ func.gen_list('{decl}') })
{{
{struct_name} s = {{
.poll_state.ctx = qemu_get_current_aio_context(),
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] block: make calling bdrv_refresh_limits() safe while holding graph lock
2025-06-04 12:07 [PATCH 0/4] block: fix deadlock when doing blockdev-snapshot of a node with compress filter Fiona Ebner
2025-06-04 12:07 ` [PATCH 1/4] block/graph-lock: fix typo in comment Fiona Ebner
2025-06-04 12:07 ` [PATCH 2/4] block-coroutine-wrapper: mark generated co-wrapper as GRAPH_UNLOCKED Fiona Ebner
@ 2025-06-04 12:07 ` Fiona Ebner
2025-06-04 12:15 ` Klaus Jensen
2025-06-04 12:07 ` [PATCH 4/4] iotests: add test for snapshot on top of node with compress filter Fiona Ebner
3 siblings, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2025-06-04 12:07 UTC (permalink / raw)
To: qemu-block
Cc: qemu-devel, kwolf, hreitz, stefanha, fam, jsnow, vsementsov,
eblake, kbusch, its, foss
The bdrv_refresh_limits() function and driver implementations are
called with the graph lock held. The implementation for the 'compress'
filter calls bdrv_get_info(), which is a generated coroutine wrapper
and thus polls. This can lead to a deadlock when issuing a
blockdev-snapshot QMP command, when bdrv_refresh_limits() is called in
bdrv_append() while the graph lock is held exclusively. This deadlock
was introduced with commit 5661a00d40 ("block: Call transaction
callbacks with lock held").
As a solution, this reverts commit 3d47eb0a2a ("block:
Convert bdrv_get_info() to co_wrapper_mixed"). To do this, it is
necessary to have callers of bdrv_get_info() take the graph lock
themselves. None of the driver implementations rely on being run in
coroutine context and none of the callers rely on the function being
a coroutine.
All callers except one either already hold the graph lock or can claim
the graph lock via bdrv_graph_rdlock_main_loop(). As part of this,
bdrv_get_default_bitmap_granularity() is annotated with GRAPH_RDLOCK
and its callers adapted where necessary.
The single exception is the caller nvme_ns_init_format(), which can
run as a callback in an IO thread, but can also be reached via the QOM
realize handler nvme_ns_realize(). For this caller, a
bdrv_get_info_unlocked() coroutine wrapper is introduced that must be
called with the graph unlocked.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
The test added by the following patch exhibits the issue mentioned
above.
Telling from a quick search, iothread support for nvme devices has not
been merged [0]. So currently, doing the following in
nvme_ns_init_format() might also be an option to avoid the need for
the wrapper:
if (qemu_in_coroutine()) {
bdrv_graph_co_rdlock();
} else {
bdrv_graph_rdlock_main_loop()
}
But that doesn't seem future-proof.
[0]: https://lore.kernel.org/qemu-devel/20220827091258.3589230-1-fanjinhao21s@ict.ac.cn/
block.c | 14 ++++++++++----
block/blkio.c | 5 ++---
block/crypto.c | 8 ++++----
block/file-posix.c | 7 +++----
block/io.c | 4 ++--
block/iscsi.c | 7 +++----
block/mirror.c | 5 +++--
block/monitor/bitmap-qmp-cmds.c | 2 ++
block/qcow.c | 5 ++---
block/qcow2.c | 5 ++---
block/qed.c | 5 ++---
block/raw-format.c | 7 +++----
block/rbd.c | 5 ++---
block/vdi.c | 11 ++++++-----
block/vhdx.c | 5 ++---
block/vmdk.c | 5 ++---
block/vpc.c | 5 ++---
hw/nvme/ns.c | 2 +-
include/block/block-io.h | 9 +++++----
include/block/block_int-common.h | 4 ++--
include/block/dirty-bitmap.h | 2 +-
qemu-img.c | 6 ++++++
22 files changed, 67 insertions(+), 61 deletions(-)
diff --git a/block.c b/block.c
index f222e1a50a..c23550ae42 100644
--- a/block.c
+++ b/block.c
@@ -6570,7 +6570,13 @@ void bdrv_get_backing_filename(BlockDriverState *bs,
pstrcpy(filename, filename_size, bs->backing_file);
}
-int coroutine_fn bdrv_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+int coroutine_fn GRAPH_RDLOCK
+bdrv_co_get_info_unlocked(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+ return bdrv_get_info(bs, bdi);
+}
+
+int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
int ret;
BlockDriver *drv = bs->drv;
@@ -6581,15 +6587,15 @@ int coroutine_fn bdrv_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
if (!drv) {
return -ENOMEDIUM;
}
- if (!drv->bdrv_co_get_info) {
+ if (!drv->bdrv_get_info) {
BlockDriverState *filtered = bdrv_filter_bs(bs);
if (filtered) {
- return bdrv_co_get_info(filtered, bdi);
+ return bdrv_get_info(filtered, bdi);
}
return -ENOTSUP;
}
memset(bdi, 0, sizeof(*bdi));
- ret = drv->bdrv_co_get_info(bs, bdi);
+ ret = drv->bdrv_get_info(bs, bdi);
if (bdi->subcluster_size == 0) {
/*
* If the driver left this unset, subclusters are not supported.
diff --git a/block/blkio.c b/block/blkio.c
index 4142673984..4b04d530e2 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -972,8 +972,7 @@ static int coroutine_fn blkio_truncate(BlockDriverState *bs, int64_t offset,
return 0;
}
-static int coroutine_fn
-blkio_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int blkio_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
return 0;
}
@@ -1094,7 +1093,7 @@ static void blkio_refresh_limits(BlockDriverState *bs, Error **errp)
.bdrv_close = blkio_close, \
.bdrv_co_getlength = blkio_co_getlength, \
.bdrv_co_truncate = blkio_truncate, \
- .bdrv_co_get_info = blkio_co_get_info, \
+ .bdrv_get_info = blkio_get_info, \
.bdrv_attach_aio_context = blkio_attach_aio_context, \
.bdrv_detach_aio_context = blkio_detach_aio_context, \
.bdrv_co_pdiscard = blkio_co_pdiscard, \
diff --git a/block/crypto.c b/block/crypto.c
index d4226cc68a..3c19848db0 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -862,13 +862,13 @@ fail:
return ret;
}
-static int coroutine_fn GRAPH_RDLOCK
-block_crypto_co_get_info_luks(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int GRAPH_RDLOCK
+block_crypto_get_info_luks(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BlockDriverInfo subbdi;
int ret;
- ret = bdrv_co_get_info(bs->file->bs, &subbdi);
+ ret = bdrv_get_info(bs->file->bs, &subbdi);
if (ret != 0) {
return ret;
}
@@ -1080,7 +1080,7 @@ static BlockDriver bdrv_crypto_luks = {
.bdrv_co_pwritev = block_crypto_co_pwritev,
.bdrv_co_getlength = block_crypto_co_getlength,
.bdrv_measure = block_crypto_measure,
- .bdrv_co_get_info = block_crypto_co_get_info_luks,
+ .bdrv_get_info = block_crypto_get_info_luks,
.bdrv_get_specific_info = block_crypto_get_specific_info_luks,
.bdrv_amend_options = block_crypto_amend_options_luks,
.bdrv_co_amend = block_crypto_co_amend_luks,
diff --git a/block/file-posix.c b/block/file-posix.c
index 9b5f08ccb2..e14d8c9213 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3743,8 +3743,7 @@ static int coroutine_fn raw_co_pwrite_zeroes(
return raw_do_pwrite_zeroes(bs, offset, bytes, flags, false);
}
-static int coroutine_fn
-raw_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
return 0;
}
@@ -4003,7 +4002,7 @@ BlockDriver bdrv_file = {
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
- .bdrv_co_get_info = raw_co_get_info,
+ .bdrv_get_info = raw_get_info,
.bdrv_get_specific_info = raw_get_specific_info,
.bdrv_co_get_allocated_file_size = raw_co_get_allocated_file_size,
.bdrv_get_specific_stats = raw_get_specific_stats,
@@ -4461,7 +4460,7 @@ static BlockDriver bdrv_host_device = {
.bdrv_co_truncate = raw_co_truncate,
.bdrv_co_getlength = raw_co_getlength,
- .bdrv_co_get_info = raw_co_get_info,
+ .bdrv_get_info = raw_get_info,
.bdrv_get_specific_info = raw_get_specific_info,
.bdrv_co_get_allocated_file_size = raw_co_get_allocated_file_size,
.bdrv_get_specific_stats = hdev_get_specific_stats,
diff --git a/block/io.c b/block/io.c
index 4fd7768f9c..43cd7b7507 100644
--- a/block/io.c
+++ b/block/io.c
@@ -743,7 +743,7 @@ bdrv_round_to_subclusters(BlockDriverState *bs, int64_t offset, int64_t bytes,
{
BlockDriverInfo bdi;
IO_CODE();
- if (bdrv_co_get_info(bs, &bdi) < 0 || bdi.subcluster_size == 0) {
+ if (bdrv_get_info(bs, &bdi) < 0 || bdi.subcluster_size == 0) {
*align_offset = offset;
*align_bytes = bytes;
} else {
@@ -758,7 +758,7 @@ static int coroutine_fn GRAPH_RDLOCK bdrv_get_cluster_size(BlockDriverState *bs)
BlockDriverInfo bdi;
int ret;
- ret = bdrv_co_get_info(bs, &bdi);
+ ret = bdrv_get_info(bs, &bdi);
if (ret < 0 || bdi.cluster_size == 0) {
return bs->bl.request_alignment;
} else {
diff --git a/block/iscsi.c b/block/iscsi.c
index 15b96ee880..4b000df117 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -2177,8 +2177,7 @@ static int coroutine_fn iscsi_co_truncate(BlockDriverState *bs, int64_t offset,
return 0;
}
-static int coroutine_fn
-iscsi_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int iscsi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
IscsiLun *iscsilun = bs->opaque;
bdi->cluster_size = iscsilun->cluster_size;
@@ -2438,7 +2437,7 @@ static BlockDriver bdrv_iscsi = {
.bdrv_co_invalidate_cache = iscsi_co_invalidate_cache,
.bdrv_co_getlength = iscsi_co_getlength,
- .bdrv_co_get_info = iscsi_co_get_info,
+ .bdrv_get_info = iscsi_get_info,
.bdrv_co_truncate = iscsi_co_truncate,
.bdrv_refresh_limits = iscsi_refresh_limits,
@@ -2477,7 +2476,7 @@ static BlockDriver bdrv_iser = {
.bdrv_co_invalidate_cache = iscsi_co_invalidate_cache,
.bdrv_co_getlength = iscsi_co_getlength,
- .bdrv_co_get_info = iscsi_co_get_info,
+ .bdrv_get_info = iscsi_get_info,
.bdrv_co_truncate = iscsi_co_truncate,
.bdrv_refresh_limits = iscsi_refresh_limits,
diff --git a/block/mirror.c b/block/mirror.c
index c2c5099c95..ea32861db1 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1082,7 +1082,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
bdrv_get_backing_filename(target_bs, backing_filename,
sizeof(backing_filename));
bdrv_graph_co_rdlock();
- if (!bdrv_co_get_info(target_bs, &bdi) && bdi.cluster_size) {
+ if (!bdrv_get_info(target_bs, &bdi) && bdi.cluster_size) {
s->target_cluster_size = bdi.cluster_size;
} else {
s->target_cluster_size = BDRV_SECTOR_SIZE;
@@ -1844,6 +1844,7 @@ static BlockJob *mirror_start_job(
GLOBAL_STATE_CODE();
+ bdrv_graph_rdlock_main_loop();
if (granularity == 0) {
granularity = bdrv_get_default_bitmap_granularity(target);
}
@@ -1852,6 +1853,7 @@ static BlockJob *mirror_start_job(
if (buf_size < 0) {
error_setg(errp, "Invalid parameter 'buf-size'");
+ bdrv_graph_rdunlock_main_loop();
return NULL;
}
@@ -1859,7 +1861,6 @@ static BlockJob *mirror_start_job(
buf_size = DEFAULT_MIRROR_BUF_SIZE;
}
- bdrv_graph_rdlock_main_loop();
if (bdrv_skip_filters(bs) == bdrv_skip_filters(target)) {
error_setg(errp, "Can't mirror node into itself");
bdrv_graph_rdunlock_main_loop();
diff --git a/block/monitor/bitmap-qmp-cmds.c b/block/monitor/bitmap-qmp-cmds.c
index a738e7bbf7..27fd5c12f2 100644
--- a/block/monitor/bitmap-qmp-cmds.c
+++ b/block/monitor/bitmap-qmp-cmds.c
@@ -114,7 +114,9 @@ void qmp_block_dirty_bitmap_add(const char *node, const char *name,
}
} else {
/* Default to cluster size, if available: */
+ bdrv_graph_rdlock_main_loop();
granularity = bdrv_get_default_bitmap_granularity(bs);
+ bdrv_graph_rdunlock_main_loop();
}
if (!has_persistent) {
diff --git a/block/qcow.c b/block/qcow.c
index 8a3e7591a9..1e70c943cb 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -1134,8 +1134,7 @@ fail:
return ret;
}
-static int coroutine_fn
-qcow_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int qcow_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVQcowState *s = bs->opaque;
bdi->cluster_size = s->cluster_size;
@@ -1204,7 +1203,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_make_empty = qcow_make_empty,
.bdrv_co_pwritev_compressed = qcow_co_pwritev_compressed,
- .bdrv_co_get_info = qcow_co_get_info,
+ .bdrv_get_info = qcow_get_info,
.create_opts = &qcow_create_opts,
.strong_runtime_opts = qcow_strong_runtime_opts,
diff --git a/block/qcow2.c b/block/qcow2.c
index 66fba89b41..9af9eea15d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5236,8 +5236,7 @@ err:
return NULL;
}
-static int coroutine_fn
-qcow2_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int qcow2_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVQcow2State *s = bs->opaque;
bdi->cluster_size = s->cluster_size;
@@ -6174,7 +6173,7 @@ BlockDriver bdrv_qcow2 = {
.bdrv_snapshot_list = qcow2_snapshot_list,
.bdrv_snapshot_load_tmp = qcow2_snapshot_load_tmp,
.bdrv_measure = qcow2_measure,
- .bdrv_co_get_info = qcow2_co_get_info,
+ .bdrv_get_info = qcow2_get_info,
.bdrv_get_specific_info = qcow2_get_specific_info,
.bdrv_co_save_vmstate = qcow2_co_save_vmstate,
diff --git a/block/qed.c b/block/qed.c
index 4a36fb3929..11bb508f39 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1494,8 +1494,7 @@ static int64_t coroutine_fn bdrv_qed_co_getlength(BlockDriverState *bs)
return s->header.image_size;
}
-static int coroutine_fn
-bdrv_qed_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int bdrv_qed_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVQEDState *s = bs->opaque;
@@ -1664,7 +1663,7 @@ static BlockDriver bdrv_qed = {
.bdrv_co_pwrite_zeroes = bdrv_qed_co_pwrite_zeroes,
.bdrv_co_truncate = bdrv_qed_co_truncate,
.bdrv_co_getlength = bdrv_qed_co_getlength,
- .bdrv_co_get_info = bdrv_qed_co_get_info,
+ .bdrv_get_info = bdrv_qed_get_info,
.bdrv_refresh_limits = bdrv_qed_refresh_limits,
.bdrv_co_change_backing_file = bdrv_qed_co_change_backing_file,
.bdrv_co_invalidate_cache = bdrv_qed_co_invalidate_cache,
diff --git a/block/raw-format.c b/block/raw-format.c
index df16ac1ea2..473c1a6df4 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -393,10 +393,9 @@ static BlockMeasureInfo *raw_measure(QemuOpts *opts, BlockDriverState *in_bs,
return info;
}
-static int coroutine_fn GRAPH_RDLOCK
-raw_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int GRAPH_RDLOCK raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
- return bdrv_co_get_info(bs->file->bs, bdi);
+ return bdrv_get_info(bs->file->bs, bdi);
}
static void GRAPH_RDLOCK raw_refresh_limits(BlockDriverState *bs, Error **errp)
@@ -660,7 +659,7 @@ BlockDriver bdrv_raw = {
.bdrv_co_getlength = &raw_co_getlength,
.is_format = true,
.bdrv_measure = &raw_measure,
- .bdrv_co_get_info = &raw_co_get_info,
+ .bdrv_get_info = &raw_get_info,
.bdrv_refresh_limits = &raw_refresh_limits,
.bdrv_probe_blocksizes = &raw_probe_blocksizes,
.bdrv_probe_geometry = &raw_probe_geometry,
diff --git a/block/rbd.c b/block/rbd.c
index 951cd63f9a..cf9670480e 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1399,8 +1399,7 @@ coroutine_fn qemu_rbd_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
}
#endif
-static int coroutine_fn
-qemu_rbd_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVRBDState *s = bs->opaque;
bdi->cluster_size = s->object_size;
@@ -1822,7 +1821,7 @@ static BlockDriver bdrv_rbd = {
.bdrv_co_create = qemu_rbd_co_create,
.bdrv_co_create_opts = qemu_rbd_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
- .bdrv_co_get_info = qemu_rbd_co_get_info,
+ .bdrv_get_info = qemu_rbd_getinfo,
.bdrv_get_specific_info = qemu_rbd_get_specific_info,
.create_opts = &qemu_rbd_create_opts,
.bdrv_co_getlength = qemu_rbd_co_getlength,
diff --git a/block/vdi.c b/block/vdi.c
index 3ddc62a569..e68455cc4d 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -329,11 +329,12 @@ static int coroutine_fn vdi_co_check(BlockDriverState *bs, BdrvCheckResult *res,
return 0;
}
-static int coroutine_fn
-vdi_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int vdi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
- /* TODO: vdi_co_get_info would be needed for machine snapshots.
- vm_state_offset is still missing. */
+ /*
+ * TODO: vdi_get_info would be needed for machine snapshots.
+ * vm_state_offset is still missing.
+ */
BDRVVdiState *s = (BDRVVdiState *)bs->opaque;
logout("\n");
bdi->cluster_size = s->block_size;
@@ -1052,7 +1053,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_co_pwritev = vdi_co_pwritev,
#endif
- .bdrv_co_get_info = vdi_co_get_info,
+ .bdrv_get_info = vdi_get_info,
.is_format = true,
.create_opts = &vdi_create_opts,
diff --git a/block/vhdx.c b/block/vhdx.c
index b2a4b813a0..e4d7527bca 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1167,8 +1167,7 @@ static void vhdx_block_translate(BDRVVHDXState *s, int64_t sector_num,
}
-static int coroutine_fn
-vhdx_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int vhdx_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVVHDXState *s = bs->opaque;
@@ -2257,7 +2256,7 @@ static BlockDriver bdrv_vhdx = {
.bdrv_co_writev = vhdx_co_writev,
.bdrv_co_create = vhdx_co_create,
.bdrv_co_create_opts = vhdx_co_create_opts,
- .bdrv_co_get_info = vhdx_co_get_info,
+ .bdrv_get_info = vhdx_get_info,
.bdrv_co_check = vhdx_co_check,
.bdrv_has_zero_init = vhdx_has_zero_init,
diff --git a/block/vmdk.c b/block/vmdk.c
index 9c7ab037e1..3c5885ba62 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -3047,8 +3047,7 @@ static bool vmdk_extents_type_eq(const VmdkExtent *a, const VmdkExtent *b)
(a->flat || a->cluster_sectors == b->cluster_sectors);
}
-static int coroutine_fn
-vmdk_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int vmdk_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
int i;
BDRVVmdkState *s = bs->opaque;
@@ -3166,7 +3165,7 @@ static BlockDriver bdrv_vmdk = {
.bdrv_has_zero_init = vmdk_has_zero_init,
.bdrv_get_specific_info = vmdk_get_specific_info,
.bdrv_refresh_limits = vmdk_refresh_limits,
- .bdrv_co_get_info = vmdk_co_get_info,
+ .bdrv_get_info = vmdk_get_info,
.bdrv_gather_child_options = vmdk_gather_child_options,
.is_format = true,
diff --git a/block/vpc.c b/block/vpc.c
index 801ff5793f..6e98a57bda 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -605,8 +605,7 @@ fail:
return ret;
}
-static int coroutine_fn
-vpc_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVVPCState *s = (BDRVVPCState *)bs->opaque;
@@ -1246,7 +1245,7 @@ static BlockDriver bdrv_vpc = {
.bdrv_co_pwritev = vpc_co_pwritev,
.bdrv_co_block_status = vpc_co_block_status,
- .bdrv_co_get_info = vpc_co_get_info,
+ .bdrv_get_info = vpc_get_info,
.is_format = true,
.create_opts = &vpc_create_opts,
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index 6df2e8e7c5..ee3eabb1aa 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -50,7 +50,7 @@ void nvme_ns_init_format(NvmeNamespace *ns)
npdg = ns->blkconf.discard_granularity / ns->lbasz;
- ret = bdrv_get_info(blk_bs(ns->blkconf.blk), &bdi);
+ ret = bdrv_get_info_unlocked(blk_bs(ns->blkconf.blk), &bdi);
if (ret >= 0 && bdi.cluster_size > ns->blkconf.discard_granularity) {
npdg = bdi.cluster_size / ns->lbasz;
}
diff --git a/include/block/block-io.h b/include/block/block-io.h
index b99cc98d26..eb29a04d29 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -193,11 +193,12 @@ bdrv_get_device_name(const BlockDriverState *bs);
const char * GRAPH_RDLOCK
bdrv_get_device_or_node_name(const BlockDriverState *bs);
+/* The coroutine is called 'unlocked' because the wrapper is. */
int coroutine_fn GRAPH_RDLOCK
-bdrv_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
-
-int co_wrapper_mixed_bdrv_rdlock
-bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
+bdrv_co_get_info_unlocked(BlockDriverState *bs, BlockDriverInfo *bdi);
+int co_wrapper_bdrv_rdlock
+bdrv_get_info_unlocked(BlockDriverState *bs, BlockDriverInfo *bdi);
+int GRAPH_RDLOCK bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
ImageInfoSpecific * GRAPH_RDLOCK
bdrv_get_specific_info(BlockDriverState *bs, Error **errp);
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 2982dd3118..22c03c0c40 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -722,8 +722,8 @@ struct BlockDriver {
BlockDriverState *bs, int64_t offset, int64_t bytes,
QEMUIOVector *qiov, size_t qiov_offset);
- int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_get_info)(
- BlockDriverState *bs, BlockDriverInfo *bdi);
+ int GRAPH_RDLOCK_PTR (*bdrv_get_info)(BlockDriverState *bs,
+ BlockDriverInfo *bdi);
ImageInfoSpecific * GRAPH_RDLOCK_PTR (*bdrv_get_specific_info)(
BlockDriverState *bs, Error **errp);
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index fa956debfb..4f5b45d7d7 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -47,7 +47,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
void bdrv_enable_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap);
BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
-uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
+uint32_t GRAPH_RDLOCK bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
bool bdrv_dirty_bitmap_has_successor(BdrvDirtyBitmap *bitmap);
diff --git a/qemu-img.c b/qemu-img.c
index 139eeb5039..8c24956f39 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2506,10 +2506,12 @@ static int img_convert(int argc, char **argv)
src_bs = blk_bs(s.src[bs_i]);
s.src_alignment[bs_i] = DIV_ROUND_UP(src_bs->bl.request_alignment,
BDRV_SECTOR_SIZE);
+ bdrv_graph_rdlock_main_loop();
if (!bdrv_get_info(src_bs, &bdi)) {
s.src_alignment[bs_i] = MAX(s.src_alignment[bs_i],
bdi.cluster_size / BDRV_SECTOR_SIZE);
}
+ bdrv_graph_rdunlock_main_loop();
s.total_sectors += s.src_sectors[bs_i];
}
@@ -2768,7 +2770,9 @@ static int img_convert(int argc, char **argv)
s.target_backing_sectors = -1;
}
+ bdrv_graph_rdlock_main_loop();
ret = bdrv_get_info(out_bs, &bdi);
+ bdrv_graph_rdunlock_main_loop();
if (ret < 0) {
if (s.compressed) {
error_report("could not get block driver info");
@@ -3694,7 +3698,9 @@ static int img_rebase(int argc, char **argv)
* We need overlay subcluster size (or cluster size in case writes are
* compressed) to make sure write requests are aligned.
*/
+ bdrv_graph_rdlock_main_loop();
ret = bdrv_get_info(unfiltered_bs, &bdi);
+ bdrv_graph_rdunlock_main_loop();
if (ret < 0) {
error_report("could not get block driver info");
goto out;
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/4] iotests: add test for snapshot on top of node with compress filter
2025-06-04 12:07 [PATCH 0/4] block: fix deadlock when doing blockdev-snapshot of a node with compress filter Fiona Ebner
` (2 preceding siblings ...)
2025-06-04 12:07 ` [PATCH 3/4] block: make calling bdrv_refresh_limits() safe while holding graph lock Fiona Ebner
@ 2025-06-04 12:07 ` Fiona Ebner
3 siblings, 0 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-06-04 12:07 UTC (permalink / raw)
To: qemu-block
Cc: qemu-devel, kwolf, hreitz, stefanha, fam, jsnow, vsementsov,
eblake, kbusch, its, foss
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
tests/qemu-iotests/085 | 18 ++++++++++++++++++
tests/qemu-iotests/085.out | 21 +++++++++++++++++++++
2 files changed, 39 insertions(+)
diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
index 3fb7b0b5c8..5063e3e8d2 100755
--- a/tests/qemu-iotests/085
+++ b/tests/qemu-iotests/085
@@ -270,6 +270,24 @@ _send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot',
'overlay':'snap_${SNAPSHOTS}' }
}" "error"
+echo
+echo === Add snapshot in combination with compress filter ===
+echo
+
+SNAPSHOTS=$((${SNAPSHOTS}+1))
+add_snapshot_image ${SNAPSHOTS}
+
+_send_qemu_cmd $h "{ 'execute': 'blockdev-add',
+ 'arguments': { 'driver':'compress',
+ 'node-name':'compress',
+ 'file':'virtio0' }
+ }" "return"
+
+_send_qemu_cmd $h "{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'compress',
+ 'overlay':'snap_${SNAPSHOTS}' }
+ }" "return"
+
# success, all done
echo "*** done"
rm -f $seq.full
diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
index b543b992ff..57d68f3bb9 100644
--- a/tests/qemu-iotests/085.out
+++ b/tests/qemu-iotests/085.out
@@ -255,4 +255,25 @@ Formatting 'TEST_DIR/15-snapshot-v0.IMGFMT', fmt=IMGFMT size=134217728 backing_f
'overlay':'snap_15' }
}
{"error": {"class": "GenericError", "desc": "Cannot find device='nodevice' nor node-name='nodevice'"}}
+
+=== Add snapshot in combination with compress filter ===
+
+Formatting 'TEST_DIR/16-snapshot-v0.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/15-snapshot-v0.IMGFMT backing_fmt=IMGFMT
+{ 'execute': 'blockdev-add', 'arguments':
+ { 'driver': 'IMGFMT', 'node-name': 'snap_16', 'backing': null,
+ 'file':
+ { 'driver': 'file', 'filename': 'TEST_DIR/16-snapshot-v0.IMGFMT',
+ 'node-name': 'file_16' } } }
+{"return": {}}
+{ 'execute': 'blockdev-add',
+ 'arguments': { 'driver':'compress',
+ 'node-name':'compress',
+ 'file':'virtio0' }
+ }
+{"return": {}}
+{ 'execute': 'blockdev-snapshot',
+ 'arguments': { 'node':'compress',
+ 'overlay':'snap_16' }
+ }
+{"return": {}}
*** done
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 3/4] block: make calling bdrv_refresh_limits() safe while holding graph lock
2025-06-04 12:07 ` [PATCH 3/4] block: make calling bdrv_refresh_limits() safe while holding graph lock Fiona Ebner
@ 2025-06-04 12:15 ` Klaus Jensen
0 siblings, 0 replies; 7+ messages in thread
From: Klaus Jensen @ 2025-06-04 12:15 UTC (permalink / raw)
To: Fiona Ebner
Cc: qemu-block, qemu-devel, kwolf, hreitz, stefanha, fam, jsnow,
vsementsov, eblake, kbusch, foss
[-- Attachment #1: Type: text/plain, Size: 2443 bytes --]
On Jun 4 14:07, Fiona Ebner wrote:
> The bdrv_refresh_limits() function and driver implementations are
> called with the graph lock held. The implementation for the 'compress'
> filter calls bdrv_get_info(), which is a generated coroutine wrapper
> and thus polls. This can lead to a deadlock when issuing a
> blockdev-snapshot QMP command, when bdrv_refresh_limits() is called in
> bdrv_append() while the graph lock is held exclusively. This deadlock
> was introduced with commit 5661a00d40 ("block: Call transaction
> callbacks with lock held").
>
> As a solution, this reverts commit 3d47eb0a2a ("block:
> Convert bdrv_get_info() to co_wrapper_mixed"). To do this, it is
> necessary to have callers of bdrv_get_info() take the graph lock
> themselves. None of the driver implementations rely on being run in
> coroutine context and none of the callers rely on the function being
> a coroutine.
>
> All callers except one either already hold the graph lock or can claim
> the graph lock via bdrv_graph_rdlock_main_loop(). As part of this,
> bdrv_get_default_bitmap_granularity() is annotated with GRAPH_RDLOCK
> and its callers adapted where necessary.
>
> The single exception is the caller nvme_ns_init_format(), which can
> run as a callback in an IO thread, but can also be reached via the QOM
> realize handler nvme_ns_realize(). For this caller, a
> bdrv_get_info_unlocked() coroutine wrapper is introduced that must be
> called with the graph unlocked.
>
> diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
> index 6df2e8e7c5..ee3eabb1aa 100644
> --- a/hw/nvme/ns.c
> +++ b/hw/nvme/ns.c
> @@ -50,7 +50,7 @@ void nvme_ns_init_format(NvmeNamespace *ns)
>
> npdg = ns->blkconf.discard_granularity / ns->lbasz;
>
> - ret = bdrv_get_info(blk_bs(ns->blkconf.blk), &bdi);
> + ret = bdrv_get_info_unlocked(blk_bs(ns->blkconf.blk), &bdi);
> if (ret >= 0 && bdi.cluster_size > ns->blkconf.discard_granularity) {
> npdg = bdi.cluster_size / ns->lbasz;
> }
Acked-by: Klaus Jensen <k.jensen@samsung.com>
FWIW, if there is a better way to get the cluster size I'd very much
like to change what we do in hw/nvme for that. We need it to
infer/compute the "preferred deallocation granularity".
But maybe, it's just not the correct way to do this, and we shouldn't
try to do it at all and not report a preferred dealllocation
granularity. It does seem brittle, or?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] block/graph-lock: fix typo in comment
2025-06-04 12:07 ` [PATCH 1/4] block/graph-lock: fix typo in comment Fiona Ebner
@ 2025-06-17 18:36 ` Eric Blake
0 siblings, 0 replies; 7+ messages in thread
From: Eric Blake @ 2025-06-17 18:36 UTC (permalink / raw)
To: Fiona Ebner
Cc: qemu-block, qemu-devel, kwolf, hreitz, stefanha, fam, jsnow,
vsementsov, kbusch, its, foss
On Wed, Jun 04, 2025 at 02:07:14PM +0200, Fiona Ebner wrote:
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> include/block/graph-lock.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
>
> diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
> index 2c26c72108..67c8d04867 100644
> --- a/include/block/graph-lock.h
> +++ b/include/block/graph-lock.h
> @@ -169,7 +169,7 @@ bdrv_graph_rdunlock_main_loop(void);
> /*
> * assert_bdrv_graph_readable:
> * Make sure that the reader is either the main loop,
> - * or there is at least a reader helding the rdlock.
> + * or there is at least a reader holding the rdlock.
> * In this way an incoming writer is aware of the read and waits.
> */
> void GRAPH_RDLOCK assert_bdrv_graph_readable(void);
> --
> 2.39.5
>
>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-06-17 18:37 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 12:07 [PATCH 0/4] block: fix deadlock when doing blockdev-snapshot of a node with compress filter Fiona Ebner
2025-06-04 12:07 ` [PATCH 1/4] block/graph-lock: fix typo in comment Fiona Ebner
2025-06-17 18:36 ` Eric Blake
2025-06-04 12:07 ` [PATCH 2/4] block-coroutine-wrapper: mark generated co-wrapper as GRAPH_UNLOCKED Fiona Ebner
2025-06-04 12:07 ` [PATCH 3/4] block: make calling bdrv_refresh_limits() safe while holding graph lock Fiona Ebner
2025-06-04 12:15 ` Klaus Jensen
2025-06-04 12:07 ` [PATCH 4/4] iotests: add test for snapshot on top of node with compress filter Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).