* [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar
@ 2025-10-07 18:34 Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 1/5] block: add BdrvChildClass->propagate_attach/detach() callbacks Stefan Hajnoczi
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 18:34 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, qemu-block, Hanna Reitz, Stefan Hajnoczi
v2:
- Add a tests/functional test case [Kevin]
This patch series fixes a bug in BlockRAMRegistrar: it currently doesn't react
to block graph changes and newly inserted nodes lack RAMBlock information
needed to map I/O buffers. This is important for vdpa-blk devices because they
rely on the ability to map I/O buffers.
Stefan Hajnoczi (5):
block: add BdrvChildClass->propagate_attach/detach() callbacks
block: add blk_add_attach/detach_notifier() APIs
block: rename RAMBlockRegistrar->notifier field
block: update inserted/removed nodes from BlockRAMRegistrar
tests/functional: add vdpa-blk blockdev-mirror test
include/block/block_int-common.h | 11 ++
include/system/block-backend-global-state.h | 9 +
include/system/block-ram-registrar.h | 4 +-
block.c | 56 ++++--
block/block-backend.c | 44 +++++
block/block-ram-registrar.c | 73 +++++++-
tests/unit/test-block-backend.c | 164 ++++++++++++++++++
tests/functional/x86_64/meson.build | 1 +
.../functional/x86_64/test_vdpa_blk_mirror.py | 118 +++++++++++++
9 files changed, 461 insertions(+), 19 deletions(-)
create mode 100755 tests/functional/x86_64/test_vdpa_blk_mirror.py
--
2.51.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/5] block: add BdrvChildClass->propagate_attach/detach() callbacks
2025-10-07 18:34 [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
@ 2025-10-07 18:34 ` Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 2/5] block: add blk_add_attach/detach_notifier() APIs Stefan Hajnoczi
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 18:34 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, qemu-block, Hanna Reitz, Stefan Hajnoczi
bdrv_replace_child_noperm() is the heart of block graph changes. It is
called when a node is inserted, removed, or replaced. These changes
happen without notifying parents in the graph, so it is currently not
possible to monitor changes.
Add BdrvChildClass callbacks to propagate attach/detach operations to
the roots of the graph. The next commit will introduce a BlockBackend
API for monitoring changes using this mechanism.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
include/block/block_int-common.h | 11 +++++++
block.c | 56 +++++++++++++++++++++++++-------
2 files changed, 56 insertions(+), 11 deletions(-)
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 034c0634c8..0368a3191c 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -963,6 +963,17 @@ struct BdrvChildClass {
void GRAPH_RDLOCK_PTR (*activate)(BdrvChild *child, Error **errp);
int GRAPH_RDLOCK_PTR (*inactivate)(BdrvChild *child);
+ /*
+ * Optional callbacks when a descendant (child, grandchild, etc) attaches
+ * or detaches a BlockDriverState. Allows monitoring changes to the graph.
+ *
+ * Called after ->attach() and before ->detach().
+ */
+ void GRAPH_WRLOCK_PTR (*propagate_attach)(BdrvChild *self,
+ BdrvChild *descendant);
+ void GRAPH_WRLOCK_PTR (*propagate_detach)(BdrvChild *self,
+ BdrvChild *descendant);
+
void GRAPH_WRLOCK_PTR (*attach)(BdrvChild *child);
void GRAPH_WRLOCK_PTR (*detach)(BdrvChild *child);
diff --git a/block.c b/block.c
index 8848e9a7ed..e1480dda04 100644
--- a/block.c
+++ b/block.c
@@ -1497,6 +1497,32 @@ static void GRAPH_WRLOCK bdrv_child_cb_detach(BdrvChild *child)
}
}
+static void GRAPH_WRLOCK
+bdrv_child_cb_propagate_attach(BdrvChild *self, BdrvChild *descendant)
+{
+ BlockDriverState *bs = self->opaque;
+ BdrvChild *c;
+
+ QLIST_FOREACH(c, &bs->parents, next_parent) {
+ if (c->klass->propagate_attach) {
+ c->klass->propagate_attach(c, descendant);
+ }
+ }
+}
+
+static void GRAPH_WRLOCK
+bdrv_child_cb_propagate_detach(BdrvChild *self, BdrvChild *descendant)
+{
+ BlockDriverState *bs = self->opaque;
+ BdrvChild *c;
+
+ QLIST_FOREACH(c, &bs->parents, next_parent) {
+ if (c->klass->propagate_detach) {
+ c->klass->propagate_detach(c, descendant);
+ }
+ }
+}
+
static int bdrv_child_cb_update_filename(BdrvChild *c, BlockDriverState *base,
const char *filename,
bool backing_mask_protocol,
@@ -1519,17 +1545,19 @@ AioContext *child_of_bds_get_parent_aio_context(BdrvChild *c)
}
const BdrvChildClass child_of_bds = {
- .parent_is_bds = true,
- .get_parent_desc = bdrv_child_get_parent_desc,
- .inherit_options = bdrv_inherited_options,
- .drained_begin = bdrv_child_cb_drained_begin,
- .drained_poll = bdrv_child_cb_drained_poll,
- .drained_end = bdrv_child_cb_drained_end,
- .attach = bdrv_child_cb_attach,
- .detach = bdrv_child_cb_detach,
- .inactivate = bdrv_child_cb_inactivate,
- .change_aio_ctx = bdrv_child_cb_change_aio_ctx,
- .update_filename = bdrv_child_cb_update_filename,
+ .parent_is_bds = true,
+ .get_parent_desc = bdrv_child_get_parent_desc,
+ .inherit_options = bdrv_inherited_options,
+ .drained_begin = bdrv_child_cb_drained_begin,
+ .drained_poll = bdrv_child_cb_drained_poll,
+ .drained_end = bdrv_child_cb_drained_end,
+ .attach = bdrv_child_cb_attach,
+ .detach = bdrv_child_cb_detach,
+ .propagate_attach = bdrv_child_cb_propagate_attach,
+ .propagate_detach = bdrv_child_cb_propagate_detach,
+ .inactivate = bdrv_child_cb_inactivate,
+ .change_aio_ctx = bdrv_child_cb_change_aio_ctx,
+ .update_filename = bdrv_child_cb_update_filename,
.get_parent_aio_context = child_of_bds_get_parent_aio_context,
};
@@ -2967,6 +2995,9 @@ bdrv_replace_child_noperm(BdrvChild *child, BlockDriverState *new_bs)
}
if (old_bs) {
+ if (child->klass->propagate_detach) {
+ child->klass->propagate_detach(child, child);
+ }
if (child->klass->detach) {
child->klass->detach(child);
}
@@ -2980,6 +3011,9 @@ bdrv_replace_child_noperm(BdrvChild *child, BlockDriverState *new_bs)
if (child->klass->attach) {
child->klass->attach(child);
}
+ if (child->klass->propagate_attach) {
+ child->klass->propagate_attach(child, child);
+ }
}
/*
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/5] block: add blk_add_attach/detach_notifier() APIs
2025-10-07 18:34 [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 1/5] block: add BdrvChildClass->propagate_attach/detach() callbacks Stefan Hajnoczi
@ 2025-10-07 18:34 ` Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 3/5] block: rename RAMBlockRegistrar->notifier field Stefan Hajnoczi
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 18:34 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, qemu-block, Hanna Reitz, Stefan Hajnoczi
Add an API to receive a callback when a BlockDriverState is attached or
detached from the block graph. This allows callers to track insert,
remove, and replace operations.
Note that this API might be able to replace the
blk_add_remove_bs_notifier() API if the callback checks that the root
BDS is being removed and ignores all other calls. I think this is a bit
messy though, so I kept blk_add_remove_bs_notifier().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
include/system/block-backend-global-state.h | 9 ++
block/block-backend.c | 44 ++++++
tests/unit/test-block-backend.c | 164 ++++++++++++++++++++
3 files changed, 217 insertions(+)
diff --git a/include/system/block-backend-global-state.h b/include/system/block-backend-global-state.h
index c3849640df..e828f271ee 100644
--- a/include/system/block-backend-global-state.h
+++ b/include/system/block-backend-global-state.h
@@ -97,6 +97,15 @@ void blk_remove_aio_context_notifier(BlockBackend *blk,
void (*detach_aio_context)(void *),
void *opaque);
void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify);
+
+typedef struct {
+ BlockBackend *blk;
+ BlockDriverState *bs;
+} BlockBackendAttachDetachArgs;
+
+void blk_add_attach_notifier(BlockBackend *blk, Notifier *notify);
+void blk_add_detach_notifier(BlockBackend *blk, Notifier *notify);
+
BlockBackendRootState *blk_get_root_state(BlockBackend *blk);
void blk_update_root_state(BlockBackend *blk);
bool blk_get_detect_zeroes_from_root_state(BlockBackend *blk);
diff --git a/block/block-backend.c b/block/block-backend.c
index f8d6ba65c1..45e31f0079 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -76,6 +76,7 @@ struct BlockBackend {
/* Protected by BQL */
NotifierList remove_bs_notifiers, insert_bs_notifiers;
+ NotifierList attach_notifiers, detach_notifiers;
QLIST_HEAD(, BlockBackendAioNotifier) aio_notifiers;
int quiesce_counter; /* atomic: written under BQL, read by other threads */
@@ -280,6 +281,30 @@ static int GRAPH_RDLOCK blk_root_inactivate(BdrvChild *child)
return 0;
}
+static void GRAPH_WRLOCK
+blk_root_propagate_attach(BdrvChild *self, BdrvChild *descendant)
+{
+ BlockBackend *blk = self->opaque;
+ BlockBackendAttachDetachArgs args = {
+ .blk = blk,
+ .bs = descendant->bs,
+ };
+
+ notifier_list_notify(&blk->attach_notifiers, &args);
+}
+
+static void GRAPH_WRLOCK
+blk_root_propagate_detach(BdrvChild *self, BdrvChild *descendant)
+{
+ BlockBackend *blk = self->opaque;
+ BlockBackendAttachDetachArgs args = {
+ .blk = blk,
+ .bs = descendant->bs,
+ };
+
+ notifier_list_notify(&blk->detach_notifiers, &args);
+}
+
static void blk_root_attach(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
@@ -333,6 +358,9 @@ static const BdrvChildClass child_root = {
.activate = blk_root_activate,
.inactivate = blk_root_inactivate,
+ .propagate_attach = blk_root_propagate_attach,
+ .propagate_detach = blk_root_propagate_detach,
+
.attach = blk_root_attach,
.detach = blk_root_detach,
@@ -374,6 +402,8 @@ BlockBackend *blk_new(AioContext *ctx, uint64_t perm, uint64_t shared_perm)
qemu_co_queue_init(&blk->queued_requests);
notifier_list_init(&blk->remove_bs_notifiers);
notifier_list_init(&blk->insert_bs_notifiers);
+ notifier_list_init(&blk->attach_notifiers);
+ notifier_list_init(&blk->detach_notifiers);
QLIST_INIT(&blk->aio_notifiers);
QTAILQ_INSERT_TAIL(&block_backends, blk, link);
@@ -492,6 +522,8 @@ static void blk_delete(BlockBackend *blk)
}
assert(QLIST_EMPTY(&blk->remove_bs_notifiers.notifiers));
assert(QLIST_EMPTY(&blk->insert_bs_notifiers.notifiers));
+ assert(QLIST_EMPTY(&blk->attach_notifiers.notifiers));
+ assert(QLIST_EMPTY(&blk->detach_notifiers.notifiers));
assert(QLIST_EMPTY(&blk->aio_notifiers));
assert(qemu_co_queue_empty(&blk->queued_requests));
qemu_mutex_destroy(&blk->queued_requests_lock);
@@ -2512,6 +2544,18 @@ void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify)
notifier_list_add(&blk->remove_bs_notifiers, notify);
}
+void blk_add_attach_notifier(BlockBackend *blk, Notifier *notify)
+{
+ GLOBAL_STATE_CODE();
+ notifier_list_add(&blk->attach_notifiers, notify);
+}
+
+void blk_add_detach_notifier(BlockBackend *blk, Notifier *notify)
+{
+ GLOBAL_STATE_CODE();
+ notifier_list_add(&blk->detach_notifiers, notify);
+}
+
BlockAcctStats *blk_get_stats(BlockBackend *blk)
{
IO_CODE();
diff --git a/tests/unit/test-block-backend.c b/tests/unit/test-block-backend.c
index 4257b3f815..693383f029 100644
--- a/tests/unit/test-block-backend.c
+++ b/tests/unit/test-block-backend.c
@@ -24,6 +24,7 @@
#include "qemu/osdep.h"
#include "block/block.h"
+#include "block/block_int.h"
#include "system/block-backend.h"
#include "qapi/error.h"
#include "qemu/main-loop.h"
@@ -70,6 +71,167 @@ static void test_drain_all_aio_error(void)
blk_unref(blk);
}
+static int bdrv_test_co_change_backing_file(BlockDriverState *bs,
+ const char *backing_file,
+ const char *backing_fmt)
+{
+ return 0; /* just return success so bdrv_set_backing_hd() works */
+}
+
+static BlockDriver bdrv_test = {
+ .format_name = "test",
+ .supports_backing = true,
+ .bdrv_child_perm = bdrv_default_perms,
+ .bdrv_co_change_backing_file = bdrv_test_co_change_backing_file,
+};
+
+typedef struct {
+ Notifier attach_notifier;
+ Notifier detach_notifier;
+ GArray *notifications;
+} AttachDetach;
+
+typedef enum {
+ NOTIFY_END = 0,
+ NOTIFY_ATTACH,
+ NOTIFY_DETACH,
+} NotificationType;
+
+typedef struct {
+ NotificationType type;
+ BlockDriverState *bs;
+} Notification;
+
+static void attach_detach_append(AttachDetach *ad, NotificationType type,
+ BlockBackendAttachDetachArgs *args)
+{
+ Notification n = {
+ .type = type,
+ .bs = args->bs,
+ };
+
+ g_array_append_vals(ad->notifications, &n, 1);
+}
+
+static void attach_notify(Notifier *notifier, void *data)
+{
+ AttachDetach *ad = container_of(notifier, AttachDetach, attach_notifier);
+ attach_detach_append(ad, NOTIFY_ATTACH, data);
+}
+
+static void detach_notify(Notifier *notifier, void *data)
+{
+ AttachDetach *ad = container_of(notifier, AttachDetach, attach_notifier);
+ attach_detach_append(ad, NOTIFY_DETACH, data);
+}
+
+static void attach_detach_init(AttachDetach *ad, BlockBackend *blk)
+{
+ ad->attach_notifier.notify = attach_notify;
+ ad->detach_notifier.notify = detach_notify;
+ ad->notifications = g_array_new(true, true, sizeof(Notification));
+
+ blk_add_attach_notifier(blk, &ad->attach_notifier);
+ blk_add_detach_notifier(blk, &ad->detach_notifier);
+}
+
+static void attach_detach_cleanup(AttachDetach *ad)
+{
+ g_array_free(ad->notifications, true);
+ notifier_remove(&ad->detach_notifier);
+ notifier_remove(&ad->attach_notifier);
+}
+
+/*
+ * Check that the expected notifications occurred. @expected is terminated by a
+ * NOTIFY_END element.
+ */
+static void attach_detach_expect(AttachDetach *ad, const Notification *expected)
+{
+ GArray *array = ad->notifications;
+
+ /* The array is zero terminated so there is at least one element */
+ Notification *actual = (Notification *)array->data;
+
+ while (expected->type != NOTIFY_END) {
+ g_assert_cmpint(actual->type, ==, expected->type);
+ g_assert(actual->bs == expected->bs);
+ expected++;
+ actual++;
+ }
+
+ g_assert_cmpint(actual->type, ==, NOTIFY_END);
+
+ g_array_remove_range(array, 0, array->len);
+}
+
+static void test_attach_detach_notifier(void)
+{
+ AttachDetach ad;
+ BlockDriverState *format;
+ BlockDriverState *file;
+ BlockDriverState *file2;
+ BlockBackend *blk = blk_new(qemu_get_aio_context(),
+ BLK_PERM_ALL, BLK_PERM_ALL);
+
+ attach_detach_init(&ad, blk);
+
+ format = bdrv_new_open_driver(&bdrv_test, "format", BDRV_O_RDWR,
+ &error_abort);
+ file = bdrv_new_open_driver(&bdrv_test, "file", BDRV_O_RDWR, &error_abort);
+ file2 = bdrv_new_open_driver(&bdrv_test, "file2", BDRV_O_RDWR,
+ &error_abort);
+
+ bdrv_graph_wrlock_drained();
+ bdrv_attach_child(format, file, "file", &child_of_bds,
+ BDRV_CHILD_PRIMARY | BDRV_CHILD_DATA, &error_abort);
+ bdrv_graph_wrunlock();
+
+ /* Insert format -> file */
+ blk_insert_bs(blk, format, &error_abort);
+ attach_detach_expect(&ad, (Notification[]){
+ (Notification){NOTIFY_ATTACH, format},
+ {},
+ });
+
+ /* Replace format -> file with file2 */
+ blk_replace_bs(blk, file2, &error_abort);
+ attach_detach_expect(&ad, (Notification[]){
+ (Notification){NOTIFY_DETACH, format},
+ (Notification){NOTIFY_ATTACH, file2},
+ {},
+ });
+
+ /* Remove file2 */
+ blk_remove_bs(blk);
+ attach_detach_expect(&ad, (Notification[]){
+ (Notification){NOTIFY_DETACH, file2},
+ {},
+ });
+
+ /* These BDSes were unrefed so we need new instances */
+ file = bdrv_new_open_driver(&bdrv_test, "file", BDRV_O_RDWR, &error_abort);
+ file2 = bdrv_new_open_driver(&bdrv_test, "file2", BDRV_O_RDWR,
+ &error_abort);
+
+ /* Replace a non-root node */
+ bdrv_graph_wrlock_drained();
+ bdrv_attach_child(format, file, "file", &child_of_bds,
+ BDRV_CHILD_PRIMARY | BDRV_CHILD_DATA, &error_abort);
+ bdrv_replace_node(file, file2, &error_abort);
+ bdrv_graph_wrunlock();
+ attach_detach_expect(&ad, (Notification[]){
+ (Notification){NOTIFY_ATTACH, file},
+ (Notification){NOTIFY_DETACH, file},
+ (Notification){NOTIFY_ATTACH, file2},
+ {},
+ });
+
+ attach_detach_cleanup(&ad);
+ blk_unref(blk);
+ bdrv_unref(format);
+}
+
int main(int argc, char **argv)
{
bdrv_init();
@@ -80,6 +242,8 @@ int main(int argc, char **argv)
g_test_add_func("/block-backend/drain_aio_error", test_drain_aio_error);
g_test_add_func("/block-backend/drain_all_aio_error",
test_drain_all_aio_error);
+ g_test_add_func("/block-backend/attach_detach_notifier",
+ test_attach_detach_notifier);
return g_test_run();
}
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/5] block: rename RAMBlockRegistrar->notifier field
2025-10-07 18:34 [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 1/5] block: add BdrvChildClass->propagate_attach/detach() callbacks Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 2/5] block: add blk_add_attach/detach_notifier() APIs Stefan Hajnoczi
@ 2025-10-07 18:34 ` Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 4/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 5/5] tests/functional: add vdpa-blk blockdev-mirror test Stefan Hajnoczi
4 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 18:34 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, qemu-block, Hanna Reitz, Stefan Hajnoczi
The 'notifier' field name will be confusing when the BlockBackend attach
and detach notifiers are added in the next commit. Rename the field so
it's clear that this is the RAMBlock notifier.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
include/system/block-ram-registrar.h | 2 +-
block/block-ram-registrar.c | 14 ++++++++------
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/system/block-ram-registrar.h b/include/system/block-ram-registrar.h
index d8b2f7942b..76c157bd54 100644
--- a/include/system/block-ram-registrar.h
+++ b/include/system/block-ram-registrar.h
@@ -21,7 +21,7 @@
*/
typedef struct {
BlockBackend *blk;
- RAMBlockNotifier notifier;
+ RAMBlockNotifier ram_block_notifier;
bool ok;
} BlockRAMRegistrar;
diff --git a/block/block-ram-registrar.c b/block/block-ram-registrar.c
index fcda2b86af..d5b84667a1 100644
--- a/block/block-ram-registrar.c
+++ b/block/block-ram-registrar.c
@@ -12,7 +12,8 @@
static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
size_t max_size)
{
- BlockRAMRegistrar *r = container_of(n, BlockRAMRegistrar, notifier);
+ BlockRAMRegistrar *r =
+ container_of(n, BlockRAMRegistrar, ram_block_notifier);
Error *err = NULL;
if (!r->ok) {
@@ -21,7 +22,7 @@ static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
if (!blk_register_buf(r->blk, host, max_size, &err)) {
error_report_err(err);
- ram_block_notifier_remove(&r->notifier);
+ ram_block_notifier_remove(n);
r->ok = false;
}
}
@@ -29,14 +30,15 @@ static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
static void ram_block_removed(RAMBlockNotifier *n, void *host, size_t size,
size_t max_size)
{
- BlockRAMRegistrar *r = container_of(n, BlockRAMRegistrar, notifier);
+ BlockRAMRegistrar *r =
+ container_of(n, BlockRAMRegistrar, ram_block_notifier);
blk_unregister_buf(r->blk, host, max_size);
}
void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk)
{
r->blk = blk;
- r->notifier = (RAMBlockNotifier){
+ r->ram_block_notifier = (RAMBlockNotifier){
.ram_block_added = ram_block_added,
.ram_block_removed = ram_block_removed,
@@ -47,12 +49,12 @@ void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk)
};
r->ok = true;
- ram_block_notifier_add(&r->notifier);
+ ram_block_notifier_add(&r->ram_block_notifier);
}
void blk_ram_registrar_destroy(BlockRAMRegistrar *r)
{
if (r->ok) {
- ram_block_notifier_remove(&r->notifier);
+ ram_block_notifier_remove(&r->ram_block_notifier);
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/5] block: update inserted/removed nodes from BlockRAMRegistrar
2025-10-07 18:34 [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
` (2 preceding siblings ...)
2025-10-07 18:34 ` [PATCH v2 3/5] block: rename RAMBlockRegistrar->notifier field Stefan Hajnoczi
@ 2025-10-07 18:34 ` Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 5/5] tests/functional: add vdpa-blk blockdev-mirror test Stefan Hajnoczi
4 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 18:34 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, qemu-block, Hanna Reitz, Stefan Hajnoczi,
Yingshun Cui
BlockRAMRegistrar ensures that RAMBlocks are registered with
BlockDriverStates. This is essential for vdpa-blk because they need to
know the memory mappings of I/O buffers. However, BlockRAMRegistrar is
currently unaware of changes to the block graph and newly inserted nodes
have no RAMBlocks registered.
Use the new blk_add_attach_notifier() and blk_add_detach_notifier() APIs
to bring nodes up to speed when the graph changes. This fixes vdpa-blk
across mirror and other operations that modify the block graph.
Previously I/O would not succeed after a new node was inserted due to
missing memory mappings.
Buglink: https://issues.redhat.com/browse/RHEL-88175
Reported-by: Yingshun Cui <yicui@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
include/system/block-ram-registrar.h | 2 +
block/block-ram-registrar.c | 61 +++++++++++++++++++++++++++-
2 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/include/system/block-ram-registrar.h b/include/system/block-ram-registrar.h
index 76c157bd54..292c197d1c 100644
--- a/include/system/block-ram-registrar.h
+++ b/include/system/block-ram-registrar.h
@@ -22,6 +22,8 @@
typedef struct {
BlockBackend *blk;
RAMBlockNotifier ram_block_notifier;
+ Notifier blk_attach_notifier;
+ Notifier blk_detach_notifier;
bool ok;
} BlockRAMRegistrar;
diff --git a/block/block-ram-registrar.c b/block/block-ram-registrar.c
index d5b84667a1..2d334c5655 100644
--- a/block/block-ram-registrar.c
+++ b/block/block-ram-registrar.c
@@ -7,7 +7,9 @@
#include "qemu/osdep.h"
#include "system/block-backend.h"
#include "system/block-ram-registrar.h"
+#include "system/ramblock.h"
#include "qapi/error.h"
+#include "trace.h"
static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
size_t max_size)
@@ -22,8 +24,8 @@ static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size,
if (!blk_register_buf(r->blk, host, max_size, &err)) {
error_report_err(err);
- ram_block_notifier_remove(n);
- r->ok = false;
+ blk_ram_registrar_destroy(r);
+ return;
}
}
@@ -35,6 +37,50 @@ static void ram_block_removed(RAMBlockNotifier *n, void *host, size_t size,
blk_unregister_buf(r->blk, host, max_size);
}
+static void blk_attached(Notifier *n, void *data)
+{
+ BlockRAMRegistrar *r =
+ container_of(n, BlockRAMRegistrar, blk_attach_notifier);
+ BlockBackendAttachDetachArgs *args = data;
+ BlockDriverState *bs = args->bs;
+ Error *err = NULL;
+
+ WITH_RCU_READ_LOCK_GUARD() {
+ RAMBlock *rb;
+
+ RAMBLOCK_FOREACH(rb) {
+ ram_addr_t max_size = qemu_ram_get_max_length(rb);
+ void *host = qemu_ram_get_host_addr(rb);
+
+ if (!bdrv_register_buf(bs, host, max_size, &err)) {
+ goto err;
+ }
+ }
+ }
+
+ return;
+
+err:
+ error_report_err(err);
+ blk_ram_registrar_destroy(r);
+}
+
+static void blk_detached(Notifier *n, void *data)
+{
+ BlockBackendAttachDetachArgs *args = data;
+ BlockDriverState *bs = args->bs;
+ RAMBlock *rb;
+
+ RCU_READ_LOCK_GUARD();
+
+ RAMBLOCK_FOREACH(rb) {
+ ram_addr_t max_size = qemu_ram_get_max_length(rb);
+ void *host = qemu_ram_get_host_addr(rb);
+
+ bdrv_unregister_buf(bs, host, max_size);
+ }
+}
+
void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk)
{
r->blk = blk;
@@ -47,14 +93,25 @@ void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk)
* value that does not change across resize.
*/
};
+ r->blk_attach_notifier = (Notifier){
+ .notify = blk_attached,
+ };
+ r->blk_detach_notifier = (Notifier){
+ .notify = blk_detached,
+ };
r->ok = true;
ram_block_notifier_add(&r->ram_block_notifier);
+ blk_add_attach_notifier(blk, &r->blk_attach_notifier);
+ blk_add_detach_notifier(blk, &r->blk_detach_notifier);
}
void blk_ram_registrar_destroy(BlockRAMRegistrar *r)
{
if (r->ok) {
+ notifier_remove(&r->blk_detach_notifier);
+ notifier_remove(&r->blk_attach_notifier);
ram_block_notifier_remove(&r->ram_block_notifier);
+ r->ok = false;
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 5/5] tests/functional: add vdpa-blk blockdev-mirror test
2025-10-07 18:34 [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
` (3 preceding siblings ...)
2025-10-07 18:34 ` [PATCH v2 4/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
@ 2025-10-07 18:34 ` Stefan Hajnoczi
2025-10-08 21:04 ` Eric Blake
4 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 18:34 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Eric Blake, qemu-block, Hanna Reitz, Stefan Hajnoczi
Add a test case that reproduces
https://issues.redhat.com/browse/RHEL-88175.
When the mirror blockjob completes, it replaces the original vdpa-blk
blockdev node with a new vdpa-blk blockdev. This will only work if the
BlockRAMRegistrar populates memory mappings (see the previous commit).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
tests/functional/x86_64/meson.build | 1 +
.../functional/x86_64/test_vdpa_blk_mirror.py | 118 ++++++++++++++++++
2 files changed, 119 insertions(+)
create mode 100755 tests/functional/x86_64/test_vdpa_blk_mirror.py
diff --git a/tests/functional/x86_64/meson.build b/tests/functional/x86_64/meson.build
index f78eec5e6c..dfe0e00190 100644
--- a/tests/functional/x86_64/meson.build
+++ b/tests/functional/x86_64/meson.build
@@ -33,6 +33,7 @@ tests_x86_64_system_thorough = [
'replay',
'reverse_debug',
'tuxrun',
+ 'vdpa_blk_mirror',
'vfio_user_client',
'virtio_balloon',
'virtio_gpu',
diff --git a/tests/functional/x86_64/test_vdpa_blk_mirror.py b/tests/functional/x86_64/test_vdpa_blk_mirror.py
new file mode 100755
index 0000000000..7d52836920
--- /dev/null
+++ b/tests/functional/x86_64/test_vdpa_blk_mirror.py
@@ -0,0 +1,118 @@
+#!/usr/bin/env python3
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright Red Hat, Inc.
+#
+# vdpa-blk mirror blockjob tests
+
+
+import glob
+import os
+import subprocess
+from qemu_test import LinuxKernelTest, Asset
+from qemu_test import exec_command_and_wait_for_pattern
+
+
+def run(cmd: str) -> None:
+ '''
+ Run a shell command without capturing stdout/stderr and raise
+ subprocess.CalledProcessError on failure.
+ '''
+ subprocess.check_call(cmd, shell=True,
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL)
+
+
+class VdpaBlk(LinuxKernelTest):
+
+ KERNEL_COMMAND_LINE = 'printk.time=0 console=ttyS0 rd.rescue'
+ ASSET_KERNEL = Asset(
+ ('https://archives.fedoraproject.org/pub/archive/fedora/linux/releases'
+ '/31/Server/x86_64/os/images/pxeboot/vmlinuz'),
+ 'd4738d03dbbe083ca610d0821d0a8f1488bebbdccef54ce33e3adb35fda00129')
+ ASSET_INITRD = Asset(
+ ('https://archives.fedoraproject.org/pub/archive/fedora/linux/releases'
+ '/31/Server/x86_64/os/images/pxeboot/initrd.img'),
+ '277cd6c7adf77c7e63d73bbb2cded8ef9e2d3a2f100000e92ff1f8396513cd8b')
+ VDPA_DEV_1 = f'vdpa-{os.getpid()}-1'
+ VDPA_DEV_2 = f'vdpa-{os.getpid()}-2'
+
+ def setUp(self) -> None:
+ def create_vdpa_dev(name):
+ '''
+ Create a new vdpasim_blk device and return its vhost_vdpa device
+ path.
+ '''
+ run(f'sudo -n vdpa dev add mgmtdev vdpasim_blk name {name}')
+ sysfs_vhost_vdpa_dev_dir = \
+ glob.glob(f'/sys/bus/vdpa/devices/{name}/vhost-vdpa-*')[0]
+ vhost_dev_basename = os.path.basename(sysfs_vhost_vdpa_dev_dir)
+ vhost_dev_path = f'/dev/{vhost_dev_basename}'
+ run(f'sudo -n chown {os.getuid()}:{os.getgid()} {vhost_dev_path}')
+ return vhost_dev_path
+
+ try:
+ run('sudo -n modprobe vhost_vdpa')
+ run('sudo -n modprobe vdpa_sim_blk')
+
+ self.vhost_dev_1_path = create_vdpa_dev(self.VDPA_DEV_1)
+ self.vhost_dev_2_path = create_vdpa_dev(self.VDPA_DEV_2)
+ except subprocess.CalledProcessError:
+ self.skipTest('Failed to set up vdpa_blk device')
+
+ super().setUp()
+
+ def tearDown(self) -> None:
+ super().tearDown()
+
+ try:
+ run(f'sudo -n vdpa dev del {self.VDPA_DEV_2}')
+ run(f'sudo -n vdpa dev del {self.VDPA_DEV_1}')
+ run('sudo -n modprobe --remove vdpa_sim_blk')
+ run('sudo -n modprobe --remove vhost_vdpa')
+ except subprocess.CalledProcessError:
+ pass # ignore failures
+
+ def test_mirror(self) -> None:
+ '''
+ Check that I/O works after a mirror blockjob pivots. See
+ https://issues.redhat.com/browse/RHEL-88175.
+ '''
+ kernel_path = self.ASSET_KERNEL.fetch()
+ initrd_path = self.ASSET_INITRD.fetch()
+
+ self.vm.add_args('-m', '1G')
+ self.vm.add_args('-object', 'memory-backend-memfd,id=mem,size=1G')
+ self.vm.add_args('-machine', 'pc,accel=kvm:tcg,memory-backend=mem')
+ self.vm.add_args('-append', self.KERNEL_COMMAND_LINE)
+ self.vm.add_args('-blockdev',
+ 'virtio-blk-vhost-vdpa,node-name=vdpa-blk-0,' +
+ f'path={self.vhost_dev_1_path},cache.direct=on')
+ self.vm.add_args('-device', 'virtio-blk-pci,drive=vdpa-blk-0')
+
+ self.launch_kernel(kernel_path, initrd_path,
+ wait_for='# ')
+
+ self.vm.cmd('blockdev-add',
+ driver='virtio-blk-vhost-vdpa',
+ node_name='vdpa-blk-1',
+ path=self.vhost_dev_2_path,
+ cache={'direct': True})
+ self.vm.cmd('blockdev-mirror',
+ device='vdpa-blk-0',
+ job_id='mirror0',
+ target='vdpa-blk-1',
+ sync='full',
+ target_is_zero=True)
+ self.vm.event_wait('BLOCK_JOB_READY')
+ self.vm.cmd('block-job-complete',
+ device='mirror0')
+
+ exec_command_and_wait_for_pattern(self,
+ 'dd if=/dev/vda of=/dev/null iflag=direct bs=4k count=1',
+ '4096 bytes (4.1 kB, 4.0 KiB) copied')
+
+
+if __name__ == '__main__':
+ LinuxKernelTest.main()
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 5/5] tests/functional: add vdpa-blk blockdev-mirror test
2025-10-07 18:34 ` [PATCH v2 5/5] tests/functional: add vdpa-blk blockdev-mirror test Stefan Hajnoczi
@ 2025-10-08 21:04 ` Eric Blake
0 siblings, 0 replies; 7+ messages in thread
From: Eric Blake @ 2025-10-08 21:04 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel, Kevin Wolf, qemu-block, Hanna Reitz
On Tue, Oct 07, 2025 at 02:34:47PM -0400, Stefan Hajnoczi wrote:
> Add a test case that reproduces
> https://issues.redhat.com/browse/RHEL-88175.
>
> When the mirror blockjob completes, it replaces the original vdpa-blk
> blockdev node with a new vdpa-blk blockdev. This will only work if the
> BlockRAMRegistrar populates memory mappings (see the previous commit).
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> tests/functional/x86_64/meson.build | 1 +
> .../functional/x86_64/test_vdpa_blk_mirror.py | 118 ++++++++++++++++++
> 2 files changed, 119 insertions(+)
> create mode 100755 tests/functional/x86_64/test_vdpa_blk_mirror.py
I tried running this test to see how it fared on my system. With the
entire series applied (and after I set up passwordless sudo, to run
instead of skip), the test passed. I then reverted the rest of the
series, and the test hung, instead of timing out or printing an error
message about a specific failure. Then I tried reapplying the rest of
the series, but now the test hangs because whatever state the system
was left in after the earlier failed test is preventing the re-run
from starting; and it appears to be uninterruptible (SIGINT is not
ending the hung test, and even ctrl-z is not letting me move the test
into a background process; I had to resort to kill -9 from another
terminal).
I'm less familiar with functional tests in general, but it might be
nice to figure out a way to quickly report failure when testing
without the rest of the series, rather than leaving the system in a
wedged state.
Of course, since CI will never be running the test without the rest of
the series in place, that is not a show-stopper for accepting this
series as-is. I'm not even sure if adding a timeout to the dd command
[1] would help in tearing down the vdpa_sim_blk device on a test
failure.
And it may not even be something that QEMU can do anything about - the
whole point of the rest of the series is so that vdpa still has memory
mappings after migration so it can complete I/O; without the rest of
the series, the test is correctly proving that the migration lost the
mappings and thus can't complete I/O, even if I have no idea how to
force the kernel to relinquish the device when we know that the I/O
won't ever happen.
So, I'm fine if you add:
Tested-by: Eric Blake <eblake@redhat.com>
even though I'm not comfortable with a Reviewed-by at this time.
> diff --git a/tests/functional/x86_64/test_vdpa_blk_mirror.py b/tests/functional/x86_64/test_vdpa_blk_mirror.py
> +class VdpaBlk(LinuxKernelTest):
> +
> + KERNEL_COMMAND_LINE = 'printk.time=0 console=ttyS0 rd.rescue'
> + ASSET_KERNEL = Asset(
> + ('https://archives.fedoraproject.org/pub/archive/fedora/linux/releases'
> + '/31/Server/x86_64/os/images/pxeboot/vmlinuz'),
> + 'd4738d03dbbe083ca610d0821d0a8f1488bebbdccef54ce33e3adb35fda00129')
> + ASSET_INITRD = Asset(
> + ('https://archives.fedoraproject.org/pub/archive/fedora/linux/releases'
> + '/31/Server/x86_64/os/images/pxeboot/initrd.img'),
> + '277cd6c7adf77c7e63d73bbb2cded8ef9e2d3a2f100000e92ff1f8396513cd8b')
> + VDPA_DEV_1 = f'vdpa-{os.getpid()}-1'
> + VDPA_DEV_2 = f'vdpa-{os.getpid()}-2'
> +
> + def setUp(self) -> None:
> + def create_vdpa_dev(name):
> + '''
> + Create a new vdpasim_blk device and return its vhost_vdpa device
> + path.
> + '''
> + run(f'sudo -n vdpa dev add mgmtdev vdpasim_blk name {name}')
> + sysfs_vhost_vdpa_dev_dir = \
> + glob.glob(f'/sys/bus/vdpa/devices/{name}/vhost-vdpa-*')[0]
> + vhost_dev_basename = os.path.basename(sysfs_vhost_vdpa_dev_dir)
> + vhost_dev_path = f'/dev/{vhost_dev_basename}'
> + run(f'sudo -n chown {os.getuid()}:{os.getgid()} {vhost_dev_path}')
> + return vhost_dev_path
> +
> + try:
> + run('sudo -n modprobe vhost_vdpa')
> + run('sudo -n modprobe vdpa_sim_blk')
Once I did kill -9 on the hung test, manually trying this line fails
with:
$ sudo -n modprobe vdpa_sim_blk
modprobe: ERROR: could not insert 'vdpa_sim_blk': Device or resource busy
as my evidence that something really did get wedged in trying to clean
up after the hang. Even
$ sudo -n vdpa dev show
is hanging with no output, no response to ctrl-c or -z, and requires
kill -9. Running it under strace ends at:
...
socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC) = 3
setsockopt(3, SOL_NETLINK, NETLINK_CAP_ACK, [1], 4) = 0
setsockopt(3, SOL_NETLINK, NETLINK_EXT_ACK, [1], 4) = 0
bind(3, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 0
getsockname(3, {sa_family=AF_NETLINK, nl_pid=3694229, nl_groups=00000000}, [12]) = 0
> +
> + self.vhost_dev_1_path = create_vdpa_dev(self.VDPA_DEV_1)
> + self.vhost_dev_2_path = create_vdpa_dev(self.VDPA_DEV_2)
> + except subprocess.CalledProcessError:
> + self.skipTest('Failed to set up vdpa_blk device')
> +
> + super().setUp()
> +
> + def tearDown(self) -> None:
> + super().tearDown()
> +
> + try:
> + run(f'sudo -n vdpa dev del {self.VDPA_DEV_2}')
> + run(f'sudo -n vdpa dev del {self.VDPA_DEV_1}')
> + run('sudo -n modprobe --remove vdpa_sim_blk')
> + run('sudo -n modprobe --remove vhost_vdpa')
so these cleanups are not happening because of whatever else already
wedged.
> + except subprocess.CalledProcessError:
> + pass # ignore failures
> +
> + def test_mirror(self) -> None:
> + '''
> + Check that I/O works after a mirror blockjob pivots. See
> + https://issues.redhat.com/browse/RHEL-88175.
> + '''
> + kernel_path = self.ASSET_KERNEL.fetch()
> + initrd_path = self.ASSET_INITRD.fetch()
> +
> + self.vm.add_args('-m', '1G')
> + self.vm.add_args('-object', 'memory-backend-memfd,id=mem,size=1G')
> + self.vm.add_args('-machine', 'pc,accel=kvm:tcg,memory-backend=mem')
> + self.vm.add_args('-append', self.KERNEL_COMMAND_LINE)
> + self.vm.add_args('-blockdev',
> + 'virtio-blk-vhost-vdpa,node-name=vdpa-blk-0,' +
> + f'path={self.vhost_dev_1_path},cache.direct=on')
> + self.vm.add_args('-device', 'virtio-blk-pci,drive=vdpa-blk-0')
> +
> + self.launch_kernel(kernel_path, initrd_path,
> + wait_for='# ')
> +
> + self.vm.cmd('blockdev-add',
> + driver='virtio-blk-vhost-vdpa',
> + node_name='vdpa-blk-1',
> + path=self.vhost_dev_2_path,
> + cache={'direct': True})
> + self.vm.cmd('blockdev-mirror',
> + device='vdpa-blk-0',
> + job_id='mirror0',
> + target='vdpa-blk-1',
> + sync='full',
> + target_is_zero=True)
> + self.vm.event_wait('BLOCK_JOB_READY')
> + self.vm.cmd('block-job-complete',
> + device='mirror0')
> +
> + exec_command_and_wait_for_pattern(self,
> + 'dd if=/dev/vda of=/dev/null iflag=direct bs=4k count=1',
[1] This might be the spot where adding a timeout command would help
the guest relinquish control of the block device, but that is still
not obvious to me whether it would also be enough for the test to fail
cleanly and allow a clean restart.
> + '4096 bytes (4.1 kB, 4.0 KiB) copied')
> +
> +
> +if __name__ == '__main__':
> + LinuxKernelTest.main()
> --
> 2.51.0
>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-08 21:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 18:34 [PATCH v2 0/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 1/5] block: add BdrvChildClass->propagate_attach/detach() callbacks Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 2/5] block: add blk_add_attach/detach_notifier() APIs Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 3/5] block: rename RAMBlockRegistrar->notifier field Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 4/5] block: update inserted/removed nodes from BlockRAMRegistrar Stefan Hajnoczi
2025-10-07 18:34 ` [PATCH v2 5/5] tests/functional: add vdpa-blk blockdev-mirror test Stefan Hajnoczi
2025-10-08 21:04 ` Eric Blake
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).