qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 0/8] Migration patches for 2025-03-07
@ 2025-03-07 18:15 Fabiano Rosas
  2025-03-07 18:15 ` [PULL 1/8] migration: Fix UAF for incoming migration on MigrationState Fabiano Rosas
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

The following changes since commit 98c7362b1efe651327385a25874a73e008c6549e:

  Merge tag 'accel-cpus-20250306' of https://github.com/philmd/qemu into staging (2025-03-07 07:39:49 +0800)

are available in the Git repository at:

  https://gitlab.com/farosas/qemu.git tags/migration-20250307-pull-request

for you to fetch changes up to 5aee8eaea8ec1d5d364f529bf044f4129286b9f1:

  migration: Add qtest for migration over RDMA (2025-03-07 12:04:58 -0300)

Patch 8/8 triggers a bogus checkpatch error that doesn't apply to
qtest, please ignore:

 ERROR: Error messages should not contain newlines

----------------------------------------------------------------
Migration pull request

- Fix use-after-free in incoming migration
- Improve cpr migration blocker for volatile ram
- Fix RDMA migration
- RDMA migration test + helper script to setup an rdma link

----------------------------------------------------------------

Li Zhijian (6):
  migration: Prioritize RDMA in ram_save_target_page()
  migration: check RDMA and capabilities are compatible on both sides
  migration: disable RDMA + postcopy-ram
  migration/rdma: Remove redundant migration_in_postcopy checks
  migration: Unfold control_save_page()
  migration: Add qtest for migration over RDMA

Peter Xu (1):
  migration: Fix UAF for incoming migration on MigrationState

Steve Sistare (1):
  migration: ram block cpr blockers

 MAINTAINERS                           |  1 +
 include/exec/memory.h                 |  3 ++
 include/exec/ramblock.h               |  1 +
 migration/migration.c                 | 70 ++++++++++++++++++++++-----
 migration/options.c                   | 25 ++++++++++
 migration/options.h                   |  1 +
 migration/ram.c                       | 41 +++++-----------
 migration/rdma.c                      | 11 ++---
 migration/rdma.h                      |  3 +-
 migration/savevm.c                    |  2 +
 scripts/rdma-migration-helper.sh      | 48 ++++++++++++++++++
 system/physmem.c                      | 66 +++++++++++++++++++++++++
 tests/qtest/migration/precopy-tests.c | 69 ++++++++++++++++++++++++++
 13 files changed, 290 insertions(+), 51 deletions(-)
 create mode 100755 scripts/rdma-migration-helper.sh

-- 
2.35.3



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PULL 1/8] migration: Fix UAF for incoming migration on MigrationState
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-07 18:15 ` [PULL 2/8] migration: ram block cpr blockers Fabiano Rosas
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Yan Fu

From: Peter Xu <peterx@redhat.com>

On the incoming migration side, QEMU uses a coroutine to load all the VM
states.  Inside, it may reference MigrationState on global states like
migration capabilities, parameters, error state, shared mutexes and more.

However there's nothing yet to make sure MigrationState won't get
destroyed (e.g. after migration_shutdown()).  Meanwhile there's also no API
available to remove the incoming coroutine in migration_shutdown(),
avoiding it to access the freed elements.

There's a bug report showing this can happen and crash dest QEMU when
migration is cancelled on source.

When it happens, the dest main thread is trying to cleanup everything:

  #0  qemu_aio_coroutine_enter
  #1  aio_dispatch_handler
  #2  aio_poll
  #3  monitor_cleanup
  #4  qemu_cleanup
  #5  qemu_default_main

Then it found the migration incoming coroutine, schedule it (even after
migration_shutdown()), causing crash:

  #0  __pthread_kill_implementation
  #1  __pthread_kill_internal
  #2  __GI_raise
  #3  __GI_abort
  #4  __assert_fail_base
  #5  __assert_fail
  #6  qemu_mutex_lock_impl
  #7  qemu_lockable_mutex_lock
  #8  qemu_lockable_lock
  #9  qemu_lockable_auto_lock
  #10 migrate_set_error
  #11 process_incoming_migration_co
  #12 coroutine_trampoline

To fix it, take a refcount after an incoming setup is properly done when
qmp_migrate_incoming() succeeded the 1st time.  As it's during a QMP
handler which needs BQL, it means the main loop is still alive (without
going into cleanups, which also needs BQL).

Releasing the refcount now only until the incoming migration coroutine
finished or failed.  Hence the refcount is valid for both (1) setup phase
of incoming ports, mostly IO watches (e.g. qio_channel_add_watch_full()),
and (2) the incoming coroutine itself (process_incoming_migration_co()).

Note that we can't unref in migration_incoming_state_destroy(), because
both qmp_xen_load_devices_state() and load_snapshot() will use it without
an incoming migration.  Those hold BQL so they're not prone to this issue.

PS: I suspect nobody uses Xen's command at all, as it didn't register yank,
hence AFAIU the command should crash on master when trying to unregister
yank in migration_incoming_state_destroy()..  but that's another story.

Also note that in some incoming failure cases we may not always unref the
MigrationState refcount, which is a trade-off to keep things simple.  We
could make it accurate, but it can be an overkill.  Some examples:

  - Unlike most of the rest protocols, socket_start_incoming_migration()
    may create net listener after incoming port setup sucessfully.
    It means we can't unref in migration_channel_process_incoming() as a
    generic path because socket protocol might keep using MigrationState.

  - For either socket or file, multiple IO watches might be created, it
    means logically each IO watch needs to take one refcount for
    MigrationState so as to be 100% accurate on ownership of refcount taken.

In general, we at least need per-protocol handling to make it accurate,
which can be an overkill if we know incoming failed after all.  Add a short
comment to explain that when taking the refcount in qmp_migrate_incoming().

Bugzilla: https://issues.redhat.com/browse/RHEL-69775
Tested-by: Yan Fu <yafu@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250220132459.512610-1-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 1833cfe358..d46e776e24 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -116,6 +116,27 @@ static void migration_downtime_start(MigrationState *s)
     s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 }
 
+/*
+ * This is unfortunate: incoming migration actually needs the outgoing
+ * migration state (MigrationState) to be there too, e.g. to query
+ * capabilities, parameters, using locks, setup errors, etc.
+ *
+ * NOTE: when calling this, making sure current_migration exists and not
+ * been freed yet!  Otherwise trying to access the refcount is already
+ * an use-after-free itself..
+ *
+ * TODO: Move shared part of incoming / outgoing out into separate object.
+ * Then this is not needed.
+ */
+static void migrate_incoming_ref_outgoing_state(void)
+{
+    object_ref(migrate_get_current());
+}
+static void migrate_incoming_unref_outgoing_state(void)
+{
+    object_unref(migrate_get_current());
+}
+
 static void migration_downtime_end(MigrationState *s)
 {
     int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
@@ -863,7 +884,7 @@ process_incoming_migration_co(void *opaque)
              * postcopy thread.
              */
             trace_process_incoming_migration_co_postcopy_end_main();
-            return;
+            goto out;
         }
         /* Else if something went wrong then just fall out of the normal exit */
     }
@@ -879,7 +900,8 @@ process_incoming_migration_co(void *opaque)
     }
 
     migration_bh_schedule(process_incoming_migration_bh, mis);
-    return;
+    goto out;
+
 fail:
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_FAILED);
@@ -896,6 +918,9 @@ fail:
 
         exit(EXIT_FAILURE);
     }
+out:
+    /* Pairs with the refcount taken in qmp_migrate_incoming() */
+    migrate_incoming_unref_outgoing_state();
 }
 
 /**
@@ -1901,6 +1926,17 @@ void qmp_migrate_incoming(const char *uri, bool has_channels,
         return;
     }
 
+    /*
+     * Making sure MigrationState is available until incoming migration
+     * completes.
+     *
+     * NOTE: QEMU _might_ leak this refcount in some failure paths, but
+     * that's OK.  This is the minimum change we need to at least making
+     * sure success case is clean on the refcount.  We can try harder to
+     * make it accurate for any kind of failures, but it might be an
+     * overkill and doesn't bring us much benefit.
+     */
+    migrate_incoming_ref_outgoing_state();
     once = false;
 }
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 2/8] migration: ram block cpr blockers
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
  2025-03-07 18:15 ` [PULL 1/8] migration: Fix UAF for incoming migration on MigrationState Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-26 18:46   ` Tom Lendacky
  2025-03-07 18:15 ` [PULL 3/8] migration: Prioritize RDMA in ram_save_target_page() Fabiano Rosas
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Steve Sistare, David Hildenbrand

From: Steve Sistare <steven.sistare@oracle.com>

Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
in the migration stream file and recreate them later, because the physical
memory for the blocks is pinned and registered for vfio.  Add a blocker
for volatile ram blocks.

Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
sufficient for CPR, but it has not been tested yet.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 include/exec/memory.h   |  3 ++
 include/exec/ramblock.h |  1 +
 migration/savevm.c      |  2 ++
 system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 72 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 78c4e0aec8..d09af58c97 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void);
  */
 bool ram_block_discard_is_required(void);
 
+void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp);
+void ram_block_del_cpr_blocker(RAMBlock *rb);
+
 #endif
 
 #endif
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 0babd105c0..64484cd821 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -39,6 +39,7 @@ struct RAMBlock {
     /* RCU-enabled, writes protected by the ramlist lock */
     QLIST_ENTRY(RAMBlock) next;
     QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
+    Error *cpr_blocker;
     int fd;
     uint64_t fd_offset;
     int guest_memfd;
diff --git a/migration/savevm.c b/migration/savevm.c
index 5c4fdfd95e..ce158c3512 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
     qemu_ram_set_idstr(mr->ram_block,
                        memory_region_name(mr), dev);
     qemu_ram_set_migratable(mr->ram_block);
+    ram_block_add_cpr_blocker(mr->ram_block, &error_fatal);
 }
 
 void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
 {
     qemu_ram_unset_idstr(mr->ram_block);
     qemu_ram_unset_migratable(mr->ram_block);
+    ram_block_del_cpr_blocker(mr->ram_block);
 }
 
 void vmstate_register_ram_global(MemoryRegion *mr)
diff --git a/system/physmem.c b/system/physmem.c
index 8c1736f84e..445981a1b4 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -70,7 +70,10 @@
 
 #include "qemu/pmem.h"
 
+#include "qapi/qapi-types-migration.h"
+#include "migration/blocker.h"
 #include "migration/cpr.h"
+#include "migration/options.h"
 #include "migration/vmstate.h"
 
 #include "qemu/range.h"
@@ -1903,6 +1906,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
             qemu_mutex_unlock_ramlist();
             goto out_free;
         }
+
+        error_setg(&new_block->cpr_blocker,
+                   "Memory region %s uses guest_memfd, "
+                   "which is not supported with CPR.",
+                   memory_region_name(new_block->mr));
+        migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
+                                  MIG_MODE_CPR_TRANSFER,
+                                  -1);
     }
 
     ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
@@ -4094,3 +4105,58 @@ bool ram_block_discard_is_required(void)
     return qatomic_read(&ram_block_discard_required_cnt) ||
            qatomic_read(&ram_block_coordinated_discard_required_cnt);
 }
+
+/*
+ * Return true if ram is compatible with CPR.  Do not exclude rom,
+ * because the rom file could change in new QEMU.
+ */
+static bool ram_is_cpr_compatible(RAMBlock *rb)
+{
+    MemoryRegion *mr = rb->mr;
+
+    if (!mr || !memory_region_is_ram(mr)) {
+        return true;
+    }
+
+    /* Ram device is remapped in new QEMU */
+    if (memory_region_is_ram_device(mr)) {
+        return true;
+    }
+
+    /*
+     * A file descriptor is passed to new QEMU and remapped, or its backing
+     * file is reopened and mapped.  It must be shared to avoid COW.
+     */
+    if (rb->fd >= 0 && qemu_ram_is_shared(rb)) {
+        return true;
+    }
+
+    return false;
+}
+
+/*
+ * Add a blocker for each volatile ram block.  This function should only be
+ * called after we know that the block is migratable.  Non-migratable blocks
+ * are either re-created in new QEMU, or are handled specially, or are covered
+ * by a device-level CPR blocker.
+ */
+void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
+{
+    assert(qemu_ram_is_migratable(rb));
+
+    if (ram_is_cpr_compatible(rb)) {
+        return;
+    }
+
+    error_setg(&rb->cpr_blocker,
+               "Memory region %s is not compatible with CPR. share=on is "
+               "required for memory-backend objects, and aux-ram-share=on is "
+               "required.", memory_region_name(rb->mr));
+    migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFER,
+                              -1);
+}
+
+void ram_block_del_cpr_blocker(RAMBlock *rb)
+{
+    migrate_del_blocker(&rb->cpr_blocker);
+}
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 3/8] migration: Prioritize RDMA in ram_save_target_page()
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
  2025-03-07 18:15 ` [PULL 1/8] migration: Fix UAF for incoming migration on MigrationState Fabiano Rosas
  2025-03-07 18:15 ` [PULL 2/8] migration: ram block cpr blockers Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-07 18:15 ` [PULL 4/8] migration: check RDMA and capabilities are compatible on both sides Fabiano Rosas
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Li Zhijian

From: Li Zhijian <lizhijian@fujitsu.com>

Address an error in RDMA-based migration by ensuring RDMA is prioritized
when saving pages in `ram_save_target_page()`.

Previously, the RDMA protocol's page-saving step was placed after other
protocols due to a refactoring in commit bc38dc2f5f3. This led to migration
failures characterized by unknown control messages and state loading errors
destination:
(qemu) qemu-system-x86_64: Unknown control message QEMU FILE
qemu-system-x86_64: error while loading state section id 1(ram)
qemu-system-x86_64: load of migration failed: Operation not permitted
source:
(qemu) qemu-system-x86_64: RDMA is in an error state waiting migration to abort!
qemu-system-x86_64: failed to save SaveStateEntry with id(name): 1(ram): -1
qemu-system-x86_64: rdma migration: recv polling control error!
qemu-system-x86_64: warning: Early error. Sending error.
qemu-system-x86_64: warning: rdma migration: send polling control error

RDMA migration implemented its own protocol/method to send pages to
destination side, hand over to RDMA first to prevent pages being saved by
other protocol.

Fixes: bc38dc2f5f3 ("migration: refactor ram_save_target_page functions")
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-2-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/ram.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 589b6505eb..424df6d9f1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1964,6 +1964,11 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
     ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
     int res;
 
+    /* Hand over to RDMA first */
+    if (control_save_page(pss, offset, &res)) {
+        return res;
+    }
+
     if (!migrate_multifd()
         || migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) {
         if (save_zero_page(rs, pss, offset)) {
@@ -1976,10 +1981,6 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
         return ram_save_multifd_page(block, offset);
     }
 
-    if (control_save_page(pss, offset, &res)) {
-        return res;
-    }
-
     return ram_save_page(rs, pss);
 }
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 4/8] migration: check RDMA and capabilities are compatible on both sides
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
                   ` (2 preceding siblings ...)
  2025-03-07 18:15 ` [PULL 3/8] migration: Prioritize RDMA in ram_save_target_page() Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-07 18:15 ` [PULL 5/8] migration: disable RDMA + postcopy-ram Fabiano Rosas
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Li Zhijian

From: Li Zhijian <lizhijian@fujitsu.com>

Depending on the order of starting RDMA and setting capability,
they can be categorized into the following scenarios:
Source:
 S1: [set capabilities] -> [Start RDMA outgoing]
Destination:
 D1: [set capabilities] -> [Start RDMA incoming]
 D2: [Start RDMA incoming] -> [set capabilities]

Previously, compatibility between RDMA and capabilities was verified only
in scenario D1, potentially causing migration failures in other situations.

For scenarios S1 and D1, we can seamlessly incorporate
migration_transport_compatible() to address compatibility between
channels and capabilities vs transport.

For scenario D2, ensure compatibility within migrate_caps_check().

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-3-lizhijian@fujitsu.com>
[fixup comment alignment]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 30 ++++++++++++++++++++----------
 migration/options.c   | 21 +++++++++++++++++++++
 migration/options.h   |  1 +
 3 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index d46e776e24..22dd966970 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -259,6 +259,24 @@ migration_channels_and_transport_compatible(MigrationAddress *addr,
     return true;
 }
 
+static bool
+migration_capabilities_and_transport_compatible(MigrationAddress *addr,
+                                                Error **errp)
+{
+    if (addr->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
+        return migrate_rdma_caps_check(migrate_get_current()->capabilities,
+                                       errp);
+    }
+
+    return true;
+}
+
+static bool migration_transport_compatible(MigrationAddress *addr, Error **errp)
+{
+    return migration_channels_and_transport_compatible(addr, errp) &&
+           migration_capabilities_and_transport_compatible(addr, errp);
+}
+
 static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
 {
     uintptr_t a = (uintptr_t) ap, b = (uintptr_t) bp;
@@ -750,7 +768,7 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
     }
 
     /* transport mechanism not suitable for migration? */
-    if (!migration_channels_and_transport_compatible(addr, errp)) {
+    if (!migration_transport_compatible(addr, errp)) {
         return;
     }
 
@@ -769,14 +787,6 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
         }
 #ifdef CONFIG_RDMA
     } else if (addr->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
-        if (migrate_xbzrle()) {
-            error_setg(errp, "RDMA and XBZRLE can't be used together");
-            return;
-        }
-        if (migrate_multifd()) {
-            error_setg(errp, "RDMA and multifd can't be used together");
-            return;
-        }
         rdma_start_incoming_migration(&addr->u.rdma, errp);
 #endif
     } else if (addr->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
@@ -2208,7 +2218,7 @@ void qmp_migrate(const char *uri, bool has_channels,
     }
 
     /* transport mechanism not suitable for migration? */
-    if (!migration_channels_and_transport_compatible(addr, errp)) {
+    if (!migration_transport_compatible(addr, errp)) {
         return;
     }
 
diff --git a/migration/options.c b/migration/options.c
index b0ac2ea408..1f3602839d 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -448,6 +448,20 @@ static bool migrate_incoming_started(void)
     return !!migration_incoming_get_current()->transport_data;
 }
 
+bool migrate_rdma_caps_check(bool *caps, Error **errp)
+{
+    if (caps[MIGRATION_CAPABILITY_XBZRLE]) {
+        error_setg(errp, "RDMA and XBZRLE can't be used together");
+        return false;
+    }
+    if (caps[MIGRATION_CAPABILITY_MULTIFD]) {
+        error_setg(errp, "RDMA and multifd can't be used together");
+        return false;
+    }
+
+    return true;
+}
+
 /**
  * @migration_caps_check - check capability compatibility
  *
@@ -611,6 +625,13 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
         }
     }
 
+    /*
+     * On destination side, check the cases that capability is being set
+     * after incoming thread has started.
+     */
+    if (migrate_rdma() && !migrate_rdma_caps_check(new_caps, errp)) {
+        return false;
+    }
     return true;
 }
 
diff --git a/migration/options.h b/migration/options.h
index 762be4e641..82d839709e 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -57,6 +57,7 @@ bool migrate_tls(void);
 
 /* capabilities helpers */
 
+bool migrate_rdma_caps_check(bool *caps, Error **errp);
 bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp);
 
 /* parameters */
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 5/8] migration: disable RDMA + postcopy-ram
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
                   ` (3 preceding siblings ...)
  2025-03-07 18:15 ` [PULL 4/8] migration: check RDMA and capabilities are compatible on both sides Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-07 18:15 ` [PULL 6/8] migration/rdma: Remove redundant migration_in_postcopy checks Fabiano Rosas
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Li Zhijian

From: Li Zhijian <lizhijian@fujitsu.com>

It's believed that RDMA + postcopy-ram has been broken for a while.
Rather than spending time re-enabling it, let's simply disable it as a
trade-off.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-4-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/options.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/migration/options.c b/migration/options.c
index 1f3602839d..4ba8fcb7dc 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -458,6 +458,10 @@ bool migrate_rdma_caps_check(bool *caps, Error **errp)
         error_setg(errp, "RDMA and multifd can't be used together");
         return false;
     }
+    if (caps[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
+        error_setg(errp, "RDMA and postcopy-ram can't be used together");
+        return false;
+    }
 
     return true;
 }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 6/8] migration/rdma: Remove redundant migration_in_postcopy checks
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
                   ` (4 preceding siblings ...)
  2025-03-07 18:15 ` [PULL 5/8] migration: disable RDMA + postcopy-ram Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-07 18:15 ` [PULL 7/8] migration: Unfold control_save_page() Fabiano Rosas
  2025-03-07 18:15 ` [PULL 8/8] migration: Add qtest for migration over RDMA Fabiano Rosas
  7 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Li Zhijian

From: Li Zhijian <lizhijian@fujitsu.com>

Since we have disabled RDMA + postcopy, it's safe to remove
the migration_in_postcopy() that follows the migrate_rdma().

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-5-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/rdma.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 76fb034923..e5b4ac599b 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3284,7 +3284,7 @@ err:
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    if (!migrate_rdma() || migration_in_postcopy()) {
+    if (!migrate_rdma()) {
         return RAM_SAVE_CONTROL_NOT_SUPP;
     }
 
@@ -3829,7 +3829,7 @@ int rdma_block_notification_handle(QEMUFile *f, const char *name)
 
 int rdma_registration_start(QEMUFile *f, uint64_t flags)
 {
-    if (!migrate_rdma() || migration_in_postcopy()) {
+    if (!migrate_rdma()) {
         return 0;
     }
 
@@ -3861,7 +3861,7 @@ int rdma_registration_stop(QEMUFile *f, uint64_t flags)
     RDMAControlHeader head = { .len = 0, .repeat = 1 };
     int ret;
 
-    if (!migrate_rdma() || migration_in_postcopy()) {
+    if (!migrate_rdma()) {
         return 0;
     }
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 7/8] migration: Unfold control_save_page()
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
                   ` (5 preceding siblings ...)
  2025-03-07 18:15 ` [PULL 6/8] migration/rdma: Remove redundant migration_in_postcopy checks Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-07 18:15 ` [PULL 8/8] migration: Add qtest for migration over RDMA Fabiano Rosas
  7 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Li Zhijian

From: Li Zhijian <lizhijian@fujitsu.com>

control_save_page() is for RDMA only, unfold it to make the code more
clear.
In addition:
 - Similar to other branches style in ram_save_target_page(), involve RDMA
   only if the condition 'migrate_rdma()' is true.
 - Further simplify the code by removing the RAM_SAVE_CONTROL_NOT_SUPP.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-6-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/ram.c  | 34 +++++++---------------------------
 migration/rdma.c |  7 ++-----
 migration/rdma.h |  3 +--
 3 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 424df6d9f1..c363034c88 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1143,32 +1143,6 @@ static int save_zero_page(RAMState *rs, PageSearchStatus *pss,
     return len;
 }
 
-/*
- * @pages: the number of pages written by the control path,
- *        < 0 - error
- *        > 0 - number of pages written
- *
- * Return true if the pages has been saved, otherwise false is returned.
- */
-static bool control_save_page(PageSearchStatus *pss,
-                              ram_addr_t offset, int *pages)
-{
-    int ret;
-
-    ret = rdma_control_save_page(pss->pss_channel, pss->block->offset, offset,
-                                 TARGET_PAGE_SIZE);
-    if (ret == RAM_SAVE_CONTROL_NOT_SUPP) {
-        return false;
-    }
-
-    if (ret == RAM_SAVE_CONTROL_DELAYED) {
-        *pages = 1;
-        return true;
-    }
-    *pages = ret;
-    return true;
-}
-
 /*
  * directly send the page to the stream
  *
@@ -1965,7 +1939,13 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
     int res;
 
     /* Hand over to RDMA first */
-    if (control_save_page(pss, offset, &res)) {
+    if (migrate_rdma()) {
+        res = rdma_control_save_page(pss->pss_channel, pss->block->offset,
+                                     offset, TARGET_PAGE_SIZE);
+
+        if (res == RAM_SAVE_CONTROL_DELAYED) {
+            res = 1;
+        }
         return res;
     }
 
diff --git a/migration/rdma.c b/migration/rdma.c
index e5b4ac599b..08eb924ffa 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3284,14 +3284,11 @@ err:
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    if (!migrate_rdma()) {
-        return RAM_SAVE_CONTROL_NOT_SUPP;
-    }
+    assert(migrate_rdma());
 
     int ret = qemu_rdma_save_page(f, block_offset, offset, size);
 
-    if (ret != RAM_SAVE_CONTROL_DELAYED &&
-        ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+    if (ret != RAM_SAVE_CONTROL_DELAYED) {
         if (ret < 0) {
             qemu_file_set_error(f, ret);
         }
diff --git a/migration/rdma.h b/migration/rdma.h
index f55f28bbed..8eeb0117b9 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -33,7 +33,6 @@ void rdma_start_incoming_migration(InetSocketAddress *host_port, Error **errp);
 #define RAM_CONTROL_ROUND     1
 #define RAM_CONTROL_FINISH    3
 
-#define RAM_SAVE_CONTROL_NOT_SUPP -1000
 #define RAM_SAVE_CONTROL_DELAYED  -2000
 
 #ifdef CONFIG_RDMA
@@ -56,7 +55,7 @@ static inline
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    return RAM_SAVE_CONTROL_NOT_SUPP;
+    g_assert_not_reached();
 }
 #endif
 #endif
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
                   ` (6 preceding siblings ...)
  2025-03-07 18:15 ` [PULL 7/8] migration: Unfold control_save_page() Fabiano Rosas
@ 2025-03-07 18:15 ` Fabiano Rosas
  2025-03-08  6:00   ` Philippe Mathieu-Daudé
  7 siblings, 1 reply; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-07 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Li Zhijian

From: Li Zhijian <lizhijian@fujitsu.com>

This qtest requires there is a RDMA(RoCE) link in the host.
In order to make the test work smoothly, introduce a
scripts/rdma-migration-helper.sh to
- setup a new Soft-RoCE(aka RXE) if it's root
- detect existing RoCE link

Test will be skipped if there is no available RoCE link.
 # Start of rdma tests
 # Running /x86_64/migration/precopy/rdma/plain
 Command 'rdma' is not available, please install it first.
 # To enable the test:
 # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
 # or
 # (2) Run the test with root privilege
 #
 ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
 # End of rdma tests

Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
or by specifying 'clean' within this script.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-7-lizhijian@fujitsu.com>
[reformated the message to be under 90 characters]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 MAINTAINERS                           |  1 +
 scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
 tests/qtest/migration/precopy-tests.c | 69 +++++++++++++++++++++++++++
 3 files changed, 118 insertions(+)
 create mode 100755 scripts/rdma-migration-helper.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 5df6020ed5..56e85adcfb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3517,6 +3517,7 @@ R: Li Zhijian <lizhijian@fujitsu.com>
 R: Peter Xu <peterx@redhat.com>
 S: Odd Fixes
 F: migration/rdma*
+F: scripts/rdma-migration-helper.sh
 
 Migration dirty limit and dirty page rate
 M: Hyman Huang <yong.huang@smartx.com>
diff --git a/scripts/rdma-migration-helper.sh b/scripts/rdma-migration-helper.sh
new file mode 100755
index 0000000000..08e29a52eb
--- /dev/null
+++ b/scripts/rdma-migration-helper.sh
@@ -0,0 +1,48 @@
+#!/bin/bash
+
+# Copied from blktests
+get_ipv4_addr()
+{
+    ip -4 -o addr show dev "$1" |
+        sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
+        tr -d '\n'
+}
+
+has_soft_rdma()
+{
+    rdma link | grep -q " netdev $1[[:blank:]]*\$"
+}
+
+rdma_rxe_setup_detect()
+{
+    (
+        cd /sys/class/net &&
+            for i in *; do
+                [ -e "$i" ] || continue
+                [ "$i" = "lo" ] && continue
+                [ "$(<"$i/addr_len")" = 6 ] || continue
+                [ "$(<"$i/carrier")" = 1 ] || continue
+
+                has_soft_rdma "$i" && break
+                [ "$operation" = "setup" ] &&
+                    rdma link add "${i}_rxe" type rxe netdev "$i" && break
+            done
+        has_soft_rdma "$i" || return
+        get_ipv4_addr "$i"
+    )
+}
+
+operation=${1:-setup}
+
+command -v rdma >/dev/null || {
+    echo "Command 'rdma' is not available, please install it first." >&2
+    exit 1
+}
+
+if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
+    rdma_rxe_setup_detect
+elif [ "$operation" == "clean" ]; then
+    modprobe -r rdma_rxe
+else
+    echo "Usage: $0 [setup | detect | clean]"
+fi
diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
index ba273d10b9..f1fe34020d 100644
--- a/tests/qtest/migration/precopy-tests.c
+++ b/tests/qtest/migration/precopy-tests.c
@@ -99,6 +99,71 @@ static void test_precopy_unix_dirty_ring(void)
     test_precopy_common(&args);
 }
 
+#ifdef CONFIG_RDMA
+
+#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
+static int new_rdma_link(char *buffer, bool verbose)
+{
+    const char *argument = (geteuid() == 0) ? "setup" : "detect";
+    char cmd[1024];
+
+    snprintf(cmd, sizeof(cmd), "%s %s %s", RDMA_MIGRATION_HELPER, argument,
+             verbose ? "" : "2>/dev/null");
+
+    FILE *pipe = popen(cmd, "r");
+    if (pipe == NULL) {
+        perror("Failed to run script");
+        return -1;
+    }
+
+    int idx = 0;
+    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
+        idx += strlen(buffer);
+    }
+
+    int status = pclose(pipe);
+    if (status == -1) {
+        perror("Error reported by pclose()");
+        return -1;
+    } else if (WIFEXITED(status)) {
+        return WEXITSTATUS(status);
+    }
+
+    return -1;
+}
+
+static void test_precopy_rdma_plain(void)
+{
+    char buffer[128] = {};
+    bool verbose = g_getenv("QTEST_LOG");
+
+    if (new_rdma_link(buffer, verbose)) {
+        g_test_skip("No rdma link available");
+        if (verbose) {
+            g_test_message(
+                "To enable the test:\n"
+                "(1) Run \'" RDMA_MIGRATION_HELPER
+                " setup\' with root and rerun the test\n"
+                "or\n(2) Run the test with root privilege");
+        }
+        return;
+    }
+
+    /*
+     * TODO: query a free port instead of hard code.
+     * 29200=('R'+'D'+'M'+'A')*100
+     **/
+    g_autofree char *uri = g_strdup_printf("rdma:%s:29200", buffer);
+
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+    };
+
+    test_precopy_common(&args);
+}
+#endif
+
 static void test_precopy_tcp_plain(void)
 {
     MigrateCommon args = {
@@ -1124,6 +1189,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
                        test_multifd_tcp_uri_none);
     migration_test_add("/migration/multifd/tcp/plain/cancel",
                        test_multifd_tcp_cancel);
+#ifdef CONFIG_RDMA
+    migration_test_add("/migration/precopy/rdma/plain",
+                       test_precopy_rdma_plain);
+#endif
 }
 
 void migration_test_add_precopy(MigrationTestEnv *env)
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-07 18:15 ` [PULL 8/8] migration: Add qtest for migration over RDMA Fabiano Rosas
@ 2025-03-08  6:00   ` Philippe Mathieu-Daudé
  2025-03-08  8:42     ` Stefan Hajnoczi
  2025-03-10  8:01     ` Zhijian Li (Fujitsu) via
  0 siblings, 2 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-03-08  6:00 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel; +Cc: Peter Xu, Li Zhijian, Daniel P. Berrangé

Hi,

On 7/3/25 19:15, Fabiano Rosas wrote:
> From: Li Zhijian <lizhijian@fujitsu.com>
> 
> This qtest requires there is a RDMA(RoCE) link in the host.
> In order to make the test work smoothly, introduce a
> scripts/rdma-migration-helper.sh to
> - setup a new Soft-RoCE(aka RXE) if it's root
> - detect existing RoCE link
> 
> Test will be skipped if there is no available RoCE link.

Is it? Runing as user I'm getting:

   RDMA ERROR: RDMA host is not set!

Apparently called via:

qemu_start_incoming_migration()
   -> rdma_start_incoming_migration()
      -> qemu_rdma_dest_init()

>   # Start of rdma tests
>   # Running /x86_64/migration/precopy/rdma/plain
>   Command 'rdma' is not available, please install it first.
>   # To enable the test:
>   # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
>   # or
>   # (2) Run the test with root privilege

Could this might be the issue, should we skip if not root, as calling
the script in "detect" mode makes the new_rdma_link() method to succeed.

>   #
>   ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
>   # End of rdma tests
> 
> Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
> or by specifying 'clean' within this script.

qtest_add() provides both setup() / teardown() methods.

Test leaving system in different state seems bogus to me.
More even if the information is buried in a commit description...

We shouldn't merge this patch as is IMHO.

Regards,

Phil.

> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> Message-ID: <20250305062825.772629-7-lizhijian@fujitsu.com>
> [reformated the message to be under 90 characters]
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>   MAINTAINERS                           |  1 +
>   scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
>   tests/qtest/migration/precopy-tests.c | 69 +++++++++++++++++++++++++++
>   3 files changed, 118 insertions(+)
>   create mode 100755 scripts/rdma-migration-helper.sh
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5df6020ed5..56e85adcfb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3517,6 +3517,7 @@ R: Li Zhijian <lizhijian@fujitsu.com>
>   R: Peter Xu <peterx@redhat.com>
>   S: Odd Fixes
>   F: migration/rdma*
> +F: scripts/rdma-migration-helper.sh
>   
>   Migration dirty limit and dirty page rate
>   M: Hyman Huang <yong.huang@smartx.com>
> diff --git a/scripts/rdma-migration-helper.sh b/scripts/rdma-migration-helper.sh
> new file mode 100755
> index 0000000000..08e29a52eb
> --- /dev/null
> +++ b/scripts/rdma-migration-helper.sh
> @@ -0,0 +1,48 @@
> +#!/bin/bash
> +
> +# Copied from blktests
> +get_ipv4_addr()
> +{
> +    ip -4 -o addr show dev "$1" |
> +        sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
> +        tr -d '\n'
> +}
> +
> +has_soft_rdma()
> +{
> +    rdma link | grep -q " netdev $1[[:blank:]]*\$"
> +}
> +
> +rdma_rxe_setup_detect()
> +{
> +    (
> +        cd /sys/class/net &&
> +            for i in *; do
> +                [ -e "$i" ] || continue
> +                [ "$i" = "lo" ] && continue
> +                [ "$(<"$i/addr_len")" = 6 ] || continue
> +                [ "$(<"$i/carrier")" = 1 ] || continue
> +
> +                has_soft_rdma "$i" && break
> +                [ "$operation" = "setup" ] &&
> +                    rdma link add "${i}_rxe" type rxe netdev "$i" && break
> +            done
> +        has_soft_rdma "$i" || return
> +        get_ipv4_addr "$i"
> +    )
> +}
> +
> +operation=${1:-setup}
> +
> +command -v rdma >/dev/null || {
> +    echo "Command 'rdma' is not available, please install it first." >&2
> +    exit 1
> +}
> +
> +if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
> +    rdma_rxe_setup_detect
> +elif [ "$operation" == "clean" ]; then
> +    modprobe -r rdma_rxe
> +else
> +    echo "Usage: $0 [setup | detect | clean]"
> +fi
> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
> index ba273d10b9..f1fe34020d 100644
> --- a/tests/qtest/migration/precopy-tests.c
> +++ b/tests/qtest/migration/precopy-tests.c
> @@ -99,6 +99,71 @@ static void test_precopy_unix_dirty_ring(void)
>       test_precopy_common(&args);
>   }
>   
> +#ifdef CONFIG_RDMA
> +
> +#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
> +static int new_rdma_link(char *buffer, bool verbose)
> +{
> +    const char *argument = (geteuid() == 0) ? "setup" : "detect";
> +    char cmd[1024];
> +
> +    snprintf(cmd, sizeof(cmd), "%s %s %s", RDMA_MIGRATION_HELPER, argument,
> +             verbose ? "" : "2>/dev/null");
> +
> +    FILE *pipe = popen(cmd, "r");
> +    if (pipe == NULL) {
> +        perror("Failed to run script");
> +        return -1;
> +    }
> +
> +    int idx = 0;
> +    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
> +        idx += strlen(buffer);
> +    }
> +
> +    int status = pclose(pipe);
> +    if (status == -1) {
> +        perror("Error reported by pclose()");
> +        return -1;
> +    } else if (WIFEXITED(status)) {
> +        return WEXITSTATUS(status);
> +    }
> +
> +    return -1;
> +}
> +
> +static void test_precopy_rdma_plain(void)
> +{
> +    char buffer[128] = {};
> +    bool verbose = g_getenv("QTEST_LOG");
> +
> +    if (new_rdma_link(buffer, verbose)) {
> +        g_test_skip("No rdma link available");
> +        if (verbose) {
> +            g_test_message(
> +                "To enable the test:\n"
> +                "(1) Run \'" RDMA_MIGRATION_HELPER
> +                " setup\' with root and rerun the test\n"
> +                "or\n(2) Run the test with root privilege");
> +        }
> +        return;
> +    }
> +
> +    /*
> +     * TODO: query a free port instead of hard code.
> +     * 29200=('R'+'D'+'M'+'A')*100
> +     **/
> +    g_autofree char *uri = g_strdup_printf("rdma:%s:29200", buffer);
> +
> +    MigrateCommon args = {
> +        .listen_uri = uri,
> +        .connect_uri = uri,
> +    };
> +
> +    test_precopy_common(&args);
> +}
> +#endif
> +
>   static void test_precopy_tcp_plain(void)
>   {
>       MigrateCommon args = {
> @@ -1124,6 +1189,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
>                          test_multifd_tcp_uri_none);
>       migration_test_add("/migration/multifd/tcp/plain/cancel",
>                          test_multifd_tcp_cancel);
> +#ifdef CONFIG_RDMA
> +    migration_test_add("/migration/precopy/rdma/plain",
> +                       test_precopy_rdma_plain);
> +#endif
>   }
>   
>   void migration_test_add_precopy(MigrationTestEnv *env)



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-08  6:00   ` Philippe Mathieu-Daudé
@ 2025-03-08  8:42     ` Stefan Hajnoczi
  2025-03-10  8:33       ` Zhijian Li (Fujitsu) via
  2025-03-10  8:01     ` Zhijian Li (Fujitsu) via
  1 sibling, 1 reply; 24+ messages in thread
From: Stefan Hajnoczi @ 2025-03-08  8:42 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Peter Xu, Li Zhijian, Daniel P. Berrangé,
	Philippe Mathieu-Daudé

On Sat, Mar 8, 2025 at 2:01 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> Hi,
>
> On 7/3/25 19:15, Fabiano Rosas wrote:
> > From: Li Zhijian <lizhijian@fujitsu.com>
> >
> > This qtest requires there is a RDMA(RoCE) link in the host.
> > In order to make the test work smoothly, introduce a
> > scripts/rdma-migration-helper.sh to
> > - setup a new Soft-RoCE(aka RXE) if it's root
> > - detect existing RoCE link
> >
> > Test will be skipped if there is no available RoCE link.
>
> Is it? Runing as user I'm getting:
>
>    RDMA ERROR: RDMA host is not set!

The CI is failing too:
https://gitlab.com/qemu-project/qemu/-/jobs/9350004599#L5590

I have dropped this pull request for now. Please send a new revision
once the issue has been resolved.

Stefan

>
> Apparently called via:
>
> qemu_start_incoming_migration()
>    -> rdma_start_incoming_migration()
>       -> qemu_rdma_dest_init()
>
> >   # Start of rdma tests
> >   # Running /x86_64/migration/precopy/rdma/plain
> >   Command 'rdma' is not available, please install it first.
> >   # To enable the test:
> >   # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
> >   # or
> >   # (2) Run the test with root privilege
>
> Could this might be the issue, should we skip if not root, as calling
> the script in "detect" mode makes the new_rdma_link() method to succeed.
>
> >   #
> >   ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
> >   # End of rdma tests
> >
> > Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
> > or by specifying 'clean' within this script.
>
> qtest_add() provides both setup() / teardown() methods.
>
> Test leaving system in different state seems bogus to me.
> More even if the information is buried in a commit description...
>
> We shouldn't merge this patch as is IMHO.
>
> Regards,
>
> Phil.
>
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> > Message-ID: <20250305062825.772629-7-lizhijian@fujitsu.com>
> > [reformated the message to be under 90 characters]
> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > ---
> >   MAINTAINERS                           |  1 +
> >   scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
> >   tests/qtest/migration/precopy-tests.c | 69 +++++++++++++++++++++++++++
> >   3 files changed, 118 insertions(+)
> >   create mode 100755 scripts/rdma-migration-helper.sh
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 5df6020ed5..56e85adcfb 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3517,6 +3517,7 @@ R: Li Zhijian <lizhijian@fujitsu.com>
> >   R: Peter Xu <peterx@redhat.com>
> >   S: Odd Fixes
> >   F: migration/rdma*
> > +F: scripts/rdma-migration-helper.sh
> >
> >   Migration dirty limit and dirty page rate
> >   M: Hyman Huang <yong.huang@smartx.com>
> > diff --git a/scripts/rdma-migration-helper.sh b/scripts/rdma-migration-helper.sh
> > new file mode 100755
> > index 0000000000..08e29a52eb
> > --- /dev/null
> > +++ b/scripts/rdma-migration-helper.sh
> > @@ -0,0 +1,48 @@
> > +#!/bin/bash
> > +
> > +# Copied from blktests
> > +get_ipv4_addr()
> > +{
> > +    ip -4 -o addr show dev "$1" |
> > +        sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
> > +        tr -d '\n'
> > +}
> > +
> > +has_soft_rdma()
> > +{
> > +    rdma link | grep -q " netdev $1[[:blank:]]*\$"
> > +}
> > +
> > +rdma_rxe_setup_detect()
> > +{
> > +    (
> > +        cd /sys/class/net &&
> > +            for i in *; do
> > +                [ -e "$i" ] || continue
> > +                [ "$i" = "lo" ] && continue
> > +                [ "$(<"$i/addr_len")" = 6 ] || continue
> > +                [ "$(<"$i/carrier")" = 1 ] || continue
> > +
> > +                has_soft_rdma "$i" && break
> > +                [ "$operation" = "setup" ] &&
> > +                    rdma link add "${i}_rxe" type rxe netdev "$i" && break
> > +            done
> > +        has_soft_rdma "$i" || return
> > +        get_ipv4_addr "$i"
> > +    )
> > +}
> > +
> > +operation=${1:-setup}
> > +
> > +command -v rdma >/dev/null || {
> > +    echo "Command 'rdma' is not available, please install it first." >&2
> > +    exit 1
> > +}
> > +
> > +if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
> > +    rdma_rxe_setup_detect
> > +elif [ "$operation" == "clean" ]; then
> > +    modprobe -r rdma_rxe
> > +else
> > +    echo "Usage: $0 [setup | detect | clean]"
> > +fi
> > diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
> > index ba273d10b9..f1fe34020d 100644
> > --- a/tests/qtest/migration/precopy-tests.c
> > +++ b/tests/qtest/migration/precopy-tests.c
> > @@ -99,6 +99,71 @@ static void test_precopy_unix_dirty_ring(void)
> >       test_precopy_common(&args);
> >   }
> >
> > +#ifdef CONFIG_RDMA
> > +
> > +#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
> > +static int new_rdma_link(char *buffer, bool verbose)
> > +{
> > +    const char *argument = (geteuid() == 0) ? "setup" : "detect";
> > +    char cmd[1024];
> > +
> > +    snprintf(cmd, sizeof(cmd), "%s %s %s", RDMA_MIGRATION_HELPER, argument,
> > +             verbose ? "" : "2>/dev/null");
> > +
> > +    FILE *pipe = popen(cmd, "r");
> > +    if (pipe == NULL) {
> > +        perror("Failed to run script");
> > +        return -1;
> > +    }
> > +
> > +    int idx = 0;
> > +    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
> > +        idx += strlen(buffer);
> > +    }
> > +
> > +    int status = pclose(pipe);
> > +    if (status == -1) {
> > +        perror("Error reported by pclose()");
> > +        return -1;
> > +    } else if (WIFEXITED(status)) {
> > +        return WEXITSTATUS(status);
> > +    }
> > +
> > +    return -1;
> > +}
> > +
> > +static void test_precopy_rdma_plain(void)
> > +{
> > +    char buffer[128] = {};
> > +    bool verbose = g_getenv("QTEST_LOG");
> > +
> > +    if (new_rdma_link(buffer, verbose)) {
> > +        g_test_skip("No rdma link available");
> > +        if (verbose) {
> > +            g_test_message(
> > +                "To enable the test:\n"
> > +                "(1) Run \'" RDMA_MIGRATION_HELPER
> > +                " setup\' with root and rerun the test\n"
> > +                "or\n(2) Run the test with root privilege");
> > +        }
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * TODO: query a free port instead of hard code.
> > +     * 29200=('R'+'D'+'M'+'A')*100
> > +     **/
> > +    g_autofree char *uri = g_strdup_printf("rdma:%s:29200", buffer);
> > +
> > +    MigrateCommon args = {
> > +        .listen_uri = uri,
> > +        .connect_uri = uri,
> > +    };
> > +
> > +    test_precopy_common(&args);
> > +}
> > +#endif
> > +
> >   static void test_precopy_tcp_plain(void)
> >   {
> >       MigrateCommon args = {
> > @@ -1124,6 +1189,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
> >                          test_multifd_tcp_uri_none);
> >       migration_test_add("/migration/multifd/tcp/plain/cancel",
> >                          test_multifd_tcp_cancel);
> > +#ifdef CONFIG_RDMA
> > +    migration_test_add("/migration/precopy/rdma/plain",
> > +                       test_precopy_rdma_plain);
> > +#endif
> >   }
> >
> >   void migration_test_add_precopy(MigrationTestEnv *env)
>
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-08  6:00   ` Philippe Mathieu-Daudé
  2025-03-08  8:42     ` Stefan Hajnoczi
@ 2025-03-10  8:01     ` Zhijian Li (Fujitsu) via
  2025-03-10 15:00       ` Fabiano Rosas
  1 sibling, 1 reply; 24+ messages in thread
From: Zhijian Li (Fujitsu) via @ 2025-03-10  8:01 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Fabiano Rosas, qemu-devel@nongnu.org
  Cc: Peter Xu, Daniel P. Berrangé, stefanha@gmail.com

Hi Philippe,

Thanks for your testing.


On 08/03/2025 14:00, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> On 7/3/25 19:15, Fabiano Rosas wrote:
>> From: Li Zhijian <lizhijian@fujitsu.com>
>>
>> This qtest requires there is a RDMA(RoCE) link in the host.
>> In order to make the test work smoothly, introduce a
>> scripts/rdma-migration-helper.sh to
>> - setup a new Soft-RoCE(aka RXE) if it's root
>> - detect existing RoCE link
>>
>> Test will be skipped if there is no available RoCE link.
> 
> Is it? Runing as user I'm getting:
> 
>    RDMA ERROR: RDMA host is not set!


It's unexpected behavior.

It implies that the script terminated successfully with an exit
code of 0(there is a RDMA link), yet failed to display its canonical
IPv4 address.

Stefan also mentioned the same error...
https://gitlab.com/qemu-project/qemu/-/jobs/9350004599#L5590

I couldn't reproduce your error.

Could you share the output of this script with a normal user,
$ scripts/rdma-migration-helper.sh detect

if your had a rdma/RXE link, please share the output of its ip
$ ip -4 -o addr show dev <NIC>

Where the <NIC> is an interface associated with the RoCE(RXE), for example

$ rdma link
link enp2s0_rxe/1 state ACTIVE physical_state LINK_UP netdev enp2s0

then the <NIC> is enp2s0


> 
> Apparently called via:
> 
> qemu_start_incoming_migration()
>    -> rdma_start_incoming_migration()
>       -> qemu_rdma_dest_init()
> 
>>   # Start of rdma tests
>>   # Running /x86_64/migration/precopy/rdma/plain
>>   Command 'rdma' is not available, please install it first.
>>   # To enable the test:
>>   # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
>>   # or
>>   # (2) Run the test with root privilege
> 
> Could this might be the issue, should we skip if not root, as calling
> the script in "detect" mode makes the new_rdma_link() method to succeed.

It's expected the 'detect' should succeed and print a IPv4 address

> 
>>   #
>>   ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
>>   # End of rdma tests
>>
>> Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
>> or by specifying 'clean' within this script.
> 
> qtest_add() provides both setup() / teardown() methods.> 

This may require a minor refactor of the migration-test framework to
enable support for setup() and teardown() methods.

Let me see...



> Test leaving system in different state seems bogus to me.

At this point, I'm unable to refute. It indeed dirty the
It indeed might dirty the system.


A palatable compromise might be that, regardless of whether one is a
root user or not, this test is only supported on hosts with RDMA link.

Otherwise, it will provide an SKIP warning.

      # Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
      # Optional: run 'scripts/rdma-migration-helper.sh clean' to revert the 'setup'

For local users, they can independently use this script to set up and clean RDMA,
as they are aware of the modifications they have made to the system.

Thanks
Zhijian

> More even if the information is buried in a commit description...> 
> We shouldn't merge this patch as is IMHO.
> 
> Regards,
> 
> Phil.
> 
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>> Message-ID: <20250305062825.772629-7-lizhijian@fujitsu.com>
>> [reformated the message to be under 90 characters]
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>   MAINTAINERS                           |  1 +
>>   scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
>>   tests/qtest/migration/precopy-tests.c | 69 +++++++++++++++++++++++++++
>>   3 files changed, 118 insertions(+)
>>   create mode 100755 scripts/rdma-migration-helper.sh
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 5df6020ed5..56e85adcfb 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -3517,6 +3517,7 @@ R: Li Zhijian <lizhijian@fujitsu.com>
>>   R: Peter Xu <peterx@redhat.com>
>>   S: Odd Fixes
>>   F: migration/rdma*
>> +F: scripts/rdma-migration-helper.sh
>>   Migration dirty limit and dirty page rate
>>   M: Hyman Huang <yong.huang@smartx.com>
>> diff --git a/scripts/rdma-migration-helper.sh b/scripts/rdma-migration-helper.sh
>> new file mode 100755
>> index 0000000000..08e29a52eb
>> --- /dev/null
>> +++ b/scripts/rdma-migration-helper.sh
>> @@ -0,0 +1,48 @@
>> +#!/bin/bash
>> +
>> +# Copied from blktests
>> +get_ipv4_addr()
>> +{
>> +    ip -4 -o addr show dev "$1" |
>> +        sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
>> +        tr -d '\n'
>> +}
>> +
>> +has_soft_rdma()
>> +{
>> +    rdma link | grep -q " netdev $1[[:blank:]]*\$"
>> +}
>> +
>> +rdma_rxe_setup_detect()
>> +{
>> +    (
>> +        cd /sys/class/net &&
>> +            for i in *; do
>> +                [ -e "$i" ] || continue
>> +                [ "$i" = "lo" ] && continue
>> +                [ "$(<"$i/addr_len")" = 6 ] || continue
>> +                [ "$(<"$i/carrier")" = 1 ] || continue
>> +
>> +                has_soft_rdma "$i" && break
>> +                [ "$operation" = "setup" ] &&
>> +                    rdma link add "${i}_rxe" type rxe netdev "$i" && break
>> +            done
>> +        has_soft_rdma "$i" || return
>> +        get_ipv4_addr "$i"
>> +    )
>> +}
>> +
>> +operation=${1:-setup}
>> +
>> +command -v rdma >/dev/null || {
>> +    echo "Command 'rdma' is not available, please install it first." >&2
>> +    exit 1
>> +}
>> +
>> +if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
>> +    rdma_rxe_setup_detect
>> +elif [ "$operation" == "clean" ]; then
>> +    modprobe -r rdma_rxe
>> +else
>> +    echo "Usage: $0 [setup | detect | clean]"
>> +fi
>> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
>> index ba273d10b9..f1fe34020d 100644
>> --- a/tests/qtest/migration/precopy-tests.c
>> +++ b/tests/qtest/migration/precopy-tests.c
>> @@ -99,6 +99,71 @@ static void test_precopy_unix_dirty_ring(void)
>>       test_precopy_common(&args);
>>   }
>> +#ifdef CONFIG_RDMA
>> +
>> +#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
>> +static int new_rdma_link(char *buffer, bool verbose)
>> +{
>> +    const char *argument = (geteuid() == 0) ? "setup" : "detect";
>> +    char cmd[1024];
>> +
>> +    snprintf(cmd, sizeof(cmd), "%s %s %s", RDMA_MIGRATION_HELPER, argument,
>> +             verbose ? "" : "2>/dev/null");
>> +
>> +    FILE *pipe = popen(cmd, "r");
>> +    if (pipe == NULL) {
>> +        perror("Failed to run script");
>> +        return -1;
>> +    }
>> +
>> +    int idx = 0;
>> +    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
>> +        idx += strlen(buffer);
>> +    }
>> +
>> +    int status = pclose(pipe);
>> +    if (status == -1) {
>> +        perror("Error reported by pclose()");
>> +        return -1;
>> +    } else if (WIFEXITED(status)) {
>> +        return WEXITSTATUS(status);
>> +    }
>> +
>> +    return -1;
>> +}
>> +
>> +static void test_precopy_rdma_plain(void)
>> +{
>> +    char buffer[128] = {};
>> +    bool verbose = g_getenv("QTEST_LOG");
>> +
>> +    if (new_rdma_link(buffer, verbose)) {
>> +        g_test_skip("No rdma link available");
>> +        if (verbose) {
>> +            g_test_message(
>> +                "To enable the test:\n"
>> +                "(1) Run \'" RDMA_MIGRATION_HELPER
>> +                " setup\' with root and rerun the test\n"
>> +                "or\n(2) Run the test with root privilege");
>> +        }
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * TODO: query a free port instead of hard code.
>> +     * 29200=('R'+'D'+'M'+'A')*100
>> +     **/
>> +    g_autofree char *uri = g_strdup_printf("rdma:%s:29200", buffer);
>> +
>> +    MigrateCommon args = {
>> +        .listen_uri = uri,
>> +        .connect_uri = uri,
>> +    };
>> +
>> +    test_precopy_common(&args);
>> +}
>> +#endif
>> +
>>   static void test_precopy_tcp_plain(void)
>>   {
>>       MigrateCommon args = {
>> @@ -1124,6 +1189,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
>>                          test_multifd_tcp_uri_none);
>>       migration_test_add("/migration/multifd/tcp/plain/cancel",
>>                          test_multifd_tcp_cancel);
>> +#ifdef CONFIG_RDMA
>> +    migration_test_add("/migration/precopy/rdma/plain",
>> +                       test_precopy_rdma_plain);
>> +#endif
>>   }
>>   void migration_test_add_precopy(MigrationTestEnv *env)
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-08  8:42     ` Stefan Hajnoczi
@ 2025-03-10  8:33       ` Zhijian Li (Fujitsu) via
  2025-03-10 14:36         ` Peter Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Zhijian Li (Fujitsu) via @ 2025-03-10  8:33 UTC (permalink / raw)
  To: Stefan Hajnoczi, Fabiano Rosas
  Cc: qemu-devel@nongnu.org, Peter Xu, Daniel P. Berrangé,
	Philippe Mathieu-Daudé, Alex Bennée, Thomas Huth

Hi Stefan,

Copied to gitlab CI,

On 08/03/2025 16:42, Stefan Hajnoczi wrote:
> On Sat, Mar 8, 2025 at 2:01 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>
>> Hi,
>>
>> On 7/3/25 19:15, Fabiano Rosas wrote:
>>> From: Li Zhijian <lizhijian@fujitsu.com>
>>>
>>> This qtest requires there is a RDMA(RoCE) link in the host.
>>> In order to make the test work smoothly, introduce a
>>> scripts/rdma-migration-helper.sh to
>>> - setup a new Soft-RoCE(aka RXE) if it's root
>>> - detect existing RoCE link
>>>
>>> Test will be skipped if there is no available RoCE link.
>>
>> Is it? Runing as user I'm getting:
>>
>>     RDMA ERROR: RDMA host is not set!
> 
> The CI is failing too:
> https://gitlab.com/qemu-project/qemu/-/jobs/9350004599#L5590

Thanks for this info, unfortunately, there is no 'testlog.txt' in this gitlab-ci.

I learned that x86 runner worked well
https://gitlab.com/qemu-project/qemu/-/jobs/9350004633

So I doubt this is aarch64 specific, but I don't have an aarch64 in hand.


Cced @CI guys:
So I prefer to send a patch to make gitlab-ci to 'cat testlog.txt' in 'check' failure.
What are your thoughts on this?

the CI diff would like:

diff --git a/.gitlab-ci.d/buildtest-template.yml b/.gitlab-ci.d/buildtest-template.yml
index 4cc1923931..b2592faa15 100644
--- a/.gitlab-ci.d/buildtest-template.yml
+++ b/.gitlab-ci.d/buildtest-template.yml
@@ -38,7 +38,7 @@
      - section_start test "Running tests"
      - if test -n "$MAKE_CHECK_ARGS";
        then
-        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS ;
+        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS || { cat meson-logs/testlog.txt 2>/dev/null; false; }
        fi
      - section_end test
      - ccache --show-stats
@@ -77,7 +77,7 @@
        fi
      - section_end buildenv
      - section_start test "Running tests"
-    - $MAKE NINJA=":" $MAKE_CHECK_ARGS
+    - $MAKE NINJA=":" $MAKE_CHECK_ARGS || { cat meson-logs/testlog.txt 2>/dev/null; false; }
      - section_end test
  
  .native_test_job_template:
diff --git a/.gitlab-ci.d/crossbuild-template.yml b/.gitlab-ci.d/crossbuild-template.yml
index 303943f818..88ff592419 100644
--- a/.gitlab-ci.d/crossbuild-template.yml
+++ b/.gitlab-ci.d/crossbuild-template.yml
@@ -36,7 +36,7 @@
      - section_start test "Running tests"
      - if test -n "$MAKE_CHECK_ARGS";
        then
-        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS ;
+        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS || { cat meson-logs/testlog.txt 2>/dev/null; false; }
        fi
      - section_end test
      - section_start installer "Building the installer"
@@ -82,7 +82,7 @@
      - section_start test "Running tests"
      - if test -n "$MAKE_CHECK_ARGS";
        then
-        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS ;
+        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS || { cat meson-logs/testlog.txt 2>/dev/null; false; }
        fi
      - section_end test
  
@@ -116,7 +116,7 @@
      - section_start test "Running tests"
      - if test -n "$MAKE_CHECK_ARGS";
        then
-        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS ;
+        $MAKE -j"$JOBS" $MAKE_CHECK_ARGS || { cat meson-logs/testlog.txt 2>/dev/null; false; }
        fi
      - section_end test
  
diff --git a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
index 8727687e2b..b7a4e3c599 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
@@ -23,3 +23,4 @@ ubuntu-22.04-aarch32-all:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc --ignore=40`
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
diff --git a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
index ca2f140471..6fcb576a59 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
@@ -22,6 +22,7 @@ ubuntu-22.04-aarch64-all-linux-static:
   - make --output-sync -j`nproc --ignore=40`
   - make check-tcg
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-aarch64-all:
   extends: .custom_runner_template
@@ -44,6 +45,7 @@ ubuntu-22.04-aarch64-all:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc --ignore=40`
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-aarch64-without-defaults:
   extends: .custom_runner_template
@@ -66,6 +68,7 @@ ubuntu-22.04-aarch64-without-defaults:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc --ignore=40`
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-aarch64-alldbg:
   extends: .custom_runner_template
@@ -85,6 +88,7 @@ ubuntu-22.04-aarch64-alldbg:
   - make clean
   - make --output-sync -j`nproc --ignore=40`
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-aarch64-clang:
   extends: .custom_runner_template
@@ -107,6 +111,7 @@ ubuntu-22.04-aarch64-clang:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc --ignore=40`
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-aarch64-tci:
   needs: []
@@ -149,3 +154,4 @@ ubuntu-22.04-aarch64-notcg:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc --ignore=40`
   - make --output-sync -j`nproc --ignore=40` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
diff --git a/.gitlab-ci.d/custom-runners/ubuntu-22.04-s390x.yml b/.gitlab-ci.d/custom-runners/ubuntu-22.04-s390x.yml
index ca374acb8c..35e36f4124 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-22.04-s390x.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-22.04-s390x.yml
@@ -20,6 +20,7 @@ ubuntu-22.04-s390x-all-linux:
   - make --output-sync -j`nproc`
   - make --output-sync check-tcg
   - make --output-sync -j`nproc` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-s390x-all-system:
   extends: .custom_runner_template
@@ -39,6 +40,7 @@ ubuntu-22.04-s390x-all-system:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc`
   - make --output-sync -j`nproc` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-s390x-alldbg:
   extends: .custom_runner_template
@@ -62,6 +64,7 @@ ubuntu-22.04-s390x-alldbg:
   - make clean
   - make --output-sync -j`nproc`
   - make --output-sync -j`nproc` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-s390x-clang:
   extends: .custom_runner_template
@@ -84,6 +87,7 @@ ubuntu-22.04-s390x-clang:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc`
   - make --output-sync -j`nproc` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }
  
  ubuntu-22.04-s390x-tci:
   needs: []
@@ -126,3 +130,4 @@ ubuntu-22.04-s390x-notcg:
     || { cat config.log meson-logs/meson-log.txt; exit 1; }
   - make --output-sync -j`nproc`
   - make --output-sync -j`nproc` check
+   || { cat meson-logs/testlog.txt 2>/dev/null; false; }


> 
> I have dropped this pull request for now. Please send a new revision
> once the issue has been resolved.
> 
> Stefan
> 
>>
>> Apparently called via:
>>
>> qemu_start_incoming_migration()
>>     -> rdma_start_incoming_migration()
>>        -> qemu_rdma_dest_init()
>>
>>>    # Start of rdma tests
>>>    # Running /x86_64/migration/precopy/rdma/plain
>>>    Command 'rdma' is not available, please install it first.
>>>    # To enable the test:
>>>    # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
>>>    # or
>>>    # (2) Run the test with root privilege
>>
>> Could this might be the issue, should we skip if not root, as calling
>> the script in "detect" mode makes the new_rdma_link() method to succeed.
>>
>>>    #
>>>    ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
>>>    # End of rdma tests
>>>
>>> Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
>>> or by specifying 'clean' within this script.
>>
>> qtest_add() provides both setup() / teardown() methods.
>>
>> Test leaving system in different state seems bogus to me.
>> More even if the information is buried in a commit description...
>>
>> We shouldn't merge this patch as is IMHO.
>>
>> Regards,
>>
>> Phil.
>>
>>> Reviewed-by: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>>> Message-ID: <20250305062825.772629-7-lizhijian@fujitsu.com>
>>> [reformated the message to be under 90 characters]
>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>> ---
>>>    MAINTAINERS                           |  1 +
>>>    scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
>>>    tests/qtest/migration/precopy-tests.c | 69 +++++++++++++++++++++++++++
>>>    3 files changed, 118 insertions(+)
>>>    create mode 100755 scripts/rdma-migration-helper.sh
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 5df6020ed5..56e85adcfb 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -3517,6 +3517,7 @@ R: Li Zhijian <lizhijian@fujitsu.com>
>>>    R: Peter Xu <peterx@redhat.com>
>>>    S: Odd Fixes
>>>    F: migration/rdma*
>>> +F: scripts/rdma-migration-helper.sh
>>>
>>>    Migration dirty limit and dirty page rate
>>>    M: Hyman Huang <yong.huang@smartx.com>
>>> diff --git a/scripts/rdma-migration-helper.sh b/scripts/rdma-migration-helper.sh
>>> new file mode 100755
>>> index 0000000000..08e29a52eb
>>> --- /dev/null
>>> +++ b/scripts/rdma-migration-helper.sh
>>> @@ -0,0 +1,48 @@
>>> +#!/bin/bash
>>> +
>>> +# Copied from blktests
>>> +get_ipv4_addr()
>>> +{
>>> +    ip -4 -o addr show dev "$1" |
>>> +        sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
>>> +        tr -d '\n'
>>> +}
>>> +
>>> +has_soft_rdma()
>>> +{
>>> +    rdma link | grep -q " netdev $1[[:blank:]]*\$"
>>> +}
>>> +
>>> +rdma_rxe_setup_detect()
>>> +{
>>> +    (
>>> +        cd /sys/class/net &&
>>> +            for i in *; do
>>> +                [ -e "$i" ] || continue
>>> +                [ "$i" = "lo" ] && continue
>>> +                [ "$(<"$i/addr_len")" = 6 ] || continue
>>> +                [ "$(<"$i/carrier")" = 1 ] || continue
>>> +
>>> +                has_soft_rdma "$i" && break
>>> +                [ "$operation" = "setup" ] &&
>>> +                    rdma link add "${i}_rxe" type rxe netdev "$i" && break
>>> +            done
>>> +        has_soft_rdma "$i" || return
>>> +        get_ipv4_addr "$i"
>>> +    )
>>> +}
>>> +
>>> +operation=${1:-setup}
>>> +
>>> +command -v rdma >/dev/null || {
>>> +    echo "Command 'rdma' is not available, please install it first." >&2
>>> +    exit 1
>>> +}
>>> +
>>> +if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
>>> +    rdma_rxe_setup_detect
>>> +elif [ "$operation" == "clean" ]; then
>>> +    modprobe -r rdma_rxe
>>> +else
>>> +    echo "Usage: $0 [setup | detect | clean]"
>>> +fi
>>> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
>>> index ba273d10b9..f1fe34020d 100644
>>> --- a/tests/qtest/migration/precopy-tests.c
>>> +++ b/tests/qtest/migration/precopy-tests.c
>>> @@ -99,6 +99,71 @@ static void test_precopy_unix_dirty_ring(void)
>>>        test_precopy_common(&args);
>>>    }
>>>
>>> +#ifdef CONFIG_RDMA
>>> +
>>> +#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
>>> +static int new_rdma_link(char *buffer, bool verbose)
>>> +{
>>> +    const char *argument = (geteuid() == 0) ? "setup" : "detect";
>>> +    char cmd[1024];
>>> +
>>> +    snprintf(cmd, sizeof(cmd), "%s %s %s", RDMA_MIGRATION_HELPER, argument,
>>> +             verbose ? "" : "2>/dev/null");
>>> +
>>> +    FILE *pipe = popen(cmd, "r");
>>> +    if (pipe == NULL) {
>>> +        perror("Failed to run script");
>>> +        return -1;
>>> +    }
>>> +
>>> +    int idx = 0;
>>> +    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
>>> +        idx += strlen(buffer);
>>> +    }
>>> +
>>> +    int status = pclose(pipe);
>>> +    if (status == -1) {
>>> +        perror("Error reported by pclose()");
>>> +        return -1;
>>> +    } else if (WIFEXITED(status)) {
>>> +        return WEXITSTATUS(status);
>>> +    }
>>> +
>>> +    return -1;
>>> +}
>>> +
>>> +static void test_precopy_rdma_plain(void)
>>> +{
>>> +    char buffer[128] = {};
>>> +    bool verbose = g_getenv("QTEST_LOG");
>>> +
>>> +    if (new_rdma_link(buffer, verbose)) {
>>> +        g_test_skip("No rdma link available");
>>> +        if (verbose) {
>>> +            g_test_message(
>>> +                "To enable the test:\n"
>>> +                "(1) Run \'" RDMA_MIGRATION_HELPER
>>> +                " setup\' with root and rerun the test\n"
>>> +                "or\n(2) Run the test with root privilege");
>>> +        }
>>> +        return;
>>> +    }
>>> +
>>> +    /*
>>> +     * TODO: query a free port instead of hard code.
>>> +     * 29200=('R'+'D'+'M'+'A')*100
>>> +     **/
>>> +    g_autofree char *uri = g_strdup_printf("rdma:%s:29200", buffer);
>>> +
>>> +    MigrateCommon args = {
>>> +        .listen_uri = uri,
>>> +        .connect_uri = uri,
>>> +    };
>>> +
>>> +    test_precopy_common(&args);
>>> +}
>>> +#endif
>>> +
>>>    static void test_precopy_tcp_plain(void)
>>>    {
>>>        MigrateCommon args = {
>>> @@ -1124,6 +1189,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
>>>                           test_multifd_tcp_uri_none);
>>>        migration_test_add("/migration/multifd/tcp/plain/cancel",
>>>                           test_multifd_tcp_cancel);
>>> +#ifdef CONFIG_RDMA
>>> +    migration_test_add("/migration/precopy/rdma/plain",
>>> +                       test_precopy_rdma_plain);
>>> +#endif
>>>    }
>>>
>>>    void migration_test_add_precopy(MigrationTestEnv *env)
>>
>>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-10  8:33       ` Zhijian Li (Fujitsu) via
@ 2025-03-10 14:36         ` Peter Xu
  2025-03-11  2:06           ` Zhijian Li (Fujitsu) via
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Xu @ 2025-03-10 14:36 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu)
  Cc: Stefan Hajnoczi, Fabiano Rosas, qemu-devel@nongnu.org,
	Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Alex Bennée, Thomas Huth

On Mon, Mar 10, 2025 at 08:33:14AM +0000, Zhijian Li (Fujitsu) wrote:
> Hi Stefan,
> 
> Copied to gitlab CI,
> 
> On 08/03/2025 16:42, Stefan Hajnoczi wrote:
> > On Sat, Mar 8, 2025 at 2:01 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> >>
> >> Hi,
> >>
> >> On 7/3/25 19:15, Fabiano Rosas wrote:
> >>> From: Li Zhijian <lizhijian@fujitsu.com>
> >>>
> >>> This qtest requires there is a RDMA(RoCE) link in the host.
> >>> In order to make the test work smoothly, introduce a
> >>> scripts/rdma-migration-helper.sh to
> >>> - setup a new Soft-RoCE(aka RXE) if it's root
> >>> - detect existing RoCE link
> >>>
> >>> Test will be skipped if there is no available RoCE link.
> >>
> >> Is it? Runing as user I'm getting:
> >>
> >>     RDMA ERROR: RDMA host is not set!
> > 
> > The CI is failing too:
> > https://gitlab.com/qemu-project/qemu/-/jobs/9350004599#L5590
> 
> Thanks for this info, unfortunately, there is no 'testlog.txt' in this gitlab-ci.

It has it.  Try look for "Job artifacts", then there're "Download" or
"Browse" for testlog.txt.  But there isn't much info.

# Start of rdma tests
# Running /aarch64/migration/precopy/rdma/plain
# Using machine type: virt-10.0
# starting QEMU: exec ./qemu-system-aarch64 -qtest unix:/tmp/qtest-1127030.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1127030.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine virt-10.0,gic-version=3 -name source,debug-threads=on -m 150M  -serial file:/tmp/migration-test-R1OX22/src_serial -cpu max -kernel /tmp/migration-test-R1OX22/bootsect    -accel qtest
# starting QEMU: exec ./qemu-system-aarch64 -qtest unix:/tmp/qtest-1127030.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1127030.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine virt-10.0,gic-version=3 -name target,debug-threads=on -m 150M  -serial file:/tmp/migration-test-R1OX22/dest_serial -incoming rdma::29200  -cpu max -kernel /tmp/migration-test-R1OX22/bootsect    -accel qtest
----------------------------------- stderr -----------------------------------
qemu-system-aarch64: -incoming rdma::29200: RDMA ERROR: RDMA host is not set!
Broken pipe
../tests/qtest/libqtest.c:199: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)

> 
> I learned that x86 runner worked well
> https://gitlab.com/qemu-project/qemu/-/jobs/9350004633
> 
> So I doubt this is aarch64 specific, but I don't have an aarch64 in hand.

I think it means it'll exit 0 even without ipv4 address in the script.  I
doubt whether we used to rely on:

  command -v rdma

But maybe that's available on the reproduced hosts, so it'll pass there.
OTOH, the script should fail the script if no avail ipv4 addr found.

To be explicit, the script does this:

  has_soft_rdma "$i" || return

So even if it failed to see the soft rdma and returned, IIUC
rdma_rxe_setup_detect() will still success.

Maybe it should be this instead?

  has_soft_rdma "$i" || exit -1

We could also sanity check the ipv4 address, e.g.:

  rdma_rxe_setup_detect | grep -Eo '^[0-9]{1,3}(\.[0-9]{1,3}){3}$'

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-10  8:01     ` Zhijian Li (Fujitsu) via
@ 2025-03-10 15:00       ` Fabiano Rosas
  0 siblings, 0 replies; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-10 15:00 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), Philippe Mathieu-Daudé,
	qemu-devel@nongnu.org
  Cc: Peter Xu, Daniel P. Berrangé, stefanha@gmail.com

"Zhijian Li (Fujitsu)" via <qemu-devel@nongnu.org> writes:

> Hi Philippe,
>
> Thanks for your testing.
>
>
> On 08/03/2025 14:00, Philippe Mathieu-Daudé wrote:
>> Hi,
>> 
>> On 7/3/25 19:15, Fabiano Rosas wrote:
>>> From: Li Zhijian <lizhijian@fujitsu.com>
>>>
>>> This qtest requires there is a RDMA(RoCE) link in the host.
>>> In order to make the test work smoothly, introduce a
>>> scripts/rdma-migration-helper.sh to
>>> - setup a new Soft-RoCE(aka RXE) if it's root
>>> - detect existing RoCE link
>>>
>>> Test will be skipped if there is no available RoCE link.
>> 
>> Is it? Runing as user I'm getting:
>> 
>>    RDMA ERROR: RDMA host is not set!
>
>
> It's unexpected behavior.
>
> It implies that the script terminated successfully with an exit
> code of 0(there is a RDMA link), yet failed to display its canonical
> IPv4 address.
>
> Stefan also mentioned the same error...
> https://gitlab.com/qemu-project/qemu/-/jobs/9350004599#L5590
>
> I couldn't reproduce your error.
>
> Could you share the output of this script with a normal user,
> $ scripts/rdma-migration-helper.sh detect
>
> if your had a rdma/RXE link, please share the output of its ip
> $ ip -4 -o addr show dev <NIC>
>
> Where the <NIC> is an interface associated with the RoCE(RXE), for example
>
> $ rdma link
> link enp2s0_rxe/1 state ACTIVE physical_state LINK_UP netdev enp2s0
>
> then the <NIC> is enp2s0
>
>
>> 
>> Apparently called via:
>> 
>> qemu_start_incoming_migration()
>>    -> rdma_start_incoming_migration()
>>       -> qemu_rdma_dest_init()
>> 
>>>   # Start of rdma tests
>>>   # Running /x86_64/migration/precopy/rdma/plain
>>>   Command 'rdma' is not available, please install it first.
>>>   # To enable the test:
>>>   # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
>>>   # or
>>>   # (2) Run the test with root privilege
>> 
>> Could this might be the issue, should we skip if not root, as calling
>> the script in "detect" mode makes the new_rdma_link() method to succeed.
>
> It's expected the 'detect' should succeed and print a IPv4 address
>
>> 
>>>   #
>>>   ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
>>>   # End of rdma tests
>>>
>>> Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
>>> or by specifying 'clean' within this script.
>> 
>> qtest_add() provides both setup() / teardown() methods.> 
>
> This may require a minor refactor of the migration-test framework to
> enable support for setup() and teardown() methods.
>
> Let me see...
>
>
>
>> Test leaving system in different state seems bogus to me.
>
> At this point, I'm unable to refute. It indeed dirty the
> It indeed might dirty the system.
>
>
> A palatable compromise might be that, regardless of whether one is a
> root user or not, this test is only supported on hosts with RDMA link.
>
> Otherwise, it will provide an SKIP warning.
>
>       # Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
>       # Optional: run 'scripts/rdma-migration-helper.sh clean' to revert the 'setup'
>
> For local users, they can independently use this script to set up and clean RDMA,
> as they are aware of the modifications they have made to the system.

Yes, let's skip it unless the user has very explicitly set things up.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 8/8] migration: Add qtest for migration over RDMA
  2025-03-10 14:36         ` Peter Xu
@ 2025-03-11  2:06           ` Zhijian Li (Fujitsu) via
  0 siblings, 0 replies; 24+ messages in thread
From: Zhijian Li (Fujitsu) via @ 2025-03-11  2:06 UTC (permalink / raw)
  To: Peter Xu
  Cc: Stefan Hajnoczi, Fabiano Rosas, qemu-devel@nongnu.org,
	Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Alex Bennée, Thomas Huth



On 10/03/2025 22:36, Peter Xu wrote:
> On Mon, Mar 10, 2025 at 08:33:14AM +0000, Zhijian Li (Fujitsu) wrote:
>> Hi Stefan,
>>
>> Copied to gitlab CI,
>>
>> On 08/03/2025 16:42, Stefan Hajnoczi wrote:
>>> On Sat, Mar 8, 2025 at 2:01 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 7/3/25 19:15, Fabiano Rosas wrote:
>>>>> From: Li Zhijian <lizhijian@fujitsu.com>
>>>>>
>>>>> This qtest requires there is a RDMA(RoCE) link in the host.
>>>>> In order to make the test work smoothly, introduce a
>>>>> scripts/rdma-migration-helper.sh to
>>>>> - setup a new Soft-RoCE(aka RXE) if it's root
>>>>> - detect existing RoCE link
>>>>>
>>>>> Test will be skipped if there is no available RoCE link.
>>>>
>>>> Is it? Runing as user I'm getting:
>>>>
>>>>      RDMA ERROR: RDMA host is not set!
>>>
>>> The CI is failing too:
>>> https://gitlab.com/qemu-project/qemu/-/jobs/9350004599#L5590
>>
>> Thanks for this info, unfortunately, there is no 'testlog.txt' in this gitlab-ci.
> 
> It has it.  Try look for "Job artifacts", then there're "Download" or
> "Browse" for testlog.txt.  But there isn't much info.

Thanks for this information.


> 
> # Start of rdma tests
> # Running /aarch64/migration/precopy/rdma/plain
> # Using machine type: virt-10.0
> # starting QEMU: exec ./qemu-system-aarch64 -qtest unix:/tmp/qtest-1127030.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1127030.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine virt-10.0,gic-version=3 -name source,debug-threads=on -m 150M  -serial file:/tmp/migration-test-R1OX22/src_serial -cpu max -kernel /tmp/migration-test-R1OX22/bootsect    -accel qtest
> # starting QEMU: exec ./qemu-system-aarch64 -qtest unix:/tmp/qtest-1127030.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1127030.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine virt-10.0,gic-version=3 -name target,debug-threads=on -m 150M  -serial file:/tmp/migration-test-R1OX22/dest_serial -incoming rdma::29200  -cpu max -kernel /tmp/migration-test-R1OX22/bootsect    -accel qtest
> ----------------------------------- stderr -----------------------------------
> qemu-system-aarch64: -incoming rdma::29200: RDMA ERROR: RDMA host is not set!
> Broken pipe
> ../tests/qtest/libqtest.c:199: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
> 
>>
>> I learned that x86 runner worked well
>> https://gitlab.com/qemu-project/qemu/-/jobs/9350004633
>>
>> So I doubt this is aarch64 specific, but I don't have an aarch64 in hand.
> 
> I think it means it'll exit 0 even without ipv4 address in the script.  I
> doubt whether we used to rely on:
> 
>    command -v rdma
> 
> But maybe that's available on the reproduced hosts, so it'll pass there.
> OTOH, the script should fail the script if no avail ipv4 addr found.


Yes, I believe this is the cause. I have reproduced it with a rdma link without a ipv4 address.



> 
> To be explicit, the script does this:
> 
>    has_soft_rdma "$i" || return
> 
> So even if it failed to see the soft rdma and returned, IIUC
> rdma_rxe_setup_detect() will still success.
> 
> Maybe it should be this instead?
> 
>    has_soft_rdma "$i" || exit -1
> 
> We could also sanity check the ipv4 address, e.g.:
> 
>    rdma_rxe_setup_detect | grep -Eo '^[0-9]{1,3}(\.[0-9]{1,3}){3}$'
> 

Yeah, this can make the script more robust.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-07 18:15 ` [PULL 2/8] migration: ram block cpr blockers Fabiano Rosas
@ 2025-03-26 18:46   ` Tom Lendacky
  2025-03-26 19:21     ` Tom Lendacky
  0 siblings, 1 reply; 24+ messages in thread
From: Tom Lendacky @ 2025-03-26 18:46 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Peter Xu, Steve Sistare, David Hildenbrand, Michael Roth

On 3/7/25 12:15, Fabiano Rosas wrote:
> From: Steve Sistare <steven.sistare@oracle.com>
> 
> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
> in the migration stream file and recreate them later, because the physical
> memory for the blocks is pinned and registered for vfio.  Add a blocker
> for volatile ram blocks.
> 
> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
> sufficient for CPR, but it has not been tested yet.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  include/exec/memory.h   |  3 ++
>  include/exec/ramblock.h |  1 +
>  migration/savevm.c      |  2 ++
>  system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 72 insertions(+)

This patch breaks booting an SNP guest as it triggers the following
assert:

qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.

I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
It looks like the error message is unable to be printed because
rb->cpr_blocker is NULL.

Adding aux-ram-share=on to the -machine object gets me past the error and
therefore the assertion, but isn't that an incompatible change to how an
SNP guest has to be started?

Thanks,
Tom

> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 78c4e0aec8..d09af58c97 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void);
>   */
>  bool ram_block_discard_is_required(void);
>  
> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp);
> +void ram_block_del_cpr_blocker(RAMBlock *rb);
> +
>  #endif
>  
>  #endif
> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> index 0babd105c0..64484cd821 100644
> --- a/include/exec/ramblock.h
> +++ b/include/exec/ramblock.h
> @@ -39,6 +39,7 @@ struct RAMBlock {
>      /* RCU-enabled, writes protected by the ramlist lock */
>      QLIST_ENTRY(RAMBlock) next;
>      QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
> +    Error *cpr_blocker;
>      int fd;
>      uint64_t fd_offset;
>      int guest_memfd;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 5c4fdfd95e..ce158c3512 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>      qemu_ram_set_idstr(mr->ram_block,
>                         memory_region_name(mr), dev);
>      qemu_ram_set_migratable(mr->ram_block);
> +    ram_block_add_cpr_blocker(mr->ram_block, &error_fatal);
>  }
>  
>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>  {
>      qemu_ram_unset_idstr(mr->ram_block);
>      qemu_ram_unset_migratable(mr->ram_block);
> +    ram_block_del_cpr_blocker(mr->ram_block);
>  }
>  
>  void vmstate_register_ram_global(MemoryRegion *mr)
> diff --git a/system/physmem.c b/system/physmem.c
> index 8c1736f84e..445981a1b4 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -70,7 +70,10 @@
>  
>  #include "qemu/pmem.h"
>  
> +#include "qapi/qapi-types-migration.h"
> +#include "migration/blocker.h"
>  #include "migration/cpr.h"
> +#include "migration/options.h"
>  #include "migration/vmstate.h"
>  
>  #include "qemu/range.h"
> @@ -1903,6 +1906,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>              qemu_mutex_unlock_ramlist();
>              goto out_free;
>          }
> +
> +        error_setg(&new_block->cpr_blocker,
> +                   "Memory region %s uses guest_memfd, "
> +                   "which is not supported with CPR.",
> +                   memory_region_name(new_block->mr));
> +        migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
> +                                  MIG_MODE_CPR_TRANSFER,
> +                                  -1);
>      }
>  
>      ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
> @@ -4094,3 +4105,58 @@ bool ram_block_discard_is_required(void)
>      return qatomic_read(&ram_block_discard_required_cnt) ||
>             qatomic_read(&ram_block_coordinated_discard_required_cnt);
>  }
> +
> +/*
> + * Return true if ram is compatible with CPR.  Do not exclude rom,
> + * because the rom file could change in new QEMU.
> + */
> +static bool ram_is_cpr_compatible(RAMBlock *rb)
> +{
> +    MemoryRegion *mr = rb->mr;
> +
> +    if (!mr || !memory_region_is_ram(mr)) {
> +        return true;
> +    }
> +
> +    /* Ram device is remapped in new QEMU */
> +    if (memory_region_is_ram_device(mr)) {
> +        return true;
> +    }
> +
> +    /*
> +     * A file descriptor is passed to new QEMU and remapped, or its backing
> +     * file is reopened and mapped.  It must be shared to avoid COW.
> +     */
> +    if (rb->fd >= 0 && qemu_ram_is_shared(rb)) {
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
> +/*
> + * Add a blocker for each volatile ram block.  This function should only be
> + * called after we know that the block is migratable.  Non-migratable blocks
> + * are either re-created in new QEMU, or are handled specially, or are covered
> + * by a device-level CPR blocker.
> + */
> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
> +{
> +    assert(qemu_ram_is_migratable(rb));
> +
> +    if (ram_is_cpr_compatible(rb)) {
> +        return;
> +    }
> +
> +    error_setg(&rb->cpr_blocker,
> +               "Memory region %s is not compatible with CPR. share=on is "
> +               "required for memory-backend objects, and aux-ram-share=on is "
> +               "required.", memory_region_name(rb->mr));
> +    migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFER,
> +                              -1);
> +}
> +
> +void ram_block_del_cpr_blocker(RAMBlock *rb)
> +{
> +    migrate_del_blocker(&rb->cpr_blocker);
> +}


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-26 18:46   ` Tom Lendacky
@ 2025-03-26 19:21     ` Tom Lendacky
  2025-03-26 19:50       ` Michael Roth
  0 siblings, 1 reply; 24+ messages in thread
From: Tom Lendacky @ 2025-03-26 19:21 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Peter Xu, Steve Sistare, David Hildenbrand, Michael Roth

On 3/26/25 13:46, Tom Lendacky wrote:
> On 3/7/25 12:15, Fabiano Rosas wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>>
>> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
>> in the migration stream file and recreate them later, because the physical
>> memory for the blocks is pinned and registered for vfio.  Add a blocker
>> for volatile ram blocks.
>>
>> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
>> sufficient for CPR, but it has not been tested yet.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Reviewed-by: David Hildenbrand <david@redhat.com>
>> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  include/exec/memory.h   |  3 ++
>>  include/exec/ramblock.h |  1 +
>>  migration/savevm.c      |  2 ++
>>  system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 72 insertions(+)
> 
> This patch breaks booting an SNP guest as it triggers the following
> assert:
> 
> qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.
> 
> I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
> It looks like the error message is unable to be printed because
> rb->cpr_blocker is NULL.
> 
> Adding aux-ram-share=on to the -machine object gets me past the error and
> therefore the assertion, but isn't that an incompatible change to how an
> SNP guest has to be started?

If I update the err_setg() call to use the errp parameter that is passed
into ram_block_add_cpr_blocker(), I get the following message and then
the guest launch terminates:

qemu-system-x86_64: Memory region pc.bios is not compatible with CPR.
share=on is required for memory-backend objects, and aux-ram-share=on is
required.

The qemu parameters I used prior to this patch that allowed an SNP guest
to launch were:

-machine type=q35,confidential-guest-support=sev0,memory-backend=ram1
-object memory-backend-memfd,id=ram1,size=16G,share=true,prealloc=false

With these parameters after this patch, the launch fails.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index 78c4e0aec8..d09af58c97 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void);
>>   */
>>  bool ram_block_discard_is_required(void);
>>  
>> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp);
>> +void ram_block_del_cpr_blocker(RAMBlock *rb);
>> +
>>  #endif
>>  
>>  #endif
>> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
>> index 0babd105c0..64484cd821 100644
>> --- a/include/exec/ramblock.h
>> +++ b/include/exec/ramblock.h
>> @@ -39,6 +39,7 @@ struct RAMBlock {
>>      /* RCU-enabled, writes protected by the ramlist lock */
>>      QLIST_ENTRY(RAMBlock) next;
>>      QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
>> +    Error *cpr_blocker;
>>      int fd;
>>      uint64_t fd_offset;
>>      int guest_memfd;
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 5c4fdfd95e..ce158c3512 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>>      qemu_ram_set_idstr(mr->ram_block,
>>                         memory_region_name(mr), dev);
>>      qemu_ram_set_migratable(mr->ram_block);
>> +    ram_block_add_cpr_blocker(mr->ram_block, &error_fatal);
>>  }
>>  
>>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>>  {
>>      qemu_ram_unset_idstr(mr->ram_block);
>>      qemu_ram_unset_migratable(mr->ram_block);
>> +    ram_block_del_cpr_blocker(mr->ram_block);
>>  }
>>  
>>  void vmstate_register_ram_global(MemoryRegion *mr)
>> diff --git a/system/physmem.c b/system/physmem.c
>> index 8c1736f84e..445981a1b4 100644
>> --- a/system/physmem.c
>> +++ b/system/physmem.c
>> @@ -70,7 +70,10 @@
>>  
>>  #include "qemu/pmem.h"
>>  
>> +#include "qapi/qapi-types-migration.h"
>> +#include "migration/blocker.h"
>>  #include "migration/cpr.h"
>> +#include "migration/options.h"
>>  #include "migration/vmstate.h"
>>  
>>  #include "qemu/range.h"
>> @@ -1903,6 +1906,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>>              qemu_mutex_unlock_ramlist();
>>              goto out_free;
>>          }
>> +
>> +        error_setg(&new_block->cpr_blocker,
>> +                   "Memory region %s uses guest_memfd, "
>> +                   "which is not supported with CPR.",
>> +                   memory_region_name(new_block->mr));
>> +        migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
>> +                                  MIG_MODE_CPR_TRANSFER,
>> +                                  -1);
>>      }
>>  
>>      ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
>> @@ -4094,3 +4105,58 @@ bool ram_block_discard_is_required(void)
>>      return qatomic_read(&ram_block_discard_required_cnt) ||
>>             qatomic_read(&ram_block_coordinated_discard_required_cnt);
>>  }
>> +
>> +/*
>> + * Return true if ram is compatible with CPR.  Do not exclude rom,
>> + * because the rom file could change in new QEMU.
>> + */
>> +static bool ram_is_cpr_compatible(RAMBlock *rb)
>> +{
>> +    MemoryRegion *mr = rb->mr;
>> +
>> +    if (!mr || !memory_region_is_ram(mr)) {
>> +        return true;
>> +    }
>> +
>> +    /* Ram device is remapped in new QEMU */
>> +    if (memory_region_is_ram_device(mr)) {
>> +        return true;
>> +    }
>> +
>> +    /*
>> +     * A file descriptor is passed to new QEMU and remapped, or its backing
>> +     * file is reopened and mapped.  It must be shared to avoid COW.
>> +     */
>> +    if (rb->fd >= 0 && qemu_ram_is_shared(rb)) {
>> +        return true;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>> +/*
>> + * Add a blocker for each volatile ram block.  This function should only be
>> + * called after we know that the block is migratable.  Non-migratable blocks
>> + * are either re-created in new QEMU, or are handled specially, or are covered
>> + * by a device-level CPR blocker.
>> + */
>> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
>> +{
>> +    assert(qemu_ram_is_migratable(rb));
>> +
>> +    if (ram_is_cpr_compatible(rb)) {
>> +        return;
>> +    }
>> +
>> +    error_setg(&rb->cpr_blocker,
>> +               "Memory region %s is not compatible with CPR. share=on is "
>> +               "required for memory-backend objects, and aux-ram-share=on is "
>> +               "required.", memory_region_name(rb->mr));
>> +    migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFER,
>> +                              -1);
>> +}
>> +
>> +void ram_block_del_cpr_blocker(RAMBlock *rb)
>> +{
>> +    migrate_del_blocker(&rb->cpr_blocker);
>> +}


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-26 19:21     ` Tom Lendacky
@ 2025-03-26 19:50       ` Michael Roth
  2025-03-26 20:13         ` Fabiano Rosas
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Roth @ 2025-03-26 19:50 UTC (permalink / raw)
  To: Fabiano Rosas, Tom Lendacky, qemu-devel
  Cc: Peter Xu, Steve Sistare, David Hildenbrand

Quoting Tom Lendacky (2025-03-26 14:21:31)
> On 3/26/25 13:46, Tom Lendacky wrote:
> > On 3/7/25 12:15, Fabiano Rosas wrote:
> >> From: Steve Sistare <steven.sistare@oracle.com>
> >>
> >> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
> >> in the migration stream file and recreate them later, because the physical
> >> memory for the blocks is pinned and registered for vfio.  Add a blocker
> >> for volatile ram blocks.
> >>
> >> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
> >> sufficient for CPR, but it has not been tested yet.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> >> Reviewed-by: Peter Xu <peterx@redhat.com>
> >> Reviewed-by: David Hildenbrand <david@redhat.com>
> >> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> ---
> >>  include/exec/memory.h   |  3 ++
> >>  include/exec/ramblock.h |  1 +
> >>  migration/savevm.c      |  2 ++
> >>  system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
> >>  4 files changed, 72 insertions(+)
> > 
> > This patch breaks booting an SNP guest as it triggers the following
> > assert:
> > 
> > qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.
> > 
> > I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
> > It looks like the error message is unable to be printed because
> > rb->cpr_blocker is NULL.
> > 
> > Adding aux-ram-share=on to the -machine object gets me past the error and
> > therefore the assertion, but isn't that an incompatible change to how an
> > SNP guest has to be started?
> 
> If I update the err_setg() call to use the errp parameter that is passed
> into ram_block_add_cpr_blocker(), I get the following message and then
> the guest launch terminates:
> 
> qemu-system-x86_64: Memory region pc.bios is not compatible with CPR.
> share=on is required for memory-backend objects, and aux-ram-share=on is
> required.
> 
> The qemu parameters I used prior to this patch that allowed an SNP guest
> to launch were:
> 
> -machine type=q35,confidential-guest-support=sev0,memory-backend=ram1
> -object memory-backend-memfd,id=ram1,size=16G,share=true,prealloc=false
> 
> With these parameters after this patch, the launch fails.

I think it might be failing because the caller of
ram_block_add_cpr_blocker() is passing in &error_abort, but if the
error_setg() is call on a properly initialized cpr_blocker value then
SNP is still able to boot for me. I'm not sure where the best spot is
to initialize cpr_blocker, it probably needs to be done before either
ram_block_add_cpr_blocker() or ram_block_del_cpr_blocker() are callable,
but the following avoids the reported crash at least:

diff --git a/system/physmem.c b/system/physmem.c
index 44dd129662..bff0fdcaac 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -4176,6 +4176,7 @@ void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
         return;
     }

+    rb->cpr_blocker = NULL;
     error_setg(&rb->cpr_blocker,
                "Memory region %s is not compatible with CPR. share=on is "
                "required for memory-backend objects, and aux-ram-share=on is "

-Mike

> 
> Thanks,
> Tom
> 
> > 
> > Thanks,
> > Tom
> > 
> >>
> >> diff --git a/include/exec/memory.h b/include/exec/memory.h
> >> index 78c4e0aec8..d09af58c97 100644
> >> --- a/include/exec/memory.h
> >> +++ b/include/exec/memory.h
> >> @@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void);
> >>   */
> >>  bool ram_block_discard_is_required(void);
> >>  
> >> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp);
> >> +void ram_block_del_cpr_blocker(RAMBlock *rb);
> >> +
> >>  #endif
> >>  
> >>  #endif
> >> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> >> index 0babd105c0..64484cd821 100644
> >> --- a/include/exec/ramblock.h
> >> +++ b/include/exec/ramblock.h
> >> @@ -39,6 +39,7 @@ struct RAMBlock {
> >>      /* RCU-enabled, writes protected by the ramlist lock */
> >>      QLIST_ENTRY(RAMBlock) next;
> >>      QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
> >> +    Error *cpr_blocker;
> >>      int fd;
> >>      uint64_t fd_offset;
> >>      int guest_memfd;
> >> diff --git a/migration/savevm.c b/migration/savevm.c
> >> index 5c4fdfd95e..ce158c3512 100644
> >> --- a/migration/savevm.c
> >> +++ b/migration/savevm.c
> >> @@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
> >>      qemu_ram_set_idstr(mr->ram_block,
> >>                         memory_region_name(mr), dev);
> >>      qemu_ram_set_migratable(mr->ram_block);
> >> +    ram_block_add_cpr_blocker(mr->ram_block, &error_fatal);
> >>  }
> >>  
> >>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
> >>  {
> >>      qemu_ram_unset_idstr(mr->ram_block);
> >>      qemu_ram_unset_migratable(mr->ram_block);
> >> +    ram_block_del_cpr_blocker(mr->ram_block);
> >>  }
> >>  
> >>  void vmstate_register_ram_global(MemoryRegion *mr)
> >> diff --git a/system/physmem.c b/system/physmem.c
> >> index 8c1736f84e..445981a1b4 100644
> >> --- a/system/physmem.c
> >> +++ b/system/physmem.c
> >> @@ -70,7 +70,10 @@
> >>  
> >>  #include "qemu/pmem.h"
> >>  
> >> +#include "qapi/qapi-types-migration.h"
> >> +#include "migration/blocker.h"
> >>  #include "migration/cpr.h"
> >> +#include "migration/options.h"
> >>  #include "migration/vmstate.h"
> >>  
> >>  #include "qemu/range.h"
> >> @@ -1903,6 +1906,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
> >>              qemu_mutex_unlock_ramlist();
> >>              goto out_free;
> >>          }
> >> +
> >> +        error_setg(&new_block->cpr_blocker,
> >> +                   "Memory region %s uses guest_memfd, "
> >> +                   "which is not supported with CPR.",
> >> +                   memory_region_name(new_block->mr));
> >> +        migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
> >> +                                  MIG_MODE_CPR_TRANSFER,
> >> +                                  -1);
> >>      }
> >>  
> >>      ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
> >> @@ -4094,3 +4105,58 @@ bool ram_block_discard_is_required(void)
> >>      return qatomic_read(&ram_block_discard_required_cnt) ||
> >>             qatomic_read(&ram_block_coordinated_discard_required_cnt);
> >>  }
> >> +
> >> +/*
> >> + * Return true if ram is compatible with CPR.  Do not exclude rom,
> >> + * because the rom file could change in new QEMU.
> >> + */
> >> +static bool ram_is_cpr_compatible(RAMBlock *rb)
> >> +{
> >> +    MemoryRegion *mr = rb->mr;
> >> +
> >> +    if (!mr || !memory_region_is_ram(mr)) {
> >> +        return true;
> >> +    }
> >> +
> >> +    /* Ram device is remapped in new QEMU */
> >> +    if (memory_region_is_ram_device(mr)) {
> >> +        return true;
> >> +    }
> >> +
> >> +    /*
> >> +     * A file descriptor is passed to new QEMU and remapped, or its backing
> >> +     * file is reopened and mapped.  It must be shared to avoid COW.
> >> +     */
> >> +    if (rb->fd >= 0 && qemu_ram_is_shared(rb)) {
> >> +        return true;
> >> +    }
> >> +
> >> +    return false;
> >> +}
> >> +
> >> +/*
> >> + * Add a blocker for each volatile ram block.  This function should only be
> >> + * called after we know that the block is migratable.  Non-migratable blocks
> >> + * are either re-created in new QEMU, or are handled specially, or are covered
> >> + * by a device-level CPR blocker.
> >> + */
> >> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
> >> +{
> >> +    assert(qemu_ram_is_migratable(rb));
> >> +
> >> +    if (ram_is_cpr_compatible(rb)) {
> >> +        return;
> >> +    }
> >> +
> >> +    error_setg(&rb->cpr_blocker,
> >> +               "Memory region %s is not compatible with CPR. share=on is "
> >> +               "required for memory-backend objects, and aux-ram-share=on is "
> >> +               "required.", memory_region_name(rb->mr));
> >> +    migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFER,
> >> +                              -1);
> >> +}
> >> +
> >> +void ram_block_del_cpr_blocker(RAMBlock *rb)
> >> +{
> >> +    migrate_del_blocker(&rb->cpr_blocker);
> >> +}
>


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-26 19:50       ` Michael Roth
@ 2025-03-26 20:13         ` Fabiano Rosas
  2025-03-26 21:34           ` Michael Roth
  0 siblings, 1 reply; 24+ messages in thread
From: Fabiano Rosas @ 2025-03-26 20:13 UTC (permalink / raw)
  To: Michael Roth, Tom Lendacky, qemu-devel
  Cc: Peter Xu, Steve Sistare, David Hildenbrand

Michael Roth <michael.roth@amd.com> writes:

> Quoting Tom Lendacky (2025-03-26 14:21:31)
>> On 3/26/25 13:46, Tom Lendacky wrote:
>> > On 3/7/25 12:15, Fabiano Rosas wrote:
>> >> From: Steve Sistare <steven.sistare@oracle.com>
>> >>
>> >> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
>> >> in the migration stream file and recreate them later, because the physical
>> >> memory for the blocks is pinned and registered for vfio.  Add a blocker
>> >> for volatile ram blocks.
>> >>
>> >> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
>> >> sufficient for CPR, but it has not been tested yet.
>> >>
>> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> >> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>> >> Reviewed-by: Peter Xu <peterx@redhat.com>
>> >> Reviewed-by: David Hildenbrand <david@redhat.com>
>> >> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
>> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> >> ---
>> >>  include/exec/memory.h   |  3 ++
>> >>  include/exec/ramblock.h |  1 +
>> >>  migration/savevm.c      |  2 ++
>> >>  system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
>> >>  4 files changed, 72 insertions(+)
>> > 
>> > This patch breaks booting an SNP guest as it triggers the following
>> > assert:
>> > 
>> > qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.
>> > 

Usually this means the error has already been set previously, which is
not allowed.

>> > I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
>> > It looks like the error message is unable to be printed because
>> > rb->cpr_blocker is NULL.
>> > 
>> > Adding aux-ram-share=on to the -machine object gets me past the error and
>> > therefore the assertion, but isn't that an incompatible change to how an
>> > SNP guest has to be started?
>> 
>> If I update the err_setg() call to use the errp parameter that is passed
>> into ram_block_add_cpr_blocker(), I get the following message and then
>> the guest launch terminates:
>> 

The usage at ram_block_add_cpr_blocker() is correct, the cpr_blocker
gets initialized and registered as a migration blocker. The errp only
becomes relevant later when migration starts and the error condition is
met.

>> qemu-system-x86_64: Memory region pc.bios is not compatible with CPR.
>> share=on is required for memory-backend objects, and aux-ram-share=on is
>> required.

Since errp is an &error_fatal, it causes QEMU to exit, so this^ error
message is bogus.

>> 
>> The qemu parameters I used prior to this patch that allowed an SNP guest
>> to launch were:
>> 
>> -machine type=q35,confidential-guest-support=sev0,memory-backend=ram1
>> -object memory-backend-memfd,id=ram1,size=16G,share=true,prealloc=false
>> 
>> With these parameters after this patch, the launch fails.
>
> I think it might be failing because the caller of
> ram_block_add_cpr_blocker() is passing in &error_abort, but if the
> error_setg() is call on a properly initialized cpr_blocker value then
> SNP is still able to boot for me.
> I'm not sure where the best spot is
> to initialize cpr_blocker, it probably needs to be done before either
> ram_block_add_cpr_blocker() or ram_block_del_cpr_blocker() are callable,
> but the following avoids the reported crash at least:
>
> diff --git a/system/physmem.c b/system/physmem.c
> index 44dd129662..bff0fdcaac 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -4176,6 +4176,7 @@ void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
>          return;
>      }
>
> +    rb->cpr_blocker = NULL;

Could it be the cpr_blocker already got set at ram_block_add() in the
RAM_GUEST_MEMFD path?

>      error_setg(&rb->cpr_blocker,
>                 "Memory region %s is not compatible with CPR. share=on is "
>                 "required for memory-backend objects, and aux-ram-share=on is "
>
> -Mike
>
>> 
>> Thanks,
>> Tom
>> 
>> > 
>> > Thanks,
>> > Tom
>> > 
>> >>
>> >> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> >> index 78c4e0aec8..d09af58c97 100644
>> >> --- a/include/exec/memory.h
>> >> +++ b/include/exec/memory.h
>> >> @@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void);
>> >>   */
>> >>  bool ram_block_discard_is_required(void);
>> >>  
>> >> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp);
>> >> +void ram_block_del_cpr_blocker(RAMBlock *rb);
>> >> +
>> >>  #endif
>> >>  
>> >>  #endif
>> >> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
>> >> index 0babd105c0..64484cd821 100644
>> >> --- a/include/exec/ramblock.h
>> >> +++ b/include/exec/ramblock.h
>> >> @@ -39,6 +39,7 @@ struct RAMBlock {
>> >>      /* RCU-enabled, writes protected by the ramlist lock */
>> >>      QLIST_ENTRY(RAMBlock) next;
>> >>      QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
>> >> +    Error *cpr_blocker;
>> >>      int fd;
>> >>      uint64_t fd_offset;
>> >>      int guest_memfd;
>> >> diff --git a/migration/savevm.c b/migration/savevm.c
>> >> index 5c4fdfd95e..ce158c3512 100644
>> >> --- a/migration/savevm.c
>> >> +++ b/migration/savevm.c
>> >> @@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
>> >>      qemu_ram_set_idstr(mr->ram_block,
>> >>                         memory_region_name(mr), dev);
>> >>      qemu_ram_set_migratable(mr->ram_block);
>> >> +    ram_block_add_cpr_blocker(mr->ram_block, &error_fatal);
>> >>  }
>> >>  
>> >>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
>> >>  {
>> >>      qemu_ram_unset_idstr(mr->ram_block);
>> >>      qemu_ram_unset_migratable(mr->ram_block);
>> >> +    ram_block_del_cpr_blocker(mr->ram_block);
>> >>  }
>> >>  
>> >>  void vmstate_register_ram_global(MemoryRegion *mr)
>> >> diff --git a/system/physmem.c b/system/physmem.c
>> >> index 8c1736f84e..445981a1b4 100644
>> >> --- a/system/physmem.c
>> >> +++ b/system/physmem.c
>> >> @@ -70,7 +70,10 @@
>> >>  
>> >>  #include "qemu/pmem.h"
>> >>  
>> >> +#include "qapi/qapi-types-migration.h"
>> >> +#include "migration/blocker.h"
>> >>  #include "migration/cpr.h"
>> >> +#include "migration/options.h"
>> >>  #include "migration/vmstate.h"
>> >>  
>> >>  #include "qemu/range.h"
>> >> @@ -1903,6 +1906,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>> >>              qemu_mutex_unlock_ramlist();
>> >>              goto out_free;
>> >>          }
>> >> +
>> >> +        error_setg(&new_block->cpr_blocker,
>> >> +                   "Memory region %s uses guest_memfd, "
>> >> +                   "which is not supported with CPR.",
>> >> +                   memory_region_name(new_block->mr));
>> >> +        migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
>> >> +                                  MIG_MODE_CPR_TRANSFER,
>> >> +                                  -1);
>> >>      }
>> >>  
>> >>      ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
>> >> @@ -4094,3 +4105,58 @@ bool ram_block_discard_is_required(void)
>> >>      return qatomic_read(&ram_block_discard_required_cnt) ||
>> >>             qatomic_read(&ram_block_coordinated_discard_required_cnt);
>> >>  }
>> >> +
>> >> +/*
>> >> + * Return true if ram is compatible with CPR.  Do not exclude rom,
>> >> + * because the rom file could change in new QEMU.
>> >> + */
>> >> +static bool ram_is_cpr_compatible(RAMBlock *rb)
>> >> +{
>> >> +    MemoryRegion *mr = rb->mr;
>> >> +
>> >> +    if (!mr || !memory_region_is_ram(mr)) {
>> >> +        return true;
>> >> +    }
>> >> +
>> >> +    /* Ram device is remapped in new QEMU */
>> >> +    if (memory_region_is_ram_device(mr)) {
>> >> +        return true;
>> >> +    }
>> >> +
>> >> +    /*
>> >> +     * A file descriptor is passed to new QEMU and remapped, or its backing
>> >> +     * file is reopened and mapped.  It must be shared to avoid COW.
>> >> +     */
>> >> +    if (rb->fd >= 0 && qemu_ram_is_shared(rb)) {
>> >> +        return true;
>> >> +    }
>> >> +
>> >> +    return false;
>> >> +}
>> >> +
>> >> +/*
>> >> + * Add a blocker for each volatile ram block.  This function should only be
>> >> + * called after we know that the block is migratable.  Non-migratable blocks
>> >> + * are either re-created in new QEMU, or are handled specially, or are covered
>> >> + * by a device-level CPR blocker.
>> >> + */
>> >> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
>> >> +{
>> >> +    assert(qemu_ram_is_migratable(rb));
>> >> +
>> >> +    if (ram_is_cpr_compatible(rb)) {
>> >> +        return;
>> >> +    }
>> >> +
>> >> +    error_setg(&rb->cpr_blocker,
>> >> +               "Memory region %s is not compatible with CPR. share=on is "
>> >> +               "required for memory-backend objects, and aux-ram-share=on is "
>> >> +               "required.", memory_region_name(rb->mr));
>> >> +    migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFER,
>> >> +                              -1);
>> >> +}
>> >> +
>> >> +void ram_block_del_cpr_blocker(RAMBlock *rb)
>> >> +{
>> >> +    migrate_del_blocker(&rb->cpr_blocker);
>> >> +}
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-26 20:13         ` Fabiano Rosas
@ 2025-03-26 21:34           ` Michael Roth
  2025-03-27 12:27             ` Steven Sistare
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Roth @ 2025-03-26 21:34 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Tom Lendacky, qemu-devel, Peter Xu, Steve Sistare,
	David Hildenbrand

On Wed, Mar 26, 2025 at 05:13:50PM -0300, Fabiano Rosas wrote:
> Michael Roth <michael.roth@amd.com> writes:
> 
> > Quoting Tom Lendacky (2025-03-26 14:21:31)
> >> On 3/26/25 13:46, Tom Lendacky wrote:
> >> > On 3/7/25 12:15, Fabiano Rosas wrote:
> >> >> From: Steve Sistare <steven.sistare@oracle.com>
> >> >>
> >> >> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
> >> >> in the migration stream file and recreate them later, because the physical
> >> >> memory for the blocks is pinned and registered for vfio.  Add a blocker
> >> >> for volatile ram blocks.
> >> >>
> >> >> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
> >> >> sufficient for CPR, but it has not been tested yet.
> >> >>
> >> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> >> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> >> >> Reviewed-by: Peter Xu <peterx@redhat.com>
> >> >> Reviewed-by: David Hildenbrand <david@redhat.com>
> >> >> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
> >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> >> ---
> >> >>  include/exec/memory.h   |  3 ++
> >> >>  include/exec/ramblock.h |  1 +
> >> >>  migration/savevm.c      |  2 ++
> >> >>  system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
> >> >>  4 files changed, 72 insertions(+)
> >> > 
> >> > This patch breaks booting an SNP guest as it triggers the following
> >> > assert:
> >> > 
> >> > qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.
> >> > 
> 
> Usually this means the error has already been set previously, which is
> not allowed.
> 
> >> > I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
> >> > It looks like the error message is unable to be printed because
> >> > rb->cpr_blocker is NULL.
> >> > 
> >> > Adding aux-ram-share=on to the -machine object gets me past the error and
> >> > therefore the assertion, but isn't that an incompatible change to how an
> >> > SNP guest has to be started?
> >> 
> >> If I update the err_setg() call to use the errp parameter that is passed
> >> into ram_block_add_cpr_blocker(), I get the following message and then
> >> the guest launch terminates:
> >> 
> 
> The usage at ram_block_add_cpr_blocker() is correct, the cpr_blocker
> gets initialized and registered as a migration blocker. The errp only
> becomes relevant later when migration starts and the error condition is
> met.
> 
> >> qemu-system-x86_64: Memory region pc.bios is not compatible with CPR.
> >> share=on is required for memory-backend objects, and aux-ram-share=on is
> >> required.
> 
> Since errp is an &error_fatal, it causes QEMU to exit, so this^ error
> message is bogus.
> 
> >> 
> >> The qemu parameters I used prior to this patch that allowed an SNP guest
> >> to launch were:
> >> 
> >> -machine type=q35,confidential-guest-support=sev0,memory-backend=ram1
> >> -object memory-backend-memfd,id=ram1,size=16G,share=true,prealloc=false
> >> 
> >> With these parameters after this patch, the launch fails.
> >
> > I think it might be failing because the caller of
> > ram_block_add_cpr_blocker() is passing in &error_abort, but if the
> > error_setg() is call on a properly initialized cpr_blocker value then
> > SNP is still able to boot for me.
> > I'm not sure where the best spot is
> > to initialize cpr_blocker, it probably needs to be done before either
> > ram_block_add_cpr_blocker() or ram_block_del_cpr_blocker() are callable,
> > but the following avoids the reported crash at least:
> >
> > diff --git a/system/physmem.c b/system/physmem.c
> > index 44dd129662..bff0fdcaac 100644
> > --- a/system/physmem.c
> > +++ b/system/physmem.c
> > @@ -4176,6 +4176,7 @@ void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
> >          return;
> >      }
> >
> > +    rb->cpr_blocker = NULL;
> 
> Could it be the cpr_blocker already got set at ram_block_add() in the
> RAM_GUEST_MEMFD path?

That seems to be the case: in some cases ram_block_add() sets cpr_blocker
when (new_block->flags & RAM_GUEST_MEMFD) is true, and then soon after
when ram_block_add_cpr_blocker() is called on the same RAMBlock:

  2025-03-26T21:08:15.092427Z qemu-system-x86_64: warning: ram_block_add: new_block 0x55c247e4c880 new_block->cpr_blocker (nil) name ram1
  
  2025-03-26T21:08:15.124710Z qemu-system-x86_64: warning: ram_block_add: new_block 0x55c2480fde00 new_block->cpr_blocker (nil) name pc.bios
  2025-03-26T21:08:15.126190Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c2480fde00 rb->cpr_blocker 0x55c2480fe050 name pc.bios
  2025-03-26T21:08:15.138582Z qemu-system-x86_64: warning: ram_block_add: new_block 0x55c247e3c1e0 new_block->cpr_blocker (nil) name pc.rom
  2025-03-26T21:08:15.138938Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c247e3c1e0 rb->cpr_blocker 0x55c247e3c890 name pc.rom
  
  2025-03-26T21:08:16.185577Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c248db9200 rb->cpr_blocker (nil) name /rom@etc/acpi/tables
  2025-03-26T21:08:16.187140Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c248085620 rb->cpr_blocker (nil) name /rom@etc/table-loader
  2025-03-26T21:08:16.188029Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c2480ce220 rb->cpr_blocker (nil) name /rom@etc/acpi/rsd

-Mike

> 
> >      error_setg(&rb->cpr_blocker,
> >                 "Memory region %s is not compatible with CPR. share=on is "
> >                 "required for memory-backend objects, and aux-ram-share=on is "
> >
> > -Mike
> >
> >> 
> >> Thanks,
> >> Tom
> >> 
> >> > 
> >> > Thanks,
> >> > Tom
> >> > 
> >> >>
> >> >> diff --git a/include/exec/memory.h b/include/exec/memory.h
> >> >> index 78c4e0aec8..d09af58c97 100644
> >> >> --- a/include/exec/memory.h
> >> >> +++ b/include/exec/memory.h
> >> >> @@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void);
> >> >>   */
> >> >>  bool ram_block_discard_is_required(void);
> >> >>  
> >> >> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp);
> >> >> +void ram_block_del_cpr_blocker(RAMBlock *rb);
> >> >> +
> >> >>  #endif
> >> >>  
> >> >>  #endif
> >> >> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> >> >> index 0babd105c0..64484cd821 100644
> >> >> --- a/include/exec/ramblock.h
> >> >> +++ b/include/exec/ramblock.h
> >> >> @@ -39,6 +39,7 @@ struct RAMBlock {
> >> >>      /* RCU-enabled, writes protected by the ramlist lock */
> >> >>      QLIST_ENTRY(RAMBlock) next;
> >> >>      QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
> >> >> +    Error *cpr_blocker;
> >> >>      int fd;
> >> >>      uint64_t fd_offset;
> >> >>      int guest_memfd;
> >> >> diff --git a/migration/savevm.c b/migration/savevm.c
> >> >> index 5c4fdfd95e..ce158c3512 100644
> >> >> --- a/migration/savevm.c
> >> >> +++ b/migration/savevm.c
> >> >> @@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
> >> >>      qemu_ram_set_idstr(mr->ram_block,
> >> >>                         memory_region_name(mr), dev);
> >> >>      qemu_ram_set_migratable(mr->ram_block);
> >> >> +    ram_block_add_cpr_blocker(mr->ram_block, &error_fatal);
> >> >>  }
> >> >>  
> >> >>  void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
> >> >>  {
> >> >>      qemu_ram_unset_idstr(mr->ram_block);
> >> >>      qemu_ram_unset_migratable(mr->ram_block);
> >> >> +    ram_block_del_cpr_blocker(mr->ram_block);
> >> >>  }
> >> >>  
> >> >>  void vmstate_register_ram_global(MemoryRegion *mr)
> >> >> diff --git a/system/physmem.c b/system/physmem.c
> >> >> index 8c1736f84e..445981a1b4 100644
> >> >> --- a/system/physmem.c
> >> >> +++ b/system/physmem.c
> >> >> @@ -70,7 +70,10 @@
> >> >>  
> >> >>  #include "qemu/pmem.h"
> >> >>  
> >> >> +#include "qapi/qapi-types-migration.h"
> >> >> +#include "migration/blocker.h"
> >> >>  #include "migration/cpr.h"
> >> >> +#include "migration/options.h"
> >> >>  #include "migration/vmstate.h"
> >> >>  
> >> >>  #include "qemu/range.h"
> >> >> @@ -1903,6 +1906,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
> >> >>              qemu_mutex_unlock_ramlist();
> >> >>              goto out_free;
> >> >>          }
> >> >> +
> >> >> +        error_setg(&new_block->cpr_blocker,
> >> >> +                   "Memory region %s uses guest_memfd, "
> >> >> +                   "which is not supported with CPR.",
> >> >> +                   memory_region_name(new_block->mr));
> >> >> +        migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
> >> >> +                                  MIG_MODE_CPR_TRANSFER,
> >> >> +                                  -1);
> >> >>      }
> >> >>  
> >> >>      ram_size = (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS;
> >> >> @@ -4094,3 +4105,58 @@ bool ram_block_discard_is_required(void)
> >> >>      return qatomic_read(&ram_block_discard_required_cnt) ||
> >> >>             qatomic_read(&ram_block_coordinated_discard_required_cnt);
> >> >>  }
> >> >> +
> >> >> +/*
> >> >> + * Return true if ram is compatible with CPR.  Do not exclude rom,
> >> >> + * because the rom file could change in new QEMU.
> >> >> + */
> >> >> +static bool ram_is_cpr_compatible(RAMBlock *rb)
> >> >> +{
> >> >> +    MemoryRegion *mr = rb->mr;
> >> >> +
> >> >> +    if (!mr || !memory_region_is_ram(mr)) {
> >> >> +        return true;
> >> >> +    }
> >> >> +
> >> >> +    /* Ram device is remapped in new QEMU */
> >> >> +    if (memory_region_is_ram_device(mr)) {
> >> >> +        return true;
> >> >> +    }
> >> >> +
> >> >> +    /*
> >> >> +     * A file descriptor is passed to new QEMU and remapped, or its backing
> >> >> +     * file is reopened and mapped.  It must be shared to avoid COW.
> >> >> +     */
> >> >> +    if (rb->fd >= 0 && qemu_ram_is_shared(rb)) {
> >> >> +        return true;
> >> >> +    }
> >> >> +
> >> >> +    return false;
> >> >> +}
> >> >> +
> >> >> +/*
> >> >> + * Add a blocker for each volatile ram block.  This function should only be
> >> >> + * called after we know that the block is migratable.  Non-migratable blocks
> >> >> + * are either re-created in new QEMU, or are handled specially, or are covered
> >> >> + * by a device-level CPR blocker.
> >> >> + */
> >> >> +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
> >> >> +{
> >> >> +    assert(qemu_ram_is_migratable(rb));
> >> >> +
> >> >> +    if (ram_is_cpr_compatible(rb)) {
> >> >> +        return;
> >> >> +    }
> >> >> +
> >> >> +    error_setg(&rb->cpr_blocker,
> >> >> +               "Memory region %s is not compatible with CPR. share=on is "
> >> >> +               "required for memory-backend objects, and aux-ram-share=on is "
> >> >> +               "required.", memory_region_name(rb->mr));
> >> >> +    migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFER,
> >> >> +                              -1);
> >> >> +}
> >> >> +
> >> >> +void ram_block_del_cpr_blocker(RAMBlock *rb)
> >> >> +{
> >> >> +    migrate_del_blocker(&rb->cpr_blocker);
> >> >> +}
> >>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-26 21:34           ` Michael Roth
@ 2025-03-27 12:27             ` Steven Sistare
  2025-03-27 13:42               ` Tom Lendacky
  2025-03-28  7:05               ` Xiaoyao Li
  0 siblings, 2 replies; 24+ messages in thread
From: Steven Sistare @ 2025-03-27 12:27 UTC (permalink / raw)
  To: Michael Roth, Fabiano Rosas
  Cc: Tom Lendacky, qemu-devel, Peter Xu, David Hildenbrand

On 3/26/2025 5:34 PM, Michael Roth wrote:
> On Wed, Mar 26, 2025 at 05:13:50PM -0300, Fabiano Rosas wrote:
>> Michael Roth <michael.roth@amd.com> writes:
>>
>>> Quoting Tom Lendacky (2025-03-26 14:21:31)
>>>> On 3/26/25 13:46, Tom Lendacky wrote:
>>>>> On 3/7/25 12:15, Fabiano Rosas wrote:
>>>>>> From: Steve Sistare <steven.sistare@oracle.com>
>>>>>>
>>>>>> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
>>>>>> in the migration stream file and recreate them later, because the physical
>>>>>> memory for the blocks is pinned and registered for vfio.  Add a blocker
>>>>>> for volatile ram blocks.
>>>>>>
>>>>>> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
>>>>>> sufficient for CPR, but it has not been tested yet.
>>>>>>
>>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>>>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>>>>>> Reviewed-by: Peter Xu <peterx@redhat.com>
>>>>>> Reviewed-by: David Hildenbrand <david@redhat.com>
>>>>>> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
>>>>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>>>>> ---
>>>>>>   include/exec/memory.h   |  3 ++
>>>>>>   include/exec/ramblock.h |  1 +
>>>>>>   migration/savevm.c      |  2 ++
>>>>>>   system/physmem.c        | 66 +++++++++++++++++++++++++++++++++++++++++
>>>>>>   4 files changed, 72 insertions(+)
>>>>>
>>>>> This patch breaks booting an SNP guest as it triggers the following
>>>>> assert:
>>>>>
>>>>> qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion `*errp == NULL' failed.
>>>>>
>>
>> Usually this means the error has already been set previously, which is
>> not allowed.
>>
>>>>> I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
>>>>> It looks like the error message is unable to be printed because
>>>>> rb->cpr_blocker is NULL.
>>>>>
>>>>> Adding aux-ram-share=on to the -machine object gets me past the error and
>>>>> therefore the assertion, but isn't that an incompatible change to how an
>>>>> SNP guest has to be started?
>>>>
>>>> If I update the err_setg() call to use the errp parameter that is passed
>>>> into ram_block_add_cpr_blocker(), I get the following message and then
>>>> the guest launch terminates:
>>>>
>>
>> The usage at ram_block_add_cpr_blocker() is correct, the cpr_blocker
>> gets initialized and registered as a migration blocker. The errp only
>> becomes relevant later when migration starts and the error condition is
>> met.
>>
>>>> qemu-system-x86_64: Memory region pc.bios is not compatible with CPR.
>>>> share=on is required for memory-backend objects, and aux-ram-share=on is
>>>> required.
>>
>> Since errp is an &error_fatal, it causes QEMU to exit, so this^ error
>> message is bogus.
>>
>>>>
>>>> The qemu parameters I used prior to this patch that allowed an SNP guest
>>>> to launch were:
>>>>
>>>> -machine type=q35,confidential-guest-support=sev0,memory-backend=ram1
>>>> -object memory-backend-memfd,id=ram1,size=16G,share=true,prealloc=false
>>>>
>>>> With these parameters after this patch, the launch fails.
>>>
>>> I think it might be failing because the caller of
>>> ram_block_add_cpr_blocker() is passing in &error_abort, but if the
>>> error_setg() is call on a properly initialized cpr_blocker value then
>>> SNP is still able to boot for me.
>>> I'm not sure where the best spot is
>>> to initialize cpr_blocker, it probably needs to be done before either
>>> ram_block_add_cpr_blocker() or ram_block_del_cpr_blocker() are callable,
>>> but the following avoids the reported crash at least:
>>>
>>> diff --git a/system/physmem.c b/system/physmem.c
>>> index 44dd129662..bff0fdcaac 100644
>>> --- a/system/physmem.c
>>> +++ b/system/physmem.c
>>> @@ -4176,6 +4176,7 @@ void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp)
>>>           return;
>>>       }
>>>
>>> +    rb->cpr_blocker = NULL;
>>
>> Could it be the cpr_blocker already got set at ram_block_add() in the
>> RAM_GUEST_MEMFD path?
> 
> That seems to be the case: in some cases ram_block_add() sets cpr_blocker
> when (new_block->flags & RAM_GUEST_MEMFD) is true, and then soon after
> when ram_block_add_cpr_blocker() is called on the same RAMBlock:
> 
>    2025-03-26T21:08:15.092427Z qemu-system-x86_64: warning: ram_block_add: new_block 0x55c247e4c880 new_block->cpr_blocker (nil) name ram1
>    
>    2025-03-26T21:08:15.124710Z qemu-system-x86_64: warning: ram_block_add: new_block 0x55c2480fde00 new_block->cpr_blocker (nil) name pc.bios
>    2025-03-26T21:08:15.126190Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c2480fde00 rb->cpr_blocker 0x55c2480fe050 name pc.bios
>    2025-03-26T21:08:15.138582Z qemu-system-x86_64: warning: ram_block_add: new_block 0x55c247e3c1e0 new_block->cpr_blocker (nil) name pc.rom
>    2025-03-26T21:08:15.138938Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c247e3c1e0 rb->cpr_blocker 0x55c247e3c890 name pc.rom
>    
>    2025-03-26T21:08:16.185577Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c248db9200 rb->cpr_blocker (nil) name /rom@etc/acpi/tables
>    2025-03-26T21:08:16.187140Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c248085620 rb->cpr_blocker (nil) name /rom@etc/table-loader
>    2025-03-26T21:08:16.188029Z qemu-system-x86_64: warning: ram_block_add_cpr_blocker: rb 0x55c2480ce220 rb->cpr_blocker (nil) name /rom@etc/acpi/rsd

Thanks everyone for debugging this.  To summarize, ram_block_add_cpr_blocker already blocks
guest_memfd, because rb->fd < 0.  The fix is to delete this redundant code in ram_block_add:

         error_setg(&new_block->cpr_blocker,
                    "Memory region %s uses guest_memfd, "
                    "which is not supported with CPR.",
                    memory_region_name(new_block->mr));
         migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
                                   MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC,
                                   -1);

I will submit a fix (unless Tom or Michael would prefer to author it).

- Steve


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-27 12:27             ` Steven Sistare
@ 2025-03-27 13:42               ` Tom Lendacky
  2025-03-28  7:05               ` Xiaoyao Li
  1 sibling, 0 replies; 24+ messages in thread
From: Tom Lendacky @ 2025-03-27 13:42 UTC (permalink / raw)
  To: Steven Sistare, Michael Roth, Fabiano Rosas
  Cc: qemu-devel, Peter Xu, David Hildenbrand

On 3/27/25 07:27, Steven Sistare wrote:
> On 3/26/2025 5:34 PM, Michael Roth wrote:
>> On Wed, Mar 26, 2025 at 05:13:50PM -0300, Fabiano Rosas wrote:
>>> Michael Roth <michael.roth@amd.com> writes:
>>>> Quoting Tom Lendacky (2025-03-26 14:21:31)
>>>>> On 3/26/25 13:46, Tom Lendacky wrote:
>>>>>> On 3/7/25 12:15, Fabiano Rosas wrote:
>>>>>>> From: Steve Sistare <steven.sistare@oracle.com>

> 
> Thanks everyone for debugging this.  To summarize,
> ram_block_add_cpr_blocker already blocks
> guest_memfd, because rb->fd < 0.  The fix is to delete this redundant
> code in ram_block_add:
> 
>         error_setg(&new_block->cpr_blocker,
>                    "Memory region %s uses guest_memfd, "
>                    "which is not supported with CPR.",
>                    memory_region_name(new_block->mr));
>         migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
>                                   MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC,
>                                   -1);
> 
> I will submit a fix (unless Tom or Michael would prefer to author it).

Steve, please do submit the fix. Thanks!

Tom

> 
> - Steve


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PULL 2/8] migration: ram block cpr blockers
  2025-03-27 12:27             ` Steven Sistare
  2025-03-27 13:42               ` Tom Lendacky
@ 2025-03-28  7:05               ` Xiaoyao Li
  1 sibling, 0 replies; 24+ messages in thread
From: Xiaoyao Li @ 2025-03-28  7:05 UTC (permalink / raw)
  To: Steven Sistare, Michael Roth, Fabiano Rosas
  Cc: Tom Lendacky, qemu-devel, Peter Xu, David Hildenbrand

On 3/27/2025 8:27 PM, Steven Sistare wrote:
> On 3/26/2025 5:34 PM, Michael Roth wrote:
>> On Wed, Mar 26, 2025 at 05:13:50PM -0300, Fabiano Rosas wrote:
>>> Michael Roth <michael.roth@amd.com> writes:
>>>
>>>> Quoting Tom Lendacky (2025-03-26 14:21:31)
>>>>> On 3/26/25 13:46, Tom Lendacky wrote:
>>>>>> On 3/7/25 12:15, Fabiano Rosas wrote:
>>>>>>> From: Steve Sistare <steven.sistare@oracle.com>
>>>>>>>
>>>>>>> Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile 
>>>>>>> ram blocks
>>>>>>> in the migration stream file and recreate them later, because the 
>>>>>>> physical
>>>>>>> memory for the blocks is pinned and registered for vfio.  Add a 
>>>>>>> blocker
>>>>>>> for volatile ram blocks.
>>>>>>>
>>>>>>> Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd 
>>>>>>> may be
>>>>>>> sufficient for CPR, but it has not been tested yet.
>>>>>>>
>>>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>>>>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>>>>>>> Reviewed-by: Peter Xu <peterx@redhat.com>
>>>>>>> Reviewed-by: David Hildenbrand <david@redhat.com>
>>>>>>> Message-ID: <1740667681-257312-1-git-send-email- 
>>>>>>> steven.sistare@oracle.com>
>>>>>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>>>>>> ---
>>>>>>>   include/exec/memory.h   |  3 ++
>>>>>>>   include/exec/ramblock.h |  1 +
>>>>>>>   migration/savevm.c      |  2 ++
>>>>>>>   system/physmem.c        | 66 ++++++++++++++++++++++++++++++++++ 
>>>>>>> +++++++
>>>>>>>   4 files changed, 72 insertions(+)
>>>>>>
>>>>>> This patch breaks booting an SNP guest as it triggers the following
>>>>>> assert:
>>>>>>
>>>>>> qemu-system-x86_64: ../util/error.c:68: error_setv: Assertion 
>>>>>> `*errp == NULL' failed.
>>>>>>
>>>
>>> Usually this means the error has already been set previously, which is
>>> not allowed.
>>>
>>>>>> I tracked it to the err_setg() call in ram_block_add_cpr_blocker().
>>>>>> It looks like the error message is unable to be printed because
>>>>>> rb->cpr_blocker is NULL.
>>>>>>
>>>>>> Adding aux-ram-share=on to the -machine object gets me past the 
>>>>>> error and
>>>>>> therefore the assertion, but isn't that an incompatible change to 
>>>>>> how an
>>>>>> SNP guest has to be started?
>>>>>
>>>>> If I update the err_setg() call to use the errp parameter that is 
>>>>> passed
>>>>> into ram_block_add_cpr_blocker(), I get the following message and then
>>>>> the guest launch terminates:
>>>>>
>>>
>>> The usage at ram_block_add_cpr_blocker() is correct, the cpr_blocker
>>> gets initialized and registered as a migration blocker. The errp only
>>> becomes relevant later when migration starts and the error condition is
>>> met.
>>>
>>>>> qemu-system-x86_64: Memory region pc.bios is not compatible with CPR.
>>>>> share=on is required for memory-backend objects, and aux-ram- 
>>>>> share=on is
>>>>> required.
>>>
>>> Since errp is an &error_fatal, it causes QEMU to exit, so this^ error
>>> message is bogus.
>>>
>>>>>
>>>>> The qemu parameters I used prior to this patch that allowed an SNP 
>>>>> guest
>>>>> to launch were:
>>>>>
>>>>> -machine type=q35,confidential-guest-support=sev0,memory-backend=ram1
>>>>> -object memory-backend- 
>>>>> memfd,id=ram1,size=16G,share=true,prealloc=false
>>>>>
>>>>> With these parameters after this patch, the launch fails.
>>>>
>>>> I think it might be failing because the caller of
>>>> ram_block_add_cpr_blocker() is passing in &error_abort, but if the
>>>> error_setg() is call on a properly initialized cpr_blocker value then
>>>> SNP is still able to boot for me.
>>>> I'm not sure where the best spot is
>>>> to initialize cpr_blocker, it probably needs to be done before either
>>>> ram_block_add_cpr_blocker() or ram_block_del_cpr_blocker() are 
>>>> callable,
>>>> but the following avoids the reported crash at least:
>>>>
>>>> diff --git a/system/physmem.c b/system/physmem.c
>>>> index 44dd129662..bff0fdcaac 100644
>>>> --- a/system/physmem.c
>>>> +++ b/system/physmem.c
>>>> @@ -4176,6 +4176,7 @@ void ram_block_add_cpr_blocker(RAMBlock *rb, 
>>>> Error **errp)
>>>>           return;
>>>>       }
>>>>
>>>> +    rb->cpr_blocker = NULL;
>>>
>>> Could it be the cpr_blocker already got set at ram_block_add() in the
>>> RAM_GUEST_MEMFD path?
>>
>> That seems to be the case: in some cases ram_block_add() sets cpr_blocker
>> when (new_block->flags & RAM_GUEST_MEMFD) is true, and then soon after
>> when ram_block_add_cpr_blocker() is called on the same RAMBlock:
>>
>>    2025-03-26T21:08:15.092427Z qemu-system-x86_64: warning: 
>> ram_block_add: new_block 0x55c247e4c880 new_block->cpr_blocker (nil) 
>> name ram1
>>    2025-03-26T21:08:15.124710Z qemu-system-x86_64: warning: 
>> ram_block_add: new_block 0x55c2480fde00 new_block->cpr_blocker (nil) 
>> name pc.bios
>>    2025-03-26T21:08:15.126190Z qemu-system-x86_64: warning: 
>> ram_block_add_cpr_blocker: rb 0x55c2480fde00 rb->cpr_blocker 
>> 0x55c2480fe050 name pc.bios
>>    2025-03-26T21:08:15.138582Z qemu-system-x86_64: warning: 
>> ram_block_add: new_block 0x55c247e3c1e0 new_block->cpr_blocker (nil) 
>> name pc.rom
>>    2025-03-26T21:08:15.138938Z qemu-system-x86_64: warning: 
>> ram_block_add_cpr_blocker: rb 0x55c247e3c1e0 rb->cpr_blocker 
>> 0x55c247e3c890 name pc.rom
>>    2025-03-26T21:08:16.185577Z qemu-system-x86_64: warning: 
>> ram_block_add_cpr_blocker: rb 0x55c248db9200 rb->cpr_blocker (nil) 
>> name /rom@etc/acpi/tables
>>    2025-03-26T21:08:16.187140Z qemu-system-x86_64: warning: 
>> ram_block_add_cpr_blocker: rb 0x55c248085620 rb->cpr_blocker (nil) 
>> name /rom@etc/table-loader
>>    2025-03-26T21:08:16.188029Z qemu-system-x86_64: warning: 
>> ram_block_add_cpr_blocker: rb 0x55c2480ce220 rb->cpr_blocker (nil) 
>> name /rom@etc/acpi/rsd
> 
> Thanks everyone for debugging this.  To summarize, 
> ram_block_add_cpr_blocker already blocks
> guest_memfd, because rb->fd < 0.  The fix is to delete this redundant 
> code in ram_block_add:
> 
>          error_setg(&new_block->cpr_blocker,
>                     "Memory region %s uses guest_memfd, "
>                     "which is not supported with CPR.",
>                     memory_region_name(new_block->mr));
>          migrate_add_blocker_modes(&new_block->cpr_blocker, errp,
>                                    MIG_MODE_CPR_TRANSFER, 
> MIG_MODE_CPR_EXEC,
>                                    -1);

I just encountered the same issue with TDX guest, after rebasing TDX 
code to 10.0.0-rc0.

thank you all for the reporting and quick solution for it.

> I will submit a fix (unless Tom or Michael would prefer to author it).
> 
> - Steve
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-03-28  7:05 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-07 18:15 [PULL 0/8] Migration patches for 2025-03-07 Fabiano Rosas
2025-03-07 18:15 ` [PULL 1/8] migration: Fix UAF for incoming migration on MigrationState Fabiano Rosas
2025-03-07 18:15 ` [PULL 2/8] migration: ram block cpr blockers Fabiano Rosas
2025-03-26 18:46   ` Tom Lendacky
2025-03-26 19:21     ` Tom Lendacky
2025-03-26 19:50       ` Michael Roth
2025-03-26 20:13         ` Fabiano Rosas
2025-03-26 21:34           ` Michael Roth
2025-03-27 12:27             ` Steven Sistare
2025-03-27 13:42               ` Tom Lendacky
2025-03-28  7:05               ` Xiaoyao Li
2025-03-07 18:15 ` [PULL 3/8] migration: Prioritize RDMA in ram_save_target_page() Fabiano Rosas
2025-03-07 18:15 ` [PULL 4/8] migration: check RDMA and capabilities are compatible on both sides Fabiano Rosas
2025-03-07 18:15 ` [PULL 5/8] migration: disable RDMA + postcopy-ram Fabiano Rosas
2025-03-07 18:15 ` [PULL 6/8] migration/rdma: Remove redundant migration_in_postcopy checks Fabiano Rosas
2025-03-07 18:15 ` [PULL 7/8] migration: Unfold control_save_page() Fabiano Rosas
2025-03-07 18:15 ` [PULL 8/8] migration: Add qtest for migration over RDMA Fabiano Rosas
2025-03-08  6:00   ` Philippe Mathieu-Daudé
2025-03-08  8:42     ` Stefan Hajnoczi
2025-03-10  8:33       ` Zhijian Li (Fujitsu) via
2025-03-10 14:36         ` Peter Xu
2025-03-11  2:06           ` Zhijian Li (Fujitsu) via
2025-03-10  8:01     ` Zhijian Li (Fujitsu) via
2025-03-10 15:00       ` Fabiano Rosas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).