All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup
@ 2025-03-05  6:28 Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 1/6] migration: Prioritize RDMA in ram_save_target_page() Li Zhijian via
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

- It fix the RDMA migration broken issue
- disable RDMA + postcopy
- some cleanups
- Add a qtest for RDMA at last

Changes since V4:
 - collect Reviewed-tags
 - Address comments in patch "migration: Add qtest for migration over RDMA" from Fabiano Rosas <farosas@suse.de>

Changes since V3:
- check RDMA and capabilities are compatible on both sides # renamed from
  previous V3's "migration: Add migration_capabilities_and_transport_compatible()"

Changes since V2:
- squash previous 2/3/4 to '[PATCH v3 5/6] migration: Unfold  control_save_page()'
- reorder the patch layout to prevent recently added code from being deleted again.
- collect Reviewed tags from Peter

Changes since V1[0]:
Add some saparate patches to refactor and cleanup based on V1

[0] https://lore.kernel.org/qemu-devel/20250218074345.638203-1-lizhijian@fujitsu.com/

Li Zhijian (6):
  migration: Prioritize RDMA in ram_save_target_page()
  migration: check RDMA and capabilities are compatible on both sides
  migration: disable RDMA + postcopy-ram
  migration/rdma: Remove redundant migration_in_postcopy checks
  migration: Unfold control_save_page()
  migration: Add qtest for migration over RDMA

 MAINTAINERS                           |  1 +
 migration/migration.c                 | 30 ++++++++----
 migration/options.c                   | 25 ++++++++++
 migration/options.h                   |  1 +
 migration/ram.c                       | 41 +++++-----------
 migration/rdma.c                      | 11 ++---
 migration/rdma.h                      |  3 +-
 scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
 tests/qtest/migration/precopy-tests.c | 67 +++++++++++++++++++++++++++
 9 files changed, 178 insertions(+), 49 deletions(-)
 create mode 100755 scripts/rdma-migration-helper.sh

-- 
2.44.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v5 1/6] migration: Prioritize RDMA in ram_save_target_page()
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
@ 2025-03-05  6:28 ` Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 2/6] migration: check RDMA and capabilities are compatible on both sides Li Zhijian via
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

Address an error in RDMA-based migration by ensuring RDMA is prioritized
when saving pages in `ram_save_target_page()`.

Previously, the RDMA protocol's page-saving step was placed after other
protocols due to a refactoring in commit bc38dc2f5f3. This led to migration
failures characterized by unknown control messages and state loading errors
destination:
(qemu) qemu-system-x86_64: Unknown control message QEMU FILE
qemu-system-x86_64: error while loading state section id 1(ram)
qemu-system-x86_64: load of migration failed: Operation not permitted
source:
(qemu) qemu-system-x86_64: RDMA is in an error state waiting migration to abort!
qemu-system-x86_64: failed to save SaveStateEntry with id(name): 1(ram): -1
qemu-system-x86_64: rdma migration: recv polling control error!
qemu-system-x86_64: warning: Early error. Sending error.
qemu-system-x86_64: warning: rdma migration: send polling control error

RDMA migration implemented its own protocol/method to send pages to
destination side, hand over to RDMA first to prevent pages being saved by
other protocol.

Fixes: bc38dc2f5f3 ("migration: refactor ram_save_target_page functions")
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V3:
   collect Reviewed tags
---
 migration/ram.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 589b6505eb2..424df6d9f13 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1964,6 +1964,11 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
     ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
     int res;
 
+    /* Hand over to RDMA first */
+    if (control_save_page(pss, offset, &res)) {
+        return res;
+    }
+
     if (!migrate_multifd()
         || migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) {
         if (save_zero_page(rs, pss, offset)) {
@@ -1976,10 +1981,6 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
         return ram_save_multifd_page(block, offset);
     }
 
-    if (control_save_page(pss, offset, &res)) {
-        return res;
-    }
-
     return ram_save_page(rs, pss);
 }
 
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 2/6] migration: check RDMA and capabilities are compatible on both sides
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 1/6] migration: Prioritize RDMA in ram_save_target_page() Li Zhijian via
@ 2025-03-05  6:28 ` Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 3/6] migration: disable RDMA + postcopy-ram Li Zhijian via
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

Depending on the order of starting RDMA and setting capability,
they can be categorized into the following scenarios:
Source:
 S1: [set capabilities] -> [Start RDMA outgoing]
Destination:
 D1: [set capabilities] -> [Start RDMA incoming]
 D2: [Start RDMA incoming] -> [set capabilities]

Previously, compatibility between RDMA and capabilities was verified only
in scenario D1, potentially causing migration failures in other situations.

For scenarios S1 and D1, we can seamlessly incorporate
migration_transport_compatible() to address compatibility between
channels and capabilities vs transport.

For scenario D2, ensure compatibility within migrate_caps_check().

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---

V4:
  - Remove Reviewed-tag and cover above D2 scenario

V3:
  - collect Reviewed tag
  - reorder: 5th -> 2nd
---
 migration/migration.c | 30 ++++++++++++++++++++----------
 migration/options.c   | 21 +++++++++++++++++++++
 migration/options.h   |  1 +
 3 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index c597aa707e5..0736d3a6728 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -238,6 +238,24 @@ migration_channels_and_transport_compatible(MigrationAddress *addr,
     return true;
 }
 
+static bool
+migration_capabilities_and_transport_compatible(MigrationAddress *addr,
+                                                Error **errp)
+{
+    if (addr->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
+        return migrate_rdma_caps_check(migrate_get_current()->capabilities,
+                                       errp);
+    }
+
+    return true;
+}
+
+static bool migration_transport_compatible(MigrationAddress *addr, Error **errp)
+{
+    return migration_channels_and_transport_compatible(addr, errp) &&
+           migration_capabilities_and_transport_compatible(addr, errp);
+}
+
 static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
 {
     uintptr_t a = (uintptr_t) ap, b = (uintptr_t) bp;
@@ -716,7 +734,7 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
     }
 
     /* transport mechanism not suitable for migration? */
-    if (!migration_channels_and_transport_compatible(addr, errp)) {
+    if (!migration_transport_compatible(addr, errp)) {
         return;
     }
 
@@ -735,14 +753,6 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
         }
 #ifdef CONFIG_RDMA
     } else if (addr->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
-        if (migrate_xbzrle()) {
-            error_setg(errp, "RDMA and XBZRLE can't be used together");
-            return;
-        }
-        if (migrate_multifd()) {
-            error_setg(errp, "RDMA and multifd can't be used together");
-            return;
-        }
         rdma_start_incoming_migration(&addr->u.rdma, errp);
 #endif
     } else if (addr->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
@@ -2159,7 +2169,7 @@ void qmp_migrate(const char *uri, bool has_channels,
     }
 
     /* transport mechanism not suitable for migration? */
-    if (!migration_channels_and_transport_compatible(addr, errp)) {
+    if (!migration_transport_compatible(addr, errp)) {
         return;
     }
 
diff --git a/migration/options.c b/migration/options.c
index bb259d192a9..c6f18df5864 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -439,6 +439,20 @@ static bool migrate_incoming_started(void)
     return !!migration_incoming_get_current()->transport_data;
 }
 
+bool migrate_rdma_caps_check(bool *caps, Error **errp)
+{
+    if (caps[MIGRATION_CAPABILITY_XBZRLE]) {
+        error_setg(errp, "RDMA and XBZRLE can't be used together");
+        return false;
+    }
+    if (caps[MIGRATION_CAPABILITY_MULTIFD]) {
+        error_setg(errp, "RDMA and multifd can't be used together");
+        return false;
+    }
+
+    return true;
+}
+
 /**
  * @migration_caps_check - check capability compatibility
  *
@@ -602,6 +616,13 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
         }
     }
 
+    /*
+     * On destination side, check the cases that capability is being set
+     * after incoming thread has started.
+    */
+    if (migrate_rdma() && !migrate_rdma_caps_check(new_caps, errp)) {
+        return false;
+    }
     return true;
 }
 
diff --git a/migration/options.h b/migration/options.h
index 762be4e641a..82d839709e7 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -57,6 +57,7 @@ bool migrate_tls(void);
 
 /* capabilities helpers */
 
+bool migrate_rdma_caps_check(bool *caps, Error **errp);
 bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp);
 
 /* parameters */
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 3/6] migration: disable RDMA + postcopy-ram
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 1/6] migration: Prioritize RDMA in ram_save_target_page() Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 2/6] migration: check RDMA and capabilities are compatible on both sides Li Zhijian via
@ 2025-03-05  6:28 ` Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 4/6] migration/rdma: Remove redundant migration_in_postcopy checks Li Zhijian via
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

It's believed that RDMA + postcopy-ram has been broken for a while.
Rather than spending time re-enabling it, let's simply disable it as a
trade-off.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V3:
  - collect Reviewed tag
  - reoder: 6th -> 3th
---
 migration/options.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/migration/options.c b/migration/options.c
index c6f18df5864..527ba05d413 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -449,6 +449,10 @@ bool migrate_rdma_caps_check(bool *caps, Error **errp)
         error_setg(errp, "RDMA and multifd can't be used together");
         return false;
     }
+    if (caps[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
+        error_setg(errp, "RDMA and postcopy-ram can't be used together");
+        return false;
+    }
 
     return true;
 }
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 4/6] migration/rdma: Remove redundant migration_in_postcopy checks
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
                   ` (2 preceding siblings ...)
  2025-03-05  6:28 ` [PATCH v5 3/6] migration: disable RDMA + postcopy-ram Li Zhijian via
@ 2025-03-05  6:28 ` Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 5/6] migration: Unfold control_save_page() Li Zhijian via
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

Since we have disabled RDMA + postcopy, it's safe to remove
the migration_in_postcopy() that follows the migrate_rdma().

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V3:
  reorder: 7th->4th
---
 migration/rdma.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 76fb0349238..e5b4ac599b1 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3284,7 +3284,7 @@ err:
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    if (!migrate_rdma() || migration_in_postcopy()) {
+    if (!migrate_rdma()) {
         return RAM_SAVE_CONTROL_NOT_SUPP;
     }
 
@@ -3829,7 +3829,7 @@ int rdma_block_notification_handle(QEMUFile *f, const char *name)
 
 int rdma_registration_start(QEMUFile *f, uint64_t flags)
 {
-    if (!migrate_rdma() || migration_in_postcopy()) {
+    if (!migrate_rdma()) {
         return 0;
     }
 
@@ -3861,7 +3861,7 @@ int rdma_registration_stop(QEMUFile *f, uint64_t flags)
     RDMAControlHeader head = { .len = 0, .repeat = 1 };
     int ret;
 
-    if (!migrate_rdma() || migration_in_postcopy()) {
+    if (!migrate_rdma()) {
         return 0;
     }
 
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 5/6] migration: Unfold control_save_page()
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
                   ` (3 preceding siblings ...)
  2025-03-05  6:28 ` [PATCH v5 4/6] migration/rdma: Remove redundant migration_in_postcopy checks Li Zhijian via
@ 2025-03-05  6:28 ` Li Zhijian via
  2025-03-05  6:28 ` [PATCH v5 6/6] migration: Add qtest for migration over RDMA Li Zhijian via
  2025-03-07 14:15 ` [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Fabiano Rosas
  6 siblings, 0 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

control_save_page() is for RDMA only, unfold it to make the code more
clear.
In addition:
 - Similar to other branches style in ram_save_target_page(), involve RDMA
   only if the condition 'migrate_rdma()' is true.
 - Further simplify the code by removing the RAM_SAVE_CONTROL_NOT_SUPP.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V3:
  squash previous 2nd, 3th, 4th into one patch
---
 migration/ram.c  | 34 +++++++---------------------------
 migration/rdma.c |  7 ++-----
 migration/rdma.h |  3 +--
 3 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 424df6d9f13..c363034c882 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1143,32 +1143,6 @@ static int save_zero_page(RAMState *rs, PageSearchStatus *pss,
     return len;
 }
 
-/*
- * @pages: the number of pages written by the control path,
- *        < 0 - error
- *        > 0 - number of pages written
- *
- * Return true if the pages has been saved, otherwise false is returned.
- */
-static bool control_save_page(PageSearchStatus *pss,
-                              ram_addr_t offset, int *pages)
-{
-    int ret;
-
-    ret = rdma_control_save_page(pss->pss_channel, pss->block->offset, offset,
-                                 TARGET_PAGE_SIZE);
-    if (ret == RAM_SAVE_CONTROL_NOT_SUPP) {
-        return false;
-    }
-
-    if (ret == RAM_SAVE_CONTROL_DELAYED) {
-        *pages = 1;
-        return true;
-    }
-    *pages = ret;
-    return true;
-}
-
 /*
  * directly send the page to the stream
  *
@@ -1965,7 +1939,13 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
     int res;
 
     /* Hand over to RDMA first */
-    if (control_save_page(pss, offset, &res)) {
+    if (migrate_rdma()) {
+        res = rdma_control_save_page(pss->pss_channel, pss->block->offset,
+                                     offset, TARGET_PAGE_SIZE);
+
+        if (res == RAM_SAVE_CONTROL_DELAYED) {
+            res = 1;
+        }
         return res;
     }
 
diff --git a/migration/rdma.c b/migration/rdma.c
index e5b4ac599b1..08eb924ffaa 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3284,14 +3284,11 @@ err:
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    if (!migrate_rdma()) {
-        return RAM_SAVE_CONTROL_NOT_SUPP;
-    }
+    assert(migrate_rdma());
 
     int ret = qemu_rdma_save_page(f, block_offset, offset, size);
 
-    if (ret != RAM_SAVE_CONTROL_DELAYED &&
-        ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+    if (ret != RAM_SAVE_CONTROL_DELAYED) {
         if (ret < 0) {
             qemu_file_set_error(f, ret);
         }
diff --git a/migration/rdma.h b/migration/rdma.h
index f55f28bbed1..8eeb0117b91 100644
--- a/migration/rdma.h
+++ b/migration/rdma.h
@@ -33,7 +33,6 @@ void rdma_start_incoming_migration(InetSocketAddress *host_port, Error **errp);
 #define RAM_CONTROL_ROUND     1
 #define RAM_CONTROL_FINISH    3
 
-#define RAM_SAVE_CONTROL_NOT_SUPP -1000
 #define RAM_SAVE_CONTROL_DELAYED  -2000
 
 #ifdef CONFIG_RDMA
@@ -56,7 +55,7 @@ static inline
 int rdma_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                            ram_addr_t offset, size_t size)
 {
-    return RAM_SAVE_CONTROL_NOT_SUPP;
+    g_assert_not_reached();
 }
 #endif
 #endif
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 6/6] migration: Add qtest for migration over RDMA
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
                   ` (4 preceding siblings ...)
  2025-03-05  6:28 ` [PATCH v5 5/6] migration: Unfold control_save_page() Li Zhijian via
@ 2025-03-05  6:28 ` Li Zhijian via
  2025-03-07 14:15 ` [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Fabiano Rosas
  6 siblings, 0 replies; 8+ messages in thread
From: Li Zhijian via @ 2025-03-05  6:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
	Li Zhijian

This qtest requires there is a RDMA(RoCE) link in the host.
In order to make the test work smoothly, introduce a
scripts/rdma-migration-helper.sh to
- setup a new Soft-RoCE(aka RXE) if it's root
- detect existing RoCE link

Test will be skipped if there is no available RoCE link.
 # Start of rdma tests
 # Running /x86_64/migration/precopy/rdma/plain
 Command 'rdma' is not available, please install it first.
 # To enable the test:
 # (1) Run 'scripts/rdma-migration-helper.sh setup' with root and rerun the test
 # or
 # (2) Run the test with root privilege
 #
 ok 1 /x86_64/migration/precopy/rdma/plain # SKIP No rdma link available
 # End of rdma tests

Note: Remove the newly added RXE link by executing 'modprobe -r rdma_rxe'
or by specifying 'clean' within this script.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5:
  check command 'rdma' in script
  Add clean option for the script
  report details only if QTEST_LOG=1
---
 MAINTAINERS                           |  1 +
 scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
 tests/qtest/migration/precopy-tests.c | 67 +++++++++++++++++++++++++++
 3 files changed, 116 insertions(+)
 create mode 100755 scripts/rdma-migration-helper.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 3848d37a38d..15360fcdc4b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3480,6 +3480,7 @@ R: Li Zhijian <lizhijian@fujitsu.com>
 R: Peter Xu <peterx@redhat.com>
 S: Odd Fixes
 F: migration/rdma*
+F: scripts/rdma-migration-helper.sh
 
 Migration dirty limit and dirty page rate
 M: Hyman Huang <yong.huang@smartx.com>
diff --git a/scripts/rdma-migration-helper.sh b/scripts/rdma-migration-helper.sh
new file mode 100755
index 00000000000..08e29a52eb8
--- /dev/null
+++ b/scripts/rdma-migration-helper.sh
@@ -0,0 +1,48 @@
+#!/bin/bash
+
+# Copied from blktests
+get_ipv4_addr()
+{
+    ip -4 -o addr show dev "$1" |
+        sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
+        tr -d '\n'
+}
+
+has_soft_rdma()
+{
+    rdma link | grep -q " netdev $1[[:blank:]]*\$"
+}
+
+rdma_rxe_setup_detect()
+{
+    (
+        cd /sys/class/net &&
+            for i in *; do
+                [ -e "$i" ] || continue
+                [ "$i" = "lo" ] && continue
+                [ "$(<"$i/addr_len")" = 6 ] || continue
+                [ "$(<"$i/carrier")" = 1 ] || continue
+
+                has_soft_rdma "$i" && break
+                [ "$operation" = "setup" ] &&
+                    rdma link add "${i}_rxe" type rxe netdev "$i" && break
+            done
+        has_soft_rdma "$i" || return
+        get_ipv4_addr "$i"
+    )
+}
+
+operation=${1:-setup}
+
+command -v rdma >/dev/null || {
+    echo "Command 'rdma' is not available, please install it first." >&2
+    exit 1
+}
+
+if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
+    rdma_rxe_setup_detect
+elif [ "$operation" == "clean" ]; then
+    modprobe -r rdma_rxe
+else
+    echo "Usage: $0 [setup | detect | clean]"
+fi
diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
index ba273d10b9a..9a4638481b4 100644
--- a/tests/qtest/migration/precopy-tests.c
+++ b/tests/qtest/migration/precopy-tests.c
@@ -99,6 +99,69 @@ static void test_precopy_unix_dirty_ring(void)
     test_precopy_common(&args);
 }
 
+#ifdef CONFIG_RDMA
+
+#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
+static int new_rdma_link(char *buffer, bool verbose)
+{
+    const char *argument = (geteuid() == 0) ? "setup" : "detect";
+    char cmd[1024];
+
+    snprintf(cmd, sizeof(cmd), "%s %s %s", RDMA_MIGRATION_HELPER, argument,
+             verbose ? "" : "2>/dev/null");
+
+    FILE *pipe = popen(cmd, "r");
+    if (pipe == NULL) {
+        perror("Failed to run script");
+        return -1;
+    }
+
+    int idx = 0;
+    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
+        idx += strlen(buffer);
+    }
+
+    int status = pclose(pipe);
+    if (status == -1) {
+        perror("Error reported by pclose()");
+        return -1;
+    } else if (WIFEXITED(status)) {
+        return WEXITSTATUS(status);
+    }
+
+    return -1;
+}
+
+static void test_precopy_rdma_plain(void)
+{
+    char buffer[128] = {};
+    bool verbose = g_getenv("QTEST_LOG");
+
+    if (new_rdma_link(buffer, verbose)) {
+        g_test_skip("No rdma link available");
+        if (verbose) {
+            g_test_message("To enable the test:\n"
+                           "(1) Run \'" RDMA_MIGRATION_HELPER " setup\' with root and rerun the test\n"
+                           "or\n(2) Run the test with root privilege\n");
+        }
+        return;
+    }
+
+    /*
+     * TODO: query a free port instead of hard code.
+     * 29200=('R'+'D'+'M'+'A')*100
+     **/
+    g_autofree char *uri = g_strdup_printf("rdma:%s:29200", buffer);
+
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+    };
+
+    test_precopy_common(&args);
+}
+#endif
+
 static void test_precopy_tcp_plain(void)
 {
     MigrateCommon args = {
@@ -1124,6 +1187,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
                        test_multifd_tcp_uri_none);
     migration_test_add("/migration/multifd/tcp/plain/cancel",
                        test_multifd_tcp_cancel);
+#ifdef CONFIG_RDMA
+    migration_test_add("/migration/precopy/rdma/plain",
+                       test_precopy_rdma_plain);
+#endif
 }
 
 void migration_test_add_precopy(MigrationTestEnv *env)
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup
  2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
                   ` (5 preceding siblings ...)
  2025-03-05  6:28 ` [PATCH v5 6/6] migration: Add qtest for migration over RDMA Li Zhijian via
@ 2025-03-07 14:15 ` Fabiano Rosas
  6 siblings, 0 replies; 8+ messages in thread
From: Fabiano Rosas @ 2025-03-07 14:15 UTC (permalink / raw)
  To: Li Zhijian via, qemu-devel
  Cc: Peter Xu, Laurent Vivier, Paolo Bonzini, Li Zhijian

Li Zhijian via <qemu-devel@nongnu.org> writes:

> - It fix the RDMA migration broken issue
> - disable RDMA + postcopy
> - some cleanups
> - Add a qtest for RDMA at last
>
> Changes since V4:
>  - collect Reviewed-tags
>  - Address comments in patch "migration: Add qtest for migration over RDMA" from Fabiano Rosas <farosas@suse.de>
>
> Changes since V3:
> - check RDMA and capabilities are compatible on both sides # renamed from
>   previous V3's "migration: Add migration_capabilities_and_transport_compatible()"
>
> Changes since V2:
> - squash previous 2/3/4 to '[PATCH v3 5/6] migration: Unfold  control_save_page()'
> - reorder the patch layout to prevent recently added code from being deleted again.
> - collect Reviewed tags from Peter
>
> Changes since V1[0]:
> Add some saparate patches to refactor and cleanup based on V1
>
> [0] https://lore.kernel.org/qemu-devel/20250218074345.638203-1-lizhijian@fujitsu.com/
>
> Li Zhijian (6):
>   migration: Prioritize RDMA in ram_save_target_page()
>   migration: check RDMA and capabilities are compatible on both sides
>   migration: disable RDMA + postcopy-ram
>   migration/rdma: Remove redundant migration_in_postcopy checks
>   migration: Unfold control_save_page()
>   migration: Add qtest for migration over RDMA
>
>  MAINTAINERS                           |  1 +
>  migration/migration.c                 | 30 ++++++++----
>  migration/options.c                   | 25 ++++++++++
>  migration/options.h                   |  1 +
>  migration/ram.c                       | 41 +++++-----------
>  migration/rdma.c                      | 11 ++---
>  migration/rdma.h                      |  3 +-
>  scripts/rdma-migration-helper.sh      | 48 +++++++++++++++++++
>  tests/qtest/migration/precopy-tests.c | 67 +++++++++++++++++++++++++++
>  9 files changed, 178 insertions(+), 49 deletions(-)
>  create mode 100755 scripts/rdma-migration-helper.sh

Queued, thanks!


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-03-07 14:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05  6:28 [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Li Zhijian via
2025-03-05  6:28 ` [PATCH v5 1/6] migration: Prioritize RDMA in ram_save_target_page() Li Zhijian via
2025-03-05  6:28 ` [PATCH v5 2/6] migration: check RDMA and capabilities are compatible on both sides Li Zhijian via
2025-03-05  6:28 ` [PATCH v5 3/6] migration: disable RDMA + postcopy-ram Li Zhijian via
2025-03-05  6:28 ` [PATCH v5 4/6] migration/rdma: Remove redundant migration_in_postcopy checks Li Zhijian via
2025-03-05  6:28 ` [PATCH v5 5/6] migration: Unfold control_save_page() Li Zhijian via
2025-03-05  6:28 ` [PATCH v5 6/6] migration: Add qtest for migration over RDMA Li Zhijian via
2025-03-07 14:15 ` [PATCH v5 0/6] migration/rdma: fixes, refactor and cleanup Fabiano Rosas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.