qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Live update: tap and vhost
@ 2025-12-03 18:43 Ben Chaney
  0 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:43 UTC (permalink / raw)
  To: qemu-devel

Changes since v2
- I have taken over this patch set since Steve retired
- Added comments to explain the order of events
- Remove redundant reversion to cleanup git history
- Inclusion of virtio and stub fixes

Tap and vhost devices can be preserved during cpr-transfer using
traditional live migration methods, wherein the management layer
creates new interfaces for the target and fiddles with 'ip link'
to deactivate the old interface and activate the new.

However, CPR can simply send the file descriptors to new QEMU,
with no special management actions required.  The user enables
this behavior by specifing '-netdev tap,cpr=on'.  The default
is cpr=off.

Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
Steve Sistare (8):
      migration: stop vm earlier for cpr
      migration: cpr setup notifier
      vhost: reset vhost devices for cpr
      cpr: delete all fds
      tap: common return label
      tap: cpr support
      tap: postload fix for cpr
      tap: cpr fixes

 hw/net/virtio-net.c               |  26 +++++++
 hw/vfio/device.c                  |   2 +-
 hw/virtio/vhost-backend.c         |   6 ++
 hw/virtio/vhost.c                 |  32 +++++++++
 include/hw/virtio/vhost-backend.h |   1 +
 include/hw/virtio/vhost.h         |   1 +
 include/migration/cpr.h           |   3 +-
 include/net/tap.h                 |   1 +
 io/channel-socket.c               |   4 +-
 migration/cpr.c                   |  24 +++++--
 migration/migration.c             |  69 ++++++++++++++----
 net/tap-win32.c                   |   5 ++
 net/tap.c                         | 147 +++++++++++++++++++++++++++++---------
 qapi/net.json                     |   5 +-
 stubs/cpr.c                       |   8 +++
 stubs/meson.build                 |   1 +
 16 files changed, 279 insertions(+), 56 deletions(-)
---
base-commit: 9febfa94b69b7146582c48a868bd2330ac45037f
change-id: 20251203-cpr-tap-04fd811ace03

Best regards,
-- 
Ben Chaney <bchaney@akamai.com>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 0/8] Live update: tap and vhost
@ 2025-12-03 18:51 Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 1/8] migration: stop vm earlier for cpr Ben Chaney
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare, Vladimir Sementsov-Ogievskiy

Changes since v2
- I have taken over this patch set since Steve retired
- Added comments to explain the order of events
- Remove redundant reversion to cleanup git history
- Inclusion of virtio and stub fixes

Tap and vhost devices can be preserved during cpr-transfer using
traditional live migration methods, wherein the management layer
creates new interfaces for the target and fiddles with 'ip link'
to deactivate the old interface and activate the new.

However, CPR can simply send the file descriptors to new QEMU,
with no special management actions required.  The user enables
this behavior by specifing '-netdev tap,cpr=on'.  The default
is cpr=off.

Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
Steve Sistare (8):
      migration: stop vm earlier for cpr
      migration: cpr setup notifier
      vhost: reset vhost devices for cpr
      cpr: delete all fds
      tap: common return label
      tap: cpr support
      tap: postload fix for cpr
      tap: cpr fixes

 hw/net/virtio-net.c               |  26 +++++++
 hw/vfio/device.c                  |   2 +-
 hw/virtio/vhost-backend.c         |   6 ++
 hw/virtio/vhost.c                 |  32 +++++++++
 include/hw/virtio/vhost-backend.h |   1 +
 include/hw/virtio/vhost.h         |   1 +
 include/migration/cpr.h           |   3 +-
 include/net/tap.h                 |   1 +
 io/channel-socket.c               |   4 +-
 migration/cpr.c                   |  24 +++++--
 migration/migration.c             |  69 ++++++++++++++----
 net/tap-win32.c                   |   5 ++
 net/tap.c                         | 147 +++++++++++++++++++++++++++++---------
 qapi/net.json                     |   5 +-
 stubs/cpr.c                       |   8 +++
 stubs/meson.build                 |   1 +
 16 files changed, 279 insertions(+), 56 deletions(-)
---
base-commit: 9febfa94b69b7146582c48a868bd2330ac45037f
change-id: 20251203-cpr-tap-04fd811ace03

Best regards,
-- 
Ben Chaney <bchaney@akamai.com>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/8] migration: stop vm earlier for cpr
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 2/8] migration: cpr setup notifier Ben Chaney
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

Stop the vm earlier for cpr, before cpr_save_state which causes new QEMU
to proceed and initialize devices.  We must guarantee devices are stopped
in old QEMU, and all source notifiers called, before they are initialized
in new QEMU.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 migration/migration.c | 57 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 48 insertions(+), 9 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index c2daab6bdd..6d40697767 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1657,6 +1657,7 @@ void migration_cancel(void)
                           MIGRATION_STATUS_CANCELLED);
         cpr_state_close();
         migrate_hup_delete(s);
+        vm_resume(s->vm_old_state);
     }
 }
 
@@ -2216,6 +2217,7 @@ void qmp_migrate(const char *uri, bool has_channels,
     MigrationAddress *addr = NULL;
     MigrationChannel *channelv[MIGRATION_CHANNEL_TYPE__MAX] = { NULL };
     MigrationChannel *cpr_channel = NULL;
+    bool stopped = false;
 
     /*
      * Having preliminary checks for uri and channel
@@ -2268,6 +2270,46 @@ void qmp_migrate(const char *uri, bool has_channels,
         return;
     }
 
+    /*
+     * CPR-transfer  ordering:
+     *
+     *   SOURCE                              TARGET
+     *   ------                              ------
+     *                                       cpr_state_load() blocks
+     *   |                                        |
+     *   |  1. migration_stop_vm()                |
+     *   |     VM stopped, devices quiesced       |
+     *   |                                        | Waiting for
+     *   |  2. notifiers (PRECOPY_SETUP)          | FDs from source
+     *   |     vhost_reset_owner() releases       |
+     *   |     device ownership                   |
+     *   |                                        |
+     *   |  3. cpr_state_save() ---- FDs -------> |
+     *   |                                        |
+     *   v                                        v
+     *   postmigrate                         Device init begins
+     *                                       - cpr_find_fd()
+     *                                       - vhost_dev_init()
+     *                                       - VHOST_SET_OWNER
+     *
+     * Step 3 is the synchronization/cut-over point. Target proceeds immediately
+     * upon receiving FDs, so steps 1-2 must complete otherwise:
+     * - Target's VHOST_SET_OWNER fails with -EBUSY (source still owns)
+     * - Race between source I/O and target device init
+     *
+     *  We stop the VM early (before FD transfer) to prevent this race.
+     *  Unlike regular migration, CPR-transfer passes memory via FD (memfd)
+     *  rather than copying RAM, so early VM stop should have minimal downtime.
+     */
+    if (migrate_mode_is_cpr(s)) {
+        int ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
+        if (ret < 0) {
+            error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
+            goto out;
+        }
+        stopped = true;
+    }
+
     if (!cpr_state_save(cpr_channel, &local_err)) {
         goto out;
     }
@@ -2294,6 +2336,9 @@ out:
     if (local_err) {
         migration_connect_set_error(s, local_err);
         error_propagate(errp, local_err);
+        if (stopped) {
+            vm_resume(s->vm_old_state);
+        }
     }
 }
 
@@ -2339,6 +2384,9 @@ static void qmp_migrate_finish(MigrationAddress *addr, bool resume_requested,
         }
         migration_connect_set_error(s, local_err);
         error_propagate(errp, local_err);
+        if (migrate_mode_is_cpr(s)) {
+            vm_resume(s->vm_old_state);
+        }
         return;
     }
 }
@@ -4028,7 +4076,6 @@ void migration_connect(MigrationState *s, Error *error_in)
     Error *local_err = NULL;
     uint64_t rate_limit;
     bool resume = (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP);
-    int ret;
 
     /*
      * If there's a previous error, free it and prepare for another one.
@@ -4099,14 +4146,6 @@ void migration_connect(MigrationState *s, Error *error_in)
         return;
     }
 
-    if (migrate_mode_is_cpr(s)) {
-        ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
-        if (ret < 0) {
-            error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
-            goto fail;
-        }
-    }
-
     /*
      * Take a refcount to make sure the migration object won't get freed by
      * the main thread already in migration_shutdown().

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/8] migration: cpr setup notifier
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 1/8] migration: stop vm earlier for cpr Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 3/8] vhost: reset vhost devices for cpr Ben Chaney
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

Call MIG_EVENT_PRECOPY_SETUP earlier, before CPR starts.  An early notifier
is needed for resetting vhost devices, as explained in the next patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 migration/migration.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 6d40697767..60850c232e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2310,7 +2310,14 @@ void qmp_migrate(const char *uri, bool has_channels,
         stopped = true;
     }
 
+    /* Notify before starting migration thread, and before starting cpr */
+    if (!resume_requested &&
+        migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP, &local_err)) {
+        goto out;
+    }
+
     if (!cpr_state_save(cpr_channel, &local_err)) {
+        migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
         goto out;
     }
 
@@ -4108,11 +4115,6 @@ void migration_connect(MigrationState *s, Error *error_in)
     } else {
         /* This is a fresh new migration */
         rate_limit = migrate_max_bandwidth();
-
-        /* Notify before starting migration thread */
-        if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP, &local_err)) {
-            goto fail;
-        }
     }
 
     migration_rate_set(rate_limit);

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/8] vhost: reset vhost devices for cpr
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 1/8] migration: stop vm earlier for cpr Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 2/8] migration: cpr setup notifier Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 4/8] cpr: delete all fds Ben Chaney
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

When preserving a vhost fd using CPR, call VHOST_RESET_OWNER prior to CPR
in old QEMU.  Otherwise, new QEMU will fail when it calls VHOST_SET_OWNER
during vhost_dev_init.

Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 hw/virtio/vhost-backend.c         |  6 ++++++
 hw/virtio/vhost.c                 | 32 ++++++++++++++++++++++++++++++++
 include/hw/virtio/vhost-backend.h |  1 +
 include/hw/virtio/vhost.h         |  1 +
 4 files changed, 40 insertions(+)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 4367db0d95..1447d12963 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -261,6 +261,11 @@ static int vhost_kernel_set_owner(struct vhost_dev *dev)
     return vhost_kernel_call(dev, VHOST_SET_OWNER, NULL);
 }
 
+static int vhost_kernel_reset_owner(struct vhost_dev *dev)
+{
+    return vhost_kernel_call(dev, VHOST_RESET_OWNER, NULL);
+}
+
 static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
 {
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
@@ -385,6 +390,7 @@ const VhostOps kernel_ops = {
         .vhost_get_features_ex = vhost_kernel_get_features,
         .vhost_set_backend_cap = vhost_kernel_set_backend_cap,
         .vhost_set_owner = vhost_kernel_set_owner,
+        .vhost_reset_owner = vhost_kernel_reset_owner,
         .vhost_get_vq_index = vhost_kernel_get_vq_index,
         .vhost_vsock_set_guest_cid = vhost_kernel_vsock_set_guest_cid,
         .vhost_vsock_set_running = vhost_kernel_vsock_set_running,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 266a11514a..011d73dae2 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -24,6 +24,7 @@
 #include "standard-headers/linux/vhost_types.h"
 #include "hw/virtio/virtio-bus.h"
 #include "hw/mem/memory-device.h"
+#include "migration/misc.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file-types.h"
 #include "system/dma.h"
@@ -1540,6 +1541,32 @@ static int vhost_dev_get_features(struct vhost_dev *hdev,
     return r;
 }
 
+static int vhost_cpr_notifier(NotifierWithReturn *notifier,
+                              MigrationEvent *e, Error **errp)
+{
+    struct vhost_dev *dev;
+    int r;
+
+    dev = container_of(notifier, struct vhost_dev, cpr_transfer_notifier);
+
+    if (dev->vhost_ops->backend_type != VHOST_BACKEND_TYPE_KERNEL) {
+        return 0;
+    }
+
+    if (e->type == MIG_EVENT_PRECOPY_SETUP) {
+        r = dev->vhost_ops->vhost_reset_owner(dev);
+        if (r < 0) {
+            VHOST_OPS_DEBUG(r, "vhost_reset_owner failed");
+        }
+    } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+        r = dev->vhost_ops->vhost_set_owner(dev);
+        if (r < 0) {
+            VHOST_OPS_DEBUG(r, "vhost_set_owner failed");
+        }
+    }
+    return 0;
+}
+
 int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
                    VhostBackendType backend_type, uint32_t busyloop_timeout,
                    Error **errp)
@@ -1550,6 +1577,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 
     hdev->vdev = NULL;
     hdev->migration_blocker = NULL;
+    hdev->cpr_transfer_notifier.notify = NULL;
 
     r = vhost_set_backend_type(hdev, backend_type);
     assert(r >= 0);
@@ -1650,6 +1678,9 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     hdev->log_enabled = false;
     hdev->started = false;
     memory_listener_register(&hdev->memory_listener, &address_space_memory);
+    migration_add_notifier_mode(&hdev->cpr_transfer_notifier,
+                                vhost_cpr_notifier,
+                                MIG_MODE_CPR_TRANSFER);
     QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
 
     /*
@@ -1702,6 +1733,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
         QLIST_REMOVE(hdev, entry);
     }
     migrate_del_blocker(&hdev->migration_blocker);
+    migration_remove_notifier(&hdev->cpr_transfer_notifier);
     g_free(hdev->mem);
     g_free(hdev->mem_sections);
     if (hdev->vhost_ops) {
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index ff94fa1734..18ce5ea9a0 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -196,6 +196,7 @@ typedef struct VhostOps {
     vhost_get_features_op vhost_get_features;
     vhost_set_backend_cap_op vhost_set_backend_cap;
     vhost_set_owner_op vhost_set_owner;
+    vhost_set_owner_op vhost_reset_owner;
     vhost_reset_device_op vhost_reset_device;
     vhost_get_vq_index_op vhost_get_vq_index;
     vhost_set_vring_enable_op vhost_set_vring_enable;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 08bbb4dfe9..5d11a97e43 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -133,6 +133,7 @@ struct vhost_dev {
     QLIST_ENTRY(vhost_dev) logdev_entry;
     QLIST_HEAD(, vhost_iommu) iommu_list;
     IOMMUNotifier n;
+    NotifierWithReturn cpr_transfer_notifier;
     const VhostDevConfigOps *config_ops;
 };
 

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/8] cpr: delete all fds
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (2 preceding siblings ...)
  2025-12-03 18:51 ` [PATCH v3 3/8] vhost: reset vhost devices for cpr Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 5/8] tap: common return label Ben Chaney
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

Add the cpr_delete_fd_all function to delete all fds associated with a
device.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 include/migration/cpr.h |  1 +
 migration/cpr.c         | 13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/migration/cpr.h b/include/migration/cpr.h
index 027cb98073..d585fadc5b 100644
--- a/include/migration/cpr.h
+++ b/include/migration/cpr.h
@@ -29,6 +29,7 @@ extern CprState cpr_state;
 
 void cpr_save_fd(const char *name, int id, int fd);
 void cpr_delete_fd(const char *name, int id);
+void cpr_delete_fd_all(const char *name);
 int cpr_find_fd(const char *name, int id);
 void cpr_resave_fd(const char *name, int id, int fd);
 int cpr_open_fd(const char *path, int flags, const char *name, int id,
diff --git a/migration/cpr.c b/migration/cpr.c
index adee2a919a..c0bf93a7ba 100644
--- a/migration/cpr.c
+++ b/migration/cpr.c
@@ -85,6 +85,19 @@ void cpr_delete_fd(const char *name, int id)
     trace_cpr_delete_fd(name, id);
 }
 
+void cpr_delete_fd_all(const char *name)
+{
+    CprFd *elem, *next_elem;
+
+    QLIST_FOREACH_SAFE(elem, &cpr_state.fds, next, next_elem) {
+        if (!strcmp(elem->name, name)) {
+            QLIST_REMOVE(elem, next);
+            g_free(elem->name);
+            g_free(elem);
+        }
+    }
+}
+
 int cpr_find_fd(const char *name, int id)
 {
     CprFd *elem = find_fd(&cpr_state.fds, name, id);

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 5/8] tap: common return label
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (3 preceding siblings ...)
  2025-12-03 18:51 ` [PATCH v3 4/8] cpr: delete all fds Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

Modify net_init_tap so every return branches to a common label, for
common cleanup in a subsequent patch.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 net/tap.c | 55 +++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index abe3b2d036..9d480574c3 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -830,7 +830,8 @@ int net_init_tap(const Netdev *netdev, const char *name,
      * For -netdev, peer is always NULL. */
     if (peer && (tap->has_queues || tap->fds || tap->vhostfds)) {
         error_setg(errp, "Multiqueue tap cannot be used with hubs");
-        return -1;
+        ret = -1;
+        goto out;
     }
 
     if (tap->fd) {
@@ -840,23 +841,27 @@ int net_init_tap(const Netdev *netdev, const char *name,
             error_setg(errp, "ifname=, script=, downscript=, vnet_hdr=, "
                        "helper=, queues=, fds=, and vhostfds= "
                        "are invalid with fd=");
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         fd = monitor_fd_param(monitor_cur(), tap->fd, errp);
         if (fd == -1) {
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         if (!qemu_set_blocking(fd, false, errp)) {
             close(fd);
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         vnet_hdr = tap_probe_vnet_hdr(fd, errp);
         if (vnet_hdr < 0) {
             close(fd);
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         net_init_tap_one(tap, peer, "tap", name, NULL,
@@ -865,7 +870,8 @@ int net_init_tap(const Netdev *netdev, const char *name,
         if (err) {
             error_propagate(errp, err);
             close(fd);
-            return -1;
+            ret = -1;
+            goto out;
         }
     } else if (tap->fds) {
         char **fds;
@@ -878,7 +884,8 @@ int net_init_tap(const Netdev *netdev, const char *name,
             error_setg(errp, "ifname=, script=, downscript=, vnet_hdr=, "
                        "helper=, queues=, and vhostfd= "
                        "are invalid with fds=");
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         fds = g_new0(char *, MAX_TAP_QUEUES);
@@ -940,29 +947,35 @@ free_fail:
         }
         g_free(fds);
         g_free(vhost_fds);
-        return ret;
+        goto out;
+
     } else if (tap->helper) {
         if (tap->ifname || tap->script || tap->downscript ||
             tap->has_vnet_hdr || tap->has_queues || tap->vhostfds) {
             error_setg(errp, "ifname=, script=, downscript=, vnet_hdr=, "
                        "queues=, and vhostfds= are invalid with helper=");
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         fd = net_bridge_run_helper(tap->helper,
                                    tap->br ?: DEFAULT_BRIDGE_INTERFACE,
                                    errp);
         if (fd == -1) {
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         if (!qemu_set_blocking(fd, false, errp)) {
-            return -1;
+            close(fd);
+            ret = -1;
+            goto out;
         }
         vnet_hdr = tap_probe_vnet_hdr(fd, errp);
         if (vnet_hdr < 0) {
             close(fd);
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         net_init_tap_one(tap, peer, "bridge", name, ifname,
@@ -971,14 +984,16 @@ free_fail:
         if (err) {
             error_propagate(errp, err);
             close(fd);
-            return -1;
+            ret = -1;
+            goto out;
         }
     } else {
         g_autofree char *default_script = NULL;
         g_autofree char *default_downscript = NULL;
         if (tap->vhostfds) {
             error_setg(errp, "vhostfds= is invalid if fds= wasn't specified");
-            return -1;
+            ret = -1;
+            goto out;
         }
 
         if (!script) {
@@ -999,14 +1014,16 @@ free_fail:
             fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
                               ifname, sizeof ifname, queues > 1, errp);
             if (fd == -1) {
-                return -1;
+                ret = -1;
+                goto out;
             }
 
             if (queues > 1 && i == 0 && !tap->ifname) {
                 if (tap_fd_get_ifname(fd, ifname)) {
                     error_setg(errp, "Fail to get ifname");
                     close(fd);
-                    return -1;
+                    ret = -1;
+                    goto out;
                 }
             }
 
@@ -1017,12 +1034,14 @@ free_fail:
             if (err) {
                 error_propagate(errp, err);
                 close(fd);
-                return -1;
+                ret = -1;
+                goto out;
             }
         }
     }
 
-    return 0;
+out:
+    return ret;
 }
 
 int tap_enable(NetClientState *nc)

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 6/8] tap: cpr support
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (4 preceding siblings ...)
  2025-12-03 18:51 ` [PATCH v3 5/8] tap: common return label Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-04  8:09   ` Markus Armbruster
                     ` (2 more replies)
  2025-12-03 18:51 ` [PATCH v3 7/8] tap: postload fix for cpr Ben Chaney
                   ` (3 subsequent siblings)
  9 siblings, 3 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

Provide the cpr=on option to preserve TAP and vhost descriptors during
cpr-transfer, so the management layer does not need to create a new
device for the target.

Save all tap fd's in canonical order, leveraging the index argument of
cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
and the vhostfd (if any) at index 2*i+1.

tap and vhost fd's are passed by name to the monitor when a NIC is hot
plugged, but the name is not known to qemu after cpr.  Allow the manager
to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
should search for a saved value.  Example:

  -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 hw/vfio/device.c        |  2 +-
 include/migration/cpr.h |  2 +-
 migration/cpr.c         | 11 ++++----
 net/tap.c               | 73 +++++++++++++++++++++++++++++++++++++++----------
 qapi/net.json           |  5 +++-
 5 files changed, 70 insertions(+), 23 deletions(-)

diff --git a/hw/vfio/device.c b/hw/vfio/device.c
index 76869828fc..73e622f7b5 100644
--- a/hw/vfio/device.c
+++ b/hw/vfio/device.c
@@ -362,7 +362,7 @@ void vfio_device_free_name(VFIODevice *vbasedev)
 
 void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp)
 {
-    vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, errp);
+    vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, true, errp);
 }
 
 static VFIODeviceIOOps vfio_device_io_ops_ioctl;
diff --git a/include/migration/cpr.h b/include/migration/cpr.h
index d585fadc5b..68424b4b03 100644
--- a/include/migration/cpr.h
+++ b/include/migration/cpr.h
@@ -48,7 +48,7 @@ void cpr_state_close(void);
 struct QIOChannel *cpr_state_ioc(void);
 
 bool cpr_incoming_needed(void *opaque);
-int cpr_get_fd_param(const char *name, const char *fdname, int index,
+int cpr_get_fd_param(const char *name, const char *fdname, int index, bool cpr,
                      Error **errp);
 
 QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
diff --git a/migration/cpr.c b/migration/cpr.c
index c0bf93a7ba..19bd56339d 100644
--- a/migration/cpr.c
+++ b/migration/cpr.c
@@ -316,6 +316,7 @@ bool cpr_incoming_needed(void *opaque)
  * @name: CPR name for the descriptor
  * @fdname: An integer-valued string, or a name passed to a getfd command
  * @index: CPR index of the descriptor
+ * @cpr: use cpr
  * @errp: returned error message
  *
  * If CPR is not being performed, then use @fdname to find the fd.
@@ -325,22 +326,22 @@ bool cpr_incoming_needed(void *opaque)
  * On success returns the fd value, else returns -1.
  */
 int cpr_get_fd_param(const char *name, const char *fdname, int index,
-                     Error **errp)
+                     bool cpr, Error **errp)
 {
     ERRP_GUARD();
     int fd;
 
-    if (cpr_is_incoming()) {
+    if (cpr && cpr_is_incoming()) {
         fd = cpr_find_fd(name, index);
         if (fd < 0) {
             error_setg(errp, "cannot find saved value for fd %s", fdname);
         }
     } else {
         fd = monitor_fd_param(monitor_cur(), fdname, errp);
-        if (fd >= 0) {
-            cpr_save_fd(name, index, fd);
-        } else {
+        if (fd < 0) {
             error_prepend(errp, "Could not parse object fd %s:", fdname);
+        } else if (cpr) {
+            cpr_save_fd(name, index, fd);
         }
     }
     return fd;
diff --git a/net/tap.c b/net/tap.c
index 9d480574c3..79e29addd1 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -35,6 +35,7 @@
 #include "net/eth.h"
 #include "net/net.h"
 #include "clients.h"
+#include "migration/cpr.h"
 #include "monitor/monitor.h"
 #include "system/system.h"
 #include "qapi/error.h"
@@ -80,6 +81,7 @@ typedef struct TAPState {
     bool has_uso;
     bool has_tunnel;
     bool enabled;
+    bool cpr;
     VHostNetState *vhost_net;
     unsigned host_vnet_hdr_len;
     Notifier exit;
@@ -323,6 +325,9 @@ static void tap_cleanup(NetClientState *nc)
 {
     TAPState *s = DO_UPCAST(TAPState, nc, nc);
 
+    if (s->cpr) {
+        cpr_delete_fd_all(nc->name);
+    }
     if (s->vhost_net) {
         vhost_net_cleanup(s->vhost_net);
         g_free(s->vhost_net);
@@ -690,18 +695,24 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
     return fd;
 }
 
+/* CPR fd's for each queue are saved at these indices */
+#define TAP_FD_INDEX(queue)         (2 * (queue) + 0)
+#define TAP_VHOSTFD_INDEX(queue)    (2 * (queue) + 1)
+
 #define MAX_TAP_QUEUES 1024
 
 static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                              const char *model, const char *name,
                              const char *ifname, const char *script,
                              const char *downscript, const char *vhostfdname,
-                             int vnet_hdr, int fd, Error **errp)
+                             int vnet_hdr, int fd, int index, Error **errp)
 {
     Error *err = NULL;
     TAPState *s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
+    bool cpr = tap->has_cpr ? tap->cpr : false;
     int vhostfd;
 
+    s->cpr = cpr;
     tap_set_sndbuf(s->fd, tap, &err);
     if (err) {
         error_propagate(errp, err);
@@ -736,7 +747,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
         }
 
         if (vhostfdname) {
-            vhostfd = monitor_fd_param(monitor_cur(), vhostfdname, &err);
+            vhostfd = cpr_get_fd_param(name, vhostfdname, index, cpr, &err);
             if (vhostfd == -1) {
                 error_propagate(errp, err);
                 goto failed;
@@ -745,13 +756,22 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                 goto failed;
             }
         } else {
-            vhostfd = open("/dev/vhost-net", O_RDWR);
+            vhostfd = cpr ? cpr_find_fd(name, index) : -1;
+            if (vhostfd < 0) {
+                vhostfd = open("/dev/vhost-net", O_RDWR);
+                if (cpr && vhostfd >= 0) {
+                    cpr_save_fd(name, index, vhostfd);
+                }
+            }
             if (vhostfd < 0) {
                 error_setg_errno(errp, errno,
                                  "tap: open vhost char device failed");
                 goto failed;
             }
             if (!qemu_set_blocking(vhostfd, false, errp)) {
+                if (!cpr) {
+                    close(vhostfd);
+                }
                 goto failed;
             }
         }
@@ -777,6 +797,9 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
     return;
 
 failed:
+    if (cpr) {
+        cpr_delete_fd_all(name);
+    }
     qemu_del_net_client(&s->nc);
 }
 
@@ -809,7 +832,8 @@ static int get_fds(char *str, char *fds[], int max)
 int net_init_tap(const Netdev *netdev, const char *name,
                  NetClientState *peer, Error **errp)
 {
-    const NetdevTapOptions *tap;
+    const NetdevTapOptions *tap = &netdev->u.tap;
+    bool cpr = tap->has_cpr ? tap->cpr : false;
     int fd, vnet_hdr = 0, i = 0, queues;
     /* for the no-fd, no-helper case */
     const char *script;
@@ -845,7 +869,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
             goto out;
         }
 
-        fd = monitor_fd_param(monitor_cur(), tap->fd, errp);
+        fd = cpr_get_fd_param(name, tap->fd, TAP_FD_INDEX(0), cpr, errp);
         if (fd == -1) {
             ret = -1;
             goto out;
@@ -866,13 +890,14 @@ int net_init_tap(const Netdev *netdev, const char *name,
 
         net_init_tap_one(tap, peer, "tap", name, NULL,
                          script, downscript,
-                         vhostfdname, vnet_hdr, fd, &err);
+                         vhostfdname, vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err);
         if (err) {
             error_propagate(errp, err);
             close(fd);
             ret = -1;
             goto out;
         }
+
     } else if (tap->fds) {
         char **fds;
         char **vhost_fds;
@@ -903,7 +928,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
         }
 
         for (i = 0; i < nfds; i++) {
-            fd = monitor_fd_param(monitor_cur(), fds[i], errp);
+            fd = cpr_get_fd_param(name, fds[i], TAP_FD_INDEX(i), cpr, errp);
             if (fd == -1) {
                 ret = -1;
                 goto free_fail;
@@ -930,7 +955,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
             net_init_tap_one(tap, peer, "tap", name, ifname,
                              script, downscript,
                              tap->vhostfds ? vhost_fds[i] : NULL,
-                             vnet_hdr, fd, &err);
+                             vnet_hdr, fd, TAP_VHOSTFD_INDEX(i), &err);
             if (err) {
                 error_propagate(errp, err);
                 ret = -1;
@@ -958,9 +983,15 @@ free_fail:
             goto out;
         }
 
-        fd = net_bridge_run_helper(tap->helper,
-                                   tap->br ?: DEFAULT_BRIDGE_INTERFACE,
-                                   errp);
+        fd = cpr ? cpr_find_fd(name, TAP_FD_INDEX(0)) : -1;
+        if (fd < 0) {
+            fd = net_bridge_run_helper(tap->helper,
+                                    tap->br ?: DEFAULT_BRIDGE_INTERFACE,
+                                    errp);
+            if (cpr && fd >= 0) {
+                cpr_save_fd(name, TAP_FD_INDEX(0), fd);
+            }
+        }
         if (fd == -1) {
             ret = -1;
             goto out;
@@ -980,13 +1011,14 @@ free_fail:
 
         net_init_tap_one(tap, peer, "bridge", name, ifname,
                          script, downscript, vhostfdname,
-                         vnet_hdr, fd, &err);
+                         vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err);
         if (err) {
             error_propagate(errp, err);
             close(fd);
             ret = -1;
             goto out;
         }
+
     } else {
         g_autofree char *default_script = NULL;
         g_autofree char *default_downscript = NULL;
@@ -1011,8 +1043,14 @@ free_fail:
         }
 
         for (i = 0; i < queues; i++) {
-            fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
-                              ifname, sizeof ifname, queues > 1, errp);
+            fd = cpr ? cpr_find_fd(name, TAP_FD_INDEX(i)) : -1;
+            if (fd < 0) {
+                fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
+                                ifname, sizeof ifname, queues > 1, errp);
+                if (cpr && fd >= 0) {
+                    cpr_save_fd(name, TAP_FD_INDEX(i), fd);
+                }
+            }
             if (fd == -1) {
                 ret = -1;
                 goto out;
@@ -1030,7 +1068,9 @@ free_fail:
             net_init_tap_one(tap, peer, "tap", name, ifname,
                              i >= 1 ? "no" : script,
                              i >= 1 ? "no" : downscript,
-                             vhostfdname, vnet_hdr, fd, &err);
+                             vhostfdname, vnet_hdr,
+                             fd, TAP_VHOSTFD_INDEX(i),
+                             &err);
             if (err) {
                 error_propagate(errp, err);
                 close(fd);
@@ -1041,6 +1081,9 @@ free_fail:
     }
 
 out:
+    if (ret && cpr) {
+        cpr_delete_fd_all(name);
+    }
     return ret;
 }
 
diff --git a/qapi/net.json b/qapi/net.json
index 118bd34965..264213b5d9 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -355,6 +355,8 @@
 # @poll-us: maximum number of microseconds that could be spent on busy
 #     polling for tap (since 2.7)
 #
+# @cpr: preserve fds and vhostfds during cpr-transfer.
+#
 # Since: 1.2
 ##
 { 'struct': 'NetdevTapOptions',
@@ -373,7 +375,8 @@
     '*vhostfds':   'str',
     '*vhostforce': 'bool',
     '*queues':     'uint32',
-    '*poll-us':    'uint32'} }
+    '*poll-us':    'uint32',
+    '*cpr':        'bool'} }
 
 ##
 # @NetdevSocketOptions:

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 7/8] tap: postload fix for cpr
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (5 preceding siblings ...)
  2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-03 18:51 ` [PATCH v3 8/8] tap: cpr fixes Ben Chaney
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare

From: Steve Sistare <steven.sistare@oracle.com>

After cpr of a multi-queue NIC, if any queues are unused, then the
corresponding tap is marked enabled in userland, but it is disabled in the
kernel for the fd that was preserved.  One cannot call tap_disable() during
postload, because that eventually calls IFF_DETACH_QUEUE, which fails
because the queue is already detached.  Define tap_disable_postload to
avoid IFF_DETACH_QUEUE.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 hw/net/virtio-net.c | 20 ++++++++++++++++++++
 include/net/tap.h   |  1 +
 net/tap-win32.c     |  5 +++++
 net/tap.c           | 17 +++++++++++++++++
 4 files changed, 43 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3b85560f6f..38ec7ac109 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -765,6 +765,25 @@ static int peer_detach(VirtIONet *n, int index)
     return tap_disable(nc->peer);
 }
 
+/*
+ * Set the disabled flag on unused queue pairs after vmstate load, without
+ * calling IFF_DETACH_QUEUE, which fails because the queue is already detached.
+ */
+static void virtio_net_postload_queue_pairs(VirtIONet *n)
+{
+    int i;
+    MigMode mode = migrate_mode();
+
+    if (mode == MIG_MODE_CPR_TRANSFER) {
+        for (i = n->curr_queue_pairs; i < n->max_queue_pairs; i++) {
+            NetClientState *nc = qemu_get_subqueue(n->nic, i);
+            if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_TAP) {
+                tap_disable_postload(nc->peer);
+            }
+        }
+    }
+}
+
 static void virtio_net_set_queue_pairs(VirtIONet *n)
 {
     int i;
@@ -3212,6 +3231,7 @@ static int virtio_net_post_load_device(void *opaque, int version_id)
      */
     n->saved_guest_offloads = n->curr_guest_offloads;
 
+    virtio_net_postload_queue_pairs(n);
     virtio_net_set_queue_pairs(n);
 
     /* Find the first multicast entry in the saved MAC filter */
diff --git a/include/net/tap.h b/include/net/tap.h
index 6f34f13eae..934131f551 100644
--- a/include/net/tap.h
+++ b/include/net/tap.h
@@ -30,6 +30,7 @@
 
 int tap_enable(NetClientState *nc);
 int tap_disable(NetClientState *nc);
+void tap_disable_postload(NetClientState *nc);
 
 int tap_get_fd(NetClientState *nc);
 
diff --git a/net/tap-win32.c b/net/tap-win32.c
index 38baf90e0b..efe81c54ee 100644
--- a/net/tap-win32.c
+++ b/net/tap-win32.c
@@ -766,3 +766,8 @@ int tap_disable(NetClientState *nc)
 {
     abort();
 }
+
+void tap_disable_postload(NetClientState *nc)
+{
+    abort();
+}
diff --git a/net/tap.c b/net/tap.c
index 79e29addd1..5acda81146 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -1121,3 +1121,20 @@ int tap_disable(NetClientState *nc)
         return ret;
     }
 }
+
+/*
+ * On cpr restart, the tap is marked enabled in userland, but it might be
+ * disabled in the kernel, and IFF_DETACH_QUEUE will fail because it is
+ * already detached.  This function disables without calling IFF_DETACH_QUEUE.
+ */
+void tap_disable_postload(NetClientState *nc)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+    if (!s->cpr || s->enabled == 0) {
+        return;
+    } else {
+        s->enabled = false;
+        tap_update_fd_handler(s);
+    }
+}

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 8/8] tap: cpr fixes
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (6 preceding siblings ...)
  2025-12-03 18:51 ` [PATCH v3 7/8] tap: postload fix for cpr Ben Chaney
@ 2025-12-03 18:51 ` Ben Chaney
  2025-12-04 17:59   ` Daniel P. Berrangé
  2025-12-04 12:52 ` [PATCH v3 0/8] Live update: tap and vhost Vladimir Sementsov-Ogievskiy
  2025-12-08 10:08 ` Cédric Le Goater
  9 siblings, 1 reply; 23+ messages in thread
From: Ben Chaney @ 2025-12-03 18:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Ben Chaney, Steve Sistare, Vladimir Sementsov-Ogievskiy

From: Steve Sistare <steven.sistare@oracle.com>

Fix "virtio_net_set_queue_pairs: Assertion `!r' failed."
Fix "virtio-net: saved image requires vnet_hdr=on"
Do not change blocking mode of incoming cpr fd's.

Reported-by: Ben Chaney <bchaney@akamai.com>
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Ben Chaney <bchaney@akamai.com>
---
 hw/net/virtio-net.c | 6 ++++++
 io/channel-socket.c | 4 +++-
 net/tap.c           | 2 ++
 stubs/cpr.c         | 8 ++++++++
 stubs/meson.build   | 1 +
 5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 38ec7ac109..fd6b30b296 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -37,6 +37,7 @@
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
 #include "hw/virtio/virtio-access.h"
+#include "migration/cpr.h"
 #include "migration/misc.h"
 #include "standard-headers/linux/ethtool.h"
 #include "system/system.h"
@@ -789,6 +790,11 @@ static void virtio_net_set_queue_pairs(VirtIONet *n)
     int i;
     int r;
 
+    if (cpr_is_incoming()) {
+        /* peers are already attached, do nothing */
+        return;
+    }
+
     if (n->nic->peer_deleted) {
         return;
     }
diff --git a/io/channel-socket.c b/io/channel-socket.c
index 3053b35ad8..443ca8cb7c 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -24,6 +24,7 @@
 #include "io/channel-socket.h"
 #include "io/channel-util.h"
 #include "io/channel-watch.h"
+#include "migration/cpr.h"
 #include "trace.h"
 #include "qapi/clone-visitor.h"
 #ifdef CONFIG_LINUX
@@ -521,7 +522,8 @@ static bool qio_channel_handle_fds(int *fds, size_t nfds,
 
         if (!preserve_blocking) {
             /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
-            if (!qemu_set_blocking(*fd, true, errp)) {
+              if (!cpr_is_incoming() &&
+                  qemu_set_blocking(*fd, true, errp)) {
                 return false;
             }
         }
diff --git a/net/tap.c b/net/tap.c
index 5acda81146..5e04099c87 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -1050,6 +1050,8 @@ free_fail:
                 if (cpr && fd >= 0) {
                     cpr_save_fd(name, TAP_FD_INDEX(i), fd);
                 }
+            } else {
+                vnet_hdr = tap->has_vnet_hdr ? tap->vnet_hdr : 1;
             }
             if (fd == -1) {
                 ret = -1;
diff --git a/stubs/cpr.c b/stubs/cpr.c
new file mode 100644
index 0000000000..1a4dbbb2d7
--- /dev/null
+++ b/stubs/cpr.c
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#include "qemu/osdep.h"
+#include "migration/cpr.h"
+
+bool cpr_is_incoming(void)
+{
+    return false;
+}
diff --git a/stubs/meson.build b/stubs/meson.build
index 0b2778c568..87af733528 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -10,6 +10,7 @@ stub_ss.add(files('is-daemonized.c'))
 stub_ss.add(files('monitor-core.c'))
 stub_ss.add(files('replay-mode.c'))
 stub_ss.add(files('trace-control.c'))
+stub_ss.add(files('cpr.c'))
 
 if have_block
   stub_ss.add(files('bdrv-next-monitor-owned.c'))

-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/8] tap: cpr support
  2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
@ 2025-12-04  8:09   ` Markus Armbruster
  2025-12-05  0:51     ` Jason Wang
  2025-12-04 17:46   ` Cédric Le Goater
  2025-12-04 17:56   ` Daniel P. Berrangé
  2 siblings, 1 reply; 23+ messages in thread
From: Markus Armbruster @ 2025-12-04  8:09 UTC (permalink / raw)
  To: Ben Chaney
  Cc: qemu-devel, Peter Xu, Fabiano Rosas, Michael S. Tsirkin,
	Stefano Garzarella, Jason Wang, Alex Williamson,
	Cédric Le Goater, Eric Blake, Stefan Weil,
	Daniel P. Berrangé, Paolo Bonzini, Hamza Khan, Mark Kanda,
	Joshua Hunt, Max Tottenham, Steve Sistare

Ben Chaney <bchaney@akamai.com> writes:

> From: Steve Sistare <steven.sistare@oracle.com>
>
> Provide the cpr=on option to preserve TAP and vhost descriptors during
> cpr-transfer, so the management layer does not need to create a new
> device for the target.
>
> Save all tap fd's in canonical order, leveraging the index argument of
> cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
> and the vhostfd (if any) at index 2*i+1.
>
> tap and vhost fd's are passed by name to the monitor when a NIC is hot
> plugged, but the name is not known to qemu after cpr.  Allow the manager
> to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
> should search for a saved value.  Example:
>
>   -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on

Hmm.  See below.

>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Signed-off-by: Ben Chaney <bchaney@akamai.com>

[...]

> diff --git a/qapi/net.json b/qapi/net.json
> index 118bd34965..264213b5d9 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -355,6 +355,8 @@
   ##
   # @NetdevTapOptions:
   #
   # Used to configure a host TAP network interface backend.
   #
   # @ifname: interface name
   #
   # @fd: file descriptor of an already opened tap
   #
   # @fds: multiple file descriptors of already opened multiqueue capable
   #     tap

Not this patch's fault: the interface is misguided, and its
documentation inadequate.

@fds is a string of file descriptor names or numbers separated by ':'.
Not documented.  I found out by reading the code.

This violates QAPI design principle "no string parsing".  It should be
an array of strings.

Aside: get_fds() should use g_strsplit().

Your patch extends the syntax to "file descriptor names or numbers or
"-1" separated by ":".  This is problematic.

Before the patch, a file descriptor name or number is interpreted as a
file descriptor number if it starts with a digit.  "-1" doesn't, so it's
interpreted as a file descriptor name.  Yes, "-1" works as file
descriptor name.  I just verified that

    {"execute": "getfd", "arguments": {"fdname": "-1"}}

works by changing 'fdname': 'fdname' to 'fdname': '-1' in
tests/qtest/libqtest.c, and running tests/qtest/dbus-display-test with
QTEST_LOG=/dev/stdout.  The test passes using file descriptor name "-1".

Aside: not restricting the syntax of identifiers to something sensible
like "begin with a letter, and contain only ASCII letters, digits, and
hyphen" is a mistake we've make again and again.

Your patch changes the interpretation of "-1" from "file descriptor
name" to "saved file descriptor".

If it does so regardless of the value of @cpr, then this is an
incompatible change.

We normally require such changes to go through the deprecation process.
We waive that when we're *confident* not doing so will not inconvenience
any users.  Are we here?

If it does so only when @cpr is true, the semantics of "-1" depends on
@cpr.  Yuck!

We can accept "yuck!" when the alternatives are no better.  Have we
considered any?

Regardless, we clearly need to document syntax and semantics of @fds.
Please fix the doc string before this patch, then have this patch update
it.

   #
   # @script: script to initialize the interface
   #
   # @downscript: script to shut down the interface
   #
   # @br: bridge name (since 2.8)
   #
   # @helper: command to execute to configure bridge
   #
   # @sndbuf: send buffer limit.  Understands [TGMKkb] suffixes.
   #
   # @vnet_hdr: enable the IFF_VNET_HDR flag on the tap interface
   #
   # @vhost: enable vhost-net network accelerator
   #
   # @vhostfd: file descriptor of an already opened vhost net device
   #
   # @vhostfds: file descriptors of multiple already opened vhost net
   #     devices

Likewise.

   #
   # @vhostforce: vhost on for non-MSIX virtio guests
   #
   # @queues: number of queues to be created for multiqueue capable tap
   #
>  # @poll-us: maximum number of microseconds that could be spent on busy
>  #     polling for tap (since 2.7)
>  #
> +# @cpr: preserve fds and vhostfds during cpr-transfer.

The commit message explains things in a lot more detail.  Users may not
need to know all that detail.  But this feels too terse.

Please don't abbreviate "file descriptors" to "fds" in documentation
prose.

> +#
>  # Since: 1.2
>  ##
>  { 'struct': 'NetdevTapOptions',
> @@ -373,7 +375,8 @@
     'data': {
       '*ifname':     'str',
       '*fd':         'str',
       '*fds':        'str',
       '*script':     'str',
       '*downscript': 'str',
       '*br':         'str',
       '*helper':     'str',
       '*sndbuf':     'size',
       '*vnet_hdr':   'bool',
       '*vhost':      'bool',
       '*vhostfd':    'str',
>      '*vhostfds':   'str',
>      '*vhostforce': 'bool',
>      '*queues':     'uint32',
> -    '*poll-us':    'uint32'} }
> +    '*poll-us':    'uint32',
> +    '*cpr':        'bool'} }
>  
>  ##
>  # @NetdevSocketOptions:



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (7 preceding siblings ...)
  2025-12-03 18:51 ` [PATCH v3 8/8] tap: cpr fixes Ben Chaney
@ 2025-12-04 12:52 ` Vladimir Sementsov-Ogievskiy
  2025-12-08 21:03   ` Chaney, Ben
  2025-12-08 10:08 ` Cédric Le Goater
  9 siblings, 1 reply; 23+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-12-04 12:52 UTC (permalink / raw)
  To: Ben Chaney, qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Steve Sistare

On 03.12.25 21:51, Ben Chaney wrote:
> Changes since v2
> - I have taken over this patch set since Steve retired
> - Added comments to explain the order of events
> - Remove redundant reversion to cleanup git history
> - Inclusion of virtio and stub fixes
> 
> Tap and vhost devices can be preserved during cpr-transfer using
> traditional live migration methods, wherein the management layer
> creates new interfaces for the target and fiddles with 'ip link'
> to deactivate the old interface and activate the new.
> 
> However, CPR can simply send the file descriptors to new QEMU,
> with no special management actions required.  The user enables
> this behavior by specifing '-netdev tap,cpr=on'.  The default
> is cpr=off.
> 
> Signed-off-by: Ben Chaney<bchaney@akamai.com>


Hi!

Hmm, note that I have alternative in-flight series,

[PATCH v9 0/8] virtio-net: live-TAP local migration
https://patchew.org/QEMU/20251030203116.870742-1-vsementsov@yandex-team.ru/

, which bring same thing: migrate TAP device, passing FDs
though migration channel. The benefit is that it doesn't
require additional migration channel.

So, it may be used as part of CPR-migration, or with
usual migration without CPR.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/8] tap: cpr support
  2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
  2025-12-04  8:09   ` Markus Armbruster
@ 2025-12-04 17:46   ` Cédric Le Goater
  2025-12-04 17:56   ` Daniel P. Berrangé
  2 siblings, 0 replies; 23+ messages in thread
From: Cédric Le Goater @ 2025-12-04 17:46 UTC (permalink / raw)
  To: Ben Chaney, qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Eric Blake, Markus Armbruster,
	Stefan Weil, Daniel P. Berrangé, Paolo Bonzini, Hamza Khan,
	Mark Kanda, Joshua Hunt, Max Tottenham, Steve Sistare

On 12/3/25 19:51, Ben Chaney wrote:
> From: Steve Sistare <steven.sistare@oracle.com>
> 
> Provide the cpr=on option to preserve TAP and vhost descriptors during
> cpr-transfer, so the management layer does not need to create a new
> device for the target.
> 
> Save all tap fd's in canonical order, leveraging the index argument of
> cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
> and the vhostfd (if any) at index 2*i+1.
> 
> tap and vhost fd's are passed by name to the monitor when a NIC is hot
> plugged, but the name is not known to qemu after cpr.  Allow the manager
> to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
> should search for a saved value.  Example:
> 
>    -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Signed-off-by: Ben Chaney <bchaney@akamai.com>
> ---
>   hw/vfio/device.c        |  2 +-
>   include/migration/cpr.h |  2 +-
>   migration/cpr.c         | 11 ++++----
>   net/tap.c               | 73 +++++++++++++++++++++++++++++++++++++++----------
>   qapi/net.json           |  5 +++-
>   5 files changed, 70 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/vfio/device.c b/hw/vfio/device.c
> index 76869828fc..73e622f7b5 100644
> --- a/hw/vfio/device.c
> +++ b/hw/vfio/device.c
> @@ -362,7 +362,7 @@ void vfio_device_free_name(VFIODevice *vbasedev)
>   
>   void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp)
>   {
> -    vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, errp);
> +    vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, true, errp);


This looks weird to me.

This calls a cpr* routine with a 'cpr' bool argument that toggles
CPR on or off. It looks a bit hacky. Could you clarify the intention?


C.

>   }
>   
>   static VFIODeviceIOOps vfio_device_io_ops_ioctl;
> diff --git a/include/migration/cpr.h b/include/migration/cpr.h
> index d585fadc5b..68424b4b03 100644
> --- a/include/migration/cpr.h
> +++ b/include/migration/cpr.h
> @@ -48,7 +48,7 @@ void cpr_state_close(void);
>   struct QIOChannel *cpr_state_ioc(void);
>   
>   bool cpr_incoming_needed(void *opaque);
> -int cpr_get_fd_param(const char *name, const char *fdname, int index,
> +int cpr_get_fd_param(const char *name, const char *fdname, int index, bool cpr,
>                        Error **errp);
>   
>   QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
> diff --git a/migration/cpr.c b/migration/cpr.c
> index c0bf93a7ba..19bd56339d 100644
> --- a/migration/cpr.c
> +++ b/migration/cpr.c
> @@ -316,6 +316,7 @@ bool cpr_incoming_needed(void *opaque)
>    * @name: CPR name for the descriptor
>    * @fdname: An integer-valued string, or a name passed to a getfd command
>    * @index: CPR index of the descriptor
> + * @cpr: use cpr
>    * @errp: returned error message
>    *
>    * If CPR is not being performed, then use @fdname to find the fd.
> @@ -325,22 +326,22 @@ bool cpr_incoming_needed(void *opaque)
>    * On success returns the fd value, else returns -1.
>    */
>   int cpr_get_fd_param(const char *name, const char *fdname, int index,
> -                     Error **errp)
> +                     bool cpr, Error **errp)
>   {
>       ERRP_GUARD();
>       int fd;
>   
> -    if (cpr_is_incoming()) {
> +    if (cpr && cpr_is_incoming()) {
>           fd = cpr_find_fd(name, index);
>           if (fd < 0) {
>               error_setg(errp, "cannot find saved value for fd %s", fdname);
>           }
>       } else {
>           fd = monitor_fd_param(monitor_cur(), fdname, errp);
> -        if (fd >= 0) {
> -            cpr_save_fd(name, index, fd);
> -        } else {
> +        if (fd < 0) {
>               error_prepend(errp, "Could not parse object fd %s:", fdname);
> +        } else if (cpr) {
> +            cpr_save_fd(name, index, fd);
>           }
>       }
>       return fd;
> diff --git a/net/tap.c b/net/tap.c
> index 9d480574c3..79e29addd1 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -35,6 +35,7 @@
>   #include "net/eth.h"
>   #include "net/net.h"
>   #include "clients.h"
> +#include "migration/cpr.h"
>   #include "monitor/monitor.h"
>   #include "system/system.h"
>   #include "qapi/error.h"
> @@ -80,6 +81,7 @@ typedef struct TAPState {
>       bool has_uso;
>       bool has_tunnel;
>       bool enabled;
> +    bool cpr;
>       VHostNetState *vhost_net;
>       unsigned host_vnet_hdr_len;
>       Notifier exit;
> @@ -323,6 +325,9 @@ static void tap_cleanup(NetClientState *nc)
>   {
>       TAPState *s = DO_UPCAST(TAPState, nc, nc);
>   
> +    if (s->cpr) {
> +        cpr_delete_fd_all(nc->name);
> +    }
>       if (s->vhost_net) {
>           vhost_net_cleanup(s->vhost_net);
>           g_free(s->vhost_net);
> @@ -690,18 +695,24 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
>       return fd;
>   }
>   
> +/* CPR fd's for each queue are saved at these indices */
> +#define TAP_FD_INDEX(queue)         (2 * (queue) + 0)
> +#define TAP_VHOSTFD_INDEX(queue)    (2 * (queue) + 1)
> +
>   #define MAX_TAP_QUEUES 1024
>   
>   static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>                                const char *model, const char *name,
>                                const char *ifname, const char *script,
>                                const char *downscript, const char *vhostfdname,
> -                             int vnet_hdr, int fd, Error **errp)
> +                             int vnet_hdr, int fd, int index, Error **errp)
>   {
>       Error *err = NULL;
>       TAPState *s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
> +    bool cpr = tap->has_cpr ? tap->cpr : false;
>       int vhostfd;
>   
> +    s->cpr = cpr;
>       tap_set_sndbuf(s->fd, tap, &err);
>       if (err) {
>           error_propagate(errp, err);
> @@ -736,7 +747,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>           }
>   
>           if (vhostfdname) {
> -            vhostfd = monitor_fd_param(monitor_cur(), vhostfdname, &err);
> +            vhostfd = cpr_get_fd_param(name, vhostfdname, index, cpr, &err);
>               if (vhostfd == -1) {
>                   error_propagate(errp, err);
>                   goto failed;
> @@ -745,13 +756,22 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>                   goto failed;
>               }
>           } else {
> -            vhostfd = open("/dev/vhost-net", O_RDWR);
> +            vhostfd = cpr ? cpr_find_fd(name, index) : -1;
> +            if (vhostfd < 0) {
> +                vhostfd = open("/dev/vhost-net", O_RDWR);
> +                if (cpr && vhostfd >= 0) {
> +                    cpr_save_fd(name, index, vhostfd);
> +                }
> +            }
>               if (vhostfd < 0) {
>                   error_setg_errno(errp, errno,
>                                    "tap: open vhost char device failed");
>                   goto failed;
>               }
>               if (!qemu_set_blocking(vhostfd, false, errp)) {
> +                if (!cpr) {
> +                    close(vhostfd);
> +                }
>                   goto failed;
>               }
>           }
> @@ -777,6 +797,9 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>       return;
>   
>   failed:
> +    if (cpr) {
> +        cpr_delete_fd_all(name);
> +    }
>       qemu_del_net_client(&s->nc);
>   }
>   
> @@ -809,7 +832,8 @@ static int get_fds(char *str, char *fds[], int max)
>   int net_init_tap(const Netdev *netdev, const char *name,
>                    NetClientState *peer, Error **errp)
>   {
> -    const NetdevTapOptions *tap;
> +    const NetdevTapOptions *tap = &netdev->u.tap;
> +    bool cpr = tap->has_cpr ? tap->cpr : false;
>       int fd, vnet_hdr = 0, i = 0, queues;
>       /* for the no-fd, no-helper case */
>       const char *script;
> @@ -845,7 +869,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
>               goto out;
>           }
>   
> -        fd = monitor_fd_param(monitor_cur(), tap->fd, errp);
> +        fd = cpr_get_fd_param(name, tap->fd, TAP_FD_INDEX(0), cpr, errp);
>           if (fd == -1) {
>               ret = -1;
>               goto out;
> @@ -866,13 +890,14 @@ int net_init_tap(const Netdev *netdev, const char *name,
>   
>           net_init_tap_one(tap, peer, "tap", name, NULL,
>                            script, downscript,
> -                         vhostfdname, vnet_hdr, fd, &err);
> +                         vhostfdname, vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err);
>           if (err) {
>               error_propagate(errp, err);
>               close(fd);
>               ret = -1;
>               goto out;
>           }
> +
>       } else if (tap->fds) {
>           char **fds;
>           char **vhost_fds;
> @@ -903,7 +928,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
>           }
>   
>           for (i = 0; i < nfds; i++) {
> -            fd = monitor_fd_param(monitor_cur(), fds[i], errp);
> +            fd = cpr_get_fd_param(name, fds[i], TAP_FD_INDEX(i), cpr, errp);
>               if (fd == -1) {
>                   ret = -1;
>                   goto free_fail;
> @@ -930,7 +955,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
>               net_init_tap_one(tap, peer, "tap", name, ifname,
>                                script, downscript,
>                                tap->vhostfds ? vhost_fds[i] : NULL,
> -                             vnet_hdr, fd, &err);
> +                             vnet_hdr, fd, TAP_VHOSTFD_INDEX(i), &err);
>               if (err) {
>                   error_propagate(errp, err);
>                   ret = -1;
> @@ -958,9 +983,15 @@ free_fail:
>               goto out;
>           }
>   
> -        fd = net_bridge_run_helper(tap->helper,
> -                                   tap->br ?: DEFAULT_BRIDGE_INTERFACE,
> -                                   errp);
> +        fd = cpr ? cpr_find_fd(name, TAP_FD_INDEX(0)) : -1;
> +        if (fd < 0) {
> +            fd = net_bridge_run_helper(tap->helper,
> +                                    tap->br ?: DEFAULT_BRIDGE_INTERFACE,
> +                                    errp);
> +            if (cpr && fd >= 0) {
> +                cpr_save_fd(name, TAP_FD_INDEX(0), fd);
> +            }
> +        }
>           if (fd == -1) {
>               ret = -1;
>               goto out;
> @@ -980,13 +1011,14 @@ free_fail:
>   
>           net_init_tap_one(tap, peer, "bridge", name, ifname,
>                            script, downscript, vhostfdname,
> -                         vnet_hdr, fd, &err);
> +                         vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err);
>           if (err) {
>               error_propagate(errp, err);
>               close(fd);
>               ret = -1;
>               goto out;
>           }
> +
>       } else {
>           g_autofree char *default_script = NULL;
>           g_autofree char *default_downscript = NULL;
> @@ -1011,8 +1043,14 @@ free_fail:
>           }
>   
>           for (i = 0; i < queues; i++) {
> -            fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
> -                              ifname, sizeof ifname, queues > 1, errp);
> +            fd = cpr ? cpr_find_fd(name, TAP_FD_INDEX(i)) : -1;
> +            if (fd < 0) {
> +                fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
> +                                ifname, sizeof ifname, queues > 1, errp);
> +                if (cpr && fd >= 0) {
> +                    cpr_save_fd(name, TAP_FD_INDEX(i), fd);
> +                }
> +            }
>               if (fd == -1) {
>                   ret = -1;
>                   goto out;
> @@ -1030,7 +1068,9 @@ free_fail:
>               net_init_tap_one(tap, peer, "tap", name, ifname,
>                                i >= 1 ? "no" : script,
>                                i >= 1 ? "no" : downscript,
> -                             vhostfdname, vnet_hdr, fd, &err);
> +                             vhostfdname, vnet_hdr,
> +                             fd, TAP_VHOSTFD_INDEX(i),
> +                             &err);
>               if (err) {
>                   error_propagate(errp, err);
>                   close(fd);
> @@ -1041,6 +1081,9 @@ free_fail:
>       }
>   
>   out:
> +    if (ret && cpr) {
> +        cpr_delete_fd_all(name);
> +    }
>       return ret;
>   }
>   
> diff --git a/qapi/net.json b/qapi/net.json
> index 118bd34965..264213b5d9 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -355,6 +355,8 @@
>   # @poll-us: maximum number of microseconds that could be spent on busy
>   #     polling for tap (since 2.7)
>   #
> +# @cpr: preserve fds and vhostfds during cpr-transfer.
> +#
>   # Since: 1.2
>   ##
>   { 'struct': 'NetdevTapOptions',
> @@ -373,7 +375,8 @@
>       '*vhostfds':   'str',
>       '*vhostforce': 'bool',
>       '*queues':     'uint32',
> -    '*poll-us':    'uint32'} }
> +    '*poll-us':    'uint32',
> +    '*cpr':        'bool'} }
>   
>   ##
>   # @NetdevSocketOptions:
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/8] tap: cpr support
  2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
  2025-12-04  8:09   ` Markus Armbruster
  2025-12-04 17:46   ` Cédric Le Goater
@ 2025-12-04 17:56   ` Daniel P. Berrangé
  2 siblings, 0 replies; 23+ messages in thread
From: Daniel P. Berrangé @ 2025-12-04 17:56 UTC (permalink / raw)
  To: Ben Chaney
  Cc: qemu-devel, Peter Xu, Fabiano Rosas, Michael S. Tsirkin,
	Stefano Garzarella, Jason Wang, Alex Williamson,
	Cédric Le Goater, Eric Blake, Markus Armbruster, Stefan Weil,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Steve Sistare

On Wed, Dec 03, 2025 at 01:51:23PM -0500, Ben Chaney wrote:
> From: Steve Sistare <steven.sistare@oracle.com>
> 
> Provide the cpr=on option to preserve TAP and vhost descriptors during
> cpr-transfer, so the management layer does not need to create a new
> device for the target.
> 
> Save all tap fd's in canonical order, leveraging the index argument of
> cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
> and the vhostfd (if any) at index 2*i+1.

This interleaving feels risky from the POV of future extensibility.

Although its unlikely that we'll need a third type of FD per queue,
it would be easy to leave this possiblity open.

IOW, IMHO we should save all tap FDs, then all vhostfds with no
interleaving. If we ever get further FDs to save in future, then
they can be set to follow the vhostfds.

> 
> tap and vhost fd's are passed by name to the monitor when a NIC is hot
> plugged, but the name is not known to qemu after cpr.  Allow the manager
> to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
> should search for a saved value.  Example:
> 
>   -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on

This syntax feels redundant.  If cpr==off then "fds" must
always be valid, or not specified at all. If cpr=on, then
"fds" will always be -1. I don't see any point in setting
the 'fds' or 'vhostfds' arg at all. It should simply be:

 -netdev tap,id=hostnet2,cpr=on

this in turn avoids introducing special syntax for allowing
-1 in 'fds' or 'vhostfds' which Markus was concerned with.


> diff --git a/include/migration/cpr.h b/include/migration/cpr.h
> index d585fadc5b..68424b4b03 100644
> --- a/include/migration/cpr.h
> +++ b/include/migration/cpr.h
> @@ -48,7 +48,7 @@ void cpr_state_close(void);
>  struct QIOChannel *cpr_state_ioc(void);
>  
>  bool cpr_incoming_needed(void *opaque);
> -int cpr_get_fd_param(const char *name, const char *fdname, int index,
> +int cpr_get_fd_param(const char *name, const char *fdname, int index, bool cpr,
>                       Error **errp);
>  
>  QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
> diff --git a/migration/cpr.c b/migration/cpr.c
> index c0bf93a7ba..19bd56339d 100644
> --- a/migration/cpr.c
> +++ b/migration/cpr.c
> @@ -316,6 +316,7 @@ bool cpr_incoming_needed(void *opaque)
>   * @name: CPR name for the descriptor
>   * @fdname: An integer-valued string, or a name passed to a getfd command
>   * @index: CPR index of the descriptor
> + * @cpr: use cpr

This feels wierdly redundant too.   THe method name already implies
use of 'cpr', and yet we now have another parameter to ask whether
to use 'cpr'. At the very least these semantics deserve a much
better explanation than "@cpr: use cpr", as I don't know what the
intention is here.

>   * @errp: returned error message
>   *
>   * If CPR is not being performed, then use @fdname to find the fd.
> @@ -325,22 +326,22 @@ bool cpr_incoming_needed(void *opaque)
>   * On success returns the fd value, else returns -1.
>   */
>  int cpr_get_fd_param(const char *name, const char *fdname, int index,
> -                     Error **errp)
> +                     bool cpr, Error **errp)
>  {
>      ERRP_GUARD();
>      int fd;
>  
> -    if (cpr_is_incoming()) {
> +    if (cpr && cpr_is_incoming()) {
>          fd = cpr_find_fd(name, index);
>          if (fd < 0) {
>              error_setg(errp, "cannot find saved value for fd %s", fdname);
>          }
>      } else {
>          fd = monitor_fd_param(monitor_cur(), fdname, errp);
> -        if (fd >= 0) {
> -            cpr_save_fd(name, index, fd);
> -        } else {
> +        if (fd < 0) {
>              error_prepend(errp, "Could not parse object fd %s:", fdname);
> +        } else if (cpr) {
> +            cpr_save_fd(name, index, fd);
>          }
>      }
>      return fd;


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 8/8] tap: cpr fixes
  2025-12-03 18:51 ` [PATCH v3 8/8] tap: cpr fixes Ben Chaney
@ 2025-12-04 17:59   ` Daniel P. Berrangé
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel P. Berrangé @ 2025-12-04 17:59 UTC (permalink / raw)
  To: Ben Chaney
  Cc: qemu-devel, Peter Xu, Fabiano Rosas, Michael S. Tsirkin,
	Stefano Garzarella, Jason Wang, Alex Williamson,
	Cédric Le Goater, Eric Blake, Markus Armbruster, Stefan Weil,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Joshua Hunt, Max Tottenham,
	Steve Sistare, Vladimir Sementsov-Ogievskiy

On Wed, Dec 03, 2025 at 01:51:25PM -0500, Ben Chaney wrote:
> From: Steve Sistare <steven.sistare@oracle.com>
> 
> Fix "virtio_net_set_queue_pairs: Assertion `!r' failed."
> Fix "virtio-net: saved image requires vnet_hdr=on"
> Do not change blocking mode of incoming cpr fd's.
> 
> Reported-by: Ben Chaney <bchaney@akamai.com>
> Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Signed-off-by: Ben Chaney <bchaney@akamai.com>
> ---
>  hw/net/virtio-net.c | 6 ++++++
>  io/channel-socket.c | 4 +++-
>  net/tap.c           | 2 ++
>  stubs/cpr.c         | 8 ++++++++
>  stubs/meson.build   | 1 +
>  5 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 38ec7ac109..fd6b30b296 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -37,6 +37,7 @@
>  #include "qapi/qapi-types-migration.h"
>  #include "qapi/qapi-events-migration.h"
>  #include "hw/virtio/virtio-access.h"
> +#include "migration/cpr.h"
>  #include "migration/misc.h"
>  #include "standard-headers/linux/ethtool.h"
>  #include "system/system.h"
> @@ -789,6 +790,11 @@ static void virtio_net_set_queue_pairs(VirtIONet *n)
>      int i;
>      int r;
>  
> +    if (cpr_is_incoming()) {
> +        /* peers are already attached, do nothing */
> +        return;
> +    }
> +
>      if (n->nic->peer_deleted) {
>          return;
>      }
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index 3053b35ad8..443ca8cb7c 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -24,6 +24,7 @@
>  #include "io/channel-socket.h"
>  #include "io/channel-util.h"
>  #include "io/channel-watch.h"
> +#include "migration/cpr.h"
>  #include "trace.h"
>  #include "qapi/clone-visitor.h"
>  #ifdef CONFIG_LINUX
> @@ -521,7 +522,8 @@ static bool qio_channel_handle_fds(int *fds, size_t nfds,
>  
>          if (!preserve_blocking) {
>              /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
> -            if (!qemu_set_blocking(*fd, true, errp)) {
> +              if (!cpr_is_incoming() &&
> +                  qemu_set_blocking(*fd, true, errp)) {
>                  return false;
>              }
>          }

This is wrong - we added the QIO_CHANNEL_READ_FLAG_FD_PRESERVE_BLOCKING
flag precisely so that we don't need to add CPR hacks into QIOChannelSocket
code.

Whatever cares about this blocking state being preserved needs to pass
that QIO_CHANNEL flag when reading from the channel.

> diff --git a/net/tap.c b/net/tap.c
> index 5acda81146..5e04099c87 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -1050,6 +1050,8 @@ free_fail:
>                  if (cpr && fd >= 0) {
>                      cpr_save_fd(name, TAP_FD_INDEX(i), fd);
>                  }
> +            } else {
> +                vnet_hdr = tap->has_vnet_hdr ? tap->vnet_hdr : 1;
>              }
>              if (fd == -1) {
>                  ret = -1;
> diff --git a/stubs/cpr.c b/stubs/cpr.c
> new file mode 100644
> index 0000000000..1a4dbbb2d7
> --- /dev/null
> +++ b/stubs/cpr.c
> @@ -0,0 +1,8 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#include "qemu/osdep.h"
> +#include "migration/cpr.h"
> +
> +bool cpr_is_incoming(void)
> +{
> +    return false;
> +}
> diff --git a/stubs/meson.build b/stubs/meson.build
> index 0b2778c568..87af733528 100644
> --- a/stubs/meson.build
> +++ b/stubs/meson.build
> @@ -10,6 +10,7 @@ stub_ss.add(files('is-daemonized.c'))
>  stub_ss.add(files('monitor-core.c'))
>  stub_ss.add(files('replay-mode.c'))
>  stub_ss.add(files('trace-control.c'))
> +stub_ss.add(files('cpr.c'))
>  
>  if have_block
>    stub_ss.add(files('bdrv-next-monitor-owned.c'))
> 
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/8] tap: cpr support
  2025-12-04  8:09   ` Markus Armbruster
@ 2025-12-05  0:51     ` Jason Wang
  2025-12-05  6:46       ` Markus Armbruster
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Wang @ 2025-12-05  0:51 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Ben Chaney, qemu-devel, Peter Xu, Fabiano Rosas,
	Michael S. Tsirkin, Stefano Garzarella, Alex Williamson,
	Cédric Le Goater, Eric Blake, Stefan Weil,
	Daniel P. Berrangé, Paolo Bonzini, Hamza Khan, Mark Kanda,
	Joshua Hunt, Max Tottenham, Steve Sistare

Hi Markus,

On Thu, Dec 4, 2025 at 4:09 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Ben Chaney <bchaney@akamai.com> writes:
>
> > From: Steve Sistare <steven.sistare@oracle.com>
> >
> > Provide the cpr=on option to preserve TAP and vhost descriptors during
> > cpr-transfer, so the management layer does not need to create a new
> > device for the target.
> >
> > Save all tap fd's in canonical order, leveraging the index argument of
> > cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
> > and the vhostfd (if any) at index 2*i+1.
> >
> > tap and vhost fd's are passed by name to the monitor when a NIC is hot
> > plugged, but the name is not known to qemu after cpr.  Allow the manager
> > to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
> > should search for a saved value.  Example:
> >
> >   -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on
>
> Hmm.  See below.
>
> >
> > Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> > Signed-off-by: Ben Chaney <bchaney@akamai.com>
>
> [...]
>
> > diff --git a/qapi/net.json b/qapi/net.json
> > index 118bd34965..264213b5d9 100644
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -355,6 +355,8 @@
>    ##
>    # @NetdevTapOptions:
>    #
>    # Used to configure a host TAP network interface backend.
>    #
>    # @ifname: interface name
>    #
>    # @fd: file descriptor of an already opened tap
>    #
>    # @fds: multiple file descriptors of already opened multiqueue capable
>    #     tap
>
> Not this patch's fault: the interface is misguided, and its
> documentation inadequate.
>
> @fds is a string of file descriptor names or numbers separated by ':'.
> Not documented.  I found out by reading the code.
>
> This violates QAPI design principle "no string parsing".  It should be
> an array of strings.
>

I agree with your concern. Just a note that this "fds" was introduced
before QAPI if I am not wrong.

Thanks



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/8] tap: cpr support
  2025-12-05  0:51     ` Jason Wang
@ 2025-12-05  6:46       ` Markus Armbruster
  0 siblings, 0 replies; 23+ messages in thread
From: Markus Armbruster @ 2025-12-05  6:46 UTC (permalink / raw)
  To: Jason Wang
  Cc: Ben Chaney, qemu-devel, Peter Xu, Fabiano Rosas,
	Michael S. Tsirkin, Stefano Garzarella, Alex Williamson,
	Cédric Le Goater, Eric Blake, Stefan Weil,
	Daniel P. Berrangé, Paolo Bonzini, Hamza Khan, Mark Kanda,
	Joshua Hunt, Max Tottenham, Steve Sistare

Jason Wang <jasowang@redhat.com> writes:

> Hi Markus,
>
> On Thu, Dec 4, 2025 at 4:09 PM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Ben Chaney <bchaney@akamai.com> writes:
>>
>> > From: Steve Sistare <steven.sistare@oracle.com>
>> >
>> > Provide the cpr=on option to preserve TAP and vhost descriptors during
>> > cpr-transfer, so the management layer does not need to create a new
>> > device for the target.
>> >
>> > Save all tap fd's in canonical order, leveraging the index argument of
>> > cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
>> > and the vhostfd (if any) at index 2*i+1.
>> >
>> > tap and vhost fd's are passed by name to the monitor when a NIC is hot
>> > plugged, but the name is not known to qemu after cpr.  Allow the manager
>> > to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
>> > should search for a saved value.  Example:
>> >
>> >   -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on
>>
>> Hmm.  See below.
>>
>> >
>> > Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> > Signed-off-by: Ben Chaney <bchaney@akamai.com>
>>
>> [...]
>>
>> > diff --git a/qapi/net.json b/qapi/net.json
>> > index 118bd34965..264213b5d9 100644
>> > --- a/qapi/net.json
>> > +++ b/qapi/net.json
>> > @@ -355,6 +355,8 @@
>>    ##
>>    # @NetdevTapOptions:
>>    #
>>    # Used to configure a host TAP network interface backend.
>>    #
>>    # @ifname: interface name
>>    #
>>    # @fd: file descriptor of an already opened tap
>>    #
>>    # @fds: multiple file descriptors of already opened multiqueue capable
>>    #     tap
>>
>> Not this patch's fault: the interface is misguided, and its
>> documentation inadequate.
>>
>> @fds is a string of file descriptor names or numbers separated by ':'.
>> Not documented.  I found out by reading the code.
>>
>> This violates QAPI design principle "no string parsing".  It should be
>> an array of strings.
>>
>
> I agree with your concern. Just a note that this "fds" was introduced
> before QAPI if I am not wrong.

It's from 2013 (commit 264986e2c8f).  QAPI was still young then
(netdev_add had been QAPIfied less than a year ago), we had much to
learn, and interface review barely happened.  All understandable, and no
reason to throw shade on anyone involved :)



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
                   ` (8 preceding siblings ...)
  2025-12-04 12:52 ` [PATCH v3 0/8] Live update: tap and vhost Vladimir Sementsov-Ogievskiy
@ 2025-12-08 10:08 ` Cédric Le Goater
  2025-12-08 14:22   ` Mark Kanda
  2025-12-09 18:36   ` Chaney, Ben
  9 siblings, 2 replies; 23+ messages in thread
From: Cédric Le Goater @ 2025-12-08 10:08 UTC (permalink / raw)
  To: Ben Chaney, qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Eric Blake, Markus Armbruster,
	Stefan Weil, Daniel P. Berrangé, Paolo Bonzini, Hamza Khan,
	Mark Kanda, Joshua Hunt, Max Tottenham, Steve Sistare,
	Vladimir Sementsov-Ogievskiy

Hello,

Ben, Mark,

Since Steve retired, we have generic names under the "CheckPoint and
Restart (CPR)" entry in MAINTAINERS. Would you be willing to step forward
as Reviewers/Maintainers ?

Also, do you have a gitlab account so we can copy you on any reported
issues [1] ?

Thanks,

C.

[1] https://gitlab.com/qemu-project/qemu/-/issues/3235



On 12/3/25 19:51, Ben Chaney wrote:
> Changes since v2
> - I have taken over this patch set since Steve retired
> - Added comments to explain the order of events
> - Remove redundant reversion to cleanup git history
> - Inclusion of virtio and stub fixes
> 
> Tap and vhost devices can be preserved during cpr-transfer using
> traditional live migration methods, wherein the management layer
> creates new interfaces for the target and fiddles with 'ip link'
> to deactivate the old interface and activate the new.
> 
> However, CPR can simply send the file descriptors to new QEMU,
> with no special management actions required.  The user enables
> this behavior by specifing '-netdev tap,cpr=on'.  The default
> is cpr=off.
> 
> Signed-off-by: Ben Chaney <bchaney@akamai.com>
> ---
> Steve Sistare (8):
>        migration: stop vm earlier for cpr
>        migration: cpr setup notifier
>        vhost: reset vhost devices for cpr
>        cpr: delete all fds
>        tap: common return label
>        tap: cpr support
>        tap: postload fix for cpr
>        tap: cpr fixes
> 
>   hw/net/virtio-net.c               |  26 +++++++
>   hw/vfio/device.c                  |   2 +-
>   hw/virtio/vhost-backend.c         |   6 ++
>   hw/virtio/vhost.c                 |  32 +++++++++
>   include/hw/virtio/vhost-backend.h |   1 +
>   include/hw/virtio/vhost.h         |   1 +
>   include/migration/cpr.h           |   3 +-
>   include/net/tap.h                 |   1 +
>   io/channel-socket.c               |   4 +-
>   migration/cpr.c                   |  24 +++++--
>   migration/migration.c             |  69 ++++++++++++++----
>   net/tap-win32.c                   |   5 ++
>   net/tap.c                         | 147 +++++++++++++++++++++++++++++---------
>   qapi/net.json                     |   5 +-
>   stubs/cpr.c                       |   8 +++
>   stubs/meson.build                 |   1 +
>   16 files changed, 279 insertions(+), 56 deletions(-)
> ---
> base-commit: 9febfa94b69b7146582c48a868bd2330ac45037f
> change-id: 20251203-cpr-tap-04fd811ace03
> 
> Best regards,





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-08 10:08 ` Cédric Le Goater
@ 2025-12-08 14:22   ` Mark Kanda
  2025-12-08 14:42     ` Cédric Le Goater
  2025-12-09 18:36   ` Chaney, Ben
  1 sibling, 1 reply; 23+ messages in thread
From: Mark Kanda @ 2025-12-08 14:22 UTC (permalink / raw)
  To: Cédric Le Goater, Ben Chaney, qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Eric Blake, Markus Armbruster,
	Stefan Weil, Daniel P. Berrangé, Paolo Bonzini, Hamza Khan,
	Joshua Hunt, Max Tottenham, Steve Sistare,
	Vladimir Sementsov-Ogievskiy

[-- Attachment #1: Type: text/plain, Size: 2939 bytes --]

On 12/8/25 4:08 AM, Cédric Le Goater wrote:
> Hello,
>
> Ben, Mark,
>
> Since Steve retired, we have generic names under the "CheckPoint and
> Restart (CPR)" entry in MAINTAINERS. Would you be willing to step forward
> as Reviewers/Maintainers ?
>
You can add me as a Reviewer.

Thanks/regards,
-Mark

> Also, do you have a gitlab account so we can copy you on any reported
> issues [1] ?
>
> Thanks,
>
> C.
>
> [1] https://gitlab.com/qemu-project/qemu/-/issues/3235
>
>
>
> On 12/3/25 19:51, Ben Chaney wrote:
>> Changes since v2
>> - I have taken over this patch set since Steve retired
>> - Added comments to explain the order of events
>> - Remove redundant reversion to cleanup git history
>> - Inclusion of virtio and stub fixes
>>
>> Tap and vhost devices can be preserved during cpr-transfer using
>> traditional live migration methods, wherein the management layer
>> creates new interfaces for the target and fiddles with 'ip link'
>> to deactivate the old interface and activate the new.
>>
>> However, CPR can simply send the file descriptors to new QEMU,
>> with no special management actions required.  The user enables
>> this behavior by specifing '-netdev tap,cpr=on'.  The default
>> is cpr=off.
>>
>> Signed-off-by: Ben Chaney <bchaney@akamai.com>
>> ---
>> Steve Sistare (8):
>>        migration: stop vm earlier for cpr
>>        migration: cpr setup notifier
>>        vhost: reset vhost devices for cpr
>>        cpr: delete all fds
>>        tap: common return label
>>        tap: cpr support
>>        tap: postload fix for cpr
>>        tap: cpr fixes
>>
>>   hw/net/virtio-net.c               |  26 +++++++
>>   hw/vfio/device.c                  |   2 +-
>>   hw/virtio/vhost-backend.c         |   6 ++
>>   hw/virtio/vhost.c                 |  32 +++++++++
>>   include/hw/virtio/vhost-backend.h |   1 +
>>   include/hw/virtio/vhost.h         |   1 +
>>   include/migration/cpr.h           |   3 +-
>>   include/net/tap.h                 |   1 +
>>   io/channel-socket.c               |   4 +-
>>   migration/cpr.c                   |  24 +++++--
>>   migration/migration.c             |  69 ++++++++++++++----
>>   net/tap-win32.c                   |   5 ++
>>   net/tap.c                         | 147 
>> +++++++++++++++++++++++++++++---------
>>   qapi/net.json                     |   5 +-
>>   stubs/cpr.c                       |   8 +++
>>   stubs/meson.build                 |   1 +
>>   16 files changed, 279 insertions(+), 56 deletions(-)
>> ---
>> base-commit: 9febfa94b69b7146582c48a868bd2330ac45037f
>> change-id: 20251203-cpr-tap-04fd811ace03
>>
>> Best regards,
>
>
>

[-- Attachment #2: Type: text/html, Size: 6087 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-08 14:22   ` Mark Kanda
@ 2025-12-08 14:42     ` Cédric Le Goater
  0 siblings, 0 replies; 23+ messages in thread
From: Cédric Le Goater @ 2025-12-08 14:42 UTC (permalink / raw)
  To: Mark Kanda, Ben Chaney, qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Eric Blake, Markus Armbruster,
	Stefan Weil, Daniel P. Berrangé, Paolo Bonzini, Hamza Khan,
	Joshua Hunt, Max Tottenham, Steve Sistare,
	Vladimir Sementsov-Ogievskiy

On 12/8/25 15:22, Mark Kanda wrote:
> On 12/8/25 4:08 AM, Cédric Le Goater wrote:
>> Hello,
>>
>> Ben, Mark,
>>
>> Since Steve retired, we have generic names under the "CheckPoint and
>> Restart (CPR)" entry in MAINTAINERS. Would you be willing to step forward
>> as Reviewers/Maintainers ?
>>
> You can add me as a Reviewer.

You should send a patch, such :

  https://gitlab.com/legoater/qemu/-/commits/9942c711835f

Thanks,

C.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-04 12:52 ` [PATCH v3 0/8] Live update: tap and vhost Vladimir Sementsov-Ogievskiy
@ 2025-12-08 21:03   ` Chaney, Ben
  2025-12-09  7:27     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 23+ messages in thread
From: Chaney, Ben @ 2025-12-08 21:03 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel@nongnu.org
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Hunt, Joshua,
	Tottenham, Max, Steve Sistare



On 12/4/25, 7:53 AM, "Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru <mailto:vsementsov@yandex-team.ru>> wrote:

> [PATCH v9 0/8] virtio-net: live-TAP local migration
> https://urldefense.com/v3/__https://patchew.org/QEMU/20251030203116.870742-1-vsementsov@yandex-team.ru <mailto:20251030203116.870742-1-vsementsov@yandex-team.ru>/__;!!GjvTz_vk!Wv8g8JpZGOl96S-RX_T81d0fwPei5C-fKhKAUqM5DJmec3xKhFaStiinE3IFLyUwrs1UQrdQgth3VU1HRlblRjjmVA$


> , which bring same thing: migrate TAP device, passing FDs
> though migration channel. The benefit is that it doesn't
> require additional migration channel.

Hi Vladimir,
        Thanks for sending this. I tried testing your patch
set and I got the following errors from qemu

2025-12-08T20:44:31.251153Z qemu-system-x86_64: 8 != 101
2025-12-08T20:44:31.251199Z qemu-system-x86_64: Failed to load element of type uint16 equal for max_queue_pairs: -22
2025-12-08T20:44:31.251492Z qemu-system-x86_64: warning: qemu_fclose: received fd 141 was never claimed
2025-12-08T20:44:31.251497Z qemu-system-x86_64: warning: qemu_fclose: received fd 142 was never claimed
2025-12-08T20:44:31.251501Z qemu-system-x86_64: warning: qemu_fclose: received fd 143 was never claimed
2025-12-08T20:44:31.251524Z qemu-system-x86_64: load of migration failed: Invalid argument: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net': Failed to load element of type virtio for virtio: -22


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-08 21:03   ` Chaney, Ben
@ 2025-12-09  7:27     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 23+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-12-09  7:27 UTC (permalink / raw)
  To: Chaney, Ben, qemu-devel@nongnu.org
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Cédric Le Goater, Eric Blake,
	Markus Armbruster, Stefan Weil, Daniel P. Berrangé,
	Paolo Bonzini, Hamza Khan, Mark Kanda, Hunt, Joshua,
	Tottenham, Max, Steve Sistare

On 09.12.25 00:03, Chaney, Ben wrote:
> 
> 
> On 12/4/25, 7:53 AM, "Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru <mailto:vsementsov@yandex-team.ru>> wrote:
> 
>> [PATCH v9 0/8] virtio-net: live-TAP local migration
>> https://urldefense.com/v3/__https://patchew.org/QEMU/20251030203116.870742-1-vsementsov@yandex-team.ru <mailto:20251030203116.870742-1-vsementsov@yandex-team.ru>/__;!!GjvTz_vk!Wv8g8JpZGOl96S-RX_T81d0fwPei5C-fKhKAUqM5DJmec3xKhFaStiinE3IFLyUwrs1UQrdQgth3VU1HRlblRjjmVA$
> 
> 
>> , which bring same thing: migrate TAP device, passing FDs
>> though migration channel. The benefit is that it doesn't
>> require additional migration channel.
> 
> Hi Vladimir,
>          Thanks for sending this. I tried testing your patch
> set and I got the following errors from qemu
> 
> 2025-12-08T20:44:31.251153Z qemu-system-x86_64: 8 != 101
> 2025-12-08T20:44:31.251199Z qemu-system-x86_64: Failed to load element of type uint16 equal for max_queue_pairs: -22
> 2025-12-08T20:44:31.251492Z qemu-system-x86_64: warning: qemu_fclose: received fd 141 was never claimed
> 2025-12-08T20:44:31.251497Z qemu-system-x86_64: warning: qemu_fclose: received fd 142 was never claimed
> 2025-12-08T20:44:31.251501Z qemu-system-x86_64: warning: qemu_fclose: received fd 143 was never claimed
> 2025-12-08T20:44:31.251524Z qemu-system-x86_64: load of migration failed: Invalid argument: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net': Failed to load element of type virtio for virtio: -22
> 

Thanks for testing!

Hmm. Migration errors was never friendly enough, to understand, what's going wrong. Could you describe, how exactly are you testing it? Also, do you use exactly my commit efb5b1a9aa839619db5  ( pushed here https://gitlab.com/vsementsov/qemu.git tag up-tap-fd-migration-v9 ) ?

Also, to check that the series basically work, you may start the included test (under sudo):

    export QEMU_TEST_QEMU_BINARY=$PWD/build/qemu-system-x86_64
    export PYTHONPATH=python:tests/functional
    python3 tests/functional/x86_64/test_tap_migration.py

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/8] Live update: tap and vhost
  2025-12-08 10:08 ` Cédric Le Goater
  2025-12-08 14:22   ` Mark Kanda
@ 2025-12-09 18:36   ` Chaney, Ben
  1 sibling, 0 replies; 23+ messages in thread
From: Chaney, Ben @ 2025-12-09 18:36 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel@nongnu.org
  Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Stefano Garzarella,
	Jason Wang, Alex Williamson, Eric Blake, Markus Armbruster,
	Stefan Weil, Daniel P. Berrangé, Paolo Bonzini, Hamza Khan,
	Mark Kanda, Hunt, Joshua, Tottenham, Max,
	Vladimir Sementsov-Ogievskiy


> Since Steve retired, we have generic names under the "CheckPoint and
> Restart (CPR)" entry in MAINTAINERS. Would you be willing to step forward
> as Reviewers/Maintainers ?


> Also, do you have a gitlab account so we can copy you on any reported
> issues [1] ?

I send a patch adding me and Mark as reviewers for CPR.

My gitlab username is benchaney

Thanks,
        Ben



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-12-09 18:37 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
2025-12-03 18:51 ` [PATCH v3 1/8] migration: stop vm earlier for cpr Ben Chaney
2025-12-03 18:51 ` [PATCH v3 2/8] migration: cpr setup notifier Ben Chaney
2025-12-03 18:51 ` [PATCH v3 3/8] vhost: reset vhost devices for cpr Ben Chaney
2025-12-03 18:51 ` [PATCH v3 4/8] cpr: delete all fds Ben Chaney
2025-12-03 18:51 ` [PATCH v3 5/8] tap: common return label Ben Chaney
2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
2025-12-04  8:09   ` Markus Armbruster
2025-12-05  0:51     ` Jason Wang
2025-12-05  6:46       ` Markus Armbruster
2025-12-04 17:46   ` Cédric Le Goater
2025-12-04 17:56   ` Daniel P. Berrangé
2025-12-03 18:51 ` [PATCH v3 7/8] tap: postload fix for cpr Ben Chaney
2025-12-03 18:51 ` [PATCH v3 8/8] tap: cpr fixes Ben Chaney
2025-12-04 17:59   ` Daniel P. Berrangé
2025-12-04 12:52 ` [PATCH v3 0/8] Live update: tap and vhost Vladimir Sementsov-Ogievskiy
2025-12-08 21:03   ` Chaney, Ben
2025-12-09  7:27     ` Vladimir Sementsov-Ogievskiy
2025-12-08 10:08 ` Cédric Le Goater
2025-12-08 14:22   ` Mark Kanda
2025-12-08 14:42     ` Cédric Le Goater
2025-12-09 18:36   ` Chaney, Ben
  -- strict thread matches above, loose matches on Subject: below --
2025-12-03 18:43 Ben Chaney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).