* [PATCH v10 0/8] virtio-net: live-TAP local migration
@ 2026-02-01 16:19 Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 1/8] net/tap: move vhost-net open() calls to tap_parse_vhost_fds() Vladimir Sementsov-Ogievskiy
` (7 more replies)
0 siblings, 8 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
Hi all!
Here is a new migration parameter backend-transfer, which allows to
enable local migration of TAP virtio-net backend (and maybe other
devices and backends in future), including its properties and open
fds.
With this new option, management software doesn't need to initialize
new TAP and do a switch to it. Nothing should be done around
virtio-net in local migration: it just migrates and continues to use
same TAP device. So we avoid extra logic in management software, extra
allocations in kernel (for new TAP), and corresponding extra delay in
migration downtime.
v10:
reworked interface:
1. backend-transfer migration parameter - simple boolean
2. backend-transfer option for virtio-net device - in combination with
[1] it enables the feature for concrete virtio-net device
3. incoming-fds option for TAP - to skip any open() calls and setting
up the fd, and instead wait for incoming FD.
Why we need [1]? To be able to simply choose between local and remote
migration, not touching device options.
Why we need [2]? Because otherwise we have to significantly complicate
migration paremeter (to be a list of QOM paths of devices to enable
backend-transfer for), which doesn't look good architectural design
(we don't want to end with migration parameters like
feature1 = [list of QOM paths],
feature2 = [list of QOM paths]k,
...
Why we need [3]? To avoid complicated logic of postponing the whole
initialization of TAP up to pre-incoming point (see
"[PATCH v9 0/9] net/tap: postpone connect"), and to make the whole
feature more reliable: with such a parameter user is sure, that
backend code will not do any open()/connect(), etc, which may
influence running source vm. Automatic logic of 3 stage postponing
in v9 (first, we postpone initialization up to attaching to frontend,
then, if it's a virtio-net, it take ownership and postpone initializing
of backend up to pre-incoming point (where we know migration
parameters), and finally if it's a backend-transfer, postpone it up
to post-load) can't be so reliable. Misconfiguration on target may
lead to breaking source vm. And with separate parameter, everything
becomes a lot simpler and reliable.
The series is based on
"[PATCH v2 00/12] net: refactoring and fixes"
Based-on: <20260129213215.1405459-1-vsementsov@yandex-team.ru>
Vladimir Sementsov-Ogievskiy (8):
net/tap: move vhost-net open() calls to tap_parse_vhost_fds()
net/tap: move vhost initialization to tap_setup_vhost()
qapi: add backend-transfer migration parameter
net: introduce vmstate_net_peer_backend
virtio-net: support backend-transfer migration
net/tap: support backend-transfer migration
tests/functional: add skipWithoutSudo() decorator
tests/functional: add test_tap_migration
hw/core/machine.c | 4 +-
hw/net/virtio-net.c | 137 +++++-
include/hw/virtio/virtio-net.h | 2 +
include/migration/misc.h | 2 +
include/net/net.h | 6 +
migration/options.c | 18 +-
net/net.c | 47 ++
net/tap.c | 185 ++++++--
qapi/migration.json | 13 +-
qapi/net.json | 6 +-
tests/functional/qemu_test/decorators.py | 16 +
tests/functional/x86_64/meson.build | 1 +
tests/functional/x86_64/test_tap_migration.py | 401 ++++++++++++++++++
13 files changed, 792 insertions(+), 46 deletions(-)
create mode 100644 tests/functional/x86_64/test_tap_migration.py
--
2.52.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v10 1/8] net/tap: move vhost-net open() calls to tap_parse_vhost_fds()
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 2/8] net/tap: move vhost initialization to tap_setup_vhost() Vladimir Sementsov-Ogievskiy
` (6 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
1. Simplify code path: get vhostfds for all cases in one function.
2. Prepare for further tap-fd-migraton feature, when we'll need to
postpone vhost initialization up to post-load stage.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
net/tap.c | 37 +++++++++++++++++++++----------------
1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index 288cfedd81..d003b82142 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -738,8 +738,7 @@ static bool net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
}
}
- if (tap->has_vhost ? tap->vhost :
- (vhostfd != -1) || (tap->has_vhostforce && tap->vhostforce)) {
+ if (vhostfd != -1) {
VhostNetOptions options;
options.backend_type = VHOST_BACKEND_TYPE_KERNEL;
@@ -749,17 +748,6 @@ static bool net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
} else {
options.busyloop_timeout = 0;
}
-
- if (vhostfd == -1) {
- vhostfd = open("/dev/vhost-net", O_RDWR);
- if (vhostfd < 0) {
- error_setg_file_open(errp, errno, "/dev/vhost-net");
- goto failed;
- }
- if (!qemu_set_blocking(vhostfd, false, errp)) {
- goto failed;
- }
- }
options.opaque = (void *)(uintptr_t)vhostfd;
options.nvqs = 2;
options.feature_bits = kernel_feature_bits;
@@ -841,13 +829,30 @@ static int tap_parse_fds_and_queues(const NetdevTapOptions *tap, int **fds,
static bool tap_parse_vhost_fds(const NetdevTapOptions *tap, int **vhost_fds,
unsigned queues, Error **errp)
{
- if (!(tap->vhostfd || tap->vhostfds)) {
+ bool need_vhost = tap->has_vhost ? tap->vhost :
+ ((tap->vhostfd || tap->vhostfds) ||
+ (tap->has_vhostforce && tap->vhostforce));
+
+ if (!need_vhost) {
*vhost_fds = NULL;
return true;
}
- if (net_parse_fds(tap->fd ?: tap->fds, vhost_fds, queues, errp) < 0) {
- return false;
+ if (tap->vhostfd || tap->vhostfds) {
+ if (net_parse_fds(tap->fd ?: tap->fds, vhost_fds, queues, errp) < 0) {
+ return false;
+ }
+ } else if (!(tap->vhostfd || tap->vhostfds)) {
+ *vhost_fds = g_new(int, queues);
+ for (int i = 0; i < queues; i++) {
+ int vhostfd = open("/dev/vhost-net", O_RDWR);
+ if (vhostfd < 0) {
+ error_setg_file_open(errp, errno, "/dev/vhost-net");
+ net_free_fds(*vhost_fds, i);
+ return false;
+ }
+ (*vhost_fds)[i] = vhostfd;
+ }
}
if (!unblock_fds(*vhost_fds, queues, errp)) {
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 2/8] net/tap: move vhost initialization to tap_setup_vhost()
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 1/8] net/tap: move vhost-net open() calls to tap_parse_vhost_fds() Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 3/8] qapi: add backend-transfer migration parameter Vladimir Sementsov-Ogievskiy
` (5 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
Make a new helper function in a way it can be reused later for
TAP fd-migration feature: we'll need to initialize vhost in a later
point when we doesn't have access to QAPI parameters.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
net/tap.c | 62 ++++++++++++++++++++++++++++++++++---------------------
1 file changed, 38 insertions(+), 24 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index d003b82142..bd19c71c42 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -71,6 +71,8 @@ static const int kernel_feature_bits[] = {
typedef struct TAPState {
NetClientState nc;
int fd;
+ int vhostfd;
+ uint32_t vhost_busyloop_timeout;
char down_script[1024];
char down_script_arg[128];
uint8_t buf[NET_BUFSIZE];
@@ -704,6 +706,38 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
#define MAX_TAP_QUEUES 1024
+static bool tap_setup_vhost(TAPState *s, Error **errp)
+{
+ VhostNetOptions options;
+
+ if (s->vhostfd == -1) {
+ return true;
+ }
+
+ options.backend_type = VHOST_BACKEND_TYPE_KERNEL;
+ options.net_backend = &s->nc;
+ options.busyloop_timeout = s->vhost_busyloop_timeout;
+ options.opaque = (void *)(uintptr_t)s->vhostfd;
+ options.nvqs = 2;
+ options.feature_bits = kernel_feature_bits;
+ options.get_acked_features = NULL;
+ options.save_acked_features = NULL;
+ options.max_tx_queue_size = 0;
+ options.is_vhost_user = false;
+
+ s->vhost_net = vhost_net_init(&options);
+ if (!s->vhost_net) {
+ error_setg(errp,
+ "vhost-net requested but could not be initialized");
+ return false;
+ }
+
+ /* vhostfd ownership is passed to s->vhost_net */
+ s->vhostfd = -1;
+
+ return true;
+}
+
static bool net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
const char *name,
const char *ifname, const char *script,
@@ -738,30 +772,10 @@ static bool net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
}
}
- if (vhostfd != -1) {
- VhostNetOptions options;
-
- options.backend_type = VHOST_BACKEND_TYPE_KERNEL;
- options.net_backend = &s->nc;
- if (tap->has_poll_us) {
- options.busyloop_timeout = tap->poll_us;
- } else {
- options.busyloop_timeout = 0;
- }
- options.opaque = (void *)(uintptr_t)vhostfd;
- options.nvqs = 2;
- options.feature_bits = kernel_feature_bits;
- options.get_acked_features = NULL;
- options.save_acked_features = NULL;
- options.max_tx_queue_size = 0;
- options.is_vhost_user = false;
-
- s->vhost_net = vhost_net_init(&options);
- if (!s->vhost_net) {
- error_setg(errp,
- "vhost-net requested but could not be initialized");
- goto failed;
- }
+ s->vhostfd = vhostfd;
+ s->vhost_busyloop_timeout = tap->has_poll_us ? tap->poll_us : 0;
+ if (!tap_setup_vhost(s, errp)) {
+ return false;
}
return true;
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 1/8] net/tap: move vhost-net open() calls to tap_parse_vhost_fds() Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 2/8] net/tap: move vhost initialization to tap_setup_vhost() Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-04 13:08 ` Markus Armbruster
2026-02-04 17:21 ` Peter Xu
2026-02-01 16:19 ` [PATCH v10 4/8] net: introduce vmstate_net_peer_backend Vladimir Sementsov-Ogievskiy
` (4 subsequent siblings)
7 siblings, 2 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
We are going to implement backend-transfer feature: some devices
will be able to transfer their backend through migration stream
for local migration through UNIX domain socket. For example,
virtio-net will migrate its attached TAP netdev, with all its
connected file descriptors.
In this commit we introduce a migration parameter, which enables
the feature, for supporting devices (no one at the moment).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
include/migration/misc.h | 2 ++
migration/options.c | 18 +++++++++++++++++-
qapi/migration.json | 13 +++++++++++--
3 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index e26d418a6e..f23a4d8b59 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -152,4 +152,6 @@ bool multifd_device_state_save_thread_should_exit(void);
void multifd_abort_device_state_save_threads(void);
bool multifd_join_device_state_save_threads(void);
+bool migrate_backend_transfer(void);
+
#endif
diff --git a/migration/options.c b/migration/options.c
index 1ffe85a2d8..a4c9e68457 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -13,6 +13,7 @@
#include "qemu/osdep.h"
#include "qemu/error-report.h"
+#include "qapi/util.h"
#include "exec/target_page.h"
#include "qapi/clone-visitor.h"
#include "qapi/error.h"
@@ -24,6 +25,7 @@
#include "migration/colo.h"
#include "migration/cpr.h"
#include "migration/misc.h"
+#include "migration/options.h"
#include "migration.h"
#include "migration-stats.h"
#include "qemu-file.h"
@@ -336,6 +338,12 @@ bool migrate_mapped_ram(void)
return s->capabilities[MIGRATION_CAPABILITY_MAPPED_RAM];
}
+bool migrate_backend_transfer(void)
+{
+ MigrationState *s = migrate_get_current();
+ return s->parameters.backend_transfer;
+}
+
bool migrate_ignore_shared(void)
{
MigrationState *s = migrate_get_current();
@@ -1047,7 +1055,7 @@ static void migrate_mark_all_params_present(MigrationParameters *p)
&p->has_announce_step, &p->has_block_bitmap_mapping,
&p->has_x_vcpu_dirty_limit_period, &p->has_vcpu_dirty_limit,
&p->has_mode, &p->has_zero_page_detection, &p->has_direct_io,
- &p->has_cpr_exec_command,
+ &p->has_cpr_exec_command, &p->has_backend_transfer,
};
len = ARRAY_SIZE(has_fields);
@@ -1386,6 +1394,10 @@ static void migrate_params_test_apply(MigrationParameters *params,
if (params->has_cpr_exec_command) {
dest->cpr_exec_command = params->cpr_exec_command;
}
+
+ if (params->has_backend_transfer) {
+ dest->backend_transfer = params->backend_transfer;
+ }
}
static void migrate_params_apply(MigrationParameters *params)
@@ -1514,6 +1526,10 @@ static void migrate_params_apply(MigrationParameters *params)
s->parameters.cpr_exec_command =
QAPI_CLONE(strList, params->cpr_exec_command);
}
+
+ if (params->has_backend_transfer) {
+ s->parameters.backend_transfer = params->backend_transfer;
+ }
}
void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
diff --git a/qapi/migration.json b/qapi/migration.json
index f925e5541b..cbe88f0c91 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -828,7 +828,8 @@
'mode',
'zero-page-detection',
'direct-io',
- 'cpr-exec-command'] }
+ 'cpr-exec-command',
+ 'backend-transfer'] }
##
# @migrate-set-parameters:
@@ -1004,6 +1005,13 @@
# is @cpr-exec. The first list element is the program's filename,
# the remainder its arguments. (Since 10.2)
#
+# @backend-transfer: Enable backend-transfer feature for devices that
+# supports it. In general that means that backend state and its
+# file descriptors are passed to the destination in the migraton
+# channel (which must be a UNIX socket). Individual devices
+# declare the support for backend-transfer by per-device
+# backend-transfer option. (Since 11.0)
+#
# Features:
#
# @unstable: Members @x-checkpoint-delay and
@@ -1043,7 +1051,8 @@
'*mode': 'MigMode',
'*zero-page-detection': 'ZeroPageDetection',
'*direct-io': 'bool',
- '*cpr-exec-command': [ 'str' ]} }
+ '*cpr-exec-command': [ 'str' ],
+ '*backend-transfer': 'bool' } }
##
# @query-migrate-parameters:
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 4/8] net: introduce vmstate_net_peer_backend
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (2 preceding siblings ...)
2026-02-01 16:19 ` [PATCH v10 3/8] qapi: add backend-transfer migration parameter Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 5/8] virtio-net: support backend-transfer migration Vladimir Sementsov-Ogievskiy
` (3 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
To implement backend-transfer migration in virtio-net in the next
commit, we need a generic API to migrate net backend. Here is it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
include/net/net.h | 4 ++++
net/net.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/include/net/net.h b/include/net/net.h
index 45bc86fc86..aa34043b1a 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -5,6 +5,7 @@
#include "qapi/qapi-types-net.h"
#include "net/queue.h"
#include "hw/core/qdev-properties-system.h"
+#include "migration/vmstate.h"
#define MAC_FMT "%02X:%02X:%02X:%02X:%02X:%02X"
#define MAC_ARG(x) ((uint8_t *)(x))[0], ((uint8_t *)(x))[1], \
@@ -110,6 +111,7 @@ typedef struct NetClientInfo {
SetSteeringEBPF *set_steering_ebpf;
NetCheckPeerType *check_peer_type;
GetVHostNet *get_vhost_net;
+ const VMStateDescription *backend_vmsd;
} NetClientInfo;
struct NetClientState {
@@ -354,4 +356,6 @@ static inline bool net_peer_needs_padding(NetClientState *nc)
return nc->peer && !nc->peer->do_not_pad;
}
+extern const VMStateInfo vmstate_net_peer_backend;
+
#endif
diff --git a/net/net.c b/net/net.c
index a176936f9b..58e11b9597 100644
--- a/net/net.c
+++ b/net/net.c
@@ -58,6 +58,7 @@
#include "qapi/string-output-visitor.h"
#include "qapi/qobject-input-visitor.h"
#include "standard-headers/linux/virtio_net.h"
+#include "migration/vmstate.h"
/* Net bridge is currently not supported for W32. */
#if !defined(_WIN32)
@@ -2173,3 +2174,49 @@ int net_fill_rstate(SocketReadState *rs, const uint8_t *buf, int size)
assert(size == 0);
return 0;
}
+
+static int get_peer_backend(QEMUFile *f, void *pv, size_t size,
+ const VMStateField *field)
+{
+ NetClientState *nc = pv;
+ Error *local_err = NULL;
+ int ret;
+
+ if (!nc->peer) {
+ return -EINVAL;
+ }
+ nc = nc->peer;
+
+ ret = vmstate_load_state(f, nc->info->backend_vmsd, nc, 0, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
+
+ return ret;
+}
+
+static int put_peer_backend(QEMUFile *f, void *pv, size_t size,
+ const VMStateField *field, JSONWriter *vmdesc)
+{
+ NetClientState *nc = pv;
+ Error *local_err = NULL;
+ int ret;
+
+ if (!nc->peer) {
+ return -EINVAL;
+ }
+ nc = nc->peer;
+
+ ret = vmstate_save_state(f, nc->info->backend_vmsd, nc, 0, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
+
+ return ret;
+}
+
+const VMStateInfo vmstate_net_peer_backend = {
+ .name = "virtio-net-nic-nc-backend",
+ .get = get_peer_backend,
+ .put = put_peer_backend,
+};
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 5/8] virtio-net: support backend-transfer migration
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (3 preceding siblings ...)
2026-02-01 16:19 ` [PATCH v10 4/8] net: introduce vmstate_net_peer_backend Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 6/8] net/tap: " Vladimir Sementsov-Ogievskiy
` (2 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
Add virtio-net option backend-transfer, which is true by default,
but false for older machine types, which doesn't support the feature.
For backend-transfer migration, both global migration parameter
backend-transfer and virtio-net backend-transfer option should be
set to true.
With the parameters enabled (both on source and target, of-course), and
with unix-socket used as migration-channel, we do "migrate" the
virtio-net backend - TAP device, with all its fds.
This way management tool should not care about creating new TAP, and
should not handle switching to it. Migration downtime become shorter.
Support for TAP will come in the next commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/core/machine.c | 4 +-
hw/net/virtio-net.c | 137 ++++++++++++++++++++++++++++++++-
include/hw/virtio/virtio-net.h | 2 +
include/net/net.h | 2 +
4 files changed, 143 insertions(+), 2 deletions(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 6411e68856..cc99287232 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -38,7 +38,9 @@
#include "hw/acpi/generic_event_device.h"
#include "qemu/audio.h"
-GlobalProperty hw_compat_10_2[] = {};
+GlobalProperty hw_compat_10_2[] = {
+ { TYPE_VIRTIO_NET, "backend-transfer", "false" },
+};
const size_t hw_compat_10_2_len = G_N_ELEMENTS(hw_compat_10_2);
GlobalProperty hw_compat_10_1[] = {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 512a7c02c9..9e3f75031a 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -38,8 +38,10 @@
#include "qapi/qapi-events-migration.h"
#include "hw/virtio/virtio-access.h"
#include "migration/misc.h"
+#include "migration/options.h"
#include "standard-headers/linux/ethtool.h"
#include "system/system.h"
+#include "system/runstate.h"
#include "system/replay.h"
#include "trace.h"
#include "monitor/qdev.h"
@@ -3061,7 +3063,17 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
n->multiqueue = multiqueue;
virtio_net_change_num_queues(n, max * 2 + 1);
- virtio_net_set_queue_pairs(n);
+ /*
+ * virtio_net_set_multiqueue() called from set_features(0) on early
+ * reset, when peer may wait for incoming (and is not initialized
+ * yet).
+ * Don't worry about it: virtio_net_set_queue_pairs() will be called
+ * later form virtio_net_post_load_device(), and anyway will be
+ * noop for local incoming migration with live backend passing.
+ */
+ if (!n->peers_wait_incoming) {
+ virtio_net_set_queue_pairs(n);
+ }
}
static int virtio_net_pre_load_queues(VirtIODevice *vdev, uint32_t n)
@@ -3090,6 +3102,17 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
virtio_add_feature_ex(features, VIRTIO_NET_F_MAC);
+ if (n->peers_wait_incoming) {
+ /*
+ * Excessive feature set is OK for early initialization when
+ * we wait for local incoming migration: actual guest-negotiated
+ * features will come with migration stream anyway. And we are sure
+ * that we support same host-features as source, because the backend
+ * is the same (the same TAP device, for example).
+ */
+ return;
+ }
+
if (!peer_has_vnet_hdr(n)) {
virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
@@ -3181,6 +3204,18 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
}
}
+static bool virtio_net_update_host_features(VirtIONet *n, Error **errp)
+{
+ ERRP_GUARD();
+ VirtIODevice *vdev = VIRTIO_DEVICE(n);
+
+ peer_test_vnet_hdr(n);
+
+ virtio_net_get_features(vdev, &vdev->host_features, errp);
+
+ return !*errp;
+}
+
static int virtio_net_post_load_device(void *opaque, int version_id)
{
VirtIONet *n = opaque;
@@ -3302,6 +3337,9 @@ struct VirtIONetMigTmp {
uint16_t curr_queue_pairs_1;
uint8_t has_ufo;
uint32_t has_vnet_hdr;
+
+ NetClientState *ncs;
+ uint32_t max_queue_pairs;
};
/* The 2nd and subsequent tx_waiting flags are loaded later than
@@ -3571,6 +3609,57 @@ static const VMStateDescription vhost_user_net_backend_state = {
}
};
+static bool virtio_net_is_backend_transfer(void *opaque, int version_id)
+{
+ VirtIONet *n = opaque;
+
+ return migrate_backend_transfer() && n->backend_transfer;
+}
+
+static int virtio_net_nic_pre_save(void *opaque)
+{
+ struct VirtIONetMigTmp *tmp = opaque;
+
+ tmp->ncs = tmp->parent->nic->ncs;
+ tmp->max_queue_pairs = tmp->parent->max_queue_pairs;
+
+ return 0;
+}
+
+static int virtio_net_nic_pre_load(void *opaque)
+{
+ /* Reuse the pointer setup from save */
+ virtio_net_nic_pre_save(opaque);
+
+ return 0;
+}
+
+static int virtio_net_nic_post_load(void *opaque, int version_id)
+{
+ struct VirtIONetMigTmp *tmp = opaque;
+ Error *local_err = NULL;
+
+ if (!virtio_net_update_host_features(tmp->parent, &local_err)) {
+ error_report_err(local_err);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static const VMStateDescription vmstate_virtio_net_nic = {
+ .name = "virtio-net-nic",
+ .pre_load = virtio_net_nic_pre_load,
+ .pre_save = virtio_net_nic_pre_save,
+ .post_load = virtio_net_nic_post_load,
+ .fields = (const VMStateField[]) {
+ VMSTATE_VARRAY_UINT32(ncs, struct VirtIONetMigTmp,
+ max_queue_pairs, 0, vmstate_net_peer_backend,
+ NetClientState),
+ VMSTATE_END_OF_LIST()
+ },
+};
+
static const VMStateDescription vmstate_virtio_net_device = {
.name = "virtio-net-device",
.version_id = VIRTIO_NET_VM_VERSION,
@@ -3602,6 +3691,9 @@ static const VMStateDescription vmstate_virtio_net_device = {
* but based on the uint.
*/
VMSTATE_BUFFER_POINTER_UNSAFE(vlans, VirtIONet, 0, MAX_VLAN >> 3),
+ VMSTATE_WITH_TMP_TEST(VirtIONet, virtio_net_is_backend_transfer,
+ struct VirtIONetMigTmp,
+ vmstate_virtio_net_nic),
VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
vmstate_virtio_net_has_vnet),
VMSTATE_UINT8(mac_table.multi_overflow, VirtIONet),
@@ -3866,6 +3958,42 @@ static bool failover_hide_primary_device(DeviceListener *listener,
return qatomic_read(&n->failover_primary_hidden);
}
+static bool virtio_net_check_peers_wait_incoming(VirtIONet *n, bool *waiting,
+ Error **errp)
+{
+ bool has_waiting = false;
+ bool has_not_waiting = false;
+
+ for (int i = 0; i < n->max_queue_pairs; i++) {
+ NetClientState *peer = n->nic->ncs[i].peer;
+ if (!peer) {
+ continue;
+ }
+
+ if (peer->info->is_wait_incoming &&
+ peer->info->is_wait_incoming(peer)) {
+ has_waiting = true;
+ } else {
+ has_not_waiting = true;
+ }
+
+ if (has_waiting && has_not_waiting) {
+ error_setg(errp, "Mixed peer states: some peers wait for incoming "
+ "migration while others don't");
+ return false;
+ }
+ }
+
+ if (has_waiting && !runstate_check(RUN_STATE_INMIGRATE)) {
+ error_setg(errp, "Peers wait for incoming, but it's not an incoming "
+ "migration.");
+ return false;
+ }
+
+ *waiting = has_waiting;
+ return true;
+}
+
static void virtio_net_device_realize(DeviceState *dev, Error **errp)
{
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -4003,6 +4131,12 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
n->nic->ncs[i].do_not_pad = true;
}
+ if (!virtio_net_check_peers_wait_incoming(n, &n->peers_wait_incoming,
+ errp)) {
+ virtio_cleanup(vdev);
+ return;
+ }
+
peer_test_vnet_hdr(n);
if (peer_has_vnet_hdr(n)) {
n->host_hdr_len = sizeof(struct virtio_net_hdr);
@@ -4314,6 +4448,7 @@ static const Property virtio_net_properties[] = {
host_features_ex,
VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM,
true),
+ DEFINE_PROP_BOOL("backend-transfer", VirtIONet, backend_transfer, true),
};
static void virtio_net_class_init(ObjectClass *klass, const void *data)
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 5b8ab7bda7..14a5c7c77b 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -231,6 +231,8 @@ struct VirtIONet {
struct EBPFRSSContext ebpf_rss;
uint32_t nr_ebpf_rss_fds;
char **ebpf_rss_fds;
+ bool peers_wait_incoming;
+ bool backend_transfer;
};
size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
diff --git a/include/net/net.h b/include/net/net.h
index aa34043b1a..d4cf399d4a 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -82,6 +82,7 @@ typedef void (SocketReadStateFinalize)(SocketReadState *rs);
typedef void (NetAnnounce)(NetClientState *);
typedef bool (SetSteeringEBPF)(NetClientState *, int);
typedef bool (NetCheckPeerType)(NetClientState *, ObjectClass *, Error **);
+typedef bool (IsWaitIncoming)(NetClientState *);
typedef struct vhost_net *(GetVHostNet)(NetClientState *nc);
typedef struct NetClientInfo {
@@ -110,6 +111,7 @@ typedef struct NetClientInfo {
NetAnnounce *announce;
SetSteeringEBPF *set_steering_ebpf;
NetCheckPeerType *check_peer_type;
+ IsWaitIncoming *is_wait_incoming;
GetVHostNet *get_vhost_net;
const VMStateDescription *backend_vmsd;
} NetClientInfo;
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 6/8] net/tap: support backend-transfer migration
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (4 preceding siblings ...)
2026-02-01 16:19 ` [PATCH v10 5/8] virtio-net: support backend-transfer migration Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-04 16:46 ` Chaney, Ben
2026-02-01 16:19 ` [PATCH v10 7/8] tests/functional: add skipWithoutSudo() decorator Vladimir Sementsov-Ogievskiy
2026-02-01 16:20 ` [PATCH v10 8/8] tests/functional: add test_tap_migration Vladimir Sementsov-Ogievskiy
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
Support transferring of TAP state (including open fd) through
migration stream.
Add new option, incoming-fds, which should be set to true to
trigger new logic.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
net/tap.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++-
qapi/net.json | 6 +++-
2 files changed, 92 insertions(+), 2 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index bd19c71c42..57d4d2d9f8 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -393,6 +393,65 @@ static VHostNetState *tap_get_vhost_net(NetClientState *nc)
return s->vhost_net;
}
+static bool tap_is_wait_incoming(NetClientState *nc)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+ assert(nc->info->type == NET_CLIENT_DRIVER_TAP);
+ return s->fd == -1;
+}
+
+static int tap_pre_load(void *opaque)
+{
+ TAPState *s = opaque;
+
+ if (s->fd != -1) {
+ error_report(
+ "TAP is already initialized and cannot receive incoming fd");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static bool tap_setup_vhost(TAPState *s, Error **errp);
+
+static int tap_post_load(void *opaque, int version_id)
+{
+ TAPState *s = opaque;
+ Error *local_err = NULL;
+
+ tap_read_poll(s, true);
+
+ if (s->fd < 0) {
+ return -1;
+ }
+
+ if (!tap_setup_vhost(s, &local_err)) {
+ error_prepend(&local_err,
+ "Failed to setup vhost during TAP post-load: ");
+ error_report_err(local_err);
+ return -1;
+ }
+
+ return 0;
+}
+
+static const VMStateDescription vmstate_tap = {
+ .name = "net-tap",
+ .pre_load = tap_pre_load,
+ .post_load = tap_post_load,
+ .fields = (const VMStateField[]) {
+ VMSTATE_FD(fd, TAPState),
+ VMSTATE_BOOL(using_vnet_hdr, TAPState),
+ VMSTATE_BOOL(has_ufo, TAPState),
+ VMSTATE_BOOL(has_uso, TAPState),
+ VMSTATE_BOOL(has_tunnel, TAPState),
+ VMSTATE_BOOL(enabled, TAPState),
+ VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
/* fd support */
static NetClientInfo net_tap_info = {
@@ -412,7 +471,9 @@ static NetClientInfo net_tap_info = {
.set_vnet_le = tap_set_vnet_le,
.set_vnet_be = tap_set_vnet_be,
.set_steering_ebpf = tap_set_steering_ebpf,
+ .is_wait_incoming = tap_is_wait_incoming,
.get_vhost_net = tap_get_vhost_net,
+ .backend_vmsd = &vmstate_tap,
};
static TAPState *net_tap_fd_init(NetClientState *peer,
@@ -907,6 +968,14 @@ int net_init_tap(const Netdev *netdev, const char *name,
return -1;
}
+ if (tap->incoming_fds &&
+ (tap->fd || tap->fds || tap->helper || tap->script ||
+ tap->downscript)) {
+ error_setg(errp, "incoming-fds is incompatible with "
+ "fd=, fds=, helper=, script=, downscript=");
+ return -1;
+ }
+
queues = tap_parse_fds_and_queues(tap, &fds, errp);
if (queues < 0) {
return -1;
@@ -925,7 +994,24 @@ int net_init_tap(const Netdev *netdev, const char *name,
goto fail;
}
- if (fds) {
+ if (tap->incoming_fds) {
+ for (i = 0; i < queues; i++) {
+ NetClientState *nc;
+ TAPState *s;
+
+ nc = qemu_new_net_client(&net_tap_info, peer, "tap", name);
+ qemu_set_info_str(nc, "incoming");
+
+ s = DO_UPCAST(TAPState, nc, nc);
+ s->fd = -1;
+ if (vhost_fds) {
+ s->vhostfd = vhost_fds[i];
+ s->vhost_busyloop_timeout = tap->has_poll_us ? tap->poll_us : 0;
+ } else {
+ s->vhostfd = -1;
+ }
+ }
+ } else if (fds) {
for (i = 0; i < queues; i++) {
if (i == 0) {
vnet_hdr = tap_probe_vnet_hdr(fds[i], errp);
diff --git a/qapi/net.json b/qapi/net.json
index 118bd34965..79f5ce9f43 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -355,6 +355,9 @@
# @poll-us: maximum number of microseconds that could be spent on busy
# polling for tap (since 2.7)
#
+# @incoming-fds: do not open/connnect any resources, instead wait for
+# TAP state from incoming migration stream. (Since 11.0)
+#
# Since: 1.2
##
{ 'struct': 'NetdevTapOptions',
@@ -373,7 +376,8 @@
'*vhostfds': 'str',
'*vhostforce': 'bool',
'*queues': 'uint32',
- '*poll-us': 'uint32'} }
+ '*poll-us': 'uint32',
+ '*incoming-fds': 'bool' } }
##
# @NetdevSocketOptions:
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 7/8] tests/functional: add skipWithoutSudo() decorator
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (5 preceding siblings ...)
2026-02-01 16:19 ` [PATCH v10 6/8] net/tap: " Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:19 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:20 ` [PATCH v10 8/8] tests/functional: add test_tap_migration Vladimir Sementsov-Ogievskiy
7 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:19 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
To be used in the next commit: that would be a test for TAP
networking, and it will need to setup TAP device.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
---
tests/functional/qemu_test/decorators.py | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/tests/functional/qemu_test/decorators.py b/tests/functional/qemu_test/decorators.py
index fcf236ecfd..aa135acc78 100644
--- a/tests/functional/qemu_test/decorators.py
+++ b/tests/functional/qemu_test/decorators.py
@@ -6,6 +6,7 @@
import os
import platform
import resource
+import subprocess
from unittest import skipIf, skipUnless
from .cmd import which
@@ -177,3 +178,18 @@ def skipLockedMemoryTest(locked_memory):
ulimit_memory == resource.RLIM_INFINITY or ulimit_memory >= locked_memory * 1024,
f'Test required {locked_memory} kB of available locked memory',
)
+
+'''
+Decorator to skip execution of a test if passwordless
+sudo command is not available.
+'''
+def skipWithoutSudo():
+ proc = subprocess.run(["sudo", "-n", "/bin/true"],
+ stdin=subprocess.PIPE,
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT,
+ universal_newlines=True,
+ check=False)
+
+ return skipUnless(proc.returncode == 0,
+ f'requires password-less sudo access: {proc.stdout}')
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v10 8/8] tests/functional: add test_tap_migration
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (6 preceding siblings ...)
2026-02-01 16:19 ` [PATCH v10 7/8] tests/functional: add skipWithoutSudo() decorator Vladimir Sementsov-Ogievskiy
@ 2026-02-01 16:20 ` Vladimir Sementsov-Ogievskiy
7 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-01 16:20 UTC (permalink / raw)
To: jasowang, mst
Cc: pbonzini, berrange, thuth, armbru, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, vsementsov, yc-core, leiyang,
raphael.s.norwitz, bchaney
Add test for a new backend-transfer migration of virtio-net/tap, with fd
passing through unix socket.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
tests/functional/x86_64/meson.build | 1 +
tests/functional/x86_64/test_tap_migration.py | 401 ++++++++++++++++++
2 files changed, 402 insertions(+)
create mode 100644 tests/functional/x86_64/test_tap_migration.py
diff --git a/tests/functional/x86_64/meson.build b/tests/functional/x86_64/meson.build
index f78eec5e6c..d23b0cc727 100644
--- a/tests/functional/x86_64/meson.build
+++ b/tests/functional/x86_64/meson.build
@@ -36,4 +36,5 @@ tests_x86_64_system_thorough = [
'vfio_user_client',
'virtio_balloon',
'virtio_gpu',
+ 'tap_migration',
]
diff --git a/tests/functional/x86_64/test_tap_migration.py b/tests/functional/x86_64/test_tap_migration.py
new file mode 100644
index 0000000000..0263843fd0
--- /dev/null
+++ b/tests/functional/x86_64/test_tap_migration.py
@@ -0,0 +1,401 @@
+#!/usr/bin/env python3
+#
+# Functional test that tests TAP local migration
+# with fd passing
+#
+# Copyright (c) Yandex Technologies LLC, 2026
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import os
+import time
+import subprocess
+from subprocess import run
+import signal
+from typing import Tuple
+
+from qemu_test import (
+ LinuxKernelTest,
+ Asset,
+ exec_command_and_wait_for_pattern,
+)
+from qemu_test.decorators import skipWithoutSudo
+
+GUEST_IP = "10.0.1.2"
+GUEST_IP_MASK = f"{GUEST_IP}/24"
+GUEST_MAC = "d6:0d:75:f8:0f:b7"
+HOST_IP = "10.0.1.1"
+HOST_IP_MASK = f"{HOST_IP}/24"
+TAP_ID = "tap0"
+TAP_ID2 = "tap1"
+TAP_MAC = "e6:1d:44:b5:03:5d"
+
+
+def ip(args, check=True) -> None:
+ """Run ip command with sudo"""
+ run(["sudo", "ip"] + args, check=check)
+
+
+def del_tap(tap_name: str = TAP_ID) -> None:
+ ip(["tuntap", "del", tap_name, "mode", "tap", "multi_queue"], check=False)
+
+
+def init_tap(tap_name: str = TAP_ID, with_ip: bool = True) -> None:
+ ip(["tuntap", "add", "dev", tap_name, "mode", "tap", "multi_queue"])
+ if with_ip:
+ ip(["link", "set", "dev", tap_name, "address", TAP_MAC])
+ ip(["addr", "add", HOST_IP_MASK, "dev", tap_name])
+ ip(["link", "set", tap_name, "up"])
+
+
+def switch_network_to_tap2() -> None:
+ ip(["link", "set", TAP_ID2, "down"])
+ ip(["link", "set", TAP_ID, "down"])
+ ip(["addr", "delete", HOST_IP_MASK, "dev", TAP_ID])
+ ip(["link", "set", "dev", TAP_ID2, "address", TAP_MAC])
+ ip(["addr", "add", HOST_IP_MASK, "dev", TAP_ID2])
+ ip(["link", "set", TAP_ID2, "up"])
+
+
+def parse_ping_line(line: str) -> float:
+ # suspect lines like
+ # [1748524876.590509] 64 bytes from 94.245.155.3 \
+ # (94.245.155.3): icmp_seq=1 ttl=250 time=101 ms
+ spl = line.split()
+ return float(spl[0][1:-1])
+
+
+def parse_ping_output(out) -> Tuple[bool, float, float]:
+ lines = [x for x in out.split("\n") if x.startswith("[")]
+
+ try:
+ first_no_ans = next(
+ (ind for ind in range(len(lines)) if lines[ind][20:26] == "no ans")
+ )
+ except StopIteration:
+ return False, parse_ping_line(lines[0]), parse_ping_line(lines[-1])
+
+ last_no_ans = next(
+ ind
+ for ind in range(len(lines) - 1, -1, -1)
+ if lines[ind][20:26] == "no ans"
+ )
+
+ return (
+ True,
+ parse_ping_line(lines[first_no_ans]),
+ parse_ping_line(lines[last_no_ans]),
+ )
+
+
+def wait_migration_finish(source_vm, target_vm):
+ migr_events = (
+ ("MIGRATION", {"data": {"status": "completed"}}),
+ ("MIGRATION", {"data": {"status": "failed"}}),
+ )
+
+ source_e = source_vm.events_wait(migr_events)["data"]
+ target_e = target_vm.events_wait(migr_events)["data"]
+
+ source_s = source_vm.cmd("query-status")["status"]
+ target_s = target_vm.cmd("query-status")["status"]
+
+ assert (
+ source_e["status"] == "completed"
+ and target_e["status"] == "completed"
+ and source_s == "postmigrate"
+ and target_s == "paused"
+ ), f"""Migration failed:
+ SRC status: {source_s}
+ SRC event: {source_e}
+ TGT status: {target_s}
+ TGT event:{target_e}"""
+
+
+@skipWithoutSudo()
+class TAPFdMigration(LinuxKernelTest):
+
+ ASSET_KERNEL = Asset(
+ (
+ "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
+ "/31/Server/x86_64/os/images/pxeboot/vmlinuz"
+ ),
+ "d4738d03dbbe083ca610d0821d0a8f1488bebbdccef54ce33e3adb35fda00129",
+ )
+
+ ASSET_INITRD = Asset(
+ (
+ "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
+ "/31/Server/x86_64/os/images/pxeboot/initrd.img"
+ ),
+ "277cd6c7adf77c7e63d73bbb2cded8ef9e2d3a2f100000e92ff1f8396513cd8b",
+ )
+
+ ASSET_ALPINE_ISO = Asset(
+ (
+ "https://dl-cdn.alpinelinux.org/"
+ "alpine/v3.22/releases/x86_64/alpine-standard-3.22.1-x86_64.iso"
+ ),
+ "96d1b44ea1b8a5a884f193526d92edb4676054e9fa903ad2f016441a0fe13089",
+ )
+
+ def setUp(self):
+ super().setUp()
+
+ init_tap()
+
+ self.outer_ping_proc = None
+ self.shm_path = None
+
+ def tearDown(self):
+ try:
+ del_tap(TAP_ID)
+ del_tap(TAP_ID2)
+
+ if self.outer_ping_proc:
+ self.stop_outer_ping()
+
+ if self.shm_path:
+ os.unlink(self.shm_path)
+ finally:
+ super().tearDown()
+
+ def start_outer_ping(self) -> None:
+ assert self.outer_ping_proc is None
+ self.outer_ping_log = self.scratch_file("ping.log")
+ with open(self.outer_ping_log, "w") as f:
+ self.outer_ping_proc = subprocess.Popen(
+ ["ping", "-i", "0", "-O", "-D", GUEST_IP],
+ text=True,
+ stdout=f,
+ )
+
+ def stop_outer_ping(self) -> str:
+ assert self.outer_ping_proc
+ self.outer_ping_proc.send_signal(signal.SIGINT)
+
+ self.outer_ping_proc.communicate(timeout=5)
+ self.outer_ping_proc = None
+
+ with open(self.outer_ping_log) as f:
+ return f.read()
+
+ def stop_ping_and_check(self, stop_time, resume_time):
+ ping_res = self.stop_outer_ping()
+
+ discon, a, b = parse_ping_output(ping_res)
+
+ if not discon:
+ text = (
+ f"STOP: {stop_time}, RESUME: {resume_time}," f"PING: {a} - {b}"
+ )
+ if a > stop_time or b < resume_time:
+ self.fail(f"PING failed: {text}")
+ self.log.info(f"PING: no packets lost: {text}")
+ return
+
+ text = (
+ f"STOP: {stop_time}, RESUME: {resume_time},"
+ f"PING: disconnect: {a} - {b}"
+ )
+ self.log.info(text)
+ eps = 0.01
+ if a < stop_time - eps or b > resume_time + eps:
+ self.fail(text)
+
+ def one_ping_from_guest(self, vm) -> None:
+ exec_command_and_wait_for_pattern(
+ self,
+ f"ping -c 1 -W 1 {HOST_IP}",
+ "1 packets transmitted, 1 packets received",
+ "1 packets transmitted, 0 packets received",
+ vm=vm,
+ )
+ self.wait_for_console_pattern("# ", vm=vm)
+
+ def one_ping_from_host(self) -> None:
+ run(["ping", "-c", "1", "-W", "1", GUEST_IP])
+
+ def setup_shared_memory(self):
+ self.shm_path = f"/dev/shm/qemu_test_{os.getpid()}"
+
+ try:
+ with open(self.shm_path, "wb") as f:
+ f.write(b"\0" * (1024 * 1024 * 1024)) # 1GB
+ except Exception as e:
+ self.fail(f"Failed to create shared memory file: {e}")
+
+ def prepare_and_launch_vm(
+ self, shm_path, vhost, incoming=False, vm=None, backend_transfer=True
+ ):
+ if not vm:
+ vm = self.vm
+
+ vm.set_console()
+ vm.add_args("-accel", "kvm")
+ vm.add_args("-device", "pcie-pci-bridge,id=pci.1,bus=pcie.0")
+ vm.add_args("-m", "1G")
+
+ vm.add_args(
+ "-object",
+ f"memory-backend-file,id=ram0,size=1G,mem-path={shm_path},share=on",
+ )
+ vm.add_args("-machine", "memory-backend=ram0")
+
+ vm.add_args(
+ "-drive",
+ f"file={self.ASSET_ALPINE_ISO.fetch()},media=cdrom,format=raw",
+ )
+
+ vm.add_args("-S")
+
+ if incoming:
+ vm.add_args("-incoming", "defer")
+
+ vm_s = "target" if incoming else "source"
+ self.log.info(f"Launching {vm_s} VM")
+ vm.launch()
+
+ if not backend_transfer:
+ tap_name = TAP_ID2 if incoming else TAP_ID
+ else:
+ tap_name = TAP_ID
+
+ self.add_virtio_net(vm, vhost, tap_name, backend_transfer, incoming)
+
+ self.set_migration_capabilities(vm, backend_transfer)
+
+ def add_virtio_net(self, vm, vhost: bool, tap_name: str,
+ backend_transfer: bool, incoming: bool):
+ incoming_fds = backend_transfer and incoming
+ netdev_params = {
+ "id": "netdev.1",
+ "vhost": vhost,
+ "type": "tap",
+ "ifname": tap_name,
+ "queues": 4,
+ "vnet_hdr": True,
+ "incoming-fds": incoming_fds,
+ }
+
+ if not incoming_fds:
+ netdev_params["script"] = "no"
+ netdev_params["downscript"] = "no"
+
+ vm.cmd("netdev_add", netdev_params)
+
+ vm.cmd(
+ "device_add",
+ driver="virtio-net-pci",
+ romfile="",
+ id="vnet.1",
+ netdev="netdev.1",
+ mq=True,
+ vectors=18,
+ bus="pci.1",
+ mac=GUEST_MAC,
+ disable_legacy="off",
+ backend_transfer=backend_transfer,
+ )
+
+ def set_migration_capabilities(self, vm, backend_transfer=True):
+ vm.cmd("migrate-set-capabilities", { "capabilities": [
+ {"capability": "events", "state": True},
+ {"capability": "x-ignore-shared", "state": True},
+ ]})
+ vm.cmd("migrate-set-parameters", {
+ "backend-transfer": backend_transfer
+ })
+
+ def setup_guest_network(self) -> None:
+ exec_command_and_wait_for_pattern(self, "ip addr", "# ")
+ exec_command_and_wait_for_pattern(
+ self,
+ f"ip addr add {GUEST_IP_MASK} dev eth0 && "
+ "ip link set eth0 up && echo OK",
+ "OK",
+ )
+ self.wait_for_console_pattern("# ")
+
+ def do_test_tap_fd_migration(self, vhost, backend_transfer=True):
+ self.require_accelerator("kvm")
+ self.set_machine("q35")
+
+ socket_dir = self.socket_dir()
+ migration_socket = os.path.join(socket_dir.name, "migration.sock")
+
+ self.setup_shared_memory()
+
+ # Setup second TAP if needed
+ if not backend_transfer:
+ del_tap(TAP_ID2)
+ init_tap(TAP_ID2, with_ip=False)
+
+ self.prepare_and_launch_vm(
+ self.shm_path, vhost, backend_transfer=backend_transfer
+ )
+ self.vm.cmd("cont")
+ self.wait_for_console_pattern("login:")
+ exec_command_and_wait_for_pattern(self, "root", "# ")
+
+ self.setup_guest_network()
+
+ self.one_ping_from_guest(self.vm)
+ self.one_ping_from_host()
+ self.start_outer_ping()
+
+ # Get some successful pings before migration
+ time.sleep(0.5)
+
+ target_vm = self.get_vm(name="target")
+ self.prepare_and_launch_vm(
+ self.shm_path,
+ vhost,
+ incoming=True,
+ vm=target_vm,
+ backend_transfer=backend_transfer,
+ )
+
+ target_vm.cmd("migrate-incoming", {"uri": f"unix:{migration_socket}"})
+
+ self.log.info("Starting migration")
+ freeze_start = time.time()
+ self.vm.cmd("migrate", {"uri": f"unix:{migration_socket}"})
+
+ self.log.info("Waiting for migration completion")
+ wait_migration_finish(self.vm, target_vm)
+
+ # Switch network to tap1 if not using backend transfer
+ if not backend_transfer:
+ switch_network_to_tap2()
+
+ target_vm.cmd("cont")
+ freeze_end = time.time()
+
+ self.vm.shutdown()
+
+ self.log.info("Verifying PING on target VM after migration")
+ self.one_ping_from_guest(target_vm)
+ self.one_ping_from_host()
+
+ # And a bit more pings after source shutdown
+ time.sleep(0.3)
+ self.stop_ping_and_check(freeze_start, freeze_end)
+
+ target_vm.shutdown()
+
+ def test_tap_fd_migration(self):
+ self.do_test_tap_fd_migration(False)
+
+ def test_tap_fd_migration_vhost(self):
+ self.do_test_tap_fd_migration(True)
+
+ def test_tap_new_tap_migration(self):
+ self.do_test_tap_fd_migration(False, backend_transfer=False)
+
+ def test_tap_new_tap_migration_vhost(self):
+ self.do_test_tap_fd_migration(True, backend_transfer=False)
+
+
+if __name__ == "__main__":
+ LinuxKernelTest.main()
--
2.52.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-01 16:19 ` [PATCH v10 3/8] qapi: add backend-transfer migration parameter Vladimir Sementsov-Ogievskiy
@ 2026-02-04 13:08 ` Markus Armbruster
2026-02-04 17:21 ` Peter Xu
1 sibling, 0 replies; 21+ messages in thread
From: Markus Armbruster @ 2026-02-04 13:08 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, mst, pbonzini, berrange, thuth, eblake, farosas, peterx,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, yc-core, leiyang, raphael.s.norwitz,
bchaney
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> We are going to implement backend-transfer feature: some devices
> will be able to transfer their backend through migration stream
> for local migration through UNIX domain socket. For example,
> virtio-net will migrate its attached TAP netdev, with all its
> connected file descriptors.
>
> In this commit we introduce a migration parameter, which enables
> the feature, for supporting devices (no one at the moment).
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[...]
> void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
> diff --git a/qapi/migration.json b/qapi/migration.json
> index f925e5541b..cbe88f0c91 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -828,7 +828,8 @@
> 'mode',
> 'zero-page-detection',
> 'direct-io',
> - 'cpr-exec-command'] }
> + 'cpr-exec-command',
> + 'backend-transfer'] }
>
> ##
> # @migrate-set-parameters:
> @@ -1004,6 +1005,13 @@
> # is @cpr-exec. The first list element is the program's filename,
> # the remainder its arguments. (Since 10.2)
> #
> +# @backend-transfer: Enable backend-transfer feature for devices that
> +# supports it. In general that means that backend state and its
support
> +# file descriptors are passed to the destination in the migraton
> +# channel (which must be a UNIX socket). Individual devices
> +# declare the support for backend-transfer by per-device
> +# backend-transfer option. (Since 11.0)
> +#
> # Features:
> #
> # @unstable: Members @x-checkpoint-delay and
> @@ -1043,7 +1051,8 @@
> '*mode': 'MigMode',
> '*zero-page-detection': 'ZeroPageDetection',
> '*direct-io': 'bool',
> - '*cpr-exec-command': [ 'str' ]} }
> + '*cpr-exec-command': [ 'str' ],
> + '*backend-transfer': 'bool' } }
>
> ##
> # @query-migrate-parameters:
With the grammar fix, QAPI schema
Acked-by: Markus Armbruster <armbru@redhat.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 6/8] net/tap: support backend-transfer migration
2026-02-01 16:19 ` [PATCH v10 6/8] net/tap: " Vladimir Sementsov-Ogievskiy
@ 2026-02-04 16:46 ` Chaney, Ben
2026-02-05 8:12 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 21+ messages in thread
From: Chaney, Ben @ 2026-02-04 16:46 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, jasowang@redhat.com, mst@redhat.com
Cc: pbonzini@redhat.com, berrange@redhat.com, thuth@redhat.com,
armbru@redhat.com, eblake@redhat.com, farosas@suse.de,
peterx@redhat.com, zhao1.liu@intel.com, wangyanan55@huawei.com,
philmd@linaro.org, marcel.apfelbaum@gmail.com,
eduardo@habkost.net, davydov-max@yandex-team.ru,
qemu-devel@nongnu.org, yc-core@yandex-team.ru, leiyang@redhat.com,
raphael.s.norwitz@gmail.com
On 2/1/26, 11:21 AM, "Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru <mailto:vsementsov@yandex-team.ru>> wrote:
> + if (tap->incoming_fds &&
> + (tap->fd || tap->fds || tap->helper || tap->script ||
> + tap->downscript)) {
> + error_setg(errp, "incoming-fds is incompatible with "
> + "fd=, fds=, helper=, script=, downscript=");
> + return -1;
> + }
Is it possible to relax this constraint at all? If so, I
would prefer to allow script= and downscript= parameters
to remain in place.
Thanks,
Ben
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-01 16:19 ` [PATCH v10 3/8] qapi: add backend-transfer migration parameter Vladimir Sementsov-Ogievskiy
2026-02-04 13:08 ` Markus Armbruster
@ 2026-02-04 17:21 ` Peter Xu
2026-02-05 7:07 ` Markus Armbruster
1 sibling, 1 reply; 21+ messages in thread
From: Peter Xu @ 2026-02-04 17:21 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, mst, pbonzini, berrange, thuth, armbru, eblake, farosas,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, yc-core, leiyang, raphael.s.norwitz,
bchaney
On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> # @migrate-set-parameters:
> @@ -1004,6 +1005,13 @@
> # is @cpr-exec. The first list element is the program's filename,
> # the remainder its arguments. (Since 10.2)
> #
> +# @backend-transfer: Enable backend-transfer feature for devices that
> +# supports it. In general that means that backend state and its
> +# file descriptors are passed to the destination in the migraton
> +# channel (which must be a UNIX socket). Individual devices
> +# declare the support for backend-transfer by per-device
> +# backend-transfer option. (Since 11.0)
I still think it'll be nice to either have "local" in the name of parameter
or at least document it with crystal clear terms.
I used to suggest fd-passing, but maybe you wanted to emphasize there's
more than fds to be migrated at least for tap? Then it can still be
"local-backend-transfer", because nobody stops a device to transfer backend
states in a remote migration either.. so "backend-transfer" seems to also
work for remote migrations, but it is not.
Or at least mentioned explicitly in the comment, saying this is a local
migration / upgrade.
If you could at least update the doc on the locality attribute for the
migration (one way or another..), feel free to take:
Acked-by: Peter Xu <peterx@redhat.com>
If you decide to rename, that's even better.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-04 17:21 ` Peter Xu
@ 2026-02-05 7:07 ` Markus Armbruster
2026-02-05 8:06 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 21+ messages in thread
From: Markus Armbruster @ 2026-02-05 7:07 UTC (permalink / raw)
To: Peter Xu
Cc: Vladimir Sementsov-Ogievskiy, jasowang, mst, pbonzini, berrange,
thuth, armbru, eblake, farosas, zhao1.liu, wangyanan55, philmd,
marcel.apfelbaum, eduardo, davydov-max, qemu-devel, yc-core,
leiyang, raphael.s.norwitz, bchaney
Peter Xu <peterx@redhat.com> writes:
> On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> # @migrate-set-parameters:
>> @@ -1004,6 +1005,13 @@
>> # is @cpr-exec. The first list element is the program's filename,
>> # the remainder its arguments. (Since 10.2)
>> #
>> +# @backend-transfer: Enable backend-transfer feature for devices that
>> +# supports it. In general that means that backend state and its
>> +# file descriptors are passed to the destination in the migraton
>> +# channel (which must be a UNIX socket). Individual devices
>> +# declare the support for backend-transfer by per-device
>> +# backend-transfer option. (Since 11.0)
>
> I still think it'll be nice to either have "local" in the name of parameter
> or at least document it with crystal clear terms.
>
> I used to suggest fd-passing, but maybe you wanted to emphasize there's
> more than fds to be migrated at least for tap? Then it can still be
> "local-backend-transfer", because nobody stops a device to transfer backend
> states in a remote migration either.. so "backend-transfer" seems to also
> work for remote migrations, but it is not.
>
> Or at least mentioned explicitly in the comment, saying this is a local
> migration / upgrade.
Documenting the restriction is a must regardless of naming. The
proposed text does document it, but only indirectly: "(which must be a
UNIX socket)". Could be improved, I guess.
Capturing the restriction in the name might help. But it also makes the
name longer. Feels like a matter of taste to me. You guys decide :)
> If you could at least update the doc on the locality attribute for the
> migration (one way or another..), feel free to take:
>
> Acked-by: Peter Xu <peterx@redhat.com>
>
> If you decide to rename, that's even better.
>
> Thanks,
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-05 7:07 ` Markus Armbruster
@ 2026-02-05 8:06 ` Vladimir Sementsov-Ogievskiy
2026-02-05 16:25 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-05 8:06 UTC (permalink / raw)
To: Markus Armbruster, Peter Xu
Cc: jasowang, mst, pbonzini, berrange, thuth, eblake, farosas,
zhao1.liu, wangyanan55, philmd, marcel.apfelbaum, eduardo,
davydov-max, qemu-devel, yc-core, leiyang, raphael.s.norwitz,
bchaney
On 05.02.26 10:07, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>> # @migrate-set-parameters:
>>> @@ -1004,6 +1005,13 @@
>>> # is @cpr-exec. The first list element is the program's filename,
>>> # the remainder its arguments. (Since 10.2)
>>> #
>>> +# @backend-transfer: Enable backend-transfer feature for devices that
>>> +# supports it. In general that means that backend state and its
>>> +# file descriptors are passed to the destination in the migraton
>>> +# channel (which must be a UNIX socket). Individual devices
>>> +# declare the support for backend-transfer by per-device
>>> +# backend-transfer option. (Since 11.0)
>>
>> I still think it'll be nice to either have "local" in the name of parameter
>> or at least document it with crystal clear terms.
>>
>> I used to suggest fd-passing, but maybe you wanted to emphasize there's
>> more than fds to be migrated at least for tap?
For vhost-user-blk it's the same: not only FDs.
>> Then it can still be
>> "local-backend-transfer", because nobody stops a device to transfer backend
>> states in a remote migration either.. so "backend-transfer" seems to also
>> work for remote migrations, but it is not.
Hmm. I imagine a mechanism, where OS supports passing FDs to another host.
This needs support for actually migrating the corresponding kernel object
by OS automatically. But theoretically I think it can be done transparently
for userspace QEMU process, which will simply pass FDs to the some special
socket, similar to UNIX domain socket.
So, the key aspect is that we should be able to pass FDs to the migration
channel, which currently meant that it must be UNIX domain socket, and it
must be local migration. But in future it may change.
And yes, "backend-transfer" work for remote migration of backend.
If we ever implement remote backend migration, why not to
reuse "backend-transfer" for it? Even if there will not be transparent
support from OS, and we'll implement another mechanics, we may add
new parameter
backend-transfer-mechanism = "scm-rights" | "something-other"
(or we can put this into "backend-transfer", supporting passing string to
it and deprecating boolean)
More over, this future "remote-backend-transfer" could be used for local
migration, so again, it should be called simply "backend-transfer"..
>>
>> Or at least mentioned explicitly in the comment, saying this is a local
>> migration / upgrade.
>
> Documenting the restriction is a must regardless of naming. The
> proposed text does document it, but only indirectly: "(which must be a
> UNIX socket)". Could be improved, I guess.
>
> Capturing the restriction in the name might help. But it also makes the
> name longer. Feels like a matter of taste to me. You guys decide :)
>
>> If you could at least update the doc on the locality attribute for the
>> migration (one way or another..), feel free to take:
>>
>> Acked-by: Peter Xu <peterx@redhat.com>
>>
>> If you decide to rename, that's even better.
>>
>> Thanks,
>
I'm OK with any name, but if you agree with my arguments above, I'd keep
"backend-transfer" name, and update the spec:
@backend-transfer: Enable backend-transfer feature for devices that
support it. In general that means that backend state and its
file descriptors are passed to the destination in the migraton
channel. Individual devices declare the support for
backend-transfer by per-device backend-transfer option.
SCM_RIGHT mechanism used to pass FDs, so backend-transfer
requires migration to be local and the channel to be UNIX domain
socket. (Since 11.0)
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 6/8] net/tap: support backend-transfer migration
2026-02-04 16:46 ` Chaney, Ben
@ 2026-02-05 8:12 ` Vladimir Sementsov-Ogievskiy
2026-02-05 14:51 ` Chaney, Ben
0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-05 8:12 UTC (permalink / raw)
To: Chaney, Ben, jasowang@redhat.com, mst@redhat.com
Cc: pbonzini@redhat.com, berrange@redhat.com, thuth@redhat.com,
armbru@redhat.com, eblake@redhat.com, farosas@suse.de,
peterx@redhat.com, zhao1.liu@intel.com, wangyanan55@huawei.com,
philmd@linaro.org, marcel.apfelbaum@gmail.com,
eduardo@habkost.net, davydov-max@yandex-team.ru,
qemu-devel@nongnu.org, yc-core@yandex-team.ru, leiyang@redhat.com,
raphael.s.norwitz@gmail.com
On 04.02.26 19:46, Chaney, Ben wrote:
>
> On 2/1/26, 11:21 AM, "Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru <mailto:vsementsov@yandex-team.ru>> wrote:
>
>> + if (tap->incoming_fds &&
>> + (tap->fd || tap->fds || tap->helper || tap->script ||
>> + tap->downscript)) {
>> + error_setg(errp, "incoming-fds is incompatible with "
>> + "fd=, fds=, helper=, script=, downscript=");
>> + return -1;
>> + }
>
> Is it possible to relax this constraint at all? If so, I
> would prefer to allow script= and downscript= parameters
> to remain in place.
>
That possible, but this requires some additional logic I think.
What if migration fails? Who should call downscript? Migration
may be successful on source, and fail on target.. In this case,
management tool usually resume stopped source. And in this case
source should get again a responsibility to call downscript.
So, I think it should be additional patch on top, which introduce
support for script/downscript together with backend-transfer.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 6/8] net/tap: support backend-transfer migration
2026-02-05 8:12 ` Vladimir Sementsov-Ogievskiy
@ 2026-02-05 14:51 ` Chaney, Ben
2026-02-06 9:00 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 21+ messages in thread
From: Chaney, Ben @ 2026-02-05 14:51 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, jasowang@redhat.com, mst@redhat.com
Cc: pbonzini@redhat.com, berrange@redhat.com, thuth@redhat.com,
armbru@redhat.com, eblake@redhat.com, farosas@suse.de,
peterx@redhat.com, zhao1.liu@intel.com, wangyanan55@huawei.com,
philmd@linaro.org, marcel.apfelbaum@gmail.com,
eduardo@habkost.net, davydov-max@yandex-team.ru,
qemu-devel@nongnu.org, yc-core@yandex-team.ru, leiyang@redhat.com,
raphael.s.norwitz@gmail.com
On 2/5/26, 3:12 AM, "Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru <mailto:vsementsov@yandex-team.ru>> wrote:
On 04.02.26 19:46, Chaney, Ben wrote:
>
>
> Is it possible to relax this constraint at all? If so, I
> would prefer to allow script= and downscript= parameters
> to remain in place.
>
>
> That possible, but this requires some additional logic I think.
>
>
> What if migration fails? Who should call downscript? Migration
> may be successful on source, and fail on target.. In this case,
> management tool usually resume stopped source. And in this case
> source should get again a responsibility to call downscript.
Hmm... If script and downscript are not set, they default to
/etc/qemu-ifup and /etc/qemu-ifdown. To disable them
altogether you must pass script=no,downscript=no, which
is not currently possible with this patch.
> So, I think it should be additional patch on top, which introduce
> support for script/downscript together with backend-transfer.
Would it be possible to at least support script=no,downscript=no as
part of this patch? We may want to require it because it sounds like
we don't have the logic to call qemu-ifup and qemu-ifdown correctly
in the event of a failure.
Thanks,
Ben
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-05 8:06 ` Vladimir Sementsov-Ogievskiy
@ 2026-02-05 16:25 ` Peter Xu
2026-02-06 8:56 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2026-02-05 16:25 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: Markus Armbruster, jasowang, mst, pbonzini, berrange, thuth,
eblake, farosas, zhao1.liu, wangyanan55, philmd, marcel.apfelbaum,
eduardo, davydov-max, qemu-devel, yc-core, leiyang,
raphael.s.norwitz, bchaney
On Thu, Feb 05, 2026 at 11:06:03AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 05.02.26 10:07, Markus Armbruster wrote:
> > Peter Xu <peterx@redhat.com> writes:
> >
> > > On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > # @migrate-set-parameters:
> > > > @@ -1004,6 +1005,13 @@
> > > > # is @cpr-exec. The first list element is the program's filename,
> > > > # the remainder its arguments. (Since 10.2)
> > > > #
> > > > +# @backend-transfer: Enable backend-transfer feature for devices that
> > > > +# supports it. In general that means that backend state and its
> > > > +# file descriptors are passed to the destination in the migraton
> > > > +# channel (which must be a UNIX socket). Individual devices
> > > > +# declare the support for backend-transfer by per-device
> > > > +# backend-transfer option. (Since 11.0)
> > >
> > > I still think it'll be nice to either have "local" in the name of parameter
> > > or at least document it with crystal clear terms.
> > >
> > > I used to suggest fd-passing, but maybe you wanted to emphasize there's
> > > more than fds to be migrated at least for tap?
>
> For vhost-user-blk it's the same: not only FDs.
>
> > > Then it can still be
> > > "local-backend-transfer", because nobody stops a device to transfer backend
> > > states in a remote migration either.. so "backend-transfer" seems to also
> > > work for remote migrations, but it is not.
>
> Hmm. I imagine a mechanism, where OS supports passing FDs to another host.
> This needs support for actually migrating the corresponding kernel object
> by OS automatically. But theoretically I think it can be done transparently
> for userspace QEMU process, which will simply pass FDs to the some special
> socket, similar to UNIX domain socket.
>
> So, the key aspect is that we should be able to pass FDs to the migration
> channel, which currently meant that it must be UNIX domain socket, and it
> must be local migration. But in future it may change.
That's a nice vision, but IMHO we shouldn't take it into account when
defining any QEMU interface, when it's only about pure imaginations..
unless there is solid work in progress, or ideas proposed / known feasible
at least.
>
> And yes, "backend-transfer" work for remote migration of backend.
> If we ever implement remote backend migration, why not to
> reuse "backend-transfer" for it? Even if there will not be transparent
> support from OS, and we'll implement another mechanics, we may add
> new parameter
>
>
> backend-transfer-mechanism = "scm-rights" | "something-other"
Yes, this will look much better. We likely shouldn't make it "scm-rights",
it should be generic terms that applies to all platforms like "local", even
if the implication / implementation might be different on various
platforms.
That's also the major confusion I got when I was reading the other
vhost-user-blk series, thought it was a local migration but not.
I feel like the interface is simply wrong to make it one covering both, or
at least it shouldn't be a boolean as you said because it represents more
than one use case.
If it's a boolean, it also shouldn't rely on UNIX sockets if it was trying
to describe a remote migration, right? The vhost-usr-blk way of
backend-migration doesn't require UNIX socket, or does it?
Especially, if we still want to have your new proposal try to work for CPR
too or even replace it some day (or a continuous set of proposals in the
future, from different developers based on this feature), we need to have a
solid and clear way represents what CPR does, which is to do local fd
sharing. "backend-transfer: local" or something similar can be that.
>
> (or we can put this into "backend-transfer", supporting passing string to
> it and deprecating boolean)
It can be a enum, something like NONE, LOCAL, REMOTE. But before that..
>
> More over, this future "remote-backend-transfer" could be used for local
> migration, so again, it should be called simply "backend-transfer"..
Yes, REMOTE might be slightly misleading. And considering you seem to want
to allow any of below to work:
(1) enable fd migrations only,
(2) enable remote migrations on backends only,
(3) enable both of (1)+(2)
Maybe we should have two different feature bits? The per-device one can be
kept as backend-transfer, however we need to change the global migration
knob to something describing a local migration.
In summary, still 1 new parameter for migration, 1 new parameter for
device, but adjust to:
- Migration parameter: "local", boolean, when set, the migration must be
a local migration within host (which requires UNIX sockets on Linux)
- Per-device parameter "backend-transfer", boolean, when set, device will
migrate backends when migration happens. Otherwise, backends are not
migrated; dest QEMU needs to re-initialize it. The backends may or may
not contain FDs.
When the backend device states contain FDs and FD migrations are
required, it requires "local" set first above, or it should fail the
migration when user requested backend-transfer=on.
When it doesn't contain FD at all (or FD migration is not a must?), it
should either migrate the backend or not depending on the user's
selection.
For tap (your series here), you need to set both ON and required.
For vhost-usr-blk, that only needs to set per-device knob to ON, the other
one shouldn't matter.
Then when we want to replace cpr, we request people switch (cpr-transfer
only, keeping cpr-exec / cpr-reboot aside for now) from setting
mode=cpr-transfer to local=on, which hopefully will start work as before.
The per-device parameter doesn't matter in this case.
Would this be more reasonable?
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-05 16:25 ` Peter Xu
@ 2026-02-06 8:56 ` Vladimir Sementsov-Ogievskiy
2026-02-06 16:08 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-06 8:56 UTC (permalink / raw)
To: Peter Xu
Cc: Markus Armbruster, jasowang, mst, pbonzini, berrange, thuth,
eblake, farosas, zhao1.liu, wangyanan55, philmd, marcel.apfelbaum,
eduardo, davydov-max, qemu-devel, yc-core, leiyang,
raphael.s.norwitz, bchaney
On 05.02.26 19:25, Peter Xu wrote:
> On Thu, Feb 05, 2026 at 11:06:03AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 05.02.26 10:07, Markus Armbruster wrote:
>>> Peter Xu <peterx@redhat.com> writes:
>>>
>>>> On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>>> # @migrate-set-parameters:
>>>>> @@ -1004,6 +1005,13 @@
>>>>> # is @cpr-exec. The first list element is the program's filename,
>>>>> # the remainder its arguments. (Since 10.2)
>>>>> #
>>>>> +# @backend-transfer: Enable backend-transfer feature for devices that
>>>>> +# supports it. In general that means that backend state and its
>>>>> +# file descriptors are passed to the destination in the migraton
>>>>> +# channel (which must be a UNIX socket). Individual devices
>>>>> +# declare the support for backend-transfer by per-device
>>>>> +# backend-transfer option. (Since 11.0)
>>>>
>>>> I still think it'll be nice to either have "local" in the name of parameter
>>>> or at least document it with crystal clear terms.
>>>>
>>>> I used to suggest fd-passing, but maybe you wanted to emphasize there's
>>>> more than fds to be migrated at least for tap?
>>
>> For vhost-user-blk it's the same: not only FDs.
>>
>>>> Then it can still be
>>>> "local-backend-transfer", because nobody stops a device to transfer backend
>>>> states in a remote migration either.. so "backend-transfer" seems to also
>>>> work for remote migrations, but it is not.
>>
>> Hmm. I imagine a mechanism, where OS supports passing FDs to another host.
>> This needs support for actually migrating the corresponding kernel object
>> by OS automatically. But theoretically I think it can be done transparently
>> for userspace QEMU process, which will simply pass FDs to the some special
>> socket, similar to UNIX domain socket.
>>
>> So, the key aspect is that we should be able to pass FDs to the migration
>> channel, which currently meant that it must be UNIX domain socket, and it
>> must be local migration. But in future it may change.
>
> That's a nice vision, but IMHO we shouldn't take it into account when
> defining any QEMU interface, when it's only about pure imaginations..
> unless there is solid work in progress, or ideas proposed / known feasible
> at least.
>
>>
>> And yes, "backend-transfer" work for remote migration of backend.
>> If we ever implement remote backend migration, why not to
>> reuse "backend-transfer" for it? Even if there will not be transparent
>> support from OS, and we'll implement another mechanics, we may add
>> new parameter
>>
>>
>> backend-transfer-mechanism = "scm-rights" | "something-other"
>
> Yes, this will look much better. We likely shouldn't make it "scm-rights",
> it should be generic terms that applies to all platforms like "local", even
> if the implication / implementation might be different on various
> platforms.
>
> That's also the major confusion I got when I was reading the other
> vhost-user-blk series, thought it was a local migration but not.
>
> I feel like the interface is simply wrong to make it one covering both, or
> at least it shouldn't be a boolean as you said because it represents more
> than one use case.
>
> If it's a boolean, it also shouldn't rely on UNIX sockets if it was trying
> to describe a remote migration, right? The vhost-usr-blk way of
> backend-migration doesn't require UNIX socket, or does it?
It does require UNIX socket too.
>
> Especially, if we still want to have your new proposal try to work for CPR
> too or even replace it some day (or a continuous set of proposals in the
> future, from different developers based on this feature), we need to have a
> solid and clear way represents what CPR does, which is to do local fd
> sharing. "backend-transfer: local" or something similar can be that.
>
>>
>> (or we can put this into "backend-transfer", supporting passing string to
>> it and deprecating boolean)
>
> It can be a enum, something like NONE, LOCAL, REMOTE. But before that..
>
>>
>> More over, this future "remote-backend-transfer" could be used for local
>> migration, so again, it should be called simply "backend-transfer"..
>
> Yes, REMOTE might be slightly misleading. And considering you seem to want
> to allow any of below to work:
>
> (1) enable fd migrations only,
> (2) enable remote migrations on backends only,
> (3) enable both of (1)+(2)
>
> Maybe we should have two different feature bits? The per-device one can be
> kept as backend-transfer, however we need to change the global migration
> knob to something describing a local migration.
>
> In summary, still 1 new parameter for migration, 1 new parameter for
> device, but adjust to:
>
> - Migration parameter: "local", boolean, when set, the migration must be
> a local migration within host (which requires UNIX sockets on Linux)
>
> - Per-device parameter "backend-transfer", boolean, when set, device will
> migrate backends when migration happens. Otherwise, backends are not
> migrated; dest QEMU needs to re-initialize it. The backends may or may
> not contain FDs.
>
> When the backend device states contain FDs and FD migrations are
> required, it requires "local" set first above, or it should fail the
> migration when user requested backend-transfer=on.
>
> When it doesn't contain FD at all (or FD migration is not a must?), it
> should either migrate the backend or not depending on the user's
> selection.
>
> For tap (your series here), you need to set both ON and required.
>
> For vhost-usr-blk, that only needs to set per-device knob to ON, the other
> one shouldn't matter.
>
> Then when we want to replace cpr, we request people switch (cpr-transfer
> only, keeping cpr-exec / cpr-reboot aside for now) from setting
> mode=cpr-transfer to local=on, which hopefully will start work as before.
> The per-device parameter doesn't matter in this case.
>
> Would this be more reasonable?
>
Hmm. So, with backend-transfer=on on device and local mig parameter set to false, it fails?
But this way we'll have to set backend-transfer to on/off before any migration (local or
remote) on all devices with help of set-qom. That's not comfortable.
The original idea was that backend-transfer is done for the device when both migration
parameter and device option are set to true. This way before the migration (local or
remote) we only have to set appropriate migration parameters. And backend-transfer
per-device options can be setup once (and the same way) when starting the QEMU, or
they may be inherited from Machine Type. And with such logic, it's good to have
similar names for migration parameter and device option.
Considering all this, could we keep the logic as is (in this patch), but rename
backend-transfer parameter to local-backend-transfer, as you proposed before?
Or turn it into "backend-transfer" = "local" | "off" (but IMHO it's too optimistic:
who knows, will we really add something into this enum in the future? I don't have
such plans)
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 6/8] net/tap: support backend-transfer migration
2026-02-05 14:51 ` Chaney, Ben
@ 2026-02-06 9:00 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-06 9:00 UTC (permalink / raw)
To: Chaney, Ben, jasowang@redhat.com, mst@redhat.com
Cc: pbonzini@redhat.com, berrange@redhat.com, thuth@redhat.com,
armbru@redhat.com, eblake@redhat.com, farosas@suse.de,
peterx@redhat.com, zhao1.liu@intel.com, wangyanan55@huawei.com,
philmd@linaro.org, marcel.apfelbaum@gmail.com,
eduardo@habkost.net, davydov-max@yandex-team.ru,
qemu-devel@nongnu.org, yc-core@yandex-team.ru, leiyang@redhat.com,
raphael.s.norwitz@gmail.com
On 05.02.26 17:51, Chaney, Ben wrote:
>
> On 2/5/26, 3:12 AM, "Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru <mailto:vsementsov@yandex-team.ru>> wrote:
>
> On 04.02.26 19:46, Chaney, Ben wrote:
>>
>>
>> Is it possible to relax this constraint at all? If so, I
>> would prefer to allow script= and downscript= parameters
>> to remain in place.
>>
>>
>> That possible, but this requires some additional logic I think.
>>
>>
>> What if migration fails? Who should call downscript? Migration
>> may be successful on source, and fail on target.. In this case,
>> management tool usually resume stopped source. And in this case
>> source should get again a responsibility to call downscript.
>
> Hmm... If script and downscript are not set, they default to
> /etc/qemu-ifup and /etc/qemu-ifdown. To disable them
> altogether you must pass script=no,downscript=no, which
> is not currently possible with this patch.
Yes, but when new option is set, we don't use any scripts. Like
for other options which doesn't allow use of script/downscript
(fd=, fds=, helper=).
But I see now, it was wrong decision.
>
>> So, I think it should be additional patch on top, which introduce
>> support for script/downscript together with backend-transfer.
>
> Would it be possible to at least support script=no,downscript=no as
> part of this patch? We may want to require it because it sounds like
> we don't have the logic to call qemu-ifup and qemu-ifdown correctly
> in the event of a failure.
>
Agree, that's correct. So, for this patch I should require script
and downscript to be present and set to "no", to be able make a
full support for them later.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-06 8:56 ` Vladimir Sementsov-Ogievskiy
@ 2026-02-06 16:08 ` Peter Xu
2026-02-06 20:37 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2026-02-06 16:08 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: Markus Armbruster, jasowang, mst, pbonzini, berrange, thuth,
eblake, farosas, zhao1.liu, wangyanan55, philmd, marcel.apfelbaum,
eduardo, davydov-max, qemu-devel, yc-core, leiyang,
raphael.s.norwitz, bchaney
On Fri, Feb 06, 2026 at 11:56:27AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 05.02.26 19:25, Peter Xu wrote:
> > On Thu, Feb 05, 2026 at 11:06:03AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > On 05.02.26 10:07, Markus Armbruster wrote:
> > > > Peter Xu <peterx@redhat.com> writes:
> > > >
> > > > > On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > > > # @migrate-set-parameters:
> > > > > > @@ -1004,6 +1005,13 @@
> > > > > > # is @cpr-exec. The first list element is the program's filename,
> > > > > > # the remainder its arguments. (Since 10.2)
> > > > > > #
> > > > > > +# @backend-transfer: Enable backend-transfer feature for devices that
> > > > > > +# supports it. In general that means that backend state and its
> > > > > > +# file descriptors are passed to the destination in the migraton
> > > > > > +# channel (which must be a UNIX socket). Individual devices
> > > > > > +# declare the support for backend-transfer by per-device
> > > > > > +# backend-transfer option. (Since 11.0)
> > > > >
> > > > > I still think it'll be nice to either have "local" in the name of parameter
> > > > > or at least document it with crystal clear terms.
> > > > >
> > > > > I used to suggest fd-passing, but maybe you wanted to emphasize there's
> > > > > more than fds to be migrated at least for tap?
> > >
> > > For vhost-user-blk it's the same: not only FDs.
> > >
> > > > > Then it can still be
> > > > > "local-backend-transfer", because nobody stops a device to transfer backend
> > > > > states in a remote migration either.. so "backend-transfer" seems to also
> > > > > work for remote migrations, but it is not.
> > >
> > > Hmm. I imagine a mechanism, where OS supports passing FDs to another host.
> > > This needs support for actually migrating the corresponding kernel object
> > > by OS automatically. But theoretically I think it can be done transparently
> > > for userspace QEMU process, which will simply pass FDs to the some special
> > > socket, similar to UNIX domain socket.
> > >
> > > So, the key aspect is that we should be able to pass FDs to the migration
> > > channel, which currently meant that it must be UNIX domain socket, and it
> > > must be local migration. But in future it may change.
> >
> > That's a nice vision, but IMHO we shouldn't take it into account when
> > defining any QEMU interface, when it's only about pure imaginations..
> > unless there is solid work in progress, or ideas proposed / known feasible
> > at least.
> >
> > >
> > > And yes, "backend-transfer" work for remote migration of backend.
> > > If we ever implement remote backend migration, why not to
> > > reuse "backend-transfer" for it? Even if there will not be transparent
> > > support from OS, and we'll implement another mechanics, we may add
> > > new parameter
> > >
> > >
> > > backend-transfer-mechanism = "scm-rights" | "something-other"
> >
> > Yes, this will look much better. We likely shouldn't make it "scm-rights",
> > it should be generic terms that applies to all platforms like "local", even
> > if the implication / implementation might be different on various
> > platforms.
> >
> > That's also the major confusion I got when I was reading the other
> > vhost-user-blk series, thought it was a local migration but not.
> >
> > I feel like the interface is simply wrong to make it one covering both, or
> > at least it shouldn't be a boolean as you said because it represents more
> > than one use case.
> >
> > If it's a boolean, it also shouldn't rely on UNIX sockets if it was trying
> > to describe a remote migration, right? The vhost-usr-blk way of
> > backend-migration doesn't require UNIX socket, or does it?
>
> It does require UNIX socket too.
I'm lost once more.. :( Could you share what requires the UNIX socket for
the other work here?
https://lore.kernel.org/all/20260115081103.655749-1-dtalexundeer@yandex-team.ru/#r
There's indeed the inflight->fd, but it's not migrated but allocated before
taking the inflight buffer. I don't see how it requires UNIX socket.
>
> >
> > Especially, if we still want to have your new proposal try to work for CPR
> > too or even replace it some day (or a continuous set of proposals in the
> > future, from different developers based on this feature), we need to have a
> > solid and clear way represents what CPR does, which is to do local fd
> > sharing. "backend-transfer: local" or something similar can be that.
> >
> > >
> > > (or we can put this into "backend-transfer", supporting passing string to
> > > it and deprecating boolean)
> >
> > It can be a enum, something like NONE, LOCAL, REMOTE. But before that..
> >
> > >
> > > More over, this future "remote-backend-transfer" could be used for local
> > > migration, so again, it should be called simply "backend-transfer"..
> >
> > Yes, REMOTE might be slightly misleading. And considering you seem to want
> > to allow any of below to work:
> >
> > (1) enable fd migrations only,
> > (2) enable remote migrations on backends only,
> > (3) enable both of (1)+(2)
> >
> > Maybe we should have two different feature bits? The per-device one can be
> > kept as backend-transfer, however we need to change the global migration
> > knob to something describing a local migration.
> >
> > In summary, still 1 new parameter for migration, 1 new parameter for
> > device, but adjust to:
> >
> > - Migration parameter: "local", boolean, when set, the migration must be
> > a local migration within host (which requires UNIX sockets on Linux)
> >
> > - Per-device parameter "backend-transfer", boolean, when set, device will
> > migrate backends when migration happens. Otherwise, backends are not
> > migrated; dest QEMU needs to re-initialize it. The backends may or may
> > not contain FDs.
> >
> > When the backend device states contain FDs and FD migrations are
> > required, it requires "local" set first above, or it should fail the
> > migration when user requested backend-transfer=on.
> >
> > When it doesn't contain FD at all (or FD migration is not a must?), it
> > should either migrate the backend or not depending on the user's
> > selection.
> >
> > For tap (your series here), you need to set both ON and required.
> >
> > For vhost-usr-blk, that only needs to set per-device knob to ON, the other
> > one shouldn't matter.
> >
> > Then when we want to replace cpr, we request people switch (cpr-transfer
> > only, keeping cpr-exec / cpr-reboot aside for now) from setting
> > mode=cpr-transfer to local=on, which hopefully will start work as before.
> > The per-device parameter doesn't matter in this case.
> >
> > Would this be more reasonable?
> >
>
> Hmm. So, with backend-transfer=on on device and local mig parameter set to false, it fails?
>
> But this way we'll have to set backend-transfer to on/off before any migration (local or
> remote) on all devices with help of set-qom. That's not comfortable.
Personally as long as we can separate the two use cases with the two knobs
properly, then it will look good to me. It doesn't need to be strictly a
failure on such conflictions indeed.
E.g. we can also define this case (local=off, backend-transfer=on) the
other way round if failing is not wanted; that is, allow migration to
happen but skip the part of backend transfer that requires the locality.
Fundamentally, we should accept two kinds of backend-transfer impl:
- When it is supported regardless of local=on/off. I believe that's
vhost-usr-blk's case (but I'll now need to double check with you again
above on UNIX dependency). Then this only relies on the per-dev knob.
- When it is supported only if local=on (this series). This part is
where we can define the behavior of whether we fail the migration on
local=off, or we skip the feature instead.
So I think we can choose to skip it for the latter. It should almost be
the same logic as what you have done in this patchset, afaict, besides the
rename and re-definition of the migration knob.
>
> The original idea was that backend-transfer is done for the device when both migration
> parameter and device option are set to true. This way before the migration (local or
> remote) we only have to set appropriate migration parameters. And backend-transfer
> per-device options can be setup once (and the same way) when starting the QEMU, or
> they may be inherited from Machine Type. And with such logic, it's good to have
> similar names for migration parameter and device option.
I hope above will solve this problem. IIUC what you described should work
if we tweat the new proposal on the local=off & backend-transfer=on case.
>
> Considering all this, could we keep the logic as is (in this patch), but rename
> backend-transfer parameter to local-backend-transfer, as you proposed before?
> Or turn it into "backend-transfer" = "local" | "off" (but IMHO it's too optimistic:
> who knows, will we really add something into this enum in the future? I don't have
> such plans)
IMHO "local" would be nicer because it's very simple, generic and clear on
is own. It almost says "requires UNIX sockets" on Linux and it also opens
the door for this parameter to be reused when without a backend: for
example, when some frontend or any-not-trivially-a-backend also want to
migrate an FD in the future. I'm not surprised to see it coming.
But let's finish above disucssion and see if we can reach the same page.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v10 3/8] qapi: add backend-transfer migration parameter
2026-02-06 16:08 ` Peter Xu
@ 2026-02-06 20:37 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2026-02-06 20:37 UTC (permalink / raw)
To: Peter Xu
Cc: Markus Armbruster, jasowang, mst, pbonzini, berrange, thuth,
eblake, farosas, zhao1.liu, wangyanan55, philmd, marcel.apfelbaum,
eduardo, davydov-max, qemu-devel, yc-core, leiyang,
raphael.s.norwitz, bchaney
On 06.02.26 19:08, Peter Xu wrote:
> On Fri, Feb 06, 2026 at 11:56:27AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 05.02.26 19:25, Peter Xu wrote:
>>> On Thu, Feb 05, 2026 at 11:06:03AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> On 05.02.26 10:07, Markus Armbruster wrote:
>>>>> Peter Xu <peterx@redhat.com> writes:
>>>>>
>>>>>> On Sun, Feb 01, 2026 at 07:19:55PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> # @migrate-set-parameters:
>>>>>>> @@ -1004,6 +1005,13 @@
>>>>>>> # is @cpr-exec. The first list element is the program's filename,
>>>>>>> # the remainder its arguments. (Since 10.2)
>>>>>>> #
>>>>>>> +# @backend-transfer: Enable backend-transfer feature for devices that
>>>>>>> +# supports it. In general that means that backend state and its
>>>>>>> +# file descriptors are passed to the destination in the migraton
>>>>>>> +# channel (which must be a UNIX socket). Individual devices
>>>>>>> +# declare the support for backend-transfer by per-device
>>>>>>> +# backend-transfer option. (Since 11.0)
>>>>>>
>>>>>> I still think it'll be nice to either have "local" in the name of parameter
>>>>>> or at least document it with crystal clear terms.
>>>>>>
>>>>>> I used to suggest fd-passing, but maybe you wanted to emphasize there's
>>>>>> more than fds to be migrated at least for tap?
>>>>
>>>> For vhost-user-blk it's the same: not only FDs.
>>>>
>>>>>> Then it can still be
>>>>>> "local-backend-transfer", because nobody stops a device to transfer backend
>>>>>> states in a remote migration either.. so "backend-transfer" seems to also
>>>>>> work for remote migrations, but it is not.
>>>>
>>>> Hmm. I imagine a mechanism, where OS supports passing FDs to another host.
>>>> This needs support for actually migrating the corresponding kernel object
>>>> by OS automatically. But theoretically I think it can be done transparently
>>>> for userspace QEMU process, which will simply pass FDs to the some special
>>>> socket, similar to UNIX domain socket.
>>>>
>>>> So, the key aspect is that we should be able to pass FDs to the migration
>>>> channel, which currently meant that it must be UNIX domain socket, and it
>>>> must be local migration. But in future it may change.
>>>
>>> That's a nice vision, but IMHO we shouldn't take it into account when
>>> defining any QEMU interface, when it's only about pure imaginations..
>>> unless there is solid work in progress, or ideas proposed / known feasible
>>> at least.
>>>
>>>>
>>>> And yes, "backend-transfer" work for remote migration of backend.
>>>> If we ever implement remote backend migration, why not to
>>>> reuse "backend-transfer" for it? Even if there will not be transparent
>>>> support from OS, and we'll implement another mechanics, we may add
>>>> new parameter
>>>>
>>>>
>>>> backend-transfer-mechanism = "scm-rights" | "something-other"
>>>
>>> Yes, this will look much better. We likely shouldn't make it "scm-rights",
>>> it should be generic terms that applies to all platforms like "local", even
>>> if the implication / implementation might be different on various
>>> platforms.
>>>
>>> That's also the major confusion I got when I was reading the other
>>> vhost-user-blk series, thought it was a local migration but not.
>>>
>>> I feel like the interface is simply wrong to make it one covering both, or
>>> at least it shouldn't be a boolean as you said because it represents more
>>> than one use case.
>>>
>>> If it's a boolean, it also shouldn't rely on UNIX sockets if it was trying
>>> to describe a remote migration, right? The vhost-usr-blk way of
>>> backend-migration doesn't require UNIX socket, or does it?
>>
>> It does require UNIX socket too.
>
> I'm lost once more.. :( Could you share what requires the UNIX socket for
> the other work here?
>
> https://lore.kernel.org/all/20260115081103.655749-1-dtalexundeer@yandex-team.ru/#r
Ah sorry, I thought we are talking about my series
"[PATCH v2 00/25] vhost-user-blk: live-backend local migration"
Of course, Alexander's series doesn't need UNIX socket.
>
> There's indeed the inflight->fd, but it's not migrated but allocated before
> taking the inflight buffer. I don't see how it requires UNIX socket.
It doesn't. But it doesn't transfer "the whole backend", only the inflight
region.
>
>>
>>>
>>> Especially, if we still want to have your new proposal try to work for CPR
>>> too or even replace it some day (or a continuous set of proposals in the
>>> future, from different developers based on this feature), we need to have a
>>> solid and clear way represents what CPR does, which is to do local fd
>>> sharing. "backend-transfer: local" or something similar can be that.
>>>
>>>>
>>>> (or we can put this into "backend-transfer", supporting passing string to
>>>> it and deprecating boolean)
>>>
>>> It can be a enum, something like NONE, LOCAL, REMOTE. But before that..
>>>
>>>>
>>>> More over, this future "remote-backend-transfer" could be used for local
>>>> migration, so again, it should be called simply "backend-transfer"..
>>>
>>> Yes, REMOTE might be slightly misleading. And considering you seem to want
>>> to allow any of below to work:
>>>
>>> (1) enable fd migrations only,
>>> (2) enable remote migrations on backends only,
>>> (3) enable both of (1)+(2)
>>>
>>> Maybe we should have two different feature bits? The per-device one can be
>>> kept as backend-transfer, however we need to change the global migration
>>> knob to something describing a local migration.
>>>
>>> In summary, still 1 new parameter for migration, 1 new parameter for
>>> device, but adjust to:
>>>
>>> - Migration parameter: "local", boolean, when set, the migration must be
>>> a local migration within host (which requires UNIX sockets on Linux)
>>>
>>> - Per-device parameter "backend-transfer", boolean, when set, device will
>>> migrate backends when migration happens. Otherwise, backends are not
>>> migrated; dest QEMU needs to re-initialize it. The backends may or may
>>> not contain FDs.
>>>
>>> When the backend device states contain FDs and FD migrations are
>>> required, it requires "local" set first above, or it should fail the
>>> migration when user requested backend-transfer=on.
>>>
>>> When it doesn't contain FD at all (or FD migration is not a must?), it
>>> should either migrate the backend or not depending on the user's
>>> selection.
>>>
>>> For tap (your series here), you need to set both ON and required.
>>>
>>> For vhost-usr-blk, that only needs to set per-device knob to ON, the other
>>> one shouldn't matter.
>>>
>>> Then when we want to replace cpr, we request people switch (cpr-transfer
>>> only, keeping cpr-exec / cpr-reboot aside for now) from setting
>>> mode=cpr-transfer to local=on, which hopefully will start work as before.
>>> The per-device parameter doesn't matter in this case.
>>>
>>> Would this be more reasonable?
>>>
>>
>> Hmm. So, with backend-transfer=on on device and local mig parameter set to false, it fails?
>>
>> But this way we'll have to set backend-transfer to on/off before any migration (local or
>> remote) on all devices with help of set-qom. That's not comfortable.
>
> Personally as long as we can separate the two use cases with the two knobs
> properly, then it will look good to me. It doesn't need to be strictly a
> failure on such conflictions indeed.
>
> E.g. we can also define this case (local=off, backend-transfer=on) the
> other way round if failing is not wanted; that is, allow migration to
> happen but skip the part of backend transfer that requires the locality.
>
> Fundamentally, we should accept two kinds of backend-transfer impl:
>
> - When it is supported regardless of local=on/off. I believe that's
> vhost-usr-blk's case (but I'll now need to double check with you again
> above on UNIX dependency). Then this only relies on the per-dev knob.
>
> - When it is supported only if local=on (this series). This part is
> where we can define the behavior of whether we fail the migration on
> local=off, or we skip the feature instead.
>
> So I think we can choose to skip it for the latter. It should almost be
> the same logic as what you have done in this patchset, afaict, besides the
> rename and re-definition of the migration knob.
>
>>
>> The original idea was that backend-transfer is done for the device when both migration
>> parameter and device option are set to true. This way before the migration (local or
>> remote) we only have to set appropriate migration parameters. And backend-transfer
>> per-device options can be setup once (and the same way) when starting the QEMU, or
>> they may be inherited from Machine Type. And with such logic, it's good to have
>> similar names for migration parameter and device option.
>
> I hope above will solve this problem. IIUC what you described should work
> if we tweat the new proposal on the local=off & backend-transfer=on case.
>
>>
>> Considering all this, could we keep the logic as is (in this patch), but rename
>> backend-transfer parameter to local-backend-transfer, as you proposed before?
>> Or turn it into "backend-transfer" = "local" | "off" (but IMHO it's too optimistic:
>> who knows, will we really add something into this enum in the future? I don't have
>> such plans)
>
> IMHO "local" would be nicer because it's very simple, generic and clear on
> is own. It almost says "requires UNIX sockets" on Linux and it also opens
> the door for this parameter to be reused when without a backend: for
> example, when some frontend or any-not-trivially-a-backend also want to
> migrate an FD in the future. I'm not surprised to see it coming.
>
> But let's finish above disucssion and see if we can reach the same page.
>
So finally, rename to "local", and keep the logic as is, right? OK for me, will do.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2026-02-06 20:38 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-01 16:19 [PATCH v10 0/8] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 1/8] net/tap: move vhost-net open() calls to tap_parse_vhost_fds() Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 2/8] net/tap: move vhost initialization to tap_setup_vhost() Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 3/8] qapi: add backend-transfer migration parameter Vladimir Sementsov-Ogievskiy
2026-02-04 13:08 ` Markus Armbruster
2026-02-04 17:21 ` Peter Xu
2026-02-05 7:07 ` Markus Armbruster
2026-02-05 8:06 ` Vladimir Sementsov-Ogievskiy
2026-02-05 16:25 ` Peter Xu
2026-02-06 8:56 ` Vladimir Sementsov-Ogievskiy
2026-02-06 16:08 ` Peter Xu
2026-02-06 20:37 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 4/8] net: introduce vmstate_net_peer_backend Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 5/8] virtio-net: support backend-transfer migration Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 6/8] net/tap: " Vladimir Sementsov-Ogievskiy
2026-02-04 16:46 ` Chaney, Ben
2026-02-05 8:12 ` Vladimir Sementsov-Ogievskiy
2026-02-05 14:51 ` Chaney, Ben
2026-02-06 9:00 ` Vladimir Sementsov-Ogievskiy
2026-02-01 16:19 ` [PATCH v10 7/8] tests/functional: add skipWithoutSudo() decorator Vladimir Sementsov-Ogievskiy
2026-02-01 16:20 ` [PATCH v10 8/8] tests/functional: add test_tap_migration Vladimir Sementsov-Ogievskiy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.