* [PATCH v3 0/9] virtio-net: live-TAP local migration
@ 2025-09-05 13:50 Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 1/9] net/tap: add some trace points Vladimir Sementsov-Ogievskiy
` (8 more replies)
0 siblings, 9 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
Hi all!
Here is a new migration capability "local-tap", which
allows local migration of TAP device, including its properties
and open fds.
With this new option, management software doesn't need to
initialize new TAP and do a switch to it. Nothing should be
done around virtio-net in local migration: it just migrates
and continues to use same TAP device. So we avoid extra logic
in management software, extra allocations in kenel (for new TAP),
and corresponding extra delay in migration downtime.
Note that patch 07 is reused from
[PATCH 00/33] vhost-user-blk: live-backend local migration
to not create extra dependency on a big series.
v3:
- drop tap_dump_packet (actually we already have qemu_hexdump,
which is even called in network code (under ifdef).
- rework save/load code in tap.c to VMSD
test:
- avoid using shell
- use sudo, and add skipUnlesPasswordlessSudo decorator
- don't keep extra open fd for ping.log
- use scratch file for ping.log (not hardcoded /tmp/ping.log)
Based on [PATCH v3 00/19] TAP initialization refactoring, or in other
words:
Based-on: <20250903124934.1169899-1-vsementsov@yandex-team.ru>
Vladimir Sementsov-Ogievskiy (9):
net/tap: add some trace points
net/tap: keep exit notifier only when downscript set
net/tap: refactor net_tap_setup_vhost()
qapi: add interface for local TAP migration
net/tap: implement interfaces for local migration
virtio-net: support local tap migration
tests/functional: exec_command_and_wait_for_pattern: add vm arg
tests/functional: add skipUnlessPasswordlessSudo() decorator
tests/functional: add test_x86_64_tap_fd_migration
hw/net/virtio-net.c | 100 ++++-
include/hw/virtio/virtio-net.h | 2 +
include/net/tap.h | 4 +
migration/options.c | 7 +
migration/options.h | 1 +
net/tap.c | 279 ++++++++++----
net/trace-events | 6 +
qapi/migration.json | 9 +-
qapi/net.json | 12 +-
tests/functional/qemu_test/cmd.py | 7 +-
tests/functional/qemu_test/decorators.py | 16 +
.../test_x86_64_tap_fd_migration.py | 345 ++++++++++++++++++
12 files changed, 703 insertions(+), 85 deletions(-)
create mode 100644 tests/functional/test_x86_64_tap_fd_migration.py
--
2.48.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v3 1/9] net/tap: add some trace points
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 2/9] net/tap: keep exit notifier only when downscript set Vladimir Sementsov-Ogievskiy
` (7 subsequent siblings)
8 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
Add trace points to simplify debugging migration.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
net/tap.c | 10 ++++++++++
net/trace-events | 6 ++++++
2 files changed, 16 insertions(+)
diff --git a/net/tap.c b/net/tap.c
index 2530627a9a..151fb73820 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -43,6 +43,7 @@
#include "qemu/main-loop.h"
#include "qemu/sockets.h"
#include "hw/virtio/vhost.h"
+#include "trace.h"
#include "net/tap.h"
@@ -148,6 +149,9 @@ static ssize_t tap_receive_iov(NetClientState *nc, const struct iovec *iov,
g_autofree struct iovec *iov_copy = NULL;
struct virtio_net_hdr hdr = { };
+ trace_tap_receive_iov(s->using_vnet_hdr, s->host_vnet_hdr_len, iovcnt,
+ iov->iov_len);
+
if (s->host_vnet_hdr_len && !s->using_vnet_hdr) {
iov_copy = g_new(struct iovec, iovcnt + 1);
iov_copy[0].iov_base = &hdr;
@@ -199,6 +203,8 @@ static void tap_send(void *opaque)
break;
}
+ trace_tap_send_packet(s->using_vnet_hdr, s->host_vnet_hdr_len, size);
+
if (s->host_vnet_hdr_len && size <= s->host_vnet_hdr_len) {
/* Invalid packet */
break;
@@ -980,6 +986,8 @@ int tap_enable(NetClientState *nc)
TAPState *s = DO_UPCAST(TAPState, nc, nc);
int ret;
+ trace_tap_enable();
+
if (s->enabled) {
return 0;
} else {
@@ -997,6 +1005,8 @@ int tap_disable(NetClientState *nc)
TAPState *s = DO_UPCAST(TAPState, nc, nc);
int ret;
+ trace_tap_disable();
+
if (s->enabled == 0) {
return 0;
} else {
diff --git a/net/trace-events b/net/trace-events
index cda960f42b..a8921763ab 100644
--- a/net/trace-events
+++ b/net/trace-events
@@ -29,3 +29,9 @@ vhost_vdpa_set_address_space_id(void *v, unsigned vq_group, unsigned asid_num) "
vhost_vdpa_net_load_cmd(void *s, uint8_t class, uint8_t cmd, int data_num, int data_size) "vdpa state: %p class: %u cmd: %u sg_num: %d size: %d"
vhost_vdpa_net_load_cmd_retval(void *s, uint8_t class, uint8_t cmd, int r) "vdpa state: %p class: %u cmd: %u retval: %d"
vhost_vdpa_net_load_mq(void *s, int ncurqps) "vdpa state: %p current_qpairs: %d"
+
+# tap.c
+tap_receive_iov(bool using_vnet_hdr, uint32_t host_vnet_hdr_len, int iovcnt, size_t iov_len) "using_vnet_hdr:%d host_vnet_hdr_len:%u iovcnt:%d iov_len:%zu"
+tap_send_packet(bool using_vnet_hdr, uint32_t host_vnet_hdr_len, int size) "using_vnet_hdr:%d host_vnet_hdr_len:%u size:%d"
+tap_enable(void) ""
+tap_disable(void) ""
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 2/9] net/tap: keep exit notifier only when downscript set
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 1/9] net/tap: add some trace points Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 3/9] net/tap: refactor net_tap_setup_vhost() Vladimir Sementsov-Ogievskiy
` (6 subsequent siblings)
8 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
The notifier is used only to call the downscript. Avoid extra
notifier for other cases.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
net/tap.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index 151fb73820..4f3512a831 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -336,8 +336,10 @@ static void tap_cleanup(NetClientState *nc)
qemu_purge_queued_packets(nc);
- tap_exit_notify(&s->exit, NULL);
- qemu_remove_exit_notifier(&s->exit);
+ if (s->exit.notify) {
+ tap_exit_notify(&s->exit, NULL);
+ qemu_remove_exit_notifier(&s->exit);
+ }
tap_read_poll(s, false);
tap_write_poll(s, false);
@@ -727,9 +729,7 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
}
tap_read_poll(s, true);
s->vhost_net = NULL;
-
- s->exit.notify = tap_exit_notify;
- qemu_add_exit_notifier(&s->exit);
+ s->exit.notify = NULL;
if (netdev->type == NET_CLIENT_DRIVER_BRIDGE) {
const NetdevBridgeOptions *bridge = &netdev->u.bridge;
@@ -757,6 +757,8 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
snprintf(s->down_script, sizeof(s->down_script), "%s", downscript);
snprintf(s->down_script_arg, sizeof(s->down_script_arg),
"%s", ifname);
+ s->exit.notify = tap_exit_notify;
+ qemu_add_exit_notifier(&s->exit);
}
}
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 3/9] net/tap: refactor net_tap_setup_vhost()
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 1/9] net/tap: add some trace points Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 2/9] net/tap: keep exit notifier only when downscript set Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 4/9] qapi: add interface for local TAP migration Vladimir Sementsov-Ogievskiy
` (5 subsequent siblings)
8 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
1. Move function higher. (we anyway want to split it into two functions,
and change the code a lot, so take an opportunity to get rid of extra
declaration).
2. Split into _set_options() and _init_vhost(): we'll need it later
to implement TAP local migration feature.
3. Split requires to store options somewhere, and to be able to call
_init_vhost() later (from migriation _load() in later commit), store
options in the TAPState.
4. So, take an opportunity to zero-initialize options:
- be safe (avoid uninitialized options)
- don't care to initialize half of the options to zero by hand
5. Also, don't worry too much about poll_us: absent option should be
zero-initialized anyway.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
net/tap.c | 124 +++++++++++++++++++++++++++++-------------------------
1 file changed, 67 insertions(+), 57 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index 4f3512a831..cf7f704a92 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -78,6 +78,7 @@ typedef struct TAPState {
bool has_ufo;
bool has_uso;
bool enabled;
+ VhostNetOptions *vhost_options;
VHostNetState *vhost_net;
unsigned host_vnet_hdr_len;
Notifier exit;
@@ -95,8 +96,6 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
const char *downscript,
const char *vhostfdname,
int vnet_hdr, int fd, Error **errp);
-static int net_tap_setup_vhost(TAPState *s, const NetdevTapOptions *tap,
- const char *vhostfdname, Error **errp);
static void tap_update_fd_handler(TAPState *s)
{
@@ -334,6 +333,8 @@ static void tap_cleanup(NetClientState *nc)
s->vhost_net = NULL;
}
+ g_free(s->vhost_options);
+
qemu_purge_queued_packets(nc);
if (s->exit.notify) {
@@ -697,6 +698,64 @@ static int net_tap_open_one(const Netdev *netdev,
#define MAX_TAP_QUEUES 1024
+static int net_tap_set_vhost_options(TAPState *s, const NetdevTapOptions *tap,
+ const char *vhostfdname, Error **errp)
+{
+ int vhostfd;
+ bool vhost_on = tap->has_vhost ? tap->vhost :
+ vhostfdname || (tap->has_vhostforce && tap->vhostforce);
+
+ if (!vhost_on) {
+ return 0;
+ }
+
+ if (vhostfdname) {
+ vhostfd = monitor_fd_param(monitor_cur(), vhostfdname, errp);
+ if (vhostfd == -1) {
+ return -1;
+ }
+ } else {
+ vhostfd = open("/dev/vhost-net", O_RDWR);
+ if (vhostfd < 0) {
+ error_setg_errno(errp, errno,
+ "tap: open vhost char device failed");
+ return -1;
+ }
+ }
+
+ if (!qemu_set_blocking(vhostfd, false, errp)) {
+ return -1;
+ }
+
+ s->vhost_options = g_new(VhostNetOptions, 1);
+ *s->vhost_options = (VhostNetOptions) {
+ .backend_type = VHOST_BACKEND_TYPE_KERNEL,
+ .net_backend = &s->nc,
+ .busyloop_timeout = tap->poll_us,
+ .opaque = (void *)(uintptr_t)vhostfd,
+ .nvqs = 2,
+ .feature_bits = kernel_feature_bits,
+ };
+
+ return 0;
+}
+
+static int net_tap_init_vhost(TAPState *s, Error **errp)
+{
+ if (!s->vhost_options) {
+ return 0;
+ }
+
+ s->vhost_net = vhost_net_init(s->vhost_options);
+ if (!s->vhost_net) {
+ error_setg(errp,
+ "vhost-net requested but could not be initialized");
+ return -1;
+ }
+
+ return 0;
+}
+
static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
const char *model, const char *name,
const char *ifname, const char *script,
@@ -762,7 +821,12 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
}
}
- ret = net_tap_setup_vhost(s, tap, vhostfdname, errp);
+ ret = net_tap_set_vhost_options(s, tap, vhostfdname, errp);
+ if (ret < 0) {
+ goto failed;
+ }
+
+ ret = net_tap_init_vhost(s, errp);
if (ret < 0) {
goto failed;
}
@@ -774,60 +838,6 @@ failed:
return -1;
}
-static int net_tap_setup_vhost(TAPState *s, const NetdevTapOptions *tap,
- const char *vhostfdname, Error **errp)
-{
- if (tap->has_vhost ? tap->vhost :
- vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
- VhostNetOptions options;
- int vhostfd;
-
- options.backend_type = VHOST_BACKEND_TYPE_KERNEL;
- options.net_backend = &s->nc;
- if (tap->has_poll_us) {
- options.busyloop_timeout = tap->poll_us;
- } else {
- options.busyloop_timeout = 0;
- }
-
- if (vhostfdname) {
- vhostfd = monitor_fd_param(monitor_cur(), vhostfdname, errp);
- if (vhostfd == -1) {
- return -1;
- }
- if (!qemu_set_blocking(vhostfd, false, errp)) {
- return -1;
- }
- } else {
- vhostfd = open("/dev/vhost-net", O_RDWR);
- if (vhostfd < 0) {
- error_setg_errno(errp, errno,
- "tap: open vhost char device failed");
- return -1;
- }
- if (!qemu_set_blocking(vhostfd, false, errp)) {
- return -1;
- }
- }
- options.opaque = (void *)(uintptr_t)vhostfd;
- options.nvqs = 2;
- options.feature_bits = kernel_feature_bits;
- options.get_acked_features = NULL;
- options.save_acked_features = NULL;
- options.max_tx_queue_size = 0;
- options.is_vhost_user = false;
-
- s->vhost_net = vhost_net_init(&options);
- if (!s->vhost_net) {
- error_setg(errp,
- "vhost-net requested but could not be initialized");
- return -1;
- }
- }
-
- return 0;
-}
-
static int net_tap_from_monitor_fd(const Netdev *netdev, NetClientState *peer,
const char *name, const char *vhostfdname,
int *pvnet_hdr, const char *fdname,
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 4/9] qapi: add interface for local TAP migration
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (2 preceding siblings ...)
2025-09-05 13:50 ` [PATCH v3 3/9] net/tap: refactor net_tap_setup_vhost() Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-08 15:35 ` Peter Xu
2025-09-05 13:50 ` [PATCH v3 5/9] net/tap: implement interfaces for local migration Vladimir Sementsov-Ogievskiy
` (4 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
To migrate TAP device (including open fds) locally, user should:
1. enable local-tap migration capability both on source and target
2. use additional local-incoming=true option for tap on target
Why capability is not enough? We need an option to modify early
initialization of TAP, to avoid opening new fds. The capability
may not be set at the moment of netdev initialization.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
migration/options.c | 7 +++++++
migration/options.h | 1 +
qapi/migration.json | 9 ++++++++-
qapi/net.json | 12 +++++++++++-
4 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/migration/options.c b/migration/options.c
index 4e923a2e07..6f5af52a5c 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -262,6 +262,13 @@ bool migrate_mapped_ram(void)
return s->capabilities[MIGRATION_CAPABILITY_MAPPED_RAM];
}
+bool migrate_local_tap(void)
+{
+ MigrationState *s = migrate_get_current();
+
+ return s->capabilities[MIGRATION_CAPABILITY_LOCAL_TAP];
+}
+
bool migrate_ignore_shared(void)
{
MigrationState *s = migrate_get_current();
diff --git a/migration/options.h b/migration/options.h
index 82d839709e..08c0606072 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -30,6 +30,7 @@ bool migrate_colo(void);
bool migrate_dirty_bitmaps(void);
bool migrate_events(void);
bool migrate_mapped_ram(void);
+bool migrate_local_tap(void);
bool migrate_ignore_shared(void);
bool migrate_late_block_activate(void);
bool migrate_multifd(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index 2387c21e9c..992a5b1e2b 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -517,6 +517,12 @@
# each RAM page. Requires a migration URI that supports seeking,
# such as a file. (since 9.0)
#
+# @local-tap: Migrate TAPs locally, keeping backend alive. Open file
+# descriptors and TAP-related state are migrated. Only may be
+# used when migration channel is unix socket. For target device
+# also @local-incoming option must be specified (since 10.2)
+# (since 10.2)
+#
# Features:
#
# @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -536,7 +542,8 @@
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
- 'dirty-limit', 'mapped-ram'] }
+ 'dirty-limit', 'mapped-ram',
+ { 'name': 'local-tap', 'features': [ 'unstable' ] } ] }
##
# @MigrationCapabilityStatus:
diff --git a/qapi/net.json b/qapi/net.json
index 78bcc9871e..8f53549d58 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -353,6 +353,15 @@
# @poll-us: maximum number of microseconds that could be spent on busy
# polling for tap (since 2.7)
#
+# @local-incoming: Do load open file descriptor for that TAP
+# on incoming migration. May be used only if QEMU is started
+# for incoming migration. Will work only together with local-tap
+# migration capability enabled (default: false) (Since: 10.2)
+#
+# Features:
+#
+# @unstable: Member @local-incoming is experimental
+#
# Since: 1.2
##
{ 'struct': 'NetdevTapOptions',
@@ -371,7 +380,8 @@
'*vhostfds': 'str',
'*vhostforce': 'bool',
'*queues': 'uint32',
- '*poll-us': 'uint32'} }
+ '*poll-us': 'uint32',
+ '*local-incoming': { 'type': 'bool', 'features': [ 'unstable' ] } } }
##
# @NetdevSocketOptions:
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (3 preceding siblings ...)
2025-09-05 13:50 ` [PATCH v3 4/9] qapi: add interface for local TAP migration Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-08 15:42 ` Peter Xu
2025-09-05 13:50 ` [PATCH v3 6/9] virtio-net: support local tap migration Vladimir Sementsov-Ogievskiy
` (3 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
Handle local-incoming option:
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
include/net/tap.h | 4 ++
net/tap.c | 139 +++++++++++++++++++++++++++++++++++++++-------
2 files changed, 122 insertions(+), 21 deletions(-)
diff --git a/include/net/tap.h b/include/net/tap.h
index 6f34f13eae..3ef2e2dbae 100644
--- a/include/net/tap.h
+++ b/include/net/tap.h
@@ -30,7 +30,11 @@
int tap_enable(NetClientState *nc);
int tap_disable(NetClientState *nc);
+bool tap_local_incoming(NetClientState *nc);
int tap_get_fd(NetClientState *nc);
+int tap_load(NetClientState *nc, QEMUFile *f);
+int tap_save(NetClientState *nc, QEMUFile *f);
+
#endif /* QEMU_NET_TAP_H */
diff --git a/net/tap.c b/net/tap.c
index cf7f704a92..0fd7a7671c 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -35,6 +35,7 @@
#include "net/eth.h"
#include "net/net.h"
#include "clients.h"
+#include "migration/vmstate.h"
#include "monitor/monitor.h"
#include "system/system.h"
#include "qapi/error.h"
@@ -44,6 +45,7 @@
#include "qemu/sockets.h"
#include "hw/virtio/vhost.h"
#include "trace.h"
+#include "system/runstate.h"
#include "net/tap.h"
@@ -82,6 +84,7 @@ typedef struct TAPState {
VHostNetState *vhost_net;
unsigned host_vnet_hdr_len;
Notifier exit;
+ bool local_incoming;
} TAPState;
static void launch_script(const char *setup_script, const char *ifname,
@@ -756,6 +759,43 @@ static int net_tap_init_vhost(TAPState *s, Error **errp)
return 0;
}
+static int tap_post_load(void *opaque, int version_id)
+{
+ TAPState *s = opaque;
+
+ tap_read_poll(s, true);
+
+ return net_tap_init_vhost(s, NULL);
+}
+
+static const VMStateDescription vmstate_tap = {
+ .name = "virtio-net-device",
+ .post_load = tap_post_load,
+ .fields = (const VMStateField[]) {
+ VMSTATE_FD(fd, TAPState),
+ VMSTATE_BOOL(using_vnet_hdr, TAPState),
+ VMSTATE_BOOL(has_ufo, TAPState),
+ VMSTATE_BOOL(has_uso, TAPState),
+ VMSTATE_BOOL(enabled, TAPState),
+ VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
+int tap_save(NetClientState *nc, QEMUFile *f)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+ return vmstate_save_state(f, &vmstate_tap, s, 0);
+}
+
+int tap_load(NetClientState *nc, QEMUFile *f)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+ return vmstate_load_state(f, &vmstate_tap, s, 0);
+}
+
static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
const char *model, const char *name,
const char *ifname, const char *script,
@@ -763,30 +803,40 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
const char *vhostfdname,
int vnet_hdr, int fd, Error **errp)
{
- const NetdevTapOptions *tap;
+ const NetdevTapOptions *tap = NULL;
int ret;
NetClientState *nc;
TAPState *s;
+ bool local_incoming = false;
+
+ if (netdev->type == NET_CLIENT_DRIVER_TAP) {
+ tap = &netdev->u.tap;
+ local_incoming = tap->local_incoming;
+ }
nc = qemu_new_net_client(&net_tap_info, peer, model, name);
s = DO_UPCAST(TAPState, nc, nc);
-
- s->fd = fd;
- s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
- s->using_vnet_hdr = false;
- s->has_ufo = tap_probe_has_ufo(s->fd);
- s->has_uso = tap_probe_has_uso(s->fd);
- s->enabled = true;
- tap_set_offload(&s->nc, 0, 0, 0, 0, 0, 0, 0);
- /*
- * Make sure host header length is set correctly in tap:
- * it might have been modified by another instance of qemu.
- */
- if (vnet_hdr) {
- tap_fd_set_vnet_hdr_len(s->fd, s->host_vnet_hdr_len);
+ s->local_incoming = local_incoming;
+
+ if (!local_incoming) {
+ s->fd = fd;
+ s->host_vnet_hdr_len = vnet_hdr ? sizeof(struct virtio_net_hdr) : 0;
+ s->using_vnet_hdr = false;
+ s->has_ufo = tap_probe_has_ufo(s->fd);
+ s->has_uso = tap_probe_has_uso(s->fd);
+ s->enabled = true;
+ tap_set_offload(&s->nc, 0, 0, 0, 0, 0, 0, 0);
+ /*
+ * Make sure host header length is set correctly in tap:
+ * it might have been modified by another instance of qemu.
+ */
+ if (vnet_hdr) {
+ tap_fd_set_vnet_hdr_len(s->fd, s->host_vnet_hdr_len);
+ }
+ tap_read_poll(s, true);
}
- tap_read_poll(s, true);
+
s->vhost_net = NULL;
s->exit.notify = NULL;
@@ -798,9 +848,8 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
}
assert(netdev->type == NET_CLIENT_DRIVER_TAP);
- tap = &netdev->u.tap;
- if (tap_set_sndbuf(s->fd, tap, errp) < 0) {
+ if (!local_incoming && tap_set_sndbuf(s->fd, tap, errp) < 0) {
goto failed;
}
@@ -826,9 +875,11 @@ static int net_tap_fd_init_common(const Netdev *netdev, NetClientState *peer,
goto failed;
}
- ret = net_tap_init_vhost(s, errp);
- if (ret < 0) {
- goto failed;
+ if (!local_incoming) {
+ ret = net_tap_init_vhost(s, errp);
+ if (ret < 0) {
+ goto failed;
+ }
}
return 0;
@@ -895,6 +946,38 @@ static int net_tap_open(const Netdev *netdev,
return 0;
}
+static int net_init_local_incoming(const Netdev *netdev,
+ const char *name,
+ NetClientState *peer,
+ Error **errp)
+{
+ const NetdevTapOptions *tap = &netdev->u.tap;
+ const char *downscript = tap->downscript;
+ int queues = tap->has_queues ? tap->queues : 1;
+ g_autofree char *default_downscript = NULL;
+ int i, ret;
+
+ assert(netdev->type == NET_CLIENT_DRIVER_TAP);
+ assert(!tap->script);
+
+ if (!downscript) {
+ downscript = default_downscript =
+ get_relocated_path(DEFAULT_NETWORK_DOWN_SCRIPT);
+ }
+
+ for (i = 0; i < queues; i++) {
+ ret = net_tap_fd_init_common(netdev, peer, "tap", name,
+ tap->ifname ?: "",
+ "no", downscript,
+ tap->vhostfd, -1, -1, errp);
+ if (ret < 0) {
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
static int net_init_tap_fds(const Netdev *netdev, const char *name,
NetClientState *peer, Error **errp)
{
@@ -983,6 +1066,13 @@ int net_init_tap(const Netdev *netdev, const char *name,
}
return net_init_bridge(netdev, name, peer, errp);
+ } else if (tap->local_incoming) {
+ if (tap->script) {
+ error_setg(errp, "script= is invalid with local-incoming");
+ return -1;
+ }
+
+ return net_init_local_incoming(netdev, name, peer, errp);
}
if (tap->vhostfds) {
@@ -1031,3 +1121,10 @@ int tap_disable(NetClientState *nc)
return ret;
}
}
+
+bool tap_local_incoming(NetClientState *nc)
+{
+ TAPState *s = DO_UPCAST(TAPState, nc, nc);
+
+ return s->local_incoming && runstate_check(RUN_STATE_INMIGRATE);
+}
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 6/9] virtio-net: support local tap migration
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (4 preceding siblings ...)
2025-09-05 13:50 ` [PATCH v3 5/9] net/tap: implement interfaces for local migration Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-08 15:43 ` Peter Xu
2025-09-05 13:50 ` [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
` (2 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
hw/net/virtio-net.c | 100 ++++++++++++++++++++++++++++++++-
include/hw/virtio/virtio-net.h | 2 +
2 files changed, 101 insertions(+), 1 deletion(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 6b5b5dace3..874e349fee 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -38,6 +38,8 @@
#include "qapi/qapi-events-migration.h"
#include "hw/virtio/virtio-access.h"
#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/options.h"
#include "standard-headers/linux/ethtool.h"
#include "system/system.h"
#include "system/replay.h"
@@ -2999,7 +3001,13 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
n->multiqueue = multiqueue;
virtio_net_change_num_queues(n, max * 2 + 1);
- virtio_net_set_queue_pairs(n);
+ /*
+ * Called from set_features(0) on reset, when on target we
+ * doesn't have fds yet
+ */
+ if (!n->tap_wait_incoming) {
+ virtio_net_set_queue_pairs(n);
+ }
}
static int virtio_net_pre_load_queues(VirtIODevice *vdev, uint32_t n)
@@ -3009,6 +3017,19 @@ static int virtio_net_pre_load_queues(VirtIODevice *vdev, uint32_t n)
return 0;
}
+static int virtio_net_pre_save_device(void *opaque)
+{
+ VirtIONet *n = opaque;
+ int i, r;
+
+ for (i = 0; i < n->curr_queue_pairs; i++) {
+ r = peer_detach(n, i);
+ assert(!r);
+ }
+
+ return 0;
+}
+
static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features,
Error **errp)
{
@@ -3028,6 +3049,11 @@ static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features,
virtio_add_feature(&features, VIRTIO_NET_F_MAC);
+ if (n->tap_wait_incoming) {
+ /* Excessive feature set is OK for early initialization */
+ return features;
+ }
+
if (!peer_has_vnet_hdr(n)) {
virtio_clear_feature(&features, VIRTIO_NET_F_CSUM);
virtio_clear_feature(&features, VIRTIO_NET_F_HOST_TSO4);
@@ -3494,11 +3520,69 @@ static const VMStateDescription vhost_user_net_backend_state = {
}
};
+static int virtio_net_tap_save(QEMUFile *f, void *pv, size_t size,
+ const VMStateField *field,
+ JSONWriter *vmdesc)
+{
+ VirtIONet *n = pv;
+ int i;
+
+ for (i = 0; i < n->max_queue_pairs; i++) {
+ NetClientState *nc = qemu_get_subqueue(n->nic, i);
+ assert(nc->peer->info->type == NET_CLIENT_DRIVER_TAP);
+
+ tap_save(nc->peer, f);
+ }
+
+ return 0;
+}
+
+static int virtio_net_tap_load(QEMUFile *f, void *pv, size_t size,
+ const VMStateField *field)
+{
+ VirtIONet *n = pv;
+ VirtIODevice *vdev = VIRTIO_DEVICE(n);
+ VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+ Error *local_err = NULL;
+ int i;
+
+ for (i = 0; i < n->max_queue_pairs; i++) {
+ NetClientState *nc = qemu_get_subqueue(n->nic, i);
+ assert(nc->peer->info->type == NET_CLIENT_DRIVER_TAP);
+
+ tap_load(nc->peer, f);
+ }
+
+ peer_test_vnet_hdr(n);
+ n->tap_wait_incoming = false;
+
+ vdev->host_features = vdc->get_features(vdev, vdev->host_features,
+ &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static bool virtio_net_is_tap_local(void *opaque, int version_id)
+{
+ VirtIONet *n = opaque;
+ NetClientState *nc;
+
+ nc = qemu_get_queue(n->nic);
+
+ return migrate_local_tap() && nc->peer &&
+ nc->peer->info->type == NET_CLIENT_DRIVER_TAP;
+}
+
static const VMStateDescription vmstate_virtio_net_device = {
.name = "virtio-net-device",
.version_id = VIRTIO_NET_VM_VERSION,
.minimum_version_id = VIRTIO_NET_VM_VERSION,
.post_load = virtio_net_post_load_device,
+ .pre_save = virtio_net_pre_save_device,
.fields = (const VMStateField[]) {
VMSTATE_UINT8_ARRAY(mac, VirtIONet, ETH_ALEN),
VMSTATE_STRUCT_POINTER(vqs, VirtIONet,
@@ -3525,6 +3609,15 @@ static const VMStateDescription vmstate_virtio_net_device = {
* but based on the uint.
*/
VMSTATE_BUFFER_POINTER_UNSAFE(vlans, VirtIONet, 0, MAX_VLAN >> 3),
+ {
+ .name = "tap",
+ .info = &(const VMStateInfo) {
+ .name = "virtio-net vhost-user backend state",
+ .get = virtio_net_tap_load,
+ .put = virtio_net_tap_save,
+ },
+ .field_exists = virtio_net_is_tap_local,
+ },
VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
vmstate_virtio_net_has_vnet),
VMSTATE_UINT8(mac_table.multi_overflow, VirtIONet),
@@ -3954,6 +4047,11 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
vhost_net_set_config(get_vhost_net(nc->peer),
(uint8_t *)&netcfg, 0, ETH_ALEN, VHOST_SET_CONFIG_TYPE_FRONTEND);
}
+
+ if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_TAP) {
+ n->tap_wait_incoming = tap_local_incoming(nc->peer);
+ }
+
QTAILQ_INIT(&n->rsc_chains);
n->qdev = dev;
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 73fdefc0dc..04ae0e4c06 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -231,6 +231,8 @@ struct VirtIONet {
struct EBPFRSSContext ebpf_rss;
uint32_t nr_ebpf_rss_fds;
char **ebpf_rss_fds;
+
+ bool tap_wait_incoming;
};
size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (5 preceding siblings ...)
2025-09-05 13:50 ` [PATCH v3 6/9] virtio-net: support local tap migration Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-09 14:03 ` Thomas Huth
2025-09-05 13:50 ` [PATCH v3 8/9] tests/functional: add skipUnlessPasswordlessSudo() decorator Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 9/9] tests/functional: add test_x86_64_tap_fd_migration Vladimir Sementsov-Ogievskiy
8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
Allow to specify non default vm for the command.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
---
tests/functional/qemu_test/cmd.py | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tests/functional/qemu_test/cmd.py b/tests/functional/qemu_test/cmd.py
index dc5f422b77..28b36a3a54 100644
--- a/tests/functional/qemu_test/cmd.py
+++ b/tests/functional/qemu_test/cmd.py
@@ -172,7 +172,8 @@ def exec_command(test, command):
_console_interaction(test, None, None, command + '\r')
def exec_command_and_wait_for_pattern(test, command,
- success_message, failure_message=None):
+ success_message, failure_message=None,
+ vm=None):
"""
Send a command to a console (appending CRLF characters), then wait
for success_message to appear on the console, while logging the.
@@ -184,9 +185,11 @@ def exec_command_and_wait_for_pattern(test, command,
:param command: the command to send
:param success_message: if this message appears, test succeeds
:param failure_message: if this message appears, test fails
+ :param vm: the VM to use (defaults to test.vm if None)
"""
assert success_message
- _console_interaction(test, success_message, failure_message, command + '\r')
+ _console_interaction(test, success_message, failure_message, command + '\r',
+ vm=vm)
def get_qemu_img(test):
test.log.debug('Looking for and selecting a qemu-img binary')
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 8/9] tests/functional: add skipUnlessPasswordlessSudo() decorator
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (6 preceding siblings ...)
2025-09-05 13:50 ` [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-08 15:49 ` Daniel P. Berrangé
2025-09-05 13:50 ` [PATCH v3 9/9] tests/functional: add test_x86_64_tap_fd_migration Vladimir Sementsov-Ogievskiy
8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
To be used in the next commit: that would be a test for TAP
networking, and it will need to setup TAP device.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
tests/functional/qemu_test/decorators.py | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/tests/functional/qemu_test/decorators.py b/tests/functional/qemu_test/decorators.py
index c0d1567b14..4b332804ef 100644
--- a/tests/functional/qemu_test/decorators.py
+++ b/tests/functional/qemu_test/decorators.py
@@ -6,6 +6,7 @@
import os
import platform
import resource
+import subprocess
from unittest import skipIf, skipUnless
from .cmd import which
@@ -149,3 +150,18 @@ def skipLockedMemoryTest(locked_memory):
ulimit_memory == resource.RLIM_INFINITY or ulimit_memory >= locked_memory * 1024,
f'Test required {locked_memory} kB of available locked memory',
)
+
+'''
+Decorator to skip execution of a test if passwordless
+sudo command is not available.
+'''
+def skipUnlessPasswordlessSudo():
+ proc = subprocess.run(["sudo", "-n", "/bin/true"],
+ stdin=subprocess.PIPE,
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT,
+ universal_newlines=True,
+ check=False)
+
+ return skipUnless(proc.returncode == 0,
+ f'requires password-less sudo access: {proc.stdout}')
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v3 9/9] tests/functional: add test_x86_64_tap_fd_migration
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
` (7 preceding siblings ...)
2025-09-05 13:50 ` [PATCH v3 8/9] tests/functional: add skipUnlessPasswordlessSudo() decorator Vladimir Sementsov-Ogievskiy
@ 2025-09-05 13:50 ` Vladimir Sementsov-Ogievskiy
2025-09-08 15:58 ` Daniel P. Berrangé
8 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-05 13:50 UTC (permalink / raw)
To: jasowang
Cc: qemu-devel, vsementsov, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd, berrange
Add test for a new feature of local TAP migration with fd passing
through unix socket.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
.../test_x86_64_tap_fd_migration.py | 345 ++++++++++++++++++
1 file changed, 345 insertions(+)
create mode 100644 tests/functional/test_x86_64_tap_fd_migration.py
diff --git a/tests/functional/test_x86_64_tap_fd_migration.py b/tests/functional/test_x86_64_tap_fd_migration.py
new file mode 100644
index 0000000000..a38dba39fe
--- /dev/null
+++ b/tests/functional/test_x86_64_tap_fd_migration.py
@@ -0,0 +1,345 @@
+#!/usr/bin/env python3
+#
+# Functional test that tests TAP local migration
+# with fd passing
+#
+# Copyright (c) Yandex
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import os
+import time
+import subprocess
+from subprocess import run
+import signal
+from typing import Tuple
+
+from qemu_test import (
+ LinuxKernelTest,
+ Asset,
+ exec_command_and_wait_for_pattern,
+)
+from qemu_test.decorators import skipUnlessPasswordlessSudo
+
+GUEST_IP = "10.0.1.2"
+GUEST_IP_MASK = f"{GUEST_IP}/24"
+GUEST_MAC = "d6:0d:75:f8:0f:b7"
+HOST_IP = "10.0.1.1"
+HOST_IP_MASK = f"{HOST_IP}/24"
+TAP_ID = "tap0"
+TAP_MAC = "e6:1d:44:b5:03:5d"
+
+
+def del_tap() -> None:
+ run(
+ ["sudo", "ip", "tuntap", "del", TAP_ID, "mode", "tap", "multi_queue"],
+ check=True,
+ )
+
+
+def init_tap() -> None:
+ run(
+ ["sudo", "ip", "tuntap", "add", "dev", TAP_ID, "mode", "tap", "multi_queue"],
+ check=True,
+ )
+ run(["sudo", "ip", "link", "set", "dev", TAP_ID, "address", TAP_MAC], check=True)
+ run(["sudo", "ip", "addr", "add", HOST_IP_MASK, "dev", TAP_ID], check=True)
+ run(["sudo", "ip", "link", "set", TAP_ID, "up"], check=True)
+
+
+def parse_ping_line(line: str) -> float:
+ # suspect lines like
+ # [1748524876.590509] 64 bytes from 94.245.155.3 \
+ # (94.245.155.3): icmp_seq=1 ttl=250 time=101 ms
+ spl = line.split()
+ return float(spl[0][1:-1])
+
+
+def parse_ping_output(out) -> Tuple[bool, float, float]:
+ lines = [x for x in out.split("\n") if x.startswith("[")]
+
+ try:
+ first_no_ans = next(
+ (ind for ind in range(len(lines)) if lines[ind][20:26] == "no ans")
+ )
+ except StopIteration:
+ return False, parse_ping_line(lines[0]), parse_ping_line(lines[-1])
+
+ last_no_ans = next(
+ (ind for ind in range(len(lines) - 1, -1, -1) if lines[ind][20:26] == "no ans")
+ )
+
+ return (
+ True,
+ parse_ping_line(lines[first_no_ans]),
+ parse_ping_line(lines[last_no_ans]),
+ )
+
+
+def wait_migration_finish(source_vm, target_vm):
+ migr_events = (
+ ("MIGRATION", {"data": {"status": "completed"}}),
+ ("MIGRATION", {"data": {"status": "failed"}}),
+ )
+
+ source_e = source_vm.events_wait(migr_events)["data"]
+ target_e = target_vm.events_wait(migr_events)["data"]
+
+ source_s = source_vm.cmd("query-status")["status"]
+ target_s = target_vm.cmd("query-status")["status"]
+
+ assert (
+ source_e["status"] == "completed"
+ and target_e["status"] == "completed"
+ and source_s == "postmigrate"
+ and target_s == "paused"
+ ), f"""Migration failed:
+ SRC status: {source_s}
+ SRC event: {source_e}
+ TGT status: {target_s}
+ TGT event:{target_e}"""
+
+
+@skipUnlessPasswordlessSudo()
+class VhostUserBlkFdMigration(LinuxKernelTest):
+
+ ASSET_KERNEL = Asset(
+ (
+ "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
+ "/31/Server/x86_64/os/images/pxeboot/vmlinuz"
+ ),
+ "d4738d03dbbe083ca610d0821d0a8f1488bebbdccef54ce33e3adb35fda00129",
+ )
+
+ ASSET_INITRD = Asset(
+ (
+ "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
+ "/31/Server/x86_64/os/images/pxeboot/initrd.img"
+ ),
+ "277cd6c7adf77c7e63d73bbb2cded8ef9e2d3a2f100000e92ff1f8396513cd8b",
+ )
+
+ ASSET_ALPINE_ISO = Asset(
+ (
+ "https://dl-cdn.alpinelinux.org/"
+ "alpine/v3.22/releases/x86_64/alpine-standard-3.22.1-x86_64.iso"
+ ),
+ "96d1b44ea1b8a5a884f193526d92edb4676054e9fa903ad2f016441a0fe13089",
+ )
+
+ def setUp(self):
+ super().setUp()
+
+ init_tap()
+
+ self.outer_ping_proc = None
+
+ def tearDown(self):
+ del_tap()
+
+ if self.outer_ping_proc:
+ self.stop_outer_ping()
+
+ super().tearDown()
+
+ def start_outer_ping(self) -> None:
+ assert self.outer_ping_proc is None
+ self.outer_ping_log = self.scratch_file("ping.log")
+ with open(self.outer_ping_log, "w") as f:
+ self.outer_ping_proc = subprocess.Popen(
+ ["ping", "-i", "0", "-O", "-D", GUEST_IP],
+ text=True,
+ stdout=f,
+ )
+
+ def stop_outer_ping(self) -> str:
+ assert self.outer_ping_proc
+ self.outer_ping_proc.send_signal(signal.SIGINT)
+
+ self.outer_ping_proc.communicate(timeout=5)
+ self.outer_ping_proc = None
+
+ with open(self.outer_ping_log) as f:
+ return f.read()
+
+ def stop_ping_and_check(self, stop_time, resume_time):
+ ping_res = self.stop_outer_ping()
+
+ discon, a, b = parse_ping_output(ping_res)
+
+ if not discon:
+ text = f"STOP: {stop_time}, RESUME: {resume_time}," f"PING: {a} - {b}"
+ if a > stop_time or b < resume_time:
+ self.fail(f"PING failed: {text}")
+ self.log.info(f"PING: no packets lost: {text}")
+ return
+
+ text = (
+ f"STOP: {stop_time}, RESUME: {resume_time}," f"PING: disconnect: {a} - {b}"
+ )
+ self.log.info(text)
+ eps = 0.01
+ if a < stop_time - eps or b > resume_time + eps:
+ self.fail(text)
+
+ def one_ping_from_guest(self, vm) -> None:
+ exec_command_and_wait_for_pattern(
+ self,
+ f"ping -c 1 -W 1 {HOST_IP}",
+ "1 packets transmitted, 1 packets received",
+ "1 packets transmitted, 0 packets received",
+ vm=vm,
+ )
+ self.wait_for_console_pattern("# ", vm=vm)
+
+ def one_ping_from_host(self) -> None:
+ run(["ping", "-c", "1", "-W", "1", GUEST_IP])
+
+ def setup_shared_memory(self):
+ shm_path = f"/dev/shm/qemu_test_{os.getpid()}"
+
+ try:
+ with open(shm_path, "wb") as f:
+ f.write(b"\0" * (1024 * 1024 * 1024)) # 1GB
+ except Exception as e:
+ self.fail(f"Failed to create shared memory file: {e}")
+
+ return shm_path
+
+ def prepare_and_launch_vm(self, shm_path, vhost, incoming=False, vm=None):
+ if not vm:
+ vm = self.vm
+
+ vm.set_console()
+ vm.add_args("-accel", "kvm")
+ vm.add_args("-device", "pcie-pci-bridge,id=pci.1,bus=pcie.0")
+ vm.add_args("-m", "1G")
+
+ vm.add_args(
+ "-object",
+ f"memory-backend-file,id=ram0,size=1G,mem-path={shm_path},share=on",
+ )
+ vm.add_args("-machine", "memory-backend=ram0")
+
+ vm.add_args(
+ "-drive", f"file={self.ASSET_ALPINE_ISO.fetch()},media=cdrom,format=raw"
+ )
+
+ vm.add_args("-S")
+
+ if incoming:
+ vm.add_args("-incoming", "defer")
+
+ vm_s = "target" if incoming else "source"
+ self.log.info(f"Launching {vm_s} VM")
+ vm.launch()
+
+ self.set_migration_capabilities(vm)
+ self.add_virtio_net(vm, vhost, incoming)
+
+ def add_virtio_net(self, vm, vhost: bool, incoming: bool = False):
+ netdev_params = {
+ "id": "netdev.1",
+ "vhost": vhost,
+ "type": "tap",
+ "ifname": "tap0",
+ "downscript": "no",
+ "queues": 4,
+ }
+
+ if incoming:
+ netdev_params["local-incoming"] = True
+ else:
+ netdev_params["script"] = "no"
+
+ vm.cmd("netdev_add", netdev_params)
+
+ vm.cmd(
+ "device_add",
+ driver="virtio-net-pci",
+ romfile="",
+ id="vnet.1",
+ netdev="netdev.1",
+ mq=True,
+ vectors=18,
+ bus="pci.1",
+ mac=GUEST_MAC,
+ disable_legacy="off",
+ )
+
+ def set_migration_capabilities(self, vm):
+ capabilities = [
+ {"capability": "events", "state": True},
+ {"capability": "x-ignore-shared", "state": True},
+ {"capability": "local-tap", "state": True},
+ ]
+ vm.cmd("migrate-set-capabilities", {"capabilities": capabilities})
+
+ def setup_guest_network(self) -> None:
+ exec_command_and_wait_for_pattern(self, "ip addr", "# ")
+ exec_command_and_wait_for_pattern(
+ self,
+ f"ip addr add {GUEST_IP_MASK} dev eth0 && ip link set eth0 up && echo OK",
+ "OK",
+ )
+ self.wait_for_console_pattern("# ")
+
+ def do_test_tap_fd_migration(self, vhost):
+ self.require_accelerator("kvm")
+ self.set_machine("q35")
+
+ socket_dir = self.socket_dir()
+ migration_socket = os.path.join(socket_dir.name, "migration.sock")
+
+ shm_path = self.setup_shared_memory()
+
+ self.prepare_and_launch_vm(shm_path, vhost)
+ self.vm.cmd("cont")
+ self.wait_for_console_pattern("login:")
+ exec_command_and_wait_for_pattern(self, "root", "# ")
+
+ self.setup_guest_network()
+
+ self.one_ping_from_guest(self.vm)
+ self.one_ping_from_host()
+ self.start_outer_ping()
+
+ # Get some successful pings before migration
+ time.sleep(0.5)
+
+ target_vm = self.get_vm(name="target")
+ self.prepare_and_launch_vm(shm_path, vhost, incoming=True, vm=target_vm)
+
+ target_vm.cmd("migrate-incoming", {"uri": f"unix:{migration_socket}"})
+
+ self.log.info("Starting migration")
+ freeze_start = time.time()
+ self.vm.cmd("migrate", {"uri": f"unix:{migration_socket}"})
+
+ self.log.info("Waiting for migration completion")
+ wait_migration_finish(self.vm, target_vm)
+
+ target_vm.cmd("cont")
+ freeze_end = time.time()
+
+ self.vm.shutdown()
+
+ self.log.info("Verifying PING on target VM after migration")
+ self.one_ping_from_guest(target_vm)
+ self.one_ping_from_host()
+
+ # And a bit more pings after source shutdown
+ time.sleep(0.3)
+ self.stop_ping_and_check(freeze_start, freeze_end)
+
+ target_vm.shutdown()
+
+ def test_tap_fd_migration(self):
+ self.do_test_tap_fd_migration(False)
+
+ def test_tap_fd_migration_vhost(self):
+ self.do_test_tap_fd_migration(True)
+
+
+if __name__ == "__main__":
+ LinuxKernelTest.main()
--
2.48.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v3 4/9] qapi: add interface for local TAP migration
2025-09-05 13:50 ` [PATCH v3 4/9] qapi: add interface for local TAP migration Vladimir Sementsov-Ogievskiy
@ 2025-09-08 15:35 ` Peter Xu
2025-09-08 16:38 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-09-08 15:35 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On Fri, Sep 05, 2025 at 04:50:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 2387c21e9c..992a5b1e2b 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -517,6 +517,12 @@
> # each RAM page. Requires a migration URI that supports seeking,
> # such as a file. (since 9.0)
> #
> +# @local-tap: Migrate TAPs locally, keeping backend alive. Open file
> +# descriptors and TAP-related state are migrated. Only may be
> +# used when migration channel is unix socket. For target device
> +# also @local-incoming option must be specified (since 10.2)
> +# (since 10.2)
IMHO we should move this into a per-device property, at least we need one
there to still control the device behavior; we had a similar discussion
recently on iterable virtio-net.
But maybe this one is slightly special? Maybe the tap device needs to at
least know whether in this specific migration, if we want to pass over FD
or not (e.g. local upgrade, or remote _real_ migration)?
If that's the case, we may consider providing a generic migration
capability, like cap-fd-passing. Nowadays since Fabiano's moving migration
capabilities all over to migration parameters, this one can start with a
parameter instead of a capability. The problem with migration capability
is (at least) that it can't by default ON on any machine types.. meanwhile
it simply looks like identital to parameters except it's always bool.
The high level rational is that we should never add a per-device cap flag
into migration framework.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-05 13:50 ` [PATCH v3 5/9] net/tap: implement interfaces for local migration Vladimir Sementsov-Ogievskiy
@ 2025-09-08 15:42 ` Peter Xu
2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-09-08 15:42 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On Fri, Sep 05, 2025 at 04:50:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> +static const VMStateDescription vmstate_tap = {
> + .name = "virtio-net-device",
> + .post_load = tap_post_load,
> + .fields = (const VMStateField[]) {
> + VMSTATE_FD(fd, TAPState),
> + VMSTATE_BOOL(using_vnet_hdr, TAPState),
> + VMSTATE_BOOL(has_ufo, TAPState),
> + VMSTATE_BOOL(has_uso, TAPState),
> + VMSTATE_BOOL(enabled, TAPState),
> + VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
> + VMSTATE_END_OF_LIST()
> + }
> +};
> +
> +int tap_save(NetClientState *nc, QEMUFile *f)
> +{
> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
> +
> + return vmstate_save_state(f, &vmstate_tap, s, 0);
> +}
> +
> +int tap_load(NetClientState *nc, QEMUFile *f)
> +{
> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
> +
> + return vmstate_load_state(f, &vmstate_tap, s, 0);
> +}
Instead of hard-coding vmstate_save_state() / vmstate_load_state(), could
we make tap's VMSD to be a subsection of virtio-net's?
Multifd already doesn't support qemufile, but only iochannels (which is the
internal impl of qemufiles). We might at some point start to concurrently
load devices with multifd, then anything with qemufile will be a no-go and
need to be serialized as legacy code in the main channel, or rewritten.
IMHO it'll be great if we can avoid adding new codes operating on
qemufiles, and also avoid adding any new custom VMSD fields' put()/get() if
ever possible.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 6/9] virtio-net: support local tap migration
2025-09-05 13:50 ` [PATCH v3 6/9] virtio-net: support local tap migration Vladimir Sementsov-Ogievskiy
@ 2025-09-08 15:43 ` Peter Xu
2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-09-08 15:43 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On Fri, Sep 05, 2025 at 04:50:36PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> @@ -3525,6 +3609,15 @@ static const VMStateDescription vmstate_virtio_net_device = {
> * but based on the uint.
> */
> VMSTATE_BUFFER_POINTER_UNSAFE(vlans, VirtIONet, 0, MAX_VLAN >> 3),
> + {
> + .name = "tap",
> + .info = &(const VMStateInfo) {
> + .name = "virtio-net vhost-user backend state",
Just randomly spot this line; likely we need s/vhost-user/tap/..
But of course it'll be much better if this can be a subsection, as
commented previous..
> + .get = virtio_net_tap_load,
> + .put = virtio_net_tap_save,
> + },
> + .field_exists = virtio_net_is_tap_local,
> + },
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 8/9] tests/functional: add skipUnlessPasswordlessSudo() decorator
2025-09-05 13:50 ` [PATCH v3 8/9] tests/functional: add skipUnlessPasswordlessSudo() decorator Vladimir Sementsov-Ogievskiy
@ 2025-09-08 15:49 ` Daniel P. Berrangé
0 siblings, 0 replies; 26+ messages in thread
From: Daniel P. Berrangé @ 2025-09-08 15:49 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd
On Fri, Sep 05, 2025 at 04:50:38PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> To be used in the next commit: that would be a test for TAP
> networking, and it will need to setup TAP device.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> tests/functional/qemu_test/decorators.py | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 9/9] tests/functional: add test_x86_64_tap_fd_migration
2025-09-05 13:50 ` [PATCH v3 9/9] tests/functional: add test_x86_64_tap_fd_migration Vladimir Sementsov-Ogievskiy
@ 2025-09-08 15:58 ` Daniel P. Berrangé
0 siblings, 0 replies; 26+ messages in thread
From: Daniel P. Berrangé @ 2025-09-08 15:58 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, peterx,
mst, farosas, eblake, armbru, thuth, philmd
On Fri, Sep 05, 2025 at 04:50:39PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add test for a new feature of local TAP migration with fd passing
> through unix socket.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> .../test_x86_64_tap_fd_migration.py | 345 ++++++++++++++++++
> 1 file changed, 345 insertions(+)
> create mode 100644 tests/functional/test_x86_64_tap_fd_migration.py
>
> diff --git a/tests/functional/test_x86_64_tap_fd_migration.py b/tests/functional/test_x86_64_tap_fd_migration.py
> new file mode 100644
> index 0000000000..a38dba39fe
> --- /dev/null
> +++ b/tests/functional/test_x86_64_tap_fd_migration.py
> @@ -0,0 +1,345 @@
> +#!/usr/bin/env python3
> +#
> +# Functional test that tests TAP local migration
> +# with fd passing
> +#
> +# Copyright (c) Yandex
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +import os
> +import time
> +import subprocess
> +from subprocess import run
> +import signal
> +from typing import Tuple
> +
> +from qemu_test import (
> + LinuxKernelTest,
> + Asset,
> + exec_command_and_wait_for_pattern,
> +)
> +from qemu_test.decorators import skipUnlessPasswordlessSudo
> +
> +GUEST_IP = "10.0.1.2"
> +GUEST_IP_MASK = f"{GUEST_IP}/24"
> +GUEST_MAC = "d6:0d:75:f8:0f:b7"
> +HOST_IP = "10.0.1.1"
> +HOST_IP_MASK = f"{HOST_IP}/24"
> +TAP_ID = "tap0"
> +TAP_MAC = "e6:1d:44:b5:03:5d"
> +
> +
> +def del_tap() -> None:
> + run(
> + ["sudo", "ip", "tuntap", "del", TAP_ID, "mode", "tap", "multi_queue"],
> + check=True,
> + )
> +
> +
> +def init_tap() -> None:
> + run(
> + ["sudo", "ip", "tuntap", "add", "dev", TAP_ID, "mode", "tap", "multi_queue"],
> + check=True,
> + )
> + run(["sudo", "ip", "link", "set", "dev", TAP_ID, "address", TAP_MAC], check=True)
> + run(["sudo", "ip", "addr", "add", HOST_IP_MASK, "dev", TAP_ID], check=True)
> + run(["sudo", "ip", "link", "set", TAP_ID, "up"], check=True)
> +
> +
> +def parse_ping_line(line: str) -> float:
> + # suspect lines like
> + # [1748524876.590509] 64 bytes from 94.245.155.3 \
> + # (94.245.155.3): icmp_seq=1 ttl=250 time=101 ms
> + spl = line.split()
> + return float(spl[0][1:-1])
> +
> +
> +def parse_ping_output(out) -> Tuple[bool, float, float]:
> + lines = [x for x in out.split("\n") if x.startswith("[")]
> +
> + try:
> + first_no_ans = next(
> + (ind for ind in range(len(lines)) if lines[ind][20:26] == "no ans")
> + )
> + except StopIteration:
> + return False, parse_ping_line(lines[0]), parse_ping_line(lines[-1])
> +
> + last_no_ans = next(
> + (ind for ind in range(len(lines) - 1, -1, -1) if lines[ind][20:26] == "no ans")
> + )
> +
> + return (
> + True,
> + parse_ping_line(lines[first_no_ans]),
> + parse_ping_line(lines[last_no_ans]),
> + )
> +
> +
> +def wait_migration_finish(source_vm, target_vm):
> + migr_events = (
> + ("MIGRATION", {"data": {"status": "completed"}}),
> + ("MIGRATION", {"data": {"status": "failed"}}),
> + )
> +
> + source_e = source_vm.events_wait(migr_events)["data"]
> + target_e = target_vm.events_wait(migr_events)["data"]
> +
> + source_s = source_vm.cmd("query-status")["status"]
> + target_s = target_vm.cmd("query-status")["status"]
> +
> + assert (
> + source_e["status"] == "completed"
> + and target_e["status"] == "completed"
> + and source_s == "postmigrate"
> + and target_s == "paused"
> + ), f"""Migration failed:
> + SRC status: {source_s}
> + SRC event: {source_e}
> + TGT status: {target_s}
> + TGT event:{target_e}"""
> +
> +
> +@skipUnlessPasswordlessSudo()
> +class VhostUserBlkFdMigration(LinuxKernelTest):
> +
> + ASSET_KERNEL = Asset(
> + (
> + "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
> + "/31/Server/x86_64/os/images/pxeboot/vmlinuz"
> + ),
> + "d4738d03dbbe083ca610d0821d0a8f1488bebbdccef54ce33e3adb35fda00129",
> + )
> +
> + ASSET_INITRD = Asset(
> + (
> + "https://archives.fedoraproject.org/pub/archive/fedora/linux/releases"
> + "/31/Server/x86_64/os/images/pxeboot/initrd.img"
> + ),
> + "277cd6c7adf77c7e63d73bbb2cded8ef9e2d3a2f100000e92ff1f8396513cd8b",
> + )
> +
> + ASSET_ALPINE_ISO = Asset(
> + (
> + "https://dl-cdn.alpinelinux.org/"
> + "alpine/v3.22/releases/x86_64/alpine-standard-3.22.1-x86_64.iso"
> + ),
> + "96d1b44ea1b8a5a884f193526d92edb4676054e9fa903ad2f016441a0fe13089",
> + )
> +
> + def setUp(self):
> + super().setUp()
> +
> + init_tap()
> +
> + self.outer_ping_proc = None
> +
> + def tearDown(self):
> + del_tap()
> +
> + if self.outer_ping_proc:
> + self.stop_outer_ping()
Wrap the del_tap() and onwards in a 'try' clause and then
use 'finally' to run the parent tearDown, so we're more
robust if anything throws an exception
> +
> + super().tearDown()
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 4/9] qapi: add interface for local TAP migration
2025-09-08 15:35 ` Peter Xu
@ 2025-09-08 16:38 ` Vladimir Sementsov-Ogievskiy
2025-09-08 18:47 ` Peter Xu
0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-08 16:38 UTC (permalink / raw)
To: Peter Xu
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On 08.09.25 18:35, Peter Xu wrote:
> On Fri, Sep 05, 2025 at 04:50:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 2387c21e9c..992a5b1e2b 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -517,6 +517,12 @@
>> # each RAM page. Requires a migration URI that supports seeking,
>> # such as a file. (since 9.0)
>> #
>> +# @local-tap: Migrate TAPs locally, keeping backend alive. Open file
>> +# descriptors and TAP-related state are migrated. Only may be
>> +# used when migration channel is unix socket. For target device
>> +# also @local-incoming option must be specified (since 10.2)
>> +# (since 10.2)
>
> IMHO we should move this into a per-device property, at least we need one
> there to still control the device behavior; we had a similar discussion
> recently on iterable virtio-net.
>
> But maybe this one is slightly special? Maybe the tap device needs to at
> least know whether in this specific migration, if we want to pass over FD
> or not (e.g. local upgrade, or remote _real_ migration)?
>
> If that's the case, we may consider providing a generic migration
> capability, like cap-fd-passing. Nowadays since Fabiano's moving migration
> capabilities all over to migration parameters, this one can start with a
> parameter instead of a capability. The problem with migration capability
> is (at least) that it can't by default ON on any machine types.. meanwhile
> it simply looks like identital to parameters except it's always bool.
>
> The high level rational is that we should never add a per-device cap flag
> into migration framework.
>
Hmm.
1. Yes, we need to distinguish, is that _real_ migration or local. And setting a
special property for each device (which supports fd-migration) to turn on passing
FD to the channel seems not comfortable and error prune.
2. Initially, I decided separate "local-tap" and "local-vhost-user-blk" capabilities
just to simplify further testing/debugging in real environment: the possibility to
enable only "half of magic" helps.
So, granularity makes sence, but having local-XXX capability for each device class
looks bad.
Maybe, having generic cap-fd-passing, together with possibility to disable it on
per-device basis (like migrate-fd=false) is good compromise.
Another question is, do we need "local-incoming" option for target device.
Initially I added this because I thought: ho, I need to distinguish it initialization
time: do I need to call open(), or wait for incoming fd.
Now I see that I can just postpone this decision up to "start" point, where
- either I already have fd from incoming migration
- or have nothing: in this case, let's call open()
-
I'll try to go with one "fd-passing" capability, as any kind of granularity may be
added later on demand.
Hmm2. Probably we can avoid even adding such a capability, but just check, is migration
channel support fd passing or not? Seems too implicit for me.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-08 15:42 ` Peter Xu
@ 2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
2025-09-08 20:01 ` Peter Xu
0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-08 16:48 UTC (permalink / raw)
To: Peter Xu
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On 08.09.25 18:42, Peter Xu wrote:
> On Fri, Sep 05, 2025 at 04:50:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> +static const VMStateDescription vmstate_tap = {
>> + .name = "virtio-net-device",
>> + .post_load = tap_post_load,
>> + .fields = (const VMStateField[]) {
>> + VMSTATE_FD(fd, TAPState),
>> + VMSTATE_BOOL(using_vnet_hdr, TAPState),
>> + VMSTATE_BOOL(has_ufo, TAPState),
>> + VMSTATE_BOOL(has_uso, TAPState),
>> + VMSTATE_BOOL(enabled, TAPState),
>> + VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
>> + VMSTATE_END_OF_LIST()
>> + }
>> +};
>> +
>> +int tap_save(NetClientState *nc, QEMUFile *f)
>> +{
>> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
>> +
>> + return vmstate_save_state(f, &vmstate_tap, s, 0);
>> +}
>> +
>> +int tap_load(NetClientState *nc, QEMUFile *f)
>> +{
>> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
>> +
>> + return vmstate_load_state(f, &vmstate_tap, s, 0);
>> +}
>
> Instead of hard-coding vmstate_save_state() / vmstate_load_state(), could
> we make tap's VMSD to be a subsection of virtio-net's?
>
> Multifd already doesn't support qemufile, but only iochannels (which is the
> internal impl of qemufiles). We might at some point start to concurrently
> load devices with multifd, then anything with qemufile will be a no-go and
> need to be serialized as legacy code in the main channel, or rewritten.
>
> IMHO it'll be great if we can avoid adding new codes operating on
> qemufiles, and also avoid adding any new custom VMSD fields' put()/get() if
> ever possible.
>
Subsections are loaded after fields.
And virtio-net already has fields
VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
vmstate_virtio_net_has_vnet),
and
VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
vmstate_virtio_net_has_ufo),
Which do check on virtio-net level some parameters, which should come from local migration of TAP.
That's why I made TAP a field, and put it before these two ones. This way these two checks works.
Still, from your comment I understand that hard-coding save/load is worse problem. So I can just
skip checking in vmstate_virtio_net_has_vnet / vmstate_virtio_net_has_ufo with enabled "local-tap"
(or "fd-passing") capability (or better migration parameter). This way TAP may be a subsection.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 6/9] virtio-net: support local tap migration
2025-09-08 15:43 ` Peter Xu
@ 2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-08 16:48 UTC (permalink / raw)
To: Peter Xu
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On 08.09.25 18:43, Peter Xu wrote:
> On Fri, Sep 05, 2025 at 04:50:36PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> @@ -3525,6 +3609,15 @@ static const VMStateDescription vmstate_virtio_net_device = {
>> * but based on the uint.
>> */
>> VMSTATE_BUFFER_POINTER_UNSAFE(vlans, VirtIONet, 0, MAX_VLAN >> 3),
>> + {
>> + .name = "tap",
>> + .info = &(const VMStateInfo) {
>> + .name = "virtio-net vhost-user backend state",
>
> Just randomly spot this line; likely we need s/vhost-user/tap/..
>
Of-course
> But of course it'll be much better if this can be a subsection, as
> commented previous..
>
Will do.
>> + .get = virtio_net_tap_load,
>> + .put = virtio_net_tap_save,
>> + },
>> + .field_exists = virtio_net_is_tap_local,
>> + },
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 4/9] qapi: add interface for local TAP migration
2025-09-08 16:38 ` Vladimir Sementsov-Ogievskiy
@ 2025-09-08 18:47 ` Peter Xu
2025-09-09 7:40 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-09-08 18:47 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On Mon, Sep 08, 2025 at 07:38:45PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.09.25 18:35, Peter Xu wrote:
> > On Fri, Sep 05, 2025 at 04:50:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 2387c21e9c..992a5b1e2b 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -517,6 +517,12 @@
> > > # each RAM page. Requires a migration URI that supports seeking,
> > > # such as a file. (since 9.0)
> > > #
> > > +# @local-tap: Migrate TAPs locally, keeping backend alive. Open file
> > > +# descriptors and TAP-related state are migrated. Only may be
> > > +# used when migration channel is unix socket. For target device
> > > +# also @local-incoming option must be specified (since 10.2)
> > > +# (since 10.2)
> >
> > IMHO we should move this into a per-device property, at least we need one
> > there to still control the device behavior; we had a similar discussion
> > recently on iterable virtio-net.
> >
> > But maybe this one is slightly special? Maybe the tap device needs to at
> > least know whether in this specific migration, if we want to pass over FD
> > or not (e.g. local upgrade, or remote _real_ migration)?
> >
> > If that's the case, we may consider providing a generic migration
> > capability, like cap-fd-passing. Nowadays since Fabiano's moving migration
> > capabilities all over to migration parameters, this one can start with a
> > parameter instead of a capability. The problem with migration capability
> > is (at least) that it can't by default ON on any machine types.. meanwhile
> > it simply looks like identital to parameters except it's always bool.
> >
> > The high level rational is that we should never add a per-device cap flag
> > into migration framework.
> >
>
> Hmm.
>
> 1. Yes, we need to distinguish, is that _real_ migration or local. And setting a
> special property for each device (which supports fd-migration) to turn on passing
> FD to the channel seems not comfortable and error prune.
>
> 2. Initially, I decided separate "local-tap" and "local-vhost-user-blk" capabilities
> just to simplify further testing/debugging in real environment: the possibility to
> enable only "half of magic" helps.
>
> So, granularity makes sence, but having local-XXX capability for each device class
> looks bad.
>
> Maybe, having generic cap-fd-passing, together with possibility to disable it on
> per-device basis (like migrate-fd=false) is good compromise.
>
>
> Another question is, do we need "local-incoming" option for target device.
>
> Initially I added this because I thought: ho, I need to distinguish it initialization
> time: do I need to call open(), or wait for incoming fd.
>
> Now I see that I can just postpone this decision up to "start" point, where
>
> - either I already have fd from incoming migration
> - or have nothing: in this case, let's call open()
>
> -
>
> I'll try to go with one "fd-passing" capability, as any kind of granularity may be
> added later on demand.
>
>
> Hmm2. Probably we can avoid even adding such a capability, but just check, is migration
> channel support fd passing or not? Seems too implicit for me.
If we want to expose a feature internally, IIUC we can use QAPI "features"
like this:
https://lore.kernel.org/all/20250603013810.4772-17-farosas@suse.de/
But I'm not yet sure whether it's useful..
In this case the "capability" itself should almost always be present when
using unix sockets.. The problem is, IIUC we're not trying to describe a
capability, but a choice the user made.
For example, when unix socket is the transport, we can still decide to not
use fd passing even if it's fully supported in the current QEMU binary for
any devices that are involved, because any of: (1) it could be a unix
socket to a proxy daemon (of a container?) when fd passing isn't supported
in the daemon, or (2) as you mentioned above, for debugging purpose when we
want to triage whether a bug is relevant to fd-passing. Maybe more.
The per-device granularity you mentioned also makes sense to me.
An use case is when, imagine, we have a QEMU that (1) supports tap local
migration, but (2) doesn't yet support virtio-blk local migration. Then we
want to be able to enable the fd-passing for tap/virtio-net, but not for
virtio-blk (even if the src QEMU in the context might support both)?
IOW, it makes sense to me to have two layers of controls here:
(a) Migration new parameter, "migrate-fds" (or any better name..).
When set, it enables all devices that supports fd-passing to migrate
the fds directly. OTOH, when not set, even if all devices enabled
fd-passing, it should still do a full migration. This one is the
user knob saying "I want to migrate with fd migrated".
This should imply unix sockets for sure as the transport, and should
fail upfront if it's not a unix socket.
We should also auto-select this with cpr migrations.. then in any
code path (whenever such path exists?) that the fds can be either
migrated from cpr or main channels.
(b) Device new parameter, "migrate-fds" (or any better name..).
When set, the device will declare support migrating fds "whenever the
migration applies", aka, when above (a) is selected first.
Taking tap device as example here, setting it ON here means "please
enable fd-passing whenever the user enables this migration option".
So in tap code, it should migrate fd if both (a) and (b) are ON.
When migrating to e.g. old QEMUs, here (b) should be OFF even if (a)
is ON.
Would above make sense?
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
@ 2025-09-08 20:01 ` Peter Xu
2025-09-09 7:44 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-09-08 20:01 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On Mon, Sep 08, 2025 at 07:48:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.09.25 18:42, Peter Xu wrote:
> > On Fri, Sep 05, 2025 at 04:50:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > +static const VMStateDescription vmstate_tap = {
> > > + .name = "virtio-net-device",
> > > + .post_load = tap_post_load,
> > > + .fields = (const VMStateField[]) {
> > > + VMSTATE_FD(fd, TAPState),
> > > + VMSTATE_BOOL(using_vnet_hdr, TAPState),
> > > + VMSTATE_BOOL(has_ufo, TAPState),
> > > + VMSTATE_BOOL(has_uso, TAPState),
> > > + VMSTATE_BOOL(enabled, TAPState),
> > > + VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
> > > + VMSTATE_END_OF_LIST()
> > > + }
> > > +};
> > > +
> > > +int tap_save(NetClientState *nc, QEMUFile *f)
> > > +{
> > > + TAPState *s = DO_UPCAST(TAPState, nc, nc);
> > > +
> > > + return vmstate_save_state(f, &vmstate_tap, s, 0);
> > > +}
> > > +
> > > +int tap_load(NetClientState *nc, QEMUFile *f)
> > > +{
> > > + TAPState *s = DO_UPCAST(TAPState, nc, nc);
> > > +
> > > + return vmstate_load_state(f, &vmstate_tap, s, 0);
> > > +}
> >
> > Instead of hard-coding vmstate_save_state() / vmstate_load_state(), could
> > we make tap's VMSD to be a subsection of virtio-net's?
> >
> > Multifd already doesn't support qemufile, but only iochannels (which is the
> > internal impl of qemufiles). We might at some point start to concurrently
> > load devices with multifd, then anything with qemufile will be a no-go and
> > need to be serialized as legacy code in the main channel, or rewritten.
> >
> > IMHO it'll be great if we can avoid adding new codes operating on
> > qemufiles, and also avoid adding any new custom VMSD fields' put()/get() if
> > ever possible.
> >
>
> Subsections are loaded after fields.
>
> And virtio-net already has fields
>
> VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
> vmstate_virtio_net_has_vnet),
>
> and
>
> VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
> vmstate_virtio_net_has_ufo),
Side note: I'm actually a bit confused on why it needs to use
VMSTATE_WITH_TMP(), or say, the get()/put() directly.
Taking example of vmstate_virtio_net_has_ufo, afaiu, the only point here is
that virtio_net_ufo_post_load() would check peer UFO support. I wonder if
that should work too to check that in virtio_net_post_load_device(), and
fail there if anything is wrong.. then would has_ufo be able to be
migrated as a VMSTATE_U8 field?
>
> Which do check on virtio-net level some parameters, which should come from local migration of TAP.
>
> That's why I made TAP a field, and put it before these two ones. This way these two checks works.
>
>
> Still, from your comment I understand that hard-coding save/load is worse problem. So I can just
> skip checking in vmstate_virtio_net_has_vnet / vmstate_virtio_net_has_ufo with enabled "local-tap"
> (or "fd-passing") capability (or better migration parameter). This way TAP may be a subsection.
That'll be nice, thanks. Or would a VMSTATE_STRUCT() for TAP to work (so
that it can also be put before the two _TMPs, but avoid raw get()/put())?
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 4/9] qapi: add interface for local TAP migration
2025-09-08 18:47 ` Peter Xu
@ 2025-09-09 7:40 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-09 7:40 UTC (permalink / raw)
To: Peter Xu
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On 08.09.25 21:47, Peter Xu wrote:
> On Mon, Sep 08, 2025 at 07:38:45PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 08.09.25 18:35, Peter Xu wrote:
>>> On Fri, Sep 05, 2025 at 04:50:34PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>> index 2387c21e9c..992a5b1e2b 100644
>>>> --- a/qapi/migration.json
>>>> +++ b/qapi/migration.json
>>>> @@ -517,6 +517,12 @@
>>>> # each RAM page. Requires a migration URI that supports seeking,
>>>> # such as a file. (since 9.0)
>>>> #
>>>> +# @local-tap: Migrate TAPs locally, keeping backend alive. Open file
>>>> +# descriptors and TAP-related state are migrated. Only may be
>>>> +# used when migration channel is unix socket. For target device
>>>> +# also @local-incoming option must be specified (since 10.2)
>>>> +# (since 10.2)
>>>
>>> IMHO we should move this into a per-device property, at least we need one
>>> there to still control the device behavior; we had a similar discussion
>>> recently on iterable virtio-net.
>>>
>>> But maybe this one is slightly special? Maybe the tap device needs to at
>>> least know whether in this specific migration, if we want to pass over FD
>>> or not (e.g. local upgrade, or remote _real_ migration)?
>>>
>>> If that's the case, we may consider providing a generic migration
>>> capability, like cap-fd-passing. Nowadays since Fabiano's moving migration
>>> capabilities all over to migration parameters, this one can start with a
>>> parameter instead of a capability. The problem with migration capability
>>> is (at least) that it can't by default ON on any machine types.. meanwhile
>>> it simply looks like identital to parameters except it's always bool.
>>>
>>> The high level rational is that we should never add a per-device cap flag
>>> into migration framework.
>>>
>>
>> Hmm.
>>
>> 1. Yes, we need to distinguish, is that _real_ migration or local. And setting a
>> special property for each device (which supports fd-migration) to turn on passing
>> FD to the channel seems not comfortable and error prune.
>>
>> 2. Initially, I decided separate "local-tap" and "local-vhost-user-blk" capabilities
>> just to simplify further testing/debugging in real environment: the possibility to
>> enable only "half of magic" helps.
>>
>> So, granularity makes sence, but having local-XXX capability for each device class
>> looks bad.
>>
>> Maybe, having generic cap-fd-passing, together with possibility to disable it on
>> per-device basis (like migrate-fd=false) is good compromise.
>>
>>
>> Another question is, do we need "local-incoming" option for target device.
>>
>> Initially I added this because I thought: ho, I need to distinguish it initialization
>> time: do I need to call open(), or wait for incoming fd.
>>
>> Now I see that I can just postpone this decision up to "start" point, where
>>
>> - either I already have fd from incoming migration
>> - or have nothing: in this case, let's call open()
>>
>> -
>>
>> I'll try to go with one "fd-passing" capability, as any kind of granularity may be
>> added later on demand.
>>
>>
>> Hmm2. Probably we can avoid even adding such a capability, but just check, is migration
>> channel support fd passing or not? Seems too implicit for me.
>
> If we want to expose a feature internally, IIUC we can use QAPI "features"
> like this:
>
> https://lore.kernel.org/all/20250603013810.4772-17-farosas@suse.de/
>
> But I'm not yet sure whether it's useful..
>
> In this case the "capability" itself should almost always be present when
> using unix sockets.. The problem is, IIUC we're not trying to describe a
> capability, but a choice the user made.
>
> For example, when unix socket is the transport, we can still decide to not
> use fd passing even if it's fully supported in the current QEMU binary for
> any devices that are involved, because any of: (1) it could be a unix
> socket to a proxy daemon (of a container?) when fd passing isn't supported
> in the daemon, or (2) as you mentioned above, for debugging purpose when we
> want to triage whether a bug is relevant to fd-passing. Maybe more.
>
> The per-device granularity you mentioned also makes sense to me.
>
> An use case is when, imagine, we have a QEMU that (1) supports tap local
> migration, but (2) doesn't yet support virtio-blk local migration. Then we
> want to be able to enable the fd-passing for tap/virtio-net, but not for
> virtio-blk (even if the src QEMU in the context might support both)?
>
> IOW, it makes sense to me to have two layers of controls here:
>
> (a) Migration new parameter, "migrate-fds" (or any better name..).
>
> When set, it enables all devices that supports fd-passing to migrate
> the fds directly. OTOH, when not set, even if all devices enabled
> fd-passing, it should still do a full migration. This one is the
> user knob saying "I want to migrate with fd migrated".
>
> This should imply unix sockets for sure as the transport, and should
> fail upfront if it's not a unix socket.
>
> We should also auto-select this with cpr migrations.. then in any
> code path (whenever such path exists?) that the fds can be either
> migrated from cpr or main channels.
>
> (b) Device new parameter, "migrate-fds" (or any better name..).
>
> When set, the device will declare support migrating fds "whenever the
> migration applies", aka, when above (a) is selected first.
>
> Taking tap device as example here, setting it ON here means "please
> enable fd-passing whenever the user enables this migration option".
> So in tap code, it should migrate fd if both (a) and (b) are ON.
> When migrating to e.g. old QEMUs, here (b) should be OFF even if (a)
> is ON.
>
> Would above make sense?
>
Yes, I meant something like this, sounds good.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-08 20:01 ` Peter Xu
@ 2025-09-09 7:44 ` Vladimir Sementsov-Ogievskiy
2025-09-09 14:56 ` Peter Xu
0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-09 7:44 UTC (permalink / raw)
To: Peter Xu
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On 08.09.25 23:01, Peter Xu wrote:
> On Mon, Sep 08, 2025 at 07:48:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 08.09.25 18:42, Peter Xu wrote:
>>> On Fri, Sep 05, 2025 at 04:50:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> +static const VMStateDescription vmstate_tap = {
>>>> + .name = "virtio-net-device",
>>>> + .post_load = tap_post_load,
>>>> + .fields = (const VMStateField[]) {
>>>> + VMSTATE_FD(fd, TAPState),
>>>> + VMSTATE_BOOL(using_vnet_hdr, TAPState),
>>>> + VMSTATE_BOOL(has_ufo, TAPState),
>>>> + VMSTATE_BOOL(has_uso, TAPState),
>>>> + VMSTATE_BOOL(enabled, TAPState),
>>>> + VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
>>>> + VMSTATE_END_OF_LIST()
>>>> + }
>>>> +};
>>>> +
>>>> +int tap_save(NetClientState *nc, QEMUFile *f)
>>>> +{
>>>> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>>> +
>>>> + return vmstate_save_state(f, &vmstate_tap, s, 0);
>>>> +}
>>>> +
>>>> +int tap_load(NetClientState *nc, QEMUFile *f)
>>>> +{
>>>> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>>> +
>>>> + return vmstate_load_state(f, &vmstate_tap, s, 0);
>>>> +}
>>>
>>> Instead of hard-coding vmstate_save_state() / vmstate_load_state(), could
>>> we make tap's VMSD to be a subsection of virtio-net's?
>>>
>>> Multifd already doesn't support qemufile, but only iochannels (which is the
>>> internal impl of qemufiles). We might at some point start to concurrently
>>> load devices with multifd, then anything with qemufile will be a no-go and
>>> need to be serialized as legacy code in the main channel, or rewritten.
>>>
>>> IMHO it'll be great if we can avoid adding new codes operating on
>>> qemufiles, and also avoid adding any new custom VMSD fields' put()/get() if
>>> ever possible.
>>>
>>
>> Subsections are loaded after fields.
>>
>> And virtio-net already has fields
>>
>> VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
>> vmstate_virtio_net_has_vnet),
>>
>> and
>>
>> VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
>> vmstate_virtio_net_has_ufo),
>
> Side note: I'm actually a bit confused on why it needs to use
> VMSTATE_WITH_TMP(), or say, the get()/put() directly.
>
> Taking example of vmstate_virtio_net_has_ufo, afaiu, the only point here is
> that virtio_net_ufo_post_load() would check peer UFO support. I wonder if
> that should work too to check that in virtio_net_post_load_device(), and
> fail there if anything is wrong.. then would has_ufo be able to be
> migrated as a VMSTATE_U8 field?
>
>>
>> Which do check on virtio-net level some parameters, which should come from local migration of TAP.
>>
>> That's why I made TAP a field, and put it before these two ones. This way these two checks works.
>>
>>
>> Still, from your comment I understand that hard-coding save/load is worse problem. So I can just
>> skip checking in vmstate_virtio_net_has_vnet / vmstate_virtio_net_has_ufo with enabled "local-tap"
>> (or "fd-passing") capability (or better migration parameter). This way TAP may be a subsection.
>
> That'll be nice, thanks. Or would a VMSTATE_STRUCT() for TAP to work (so
> that it can also be put before the two _TMPs, but avoid raw get()/put())?
>
I considered using VMSTATE_STRUCT, but it would mean, I should access TAPState directly from virtio-net
code. That's not possible now, as TAPState is a static type in tap.c, and think it's better to keep
it static. So, in this case, subsection should be better, if I understand correctly.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg
2025-09-05 13:50 ` [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
@ 2025-09-09 14:03 ` Thomas Huth
2025-09-09 14:09 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Thomas Huth @ 2025-09-09 14:03 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, jasowang
Cc: qemu-devel, leiyang, steven.sistare, yc-core, peterx, mst,
farosas, eblake, armbru, philmd, berrange
On 05/09/2025 15.50, Vladimir Sementsov-Ogievskiy wrote:
> Allow to specify non default vm for the command.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> ---
> tests/functional/qemu_test/cmd.py | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tests/functional/qemu_test/cmd.py b/tests/functional/qemu_test/cmd.py
> index dc5f422b77..28b36a3a54 100644
> --- a/tests/functional/qemu_test/cmd.py
> +++ b/tests/functional/qemu_test/cmd.py
> @@ -172,7 +172,8 @@ def exec_command(test, command):
> _console_interaction(test, None, None, command + '\r')
>
> def exec_command_and_wait_for_pattern(test, command,
> - success_message, failure_message=None):
> + success_message, failure_message=None,
> + vm=None):
> """
> Send a command to a console (appending CRLF characters), then wait
> for success_message to appear on the console, while logging the.
> @@ -184,9 +185,11 @@ def exec_command_and_wait_for_pattern(test, command,
> :param command: the command to send
> :param success_message: if this message appears, test succeeds
> :param failure_message: if this message appears, test fails
> + :param vm: the VM to use (defaults to test.vm if None)
> """
> assert success_message
> - _console_interaction(test, success_message, failure_message, command + '\r')
> + _console_interaction(test, success_message, failure_message, command + '\r',
> + vm=vm)
Hi,
FYI, there was another patch on the list that does something very similar,
and I picked it for my pull request today (since it fixed some more spots):
https://lore.kernel.org/qemu-devel/20250909135147.612345-20-thuth@redhat.com/
... so once that got merged, you can likely drop this patch from your series.
Regards,
Thomas
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg
2025-09-09 14:03 ` Thomas Huth
@ 2025-09-09 14:09 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-09 14:09 UTC (permalink / raw)
To: Thomas Huth, jasowang
Cc: qemu-devel, leiyang, steven.sistare, yc-core, peterx, mst,
farosas, eblake, armbru, philmd, berrange
On 09.09.25 17:03, Thomas Huth wrote:
> On 05/09/2025 15.50, Vladimir Sementsov-Ogievskiy wrote:
>> Allow to specify non default vm for the command.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Reviewed-by: Thomas Huth <thuth@redhat.com>
>> ---
>> tests/functional/qemu_test/cmd.py | 7 +++++--
>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/functional/qemu_test/cmd.py b/tests/functional/qemu_test/cmd.py
>> index dc5f422b77..28b36a3a54 100644
>> --- a/tests/functional/qemu_test/cmd.py
>> +++ b/tests/functional/qemu_test/cmd.py
>> @@ -172,7 +172,8 @@ def exec_command(test, command):
>> _console_interaction(test, None, None, command + '\r')
>> def exec_command_and_wait_for_pattern(test, command,
>> - success_message, failure_message=None):
>> + success_message, failure_message=None,
>> + vm=None):
>> """
>> Send a command to a console (appending CRLF characters), then wait
>> for success_message to appear on the console, while logging the.
>> @@ -184,9 +185,11 @@ def exec_command_and_wait_for_pattern(test, command,
>> :param command: the command to send
>> :param success_message: if this message appears, test succeeds
>> :param failure_message: if this message appears, test fails
>> + :param vm: the VM to use (defaults to test.vm if None)
>> """
>> assert success_message
>> - _console_interaction(test, success_message, failure_message, command + '\r')
>> + _console_interaction(test, success_message, failure_message, command + '\r',
>> + vm=vm)
>
> Hi,
>
> FYI, there was another patch on the list that does something very similar, and I picked it for my pull request today (since it fixed some more spots):
>
> https://lore.kernel.org/qemu-devel/20250909135147.612345-20-thuth@redhat.com/
>
> ... so once that got merged, you can likely drop this patch from your series.
>
Good, thanks!
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-09 7:44 ` Vladimir Sementsov-Ogievskiy
@ 2025-09-09 14:56 ` Peter Xu
2025-09-09 15:08 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-09-09 14:56 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On Tue, Sep 09, 2025 at 10:44:22AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.09.25 23:01, Peter Xu wrote:
> > On Mon, Sep 08, 2025 at 07:48:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > On 08.09.25 18:42, Peter Xu wrote:
> > > > On Fri, Sep 05, 2025 at 04:50:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > > +static const VMStateDescription vmstate_tap = {
> > > > > + .name = "virtio-net-device",
> > > > > + .post_load = tap_post_load,
> > > > > + .fields = (const VMStateField[]) {
> > > > > + VMSTATE_FD(fd, TAPState),
> > > > > + VMSTATE_BOOL(using_vnet_hdr, TAPState),
> > > > > + VMSTATE_BOOL(has_ufo, TAPState),
> > > > > + VMSTATE_BOOL(has_uso, TAPState),
> > > > > + VMSTATE_BOOL(enabled, TAPState),
> > > > > + VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
> > > > > + VMSTATE_END_OF_LIST()
> > > > > + }
> > > > > +};
> > > > > +
> > > > > +int tap_save(NetClientState *nc, QEMUFile *f)
> > > > > +{
> > > > > + TAPState *s = DO_UPCAST(TAPState, nc, nc);
> > > > > +
> > > > > + return vmstate_save_state(f, &vmstate_tap, s, 0);
> > > > > +}
> > > > > +
> > > > > +int tap_load(NetClientState *nc, QEMUFile *f)
> > > > > +{
> > > > > + TAPState *s = DO_UPCAST(TAPState, nc, nc);
> > > > > +
> > > > > + return vmstate_load_state(f, &vmstate_tap, s, 0);
> > > > > +}
> > > >
> > > > Instead of hard-coding vmstate_save_state() / vmstate_load_state(), could
> > > > we make tap's VMSD to be a subsection of virtio-net's?
> > > >
> > > > Multifd already doesn't support qemufile, but only iochannels (which is the
> > > > internal impl of qemufiles). We might at some point start to concurrently
> > > > load devices with multifd, then anything with qemufile will be a no-go and
> > > > need to be serialized as legacy code in the main channel, or rewritten.
> > > >
> > > > IMHO it'll be great if we can avoid adding new codes operating on
> > > > qemufiles, and also avoid adding any new custom VMSD fields' put()/get() if
> > > > ever possible.
> > > >
> > >
> > > Subsections are loaded after fields.
> > >
> > > And virtio-net already has fields
> > >
> > > VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
> > > vmstate_virtio_net_has_vnet),
> > >
> > > and
> > >
> > > VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
> > > vmstate_virtio_net_has_ufo),
> >
> > Side note: I'm actually a bit confused on why it needs to use
> > VMSTATE_WITH_TMP(), or say, the get()/put() directly.
> >
> > Taking example of vmstate_virtio_net_has_ufo, afaiu, the only point here is
> > that virtio_net_ufo_post_load() would check peer UFO support. I wonder if
> > that should work too to check that in virtio_net_post_load_device(), and
> > fail there if anything is wrong.. then would has_ufo be able to be
> > migrated as a VMSTATE_U8 field?
[1]
> >
> > >
> > > Which do check on virtio-net level some parameters, which should come from local migration of TAP.
> > >
> > > That's why I made TAP a field, and put it before these two ones. This way these two checks works.
> > >
> > >
> > > Still, from your comment I understand that hard-coding save/load is worse problem. So I can just
> > > skip checking in vmstate_virtio_net_has_vnet / vmstate_virtio_net_has_ufo with enabled "local-tap"
> > > (or "fd-passing") capability (or better migration parameter). This way TAP may be a subsection.
> >
> > That'll be nice, thanks. Or would a VMSTATE_STRUCT() for TAP to work (so
> > that it can also be put before the two _TMPs, but avoid raw get()/put())?
> >
>
> I considered using VMSTATE_STRUCT, but it would mean, I should access TAPState directly from virtio-net
> code. That's not possible now, as TAPState is a static type in tap.c, and think it's better to keep
> it static. So, in this case, subsection should be better, if I understand correctly.
Ah OK. Then I wonder if this is a good chance too to move the peer feature
checks above [1] directly over to virtio-net's post_load as a pre-requisite
change, which should also be after the subsections. Then the hope is it'll
also help removing some _TMP users.
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3 5/9] net/tap: implement interfaces for local migration
2025-09-09 14:56 ` Peter Xu
@ 2025-09-09 15:08 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-09-09 15:08 UTC (permalink / raw)
To: Peter Xu
Cc: jasowang, qemu-devel, leiyang, steven.sistare, yc-core, mst,
farosas, eblake, armbru, thuth, philmd, berrange
On 09.09.25 17:56, Peter Xu wrote:
> On Tue, Sep 09, 2025 at 10:44:22AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 08.09.25 23:01, Peter Xu wrote:
>>> On Mon, Sep 08, 2025 at 07:48:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> On 08.09.25 18:42, Peter Xu wrote:
>>>>> On Fri, Sep 05, 2025 at 04:50:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> +static const VMStateDescription vmstate_tap = {
>>>>>> + .name = "virtio-net-device",
>>>>>> + .post_load = tap_post_load,
>>>>>> + .fields = (const VMStateField[]) {
>>>>>> + VMSTATE_FD(fd, TAPState),
>>>>>> + VMSTATE_BOOL(using_vnet_hdr, TAPState),
>>>>>> + VMSTATE_BOOL(has_ufo, TAPState),
>>>>>> + VMSTATE_BOOL(has_uso, TAPState),
>>>>>> + VMSTATE_BOOL(enabled, TAPState),
>>>>>> + VMSTATE_UINT32(host_vnet_hdr_len, TAPState),
>>>>>> + VMSTATE_END_OF_LIST()
>>>>>> + }
>>>>>> +};
>>>>>> +
>>>>>> +int tap_save(NetClientState *nc, QEMUFile *f)
>>>>>> +{
>>>>>> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>>>>> +
>>>>>> + return vmstate_save_state(f, &vmstate_tap, s, 0);
>>>>>> +}
>>>>>> +
>>>>>> +int tap_load(NetClientState *nc, QEMUFile *f)
>>>>>> +{
>>>>>> + TAPState *s = DO_UPCAST(TAPState, nc, nc);
>>>>>> +
>>>>>> + return vmstate_load_state(f, &vmstate_tap, s, 0);
>>>>>> +}
>>>>>
>>>>> Instead of hard-coding vmstate_save_state() / vmstate_load_state(), could
>>>>> we make tap's VMSD to be a subsection of virtio-net's?
>>>>>
>>>>> Multifd already doesn't support qemufile, but only iochannels (which is the
>>>>> internal impl of qemufiles). We might at some point start to concurrently
>>>>> load devices with multifd, then anything with qemufile will be a no-go and
>>>>> need to be serialized as legacy code in the main channel, or rewritten.
>>>>>
>>>>> IMHO it'll be great if we can avoid adding new codes operating on
>>>>> qemufiles, and also avoid adding any new custom VMSD fields' put()/get() if
>>>>> ever possible.
>>>>>
>>>>
>>>> Subsections are loaded after fields.
>>>>
>>>> And virtio-net already has fields
>>>>
>>>> VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
>>>> vmstate_virtio_net_has_vnet),
>>>>
>>>> and
>>>>
>>>> VMSTATE_WITH_TMP(VirtIONet, struct VirtIONetMigTmp,
>>>> vmstate_virtio_net_has_ufo),
>>>
>>> Side note: I'm actually a bit confused on why it needs to use
>>> VMSTATE_WITH_TMP(), or say, the get()/put() directly.
>>>
>>> Taking example of vmstate_virtio_net_has_ufo, afaiu, the only point here is
>>> that virtio_net_ufo_post_load() would check peer UFO support. I wonder if
>>> that should work too to check that in virtio_net_post_load_device(), and
>>> fail there if anything is wrong.. then would has_ufo be able to be
>>> migrated as a VMSTATE_U8 field?
>
> [1]
>
>>>
>>>>
>>>> Which do check on virtio-net level some parameters, which should come from local migration of TAP.
>>>>
>>>> That's why I made TAP a field, and put it before these two ones. This way these two checks works.
>>>>
>>>>
>>>> Still, from your comment I understand that hard-coding save/load is worse problem. So I can just
>>>> skip checking in vmstate_virtio_net_has_vnet / vmstate_virtio_net_has_ufo with enabled "local-tap"
>>>> (or "fd-passing") capability (or better migration parameter). This way TAP may be a subsection.
>>>
>>> That'll be nice, thanks. Or would a VMSTATE_STRUCT() for TAP to work (so
>>> that it can also be put before the two _TMPs, but avoid raw get()/put())?
>>>
>>
>> I considered using VMSTATE_STRUCT, but it would mean, I should access TAPState directly from virtio-net
>> code. That's not possible now, as TAPState is a static type in tap.c, and think it's better to keep
>> it static. So, in this case, subsection should be better, if I understand correctly.
>
> Ah OK. Then I wonder if this is a good chance too to move the peer feature
> checks above [1] directly over to virtio-net's post_load as a pre-requisite
> change, which should also be after the subsections. Then the hope is it'll
> also help removing some _TMP users.
>
Sounds good, will try.
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2025-09-09 15:10 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-05 13:50 [PATCH v3 0/9] virtio-net: live-TAP local migration Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 1/9] net/tap: add some trace points Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 2/9] net/tap: keep exit notifier only when downscript set Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 3/9] net/tap: refactor net_tap_setup_vhost() Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 4/9] qapi: add interface for local TAP migration Vladimir Sementsov-Ogievskiy
2025-09-08 15:35 ` Peter Xu
2025-09-08 16:38 ` Vladimir Sementsov-Ogievskiy
2025-09-08 18:47 ` Peter Xu
2025-09-09 7:40 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 5/9] net/tap: implement interfaces for local migration Vladimir Sementsov-Ogievskiy
2025-09-08 15:42 ` Peter Xu
2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
2025-09-08 20:01 ` Peter Xu
2025-09-09 7:44 ` Vladimir Sementsov-Ogievskiy
2025-09-09 14:56 ` Peter Xu
2025-09-09 15:08 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 6/9] virtio-net: support local tap migration Vladimir Sementsov-Ogievskiy
2025-09-08 15:43 ` Peter Xu
2025-09-08 16:48 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 7/9] tests/functional: exec_command_and_wait_for_pattern: add vm arg Vladimir Sementsov-Ogievskiy
2025-09-09 14:03 ` Thomas Huth
2025-09-09 14:09 ` Vladimir Sementsov-Ogievskiy
2025-09-05 13:50 ` [PATCH v3 8/9] tests/functional: add skipUnlessPasswordlessSudo() decorator Vladimir Sementsov-Ogievskiy
2025-09-08 15:49 ` Daniel P. Berrangé
2025-09-05 13:50 ` [PATCH v3 9/9] tests/functional: add test_x86_64_tap_fd_migration Vladimir Sementsov-Ogievskiy
2025-09-08 15:58 ` Daniel P. Berrangé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).