* [RFC PATCH] virtio-net: introduce strict peer feature check
@ 2025-11-07 2:01 Jason Wang
2025-11-12 21:55 ` Peter Xu
2025-11-13 16:09 ` Michael S. Tsirkin
0 siblings, 2 replies; 26+ messages in thread
From: Jason Wang @ 2025-11-07 2:01 UTC (permalink / raw)
To: eduardo, marcel.apfelbaum, philmd, wangyanan55, zhao1.liu, mst,
jasowang, qemu-devel, peterx
Cc: farosas, jinpu.wang, thuth, berrange
We used to clear features silently in virtio_net_get_features() even
if it is required. This complicates the live migration compatibility
as the management layer may think the feature is enabled but in fact
not.
Let's add a strict feature check to make sure if there's a mismatch
between the required feature and peer, fail the get_features()
immediately instead of waiting until the migration to fail. This
offload the migration compatibility completely to the management
layer.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
hw/core/machine.c | 1 +
hw/net/virtio-net.c | 153 +++++++++++++++++++++++++--------
include/hw/virtio/virtio-net.h | 1 +
3 files changed, 119 insertions(+), 36 deletions(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 681adbb7ac..a9e43c4990 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -40,6 +40,7 @@
GlobalProperty hw_compat_10_1[] = {
{ TYPE_ACPI_GED, "x-has-hest-addr", "false" },
+ { TYPE_VIRTIO_NET, "strict-peer-feature-check", "false"},
};
const size_t hw_compat_10_1_len = G_N_ELEMENTS(hw_compat_10_1);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 33116712eb..3acc5ed4a6 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3090,53 +3090,120 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
virtio_add_feature_ex(features, VIRTIO_NET_F_MAC);
if (!peer_has_vnet_hdr(n)) {
- virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
-
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
-
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
-
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
- virtio_clear_feature_ex(features,
- VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
- virtio_clear_feature_ex(features,
- VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
+ if (n->strict_peer_feature_check) {
+ if (virtio_has_feature_ex(features, VIRTIO_NET_F_CSUM) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO4) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO6) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_ECN) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_ECN) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_HASH_REPORT)) {
+ error_setg(errp, "virtio_net: peer doesn't support vnet hdr");
+ return;
+ }
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
+
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
+
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
+
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
+ virtio_clear_feature_ex(features,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
+ virtio_clear_feature_ex(features,
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
+ }
}
if (!peer_has_vnet_hdr(n) || !peer_has_ufo(n)) {
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
+ if (n->strict_peer_feature_check) {
+ if (virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_UFO) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_UFO)) {
+ error_setg(errp, "virtio_net: peer doesn't support UFO");
+ return;
+ }
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
+ }
}
if (!peer_has_uso(n)) {
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
+ if (n->strict_peer_feature_check) {
+ if (virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6)) {
+ error_setg(errp, "virtio_net: peer doesn't support USO");
+ return;
+ }
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
+ }
}
if (!peer_has_tunnel(n)) {
- virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
- virtio_clear_feature_ex(features,
- VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
- virtio_clear_feature_ex(features,
- VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
+ if (n->strict_peer_feature_check) {
+ if (virtio_has_feature_ex(features,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
+ virtio_has_feature_ex(features,
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM)) {
+ error_setg(errp, "virtio_net: peer doesn't support tunnel GSO");
+ return;
+ }
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
+ virtio_clear_feature_ex(features,
+ VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
+ virtio_clear_feature_ex(features,
+ VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
+ }
}
if (!get_vhost_net(nc->peer)) {
if (!use_own_hash) {
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
- virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
+ if (n->strict_peer_feature_check) {
+ if (virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT) |
+ virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
+ error_setg(errp,
+ "virtio_net: peer doesn't support RSS/HASH_REPORT");
+ return;
+ }
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
+ }
} else if (virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
virtio_net_load_ebpf(n, errp);
}
@@ -3145,14 +3212,26 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
}
if (!use_peer_hash) {
- virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
+ if (n->strict_peer_feature_check &&
+ virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT)) {
+ error_setg(errp, "virtio_net: peer doesn't HASH_REPORT");
+ return;
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
+ }
if (!use_own_hash || !virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
if (!virtio_net_load_ebpf(n, errp)) {
return;
}
- virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
+ if (n->strict_peer_feature_check &&
+ virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
+ error_setg(errp, "virtio_net: fail to attach eBPF for RSS");
+ return;
+ } else {
+ virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
+ }
}
}
@@ -4313,6 +4392,8 @@ static const Property virtio_net_properties[] = {
host_features_ex,
VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM,
false),
+ DEFINE_PROP_BOOL("strict-peer-feature-check", VirtIONet,
+ strict_peer_feature_check, true),
};
static void virtio_net_class_init(ObjectClass *klass, const void *data)
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 5b8ab7bda7..abd4ca4bb0 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -222,6 +222,7 @@ struct VirtIONet {
/* primary failover device is hidden*/
bool failover_primary_hidden;
bool failover;
+ bool strict_peer_feature_check;
DeviceListener primary_listener;
QDict *primary_opts;
bool primary_opts_from_json;
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-07 2:01 [RFC PATCH] virtio-net: introduce strict peer feature check Jason Wang
@ 2025-11-12 21:55 ` Peter Xu
2025-11-13 0:31 ` Jason Wang
2025-11-13 8:53 ` Daniel P. Berrangé
2025-11-13 16:09 ` Michael S. Tsirkin
1 sibling, 2 replies; 26+ messages in thread
From: Peter Xu @ 2025-11-12 21:55 UTC (permalink / raw)
To: Jason Wang
Cc: eduardo, marcel.apfelbaum, philmd, wangyanan55, zhao1.liu, mst,
qemu-devel, farosas, jinpu.wang, thuth, berrange
On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> We used to clear features silently in virtio_net_get_features() even
> if it is required. This complicates the live migration compatibility
> as the management layer may think the feature is enabled but in fact
> not.
>
> Let's add a strict feature check to make sure if there's a mismatch
> between the required feature and peer, fail the get_features()
> immediately instead of waiting until the migration to fail. This
> offload the migration compatibility completely to the management
> layer.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
Jason, thanks for help looking into the problem!
Am I right that after this patch applied, whenever a new QEMU boots with
the new machine types (e.g. having USO* by default ON), will fail to boot
on an old kernel that doesn't support USO*, but ask the users to turn off
USO* features explicitly in the virtio-net devices?
Thanks,
> ---
> hw/core/machine.c | 1 +
> hw/net/virtio-net.c | 153 +++++++++++++++++++++++++--------
> include/hw/virtio/virtio-net.h | 1 +
> 3 files changed, 119 insertions(+), 36 deletions(-)
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 681adbb7ac..a9e43c4990 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -40,6 +40,7 @@
>
> GlobalProperty hw_compat_10_1[] = {
> { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
> + { TYPE_VIRTIO_NET, "strict-peer-feature-check", "false"},
> };
> const size_t hw_compat_10_1_len = G_N_ELEMENTS(hw_compat_10_1);
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 33116712eb..3acc5ed4a6 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3090,53 +3090,120 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
> virtio_add_feature_ex(features, VIRTIO_NET_F_MAC);
>
> if (!peer_has_vnet_hdr(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
> -
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
> -
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> -
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_CSUM) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO6) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_ECN) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_ECN) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HASH_REPORT)) {
> + error_setg(errp, "virtio_net: peer doesn't support vnet hdr");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
> +
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
> +
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> +
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
>
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + }
> }
>
> if (!peer_has_vnet_hdr(n) || !peer_has_ufo(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_UFO) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_UFO)) {
> + error_setg(errp, "virtio_net: peer doesn't support UFO");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
> + }
> }
> if (!peer_has_uso(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6)) {
> + error_setg(errp, "virtio_net: peer doesn't support USO");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> + }
> }
>
> if (!peer_has_tunnel(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM)) {
> + error_setg(errp, "virtio_net: peer doesn't support tunnel GSO");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> + }
> }
>
> if (!get_vhost_net(nc->peer)) {
> if (!use_own_hash) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> + error_setg(errp,
> + "virtio_net: peer doesn't support RSS/HASH_REPORT");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + }
> } else if (virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> virtio_net_load_ebpf(n, errp);
> }
> @@ -3145,14 +3212,26 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
> }
>
> if (!use_peer_hash) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + if (n->strict_peer_feature_check &&
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT)) {
> + error_setg(errp, "virtio_net: peer doesn't HASH_REPORT");
> + return;
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + }
>
> if (!use_own_hash || !virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
> if (!virtio_net_load_ebpf(n, errp)) {
> return;
> }
>
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + if (n->strict_peer_feature_check &&
> + virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> + error_setg(errp, "virtio_net: fail to attach eBPF for RSS");
> + return;
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + }
> }
> }
>
> @@ -4313,6 +4392,8 @@ static const Property virtio_net_properties[] = {
> host_features_ex,
> VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM,
> false),
> + DEFINE_PROP_BOOL("strict-peer-feature-check", VirtIONet,
> + strict_peer_feature_check, true),
> };
>
> static void virtio_net_class_init(ObjectClass *klass, const void *data)
> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
> index 5b8ab7bda7..abd4ca4bb0 100644
> --- a/include/hw/virtio/virtio-net.h
> +++ b/include/hw/virtio/virtio-net.h
> @@ -222,6 +222,7 @@ struct VirtIONet {
> /* primary failover device is hidden*/
> bool failover_primary_hidden;
> bool failover;
> + bool strict_peer_feature_check;
> DeviceListener primary_listener;
> QDict *primary_opts;
> bool primary_opts_from_json;
> --
> 2.34.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-12 21:55 ` Peter Xu
@ 2025-11-13 0:31 ` Jason Wang
2025-11-13 15:51 ` Peter Xu
2025-11-13 8:53 ` Daniel P. Berrangé
1 sibling, 1 reply; 26+ messages in thread
From: Jason Wang @ 2025-11-13 0:31 UTC (permalink / raw)
To: Peter Xu
Cc: eduardo, marcel.apfelbaum, philmd, wangyanan55, zhao1.liu, mst,
qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 5:55 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > We used to clear features silently in virtio_net_get_features() even
> > if it is required. This complicates the live migration compatibility
> > as the management layer may think the feature is enabled but in fact
> > not.
> >
> > Let's add a strict feature check to make sure if there's a mismatch
> > between the required feature and peer, fail the get_features()
> > immediately instead of waiting until the migration to fail. This
> > offload the migration compatibility completely to the management
> > layer.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> Jason, thanks for help looking into the problem!
>
> Am I right that after this patch applied, whenever a new QEMU boots with
> the new machine types (e.g. having USO* by default ON), will fail to boot
> on an old kernel that doesn't support USO*, but ask the users to turn off
> USO* features explicitly in the virtio-net devices?
>
> Thanks,
Yes, I wonder if this can help in dealing with migration compatibility issues.
Thanks
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-12 21:55 ` Peter Xu
2025-11-13 0:31 ` Jason Wang
@ 2025-11-13 8:53 ` Daniel P. Berrangé
2025-11-13 15:58 ` Peter Xu
1 sibling, 1 reply; 26+ messages in thread
From: Daniel P. Berrangé @ 2025-11-13 8:53 UTC (permalink / raw)
To: Peter Xu
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, mst, qemu-devel, farosas, jinpu.wang, thuth
On Wed, Nov 12, 2025 at 04:55:42PM -0500, Peter Xu wrote:
> On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > We used to clear features silently in virtio_net_get_features() even
> > if it is required. This complicates the live migration compatibility
> > as the management layer may think the feature is enabled but in fact
> > not.
> >
> > Let's add a strict feature check to make sure if there's a mismatch
> > between the required feature and peer, fail the get_features()
> > immediately instead of waiting until the migration to fail. This
> > offload the migration compatibility completely to the management
> > layer.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> Jason, thanks for help looking into the problem!
>
> Am I right that after this patch applied, whenever a new QEMU boots with
> the new machine types (e.g. having USO* by default ON), will fail to boot
> on an old kernel that doesn't support USO*, but ask the users to turn off
> USO* features explicitly in the virtio-net devices?
What kernel version are we talking about where there will be
incompatibility ? Is it old enough that it pre-dates our
platform support matrix requirements ? Ubuntu 22.04 and
RHEL-9 are currently our targets with the oldest kernels
that we need to retain compatibility with as the bare min.
I would expect machine types to work on these old platforms
without users to having to manually disable default set
features.
>
> Thanks,
>
> > ---
> > hw/core/machine.c | 1 +
> > hw/net/virtio-net.c | 153 +++++++++++++++++++++++++--------
> > include/hw/virtio/virtio-net.h | 1 +
> > 3 files changed, 119 insertions(+), 36 deletions(-)
> >
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 681adbb7ac..a9e43c4990 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -40,6 +40,7 @@
> >
> > GlobalProperty hw_compat_10_1[] = {
> > { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
> > + { TYPE_VIRTIO_NET, "strict-peer-feature-check", "false"},
> > };
> > const size_t hw_compat_10_1_len = G_N_ELEMENTS(hw_compat_10_1);
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 33116712eb..3acc5ed4a6 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -3090,53 +3090,120 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
> > virtio_add_feature_ex(features, VIRTIO_NET_F_MAC);
> >
> > if (!peer_has_vnet_hdr(n)) {
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
> > -
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
> > -
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> > -
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> > - virtio_clear_feature_ex(features,
> > - VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> > - virtio_clear_feature_ex(features,
> > - VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> > + if (n->strict_peer_feature_check) {
> > + if (virtio_has_feature_ex(features, VIRTIO_NET_F_CSUM) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO4) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO6) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_ECN) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_ECN) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_HASH_REPORT)) {
> > + error_setg(errp, "virtio_net: peer doesn't support vnet hdr");
> > + return;
> > + }
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
> > +
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
> > +
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> > +
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> > + virtio_clear_feature_ex(features,
> > + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> > + virtio_clear_feature_ex(features,
> > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> >
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> > + }
> > }
> >
> > if (!peer_has_vnet_hdr(n) || !peer_has_ufo(n)) {
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
> > + if (n->strict_peer_feature_check) {
> > + if (virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_UFO) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_UFO)) {
> > + error_setg(errp, "virtio_net: peer doesn't support UFO");
> > + return;
> > + }
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
> > + }
> > }
> > if (!peer_has_uso(n)) {
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> > + if (n->strict_peer_feature_check) {
> > + if (virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6)) {
> > + error_setg(errp, "virtio_net: peer doesn't support USO");
> > + return;
> > + }
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> > + }
> > }
> >
> > if (!peer_has_tunnel(n)) {
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> > - virtio_clear_feature_ex(features,
> > - VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> > - virtio_clear_feature_ex(features,
> > - VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> > + if (n->strict_peer_feature_check) {
> > + if (virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
> > + virtio_has_feature_ex(features,
> > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM)) {
> > + error_setg(errp, "virtio_net: peer doesn't support tunnel GSO");
> > + return;
> > + }
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> > + virtio_clear_feature_ex(features,
> > + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> > + virtio_clear_feature_ex(features,
> > + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> > + }
> > }
> >
> > if (!get_vhost_net(nc->peer)) {
> > if (!use_own_hash) {
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> > + if (n->strict_peer_feature_check) {
> > + if (virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT) |
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> > + error_setg(errp,
> > + "virtio_net: peer doesn't support RSS/HASH_REPORT");
> > + return;
> > + }
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> > + }
> > } else if (virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> > virtio_net_load_ebpf(n, errp);
> > }
> > @@ -3145,14 +3212,26 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
> > }
> >
> > if (!use_peer_hash) {
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> > + if (n->strict_peer_feature_check &&
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT)) {
> > + error_setg(errp, "virtio_net: peer doesn't HASH_REPORT");
> > + return;
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> > + }
> >
> > if (!use_own_hash || !virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
> > if (!virtio_net_load_ebpf(n, errp)) {
> > return;
> > }
> >
> > - virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> > + if (n->strict_peer_feature_check &&
> > + virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> > + error_setg(errp, "virtio_net: fail to attach eBPF for RSS");
> > + return;
> > + } else {
> > + virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> > + }
> > }
> > }
> >
> > @@ -4313,6 +4392,8 @@ static const Property virtio_net_properties[] = {
> > host_features_ex,
> > VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM,
> > false),
> > + DEFINE_PROP_BOOL("strict-peer-feature-check", VirtIONet,
> > + strict_peer_feature_check, true),
> > };
> >
> > static void virtio_net_class_init(ObjectClass *klass, const void *data)
> > diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
> > index 5b8ab7bda7..abd4ca4bb0 100644
> > --- a/include/hw/virtio/virtio-net.h
> > +++ b/include/hw/virtio/virtio-net.h
> > @@ -222,6 +222,7 @@ struct VirtIONet {
> > /* primary failover device is hidden*/
> > bool failover_primary_hidden;
> > bool failover;
> > + bool strict_peer_feature_check;
> > DeviceListener primary_listener;
> > QDict *primary_opts;
> > bool primary_opts_from_json;
> > --
> > 2.34.1
> >
>
> --
> Peter Xu
>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 0:31 ` Jason Wang
@ 2025-11-13 15:51 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2025-11-13 15:51 UTC (permalink / raw)
To: Jason Wang
Cc: eduardo, marcel.apfelbaum, philmd, wangyanan55, zhao1.liu, mst,
qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 08:31:14AM +0800, Jason Wang wrote:
> On Thu, Nov 13, 2025 at 5:55 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > We used to clear features silently in virtio_net_get_features() even
> > > if it is required. This complicates the live migration compatibility
> > > as the management layer may think the feature is enabled but in fact
> > > not.
> > >
> > > Let's add a strict feature check to make sure if there's a mismatch
> > > between the required feature and peer, fail the get_features()
> > > immediately instead of waiting until the migration to fail. This
> > > offload the migration compatibility completely to the management
> > > layer.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> >
> > Jason, thanks for help looking into the problem!
> >
> > Am I right that after this patch applied, whenever a new QEMU boots with
> > the new machine types (e.g. having USO* by default ON), will fail to boot
> > on an old kernel that doesn't support USO*, but ask the users to turn off
> > USO* features explicitly in the virtio-net devices?
> >
> > Thanks,
>
> Yes, I wonder if this can help in dealing with migration compatibility issues.
Yes I think so.
The only thing I do not know well is how much risk new qemu will start to
fail boot on old kernels. One thing we can do here is be less aggressive
on set default-ON to new network features.
E.g. for features like USO* we can avoid turning it ON by default when
introduced, but wait for a few more releases. Distros / Downstreams can
still be aggresive though to tweak the default if they know exactly the
kernels to be run on top will be new enough.
Thanks again for the proposal, Jason. Anyway, from migration side, feel
free to take:
Acked-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 8:53 ` Daniel P. Berrangé
@ 2025-11-13 15:58 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2025-11-13 15:58 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, mst, qemu-devel, farosas, jinpu.wang, thuth
On Thu, Nov 13, 2025 at 08:53:30AM +0000, Daniel P. Berrangé wrote:
> On Wed, Nov 12, 2025 at 04:55:42PM -0500, Peter Xu wrote:
> > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > We used to clear features silently in virtio_net_get_features() even
> > > if it is required. This complicates the live migration compatibility
> > > as the management layer may think the feature is enabled but in fact
> > > not.
> > >
> > > Let's add a strict feature check to make sure if there's a mismatch
> > > between the required feature and peer, fail the get_features()
> > > immediately instead of waiting until the migration to fail. This
> > > offload the migration compatibility completely to the management
> > > layer.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> >
> > Jason, thanks for help looking into the problem!
> >
> > Am I right that after this patch applied, whenever a new QEMU boots with
> > the new machine types (e.g. having USO* by default ON), will fail to boot
> > on an old kernel that doesn't support USO*, but ask the users to turn off
> > USO* features explicitly in the virtio-net devices?
>
> What kernel version are we talking about where there will be
> incompatibility ? Is it old enough that it pre-dates our
> platform support matrix requirements ? Ubuntu 22.04 and
> RHEL-9 are currently our targets with the oldest kernels
> that we need to retain compatibility with as the bare min.
> I would expect machine types to work on these old platforms
> without users to having to manually disable default set
> features.
Jason's proposal should have kept the behavior for old machine types so the
strict checks are bypassed, so at least existing running VMs with old
machine types should not be affected on booting.
New machine types may suffer from this indeed, that when running on old
kernels it may needs some tweak on cmdlines.
There's another alternative, which we can introduce an option to allow QEMU
boot but forbidding migration (or at least show a warning to user that
migration may not work properly). Then everything can be auto-probed like
before, because migration ABI is not necessary.
Personally, I think Jason's proposal is a good trade-off we can consider,
if we think migration should by default supported on any QEMUs that would
boot up properly.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-07 2:01 [RFC PATCH] virtio-net: introduce strict peer feature check Jason Wang
2025-11-12 21:55 ` Peter Xu
@ 2025-11-13 16:09 ` Michael S. Tsirkin
2025-11-13 16:37 ` Peter Xu
1 sibling, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-13 16:09 UTC (permalink / raw)
To: Jason Wang
Cc: eduardo, marcel.apfelbaum, philmd, wangyanan55, zhao1.liu,
qemu-devel, peterx, farosas, jinpu.wang, thuth, berrange
On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> We used to clear features silently in virtio_net_get_features() even
> if it is required. This complicates the live migration compatibility
> as the management layer may think the feature is enabled but in fact
> not.
>
> Let's add a strict feature check to make sure if there's a mismatch
> between the required feature and peer, fail the get_features()
> immediately instead of waiting until the migration to fail. This
> offload the migration compatibility completely to the management
> layer.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
This is not really useful - how do users know how to tweak their
command lines?
We discussed this many times.
To try and solve this you need a tool that will tell you how to start
VM on X to make it migrateable to Y or Z.
More importantly,
migration is a niche thing and breaking booting perfectly good VMs
just for that seems wrong.
If you want to keep this off by default, and have management
enable this if it knows what it's doing, then I don't really
care.
> ---
> hw/core/machine.c | 1 +
> hw/net/virtio-net.c | 153 +++++++++++++++++++++++++--------
> include/hw/virtio/virtio-net.h | 1 +
> 3 files changed, 119 insertions(+), 36 deletions(-)
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 681adbb7ac..a9e43c4990 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -40,6 +40,7 @@
>
> GlobalProperty hw_compat_10_1[] = {
> { TYPE_ACPI_GED, "x-has-hest-addr", "false" },
> + { TYPE_VIRTIO_NET, "strict-peer-feature-check", "false"},
> };
> const size_t hw_compat_10_1_len = G_N_ELEMENTS(hw_compat_10_1);
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 33116712eb..3acc5ed4a6 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3090,53 +3090,120 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
> virtio_add_feature_ex(features, VIRTIO_NET_F_MAC);
>
> if (!peer_has_vnet_hdr(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
> -
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
> -
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> -
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_CSUM) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_TSO6) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_ECN) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_ECN) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HASH_REPORT)) {
> + error_setg(errp, "virtio_net: peer doesn't support vnet hdr");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_CSUM);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_TSO6);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_ECN);
> +
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_CSUM);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_TSO6);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_ECN);
> +
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> +
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
>
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + }
> }
>
> if (!peer_has_vnet_hdr(n) || !peer_has_ufo(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_UFO) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_UFO)) {
> + error_setg(errp, "virtio_net: peer doesn't support UFO");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UFO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UFO);
> + }
> }
> if (!peer_has_uso(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_HOST_USO) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO4) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_GUEST_USO6)) {
> + error_setg(errp, "virtio_net: peer doesn't support USO");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_USO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO4);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_USO6);
> + }
> }
>
> if (!peer_has_tunnel(n)) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> - virtio_clear_feature_ex(features,
> - VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM) |
> + virtio_has_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM)) {
> + error_setg(errp, "virtio_net: peer doesn't support tunnel GSO");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM);
> + virtio_clear_feature_ex(features,
> + VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM);
> + }
> }
>
> if (!get_vhost_net(nc->peer)) {
> if (!use_own_hash) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + if (n->strict_peer_feature_check) {
> + if (virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT) |
> + virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> + error_setg(errp,
> + "virtio_net: peer doesn't support RSS/HASH_REPORT");
> + return;
> + }
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + }
> } else if (virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> virtio_net_load_ebpf(n, errp);
> }
> @@ -3145,14 +3212,26 @@ static void virtio_net_get_features(VirtIODevice *vdev, uint64_t *features,
> }
>
> if (!use_peer_hash) {
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + if (n->strict_peer_feature_check &&
> + virtio_has_feature_ex(features, VIRTIO_NET_F_HASH_REPORT)) {
> + error_setg(errp, "virtio_net: peer doesn't HASH_REPORT");
> + return;
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_HASH_REPORT);
> + }
>
> if (!use_own_hash || !virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
> if (!virtio_net_load_ebpf(n, errp)) {
> return;
> }
>
> - virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + if (n->strict_peer_feature_check &&
> + virtio_has_feature_ex(features, VIRTIO_NET_F_RSS)) {
> + error_setg(errp, "virtio_net: fail to attach eBPF for RSS");
> + return;
> + } else {
> + virtio_clear_feature_ex(features, VIRTIO_NET_F_RSS);
> + }
> }
> }
>
> @@ -4313,6 +4392,8 @@ static const Property virtio_net_properties[] = {
> host_features_ex,
> VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM,
> false),
> + DEFINE_PROP_BOOL("strict-peer-feature-check", VirtIONet,
> + strict_peer_feature_check, true),
> };
>
> static void virtio_net_class_init(ObjectClass *klass, const void *data)
> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
> index 5b8ab7bda7..abd4ca4bb0 100644
> --- a/include/hw/virtio/virtio-net.h
> +++ b/include/hw/virtio/virtio-net.h
> @@ -222,6 +222,7 @@ struct VirtIONet {
> /* primary failover device is hidden*/
> bool failover_primary_hidden;
> bool failover;
> + bool strict_peer_feature_check;
> DeviceListener primary_listener;
> QDict *primary_opts;
> bool primary_opts_from_json;
> --
> 2.34.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 16:09 ` Michael S. Tsirkin
@ 2025-11-13 16:37 ` Peter Xu
2025-11-13 16:47 ` Michael S. Tsirkin
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-11-13 16:37 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > We used to clear features silently in virtio_net_get_features() even
> > if it is required. This complicates the live migration compatibility
> > as the management layer may think the feature is enabled but in fact
> > not.
> >
> > Let's add a strict feature check to make sure if there's a mismatch
> > between the required feature and peer, fail the get_features()
> > immediately instead of waiting until the migration to fail. This
> > offload the migration compatibility completely to the management
> > layer.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> This is not really useful - how do users know how to tweak their
> command lines?
> We discussed this many times.
> To try and solve this you need a tool that will tell you how to start
> VM on X to make it migrateable to Y or Z.
>
>
> More importantly,
> migration is a niche thing and breaking booting perfectly good VMs
> just for that seems wrong.
IMHO Jason's proposal is useful in that it now provides a way to provide
ABI stablility but allows auto-ON to exist.
If we think migration is optional, we could add a migration blocker where
strict check flag is set to OFF, as I mentioned in the email reply to Dan.
As that implies the VM ABI is not guaranteed.
Thanks,
>
>
> If you want to keep this off by default, and have management
> enable this if it knows what it's doing, then I don't really
> care.
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 16:37 ` Peter Xu
@ 2025-11-13 16:47 ` Michael S. Tsirkin
2025-11-13 17:12 ` Peter Xu
0 siblings, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-13 16:47 UTC (permalink / raw)
To: Peter Xu
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > We used to clear features silently in virtio_net_get_features() even
> > > if it is required. This complicates the live migration compatibility
> > > as the management layer may think the feature is enabled but in fact
> > > not.
> > >
> > > Let's add a strict feature check to make sure if there's a mismatch
> > > between the required feature and peer, fail the get_features()
> > > immediately instead of waiting until the migration to fail. This
> > > offload the migration compatibility completely to the management
> > > layer.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> >
> > This is not really useful - how do users know how to tweak their
> > command lines?
> > We discussed this many times.
> > To try and solve this you need a tool that will tell you how to start
> > VM on X to make it migrateable to Y or Z.
> >
> >
> > More importantly,
> > migration is a niche thing and breaking booting perfectly good VMs
> > just for that seems wrong.
>
> IMHO Jason's proposal is useful in that it now provides a way to provide
> ABI stablility but allows auto-ON to exist.
>
> If we think migration is optional, we could add a migration blocker where
> strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> As that implies the VM ABI is not guaranteed.
>
> Thanks,
All you have to do is avoid changing the kernel and ABI is stable.
Downstreams already do this.
> >
> >
> > If you want to keep this off by default, and have management
> > enable this if it knows what it's doing, then I don't really
> > care.
>
> --
> Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 16:47 ` Michael S. Tsirkin
@ 2025-11-13 17:12 ` Peter Xu
2025-11-13 17:46 ` Michael S. Tsirkin
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2025-11-13 17:12 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > We used to clear features silently in virtio_net_get_features() even
> > > > if it is required. This complicates the live migration compatibility
> > > > as the management layer may think the feature is enabled but in fact
> > > > not.
> > > >
> > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > between the required feature and peer, fail the get_features()
> > > > immediately instead of waiting until the migration to fail. This
> > > > offload the migration compatibility completely to the management
> > > > layer.
> > > >
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > >
> > > This is not really useful - how do users know how to tweak their
> > > command lines?
> > > We discussed this many times.
> > > To try and solve this you need a tool that will tell you how to start
> > > VM on X to make it migrateable to Y or Z.
> > >
> > >
> > > More importantly,
> > > migration is a niche thing and breaking booting perfectly good VMs
> > > just for that seems wrong.
> >
> > IMHO Jason's proposal is useful in that it now provides a way to provide
> > ABI stablility but allows auto-ON to exist.
> >
> > If we think migration is optional, we could add a migration blocker where
> > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > As that implies the VM ABI is not guaranteed.
> >
> > Thanks,
>
>
> All you have to do is avoid changing the kernel and ABI is stable.
> Downstreams already do this.
But the whole point of migration is allowing VMs to move between hosts..
hence AFAIU kernel can change.
Downstream will still have problem if some network features will be
optionally supported in some of the RHEL-N branches, because machine types
are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
feature available..
It's also not good IMHO to only fix downstream but having upstream face
such problems, even if there's a downstream fix...
This thread was revived only because Jinpu hit similar issues. IMHO we
should still try to provide a generic solution upstream for everyone.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 17:12 ` Peter Xu
@ 2025-11-13 17:46 ` Michael S. Tsirkin
2025-11-13 19:32 ` Peter Xu
2025-11-14 1:32 ` Jason Wang
0 siblings, 2 replies; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-13 17:46 UTC (permalink / raw)
To: Peter Xu
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 12:12:38PM -0500, Peter Xu wrote:
> On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> > On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > > We used to clear features silently in virtio_net_get_features() even
> > > > > if it is required. This complicates the live migration compatibility
> > > > > as the management layer may think the feature is enabled but in fact
> > > > > not.
> > > > >
> > > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > > between the required feature and peer, fail the get_features()
> > > > > immediately instead of waiting until the migration to fail. This
> > > > > offload the migration compatibility completely to the management
> > > > > layer.
> > > > >
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > >
> > > > This is not really useful - how do users know how to tweak their
> > > > command lines?
> > > > We discussed this many times.
> > > > To try and solve this you need a tool that will tell you how to start
> > > > VM on X to make it migrateable to Y or Z.
> > > >
> > > >
> > > > More importantly,
> > > > migration is a niche thing and breaking booting perfectly good VMs
> > > > just for that seems wrong.
> > >
> > > IMHO Jason's proposal is useful in that it now provides a way to provide
> > > ABI stablility but allows auto-ON to exist.
> > >
> > > If we think migration is optional, we could add a migration blocker where
> > > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > > As that implies the VM ABI is not guaranteed.
> > >
> > > Thanks,
> >
> >
> > All you have to do is avoid changing the kernel and ABI is stable.
> > Downstreams already do this.
>
> But the whole point of migration is allowing VMs to move between hosts..
> hence AFAIU kernel can change.
>
> Downstream will still have problem if some network features will be
> optionally supported in some of the RHEL-N branches, because machine types
> are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
> on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
> RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
> feature available..
>
> It's also not good IMHO to only fix downstream but having upstream face
> such problems, even if there's a downstream fix...
>
> This thread was revived only because Jinpu hit similar issues. IMHO we
> should still try to provide a generic solution upstream for everyone.
>
> Thanks,
>
> --
> Peter Xu
failing to start a perfectly good qemu which used to work
because you changed kernels is better than failing to migrate how?
graceful downgrade with old kernels is the basics of good userspace
behaviour and has been for decades.
sure, let's work on a solution, just erroring out is more about blaming
the user. what is the user supposed to do when qemu fails to start?
first, formulate what exactly do you want to enable.
for example, you have a set of boxes and you want a set of flags
to supply to guarantee qemu can migrate between them. is that it?
--
MST
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 17:46 ` Michael S. Tsirkin
@ 2025-11-13 19:32 ` Peter Xu
2025-11-14 1:51 ` Jason Wang
2025-11-14 5:48 ` Thomas Huth
2025-11-14 1:32 ` Jason Wang
1 sibling, 2 replies; 26+ messages in thread
From: Peter Xu @ 2025-11-13 19:32 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Thu, Nov 13, 2025 at 12:46:55PM -0500, Michael S. Tsirkin wrote:
> failing to start a perfectly good qemu which used to work
> because you changed kernels is better than failing to migrate how?
>
I agree this is not pretty.
The very original proposal was having extra features to be OFF by default,
only allow explicit selections to enable them when the mgmt / user is aware
of the possible hosts to run on top. That'll guarantee:
(1) explicit failure whenever some unsupported cap is chosen on boot,
(2) default setup should always assume no kernel dependency hence booting
should be all fine,
(3) since all features will be by default OFF or selected by the user with
explicit cmdlines, VM ABI is guaranteed so that migration will work.
But unfortunately that proposal was rejected.
>
> graceful downgrade with old kernels is the basics of good userspace
> behaviour and has been for decades.
>
>
> sure, let's work on a solution, just erroring out is more about blaming
> the user. what is the user supposed to do when qemu fails to start?
This is indeed a good question. If with strict checks maybe we would at
least want to make sure we throw explicit messages to let user know what to
turn off.
>
>
> first, formulate what exactly do you want to enable.
>
>
>
> for example, you have a set of boxes and you want a set of flags
> to supply to guarantee qemu can migrate between them. is that it?
Yes I think that's the case.
That's also why I think the very original proposal still makes sense
(having all defaults OFF when dependent on kernel), because only the mgmt
knows the details about the cluster, so it may make more sense to select
from the top which has the full knowledge base, explicitly enable some sets
of features (not only network, but also CPU feature bits and else). Then
the mgmt boots the VM, also knows where it can migrate explicitly.
If all things are hidden then the mgmt is almost out of control of this.
That was rejected because there's the need to by default enable new
features if ever possible. In that case, IMHO Jason's soluion is spot on
where it sits in the middle ground of both, allowing both to happen
(auto-enable of new feats, while keeping VM ABI stablility).
So IIUC there will be a cluster, it may contain different groups of hosts,
each group should have similar setups so that VMs can freely migrate
between each other within the same group (but may not easily migratable
across groups?). But I don't think I know well on that part in practise.
Dan might be a great source of input from that level.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 17:46 ` Michael S. Tsirkin
2025-11-13 19:32 ` Peter Xu
@ 2025-11-14 1:32 ` Jason Wang
2025-11-16 6:52 ` Michael S. Tsirkin
1 sibling, 1 reply; 26+ messages in thread
From: Jason Wang @ 2025-11-14 1:32 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Fri, Nov 14, 2025 at 1:47 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Nov 13, 2025 at 12:12:38PM -0500, Peter Xu wrote:
> > On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > > > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > > > We used to clear features silently in virtio_net_get_features() even
> > > > > > if it is required. This complicates the live migration compatibility
> > > > > > as the management layer may think the feature is enabled but in fact
> > > > > > not.
> > > > > >
> > > > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > > > between the required feature and peer, fail the get_features()
> > > > > > immediately instead of waiting until the migration to fail. This
> > > > > > offload the migration compatibility completely to the management
> > > > > > layer.
> > > > > >
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > >
> > > > > This is not really useful - how do users know how to tweak their
> > > > > command lines?
> > > > > We discussed this many times.
> > > > > To try and solve this you need a tool that will tell you how to start
> > > > > VM on X to make it migrateable to Y or Z.
> > > > >
> > > > >
> > > > > More importantly,
> > > > > migration is a niche thing and breaking booting perfectly good VMs
> > > > > just for that seems wrong.
> > > >
> > > > IMHO Jason's proposal is useful in that it now provides a way to provide
> > > > ABI stablility but allows auto-ON to exist.
> > > >
> > > > If we think migration is optional, we could add a migration blocker where
> > > > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > > > As that implies the VM ABI is not guaranteed.
> > > >
> > > > Thanks,
> > >
> > >
> > > All you have to do is avoid changing the kernel and ABI is stable.
> > > Downstreams already do this.
> >
> > But the whole point of migration is allowing VMs to move between hosts..
> > hence AFAIU kernel can change.
> >
> > Downstream will still have problem if some network features will be
> > optionally supported in some of the RHEL-N branches, because machine types
> > are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
> > on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
> > RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
> > feature available..
> >
> > It's also not good IMHO to only fix downstream but having upstream face
> > such problems, even if there's a downstream fix...
> >
> > This thread was revived only because Jinpu hit similar issues. IMHO we
> > should still try to provide a generic solution upstream for everyone.
> >
> > Thanks,
> >
> > --
> > Peter Xu
>
> failing to start a perfectly good qemu which used to work
> because you changed kernels is better than failing to migrate how?
It doesn't:
1) the strict feature check will be only enabled in new machine types
2) if kernel ABI is stable, qemu will keep working after upgrading
kernel even with strict check otherwise it would be a bug of kernel
So I don't see it breaking anything if we make it start to work at 11.0?
>
>
>
> graceful downgrade with old kernels is the basics of good userspace
> behaviour and has been for decades.
Peter has given the example of how hard we can define gracefulness
(e.g migrate from a kernel w/ USO to a kernel w/o USO) and fix.
Maybe we can think of a usersapce fallback to emulation of USO or
others, but I'm not sure if it's an overkill.
>
>
> sure, let's work on a solution, just erroring out is more about blaming
> the user. what is the user supposed to do when qemu fails to start?
It's the first step as it's much better than silently clearing the
feature which may confuse both user and migration. We can use warnings
instead of errors but I'm not sure how much it can help.
>
>
> first, formulate what exactly do you want to enable.
>
>
>
> for example, you have a set of boxes and you want a set of flags
> to supply to guarantee qemu can migrate between them. is that it?
Mostly, it should work as a CPU cluster. So it's the responsibility of
the management layer, maybe we can develop some tool to report this or
via qemu introspection ("query-tap" ?). Or if the management can do
this now, we don't even need to bother (or it can help to uncover
bugs). Anyhow, clearing a feature silently is not good and can cover
bugs of various layers.
Note that this issue is not specific to TAP, we may meet this for
vDPA/VFIO live migration as well. Basically, it should be the
responsibility of the management layer to deal with those migration
compatibility policies instead of using hard coded policies inside
Qemu. For qemu, it can simply error out when there's a mismatch
between features that are supported and features that are asked to
enable. We've suffered a lot in the past when trying to deal with this
by Qemu.
Thanks
>
>
>
> --
> MST
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 19:32 ` Peter Xu
@ 2025-11-14 1:51 ` Jason Wang
2025-11-16 6:45 ` Michael S. Tsirkin
2025-11-14 5:48 ` Thomas Huth
1 sibling, 1 reply; 26+ messages in thread
From: Jason Wang @ 2025-11-14 1:51 UTC (permalink / raw)
To: Peter Xu
Cc: Michael S. Tsirkin, eduardo, marcel.apfelbaum, philmd,
wangyanan55, zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth,
berrange
On Fri, Nov 14, 2025 at 3:32 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, Nov 13, 2025 at 12:46:55PM -0500, Michael S. Tsirkin wrote:
> > failing to start a perfectly good qemu which used to work
> > because you changed kernels is better than failing to migrate how?
> >
>
> I agree this is not pretty.
>
> The very original proposal was having extra features to be OFF by default,
> only allow explicit selections to enable them when the mgmt / user is aware
> of the possible hosts to run on top. That'll guarantee:
>
> (1) explicit failure whenever some unsupported cap is chosen on boot,
>
> (2) default setup should always assume no kernel dependency hence booting
> should be all fine,
>
> (3) since all features will be by default OFF or selected by the user with
> explicit cmdlines, VM ABI is guaranteed so that migration will work.
>
> But unfortunately that proposal was rejected.
>
> >
> > graceful downgrade with old kernels is the basics of good userspace
> > behaviour and has been for decades.
> >
> >
> > sure, let's work on a solution, just erroring out is more about blaming
> > the user. what is the user supposed to do when qemu fails to start?
>
> This is indeed a good question. If with strict checks maybe we would at
> least want to make sure we throw explicit messages to let user know what to
> turn off.
>
> >
> >
> > first, formulate what exactly do you want to enable.
> >
> >
> >
> > for example, you have a set of boxes and you want a set of flags
> > to supply to guarantee qemu can migrate between them. is that it?
>
> Yes I think that's the case.
>
> That's also why I think the very original proposal still makes sense
> (having all defaults OFF when dependent on kernel), because only the mgmt
> knows the details about the cluster, so it may make more sense to select
> from the top which has the full knowledge base, explicitly enable some sets
> of features (not only network, but also CPU feature bits and else). Then
> the mgmt boots the VM, also knows where it can migrate explicitly.
>
> If all things are hidden then the mgmt is almost out of control of this.
+1
>
> That was rejected because there's the need to by default enable new
> features if ever possible. In that case, IMHO Jason's soluion is spot on
> where it sits in the middle ground of both, allowing both to happen
> (auto-enable of new feats, while keeping VM ABI stablility).
Yes.
>
> So IIUC there will be a cluster, it may contain different groups of hosts,
> each group should have similar setups so that VMs can freely migrate
> between each other within the same group (but may not easily migratable
> across groups?). But I don't think I know well on that part in practise.
Towards this, we may need to develop tools somewhere to report TAP
capability. Or as replied in another thread, developing software
fallback for new features, but it seems a burden.
>
> Dan might be a great source of input from that level.
>
> Thanks,
>
> --
> Peter Xu
>
Thanks
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-13 19:32 ` Peter Xu
2025-11-14 1:51 ` Jason Wang
@ 2025-11-14 5:48 ` Thomas Huth
2025-11-14 9:53 ` Jinpu Wang
2025-11-14 15:47 ` Peter Xu
1 sibling, 2 replies; 26+ messages in thread
From: Thomas Huth @ 2025-11-14 5:48 UTC (permalink / raw)
To: Peter Xu, Michael S. Tsirkin
Cc: Jason Wang, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, berrange
On 13/11/2025 20.32, Peter Xu wrote:
> On Thu, Nov 13, 2025 at 12:46:55PM -0500, Michael S. Tsirkin wrote:
>> failing to start a perfectly good qemu which used to work
>> because you changed kernels is better than failing to migrate how?
>>
>
> I agree this is not pretty.
>
> The very original proposal was having extra features to be OFF by default,
> only allow explicit selections to enable them when the mgmt / user is aware
> of the possible hosts to run on top.
Could it maybe be tied to the "-nodefaults" option of QEMU? If you run QEMU
with "-nodefaults" (which you should do when planning a migration later),
these extra features that depend on the kernel version stay OFF. If you run
QEMU without "-nodefaults", QEMU could enable them if supported by the
kernel. So that would benefit both, the people running QEMU via management
layers (using -nodefaults), and the people who just want to quickly launch
QEMU on the command line. WDYT?
Thomas
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-14 5:48 ` Thomas Huth
@ 2025-11-14 9:53 ` Jinpu Wang
2025-11-14 15:47 ` Peter Xu
1 sibling, 0 replies; 26+ messages in thread
From: Jinpu Wang @ 2025-11-14 9:53 UTC (permalink / raw)
To: Thomas Huth
Cc: Peter Xu, Michael S. Tsirkin, Jason Wang, eduardo,
marcel.apfelbaum, philmd, wangyanan55, zhao1.liu, qemu-devel,
farosas, berrange
Hello all,
My apologies for the slow reply; I have been heavily involved in other
tasks recently.
I wanted to chime in to confirm that, as a Cloud Service Provider
(CSP), we rely heavily on live migration, particularly during the
rollout of new host operating system versions or for resource
balancing across our infrastructure. Consequently, it is a critical
requirement for us that the live migration process remains robust and
does not experience failures simply because new features are
automatically enabled by default when running on a newer kernel
version.
From the CSP perspective, having a strict feature check that results
in a failure at boot (or feature negotiation) seems significantly
preferable to the previous behavior of silently clearing features and
only failing much later during a migration attempt. A strict check
provides immediate feedback that the requested feature set is
incompatible with the peer/environment, allowing us to address the
configuration issue proactively before the VM is put into production
or before a migration is attempted.
I must admit I don't know the code base well enough to comment on the
technical implementation, but the principle behind the proposed strict
check appears to align well with our operational needs for managing
live migration compatibility and stability.
Thank you very much, Jason, for proposing and working on this solution.
Best regards,
Jinpu
On Fri, Nov 14, 2025 at 6:48 AM Thomas Huth <thuth@redhat.com> wrote:
>
> On 13/11/2025 20.32, Peter Xu wrote:
> > On Thu, Nov 13, 2025 at 12:46:55PM -0500, Michael S. Tsirkin wrote:
> >> failing to start a perfectly good qemu which used to work
> >> because you changed kernels is better than failing to migrate how?
> >>
> >
> > I agree this is not pretty.
> >
> > The very original proposal was having extra features to be OFF by default,
> > only allow explicit selections to enable them when the mgmt / user is aware
> > of the possible hosts to run on top.
>
> Could it maybe be tied to the "-nodefaults" option of QEMU? If you run QEMU
> with "-nodefaults" (which you should do when planning a migration later),
> these extra features that depend on the kernel version stay OFF. If you run
> QEMU without "-nodefaults", QEMU could enable them if supported by the
> kernel. So that would benefit both, the people running QEMU via management
> layers (using -nodefaults), and the people who just want to quickly launch
> QEMU on the command line. WDYT?
>
> Thomas
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-14 5:48 ` Thomas Huth
2025-11-14 9:53 ` Jinpu Wang
@ 2025-11-14 15:47 ` Peter Xu
1 sibling, 0 replies; 26+ messages in thread
From: Peter Xu @ 2025-11-14 15:47 UTC (permalink / raw)
To: Thomas Huth
Cc: Michael S. Tsirkin, Jason Wang, eduardo, marcel.apfelbaum, philmd,
wangyanan55, zhao1.liu, qemu-devel, farosas, jinpu.wang, berrange
On Fri, Nov 14, 2025 at 06:48:42AM +0100, Thomas Huth wrote:
> On 13/11/2025 20.32, Peter Xu wrote:
> > On Thu, Nov 13, 2025 at 12:46:55PM -0500, Michael S. Tsirkin wrote:
> > > failing to start a perfectly good qemu which used to work
> > > because you changed kernels is better than failing to migrate how?
> > >
> >
> > I agree this is not pretty.
> >
> > The very original proposal was having extra features to be OFF by default,
> > only allow explicit selections to enable them when the mgmt / user is aware
> > of the possible hosts to run on top.
>
> Could it maybe be tied to the "-nodefaults" option of QEMU? If you run QEMU
> with "-nodefaults" (which you should do when planning a migration later),
> these extra features that depend on the kernel version stay OFF. If you run
> QEMU without "-nodefaults", QEMU could enable them if supported by the
> kernel. So that would benefit both, the people running QEMU via management
> layers (using -nodefaults), and the people who just want to quickly launch
> QEMU on the command line. WDYT?
Are the "default set of devices" when without -nodefaults more or less
stable (aka, still live migratable)? If so, I wonder if there're still
people relying on migrations but using default devices.
The other question is, such proposal also means auto-probe will be OFF for
all serious users. I am personally OK with such, however it means it'll
also reduce the test coverage that Michael was looking for on new network
features, when QEMU is running on new kernels.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-14 1:51 ` Jason Wang
@ 2025-11-16 6:45 ` Michael S. Tsirkin
2025-11-19 2:06 ` Jason Wang
0 siblings, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-16 6:45 UTC (permalink / raw)
To: Jason Wang
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Fri, Nov 14, 2025 at 09:51:00AM +0800, Jason Wang wrote:
> > So IIUC there will be a cluster, it may contain different groups of hosts,
> > each group should have similar setups so that VMs can freely migrate
> > between each other within the same group (but may not easily migratable
> > across groups?). But I don't think I know well on that part in practise.
>
> Towards this, we may need to develop tools somewhere to report TAP
> capability. Or as replied in another thread, developing software
> fallback for new features, but it seems a burden.
Or more generally, host capability from QEMU POV.
--
MST
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-14 1:32 ` Jason Wang
@ 2025-11-16 6:52 ` Michael S. Tsirkin
2025-11-17 4:31 ` Jason Wang
0 siblings, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-16 6:52 UTC (permalink / raw)
To: Jason Wang
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Fri, Nov 14, 2025 at 09:32:47AM +0800, Jason Wang wrote:
> On Fri, Nov 14, 2025 at 1:47 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 12:12:38PM -0500, Peter Xu wrote:
> > > On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> > > > On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > > > > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > > > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > > > > We used to clear features silently in virtio_net_get_features() even
> > > > > > > if it is required. This complicates the live migration compatibility
> > > > > > > as the management layer may think the feature is enabled but in fact
> > > > > > > not.
> > > > > > >
> > > > > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > > > > between the required feature and peer, fail the get_features()
> > > > > > > immediately instead of waiting until the migration to fail. This
> > > > > > > offload the migration compatibility completely to the management
> > > > > > > layer.
> > > > > > >
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > >
> > > > > > This is not really useful - how do users know how to tweak their
> > > > > > command lines?
> > > > > > We discussed this many times.
> > > > > > To try and solve this you need a tool that will tell you how to start
> > > > > > VM on X to make it migrateable to Y or Z.
> > > > > >
> > > > > >
> > > > > > More importantly,
> > > > > > migration is a niche thing and breaking booting perfectly good VMs
> > > > > > just for that seems wrong.
> > > > >
> > > > > IMHO Jason's proposal is useful in that it now provides a way to provide
> > > > > ABI stablility but allows auto-ON to exist.
> > > > >
> > > > > If we think migration is optional, we could add a migration blocker where
> > > > > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > > > > As that implies the VM ABI is not guaranteed.
> > > > >
> > > > > Thanks,
> > > >
> > > >
> > > > All you have to do is avoid changing the kernel and ABI is stable.
> > > > Downstreams already do this.
> > >
> > > But the whole point of migration is allowing VMs to move between hosts..
> > > hence AFAIU kernel can change.
> > >
> > > Downstream will still have problem if some network features will be
> > > optionally supported in some of the RHEL-N branches, because machine types
> > > are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
> > > on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
> > > RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
> > > feature available..
> > >
> > > It's also not good IMHO to only fix downstream but having upstream face
> > > such problems, even if there's a downstream fix...
> > >
> > > This thread was revived only because Jinpu hit similar issues. IMHO we
> > > should still try to provide a generic solution upstream for everyone.
> > >
> > > Thanks,
> > >
> > > --
> > > Peter Xu
> >
> > failing to start a perfectly good qemu which used to work
> > because you changed kernels is better than failing to migrate how?
>
> It doesn't:
>
> 1) the strict feature check will be only enabled in new machine types
> 2) if kernel ABI is stable, qemu will keep working after upgrading
> kernel even with strict check otherwise it would be a bug of kernel
>
> So I don't see it breaking anything if we make it start to work at 11.0?
Using QEMU from git suddenly requires upgrading the kernel or figuring
out obscure flags? Ugh.
> >
> >
> >
> > graceful downgrade with old kernels is the basics of good userspace
> > behaviour and has been for decades.
>
> Peter has given the example of how hard we can define gracefulness
> (e.g migrate from a kernel w/ USO to a kernel w/o USO) and fix.
>
> Maybe we can think of a usersapce fallback to emulation of USO or
> others, but I'm not sure if it's an overkill.
>
> >
> >
> > sure, let's work on a solution, just erroring out is more about blaming
> > the user. what is the user supposed to do when qemu fails to start?
>
> It's the first step as it's much better than silently clearing the
> feature which may confuse both user and migration. We can use warnings
> instead of errors but I'm not sure how much it can help.
Well with this first step we have successfully blamed the user and
the second step won't ever be taken.
> >
> >
> > first, formulate what exactly do you want to enable.
> >
> >
> >
> > for example, you have a set of boxes and you want a set of flags
> > to supply to guarantee qemu can migrate between them. is that it?
>
> Mostly, it should work as a CPU cluster.
the reason it kinda works with CPU cluster is simply because
there is a final set of CPU models and you can not easily
switch your CPU to a different model.
> So it's the responsibility of
> the management layer, maybe we can develop some tool to report this or
> via qemu introspection ("query-tap" ?). Or if the management can do
> this now, we don't even need to bother (or it can help to uncover
> bugs). Anyhow, clearing a feature silently is not good and can cover
> bugs of various layers.
>
> Note that this issue is not specific to TAP, we may meet this for
> vDPA/VFIO live migration as well. Basically, it should be the
> responsibility of the management layer to deal with those migration
> compatibility policies instead of using hard coded policies inside
> Qemu. For qemu, it can simply error out when there's a mismatch
> between features that are supported and features that are asked to
> enable. We've suffered a lot in the past when trying to deal with this
> by Qemu.
>
> Thanks
Yes but QEMU currently gives management no tools to figure out
what is important for it.
> >
> >
> >
> > --
> > MST
> >
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-16 6:52 ` Michael S. Tsirkin
@ 2025-11-17 4:31 ` Jason Wang
2025-11-17 8:57 ` Michael S. Tsirkin
0 siblings, 1 reply; 26+ messages in thread
From: Jason Wang @ 2025-11-17 4:31 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Sun, Nov 16, 2025 at 2:53 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Nov 14, 2025 at 09:32:47AM +0800, Jason Wang wrote:
> > On Fri, Nov 14, 2025 at 1:47 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Nov 13, 2025 at 12:12:38PM -0500, Peter Xu wrote:
> > > > On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> > > > > On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > > > > > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > > > > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > > > > > We used to clear features silently in virtio_net_get_features() even
> > > > > > > > if it is required. This complicates the live migration compatibility
> > > > > > > > as the management layer may think the feature is enabled but in fact
> > > > > > > > not.
> > > > > > > >
> > > > > > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > > > > > between the required feature and peer, fail the get_features()
> > > > > > > > immediately instead of waiting until the migration to fail. This
> > > > > > > > offload the migration compatibility completely to the management
> > > > > > > > layer.
> > > > > > > >
> > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > >
> > > > > > > This is not really useful - how do users know how to tweak their
> > > > > > > command lines?
> > > > > > > We discussed this many times.
> > > > > > > To try and solve this you need a tool that will tell you how to start
> > > > > > > VM on X to make it migrateable to Y or Z.
> > > > > > >
> > > > > > >
> > > > > > > More importantly,
> > > > > > > migration is a niche thing and breaking booting perfectly good VMs
> > > > > > > just for that seems wrong.
> > > > > >
> > > > > > IMHO Jason's proposal is useful in that it now provides a way to provide
> > > > > > ABI stablility but allows auto-ON to exist.
> > > > > >
> > > > > > If we think migration is optional, we could add a migration blocker where
> > > > > > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > > > > > As that implies the VM ABI is not guaranteed.
> > > > > >
> > > > > > Thanks,
> > > > >
> > > > >
> > > > > All you have to do is avoid changing the kernel and ABI is stable.
> > > > > Downstreams already do this.
> > > >
> > > > But the whole point of migration is allowing VMs to move between hosts..
> > > > hence AFAIU kernel can change.
> > > >
> > > > Downstream will still have problem if some network features will be
> > > > optionally supported in some of the RHEL-N branches, because machine types
> > > > are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
> > > > on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
> > > > RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
> > > > feature available..
> > > >
> > > > It's also not good IMHO to only fix downstream but having upstream face
> > > > such problems, even if there's a downstream fix...
> > > >
> > > > This thread was revived only because Jinpu hit similar issues. IMHO we
> > > > should still try to provide a generic solution upstream for everyone.
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Peter Xu
> > >
> > > failing to start a perfectly good qemu which used to work
> > > because you changed kernels is better than failing to migrate how?
> >
> > It doesn't:
> >
> > 1) the strict feature check will be only enabled in new machine types
> > 2) if kernel ABI is stable, qemu will keep working after upgrading
> > kernel even with strict check otherwise it would be a bug of kernel
> >
> > So I don't see it breaking anything if we make it start to work at 11.0?
>
> Using QEMU from git suddenly requires upgrading the kernel or figuring
> out obscure flags? Ugh.
Only the setups are buggy that might meet this.
>
>
> > >
> > >
> > >
> > > graceful downgrade with old kernels is the basics of good userspace
> > > behaviour and has been for decades.
> >
> > Peter has given the example of how hard we can define gracefulness
> > (e.g migrate from a kernel w/ USO to a kernel w/o USO) and fix.
> >
> > Maybe we can think of a usersapce fallback to emulation of USO or
> > others, but I'm not sure if it's an overkill.
> >
> > >
> > >
> > > sure, let's work on a solution, just erroring out is more about blaming
> > > the user. what is the user supposed to do when qemu fails to start?
> >
> > It's the first step as it's much better than silently clearing the
> > feature which may confuse both user and migration. We can use warnings
> > instead of errors but I'm not sure how much it can help.
>
>
> Well with this first step we have successfully blamed the user and
> the second step won't ever be taken.
Are you suggesting to fix the management? E.g patching libvirt to
probe tap features?
>
> > >
> > >
> > > first, formulate what exactly do you want to enable.
> > >
> > >
> > >
> > > for example, you have a set of boxes and you want a set of flags
> > > to supply to guarantee qemu can migrate between them. is that it?
> >
> > Mostly, it should work as a CPU cluster.
>
> the reason it kinda works with CPU cluster is simply because
> there is a final set of CPU models and you can not easily
> switch your CPU to a different model.
We can define a set of TAP features as well, but I'm not sure it's
worthwhile to do this.
>
> > So it's the responsibility of
> > the management layer, maybe we can develop some tool to report this or
> > via qemu introspection ("query-tap" ?). Or if the management can do
> > this now, we don't even need to bother (or it can help to uncover
> > bugs). Anyhow, clearing a feature silently is not good and can cover
> > bugs of various layers.
> >
> > Note that this issue is not specific to TAP, we may meet this for
> > vDPA/VFIO live migration as well. Basically, it should be the
> > responsibility of the management layer to deal with those migration
> > compatibility policies instead of using hard coded policies inside
> > Qemu. For qemu, it can simply error out when there's a mismatch
> > between features that are supported and features that are asked to
> > enable. We've suffered a lot in the past when trying to deal with this
> > by Qemu.
> >
> > Thanks
>
> Yes but QEMU currently gives management no tools to figure out
> what is important for it.
Using Qemu might be problematic as usually it doesn't not have privilege.
We can extend iproute, or a dedicated tool or ask libvirt to do this.
If libvirt could do the probe by itself, could we start from that?
Thanks
>
>
>
> > >
> > >
> > >
> > > --
> > > MST
> > >
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-17 4:31 ` Jason Wang
@ 2025-11-17 8:57 ` Michael S. Tsirkin
2025-11-19 2:49 ` Jason Wang
0 siblings, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-17 8:57 UTC (permalink / raw)
To: Jason Wang
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Mon, Nov 17, 2025 at 12:31:47PM +0800, Jason Wang wrote:
> On Sun, Nov 16, 2025 at 2:53 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Nov 14, 2025 at 09:32:47AM +0800, Jason Wang wrote:
> > > On Fri, Nov 14, 2025 at 1:47 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Thu, Nov 13, 2025 at 12:12:38PM -0500, Peter Xu wrote:
> > > > > On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> > > > > > On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > > > > > > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > > > > > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > > > > > > We used to clear features silently in virtio_net_get_features() even
> > > > > > > > > if it is required. This complicates the live migration compatibility
> > > > > > > > > as the management layer may think the feature is enabled but in fact
> > > > > > > > > not.
> > > > > > > > >
> > > > > > > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > > > > > > between the required feature and peer, fail the get_features()
> > > > > > > > > immediately instead of waiting until the migration to fail. This
> > > > > > > > > offload the migration compatibility completely to the management
> > > > > > > > > layer.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > >
> > > > > > > > This is not really useful - how do users know how to tweak their
> > > > > > > > command lines?
> > > > > > > > We discussed this many times.
> > > > > > > > To try and solve this you need a tool that will tell you how to start
> > > > > > > > VM on X to make it migrateable to Y or Z.
> > > > > > > >
> > > > > > > >
> > > > > > > > More importantly,
> > > > > > > > migration is a niche thing and breaking booting perfectly good VMs
> > > > > > > > just for that seems wrong.
> > > > > > >
> > > > > > > IMHO Jason's proposal is useful in that it now provides a way to provide
> > > > > > > ABI stablility but allows auto-ON to exist.
> > > > > > >
> > > > > > > If we think migration is optional, we could add a migration blocker where
> > > > > > > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > > > > > > As that implies the VM ABI is not guaranteed.
> > > > > > >
> > > > > > > Thanks,
> > > > > >
> > > > > >
> > > > > > All you have to do is avoid changing the kernel and ABI is stable.
> > > > > > Downstreams already do this.
> > > > >
> > > > > But the whole point of migration is allowing VMs to move between hosts..
> > > > > hence AFAIU kernel can change.
> > > > >
> > > > > Downstream will still have problem if some network features will be
> > > > > optionally supported in some of the RHEL-N branches, because machine types
> > > > > are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
> > > > > on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
> > > > > RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
> > > > > feature available..
> > > > >
> > > > > It's also not good IMHO to only fix downstream but having upstream face
> > > > > such problems, even if there's a downstream fix...
> > > > >
> > > > > This thread was revived only because Jinpu hit similar issues. IMHO we
> > > > > should still try to provide a generic solution upstream for everyone.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > --
> > > > > Peter Xu
> > > >
> > > > failing to start a perfectly good qemu which used to work
> > > > because you changed kernels is better than failing to migrate how?
> > >
> > > It doesn't:
> > >
> > > 1) the strict feature check will be only enabled in new machine types
> > > 2) if kernel ABI is stable, qemu will keep working after upgrading
> > > kernel even with strict check otherwise it would be a bug of kernel
> > >
> > > So I don't see it breaking anything if we make it start to work at 11.0?
> >
> > Using QEMU from git suddenly requires upgrading the kernel or figuring
> > out obscure flags? Ugh.
>
> Only the setups are buggy that might meet this.
I do git pull on qemu and have an old kernel. My setup just
became buggy. No?
> >
> >
> > > >
> > > >
> > > >
> > > > graceful downgrade with old kernels is the basics of good userspace
> > > > behaviour and has been for decades.
> > >
> > > Peter has given the example of how hard we can define gracefulness
> > > (e.g migrate from a kernel w/ USO to a kernel w/o USO) and fix.
> > >
> > > Maybe we can think of a usersapce fallback to emulation of USO or
> > > others, but I'm not sure if it's an overkill.
> > >
> > > >
> > > >
> > > > sure, let's work on a solution, just erroring out is more about blaming
> > > > the user. what is the user supposed to do when qemu fails to start?
> > >
> > > It's the first step as it's much better than silently clearing the
> > > feature which may confuse both user and migration. We can use warnings
> > > instead of errors but I'm not sure how much it can help.
> >
> >
> > Well with this first step we have successfully blamed the user and
> > the second step won't ever be taken.
>
> Are you suggesting to fix the management? E.g patching libvirt to
> probe tap features?
host features generally.
> >
> > > >
> > > >
> > > > first, formulate what exactly do you want to enable.
> > > >
> > > >
> > > >
> > > > for example, you have a set of boxes and you want a set of flags
> > > > to supply to guarantee qemu can migrate between them. is that it?
> > >
> > > Mostly, it should work as a CPU cluster.
> >
> > the reason it kinda works with CPU cluster is simply because
> > there is a final set of CPU models and you can not easily
> > switch your CPU to a different model.
>
> We can define a set of TAP features as well, but I'm not sure it's
> worthwhile to do this.
>
> >
> > > So it's the responsibility of
> > > the management layer, maybe we can develop some tool to report this or
> > > via qemu introspection ("query-tap" ?). Or if the management can do
> > > this now, we don't even need to bother (or it can help to uncover
> > > bugs). Anyhow, clearing a feature silently is not good and can cover
> > > bugs of various layers.
> > >
> > > Note that this issue is not specific to TAP, we may meet this for
> > > vDPA/VFIO live migration as well. Basically, it should be the
> > > responsibility of the management layer to deal with those migration
> > > compatibility policies instead of using hard coded policies inside
> > > Qemu. For qemu, it can simply error out when there's a mismatch
> > > between features that are supported and features that are asked to
> > > enable. We've suffered a lot in the past when trying to deal with this
> > > by Qemu.
> > >
> > > Thanks
> >
> > Yes but QEMU currently gives management no tools to figure out
> > what is important for it.
>
> Using Qemu might be problematic as usually it doesn't not have privilege.
>
> We can extend iproute, or a dedicated tool or ask libvirt to do this.
> If libvirt could do the probe by itself, could we start from that?
>
> Thanks
All I am saying is that I want to see how is management supposed to
know what to enable.
qemu already probes tap features. To me, it seems natural
for management to do the probing through qemu.
in fact your patch is a way to do that, is it not?
what it lacks though is a structured way to tell management how
to fix the problem.
>
> >
> >
> >
> > > >
> > > >
> > > >
> > > > --
> > > > MST
> > > >
> >
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-16 6:45 ` Michael S. Tsirkin
@ 2025-11-19 2:06 ` Jason Wang
2025-11-19 6:31 ` Michael S. Tsirkin
0 siblings, 1 reply; 26+ messages in thread
From: Jason Wang @ 2025-11-19 2:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Sun, Nov 16, 2025 at 2:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Nov 14, 2025 at 09:51:00AM +0800, Jason Wang wrote:
> > > So IIUC there will be a cluster, it may contain different groups of hosts,
> > > each group should have similar setups so that VMs can freely migrate
> > > between each other within the same group (but may not easily migratable
> > > across groups?). But I don't think I know well on that part in practise.
> >
> > Towards this, we may need to develop tools somewhere to report TAP
> > capability. Or as replied in another thread, developing software
> > fallback for new features, but it seems a burden.
>
> Or more generally, host capability from QEMU POV.
Or you mean managment POV actually, for example TAP is usually created
by libvirt.
Thanks
>
> --
> MST
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-17 8:57 ` Michael S. Tsirkin
@ 2025-11-19 2:49 ` Jason Wang
2025-11-19 8:07 ` Michael S. Tsirkin
0 siblings, 1 reply; 26+ messages in thread
From: Jason Wang @ 2025-11-19 2:49 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Mon, Nov 17, 2025 at 4:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Nov 17, 2025 at 12:31:47PM +0800, Jason Wang wrote:
> > On Sun, Nov 16, 2025 at 2:53 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Nov 14, 2025 at 09:32:47AM +0800, Jason Wang wrote:
> > > > On Fri, Nov 14, 2025 at 1:47 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Thu, Nov 13, 2025 at 12:12:38PM -0500, Peter Xu wrote:
> > > > > > On Thu, Nov 13, 2025 at 11:47:51AM -0500, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Nov 13, 2025 at 11:37:25AM -0500, Peter Xu wrote:
> > > > > > > > On Thu, Nov 13, 2025 at 11:09:32AM -0500, Michael S. Tsirkin wrote:
> > > > > > > > > On Fri, Nov 07, 2025 at 10:01:49AM +0800, Jason Wang wrote:
> > > > > > > > > > We used to clear features silently in virtio_net_get_features() even
> > > > > > > > > > if it is required. This complicates the live migration compatibility
> > > > > > > > > > as the management layer may think the feature is enabled but in fact
> > > > > > > > > > not.
> > > > > > > > > >
> > > > > > > > > > Let's add a strict feature check to make sure if there's a mismatch
> > > > > > > > > > between the required feature and peer, fail the get_features()
> > > > > > > > > > immediately instead of waiting until the migration to fail. This
> > > > > > > > > > offload the migration compatibility completely to the management
> > > > > > > > > > layer.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > >
> > > > > > > > > This is not really useful - how do users know how to tweak their
> > > > > > > > > command lines?
> > > > > > > > > We discussed this many times.
> > > > > > > > > To try and solve this you need a tool that will tell you how to start
> > > > > > > > > VM on X to make it migrateable to Y or Z.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > More importantly,
> > > > > > > > > migration is a niche thing and breaking booting perfectly good VMs
> > > > > > > > > just for that seems wrong.
> > > > > > > >
> > > > > > > > IMHO Jason's proposal is useful in that it now provides a way to provide
> > > > > > > > ABI stablility but allows auto-ON to exist.
> > > > > > > >
> > > > > > > > If we think migration is optional, we could add a migration blocker where
> > > > > > > > strict check flag is set to OFF, as I mentioned in the email reply to Dan.
> > > > > > > > As that implies the VM ABI is not guaranteed.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > >
> > > > > > >
> > > > > > > All you have to do is avoid changing the kernel and ABI is stable.
> > > > > > > Downstreams already do this.
> > > > > >
> > > > > > But the whole point of migration is allowing VMs to move between hosts..
> > > > > > hence AFAIU kernel can change.
> > > > > >
> > > > > > Downstream will still have problem if some network features will be
> > > > > > optionally supported in some of the RHEL-N branches, because machine types
> > > > > > are defined the same in any RHEL-N, so IIUC it's also possible a VM booting
> > > > > > on a latest RHEL-X.Y qemu/kernel hit issues migrating back to an older
> > > > > > RHEL-X.(Y-1) qemu/kernel if RHEL-X.(Y-1) kernel doesn't have the network
> > > > > > feature available..
> > > > > >
> > > > > > It's also not good IMHO to only fix downstream but having upstream face
> > > > > > such problems, even if there's a downstream fix...
> > > > > >
> > > > > > This thread was revived only because Jinpu hit similar issues. IMHO we
> > > > > > should still try to provide a generic solution upstream for everyone.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > --
> > > > > > Peter Xu
> > > > >
> > > > > failing to start a perfectly good qemu which used to work
> > > > > because you changed kernels is better than failing to migrate how?
> > > >
> > > > It doesn't:
> > > >
> > > > 1) the strict feature check will be only enabled in new machine types
> > > > 2) if kernel ABI is stable, qemu will keep working after upgrading
> > > > kernel even with strict check otherwise it would be a bug of kernel
> > > >
> > > > So I don't see it breaking anything if we make it start to work at 11.0?
> > >
> > > Using QEMU from git suddenly requires upgrading the kernel or figuring
> > > out obscure flags? Ugh.
> >
> > Only the setups are buggy that might meet this.
>
> I do git pull on qemu and have an old kernel. My setup just
> became buggy. No?
>
>
> > >
> > >
> > > > >
> > > > >
> > > > >
> > > > > graceful downgrade with old kernels is the basics of good userspace
> > > > > behaviour and has been for decades.
> > > >
> > > > Peter has given the example of how hard we can define gracefulness
> > > > (e.g migrate from a kernel w/ USO to a kernel w/o USO) and fix.
> > > >
> > > > Maybe we can think of a usersapce fallback to emulation of USO or
> > > > others, but I'm not sure if it's an overkill.
> > > >
> > > > >
> > > > >
> > > > > sure, let's work on a solution, just erroring out is more about blaming
> > > > > the user. what is the user supposed to do when qemu fails to start?
> > > >
> > > > It's the first step as it's much better than silently clearing the
> > > > feature which may confuse both user and migration. We can use warnings
> > > > instead of errors but I'm not sure how much it can help.
> > >
> > >
> > > Well with this first step we have successfully blamed the user and
> > > the second step won't ever be taken.
> >
> > Are you suggesting to fix the management? E.g patching libvirt to
> > probe tap features?
>
> host features generally.
>
>
> > >
> > > > >
> > > > >
> > > > > first, formulate what exactly do you want to enable.
> > > > >
> > > > >
> > > > >
> > > > > for example, you have a set of boxes and you want a set of flags
> > > > > to supply to guarantee qemu can migrate between them. is that it?
> > > >
> > > > Mostly, it should work as a CPU cluster.
> > >
> > > the reason it kinda works with CPU cluster is simply because
> > > there is a final set of CPU models and you can not easily
> > > switch your CPU to a different model.
> >
> > We can define a set of TAP features as well, but I'm not sure it's
> > worthwhile to do this.
> >
> > >
> > > > So it's the responsibility of
> > > > the management layer, maybe we can develop some tool to report this or
> > > > via qemu introspection ("query-tap" ?). Or if the management can do
> > > > this now, we don't even need to bother (or it can help to uncover
> > > > bugs). Anyhow, clearing a feature silently is not good and can cover
> > > > bugs of various layers.
> > > >
> > > > Note that this issue is not specific to TAP, we may meet this for
> > > > vDPA/VFIO live migration as well. Basically, it should be the
> > > > responsibility of the management layer to deal with those migration
> > > > compatibility policies instead of using hard coded policies inside
> > > > Qemu. For qemu, it can simply error out when there's a mismatch
> > > > between features that are supported and features that are asked to
> > > > enable. We've suffered a lot in the past when trying to deal with this
> > > > by Qemu.
> > > >
> > > > Thanks
> > >
> > > Yes but QEMU currently gives management no tools to figure out
> > > what is important for it.
> >
> > Using Qemu might be problematic as usually it doesn't not have privilege.
> >
> > We can extend iproute, or a dedicated tool or ask libvirt to do this.
> > If libvirt could do the probe by itself, could we start from that?
> >
> > Thanks
>
> All I am saying is that I want to see how is management supposed to
> know what to enable.
>
> qemu already probes tap features.
Only part of the features.
> To me, it seems natural
> for management to do the probing through qemu.
> in fact your patch is a way to do that, is it not?
Yes and no.
> what it lacks though is a structured way to tell management how
> to fix the problem.
Probing through management seems to be better. For example it can
calculate the cluster in advance without the need to launch qemu
everywhere.
Or consider the case when USO is not supported by the kernel in the
destination, even if qemu reports this, I'm not sure what is expected
to be done in the management layer?
Thanks
>
> >
> > >
> > >
> > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
> > >
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-19 2:06 ` Jason Wang
@ 2025-11-19 6:31 ` Michael S. Tsirkin
0 siblings, 0 replies; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19 6:31 UTC (permalink / raw)
To: Jason Wang
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Wed, Nov 19, 2025 at 10:06:16AM +0800, Jason Wang wrote:
> On Sun, Nov 16, 2025 at 2:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Nov 14, 2025 at 09:51:00AM +0800, Jason Wang wrote:
> > > > So IIUC there will be a cluster, it may contain different groups of hosts,
> > > > each group should have similar setups so that VMs can freely migrate
> > > > between each other within the same group (but may not easily migratable
> > > > across groups?). But I don't think I know well on that part in practise.
> > >
> > > Towards this, we may need to develop tools somewhere to report TAP
> > > capability. Or as replied in another thread, developing software
> > > fallback for new features, but it seems a burden.
> >
> > Or more generally, host capability from QEMU POV.
>
> Or you mean managment POV actually, for example TAP is usually created
> by libvirt.
>
> Thanks
So what you propose is really this:
1- create tap
2- run qemu check that it did not fail
3- if it fails guess what the right set of flags is to make it not fail
and switch to that
and I am saying qemu should help with 3.
> >
> > --
> > MST
> >
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-19 2:49 ` Jason Wang
@ 2025-11-19 8:07 ` Michael S. Tsirkin
2025-11-20 1:45 ` Jason Wang
0 siblings, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19 8:07 UTC (permalink / raw)
To: Jason Wang
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Wed, Nov 19, 2025 at 10:49:11AM +0800, Jason Wang wrote:
> > qemu already probes tap features.
>
> Only part of the features.
the part we care about?
> > To me, it seems natural
> > for management to do the probing through qemu.
> > in fact your patch is a way to do that, is it not?
>
> Yes and no.
>
> > what it lacks though is a structured way to tell management how
> > to fix the problem.
>
> Probing through management seems to be better. For example it can
> calculate the cluster in advance without the need to launch qemu
> everywhere.
it is basically replicating qemu code then.
what's the big deal to launch qemu?
> Or consider the case when USO is not supported by the kernel in the
> destination, even if qemu reports this, I'm not sure what is expected
> to be done in the management layer?
>
> Thanks
reports what?
--
MST
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC PATCH] virtio-net: introduce strict peer feature check
2025-11-19 8:07 ` Michael S. Tsirkin
@ 2025-11-20 1:45 ` Jason Wang
0 siblings, 0 replies; 26+ messages in thread
From: Jason Wang @ 2025-11-20 1:45 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Peter Xu, eduardo, marcel.apfelbaum, philmd, wangyanan55,
zhao1.liu, qemu-devel, farosas, jinpu.wang, thuth, berrange
On Wed, Nov 19, 2025 at 4:07 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Nov 19, 2025 at 10:49:11AM +0800, Jason Wang wrote:
> > > qemu already probes tap features.
> >
> > Only part of the features.
>
>
> the part we care about?
Yes and technically Qemu can probe all.
>
> > > To me, it seems natural
> > > for management to do the probing through qemu.
> > > in fact your patch is a way to do that, is it not?
> >
> > Yes and no.
> >
> > > what it lacks though is a structured way to tell management how
> > > to fix the problem.
> >
> > Probing through management seems to be better. For example it can
> > calculate the cluster in advance without the need to launch qemu
> > everywhere.
>
> it is basically replicating qemu code then.
Yes but libvirt has duplicated somehow, e.g create and destroy TAP and
it even uses TUNGETFEATURES in virNetDevMacVLanTapSetup(). But I'm not
even sure this is the work of the libvirt. I think it should be the
job of the one who is in charge of managing the cpu cluster.
>
> what's the big deal to launch qemu?
It looks more heavyweight than probing by libvirt but I'm not sure, we
can listen to libvirt guys.
>
>
>
> > Or consider the case when USO is not supported by the kernel in the
> > destination, even if qemu reports this, I'm not sure what is expected
> > to be done in the management layer?
> >
> > Thanks
>
> reports what?
USO is not supported by this kernel.
Btw, to not make things worse, I would revert this until we find a
solution. Do you agree?
commit 1c79ab6937ae938d3dfd4da1c01afc7eb599857e
Author: Paolo Abeni <pabeni@redhat.com>
Date: Fri Oct 10 16:12:57 2025 +0200
virtio-net: Advertise UDP tunnel GSO support by default
Allow bidirectional aggregated traffic for UDP encapsulated flows.
Add the needed compatibility entries to avoid migration issues
vs older QEMU instances.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id:
<9c500fbcd2cf29afd1826b1ac906f9d5beac3601.1760104079.git.pabeni@redhat.com>
Otherwise we will suffer from a similar issue soon.
Thanks
>
> --
> MST
>
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2025-11-20 1:47 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07 2:01 [RFC PATCH] virtio-net: introduce strict peer feature check Jason Wang
2025-11-12 21:55 ` Peter Xu
2025-11-13 0:31 ` Jason Wang
2025-11-13 15:51 ` Peter Xu
2025-11-13 8:53 ` Daniel P. Berrangé
2025-11-13 15:58 ` Peter Xu
2025-11-13 16:09 ` Michael S. Tsirkin
2025-11-13 16:37 ` Peter Xu
2025-11-13 16:47 ` Michael S. Tsirkin
2025-11-13 17:12 ` Peter Xu
2025-11-13 17:46 ` Michael S. Tsirkin
2025-11-13 19:32 ` Peter Xu
2025-11-14 1:51 ` Jason Wang
2025-11-16 6:45 ` Michael S. Tsirkin
2025-11-19 2:06 ` Jason Wang
2025-11-19 6:31 ` Michael S. Tsirkin
2025-11-14 5:48 ` Thomas Huth
2025-11-14 9:53 ` Jinpu Wang
2025-11-14 15:47 ` Peter Xu
2025-11-14 1:32 ` Jason Wang
2025-11-16 6:52 ` Michael S. Tsirkin
2025-11-17 4:31 ` Jason Wang
2025-11-17 8:57 ` Michael S. Tsirkin
2025-11-19 2:49 ` Jason Wang
2025-11-19 8:07 ` Michael S. Tsirkin
2025-11-20 1:45 ` Jason Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).