qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@redhat.com>
To: Ben Chaney <bchaney@akamai.com>, qemu-devel@nongnu.org
Cc: "Peter Xu" <peterx@redhat.com>, "Fabiano Rosas" <farosas@suse.de>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Alex Williamson" <alex@shazbot.org>,
	"Eric Blake" <eblake@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Stefan Weil" <sw@weilnetz.de>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Hamza Khan" <hamza.khan@nutanix.com>,
	"Mark Kanda" <mark.kanda@oracle.com>,
	"Joshua Hunt" <johunt@akamai.com>,
	"Max Tottenham" <mtottenh@akamai.com>,
	"Steve Sistare" <steven.sistare@oracle.com>
Subject: Re: [PATCH v3 6/8] tap: cpr support
Date: Thu, 4 Dec 2025 18:46:21 +0100	[thread overview]
Message-ID: <fbc8007b-2667-42c6-9fdf-56147cae664d@redhat.com> (raw)
In-Reply-To: <20251203-cpr-tap-v3-6-3c12e0a61f8e@akamai.com>

On 12/3/25 19:51, Ben Chaney wrote:
> From: Steve Sistare <steven.sistare@oracle.com>
> 
> Provide the cpr=on option to preserve TAP and vhost descriptors during
> cpr-transfer, so the management layer does not need to create a new
> device for the target.
> 
> Save all tap fd's in canonical order, leveraging the index argument of
> cpr_save_fd.  For the i'th queue, the tap device fd is saved at index 2*i,
> and the vhostfd (if any) at index 2*i+1.
> 
> tap and vhost fd's are passed by name to the monitor when a NIC is hot
> plugged, but the name is not known to qemu after cpr.  Allow the manager
> to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
> should search for a saved value.  Example:
> 
>    -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Signed-off-by: Ben Chaney <bchaney@akamai.com>
> ---
>   hw/vfio/device.c        |  2 +-
>   include/migration/cpr.h |  2 +-
>   migration/cpr.c         | 11 ++++----
>   net/tap.c               | 73 +++++++++++++++++++++++++++++++++++++++----------
>   qapi/net.json           |  5 +++-
>   5 files changed, 70 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/vfio/device.c b/hw/vfio/device.c
> index 76869828fc..73e622f7b5 100644
> --- a/hw/vfio/device.c
> +++ b/hw/vfio/device.c
> @@ -362,7 +362,7 @@ void vfio_device_free_name(VFIODevice *vbasedev)
>   
>   void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp)
>   {
> -    vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, errp);
> +    vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, true, errp);


This looks weird to me.

This calls a cpr* routine with a 'cpr' bool argument that toggles
CPR on or off. It looks a bit hacky. Could you clarify the intention?


C.

>   }
>   
>   static VFIODeviceIOOps vfio_device_io_ops_ioctl;
> diff --git a/include/migration/cpr.h b/include/migration/cpr.h
> index d585fadc5b..68424b4b03 100644
> --- a/include/migration/cpr.h
> +++ b/include/migration/cpr.h
> @@ -48,7 +48,7 @@ void cpr_state_close(void);
>   struct QIOChannel *cpr_state_ioc(void);
>   
>   bool cpr_incoming_needed(void *opaque);
> -int cpr_get_fd_param(const char *name, const char *fdname, int index,
> +int cpr_get_fd_param(const char *name, const char *fdname, int index, bool cpr,
>                        Error **errp);
>   
>   QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
> diff --git a/migration/cpr.c b/migration/cpr.c
> index c0bf93a7ba..19bd56339d 100644
> --- a/migration/cpr.c
> +++ b/migration/cpr.c
> @@ -316,6 +316,7 @@ bool cpr_incoming_needed(void *opaque)
>    * @name: CPR name for the descriptor
>    * @fdname: An integer-valued string, or a name passed to a getfd command
>    * @index: CPR index of the descriptor
> + * @cpr: use cpr
>    * @errp: returned error message
>    *
>    * If CPR is not being performed, then use @fdname to find the fd.
> @@ -325,22 +326,22 @@ bool cpr_incoming_needed(void *opaque)
>    * On success returns the fd value, else returns -1.
>    */
>   int cpr_get_fd_param(const char *name, const char *fdname, int index,
> -                     Error **errp)
> +                     bool cpr, Error **errp)
>   {
>       ERRP_GUARD();
>       int fd;
>   
> -    if (cpr_is_incoming()) {
> +    if (cpr && cpr_is_incoming()) {
>           fd = cpr_find_fd(name, index);
>           if (fd < 0) {
>               error_setg(errp, "cannot find saved value for fd %s", fdname);
>           }
>       } else {
>           fd = monitor_fd_param(monitor_cur(), fdname, errp);
> -        if (fd >= 0) {
> -            cpr_save_fd(name, index, fd);
> -        } else {
> +        if (fd < 0) {
>               error_prepend(errp, "Could not parse object fd %s:", fdname);
> +        } else if (cpr) {
> +            cpr_save_fd(name, index, fd);
>           }
>       }
>       return fd;
> diff --git a/net/tap.c b/net/tap.c
> index 9d480574c3..79e29addd1 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -35,6 +35,7 @@
>   #include "net/eth.h"
>   #include "net/net.h"
>   #include "clients.h"
> +#include "migration/cpr.h"
>   #include "monitor/monitor.h"
>   #include "system/system.h"
>   #include "qapi/error.h"
> @@ -80,6 +81,7 @@ typedef struct TAPState {
>       bool has_uso;
>       bool has_tunnel;
>       bool enabled;
> +    bool cpr;
>       VHostNetState *vhost_net;
>       unsigned host_vnet_hdr_len;
>       Notifier exit;
> @@ -323,6 +325,9 @@ static void tap_cleanup(NetClientState *nc)
>   {
>       TAPState *s = DO_UPCAST(TAPState, nc, nc);
>   
> +    if (s->cpr) {
> +        cpr_delete_fd_all(nc->name);
> +    }
>       if (s->vhost_net) {
>           vhost_net_cleanup(s->vhost_net);
>           g_free(s->vhost_net);
> @@ -690,18 +695,24 @@ static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
>       return fd;
>   }
>   
> +/* CPR fd's for each queue are saved at these indices */
> +#define TAP_FD_INDEX(queue)         (2 * (queue) + 0)
> +#define TAP_VHOSTFD_INDEX(queue)    (2 * (queue) + 1)
> +
>   #define MAX_TAP_QUEUES 1024
>   
>   static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>                                const char *model, const char *name,
>                                const char *ifname, const char *script,
>                                const char *downscript, const char *vhostfdname,
> -                             int vnet_hdr, int fd, Error **errp)
> +                             int vnet_hdr, int fd, int index, Error **errp)
>   {
>       Error *err = NULL;
>       TAPState *s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
> +    bool cpr = tap->has_cpr ? tap->cpr : false;
>       int vhostfd;
>   
> +    s->cpr = cpr;
>       tap_set_sndbuf(s->fd, tap, &err);
>       if (err) {
>           error_propagate(errp, err);
> @@ -736,7 +747,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>           }
>   
>           if (vhostfdname) {
> -            vhostfd = monitor_fd_param(monitor_cur(), vhostfdname, &err);
> +            vhostfd = cpr_get_fd_param(name, vhostfdname, index, cpr, &err);
>               if (vhostfd == -1) {
>                   error_propagate(errp, err);
>                   goto failed;
> @@ -745,13 +756,22 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>                   goto failed;
>               }
>           } else {
> -            vhostfd = open("/dev/vhost-net", O_RDWR);
> +            vhostfd = cpr ? cpr_find_fd(name, index) : -1;
> +            if (vhostfd < 0) {
> +                vhostfd = open("/dev/vhost-net", O_RDWR);
> +                if (cpr && vhostfd >= 0) {
> +                    cpr_save_fd(name, index, vhostfd);
> +                }
> +            }
>               if (vhostfd < 0) {
>                   error_setg_errno(errp, errno,
>                                    "tap: open vhost char device failed");
>                   goto failed;
>               }
>               if (!qemu_set_blocking(vhostfd, false, errp)) {
> +                if (!cpr) {
> +                    close(vhostfd);
> +                }
>                   goto failed;
>               }
>           }
> @@ -777,6 +797,9 @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
>       return;
>   
>   failed:
> +    if (cpr) {
> +        cpr_delete_fd_all(name);
> +    }
>       qemu_del_net_client(&s->nc);
>   }
>   
> @@ -809,7 +832,8 @@ static int get_fds(char *str, char *fds[], int max)
>   int net_init_tap(const Netdev *netdev, const char *name,
>                    NetClientState *peer, Error **errp)
>   {
> -    const NetdevTapOptions *tap;
> +    const NetdevTapOptions *tap = &netdev->u.tap;
> +    bool cpr = tap->has_cpr ? tap->cpr : false;
>       int fd, vnet_hdr = 0, i = 0, queues;
>       /* for the no-fd, no-helper case */
>       const char *script;
> @@ -845,7 +869,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
>               goto out;
>           }
>   
> -        fd = monitor_fd_param(monitor_cur(), tap->fd, errp);
> +        fd = cpr_get_fd_param(name, tap->fd, TAP_FD_INDEX(0), cpr, errp);
>           if (fd == -1) {
>               ret = -1;
>               goto out;
> @@ -866,13 +890,14 @@ int net_init_tap(const Netdev *netdev, const char *name,
>   
>           net_init_tap_one(tap, peer, "tap", name, NULL,
>                            script, downscript,
> -                         vhostfdname, vnet_hdr, fd, &err);
> +                         vhostfdname, vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err);
>           if (err) {
>               error_propagate(errp, err);
>               close(fd);
>               ret = -1;
>               goto out;
>           }
> +
>       } else if (tap->fds) {
>           char **fds;
>           char **vhost_fds;
> @@ -903,7 +928,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
>           }
>   
>           for (i = 0; i < nfds; i++) {
> -            fd = monitor_fd_param(monitor_cur(), fds[i], errp);
> +            fd = cpr_get_fd_param(name, fds[i], TAP_FD_INDEX(i), cpr, errp);
>               if (fd == -1) {
>                   ret = -1;
>                   goto free_fail;
> @@ -930,7 +955,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
>               net_init_tap_one(tap, peer, "tap", name, ifname,
>                                script, downscript,
>                                tap->vhostfds ? vhost_fds[i] : NULL,
> -                             vnet_hdr, fd, &err);
> +                             vnet_hdr, fd, TAP_VHOSTFD_INDEX(i), &err);
>               if (err) {
>                   error_propagate(errp, err);
>                   ret = -1;
> @@ -958,9 +983,15 @@ free_fail:
>               goto out;
>           }
>   
> -        fd = net_bridge_run_helper(tap->helper,
> -                                   tap->br ?: DEFAULT_BRIDGE_INTERFACE,
> -                                   errp);
> +        fd = cpr ? cpr_find_fd(name, TAP_FD_INDEX(0)) : -1;
> +        if (fd < 0) {
> +            fd = net_bridge_run_helper(tap->helper,
> +                                    tap->br ?: DEFAULT_BRIDGE_INTERFACE,
> +                                    errp);
> +            if (cpr && fd >= 0) {
> +                cpr_save_fd(name, TAP_FD_INDEX(0), fd);
> +            }
> +        }
>           if (fd == -1) {
>               ret = -1;
>               goto out;
> @@ -980,13 +1011,14 @@ free_fail:
>   
>           net_init_tap_one(tap, peer, "bridge", name, ifname,
>                            script, downscript, vhostfdname,
> -                         vnet_hdr, fd, &err);
> +                         vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err);
>           if (err) {
>               error_propagate(errp, err);
>               close(fd);
>               ret = -1;
>               goto out;
>           }
> +
>       } else {
>           g_autofree char *default_script = NULL;
>           g_autofree char *default_downscript = NULL;
> @@ -1011,8 +1043,14 @@ free_fail:
>           }
>   
>           for (i = 0; i < queues; i++) {
> -            fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
> -                              ifname, sizeof ifname, queues > 1, errp);
> +            fd = cpr ? cpr_find_fd(name, TAP_FD_INDEX(i)) : -1;
> +            if (fd < 0) {
> +                fd = net_tap_init(tap, &vnet_hdr, i >= 1 ? "no" : script,
> +                                ifname, sizeof ifname, queues > 1, errp);
> +                if (cpr && fd >= 0) {
> +                    cpr_save_fd(name, TAP_FD_INDEX(i), fd);
> +                }
> +            }
>               if (fd == -1) {
>                   ret = -1;
>                   goto out;
> @@ -1030,7 +1068,9 @@ free_fail:
>               net_init_tap_one(tap, peer, "tap", name, ifname,
>                                i >= 1 ? "no" : script,
>                                i >= 1 ? "no" : downscript,
> -                             vhostfdname, vnet_hdr, fd, &err);
> +                             vhostfdname, vnet_hdr,
> +                             fd, TAP_VHOSTFD_INDEX(i),
> +                             &err);
>               if (err) {
>                   error_propagate(errp, err);
>                   close(fd);
> @@ -1041,6 +1081,9 @@ free_fail:
>       }
>   
>   out:
> +    if (ret && cpr) {
> +        cpr_delete_fd_all(name);
> +    }
>       return ret;
>   }
>   
> diff --git a/qapi/net.json b/qapi/net.json
> index 118bd34965..264213b5d9 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -355,6 +355,8 @@
>   # @poll-us: maximum number of microseconds that could be spent on busy
>   #     polling for tap (since 2.7)
>   #
> +# @cpr: preserve fds and vhostfds during cpr-transfer.
> +#
>   # Since: 1.2
>   ##
>   { 'struct': 'NetdevTapOptions',
> @@ -373,7 +375,8 @@
>       '*vhostfds':   'str',
>       '*vhostforce': 'bool',
>       '*queues':     'uint32',
> -    '*poll-us':    'uint32'} }
> +    '*poll-us':    'uint32',
> +    '*cpr':        'bool'} }
>   
>   ##
>   # @NetdevSocketOptions:
> 



  parent reply	other threads:[~2025-12-04 17:47 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-03 18:51 [PATCH v3 0/8] Live update: tap and vhost Ben Chaney
2025-12-03 18:51 ` [PATCH v3 1/8] migration: stop vm earlier for cpr Ben Chaney
2025-12-03 18:51 ` [PATCH v3 2/8] migration: cpr setup notifier Ben Chaney
2025-12-03 18:51 ` [PATCH v3 3/8] vhost: reset vhost devices for cpr Ben Chaney
2025-12-03 18:51 ` [PATCH v3 4/8] cpr: delete all fds Ben Chaney
2025-12-03 18:51 ` [PATCH v3 5/8] tap: common return label Ben Chaney
2025-12-03 18:51 ` [PATCH v3 6/8] tap: cpr support Ben Chaney
2025-12-04  8:09   ` Markus Armbruster
2025-12-05  0:51     ` Jason Wang
2025-12-05  6:46       ` Markus Armbruster
2025-12-04 17:46   ` Cédric Le Goater [this message]
2025-12-04 17:56   ` Daniel P. Berrangé
2025-12-03 18:51 ` [PATCH v3 7/8] tap: postload fix for cpr Ben Chaney
2025-12-03 18:51 ` [PATCH v3 8/8] tap: cpr fixes Ben Chaney
2025-12-04 17:59   ` Daniel P. Berrangé
2025-12-04 12:52 ` [PATCH v3 0/8] Live update: tap and vhost Vladimir Sementsov-Ogievskiy
2025-12-08 21:03   ` Chaney, Ben
2025-12-09  7:27     ` Vladimir Sementsov-Ogievskiy
2025-12-08 10:08 ` Cédric Le Goater
2025-12-08 14:22   ` Mark Kanda
2025-12-08 14:42     ` Cédric Le Goater
2025-12-09 18:36   ` Chaney, Ben
  -- strict thread matches above, loose matches on Subject: below --
2025-12-03 18:43 Ben Chaney
2025-12-03 18:43 ` [PATCH v3 6/8] tap: cpr support Ben Chaney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fbc8007b-2667-42c6-9fdf-56147cae664d@redhat.com \
    --to=clg@redhat.com \
    --cc=alex@shazbot.org \
    --cc=armbru@redhat.com \
    --cc=bchaney@akamai.com \
    --cc=berrange@redhat.com \
    --cc=eblake@redhat.com \
    --cc=farosas@suse.de \
    --cc=hamza.khan@nutanix.com \
    --cc=jasowang@redhat.com \
    --cc=johunt@akamai.com \
    --cc=mark.kanda@oracle.com \
    --cc=mst@redhat.com \
    --cc=mtottenh@akamai.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=steven.sistare@oracle.com \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).