[Qemu-devel] [RFC] net: Peer with existing NIC in netdev

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
@ 2012-10-24 12:49 Stefan Hajnoczi
  2012-10-30 15:24 ` Michael S. Tsirkin
  2012-11-02 10:34 ` Stefan Hajnoczi
  0 siblings, 2 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-10-24 12:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, Laine Stump

Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:

  (qemu) info network
  virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
   \ netdev0: type=user,net=10.0.2.0,restrict=off

  (qemu) netdev_del netdev0

  (qemu) netdev_add socket,id=netdev0,listen=:1234

  (qemu) info network
  virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
   \ netdev0: type=socket,

This makes it possible to switch netdev while the guest is running.  It
is not necessary to reset the NIC.

Note that the NIC's link goes down in netdev_del and back up again in
netdev_add.  Therefore the guest becomes aware that the network has
changed, although this depends on the emulated NIC model providing link
status change interrupts.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
This patch addresses the netdev hotplug issue that has been discussed
previously on the mailing list:

  https://www.redhat.com/archives/libvir-list/2012-October/msg00629.html

Initially I thought we need to add a new monitor command.  After looking at the
code it seems that simply allowing netdev_del followed by netdev_add (as
suggested by Laine) is the simplest way to solve the problem.  No new monitor
commands are necessary.

The current code is not safe with virtio-net + tap, where the virtio-net device
reports offload features from the tap device to the guest.  If you try to
switch to a -netdev user or -netdev socket device, those offload capabilities
are not present and network communication will fail.

Laine: Please try this out and see if it works for your use case.

 net.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/net.c b/net.c
index ae4bc0d..d7c60b0 100644
--- a/net.c
+++ b/net.c
@@ -619,6 +619,19 @@ static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
         [NET_CLIENT_OPTIONS_KIND_HUBPORT]   = net_init_hubport,
 };
 
+static NICState *net_find_nic_for_peername(const char *name)
+{
+    NetClientState *nc;
+    QTAILQ_FOREACH(nc, &net_clients, next) {
+        if (nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC) {
+            continue;
+        }
+        if (nc->peer && strcmp(nc->peer->name, name) == 0) {
+            return (NICState *)nc;
+        }
+    }
+    return NULL;
+}
 
 static int net_client_init1(const void *object, int is_netdev, Error **errp)
 {
@@ -678,6 +691,25 @@ static int net_client_init1(const void *object, int is_netdev, Error **errp)
                       NetClientOptionsKind_lookup[opts->kind]);
             return -1;
         }
+
+        /* Peer with existing nic if netdev name matches */
+        if (is_netdev && opts->kind != NET_CLIENT_OPTIONS_KIND_NIC) {
+            NICState *nic = net_find_nic_for_peername(name);
+            if (nic && nic->peer_deleted) {
+                NetClientState *netdev = qemu_find_netdev(name);
+
+                /* TODO vnet_hdr compatibility check */
+
+                qemu_free_net_client(nic->nc.peer);
+                nic->nc.peer = netdev;
+                netdev->peer = &nic->nc;
+                nic->peer_deleted = false;
+                nic->nc.link_down = false;
+                if (nic->nc.info->link_status_changed) {
+                    nic->nc.info->link_status_changed(&nic->nc);
+                }
+            }
+        }
     }
     return 0;
 }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-24 12:49 [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add Stefan Hajnoczi
@ 2012-10-30 15:24 ` Michael S. Tsirkin
  2012-10-31  8:07   ` Stefan Hajnoczi
  2012-11-02 10:34 ` Stefan Hajnoczi
  1 sibling, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2012-10-30 15:24 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Laine Stump

On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> 
>   (qemu) info network
>   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
>    \ netdev0: type=user,net=10.0.2.0,restrict=off
> 
>   (qemu) netdev_del netdev0
> 
>   (qemu) netdev_add socket,id=netdev0,listen=:1234
> 
>   (qemu) info network
>   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
>    \ netdev0: type=socket,
> 
> This makes it possible to switch netdev while the guest is running.  It
> is not necessary to reset the NIC.
> 
> Note that the NIC's link goes down in netdev_del and back up again in
> netdev_add.  Therefore the guest becomes aware that the network has
> changed, although this depends on the emulated NIC model providing link
> status change interrupts.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

I'd be surprised if this patch worked when one or both backends are tap.
tap supports offloads but slirp doesn't, since guest
probes offloads at startup, it assumes it can use offloads.
We also program tap during device operation e.g. on set features.
vhost operation could also be interesting, have not looked into it.

> ---
> This patch addresses the netdev hotplug issue that has been discussed
> previously on the mailing list:
> 
>   https://www.redhat.com/archives/libvir-list/2012-October/msg00629.html

I actually think both we and libvirt should address (2):
	(2) because qemu has no asynchronous event to notify libvirt
	when the guest's PCI detach has actually completed, I have to stick in
	an arbitrary call to sleep() which is generally *way* too long, but may
	be too short in some cases of extremely high load.

libvirt should verify device is gone after calling sleep,
if not yet gone - sleep again.

qemu should report an event when device is ejected.

> Initially I thought we need to add a new monitor command.  After looking at the
> code it seems that simply allowing netdev_del followed by netdev_add (as
> suggested by Laine) is the simplest way to solve the problem.  No new monitor
> commands are necessary.
> 
> The current code is not safe with virtio-net + tap, where the virtio-net device
> reports offload features from the tap device to the guest.  If you try to
> switch to a -netdev user or -netdev socket device, those offload capabilities
> are not present and network communication will fail.
> 
> Laine: Please try this out and see if it works for your use case.
> 
>  net.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/net.c b/net.c
> index ae4bc0d..d7c60b0 100644
> --- a/net.c
> +++ b/net.c
> @@ -619,6 +619,19 @@ static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
>          [NET_CLIENT_OPTIONS_KIND_HUBPORT]   = net_init_hubport,
>  };
>  
> +static NICState *net_find_nic_for_peername(const char *name)
> +{
> +    NetClientState *nc;
> +    QTAILQ_FOREACH(nc, &net_clients, next) {
> +        if (nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC) {
> +            continue;
> +        }
> +        if (nc->peer && strcmp(nc->peer->name, name) == 0) {
> +            return (NICState *)nc;
> +        }
> +    }
> +    return NULL;
> +}
>  
>  static int net_client_init1(const void *object, int is_netdev, Error **errp)
>  {
> @@ -678,6 +691,25 @@ static int net_client_init1(const void *object, int is_netdev, Error **errp)
>                        NetClientOptionsKind_lookup[opts->kind]);
>              return -1;
>          }
> +
> +        /* Peer with existing nic if netdev name matches */
> +        if (is_netdev && opts->kind != NET_CLIENT_OPTIONS_KIND_NIC) {
> +            NICState *nic = net_find_nic_for_peername(name);
> +            if (nic && nic->peer_deleted) {
> +                NetClientState *netdev = qemu_find_netdev(name);
> +
> +                /* TODO vnet_hdr compatibility check */
> +
> +                qemu_free_net_client(nic->nc.peer);
> +                nic->nc.peer = netdev;
> +                netdev->peer = &nic->nc;
> +                nic->peer_deleted = false;
> +                nic->nc.link_down = false;
> +                if (nic->nc.info->link_status_changed) {
> +                    nic->nc.info->link_status_changed(&nic->nc);
> +                }
> +            }
> +        }
>      }
>      return 0;
>  }
> -- 
> 1.7.11.7
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-30 15:24 ` Michael S. Tsirkin
@ 2012-10-31  8:07   ` Stefan Hajnoczi
  2012-10-31  8:57     ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-10-31  8:07 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Laine Stump, qemu-devel, Stefan Hajnoczi

On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
> On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> > 
> >   (qemu) info network
> >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> >    \ netdev0: type=user,net=10.0.2.0,restrict=off
> > 
> >   (qemu) netdev_del netdev0
> > 
> >   (qemu) netdev_add socket,id=netdev0,listen=:1234
> > 
> >   (qemu) info network
> >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> >    \ netdev0: type=socket,
> > 
> > This makes it possible to switch netdev while the guest is running.  It
> > is not necessary to reset the NIC.
> > 
> > Note that the NIC's link goes down in netdev_del and back up again in
> > netdev_add.  Therefore the guest becomes aware that the network has
> > changed, although this depends on the emulated NIC model providing link
> > status change interrupts.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> I'd be surprised if this patch worked when one or both backends are tap.
> tap supports offloads but slirp doesn't, since guest
> probes offloads at startup, it assumes it can use offloads.
> We also program tap during device operation e.g. on set features.
> vhost operation could also be interesting, have not looked into it.

Yes, I left a TODO in the RFC patch and described the issue below.
We'll have to reject incompatible netdevs.

At this stage I'd like to see if this approach works for Laine.  If it
looks good I'll figure out how we can make vhost and offload checks.

> > ---
> > This patch addresses the netdev hotplug issue that has been discussed
> > previously on the mailing list:
> > 
> >   https://www.redhat.com/archives/libvir-list/2012-October/msg00629.html
> 
> I actually think both we and libvirt should address (2):
> 	(2) because qemu has no asynchronous event to notify libvirt
> 	when the guest's PCI detach has actually completed, I have to stick in
> 	an arbitrary call to sleep() which is generally *way* too long, but may
> 	be too short in some cases of extremely high load.
> 
> libvirt should verify device is gone after calling sleep,
> if not yet gone - sleep again.
> 
> qemu should report an event when device is ejected.

I agree.  The hotplug implementation and how it is driven by QMP clients
is really ugly today.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-31  8:07   ` Stefan Hajnoczi
@ 2012-10-31  8:57     ` Michael S. Tsirkin
  2012-10-31 14:51       ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2012-10-31  8:57 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Laine Stump, qemu-devel, Stefan Hajnoczi

On Wed, Oct 31, 2012 at 09:07:27AM +0100, Stefan Hajnoczi wrote:
> On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> > > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> > > 
> > >   (qemu) info network
> > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > >    \ netdev0: type=user,net=10.0.2.0,restrict=off
> > > 
> > >   (qemu) netdev_del netdev0
> > > 
> > >   (qemu) netdev_add socket,id=netdev0,listen=:1234
> > > 
> > >   (qemu) info network
> > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > >    \ netdev0: type=socket,
> > > 
> > > This makes it possible to switch netdev while the guest is running.  It
> > > is not necessary to reset the NIC.
> > > 
> > > Note that the NIC's link goes down in netdev_del and back up again in
> > > netdev_add.  Therefore the guest becomes aware that the network has
> > > changed, although this depends on the emulated NIC model providing link
> > > status change interrupts.
> > > 
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > 
> > I'd be surprised if this patch worked when one or both backends are tap.
> > tap supports offloads but slirp doesn't, since guest
> > probes offloads at startup, it assumes it can use offloads.
> > We also program tap during device operation e.g. on set features.
> > vhost operation could also be interesting, have not looked into it.
> 
> Yes, I left a TODO in the RFC patch and described the issue below.
> We'll have to reject incompatible netdevs.

Ideally, we'd probe all backend capabilities at init time.
However, looks like we allowed netdev and device creation in any order.
Can we change this and require netdev always be there before device?

> At this stage I'd like to see if this approach works for Laine.  If it
> looks good I'll figure out how we can make vhost and offload checks.
> 
> > > ---
> > > This patch addresses the netdev hotplug issue that has been discussed
> > > previously on the mailing list:
> > > 
> > >   https://www.redhat.com/archives/libvir-list/2012-October/msg00629.html
> > 
> > I actually think both we and libvirt should address (2):
> > 	(2) because qemu has no asynchronous event to notify libvirt
> > 	when the guest's PCI detach has actually completed, I have to stick in
> > 	an arbitrary call to sleep() which is generally *way* too long, but may
> > 	be too short in some cases of extremely high load.
> > 
> > libvirt should verify device is gone after calling sleep,
> > if not yet gone - sleep again.
> > 
> > qemu should report an event when device is ejected.
> 
> I agree.  The hotplug implementation and how it is driven by QMP clients
> is really ugly today.
> 
> Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-31  8:57     ` Michael S. Tsirkin
@ 2012-10-31 14:51       ` Stefan Hajnoczi
  2012-10-31 16:34         ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-10-31 14:51 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, Laine Stump

On Wed, Oct 31, 2012 at 10:57:24AM +0200, Michael S. Tsirkin wrote:
> On Wed, Oct 31, 2012 at 09:07:27AM +0100, Stefan Hajnoczi wrote:
> > On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> > > > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> > > > 
> > > >   (qemu) info network
> > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > >    \ netdev0: type=user,net=10.0.2.0,restrict=off
> > > > 
> > > >   (qemu) netdev_del netdev0
> > > > 
> > > >   (qemu) netdev_add socket,id=netdev0,listen=:1234
> > > > 
> > > >   (qemu) info network
> > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > >    \ netdev0: type=socket,
> > > > 
> > > > This makes it possible to switch netdev while the guest is running.  It
> > > > is not necessary to reset the NIC.
> > > > 
> > > > Note that the NIC's link goes down in netdev_del and back up again in
> > > > netdev_add.  Therefore the guest becomes aware that the network has
> > > > changed, although this depends on the emulated NIC model providing link
> > > > status change interrupts.
> > > > 
> > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > 
> > > I'd be surprised if this patch worked when one or both backends are tap.
> > > tap supports offloads but slirp doesn't, since guest
> > > probes offloads at startup, it assumes it can use offloads.
> > > We also program tap during device operation e.g. on set features.
> > > vhost operation could also be interesting, have not looked into it.
> > 
> > Yes, I left a TODO in the RFC patch and described the issue below.
> > We'll have to reject incompatible netdevs.
> 
> Ideally, we'd probe all backend capabilities at init time.
> However, looks like we allowed netdev and device creation in any order.
> Can we change this and require netdev always be there before device?

I don't think the order is a problem.  The relaxed order is only
relevant during startup from main() - but in that case we have no
constraints yet anyway.

The problem only occurs when netdev_add is used to create an
incompatible netdev after devices have initialized.  We should be able
to check and error out in the code that my RFC patch modifies.  If
constraints are violated then netdev_add can fail with an error (the new
netdev is not created and the QMP client needs to try again with a
compatible netdev configuration).

Maybe I'm misunderstanding your point?

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-31 14:51       ` Stefan Hajnoczi
@ 2012-10-31 16:34         ` Michael S. Tsirkin
  2012-11-01  9:53           ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2012-10-31 16:34 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel, Laine Stump

On Wed, Oct 31, 2012 at 03:51:08PM +0100, Stefan Hajnoczi wrote:
> On Wed, Oct 31, 2012 at 10:57:24AM +0200, Michael S. Tsirkin wrote:
> > On Wed, Oct 31, 2012 at 09:07:27AM +0100, Stefan Hajnoczi wrote:
> > > On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> > > > > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> > > > > 
> > > > >   (qemu) info network
> > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > > >    \ netdev0: type=user,net=10.0.2.0,restrict=off
> > > > > 
> > > > >   (qemu) netdev_del netdev0
> > > > > 
> > > > >   (qemu) netdev_add socket,id=netdev0,listen=:1234
> > > > > 
> > > > >   (qemu) info network
> > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > > >    \ netdev0: type=socket,
> > > > > 
> > > > > This makes it possible to switch netdev while the guest is running.  It
> > > > > is not necessary to reset the NIC.
> > > > > 
> > > > > Note that the NIC's link goes down in netdev_del and back up again in
> > > > > netdev_add.  Therefore the guest becomes aware that the network has
> > > > > changed, although this depends on the emulated NIC model providing link
> > > > > status change interrupts.
> > > > > 
> > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > 
> > > > I'd be surprised if this patch worked when one or both backends are tap.
> > > > tap supports offloads but slirp doesn't, since guest
> > > > probes offloads at startup, it assumes it can use offloads.
> > > > We also program tap during device operation e.g. on set features.
> > > > vhost operation could also be interesting, have not looked into it.
> > > 
> > > Yes, I left a TODO in the RFC patch and described the issue below.
> > > We'll have to reject incompatible netdevs.
> > 
> > Ideally, we'd probe all backend capabilities at init time.
> > However, looks like we allowed netdev and device creation in any order.
> > Can we change this and require netdev always be there before device?
> 
> I don't think the order is a problem.  The relaxed order is only
> relevant during startup from main() - but in that case we have no
> constraints yet anyway.
> The problem only occurs when netdev_add is used to create an
> incompatible netdev after devices have initialized.  We should be able
> to check and error out in the code that my RFC patch modifies.  If
> constraints are violated then netdev_add can fail with an error (the new
> netdev is not created and the QMP client needs to try again with a
> compatible netdev configuration).
> 
> Maybe I'm misunderstanding your point?
> 
> Stefan

OK so if we basically require same type backend then I think it's mostly
fine.  I was trying to think of a way to allow changing backend type,
this becomes messy very quickly.  In partuclar macvtap probably
shouldn't be swapped with tap even though they are the same type
formally.

-- 
MST

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-31 16:34         ` Michael S. Tsirkin
@ 2012-11-01  9:53           ` Stefan Hajnoczi
  2012-11-01 11:31             ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-11-01  9:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, Laine Stump

On Wed, Oct 31, 2012 at 06:34:07PM +0200, Michael S. Tsirkin wrote:
> On Wed, Oct 31, 2012 at 03:51:08PM +0100, Stefan Hajnoczi wrote:
> > On Wed, Oct 31, 2012 at 10:57:24AM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Oct 31, 2012 at 09:07:27AM +0100, Stefan Hajnoczi wrote:
> > > > On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
> > > > > On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> > > > > > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> > > > > > 
> > > > > >   (qemu) info network
> > > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > > > >    \ netdev0: type=user,net=10.0.2.0,restrict=off
> > > > > > 
> > > > > >   (qemu) netdev_del netdev0
> > > > > > 
> > > > > >   (qemu) netdev_add socket,id=netdev0,listen=:1234
> > > > > > 
> > > > > >   (qemu) info network
> > > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > > > >    \ netdev0: type=socket,
> > > > > > 
> > > > > > This makes it possible to switch netdev while the guest is running.  It
> > > > > > is not necessary to reset the NIC.
> > > > > > 
> > > > > > Note that the NIC's link goes down in netdev_del and back up again in
> > > > > > netdev_add.  Therefore the guest becomes aware that the network has
> > > > > > changed, although this depends on the emulated NIC model providing link
> > > > > > status change interrupts.
> > > > > > 
> > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > 
> > > > > I'd be surprised if this patch worked when one or both backends are tap.
> > > > > tap supports offloads but slirp doesn't, since guest
> > > > > probes offloads at startup, it assumes it can use offloads.
> > > > > We also program tap during device operation e.g. on set features.
> > > > > vhost operation could also be interesting, have not looked into it.
> > > > 
> > > > Yes, I left a TODO in the RFC patch and described the issue below.
> > > > We'll have to reject incompatible netdevs.
> > > 
> > > Ideally, we'd probe all backend capabilities at init time.
> > > However, looks like we allowed netdev and device creation in any order.
> > > Can we change this and require netdev always be there before device?
> > 
> > I don't think the order is a problem.  The relaxed order is only
> > relevant during startup from main() - but in that case we have no
> > constraints yet anyway.
> > The problem only occurs when netdev_add is used to create an
> > incompatible netdev after devices have initialized.  We should be able
> > to check and error out in the code that my RFC patch modifies.  If
> > constraints are violated then netdev_add can fail with an error (the new
> > netdev is not created and the QMP client needs to try again with a
> > compatible netdev configuration).
> > 
> > Maybe I'm misunderstanding your point?
> > 
> > Stefan
> 
> OK so if we basically require same type backend then I think it's mostly
> fine.  I was trying to think of a way to allow changing backend type,
> this becomes messy very quickly.  In partuclar macvtap probably
> shouldn't be swapped with tap even though they are the same type
> formally.

As long as they are offload-compatible, I think they can be swapped.
It's up to the user or the management stack to make sure switching
netdevs makes "sense".  So the network may be different and the guest
needs to DHCP again, but that's the user's problem.

Is there a reason why macvtap <-> tap won't work given compatible
vnet_hdr offload?

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-11-01  9:53           ` Stefan Hajnoczi
@ 2012-11-01 11:31             ` Michael S. Tsirkin
  2012-11-02 10:32               ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2012-11-01 11:31 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel, Laine Stump

On Thu, Nov 01, 2012 at 10:53:52AM +0100, Stefan Hajnoczi wrote:
> On Wed, Oct 31, 2012 at 06:34:07PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Oct 31, 2012 at 03:51:08PM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Oct 31, 2012 at 10:57:24AM +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Oct 31, 2012 at 09:07:27AM +0100, Stefan Hajnoczi wrote:
> > > > > On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
> > > > > > On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
> > > > > > > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
> > > > > > > 
> > > > > > >   (qemu) info network
> > > > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > > > > >    \ netdev0: type=user,net=10.0.2.0,restrict=off
> > > > > > > 
> > > > > > >   (qemu) netdev_del netdev0
> > > > > > > 
> > > > > > >   (qemu) netdev_add socket,id=netdev0,listen=:1234
> > > > > > > 
> > > > > > >   (qemu) info network
> > > > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
> > > > > > >    \ netdev0: type=socket,
> > > > > > > 
> > > > > > > This makes it possible to switch netdev while the guest is running.  It
> > > > > > > is not necessary to reset the NIC.
> > > > > > > 
> > > > > > > Note that the NIC's link goes down in netdev_del and back up again in
> > > > > > > netdev_add.  Therefore the guest becomes aware that the network has
> > > > > > > changed, although this depends on the emulated NIC model providing link
> > > > > > > status change interrupts.
> > > > > > > 
> > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > > 
> > > > > > I'd be surprised if this patch worked when one or both backends are tap.
> > > > > > tap supports offloads but slirp doesn't, since guest
> > > > > > probes offloads at startup, it assumes it can use offloads.
> > > > > > We also program tap during device operation e.g. on set features.
> > > > > > vhost operation could also be interesting, have not looked into it.
> > > > > 
> > > > > Yes, I left a TODO in the RFC patch and described the issue below.
> > > > > We'll have to reject incompatible netdevs.
> > > > 
> > > > Ideally, we'd probe all backend capabilities at init time.
> > > > However, looks like we allowed netdev and device creation in any order.
> > > > Can we change this and require netdev always be there before device?
> > > 
> > > I don't think the order is a problem.  The relaxed order is only
> > > relevant during startup from main() - but in that case we have no
> > > constraints yet anyway.
> > > The problem only occurs when netdev_add is used to create an
> > > incompatible netdev after devices have initialized.  We should be able
> > > to check and error out in the code that my RFC patch modifies.  If
> > > constraints are violated then netdev_add can fail with an error (the new
> > > netdev is not created and the QMP client needs to try again with a
> > > compatible netdev configuration).
> > > 
> > > Maybe I'm misunderstanding your point?
> > > 
> > > Stefan
> > 
> > OK so if we basically require same type backend then I think it's mostly
> > fine.  I was trying to think of a way to allow changing backend type,
> > this becomes messy very quickly.  In partuclar macvtap probably
> > shouldn't be swapped with tap even though they are the same type
> > formally.
> 
> As long as they are offload-compatible, I think they can be swapped.
> It's up to the user or the management stack to make sure switching
> netdevs makes "sense".  So the network may be different and the guest
> needs to DHCP again, but that's the user's problem.

I think a simple rule like "use same backend type" is better than
an opaque one "are offload-compatible" - user has no idea
which offloads do each of the frontends and backends support.
Also if in future we add offloads to backend X suddenly we
break ability to swap with backend Y.
Let's keep it simple.

> Is there a reason why macvtap <-> tap won't work given compatible
> vnet_hdr offload?
> 
> Stefan

There's no guarantee they will support same offload options in all
kernels, in fact thats' not the case in some kernel.org kernels.


-- 
MST

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-11-01 11:31             ` Michael S. Tsirkin
@ 2012-11-02 10:32               ` Stefan Hajnoczi
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-11-02 10:32 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Laine Stump, qemu-devel, Stefan Hajnoczi

On Thu, Nov 1, 2012 at 12:31 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Nov 01, 2012 at 10:53:52AM +0100, Stefan Hajnoczi wrote:
>> On Wed, Oct 31, 2012 at 06:34:07PM +0200, Michael S. Tsirkin wrote:
>> > On Wed, Oct 31, 2012 at 03:51:08PM +0100, Stefan Hajnoczi wrote:
>> > > On Wed, Oct 31, 2012 at 10:57:24AM +0200, Michael S. Tsirkin wrote:
>> > > > On Wed, Oct 31, 2012 at 09:07:27AM +0100, Stefan Hajnoczi wrote:
>> > > > > On Tue, Oct 30, 2012 at 05:24:06PM +0200, Michael S. Tsirkin wrote:
>> > > > > > On Wed, Oct 24, 2012 at 02:49:21PM +0200, Stefan Hajnoczi wrote:
>> > > > > > > Allow netdev_del followed by netdev_add to re-peer a NIC and its netdev:
>> > > > > > >
>> > > > > > >   (qemu) info network
>> > > > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
>> > > > > > >    \ netdev0: type=user,net=10.0.2.0,restrict=off
>> > > > > > >
>> > > > > > >   (qemu) netdev_del netdev0
>> > > > > > >
>> > > > > > >   (qemu) netdev_add socket,id=netdev0,listen=:1234
>> > > > > > >
>> > > > > > >   (qemu) info network
>> > > > > > >   virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
>> > > > > > >    \ netdev0: type=socket,
>> > > > > > >
>> > > > > > > This makes it possible to switch netdev while the guest is running.  It
>> > > > > > > is not necessary to reset the NIC.
>> > > > > > >
>> > > > > > > Note that the NIC's link goes down in netdev_del and back up again in
>> > > > > > > netdev_add.  Therefore the guest becomes aware that the network has
>> > > > > > > changed, although this depends on the emulated NIC model providing link
>> > > > > > > status change interrupts.
>> > > > > > >
>> > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> > > > > >
>> > > > > > I'd be surprised if this patch worked when one or both backends are tap.
>> > > > > > tap supports offloads but slirp doesn't, since guest
>> > > > > > probes offloads at startup, it assumes it can use offloads.
>> > > > > > We also program tap during device operation e.g. on set features.
>> > > > > > vhost operation could also be interesting, have not looked into it.
>> > > > >
>> > > > > Yes, I left a TODO in the RFC patch and described the issue below.
>> > > > > We'll have to reject incompatible netdevs.
>> > > >
>> > > > Ideally, we'd probe all backend capabilities at init time.
>> > > > However, looks like we allowed netdev and device creation in any order.
>> > > > Can we change this and require netdev always be there before device?
>> > >
>> > > I don't think the order is a problem.  The relaxed order is only
>> > > relevant during startup from main() - but in that case we have no
>> > > constraints yet anyway.
>> > > The problem only occurs when netdev_add is used to create an
>> > > incompatible netdev after devices have initialized.  We should be able
>> > > to check and error out in the code that my RFC patch modifies.  If
>> > > constraints are violated then netdev_add can fail with an error (the new
>> > > netdev is not created and the QMP client needs to try again with a
>> > > compatible netdev configuration).
>> > >
>> > > Maybe I'm misunderstanding your point?
>> > >
>> > > Stefan
>> >
>> > OK so if we basically require same type backend then I think it's mostly
>> > fine.  I was trying to think of a way to allow changing backend type,
>> > this becomes messy very quickly.  In partuclar macvtap probably
>> > shouldn't be swapped with tap even though they are the same type
>> > formally.
>>
>> As long as they are offload-compatible, I think they can be swapped.
>> It's up to the user or the management stack to make sure switching
>> netdevs makes "sense".  So the network may be different and the guest
>> needs to DHCP again, but that's the user's problem.
>
> I think a simple rule like "use same backend type" is better than
> an opaque one "are offload-compatible" - user has no idea
> which offloads do each of the frontends and backends support.
> Also if in future we add offloads to backend X suddenly we
> break ability to swap with backend Y.
> Let's keep it simple.

Okay, that's a safe constraint that we can start with.  If users
request more freedom later we can get fancy.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add
  2012-10-24 12:49 [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add Stefan Hajnoczi
  2012-10-30 15:24 ` Michael S. Tsirkin
@ 2012-11-02 10:34 ` Stefan Hajnoczi
  1 sibling, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-11-02 10:34 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Laine Stump

On Wed, Oct 24, 2012 at 2:49 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> Laine: Please try this out and see if it works for your use case.

Waiting for your feedback before I prepare a final patch that can go into QEMU.

There's no time pressure from my side to get this feature in so take
as much time as you need.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-11-02 10:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-24 12:49 [Qemu-devel] [RFC] net: Peer with existing NIC in netdev_add Stefan Hajnoczi
2012-10-30 15:24 ` Michael S. Tsirkin
2012-10-31  8:07   ` Stefan Hajnoczi
2012-10-31  8:57     ` Michael S. Tsirkin
2012-10-31 14:51       ` Stefan Hajnoczi
2012-10-31 16:34         ` Michael S. Tsirkin
2012-11-01  9:53           ` Stefan Hajnoczi
2012-11-01 11:31             ` Michael S. Tsirkin
2012-11-02 10:32               ` Stefan Hajnoczi
2012-11-02 10:34 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).