* [PULL 1/5] net: parameterize the removing client from nc list
2025-03-10 12:22 [PULL 0/5] Net patches Jason Wang
@ 2025-03-10 12:22 ` Jason Wang
2025-03-10 12:22 ` [PULL 2/5] net: move backend cleanup to NIC cleanup Jason Wang
` (4 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2025-03-10 12:22 UTC (permalink / raw)
To: qemu-devel; +Cc: Eugenio Pérez, Si-Wei Liu, Jason Wang
From: Eugenio Pérez <eperezma@redhat.com>
This change is used in later commits so we can avoid the removal of the
netclient if it is delayed.
No functional change intended.
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
net/net.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/net/net.c b/net/net.c
index a3996d5c62..4eb78a1299 100644
--- a/net/net.c
+++ b/net/net.c
@@ -381,9 +381,12 @@ NetClientState *qemu_get_peer(NetClientState *nc, int queue_index)
return ncs->peer;
}
-static void qemu_cleanup_net_client(NetClientState *nc)
+static void qemu_cleanup_net_client(NetClientState *nc,
+ bool remove_from_net_clients)
{
- QTAILQ_REMOVE(&net_clients, nc, next);
+ if (remove_from_net_clients) {
+ QTAILQ_REMOVE(&net_clients, nc, next);
+ }
if (nc->info->cleanup) {
nc->info->cleanup(nc);
@@ -442,14 +445,14 @@ void qemu_del_net_client(NetClientState *nc)
}
for (i = 0; i < queues; i++) {
- qemu_cleanup_net_client(ncs[i]);
+ qemu_cleanup_net_client(ncs[i], true);
}
return;
}
for (i = 0; i < queues; i++) {
- qemu_cleanup_net_client(ncs[i]);
+ qemu_cleanup_net_client(ncs[i], true);
qemu_free_net_client(ncs[i]);
}
}
@@ -474,7 +477,7 @@ void qemu_del_nic(NICState *nic)
for (i = queues - 1; i >= 0; i--) {
NetClientState *nc = qemu_get_subqueue(nic, i);
- qemu_cleanup_net_client(nc);
+ qemu_cleanup_net_client(nc, true);
qemu_free_net_client(nc);
}
--
2.42.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-03-10 12:22 [PULL 0/5] Net patches Jason Wang
2025-03-10 12:22 ` [PULL 1/5] net: parameterize the removing client from nc list Jason Wang
@ 2025-03-10 12:22 ` Jason Wang
2025-08-19 16:13 ` David Woodhouse
2025-03-10 12:22 ` [PULL 3/5] util/iov: Do not assert offset is in iov Jason Wang
` (3 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2025-03-10 12:22 UTC (permalink / raw)
To: qemu-devel; +Cc: Eugenio Pérez, Lei Yang, Jonah Palmer, Jason Wang
From: Eugenio Pérez <eperezma@redhat.com>
Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
structures if peer nic is present") effectively delayed the backend
cleanup, allowing the frontend or the guest to access it resources as
long as the frontend is still visible to the guest.
However it does not clean up the resources until the qemu process is
over. This causes an effective leak if the device is deleted with
device_del, as there is no way to close the vdpa device. This makes
impossible to re-add that device to this or other QEMU instances until
the first instance of QEMU is finished.
Move the cleanup from qemu_cleanup to the NIC deletion and to
net_cleanup.
Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present")
Reported-by: Lei Yang <leiyang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
net/net.c | 33 +++++++++++++++++++++++++++------
net/vhost-vdpa.c | 8 --------
2 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/net/net.c b/net/net.c
index 4eb78a1299..39d6f28158 100644
--- a/net/net.c
+++ b/net/net.c
@@ -428,7 +428,13 @@ void qemu_del_net_client(NetClientState *nc)
object_unparent(OBJECT(nf));
}
- /* If there is a peer NIC, delete and cleanup client, but do not free. */
+ /*
+ * If there is a peer NIC, transfer ownership to it. Delete the client
+ * from net_client list but do not cleanup nor free. This way NIC can
+ * still access to members of the backend.
+ *
+ * The cleanup and free will be done when the NIC is free.
+ */
if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_NIC) {
NICState *nic = qemu_get_nic(nc->peer);
if (nic->peer_deleted) {
@@ -438,16 +444,13 @@ void qemu_del_net_client(NetClientState *nc)
for (i = 0; i < queues; i++) {
ncs[i]->peer->link_down = true;
+ QTAILQ_REMOVE(&net_clients, ncs[i], next);
}
if (nc->peer->info->link_status_changed) {
nc->peer->info->link_status_changed(nc->peer);
}
- for (i = 0; i < queues; i++) {
- qemu_cleanup_net_client(ncs[i], true);
- }
-
return;
}
@@ -465,8 +468,12 @@ void qemu_del_nic(NICState *nic)
for (i = 0; i < queues; i++) {
NetClientState *nc = qemu_get_subqueue(nic, i);
- /* If this is a peer NIC and peer has already been deleted, free it now. */
+ /*
+ * If this is a peer NIC and peer has already been deleted, clean it up
+ * and free it now.
+ */
if (nic->peer_deleted) {
+ qemu_cleanup_net_client(nc->peer, false);
qemu_free_net_client(nc->peer);
} else if (nc->peer) {
/* if there are RX packets pending, complete them */
@@ -1681,6 +1688,9 @@ void net_cleanup(void)
* of the latest NET_CLIENT_DRIVER_NIC, and operate on *p as we walk
* the list.
*
+ * However, the NIC may have peers that trust to be clean beyond this
+ * point. For example, if they have been removed with device_del.
+ *
* The 'nc' variable isn't part of the list traversal; it's purely
* for convenience as too much '(*p)->' has a tendency to make the
* readers' eyes bleed.
@@ -1688,6 +1698,17 @@ void net_cleanup(void)
while (*p) {
nc = *p;
if (nc->info->type == NET_CLIENT_DRIVER_NIC) {
+ NICState *nic = qemu_get_nic(nc);
+
+ if (nic->peer_deleted) {
+ int queues = MAX(nic->conf->peers.queues, 1);
+
+ for (int i = 0; i < queues; i++) {
+ nc = qemu_get_subqueue(nic, i);
+ qemu_cleanup_net_client(nc->peer, false);
+ }
+ }
+
/* Skip NET_CLIENT_DRIVER_NIC entries */
p = &QTAILQ_NEXT(nc, next);
} else {
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index bd01866878..f7a54f46aa 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -224,14 +224,6 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
{
VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
- /*
- * If a peer NIC is attached, do not cleanup anything.
- * Cleanup will happen as a part of qemu_cleanup() -> net_cleanup()
- * when the guest is shutting down.
- */
- if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_NIC) {
- return;
- }
munmap(s->cvq_cmd_out_buffer, vhost_vdpa_net_cvq_cmd_page_len());
munmap(s->status, vhost_vdpa_net_cvq_cmd_page_len());
if (s->vhost_net) {
--
2.42.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-03-10 12:22 ` [PULL 2/5] net: move backend cleanup to NIC cleanup Jason Wang
@ 2025-08-19 16:13 ` David Woodhouse
2025-08-20 2:34 ` Jason Wang
0 siblings, 1 reply; 21+ messages in thread
From: David Woodhouse @ 2025-08-19 16:13 UTC (permalink / raw)
To: Jason Wang, qemu-devel; +Cc: Eugenio Pérez, Lei Yang, Jonah Palmer
[-- Attachment #1: Type: text/plain, Size: 4121 bytes --]
On Mon, 2025-03-10 at 20:22 +0800, Jason Wang wrote:
> From: Eugenio Pérez <eperezma@redhat.com>
>
> Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
> structures if peer nic is present") effectively delayed the backend
> cleanup, allowing the frontend or the guest to access it resources as
> long as the frontend is still visible to the guest.
>
> However it does not clean up the resources until the qemu process is
> over. This causes an effective leak if the device is deleted with
> device_del, as there is no way to close the vdpa device. This makes
> impossible to re-add that device to this or other QEMU instances until
> the first instance of QEMU is finished.
>
> Move the cleanup from qemu_cleanup to the NIC deletion and to
> net_cleanup.
>
> Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present")
> Reported-by: Lei Yang <leiyang@redhat.com>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
This crashes QEMU when I launch an emulated Xen guest with a Xen PV
NIC, and quit (using Ctrl-A x on the monitor).
$ gdb --args ./qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split -serial mon:stdio -display none -m 1G -kernel /boot/vmlinuz-6.13.8-200.fc41.x86_64/boot/vmlinuz-6.13.8-200.fc41.x86_64 -append "console=ttyS0"
(gdb) handle SIGUSR1 pass nostop noprint
(gdb) run
[ 0.000000] Linux version 6.13.8-200.fc41.x86_64 (mockbuild@cdcecfee8b71420eb7f55566b7811804) (gcc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7), GNU ld version 2.43.1-5.fc41) #1 SMP PREEMPT_DYNAMIC Sun Mar 23 05:03:09 UTC 2025[ 0.000000] Linux version 6.13.8-200.fc41.x86_64 (mockbuild@cdcecfee8b71420eb7f55566b7811804) (gcc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7), GNU ld version 2.43.1-5.fc41) #1 SMP PREEMPT_DYNAMIC Sun Mar 23 05:03:09 UTC 2025
[ 0.000000] Command line: console=ttyS0
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000003ffdffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000003ffe0000-0x000000003fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000feff8000-0x00000000feffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] APIC: Static calls initialized
[ 0.000000] SMBIOS 2.8 present.
[ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 0.000000] DMI: Memory slots populated: 1/1
[ 0.000000] Hypervisor detected: Xen HVM
…
<Ctrl-A x>
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x000055555584821c in net_hub_port_cleanup (nc=0x555557ce23d0) at ../net/hub.c:132
132 QLIST_REMOVE(port, next);
(gdb) bt
#0 0x000055555584821c in net_hub_port_cleanup (nc=0x555557ce23d0) at ../net/hub.c:132
#1 0x000055555584a8c9 in qemu_cleanup_net_client (nc=<optimized out>, remove_from_net_clients=false)
at ../net/net.c:392
#2 qemu_del_nic (nic=0x555558866400) at ../net/net.c:476
#3 0x00005555557cae41 in xen_device_unrealize (dev=<optimized out>) at ../hw/xen/xen-bus.c:988
#4 0x0000555555c414ff in notifier_list_notify
(list=list@entry=0x5555570671f0 <exit_notifiers>, data=data@entry=0x0) at ../util/notify.c:39
#5 0x00005555557efd2c in qemu_run_exit_notifiers () at ../system/runstate.c:854
#6 0x00007ffff52996c1 in __run_exit_handlers () at /lib64/libc.so.6
#7 0x00007ffff529978e in exit () at /lib64/libc.so.6
#8 0x0000555555b90a5c in qemu_default_main (opaque=opaque@entry=0x0) at ../system/main.c:52
#9 0x00005555555622d0 in main (argc=<optimized out>, argv=<optimized out>) at ../system/main.c:76
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-08-19 16:13 ` David Woodhouse
@ 2025-08-20 2:34 ` Jason Wang
2025-08-20 6:46 ` David Woodhouse
0 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2025-08-20 2:34 UTC (permalink / raw)
To: Eugenio Pérez, Jonah Palmer; +Cc: qemu-devel, Lei Yang, David Woodhouse
On Wed, Aug 20, 2025 at 12:13 AM David Woodhouse <dwmw2@infradead.org> wrote:
>
> On Mon, 2025-03-10 at 20:22 +0800, Jason Wang wrote:
> > From: Eugenio Pérez <eperezma@redhat.com>
> >
> > Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
> > structures if peer nic is present") effectively delayed the backend
> > cleanup, allowing the frontend or the guest to access it resources as
> > long as the frontend is still visible to the guest.
> >
> > However it does not clean up the resources until the qemu process is
> > over. This causes an effective leak if the device is deleted with
> > device_del, as there is no way to close the vdpa device. This makes
> > impossible to re-add that device to this or other QEMU instances until
> > the first instance of QEMU is finished.
> >
> > Move the cleanup from qemu_cleanup to the NIC deletion and to
> > net_cleanup.
> >
> > Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present")
> > Reported-by: Lei Yang <leiyang@redhat.com>
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> This crashes QEMU when I launch an emulated Xen guest with a Xen PV
> NIC, and quit (using Ctrl-A x on the monitor).
Eugenio and Jonah, any thoughts on this? It looks like the code
doesn't deal with hub correctly.
Thanks
>
> $ gdb --args ./qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split -serial mon:stdio -display none -m 1G -kernel /boot/vmlinuz-6.13.8-200.fc41.x86_64/boot/vmlinuz-6.13.8-200.fc41.x86_64 -append "console=ttyS0"
> (gdb) handle SIGUSR1 pass nostop noprint
> (gdb) run
> [ 0.000000] Linux version 6.13.8-200.fc41.x86_64 (mockbuild@cdcecfee8b71420eb7f55566b7811804) (gcc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7), GNU ld version 2.43.1-5.fc41) #1 SMP PREEMPT_DYNAMIC Sun Mar 23 05:03:09 UTC 2025[ 0.000000] Linux version 6.13.8-200.fc41.x86_64 (mockbuild@cdcecfee8b71420eb7f55566b7811804) (gcc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7), GNU ld version 2.43.1-5.fc41) #1 SMP PREEMPT_DYNAMIC Sun Mar 23 05:03:09 UTC 2025
> [ 0.000000] Command line: console=ttyS0
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000003ffdffff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000003ffe0000-0x000000003fffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000feff8000-0x00000000feffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] APIC: Static calls initialized
> [ 0.000000] SMBIOS 2.8 present.
> [ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [ 0.000000] DMI: Memory slots populated: 1/1
> [ 0.000000] Hypervisor detected: Xen HVM
> …
> <Ctrl-A x>
> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
> 0x000055555584821c in net_hub_port_cleanup (nc=0x555557ce23d0) at ../net/hub.c:132
> 132 QLIST_REMOVE(port, next);
> (gdb) bt
> #0 0x000055555584821c in net_hub_port_cleanup (nc=0x555557ce23d0) at ../net/hub.c:132
> #1 0x000055555584a8c9 in qemu_cleanup_net_client (nc=<optimized out>, remove_from_net_clients=false)
> at ../net/net.c:392
> #2 qemu_del_nic (nic=0x555558866400) at ../net/net.c:476
> #3 0x00005555557cae41 in xen_device_unrealize (dev=<optimized out>) at ../hw/xen/xen-bus.c:988
> #4 0x0000555555c414ff in notifier_list_notify
> (list=list@entry=0x5555570671f0 <exit_notifiers>, data=data@entry=0x0) at ../util/notify.c:39
> #5 0x00005555557efd2c in qemu_run_exit_notifiers () at ../system/runstate.c:854
> #6 0x00007ffff52996c1 in __run_exit_handlers () at /lib64/libc.so.6
> #7 0x00007ffff529978e in exit () at /lib64/libc.so.6
> #8 0x0000555555b90a5c in qemu_default_main (opaque=opaque@entry=0x0) at ../system/main.c:52
> #9 0x00005555555622d0 in main (argc=<optimized out>, argv=<optimized out>) at ../system/main.c:76
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-08-20 2:34 ` Jason Wang
@ 2025-08-20 6:46 ` David Woodhouse
2025-08-20 7:14 ` Eugenio Perez Martin
0 siblings, 1 reply; 21+ messages in thread
From: David Woodhouse @ 2025-08-20 6:46 UTC (permalink / raw)
To: Jason Wang, Eugenio Pérez, Jonah Palmer; +Cc: qemu-devel, Lei Yang
[-- Attachment #1: Type: text/plain, Size: 1996 bytes --]
On Wed, 2025-08-20 at 10:34 +0800, Jason Wang wrote:
> On Wed, Aug 20, 2025 at 12:13 AM David Woodhouse <dwmw2@infradead.org> wrote:
> >
> > On Mon, 2025-03-10 at 20:22 +0800, Jason Wang wrote:
> > > From: Eugenio Pérez <eperezma@redhat.com>
> > >
> > > Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
> > > structures if peer nic is present") effectively delayed the backend
> > > cleanup, allowing the frontend or the guest to access it resources as
> > > long as the frontend is still visible to the guest.
> > >
> > > However it does not clean up the resources until the qemu process is
> > > over. This causes an effective leak if the device is deleted with
> > > device_del, as there is no way to close the vdpa device. This makes
> > > impossible to re-add that device to this or other QEMU instances until
> > > the first instance of QEMU is finished.
> > >
> > > Move the cleanup from qemu_cleanup to the NIC deletion and to
> > > net_cleanup.
> > >
> > > Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present")
> > > Reported-by: Lei Yang <leiyang@redhat.com>
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> >
> > This crashes QEMU when I launch an emulated Xen guest with a Xen PV
> > NIC, and quit (using Ctrl-A x on the monitor).
>
> Eugenio and Jonah, any thoughts on this? It looks like the code
> doesn't deal with hub correctly.
The interesting part about Xen netback is that it does its own teardown
from xen_device_unrealize() because in the case of running under true
Xen, it needs to clean up XenStore nodes which are externally visible.
It doesn't all just go away when QEMU exits.
We fixed a potentially similar issue in commit 84f85eb95f14a ("net: do
not delete nics in net_cleanup()")?
I should add a test case for this...
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-08-20 6:46 ` David Woodhouse
@ 2025-08-20 7:14 ` Eugenio Perez Martin
2025-08-20 12:22 ` Jonah Palmer
0 siblings, 1 reply; 21+ messages in thread
From: Eugenio Perez Martin @ 2025-08-20 7:14 UTC (permalink / raw)
To: David Woodhouse; +Cc: Jason Wang, Jonah Palmer, qemu-devel, Lei Yang
On Wed, Aug 20, 2025 at 8:47 AM David Woodhouse <dwmw2@infradead.org> wrote:
>
> On Wed, 2025-08-20 at 10:34 +0800, Jason Wang wrote:
> > On Wed, Aug 20, 2025 at 12:13 AM David Woodhouse <dwmw2@infradead.org> wrote:
> > >
> > > On Mon, 2025-03-10 at 20:22 +0800, Jason Wang wrote:
> > > > From: Eugenio Pérez <eperezma@redhat.com>
> > > >
> > > > Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
> > > > structures if peer nic is present") effectively delayed the backend
> > > > cleanup, allowing the frontend or the guest to access it resources as
> > > > long as the frontend is still visible to the guest.
> > > >
> > > > However it does not clean up the resources until the qemu process is
> > > > over. This causes an effective leak if the device is deleted with
> > > > device_del, as there is no way to close the vdpa device. This makes
> > > > impossible to re-add that device to this or other QEMU instances until
> > > > the first instance of QEMU is finished.
> > > >
> > > > Move the cleanup from qemu_cleanup to the NIC deletion and to
> > > > net_cleanup.
> > > >
> > > > Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present")
> > > > Reported-by: Lei Yang <leiyang@redhat.com>
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > >
> > > This crashes QEMU when I launch an emulated Xen guest with a Xen PV
> > > NIC, and quit (using Ctrl-A x on the monitor).
> >
> > Eugenio and Jonah, any thoughts on this? It looks like the code
> > doesn't deal with hub correctly.
>
> The interesting part about Xen netback is that it does its own teardown
> from xen_device_unrealize() because in the case of running under true
> Xen, it needs to clean up XenStore nodes which are externally visible.
> It doesn't all just go away when QEMU exits.
>
> We fixed a potentially similar issue in commit 84f85eb95f14a ("net: do
> not delete nics in net_cleanup()")?
>
I was not aware of that code to be honest :(.
Jonah, could you take a look at this? It should be a matter of
replicating what the series does in the list on NICs, in this linked
list.
Thanks!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-08-20 7:14 ` Eugenio Perez Martin
@ 2025-08-20 12:22 ` Jonah Palmer
2025-08-20 18:06 ` David Woodhouse
0 siblings, 1 reply; 21+ messages in thread
From: Jonah Palmer @ 2025-08-20 12:22 UTC (permalink / raw)
To: Eugenio Perez Martin, David Woodhouse; +Cc: Jason Wang, qemu-devel, Lei Yang
On 8/20/25 3:14 AM, Eugenio Perez Martin wrote:
> On Wed, Aug 20, 2025 at 8:47 AM David Woodhouse <dwmw2@infradead.org> wrote:
>>
>> On Wed, 2025-08-20 at 10:34 +0800, Jason Wang wrote:
>>> On Wed, Aug 20, 2025 at 12:13 AM David Woodhouse <dwmw2@infradead.org> wrote:
>>>>
>>>> On Mon, 2025-03-10 at 20:22 +0800, Jason Wang wrote:
>>>>> From: Eugenio Pérez <eperezma@redhat.com>
>>>>>
>>>>> Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
>>>>> structures if peer nic is present") effectively delayed the backend
>>>>> cleanup, allowing the frontend or the guest to access it resources as
>>>>> long as the frontend is still visible to the guest.
>>>>>
>>>>> However it does not clean up the resources until the qemu process is
>>>>> over. This causes an effective leak if the device is deleted with
>>>>> device_del, as there is no way to close the vdpa device. This makes
>>>>> impossible to re-add that device to this or other QEMU instances until
>>>>> the first instance of QEMU is finished.
>>>>>
>>>>> Move the cleanup from qemu_cleanup to the NIC deletion and to
>>>>> net_cleanup.
>>>>>
>>>>> Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present")
>>>>> Reported-by: Lei Yang <leiyang@redhat.com>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>
>>>> This crashes QEMU when I launch an emulated Xen guest with a Xen PV
>>>> NIC, and quit (using Ctrl-A x on the monitor).
>>>
>>> Eugenio and Jonah, any thoughts on this? It looks like the code
>>> doesn't deal with hub correctly.
>>
>> The interesting part about Xen netback is that it does its own teardown
>> from xen_device_unrealize() because in the case of running under true
>> Xen, it needs to clean up XenStore nodes which are externally visible.
>> It doesn't all just go away when QEMU exits.
>>
>> We fixed a potentially similar issue in commit 84f85eb95f14a ("net: do
>> not delete nics in net_cleanup()")?
>>
>
> I was not aware of that code to be honest :(.
>
> Jonah, could you take a look at this? It should be a matter of
> replicating what the series does in the list on NICs, in this linked
> list.
>
> Thanks!
>
Sure, I'll see what I can do here.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PULL 2/5] net: move backend cleanup to NIC cleanup
2025-08-20 12:22 ` Jonah Palmer
@ 2025-08-20 18:06 ` David Woodhouse
0 siblings, 0 replies; 21+ messages in thread
From: David Woodhouse @ 2025-08-20 18:06 UTC (permalink / raw)
To: Jonah Palmer, Eugenio Perez Martin; +Cc: Jason Wang, qemu-devel, Lei Yang
[-- Attachment #1: Type: text/plain, Size: 1579 bytes --]
On Wed, 2025-08-20 at 08:22 -0400, Jonah Palmer wrote:
> > > > > > From: Eugenio Pérez <eperezma@redhat.com>
> > > > > >
> > > > > > Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net
> > > > > > structures if peer nic is present") effectively delayed the backend
> > > > > > cleanup, allowing the frontend or the guest to access it resources as
> > > > > > long as the frontend is still visible to the guest.
> > > > > >
> > > > > > However it does not clean up the resources until the qemu process is
> > > > > > over. This causes an effective leak if the device is deleted with
> > > > > > device_del, as there is no way to close the vdpa device. This makes
> > > > > > impossible to re-add that device to this or other QEMU instances until
> > > > > > the first instance of QEMU is finished.
> > > > > >
> > > > > > Move the cleanup from qemu_cleanup to the NIC deletion and to
> > > > > > net_cleanup.
> > > > > >
> > > > > > Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the
> > > > > > vdpa/vhost-net structures if peer nic is present")
> > > > > > Reported-by: Lei Yang <leiyang@redhat.com>
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
Is it just that VDPA devices should have an unrealize() method to
properly tear themselves down? The problem appears to happen to the Xen
netdev when it gets torn down twice, precisely because it *was* doing
the right thing for itself?
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PULL 3/5] util/iov: Do not assert offset is in iov
2025-03-10 12:22 [PULL 0/5] Net patches Jason Wang
2025-03-10 12:22 ` [PULL 1/5] net: parameterize the removing client from nc list Jason Wang
2025-03-10 12:22 ` [PULL 2/5] net: move backend cleanup to NIC cleanup Jason Wang
@ 2025-03-10 12:22 ` Jason Wang
2025-03-10 12:22 ` [PULL 4/5] Revert "hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()" Jason Wang
` (2 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2025-03-10 12:22 UTC (permalink / raw)
To: qemu-devel; +Cc: Akihiko Odaki, Jason Wang
From: Akihiko Odaki <akihiko.odaki@daynix.com>
iov_from_buf(), iov_to_buf(), iov_memset(), and iov_copy() asserts
that the given offset fits in the iov while tolerating the specified
number of bytes to operate with to be greater than the size of iov.
This is inconsistent so remove the assertions.
Asserting the offset fits in the iov makes sense if it is expected that
there are other operations that process the content before the offset
and the content is processed in order. Under this expectation, the
offset should point to the end of bytes that are previously processed
and fit in the iov. However, this expectation depends on the details of
the caller, and did not hold true at least one case and required code to
check iov_size(), which is added with commit 83ddb3dbba2e
("hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()").
Adding such a check is inefficient and error-prone. These functions
already tolerate the specified number of bytes to operate with to be
greater than the size of iov to avoid such checks so remove the
assertions to tolerate invalid offset as well. They return the number of
bytes they operated with so their callers can still check the returned
value to ensure there are sufficient space at the given offset.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/qemu/iov.h | 5 +++--
util/iov.c | 5 -----
2 files changed, 3 insertions(+), 7 deletions(-)
diff --git a/include/qemu/iov.h b/include/qemu/iov.h
index 44f9db5cee..9535673c13 100644
--- a/include/qemu/iov.h
+++ b/include/qemu/iov.h
@@ -31,7 +31,7 @@ size_t iov_size(const struct iovec *iov, const unsigned int iov_cnt);
* only part of data will be copied, up to the end of the iovec.
* Number of bytes actually copied will be returned, which is
* min(bytes, iov_size(iov)-offset)
- * `Offset' must point to the inside of iovec.
+ * Returns 0 when `offset' points to the outside of iovec.
*/
size_t iov_from_buf_full(const struct iovec *iov, unsigned int iov_cnt,
size_t offset, const void *buf, size_t bytes);
@@ -67,11 +67,12 @@ iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
/**
* Set data bytes pointed out by iovec `iov' of size `iov_cnt' elements,
* starting at byte offset `start', to value `fillc', repeating it
- * `bytes' number of times. `Offset' must point to the inside of iovec.
+ * `bytes' number of times.
* If `bytes' is large enough, only last bytes portion of iovec,
* up to the end of it, will be filled with the specified value.
* Function return actual number of bytes processed, which is
* min(size, iov_size(iov) - offset).
+ * Returns 0 when `offset' points to the outside of iovec.
*/
size_t iov_memset(const struct iovec *iov, const unsigned int iov_cnt,
size_t offset, int fillc, size_t bytes);
diff --git a/util/iov.c b/util/iov.c
index 7777116123..f8536f0474 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -37,7 +37,6 @@ size_t iov_from_buf_full(const struct iovec *iov, unsigned int iov_cnt,
offset -= iov[i].iov_len;
}
}
- assert(offset == 0);
return done;
}
@@ -56,7 +55,6 @@ size_t iov_to_buf_full(const struct iovec *iov, const unsigned int iov_cnt,
offset -= iov[i].iov_len;
}
}
- assert(offset == 0);
return done;
}
@@ -75,7 +73,6 @@ size_t iov_memset(const struct iovec *iov, const unsigned int iov_cnt,
offset -= iov[i].iov_len;
}
}
- assert(offset == 0);
return done;
}
@@ -277,7 +274,6 @@ unsigned iov_copy(struct iovec *dst_iov, unsigned int dst_iov_cnt,
bytes -= len;
offset = 0;
}
- assert(offset == 0);
return j;
}
@@ -348,7 +344,6 @@ size_t qemu_iovec_concat_iov(QEMUIOVector *dst,
soffset -= src_iov[i].iov_len;
}
}
- assert(soffset == 0); /* offset beyond end of src */
return done;
}
--
2.42.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PULL 4/5] Revert "hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()"
2025-03-10 12:22 [PULL 0/5] Net patches Jason Wang
` (2 preceding siblings ...)
2025-03-10 12:22 ` [PULL 3/5] util/iov: Do not assert offset is in iov Jason Wang
@ 2025-03-10 12:22 ` Jason Wang
2025-03-10 12:22 ` [PULL 5/5] tap-linux: Open ipvtap and macvtap Jason Wang
2025-03-11 5:03 ` [PULL 0/5] Net patches Stefan Hajnoczi
5 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2025-03-10 12:22 UTC (permalink / raw)
To: qemu-devel; +Cc: Akihiko Odaki, Philippe Mathieu-Daudé, Jason Wang
From: Akihiko Odaki <akihiko.odaki@daynix.com>
This reverts commit 83ddb3dbba2ee0f1767442ae6ee665058aeb1093.
The added check is no longer necessary due to a change of
iov_from_buf().
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
hw/net/net_tx_pkt.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 1f79b82b77..903238dca2 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -141,10 +141,6 @@ bool net_tx_pkt_update_sctp_checksum(struct NetTxPkt *pkt)
uint32_t csum = 0;
struct iovec *pl_start_frag = pkt->vec + NET_TX_PKT_PL_START_FRAG;
- if (iov_size(pl_start_frag, pkt->payload_frags) < 8 + sizeof(csum)) {
- return false;
- }
-
if (iov_from_buf(pl_start_frag, pkt->payload_frags, 8, &csum, sizeof(csum)) < sizeof(csum)) {
return false;
}
--
2.42.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PULL 5/5] tap-linux: Open ipvtap and macvtap
2025-03-10 12:22 [PULL 0/5] Net patches Jason Wang
` (3 preceding siblings ...)
2025-03-10 12:22 ` [PULL 4/5] Revert "hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()" Jason Wang
@ 2025-03-10 12:22 ` Jason Wang
2025-03-11 5:03 ` [PULL 0/5] Net patches Stefan Hajnoczi
5 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2025-03-10 12:22 UTC (permalink / raw)
To: qemu-devel; +Cc: Akihiko Odaki, Jason Wang
From: Akihiko Odaki <akihiko.odaki@daynix.com>
ipvtap and macvtap create a file for each interface unlike tuntap, which
creates one file shared by all interfaces. Try to open a file dedicated
to the interface first for ipvtap and macvtap.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
net/tap-linux.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 1226d5fda2..22ec2f45d2 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -45,10 +45,21 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
int len = sizeof(struct virtio_net_hdr);
unsigned int features;
- fd = RETRY_ON_EINTR(open(PATH_NET_TUN, O_RDWR));
+
+ ret = if_nametoindex(ifname);
+ if (ret) {
+ g_autofree char *file = g_strdup_printf("/dev/tap%d", ret);
+ fd = open(file, O_RDWR);
+ } else {
+ fd = -1;
+ }
+
if (fd < 0) {
- error_setg_errno(errp, errno, "could not open %s", PATH_NET_TUN);
- return -1;
+ fd = RETRY_ON_EINTR(open(PATH_NET_TUN, O_RDWR));
+ if (fd < 0) {
+ error_setg_errno(errp, errno, "could not open %s", PATH_NET_TUN);
+ return -1;
+ }
}
memset(&ifr, 0, sizeof(ifr));
ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
--
2.42.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PULL 0/5] Net patches
2025-03-10 12:22 [PULL 0/5] Net patches Jason Wang
` (4 preceding siblings ...)
2025-03-10 12:22 ` [PULL 5/5] tap-linux: Open ipvtap and macvtap Jason Wang
@ 2025-03-11 5:03 ` Stefan Hajnoczi
5 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2025-03-11 5:03 UTC (permalink / raw)
To: Jason Wang; +Cc: qemu-devel, Jason Wang
[-- Attachment #1: Type: text/plain, Size: 116 bytes --]
Applied, thanks.
Please update the changelog at https://wiki.qemu.org/ChangeLog/10.0 for any user-visible changes.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread