* [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing @ 2025-07-02 10:37 Zigit Zo 2025-07-10 8:30 ` Paolo Abeni 2025-07-15 9:31 ` Jason Wang 0 siblings, 2 replies; 6+ messages in thread From: Zigit Zo @ 2025-07-02 10:37 UTC (permalink / raw) To: mst, jasowang, xuanzhuo, eperezma Cc: zuozhijie, virtualization, netdev, linux-kernel This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> --- v1 -> v2: - Check vi->status in virtnet_open(). v1: - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/ --- drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++---------------- 1 file changed, 26 insertions(+), 17 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..859add98909b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev) if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { if (vi->status & VIRTIO_NET_S_LINK_UP) netif_carrier_on(vi->dev); + if (vi->status & VIRTIO_NET_S_ANNOUNCE) { + vi->status &= ~VIRTIO_NET_S_ANNOUNCE; + schedule_work(&vi->config_work); + } virtio_config_driver_enable(vi->vdev); } else { vi->status = VIRTIO_NET_S_LINK_UP; @@ -6215,33 +6219,34 @@ static void virtnet_config_changed_work(struct work_struct *work) { struct virtnet_info *vi = container_of(work, struct virtnet_info, config_work); - u16 v; + u16 v, changed; if (virtio_cread_feature(vi->vdev, VIRTIO_NET_F_STATUS, struct virtio_net_config, status, &v) < 0) return; - if (v & VIRTIO_NET_S_ANNOUNCE) { + changed = vi->status ^ v; + + /* Assume the device will clear announce status when ack received. */ + if ((changed & VIRTIO_NET_S_ANNOUNCE) && (v & VIRTIO_NET_S_ANNOUNCE)) { netdev_notify_peers(vi->dev); virtnet_ack_link_announce(vi); + v &= ~VIRTIO_NET_S_ANNOUNCE; } - /* Ignore unknown (future) status bits */ - v &= VIRTIO_NET_S_LINK_UP; - - if (vi->status == v) - return; - - vi->status = v; - - if (vi->status & VIRTIO_NET_S_LINK_UP) { - virtnet_update_settings(vi); - netif_carrier_on(vi->dev); - netif_tx_wake_all_queues(vi->dev); - } else { - netif_carrier_off(vi->dev); - netif_tx_stop_all_queues(vi->dev); + if (changed & VIRTIO_NET_S_LINK_UP) { + if (v & VIRTIO_NET_S_LINK_UP) { + virtnet_update_settings(vi); + netif_carrier_on(vi->dev); + netif_tx_wake_all_queues(vi->dev); + } else { + netif_carrier_off(vi->dev); + netif_tx_stop_all_queues(vi->dev); + } } + + /* Ignore unknown (future) status bits */ + vi->status = v & (VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE); } static void virtnet_config_changed(struct virtio_device *vdev) @@ -7030,6 +7035,10 @@ static int virtnet_probe(struct virtio_device *vdev) otherwise get link status from config. */ netif_carrier_off(dev); if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { + /* If there's (rarely) an announcement, the actual work will be + * scheduled on ndo_open() to avoid recursive rtnl_lock() here. + */ + vi->status = VIRTIO_NET_S_ANNOUNCE; virtnet_config_changed_work(&vi->config_work); } else { vi->status = VIRTIO_NET_S_LINK_UP; base-commit: 34a500caf48c47d5171f4aa1f237da39b07c6157 -- 2.49.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing 2025-07-02 10:37 [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing Zigit Zo @ 2025-07-10 8:30 ` Paolo Abeni 2025-07-15 9:31 ` Jason Wang 1 sibling, 0 replies; 6+ messages in thread From: Paolo Abeni @ 2025-07-10 8:30 UTC (permalink / raw) To: Zigit Zo, mst, jasowang, xuanzhuo, eperezma Cc: virtualization, netdev, linux-kernel On 7/2/25 12:37 PM, Zigit Zo wrote: > This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while > the virtio-net driver is still probing with rtnl_lock() hold, this will > cause a recursive mutex in netdev_notify_peers(). > > Fix it by temporarily save the announce status while probing, and then in > virtnet_open(), if it sees a delayed announce work is there, it starts to > schedule the virtnet_config_changed_work(). > > Another possible solution is to directly check whether rtnl_is_locked() > and call __netdev_notify_peers(), but in that way means we need to relies > on netdev_queue to schedule the arp packets after ndo_open(), which we > thought is not very intuitive. > > We've observed a softlockup with Ubuntu 24.04, and can be reproduced with > QEMU sending the announce_self rapidly while booting. > > [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. > [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu > [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 > [ 494.168260] Call Trace: > [ 494.168329] <TASK> > [ 494.168389] __schedule+0x27c/0x6b0 > [ 494.168495] schedule+0x33/0x110 > [ 494.168585] schedule_preempt_disabled+0x15/0x30 > [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 > [ 494.168835] __mutex_lock_slowpath+0x13/0x20 > [ 494.168949] mutex_lock+0x3c/0x50 > [ 494.169039] rtnl_lock+0x15/0x20 > [ 494.169128] netdev_notify_peers+0x12/0x30 > [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 > [ 494.169377] virtnet_probe+0xa48/0xe00 > [ 494.169484] ? vp_get+0x4d/0x100 > [ 494.169574] virtio_dev_probe+0x1e9/0x310 > [ 494.169682] really_probe+0x1c7/0x410 > [ 494.169783] __driver_probe_device+0x8c/0x180 > [ 494.169901] driver_probe_device+0x24/0xd0 > [ 494.170011] __driver_attach+0x10b/0x210 > [ 494.170117] ? __pfx___driver_attach+0x10/0x10 > [ 494.170237] bus_for_each_dev+0x8d/0xf0 > [ 494.170341] driver_attach+0x1e/0x30 > [ 494.170440] bus_add_driver+0x14e/0x290 > [ 494.170548] driver_register+0x5e/0x130 > [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 > [ 494.170788] register_virtio_driver+0x20/0x40 > [ 494.170905] virtio_net_driver_init+0x97/0xb0 > [ 494.171022] do_one_initcall+0x5e/0x340 > [ 494.171128] do_initcalls+0x107/0x230 > [ 494.171228] ? __pfx_kernel_init+0x10/0x10 > [ 494.171340] kernel_init_freeable+0x134/0x210 > [ 494.171462] kernel_init+0x1b/0x200 > [ 494.171560] ret_from_fork+0x47/0x70 > [ 494.171659] ? __pfx_kernel_init+0x10/0x10 > [ 494.171769] ret_from_fork_asm+0x1b/0x30 > [ 494.171875] </TASK> > > Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") > Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> @Micheal: I think this addresses your concerns on v1, WDYT? /P ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing 2025-07-02 10:37 [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing Zigit Zo 2025-07-10 8:30 ` Paolo Abeni @ 2025-07-15 9:31 ` Jason Wang 2025-07-15 10:59 ` Zigit Zo 1 sibling, 1 reply; 6+ messages in thread From: Jason Wang @ 2025-07-15 9:31 UTC (permalink / raw) To: Zigit Zo; +Cc: mst, xuanzhuo, eperezma, virtualization, netdev, linux-kernel On Wed, Jul 2, 2025 at 6:37 PM Zigit Zo <zuozhijie@bytedance.com> wrote: > > This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while > the virtio-net driver is still probing with rtnl_lock() hold, this will > cause a recursive mutex in netdev_notify_peers(). > > Fix it by temporarily save the announce status while probing, and then in > virtnet_open(), if it sees a delayed announce work is there, it starts to > schedule the virtnet_config_changed_work(). > > Another possible solution is to directly check whether rtnl_is_locked() > and call __netdev_notify_peers(), but in that way means we need to relies > on netdev_queue to schedule the arp packets after ndo_open(), which we > thought is not very intuitive. > > We've observed a softlockup with Ubuntu 24.04, and can be reproduced with > QEMU sending the announce_self rapidly while booting. > > [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. > [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu > [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 > [ 494.168260] Call Trace: > [ 494.168329] <TASK> > [ 494.168389] __schedule+0x27c/0x6b0 > [ 494.168495] schedule+0x33/0x110 > [ 494.168585] schedule_preempt_disabled+0x15/0x30 > [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 > [ 494.168835] __mutex_lock_slowpath+0x13/0x20 > [ 494.168949] mutex_lock+0x3c/0x50 > [ 494.169039] rtnl_lock+0x15/0x20 > [ 494.169128] netdev_notify_peers+0x12/0x30 > [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 > [ 494.169377] virtnet_probe+0xa48/0xe00 > [ 494.169484] ? vp_get+0x4d/0x100 > [ 494.169574] virtio_dev_probe+0x1e9/0x310 > [ 494.169682] really_probe+0x1c7/0x410 > [ 494.169783] __driver_probe_device+0x8c/0x180 > [ 494.169901] driver_probe_device+0x24/0xd0 > [ 494.170011] __driver_attach+0x10b/0x210 > [ 494.170117] ? __pfx___driver_attach+0x10/0x10 > [ 494.170237] bus_for_each_dev+0x8d/0xf0 > [ 494.170341] driver_attach+0x1e/0x30 > [ 494.170440] bus_add_driver+0x14e/0x290 > [ 494.170548] driver_register+0x5e/0x130 > [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 > [ 494.170788] register_virtio_driver+0x20/0x40 > [ 494.170905] virtio_net_driver_init+0x97/0xb0 > [ 494.171022] do_one_initcall+0x5e/0x340 > [ 494.171128] do_initcalls+0x107/0x230 > [ 494.171228] ? __pfx_kernel_init+0x10/0x10 > [ 494.171340] kernel_init_freeable+0x134/0x210 > [ 494.171462] kernel_init+0x1b/0x200 > [ 494.171560] ret_from_fork+0x47/0x70 > [ 494.171659] ? __pfx_kernel_init+0x10/0x10 > [ 494.171769] ret_from_fork_asm+0x1b/0x30 > [ 494.171875] </TASK> > > Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") > Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> > --- > v1 -> v2: > - Check vi->status in virtnet_open(). > v1: > - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/ > --- > drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++---------------- > 1 file changed, 26 insertions(+), 17 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index e53ba600605a..859add98909b 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev) > if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { > if (vi->status & VIRTIO_NET_S_LINK_UP) > netif_carrier_on(vi->dev); > + if (vi->status & VIRTIO_NET_S_ANNOUNCE) { > + vi->status &= ~VIRTIO_NET_S_ANNOUNCE; > + schedule_work(&vi->config_work); > + } > virtio_config_driver_enable(vi->vdev); Instead of doing tricks like this. I wonder if the fix is as simple as calling virtio_config_driver_disable() before init_vqs()? Thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing 2025-07-15 9:31 ` Jason Wang @ 2025-07-15 10:59 ` Zigit Zo 2025-07-16 2:47 ` Jason Wang 0 siblings, 1 reply; 6+ messages in thread From: Zigit Zo @ 2025-07-15 10:59 UTC (permalink / raw) To: Jason Wang; +Cc: mst, xuanzhuo, eperezma, virtualization, netdev, linux-kernel On 7/15/25 5:31 PM, Jason Wang wrote: > On Wed, Jul 2, 2025 at 6:37 PM Zigit Zo <zuozhijie@bytedance.com> wrote: >> >> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while >> the virtio-net driver is still probing with rtnl_lock() hold, this will >> cause a recursive mutex in netdev_notify_peers(). >> >> Fix it by temporarily save the announce status while probing, and then in >> virtnet_open(), if it sees a delayed announce work is there, it starts to >> schedule the virtnet_config_changed_work(). >> >> Another possible solution is to directly check whether rtnl_is_locked() >> and call __netdev_notify_peers(), but in that way means we need to relies >> on netdev_queue to schedule the arp packets after ndo_open(), which we >> thought is not very intuitive. >> >> We've observed a softlockup with Ubuntu 24.04, and can be reproduced with >> QEMU sending the announce_self rapidly while booting. >> >> [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. >> [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu >> [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 >> [ 494.168260] Call Trace: >> [ 494.168329] <TASK> >> [ 494.168389] __schedule+0x27c/0x6b0 >> [ 494.168495] schedule+0x33/0x110 >> [ 494.168585] schedule_preempt_disabled+0x15/0x30 >> [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 >> [ 494.168835] __mutex_lock_slowpath+0x13/0x20 >> [ 494.168949] mutex_lock+0x3c/0x50 >> [ 494.169039] rtnl_lock+0x15/0x20 >> [ 494.169128] netdev_notify_peers+0x12/0x30 >> [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 >> [ 494.169377] virtnet_probe+0xa48/0xe00 >> [ 494.169484] ? vp_get+0x4d/0x100 >> [ 494.169574] virtio_dev_probe+0x1e9/0x310 >> [ 494.169682] really_probe+0x1c7/0x410 >> [ 494.169783] __driver_probe_device+0x8c/0x180 >> [ 494.169901] driver_probe_device+0x24/0xd0 >> [ 494.170011] __driver_attach+0x10b/0x210 >> [ 494.170117] ? __pfx___driver_attach+0x10/0x10 >> [ 494.170237] bus_for_each_dev+0x8d/0xf0 >> [ 494.170341] driver_attach+0x1e/0x30 >> [ 494.170440] bus_add_driver+0x14e/0x290 >> [ 494.170548] driver_register+0x5e/0x130 >> [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 >> [ 494.170788] register_virtio_driver+0x20/0x40 >> [ 494.170905] virtio_net_driver_init+0x97/0xb0 >> [ 494.171022] do_one_initcall+0x5e/0x340 >> [ 494.171128] do_initcalls+0x107/0x230 >> [ 494.171228] ? __pfx_kernel_init+0x10/0x10 >> [ 494.171340] kernel_init_freeable+0x134/0x210 >> [ 494.171462] kernel_init+0x1b/0x200 >> [ 494.171560] ret_from_fork+0x47/0x70 >> [ 494.171659] ? __pfx_kernel_init+0x10/0x10 >> [ 494.171769] ret_from_fork_asm+0x1b/0x30 >> [ 494.171875] </TASK> >> >> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") >> Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> >> --- >> v1 -> v2: >> - Check vi->status in virtnet_open(). >> v1: >> - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/ >> --- >> drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++---------------- >> 1 file changed, 26 insertions(+), 17 deletions(-) >> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> index e53ba600605a..859add98909b 100644 >> --- a/drivers/net/virtio_net.c >> +++ b/drivers/net/virtio_net.c >> @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev) >> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { >> if (vi->status & VIRTIO_NET_S_LINK_UP) >> netif_carrier_on(vi->dev); >> + if (vi->status & VIRTIO_NET_S_ANNOUNCE) { >> + vi->status &= ~VIRTIO_NET_S_ANNOUNCE; >> + schedule_work(&vi->config_work); >> + } >> virtio_config_driver_enable(vi->vdev); > > Instead of doing tricks like this. > > I wonder if the fix is as simple as calling > virtio_config_driver_disable() before init_vqs()? > > Thanks > That might not work as the device like QEMU will set the VIRTIO_NET_S_ANNOUNCE regardless of most of the driver status, QEMU only checks whether the driver has finalized it's features with VIRTIO_NET_F_GUEST_ANNOUNCE & VIRTIO_NET_F_CTRL_VQ. We've made a little patch to verify, don't know if it matches your thought, but it does not seem to work :( diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..f309ce3fe243 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -6903,6 +6903,9 @@ static int virtnet_probe(struct virtio_device *vdev) vi->curr_queue_pairs = num_online_cpus(); vi->max_queue_pairs = max_queue_pairs; + /* Disable config change notification until ndo_open. */ + virtio_config_driver_disable(vi->vdev); + /* Allocate/initialize the rx/tx queues, and invoke find_vqs */ err = init_vqs(vi); if (err) @@ -6965,9 +6968,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_failover; } - /* Disable config change notification until ndo_open. */ - virtio_config_driver_disable(vi->vdev); - virtio_device_ready(vdev); if (vi->has_rss || vi->has_rss_hash_report) { For reproduce details, 1. Spawn qemu with monitor, like `-monitor unix:qemu.sock,server` 2. In another window, run `while true; echo "announce_self"; end | socat - unix-connect:qemu.sock > /dev/null` 3. The boot up will get hanged when probing the virtio_net The simplest version we've made is to revert the usage of `virtnet_config_changed_work()` back to the `schedule_work()`, but as in v1, we're still trying to understand the impact, making sure that it won't break other things. Regards, ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing 2025-07-15 10:59 ` Zigit Zo @ 2025-07-16 2:47 ` Jason Wang 2025-07-16 11:20 ` Zigit Zo 0 siblings, 1 reply; 6+ messages in thread From: Jason Wang @ 2025-07-16 2:47 UTC (permalink / raw) To: Zigit Zo; +Cc: mst, xuanzhuo, eperezma, virtualization, netdev, linux-kernel Hi Zigit: On Tue, Jul 15, 2025 at 7:00 PM Zigit Zo <zuozhijie@bytedance.com> wrote: > > On 7/15/25 5:31 PM, Jason Wang wrote: > > On Wed, Jul 2, 2025 at 6:37 PM Zigit Zo <zuozhijie@bytedance.com> wrote: > >> > >> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while > >> the virtio-net driver is still probing with rtnl_lock() hold, this will > >> cause a recursive mutex in netdev_notify_peers(). > >> > >> Fix it by temporarily save the announce status while probing, and then in > >> virtnet_open(), if it sees a delayed announce work is there, it starts to > >> schedule the virtnet_config_changed_work(). > >> > >> Another possible solution is to directly check whether rtnl_is_locked() > >> and call __netdev_notify_peers(), but in that way means we need to relies > >> on netdev_queue to schedule the arp packets after ndo_open(), which we > >> thought is not very intuitive. > >> > >> We've observed a softlockup with Ubuntu 24.04, and can be reproduced with > >> QEMU sending the announce_self rapidly while booting. > >> > >> [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. > >> [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu > >> [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 > >> [ 494.168260] Call Trace: > >> [ 494.168329] <TASK> > >> [ 494.168389] __schedule+0x27c/0x6b0 > >> [ 494.168495] schedule+0x33/0x110 > >> [ 494.168585] schedule_preempt_disabled+0x15/0x30 > >> [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 > >> [ 494.168835] __mutex_lock_slowpath+0x13/0x20 > >> [ 494.168949] mutex_lock+0x3c/0x50 > >> [ 494.169039] rtnl_lock+0x15/0x20 > >> [ 494.169128] netdev_notify_peers+0x12/0x30 > >> [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 > >> [ 494.169377] virtnet_probe+0xa48/0xe00 > >> [ 494.169484] ? vp_get+0x4d/0x100 > >> [ 494.169574] virtio_dev_probe+0x1e9/0x310 > >> [ 494.169682] really_probe+0x1c7/0x410 > >> [ 494.169783] __driver_probe_device+0x8c/0x180 > >> [ 494.169901] driver_probe_device+0x24/0xd0 > >> [ 494.170011] __driver_attach+0x10b/0x210 > >> [ 494.170117] ? __pfx___driver_attach+0x10/0x10 > >> [ 494.170237] bus_for_each_dev+0x8d/0xf0 > >> [ 494.170341] driver_attach+0x1e/0x30 > >> [ 494.170440] bus_add_driver+0x14e/0x290 > >> [ 494.170548] driver_register+0x5e/0x130 > >> [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 > >> [ 494.170788] register_virtio_driver+0x20/0x40 > >> [ 494.170905] virtio_net_driver_init+0x97/0xb0 > >> [ 494.171022] do_one_initcall+0x5e/0x340 > >> [ 494.171128] do_initcalls+0x107/0x230 > >> [ 494.171228] ? __pfx_kernel_init+0x10/0x10 > >> [ 494.171340] kernel_init_freeable+0x134/0x210 > >> [ 494.171462] kernel_init+0x1b/0x200 > >> [ 494.171560] ret_from_fork+0x47/0x70 > >> [ 494.171659] ? __pfx_kernel_init+0x10/0x10 > >> [ 494.171769] ret_from_fork_asm+0x1b/0x30 > >> [ 494.171875] </TASK> > >> > >> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") > >> Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> > >> --- > >> v1 -> v2: > >> - Check vi->status in virtnet_open(). > >> v1: > >> - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/ > >> --- > >> drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++---------------- > >> 1 file changed, 26 insertions(+), 17 deletions(-) > >> > >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >> index e53ba600605a..859add98909b 100644 > >> --- a/drivers/net/virtio_net.c > >> +++ b/drivers/net/virtio_net.c > >> @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev) > >> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { > >> if (vi->status & VIRTIO_NET_S_LINK_UP) > >> netif_carrier_on(vi->dev); > >> + if (vi->status & VIRTIO_NET_S_ANNOUNCE) { > >> + vi->status &= ~VIRTIO_NET_S_ANNOUNCE; > >> + schedule_work(&vi->config_work); > >> + } > >> virtio_config_driver_enable(vi->vdev); > > > > Instead of doing tricks like this. > > > > I wonder if the fix is as simple as calling > > virtio_config_driver_disable() before init_vqs()? > > > > Thanks > > > > That might not work as the device like QEMU will set the VIRTIO_NET_S_ANNOUNCE > regardless of most of the driver status, QEMU only checks whether the driver has > finalized it's features with VIRTIO_NET_F_GUEST_ANNOUNCE & VIRTIO_NET_F_CTRL_VQ. > > We've made a little patch to verify, don't know if it matches your thought, but > it does not seem to work :( > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index e53ba600605a..f309ce3fe243 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -6903,6 +6903,9 @@ static int virtnet_probe(struct virtio_device *vdev) > vi->curr_queue_pairs = num_online_cpus(); > vi->max_queue_pairs = max_queue_pairs; > > + /* Disable config change notification until ndo_open. */ > + virtio_config_driver_disable(vi->vdev); > + > /* Allocate/initialize the rx/tx queues, and invoke find_vqs */ > err = init_vqs(vi); > if (err) > @@ -6965,9 +6968,6 @@ static int virtnet_probe(struct virtio_device *vdev) > goto free_failover; > } > > - /* Disable config change notification until ndo_open. */ > - virtio_config_driver_disable(vi->vdev); > - > virtio_device_ready(vdev); > > if (vi->has_rss || vi->has_rss_hash_report) { > > For reproduce details, > > 1. Spawn qemu with monitor, like `-monitor unix:qemu.sock,server` > 2. In another window, run `while true; echo "announce_self"; end | socat - unix-connect:qemu.sock > /dev/null` > 3. The boot up will get hanged when probing the virtio_net > > The simplest version we've made is to revert the usage of > `virtnet_config_changed_work()` back to the `schedule_work()`, but as in v1, > we're still trying to understand the impact, making sure that it won't break > other things. > > Regards, > Thanks for the clarification. Now I see the issue. It looks like the root cause is to call virtio_config_changed_work() directly during probe(). Let's switch to use virtio_config_changed() instead so that we can properly check the config_driver_disabled. Thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Re: Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing 2025-07-16 2:47 ` Jason Wang @ 2025-07-16 11:20 ` Zigit Zo 0 siblings, 0 replies; 6+ messages in thread From: Zigit Zo @ 2025-07-16 11:20 UTC (permalink / raw) To: Jason Wang Cc: mst, xuanzhuo, eperezma, virtualization, netdev, linux-kernel, zuozhijie On 7/16/25 10:47 AM, Jason Wang wrote: > Hi Zigit: > > On Tue, Jul 15, 2025 at 7:00 PM Zigit Zo <zuozhijie@bytedance.com> wrote: >> >> On 7/15/25 5:31 PM, Jason Wang wrote: >>> On Wed, Jul 2, 2025 at 6:37 PM Zigit Zo <zuozhijie@bytedance.com> wrote: >>>> >>>> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while >>>> the virtio-net driver is still probing with rtnl_lock() hold, this will >>>> cause a recursive mutex in netdev_notify_peers(). >>>> >>>> Fix it by temporarily save the announce status while probing, and then in >>>> virtnet_open(), if it sees a delayed announce work is there, it starts to >>>> schedule the virtnet_config_changed_work(). >>>> >>>> Another possible solution is to directly check whether rtnl_is_locked() >>>> and call __netdev_notify_peers(), but in that way means we need to relies >>>> on netdev_queue to schedule the arp packets after ndo_open(), which we >>>> thought is not very intuitive. >>>> >>>> We've observed a softlockup with Ubuntu 24.04, and can be reproduced with >>>> QEMU sending the announce_self rapidly while booting. >>>> >>>> [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. >>>> [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu >>>> [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>> [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 >>>> [ 494.168260] Call Trace: >>>> [ 494.168329] <TASK> >>>> [ 494.168389] __schedule+0x27c/0x6b0 >>>> [ 494.168495] schedule+0x33/0x110 >>>> [ 494.168585] schedule_preempt_disabled+0x15/0x30 >>>> [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 >>>> [ 494.168835] __mutex_lock_slowpath+0x13/0x20 >>>> [ 494.168949] mutex_lock+0x3c/0x50 >>>> [ 494.169039] rtnl_lock+0x15/0x20 >>>> [ 494.169128] netdev_notify_peers+0x12/0x30 >>>> [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 >>>> [ 494.169377] virtnet_probe+0xa48/0xe00 >>>> [ 494.169484] ? vp_get+0x4d/0x100 >>>> [ 494.169574] virtio_dev_probe+0x1e9/0x310 >>>> [ 494.169682] really_probe+0x1c7/0x410 >>>> [ 494.169783] __driver_probe_device+0x8c/0x180 >>>> [ 494.169901] driver_probe_device+0x24/0xd0 >>>> [ 494.170011] __driver_attach+0x10b/0x210 >>>> [ 494.170117] ? __pfx___driver_attach+0x10/0x10 >>>> [ 494.170237] bus_for_each_dev+0x8d/0xf0 >>>> [ 494.170341] driver_attach+0x1e/0x30 >>>> [ 494.170440] bus_add_driver+0x14e/0x290 >>>> [ 494.170548] driver_register+0x5e/0x130 >>>> [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 >>>> [ 494.170788] register_virtio_driver+0x20/0x40 >>>> [ 494.170905] virtio_net_driver_init+0x97/0xb0 >>>> [ 494.171022] do_one_initcall+0x5e/0x340 >>>> [ 494.171128] do_initcalls+0x107/0x230 >>>> [ 494.171228] ? __pfx_kernel_init+0x10/0x10 >>>> [ 494.171340] kernel_init_freeable+0x134/0x210 >>>> [ 494.171462] kernel_init+0x1b/0x200 >>>> [ 494.171560] ret_from_fork+0x47/0x70 >>>> [ 494.171659] ? __pfx_kernel_init+0x10/0x10 >>>> [ 494.171769] ret_from_fork_asm+0x1b/0x30 >>>> [ 494.171875] </TASK> >>>> >>>> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") >>>> Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> >>>> --- >>>> v1 -> v2: >>>> - Check vi->status in virtnet_open(). >>>> v1: >>>> - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/ >>>> --- >>>> drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++---------------- >>>> 1 file changed, 26 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>>> index e53ba600605a..859add98909b 100644 >>>> --- a/drivers/net/virtio_net.c >>>> +++ b/drivers/net/virtio_net.c >>>> @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev) >>>> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) { >>>> if (vi->status & VIRTIO_NET_S_LINK_UP) >>>> netif_carrier_on(vi->dev); >>>> + if (vi->status & VIRTIO_NET_S_ANNOUNCE) { >>>> + vi->status &= ~VIRTIO_NET_S_ANNOUNCE; >>>> + schedule_work(&vi->config_work); >>>> + } >>>> virtio_config_driver_enable(vi->vdev); >>> >>> Instead of doing tricks like this. >>> >>> I wonder if the fix is as simple as calling >>> virtio_config_driver_disable() before init_vqs()? >>> >>> Thanks >>> >> >> That might not work as the device like QEMU will set the VIRTIO_NET_S_ANNOUNCE >> regardless of most of the driver status, QEMU only checks whether the driver has >> finalized it's features with VIRTIO_NET_F_GUEST_ANNOUNCE & VIRTIO_NET_F_CTRL_VQ. >> >> We've made a little patch to verify, don't know if it matches your thought, but >> it does not seem to work :( >> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> index e53ba600605a..f309ce3fe243 100644 >> --- a/drivers/net/virtio_net.c >> +++ b/drivers/net/virtio_net.c >> @@ -6903,6 +6903,9 @@ static int virtnet_probe(struct virtio_device *vdev) >> vi->curr_queue_pairs = num_online_cpus(); >> vi->max_queue_pairs = max_queue_pairs; >> >> + /* Disable config change notification until ndo_open. */ >> + virtio_config_driver_disable(vi->vdev); >> + >> /* Allocate/initialize the rx/tx queues, and invoke find_vqs */ >> err = init_vqs(vi); >> if (err) >> @@ -6965,9 +6968,6 @@ static int virtnet_probe(struct virtio_device *vdev) >> goto free_failover; >> } >> >> - /* Disable config change notification until ndo_open. */ >> - virtio_config_driver_disable(vi->vdev); >> - >> virtio_device_ready(vdev); >> >> if (vi->has_rss || vi->has_rss_hash_report) { >> >> For reproduce details, >> >> 1. Spawn qemu with monitor, like `-monitor unix:qemu.sock,server` >> 2. In another window, run `while true; echo "announce_self"; end | socat - unix-connect:qemu.sock > /dev/null` >> 3. The boot up will get hanged when probing the virtio_net >> >> The simplest version we've made is to revert the usage of >> `virtnet_config_changed_work()` back to the `schedule_work()`, but as in v1, >> we're still trying to understand the impact, making sure that it won't break >> other things. >> >> Regards, >> > > Thanks for the clarification. Now I see the issue. > > It looks like the root cause is to call virtio_config_changed_work() > directly during probe(). > > Let's switch to use virtio_config_changed() instead so that we can > properly check the config_driver_disabled. > > Thanks > Yes, we just wonder why there's a change to `virtnet_config_changed_work()` in commit df28de7b0050, and therefore try to keep this behavior as much as possible, to avoid breaking something. Anyway, v3 will be sent off soon, thanks for the reviewing! Regards, ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-07-16 11:20 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-02 10:37 [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during probing Zigit Zo 2025-07-10 8:30 ` Paolo Abeni 2025-07-15 9:31 ` Jason Wang 2025-07-15 10:59 ` Zigit Zo 2025-07-16 2:47 ` Jason Wang 2025-07-16 11:20 ` Zigit Zo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).