public inbox for virtualization@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Zigit Zo <zuozhijie@bytedance.com>
Cc: jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	eperezma@redhat.com, virtualization@lists.linux.dev,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] virtio-net: fix a rtnl_lock() deadlock during probing
Date: Mon, 30 Jun 2025 10:50:52 -0400	[thread overview]
Message-ID: <20250630103240-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20250630095109.214013-1-zuozhijie@bytedance.com>

On Mon, Jun 30, 2025 at 05:51:09PM +0800, Zigit Zo wrote:
> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while
> the virtio-net driver is still probing with rtnl_lock() hold, this will
> cause a recursive mutex in netdev_notify_peers().
> 
> Fix it by skip acking the annouce in virtnet_config_changed_work() when
> probing. The annouce will still get done when ndo_open() enables the
> virtio_config_driver_enable().

I am not so sure it will be - while driver is not loaded, device does
not have to send interrupts, and there's no rule I'm aware of that says
we'll get one after DRIVER_OK.

How about, we instead just schedule the work to do it later?


Also, there is another bug here.
If ndo_open did not run, we actually should not send any announcements.

Do we care if carrier on is set on probe or on open?
If not, let's just defer this to ndo_open?


> We've observed a softlockup with Ubuntu 24.04, and can be reproduced with
> QEMU sending the announce_self rapidly while booting.
> 
> [  494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds.
> [  494.167667]       Not tainted 6.8.0-57-generic #59-Ubuntu
> [  494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  494.168015] task:swapper/0       state:D stack:0     pid:1     tgid:1     ppid:0      flags:0x00004000
> [  494.168260] Call Trace:
> [  494.168329]  <TASK>
> [  494.168389]  __schedule+0x27c/0x6b0
> [  494.168495]  schedule+0x33/0x110
> [  494.168585]  schedule_preempt_disabled+0x15/0x30
> [  494.168709]  __mutex_lock.constprop.0+0x42f/0x740
> [  494.168835]  __mutex_lock_slowpath+0x13/0x20
> [  494.168949]  mutex_lock+0x3c/0x50
> [  494.169039]  rtnl_lock+0x15/0x20
> [  494.169128]  netdev_notify_peers+0x12/0x30
> [  494.169240]  virtnet_config_changed_work+0x152/0x1a0
> [  494.169377]  virtnet_probe+0xa48/0xe00
> [  494.169484]  ? vp_get+0x4d/0x100
> [  494.169574]  virtio_dev_probe+0x1e9/0x310
> [  494.169682]  really_probe+0x1c7/0x410
> [  494.169783]  __driver_probe_device+0x8c/0x180
> [  494.169901]  driver_probe_device+0x24/0xd0
> [  494.170011]  __driver_attach+0x10b/0x210
> [  494.170117]  ? __pfx___driver_attach+0x10/0x10
> [  494.170237]  bus_for_each_dev+0x8d/0xf0
> [  494.170341]  driver_attach+0x1e/0x30
> [  494.170440]  bus_add_driver+0x14e/0x290
> [  494.170548]  driver_register+0x5e/0x130
> [  494.170651]  ? __pfx_virtio_net_driver_init+0x10/0x10
> [  494.170788]  register_virtio_driver+0x20/0x40
> [  494.170905]  virtio_net_driver_init+0x97/0xb0
> [  494.171022]  do_one_initcall+0x5e/0x340
> [  494.171128]  do_initcalls+0x107/0x230
> [  494.171228]  ? __pfx_kernel_init+0x10/0x10
> [  494.171340]  kernel_init_freeable+0x134/0x210
> [  494.171462]  kernel_init+0x1b/0x200
> [  494.171560]  ret_from_fork+0x47/0x70
> [  494.171659]  ? __pfx_kernel_init+0x10/0x10
> [  494.171769]  ret_from_fork_asm+0x1b/0x30
> [  494.171875]  </TASK>
> 
> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
> Signed-off-by: Zigit Zo <zuozhijie@bytedance.com>
> ---
>  drivers/net/virtio_net.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e53ba600605a..0290d289ebee 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -6211,7 +6211,8 @@ static const struct net_device_ops virtnet_netdev = {
>  	.ndo_tx_timeout		= virtnet_tx_timeout,
>  };
>  
> -static void virtnet_config_changed_work(struct work_struct *work)
> +static void __virtnet_config_changed_work(struct work_struct *work,
> +					  bool check_announce)
>  {
>  	struct virtnet_info *vi =
>  		container_of(work, struct virtnet_info, config_work);

So this will be schedule_announce instead of check_announce?




> @@ -6221,7 +6222,7 @@ static void virtnet_config_changed_work(struct work_struct *work)
>  				 struct virtio_net_config, status, &v) < 0)
>  		return;
>  
> -	if (v & VIRTIO_NET_S_ANNOUNCE) {
> +	if (check_announce && (v & VIRTIO_NET_S_ANNOUNCE)) {
>  		netdev_notify_peers(vi->dev);
>  		virtnet_ack_link_announce(vi);
>  	}
> @@ -6244,6 +6245,11 @@ static void virtnet_config_changed_work(struct work_struct *work)
>  	}
>  }
>  
> +static void virtnet_config_changed_work(struct work_struct *work)
> +{
> +	__virtnet_config_changed_work(work, true);
> +}
> +
>  static void virtnet_config_changed(struct virtio_device *vdev)
>  {
>  	struct virtnet_info *vi = vdev->priv;
> @@ -7030,7 +7036,10 @@ static int virtnet_probe(struct virtio_device *vdev)
>  	   otherwise get link status from config. */
>  	netif_carrier_off(dev);
>  	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
> -		virtnet_config_changed_work(&vi->config_work);
> +		/* The check_annouce work will get scheduled when ndo_open()
> +		 * doing the virtio_config_driver_enable().
> +		 */
> +		__virtnet_config_changed_work(&vi->config_work, false);
>  	} else {
>  		vi->status = VIRTIO_NET_S_LINK_UP;
>  		virtnet_update_settings(vi);
> 
> base-commit: 2def09ead4ad5907988b655d1e1454003aaf8297
> -- 
> 2.49.0


  parent reply	other threads:[~2025-06-30 14:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-30  9:51 [PATCH net] virtio-net: fix a rtnl_lock() deadlock during probing Zigit Zo
2025-06-30 10:00 ` Markus Elfring
2025-06-30 14:50 ` Michael S. Tsirkin [this message]
2025-06-30 14:54   ` Michael S. Tsirkin
2025-07-01  7:48     ` [External] " Zigit Zo
2025-07-01  7:52       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250630103240-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=zuozhijie@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox