From: "Michael S. Tsirkin" <mst@redhat.com>
To: Zigit Zo <zuozhijie@bytedance.com>
Cc: jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
eperezma@redhat.com, virtualization@lists.linux.dev,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [External] Re: [PATCH net] virtio-net: fix a rtnl_lock() deadlock during probing
Date: Tue, 1 Jul 2025 03:52:40 -0400 [thread overview]
Message-ID: <20250701035110-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <f1f68fbd-e2cf-40c5-a6b8-533cb3ec798f@bytedance.com>
On Tue, Jul 01, 2025 at 03:48:41PM +0800, Zigit Zo wrote:
> On 6/30/25 10:54 PM, Michael S. Tsirkin wrote:
> > On Mon, Jun 30, 2025 at 10:50:55AM -0400, Michael S. Tsirkin wrote:
> >> On Mon, Jun 30, 2025 at 05:51:09PM +0800, Zigit Zo wrote:
> >>> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while
> >>> the virtio-net driver is still probing with rtnl_lock() hold, this will
> >>> cause a recursive mutex in netdev_notify_peers().
> >>>
> >>> Fix it by skip acking the annouce in virtnet_config_changed_work() when
> >>> probing. The annouce will still get done when ndo_open() enables the
> >>> virtio_config_driver_enable().
> >>
> >> I am not so sure it will be - while driver is not loaded, device does
> >> not have to send interrupts, and there's no rule I'm aware of that says
> >> we'll get one after DRIVER_OK.
>
> Yep, at first we're thinking that when the VIRTIO_NET_S_ANNOUNCE flag set,
> we can always assure an interrupt has fired by VMM, to notify the driver
> to do the announcement.
>
> But later we realized that the S_ANNOUNCE flag can be sent before the
> driver's probing, and for QEMU seems to set the status flag regardless of
> whether driver is ready, so the problem you're talking still may happens.
> >> How about, we instead just schedule the work to do it later?I'm not sure if scheduling the work later will break df28de7b0050, the work
> was being scheduled before that commit, and we have no much idea of why that
> commit removes the schedule_work, we just keep it for safe...
Well managing async things is always tricky. Direct call is safer.
If you reintroduce it, you need to audit all call paths for safely.
> Then, for plan A, we change the check_announce to schedule_announce, and if
> that's true, we do another schedule_work to call virtnet_config_changed_work
> again to finish the announcement, like
>
> if (v & VIRTIO_NET_S_ANNOUNCE) {
> if (unlikely(schedule_announce))
> schedule_work(&vi->config_work);
> else {
> netdev_notify_peers(vi->dev);
> virtnet_ack_link_announce(vi);
> }
> }
>
> >>
> >> Also, there is another bug here.
> >> If ndo_open did not run, we actually should not send any announcements.
> >>
> >> Do we care if carrier on is set on probe or on open?
> >> If not, let's just defer this to ndo_open?
> >
> > Hmm yes I think we do, device is visible to userspace is it not?
> >
> > Hmm. We can keep the announce bit set in vi->status and on open, check
> > it and then schedule a work to do the announcement.
>
> Okay, so there's a plan B, we save the bit and re-check it in ndo_open, like
>
> /* __virtnet_config_changed_work() */
> if (v & VIRTIO_NET_S_ANNOUNCE) {
> vi->status |= VIRTIO_NET_S_ANNOUNCE;
> if (unlikely(!check_announce))
> goto check_link;
>
> netdev_notify_peers(vi->dev);
> virtnet_ack_link_announce(vi);
> vi->status &= ~VIRTIO_NET_S_ANNOUNCE;
> }
>
> /* virtnet_open() */
> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
> if (vi->status & VIRTIO_NET_S_LINK_UP)
> netif_carrier_on(vi->dev);
> if
> if (vi->status & VIRTIO_NET_S_ANNOUNCE)
> schedule_work(&vi->config_work);
> virtio_config_driver_enable(vi->vdev);
> }
>
> This is a dirty demo, any ideas are welcomed :)
>
> (I think in virtnet_open() we can make the S_LINK_UP being scheduled as well?)
prev parent reply other threads:[~2025-07-01 7:52 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 9:51 [PATCH net] virtio-net: fix a rtnl_lock() deadlock during probing Zigit Zo
2025-06-30 10:00 ` Markus Elfring
2025-06-30 14:50 ` Michael S. Tsirkin
2025-06-30 14:54 ` Michael S. Tsirkin
2025-07-01 7:48 ` [External] " Zigit Zo
2025-07-01 7:52 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250701035110-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
--cc=zuozhijie@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox