From: "Michael S. Tsirkin" <mst@redhat.com>
To: Bui Quang Minh <minhquangbui99@gmail.com>
Cc: "Jason Wang" <jasowang@redhat.com>,
netdev@vger.kernel.org, "Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH net v2] virtio-net: enable all napis before scheduling refill work
Date: Sun, 21 Dec 2025 08:42:51 -0500 [thread overview]
Message-ID: <20251221084218-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <5434a67e-dd6e-4cd1-870b-fdd32ad34a28@gmail.com>
On Fri, Dec 19, 2025 at 12:03:29PM +0700, Bui Quang Minh wrote:
> On 12/17/25 09:58, Jason Wang wrote:
> > On Wed, Dec 17, 2025 at 12:23 AM Bui Quang Minh
> > <minhquangbui99@gmail.com> wrote:
> > > On 12/16/25 11:16, Jason Wang wrote:
> > > > On Fri, Dec 12, 2025 at 11:28 PM Bui Quang Minh
> > > > <minhquangbui99@gmail.com> wrote:
> > > > > Calling napi_disable() on an already disabled napi can cause the
> > > > > deadlock. In commit 4bc12818b363 ("virtio-net: disable delayed refill
> > > > > when pausing rx"), to avoid the deadlock, when pausing the RX in
> > > > > virtnet_rx_pause[_all](), we disable and cancel the delayed refill work.
> > > > > However, in the virtnet_rx_resume_all(), we enable the delayed refill
> > > > > work too early before enabling all the receive queue napis.
> > > > >
> > > > > The deadlock can be reproduced by running
> > > > > selftests/drivers/net/hw/xsk_reconfig.py with multiqueue virtio-net
> > > > > device and inserting a cond_resched() inside the for loop in
> > > > > virtnet_rx_resume_all() to increase the success rate. Because the worker
> > > > > processing the delayed refilled work runs on the same CPU as
> > > > > virtnet_rx_resume_all(), a reschedule is needed to cause the deadlock.
> > > > > In real scenario, the contention on netdev_lock can cause the
> > > > > reschedule.
> > > > >
> > > > > This fixes the deadlock by ensuring all receive queue's napis are
> > > > > enabled before we enable the delayed refill work in
> > > > > virtnet_rx_resume_all() and virtnet_open().
> > > > >
> > > > > Fixes: 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx")
> > > > > Reported-by: Paolo Abeni <pabeni@redhat.com>
> > > > > Closes: https://netdev-ctrl.bots.linux.dev/logs/vmksft/drv-hw-dbg/results/400961/3-xdp-py/stderr
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
> > > > > ---
> > > > > Changes in v2:
> > > > > - Move try_fill_recv() before rx napi_enable()
> > > > > - Link to v1: https://lore.kernel.org/netdev/20251208153419.18196-1-minhquangbui99@gmail.com/
> > > > > ---
> > > > > drivers/net/virtio_net.c | 71 +++++++++++++++++++++++++---------------
> > > > > 1 file changed, 45 insertions(+), 26 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > index 8e04adb57f52..4e08880a9467 100644
> > > > > --- a/drivers/net/virtio_net.c
> > > > > +++ b/drivers/net/virtio_net.c
> > > > > @@ -3214,21 +3214,31 @@ static void virtnet_update_settings(struct virtnet_info *vi)
> > > > > static int virtnet_open(struct net_device *dev)
> > > > > {
> > > > > struct virtnet_info *vi = netdev_priv(dev);
> > > > > + bool schedule_refill = false;
> > > > > int i, err;
> > > > >
> > > > > - enable_delayed_refill(vi);
> > > > > -
> > > > > + /* - We must call try_fill_recv before enabling napi of the same receive
> > > > > + * queue so that it doesn't race with the call in virtnet_receive.
> > > > > + * - We must enable and schedule delayed refill work only when we have
> > > > > + * enabled all the receive queue's napi. Otherwise, in refill_work, we
> > > > > + * have a deadlock when calling napi_disable on an already disabled
> > > > > + * napi.
> > > > > + */
> > > > > for (i = 0; i < vi->max_queue_pairs; i++) {
> > > > > if (i < vi->curr_queue_pairs)
> > > > > /* Make sure we have some buffers: if oom use wq. */
> > > > > if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
> > > > > - schedule_delayed_work(&vi->refill, 0);
> > > > > + schedule_refill = true;
> > > > >
> > > > > err = virtnet_enable_queue_pair(vi, i);
> > > > > if (err < 0)
> > > > > goto err_enable_qp;
> > > > > }
> > > > So NAPI could be scheduled and it may want to refill but since refill
> > > > is not enabled, there would be no refill work.
> > > >
> > > > Is this a problem?
> > > You are right. It is indeed a problem.
> > >
> > > I think we can unconditionally schedule the delayed refill after
> > > enabling all the RX NAPIs (don't check the boolean schedule_refill
> > > anymore) to ensure that we will have refill work. We can still keep the
> > > try_fill_recv here to fill the receive buffer earlier in normal case.
> > > What do you think?
> > Or we can have a reill_pending
>
> Okay, let me implement this in the next version.
>
> > but basically I think we need something
> > that is much more simple. That is, using a per rq work instead of a
> > global one?
>
> I think we can leave this in a net-next patch later.
>
> Thanks,
> Quang Minh
i am not sure per rq is not simpler than this pile of tricks.
> >
> > Thanks
> >
> > > >
> > > > > + enable_delayed_refill(vi);
> > > > > + if (schedule_refill)
> > > > > + schedule_delayed_work(&vi->refill, 0);
> > > > > +
> > > > > if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
> > > > > if (vi->status & VIRTIO_NET_S_LINK_UP)
> > > > > netif_carrier_on(vi->dev);
> > > > > @@ -3463,39 +3473,48 @@ static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
> > > > > __virtnet_rx_pause(vi, rq);
> > > > > }
> > > > >
> > > > Thanks
> > > >
> > > Thanks,
> > > Quang Minh.
> > >
next prev parent reply other threads:[~2025-12-21 13:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-12 15:27 [PATCH net v2] virtio-net: enable all napis before scheduling refill work Bui Quang Minh
2025-12-16 4:16 ` Jason Wang
2025-12-16 16:23 ` Bui Quang Minh
2025-12-17 2:58 ` Jason Wang
2025-12-19 5:03 ` Bui Quang Minh
2025-12-21 13:42 ` Michael S. Tsirkin [this message]
2025-12-22 11:58 ` Paolo Abeni
2025-12-22 23:11 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251221084218-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=hawk@kernel.org \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=minhquangbui99@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=stable@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.