From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: "Bui Quang Minh" <minhquangbui99@gmail.com>,
"Jason Wang" <jasowang@redhat.com>,
netdev@vger.kernel.org, "Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH net v2] virtio-net: enable all napis before scheduling refill work
Date: Mon, 22 Dec 2025 18:11:31 -0500 [thread overview]
Message-ID: <20251222180931-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <8e69a404-18bf-4c91-a6c7-59d5ae831591@redhat.com>
On Mon, Dec 22, 2025 at 12:58:01PM +0100, Paolo Abeni wrote:
> On 12/21/25 2:42 PM, Michael S. Tsirkin wrote:
> > On Fri, Dec 19, 2025 at 12:03:29PM +0700, Bui Quang Minh wrote:
> >> On 12/17/25 09:58, Jason Wang wrote:
> >>> On Wed, Dec 17, 2025 at 12:23 AM Bui Quang Minh
> >>> <minhquangbui99@gmail.com> wrote:
> >>>> I think we can unconditionally schedule the delayed refill after
> >>>> enabling all the RX NAPIs (don't check the boolean schedule_refill
> >>>> anymore) to ensure that we will have refill work. We can still keep the
> >>>> try_fill_recv here to fill the receive buffer earlier in normal case.
> >>>> What do you think?
> >>> Or we can have a reill_pending
> >>
> >> Okay, let me implement this in the next version.
> >>
> >>> but basically I think we need something
> >>> that is much more simple. That is, using a per rq work instead of a
> >>> global one?
> >>
> >> I think we can leave this in a net-next patch later.
> >>
> >> Thanks,
> >> Quang Minh
> >
> > i am not sure per rq is not simpler than this pile of tricks.
> FWIW, I agree with Michael: the diffstat of the current patch is quite
> scaring, I don't think a per RQ work would be significantly larger, but
> should be significantly simpler to review and maintain.
>
> I suggest doing directly the per RQ work implementation.
>
> Thanks!
>
> Paolo
I mean a stupidly mechanical move of all refill to per RQ is
rather trivial (CB). Compiled only, feel free to reuse.
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0369dda5ed60..4eb90b4e7b0f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -379,6 +379,15 @@ struct receive_queue {
struct xdp_rxq_info xsk_rxq_info;
struct xdp_buff **xsk_buffs;
+
+ /* Work struct for delayed refilling if we run low on memory. */
+ struct delayed_work refill;
+
+ /* Is delayed refill enabled? */
+ bool refill_enabled;
+
+ /* The lock to synchronize the access to refill_enabled */
+ spinlock_t refill_lock;
};
#define VIRTIO_NET_RSS_MAX_KEY_SIZE 40
@@ -441,9 +450,6 @@ struct virtnet_info {
/* Packet virtio header size */
u8 hdr_len;
- /* Work struct for delayed refilling if we run low on memory. */
- struct delayed_work refill;
-
/* UDP tunnel support */
bool tx_tnl;
@@ -451,12 +457,6 @@ struct virtnet_info {
bool rx_tnl_csum;
- /* Is delayed refill enabled? */
- bool refill_enabled;
-
- /* The lock to synchronize the access to refill_enabled */
- spinlock_t refill_lock;
-
/* Work struct for config space updates */
struct work_struct config_work;
@@ -720,18 +720,11 @@ static void virtnet_rq_free_buf(struct virtnet_info *vi,
put_page(virt_to_head_page(buf));
}
-static void enable_delayed_refill(struct virtnet_info *vi)
+static void disable_delayed_refill(struct receive_queue *rq)
{
- spin_lock_bh(&vi->refill_lock);
- vi->refill_enabled = true;
- spin_unlock_bh(&vi->refill_lock);
-}
-
-static void disable_delayed_refill(struct virtnet_info *vi)
-{
- spin_lock_bh(&vi->refill_lock);
- vi->refill_enabled = false;
- spin_unlock_bh(&vi->refill_lock);
+ spin_lock_bh(&rq->refill_lock);
+ rq->refill_enabled = false;
+ spin_unlock_bh(&rq->refill_lock);
}
static void enable_rx_mode_work(struct virtnet_info *vi)
@@ -2935,38 +2928,20 @@ static void virtnet_napi_disable(struct receive_queue *rq)
static void refill_work(struct work_struct *work)
{
- struct virtnet_info *vi =
- container_of(work, struct virtnet_info, refill.work);
+ struct receive_queue *rq = container_of(work, struct receive_queue,
+ refill.work);
+ struct virtnet_info *vi = rq->vq->vdev->priv;
bool still_empty;
- int i;
- for (i = 0; i < vi->curr_queue_pairs; i++) {
- struct receive_queue *rq = &vi->rq[i];
+ napi_disable(&rq->napi);
+ still_empty = !try_fill_recv(vi, rq, GFP_KERNEL);
+ virtnet_napi_do_enable(rq->vq, &rq->napi);
- /*
- * When queue API support is added in the future and the call
- * below becomes napi_disable_locked, this driver will need to
- * be refactored.
- *
- * One possible solution would be to:
- * - cancel refill_work with cancel_delayed_work (note:
- * non-sync)
- * - cancel refill_work with cancel_delayed_work_sync in
- * virtnet_remove after the netdev is unregistered
- * - wrap all of the work in a lock (perhaps the netdev
- * instance lock)
- * - check netif_running() and return early to avoid a race
- */
- napi_disable(&rq->napi);
- still_empty = !try_fill_recv(vi, rq, GFP_KERNEL);
- virtnet_napi_do_enable(rq->vq, &rq->napi);
-
- /* In theory, this can happen: if we don't get any buffers in
- * we will *never* try to fill again.
- */
- if (still_empty)
- schedule_delayed_work(&vi->refill, HZ/2);
- }
+ /* In theory, this can happen: if we don't get any buffers in
+ * we will *never* try to fill again.
+ */
+ if (still_empty)
+ schedule_delayed_work(&rq->refill, HZ / 2);
}
static int virtnet_receive_xsk_bufs(struct virtnet_info *vi,
@@ -3033,10 +3008,10 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) {
if (!try_fill_recv(vi, rq, GFP_ATOMIC)) {
- spin_lock(&vi->refill_lock);
- if (vi->refill_enabled)
- schedule_delayed_work(&vi->refill, 0);
- spin_unlock(&vi->refill_lock);
+ spin_lock(&rq->refill_lock);
+ if (rq->refill_enabled)
+ schedule_delayed_work(&rq->refill, 0);
+ spin_unlock(&rq->refill_lock);
}
}
@@ -3216,13 +3191,14 @@ static int virtnet_open(struct net_device *dev)
struct virtnet_info *vi = netdev_priv(dev);
int i, err;
- enable_delayed_refill(vi);
-
for (i = 0; i < vi->max_queue_pairs; i++) {
+ /* Enable refill work before enabling NAPI */
+ vi->rq[i].refill_enabled = true;
+
if (i < vi->curr_queue_pairs)
/* Make sure we have some buffers: if oom use wq. */
if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
- schedule_delayed_work(&vi->refill, 0);
+ schedule_delayed_work(&vi->rq[i].refill, 0);
err = virtnet_enable_queue_pair(vi, i);
if (err < 0)
@@ -3241,11 +3217,10 @@ static int virtnet_open(struct net_device *dev)
return 0;
err_enable_qp:
- disable_delayed_refill(vi);
- cancel_delayed_work_sync(&vi->refill);
-
for (i--; i >= 0; i--) {
+ disable_delayed_refill(&vi->rq[i]);
virtnet_disable_queue_pair(vi, i);
+ cancel_delayed_work_sync(&vi->rq[i].refill);
virtnet_cancel_dim(vi, &vi->rq[i].dim);
}
@@ -3437,29 +3412,19 @@ static void __virtnet_rx_pause(struct virtnet_info *vi,
}
}
+static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
+{
+ disable_delayed_refill(rq);
+ cancel_delayed_work_sync(&rq->refill);
+ __virtnet_rx_pause(vi, rq);
+}
+
static void virtnet_rx_pause_all(struct virtnet_info *vi)
{
int i;
- /*
- * Make sure refill_work does not run concurrently to
- * avoid napi_disable race which leads to deadlock.
- */
- disable_delayed_refill(vi);
- cancel_delayed_work_sync(&vi->refill);
for (i = 0; i < vi->max_queue_pairs; i++)
- __virtnet_rx_pause(vi, &vi->rq[i]);
-}
-
-static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
-{
- /*
- * Make sure refill_work does not run concurrently to
- * avoid napi_disable race which leads to deadlock.
- */
- disable_delayed_refill(vi);
- cancel_delayed_work_sync(&vi->refill);
- __virtnet_rx_pause(vi, rq);
+ virtnet_rx_pause(vi, &vi->rq[i]);
}
static void __virtnet_rx_resume(struct virtnet_info *vi,
@@ -3474,15 +3439,17 @@ static void __virtnet_rx_resume(struct virtnet_info *vi,
if (running)
virtnet_napi_enable(rq);
+ spin_lock_bh(&rq->refill_lock);
+ rq->refill_enabled = true;
if (schedule_refill)
- schedule_delayed_work(&vi->refill, 0);
+ schedule_delayed_work(&rq->refill, 0);
+ spin_unlock_bh(&rq->refill_lock);
}
static void virtnet_rx_resume_all(struct virtnet_info *vi)
{
int i;
- enable_delayed_refill(vi);
for (i = 0; i < vi->max_queue_pairs; i++) {
if (i < vi->curr_queue_pairs)
__virtnet_rx_resume(vi, &vi->rq[i], true);
@@ -3493,7 +3460,6 @@ static void virtnet_rx_resume_all(struct virtnet_info *vi)
static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq)
{
- enable_delayed_refill(vi);
__virtnet_rx_resume(vi, rq, true);
}
@@ -3768,6 +3734,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs)
struct virtio_net_rss_config_trailer old_rss_trailer;
struct net_device *dev = vi->dev;
struct scatterlist sg;
+ int i;
if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ))
return 0;
@@ -3821,10 +3788,14 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs)
succ:
vi->curr_queue_pairs = queue_pairs;
/* virtnet_open() will refill when device is going to up. */
- spin_lock_bh(&vi->refill_lock);
- if (dev->flags & IFF_UP && vi->refill_enabled)
- schedule_delayed_work(&vi->refill, 0);
- spin_unlock_bh(&vi->refill_lock);
+ if (dev->flags & IFF_UP) {
+ for (i = 0; i < vi->curr_queue_pairs; i++) {
+ spin_lock_bh(&vi->rq[i].refill_lock);
+ if (vi->rq[i].refill_enabled)
+ schedule_delayed_work(&vi->rq[i].refill, 0);
+ spin_unlock_bh(&vi->rq[i].refill_lock);
+ }
+ }
return 0;
}
@@ -3834,10 +3805,6 @@ static int virtnet_close(struct net_device *dev)
struct virtnet_info *vi = netdev_priv(dev);
int i;
- /* Make sure NAPI doesn't schedule refill work */
- disable_delayed_refill(vi);
- /* Make sure refill_work doesn't re-enable napi! */
- cancel_delayed_work_sync(&vi->refill);
/* Prevent the config change callback from changing carrier
* after close
*/
@@ -3848,7 +3815,9 @@ static int virtnet_close(struct net_device *dev)
cancel_work_sync(&vi->config_work);
for (i = 0; i < vi->max_queue_pairs; i++) {
+ disable_delayed_refill(&vi->rq[i]);
virtnet_disable_queue_pair(vi, i);
+ cancel_delayed_work_sync(&vi->rq[i].refill);
virtnet_cancel_dim(vi, &vi->rq[i].dim);
}
@@ -5793,7 +5762,6 @@ static int virtnet_restore_up(struct virtio_device *vdev)
virtio_device_ready(vdev);
- enable_delayed_refill(vi);
enable_rx_mode_work(vi);
if (netif_running(vi->dev)) {
@@ -6550,7 +6518,6 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
if (!vi->rq)
goto err_rq;
- INIT_DELAYED_WORK(&vi->refill, refill_work);
for (i = 0; i < vi->max_queue_pairs; i++) {
vi->rq[i].pages = NULL;
netif_napi_add_config(vi->dev, &vi->rq[i].napi, virtnet_poll,
@@ -6567,6 +6534,8 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
u64_stats_init(&vi->rq[i].stats.syncp);
u64_stats_init(&vi->sq[i].stats.syncp);
mutex_init(&vi->rq[i].dim_lock);
+ INIT_DELAYED_WORK(&vi->rq[i].refill, refill_work);
+ spin_lock_init(&vi->rq[i].refill_lock);
}
return 0;
@@ -6892,7 +6861,6 @@ static int virtnet_probe(struct virtio_device *vdev)
INIT_WORK(&vi->config_work, virtnet_config_changed_work);
INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work);
- spin_lock_init(&vi->refill_lock);
if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) {
vi->mergeable_rx_bufs = true;
@@ -7156,7 +7124,8 @@ static int virtnet_probe(struct virtio_device *vdev)
net_failover_destroy(vi->failover);
free_vqs:
virtio_reset_device(vdev);
- cancel_delayed_work_sync(&vi->refill);
+ for (i = 0; i < vi->max_queue_pairs; i++)
+ cancel_delayed_work_sync(&vi->rq[i].refill);
free_receive_page_frags(vi);
virtnet_del_vqs(vi);
free:
prev parent reply other threads:[~2025-12-22 23:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-12 15:27 [PATCH net v2] virtio-net: enable all napis before scheduling refill work Bui Quang Minh
2025-12-16 4:16 ` Jason Wang
2025-12-16 16:23 ` Bui Quang Minh
2025-12-17 2:58 ` Jason Wang
2025-12-19 5:03 ` Bui Quang Minh
2025-12-21 13:42 ` Michael S. Tsirkin
2025-12-22 11:58 ` Paolo Abeni
2025-12-22 23:11 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251222180931-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=hawk@kernel.org \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=minhquangbui99@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=stable@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.