* [PATCH net-next v2 0/2] virtio_net: xsk: rx and tx wake up @ 2026-06-11 2:56 menglong8.dong 2026-06-11 2:56 ` [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx " menglong8.dong 2026-06-11 2:56 ` [PATCH net-next v2 2/2] virtio-net: xsk: support tx " menglong8.dong 0 siblings, 2 replies; 4+ messages in thread From: menglong8.dong @ 2026-06-11 2:56 UTC (permalink / raw) To: xuanzhuo, eperezma Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, minhquangbui99, kerneljasonxing, netdev, virtualization, linux-kernel From: Menglong Dong <dongml2@chinatelecom.cn> In the first patch, we fix a race condition for the xsk rx wake up in virtio-net. In the second patch, we support xsk tx wake up for virtio-net. Changes since v1: - split the rx and tx into two patch - add the Fixes tag Menglong Dong (2): virtio_net: xsk: fix race in rx wake up virtio-net: xsk: support tx wake up drivers/net/virtio_net.c | 52 ++++++++++++++++++++++++++++++++++------ 1 file changed, 45 insertions(+), 7 deletions(-) -- 2.54.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up 2026-06-11 2:56 [PATCH net-next v2 0/2] virtio_net: xsk: rx and tx wake up menglong8.dong @ 2026-06-11 2:56 ` menglong8.dong 2026-06-11 16:24 ` Bui Quang Minh 2026-06-11 2:56 ` [PATCH net-next v2 2/2] virtio-net: xsk: support tx " menglong8.dong 1 sibling, 1 reply; 4+ messages in thread From: menglong8.dong @ 2026-06-11 2:56 UTC (permalink / raw) To: xuanzhuo, eperezma Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, minhquangbui99, kerneljasonxing, netdev, virtualization, linux-kernel From: Menglong Dong <dongml2@chinatelecom.cn> During packet receiving in virtio-net, the rq can be empty, which means "rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring can be empty too, which means we can't allocate anything from xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag. However, if the user clean all the data in rx ring and fill the "fill ring" and check the XDP_RING_NEED_WAKEUP flag after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx napi will never be scheduled: the rx ring is empty, which means we will never receive a packet to trigger the further recv fill. The rx ring is empty now, so the user will not check the flag too. Fix this by set the XDP_RING_NEED_WAKEUP flag before xsk_buff_alloc_batch() if both rq->vq and fill ring are empty. Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in rq->vq. Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> --- drivers/net/virtio_net.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index f4adcfee7a80..4b5b3fa62008 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue struct xsk_buff_pool *pool, gfp_t gfp) { struct xdp_buff **xsk_buffs; + bool need_wakeup; dma_addr_t addr; int err = 0; u32 len, i; int num; + need_wakeup = xsk_uses_need_wakeup(pool); xsk_buffs = rq->xsk_buffs; + /* If both rq->vq and fill ring are empty, and then the user submit + * all the chunks to the fill ring and check the wake up flag + * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), + * we will lose the chance to wake up the rx napi, so we have to + * set the need_wakeup flag here. + */ + if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free) + xsk_set_rx_need_wakeup(pool); + num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free); if (!num) { - if (xsk_uses_need_wakeup(pool)) { + if (need_wakeup) { xsk_set_rx_need_wakeup(pool); /* Return 0 instead of -ENOMEM so that NAPI is * descheduled. @@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue } return -ENOMEM; - } else { - xsk_clear_rx_need_wakeup(pool); } len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len; @@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue goto err; } + if (need_wakeup) { + if (rq->vq->num_free) + /* We have free buffers, so we'd better wake up the + * rx napi as soon as possible. + */ + xsk_set_rx_need_wakeup(pool); + else + xsk_clear_rx_need_wakeup(pool); + } + return num; err: -- 2.54.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up 2026-06-11 2:56 ` [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx " menglong8.dong @ 2026-06-11 16:24 ` Bui Quang Minh 0 siblings, 0 replies; 4+ messages in thread From: Bui Quang Minh @ 2026-06-11 16:24 UTC (permalink / raw) To: menglong8.dong, xuanzhuo, eperezma Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, kerneljasonxing, netdev, virtualization, linux-kernel On 6/11/26 09:56, menglong8.dong@gmail.com wrote: > From: Menglong Dong <dongml2@chinatelecom.cn> > > During packet receiving in virtio-net, the rq can be empty, which means > "rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in > virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring > can be empty too, which means we can't allocate anything from > xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag. > > However, if the user clean all the data in rx ring and fill the > "fill ring" and check the XDP_RING_NEED_WAKEUP flag after > xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx > napi will never be scheduled: the rx ring is empty, which means we will > never receive a packet to trigger the further recv fill. The rx ring is > empty now, so the user will not check the flag too. > > Fix this by set the XDP_RING_NEED_WAKEUP flag before > xsk_buff_alloc_batch() if both rq->vq and fill ring are empty. > > Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in > rq->vq. > > Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side") > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > --- > drivers/net/virtio_net.c | 25 ++++++++++++++++++++++--- > 1 file changed, 22 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index f4adcfee7a80..4b5b3fa62008 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue > struct xsk_buff_pool *pool, gfp_t gfp) > { > struct xdp_buff **xsk_buffs; > + bool need_wakeup; > dma_addr_t addr; > int err = 0; > u32 len, i; > int num; > > + need_wakeup = xsk_uses_need_wakeup(pool); > xsk_buffs = rq->xsk_buffs; > > + /* If both rq->vq and fill ring are empty, and then the user submit > + * all the chunks to the fill ring and check the wake up flag > + * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), > + * we will lose the chance to wake up the rx napi, so we have to > + * set the need_wakeup flag here. > + */ > + if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free) > + xsk_set_rx_need_wakeup(pool); I think when polling the receive queue, the userspace program needs to check the XDP_RING_NEED_WAKEUP flag if it does not see any packets. The flag check is quite lightweight in my opinion. Here are some examples I find - https://github.com/xdp-project/xdp-tools/blob/e9469501622aa22a7e452a671000bec8685edcde/lib/util/xdpsock.c#L1206 - https://github.com/xdp-project/bpf-examples/blob/43e565901c4287efa863edca7f0e6cd6e35ed896/AF_XDP-forwarding/xsk_fwd.c#L540 Furthermore, the XDP_RING_NEED_WAKEUP flag related functions does not provide any memory orderings. So even with your patch, I'm worried that this case is possible kernel userspace xsk_buff_alloc_batch -> failed submit fill ring flag != XDP_RING_NEED_WAKEUP // reordering due to lack of memory orderings xsk_set_rx_need_wakeup I'm not expert here, so correct me if I'm wrong. I think the wake up flag is designed with no orderings so we cannot rely on it to reason and skip further checks. > + > num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free); > if (!num) { > - if (xsk_uses_need_wakeup(pool)) { > + if (need_wakeup) { > xsk_set_rx_need_wakeup(pool); > /* Return 0 instead of -ENOMEM so that NAPI is > * descheduled. > @@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue > } > > return -ENOMEM; > - } else { > - xsk_clear_rx_need_wakeup(pool); > } > > len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len; > @@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue > goto err; > } > > + if (need_wakeup) { > + if (rq->vq->num_free) > + /* We have free buffers, so we'd better wake up the > + * rx napi as soon as possible. > + */ > + xsk_set_rx_need_wakeup(pool); > + else > + xsk_clear_rx_need_wakeup(pool); > + } > + Why do we need to set XDP_RING_NEED_WAKEUP even when xsk_buff_alloc_batch succeeds? > return num; > > err: Thanks, Quang Minh. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next v2 2/2] virtio-net: xsk: support tx wake up 2026-06-11 2:56 [PATCH net-next v2 0/2] virtio_net: xsk: rx and tx wake up menglong8.dong 2026-06-11 2:56 ` [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx " menglong8.dong @ 2026-06-11 2:56 ` menglong8.dong 1 sibling, 0 replies; 4+ messages in thread From: menglong8.dong @ 2026-06-11 2:56 UTC (permalink / raw) To: xuanzhuo, eperezma Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, minhquangbui99, kerneljasonxing, netdev, virtualization, linux-kernel From: Menglong Dong <dongml2@chinatelecom.cn> For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net in the tx path for example: we set xsk_set_tx_need_wakeup() in virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup() anywhere, which means the user will call send() for every packet. We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq is empty, as we can't be wakeup by the skb_xmit_done() in this case. Otherwise, we will clear the wakeup flag. Race condition is considered for tx path. Fixes: 89f86675cb03 ("virtio_net: xsk: tx: support xmit xsk buffer") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> --- drivers/net/virtio_net.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 4b5b3fa62008..86b5c1ca568c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1459,8 +1459,9 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool, struct virtnet_info *vi = sq->vq->vdev->priv; struct virtnet_sq_free_stats stats = {}; struct net_device *dev = vi->dev; + int sent, vring_size; + bool need_wakeup; u64 kicks = 0; - int sent; /* Avoid to wakeup napi meanless, so call __free_old_xmit instead of * free_old_xmit(). @@ -1470,8 +1471,29 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool, if (stats.xsk) xsk_tx_completed(sq->xsk_pool, stats.xsk); + vring_size = virtqueue_get_vring_size(sq->vq); + need_wakeup = xsk_uses_need_wakeup(pool); + /* If the sq->vq is empty, and the tx ring is empty, and the user + * submit an entry to the tx ring after virtnet_xsk_xmit_batch() and + * before xsk_set_tx_need_wakeup(), we will lose the chance to wake + * up the tx napi, so we have to set the need_wakeup flag here. + */ + if (need_wakeup && vring_size == sq->vq->num_free) + xsk_set_tx_need_wakeup(pool); + sent = virtnet_xsk_xmit_batch(sq, pool, budget, &kicks); + if (need_wakeup) { + if (vring_size == sq->vq->num_free) + /* we can't wake up by ourself, and it should be done + * by the user. + */ + xsk_set_tx_need_wakeup(pool); + else + /* we can wake up from skb_xmit_done() */ + xsk_clear_tx_need_wakeup(pool); + } + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) check_sq_full_and_disable(vi, vi->dev, sq); @@ -1489,9 +1511,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool, u64_stats_add(&sq->stats.xdp_tx, sent); u64_stats_update_end(&sq->stats.syncp); - if (xsk_uses_need_wakeup(pool)) - xsk_set_tx_need_wakeup(pool); - return sent; } -- 2.54.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-11 16:24 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-11 2:56 [PATCH net-next v2 0/2] virtio_net: xsk: rx and tx wake up menglong8.dong 2026-06-11 2:56 ` [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx " menglong8.dong 2026-06-11 16:24 ` Bui Quang Minh 2026-06-11 2:56 ` [PATCH net-next v2 2/2] virtio-net: xsk: support tx " menglong8.dong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox