Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v3] virtio-net: xsk: support tx wake up
@ 2026-06-16 11:59 Menglong Dong
  2026-06-21 22:06 ` Jakub Kicinski
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Menglong Dong @ 2026-06-16 11:59 UTC (permalink / raw)
  To: xuanzhuo, eperezma
  Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni,
	netdev, virtualization, linux-kernel

For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net
in the tx path for example: we set xsk_set_tx_need_wakeup() in
virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup()
anywhere, which means the user will call send() for every packet.

We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq
is empty, as we can't be wakeup by the skb_xmit_done() in this case.
Otherwise, we will clear the wakeup flag.

Race condition is considered for tx path.

Fixes: 89f86675cb03 ("virtio_net: xsk: tx: support xmit xsk buffer")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v3:
- remove the confusing comment

v2:
- add the Fixes tag
---
 drivers/net/virtio_net.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f4adcfee7a80..6e099edef6e9 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1440,8 +1440,9 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
 	struct virtnet_info *vi = sq->vq->vdev->priv;
 	struct virtnet_sq_free_stats stats = {};
 	struct net_device *dev = vi->dev;
+	int sent, vring_size;
+	bool need_wakeup;
 	u64 kicks = 0;
-	int sent;
 
 	/* Avoid to wakeup napi meanless, so call __free_old_xmit instead of
 	 * free_old_xmit().
@@ -1451,8 +1452,25 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
 	if (stats.xsk)
 		xsk_tx_completed(sq->xsk_pool, stats.xsk);
 
+	vring_size = virtqueue_get_vring_size(sq->vq);
+	need_wakeup = xsk_uses_need_wakeup(pool);
+
+	if (need_wakeup && vring_size == sq->vq->num_free)
+		xsk_set_tx_need_wakeup(pool);
+
 	sent = virtnet_xsk_xmit_batch(sq, pool, budget, &kicks);
 
+	if (need_wakeup) {
+		if (vring_size == sq->vq->num_free)
+			/* we can't wake up by ourself, and it should be done
+			 * by the user.
+			 */
+			xsk_set_tx_need_wakeup(pool);
+		else
+			/* we can wake up from skb_xmit_done() */
+			xsk_clear_tx_need_wakeup(pool);
+	}
+
 	if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq))
 		check_sq_full_and_disable(vi, vi->dev, sq);
 
@@ -1470,9 +1488,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
 	u64_stats_add(&sq->stats.xdp_tx,  sent);
 	u64_stats_update_end(&sq->stats.syncp);
 
-	if (xsk_uses_need_wakeup(pool))
-		xsk_set_tx_need_wakeup(pool);
-
 	return sent;
 }
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-16 11:59 [PATCH net-next v3] virtio-net: xsk: support tx wake up Menglong Dong
@ 2026-06-21 22:06 ` Jakub Kicinski
  2026-06-22 12:38   ` Menglong Dong
  2026-06-21 22:31 ` Michael S. Tsirkin
  2026-06-22  2:40 ` Xuan Zhuo
  2 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2026-06-21 22:06 UTC (permalink / raw)
  To: xuanzhuo
  Cc: Menglong Dong, eperezma, mst, jasowang, andrew+netdev, davem,
	edumazet, pabeni, netdev, virtualization, linux-kernel

On Tue, 16 Jun 2026 19:59:12 +0800 Menglong Dong wrote:
> For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net
> in the tx path for example: we set xsk_set_tx_need_wakeup() in
> virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup()
> anywhere, which means the user will call send() for every packet.
> 
> We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq
> is empty, as we can't be wakeup by the skb_xmit_done() in this case.
> Otherwise, we will clear the wakeup flag.
> 
> Race condition is considered for tx path.

Seems to follow what mlx5 does so presumably this is fine but IDK if
there's anything virtio-specific that we need to be worried about.

Xuan Zhuo, please TAL?
-- 
mping: VIRTIO NET DRIVER

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-16 11:59 [PATCH net-next v3] virtio-net: xsk: support tx wake up Menglong Dong
  2026-06-21 22:06 ` Jakub Kicinski
@ 2026-06-21 22:31 ` Michael S. Tsirkin
  2026-06-22 12:27   ` Menglong Dong
  2026-06-22  2:40 ` Xuan Zhuo
  2 siblings, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2026-06-21 22:31 UTC (permalink / raw)
  To: Menglong Dong
  Cc: xuanzhuo, eperezma, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, netdev, virtualization, linux-kernel

On Tue, Jun 16, 2026 at 07:59:12PM +0800, Menglong Dong wrote:
> For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net
> in the tx path for example: we set xsk_set_tx_need_wakeup() in
> virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup()
> anywhere, which means the user will call send() for every packet.
> 
> We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq
> is empty, as we can't be wakeup by the skb_xmit_done() in this case.
> Otherwise, we will clear the wakeup flag.
> 
> Race condition is considered for tx path.
> 
> Fixes: 89f86675cb03 ("virtio_net: xsk: tx: support xmit xsk buffer")
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>

thanks for the patch! yes something to improve.

> ---
> v3:
> - remove the confusing comment
> 
> v2:
> - add the Fixes tag
> ---
>  drivers/net/virtio_net.c | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f4adcfee7a80..6e099edef6e9 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1440,8 +1440,9 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
>  	struct virtnet_info *vi = sq->vq->vdev->priv;
>  	struct virtnet_sq_free_stats stats = {};
>  	struct net_device *dev = vi->dev;
> +	int sent, vring_size;
> +	bool need_wakeup;
>  	u64 kicks = 0;
> -	int sent;
>  
>  	/* Avoid to wakeup napi meanless, so call __free_old_xmit instead of
>  	 * free_old_xmit().
> @@ -1451,8 +1452,25 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
>  	if (stats.xsk)
>  		xsk_tx_completed(sq->xsk_pool, stats.xsk);
>  
> +	vring_size = virtqueue_get_vring_size(sq->vq);
> +	need_wakeup = xsk_uses_need_wakeup(pool);
> +
> +	if (need_wakeup && vring_size == sq->vq->num_free)
> +		xsk_set_tx_need_wakeup(pool);
> +

why are we doing this here?
the check after virtnet_xsk_xmit_batch not enough?
I vaguely think it's some kind of race we are closing?
Pls add a comment to explain.

>  	sent = virtnet_xsk_xmit_batch(sq, pool, budget, &kicks);
>  
> +	if (need_wakeup) {
> +		if (vring_size == sq->vq->num_free)
> +			/* we can't wake up by ourself, and it should be done
> +			 * by the user.
> +			 */
> +			xsk_set_tx_need_wakeup(pool);
> +		else
> +			/* we can wake up from skb_xmit_done() */
> +			xsk_clear_tx_need_wakeup(pool);

But what if we don't have get tx napi so no wakeup in skb_xmit_done?


> +	}
> +
>  	if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq))
>  		check_sq_full_and_disable(vi, vi->dev, sq);
>  
> @@ -1470,9 +1488,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
>  	u64_stats_add(&sq->stats.xdp_tx,  sent);
>  	u64_stats_update_end(&sq->stats.syncp);
>  
> -	if (xsk_uses_need_wakeup(pool))
> -		xsk_set_tx_need_wakeup(pool);
> -
>  	return sent;
>  }
>  
> -- 
> 2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-16 11:59 [PATCH net-next v3] virtio-net: xsk: support tx wake up Menglong Dong
  2026-06-21 22:06 ` Jakub Kicinski
  2026-06-21 22:31 ` Michael S. Tsirkin
@ 2026-06-22  2:40 ` Xuan Zhuo
  2026-06-22 12:28   ` Menglong Dong
  2 siblings, 1 reply; 8+ messages in thread
From: Xuan Zhuo @ 2026-06-22  2:40 UTC (permalink / raw)
  To: Menglong Dong
  Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni,
	netdev, virtualization, linux-kernel, eperezma

On Tue, 16 Jun 2026 19:59:12 +0800, Menglong Dong <menglong8.dong@gmail.com> wrote:
> For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net
> in the tx path for example: we set xsk_set_tx_need_wakeup() in
> virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup()
> anywhere, which means the user will call send() for every packet.
>
> We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq
> is empty, as we can't be wakeup by the skb_xmit_done() in this case.
> Otherwise, we will clear the wakeup flag.
>
> Race condition is considered for tx path.
>
> Fixes: 89f86675cb03 ("virtio_net: xsk: tx: support xmit xsk buffer")

This is not a bug, so we do not need this.
And you post this to net-next.


> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
> v3:
> - remove the confusing comment
>
> v2:
> - add the Fixes tag
> ---
>  drivers/net/virtio_net.c | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f4adcfee7a80..6e099edef6e9 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1440,8 +1440,9 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
>  	struct virtnet_info *vi = sq->vq->vdev->priv;
>  	struct virtnet_sq_free_stats stats = {};
>  	struct net_device *dev = vi->dev;
> +	int sent, vring_size;
> +	bool need_wakeup;
>  	u64 kicks = 0;
> -	int sent;
>
>  	/* Avoid to wakeup napi meanless, so call __free_old_xmit instead of
>  	 * free_old_xmit().
> @@ -1451,8 +1452,25 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
>  	if (stats.xsk)
>  		xsk_tx_completed(sq->xsk_pool, stats.xsk);
>
> +	vring_size = virtqueue_get_vring_size(sq->vq);
> +	need_wakeup = xsk_uses_need_wakeup(pool);
> +
> +	if (need_wakeup && vring_size == sq->vq->num_free)
> +		xsk_set_tx_need_wakeup(pool);

You need to comment this.


> +
>  	sent = virtnet_xsk_xmit_batch(sq, pool, budget, &kicks);
>
> +	if (need_wakeup) {
> +		if (vring_size == sq->vq->num_free)
> +			/* we can't wake up by ourself, and it should be done
> +			 * by the user.
> +			 */
> +			xsk_set_tx_need_wakeup(pool);
> +		else
> +			/* we can wake up from skb_xmit_done() */
> +			xsk_clear_tx_need_wakeup(pool);
> +	}
> +
>  	if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq))
>  		check_sq_full_and_disable(vi, vi->dev, sq);


After fixed above comments, you can add:

Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

Thanks.


>
> @@ -1470,9 +1488,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
>  	u64_stats_add(&sq->stats.xdp_tx,  sent);
>  	u64_stats_update_end(&sq->stats.syncp);
>
> -	if (xsk_uses_need_wakeup(pool))
> -		xsk_set_tx_need_wakeup(pool);
> -
>  	return sent;
>  }
>
> --
> 2.54.0
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-21 22:31 ` Michael S. Tsirkin
@ 2026-06-22 12:27   ` Menglong Dong
  2026-06-22 13:24     ` Michael S. Tsirkin
  0 siblings, 1 reply; 8+ messages in thread
From: Menglong Dong @ 2026-06-22 12:27 UTC (permalink / raw)
  To: Menglong Dong, Michael S. Tsirkin
  Cc: xuanzhuo, eperezma, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, netdev, virtualization, linux-kernel

On 2026/6/22 06:31 Michael S. Tsirkin <mst@redhat.com> write:
> On Tue, Jun 16, 2026 at 07:59:12PM +0800, Menglong Dong wrote:
[...]
> >  
> > +	vring_size = virtqueue_get_vring_size(sq->vq);
> > +	need_wakeup = xsk_uses_need_wakeup(pool);
> > +
> > +	if (need_wakeup && vring_size == sq->vq->num_free)
> > +		xsk_set_tx_need_wakeup(pool);
> > +
> 
> why are we doing this here?
> the check after virtnet_xsk_xmit_batch not enough?
> I vaguely think it's some kind of race we are closing?
> Pls add a comment to explain.

Hi, Michael. Thanks for your review.

Yeah, it's for a race condition between user space and kernel
space. I added a comment in V2, which is too confusing, and
I removed it 😢. I'll make it more clear and add it in the V4. The
origin comment is:

 * If the sq->vq is empty, and the tx ring is empty, and the user
 * submit an entry to the tx ring after virtnet_xsk_xmit_batch() and
 * before xsk_set_tx_need_wakeup(), we will lose the chance to wake
 * up the tx napi, so we have to set the need_wakeup flag here.

And the logic is like this:

Kernel: tx NAPI is waked up from skb_xmit_done() ->
Kernel: sq->vq and xsk->tx_ring are both empty ->
Kernel: call virtnet_xsk_xmit_batch()

    User: submit a entry to the xsk->tx_ring
    User: check the wakeup flag
    User: wakeup flag is not set, skip send()

Kernel: call xsk_set_tx_need_wakeup(), because sq->vq is empty

If we don't send more data, the data in the xsk->tx_ring will
not be sent forever.

> 
> >  	sent = virtnet_xsk_xmit_batch(sq, pool, budget, &kicks);
> >  
> > +	if (need_wakeup) {
> > +		if (vring_size == sq->vq->num_free)
> > +			/* we can't wake up by ourself, and it should be done
> > +			 * by the user.
> > +			 */
> > +			xsk_set_tx_need_wakeup(pool);
> > +		else
> > +			/* we can wake up from skb_xmit_done() */
> > +			xsk_clear_tx_need_wakeup(pool);
> 
> But what if we don't have get tx napi so no wakeup in skb_xmit_done?

Sorry that I'm not sure what "get tx napi" means here ;(

There are entry in sq->vq, so skb_xmit_done() will be called after
the entries in the ring is consumed by the HOST, right?
Then, the corresponding sq->napi will be scheduled, as we ensure
that tx napi is always enabled, which means napi->weight is not
zero, in this commit:
1df5116a41a8 ("virtio_net: xsk: prevent disable tx napi")

Right?

Thanks!
Menglong Dong

> 
> 
> > +	}
> > +
> >  	if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq))
> >  		check_sq_full_and_disable(vi, vi->dev, sq);
> >  
> > @@ -1470,9 +1488,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
> >  	u64_stats_add(&sq->stats.xdp_tx,  sent);
> >  	u64_stats_update_end(&sq->stats.syncp);
> >  
> > -	if (xsk_uses_need_wakeup(pool))
> > -		xsk_set_tx_need_wakeup(pool);
> > -
> >  	return sent;
> >  }
> >  
> > -- 
> > 2.54.0
> 
> 
> 





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-22  2:40 ` Xuan Zhuo
@ 2026-06-22 12:28   ` Menglong Dong
  0 siblings, 0 replies; 8+ messages in thread
From: Menglong Dong @ 2026-06-22 12:28 UTC (permalink / raw)
  To: Menglong Dong, Xuan Zhuo
  Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni,
	netdev, virtualization, linux-kernel, eperezma

On 2026/6/22 10:40 Xuan Zhuo <xuanzhuo@linux.alibaba.com> write:
> On Tue, 16 Jun 2026 19:59:12 +0800, Menglong Dong <menglong8.dong@gmail.com> wrote:
> > For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net
> > in the tx path for example: we set xsk_set_tx_need_wakeup() in
> > virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup()
> > anywhere, which means the user will call send() for every packet.
> >
> > We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq
> > is empty, as we can't be wakeup by the skb_xmit_done() in this case.
> > Otherwise, we will clear the wakeup flag.
> >
> > Race condition is considered for tx path.
> >
> > Fixes: 89f86675cb03 ("virtio_net: xsk: tx: support xmit xsk buffer")
> 
> This is not a bug, so we do not need this.
> And you post this to net-next.

Okay, I'll remove this tag in the V4.

> 
> 
> > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > ---
> > v3:
[...]
> > +
> > +	if (need_wakeup && vring_size == sq->vq->num_free)
> > +		xsk_set_tx_need_wakeup(pool);
> 
> You need to comment this.

Ack!

> 
> 
> > +
[...]
> > +
> >  	if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq))
> >  		check_sq_full_and_disable(vi, vi->dev, sq);
> 
> 
> After fixed above comments, you can add:
> 
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

OK! Thanks for the review :)

> 
> Thanks.
> 
> 
> >
> > @@ -1470,9 +1488,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
> >  	u64_stats_add(&sq->stats.xdp_tx,  sent);
> >  	u64_stats_update_end(&sq->stats.syncp);
> >
> > -	if (xsk_uses_need_wakeup(pool))
> > -		xsk_set_tx_need_wakeup(pool);
> > -
> >  	return sent;
> >  }
> >
> > --
> > 2.54.0
> >
> 
> 





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-21 22:06 ` Jakub Kicinski
@ 2026-06-22 12:38   ` Menglong Dong
  0 siblings, 0 replies; 8+ messages in thread
From: Menglong Dong @ 2026-06-22 12:38 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: xuanzhuo, Menglong Dong, eperezma, mst, jasowang, andrew+netdev,
	davem, edumazet, pabeni, netdev, virtualization, linux-kernel

On 2026/6/22 06:06 Jakub Kicinski <kuba@kernel.org> write:
> On Tue, 16 Jun 2026 19:59:12 +0800 Menglong Dong wrote:
> > For now, XDP_RING_NEED_WAKEUP is not supported properly by the virtio-net
> > in the tx path for example: we set xsk_set_tx_need_wakeup() in
> > virtnet_xsk_xmit(), but we didn't call xsk_clear_tx_need_wakeup()
> > anywhere, which means the user will call send() for every packet.
> > 
> > We call xsk_set_tx_need_wakeup() after virtnet_xsk_xmit_batch() if sq->vq
> > is empty, as we can't be wakeup by the skb_xmit_done() in this case.
> > Otherwise, we will clear the wakeup flag.
> > 
> > Race condition is considered for tx path.
> 
> Seems to follow what mlx5 does so presumably this is fine but IDK if

Yeah, I followed the logic of mlx5. It's amazing that you found it :)

> there's anything virtio-specific that we need to be worried about.
> 
> Xuan Zhuo, please TAL?
> -- 
> mping: VIRTIO NET DRIVER
> 
> 





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3] virtio-net: xsk: support tx wake up
  2026-06-22 12:27   ` Menglong Dong
@ 2026-06-22 13:24     ` Michael S. Tsirkin
  0 siblings, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2026-06-22 13:24 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Menglong Dong, xuanzhuo, eperezma, jasowang, andrew+netdev, davem,
	edumazet, kuba, pabeni, netdev, virtualization, linux-kernel

On Mon, Jun 22, 2026 at 08:27:12PM +0800, Menglong Dong wrote:
> On 2026/6/22 06:31 Michael S. Tsirkin <mst@redhat.com> write:
> > On Tue, Jun 16, 2026 at 07:59:12PM +0800, Menglong Dong wrote:
> [...]
> > >  
> > > +	vring_size = virtqueue_get_vring_size(sq->vq);
> > > +	need_wakeup = xsk_uses_need_wakeup(pool);
> > > +
> > > +	if (need_wakeup && vring_size == sq->vq->num_free)
> > > +		xsk_set_tx_need_wakeup(pool);
> > > +
> > 
> > why are we doing this here?
> > the check after virtnet_xsk_xmit_batch not enough?
> > I vaguely think it's some kind of race we are closing?
> > Pls add a comment to explain.
> 
> Hi, Michael. Thanks for your review.
> 
> Yeah, it's for a race condition between user space and kernel
> space. I added a comment in V2, which is too confusing, and
> I removed it 😢. I'll make it more clear and add it in the V4. The
> origin comment is:
> 
>  * If the sq->vq is empty, and the tx ring is empty, and the user
>  * submit an entry to the tx ring after virtnet_xsk_xmit_batch() and
>  * before xsk_set_tx_need_wakeup(), we will lose the chance to wake
>  * up the tx napi, so we have to set the need_wakeup flag here.
> 
> And the logic is like this:
> 
> Kernel: tx NAPI is waked up from skb_xmit_done() ->
> Kernel: sq->vq and xsk->tx_ring are both empty ->
> Kernel: call virtnet_xsk_xmit_batch()
> 
>     User: submit a entry to the xsk->tx_ring
>     User: check the wakeup flag
>     User: wakeup flag is not set, skip send()
> 
> Kernel: call xsk_set_tx_need_wakeup(), because sq->vq is empty
> 
> If we don't send more data, the data in the xsk->tx_ring will
> not be sent forever.

I'm not 100% sure I understand, but when someone fixes cross-CPU races
with no synchronization or CPU memory barriers just with extra checks,
this always gives me pause.

AI helped write this for me, for example:
  1. Kernel: xsk_set_tx_need_wakeup stores NEED_WAKEUP (sits in store buffer)
  2. Kernel: xsk_tx_peek_release_desc_batch - load, sees empty (reordered before the store is globally visible)
  3. Kernel: peek finds nothing, returns 0
  4. Userspace: stores entry + producer
  5. Userspace: loads flags - doesn't see NEED_WAKEUP yet (still in kernel's store buffer)
  6. Userspeace: skips send()
  7. Kernel: NEED_WAKEUP store finally becomes visible - too late

Seems legit?



> > 
> > >  	sent = virtnet_xsk_xmit_batch(sq, pool, budget, &kicks);
> > >  
> > > +	if (need_wakeup) {
> > > +		if (vring_size == sq->vq->num_free)
> > > +			/* we can't wake up by ourself, and it should be done
> > > +			 * by the user.
> > > +			 */
> > > +			xsk_set_tx_need_wakeup(pool);
> > > +		else
> > > +			/* we can wake up from skb_xmit_done() */
> > > +			xsk_clear_tx_need_wakeup(pool);
> > 
> > But what if we don't have get tx napi so no wakeup in skb_xmit_done?
> 
> Sorry that I'm not sure what "get tx napi" means here ;(
> 
> There are entry in sq->vq, so skb_xmit_done() will be called after
> the entries in the ring is consumed by the HOST, right?
> Then, the corresponding sq->napi will be scheduled, as we ensure
> that tx napi is always enabled, which means napi->weight is not
> zero, in this commit:
> 1df5116a41a8 ("virtio_net: xsk: prevent disable tx napi")

Oh I forgot we did that. But can xsk bind when tx napi has already
been disabled previously?


> Right?
> 
> Thanks!
> Menglong Dong
> 
> > 
> > 
> > > +	}
> > > +
> > >  	if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq))
> > >  		check_sq_full_and_disable(vi, vi->dev, sq);
> > >  
> > > @@ -1470,9 +1488,6 @@ static bool virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool,
> > >  	u64_stats_add(&sq->stats.xdp_tx,  sent);
> > >  	u64_stats_update_end(&sq->stats.syncp);
> > >  
> > > -	if (xsk_uses_need_wakeup(pool))
> > > -		xsk_set_tx_need_wakeup(pool);
> > > -
> > >  	return sent;
> > >  }
> > >  
> > > -- 
> > > 2.54.0
> > 
> > 
> > 
> 
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-22 13:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 11:59 [PATCH net-next v3] virtio-net: xsk: support tx wake up Menglong Dong
2026-06-21 22:06 ` Jakub Kicinski
2026-06-22 12:38   ` Menglong Dong
2026-06-21 22:31 ` Michael S. Tsirkin
2026-06-22 12:27   ` Menglong Dong
2026-06-22 13:24     ` Michael S. Tsirkin
2026-06-22  2:40 ` Xuan Zhuo
2026-06-22 12:28   ` Menglong Dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox