Netdev List
 help / color / mirror / Atom feed
From: Bui Quang Minh <minhquangbui99@gmail.com>
To: menglong8.dong@gmail.com, xuanzhuo@linux.alibaba.com,
	eperezma@redhat.com
Cc: mst@redhat.com, jasowang@redhat.com, andrew+netdev@lunn.ch,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, kerneljasonxing@gmail.com,
	netdev@vger.kernel.org, virtualization@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up
Date: Thu, 11 Jun 2026 23:24:44 +0700	[thread overview]
Message-ID: <41eefa1d-99bf-450d-988e-7dec67c6b61e@gmail.com> (raw)
In-Reply-To: <20260611025644.2431148-2-dongml2@chinatelecom.cn>

On 6/11/26 09:56, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dongml2@chinatelecom.cn>
>
> During packet receiving in virtio-net, the rq can be empty, which means
> "rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in
> virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring
> can be empty too, which means we can't allocate anything from
> xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag.
>
> However, if the user clean all the data in rx ring and fill the
> "fill ring" and check the XDP_RING_NEED_WAKEUP flag after
> xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx
> napi will never be scheduled: the rx ring is empty, which means we will
> never receive a packet to trigger the further recv fill. The rx ring is
> empty now, so the user will not check the flag too.
>
> Fix this by set the XDP_RING_NEED_WAKEUP flag before
> xsk_buff_alloc_batch() if both rq->vq and fill ring are empty.
>
> Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in
> rq->vq.
>
> Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side")
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
>   drivers/net/virtio_net.c | 25 ++++++++++++++++++++++---
>   1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f4adcfee7a80..4b5b3fa62008 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>   				   struct xsk_buff_pool *pool, gfp_t gfp)
>   {
>   	struct xdp_buff **xsk_buffs;
> +	bool need_wakeup;
>   	dma_addr_t addr;
>   	int err = 0;
>   	u32 len, i;
>   	int num;
>   
> +	need_wakeup = xsk_uses_need_wakeup(pool);
>   	xsk_buffs = rq->xsk_buffs;
>   
> +	/* If both rq->vq and fill ring are empty, and then the user submit
> +	 * all the chunks to the fill ring and check the wake up flag
> +	 * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(),
> +	 * we will lose the chance to wake up the rx napi, so we have to
> +	 * set the need_wakeup flag here.
> +	 */
> +	if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free)
> +		xsk_set_rx_need_wakeup(pool);

I think when polling the receive queue, the userspace program needs to 
check the XDP_RING_NEED_WAKEUP flag if it does not see any packets. The 
flag check is quite lightweight in my opinion. Here are some examples I find

- 
https://github.com/xdp-project/xdp-tools/blob/e9469501622aa22a7e452a671000bec8685edcde/lib/util/xdpsock.c#L1206
- 
https://github.com/xdp-project/bpf-examples/blob/43e565901c4287efa863edca7f0e6cd6e35ed896/AF_XDP-forwarding/xsk_fwd.c#L540

Furthermore, the XDP_RING_NEED_WAKEUP flag related functions does not 
provide any memory orderings. So even with your patch, I'm worried that 
this case is possible

kernel userspace

xsk_buff_alloc_batch -> failed
                                                             submit fill 
ring
                                                             flag != 
XDP_RING_NEED_WAKEUP
// reordering due to lack of memory orderings
xsk_set_rx_need_wakeup

I'm not expert here, so correct me if I'm wrong. I think the wake up 
flag is designed with no orderings so we cannot rely on it to reason and 
skip further checks.

> +
>   	num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free);
>   	if (!num) {
> -		if (xsk_uses_need_wakeup(pool)) {
> +		if (need_wakeup) {
>   			xsk_set_rx_need_wakeup(pool);
>   			/* Return 0 instead of -ENOMEM so that NAPI is
>   			 * descheduled.
> @@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>   		}
>   
>   		return -ENOMEM;
> -	} else {
> -		xsk_clear_rx_need_wakeup(pool);
>   	}
>   
>   	len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
> @@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>   			goto err;
>   	}
>   
> +	if (need_wakeup) {
> +		if (rq->vq->num_free)
> +			/* We have free buffers, so we'd better wake up the
> +			 * rx napi as soon as possible.
> +			 */
> +			xsk_set_rx_need_wakeup(pool);
> +		else
> +			xsk_clear_rx_need_wakeup(pool);
> +	}
> +

Why do we need to set XDP_RING_NEED_WAKEUP even when 
xsk_buff_alloc_batch succeeds?

>   	return num;
>   
>   err:

Thanks,
Quang Minh.



  reply	other threads:[~2026-06-11 16:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11  2:56 [PATCH net-next v2 0/2] virtio_net: xsk: rx and tx wake up menglong8.dong
2026-06-11  2:56 ` [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx " menglong8.dong
2026-06-11 16:24   ` Bui Quang Minh [this message]
2026-06-11  2:56 ` [PATCH net-next v2 2/2] virtio-net: xsk: support tx " menglong8.dong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41eefa1d-99bf-450d-988e-7dec67c6b61e@gmail.com \
    --to=minhquangbui99@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menglong8.dong@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox