All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bui Quang Minh <minhquangbui99@gmail.com>
To: menglong8.dong@gmail.com, xuanzhuo@linux.alibaba.com,
	eperezma@redhat.com
Cc: mst@redhat.com, jasowang@redhat.com, andrew+netdev@lunn.ch,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, kerneljasonxing@gmail.com,
	netdev@vger.kernel.org, virtualization@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up
Date: Thu, 11 Jun 2026 23:24:44 +0700	[thread overview]
Message-ID: <41eefa1d-99bf-450d-988e-7dec67c6b61e@gmail.com> (raw)
In-Reply-To: <20260611025644.2431148-2-dongml2@chinatelecom.cn>

On 6/11/26 09:56, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dongml2@chinatelecom.cn>
>
> During packet receiving in virtio-net, the rq can be empty, which means
> "rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in
> virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring
> can be empty too, which means we can't allocate anything from
> xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag.
>
> However, if the user clean all the data in rx ring and fill the
> "fill ring" and check the XDP_RING_NEED_WAKEUP flag after
> xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx
> napi will never be scheduled: the rx ring is empty, which means we will
> never receive a packet to trigger the further recv fill. The rx ring is
> empty now, so the user will not check the flag too.
>
> Fix this by set the XDP_RING_NEED_WAKEUP flag before
> xsk_buff_alloc_batch() if both rq->vq and fill ring are empty.
>
> Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in
> rq->vq.
>
> Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side")
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
>   drivers/net/virtio_net.c | 25 ++++++++++++++++++++++---
>   1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f4adcfee7a80..4b5b3fa62008 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>   				   struct xsk_buff_pool *pool, gfp_t gfp)
>   {
>   	struct xdp_buff **xsk_buffs;
> +	bool need_wakeup;
>   	dma_addr_t addr;
>   	int err = 0;
>   	u32 len, i;
>   	int num;
>   
> +	need_wakeup = xsk_uses_need_wakeup(pool);
>   	xsk_buffs = rq->xsk_buffs;
>   
> +	/* If both rq->vq and fill ring are empty, and then the user submit
> +	 * all the chunks to the fill ring and check the wake up flag
> +	 * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(),
> +	 * we will lose the chance to wake up the rx napi, so we have to
> +	 * set the need_wakeup flag here.
> +	 */
> +	if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free)
> +		xsk_set_rx_need_wakeup(pool);

I think when polling the receive queue, the userspace program needs to 
check the XDP_RING_NEED_WAKEUP flag if it does not see any packets. The 
flag check is quite lightweight in my opinion. Here are some examples I find

- 
https://github.com/xdp-project/xdp-tools/blob/e9469501622aa22a7e452a671000bec8685edcde/lib/util/xdpsock.c#L1206
- 
https://github.com/xdp-project/bpf-examples/blob/43e565901c4287efa863edca7f0e6cd6e35ed896/AF_XDP-forwarding/xsk_fwd.c#L540

Furthermore, the XDP_RING_NEED_WAKEUP flag related functions does not 
provide any memory orderings. So even with your patch, I'm worried that 
this case is possible

kernel userspace

xsk_buff_alloc_batch -> failed
                                                             submit fill 
ring
                                                             flag != 
XDP_RING_NEED_WAKEUP
// reordering due to lack of memory orderings
xsk_set_rx_need_wakeup

I'm not expert here, so correct me if I'm wrong. I think the wake up 
flag is designed with no orderings so we cannot rely on it to reason and 
skip further checks.

> +
>   	num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free);
>   	if (!num) {
> -		if (xsk_uses_need_wakeup(pool)) {
> +		if (need_wakeup) {
>   			xsk_set_rx_need_wakeup(pool);
>   			/* Return 0 instead of -ENOMEM so that NAPI is
>   			 * descheduled.
> @@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>   		}
>   
>   		return -ENOMEM;
> -	} else {
> -		xsk_clear_rx_need_wakeup(pool);
>   	}
>   
>   	len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
> @@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
>   			goto err;
>   	}
>   
> +	if (need_wakeup) {
> +		if (rq->vq->num_free)
> +			/* We have free buffers, so we'd better wake up the
> +			 * rx napi as soon as possible.
> +			 */
> +			xsk_set_rx_need_wakeup(pool);
> +		else
> +			xsk_clear_rx_need_wakeup(pool);
> +	}
> +

Why do we need to set XDP_RING_NEED_WAKEUP even when 
xsk_buff_alloc_batch succeeds?

>   	return num;
>   
>   err:

Thanks,
Quang Minh.



  reply	other threads:[~2026-06-11 16:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11  2:56 [PATCH net-next v2 0/2] virtio_net: xsk: rx and tx wake up menglong8.dong
2026-06-11  2:56 ` [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx " menglong8.dong
2026-06-11 16:24   ` Bui Quang Minh [this message]
2026-06-13 12:26     ` Menglong Dong
2026-06-15  2:48   ` Xuan Zhuo
2026-06-16  1:48     ` Menglong Dong
2026-06-11  2:56 ` [PATCH net-next v2 2/2] virtio-net: xsk: support tx " menglong8.dong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41eefa1d-99bf-450d-988e-7dec67c6b61e@gmail.com \
    --to=minhquangbui99@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menglong8.dong@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.