From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
<pabeni@redhat.com>, <bjorn@kernel.org>,
<magnus.karlsson@intel.com>, <jonathan.lemon@gmail.com>,
<sdf@fomichev.me>, <ast@kernel.org>, <daniel@iogearbox.net>,
<hawk@kernel.org>, <john.fastabend@gmail.com>, <joe@dama.to>,
<willemdebruijn.kernel@gmail.com>, <bpf@vger.kernel.org>,
<netdev@vger.kernel.org>, Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v4 1/2] net: xsk: update tx queue consumer immediately after transmission
Date: Mon, 30 Jun 2025 13:49:11 +0200 [thread overview]
Message-ID: <aGJ5t2hvN9wX+vxh@boxer> (raw)
In-Reply-To: <20250627085745.53173-2-kerneljasonxing@gmail.com>
On Fri, Jun 27, 2025 at 04:57:44PM +0800, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> For afxdp, the return value of sendto() syscall doesn't reflect how many
> descs handled in the kernel. One of use cases is that when user-space
> application tries to know the number of transmitted skbs and then decides
> if it continues to send, say, is it stopped due to max tx budget?
>
> The following formular can be used after sending to learn how many
> skbs/descs the kernel takes care of:
>
> tx_queue.consumers_before - tx_queue.consumers_after
>
> Prior to the current patch, in non-zc mode, the consumer of tx queue is
> not immediately updated at the end of each sendto syscall when error
> occurs, which leads to the consumer value out-of-dated from the perspective
> of user space. So this patch requires store operation to pass the cached
> value to the shared value to handle the problem.
>
> More than those explicit errors appearing in the while() loop in
> __xsk_generic_xmit(), there are a few possible error cases that might
> be neglected in the following call trace:
> __xsk_generic_xmit()
> xskq_cons_peek_desc()
> xskq_cons_read_desc()
> xskq_cons_is_valid_desc()
> It will also cause the premature exit in the while() loop even if not
> all the descs are consumed.
>
> Based on the above analysis, using @sent_frame could cover all the possible
> cases where it might lead to out-of-dated global state of consumer after
> finishing __xsk_generic_xmit().
>
> The patch also adds a common helper __xsk_tx_release() to keep align
> with the zc mode usage in xsk_tx_release().
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> v4
> Link: https://lore.kernel.org/all/20250625101014.45066-1-kerneljasonxing@gmail.com/
> 1. use the common helper
> 2. keep align with the zc mode usage in xsk_tx_release()
>
> v3
> Link: https://lore.kernel.org/all/20250623073129.23290-1-kerneljasonxing@gmail.com/
> 1. use xskq_has_descs helper.
> 2. add selftest
>
> V2
> Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
> 1. filter out those good cases because only those that return error need
> updates.
> Side note:
> 1. in non-batched zero copy mode, at the end of every caller of
> xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
> to update the local consumer to the global state of consumer. So for the
> zero copy mode, no need to change at all.
> 2. Actually I have no strong preference between v1 (see the above link)
> and v2 because smp_store_release() shouldn't cause side effect.
> Considering the exactitude of writing code, v2 is a more preferable
> one.
> ---
> net/xdp/xsk.c | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 72c000c0ae5f..bd61b0bc9c24 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -300,6 +300,13 @@ static bool xsk_tx_writeable(struct xdp_sock *xs)
> return true;
> }
>
> +static void __xsk_tx_release(struct xdp_sock *xs)
> +{
> + __xskq_cons_release(xs->tx);
> + if (xsk_tx_writeable(xs))
> + xs->sk.sk_write_space(&xs->sk);
> +}
> +
> static bool xsk_is_bound(struct xdp_sock *xs)
> {
> if (READ_ONCE(xs->state) == XSK_BOUND) {
> @@ -407,11 +414,8 @@ void xsk_tx_release(struct xsk_buff_pool *pool)
> struct xdp_sock *xs;
>
> rcu_read_lock();
> - list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
> - __xskq_cons_release(xs->tx);
> - if (xsk_tx_writeable(xs))
> - xs->sk.sk_write_space(&xs->sk);
> - }
> + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list)
> + __xsk_tx_release(xs);
> rcu_read_unlock();
> }
> EXPORT_SYMBOL(xsk_tx_release);
> @@ -858,8 +862,7 @@ static int __xsk_generic_xmit(struct sock *sk)
>
> out:
> if (sent_frame)
> - if (xsk_tx_writeable(xs))
> - sk->sk_write_space(sk);
> + __xsk_tx_release(xs);
>
> mutex_unlock(&xs->mutex);
> return err;
> --
> 2.41.3
>
next prev parent reply other threads:[~2025-06-30 11:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-27 8:57 [PATCH net-next v4 0/2] net: xsk: update tx queue consumer Jason Xing
2025-06-27 8:57 ` [PATCH net-next v4 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
2025-06-30 11:49 ` Maciej Fijalkowski [this message]
2025-06-30 15:23 ` Stanislav Fomichev
2025-06-27 8:57 ` [PATCH net-next v4 2/2] selftests/bpf: check if the global consumer updates in time Jason Xing
2025-06-30 12:06 ` Maciej Fijalkowski
2025-06-30 12:28 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGJ5t2hvN9wX+vxh@boxer \
--to=maciej.fijalkowski@intel.com \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=joe@dama.to \
--cc=john.fastabend@gmail.com \
--cc=jonathan.lemon@gmail.com \
--cc=kerneljasonxing@gmail.com \
--cc=kernelxing@tencent.com \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.