bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Xing <kerneljasonxing@gmail.com>
To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, bjorn@kernel.org, magnus.karlsson@intel.com,
	maciej.fijalkowski@intel.com, jonathan.lemon@gmail.com,
	sdf@fomichev.me, ast@kernel.org, daniel@iogearbox.net,
	hawk@kernel.org, john.fastabend@gmail.com, joe@dama.to,
	willemdebruijn.kernel@gmail.com
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Jason Xing <kernelxing@tencent.com>
Subject: [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission
Date: Wed, 25 Jun 2025 18:10:13 +0800	[thread overview]
Message-ID: <20250625101014.45066-2-kerneljasonxing@gmail.com> (raw)
In-Reply-To: <20250625101014.45066-1-kerneljasonxing@gmail.com>

From: Jason Xing <kernelxing@tencent.com>

For afxdp, the return value of sendto() syscall doesn't reflect how many
descs handled in the kernel. One of use cases is that when user-space
application tries to know the number of transmitted skbs and then decides
if it continues to send, say, is it stopped due to max tx budget?

The following formular can be used after sending to learn how many
skbs/descs the kernel takes care of:

  tx_queue.consumers_before - tx_queue.consumers_after

Prior to the current patch, in non-zc mode, the consumer of tx queue is
not immediately updated at the end of each sendto syscall when error
occurs, which leads to the consumer value out-of-dated from the perspective
of user space. So this patch requires store operation to pass the cached
value to the shared value to handle the problem.

More than those explicit errors appearing in the while() loop in
__xsk_generic_xmit(), there are a few possible error cases that might
be neglected in the following call trace:
__xsk_generic_xmit()
    xskq_cons_peek_desc()
        xskq_cons_read_desc()
	    xskq_cons_is_valid_desc()
It will also cause the premature exit in the while() loop even if not
all the descs are consumed.

Based on the above analysis, using 'cached_prod != cached_cons' could
cover all the possible cases because it represents there are remaining
descs that are not handled and cached_cons are not updated to the global
state of consumer at this time.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
v3
Link: https://lore.kernel.org/all/20250623073129.23290-1-kerneljasonxing@gmail.com/
1. use xskq_has_descs helper.
2. add selftest

V2
Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
1. filter out those good cases because only those that return error need
updates.
Side note:
1. in non-batched zero copy mode, at the end of every caller of
xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
to update the local consumer to the global state of consumer. So for the
zero copy mode, no need to change at all.
2. Actually I have no strong preference between v1 (see the above link)
and v2 because smp_store_release() shouldn't cause side effect.
Considering the exactitude of writing code, v2 is a more preferable
one.
---
 net/xdp/xsk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 5542675dffa9..ab6351b24ac8 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
 	}
 
 out:
+	if (xskq_has_descs(xs->tx))
+		__xskq_cons_release(xs->tx);
+
 	if (sent_frame)
 		if (xsk_tx_writeable(xs))
 			sk->sk_write_space(sk);
-- 
2.41.3


  reply	other threads:[~2025-06-25 10:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25 10:10 [PATCH net-next v3 0/2] net: xsk: update tx queue consumer Jason Xing
2025-06-25 10:10 ` Jason Xing [this message]
2025-06-25 11:09   ` [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission Maciej Fijalkowski
2025-06-25 12:49     ` Jason Xing
2025-06-25 10:10 ` [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call Jason Xing
2025-06-25 12:19   ` Maciej Fijalkowski
2025-06-25 12:58     ` Jason Xing
2025-06-25 15:00       ` Stanislav Fomichev
2025-06-25 15:35         ` Jason Xing
2025-06-25 23:41           ` Stanislav Fomichev
2025-06-25 23:55             ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250625101014.45066-2-kerneljasonxing@gmail.com \
    --to=kerneljasonxing@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=joe@dama.to \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).