* [PATCH net-next v2] net: xsk: update tx queue consumer immediately after transmission
@ 2025-06-23 7:31 Jason Xing
2025-06-23 14:35 ` Stanislav Fomichev
0 siblings, 1 reply; 3+ messages in thread
From: Jason Xing @ 2025-06-23 7:31 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
For afxdp, the return value of sendto() syscall doesn't reflect how many
descs handled in the kernel. One of use cases is that when user-space
application tries to know the number of transmitted skbs and then decides
if it continues to send, say, is it stopped due to max tx budget?
The following formular can be used after sending to learn how many
skbs/descs the kernel takes care of:
tx_queue.consumers_before - tx_queue.consumers_after
Prior to the current patch, in non-zc mode, the consumer of tx queue is
not immediately updated at the end of each sendto syscall when error
occurs, which leads to the consumer value out-of-dated from the perspective
of user space. So this patch requires store operation to pass the cached
value to the shared value to handle the problem.
More than those explicit errors appearing in the while() loop in
__xsk_generic_xmit(), there are a few possible error cases that might
be neglected in the following call trace:
__xsk_generic_xmit()
xskq_cons_peek_desc()
xskq_cons_read_desc()
xskq_cons_is_valid_desc()
It will also cause the premature exit in the while() loop even if not
all the descs are consumed.
Based on the above analysis, using 'cached_prod != cached_cons' could
cover all the possible cases because it represents there are remaining
descs that are not handled and cached_cons are not updated to the global
state of consumer at this time.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
V2
Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
1. filter out those good cases because only those that return error need
updates.
Side note:
1. in non-batched zero copy mode, at the end of every caller of
xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
to update the local consumer to the global state of consumer. So for the
zero copy mode, no need to change at all.
2. Actually I have no strong preference between v1 (see the above link)
and v2 because smp_store_release() shouldn't cause side effect.
Considering the exactitude of writing code, v2 is a more preferable
one.
---
net/xdp/xsk.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 5542675dffa9..b9223a2a6ada 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
}
out:
+ if (xs->tx->cached_prod != xs->tx->cached_cons)
+ __xskq_cons_release(xs->tx);
+
if (sent_frame)
if (xsk_tx_writeable(xs))
sk->sk_write_space(sk);
--
2.43.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net-next v2] net: xsk: update tx queue consumer immediately after transmission
2025-06-23 7:31 [PATCH net-next v2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
@ 2025-06-23 14:35 ` Stanislav Fomichev
2025-06-24 12:53 ` Jason Xing
0 siblings, 1 reply; 3+ messages in thread
From: Stanislav Fomichev @ 2025-06-23 14:35 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel, bpf, netdev,
Jason Xing
On 06/23, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> For afxdp, the return value of sendto() syscall doesn't reflect how many
> descs handled in the kernel. One of use cases is that when user-space
> application tries to know the number of transmitted skbs and then decides
> if it continues to send, say, is it stopped due to max tx budget?
>
> The following formular can be used after sending to learn how many
> skbs/descs the kernel takes care of:
>
> tx_queue.consumers_before - tx_queue.consumers_after
>
> Prior to the current patch, in non-zc mode, the consumer of tx queue is
> not immediately updated at the end of each sendto syscall when error
> occurs, which leads to the consumer value out-of-dated from the perspective
> of user space. So this patch requires store operation to pass the cached
> value to the shared value to handle the problem.
>
> More than those explicit errors appearing in the while() loop in
> __xsk_generic_xmit(), there are a few possible error cases that might
> be neglected in the following call trace:
> __xsk_generic_xmit()
> xskq_cons_peek_desc()
> xskq_cons_read_desc()
> xskq_cons_is_valid_desc()
> It will also cause the premature exit in the while() loop even if not
> all the descs are consumed.
>
> Based on the above analysis, using 'cached_prod != cached_cons' could
> cover all the possible cases because it represents there are remaining
> descs that are not handled and cached_cons are not updated to the global
> state of consumer at this time.
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> V2
> Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
> 1. filter out those good cases because only those that return error need
> updates.
> Side note:
> 1. in non-batched zero copy mode, at the end of every caller of
> xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
> to update the local consumer to the global state of consumer. So for the
> zero copy mode, no need to change at all.
> 2. Actually I have no strong preference between v1 (see the above link)
> and v2 because smp_store_release() shouldn't cause side effect.
> Considering the exactitude of writing code, v2 is a more preferable
> one.
> ---
> net/xdp/xsk.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 5542675dffa9..b9223a2a6ada 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
> }
>
> out:
> + if (xs->tx->cached_prod != xs->tx->cached_cons)
Can we use xskq_has_descs() here instead?
And still would be nice to verify this with a test. Should not be much work,
most of the things (setup/etc) are already there, so you won't have
to write everything from scratch...
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net-next v2] net: xsk: update tx queue consumer immediately after transmission
2025-06-23 14:35 ` Stanislav Fomichev
@ 2025-06-24 12:53 ` Jason Xing
0 siblings, 0 replies; 3+ messages in thread
From: Jason Xing @ 2025-06-24 12:53 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel, bpf, netdev,
Jason Xing
On Mon, Jun 23, 2025 at 10:35 PM Stanislav Fomichev
<stfomichev@gmail.com> wrote:
>
> On 06/23, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > For afxdp, the return value of sendto() syscall doesn't reflect how many
> > descs handled in the kernel. One of use cases is that when user-space
> > application tries to know the number of transmitted skbs and then decides
> > if it continues to send, say, is it stopped due to max tx budget?
> >
> > The following formular can be used after sending to learn how many
> > skbs/descs the kernel takes care of:
> >
> > tx_queue.consumers_before - tx_queue.consumers_after
> >
> > Prior to the current patch, in non-zc mode, the consumer of tx queue is
> > not immediately updated at the end of each sendto syscall when error
> > occurs, which leads to the consumer value out-of-dated from the perspective
> > of user space. So this patch requires store operation to pass the cached
> > value to the shared value to handle the problem.
> >
> > More than those explicit errors appearing in the while() loop in
> > __xsk_generic_xmit(), there are a few possible error cases that might
> > be neglected in the following call trace:
> > __xsk_generic_xmit()
> > xskq_cons_peek_desc()
> > xskq_cons_read_desc()
> > xskq_cons_is_valid_desc()
> > It will also cause the premature exit in the while() loop even if not
> > all the descs are consumed.
> >
> > Based on the above analysis, using 'cached_prod != cached_cons' could
> > cover all the possible cases because it represents there are remaining
> > descs that are not handled and cached_cons are not updated to the global
> > state of consumer at this time.
>
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > V2
> > Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
> > 1. filter out those good cases because only those that return error need
> > updates.
> > Side note:
> > 1. in non-batched zero copy mode, at the end of every caller of
> > xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
> > to update the local consumer to the global state of consumer. So for the
> > zero copy mode, no need to change at all.
> > 2. Actually I have no strong preference between v1 (see the above link)
> > and v2 because smp_store_release() shouldn't cause side effect.
> > Considering the exactitude of writing code, v2 is a more preferable
> > one.
> > ---
> > net/xdp/xsk.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 5542675dffa9..b9223a2a6ada 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
> > }
> >
> > out:
> > + if (xs->tx->cached_prod != xs->tx->cached_cons)
>
> Can we use xskq_has_descs() here instead?
Sure, thanks.
>
> And still would be nice to verify this with a test. Should not be much work,
> most of the things (setup/etc) are already there, so you won't have
> to write everything from scratch...
Ah, finally I managed to make it work but a bit ugly... Let me try if
I can find a
better way to write the test patch.
Thanks,
Jason
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-06-24 12:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-23 7:31 [PATCH net-next v2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
2025-06-23 14:35 ` Stanislav Fomichev
2025-06-24 12:53 ` Jason Xing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).