* [PATCH net-next v3 0/2] net: xsk: update tx queue consumer
@ 2025-06-25 10:10 Jason Xing
2025-06-25 10:10 ` [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
2025-06-25 10:10 ` [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call Jason Xing
0 siblings, 2 replies; 11+ messages in thread
From: Jason Xing @ 2025-06-25 10:10 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Patch 1 makes sure the consumer is updated at the end of xsk xmit.
Patch 2 adds corresponding test.
Jason Xing (2):
net: xsk: update tx queue consumer immediately after transmission
selftests/bpf: check if the global consumer of tx queue updates after
send call
net/xdp/xsk.c | 3 +++
tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
2 files changed, 31 insertions(+), 2 deletions(-)
--
2.41.3
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission
2025-06-25 10:10 [PATCH net-next v3 0/2] net: xsk: update tx queue consumer Jason Xing
@ 2025-06-25 10:10 ` Jason Xing
2025-06-25 11:09 ` Maciej Fijalkowski
2025-06-25 10:10 ` [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call Jason Xing
1 sibling, 1 reply; 11+ messages in thread
From: Jason Xing @ 2025-06-25 10:10 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
For afxdp, the return value of sendto() syscall doesn't reflect how many
descs handled in the kernel. One of use cases is that when user-space
application tries to know the number of transmitted skbs and then decides
if it continues to send, say, is it stopped due to max tx budget?
The following formular can be used after sending to learn how many
skbs/descs the kernel takes care of:
tx_queue.consumers_before - tx_queue.consumers_after
Prior to the current patch, in non-zc mode, the consumer of tx queue is
not immediately updated at the end of each sendto syscall when error
occurs, which leads to the consumer value out-of-dated from the perspective
of user space. So this patch requires store operation to pass the cached
value to the shared value to handle the problem.
More than those explicit errors appearing in the while() loop in
__xsk_generic_xmit(), there are a few possible error cases that might
be neglected in the following call trace:
__xsk_generic_xmit()
xskq_cons_peek_desc()
xskq_cons_read_desc()
xskq_cons_is_valid_desc()
It will also cause the premature exit in the while() loop even if not
all the descs are consumed.
Based on the above analysis, using 'cached_prod != cached_cons' could
cover all the possible cases because it represents there are remaining
descs that are not handled and cached_cons are not updated to the global
state of consumer at this time.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
v3
Link: https://lore.kernel.org/all/20250623073129.23290-1-kerneljasonxing@gmail.com/
1. use xskq_has_descs helper.
2. add selftest
V2
Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
1. filter out those good cases because only those that return error need
updates.
Side note:
1. in non-batched zero copy mode, at the end of every caller of
xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
to update the local consumer to the global state of consumer. So for the
zero copy mode, no need to change at all.
2. Actually I have no strong preference between v1 (see the above link)
and v2 because smp_store_release() shouldn't cause side effect.
Considering the exactitude of writing code, v2 is a more preferable
one.
---
net/xdp/xsk.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 5542675dffa9..ab6351b24ac8 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
}
out:
+ if (xskq_has_descs(xs->tx))
+ __xskq_cons_release(xs->tx);
+
if (sent_frame)
if (xsk_tx_writeable(xs))
sk->sk_write_space(sk);
--
2.41.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 10:10 [PATCH net-next v3 0/2] net: xsk: update tx queue consumer Jason Xing
2025-06-25 10:10 ` [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
@ 2025-06-25 10:10 ` Jason Xing
2025-06-25 12:19 ` Maciej Fijalkowski
1 sibling, 1 reply; 11+ messages in thread
From: Jason Xing @ 2025-06-25 10:10 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
The subtest sends 33 packets at one time on purpose to see if xsk
exitting __xsk_generic_xmit() updates the global consumer of tx queue
when reaching the max loop (max_tx_budget, 32 by default). The number 33
can avoid xskq_cons_peek_desc() updates the consumer, to accurately
check if the issue that the first patch resolves remains.
Speaking of the selftest implementation, it's not possible to use the
normal validation_func to check if the issue happens because the whole
send packets logic will call the sendto multiple times such that we're
unable to detect in time.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
index 0ced4026ee44..f7aa83706bc7 100644
--- a/tools/testing/selftests/bpf/xskxceiver.c
+++ b/tools/testing/selftests/bpf/xskxceiver.c
@@ -109,6 +109,8 @@
#include <network_helpers.h>
+#define MAX_TX_BUDGET_DEFAULT 32
+
static bool opt_verbose;
static bool opt_print_tests;
static enum test_mode opt_mode = TEST_MODE_ALL;
@@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
return TEST_PASS;
}
-static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
+static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
+ struct xsk_socket_info *xsk, bool timeout)
{
u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
struct pkt_stream *pkt_stream = xsk->pkt_stream;
@@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
}
if (!timeout) {
+ int prev_tx_consumer;
+
+ if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
+ prev_tx_consumer = *xsk->tx.consumer;
+
if (complete_pkts(xsk, i))
return TEST_FAILURE;
+ if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
+ int delta = *xsk->tx.consumer - prev_tx_consumer;
+
+ if (delta != MAX_TX_BUDGET_DEFAULT)
+ return TEST_FAILURE;
+ }
+
usleep(10);
return TEST_PASS;
}
@@ -1492,7 +1507,7 @@ static int send_pkts(struct test_spec *test, struct ifobject *ifobject)
__set_bit(i, bitmap);
continue;
}
- ret = __send_pkts(ifobject, &ifobject->xsk_arr[i], timeout);
+ ret = __send_pkts(test, ifobject, &ifobject->xsk_arr[i], timeout);
if (ret == TEST_CONTINUE && !test->fail)
continue;
@@ -2613,6 +2628,16 @@ static int testapp_adjust_tail_grow_mb(struct test_spec *test)
XSK_UMEM__LARGE_FRAME_SIZE * 2);
}
+static int testapp_tx_queue_consumer(struct test_spec *test)
+{
+ int nr_packets = MAX_TX_BUDGET_DEFAULT + 1;
+
+ pkt_stream_replace(test, nr_packets, MIN_PKT_SIZE);
+ test->ifobj_tx->xsk->batch_size = nr_packets;
+
+ return testapp_validate_traffic(test);
+}
+
static void run_pkt_test(struct test_spec *test)
{
int ret;
@@ -2723,6 +2748,7 @@ static const struct test_spec tests[] = {
{.name = "XDP_ADJUST_TAIL_SHRINK_MULTI_BUFF", .test_func = testapp_adjust_tail_shrink_mb},
{.name = "XDP_ADJUST_TAIL_GROW", .test_func = testapp_adjust_tail_grow},
{.name = "XDP_ADJUST_TAIL_GROW_MULTI_BUFF", .test_func = testapp_adjust_tail_grow_mb},
+ {.name = "TX_QUEUE_CONSUMER", .test_func = testapp_tx_queue_consumer},
};
static void print_tests(void)
--
2.41.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission
2025-06-25 10:10 ` [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
@ 2025-06-25 11:09 ` Maciej Fijalkowski
2025-06-25 12:49 ` Jason Xing
0 siblings, 1 reply; 11+ messages in thread
From: Maciej Fijalkowski @ 2025-06-25 11:09 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe,
willemdebruijn.kernel, bpf, netdev, Jason Xing
On Wed, Jun 25, 2025 at 06:10:13PM +0800, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> For afxdp, the return value of sendto() syscall doesn't reflect how many
> descs handled in the kernel. One of use cases is that when user-space
> application tries to know the number of transmitted skbs and then decides
> if it continues to send, say, is it stopped due to max tx budget?
>
> The following formular can be used after sending to learn how many
> skbs/descs the kernel takes care of:
>
> tx_queue.consumers_before - tx_queue.consumers_after
>
> Prior to the current patch, in non-zc mode, the consumer of tx queue is
> not immediately updated at the end of each sendto syscall when error
> occurs, which leads to the consumer value out-of-dated from the perspective
> of user space. So this patch requires store operation to pass the cached
> value to the shared value to handle the problem.
>
> More than those explicit errors appearing in the while() loop in
> __xsk_generic_xmit(), there are a few possible error cases that might
> be neglected in the following call trace:
> __xsk_generic_xmit()
> xskq_cons_peek_desc()
> xskq_cons_read_desc()
> xskq_cons_is_valid_desc()
> It will also cause the premature exit in the while() loop even if not
> all the descs are consumed.
>
> Based on the above analysis, using 'cached_prod != cached_cons' could
> cover all the possible cases because it represents there are remaining
> descs that are not handled and cached_cons are not updated to the global
> state of consumer at this time.
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> v3
> Link: https://lore.kernel.org/all/20250623073129.23290-1-kerneljasonxing@gmail.com/
> 1. use xskq_has_descs helper.
> 2. add selftest
>
> V2
> Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
> 1. filter out those good cases because only those that return error need
> updates.
> Side note:
> 1. in non-batched zero copy mode, at the end of every caller of
> xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
> to update the local consumer to the global state of consumer. So for the
> zero copy mode, no need to change at all.
> 2. Actually I have no strong preference between v1 (see the above link)
> and v2 because smp_store_release() shouldn't cause side effect.
> Considering the exactitude of writing code, v2 is a more preferable
> one.
> ---
> net/xdp/xsk.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 5542675dffa9..ab6351b24ac8 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
> }
>
> out:
> + if (xskq_has_descs(xs->tx))
> + __xskq_cons_release(xs->tx);
> +
> if (sent_frame)
> if (xsk_tx_writeable(xs))
> sk->sk_write_space(sk);
Hi Jason,
IMHO below should be enough to address the issue:
if (sent_frame) {
__xskq_cons_release(xs->tx);
if (xsk_tx_writeable(xs))
sk->sk_write_space(sk);
}
which basically is what xsk_tx_release() does for each tx socket in list.
zc drivers call it whenever there was a single descriptor produced to HW
ring. So should we on generic xmit side, based on @sent_frame.
We could even wrap these 3 lines onto internal function, say
__xsk_tx_release() and use it in xsk_tx_release() as well.
> --
> 2.41.3
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 10:10 ` [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call Jason Xing
@ 2025-06-25 12:19 ` Maciej Fijalkowski
2025-06-25 12:58 ` Jason Xing
0 siblings, 1 reply; 11+ messages in thread
From: Maciej Fijalkowski @ 2025-06-25 12:19 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe,
willemdebruijn.kernel, bpf, netdev, Jason Xing
On Wed, Jun 25, 2025 at 06:10:14PM +0800, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> The subtest sends 33 packets at one time on purpose to see if xsk
> exitting __xsk_generic_xmit() updates the global consumer of tx queue
> when reaching the max loop (max_tx_budget, 32 by default). The number 33
> can avoid xskq_cons_peek_desc() updates the consumer, to accurately
> check if the issue that the first patch resolves remains.
>
> Speaking of the selftest implementation, it's not possible to use the
> normal validation_func to check if the issue happens because the whole
> send packets logic will call the sendto multiple times such that we're
> unable to detect in time.
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
> 1 file changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
> index 0ced4026ee44..f7aa83706bc7 100644
> --- a/tools/testing/selftests/bpf/xskxceiver.c
> +++ b/tools/testing/selftests/bpf/xskxceiver.c
> @@ -109,6 +109,8 @@
>
> #include <network_helpers.h>
>
> +#define MAX_TX_BUDGET_DEFAULT 32
and what if in the future you would increase the generic xmit budget on
the system? it would be better to wait with test addition when you
introduce the setsockopt patch.
plus keep in mind that xskxceiver tests ZC drivers as well. so either we
should have a test that serves all modes or keep it for skb mode only.
> +
> static bool opt_verbose;
> static bool opt_print_tests;
> static enum test_mode opt_mode = TEST_MODE_ALL;
> @@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
> return TEST_PASS;
> }
>
> -static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
> +static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
> + struct xsk_socket_info *xsk, bool timeout)
> {
> u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
> struct pkt_stream *pkt_stream = xsk->pkt_stream;
> @@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
> }
>
> if (!timeout) {
> + int prev_tx_consumer;
> +
> + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
> + prev_tx_consumer = *xsk->tx.consumer;
> +
> if (complete_pkts(xsk, i))
> return TEST_FAILURE;
>
> + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
> + int delta = *xsk->tx.consumer - prev_tx_consumer;
hacking the data path logic for single test purpose is rather not good.
I am also not really sure if this deserves a standalone test case or could
we just introduce a check in data path in appropriate place.
> +
> + if (delta != MAX_TX_BUDGET_DEFAULT)
> + return TEST_FAILURE;
> + }
> +
> usleep(10);
> return TEST_PASS;
> }
> @@ -1492,7 +1507,7 @@ static int send_pkts(struct test_spec *test, struct ifobject *ifobject)
> __set_bit(i, bitmap);
> continue;
> }
> - ret = __send_pkts(ifobject, &ifobject->xsk_arr[i], timeout);
> + ret = __send_pkts(test, ifobject, &ifobject->xsk_arr[i], timeout);
> if (ret == TEST_CONTINUE && !test->fail)
> continue;
>
> @@ -2613,6 +2628,16 @@ static int testapp_adjust_tail_grow_mb(struct test_spec *test)
> XSK_UMEM__LARGE_FRAME_SIZE * 2);
> }
>
> +static int testapp_tx_queue_consumer(struct test_spec *test)
> +{
> + int nr_packets = MAX_TX_BUDGET_DEFAULT + 1;
> +
> + pkt_stream_replace(test, nr_packets, MIN_PKT_SIZE);
> + test->ifobj_tx->xsk->batch_size = nr_packets;
> +
> + return testapp_validate_traffic(test);
> +}
> +
> static void run_pkt_test(struct test_spec *test)
> {
> int ret;
> @@ -2723,6 +2748,7 @@ static const struct test_spec tests[] = {
> {.name = "XDP_ADJUST_TAIL_SHRINK_MULTI_BUFF", .test_func = testapp_adjust_tail_shrink_mb},
> {.name = "XDP_ADJUST_TAIL_GROW", .test_func = testapp_adjust_tail_grow},
> {.name = "XDP_ADJUST_TAIL_GROW_MULTI_BUFF", .test_func = testapp_adjust_tail_grow_mb},
> + {.name = "TX_QUEUE_CONSUMER", .test_func = testapp_tx_queue_consumer},
> };
>
> static void print_tests(void)
> --
> 2.41.3
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission
2025-06-25 11:09 ` Maciej Fijalkowski
@ 2025-06-25 12:49 ` Jason Xing
0 siblings, 0 replies; 11+ messages in thread
From: Jason Xing @ 2025-06-25 12:49 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe,
willemdebruijn.kernel, bpf, netdev, Jason Xing
Hi Maciej,
On Wed, Jun 25, 2025 at 7:09 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Wed, Jun 25, 2025 at 06:10:13PM +0800, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > For afxdp, the return value of sendto() syscall doesn't reflect how many
> > descs handled in the kernel. One of use cases is that when user-space
> > application tries to know the number of transmitted skbs and then decides
> > if it continues to send, say, is it stopped due to max tx budget?
> >
> > The following formular can be used after sending to learn how many
> > skbs/descs the kernel takes care of:
> >
> > tx_queue.consumers_before - tx_queue.consumers_after
> >
> > Prior to the current patch, in non-zc mode, the consumer of tx queue is
> > not immediately updated at the end of each sendto syscall when error
> > occurs, which leads to the consumer value out-of-dated from the perspective
> > of user space. So this patch requires store operation to pass the cached
> > value to the shared value to handle the problem.
> >
> > More than those explicit errors appearing in the while() loop in
> > __xsk_generic_xmit(), there are a few possible error cases that might
> > be neglected in the following call trace:
> > __xsk_generic_xmit()
> > xskq_cons_peek_desc()
> > xskq_cons_read_desc()
> > xskq_cons_is_valid_desc()
> > It will also cause the premature exit in the while() loop even if not
> > all the descs are consumed.
> >
> > Based on the above analysis, using 'cached_prod != cached_cons' could
> > cover all the possible cases because it represents there are remaining
> > descs that are not handled and cached_cons are not updated to the global
> > state of consumer at this time.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > v3
> > Link: https://lore.kernel.org/all/20250623073129.23290-1-kerneljasonxing@gmail.com/
> > 1. use xskq_has_descs helper.
> > 2. add selftest
> >
> > V2
> > Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/
> > 1. filter out those good cases because only those that return error need
> > updates.
> > Side note:
> > 1. in non-batched zero copy mode, at the end of every caller of
> > xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used
> > to update the local consumer to the global state of consumer. So for the
> > zero copy mode, no need to change at all.
> > 2. Actually I have no strong preference between v1 (see the above link)
> > and v2 because smp_store_release() shouldn't cause side effect.
> > Considering the exactitude of writing code, v2 is a more preferable
> > one.
> > ---
> > net/xdp/xsk.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 5542675dffa9..ab6351b24ac8 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -856,6 +856,9 @@ static int __xsk_generic_xmit(struct sock *sk)
> > }
> >
> > out:
> > + if (xskq_has_descs(xs->tx))
> > + __xskq_cons_release(xs->tx);
> > +
> > if (sent_frame)
> > if (xsk_tx_writeable(xs))
> > sk->sk_write_space(sk);
>
> Hi Jason,
> IMHO below should be enough to address the issue:
Sure, it can.
Can I ask one more thing? Technically it's not considered a bug,
right? I'm not sure if it's worth telling the stable team to backport
in older versions.
>
> if (sent_frame) {
Using this condition means the consumer is updated in majority cases
including those good cases [1]. The intention of the current patch is
to update the consumer only when the error occurs because in other
cases xskq_cons_peek_desc() does it.
[1]: https://lore.kernel.org/all/aFVr60tw3QJopcOo@mini-arch/
> __xskq_cons_release(xs->tx);
> if (xsk_tx_writeable(xs))
> sk->sk_write_space(sk);
> }
>
> which basically is what xsk_tx_release() does for each tx socket in list.
> zc drivers call it whenever there was a single descriptor produced to HW
> ring. So should we on generic xmit side, based on @sent_frame.
As you said, they would be the same :)
>
> We could even wrap these 3 lines onto internal function, say
> __xsk_tx_release() and use it in xsk_tx_release() as well.
I can do it in the next respin.
But I have no obvious opinion on how to write it. If no one is opposed
to the taste of patch, I will follow your advice. Thanks.
Thanks,
Jason
>
> > --
> > 2.41.3
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 12:19 ` Maciej Fijalkowski
@ 2025-06-25 12:58 ` Jason Xing
2025-06-25 15:00 ` Stanislav Fomichev
0 siblings, 1 reply; 11+ messages in thread
From: Jason Xing @ 2025-06-25 12:58 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe,
willemdebruijn.kernel, bpf, netdev, Jason Xing
On Wed, Jun 25, 2025 at 8:19 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Wed, Jun 25, 2025 at 06:10:14PM +0800, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > The subtest sends 33 packets at one time on purpose to see if xsk
> > exitting __xsk_generic_xmit() updates the global consumer of tx queue
> > when reaching the max loop (max_tx_budget, 32 by default). The number 33
> > can avoid xskq_cons_peek_desc() updates the consumer, to accurately
> > check if the issue that the first patch resolves remains.
> >
> > Speaking of the selftest implementation, it's not possible to use the
> > normal validation_func to check if the issue happens because the whole
> > send packets logic will call the sendto multiple times such that we're
> > unable to detect in time.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
> > 1 file changed, 28 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
> > index 0ced4026ee44..f7aa83706bc7 100644
> > --- a/tools/testing/selftests/bpf/xskxceiver.c
> > +++ b/tools/testing/selftests/bpf/xskxceiver.c
> > @@ -109,6 +109,8 @@
> >
> > #include <network_helpers.h>
> >
> > +#define MAX_TX_BUDGET_DEFAULT 32
>
> and what if in the future you would increase the generic xmit budget on
> the system? it would be better to wait with test addition when you
> introduce the setsockopt patch.
>
> plus keep in mind that xskxceiver tests ZC drivers as well. so either we
> should have a test that serves all modes or keep it for skb mode only.
>
> > +
> > static bool opt_verbose;
> > static bool opt_print_tests;
> > static enum test_mode opt_mode = TEST_MODE_ALL;
> > @@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
> > return TEST_PASS;
> > }
> >
> > -static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
> > +static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
> > + struct xsk_socket_info *xsk, bool timeout)
> > {
> > u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
> > struct pkt_stream *pkt_stream = xsk->pkt_stream;
> > @@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
> > }
> >
> > if (!timeout) {
> > + int prev_tx_consumer;
> > +
> > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
> > + prev_tx_consumer = *xsk->tx.consumer;
> > +
> > if (complete_pkts(xsk, i))
> > return TEST_FAILURE;
> >
> > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
> > + int delta = *xsk->tx.consumer - prev_tx_consumer;
>
> hacking the data path logic for single test purpose is rather not good.
> I am also not really sure if this deserves a standalone test case or could
> we just introduce a check in data path in appropriate place.
The big headache is that if we expect to detect such a case, we have
to re-invent a similar send packet logic or hack the data path (a bit
like this patch). I admit it's ugly as I mentioned yesterday.
Sorry, Stanislav, no offense here. If you read this, please don't
blame me. I know you wish me to add one related test case. So here we
are. Since Maciej brought up the similar thought, I keep wondering if
we should give up such a standalone test patch? Honestly it already
involved more time than expected. The primary reason for me is that
the issue doesn't cause much trouble to the application.
Thanks,
Jason
>
> > +
> > + if (delta != MAX_TX_BUDGET_DEFAULT)
> > + return TEST_FAILURE;
> > + }
> > +
> > usleep(10);
> > return TEST_PASS;
> > }
> > @@ -1492,7 +1507,7 @@ static int send_pkts(struct test_spec *test, struct ifobject *ifobject)
> > __set_bit(i, bitmap);
> > continue;
> > }
> > - ret = __send_pkts(ifobject, &ifobject->xsk_arr[i], timeout);
> > + ret = __send_pkts(test, ifobject, &ifobject->xsk_arr[i], timeout);
> > if (ret == TEST_CONTINUE && !test->fail)
> > continue;
> >
> > @@ -2613,6 +2628,16 @@ static int testapp_adjust_tail_grow_mb(struct test_spec *test)
> > XSK_UMEM__LARGE_FRAME_SIZE * 2);
> > }
> >
> > +static int testapp_tx_queue_consumer(struct test_spec *test)
> > +{
> > + int nr_packets = MAX_TX_BUDGET_DEFAULT + 1;
> > +
> > + pkt_stream_replace(test, nr_packets, MIN_PKT_SIZE);
> > + test->ifobj_tx->xsk->batch_size = nr_packets;
> > +
> > + return testapp_validate_traffic(test);
> > +}
> > +
> > static void run_pkt_test(struct test_spec *test)
> > {
> > int ret;
> > @@ -2723,6 +2748,7 @@ static const struct test_spec tests[] = {
> > {.name = "XDP_ADJUST_TAIL_SHRINK_MULTI_BUFF", .test_func = testapp_adjust_tail_shrink_mb},
> > {.name = "XDP_ADJUST_TAIL_GROW", .test_func = testapp_adjust_tail_grow},
> > {.name = "XDP_ADJUST_TAIL_GROW_MULTI_BUFF", .test_func = testapp_adjust_tail_grow_mb},
> > + {.name = "TX_QUEUE_CONSUMER", .test_func = testapp_tx_queue_consumer},
> > };
> >
> > static void print_tests(void)
> > --
> > 2.41.3
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 12:58 ` Jason Xing
@ 2025-06-25 15:00 ` Stanislav Fomichev
2025-06-25 15:35 ` Jason Xing
0 siblings, 1 reply; 11+ messages in thread
From: Stanislav Fomichev @ 2025-06-25 15:00 UTC (permalink / raw)
To: Jason Xing
Cc: Maciej Fijalkowski, davem, edumazet, kuba, pabeni, bjorn,
magnus.karlsson, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel, bpf, netdev,
Jason Xing
On 06/25, Jason Xing wrote:
> On Wed, Jun 25, 2025 at 8:19 PM Maciej Fijalkowski
> <maciej.fijalkowski@intel.com> wrote:
> >
> > On Wed, Jun 25, 2025 at 06:10:14PM +0800, Jason Xing wrote:
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > The subtest sends 33 packets at one time on purpose to see if xsk
> > > exitting __xsk_generic_xmit() updates the global consumer of tx queue
> > > when reaching the max loop (max_tx_budget, 32 by default). The number 33
> > > can avoid xskq_cons_peek_desc() updates the consumer, to accurately
> > > check if the issue that the first patch resolves remains.
> > >
> > > Speaking of the selftest implementation, it's not possible to use the
> > > normal validation_func to check if the issue happens because the whole
> > > send packets logic will call the sendto multiple times such that we're
> > > unable to detect in time.
> > >
> > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > ---
> > > tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
> > > 1 file changed, 28 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
> > > index 0ced4026ee44..f7aa83706bc7 100644
> > > --- a/tools/testing/selftests/bpf/xskxceiver.c
> > > +++ b/tools/testing/selftests/bpf/xskxceiver.c
> > > @@ -109,6 +109,8 @@
> > >
> > > #include <network_helpers.h>
> > >
> > > +#define MAX_TX_BUDGET_DEFAULT 32
> >
> > and what if in the future you would increase the generic xmit budget on
> > the system? it would be better to wait with test addition when you
> > introduce the setsockopt patch.
We can always update it to follow new budget. The purpose of the test
is to document/verify userspace expectations. Sincle even with the
setsockopt we are still gonna have the default budget.
> > plus keep in mind that xskxceiver tests ZC drivers as well. so either we
> > should have a test that serves all modes or keep it for skb mode only.
> >
> > > +
> > > static bool opt_verbose;
> > > static bool opt_print_tests;
> > > static enum test_mode opt_mode = TEST_MODE_ALL;
> > > @@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
> > > return TEST_PASS;
> > > }
> > >
> > > -static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
> > > +static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
> > > + struct xsk_socket_info *xsk, bool timeout)
> > > {
> > > u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
> > > struct pkt_stream *pkt_stream = xsk->pkt_stream;
> > > @@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
> > > }
> > >
> > > if (!timeout) {
> > > + int prev_tx_consumer;
> > > +
> > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
> > > + prev_tx_consumer = *xsk->tx.consumer;
> > > +
> > > if (complete_pkts(xsk, i))
> > > return TEST_FAILURE;
> > >
> > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
> > > + int delta = *xsk->tx.consumer - prev_tx_consumer;
> >
> > hacking the data path logic for single test purpose is rather not good.
> > I am also not really sure if this deserves a standalone test case or could
> > we just introduce a check in data path in appropriate place.
>
> The big headache is that if we expect to detect such a case, we have
> to re-invent a similar send packet logic or hack the data path (a bit
> like this patch). I admit it's ugly as I mentioned yesterday.
>
> Sorry, Stanislav, no offense here. If you read this, please don't
> blame me. I know you wish me to add one related test case. So here we
> are. Since Maciej brought up the similar thought, I keep wondering if
> we should give up such a standalone test patch? Honestly it already
> involved more time than expected. The primary reason for me is that
> the issue doesn't cause much trouble to the application.
IIUC, Maciej does not suggest to completely drop the test but rather
to move this check (unconditionally and only for skb mode) somewhere
into __send_pkts/complete_pkts to make sure the number of completed
packets is always <= budget. Maciej correct me if I misread..
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 15:00 ` Stanislav Fomichev
@ 2025-06-25 15:35 ` Jason Xing
2025-06-25 23:41 ` Stanislav Fomichev
0 siblings, 1 reply; 11+ messages in thread
From: Jason Xing @ 2025-06-25 15:35 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Maciej Fijalkowski, davem, edumazet, kuba, pabeni, bjorn,
magnus.karlsson, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel, bpf, netdev,
Jason Xing
On Wed, Jun 25, 2025 at 11:00 PM Stanislav Fomichev
<stfomichev@gmail.com> wrote:
>
> On 06/25, Jason Xing wrote:
> > On Wed, Jun 25, 2025 at 8:19 PM Maciej Fijalkowski
> > <maciej.fijalkowski@intel.com> wrote:
> > >
> > > On Wed, Jun 25, 2025 at 06:10:14PM +0800, Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > >
> > > > The subtest sends 33 packets at one time on purpose to see if xsk
> > > > exitting __xsk_generic_xmit() updates the global consumer of tx queue
> > > > when reaching the max loop (max_tx_budget, 32 by default). The number 33
> > > > can avoid xskq_cons_peek_desc() updates the consumer, to accurately
> > > > check if the issue that the first patch resolves remains.
> > > >
> > > > Speaking of the selftest implementation, it's not possible to use the
> > > > normal validation_func to check if the issue happens because the whole
> > > > send packets logic will call the sendto multiple times such that we're
> > > > unable to detect in time.
> > > >
> > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > ---
> > > > tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
> > > > 1 file changed, 28 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
> > > > index 0ced4026ee44..f7aa83706bc7 100644
> > > > --- a/tools/testing/selftests/bpf/xskxceiver.c
> > > > +++ b/tools/testing/selftests/bpf/xskxceiver.c
> > > > @@ -109,6 +109,8 @@
> > > >
> > > > #include <network_helpers.h>
> > > >
> > > > +#define MAX_TX_BUDGET_DEFAULT 32
> > >
> > > and what if in the future you would increase the generic xmit budget on
> > > the system? it would be better to wait with test addition when you
> > > introduce the setsockopt patch.
>
> We can always update it to follow new budget. The purpose of the test
> is to document/verify userspace expectations. Sincle even with the
> setsockopt we are still gonna have the default budget.
>
> > > plus keep in mind that xskxceiver tests ZC drivers as well. so either we
> > > should have a test that serves all modes or keep it for skb mode only.
> > >
> > > > +
> > > > static bool opt_verbose;
> > > > static bool opt_print_tests;
> > > > static enum test_mode opt_mode = TEST_MODE_ALL;
> > > > @@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
> > > > return TEST_PASS;
> > > > }
> > > >
> > > > -static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
> > > > +static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
> > > > + struct xsk_socket_info *xsk, bool timeout)
> > > > {
> > > > u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
> > > > struct pkt_stream *pkt_stream = xsk->pkt_stream;
> > > > @@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
> > > > }
> > > >
> > > > if (!timeout) {
> > > > + int prev_tx_consumer;
> > > > +
> > > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
> > > > + prev_tx_consumer = *xsk->tx.consumer;
> > > > +
> > > > if (complete_pkts(xsk, i))
> > > > return TEST_FAILURE;
> > > >
> > > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
> > > > + int delta = *xsk->tx.consumer - prev_tx_consumer;
> > >
> > > hacking the data path logic for single test purpose is rather not good.
> > > I am also not really sure if this deserves a standalone test case or could
> > > we just introduce a check in data path in appropriate place.
> >
> > The big headache is that if we expect to detect such a case, we have
> > to re-invent a similar send packet logic or hack the data path (a bit
> > like this patch). I admit it's ugly as I mentioned yesterday.
> >
> > Sorry, Stanislav, no offense here. If you read this, please don't
> > blame me. I know you wish me to add one related test case. So here we
> > are. Since Maciej brought up the similar thought, I keep wondering if
> > we should give up such a standalone test patch? Honestly it already
> > involved more time than expected. The primary reason for me is that
> > the issue doesn't cause much trouble to the application.
>
> IIUC, Maciej does not suggest to completely drop the test but rather
> to move this check (unconditionally and only for skb mode) somewhere
I prefer the former: make it suitable for all the cases. Whether it's
zero copy mode or non-zc one, the behaviour of the consumer should be
the same: update when finishing sending packets. I will give it a try
again tomorrow since it's too late for me right now. Sorry.
Thanks,
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 15:35 ` Jason Xing
@ 2025-06-25 23:41 ` Stanislav Fomichev
2025-06-25 23:55 ` Jason Xing
0 siblings, 1 reply; 11+ messages in thread
From: Stanislav Fomichev @ 2025-06-25 23:41 UTC (permalink / raw)
To: Jason Xing
Cc: Maciej Fijalkowski, davem, edumazet, kuba, pabeni, bjorn,
magnus.karlsson, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel, bpf, netdev,
Jason Xing
On 06/25, Jason Xing wrote:
> On Wed, Jun 25, 2025 at 11:00 PM Stanislav Fomichev
> <stfomichev@gmail.com> wrote:
> >
> > On 06/25, Jason Xing wrote:
> > > On Wed, Jun 25, 2025 at 8:19 PM Maciej Fijalkowski
> > > <maciej.fijalkowski@intel.com> wrote:
> > > >
> > > > On Wed, Jun 25, 2025 at 06:10:14PM +0800, Jason Xing wrote:
> > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > >
> > > > > The subtest sends 33 packets at one time on purpose to see if xsk
> > > > > exitting __xsk_generic_xmit() updates the global consumer of tx queue
> > > > > when reaching the max loop (max_tx_budget, 32 by default). The number 33
> > > > > can avoid xskq_cons_peek_desc() updates the consumer, to accurately
> > > > > check if the issue that the first patch resolves remains.
> > > > >
> > > > > Speaking of the selftest implementation, it's not possible to use the
> > > > > normal validation_func to check if the issue happens because the whole
> > > > > send packets logic will call the sendto multiple times such that we're
> > > > > unable to detect in time.
> > > > >
> > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > > ---
> > > > > tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
> > > > > 1 file changed, 28 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
> > > > > index 0ced4026ee44..f7aa83706bc7 100644
> > > > > --- a/tools/testing/selftests/bpf/xskxceiver.c
> > > > > +++ b/tools/testing/selftests/bpf/xskxceiver.c
> > > > > @@ -109,6 +109,8 @@
> > > > >
> > > > > #include <network_helpers.h>
> > > > >
> > > > > +#define MAX_TX_BUDGET_DEFAULT 32
> > > >
> > > > and what if in the future you would increase the generic xmit budget on
> > > > the system? it would be better to wait with test addition when you
> > > > introduce the setsockopt patch.
> >
> > We can always update it to follow new budget. The purpose of the test
> > is to document/verify userspace expectations. Sincle even with the
> > setsockopt we are still gonna have the default budget.
> >
> > > > plus keep in mind that xskxceiver tests ZC drivers as well. so either we
> > > > should have a test that serves all modes or keep it for skb mode only.
> > > >
> > > > > +
> > > > > static bool opt_verbose;
> > > > > static bool opt_print_tests;
> > > > > static enum test_mode opt_mode = TEST_MODE_ALL;
> > > > > @@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
> > > > > return TEST_PASS;
> > > > > }
> > > > >
> > > > > -static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
> > > > > +static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
> > > > > + struct xsk_socket_info *xsk, bool timeout)
> > > > > {
> > > > > u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
> > > > > struct pkt_stream *pkt_stream = xsk->pkt_stream;
> > > > > @@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
> > > > > }
> > > > >
> > > > > if (!timeout) {
> > > > > + int prev_tx_consumer;
> > > > > +
> > > > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
> > > > > + prev_tx_consumer = *xsk->tx.consumer;
> > > > > +
> > > > > if (complete_pkts(xsk, i))
> > > > > return TEST_FAILURE;
> > > > >
> > > > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
> > > > > + int delta = *xsk->tx.consumer - prev_tx_consumer;
> > > >
> > > > hacking the data path logic for single test purpose is rather not good.
> > > > I am also not really sure if this deserves a standalone test case or could
> > > > we just introduce a check in data path in appropriate place.
> > >
> > > The big headache is that if we expect to detect such a case, we have
> > > to re-invent a similar send packet logic or hack the data path (a bit
> > > like this patch). I admit it's ugly as I mentioned yesterday.
> > >
> > > Sorry, Stanislav, no offense here. If you read this, please don't
> > > blame me. I know you wish me to add one related test case. So here we
> > > are. Since Maciej brought up the similar thought, I keep wondering if
> > > we should give up such a standalone test patch? Honestly it already
> > > involved more time than expected. The primary reason for me is that
> > > the issue doesn't cause much trouble to the application.
> >
> > IIUC, Maciej does not suggest to completely drop the test but rather
> > to move this check (unconditionally and only for skb mode) somewhere
>
> I prefer the former: make it suitable for all the cases. Whether it's
> zero copy mode or non-zc one, the behaviour of the consumer should be
But it does not work the same in zc and non-zc modes :-( There are
a bunch of (uncodumented) quirks here and there. With a test case we
can at least highlight/document them.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call
2025-06-25 23:41 ` Stanislav Fomichev
@ 2025-06-25 23:55 ` Jason Xing
0 siblings, 0 replies; 11+ messages in thread
From: Jason Xing @ 2025-06-25 23:55 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Maciej Fijalkowski, davem, edumazet, kuba, pabeni, bjorn,
magnus.karlsson, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend, joe, willemdebruijn.kernel, bpf, netdev,
Jason Xing
On Thu, Jun 26, 2025 at 7:41 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
>
> On 06/25, Jason Xing wrote:
> > On Wed, Jun 25, 2025 at 11:00 PM Stanislav Fomichev
> > <stfomichev@gmail.com> wrote:
> > >
> > > On 06/25, Jason Xing wrote:
> > > > On Wed, Jun 25, 2025 at 8:19 PM Maciej Fijalkowski
> > > > <maciej.fijalkowski@intel.com> wrote:
> > > > >
> > > > > On Wed, Jun 25, 2025 at 06:10:14PM +0800, Jason Xing wrote:
> > > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > > >
> > > > > > The subtest sends 33 packets at one time on purpose to see if xsk
> > > > > > exitting __xsk_generic_xmit() updates the global consumer of tx queue
> > > > > > when reaching the max loop (max_tx_budget, 32 by default). The number 33
> > > > > > can avoid xskq_cons_peek_desc() updates the consumer, to accurately
> > > > > > check if the issue that the first patch resolves remains.
> > > > > >
> > > > > > Speaking of the selftest implementation, it's not possible to use the
> > > > > > normal validation_func to check if the issue happens because the whole
> > > > > > send packets logic will call the sendto multiple times such that we're
> > > > > > unable to detect in time.
> > > > > >
> > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > > > ---
> > > > > > tools/testing/selftests/bpf/xskxceiver.c | 30 ++++++++++++++++++++++--
> > > > > > 1 file changed, 28 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
> > > > > > index 0ced4026ee44..f7aa83706bc7 100644
> > > > > > --- a/tools/testing/selftests/bpf/xskxceiver.c
> > > > > > +++ b/tools/testing/selftests/bpf/xskxceiver.c
> > > > > > @@ -109,6 +109,8 @@
> > > > > >
> > > > > > #include <network_helpers.h>
> > > > > >
> > > > > > +#define MAX_TX_BUDGET_DEFAULT 32
> > > > >
> > > > > and what if in the future you would increase the generic xmit budget on
> > > > > the system? it would be better to wait with test addition when you
> > > > > introduce the setsockopt patch.
> > >
> > > We can always update it to follow new budget. The purpose of the test
> > > is to document/verify userspace expectations. Sincle even with the
> > > setsockopt we are still gonna have the default budget.
> > >
> > > > > plus keep in mind that xskxceiver tests ZC drivers as well. so either we
> > > > > should have a test that serves all modes or keep it for skb mode only.
> > > > >
> > > > > > +
> > > > > > static bool opt_verbose;
> > > > > > static bool opt_print_tests;
> > > > > > static enum test_mode opt_mode = TEST_MODE_ALL;
> > > > > > @@ -1323,7 +1325,8 @@ static int receive_pkts(struct test_spec *test)
> > > > > > return TEST_PASS;
> > > > > > }
> > > > > >
> > > > > > -static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, bool timeout)
> > > > > > +static int __send_pkts(struct test_spec *test, struct ifobject *ifobject,
> > > > > > + struct xsk_socket_info *xsk, bool timeout)
> > > > > > {
> > > > > > u32 i, idx = 0, valid_pkts = 0, valid_frags = 0, buffer_len;
> > > > > > struct pkt_stream *pkt_stream = xsk->pkt_stream;
> > > > > > @@ -1437,9 +1440,21 @@ static int __send_pkts(struct ifobject *ifobject, struct xsk_socket_info *xsk, b
> > > > > > }
> > > > > >
> > > > > > if (!timeout) {
> > > > > > + int prev_tx_consumer;
> > > > > > +
> > > > > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE))
> > > > > > + prev_tx_consumer = *xsk->tx.consumer;
> > > > > > +
> > > > > > if (complete_pkts(xsk, i))
> > > > > > return TEST_FAILURE;
> > > > > >
> > > > > > + if (!strncmp("TX_QUEUE_CONSUMER", test->name, MAX_TEST_NAME_SIZE)) {
> > > > > > + int delta = *xsk->tx.consumer - prev_tx_consumer;
> > > > >
> > > > > hacking the data path logic for single test purpose is rather not good.
> > > > > I am also not really sure if this deserves a standalone test case or could
> > > > > we just introduce a check in data path in appropriate place.
> > > >
> > > > The big headache is that if we expect to detect such a case, we have
> > > > to re-invent a similar send packet logic or hack the data path (a bit
> > > > like this patch). I admit it's ugly as I mentioned yesterday.
> > > >
> > > > Sorry, Stanislav, no offense here. If you read this, please don't
> > > > blame me. I know you wish me to add one related test case. So here we
> > > > are. Since Maciej brought up the similar thought, I keep wondering if
> > > > we should give up such a standalone test patch? Honestly it already
> > > > involved more time than expected. The primary reason for me is that
> > > > the issue doesn't cause much trouble to the application.
> > >
> > > IIUC, Maciej does not suggest to completely drop the test but rather
> > > to move this check (unconditionally and only for skb mode) somewhere
> >
> > I prefer the former: make it suitable for all the cases. Whether it's
> > zero copy mode or non-zc one, the behaviour of the consumer should be
>
> But it does not work the same in zc and non-zc modes :-( There are
> a bunch of (uncodumented) quirks here and there. With a test case we
> can at least highlight/document them.
Oh, sorry, I didn't explain myself well. I was trying to say I would
add in the data path so that the new test patch can easily recognize
the problem for generic xmit path (in either mode just for
compatibility).
Do you have any good advice to continue to do such a thing you mentioned?
Thanks,
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-06-25 23:55 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-25 10:10 [PATCH net-next v3 0/2] net: xsk: update tx queue consumer Jason Xing
2025-06-25 10:10 ` [PATCH net-next v3 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing
2025-06-25 11:09 ` Maciej Fijalkowski
2025-06-25 12:49 ` Jason Xing
2025-06-25 10:10 ` [PATCH net-next v3 2/2] selftests/bpf: check if the global consumer of tx queue updates after send call Jason Xing
2025-06-25 12:19 ` Maciej Fijalkowski
2025-06-25 12:58 ` Jason Xing
2025-06-25 15:00 ` Stanislav Fomichev
2025-06-25 15:35 ` Jason Xing
2025-06-25 23:41 ` Stanislav Fomichev
2025-06-25 23:55 ` Jason Xing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).