* [PATCH net-next v6 0/2] net: xsk: update tx queue consumer @ 2025-07-03 14:17 Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Jason Xing @ 2025-07-03 14:17 UTC (permalink / raw) To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson, maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe, willemdebruijn.kernel Cc: bpf, netdev, Jason Xing From: Jason Xing <kernelxing@tencent.com> Patch 1 makes sure the consumer is updated at the end of generic xmit. Patch 2 adds corresponding test. Jason Xing (2): net: xsk: update tx queue consumer immediately after transmission selftests/bpf: add a new test to check the consumer update case net/xdp/xsk.c | 17 ++++--- tools/testing/selftests/bpf/xskxceiver.c | 56 +++++++++++++++++++++++- tools/testing/selftests/bpf/xskxceiver.h | 1 + 3 files changed, 66 insertions(+), 8 deletions(-) -- 2.41.3 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next v6 1/2] net: xsk: update tx queue consumer immediately after transmission 2025-07-03 14:17 [PATCH net-next v6 0/2] net: xsk: update tx queue consumer Jason Xing @ 2025-07-03 14:17 ` Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case Jason Xing 2025-07-09 1:40 ` [PATCH net-next v6 0/2] net: xsk: update tx queue consumer patchwork-bot+netdevbpf 2 siblings, 0 replies; 6+ messages in thread From: Jason Xing @ 2025-07-03 14:17 UTC (permalink / raw) To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson, maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe, willemdebruijn.kernel Cc: bpf, netdev, Jason Xing From: Jason Xing <kernelxing@tencent.com> For afxdp, the return value of sendto() syscall doesn't reflect how many descs handled in the kernel. One of use cases is that when user-space application tries to know the number of transmitted skbs and then decides if it continues to send, say, is it stopped due to max tx budget? The following formular can be used after sending to learn how many skbs/descs the kernel takes care of: tx_queue.consumers_before - tx_queue.consumers_after Prior to the current patch, in non-zc mode, the consumer of tx queue is not immediately updated at the end of each sendto syscall when error occurs, which leads to the consumer value out-of-dated from the perspective of user space. So this patch requires store operation to pass the cached value to the shared value to handle the problem. More than those explicit errors appearing in the while() loop in __xsk_generic_xmit(), there are a few possible error cases that might be neglected in the following call trace: __xsk_generic_xmit() xskq_cons_peek_desc() xskq_cons_read_desc() xskq_cons_is_valid_desc() It will also cause the premature exit in the while() loop even if not all the descs are consumed. Based on the above analysis, using @sent_frame could cover all the possible cases where it might lead to out-of-dated global state of consumer after finishing __xsk_generic_xmit(). The patch also adds a common helper __xsk_tx_release() to keep align with the zc mode usage in xsk_tx_release(). Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> --- v5 Link: https://lore.kernel.org/all/20250627085745.53173-1-kerneljasonxing@gmail.com/ 1. add acked-by tags v4 Link: https://lore.kernel.org/all/20250625101014.45066-1-kerneljasonxing@gmail.com/ 1. use the common helper 2. keep align with the zc mode usage in xsk_tx_release() v3 Link: https://lore.kernel.org/all/20250623073129.23290-1-kerneljasonxing@gmail.com/ 1. use xskq_has_descs helper. 2. add selftest V2 Link: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/ 1. filter out those good cases because only those that return error need updates. Side note: 1. in non-batched zero copy mode, at the end of every caller of xsk_tx_peek_desc(), there is always a xsk_tx_release() function that used to update the local consumer to the global state of consumer. So for the zero copy mode, no need to change at all. 2. Actually I have no strong preference between v1 (see the above link) and v2 because smp_store_release() shouldn't cause side effect. Considering the exactitude of writing code, v2 is a more preferable one. --- net/xdp/xsk.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 72c000c0ae5f..bd61b0bc9c24 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -300,6 +300,13 @@ static bool xsk_tx_writeable(struct xdp_sock *xs) return true; } +static void __xsk_tx_release(struct xdp_sock *xs) +{ + __xskq_cons_release(xs->tx); + if (xsk_tx_writeable(xs)) + xs->sk.sk_write_space(&xs->sk); +} + static bool xsk_is_bound(struct xdp_sock *xs) { if (READ_ONCE(xs->state) == XSK_BOUND) { @@ -407,11 +414,8 @@ void xsk_tx_release(struct xsk_buff_pool *pool) struct xdp_sock *xs; rcu_read_lock(); - list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { - __xskq_cons_release(xs->tx); - if (xsk_tx_writeable(xs)) - xs->sk.sk_write_space(&xs->sk); - } + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) + __xsk_tx_release(xs); rcu_read_unlock(); } EXPORT_SYMBOL(xsk_tx_release); @@ -858,8 +862,7 @@ static int __xsk_generic_xmit(struct sock *sk) out: if (sent_frame) - if (xsk_tx_writeable(xs)) - sk->sk_write_space(sk); + __xsk_tx_release(xs); mutex_unlock(&xs->mutex); return err; -- 2.41.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case 2025-07-03 14:17 [PATCH net-next v6 0/2] net: xsk: update tx queue consumer Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing @ 2025-07-03 14:17 ` Jason Xing 2025-07-03 15:10 ` Maciej Fijalkowski 2025-07-03 15:45 ` Stanislav Fomichev 2025-07-09 1:40 ` [PATCH net-next v6 0/2] net: xsk: update tx queue consumer patchwork-bot+netdevbpf 2 siblings, 2 replies; 6+ messages in thread From: Jason Xing @ 2025-07-03 14:17 UTC (permalink / raw) To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson, maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe, willemdebruijn.kernel Cc: bpf, netdev, Jason Xing From: Jason Xing <kernelxing@tencent.com> The subtest sends 33 packets at one time on purpose to see if xsk exitting __xsk_generic_xmit() updates the global consumer of tx queue when reaching the max loop (max_tx_budget, 32 by default). The number 33 can avoid xskq_cons_peek_desc() updates the consumer when it's about to quit sending, to accurately check if the issue that the first patch resolves remains. The new case will not check this issue in zero copy mode. Signed-off-by: Jason Xing <kernelxing@tencent.com> --- v6 Link: https://lore.kernel.org/all/20250702112815.50746-1-kerneljasonxing@gmail.com/ 1. filter out and skip TEST_MODE_ZC test. v5 Link: https://lore.kernel.org/all/20250627085745.53173-1-kerneljasonxing@gmail.com/ 1. use the initial approach to add a new testcase 2. add a new flag 'check_consumer' to see if the check is needed --- tools/testing/selftests/bpf/xskxceiver.c | 56 +++++++++++++++++++++++- tools/testing/selftests/bpf/xskxceiver.h | 1 + 2 files changed, 56 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 0ced4026ee44..a29de0713f19 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -109,6 +109,8 @@ #include <network_helpers.h> +#define MAX_TX_BUDGET_DEFAULT 32 + static bool opt_verbose; static bool opt_print_tests; static enum test_mode opt_mode = TEST_MODE_ALL; @@ -1091,11 +1093,45 @@ static bool is_pkt_valid(struct pkt *pkt, void *buffer, u64 addr, u32 len) return true; } +static u32 load_value(u32 *counter) +{ + return __atomic_load_n(counter, __ATOMIC_ACQUIRE); +} + +static bool kick_tx_with_check(struct xsk_socket_info *xsk, int *ret) +{ + u32 max_budget = MAX_TX_BUDGET_DEFAULT; + u32 cons, ready_to_send; + int delta; + + cons = load_value(xsk->tx.consumer); + ready_to_send = load_value(xsk->tx.producer) - cons; + *ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); + + delta = load_value(xsk->tx.consumer) - cons; + /* By default, xsk should consume exact @max_budget descs at one + * send in this case where hitting the max budget limit in while + * loop is triggered in __xsk_generic_xmit(). Please make sure that + * the number of descs to be sent is larger than @max_budget, or + * else the tx.consumer will be updated in xskq_cons_peek_desc() + * in time which hides the issue we try to verify. + */ + if (ready_to_send > max_budget && delta != max_budget) + return false; + + return true; +} + static int kick_tx(struct xsk_socket_info *xsk) { int ret; - ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); + if (xsk->check_consumer) { + if (!kick_tx_with_check(xsk, &ret)) + return TEST_FAILURE; + } else { + ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); + } if (ret >= 0) return TEST_PASS; if (errno == ENOBUFS || errno == EAGAIN || errno == EBUSY || errno == ENETDOWN) { @@ -2613,6 +2649,23 @@ static int testapp_adjust_tail_grow_mb(struct test_spec *test) XSK_UMEM__LARGE_FRAME_SIZE * 2); } +static int testapp_tx_queue_consumer(struct test_spec *test) +{ + int nr_packets; + + if (test->mode == TEST_MODE_ZC) { + ksft_test_result_skip("Can not run TX_QUEUE_CONSUMER test for ZC mode\n"); + return TEST_SKIP; + } + + nr_packets = MAX_TX_BUDGET_DEFAULT + 1; + pkt_stream_replace(test, nr_packets, MIN_PKT_SIZE); + test->ifobj_tx->xsk->batch_size = nr_packets; + test->ifobj_tx->xsk->check_consumer = true; + + return testapp_validate_traffic(test); +} + static void run_pkt_test(struct test_spec *test) { int ret; @@ -2723,6 +2776,7 @@ static const struct test_spec tests[] = { {.name = "XDP_ADJUST_TAIL_SHRINK_MULTI_BUFF", .test_func = testapp_adjust_tail_shrink_mb}, {.name = "XDP_ADJUST_TAIL_GROW", .test_func = testapp_adjust_tail_grow}, {.name = "XDP_ADJUST_TAIL_GROW_MULTI_BUFF", .test_func = testapp_adjust_tail_grow_mb}, + {.name = "TX_QUEUE_CONSUMER", .test_func = testapp_tx_queue_consumer}, }; static void print_tests(void) diff --git a/tools/testing/selftests/bpf/xskxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h index 67fc44b2813b..4df3a5d329ac 100644 --- a/tools/testing/selftests/bpf/xskxceiver.h +++ b/tools/testing/selftests/bpf/xskxceiver.h @@ -95,6 +95,7 @@ struct xsk_socket_info { u32 batch_size; u8 dst_mac[ETH_ALEN]; u8 src_mac[ETH_ALEN]; + bool check_consumer; }; struct pkt { -- 2.41.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case 2025-07-03 14:17 ` [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case Jason Xing @ 2025-07-03 15:10 ` Maciej Fijalkowski 2025-07-03 15:45 ` Stanislav Fomichev 1 sibling, 0 replies; 6+ messages in thread From: Maciej Fijalkowski @ 2025-07-03 15:10 UTC (permalink / raw) To: Jason Xing Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson, jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe, willemdebruijn.kernel, bpf, netdev, Jason Xing On Thu, Jul 03, 2025 at 10:17:12PM +0800, Jason Xing wrote: > From: Jason Xing <kernelxing@tencent.com> > > The subtest sends 33 packets at one time on purpose to see if xsk > exitting __xsk_generic_xmit() updates the global consumer of tx queue > when reaching the max loop (max_tx_budget, 32 by default). The number 33 > can avoid xskq_cons_peek_desc() updates the consumer when it's about to > quit sending, to accurately check if the issue that the first patch > resolves remains. The new case will not check this issue in zero copy > mode. > > Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > v6 > Link: https://lore.kernel.org/all/20250702112815.50746-1-kerneljasonxing@gmail.com/ > 1. filter out and skip TEST_MODE_ZC test. > > v5 > Link: https://lore.kernel.org/all/20250627085745.53173-1-kerneljasonxing@gmail.com/ > 1. use the initial approach to add a new testcase > 2. add a new flag 'check_consumer' to see if the check is needed > --- > tools/testing/selftests/bpf/xskxceiver.c | 56 +++++++++++++++++++++++- > tools/testing/selftests/bpf/xskxceiver.h | 1 + > 2 files changed, 56 insertions(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c > index 0ced4026ee44..a29de0713f19 100644 > --- a/tools/testing/selftests/bpf/xskxceiver.c > +++ b/tools/testing/selftests/bpf/xskxceiver.c > @@ -109,6 +109,8 @@ > > #include <network_helpers.h> > > +#define MAX_TX_BUDGET_DEFAULT 32 > + > static bool opt_verbose; > static bool opt_print_tests; > static enum test_mode opt_mode = TEST_MODE_ALL; > @@ -1091,11 +1093,45 @@ static bool is_pkt_valid(struct pkt *pkt, void *buffer, u64 addr, u32 len) > return true; > } > > +static u32 load_value(u32 *counter) > +{ > + return __atomic_load_n(counter, __ATOMIC_ACQUIRE); > +} > + > +static bool kick_tx_with_check(struct xsk_socket_info *xsk, int *ret) > +{ > + u32 max_budget = MAX_TX_BUDGET_DEFAULT; > + u32 cons, ready_to_send; > + int delta; > + > + cons = load_value(xsk->tx.consumer); > + ready_to_send = load_value(xsk->tx.producer) - cons; > + *ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); > + > + delta = load_value(xsk->tx.consumer) - cons; > + /* By default, xsk should consume exact @max_budget descs at one > + * send in this case where hitting the max budget limit in while > + * loop is triggered in __xsk_generic_xmit(). Please make sure that > + * the number of descs to be sent is larger than @max_budget, or > + * else the tx.consumer will be updated in xskq_cons_peek_desc() > + * in time which hides the issue we try to verify. > + */ > + if (ready_to_send > max_budget && delta != max_budget) > + return false; > + > + return true; > +} > + > static int kick_tx(struct xsk_socket_info *xsk) > { > int ret; > > - ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); > + if (xsk->check_consumer) { > + if (!kick_tx_with_check(xsk, &ret)) > + return TEST_FAILURE; > + } else { > + ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); > + } > if (ret >= 0) > return TEST_PASS; > if (errno == ENOBUFS || errno == EAGAIN || errno == EBUSY || errno == ENETDOWN) { > @@ -2613,6 +2649,23 @@ static int testapp_adjust_tail_grow_mb(struct test_spec *test) > XSK_UMEM__LARGE_FRAME_SIZE * 2); > } > > +static int testapp_tx_queue_consumer(struct test_spec *test) > +{ > + int nr_packets; > + > + if (test->mode == TEST_MODE_ZC) { > + ksft_test_result_skip("Can not run TX_QUEUE_CONSUMER test for ZC mode\n"); > + return TEST_SKIP; > + } > + > + nr_packets = MAX_TX_BUDGET_DEFAULT + 1; > + pkt_stream_replace(test, nr_packets, MIN_PKT_SIZE); > + test->ifobj_tx->xsk->batch_size = nr_packets; > + test->ifobj_tx->xsk->check_consumer = true; > + > + return testapp_validate_traffic(test); > +} > + > static void run_pkt_test(struct test_spec *test) > { > int ret; > @@ -2723,6 +2776,7 @@ static const struct test_spec tests[] = { > {.name = "XDP_ADJUST_TAIL_SHRINK_MULTI_BUFF", .test_func = testapp_adjust_tail_shrink_mb}, > {.name = "XDP_ADJUST_TAIL_GROW", .test_func = testapp_adjust_tail_grow}, > {.name = "XDP_ADJUST_TAIL_GROW_MULTI_BUFF", .test_func = testapp_adjust_tail_grow_mb}, > + {.name = "TX_QUEUE_CONSUMER", .test_func = testapp_tx_queue_consumer}, > }; > > static void print_tests(void) > diff --git a/tools/testing/selftests/bpf/xskxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h > index 67fc44b2813b..4df3a5d329ac 100644 > --- a/tools/testing/selftests/bpf/xskxceiver.h > +++ b/tools/testing/selftests/bpf/xskxceiver.h > @@ -95,6 +95,7 @@ struct xsk_socket_info { > u32 batch_size; > u8 dst_mac[ETH_ALEN]; > u8 src_mac[ETH_ALEN]; > + bool check_consumer; > }; > > struct pkt { > -- > 2.41.3 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case 2025-07-03 14:17 ` [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case Jason Xing 2025-07-03 15:10 ` Maciej Fijalkowski @ 2025-07-03 15:45 ` Stanislav Fomichev 1 sibling, 0 replies; 6+ messages in thread From: Stanislav Fomichev @ 2025-07-03 15:45 UTC (permalink / raw) To: Jason Xing Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson, maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe, willemdebruijn.kernel, bpf, netdev, Jason Xing On 07/03, Jason Xing wrote: > From: Jason Xing <kernelxing@tencent.com> > > The subtest sends 33 packets at one time on purpose to see if xsk > exitting __xsk_generic_xmit() updates the global consumer of tx queue > when reaching the max loop (max_tx_budget, 32 by default). The number 33 > can avoid xskq_cons_peek_desc() updates the consumer when it's about to > quit sending, to accurately check if the issue that the first patch > resolves remains. The new case will not check this issue in zero copy > mode. > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > v6 > Link: https://lore.kernel.org/all/20250702112815.50746-1-kerneljasonxing@gmail.com/ > 1. filter out and skip TEST_MODE_ZC test. Acked-by: Stanislav Fomichev <sdf@fomichev.me> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v6 0/2] net: xsk: update tx queue consumer 2025-07-03 14:17 [PATCH net-next v6 0/2] net: xsk: update tx queue consumer Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case Jason Xing @ 2025-07-09 1:40 ` patchwork-bot+netdevbpf 2 siblings, 0 replies; 6+ messages in thread From: patchwork-bot+netdevbpf @ 2025-07-09 1:40 UTC (permalink / raw) To: Jason Xing Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson, maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk, john.fastabend, joe, willemdebruijn.kernel, bpf, netdev, kernelxing Hello: This series was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 3 Jul 2025 22:17:10 +0800 you wrote: > From: Jason Xing <kernelxing@tencent.com> > > Patch 1 makes sure the consumer is updated at the end of generic xmit. > Patch 2 adds corresponding test. > > Jason Xing (2): > net: xsk: update tx queue consumer immediately after transmission > selftests/bpf: add a new test to check the consumer update case > > [...] Here is the summary with links: - [net-next,v6,1/2] net: xsk: update tx queue consumer immediately after transmission https://git.kernel.org/netdev/net-next/c/1eb8b0dac189 - [net-next,v6,2/2] selftests/bpf: add a new test to check the consumer update case https://git.kernel.org/netdev/net-next/c/680acde13ffd You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-07-09 1:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-03 14:17 [PATCH net-next v6 0/2] net: xsk: update tx queue consumer Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 1/2] net: xsk: update tx queue consumer immediately after transmission Jason Xing 2025-07-03 14:17 ` [PATCH net-next v6 2/2] selftests/bpf: add a new test to check the consumer update case Jason Xing 2025-07-03 15:10 ` Maciej Fijalkowski 2025-07-03 15:45 ` Stanislav Fomichev 2025-07-09 1:40 ` [PATCH net-next v6 0/2] net: xsk: update tx queue consumer patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).