* [PATCH net-next] net: add prefetch() in skb_defer_free_flush()
@ 2025-11-06 8:55 Eric Dumazet
2025-11-06 9:05 ` Paolo Abeni
2025-11-08 3:10 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 5+ messages in thread
From: Eric Dumazet @ 2025-11-06 8:55 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet
skb_defer_free_flush() is becoming more important these days.
Add a prefetch operation to reduce latency a bit on some
platforms like AMD EPYC 7B12.
On more recent cpus, a stall happens when reading skb_shinfo().
Avoiding it will require a more elaborate strategy.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/dev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index 537aa43edff0e4bfedb42593146cfdf7511d8c37..69515edd17bc6a157046f31b3dd343a59ae192ab 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6782,6 +6782,7 @@ static void skb_defer_free_flush(void)
free_list = llist_del_all(&sdn->defer_list);
llist_for_each_entry_safe(skb, next, free_list, ll_node) {
+ prefetch(next);
napi_consume_skb(skb, 1);
}
}
--
2.51.2.1026.g39e6a42477-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: add prefetch() in skb_defer_free_flush()
2025-11-06 8:55 [PATCH net-next] net: add prefetch() in skb_defer_free_flush() Eric Dumazet
@ 2025-11-06 9:05 ` Paolo Abeni
2025-11-06 9:13 ` Eric Dumazet
2025-11-08 3:10 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2025-11-06 9:05 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski
Cc: Simon Horman, netdev, eric.dumazet
On 11/6/25 9:55 AM, Eric Dumazet wrote:
> skb_defer_free_flush() is becoming more important these days.
>
> Add a prefetch operation to reduce latency a bit on some
> platforms like AMD EPYC 7B12.
>
> On more recent cpus, a stall happens when reading skb_shinfo().
> Avoiding it will require a more elaborate strategy.
For my education, how do you catch such stalls? looking for specific
perf events? Or just based on cycles spent in a given function/chunk of
code?
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Just to avoid doubts on my thoughts about this patch:
Acked-by: Paolo Abeni <pabeni@redhat.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: add prefetch() in skb_defer_free_flush()
2025-11-06 9:05 ` Paolo Abeni
@ 2025-11-06 9:13 ` Eric Dumazet
2025-11-06 15:04 ` Paolo Abeni
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2025-11-06 9:13 UTC (permalink / raw)
To: Paolo Abeni
Cc: David S . Miller, Jakub Kicinski, Simon Horman, netdev,
eric.dumazet
On Thu, Nov 6, 2025 at 1:05 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 11/6/25 9:55 AM, Eric Dumazet wrote:
> > skb_defer_free_flush() is becoming more important these days.
> >
> > Add a prefetch operation to reduce latency a bit on some
> > platforms like AMD EPYC 7B12.
> >
> > On more recent cpus, a stall happens when reading skb_shinfo().
> > Avoiding it will require a more elaborate strategy.
>
> For my education, how do you catch such stalls? looking for specific
> perf events? Or just based on cycles spent in a given function/chunk of
> code?
In this case, I was focusing on a NIC driver handling both RX and TX
from a single cpu.
I am using "perf record -g -C one_of_the_hot_cpu sleep 5; perf report
--no-children"
I am working on an issue with napi_complete_skb() which has no NUMA awareness.
With the following WIP series, I can push 115 Mpps UDP packets
(instead of 80Mpps) on IDPF.
I need more tests before pushing it for review, but the prefetch()
from skb_defer_free_flush()
is a no-brainer.
git diff d24e4780d5783b8eecd33aab03bd4efd24703c65..
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5b4bc8b1c7d5674c19b64f8b15685d74632048fe..7ac5f8aa1235a55db02b40b5a0f51bb3fa53fa03
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1149,11 +1149,10 @@ void skb_release_head_state(struct sk_buff *skb)
skb);
#endif
+ skb->destructor = NULL;
}
-#if IS_ENABLED(CONFIG_NF_CONNTRACK)
- nf_conntrack_put(skb_nfct(skb));
-#endif
- skb_ext_put(skb);
+ nf_reset_ct(skb);
+ skb_ext_reset(skb);
}
/* Free everything but the sk_buff shell. */
@@ -1477,6 +1476,11 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
DEBUG_NET_WARN_ON_ONCE(!in_softirq());
+ if (skb->alloc_cpu != smp_processor_id() && !skb_shared(skb)) {
+ skb_release_head_state(skb);
+ return skb_attempt_defer_free(skb);
+ }
+
if (!skb_unref(skb))
return;
commit df7dacc619117ebab7ea330ccc6390618f04dff3
Author: Eric Dumazet <edumazet@google.com>
Date: Wed Nov 5 17:02:20 2025 +0000
net: fix napi_consume_skb() with alien skbs
There is a lack of NUMA awareness and more generally lack
of slab caches affinity on TX completion path.
Modern drivers are using napi_consume_skb(), hoping to cache sk_buff
in per-cpu caches so that they can be recycled in RX path.
Only allow this if the skb was allocated on the same cpu,
otherwise use skb_attempt_defer_free() so that the skb
is freed on the original cpu.
This removes contention on SLUB spinlocks and data structures.
After this patch, I get 40% improvement for an UDP tx workload
on an AMD EPYC 9B45 (IDPF 200Gbit NIC with 32 TX queues).
80 Mpps -> 115 Mpps.
Signed-off-by: Eric Dumazet <edumazet@google.com>
commit 42593ad5f2bed6abd3a6cce3483e2980b114cbd9
Author: Eric Dumazet <edumazet@google.com>
Date: Wed Nov 5 16:50:29 2025 +0000
net: allow skb_release_head_state() to be called multiple times
Currently, only skb dst is cleared (thanks to skb_dst_drop())
Make sure skb->destructor, conntrack and extensions are cleared.
Signed-off-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: add prefetch() in skb_defer_free_flush()
2025-11-06 9:13 ` Eric Dumazet
@ 2025-11-06 15:04 ` Paolo Abeni
0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2025-11-06 15:04 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Jakub Kicinski, Simon Horman, netdev,
eric.dumazet
On 11/6/25 10:13 AM, Eric Dumazet wrote:
> On Thu, Nov 6, 2025 at 1:05 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>
>> On 11/6/25 9:55 AM, Eric Dumazet wrote:
>>> skb_defer_free_flush() is becoming more important these days.
>>>
>>> Add a prefetch operation to reduce latency a bit on some
>>> platforms like AMD EPYC 7B12.
>>>
>>> On more recent cpus, a stall happens when reading skb_shinfo().
>>> Avoiding it will require a more elaborate strategy.
>>
>> For my education, how do you catch such stalls? looking for specific
>> perf events? Or just based on cycles spent in a given function/chunk of
>> code?
>
> In this case, I was focusing on a NIC driver handling both RX and TX
> from a single cpu.
>
> I am using "perf record -g -C one_of_the_hot_cpu sleep 5; perf report
> --no-children"
>
> I am working on an issue with napi_complete_skb() which has no NUMA awareness.
Many thanks for sharing!
> With the following WIP series, I can push 115 Mpps UDP packets
> (instead of 80Mpps) on IDPF.
> I need more tests before pushing it for review, but the prefetch()
> from skb_defer_free_flush()
> is a no-brainer.
FWIW, the napi_complete_skb() makes sense to me, looking forward to it!
/P
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: add prefetch() in skb_defer_free_flush()
2025-11-06 8:55 [PATCH net-next] net: add prefetch() in skb_defer_free_flush() Eric Dumazet
2025-11-06 9:05 ` Paolo Abeni
@ 2025-11-08 3:10 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-11-08 3:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, kuba, pabeni, horms, netdev, eric.dumazet
Hello:
This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 6 Nov 2025 08:55:00 +0000 you wrote:
> skb_defer_free_flush() is becoming more important these days.
>
> Add a prefetch operation to reduce latency a bit on some
> platforms like AMD EPYC 7B12.
>
> On more recent cpus, a stall happens when reading skb_shinfo().
> Avoiding it will require a more elaborate strategy.
>
> [...]
Here is the summary with links:
- [net-next] net: add prefetch() in skb_defer_free_flush()
https://git.kernel.org/netdev/net-next/c/fd9557c3606b
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-11-08 3:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-06 8:55 [PATCH net-next] net: add prefetch() in skb_defer_free_flush() Eric Dumazet
2025-11-06 9:05 ` Paolo Abeni
2025-11-06 9:13 ` Eric Dumazet
2025-11-06 15:04 ` Paolo Abeni
2025-11-08 3:10 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox