netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free
@ 2024-04-10  1:28 Pavel Begunkov
  2024-04-10  1:28 ` [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Pavel Begunkov @ 2024-04-10  1:28 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, davem, dsahern, pabeni, kuba, Pavel Begunkov

Optimise the case when an skb comes to skb_attempt_defer_free()
on the same CPU it was allocated on. The patch 1 enables skb caches
and gives frags a chance to hit the page pool's fast path.
CPU bound benchmarking with perfect skb_attempt_defer_free()
gives around 1% of extra throughput.

v4: SKB_DROP_REASON_NOT_SPECIFIED -> SKB_CONSUMED
v3: rebased, no changes otherwise
v2: pass @napi_safe=true by using __napi_kfree_skb()

Pavel Begunkov (2):
  net: cache for same cpu skb_attempt_defer_free
  net: use SKB_CONSUMED in skb_attempt_defer_free()

 net/core/skbuff.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free
  2024-04-10  1:28 [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free Pavel Begunkov
@ 2024-04-10  1:28 ` Pavel Begunkov
  2024-04-10 18:47   ` Eric Dumazet
  2024-04-11  2:54   ` Jason Xing
  2024-04-10  1:28 ` [PATCH net-next v4 2/2] net: use SKB_CONSUMED in skb_attempt_defer_free() Pavel Begunkov
  2024-04-11  2:30 ` [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free patchwork-bot+netdevbpf
  2 siblings, 2 replies; 7+ messages in thread
From: Pavel Begunkov @ 2024-04-10  1:28 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, davem, dsahern, pabeni, kuba, Pavel Begunkov

Optimise skb_attempt_defer_free() when run by the same CPU the skb was
allocated on. Instead of __kfree_skb() -> kmem_cache_free() we can
disable softirqs and put the buffer into cpu local caches.

CPU bound TCP ping pong style benchmarking (i.e. netbench) showed a 1%
throughput increase (392.2 -> 396.4 Krps). Cross checking with profiles,
the total CPU share of skb_attempt_defer_free() dropped by 0.6%. Note,
I'd expect the win doubled with rx only benchmarks, as the optimisation
is for the receive path, but the test spends >55% of CPU doing writes.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/core/skbuff.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 21cd01641f4c..62b07ed3af98 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6974,6 +6974,19 @@ void __skb_ext_put(struct skb_ext *ext)
 EXPORT_SYMBOL(__skb_ext_put);
 #endif /* CONFIG_SKB_EXTENSIONS */
 
+static void kfree_skb_napi_cache(struct sk_buff *skb)
+{
+	/* if SKB is a clone, don't handle this case */
+	if (skb->fclone != SKB_FCLONE_UNAVAILABLE) {
+		__kfree_skb(skb);
+		return;
+	}
+
+	local_bh_disable();
+	__napi_kfree_skb(skb, SKB_DROP_REASON_NOT_SPECIFIED);
+	local_bh_enable();
+}
+
 /**
  * skb_attempt_defer_free - queue skb for remote freeing
  * @skb: buffer
@@ -6992,7 +7005,7 @@ void skb_attempt_defer_free(struct sk_buff *skb)
 	if (WARN_ON_ONCE(cpu >= nr_cpu_ids) ||
 	    !cpu_online(cpu) ||
 	    cpu == raw_smp_processor_id()) {
-nodefer:	__kfree_skb(skb);
+nodefer:	kfree_skb_napi_cache(skb);
 		return;
 	}
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next v4 2/2] net: use SKB_CONSUMED in skb_attempt_defer_free()
  2024-04-10  1:28 [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free Pavel Begunkov
  2024-04-10  1:28 ` [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free Pavel Begunkov
@ 2024-04-10  1:28 ` Pavel Begunkov
  2024-04-10 18:47   ` Eric Dumazet
  2024-04-11  2:30 ` [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free patchwork-bot+netdevbpf
  2 siblings, 1 reply; 7+ messages in thread
From: Pavel Begunkov @ 2024-04-10  1:28 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, davem, dsahern, pabeni, kuba, Pavel Begunkov,
	Jason Xing

skb_attempt_defer_free() is used to free already processed skbs, so pass
SKB_CONSUMED as the reason in kfree_skb_napi_cache().

Suggested-by: Jason Xing <kerneljasonxing@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 62b07ed3af98..dd266f44aaff 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6983,7 +6983,7 @@ static void kfree_skb_napi_cache(struct sk_buff *skb)
 	}
 
 	local_bh_disable();
-	__napi_kfree_skb(skb, SKB_DROP_REASON_NOT_SPECIFIED);
+	__napi_kfree_skb(skb, SKB_CONSUMED);
 	local_bh_enable();
 }
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free
  2024-04-10  1:28 ` [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free Pavel Begunkov
@ 2024-04-10 18:47   ` Eric Dumazet
  2024-04-11  2:54   ` Jason Xing
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2024-04-10 18:47 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: netdev, davem, dsahern, pabeni, kuba

On Wed, Apr 10, 2024 at 3:28 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> Optimise skb_attempt_defer_free() when run by the same CPU the skb was
> allocated on. Instead of __kfree_skb() -> kmem_cache_free() we can
> disable softirqs and put the buffer into cpu local caches.
>
> CPU bound TCP ping pong style benchmarking (i.e. netbench) showed a 1%
> throughput increase (392.2 -> 396.4 Krps). Cross checking with profiles,
> the total CPU share of skb_attempt_defer_free() dropped by 0.6%. Note,
> I'd expect the win doubled with rx only benchmarks, as the optimisation
> is for the receive path, but the test spends >55% of CPU doing writes.
>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v4 2/2] net: use SKB_CONSUMED in skb_attempt_defer_free()
  2024-04-10  1:28 ` [PATCH net-next v4 2/2] net: use SKB_CONSUMED in skb_attempt_defer_free() Pavel Begunkov
@ 2024-04-10 18:47   ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2024-04-10 18:47 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: netdev, davem, dsahern, pabeni, kuba, Jason Xing

On Wed, Apr 10, 2024 at 3:28 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> skb_attempt_defer_free() is used to free already processed skbs, so pass
> SKB_CONSUMED as the reason in kfree_skb_napi_cache().
>
> Suggested-by: Jason Xing <kerneljasonxing@gmail.com>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free
  2024-04-10  1:28 [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free Pavel Begunkov
  2024-04-10  1:28 ` [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free Pavel Begunkov
  2024-04-10  1:28 ` [PATCH net-next v4 2/2] net: use SKB_CONSUMED in skb_attempt_defer_free() Pavel Begunkov
@ 2024-04-11  2:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-04-11  2:30 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: netdev, edumazet, davem, dsahern, pabeni, kuba

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 10 Apr 2024 02:28:08 +0100 you wrote:
> Optimise the case when an skb comes to skb_attempt_defer_free()
> on the same CPU it was allocated on. The patch 1 enables skb caches
> and gives frags a chance to hit the page pool's fast path.
> CPU bound benchmarking with perfect skb_attempt_defer_free()
> gives around 1% of extra throughput.
> 
> v4: SKB_DROP_REASON_NOT_SPECIFIED -> SKB_CONSUMED
> v3: rebased, no changes otherwise
> v2: pass @napi_safe=true by using __napi_kfree_skb()
> 
> [...]

Here is the summary with links:
  - [net-next,v4,1/2] net: cache for same cpu skb_attempt_defer_free
    https://git.kernel.org/netdev/net-next/c/7cb31c46b9cc
  - [net-next,v4,2/2] net: use SKB_CONSUMED in skb_attempt_defer_free()
    https://git.kernel.org/netdev/net-next/c/d8415a165c43

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free
  2024-04-10  1:28 ` [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free Pavel Begunkov
  2024-04-10 18:47   ` Eric Dumazet
@ 2024-04-11  2:54   ` Jason Xing
  1 sibling, 0 replies; 7+ messages in thread
From: Jason Xing @ 2024-04-11  2:54 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: netdev, edumazet, davem, dsahern, pabeni, kuba

On Wed, Apr 10, 2024 at 9:28 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> Optimise skb_attempt_defer_free() when run by the same CPU the skb was
> allocated on. Instead of __kfree_skb() -> kmem_cache_free() we can
> disable softirqs and put the buffer into cpu local caches.
>
> CPU bound TCP ping pong style benchmarking (i.e. netbench) showed a 1%
> throughput increase (392.2 -> 396.4 Krps). Cross checking with profiles,
> the total CPU share of skb_attempt_defer_free() dropped by 0.6%. Note,
> I'd expect the win doubled with rx only benchmarks, as the optimisation
> is for the receive path, but the test spends >55% of CPU doing writes.
>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-11  2:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-10  1:28 [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free Pavel Begunkov
2024-04-10  1:28 ` [PATCH net-next v4 1/2] net: cache for same cpu skb_attempt_defer_free Pavel Begunkov
2024-04-10 18:47   ` Eric Dumazet
2024-04-11  2:54   ` Jason Xing
2024-04-10  1:28 ` [PATCH net-next v4 2/2] net: use SKB_CONSUMED in skb_attempt_defer_free() Pavel Begunkov
2024-04-10 18:47   ` Eric Dumazet
2024-04-11  2:30 ` [PATCH net-next v4 0/2] optimise local CPU skb_attempt_defer_free patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).