[PATCH net-next] tcp: move sk_forced_mem

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
@ 2026-01-23 11:16 Eric Dumazet
  2026-01-23 15:52 ` Neal Cardwell
  2026-01-27 14:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Dumazet @ 2026-01-23 11:16 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Neal Cardwell, Kuniyuki Iwashima, netdev,
	eric.dumazet, Eric Dumazet

TCP fast path can (auto)inline this helper, instead
of (auto)inling it from tcp_send_fin().

No change of overall code size, but tcp_sendmsg() is faster.

$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
Function                                     old     new   delta
tcp_stream_alloc_skb                         216     357    +141
tcp_send_fin                                 688     548    -140
Total: Before=22236729, After=22236730, chg +0.00%

BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c        | 27 +++++++++++++++++++++++++++
 net/ipv4/tcp_output.c | 27 ---------------------------
 2 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 148cdf3cd6233add37ea52e273cb4fb3e75fcbcb..87acd01274601eca799e42b8f895fff4c35cf25a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -902,6 +902,33 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
 }
 EXPORT_IPV6_MOD(tcp_splice_read);
 
+/* We allow to exceed memory limits for FIN packets to expedite
+ * connection tear down and (memory) recovery.
+ * Otherwise tcp_send_fin() could be tempted to either delay FIN
+ * or even be forced to close flow without any FIN.
+ * In general, we want to allow one skb per socket to avoid hangs
+ * with edge trigger epoll()
+ */
+void sk_forced_mem_schedule(struct sock *sk, int size)
+{
+	int delta, amt;
+
+	delta = size - sk->sk_forward_alloc;
+	if (delta <= 0)
+		return;
+
+	amt = sk_mem_pages(delta);
+	sk_forward_alloc_add(sk, amt << PAGE_SHIFT);
+
+	if (mem_cgroup_sk_enabled(sk))
+		mem_cgroup_sk_charge(sk, amt, gfp_memcg_charge() | __GFP_NOFAIL);
+
+	if (sk->sk_bypass_prot_mem)
+		return;
+
+	sk_memory_allocated_add(sk, amt);
+}
+
 struct sk_buff *tcp_stream_alloc_skb(struct sock *sk, gfp_t gfp,
 				     bool force_schedule)
 {
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 256b669e8d3b4a4d191e61e79784e412aaef8965..597e888af36d8697cd696964077b423e79fdb83e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3767,33 +3767,6 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
 				     inet_csk(sk)->icsk_rto, true);
 }
 
-/* We allow to exceed memory limits for FIN packets to expedite
- * connection tear down and (memory) recovery.
- * Otherwise tcp_send_fin() could be tempted to either delay FIN
- * or even be forced to close flow without any FIN.
- * In general, we want to allow one skb per socket to avoid hangs
- * with edge trigger epoll()
- */
-void sk_forced_mem_schedule(struct sock *sk, int size)
-{
-	int delta, amt;
-
-	delta = size - sk->sk_forward_alloc;
-	if (delta <= 0)
-		return;
-
-	amt = sk_mem_pages(delta);
-	sk_forward_alloc_add(sk, amt << PAGE_SHIFT);
-
-	if (mem_cgroup_sk_enabled(sk))
-		mem_cgroup_sk_charge(sk, amt, gfp_memcg_charge() | __GFP_NOFAIL);
-
-	if (sk->sk_bypass_prot_mem)
-		return;
-
-	sk_memory_allocated_add(sk, amt);
-}
-
 /* Send a FIN. The caller locks the socket for us.
  * We should try to send a FIN packet really hard, but eventually give up.
  */
-- 
2.52.0.457.g6b5491de43-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-23 11:16 [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c Eric Dumazet
@ 2026-01-23 15:52 ` Neal Cardwell
  2026-01-26  8:18   ` Paolo Abeni
  2026-01-27 14:10 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 8+ messages in thread
From: Neal Cardwell @ 2026-01-23 15:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, netdev, eric.dumazet

On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@google.com> wrote:
>
> TCP fast path can (auto)inline this helper, instead
> of (auto)inling it from tcp_send_fin().
>
> No change of overall code size, but tcp_sendmsg() is faster.
>
> $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> Function                                     old     new   delta
> tcp_stream_alloc_skb                         216     357    +141
> tcp_send_fin                                 688     548    -140
> Total: Before=22236729, After=22236730, chg +0.00%
>
> BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---

Reviewed-by: Neal Cardwell <ncardwell@google.com>

Thanks, Eric!

neal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-23 15:52 ` Neal Cardwell
@ 2026-01-26  8:18   ` Paolo Abeni
  2026-01-26  8:45     ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2026-01-26  8:18 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Simon Horman, Kuniyuki Iwashima,
	netdev, eric.dumazet, Neal Cardwell

Hi Eric,

On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@google.com> wrote:
> TCP fast path can (auto)inline this helper, instead
 of (auto)inling it from tcp_send_fin().
>
>> No change of overall code size, but tcp_sendmsg() is faster.
>
> $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> Function                                     old     new   delta
> tcp_stream_alloc_skb                         216     357    +141
> tcp_send_fin                                 688     548    -140
> Total: Before=22236729, After=22236730, chg +0.00%
>
> BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

I've a question out of sheer curiosity: are you using some specific tool
to look for inline opportunity, or "just" careful code and/or objdump
analysis?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-26  8:18   ` Paolo Abeni
@ 2026-01-26  8:45     ` Eric Dumazet
  2026-01-27  1:56       ` Jason Xing
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2026-01-26  8:45 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: David S . Miller, Jakub Kicinski, Simon Horman, Kuniyuki Iwashima,
	netdev, eric.dumazet, Neal Cardwell

On Mon, Jan 26, 2026 at 9:18 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Hi Eric,
>
> On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@google.com> wrote:
> > TCP fast path can (auto)inline this helper, instead
>  of (auto)inling it from tcp_send_fin().
> >
> >> No change of overall code size, but tcp_sendmsg() is faster.
> >
> > $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> > add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> > Function                                     old     new   delta
> > tcp_stream_alloc_skb                         216     357    +141
> > tcp_send_fin                                 688     548    -140
> > Total: Before=22236729, After=22236730, chg +0.00%
> >
> > BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>
> I've a question out of sheer curiosity: are you using some specific tool
> to look for inline opportunity, or "just" careful code and/or objdump
> analysis?

I am studying performance profiles on a stress test using Google
production kernels,
on platforms that are a big chunk of the fleet.
These kernels are very close to upstream (at least for core and TCP
networking stacks), and use clang and FDO.

I am currently focusing on some functions that even FDO does not
inline, for some reasons.

Thanks !

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-26  8:45     ` Eric Dumazet
@ 2026-01-27  1:56       ` Jason Xing
  2026-01-27  2:06         ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Xing @ 2026-01-27  1:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Paolo Abeni, David S . Miller, Jakub Kicinski, Simon Horman,
	Kuniyuki Iwashima, netdev, eric.dumazet, Neal Cardwell

Hi Eric,

On Mon, Jan 26, 2026 at 4:45 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Jan 26, 2026 at 9:18 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > Hi Eric,
> >
> > On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@google.com> wrote:
> > > TCP fast path can (auto)inline this helper, instead
> >  of (auto)inling it from tcp_send_fin().
> > >
> > >> No change of overall code size, but tcp_sendmsg() is faster.
> > >
> > > $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> > > add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> > > Function                                     old     new   delta
> > > tcp_stream_alloc_skb                         216     357    +141
> > > tcp_send_fin                                 688     548    -140
> > > Total: Before=22236729, After=22236730, chg +0.00%
> > >
> > > BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> >
> > I've a question out of sheer curiosity: are you using some specific tool
> > to look for inline opportunity, or "just" careful code and/or objdump
> > analysis?

Paolo, right, you beat me to it! I'm also curious :)

>
> I am studying performance profiles on a stress test using Google
> production kernels,
> on platforms that are a big chunk of the fleet.
> These kernels are very close to upstream (at least for core and TCP
> networking stacks), and use clang and FDO.
>
> I am currently focusing on some functions that even FDO does not
> inline, for some reasons.

Eric, could you share your more valuable experience and methodology behind this?

Prior to the recent work you've done, I completely missed that we
could even do something with the inline functions. Normally using perf
doesn't give us a clear micro observation of those functions
especially about some hints on whether we should inline some
functions. If there is any tool, it would be super awesome!

Let me guess, are you trying to avoid function call action as much as
you can in the hot path based on your deep understanding of
ingress/egress performance?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-27  1:56       ` Jason Xing
@ 2026-01-27  2:06         ` Eric Dumazet
  2026-01-27  6:15           ` Jason Xing
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2026-01-27  2:06 UTC (permalink / raw)
  To: Jason Xing
  Cc: Paolo Abeni, David S . Miller, Jakub Kicinski, Simon Horman,
	Kuniyuki Iwashima, netdev, eric.dumazet, Neal Cardwell

On Tue, Jan 27, 2026 at 2:56 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> Hi Eric,
>
> On Mon, Jan 26, 2026 at 4:45 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Mon, Jan 26, 2026 at 9:18 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > >
> > > Hi Eric,
> > >
> > > On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@google.com> wrote:
> > > > TCP fast path can (auto)inline this helper, instead
> > >  of (auto)inling it from tcp_send_fin().
> > > >
> > > >> No change of overall code size, but tcp_sendmsg() is faster.
> > > >
> > > > $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> > > > add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> > > > Function                                     old     new   delta
> > > > tcp_stream_alloc_skb                         216     357    +141
> > > > tcp_send_fin                                 688     548    -140
> > > > Total: Before=22236729, After=22236730, chg +0.00%
> > > >
> > > > BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
> > > >
> > > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > >
> > > I've a question out of sheer curiosity: are you using some specific tool
> > > to look for inline opportunity, or "just" careful code and/or objdump
> > > analysis?
>
> Paolo, right, you beat me to it! I'm also curious :)
>
> >
> > I am studying performance profiles on a stress test using Google
> > production kernels,
> > on platforms that are a big chunk of the fleet.
> > These kernels are very close to upstream (at least for core and TCP
> > networking stacks), and use clang and FDO.
> >
> > I am currently focusing on some functions that even FDO does not
> > inline, for some reasons.
>
> Eric, could you share your more valuable experience and methodology behind this?
>
> Prior to the recent work you've done, I completely missed that we
> could even do something with the inline functions. Normally using perf
> doesn't give us a clear micro observation of those functions
> especially about some hints on whether we should inline some
> functions. If there is any tool, it would be super awesome!
>

perf is your friend, and a deep knowledge of your cpu-du-jour behavior
(assembly)

perf record -C xxx -g sleep 10
perf report --no-children

<Use the perf UI appropriately>

> Let me guess, are you trying to avoid function call action as much as
> you can in the hot path based on your deep understanding of
> ingress/egress performance?

I think you could answer the question by parsing the changelogs of my
recent commits ;)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-27  2:06         ` Eric Dumazet
@ 2026-01-27  6:15           ` Jason Xing
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Xing @ 2026-01-27  6:15 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Paolo Abeni, David S . Miller, Jakub Kicinski, Simon Horman,
	Kuniyuki Iwashima, netdev, eric.dumazet, Neal Cardwell

On Tue, Jan 27, 2026 at 10:06 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Jan 27, 2026 at 2:56 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > Hi Eric,
> >
> > On Mon, Jan 26, 2026 at 4:45 PM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Mon, Jan 26, 2026 at 9:18 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > > >
> > > > Hi Eric,
> > > >
> > > > On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@google.com> wrote:
> > > > > TCP fast path can (auto)inline this helper, instead
> > > >  of (auto)inling it from tcp_send_fin().
> > > > >
> > > > >> No change of overall code size, but tcp_sendmsg() is faster.
> > > > >
> > > > > $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> > > > > add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> > > > > Function                                     old     new   delta
> > > > > tcp_stream_alloc_skb                         216     357    +141
> > > > > tcp_send_fin                                 688     548    -140
> > > > > Total: Before=22236729, After=22236730, chg +0.00%
> > > > >
> > > > > BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
> > > > >
> > > > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > >
> > > > I've a question out of sheer curiosity: are you using some specific tool
> > > > to look for inline opportunity, or "just" careful code and/or objdump
> > > > analysis?
> >
> > Paolo, right, you beat me to it! I'm also curious :)
> >
> > >
> > > I am studying performance profiles on a stress test using Google
> > > production kernels,
> > > on platforms that are a big chunk of the fleet.
> > > These kernels are very close to upstream (at least for core and TCP
> > > networking stacks), and use clang and FDO.
> > >
> > > I am currently focusing on some functions that even FDO does not
> > > inline, for some reasons.
> >
> > Eric, could you share your more valuable experience and methodology behind this?
> >
> > Prior to the recent work you've done, I completely missed that we
> > could even do something with the inline functions. Normally using perf
> > doesn't give us a clear micro observation of those functions
> > especially about some hints on whether we should inline some
> > functions. If there is any tool, it would be super awesome!
> >
>
> perf is your friend, and a deep knowledge of your cpu-du-jour behavior
> (assembly)
>
> perf record -C xxx -g sleep 10
> perf report --no-children
>
> <Use the perf UI appropriately>
>
> > Let me guess, are you trying to avoid function call action as much as
> > you can in the hot path based on your deep understanding of
> > ingress/egress performance?
>
> I think you could answer the question by parsing the changelogs of my
> recent commits ;)

Right, I've already catched some insightful knowledge from your
commits :) Thanks for the interesting work!

Thanks,
Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
  2026-01-23 11:16 [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c Eric Dumazet
  2026-01-23 15:52 ` Neal Cardwell
@ 2026-01-27 14:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-01-27 14:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, horms, ncardwell, kuniyu, netdev,
	eric.dumazet

Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Fri, 23 Jan 2026 11:16:05 +0000 you wrote:
> TCP fast path can (auto)inline this helper, instead
> of (auto)inling it from tcp_send_fin().
> 
> No change of overall code size, but tcp_sendmsg() is faster.
> 
> $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> Function                                     old     new   delta
> tcp_stream_alloc_skb                         216     357    +141
> tcp_send_fin                                 688     548    -140
> Total: Before=22236729, After=22236730, chg +0.00%
> 
> [...]

Here is the summary with links:
  - [net-next] tcp: move sk_forced_mem_schedule() to tcp.c
    https://git.kernel.org/netdev/net-next/c/a18056a6c11c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-01-27 14:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-23 11:16 [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c Eric Dumazet
2026-01-23 15:52 ` Neal Cardwell
2026-01-26  8:18   ` Paolo Abeni
2026-01-26  8:45     ` Eric Dumazet
2026-01-27  1:56       ` Jason Xing
2026-01-27  2:06         ` Eric Dumazet
2026-01-27  6:15           ` Jason Xing
2026-01-27 14:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox