* [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete}
@ 2026-01-18 17:52 Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 1/3] net: always inline __skb_incr_checksum_unnecessary() Eric Dumazet
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-01-18 17:52 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet
On some platforms, GRO stack is too deep and causes cpu stalls.
Decreasing call depths by one shows a 1.5 % gain on Zen2 cpus.
(32 RX queues, 100Gbit NIC, RFS enabled, tcp_rr with 128 threads and 10,000 flows)
We can go further by inlining ipv6_gro_{receive,complete}
and take care of IPv4 if there is interest.
Note: two temporary __always_inline will be replaced with
inline_for_performance when available.
v2: dealt with udp6_gro_receive()/udp6_gro_complete()
missing declarations (kernel test robot <lkp@intel.com>)
for CONFIG_MITIGATION_RETPOLINE=n
Cumulative size increase for this series (of 3):
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux.3
add/remove: 2/2 grow/shrink: 5/1 up/down: 1572/-471 (1101)
Function old new delta
ipv6_gro_receive 1069 1846 +777
ipv6_gro_complete 433 733 +300
tcp6_check_fraglist_gro - 272 +272
tcp6_gro_complete 227 306 +79
tcp4_gro_complete 325 397 +72
ipv6_offload_init 218 274 +56
__pfx_tcp6_check_fraglist_gro - 16 +16
__pfx___skb_incr_checksum_unnecessary 32 - -32
__skb_incr_checksum_unnecessary 186 - -186
tcp6_gro_receive 959 706 -253
Total: Before=22592724, After=22593825, chg +0.00%
Eric Dumazet (3):
net: always inline __skb_incr_checksum_unnecessary()
gro: inline tcp6_gro_receive()
gro: inline tcp6_gro_complete()
include/linux/skbuff.h | 2 +-
include/net/gro.h | 5 ++---
include/net/tcp.h | 2 --
net/ipv6/Makefile | 2 +-
net/ipv6/ip6_offload.c | 43 ++++++++++++++++++++--------------------
net/ipv6/tcpv6_offload.c | 12 +++++------
6 files changed, 31 insertions(+), 35 deletions(-)
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 1/3] net: always inline __skb_incr_checksum_unnecessary()
2026-01-18 17:52 [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Eric Dumazet
@ 2026-01-18 17:52 ` Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 2/3] gro: inline tcp6_gro_receive() Eric Dumazet
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-01-18 17:52 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet
clang does not inline this helper in GRO fast path.
We can save space and cpu cycles.
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux.1
add/remove: 0/2 grow/shrink: 2/0 up/down: 156/-218 (-62)
Function old new delta
tcp6_gro_complete 227 311 +84
tcp4_gro_complete 325 397 +72
__pfx___skb_incr_checksum_unnecessary 32 - -32
__skb_incr_checksum_unnecessary 186 - -186
Total: Before=22592724, After=22592662, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/skbuff.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 86737076101d4a8452e90fe78adcdcfdefb79169..e6bfe5d0c5252b2e7540e1fef9317aab83feced2 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4763,7 +4763,7 @@ static inline void __skb_decr_checksum_unnecessary(struct sk_buff *skb)
}
}
-static inline void __skb_incr_checksum_unnecessary(struct sk_buff *skb)
+static __always_inline void __skb_incr_checksum_unnecessary(struct sk_buff *skb)
{
if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 2/3] gro: inline tcp6_gro_receive()
2026-01-18 17:52 [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 1/3] net: always inline __skb_incr_checksum_unnecessary() Eric Dumazet
@ 2026-01-18 17:52 ` Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 3/3] gro: inline tcp6_gro_complete() Eric Dumazet
2026-01-20 15:30 ` [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Jakub Kicinski
3 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-01-18 17:52 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet
FDO/LTO are unable to inline tcp6_gro_receive() from ipv6_gro_receive()
Make sure tcp6_check_fraglist_gro() is only called only when needed,
so that compiler can leave it out-of-line.
$ scripts/bloat-o-meter -t vmlinux.1 vmlinux.2
add/remove: 2/0 grow/shrink: 3/1 up/down: 1123/-253 (870)
Function old new delta
ipv6_gro_receive 1069 1846 +777
tcp6_check_fraglist_gro - 272 +272
ipv6_offload_init 218 274 +56
__pfx_tcp6_check_fraglist_gro - 16 +16
ipv6_gro_complete 433 435 +2
tcp6_gro_receive 959 706 -253
Total: Before=22592662, After=22593532, chg +0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/gro.h | 3 +--
include/net/tcp.h | 1 -
net/ipv6/Makefile | 2 +-
net/ipv6/ip6_offload.c | 22 +++++++++++++---------
net/ipv6/tcpv6_offload.c | 10 ++++------
5 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/include/net/gro.h b/include/net/gro.h
index b65f631c521d7d9741ef86781add0038c9ce4055..85e5eeed4c90feef9440c57af9382b0e9ead1219 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -405,8 +405,7 @@ INDIRECT_CALLABLE_DECLARE(struct sk_buff *udp4_gro_receive(struct list_head *,
struct sk_buff *));
INDIRECT_CALLABLE_DECLARE(int udp4_gro_complete(struct sk_buff *, int));
-INDIRECT_CALLABLE_DECLARE(struct sk_buff *udp6_gro_receive(struct list_head *,
- struct sk_buff *));
+struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
INDIRECT_CALLABLE_DECLARE(int udp6_gro_complete(struct sk_buff *, int));
#define indirect_call_gro_receive_inet(cb, f2, f1, head, skb) \
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 15f9b20f851fe322f4417ff403c3965436aa3f9f..3b94c84888a884d9ca8eb602ad1f7d4f941f3ef9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2327,7 +2327,6 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb,
INDIRECT_CALLABLE_DECLARE(int tcp4_gro_complete(struct sk_buff *skb, int thoff));
INDIRECT_CALLABLE_DECLARE(struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb));
INDIRECT_CALLABLE_DECLARE(int tcp6_gro_complete(struct sk_buff *skb, int thoff));
-INDIRECT_CALLABLE_DECLARE(struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb));
#ifdef CONFIG_INET
void tcp_gro_complete(struct sk_buff *skb);
#else
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index d283c59df4c1c421bc043056fe11e5437cc4aece..0492f1a0b4918ada8c56cf649fbec04c7114863a 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -45,7 +45,7 @@ obj-$(CONFIG_IPV6_FOU) += fou6.o
obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
obj-$(CONFIG_INET) += output_core.o protocol.o \
- ip6_offload.o tcpv6_offload.o exthdrs_offload.o
+ ip6_offload.o exthdrs_offload.o
obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index fce91183797a60fcbf271c73e086aeb0aa9d40c6..4d96154c0dcd019322908ab6ddaa663a2a565e44 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -19,6 +19,7 @@
#include <net/gso.h>
#include "ip6_offload.h"
+#include "tcpv6_offload.c"
/* All GRO functions are always builtin, except UDP over ipv6, which lays in
* ipv6 module, as it depends on UDPv6 lookup function, so we need special care
@@ -30,13 +31,6 @@
#define INDIRECT_CALL_L4(f, f2, f1, ...) INDIRECT_CALL_1(f, f2, __VA_ARGS__)
#endif
-#define indirect_call_gro_receive_l4(f2, f1, cb, head, skb) \
-({ \
- unlikely(gro_recursion_inc_test(skb)) ? \
- NAPI_GRO_CB(skb)->flush |= 1, NULL : \
- INDIRECT_CALL_L4(cb, f2, f1, head, skb); \
-})
-
static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto)
{
const struct net_offload *ops = NULL;
@@ -298,9 +292,19 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
skb_gro_postpull_rcsum(skb, iph, nlen);
- pp = indirect_call_gro_receive_l4(tcp6_gro_receive, udp6_gro_receive,
- ops->callbacks.gro_receive, head, skb);
+ if (unlikely(gro_recursion_inc_test(skb))) {
+ flush = 1;
+ goto out;
+ }
+ if (likely(proto == IPPROTO_TCP))
+ pp = tcp6_gro_receive(head, skb);
+#if IS_BUILTIN(CONFIG_IPV6)
+ else if (likely(proto == IPPROTO_UDP))
+ pp = udp6_gro_receive(head, skb);
+#endif
+ else
+ pp = ops->callbacks.gro_receive(head, skb);
out:
skb_gro_flush_final(skb, pp, flush);
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index effeba58630b5ac2593b824bd8fc10a473954b6c..7f19ce423058870f285b7f8ae2a4d116d783f9fb 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -24,9 +24,6 @@ static void tcp6_check_fraglist_gro(struct list_head *head, struct sk_buff *skb,
struct net *net;
int iif, sdif;
- if (likely(!(skb->dev->features & NETIF_F_GRO_FRAGLIST)))
- return;
-
p = tcp_gro_lookup(head, th);
if (p) {
NAPI_GRO_CB(skb)->is_flist = NAPI_GRO_CB(p)->is_flist;
@@ -45,8 +42,8 @@ static void tcp6_check_fraglist_gro(struct list_head *head, struct sk_buff *skb,
#endif /* IS_ENABLED(CONFIG_IPV6) */
}
-INDIRECT_CALLABLE_SCOPE
-struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
+static __always_inline struct sk_buff *tcp6_gro_receive(struct list_head *head,
+ struct sk_buff *skb)
{
struct tcphdr *th;
@@ -60,7 +57,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
if (!th)
goto flush;
- tcp6_check_fraglist_gro(head, skb, th);
+ if (unlikely(skb->dev->features & NETIF_F_GRO_FRAGLIST))
+ tcp6_check_fraglist_gro(head, skb, th);
return tcp_gro_receive(head, skb, th);
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 net-next 3/3] gro: inline tcp6_gro_complete()
2026-01-18 17:52 [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 1/3] net: always inline __skb_incr_checksum_unnecessary() Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 2/3] gro: inline tcp6_gro_receive() Eric Dumazet
@ 2026-01-18 17:52 ` Eric Dumazet
2026-01-20 15:30 ` [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Jakub Kicinski
3 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-01-18 17:52 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet
Remove one function call from GRO stack for native IPv6 + TCP packets.
$ scripts/bloat-o-meter -t vmlinux.2 vmlinux.3
add/remove: 0/0 grow/shrink: 1/1 up/down: 298/-5 (293)
Function old new delta
ipv6_gro_complete 435 733 +298
tcp6_gro_complete 311 306 -5
Total: Before=22593532, After=22593825, chg +0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/gro.h | 2 +-
include/net/tcp.h | 1 -
net/ipv6/ip6_offload.c | 21 +++++++++------------
net/ipv6/tcpv6_offload.c | 2 +-
4 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/include/net/gro.h b/include/net/gro.h
index 85e5eeed4c90feef9440c57af9382b0e9ead1219..2300b6da05b2728ec40f42228f8fa9c195d8479c 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -406,7 +406,7 @@ INDIRECT_CALLABLE_DECLARE(struct sk_buff *udp4_gro_receive(struct list_head *,
INDIRECT_CALLABLE_DECLARE(int udp4_gro_complete(struct sk_buff *, int));
struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
-INDIRECT_CALLABLE_DECLARE(int udp6_gro_complete(struct sk_buff *, int));
+int udp6_gro_complete(struct sk_buff *, int);
#define indirect_call_gro_receive_inet(cb, f2, f1, head, skb) \
({ \
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3b94c84888a884d9ca8eb602ad1f7d4f941f3ef9..ebdf59d435b8002ca9b90803f40720a58ce3e809 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2326,7 +2326,6 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb,
struct tcphdr *th);
INDIRECT_CALLABLE_DECLARE(int tcp4_gro_complete(struct sk_buff *skb, int thoff));
INDIRECT_CALLABLE_DECLARE(struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb));
-INDIRECT_CALLABLE_DECLARE(int tcp6_gro_complete(struct sk_buff *skb, int thoff));
#ifdef CONFIG_INET
void tcp_gro_complete(struct sk_buff *skb);
#else
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 4d96154c0dcd019322908ab6ddaa663a2a565e44..32a104ead8760d33e152e0b0a6a6896d70d155b5 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -21,16 +21,6 @@
#include "ip6_offload.h"
#include "tcpv6_offload.c"
-/* All GRO functions are always builtin, except UDP over ipv6, which lays in
- * ipv6 module, as it depends on UDPv6 lookup function, so we need special care
- * when ipv6 is built as a module
- */
-#if IS_BUILTIN(CONFIG_IPV6)
-#define INDIRECT_CALL_L4(f, f2, f1, ...) INDIRECT_CALL_2(f, f2, f1, __VA_ARGS__)
-#else
-#define INDIRECT_CALL_L4(f, f2, f1, ...) INDIRECT_CALL_1(f, f2, __VA_ARGS__)
-#endif
-
static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto)
{
const struct net_offload *ops = NULL;
@@ -383,11 +373,18 @@ INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff)
}
nhoff += sizeof(*iph) + ipv6_exthdrs_len(iph, &ops);
+
+ if (likely(ops == &net_hotdata.tcpv6_offload))
+ return tcp6_gro_complete(skb, nhoff);
+#if IS_BUILTIN(CONFIG_IPV6)
+ if (ops == &net_hotdata.udpv6_offload)
+ return udp6_gro_complete(skb, nhoff);
+#endif
+
if (WARN_ON(!ops || !ops->callbacks.gro_complete))
goto out;
- err = INDIRECT_CALL_L4(ops->callbacks.gro_complete, tcp6_gro_complete,
- udp6_gro_complete, skb, nhoff);
+ err = ops->callbacks.gro_complete(skb, nhoff);
out:
return err;
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index 7f19ce423058870f285b7f8ae2a4d116d783f9fb..46fa2069d321663ed232e2836db77e3fcb1f4f07 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -67,7 +67,7 @@ static __always_inline struct sk_buff *tcp6_gro_receive(struct list_head *head,
return NULL;
}
-INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff)
+static __always_inline int tcp6_gro_complete(struct sk_buff *skb, int thoff)
{
const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete}
2026-01-18 17:52 [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Eric Dumazet
` (2 preceding siblings ...)
2026-01-18 17:52 ` [PATCH v2 net-next 3/3] gro: inline tcp6_gro_complete() Eric Dumazet
@ 2026-01-20 15:30 ` Jakub Kicinski
2026-01-20 15:41 ` Eric Dumazet
3 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2026-01-20 15:30 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Simon Horman, netdev, eric.dumazet
On Sun, 18 Jan 2026 17:52:12 +0000 Eric Dumazet wrote:
> On some platforms, GRO stack is too deep and causes cpu stalls.
>
> Decreasing call depths by one shows a 1.5 % gain on Zen2 cpus.
> (32 RX queues, 100Gbit NIC, RFS enabled, tcp_rr with 128 threads and 10,000 flows)
>
> We can go further by inlining ipv6_gro_{receive,complete}
> and take care of IPv4 if there is interest.
>
> Note: two temporary __always_inline will be replaced with
> inline_for_performance when available.
>
> v2: dealt with udp6_gro_receive()/udp6_gro_complete()
> missing declarations (kernel test robot <lkp@intel.com>)
> for CONFIG_MITIGATION_RETPOLINE=n
Still not good?
net/ipv6/udp_offload.c:136:17: error: static declaration of ‘udp6_gro_receive’ follows non-static declaration
136 | struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
| ^~~~~~~~~~~~~~~~
In file included from net/ipv6/udp_offload.c:16:
./include/net/gro.h:408:17: note: previous declaration of ‘udp6_gro_receive’ with type ‘struct sk_buff *(struct list_head *, struct sk_buff *)’
408 | struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
| ^~~~~~~~~~~~~~~~
net/ipv6/udp_offload.c:168:29: error: static declaration of ‘udp6_gro_complete’ follows non-static declaration
168 | INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
| ^~~~~~~~~~~~~~~~~
./include/net/gro.h:409:5: note: previous declaration of ‘udp6_gro_complete’ with type ‘int(struct sk_buff *, int)’
409 | int udp6_gro_complete(struct sk_buff *, int);
| ^~~~~~~~~~~~~~~~~
--
pw-bot: cr
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete}
2026-01-20 15:30 ` [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Jakub Kicinski
@ 2026-01-20 15:41 ` Eric Dumazet
2026-01-20 15:44 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-01-20 15:41 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S . Miller, Paolo Abeni, Simon Horman, netdev, eric.dumazet
On Tue, Jan 20, 2026 at 4:30 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sun, 18 Jan 2026 17:52:12 +0000 Eric Dumazet wrote:
> > On some platforms, GRO stack is too deep and causes cpu stalls.
> >
> > Decreasing call depths by one shows a 1.5 % gain on Zen2 cpus.
> > (32 RX queues, 100Gbit NIC, RFS enabled, tcp_rr with 128 threads and 10,000 flows)
> >
> > We can go further by inlining ipv6_gro_{receive,complete}
> > and take care of IPv4 if there is interest.
> >
> > Note: two temporary __always_inline will be replaced with
> > inline_for_performance when available.
> >
> > v2: dealt with udp6_gro_receive()/udp6_gro_complete()
> > missing declarations (kernel test robot <lkp@intel.com>)
> > for CONFIG_MITIGATION_RETPOLINE=n
>
> Still not good?
>
> net/ipv6/udp_offload.c:136:17: error: static declaration of ‘udp6_gro_receive’ follows non-static declaration
> 136 | struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
> | ^~~~~~~~~~~~~~~~
> In file included from net/ipv6/udp_offload.c:16:
> ./include/net/gro.h:408:17: note: previous declaration of ‘udp6_gro_receive’ with type ‘struct sk_buff *(struct list_head *, struct sk_buff *)’
> 408 | struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
> | ^~~~~~~~~~~~~~~~
> net/ipv6/udp_offload.c:168:29: error: static declaration of ‘udp6_gro_complete’ follows non-static declaration
> 168 | INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
> | ^~~~~~~~~~~~~~~~~
> ./include/net/gro.h:409:5: note: previous declaration of ‘udp6_gro_complete’ with type ‘int(struct sk_buff *, int)’
> 409 | int udp6_gro_complete(struct sk_buff *, int);
> | ^~~~~~~~~~~~~~~~~
Oh well, I thought I tested this stuff.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete}
2026-01-20 15:41 ` Eric Dumazet
@ 2026-01-20 15:44 ` Eric Dumazet
2026-01-20 16:29 ` Jakub Kicinski
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-01-20 15:44 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S . Miller, Paolo Abeni, Simon Horman, netdev, eric.dumazet
On Tue, Jan 20, 2026 at 4:41 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Jan 20, 2026 at 4:30 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Sun, 18 Jan 2026 17:52:12 +0000 Eric Dumazet wrote:
> > > On some platforms, GRO stack is too deep and causes cpu stalls.
> > >
> > > Decreasing call depths by one shows a 1.5 % gain on Zen2 cpus.
> > > (32 RX queues, 100Gbit NIC, RFS enabled, tcp_rr with 128 threads and 10,000 flows)
> > >
> > > We can go further by inlining ipv6_gro_{receive,complete}
> > > and take care of IPv4 if there is interest.
> > >
> > > Note: two temporary __always_inline will be replaced with
> > > inline_for_performance when available.
> > >
> > > v2: dealt with udp6_gro_receive()/udp6_gro_complete()
> > > missing declarations (kernel test robot <lkp@intel.com>)
> > > for CONFIG_MITIGATION_RETPOLINE=n
> >
> > Still not good?
> >
> > net/ipv6/udp_offload.c:136:17: error: static declaration of ‘udp6_gro_receive’ follows non-static declaration
> > 136 | struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
> > | ^~~~~~~~~~~~~~~~
> > In file included from net/ipv6/udp_offload.c:16:
> > ./include/net/gro.h:408:17: note: previous declaration of ‘udp6_gro_receive’ with type ‘struct sk_buff *(struct list_head *, struct sk_buff *)’
> > 408 | struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
> > | ^~~~~~~~~~~~~~~~
> > net/ipv6/udp_offload.c:168:29: error: static declaration of ‘udp6_gro_complete’ follows non-static declaration
> > 168 | INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
> > | ^~~~~~~~~~~~~~~~~
> > ./include/net/gro.h:409:5: note: previous declaration of ‘udp6_gro_complete’ with type ‘int(struct sk_buff *, int)’
> > 409 | int udp6_gro_complete(struct sk_buff *, int);
> > | ^~~~~~~~~~~~~~~~~
>
> Oh well, I thought I tested this stuff.
Interesting... clang (our default compiler for kernel) does not complain at all.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete}
2026-01-20 15:44 ` Eric Dumazet
@ 2026-01-20 16:29 ` Jakub Kicinski
2026-01-20 16:38 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2026-01-20 16:29 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Simon Horman, netdev, eric.dumazet
On Tue, 20 Jan 2026 16:44:52 +0100 Eric Dumazet wrote:
> On Tue, Jan 20, 2026 at 4:41 PM Eric Dumazet <edumazet@google.com> wrote:
> > > Still not good?
> > >
> > > net/ipv6/udp_offload.c:136:17: error: static declaration of ‘udp6_gro_receive’ follows non-static declaration
> > > 136 | struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
> > > | ^~~~~~~~~~~~~~~~
> > > In file included from net/ipv6/udp_offload.c:16:
> > > ./include/net/gro.h:408:17: note: previous declaration of ‘udp6_gro_receive’ with type ‘struct sk_buff *(struct list_head *, struct sk_buff *)’
> > > 408 | struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
> > > | ^~~~~~~~~~~~~~~~
> > > net/ipv6/udp_offload.c:168:29: error: static declaration of ‘udp6_gro_complete’ follows non-static declaration
> > > 168 | INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
> > > | ^~~~~~~~~~~~~~~~~
> > > ./include/net/gro.h:409:5: note: previous declaration of ‘udp6_gro_complete’ with type ‘int(struct sk_buff *, int)’
> > > 409 | int udp6_gro_complete(struct sk_buff *, int);
> > > | ^~~~~~~~~~~~~~~~~
> >
> > Oh well, I thought I tested this stuff.
>
> Interesting... clang (our default compiler for kernel) does not complain at all.
Well, at least I _think_ it's this series, haven't tested.
It breaks in the kselftests, no allmodconfig, here's the full config:
https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill-dbg/results/482021/config
Also possible that it's a silent conflict with another pending series.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete}
2026-01-20 16:29 ` Jakub Kicinski
@ 2026-01-20 16:38 ` Eric Dumazet
0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-01-20 16:38 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S . Miller, Paolo Abeni, Simon Horman, netdev, eric.dumazet
On Tue, Jan 20, 2026 at 5:29 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 20 Jan 2026 16:44:52 +0100 Eric Dumazet wrote:
> > On Tue, Jan 20, 2026 at 4:41 PM Eric Dumazet <edumazet@google.com> wrote:
> > > > Still not good?
> > > >
> > > > net/ipv6/udp_offload.c:136:17: error: static declaration of ‘udp6_gro_receive’ follows non-static declaration
> > > > 136 | struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
> > > > | ^~~~~~~~~~~~~~~~
> > > > In file included from net/ipv6/udp_offload.c:16:
> > > > ./include/net/gro.h:408:17: note: previous declaration of ‘udp6_gro_receive’ with type ‘struct sk_buff *(struct list_head *, struct sk_buff *)’
> > > > 408 | struct sk_buff *udp6_gro_receive(struct list_head *, struct sk_buff *);
> > > > | ^~~~~~~~~~~~~~~~
> > > > net/ipv6/udp_offload.c:168:29: error: static declaration of ‘udp6_gro_complete’ follows non-static declaration
> > > > 168 | INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
> > > > | ^~~~~~~~~~~~~~~~~
> > > > ./include/net/gro.h:409:5: note: previous declaration of ‘udp6_gro_complete’ with type ‘int(struct sk_buff *, int)’
> > > > 409 | int udp6_gro_complete(struct sk_buff *, int);
> > > > | ^~~~~~~~~~~~~~~~~
> > >
> > > Oh well, I thought I tested this stuff.
> >
> > Interesting... clang (our default compiler for kernel) does not complain at all.
>
> Well, at least I _think_ it's this series, haven't tested.
> It breaks in the kselftests, no allmodconfig, here's the full config:
>
> https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill-dbg/results/482021/config
>
> Also possible that it's a silent conflict with another pending series.
To clarify : clang does not see an error, gcc does.
I removed the INDIRECT_CALLABLE_SCOPE from both functions for v3.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-01-20 16:38 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-18 17:52 [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 1/3] net: always inline __skb_incr_checksum_unnecessary() Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 2/3] gro: inline tcp6_gro_receive() Eric Dumazet
2026-01-18 17:52 ` [PATCH v2 net-next 3/3] gro: inline tcp6_gro_complete() Eric Dumazet
2026-01-20 15:30 ` [PATCH v2 net-next 0/3] gro: inline tcp6_gro_{receive,complete} Jakub Kicinski
2026-01-20 15:41 ` Eric Dumazet
2026-01-20 15:44 ` Eric Dumazet
2026-01-20 16:29 ` Jakub Kicinski
2026-01-20 16:38 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox