* [PATCH net-next 0/4] net: gro: cleanups and fast path refinement
@ 2024-03-01 19:37 Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 1/4] net: gro: rename skb_gro_header_hard() Eric Dumazet
` (5 more replies)
0 siblings, 6 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-03-01 19:37 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Richard Gobert, netdev, eric.dumazet, Eric Dumazet
Current GRO stack has a 'fast path' for a subset of drivers,
users of napi_frags_skb().
With TCP zerocopy/direct uses, header split at receive is becoming
more important, and GRO fast path is disabled.
This series makes GRO (a bit) more efficient for almost all use cases.
Eric Dumazet (4):
net: gro: rename skb_gro_header_hard()
net: gro: change skb_gro_network_header()
net: gro: enable fast path for more cases
tcp: gro: micro optimizations in tcp[4]_gro_complete()
drivers/net/geneve.c | 2 +-
include/net/gro.h | 34 ++++++++++++++++------------------
net/core/gro.c | 25 +++++++++++++++++--------
net/ipv4/fou_core.c | 2 +-
net/ipv4/gre_offload.c | 2 +-
net/ipv4/tcp_offload.c | 19 ++++++++++---------
6 files changed, 46 insertions(+), 38 deletions(-)
--
2.44.0.278.ge034bb2e1d-goog
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next 1/4] net: gro: rename skb_gro_header_hard()
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
@ 2024-03-01 19:37 ` Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 2/4] net: gro: change skb_gro_network_header() Eric Dumazet
` (4 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-03-01 19:37 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Richard Gobert, netdev, eric.dumazet, Eric Dumazet
skb_gro_header_hard() is renamed to skb_gro_may_pull() to match
the convention used by common helpers like pskb_may_pull().
This means the condition is inverted:
if (skb_gro_header_hard(skb, hlen))
slow_path();
becomes:
if (!skb_gro_may_pull(skb, hlen))
slow_path();
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
drivers/net/geneve.c | 2 +-
include/net/gro.h | 7 ++++---
net/core/gro.c | 2 +-
net/ipv4/fou_core.c | 2 +-
net/ipv4/gre_offload.c | 2 +-
net/ipv4/tcp_offload.c | 2 +-
6 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 6f3f9b446b1d202f6c71a20ce48088691e9120bf..e25e0a31126c1527f5b4f61c83d99f0d9481e58f 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -508,7 +508,7 @@ static struct sk_buff *geneve_gro_receive(struct sock *sk,
gh_len = geneve_hlen(gh);
hlen = off_gnv + gh_len;
- if (skb_gro_header_hard(skb, hlen)) {
+ if (!skb_gro_may_pull(skb, hlen)) {
gh = skb_gro_header_slow(skb, hlen, off_gnv);
if (unlikely(!gh))
goto out;
diff --git a/include/net/gro.h b/include/net/gro.h
index b435f0ddbf64f7bf740b7e479a1b28bcdef122c6..ffc2c96d263b0399a81465d903a6181271b4a3f7 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -145,9 +145,10 @@ static inline void *skb_gro_header_fast(struct sk_buff *skb,
return NAPI_GRO_CB(skb)->frag0 + offset;
}
-static inline int skb_gro_header_hard(struct sk_buff *skb, unsigned int hlen)
+static inline bool skb_gro_may_pull(const struct sk_buff *skb,
+ unsigned int hlen)
{
- return NAPI_GRO_CB(skb)->frag0_len < hlen;
+ return hlen <= NAPI_GRO_CB(skb)->frag0_len;
}
static inline void skb_gro_frag0_invalidate(struct sk_buff *skb)
@@ -172,7 +173,7 @@ static inline void *skb_gro_header(struct sk_buff *skb,
void *ptr;
ptr = skb_gro_header_fast(skb, offset);
- if (skb_gro_header_hard(skb, hlen))
+ if (!skb_gro_may_pull(skb, hlen))
ptr = skb_gro_header_slow(skb, hlen, offset);
return ptr;
}
diff --git a/net/core/gro.c b/net/core/gro.c
index 0759277dc14ee65d0a5376d48694cc1cccaee959..927ccf68149093d6dfd66a622a7db5215a483876 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -700,7 +700,7 @@ static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
skb_reset_mac_header(skb);
skb_gro_reset_offset(skb, hlen);
- if (unlikely(skb_gro_header_hard(skb, hlen))) {
+ if (unlikely(!skb_gro_may_pull(skb, hlen))) {
eth = skb_gro_header_slow(skb, hlen, 0);
if (unlikely(!eth)) {
net_warn_ratelimited("%s: dropping impossible skb from %s\n",
diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
index 0c41076e31edadd16f8e55ebc50f84db262a2f0d..a8494f796dca336ca4b30fdbc2f91f3a7e6631fb 100644
--- a/net/ipv4/fou_core.c
+++ b/net/ipv4/fou_core.c
@@ -351,7 +351,7 @@ static struct sk_buff *gue_gro_receive(struct sock *sk,
optlen = guehdr->hlen << 2;
len += optlen;
- if (skb_gro_header_hard(skb, len)) {
+ if (!skb_gro_may_pull(skb, len)) {
guehdr = skb_gro_header_slow(skb, len, off);
if (unlikely(!guehdr))
goto out;
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index 311e70bfce407a2cadaa33fbef9a3976375711f4..5028c72d494abdbf890b3270a4849b2f5c3834a3 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -174,7 +174,7 @@ static struct sk_buff *gre_gro_receive(struct list_head *head,
grehlen += GRE_HEADER_SECTION;
hlen = off + grehlen;
- if (skb_gro_header_hard(skb, hlen)) {
+ if (!skb_gro_may_pull(skb, hlen)) {
greh = skb_gro_header_slow(skb, hlen, off);
if (unlikely(!greh))
goto out;
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 8311c38267b55ba97e59924c3c1c5b59f133fdcd..875456efc92ddd546e13232dd775aaaf1093ce4f 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -204,7 +204,7 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb)
goto out;
hlen = off + thlen;
- if (skb_gro_header_hard(skb, hlen)) {
+ if (!skb_gro_may_pull(skb, hlen)) {
th = skb_gro_header_slow(skb, hlen, off);
if (unlikely(!th))
goto out;
--
2.44.0.278.ge034bb2e1d-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next 2/4] net: gro: change skb_gro_network_header()
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 1/4] net: gro: rename skb_gro_header_hard() Eric Dumazet
@ 2024-03-01 19:37 ` Eric Dumazet
2024-03-04 8:28 ` Paolo Abeni
2024-03-01 19:37 ` [PATCH net-next 3/4] net: gro: enable fast path for more cases Eric Dumazet
` (3 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2024-03-01 19:37 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Richard Gobert, netdev, eric.dumazet, Eric Dumazet
Change skb_gro_network_header() to accept a const sk_buff
and to no longer check if frag0 is NULL or not.
This allows to remove skb_gro_frag0_invalidate()
which is seen in profiles when header-split is enabled.
sk_buff parameter is constified for skb_gro_header_fast(),
inet_gro_compute_pseudo() and ip6_gro_compute_pseudo().
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/gro.h | 27 ++++++++++++---------------
1 file changed, 12 insertions(+), 15 deletions(-)
diff --git a/include/net/gro.h b/include/net/gro.h
index ffc2c96d263b0399a81465d903a6181271b4a3f7..3c3666e46b3055caa619f2da0b6b8b20192a03b4 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -139,7 +139,7 @@ static inline void skb_gro_pull(struct sk_buff *skb, unsigned int len)
NAPI_GRO_CB(skb)->data_offset += len;
}
-static inline void *skb_gro_header_fast(struct sk_buff *skb,
+static inline void *skb_gro_header_fast(const struct sk_buff *skb,
unsigned int offset)
{
return NAPI_GRO_CB(skb)->frag0 + offset;
@@ -151,24 +151,17 @@ static inline bool skb_gro_may_pull(const struct sk_buff *skb,
return hlen <= NAPI_GRO_CB(skb)->frag0_len;
}
-static inline void skb_gro_frag0_invalidate(struct sk_buff *skb)
-{
- NAPI_GRO_CB(skb)->frag0 = NULL;
- NAPI_GRO_CB(skb)->frag0_len = 0;
-}
-
static inline void *skb_gro_header_slow(struct sk_buff *skb, unsigned int hlen,
unsigned int offset)
{
if (!pskb_may_pull(skb, hlen))
return NULL;
- skb_gro_frag0_invalidate(skb);
return skb->data + offset;
}
-static inline void *skb_gro_header(struct sk_buff *skb,
- unsigned int hlen, unsigned int offset)
+static inline void *skb_gro_header(struct sk_buff *skb, unsigned int hlen,
+ unsigned int offset)
{
void *ptr;
@@ -178,13 +171,16 @@ static inline void *skb_gro_header(struct sk_buff *skb,
return ptr;
}
-static inline void *skb_gro_network_header(struct sk_buff *skb)
+static inline void *skb_gro_network_header(const struct sk_buff *skb)
{
- return (NAPI_GRO_CB(skb)->frag0 ?: skb->data) +
- skb_network_offset(skb);
+ if (skb_gro_may_pull(skb, skb_gro_offset(skb)))
+ return skb_gro_header_fast(skb, skb_network_offset(skb));
+
+ return skb_network_header(skb);
}
-static inline __wsum inet_gro_compute_pseudo(struct sk_buff *skb, int proto)
+static inline __wsum inet_gro_compute_pseudo(const struct sk_buff *skb,
+ int proto)
{
const struct iphdr *iph = skb_gro_network_header(skb);
@@ -422,7 +418,8 @@ static inline struct udphdr *udp_gro_udphdr(struct sk_buff *skb)
return uh;
}
-static inline __wsum ip6_gro_compute_pseudo(struct sk_buff *skb, int proto)
+static inline __wsum ip6_gro_compute_pseudo(const struct sk_buff *skb,
+ int proto)
{
const struct ipv6hdr *iph = skb_gro_network_header(skb);
--
2.44.0.278.ge034bb2e1d-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next 3/4] net: gro: enable fast path for more cases
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 1/4] net: gro: rename skb_gro_header_hard() Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 2/4] net: gro: change skb_gro_network_header() Eric Dumazet
@ 2024-03-01 19:37 ` Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 4/4] tcp: gro: micro optimizations in tcp[4]_gro_complete() Eric Dumazet
` (2 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-03-01 19:37 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Richard Gobert, netdev, eric.dumazet, Eric Dumazet
Currently the so-called GRO fast path is only enabled for
napi_frags_skb() callers.
After the prior patch, we no longer have to clear frag0 whenever
we pulled bytes to skb->head.
We therefore can initialize frag0 to skb->data so that GRO
fast path can be used in the following additional cases:
- Drivers using header split (populating skb->data with headers,
and having payload in one or more page fragments).
- Drivers not using any page frag (entire packet is in skb->data)
Add a likely() in skb_gro_may_pull() to help the compiler
to generate better code if possible.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/net/gro.h | 2 +-
net/core/gro.c | 23 ++++++++++++++++-------
2 files changed, 17 insertions(+), 8 deletions(-)
diff --git a/include/net/gro.h b/include/net/gro.h
index 3c3666e46b3055caa619f2da0b6b8b20192a03b4..2b58671a65492bf3f9dabf1e7a2d985cee007e11 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -148,7 +148,7 @@ static inline void *skb_gro_header_fast(const struct sk_buff *skb,
static inline bool skb_gro_may_pull(const struct sk_buff *skb,
unsigned int hlen)
{
- return hlen <= NAPI_GRO_CB(skb)->frag0_len;
+ return likely(hlen <= NAPI_GRO_CB(skb)->frag0_len);
}
static inline void *skb_gro_header_slow(struct sk_buff *skb, unsigned int hlen,
diff --git a/net/core/gro.c b/net/core/gro.c
index 927ccf68149093d6dfd66a622a7db5215a483876..6a0edbd826a17573b51c5f71e20ff0c09364fc21 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -369,15 +369,21 @@ static void gro_list_prepare(const struct list_head *head,
static inline void skb_gro_reset_offset(struct sk_buff *skb, u32 nhoff)
{
- const struct skb_shared_info *pinfo = skb_shinfo(skb);
- const skb_frag_t *frag0 = &pinfo->frags[0];
+ const struct skb_shared_info *pinfo;
+ const skb_frag_t *frag0;
+ unsigned int headlen;
NAPI_GRO_CB(skb)->data_offset = 0;
- NAPI_GRO_CB(skb)->frag0 = NULL;
- NAPI_GRO_CB(skb)->frag0_len = 0;
+ headlen = skb_headlen(skb);
+ NAPI_GRO_CB(skb)->frag0 = skb->data;
+ NAPI_GRO_CB(skb)->frag0_len = headlen;
+ if (headlen)
+ return;
+
+ pinfo = skb_shinfo(skb);
+ frag0 = &pinfo->frags[0];
- if (!skb_headlen(skb) && pinfo->nr_frags &&
- !PageHighMem(skb_frag_page(frag0)) &&
+ if (pinfo->nr_frags && !PageHighMem(skb_frag_page(frag0)) &&
(!NET_IP_ALIGN || !((skb_frag_off(frag0) + nhoff) & 3))) {
NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
@@ -710,7 +716,10 @@ static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
}
} else {
eth = (const struct ethhdr *)skb->data;
- gro_pull_from_frag0(skb, hlen);
+
+ if (NAPI_GRO_CB(skb)->frag0 != skb->data)
+ gro_pull_from_frag0(skb, hlen);
+
NAPI_GRO_CB(skb)->frag0 += hlen;
NAPI_GRO_CB(skb)->frag0_len -= hlen;
}
--
2.44.0.278.ge034bb2e1d-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next 4/4] tcp: gro: micro optimizations in tcp[4]_gro_complete()
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
` (2 preceding siblings ...)
2024-03-01 19:37 ` [PATCH net-next 3/4] net: gro: enable fast path for more cases Eric Dumazet
@ 2024-03-01 19:37 ` Eric Dumazet
2024-03-04 10:30 ` [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Paolo Abeni
2024-03-05 12:50 ` patchwork-bot+netdevbpf
5 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-03-01 19:37 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Richard Gobert, netdev, eric.dumazet, Eric Dumazet
In tcp_gro_complete() :
Moving the skb->inner_transport_header setting
allows the compiler to reuse the previously loaded value
of skb->transport_header.
Caching skb_shinfo() avoids duplications as well.
In tcp4_gro_complete(), doing a single change on
skb_shinfo(skb)->gso_type also generates better code.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_offload.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 875456efc92ddd546e13232dd775aaaf1093ce4f..b955ab3b236d965a38054efa004fe12f03074c70 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -299,18 +299,20 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb)
void tcp_gro_complete(struct sk_buff *skb)
{
struct tcphdr *th = tcp_hdr(skb);
+ struct skb_shared_info *shinfo;
+
+ if (skb->encapsulation)
+ skb->inner_transport_header = skb->transport_header;
skb->csum_start = (unsigned char *)th - skb->head;
skb->csum_offset = offsetof(struct tcphdr, check);
skb->ip_summed = CHECKSUM_PARTIAL;
- skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
+ shinfo = skb_shinfo(skb);
+ shinfo->gso_segs = NAPI_GRO_CB(skb)->count;
if (th->cwr)
- skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
-
- if (skb->encapsulation)
- skb->inner_transport_header = skb->transport_header;
+ shinfo->gso_type |= SKB_GSO_TCP_ECN;
}
EXPORT_SYMBOL(tcp_gro_complete);
@@ -335,10 +337,9 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
iph->daddr, 0);
- skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
- if (NAPI_GRO_CB(skb)->is_atomic)
- skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID;
+ skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4 |
+ (NAPI_GRO_CB(skb)->is_atomic * SKB_GSO_TCP_FIXEDID);
tcp_gro_complete(skb);
return 0;
--
2.44.0.278.ge034bb2e1d-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/4] net: gro: change skb_gro_network_header()
2024-03-01 19:37 ` [PATCH net-next 2/4] net: gro: change skb_gro_network_header() Eric Dumazet
@ 2024-03-04 8:28 ` Paolo Abeni
2024-03-04 9:06 ` Eric Dumazet
0 siblings, 1 reply; 12+ messages in thread
From: Paolo Abeni @ 2024-03-04 8:28 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski
Cc: Richard Gobert, netdev, eric.dumazet
On Fri, 2024-03-01 at 19:37 +0000, Eric Dumazet wrote:
> Change skb_gro_network_header() to accept a const sk_buff
> and to no longer check if frag0 is NULL or not.
>
> This allows to remove skb_gro_frag0_invalidate()
> which is seen in profiles when header-split is enabled.
I have a few questions to help me understanding this patchset better:
skb_gro_frag0_invalidate() shows in profiles (for non napi_frags_skb
callers?) because it's called multiple times for each aggregate packet,
right? I guessed writing the same cacheline multiple times per-se
should not be too much expansive.
perf here did not allow me to easily observed the mentioned cost,
because the function is inlined in many different places, I'm wondering
how you noticed?
Thanks!
Paolo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/4] net: gro: change skb_gro_network_header()
2024-03-04 8:28 ` Paolo Abeni
@ 2024-03-04 9:06 ` Eric Dumazet
2024-03-04 10:29 ` Paolo Abeni
2024-03-04 13:04 ` Richard Gobert
0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-03-04 9:06 UTC (permalink / raw)
To: Paolo Abeni
Cc: David S . Miller, Jakub Kicinski, Richard Gobert, netdev,
eric.dumazet
On Mon, Mar 4, 2024 at 9:28 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Fri, 2024-03-01 at 19:37 +0000, Eric Dumazet wrote:
> > Change skb_gro_network_header() to accept a const sk_buff
> > and to no longer check if frag0 is NULL or not.
> >
> > This allows to remove skb_gro_frag0_invalidate()
> > which is seen in profiles when header-split is enabled.
>
> I have a few questions to help me understanding this patchset better:
>
> skb_gro_frag0_invalidate() shows in profiles (for non napi_frags_skb
> callers?) because it's called multiple times for each aggregate packet,
> right? I guessed writing the same cacheline multiple times per-se
> should not be too much expansive.
Apparently some (not very recent) intel cpus have issues (at least
with clang generated code) with
immediate reloads after a write.
I also saw some strange artifacts on ARM64 cpus, but it is hard to say,
I found perf to be not very precise on them.
>
> perf here did not allow me to easily observed the mentioned cost,
> because the function is inlined in many different places, I'm wondering
> how you noticed?
It is more about the whole patchset really, this gave me about 4%
improvement on saturated cpu
(RFS enabled, Intel(R) Xeon(R) Gold 6268L CPU @ 2.80GHz)
One TCP flow : (1500 MTU)
New profile (6,233,000 pkts per second )
19.76% [kernel] [k] gq_rx_napi_handler
11.19% [kernel] [k] dev_gro_receive
8.05% [kernel] [k] ipv6_gro_receive
7.98% [kernel] [k] tcp_gro_receive
7.25% [kernel] [k] skb_gro_receive
5.47% [kernel] [k] gq_rx_prep_buffers
4.39% [kernel] [k] skb_release_data
3.91% [kernel] [k] tcp6_gro_receive
3.55% [kernel] [k] csum_ipv6_magic
3.06% [kernel] [k] napi_gro_frags
2.76% [kernel] [k] napi_reuse_skb
Old profile (5,950,000 pkts per second)
17.92% [kernel] [k] gq_rx_napi_handler
10.22% [kernel] [k] dev_gro_receive
8.60% [kernel] [k] tcp_gro_receive
8.09% [kernel] [k] ipv6_gro_receive
8.06% [kernel] [k] skb_gro_receive
6.74% [kernel] [k] gq_rx_prep_buffers
4.82% [kernel] [k] skb_release_data
3.82% [kernel] [k] tcp6_gro_receive
3.76% [kernel] [k] csum_ipv6_magic
2.97% [kernel] [k] napi_gro_frags
2.57% [kernel] [k] napi_reuse_skb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/4] net: gro: change skb_gro_network_header()
2024-03-04 9:06 ` Eric Dumazet
@ 2024-03-04 10:29 ` Paolo Abeni
2024-03-04 13:04 ` Richard Gobert
1 sibling, 0 replies; 12+ messages in thread
From: Paolo Abeni @ 2024-03-04 10:29 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Jakub Kicinski, Richard Gobert, netdev,
eric.dumazet
On Mon, 2024-03-04 at 10:06 +0100, Eric Dumazet wrote:
> On Mon, Mar 4, 2024 at 9:28 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On Fri, 2024-03-01 at 19:37 +0000, Eric Dumazet wrote:
> > > Change skb_gro_network_header() to accept a const sk_buff
> > > and to no longer check if frag0 is NULL or not.
> > >
> > > This allows to remove skb_gro_frag0_invalidate()
> > > which is seen in profiles when header-split is enabled.
> >
> > I have a few questions to help me understanding this patchset better:
> >
> > skb_gro_frag0_invalidate() shows in profiles (for non napi_frags_skb
> > callers?) because it's called multiple times for each aggregate packet,
> > right? I guessed writing the same cacheline multiple times per-se
> > should not be too much expansive.
>
> Apparently some (not very recent) intel cpus have issues (at least
> with clang generated code) with
> immediate reloads after a write.
Ah! I *think* I have observed the same even with gcc (accessing some CB
fields just after the initial zeroing popped-up in perf report.
> I also saw some strange artifacts on ARM64 cpus, but it is hard to say,
> I found perf to be not very precise on them.
>
> >
> > perf here did not allow me to easily observed the mentioned cost,
> > because the function is inlined in many different places, I'm wondering
> > how you noticed?
>
> It is more about the whole patchset really, this gave me about 4%
> improvement on saturated cpu
> (RFS enabled, Intel(R) Xeon(R) Gold 6268L CPU @ 2.80GHz)
>
> One TCP flow : (1500 MTU)
>
> New profile (6,233,000 pkts per second )
> 19.76% [kernel] [k] gq_rx_napi_handler
> 11.19% [kernel] [k] dev_gro_receive
> 8.05% [kernel] [k] ipv6_gro_receive
> 7.98% [kernel] [k] tcp_gro_receive
> 7.25% [kernel] [k] skb_gro_receive
> 5.47% [kernel] [k] gq_rx_prep_buffers
> 4.39% [kernel] [k] skb_release_data
> 3.91% [kernel] [k] tcp6_gro_receive
> 3.55% [kernel] [k] csum_ipv6_magic
> 3.06% [kernel] [k] napi_gro_frags
> 2.76% [kernel] [k] napi_reuse_skb
>
> Old profile (5,950,000 pkts per second)
> 17.92% [kernel] [k] gq_rx_napi_handler
> 10.22% [kernel] [k] dev_gro_receive
> 8.60% [kernel] [k] tcp_gro_receive
> 8.09% [kernel] [k] ipv6_gro_receive
> 8.06% [kernel] [k] skb_gro_receive
> 6.74% [kernel] [k] gq_rx_prep_buffers
> 4.82% [kernel] [k] skb_release_data
> 3.82% [kernel] [k] tcp6_gro_receive
> 3.76% [kernel] [k] csum_ipv6_magic
> 2.97% [kernel] [k] napi_gro_frags
> 2.57% [kernel] [k] napi_reuse_skb
Thanks for the detailed info! I'll try to benchmark this on non
napi_gro_frags enabled driver, but don't hold your breath meanwhile!
Cheers,
Paolo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 0/4] net: gro: cleanups and fast path refinement
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
` (3 preceding siblings ...)
2024-03-01 19:37 ` [PATCH net-next 4/4] tcp: gro: micro optimizations in tcp[4]_gro_complete() Eric Dumazet
@ 2024-03-04 10:30 ` Paolo Abeni
2024-03-05 12:50 ` patchwork-bot+netdevbpf
5 siblings, 0 replies; 12+ messages in thread
From: Paolo Abeni @ 2024-03-04 10:30 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski
Cc: Richard Gobert, netdev, eric.dumazet
On Fri, 2024-03-01 at 19:37 +0000, Eric Dumazet wrote:
> Current GRO stack has a 'fast path' for a subset of drivers,
> users of napi_frags_skb().
>
> With TCP zerocopy/direct uses, header split at receive is becoming
> more important, and GRO fast path is disabled.
>
> This series makes GRO (a bit) more efficient for almost all use cases.
>
> Eric Dumazet (4):
> net: gro: rename skb_gro_header_hard()
> net: gro: change skb_gro_network_header()
> net: gro: enable fast path for more cases
> tcp: gro: micro optimizations in tcp[4]_gro_complete()
>
> drivers/net/geneve.c | 2 +-
> include/net/gro.h | 34 ++++++++++++++++------------------
> net/core/gro.c | 25 +++++++++++++++++--------
> net/ipv4/fou_core.c | 2 +-
> net/ipv4/gre_offload.c | 2 +-
> net/ipv4/tcp_offload.c | 19 ++++++++++---------
> 6 files changed, 46 insertions(+), 38 deletions(-)
Many thanks, looks great!
Acked-by: Paolo Abeni <pabeni@redhat.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/4] net: gro: change skb_gro_network_header()
2024-03-04 9:06 ` Eric Dumazet
2024-03-04 10:29 ` Paolo Abeni
@ 2024-03-04 13:04 ` Richard Gobert
2024-03-04 13:14 ` Eric Dumazet
1 sibling, 1 reply; 12+ messages in thread
From: Richard Gobert @ 2024-03-04 13:04 UTC (permalink / raw)
To: Eric Dumazet, Paolo Abeni
Cc: David S . Miller, Jakub Kicinski, netdev, eric.dumazet
Eric Dumazet wrote:
> New profile (6,233,000 pkts per second )
> 19.76% [kernel] [k] gq_rx_napi_handler
> 11.19% [kernel] [k] dev_gro_receive
> 8.05% [kernel] [k] ipv6_gro_receive
> 7.98% [kernel] [k] tcp_gro_receive
> 7.25% [kernel] [k] skb_gro_receive
> 5.47% [kernel] [k] gq_rx_prep_buffers
> 4.39% [kernel] [k] skb_release_data
> 3.91% [kernel] [k] tcp6_gro_receive
> 3.55% [kernel] [k] csum_ipv6_magic
> 3.06% [kernel] [k] napi_gro_frags
> 2.76% [kernel] [k] napi_reuse_skb
>
> Old profile (5,950,000 pkts per second)
> 17.92% [kernel] [k] gq_rx_napi_handler
> 10.22% [kernel] [k] dev_gro_receive
> 8.60% [kernel] [k] tcp_gro_receive
> 8.09% [kernel] [k] ipv6_gro_receive
> 8.06% [kernel] [k] skb_gro_receive
> 6.74% [kernel] [k] gq_rx_prep_buffers
> 4.82% [kernel] [k] skb_release_data
> 3.82% [kernel] [k] tcp6_gro_receive
> 3.76% [kernel] [k] csum_ipv6_magic
> 2.97% [kernel] [k] napi_gro_frags
> 2.57% [kernel] [k] napi_reuse_skb
Overall looks like a great gain for GRO, less code for handling frag0 :)
Could you please share how to measure a <10% gain in pps in a stable
manner? While perf top is stable for me when testing CPU-bound tasks,
netperf pps measurements between 2 physical machines generate ~5-7%
noise when I try to measure.
Thanks
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 2/4] net: gro: change skb_gro_network_header()
2024-03-04 13:04 ` Richard Gobert
@ 2024-03-04 13:14 ` Eric Dumazet
0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-03-04 13:14 UTC (permalink / raw)
To: Richard Gobert
Cc: Paolo Abeni, David S . Miller, Jakub Kicinski, netdev,
eric.dumazet
On Mon, Mar 4, 2024 at 2:04 PM Richard Gobert <richardbgobert@gmail.com> wrote:
> Overall looks like a great gain for GRO, less code for handling frag0 :)
>
> Could you please share how to measure a <10% gain in pps in a stable
> manner? While perf top is stable for me when testing CPU-bound tasks,
> netperf pps measurements between 2 physical machines generate ~5-7%
> noise when I try to measure.
The pps are measured without "perf top" running.
"sar -n DEV 5 5" , or other non intrusive monitoring tool.
Quite often, noise comes from NUMA hosts, because the NIC has direct
attachment to one of them.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 0/4] net: gro: cleanups and fast path refinement
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
` (4 preceding siblings ...)
2024-03-04 10:30 ` [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Paolo Abeni
@ 2024-03-05 12:50 ` patchwork-bot+netdevbpf
5 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-03-05 12:50 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, kuba, pabeni, richardbgobert, netdev, eric.dumazet
Hello:
This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Fri, 1 Mar 2024 19:37:36 +0000 you wrote:
> Current GRO stack has a 'fast path' for a subset of drivers,
> users of napi_frags_skb().
>
> With TCP zerocopy/direct uses, header split at receive is becoming
> more important, and GRO fast path is disabled.
>
> This series makes GRO (a bit) more efficient for almost all use cases.
>
> [...]
Here is the summary with links:
- [net-next,1/4] net: gro: rename skb_gro_header_hard()
https://git.kernel.org/netdev/net-next/c/93e16ea025d2
- [net-next,2/4] net: gro: change skb_gro_network_header()
https://git.kernel.org/netdev/net-next/c/bd56a29c7a4e
- [net-next,3/4] net: gro: enable fast path for more cases
https://git.kernel.org/netdev/net-next/c/c7583e9f768e
- [net-next,4/4] tcp: gro: micro optimizations in tcp[4]_gro_complete()
https://git.kernel.org/netdev/net-next/c/8f78010b701d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-03-05 12:50 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-01 19:37 [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 1/4] net: gro: rename skb_gro_header_hard() Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 2/4] net: gro: change skb_gro_network_header() Eric Dumazet
2024-03-04 8:28 ` Paolo Abeni
2024-03-04 9:06 ` Eric Dumazet
2024-03-04 10:29 ` Paolo Abeni
2024-03-04 13:04 ` Richard Gobert
2024-03-04 13:14 ` Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 3/4] net: gro: enable fast path for more cases Eric Dumazet
2024-03-01 19:37 ` [PATCH net-next 4/4] tcp: gro: micro optimizations in tcp[4]_gro_complete() Eric Dumazet
2024-03-04 10:30 ` [PATCH net-next 0/4] net: gro: cleanups and fast path refinement Paolo Abeni
2024-03-05 12:50 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).