* [PATCH net-next] net: gro: allow to build full sized skb
@ 2013-10-08 16:02 Eric Dumazet
2013-10-08 16:14 ` Ben Greear
2013-10-10 4:17 ` David Miller
0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2013-10-08 16:02 UTC (permalink / raw)
To: David Miller; +Cc: netdev
From: Eric Dumazet <edumazet@google.com>
skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb,
typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags.
It's relatively easy to extend the skb using frag_list to allow
more frags to be appended into the last sk_buff.
This still builds very efficient skbs, and allows reaching 45 MSS per
skb.
(45 MSS GRO packet uses one skb plus a frag_list containing 2 additional
sk_buff)
High speed TCP flows benefit from this extension by lowering TCP stack
cpu usage (less packets stored in receive queue, less ACK packets
processed)
Forwarding setups could be hurt, as such skbs will need to be
linearized, although its not a new problem, as GRO could already
provide skbs with a frag_list.
We could make the 65536 bytes threshold a tunable to mitigate this.
(First time we need to linearize skb in skb_needs_linearize(), we could
lower the tunable to ~16*1460 so that following skb_gro_receive() calls
build smaller skbs)
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/skbuff.c | 43 ++++++++++++++++++++++++++-----------------
1 file changed, 26 insertions(+), 17 deletions(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d81cff1..8ead744 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2936,32 +2936,30 @@ EXPORT_SYMBOL_GPL(skb_segment);
int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
{
- struct sk_buff *p = *head;
- struct sk_buff *nskb;
- struct skb_shared_info *skbinfo = skb_shinfo(skb);
- struct skb_shared_info *pinfo = skb_shinfo(p);
- unsigned int headroom;
- unsigned int len = skb_gro_len(skb);
+ struct skb_shared_info *pinfo, *skbinfo = skb_shinfo(skb);
unsigned int offset = skb_gro_offset(skb);
unsigned int headlen = skb_headlen(skb);
+ struct sk_buff *nskb, *lp, *p = *head;
+ unsigned int len = skb_gro_len(skb);
unsigned int delta_truesize;
+ unsigned int headroom;
- if (p->len + len >= 65536)
+ if (unlikely(p->len + len >= 65536))
return -E2BIG;
- if (pinfo->frag_list)
- goto merge;
- else if (headlen <= offset) {
+ lp = NAPI_GRO_CB(p)->last ?: p;
+ pinfo = skb_shinfo(lp);
+
+ if (headlen <= offset) {
skb_frag_t *frag;
skb_frag_t *frag2;
int i = skbinfo->nr_frags;
int nr_frags = pinfo->nr_frags + i;
- offset -= headlen;
-
if (nr_frags > MAX_SKB_FRAGS)
- return -E2BIG;
+ goto merge;
+ offset -= headlen;
pinfo->nr_frags = nr_frags;
skbinfo->nr_frags = 0;
@@ -2992,7 +2990,7 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
unsigned int first_offset;
if (nr_frags + 1 + skbinfo->nr_frags > MAX_SKB_FRAGS)
- return -E2BIG;
+ goto merge;
first_offset = skb->data -
(unsigned char *)page_address(page) +
@@ -3010,7 +3008,10 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
delta_truesize = skb->truesize - SKB_DATA_ALIGN(sizeof(struct sk_buff));
NAPI_GRO_CB(skb)->free = NAPI_GRO_FREE_STOLEN_HEAD;
goto done;
- } else if (skb_gro_len(p) != pinfo->gso_size)
+ }
+ if (pinfo->frag_list)
+ goto merge;
+ if (skb_gro_len(p) != pinfo->gso_size)
return -E2BIG;
headroom = skb_headroom(p);
@@ -3062,16 +3063,24 @@ merge:
__skb_pull(skb, offset);
- NAPI_GRO_CB(p)->last->next = skb;
+ if (!NAPI_GRO_CB(p)->last)
+ skb_shinfo(p)->frag_list = skb;
+ else
+ NAPI_GRO_CB(p)->last->next = skb;
NAPI_GRO_CB(p)->last = skb;
skb_header_release(skb);
+ lp = p;
done:
NAPI_GRO_CB(p)->count++;
p->data_len += len;
p->truesize += delta_truesize;
p->len += len;
-
+ if (lp != p) {
+ lp->data_len += len;
+ lp->truesize += delta_truesize;
+ lp->len += len;
+ }
NAPI_GRO_CB(skb)->same_flow = 1;
return 0;
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] net: gro: allow to build full sized skb
2013-10-08 16:02 [PATCH net-next] net: gro: allow to build full sized skb Eric Dumazet
@ 2013-10-08 16:14 ` Ben Greear
2013-10-08 16:50 ` Eric Dumazet
2013-10-10 4:17 ` David Miller
1 sibling, 1 reply; 4+ messages in thread
From: Ben Greear @ 2013-10-08 16:14 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
On 10/08/2013 09:02 AM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb,
> typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags.
>
> It's relatively easy to extend the skb using frag_list to allow
> more frags to be appended into the last sk_buff.
>
> This still builds very efficient skbs, and allows reaching 45 MSS per
> skb.
>
> (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional
> sk_buff)
>
> High speed TCP flows benefit from this extension by lowering TCP stack
> cpu usage (less packets stored in receive queue, less ACK packets
> processed)
>
> Forwarding setups could be hurt, as such skbs will need to be
> linearized, although its not a new problem, as GRO could already
> provide skbs with a frag_list.
>
> We could make the 65536 bytes threshold a tunable to mitigate this.
>
> (First time we need to linearize skb in skb_needs_linearize(), we could
> lower the tunable to ~16*1460 so that following skb_gro_receive() calls
> build smaller skbs)
On multi-homed boxes you could easily have paths that would never need linearize
and other paths that always need it, for whatever reason.
Maybe a per-netdev check for needs linearize instead of something global
would be better...and maybe let users over-ride the default behaviour
regardless?
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] net: gro: allow to build full sized skb
2013-10-08 16:14 ` Ben Greear
@ 2013-10-08 16:50 ` Eric Dumazet
0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2013-10-08 16:50 UTC (permalink / raw)
To: Ben Greear; +Cc: David Miller, netdev
On Tue, 2013-10-08 at 09:14 -0700, Ben Greear wrote:
> On multi-homed boxes you could easily have paths that would never need linearize
> and other paths that always need it, for whatever reason.
>
> Maybe a per-netdev check for needs linearize instead of something global
> would be better...and maybe let users over-ride the default behaviour
> regardless?
Sure. Note that this has yet to be discussed.
I mentioned this in the changelog for completeness.
I know people just disabling GRO in forwarding setups, probably because
we lack control of latencies / bursts.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] net: gro: allow to build full sized skb
2013-10-08 16:02 [PATCH net-next] net: gro: allow to build full sized skb Eric Dumazet
2013-10-08 16:14 ` Ben Greear
@ 2013-10-10 4:17 ` David Miller
1 sibling, 0 replies; 4+ messages in thread
From: David Miller @ 2013-10-10 4:17 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 08 Oct 2013 09:02:23 -0700
> From: Eric Dumazet <edumazet@google.com>
>
> skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb,
> typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags.
>
> It's relatively easy to extend the skb using frag_list to allow
> more frags to be appended into the last sk_buff.
>
> This still builds very efficient skbs, and allows reaching 45 MSS per
> skb.
>
> (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional
> sk_buff)
>
> High speed TCP flows benefit from this extension by lowering TCP stack
> cpu usage (less packets stored in receive queue, less ACK packets
> processed)
>
> Forwarding setups could be hurt, as such skbs will need to be
> linearized, although its not a new problem, as GRO could already
> provide skbs with a frag_list.
>
> We could make the 65536 bytes threshold a tunable to mitigate this.
>
> (First time we need to linearize skb in skb_needs_linearize(), we could
> lower the tunable to ~16*1460 so that following skb_gro_receive() calls
> build smaller skbs)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks Eric.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-10-10 4:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-08 16:02 [PATCH net-next] net: gro: allow to build full sized skb Eric Dumazet
2013-10-08 16:14 ` Ben Greear
2013-10-08 16:50 ` Eric Dumazet
2013-10-10 4:17 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).