From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
To: Florian Westphal <fw@strlen.de>,
"David S. Miller" <davem@davemloft.net>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
Eric Dumazet <edumazet@google.com>,
Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org
Subject: Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu
Date: Thu, 25 Aug 2016 12:05:33 +0300 [thread overview]
Message-ID: <20160825120533.352bbd1b@pixies> (raw)
In-Reply-To: <20160822125842.GF6199@breakpoint.cc>
Hi,
On Mon, 22 Aug 2016 14:58:42 +0200, fw@strlen.de wrote:
> Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote:
> > There are cases where gso skbs (which originate from an ingress
> > interface) have a gso_size value that exceeds the output dst mtu:
> >
> > - ipv4 forwarding middlebox having in/out interfaces with different mtus
> > addressed by fe6cc55f3a 'net: ip, ipv6: handle gso skbs in forwarding path'
> > - bridge having a tunnel member interface stacked over a device with small mtu
> > addressed by b8247f095e 'net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs'
> >
> > In both cases, such skbs are identified, then go through early software
> > segmentation+fragmentation as part of ip_finish_output_gso.
> >
> > Another approach is to shrink the gso_size to a value suitable so
> > resulting segments are smaller than dst mtu, as suggeted by Eric
> > Dumazet (as part of [1]) and Florian Westphal (as part of [2]).
> >
> > This will void the need for software segmentation/fragmentation at
> > ip_finish_output_gso, thus significantly improve throughput and lower
> > cpu load.
> >
> > This RFC patch attempts to implement this gso_size clamping.
> >
> > [1] https://patchwork.ozlabs.org/patch/314327/
> > [2] https://patchwork.ozlabs.org/patch/644724/
> >
> > Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> > ---
> >
> > Florian, in fe6cc55f you described a BUG due to gso_size decrease.
> > I've tested both bridged and routed cases, but in my setups failed to
> > hit the issue; Appreciate if you can provide some hints.
>
> Still get the BUG, I applied this patch on top of net-next.
The BUG occurs when GRO occurs on the ingress, and only if GRO merges
skbs into the frag_list (OTOH when segments are only placed into frags[]
of a single skb, skb_segment succeeds even if gso_size was altered).
This is due to an assumption that the frag_list members terminate on
exact MSS boundaries (which no longer holds during gso_size clamping).
The assumption is dated back to original support of 'frag_list' to
skb_segment:
89319d3801 net: Add frag_list support to skb_segment
This patch adds limited support for handling frag_list packets in
skb_segment. The intention is to support GRO (Generic Receive Offload)
packets which will be constructed by chaining normal packets using
frag_list.
As such we require all frag_list members terminate on exact MSS
boundaries. This is checked using BUG_ON.
As there should only be one producer in the kernel of such packets,
namely GRO, this requirement should not be difficult to maintain.
We have few alternatives for gso_size clamping:
1 Fix 'skb_segment' arithmentics to support inputs that do not match
the "frag_list members terminate on exact MSS" assumption.
2 Perform gso_size clamping in 'ip_finish_output_gso' for non-GROed skbs.
Other usecases will still benefit: (a) packets arriving from
virtualized interfaces, e.g. tap and friends; (b) packets arriving from
veth of other namespaces (packets are locally generated by TCP stack
on a different netns).
3 Ditch the idea, again ;)
We can persue (1), especially if there are other benefits doing so.
OTOH due to the current complexity of 'skb_segment' this is bit risky.
Going with (2) could be reasonable too, as it brings value for
the virtualized environmnets that need gso_size clamping, while
presenting minimal risk.
Appreciate your opinions.
Regards,
Shmulik
next prev parent reply other threads:[~2016-08-25 9:05 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-22 12:06 [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu Shmulik Ladkani
2016-08-22 12:58 ` Florian Westphal
2016-08-22 13:05 ` Shmulik Ladkani
2016-08-24 14:53 ` Shmulik Ladkani
2016-08-24 14:59 ` Florian Westphal
2016-08-25 9:05 ` Shmulik Ladkani [this message]
2016-08-26 11:19 ` Herbert Xu
2016-09-09 5:48 ` Shmulik Ladkani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160825120533.352bbd1b@pixies \
--to=shmulik.ladkani@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=hannes@stressinduktion.org \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.