From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
To: Florian Westphal <fw@strlen.de>,
"David S. Miller" <davem@davemloft.net>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
Eric Dumazet <edumazet@google.com>,
Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org
Subject: Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu
Date: Thu, 25 Aug 2016 12:05:33 +0300 [thread overview]
Message-ID: <20160825120533.352bbd1b@pixies> (raw)
In-Reply-To: <20160822125842.GF6199@breakpoint.cc>
Hi,
On Mon, 22 Aug 2016 14:58:42 +0200, fw@strlen.de wrote:
> Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote:
> > There are cases where gso skbs (which originate from an ingress
> > interface) have a gso_size value that exceeds the output dst mtu:
> >
> > - ipv4 forwarding middlebox having in/out interfaces with different mtus
> > addressed by fe6cc55f3a 'net: ip, ipv6: handle gso skbs in forwarding path'
> > - bridge having a tunnel member interface stacked over a device with small mtu
> > addressed by b8247f095e 'net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs'
> >
> > In both cases, such skbs are identified, then go through early software
> > segmentation+fragmentation as part of ip_finish_output_gso.
> >
> > Another approach is to shrink the gso_size to a value suitable so
> > resulting segments are smaller than dst mtu, as suggeted by Eric
> > Dumazet (as part of [1]) and Florian Westphal (as part of [2]).
> >
> > This will void the need for software segmentation/fragmentation at
> > ip_finish_output_gso, thus significantly improve throughput and lower
> > cpu load.
> >
> > This RFC patch attempts to implement this gso_size clamping.
> >
> > [1] https://patchwork.ozlabs.org/patch/314327/
> > [2] https://patchwork.ozlabs.org/patch/644724/
> >
> > Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> > ---
> >
> > Florian, in fe6cc55f you described a BUG due to gso_size decrease.
> > I've tested both bridged and routed cases, but in my setups failed to
> > hit the issue; Appreciate if you can provide some hints.
>
> Still get the BUG, I applied this patch on top of net-next.
The BUG occurs when GRO occurs on the ingress, and only if GRO merges
skbs into the frag_list (OTOH when segments are only placed into frags[]
of a single skb, skb_segment succeeds even if gso_size was altered).
This is due to an assumption that the frag_list members terminate on
exact MSS boundaries (which no longer holds during gso_size clamping).
The assumption is dated back to original support of 'frag_list' to
skb_segment:
89319d3801 net: Add frag_list support to skb_segment
This patch adds limited support for handling frag_list packets in
skb_segment. The intention is to support GRO (Generic Receive Offload)
packets which will be constructed by chaining normal packets using
frag_list.
As such we require all frag_list members terminate on exact MSS
boundaries. This is checked using BUG_ON.
As there should only be one producer in the kernel of such packets,
namely GRO, this requirement should not be difficult to maintain.
We have few alternatives for gso_size clamping:
1 Fix 'skb_segment' arithmentics to support inputs that do not match
the "frag_list members terminate on exact MSS" assumption.
2 Perform gso_size clamping in 'ip_finish_output_gso' for non-GROed skbs.
Other usecases will still benefit: (a) packets arriving from
virtualized interfaces, e.g. tap and friends; (b) packets arriving from
veth of other namespaces (packets are locally generated by TCP stack
on a different netns).
3 Ditch the idea, again ;)
We can persue (1), especially if there are other benefits doing so.
OTOH due to the current complexity of 'skb_segment' this is bit risky.
Going with (2) could be reasonable too, as it brings value for
the virtualized environmnets that need gso_size clamping, while
presenting minimal risk.
Appreciate your opinions.
Regards,
Shmulik
next prev parent reply other threads:[~2016-08-25 9:05 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-22 12:06 [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu Shmulik Ladkani
2016-08-22 12:58 ` Florian Westphal
2016-08-22 13:05 ` Shmulik Ladkani
2016-08-24 14:53 ` Shmulik Ladkani
2016-08-24 14:59 ` Florian Westphal
2016-08-25 9:05 ` Shmulik Ladkani [this message]
2016-08-26 11:19 ` Herbert Xu
2016-09-09 5:48 ` Shmulik Ladkani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160825120533.352bbd1b@pixies \
--to=shmulik.ladkani@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=hannes@stressinduktion.org \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox