From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: David Miller <davem@davemloft.net>,
wenxu@ucloud.cn, kuznet@ms2.inr.ac.ru, jmorris@namei.org,
kaber@trash.net, yoshfuji@linux-ipv6.org, netdev@vger.kernel.org,
wenx05124561@163.com
Subject: Re: [PATCH] net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for gre tunneled skbs
Date: Mon, 15 Aug 2016 14:16:39 +0300 [thread overview]
Message-ID: <20160815141639.35e940b4@pixies> (raw)
In-Reply-To: <dfa12e1b-54dc-4e20-8819-cb2400a693d1@stressinduktion.org>
Hi,
On Fri, 12 Aug 2016 13:11:50 +0200, hannes@stressinduktion.org wrote:
> I really would not like to see this expanded to gre and other protocols.
> All switches drop packets where the packets are exceeding the MTU,
> bridges and also openvswitch should behave the same.
>
> Unfortunately we already had this loophole in the kernel that vxlan udp
> output path could fragment the packet again, even in case of switches.
> But this stopped working for GSO packets, which violates another rule in
> the kernel, GSO should always be transparent and user space should never
> have to care if a packet is GSO or not.
>
> Because we couldn't a) roll back the change that we fragment packets in
> UDP output paths and b) should not violate GSO transparency rule, I
> strongly believed it would be better too only change the kernel in a way
> that it transparently works with GSO, too. If we argue that a VTEP is
> its own UDP endpoint which is set up after the bridge, I still can sleep
> well. :)
>
> My understanding was that GRE failed consistently, GSO as well as
> non-GSO packets are dropped, which would be the correct behavior for me.
> I don't want to change this. A good argument against this would be if we
> violate the GSO transparency rule again. But when I looked into the code
> I couldn't see that.
I completely agree with your arguments.
I think we may run into the same GSO vs Non-GSO anomaly if one uses
a "nopmtudisc" tunnel, or a gre tunnel in "collect_md" mode, where the
encapsulating iphdr 'df' is derived from 'tun_flags&TUNNEL_DONT_FRAGMENT'
(e.g. in case DF is not set).
I suspect OvS's vport-gre does exactly that, so I assume this is the
reason why the change was suggested.
Maybe we can change our criteria in the following manner:
- if (skb_iif && proto == IPPROTO_UDP) {
+ if (skb_iif && !(df & htons(IP_DF))) {
IPCB(skb)->flags |= IPSKB_FRAG_SEGS;
This way, any tunnel explicitly providing DF is NOT allowed to
further fragment the resulting segments (leading to tx segments being
dropped).
Others tunnels, that do not care (e.g. vxlan and geneve, and probably
ovs vport-gre, or other ovs encap vports, in df_default=false mode),
will behave same for gso and non-gso.
WDYT? Am I missing something here?
Thanks,
Shmulik
next prev parent reply other threads:[~2016-08-15 11:16 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-09 7:04 [PATCH] net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for gre tunneled skbs wenxu
2016-08-11 0:35 ` David Miller
2016-08-11 19:41 ` Shmulik Ladkani
2016-08-12 4:29 ` wenxu
[not found] ` <80d116d7-e61c-1bbd-64bf-e3b1f809419b@ucloud.cn>
2016-08-12 5:18 ` Shmulik Ladkani
2016-08-12 11:11 ` Hannes Frederic Sowa
2016-08-15 11:16 ` Shmulik Ladkani [this message]
2016-08-16 7:12 ` wenxu
2016-08-19 7:26 ` Shmulik Ladkani
2016-08-19 9:20 ` Hannes Frederic Sowa
2016-08-19 13:40 ` Shmulik Ladkani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160815141639.35e940b4@pixies \
--to=shmulik.ladkani@gmail.com \
--cc=davem@davemloft.net \
--cc=hannes@stressinduktion.org \
--cc=jmorris@namei.org \
--cc=kaber@trash.net \
--cc=kuznet@ms2.inr.ac.ru \
--cc=netdev@vger.kernel.org \
--cc=wenx05124561@163.com \
--cc=wenxu@ucloud.cn \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).