netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: David Miller <davem@davemloft.net>,
	wenxu@ucloud.cn, kuznet@ms2.inr.ac.ru, jmorris@namei.org,
	kaber@trash.net, yoshfuji@linux-ipv6.org, netdev@vger.kernel.org,
	wenx05124561@163.com
Subject: Re: [PATCH] net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for gre tunneled skbs
Date: Fri, 19 Aug 2016 10:26:04 +0300	[thread overview]
Message-ID: <20160819102604.271c89d9@halley> (raw)
In-Reply-To: <20160815141639.35e940b4@pixies>

On Mon, 15 Aug 2016 14:16:39 +0300 Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote:
> On Fri, 12 Aug 2016 13:11:50 +0200, hannes@stressinduktion.org wrote:
> > I really would not like to see this expanded to gre and other protocols.
> > All switches drop packets where the packets are exceeding the MTU,
> > bridges and also openvswitch should behave the same.
> > 
> > Unfortunately we already had this loophole in the kernel that vxlan udp
> > output path could fragment the packet again, even in case of switches.
> > But this stopped working for GSO packets, which violates another rule in
> > the kernel, GSO should always be transparent and user space should never
> > have to care if a packet is GSO or not.
> > 
> > Because we couldn't a) roll back the change that we fragment packets in
> > UDP output paths and b) should not violate GSO transparency rule, I
> > strongly believed it would be better too only change the kernel in a way
> > that it transparently works with GSO, too. If we argue that a VTEP is
> > its own UDP endpoint which is set up after the bridge, I still can sleep
> > well. :)
> > 
> > My understanding was that GRE failed consistently, GSO as well as
> > non-GSO packets are dropped, which would be the correct behavior for me.
> > I don't want to change this. A good argument against this would be if we
> > violate the GSO transparency rule again. But when I looked into the code
> > I couldn't see that.  
> 
> I completely agree with your arguments.
> 
> I think we may run into the same GSO vs Non-GSO anomaly if one uses
> a "nopmtudisc" tunnel, or a gre tunnel in "collect_md" mode, where the
> encapsulating iphdr 'df' is derived from 'tun_flags&TUNNEL_DONT_FRAGMENT'
> (e.g. in case DF is not set).
> 
> I suspect OvS's vport-gre does exactly that, so I assume this is the
> reason why the change was suggested.
> 
> Maybe we can change our criteria in the following manner:
>  
> -	if (skb_iif && proto == IPPROTO_UDP) {
> +	if (skb_iif && !(df & htons(IP_DF))) {
> 		IPCB(skb)->flags |= IPSKB_FRAG_SEGS;
> 
> This way, any tunnel explicitly providing DF is NOT allowed to
> further fragment the resulting segments (leading to tx segments being
> dropped).
> Others tunnels, that do not care (e.g. vxlan and geneve, and probably
> ovs vport-gre, or other ovs encap vports, in df_default=false mode),
> will behave same for gso and non-gso.
> 
> WDYT? Am I missing something here?
> 

ping..

  parent reply	other threads:[~2016-08-19  7:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-09  7:04 [PATCH] net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for gre tunneled skbs wenxu
2016-08-11  0:35 ` David Miller
2016-08-11 19:41   ` Shmulik Ladkani
2016-08-12  4:29     ` wenxu
     [not found]     ` <80d116d7-e61c-1bbd-64bf-e3b1f809419b@ucloud.cn>
2016-08-12  5:18       ` Shmulik Ladkani
2016-08-12 11:11     ` Hannes Frederic Sowa
2016-08-15 11:16       ` Shmulik Ladkani
2016-08-16  7:12         ` wenxu
2016-08-19  7:26         ` Shmulik Ladkani [this message]
2016-08-19  9:20           ` Hannes Frederic Sowa
2016-08-19 13:40             ` Shmulik Ladkani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160819102604.271c89d9@halley \
    --to=shmulik.ladkani@gmail.com \
    --cc=davem@davemloft.net \
    --cc=hannes@stressinduktion.org \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@vger.kernel.org \
    --cc=wenx05124561@163.com \
    --cc=wenxu@ucloud.cn \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).