All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
To: netdev@vger.kernel.org
Cc: Florian Westphal <fw@strlen.de>,
	"David S. Miller" <davem@davemloft.net>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Eric Dumazet <edumazet@google.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Alexander Duyck <alexander.h.duyck@intel.com>
Subject: Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu
Date: Fri, 9 Sep 2016 08:48:34 +0300	[thread overview]
Message-ID: <20160909084834.7067784c@halley> (raw)
In-Reply-To: <20160825120533.352bbd1b@pixies>

On Thu, 25 Aug 2016 12:05:33 +0300 Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote:
> The BUG occurs when GRO occurs on the ingress, and only if GRO merges
> skbs into the frag_list (OTOH when segments are only placed into frags[]
> of a single skb, skb_segment succeeds even if gso_size was altered).
> 
> This is due to an assumption that the frag_list members terminate on
> exact MSS boundaries (which no longer holds during gso_size clamping).
> 
> We have few alternatives for gso_size clamping:
> 
> 1 Fix 'skb_segment' arithmentics to support inputs that do not match
>   the "frag_list members terminate on exact MSS" assumption.
> 
> 2 Perform gso_size clamping in 'ip_finish_output_gso' for non-GROed skbs.
>   Other usecases will still benefit: (a) packets arriving from
>   virtualized interfaces, e.g. tap and friends; (b) packets arriving from
>   veth of other namespaces (packets are locally generated by TCP stack
>   on a different netns).
> 
> 3 Ditch the idea, again ;)
> 
> We can persue (1), especially if there are other benefits doing so.
> OTOH due to the current complexity of 'skb_segment' this is bit risky.
> 
> Going with (2) could be reasonable too, as it brings value for
> the virtualized environmnets that need gso_size clamping, while
> presenting minimal risk.

Summarizing actions taken, in case someone refers to this thread.

- Re (1): Spent a short while massaging skb_segment().
  Code is not prepared to support various gso_size inputs.
  Main issue is that if nskb's frags[] get exausted (but original
  frag_skb's frags[] not yet fully traversed), there's no generation of
  a new skb. Code expects interation of both nskb's frags[] and
  frag_skb's frags[] to terminate together; the following allocated new
  skb is always a clone of next frag_skb in the original head_skb.
  Supporting various gso_size inputs required an intrusive rewrite.

- Re (2): There's no easy way for ip_finish_output_gso() to detect that
  the skb is safe for "gso_size clamping" while preserving GSO/GRO
  transparency:
  We can know it is "gso_size clamping safe" PER SKB, but it doesn't
  suffice; to preserve GRO transparecy rule, we must know skb arrived
  from a code flow that is ALWAYS safe for gso_size clamping.

So I ended up identifying the relevant code-flow of the use-case I'm
interested on, verified it is indeed safe for altering gso_size (while
taking a slight risk that this might not hold true in the future).
I've used that mark as the criteria for safe "gso_size clamping" in
'ip_finish_output_gso'. Yep, not too elegant.

Regards,
Shmulik

      parent reply	other threads:[~2016-09-09  5:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-22 12:06 [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu Shmulik Ladkani
2016-08-22 12:58 ` Florian Westphal
2016-08-22 13:05   ` Shmulik Ladkani
2016-08-24 14:53   ` Shmulik Ladkani
2016-08-24 14:59     ` Florian Westphal
2016-08-25  9:05   ` Shmulik Ladkani
2016-08-26 11:19     ` Herbert Xu
2016-09-09  5:48     ` Shmulik Ladkani [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909084834.7067784c@halley \
    --to=shmulik.ladkani@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=hannes@stressinduktion.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.