From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
To: netdev@vger.kernel.org
Cc: Florian Westphal <fw@strlen.de>,
"David S. Miller" <davem@davemloft.net>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
Eric Dumazet <edumazet@google.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
Alexander Duyck <alexander.h.duyck@intel.com>
Subject: Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu
Date: Fri, 9 Sep 2016 08:48:34 +0300 [thread overview]
Message-ID: <20160909084834.7067784c@halley> (raw)
In-Reply-To: <20160825120533.352bbd1b@pixies>
On Thu, 25 Aug 2016 12:05:33 +0300 Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote:
> The BUG occurs when GRO occurs on the ingress, and only if GRO merges
> skbs into the frag_list (OTOH when segments are only placed into frags[]
> of a single skb, skb_segment succeeds even if gso_size was altered).
>
> This is due to an assumption that the frag_list members terminate on
> exact MSS boundaries (which no longer holds during gso_size clamping).
>
> We have few alternatives for gso_size clamping:
>
> 1 Fix 'skb_segment' arithmentics to support inputs that do not match
> the "frag_list members terminate on exact MSS" assumption.
>
> 2 Perform gso_size clamping in 'ip_finish_output_gso' for non-GROed skbs.
> Other usecases will still benefit: (a) packets arriving from
> virtualized interfaces, e.g. tap and friends; (b) packets arriving from
> veth of other namespaces (packets are locally generated by TCP stack
> on a different netns).
>
> 3 Ditch the idea, again ;)
>
> We can persue (1), especially if there are other benefits doing so.
> OTOH due to the current complexity of 'skb_segment' this is bit risky.
>
> Going with (2) could be reasonable too, as it brings value for
> the virtualized environmnets that need gso_size clamping, while
> presenting minimal risk.
Summarizing actions taken, in case someone refers to this thread.
- Re (1): Spent a short while massaging skb_segment().
Code is not prepared to support various gso_size inputs.
Main issue is that if nskb's frags[] get exausted (but original
frag_skb's frags[] not yet fully traversed), there's no generation of
a new skb. Code expects interation of both nskb's frags[] and
frag_skb's frags[] to terminate together; the following allocated new
skb is always a clone of next frag_skb in the original head_skb.
Supporting various gso_size inputs required an intrusive rewrite.
- Re (2): There's no easy way for ip_finish_output_gso() to detect that
the skb is safe for "gso_size clamping" while preserving GSO/GRO
transparency:
We can know it is "gso_size clamping safe" PER SKB, but it doesn't
suffice; to preserve GRO transparecy rule, we must know skb arrived
from a code flow that is ALWAYS safe for gso_size clamping.
So I ended up identifying the relevant code-flow of the use-case I'm
interested on, verified it is indeed safe for altering gso_size (while
taking a slight risk that this might not hold true in the future).
I've used that mark as the criteria for safe "gso_size clamping" in
'ip_finish_output_gso'. Yep, not too elegant.
Regards,
Shmulik
prev parent reply other threads:[~2016-09-09 5:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-22 12:06 [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu Shmulik Ladkani
2016-08-22 12:58 ` Florian Westphal
2016-08-22 13:05 ` Shmulik Ladkani
2016-08-24 14:53 ` Shmulik Ladkani
2016-08-24 14:59 ` Florian Westphal
2016-08-25 9:05 ` Shmulik Ladkani
2016-08-26 11:19 ` Herbert Xu
2016-09-09 5:48 ` Shmulik Ladkani [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160909084834.7067784c@halley \
--to=shmulik.ladkani@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=hannes@stressinduktion.org \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).