From: Florian Westphal <fw@strlen.de>
To: David Miller <davem@davemloft.net>
Cc: fw@strlen.de, netdev@vger.kernel.org, hannes@stressinduktion.org,
edumazet@google.com, herbert@gondor.apana.org.au
Subject: Re: [PATCH -next] net: preserve geometry of fragment sizes when forwarding
Date: Mon, 18 May 2015 23:33:29 +0200 [thread overview]
Message-ID: <20150518213329.GA2335@breakpoint.cc> (raw)
In-Reply-To: <20150518.165550.359134808190719687.davem@davemloft.net>
David Miller <davem@davemloft.net> wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Mon, 18 May 2015 22:40:49 +0200
>
> > But, to the best of my understanding, what you ask will push a lot of
> > non-trivial code into the kernel for no functional gain over
> > what has been proposed.
>
> The functional gain is that we stop linearizing the packet, which
> involves memory allocation and copying the entire packet.
AFAICS ipv4 and ipv6 defragmentations do not perform linearizations or
reallocations?
> I am very confident that the performance gains would be non-trivial
> and quite measurable.
Are fragmented packets that common?
I don't have any real data on this, the box sending this email has
55965898 incoming packets delivered
62 reassemblies required
... but it is just an end host.
TCP shouldn't be a problem thanks to pmtud, and for high-volume
fragmented ipv4 flows i'd expect poor performance due to the 16 bit ID space
limitations long before processing bottleneck.
> You'd also be able to trivially respect the geometry of the original
> incoming packet stream.
True. OTOH, the patch proposed in this thread would have done the same
with a lot less code (I admit that removing the optimization from Eric
once nf_defrag is loaded is not desirable; but I did not find a solution
to this problem aside from doing route lookup or tentative 'forward is
off') check, which I did not like.
Another alternative might be to delay Erics 'coalescing' step and move
it into the ip stack, after 'local delivery' decision was taken.
I can investigate this if you think its worth it.
> Every objection has been of the form "this special case" (this time
> SIP) is not easy.
Yes, but these objections are not some random hand-waving gesture.
It presents us with certain dilemmas, e.g. single udp packet:
1280 1280 1280 542
sip nat helper has to do nat/pat and replaces 10.2.3.4 with 192.168.2.3
(lets assume we'd have helpers that deal with addresses split over 2
fragmented skbs so we can deal with 10.2 appearing in fragment #2
and .3.4 in fragment #3)
We can then end up with something like
1283 1281 1282 542
... and what should we do then?
shuffe payload via memcpy/memmove() to only grow last frag?
This will not be hot path or common by any means.
But nervertheless, this can happen, and we need to deal with it.
> If I were doing this, I would implement something that handles the
> normal cases properly. And then take it from there.
What is a 'normal case'?
And how do you propose we deal with the 'non-normal' cases?
I assume you mean to e.g. linearize for edge cases + then refragment?
If thats true, then we'd still need one of the proposed solutions to handle
this to get packets we can send out without breaking geometry/growing
fragments to a larger mtu.
> If you try to imagine the totality of it and all the edge cases
> and details from the beginning, yes it will look impossible.
Hmm... correct, but I still believe we're talking immense pain
for very little gain.
Thanks for spending time on this.
next prev parent reply other threads:[~2015-05-18 21:33 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-07 21:04 [PATCH -next] net: preserve geometry of fragment sizes when forwarding Florian Westphal
2015-05-18 19:39 ` David Miller
2015-05-18 20:06 ` Florian Westphal
2015-05-18 20:28 ` David Miller
2015-05-18 20:40 ` Florian Westphal
2015-05-18 20:55 ` David Miller
2015-05-18 21:33 ` Florian Westphal [this message]
2015-05-18 22:50 ` Herbert Xu
2015-05-18 23:02 ` Florian Westphal
2015-05-18 23:20 ` Herbert Xu
2015-05-18 23:51 ` David Miller
2015-05-19 12:34 ` Florian Westphal
2015-05-19 19:34 ` Jay Vosburgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150518213329.GA2335@breakpoint.cc \
--to=fw@strlen.de \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hannes@stressinduktion.org \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.