From: Alexander Duyck <alexander.h.duyck@intel.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Kieran Mansley <kmansley@solarflare.com>,
Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
Ben Hutchings <bhutchings@solarflare.com>,
netdev@vger.kernel.org
Subject: Re: TCPBacklogDrops during aggressive bursts of traffic
Date: Wed, 23 May 2012 10:57:49 -0700 [thread overview]
Message-ID: <4FBD251D.8090807@intel.com> (raw)
In-Reply-To: <1337793866.3361.3090.camel@edumazet-glaptop>
On 05/23/2012 10:24 AM, Eric Dumazet wrote:
> On Wed, 2012-05-23 at 09:58 -0700, Alexander Duyck wrote:
>
>> Right, but the problem is that in order to make this work the we are
>> dropping the padding for head and hoping to have room for shared info.
>> This is going to kill performance for things like routing workloads
>> since the entire head is going to have to be copied over to make space
>> for NET_SKB_PAD.
> Hey I said that one of the point I have to add to my patch. Please read
> it again ;)
I'm aware of that, but still it seems like we are getting ahead of
ourselves. This fix is so specific to just the socket backlog case that
I think we are missing the fact that it is going to have huge
performance repercussions elsewhere.
> By the way, we can also add code doing the ksb->head upgrade to fragment
> again in case we need to add a tunnel header, instead of full copy.
>
> So maybe the NET_SKB_PAD is not really needed anymore.
>
> Anyway, a router host could use a different allocation strategy (going
> back to current one)
The thing I don't like is that we are adding extra memcpy calls to all
of these paths. We cannot change the head without having to copy the
shared info and there is going to be a cost for that. I would prefer to
only generate the shared info once and not have to relocate it every
time I want to run a router or tunnel.
>> Also this assumes no RSC being enabled. RSC is
>> normally enabled by default. If it is turned on we are going to start
>> receiving full 2K buffers which will cause even more issues since there
>> wouldn't be any room for shared info in the 2K frame.
>>
> Hey his is one of the point I have to address, also mentioned.
>
> Its almost trivial to check len (if we have room for shared info, take
> it, if not allocate the head as before)
I know you mentioned this as well. The thing is I would prefer not to
add code where we are branching in so many different directions on what
the header actually looks like. It tends to open a lot of opportunities
for bugs when someone makes a change and doesn't take one of the
possible head and fragment combinations into account.
>> The way the driver is currently written probably provides the optimal
>> setup for truesize given the circumstances.
> It unfortunate the hardware has 1KB granularity.
Agreed, I would have preferred 512B granularity, but we are locked in at
1K for now.. :-/
> Problem is skb_try_coalesce() is not used when we store packets in
> socket backlog, and only used for TCP at this moment.
One piece of low hanging fruit that is available to help with some of
this is to drop the Rx header size for ixgbe to 256. That should at
least cut the total truesize for the buffer and sk_buff to 960 or so
which is at least a step in the right direction.
Thanks,
Alex
next prev parent reply other threads:[~2012-05-23 17:57 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-15 14:38 TCPBacklogDrops during aggressive bursts of traffic Kieran Mansley
2012-05-15 14:56 ` Eric Dumazet
2012-05-15 15:00 ` Eric Dumazet
2012-05-15 16:29 ` Kieran Mansley
2012-05-15 16:34 ` Eric Dumazet
2012-05-15 16:47 ` Ben Hutchings
2012-05-15 17:01 ` Eric Dumazet
2012-05-15 17:23 ` Eric Dumazet
2012-05-17 16:31 ` Kieran Mansley
2012-05-17 16:37 ` Eric Dumazet
2012-05-18 15:45 ` Kieran Mansley
2012-05-18 15:49 ` Eric Dumazet
2012-05-18 15:53 ` Kieran Mansley
2012-05-18 18:40 ` Eric Dumazet
2012-05-22 8:20 ` Kieran Mansley
2012-05-22 9:25 ` Eric Dumazet
2012-05-22 9:30 ` Eric Dumazet
2012-05-22 15:09 ` Kieran Mansley
2012-05-22 16:12 ` Eric Dumazet
2012-05-22 16:32 ` Kieran Mansley
2012-05-22 16:45 ` Eric Dumazet
2012-05-22 20:54 ` Eric Dumazet
2012-05-23 9:44 ` Eric Dumazet
2012-05-23 12:09 ` Eric Dumazet
2012-05-23 16:04 ` Alexander Duyck
2012-05-23 16:12 ` Eric Dumazet
2012-05-23 16:39 ` Eric Dumazet
2012-05-23 17:10 ` Alexander Duyck
2012-05-23 21:19 ` Alexander Duyck
2012-05-23 21:37 ` Eric Dumazet
2012-05-23 22:03 ` Alexander Duyck
2012-05-23 16:58 ` Alexander Duyck
2012-05-23 17:24 ` Eric Dumazet
2012-05-23 17:57 ` Alexander Duyck [this message]
2012-05-23 17:34 ` David Miller
2012-05-23 17:46 ` Eric Dumazet
2012-05-23 17:57 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FBD251D.8090807@intel.com \
--to=alexander.h.duyck@intel.com \
--cc=bhutchings@solarflare.com \
--cc=eric.dumazet@gmail.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=kmansley@solarflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).