* Peak TCP performance
@ 2013-10-11 2:34 Kyle Hubert
2013-10-11 3:21 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Kyle Hubert @ 2013-10-11 2:34 UTC (permalink / raw)
To: netdev
I'm working on a device, I consistently get 27gbps via netperf-2.6.
UDP reports 54gbps.
TCP is maxed out at 100% CPU on the transmit side. On the receive
side, 40% of the CPU. Thus, I didn't believe I could eek anymore
performance out of it.
However, very oddly, if I enabled bridged mode to forward some
packets, TCP performance goes up to 32gbps. The thing that bothers me
is that transmit CPU utilization drops to 65%, and receive CPU
utilization increases to 60%.
What happens when the device becomes bridged to gain so much
performance? Also, can I now take advantage of the extra CPU time
available to drive more traffic? No tunable seems to have any effect..
(except down)
Any ideas?
Cheers,
-Kyle
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Peak TCP performance
2013-10-11 2:34 Peak TCP performance Kyle Hubert
@ 2013-10-11 3:21 ` Eric Dumazet
2013-10-11 3:33 ` Kyle Hubert
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-10-11 3:21 UTC (permalink / raw)
To: Kyle Hubert; +Cc: netdev
On Thu, 2013-10-10 at 22:34 -0400, Kyle Hubert wrote:
> I'm working on a device, I consistently get 27gbps via netperf-2.6.
> UDP reports 54gbps.
>
> TCP is maxed out at 100% CPU on the transmit side. On the receive
> side, 40% of the CPU. Thus, I didn't believe I could eek anymore
> performance out of it.
>
> However, very oddly, if I enabled bridged mode to forward some
> packets, TCP performance goes up to 32gbps. The thing that bothers me
> is that transmit CPU utilization drops to 65%, and receive CPU
> utilization increases to 60%.
>
> What happens when the device becomes bridged to gain so much
> performance? Also, can I now take advantage of the extra CPU time
> available to drive more traffic? No tunable seems to have any effect..
> (except down)
You do not give what version of linux you use, but my guess is that
using latest trees should help, because of
http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=c9eeec26e32e087359160406f96e0949b3cc6f10
Also try to disable tx-nocache-copy
ethtool -K eth0 tx-nocache-copy off
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Peak TCP performance
2013-10-11 3:21 ` Eric Dumazet
@ 2013-10-11 3:33 ` Kyle Hubert
2013-10-11 3:44 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Kyle Hubert @ 2013-10-11 3:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Thu, Oct 10, 2013 at 11:21 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2013-10-10 at 22:34 -0400, Kyle Hubert wrote:
>> I'm working on a device, I consistently get 27gbps via netperf-2.6.
>> UDP reports 54gbps.
>>
>> TCP is maxed out at 100% CPU on the transmit side. On the receive
>> side, 40% of the CPU. Thus, I didn't believe I could eek anymore
>> performance out of it.
>>
>> However, very oddly, if I enabled bridged mode to forward some
>> packets, TCP performance goes up to 32gbps. The thing that bothers me
>> is that transmit CPU utilization drops to 65%, and receive CPU
>> utilization increases to 60%.
>>
>> What happens when the device becomes bridged to gain so much
>> performance? Also, can I now take advantage of the extra CPU time
>> available to drive more traffic? No tunable seems to have any effect..
>> (except down)
>
>
> You do not give what version of linux you use, but my guess is that
> using latest trees should help, because of
> http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=c9eeec26e32e087359160406f96e0949b3cc6f10
>
> Also try to disable tx-nocache-copy
>
> ethtool -K eth0 tx-nocache-copy off
Ah, yes, sorry. I'm on 3.0.80 (SLES SP2). And, no, I do not have that
commit. In fact, the code looks quite different. There must have been
a lot of changes?
Also, my copy of ethtool does not recognize tx-nocache-copy. However,
I do have control over the net device. Is there something there I can
set, or is tx-nocache-copy also a new feature? I'll start digging.
Thanks for your response.
-Kyle
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Peak TCP performance
2013-10-11 3:33 ` Kyle Hubert
@ 2013-10-11 3:44 ` Eric Dumazet
2013-10-11 4:21 ` Kyle Hubert
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-10-11 3:44 UTC (permalink / raw)
To: Kyle Hubert; +Cc: netdev
On Thu, 2013-10-10 at 23:33 -0400, Kyle Hubert wrote:
> Ah, yes, sorry. I'm on 3.0.80 (SLES SP2). And, no, I do not have that
> commit. In fact, the code looks quite different. There must have been
> a lot of changes?
>
Probably ;)
> Also, my copy of ethtool does not recognize tx-nocache-copy. However,
> I do have control over the net device. Is there something there I can
> set, or is tx-nocache-copy also a new feature? I'll start digging.
>
nocache-copy was added in 3.0, but I do find its not a gain for current
cpus.
You could get a fresh copy of ethtool sources :
git clone git://git.kernel.org/pub/scm/network/ethtool/ethtool.git
cd ethtool
./autogen.sh ...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Peak TCP performance
2013-10-11 3:44 ` Eric Dumazet
@ 2013-10-11 4:21 ` Kyle Hubert
2013-10-11 17:46 ` Rick Jones
0 siblings, 1 reply; 6+ messages in thread
From: Kyle Hubert @ 2013-10-11 4:21 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Thu, Oct 10, 2013 at 11:44 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> Also, my copy of ethtool does not recognize tx-nocache-copy. However,
>> I do have control over the net device. Is there something there I can
>> set, or is tx-nocache-copy also a new feature? I'll start digging.
>>
>
> nocache-copy was added in 3.0, but I do find its not a gain for current
> cpus.
>
> You could get a fresh copy of ethtool sources :
>
> git clone git://git.kernel.org/pub/scm/network/ethtool/ethtool.git
> cd ethtool
> ./autogen.sh ...
That did the trick. Thanks for the help! Is there somewhere I can read
up on this feature? A lot of the netdev features are opaque to me.
Also, I can set NETIF_F_NOCACHE_COPY in the netdev->features to set
this by default, yes?
This at least mirrors the performance improvement that I see when
forwarding, however I still see reserved CPU time. Is there a way to
push it even farther?
-Kyle
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Peak TCP performance
2013-10-11 4:21 ` Kyle Hubert
@ 2013-10-11 17:46 ` Rick Jones
0 siblings, 0 replies; 6+ messages in thread
From: Rick Jones @ 2013-10-11 17:46 UTC (permalink / raw)
To: Kyle Hubert, Eric Dumazet; +Cc: netdev
On 10/10/2013 09:21 PM, Kyle Hubert wrote:
> On Thu, Oct 10, 2013 at 11:44 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> Also, my copy of ethtool does not recognize tx-nocache-copy. However,
>>> I do have control over the net device. Is there something there I can
>>> set, or is tx-nocache-copy also a new feature? I'll start digging.
>>>
>>
>> nocache-copy was added in 3.0, but I do find its not a gain for current
>> cpus.
>>
>> You could get a fresh copy of ethtool sources :
>>
>> git clone git://git.kernel.org/pub/scm/network/ethtool/ethtool.git
>> cd ethtool
>> ./autogen.sh ...
>
> That did the trick. Thanks for the help! Is there somewhere I can read
> up on this feature? A lot of the netdev features are opaque to me.
> Also, I can set NETIF_F_NOCACHE_COPY in the netdev->features to set
> this by default, yes?
>
> This at least mirrors the performance improvement that I see when
> forwarding, however I still see reserved CPU time. Is there a way to
> push it even farther?
Thought I would point-out that unless you do concrete steps to make it
behave otherwise, netperf will constantly present the same set of
cache-clean buffers to the transport. The size of those buffers will be
determined by some heuristics and will depend on the socket buffer size
at the time the data socket is create, which itself will depend on
whether or not you have used a test-specific -s option. And the
test-specific -m option will come into play.
If I am recalling correctly, the number of buffers will be one more than:
initialSO_SNDBUF / send_size
though you can control that with global -W option.
happy benchmarking,
rick jones
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-10-11 17:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-11 2:34 Peak TCP performance Kyle Hubert
2013-10-11 3:21 ` Eric Dumazet
2013-10-11 3:33 ` Kyle Hubert
2013-10-11 3:44 ` Eric Dumazet
2013-10-11 4:21 ` Kyle Hubert
2013-10-11 17:46 ` Rick Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).