netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Question] TCP stack performance decrease since 3.14
@ 2015-04-15 19:31 rapier
  2015-04-15 21:01 ` Eric Dumazet
  2015-04-15 21:31 ` Rick Jones
  0 siblings, 2 replies; 5+ messages in thread
From: rapier @ 2015-04-15 19:31 UTC (permalink / raw)
  To: netdev

All,

First, my apologies if this came up previously but I couldn't find 
anything using a keyword search of the mailing list archive.

As part of the on going work with web10g I need to come up with baseline 
TCP stack performance for various kernel revision. Using netperf and 
super_netperf* I've found that performance for TCP_CC, TCP_RR, and 
TCP_CRR has decreased since 3.14.

	3.14	3.18	4.0 	decrease %
TCP_CC	183945	179222	175793	4.4%
TCP_RR	594495	585484	561365	5.6%
TCP_CRR	98677	96726	93026	5.7%

Stream tests have remained the same from 3.14 through 4.0.

All tests were conducted on the same platform from clean boot with stock 
kernels.

So my questions are:

Has anyone else seen this or is this a result of some weirdness on my 
system or artifact of my tests?

If others have seen this or is just simply to be expected (from new 
features and the like) is it due to the TCP stack itself or other 
changes in the kernel?

If so, is there anyway to mitigate the effect of this via stack tuning, 
kernel configuration, etc?

Thanks!

Chris


* The above results are the average of 10 iterations of super_netperf 
for each test. I can run more iterations to verify the results but it 
seem consistent. The number of parallel processes for each test was 
tuned to produce the maximum test result. In other words, enough to push 
things but not enough to cause performance hits due to being 
cpu/memory/etc bound. If anyone wants the full results and test scripts 
just let me know.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] TCP stack performance decrease since 3.14
  2015-04-15 19:31 [Question] TCP stack performance decrease since 3.14 rapier
@ 2015-04-15 21:01 ` Eric Dumazet
  2015-04-15 21:38   ` rapier
  2015-04-15 21:31 ` Rick Jones
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-04-15 21:01 UTC (permalink / raw)
  To: rapier; +Cc: netdev

On Wed, 2015-04-15 at 15:31 -0400, rapier wrote:
> All,
> 
> First, my apologies if this came up previously but I couldn't find 
> anything using a keyword search of the mailing list archive.
> 
> As part of the on going work with web10g I need to come up with baseline 
> TCP stack performance for various kernel revision. Using netperf and 
> super_netperf* I've found that performance for TCP_CC, TCP_RR, and 
> TCP_CRR has decreased since 3.14.
> 
> 	3.14	3.18	4.0 	decrease %
> TCP_CC	183945	179222	175793	4.4%
> TCP_RR	594495	585484	561365	5.6%
> TCP_CRR	98677	96726	93026	5.7%
> 
> Stream tests have remained the same from 3.14 through 4.0.
> 
> All tests were conducted on the same platform from clean boot with stock 
> kernels.
> 
> So my questions are:
> 
> Has anyone else seen this or is this a result of some weirdness on my 
> system or artifact of my tests?
> 
> If others have seen this or is just simply to be expected (from new 
> features and the like) is it due to the TCP stack itself or other 
> changes in the kernel?
> 
> If so, is there anyway to mitigate the effect of this via stack tuning, 
> kernel configuration, etc?
> 
> Thanks!
> 
> Chris
> 
> 
> * The above results are the average of 10 iterations of super_netperf 
> for each test. I can run more iterations to verify the results but it 
> seem consistent. The number of parallel processes for each test was 
> tuned to produce the maximum test result. In other words, enough to push 
> things but not enough to cause performance hits due to being 
> cpu/memory/etc bound. If anyone wants the full results and test scripts 
> just let me know.
> --

Make sure you do not hit a c-state issue.

I've seen improvements in the stack translate to longer wait times, and
cpu takes longer to exit deep c-state.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] TCP stack performance decrease since 3.14
  2015-04-15 19:31 [Question] TCP stack performance decrease since 3.14 rapier
  2015-04-15 21:01 ` Eric Dumazet
@ 2015-04-15 21:31 ` Rick Jones
  1 sibling, 0 replies; 5+ messages in thread
From: Rick Jones @ 2015-04-15 21:31 UTC (permalink / raw)
  To: rapier, netdev

On 04/15/2015 12:31 PM, rapier wrote:
> All,
>
> First, my apologies if this came up previously but I couldn't find
> anything using a keyword search of the mailing list archive.
>
> As part of the on going work with web10g I need to come up with baseline
> TCP stack performance for various kernel revision. Using netperf and
> super_netperf* I've found that performance for TCP_CC, TCP_RR, and
> TCP_CRR has decreased since 3.14.
>
>      3.14    3.18    4.0     decrease %
> TCP_CC    183945    179222    175793    4.4%
> TCP_RR    594495    585484    561365    5.6%
> TCP_CRR    98677    96726    93026    5.7%
>
> Stream tests have remained the same from 3.14 through 4.0.

Have the service demands (usec of CPU consumed per KB) remained the same 
on the stream tests?  Even then, stateless offloads can help hide a 
multitude of path-length sins.

> All tests were conducted on the same platform from clean boot with stock
> kernels.
>
> So my questions are:
>
> Has anyone else seen this or is this a result of some weirdness on my
> system or artifact of my tests?

I've wondered if such a thing might be taking place but never had a 
chance to check.

One thing you might consider is "perf" profiling to see how the CPU 
consumption break-down has changed.

happy benchmarking,

rick jones

>
> If others have seen this or is just simply to be expected (from new
> features and the like) is it due to the TCP stack itself or other
> changes in the kernel?
>
> If so, is there anyway to mitigate the effect of this via stack tuning,
> kernel configuration, etc?
>
> Thanks!
>
> Chris
>
>
> * The above results are the average of 10 iterations of super_netperf
> for each test. I can run more iterations to verify the results but it
> seem consistent. The number of parallel processes for each test was
> tuned to produce the maximum test result. In other words, enough to push
> things but not enough to cause performance hits due to being
> cpu/memory/etc bound. If anyone wants the full results and test scripts
> just let me know.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] TCP stack performance decrease since 3.14
  2015-04-15 21:01 ` Eric Dumazet
@ 2015-04-15 21:38   ` rapier
  2015-04-15 22:13     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: rapier @ 2015-04-15 21:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev



On 4/15/15 5:01 PM, Eric Dumazet wrote:
> On Wed, 2015-04-15 at 15:31 -0400, rapier wrote:
>> All,
>>
>> First, my apologies if this came up previously but I couldn't find
>> anything using a keyword search of the mailing list archive.
>>
>> As part of the on going work with web10g I need to come up with baseline
>> TCP stack performance for various kernel revision. Using netperf and
>> super_netperf* I've found that performance for TCP_CC, TCP_RR, and
>> TCP_CRR has decreased since 3.14.
>>
>> 	3.14	3.18	4.0 	decrease %
>> TCP_CC	183945	179222	175793	4.4%
>> TCP_RR	594495	585484	561365	5.6%
>> TCP_CRR	98677	96726	93026	5.7%
>>
>> Stream tests have remained the same from 3.14 through 4.0.
>>
>> All tests were conducted on the same platform from clean boot with stock
>> kernels.
>>
>> So my questions are:
>>
>> Has anyone else seen this or is this a result of some weirdness on my
>> system or artifact of my tests?
>>
>> If others have seen this or is just simply to be expected (from new
>> features and the like) is it due to the TCP stack itself or other
>> changes in the kernel?
>>
>> If so, is there anyway to mitigate the effect of this via stack tuning,
>> kernel configuration, etc?
>>
>> Thanks!
>>
>> Chris
>>
>>
>> * The above results are the average of 10 iterations of super_netperf
>> for each test. I can run more iterations to verify the results but it
>> seem consistent. The number of parallel processes for each test was
>> tuned to produce the maximum test result. In other words, enough to push
>> things but not enough to cause performance hits due to being
>> cpu/memory/etc bound. If anyone wants the full results and test scripts
>> just let me know.
>> --
>
> Make sure you do not hit a c-state issue.
>
> I've seen improvements in the stack translate to longer wait times, and
> cpu takes longer to exit deep c-state.

I believe I properly disabled CPU power management in the bios (the 
lenovo bios isn't terribly clear on this). I then booted with 
processor.max_cstate=1 idle=poll (also tried with 
intel_idle.max_cstate=0 and combinatiosn thereof). Still seeing reduced 
performance in comparison to 3.14. I'll try using /dev/cpu_dma_latency 
instead when I get in tomorrow. If you have other suggestions to verify 
c-state I'd be happy to hear them.

As a note, 3.2 tests as being more than 18% faster in the above categories.

Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Question] TCP stack performance decrease since 3.14
  2015-04-15 21:38   ` rapier
@ 2015-04-15 22:13     ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2015-04-15 22:13 UTC (permalink / raw)
  To: rapier; +Cc: netdev

On Wed, 2015-04-15 at 17:38 -0400, rapier wrote:

> I believe I properly disabled CPU power management in the bios (the 
> lenovo bios isn't terribly clear on this). I then booted with 
> processor.max_cstate=1 idle=poll (also tried with 
> intel_idle.max_cstate=0 and combinatiosn thereof). Still seeing reduced 
> performance in comparison to 3.14. I'll try using /dev/cpu_dma_latency 
> instead when I get in tomorrow. If you have other suggestions to verify 
> c-state I'd be happy to hear them.
> 
> As a note, 3.2 tests as being more than 18% faster in the above categories.

Wait a minute, are you testing tcp on a single laptop over loopback ?

multiple netperf consume a lot of ram and any change in kernel vmlinux
size can impact the performance simply because your cpu cache is too
small to cope with the increase.

Between 3.2 and 4.0 kernel, TCP networking changes are maybe 2 % of
overall changes.

2.6.1 kernel in 32bit mode will be much faster you know.

Maybe 40% faster.

# ls -l /boot/vmlinuz-*
-rwxr-xr-x 1 root root 3719696 2015-04-15 14:09 /boot/vmlinuz-3.14.0-smp
-rwxr-xr-x 1 root root 3911520 2015-04-15 14:30 /boot/vmlinuz-4.0.0-smp

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-04-15 22:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-15 19:31 [Question] TCP stack performance decrease since 3.14 rapier
2015-04-15 21:01 ` Eric Dumazet
2015-04-15 21:38   ` rapier
2015-04-15 22:13     ` Eric Dumazet
2015-04-15 21:31 ` Rick Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).