AIM9 regression

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* AIM9 regression
@ 2008-09-23 18:14 Christoph Lameter
  2008-09-23 20:36 ` Stephen Hemminger
  2008-09-24  5:12 ` Herbert Xu
  0 siblings, 2 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-23 18:14 UTC (permalink / raw)
  To: David Miller; +Cc: Netdev, Herbert Xu

I just dont seem to be able to get 2.6.27 to behave in a speedy way network
wise. Configured out various components (netfilter, etc etc) but I still keep
getting these aim9 result against 2.6.22:

47 misc_rtns_1  448038.00 430118.00  -17920.00 -4.00% Auxiliary Loops/second
48 dir_rtns_1   2412587.41 2723000.00  310412.59 12.87% Directory
Operations/second
49 shell_rtns_1 364.30 345.80     -18.50 -5.08% Shell Scripts/second
50 shell_rtns_2 364.20 355.34      -8.86 -2.43% Shell Scripts/second
51 shell_rtns_3 363.30 353.60      -9.70 -2.67% Shell Scripts/second
52 series_1     6694290.00 6706690.00   12400.00  0.19% Series Evaluations/second
53 shared_memory 1042900.00 1080630.00   37730.00  3.62% Shared Memory
Operations/second
54 tcp_test     352035.00 278442.00  -73593.00 -20.91% TCP/IP Messages/second
55 udp_test     640940.00 585570.00  -55370.00 -8.64% UDP/IP DataGrams/second
56 fifo_test    772440.00 932330.00  159890.00 20.70% FIFO Messages/second
57 stream_pipe  1222870.00 1230140.00    7270.00  0.59% Stream Pipe
Messages/second
58 dgram_pipe   1143106.89 1152730.00    9623.11  0.84% DataGram Pipe
Messages/second
59 pipe_cpy     867850.00 1065430.00  197580.00 22.77% Pipe Messages/second

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-23 18:14 AIM9 regression Christoph Lameter
@ 2008-09-23 20:36 ` Stephen Hemminger
  2008-09-23 20:40   ` Christoph Lameter
  2008-09-24  5:12 ` Herbert Xu
  1 sibling, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-23 20:36 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, Netdev, Herbert Xu

On Tue, 23 Sep 2008 13:14:27 -0500
Christoph Lameter <cl@linux-foundation.org> wrote:

> I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> wise. Configured out various components (netfilter, etc etc) but I still keep
> getting these aim9 result against 2.6.22:
> 
> 47 misc_rtns_1  448038.00 430118.00  -17920.00 -4.00% Auxiliary Loops/second
> 48 dir_rtns_1   2412587.41 2723000.00  310412.59 12.87% Directory
> Operations/second
> 49 shell_rtns_1 364.30 345.80     -18.50 -5.08% Shell Scripts/second
> 50 shell_rtns_2 364.20 355.34      -8.86 -2.43% Shell Scripts/second
> 51 shell_rtns_3 363.30 353.60      -9.70 -2.67% Shell Scripts/second
> 52 series_1     6694290.00 6706690.00   12400.00  0.19% Series Evaluations/second
> 53 shared_memory 1042900.00 1080630.00   37730.00  3.62% Shared Memory
> Operations/second
> 54 tcp_test     352035.00 278442.00  -73593.00 -20.91% TCP/IP Messages/second
> 55 udp_test     640940.00 585570.00  -55370.00 -8.64% UDP/IP DataGrams/second
> 56 fifo_test    772440.00 932330.00  159890.00 20.70% FIFO Messages/second
> 57 stream_pipe  1222870.00 1230140.00    7270.00  0.59% Stream Pipe
> Messages/second
> 58 dgram_pipe   1143106.89 1152730.00    9623.11  0.84% DataGram Pipe
> Messages/second
> 59 pipe_cpy     867850.00 1065430.00  197580.00 22.77% Pipe Messages/second
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hardware configuration please?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-23 20:36 ` Stephen Hemminger
@ 2008-09-23 20:40   ` Christoph Lameter
  2008-09-23 20:43     ` Christoph Lameter
  2008-09-24  1:20     ` Jeff Garzik
  0 siblings, 2 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-23 20:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Netdev, Herbert Xu

Stephen Hemminger wrote:

> Hardware configuration please?

Dual Processor 4 core 8G, Xeon X5460  @ 3.16GHz, 667Mhz memory.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-23 20:40   ` Christoph Lameter
@ 2008-09-23 20:43     ` Christoph Lameter
  2008-09-24  1:20     ` Jeff Garzik
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-23 20:43 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Netdev, Herbert Xu

Christoph Lameter wrote:

> Dual Processor 4 core 8G, Xeon X5460  @ 3.16GHz, 667Mhz memory.

So 8G Ram, 8 processors total.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-23 20:40   ` Christoph Lameter
  2008-09-23 20:43     ` Christoph Lameter
@ 2008-09-24  1:20     ` Jeff Garzik
  2008-09-24  3:11       ` David Miller
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff Garzik @ 2008-09-24  1:20 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Stephen Hemminger, David Miller, Netdev, Herbert Xu

Christoph Lameter wrote:
> Stephen Hemminger wrote:
> 
>> Hardware configuration please?
> 
> Dual Processor 4 core 8G, Xeon X5460  @ 3.16GHz, 667Mhz memory.

Network hardware configuration?

Or is the TCP test over loopback?



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24  1:20     ` Jeff Garzik
@ 2008-09-24  3:11       ` David Miller
  2008-09-24 14:20         ` Christoph Lameter
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24  3:11 UTC (permalink / raw)
  To: jeff; +Cc: cl, shemminger, netdev, herbert

From: Jeff Garzik <jeff@garzik.org>
Date: Tue, 23 Sep 2008 21:20:40 -0400

> Christoph Lameter wrote:
> > Stephen Hemminger wrote:
> > 
> >> Hardware configuration please?
> > Dual Processor 4 core 8G, Xeon X5460  @ 3.16GHz, 667Mhz memory.
> 
> Network hardware configuration?
> 
> Or is the TCP test over loopback?

I'm pretty sure it's over loopback :-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24  3:11       ` David Miller
@ 2008-09-24 14:20         ` Christoph Lameter
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-24 14:20 UTC (permalink / raw)
  To: David Miller; +Cc: jeff, shemminger, netdev, herbert

David Miller wrote:
> From: Jeff Garzik <jeff@garzik.org>
> Date: Tue, 23 Sep 2008 21:20:40 -0400
> 
>> Christoph Lameter wrote:
>>> Stephen Hemminger wrote:
>>>
>>>> Hardware configuration please?
>>> Dual Processor 4 core 8G, Xeon X5460  @ 3.16GHz, 667Mhz memory.
>> Network hardware configuration?
>>
>> Or is the TCP test over loopback?
> 
> I'm pretty sure it's over loopback :-)

Correct.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-23 18:14 AIM9 regression Christoph Lameter
  2008-09-23 20:36 ` Stephen Hemminger
@ 2008-09-24  5:12 ` Herbert Xu
  2008-09-24  5:18   ` David Miller
  1 sibling, 1 reply; 20+ messages in thread
From: Herbert Xu @ 2008-09-24  5:12 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, Netdev

On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> wise. Configured out various components (netfilter, etc etc) but I still keep
> getting these aim9 result against 2.6.22:

Could you please compare this against something less ancient,
like 2.6.26 perhaps?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24  5:12 ` Herbert Xu
@ 2008-09-24  5:18   ` David Miller
  2008-09-24 15:16     ` Stephen Hemminger
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24  5:18 UTC (permalink / raw)
  To: herbert; +Cc: cl, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 24 Sep 2008 13:12:37 +0800

> On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > wise. Configured out various components (netfilter, etc etc) but I still keep
> > getting these aim9 result against 2.6.22:
> 
> Could you please compare this against something less ancient,
> like 2.6.26 perhaps?

Herbert, this is part of the tbench regression issues.  Christoph
took tbench from 2.6.22 until 2.6.27 and at basically every release
tbench performance suffered noticably.

Now, he's taking the AIM9 benchmark networking numbers and showing
that the same exact effect is seen there too.

It really behooves us to start doing something proactive about this
blindingly obvious set of networking performance regressions through
the past 6 or so releases instead of barking at the reporters saying
things like "try this, try that, what's your config" etc.

:-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24  5:18   ` David Miller
@ 2008-09-24 15:16     ` Stephen Hemminger
  2008-09-24 19:10       ` Christoph Lameter
  2008-09-24 19:36       ` David Miller
  0 siblings, 2 replies; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-24 15:16 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, cl, netdev

On Tue, 23 Sep 2008 22:18:31 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Wed, 24 Sep 2008 13:12:37 +0800
> 
> > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > > wise. Configured out various components (netfilter, etc etc) but I still keep
> > > getting these aim9 result against 2.6.22:
> > 
> > Could you please compare this against something less ancient,
> > like 2.6.26 perhaps?
> 
> Herbert, this is part of the tbench regression issues.  Christoph
> took tbench from 2.6.22 until 2.6.27 and at basically every release
> tbench performance suffered noticably.
> 
> Now, he's taking the AIM9 benchmark networking numbers and showing
> that the same exact effect is seen there too.
> 
> It really behooves us to start doing something proactive about this
> blindingly obvious set of networking performance regressions through
> the past 6 or so releases instead of barking at the reporters saying
> things like "try this, try that, what's your config" etc.
> 
> :-)

These loopback benchmarks are often more sensitive to scheduler than networking
changes.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24 15:16     ` Stephen Hemminger
@ 2008-09-24 19:10       ` Christoph Lameter
  2008-09-24 19:53         ` David Miller
  2008-09-24 19:36       ` David Miller
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2008-09-24 19:10 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, herbert, netdev

Stephen Hemminger wrote:

> These loopback benchmarks are often more sensitive to scheduler than networking
> changes.

Just ran a test with real NICs which show the same issues. I guess I need to
get familiar with the network stack and start hacking on it. Sigh.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24 19:10       ` Christoph Lameter
@ 2008-09-24 19:53         ` David Miller
  2008-09-24 21:34           ` Stephen Hemminger
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24 19:53 UTC (permalink / raw)
  To: cl; +Cc: shemminger, herbert, netdev

From: Christoph Lameter <cl@linux-foundation.org>
Date: Wed, 24 Sep 2008 14:10:54 -0500

> Stephen Hemminger wrote:
> 
> > These loopback benchmarks are often more sensitive to scheduler than networking
> > changes.
> 
> Just ran a test with real NICs which show the same issues. I guess I need to
> get familiar with the network stack and start hacking on it. Sigh.

I feel your pain, I think people are being very unreasonable in their
analysis of your numbers, and for this I want to personally apologize.

It's clearly a networking issue in my eyes, and I wish my co-developers
in networking would treat it as such instead of pushing the blame under
the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit
without any facts on this specific case to back up such accusations.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24 19:53         ` David Miller
@ 2008-09-24 21:34           ` Stephen Hemminger
  2008-09-24 22:26             ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-24 21:34 UTC (permalink / raw)
  To: David Miller; +Cc: cl, herbert, netdev

On Wed, 24 Sep 2008 12:53:46 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Christoph Lameter <cl@linux-foundation.org>
> Date: Wed, 24 Sep 2008 14:10:54 -0500
> 
> > Stephen Hemminger wrote:
> > 
> > > These loopback benchmarks are often more sensitive to scheduler than networking
> > > changes.
> > 
> > Just ran a test with real NICs which show the same issues. I guess I need to
> > get familiar with the network stack and start hacking on it. Sigh.
> 
> I feel your pain, I think people are being very unreasonable in their
> analysis of your numbers, and for this I want to personally apologize.
> 
> It's clearly a networking issue in my eyes, and I wish my co-developers
> in networking would treat it as such instead of pushing the blame under
> the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit
> without any facts on this specific case to back up such accusations.

Is this a one time change, or has networking been getting slower over time?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24 21:34           ` Stephen Hemminger
@ 2008-09-24 22:26             ` David Miller
  0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2008-09-24 22:26 UTC (permalink / raw)
  To: shemminger; +Cc: cl, herbert, netdev

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 24 Sep 2008 14:34:19 -0700

> On Wed, 24 Sep 2008 12:53:46 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Christoph Lameter <cl@linux-foundation.org>
> > Date: Wed, 24 Sep 2008 14:10:54 -0500
> > 
> > > Stephen Hemminger wrote:
> > > 
> > > > These loopback benchmarks are often more sensitive to scheduler than networking
> > > > changes.
> > > 
> > > Just ran a test with real NICs which show the same issues. I guess I need to
> > > get familiar with the network stack and start hacking on it. Sigh.
> > 
> > I feel your pain, I think people are being very unreasonable in their
> > analysis of your numbers, and for this I want to personally apologize.
> > 
> > It's clearly a networking issue in my eyes, and I wish my co-developers
> > in networking would treat it as such instead of pushing the blame under
> > the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit
> > without any facts on this specific case to back up such accusations.
> 
> Is this a one time change, or has networking been getting slower over time?

As per the tbench thread, it's been getting slower and slower, every
single release, since as far back as people have tested, which seems
to be 2.6.22 or thereabouts.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24 15:16     ` Stephen Hemminger
  2008-09-24 19:10       ` Christoph Lameter
@ 2008-09-24 19:36       ` David Miller
  2008-09-29 14:24         ` Ilpo Järvinen
  1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24 19:36 UTC (permalink / raw)
  To: shemminger; +Cc: herbert, cl, netdev

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 24 Sep 2008 08:16:03 -0700

> On Tue, 23 Sep 2008 22:18:31 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Wed, 24 Sep 2008 13:12:37 +0800
> > 
> > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > > > wise. Configured out various components (netfilter, etc etc) but I still keep
> > > > getting these aim9 result against 2.6.22:
> > > 
> > > Could you please compare this against something less ancient,
> > > like 2.6.26 perhaps?
> > 
> > Herbert, this is part of the tbench regression issues.  Christoph
> > took tbench from 2.6.22 until 2.6.27 and at basically every release
> > tbench performance suffered noticably.
> > 
> > Now, he's taking the AIM9 benchmark networking numbers and showing
> > that the same exact effect is seen there too.
> > 
> > It really behooves us to start doing something proactive about this
> > blindingly obvious set of networking performance regressions through
> > the past 6 or so releases instead of barking at the reporters saying
> > things like "try this, try that, what's your config" etc.
> > 
> > :-)
> 
> These loopback benchmarks are often more sensitive to scheduler than networking
> changes.

When it gets to %20, I strong start to doubt that, and this is exactly
what's happening here.

What is it going to take to actually get someone to start profiling and
analyzing this?  :-)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-24 19:36       ` David Miller
@ 2008-09-29 14:24         ` Ilpo Järvinen
  2008-09-29 14:54           ` Christoph Lameter
  0 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2008-09-29 14:24 UTC (permalink / raw)
  To: David Miller; +Cc: shemminger, herbert, cl, netdev

On Wed, 24 Sep 2008, David Miller wrote:

> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Wed, 24 Sep 2008 08:16:03 -0700
> 
> > On Tue, 23 Sep 2008 22:18:31 -0700 (PDT)
> > David Miller <davem@davemloft.net> wrote:
> > 
> > > From: Herbert Xu <herbert@gondor.apana.org.au>
> > > Date: Wed, 24 Sep 2008 13:12:37 +0800
> > > 
> > > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > > > > wise. Configured out various components (netfilter, etc etc) but I still keep
> > > > > getting these aim9 result against 2.6.22:
> > > > 
> > > > Could you please compare this against something less ancient,
> > > > like 2.6.26 perhaps?
> > > 
> > > Herbert, this is part of the tbench regression issues.  Christoph
> > > took tbench from 2.6.22 until 2.6.27 and at basically every release
> > > tbench performance suffered noticably.
> > > 
> > > Now, he's taking the AIM9 benchmark networking numbers and showing
> > > that the same exact effect is seen there too.
> > > 
> > > It really behooves us to start doing something proactive about this
> > > blindingly obvious set of networking performance regressions through
> > > the past 6 or so releases instead of barking at the reporters saying
> > > things like "try this, try that, what's your config" etc.
> > > 
> > > :-)
> > 
> > These loopback benchmarks are often more sensitive to scheduler than 
> > networking changes.
> 
> When it gets to %20, I strong start to doubt that, and this is exactly
> what's happening here.
>
> What is it going to take to actually get someone to start profiling and
> analyzing this?  :-)

...I was thinking earlier to answer "time?", but now once been there, it 
seems that more time is more appropriate... So far I haven't been able to 
find a way to create a reproducable serie of result numbers with aim9 
tcp_test... it seems that the results vary within that (at least) 20% 
margin. Can Christoph actually get stable numbers out of it with 27-rcs
(I haven't extensively tested .22 yet with long test durations but it 
seems that same problem occurs with it as well if short tests were used)?

...And what I've learned, I couldn't even finish a testrun with conntrack 
and default settings as ipv4 conntrack run out of entries :-).

Ow, almost forgot, I got some stable regression with lockdep though,
I hope we've gotten some more power to its detection in return for the
lost performance.

I got these top variations (in absolute numbers) between three consecutive 
runs of 1000 seconds aim9 tcp_test (3xoprof(abs,%), func, max-min, 
(max-min)/min), aim9+its data on tmpfs (with nodebug-nonf config):

266288	1.0221	420190	1.6457	614494	2.4039	vfs_read	348206	1.30763
233649	0.8968	317763	1.2446	508838	1.9906	vfs_write	275189	1.17779
228732	0.8779	496359	1.9440	324747	1.2704	dnotify_parent	267627	1.17005
671548	2.5776	592604	2.3210	445792	1.7440	inet_csk_get_port	225756	0.506416
392960	1.5083	362665	1.4204	491234	1.9217	netif_rx	128569	0.354512
121337	0.4657	208314	0.8159	249783	0.9772	do_sync_write	128446	1.05859
164951	0.6331	168276	0.6591	285451	1.1167	loopback_xmit	120500	0.73052
359659	1.3805	242133	0.9483	256785	1.0046	__tcp_select_window	117526	0.485378
876319	3.3636	762690	2.9872	772554	3.0223	tcp_sendmsg	113629	0.148985
266895	1.0244	199204	0.7802	176985	0.6924	tcp_established_options	89910	0.508009
689652	2.6471	647962	2.5378	608943	2.3822	dev_queue_xmit	80709	0.132539
206754	0.7936	265523	1.0400	284087	1.1114	__kmalloc_track_caller	77333	0.374034
544026	2.0882	496654	1.9452	571982	2.2376	tcp_recvmsg	75328	0.151671
600414	2.3046	525704	2.0590	567588	2.2204	ip_queue_xmit	74710	0.142114
131820	0.5060	59259	0.2321	121586	0.4757	getnstimeofday	72561	1.22447
67061	0.2574	132155	0.5176	137914	0.5395	rw_verify_area	70853	1.05655
129676	0.4977	60652	0.2376	98307	0.3846	sock_rfree	69024	1.13803
535701	2.0562	586248	2.2961	517563	2.0247	ip_finish_output	68685	0.132708
692187	2.6568	634962	2.4869	623888	2.4407	tcp_rcv_established	68299	0.109473
949233	3.6435	900741	3.5279	882256	3.4514	tcp_transmit_skb	66977	0.0759156

...like said, the variation in the aim9 results were ~20% at most.


-- 
 i.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-29 14:24         ` Ilpo Järvinen
@ 2008-09-29 14:54           ` Christoph Lameter
  2008-09-29 15:12             ` Ilpo Järvinen
                               ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-29 14:54 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: David Miller, shemminger, herbert, netdev

Ilpo Järvinen wrote:

> ...I was thinking earlier to answer "time?", but now once been there, it 
> seems that more time is more appropriate... So far I haven't been able to 
> find a way to create a reproducable serie of result numbers with aim9 
> tcp_test... it seems that the results vary within that (at least) 20% 
> margin. Can Christoph actually get stable numbers out of it with 27-rcs
> (I haven't extensively tested .22 yet with long test durations but it 
> seems that same problem occurs with it as well if short tests were used)?

Results fluctuate between 10 - 25%. The problem occurs with the short
durations as well. If this is due to the additional code complexity in later
kernels as we suspect then it may be an issue with cpu cache effectiveness.

Going to 64 bit binaries also yields a significant hit (as high as 30%) which
also indicates caching issues.

Both 64 bit kernels and later kernels cause the variability of results to
increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
of cpu caching issues. The L1 cache may become ineffective due to the
increased cache footprint.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-29 14:54           ` Christoph Lameter
@ 2008-09-29 15:12             ` Ilpo Järvinen
  2008-09-29 15:36             ` Stephen Hemminger
  2008-10-31 14:57             ` Ilpo Järvinen
  2 siblings, 0 replies; 20+ messages in thread
From: Ilpo Järvinen @ 2008-09-29 15:12 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, shemminger, Herbert Xu, Netdev

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1466 bytes --]

On Mon, 29 Sep 2008, Christoph Lameter wrote:

> Ilpo Järvinen wrote:
> 
> > So far I haven't been able to 
> > find a way to create a reproducable serie of result numbers with aim9 
> > tcp_test... it seems that the results vary within that (at least) 20% 
> > margin. Can Christoph actually get stable numbers out of it with 27-rcs
> > (I haven't extensively tested .22 yet with long test durations but it 
> > seems that same problem occurs with it as well if short tests were used)?
> 
> Results fluctuate between 10 - 25%. The problem occurs with the short
> durations as well. If this is due to the additional code complexity in later
> kernels as we suspect then it may be an issue with cpu cache effectiveness.

Hmm... I'll try to extract some very raw (and possible somewhat skewed) 
numbers out of that based on the profiles I have and some acme's tools.

> Going to 64 bit binaries also yields a significant hit (as high as 30%) 
> which also indicates caching issues.
>
> Both 64 bit kernels and later kernels cause the variability of results to
> increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
> of cpu caching issues. The L1 cache may become ineffective due to the
> increased cache footprint.

Ok. I was testing 64bit only... I'll probably try next some very short 
tests in vast numbers to see if the results converge after enough samples 
are taken, lets hope I don't need many days to get to such point... :-)

-- 
 i.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-29 14:54           ` Christoph Lameter
  2008-09-29 15:12             ` Ilpo Järvinen
@ 2008-09-29 15:36             ` Stephen Hemminger
  2008-10-31 14:57             ` Ilpo Järvinen
  2 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-29 15:36 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, herbert, netdev, Ilpo Järvinen


----- Original Message -----
From: "Christoph Lameter" <cl@linux-foundation.org>
To: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Cc: "David Miller" <davem@davemloft.net>, shemminger@vyatta.com, herbert@gondor.apana.org.au, netdev@vger.kernel.org
Sent: Monday, September 29, 2008 4:54:11 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: AIM9 regression

Ilpo Järvinen wrote:

> ...I was thinking earlier to answer "time?", but now once been there, it 
> seems that more time is more appropriate... So far I haven't been able to 
> find a way to create a reproducable serie of result numbers with aim9 
> tcp_test... it seems that the results vary within that (at least) 20% 
> margin. Can Christoph actually get stable numbers out of it with 27-rcs
> (I haven't extensively tested .22 yet with long test durations but it 
> seems that same problem occurs with it as well if short tests were used)?

Results fluctuate between 10 - 25%. The problem occurs with the short
durations as well. If this is due to the additional code complexity in later
kernels as we suspect then it may be an issue with cpu cache effectiveness.

Going to 64 bit binaries also yields a significant hit (as high as 30%) which
also indicates caching issues.

Both 64 bit kernels and later kernels cause the variability of results to
increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
of cpu caching issues. The L1 cache may become ineffective due to the
increased cache footprint.

-------------
One of the items showing up in the profile is the local side port allocation.
Is the ephemeral port range getting full? If it is then the random port scan
could take a long time to find the next free slot, especially now that source
ports are randomized.




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: AIM9 regression
  2008-09-29 14:54           ` Christoph Lameter
  2008-09-29 15:12             ` Ilpo Järvinen
  2008-09-29 15:36             ` Stephen Hemminger
@ 2008-10-31 14:57             ` Ilpo Järvinen
  2 siblings, 0 replies; 20+ messages in thread
From: Ilpo Järvinen @ 2008-10-31 14:57 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, shemminger, Herbert Xu, Netdev

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2813 bytes --]

On Mon, 29 Sep 2008, Christoph Lameter wrote:

> Ilpo Järvinen wrote:
> 
> > ...I was thinking earlier to answer "time?", but now once been there, it 
> > seems that more time is more appropriate... So far I haven't been able to 
> > find a way to create a reproducable serie of result numbers with aim9 
> > tcp_test... it seems that the results vary within that (at least) 20% 
> > margin. Can Christoph actually get stable numbers out of it with 27-rcs
> > (I haven't extensively tested .22 yet with long test durations but it 
> > seems that same problem occurs with it as well if short tests were used)?
> 
> Results fluctuate between 10 - 25%. The problem occurs with the short
> durations as well. If this is due to the additional code complexity in later
> kernels as we suspect then it may be an issue with cpu cache effectiveness.
> 
> Going to 64 bit binaries also yields a significant hit (as high as 30%) which
> also indicates caching issues.
> 
> Both 64 bit kernels and later kernels cause the variability of results to
> increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
> of cpu caching issues. The L1 cache may become ineffective due to the
> increased cache footprint.

I experimented with it some and changed tcp_test to bind into supplied 
port instead of relying on the port allocator randomness, both server and 
client port were do like that. However, I had to turn tcp_tw_recycle on to 
get the test to actually return instead of -ESOMETHING. In addition I did
sync & drop_caches before each run (I'm not sure if it did actually reduce 
variantion a bit or did I just imagine, I'd expect it to damp test 
harness caused artifacts if it did something) + sleep 20 before each 20 
seconds test.

Port allocator could be benchmarked separately if so desired.

Here are my current numbers with 64-bit (nodebug & nonf):

 .22   .28-rc2-gsmthg
          GSO/TSO
         off    on
240700 232398 224194
241187 236722 227610
243940 237388 229472
244367 237469 229576
246134 238569 229680
246211 238680 229999
246400 238693 230262
248761 239076 230404
250934 239107 231404
251203 239152 231562
251572 239215 231912
254158 239863 232744
256407 239912 234017
257329 240022 -EINTR
259560 241352 -EINTR

http://www.cs.helsinki.fi/u/ijjarvin/aim9/res.png

TSO/GSO does modulos every so often but Dave is currently evaluating how 
to get rid of that, discussed here:
http://marc.info/?t=122411618000004&r=1&w=2
...Still some uncertainty where the remaining of Evgeniy's G&TSO off/on 
difference comes from.

2.6.27-rc7 has basically the same numbers as 2.6.28-rc2 though
I accidently had there ftrace on so some extra nops were present.

Still some regression to attack, but there seems to considerably
less than 20% when testing for net_random()'s output is removed.

-- 
 i.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2008-10-31 14:57 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-23 18:14 AIM9 regression Christoph Lameter
2008-09-23 20:36 ` Stephen Hemminger
2008-09-23 20:40   ` Christoph Lameter
2008-09-23 20:43     ` Christoph Lameter
2008-09-24  1:20     ` Jeff Garzik
2008-09-24  3:11       ` David Miller
2008-09-24 14:20         ` Christoph Lameter
2008-09-24  5:12 ` Herbert Xu
2008-09-24  5:18   ` David Miller
2008-09-24 15:16     ` Stephen Hemminger
2008-09-24 19:10       ` Christoph Lameter
2008-09-24 19:53         ` David Miller
2008-09-24 21:34           ` Stephen Hemminger
2008-09-24 22:26             ` David Miller
2008-09-24 19:36       ` David Miller
2008-09-29 14:24         ` Ilpo Järvinen
2008-09-29 14:54           ` Christoph Lameter
2008-09-29 15:12             ` Ilpo Järvinen
2008-09-29 15:36             ` Stephen Hemminger
2008-10-31 14:57             ` Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).