reaim - 2.6.3-mm1 IO performance down.

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* reaim - 2.6.3-mm1 IO performance down.
@ 2004-02-25  0:20 cliff white
  2004-02-25  1:03 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: cliff white @ 2004-02-25  0:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm



Running the reaim 'new_fserver' workload, we now see a performance drop
on 2.6.3-mm1, ext3 filesystem

Kernel          JPM       Max Users     Percent change
inux-2.6.3      10347.87  172           0.0
2.6.3-rc3-mm1   9826.35   164           -5.07
2.6.3-mm1       8938.17   140   (run 1) -13.65
2.6.3-mm1       9100.39   136    (run 2)-12.08 

I have done some comparions graphs here, with readprofile data:
http://developer.osdl.org/cliffw/reaim/compares/r_comp/2.6.3_vs_mm1_r1/index.html
http://developer.osdl.org/cliffw/reaim/compares/r_comp/2.6.3_vs_mm1_r2/index.html

The interesting graph is the bottom one on the page, comparing client
run-time to number of children. The -mm1 kernel has a spike in run time, becoming
very noticealbe around 60-80 lusers.

kernel       Users 	JPM        Run Time
linux-2.6.3  60         10327.87   34.16 seconds
2.6.3-mm1    60         8279.75    42.61 seconds

Linux-2.6.3  80         10275.23   45.78 seconds
2.6.3-mm1    80         7841.31    59.99 seconds


For the same test on the same machine, results from 2.6.2-rc1-mm2 and 2.6.2-rc3-mm1
were within 1.0% of the linux-2.6.2 runs. So this is new. 

More data and tests if requested - are there some patch sets we should try reverting?
cliffw

-- 
The church is near, but the road is icy.
The bar is far, but i will walk carefully. - Russian proverb

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
  2004-02-25  0:20 cliff white
@ 2004-02-25  1:03 ` Andrew Morton
  2004-02-25 17:40   ` Cliff White
  2004-02-25 23:36   ` Cliff White
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2004-02-25  1:03 UTC (permalink / raw)
  To: cliff white; +Cc: linux-kernel

cliff white <cliffw@osdl.org> wrote:
>
> For the same test on the same machine, results from 2.6.2-rc1-mm2 and 2.6.2-rc3-mm1
> were within 1.0% of the linux-2.6.2 runs. So this is new. 
> 
> More data and tests if requested - are there some patch sets we should try reverting?

Thanks.  You could try reverting adaptive-lazy-readahead.patch.  If it is
not that I'd be suspecting CPU scheduler changes.  Do you have uniprocessor
test results?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
@ 2004-02-25  2:00 Con Kolivas
  2004-02-25  3:31 ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Con Kolivas @ 2004-02-25  2:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Cliff White, Andrew Morton



>Running the reaim 'new_fserver' workload, we now see a performance drop on
>2.6.3-mm1, ext3 filesystem

I observed a serious slowdown on non numa, non smt machines with kernbench and
the scheduler changes (posted results a week ago here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=107719112225482&w=2 )

A summary of those results is half job load (-j4 on 8x):
2.6.3: Elapsed Time 231.274
2.6.3-mm1: Elapsed Time 273.688

The drop in reaim performance is possibly related.

Con

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
  2004-02-25  2:00 reaim - 2.6.3-mm1 IO performance down Con Kolivas
@ 2004-02-25  3:31 ` Nick Piggin
  0 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2004-02-25  3:31 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, Cliff White, Andrew Morton

Con Kolivas wrote:

>
>>Running the reaim 'new_fserver' workload, we now see a performance drop on
>>2.6.3-mm1, ext3 filesystem
>>
>
>I observed a serious slowdown on non numa, non smt machines with kernbench and
>the scheduler changes (posted results a week ago here:
>http://marc.theaimsgroup.com/?l=linux-kernel&m=107719112225482&w=2 )
>
>A summary of those results is half job load (-j4 on 8x):
>2.6.3: Elapsed Time 231.274
>2.6.3-mm1: Elapsed Time 273.688
>
>The drop in reaim performance is possibly related.
>
>

I have been meaning to look at that. The STP wasn't working
for me for a couple of days, but Cliff fixed that so I'll get
on it.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
  2004-02-25  1:03 ` Andrew Morton
@ 2004-02-25 17:40   ` Cliff White
  2004-02-25 19:16     ` Andrew Morton
  2004-02-25 23:36   ` Cliff White
  1 sibling, 1 reply; 8+ messages in thread
From: Cliff White @ 2004-02-25 17:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

> cliff white <cliffw@osdl.org> wrote:
> >
> > For the same test on the same machine, results from 2.6.2-rc1-mm2 and 2.6.2
> -rc3-mm1
> > were within 1.0% of the linux-2.6.2 runs. So this is new. 
> > 
> > More data and tests if requested - are there some patch sets we should try 
> reverting?
> 
> Thanks.  You could try reverting adaptive-lazy-readahead.patch.  If it is
> not that I'd be suspecting CPU scheduler changes.  Do you have uniprocessor
> test results?

I have them for 2.6.3-mm3, am re-running 2.6.3-mm1 right now.
Gross results are within 1%, but looking at the detail, i do see badness,
example:

Kernel    Users  Run time
2.6.3	  20     32.11
2.6.3-mm3 20     35.47

2.6.3	  40     63.64
2.6.3-mm2 40     66.33

Again, this shows up best on the bottom graph on the page.
Graphn of 2.6.3 vs 2.6.3-mm3 : 
http://developer.osdl.org/cliffw/reaim/compares/r_comp/2.6.3_vs_mm1_1cpu/index.html
cliffw

> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
  2004-02-25 17:40   ` Cliff White
@ 2004-02-25 19:16     ` Andrew Morton
  2004-02-25 20:52       ` Cliff White
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2004-02-25 19:16 UTC (permalink / raw)
  To: Cliff White; +Cc: linux-kernel

Cliff White <cliffw@osdl.org> wrote:
>
>  > Thanks.  You could try reverting adaptive-lazy-readahead.patch.  If it is
>  > not that I'd be suspecting CPU scheduler changes.  Do you have uniprocessor
>  > test results?
> 
>  I have them for 2.6.3-mm3, am re-running 2.6.3-mm1 right now.
>  Gross results are within 1%, but looking at the detail, i do see badness,

OK, I'll do the bsearch.  Again.  Could you please tell me what would be an
appropriate reaim command line with which to reproduce this?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
  2004-02-25 19:16     ` Andrew Morton
@ 2004-02-25 20:52       ` Cliff White
  0 siblings, 0 replies; 8+ messages in thread
From: Cliff White @ 2004-02-25 20:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

> Cliff White <cliffw@osdl.org> wrote:
> >
> >  > Thanks.  You could try reverting adaptive-lazy-readahead.patch.  If it i
> s
> >  > not that I'd be suspecting CPU scheduler changes.  Do you have uniproces
> sor
> >  > test results?
> > 
> >  I have them for 2.6.3-mm3, am re-running 2.6.3-mm1 right now.
> >  Gross results are within 1%, but looking at the detail, i do see badness,
> 
> OK, I'll do the bsearch.  Again.  Could you please tell me what would be an
> appropriate reaim command line with which to reproduce this?

Sure. 
We use the 'workfile.new_fserver' for this load. Comes in the kit.
This test does IO, so you'll need to list some directories and a filepool
size
in a text file. Ours looks like this:
stp.config
-------------------
FILESIZE 80k
POOLSIZE 1024k
DISKDIR /mnt/disk1
DISKDIR /mnt/disk2
DISKDIR /mnt/disk3
DISKDIR /mnt/disk4
--------------------

If you want the shortest possible run,  shrink POOLSIZE and 
go with the defaults:
reaim -fworkfile.new_fserver -l./stp.config

That will run 1->5 users, should be < 1 egg timer in duration
on modern iron. I dunno if that'll be a useful data point.
To replicate at a specific number of users, pick a number 
( example: 20 )
, and use this:

reaim -s20 -e20 -fworkfile.new_fserver -l./stp.config

That will do one pass at 20 users. Should be quick, maybe 2-4
egg timers, and will give you one of the data points I reported.
If you have more time, this runs range 20->80 with a 10 user increment:

reaim -s20 -e80 -i10 -t -fworkfile.new_fserver -l./stp.config
( the -t turns off the AIM7 adaptive increment )

The STP runs are very long. Many egg-timers. 
We invoke them like this, where NCPU is the number of CPUS we have:
The 'quick convergence run' ( repeats three times )
reaim -s$NCPU -q -t -i$NCPU -fworkfile.new_fserver -r3 -b -l./stp.config

This is the 'maximum' longest run
reaim -s$NCPU -s -t -i$NCPU -fworkfile.new_fserver -r3 -b -l./stp.config

If you want those, quicker to request via STP/PLM -
cliffw 





> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: reaim - 2.6.3-mm1 IO performance down.
  2004-02-25  1:03 ` Andrew Morton
  2004-02-25 17:40   ` Cliff White
@ 2004-02-25 23:36   ` Cliff White
  1 sibling, 0 replies; 8+ messages in thread
From: Cliff White @ 2004-02-25 23:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: cliff white, linux-kernel, cliffw

> cliff white <cliffw@osdl.org> wrote:
> >
> > For the same test on the same machine, results from 2.6.2-rc1-mm2 and 2.6.2
> -rc3-mm1
> > were within 1.0% of the linux-2.6.2 runs. So this is new. 
> > 
> > More data and tests if requested - are there some patch sets we should try 
> reverting?
> 
> Thanks.  You could try reverting adaptive-lazy-readahead.patch.  If it is
> not that I'd be suspecting CPU scheduler changes.  Do you have uniprocessor
> test results?

adaptive-lazy-readahead reverted, not really much change here:
Kernel          Users      JPM        Run Time
2.6.3		60         10327.87   34.16 seconds
2.6.3-mm1	60         8279.75    42.61 seconds
2.6.3-mm1-noalr 60	   7731.76    45.63 seconds

2.6.3		80         10275.23   45.78 seconds
2.6.3-mm1	80         7841.31    59.99 seconds
2.6.3-mm1-noalr 80         8565.19    54.92 seconds

Full details, reaim tarball:
http://developer.osdl.org/cliffw

cliffw


> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-02-25 23:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-25  2:00 reaim - 2.6.3-mm1 IO performance down Con Kolivas
2004-02-25  3:31 ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2004-02-25  0:20 cliff white
2004-02-25  1:03 ` Andrew Morton
2004-02-25 17:40   ` Cliff White
2004-02-25 19:16     ` Andrew Morton
2004-02-25 20:52       ` Cliff White
2004-02-25 23:36   ` Cliff White

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox