public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: 2.5.8 final - another data point
  2002-04-14 21:06 2.5.8 final - J Sloan
@ 2002-04-15  5:46 ` J Sloan
  2002-04-15  6:35   ` J Sloan
  2002-04-15  7:18   ` Andrew Morton
  0 siblings, 2 replies; 10+ messages in thread
From: J Sloan @ 2002-04-15  5:46 UTC (permalink / raw)
  To: linux kernel; +Cc: J Sloan

J Sloan wrote:

> Observations -
>
> The up-fix for the setup_per_cpu_areas compile
> issue apparently didn't make it into 2.5.8-final,
> so we had to apply the patch from 2.5.8-pre3
> to get it to compile.
>
> That said, however, everything works, all services
> are running, all devices working, Xfree is happy.

Stop me if you've heard this one before -

But there is one additional observation:

dbench performance has regressed significantly
since 2.5.8-pre1; the performance is equivalent
up to 8 instances, but at 16 and above, 2.5.8 final
takes a nosedive. Performance at 128 instances
is approximately 20% of the throughput of
2.5.8-pre1 - which is in turn not up to 2.4.xx
performance levels. I realize that the BIO has
been through heavy surgery, and nowhere near
optimized, but this is just a data point...

hdparm -t shows normal performance levels,
for what it's worth

2.5.8-pre1
--------------
Throughput 151.152 MB/sec (NB=188.94 MB/sec  1511.52 MBit/sec)  1 procs
Throughput 152.177 MB/sec (NB=190.221 MB/sec  1521.77 MBit/sec)  2 procs
Throughput 151.965 MB/sec (NB=189.957 MB/sec  1519.65 MBit/sec)  4 procs
Throughput 151.068 MB/sec (NB=188.835 MB/sec  1510.68 MBit/sec)  8 procs
Throughput 43.0191 MB/sec (NB=53.7738 MB/sec  430.191 MBit/sec)  16 procs
Throughput 9.65171 MB/sec (NB=12.0646 MB/sec  96.5171 MBit/sec)  32 procs
Throughput 37.8267 MB/sec (NB=47.2833 MB/sec  378.267 MBit/sec)  64 procs
Throughput 14.0459 MB/sec (NB=17.5573 MB/sec  140.459 MBit/sec)  80 procs
Throughput 16.2971 MB/sec (NB=20.3714 MB/sec  162.971 MBit/sec)  128 procs

2.5.8-final
---------------
Throughput 152.948 MB/sec (NB=191.185 MB/sec  1529.48 MBit/sec)  1 procs
Throughput 151.597 MB/sec (NB=189.497 MB/sec  1515.97 MBit/sec)  2 procs
Throughput 150.377 MB/sec (NB=187.972 MB/sec  1503.77 MBit/sec)  4 procs
Throughput 150.159 MB/sec (NB=187.698 MB/sec  1501.59 MBit/sec)  8 procs
Throughput 7.25691 MB/sec (NB=9.07113 MB/sec  72.5691 MBit/sec)  16 procs
Throughput 6.36332 MB/sec (NB=7.95415 MB/sec  63.6332 MBit/sec)  32 procs
Throughput 5.55008 MB/sec (NB=6.9376 MB/sec  55.5008 MBit/sec)  64 procs
Throughput 5.82333 MB/sec (NB=7.27916 MB/sec  58.2333 MBit/sec)  80 procs
Throughput 3.40741 MB/sec (NB=4.25926 MB/sec  34.0741 MBit/sec)  128 procs








^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
@ 2002-04-15  6:35   ` J Sloan
  2002-04-15  7:27     ` Andrew Morton
  2002-04-15  7:18   ` Andrew Morton
  1 sibling, 1 reply; 10+ messages in thread
From: J Sloan @ 2002-04-15  6:35 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

FWIW -

One other observation was the numerous
syslog entries generated during the test,
which were as follows:


Apr 14 20:40:35 neo kernel: invalidate: busy buffer
Apr 14 20:41:15 neo last message repeated 72 times
Apr 14 20:44:41 neo last message repeated 36 times
Apr 14 20:45:24 neo last message repeated 47 times


J Sloan wrote:

> dbench performance has regressed significantly
> since 2.5.8-pre1; 

>
> 2.5.8-pre1
> --------------
> Throughput 37.8267 MB/sec (NB=47.2833 MB/sec  378.267 MBit/sec)  64 procs
> Throughput 14.0459 MB/sec (NB=17.5573 MB/sec  140.459 MBit/sec)  80 procs
> Throughput 16.2971 MB/sec (NB=20.3714 MB/sec  162.971 MBit/sec)  128 
> procs
>
> 2.5.8-final
> ---------------
> Throughput 5.55008 MB/sec (NB=6.9376 MB/sec  55.5008 MBit/sec)  64 procs
> Throughput 5.82333 MB/sec (NB=7.27916 MB/sec  58.2333 MBit/sec)  80 procs
> Throughput 3.40741 MB/sec (NB=4.25926 MB/sec  34.0741 MBit/sec)  128 
> procs
>




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
  2002-04-15  6:35   ` J Sloan
@ 2002-04-15  7:18   ` Andrew Morton
  2002-04-15  8:14     ` J Sloan
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2002-04-15  7:18 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

J Sloan wrote:
> 
> ...
> dbench performance has regressed significantly
> since 2.5.8-pre1; the performance is equivalent
> up to 8 instances, but at 16 and above, 2.5.8 final
> takes a nosedive. Performance at 128 instances
> is approximately 20% of the throughput of
> 2.5.8-pre1 - which is in turn not up to 2.4.xx
> performance levels. I realize that the BIO has
> been through heavy surgery, and nowhere near
> optimized, but this is just a data point...

It's not related to BIO.  dbench is all about higher-level
memory management, high-level IO scheduling and butterfly
wings.
 
> ...
> Throughput 151.068 MB/sec (NB=188.835 MB/sec  1510.68 MBit/sec)  8 procs
> Throughput 43.0191 MB/sec (NB=53.7738 MB/sec  430.191 MBit/sec)  16 procs
> Throughput 9.65171 MB/sec (NB=12.0646 MB/sec  96.5171 MBit/sec)  32 procs
> Throughput 37.8267 MB/sec (NB=47.2833 MB/sec  378.267 MBit/sec)  64 procs

Consider that 32 proc line for a while.

>....
> 2.5.8-final
> ---------------
> Throughput 152.948 MB/sec (NB=191.185 MB/sec  1529.48 MBit/sec)  1 procs
> Throughput 151.597 MB/sec (NB=189.497 MB/sec  1515.97 MBit/sec)  2 procs
> Throughput 150.377 MB/sec (NB=187.972 MB/sec  1503.77 MBit/sec)  4 procs
> Throughput 150.159 MB/sec (NB=187.698 MB/sec  1501.59 MBit/sec)  8 procs
> Throughput 7.25691 MB/sec (NB=9.07113 MB/sec  72.5691 MBit/sec)  16 procs
> Throughput 6.36332 MB/sec (NB=7.95415 MB/sec  63.6332 MBit/sec)  32 procs

It's obviously fallen over some cliff.  Conceivably the larger readahead
window causes this.  How much memory does the machine have? `dbench 64'
on a 512 meg setup certainly causes readahead thrashing.  You can
stick a `printk("ouch");' into handle_ra_thrashing() and watch it...


But really, all this stuff is in churn at present. I have patches here
which take `dbench 64' on 512 megs from this:


2.5.8:
Throughput 12.7343 MB/sec (NB=15.9179 MB/sec  127.343 MBit/sec)

to this:

2.5.8-akpm:
Throughput 49.2223 MB/sec (NB=61.5278 MB/sec  492.223 MBit/sec)

This is partly by just throwing more memory at it.  The gap
widens on highmem...

And that code isn't tuned yet - I do know that threads are getting
blocked by each other at the inode level.  And that ext2 is serialising
itself at the lock_super() level, and that if you fix that,
threads serialise on slab's cache_chain_sem (which is pretty
amazing...).

Patience.  2.5.later-on will perform well.  :)

-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  6:35   ` J Sloan
@ 2002-04-15  7:27     ` Andrew Morton
  2002-04-15  8:02       ` J Sloan
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2002-04-15  7:27 UTC (permalink / raw)
  To: J Sloan; +Cc: linux kernel

J Sloan wrote:
> 
> FWIW -
> 
> One other observation was the numerous
> syslog entries generated during the test,
> which were as follows:
> 
> Apr 14 20:40:35 neo kernel: invalidate: busy buffer
> Apr 14 20:41:15 neo last message repeated 72 times
> Apr 14 20:44:41 neo last message repeated 36 times
> Apr 14 20:45:24 neo last message repeated 47 times
> 

If that is happening during the dbench run, then something
is wrong.

What filesystem and I/O drivers are you using?  LVM?
RAID?

Please replace that line in fs:buffer.c:invalidate_bdev()
with a BUG(), or show_stack(0), send the ksymoops output.

Thanks.

-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  7:27     ` Andrew Morton
@ 2002-04-15  8:02       ` J Sloan
  0 siblings, 0 replies; 10+ messages in thread
From: J Sloan @ 2002-04-15  8:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux kernel

Andrew Morton wrote:

>J Sloan wrote:
>
>>
>>Apr 14 20:40:35 neo kernel: invalidate: busy buffer
>>
>
>If that is happening during the dbench run, then something
>is wrong.
>
I am reasonably sure that's when it was happening.

>
>
>What filesystem and I/O drivers are you using?  LVM?
>RAID?
>
Actually just plain old ext2 on ide drives -

>
>Please replace that line in fs:buffer.c:invalidate_bdev()
>with a BUG(), or show_stack(0), send the ksymoops output.
>
OK, will do -

Joe




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-15  7:18   ` Andrew Morton
@ 2002-04-15  8:14     ` J Sloan
  0 siblings, 0 replies; 10+ messages in thread
From: J Sloan @ 2002-04-15  8:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux kernel

Andrew Morton wrote:

>It's not related to BIO.  dbench is all about higher-level
>memory management, high-level IO scheduling and butterfly
>wings.
>
Yes, no doubt and a lot of other deep magic
which is only dimly perceived by the likes
of yours truly....

>>
>>Throughput 150.159 MB/sec (NB=187.698 MB/sec  1501.59 MBit/sec)  8 procs
>>Throughput 7.25691 MB/sec (NB=9.07113 MB/sec  72.5691 MBit/sec)  16 procs
>>Throughput 6.36332 MB/sec (NB=7.95415 MB/sec  63.6332 MBit/sec)  32 procs
>>
>
>It's obviously fallen over some cliff.  Conceivably the larger readahead
>window causes this.  How much memory does the machine have? 
>
The box has 512 MB RAM -

>`dbench 64'
>on a 512 meg setup certainly causes readahead thrashing.  You can
>stick a `printk("ouch");' into handle_ra_thrashing() and watch it...
>
hmm - OK, will try that -

Just for giggles, same machine with 2.4.19-pre4-ac4 -

Throughput 150.979 MB/sec (NB=188.723 MB/sec  1509.79 MBit/sec)  1 procs
Throughput 150.796 MB/sec (NB=188.496 MB/sec  1507.96 MBit/sec)  2 procs
Throughput 151.185 MB/sec (NB=188.982 MB/sec  1511.85 MBit/sec)  4 procs
Throughput 141.255 MB/sec (NB=176.568 MB/sec  1412.55 MBit/sec)  8 procs
Throughput 105.066 MB/sec (NB=131.332 MB/sec  1050.66 MBit/sec)  16 procs
Throughput 69.3542 MB/sec (NB=86.6928 MB/sec  693.542 MBit/sec)  32 procs
Throughput 32.4904 MB/sec (NB=40.613 MB/sec  324.904 MBit/sec)  64 procs
Throughput 30.4824 MB/sec (NB=38.103 MB/sec  304.824 MBit/sec)  80 procs
Throughput 19.0265 MB/sec (NB=23.7832 MB/sec  190.265 MBit/sec)  128 procs

>
>
>Patience.  2.5.later-on will perform well.  :)
>
Oh, yes -

It's already quite usable for some workloads, and the
latency for workstation use is quite good -  I am looking
forward to the maturation of this diamond in the rough

:-)

Joe



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
@ 2002-04-16 12:42 rwhron
  2002-04-16 18:31 ` J Sloan
  0 siblings, 1 reply; 10+ messages in thread
From: rwhron @ 2002-04-16 12:42 UTC (permalink / raw)
  To: joe, akpm; +Cc: linux-kernel

>>Patience.  2.5.later-on will perform well.  :)

> It's already quite usable for some workloads, and the
> latency for workstation use is quite good -  I am looking
> forward to the maturation of this diamond in the rough

I noticed a dbench regression in 2.5.8 too.  The light at 
the end of the tunnel looks bright and close though. :)
(reference to near-death experience - not a train. :)

Running dbench 128 on ext2 mounted with delalloc and Andrew's
patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
was 7.5x faster than 2.5.8 vanilla and 1.5x faster than 
2.4.19pre6aa1.

It will be fun to see what the other i/o benchmarks
and OSDB do with Andrew's delalloc patches.

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-16 12:42 2.5.8 final - another data point rwhron
@ 2002-04-16 18:31 ` J Sloan
  0 siblings, 0 replies; 10+ messages in thread
From: J Sloan @ 2002-04-16 18:31 UTC (permalink / raw)
  To: rwhron; +Cc: Linux kernel

rwhron@earthlink.net wrote:

>Running dbench 128 on ext2 mounted with delalloc and Andrew's
>patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
>was 7.5x faster than 2.5.8 vanilla and 1.5x faster than 
>2.4.19pre6aa1.
>
Wow, good stuff - I'll have to pull those down
and give it a go - looks like that light is closer
than I thought :-)

Once tux is ported to 2.5 I'll really be a happy
camper.

Joe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
@ 2002-04-16 21:48 rwhron
  2002-04-16 22:02 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: rwhron @ 2002-04-16 21:48 UTC (permalink / raw)
  To: jjs, akpm; +Cc: linux-kernel

>>Running dbench 128 on ext2 mounted with delalloc and Andrew's
>>patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
>>was 7.5x faster than 2.5.8 vanilla and 1.5x faster than

> Wow, good stuff - I'll have to pull those down

Hmm, I had to run e2fsck -f twice on the filesystem that ran
dbench, tiobench, bonnie++ on nfs, and osdb.  The filesystem
was showing 52% used and is normally 1% used before/after
testing.  No big files on the fs. The directory where
bonnie++ on nfs runs had some temporary directories that
were not deletable.  A bunch of files/directories were in
lost+found after e2fsck.  After removing the files, the
fs was back to 1% used.

I backed up and did mke2fs in case there was any
pre-existing/lingering corruption.  So keep your karma
up and test on a test box. :)

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.5.8 final - another data point
  2002-04-16 21:48 rwhron
@ 2002-04-16 22:02 ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2002-04-16 22:02 UTC (permalink / raw)
  To: rwhron; +Cc: jjs, linux-kernel

rwhron@earthlink.net wrote:
> 
> >>Running dbench 128 on ext2 mounted with delalloc and Andrew's
> >>patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
> >>was 7.5x faster than 2.5.8 vanilla and 1.5x faster than
> 
> > Wow, good stuff - I'll have to pull those down
> 
> Hmm, I had to run e2fsck -f twice on the filesystem that ran
> dbench, tiobench, bonnie++ on nfs, and osdb.  The filesystem
> was showing 52% used and is normally 1% used before/after
> testing.  No big files on the fs. The directory where
> bonnie++ on nfs runs had some temporary directories that
> were not deletable.  A bunch of files/directories were in
> lost+found after e2fsck.  After removing the files, the
> fs was back to 1% used.
> 

ho-hum.  Presumably an unreservepage() got lost somewhere
in the diff shuffling.

All I'm doing with the delayed-allocation code at present
is keeping the diffs up to date.  I haven't even compiled
that stuff for over a week.  All work at present is against
dallocbase-70-writeback.   It's probably not a good use of
your time to test anything beyond that.  Sorry about that.

I'll leave the later diffs available so anyone who's interested
can see the multipage bio assembly stuff, but "dont use".

-

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-04-16 22:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-04-16 12:42 2.5.8 final - another data point rwhron
2002-04-16 18:31 ` J Sloan
  -- strict thread matches above, loose matches on Subject: below --
2002-04-16 21:48 rwhron
2002-04-16 22:02 ` Andrew Morton
2002-04-14 21:06 2.5.8 final - J Sloan
2002-04-15  5:46 ` 2.5.8 final - another data point J Sloan
2002-04-15  6:35   ` J Sloan
2002-04-15  7:27     ` Andrew Morton
2002-04-15  8:02       ` J Sloan
2002-04-15  7:18   ` Andrew Morton
2002-04-15  8:14     ` J Sloan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox