* 2.5.8 final -
@ 2002-04-14 21:06 J Sloan
2002-04-15 5:46 ` 2.5.8 final - another data point J Sloan
2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
0 siblings, 2 replies; 13+ messages in thread
From: J Sloan @ 2002-04-14 21:06 UTC (permalink / raw)
To: linux kernel
Observations -
The up-fix for the setup_per_cpu_areas compile
issue apparently didn't make it into 2.5.8-final,
so we had to apply the patch from 2.5.8-pre3
to get it to compile.
That said, however, everything works, all services
are running, all devices working, Xfree is happy.
P4-B/1600, genuine intel mobo running RH 7.2+rawhide
It also passes the q3a test with snappy results
:-)
Joe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-14 21:06 2.5.8 final - J Sloan
@ 2002-04-15 5:46 ` J Sloan
2002-04-15 6:35 ` J Sloan
2002-04-15 7:18 ` Andrew Morton
2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
1 sibling, 2 replies; 13+ messages in thread
From: J Sloan @ 2002-04-15 5:46 UTC (permalink / raw)
To: linux kernel; +Cc: J Sloan
J Sloan wrote:
> Observations -
>
> The up-fix for the setup_per_cpu_areas compile
> issue apparently didn't make it into 2.5.8-final,
> so we had to apply the patch from 2.5.8-pre3
> to get it to compile.
>
> That said, however, everything works, all services
> are running, all devices working, Xfree is happy.
Stop me if you've heard this one before -
But there is one additional observation:
dbench performance has regressed significantly
since 2.5.8-pre1; the performance is equivalent
up to 8 instances, but at 16 and above, 2.5.8 final
takes a nosedive. Performance at 128 instances
is approximately 20% of the throughput of
2.5.8-pre1 - which is in turn not up to 2.4.xx
performance levels. I realize that the BIO has
been through heavy surgery, and nowhere near
optimized, but this is just a data point...
hdparm -t shows normal performance levels,
for what it's worth
2.5.8-pre1
--------------
Throughput 151.152 MB/sec (NB=188.94 MB/sec 1511.52 MBit/sec) 1 procs
Throughput 152.177 MB/sec (NB=190.221 MB/sec 1521.77 MBit/sec) 2 procs
Throughput 151.965 MB/sec (NB=189.957 MB/sec 1519.65 MBit/sec) 4 procs
Throughput 151.068 MB/sec (NB=188.835 MB/sec 1510.68 MBit/sec) 8 procs
Throughput 43.0191 MB/sec (NB=53.7738 MB/sec 430.191 MBit/sec) 16 procs
Throughput 9.65171 MB/sec (NB=12.0646 MB/sec 96.5171 MBit/sec) 32 procs
Throughput 37.8267 MB/sec (NB=47.2833 MB/sec 378.267 MBit/sec) 64 procs
Throughput 14.0459 MB/sec (NB=17.5573 MB/sec 140.459 MBit/sec) 80 procs
Throughput 16.2971 MB/sec (NB=20.3714 MB/sec 162.971 MBit/sec) 128 procs
2.5.8-final
---------------
Throughput 152.948 MB/sec (NB=191.185 MB/sec 1529.48 MBit/sec) 1 procs
Throughput 151.597 MB/sec (NB=189.497 MB/sec 1515.97 MBit/sec) 2 procs
Throughput 150.377 MB/sec (NB=187.972 MB/sec 1503.77 MBit/sec) 4 procs
Throughput 150.159 MB/sec (NB=187.698 MB/sec 1501.59 MBit/sec) 8 procs
Throughput 7.25691 MB/sec (NB=9.07113 MB/sec 72.5691 MBit/sec) 16 procs
Throughput 6.36332 MB/sec (NB=7.95415 MB/sec 63.6332 MBit/sec) 32 procs
Throughput 5.55008 MB/sec (NB=6.9376 MB/sec 55.5008 MBit/sec) 64 procs
Throughput 5.82333 MB/sec (NB=7.27916 MB/sec 58.2333 MBit/sec) 80 procs
Throughput 3.40741 MB/sec (NB=4.25926 MB/sec 34.0741 MBit/sec) 128 procs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-15 5:46 ` 2.5.8 final - another data point J Sloan
@ 2002-04-15 6:35 ` J Sloan
2002-04-15 7:27 ` Andrew Morton
2002-04-15 7:18 ` Andrew Morton
1 sibling, 1 reply; 13+ messages in thread
From: J Sloan @ 2002-04-15 6:35 UTC (permalink / raw)
To: J Sloan; +Cc: linux kernel
FWIW -
One other observation was the numerous
syslog entries generated during the test,
which were as follows:
Apr 14 20:40:35 neo kernel: invalidate: busy buffer
Apr 14 20:41:15 neo last message repeated 72 times
Apr 14 20:44:41 neo last message repeated 36 times
Apr 14 20:45:24 neo last message repeated 47 times
J Sloan wrote:
> dbench performance has regressed significantly
> since 2.5.8-pre1;
>
> 2.5.8-pre1
> --------------
> Throughput 37.8267 MB/sec (NB=47.2833 MB/sec 378.267 MBit/sec) 64 procs
> Throughput 14.0459 MB/sec (NB=17.5573 MB/sec 140.459 MBit/sec) 80 procs
> Throughput 16.2971 MB/sec (NB=20.3714 MB/sec 162.971 MBit/sec) 128
> procs
>
> 2.5.8-final
> ---------------
> Throughput 5.55008 MB/sec (NB=6.9376 MB/sec 55.5008 MBit/sec) 64 procs
> Throughput 5.82333 MB/sec (NB=7.27916 MB/sec 58.2333 MBit/sec) 80 procs
> Throughput 3.40741 MB/sec (NB=4.25926 MB/sec 34.0741 MBit/sec) 128
> procs
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-15 5:46 ` 2.5.8 final - another data point J Sloan
2002-04-15 6:35 ` J Sloan
@ 2002-04-15 7:18 ` Andrew Morton
2002-04-15 8:14 ` J Sloan
1 sibling, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-04-15 7:18 UTC (permalink / raw)
To: J Sloan; +Cc: linux kernel
J Sloan wrote:
>
> ...
> dbench performance has regressed significantly
> since 2.5.8-pre1; the performance is equivalent
> up to 8 instances, but at 16 and above, 2.5.8 final
> takes a nosedive. Performance at 128 instances
> is approximately 20% of the throughput of
> 2.5.8-pre1 - which is in turn not up to 2.4.xx
> performance levels. I realize that the BIO has
> been through heavy surgery, and nowhere near
> optimized, but this is just a data point...
It's not related to BIO. dbench is all about higher-level
memory management, high-level IO scheduling and butterfly
wings.
> ...
> Throughput 151.068 MB/sec (NB=188.835 MB/sec 1510.68 MBit/sec) 8 procs
> Throughput 43.0191 MB/sec (NB=53.7738 MB/sec 430.191 MBit/sec) 16 procs
> Throughput 9.65171 MB/sec (NB=12.0646 MB/sec 96.5171 MBit/sec) 32 procs
> Throughput 37.8267 MB/sec (NB=47.2833 MB/sec 378.267 MBit/sec) 64 procs
Consider that 32 proc line for a while.
>....
> 2.5.8-final
> ---------------
> Throughput 152.948 MB/sec (NB=191.185 MB/sec 1529.48 MBit/sec) 1 procs
> Throughput 151.597 MB/sec (NB=189.497 MB/sec 1515.97 MBit/sec) 2 procs
> Throughput 150.377 MB/sec (NB=187.972 MB/sec 1503.77 MBit/sec) 4 procs
> Throughput 150.159 MB/sec (NB=187.698 MB/sec 1501.59 MBit/sec) 8 procs
> Throughput 7.25691 MB/sec (NB=9.07113 MB/sec 72.5691 MBit/sec) 16 procs
> Throughput 6.36332 MB/sec (NB=7.95415 MB/sec 63.6332 MBit/sec) 32 procs
It's obviously fallen over some cliff. Conceivably the larger readahead
window causes this. How much memory does the machine have? `dbench 64'
on a 512 meg setup certainly causes readahead thrashing. You can
stick a `printk("ouch");' into handle_ra_thrashing() and watch it...
But really, all this stuff is in churn at present. I have patches here
which take `dbench 64' on 512 megs from this:
2.5.8:
Throughput 12.7343 MB/sec (NB=15.9179 MB/sec 127.343 MBit/sec)
to this:
2.5.8-akpm:
Throughput 49.2223 MB/sec (NB=61.5278 MB/sec 492.223 MBit/sec)
This is partly by just throwing more memory at it. The gap
widens on highmem...
And that code isn't tuned yet - I do know that threads are getting
blocked by each other at the inode level. And that ext2 is serialising
itself at the lock_super() level, and that if you fix that,
threads serialise on slab's cache_chain_sem (which is pretty
amazing...).
Patience. 2.5.later-on will perform well. :)
-
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-15 6:35 ` J Sloan
@ 2002-04-15 7:27 ` Andrew Morton
2002-04-15 8:02 ` J Sloan
0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-04-15 7:27 UTC (permalink / raw)
To: J Sloan; +Cc: linux kernel
J Sloan wrote:
>
> FWIW -
>
> One other observation was the numerous
> syslog entries generated during the test,
> which were as follows:
>
> Apr 14 20:40:35 neo kernel: invalidate: busy buffer
> Apr 14 20:41:15 neo last message repeated 72 times
> Apr 14 20:44:41 neo last message repeated 36 times
> Apr 14 20:45:24 neo last message repeated 47 times
>
If that is happening during the dbench run, then something
is wrong.
What filesystem and I/O drivers are you using? LVM?
RAID?
Please replace that line in fs:buffer.c:invalidate_bdev()
with a BUG(), or show_stack(0), send the ksymoops output.
Thanks.
-
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-15 7:27 ` Andrew Morton
@ 2002-04-15 8:02 ` J Sloan
0 siblings, 0 replies; 13+ messages in thread
From: J Sloan @ 2002-04-15 8:02 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux kernel
Andrew Morton wrote:
>J Sloan wrote:
>
>>
>>Apr 14 20:40:35 neo kernel: invalidate: busy buffer
>>
>
>If that is happening during the dbench run, then something
>is wrong.
>
I am reasonably sure that's when it was happening.
>
>
>What filesystem and I/O drivers are you using? LVM?
>RAID?
>
Actually just plain old ext2 on ide drives -
>
>Please replace that line in fs:buffer.c:invalidate_bdev()
>with a BUG(), or show_stack(0), send the ksymoops output.
>
OK, will do -
Joe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-15 7:18 ` Andrew Morton
@ 2002-04-15 8:14 ` J Sloan
0 siblings, 0 replies; 13+ messages in thread
From: J Sloan @ 2002-04-15 8:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux kernel
Andrew Morton wrote:
>It's not related to BIO. dbench is all about higher-level
>memory management, high-level IO scheduling and butterfly
>wings.
>
Yes, no doubt and a lot of other deep magic
which is only dimly perceived by the likes
of yours truly....
>>
>>Throughput 150.159 MB/sec (NB=187.698 MB/sec 1501.59 MBit/sec) 8 procs
>>Throughput 7.25691 MB/sec (NB=9.07113 MB/sec 72.5691 MBit/sec) 16 procs
>>Throughput 6.36332 MB/sec (NB=7.95415 MB/sec 63.6332 MBit/sec) 32 procs
>>
>
>It's obviously fallen over some cliff. Conceivably the larger readahead
>window causes this. How much memory does the machine have?
>
The box has 512 MB RAM -
>`dbench 64'
>on a 512 meg setup certainly causes readahead thrashing. You can
>stick a `printk("ouch");' into handle_ra_thrashing() and watch it...
>
hmm - OK, will try that -
Just for giggles, same machine with 2.4.19-pre4-ac4 -
Throughput 150.979 MB/sec (NB=188.723 MB/sec 1509.79 MBit/sec) 1 procs
Throughput 150.796 MB/sec (NB=188.496 MB/sec 1507.96 MBit/sec) 2 procs
Throughput 151.185 MB/sec (NB=188.982 MB/sec 1511.85 MBit/sec) 4 procs
Throughput 141.255 MB/sec (NB=176.568 MB/sec 1412.55 MBit/sec) 8 procs
Throughput 105.066 MB/sec (NB=131.332 MB/sec 1050.66 MBit/sec) 16 procs
Throughput 69.3542 MB/sec (NB=86.6928 MB/sec 693.542 MBit/sec) 32 procs
Throughput 32.4904 MB/sec (NB=40.613 MB/sec 324.904 MBit/sec) 64 procs
Throughput 30.4824 MB/sec (NB=38.103 MB/sec 304.824 MBit/sec) 80 procs
Throughput 19.0265 MB/sec (NB=23.7832 MB/sec 190.265 MBit/sec) 128 procs
>
>
>Patience. 2.5.later-on will perform well. :)
>
Oh, yes -
It's already quite usable for some workloads, and the
latency for workstation use is quite good - I am looking
forward to the maturation of this diamond in the rough
:-)
Joe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final -
2002-04-14 21:06 2.5.8 final - J Sloan
2002-04-15 5:46 ` 2.5.8 final - another data point J Sloan
@ 2002-04-15 14:15 ` Luigi Genoni
2002-04-15 14:55 ` David S. Miller
1 sibling, 1 reply; 13+ messages in thread
From: Luigi Genoni @ 2002-04-15 14:15 UTC (permalink / raw)
To: J Sloan; +Cc: linux kernel
OH well, on sparc64 setup_per_cpu_areas() simply is
not declared, since it is not a GENERIC_PER_CPU.
then asm/cacheflush.h, required by linux/highmem.h,
does not exist.
And then PREEMPT_ACTIVE is not defined...
it seems that I could not test under sparc64 also this release, sigh!
On Sun, 14 Apr 2002, J Sloan wrote:
> Observations -
>
> The up-fix for the setup_per_cpu_areas compile
> issue apparently didn't make it into 2.5.8-final,
> so we had to apply the patch from 2.5.8-pre3
> to get it to compile.
>
> That said, however, everything works, all services
> are running, all devices working, Xfree is happy.
>
> P4-B/1600, genuine intel mobo running RH 7.2+rawhide
>
> It also passes the q3a test with snappy results
>
> :-)
>
> Joe
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final -
2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
@ 2002-04-15 14:55 ` David S. Miller
0 siblings, 0 replies; 13+ messages in thread
From: David S. Miller @ 2002-04-15 14:55 UTC (permalink / raw)
To: kernel; +Cc: joe, linux-kernel
From: Luigi Genoni <kernel@Expansa.sns.it>
Date: Mon, 15 Apr 2002 16:15:04 +0200 (CEST)
OH well, on sparc64 setup_per_cpu_areas() simply is
not declared, since it is not a GENERIC_PER_CPU.
then asm/cacheflush.h, required by linux/highmem.h,
does not exist.
And then PREEMPT_ACTIVE is not defined...
it seems that I could not test under sparc64 also this release, sigh!
I just haven't pushed my tree yet, it will be fixed soon.
I've been busy with other things this weekend...
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
@ 2002-04-16 12:42 rwhron
2002-04-16 18:31 ` J Sloan
0 siblings, 1 reply; 13+ messages in thread
From: rwhron @ 2002-04-16 12:42 UTC (permalink / raw)
To: joe, akpm; +Cc: linux-kernel
>>Patience. 2.5.later-on will perform well. :)
> It's already quite usable for some workloads, and the
> latency for workstation use is quite good - I am looking
> forward to the maturation of this diamond in the rough
I noticed a dbench regression in 2.5.8 too. The light at
the end of the tunnel looks bright and close though. :)
(reference to near-death experience - not a train. :)
Running dbench 128 on ext2 mounted with delalloc and Andrew's
patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
was 7.5x faster than 2.5.8 vanilla and 1.5x faster than
2.4.19pre6aa1.
It will be fun to see what the other i/o benchmarks
and OSDB do with Andrew's delalloc patches.
--
Randy Hron
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-16 12:42 2.5.8 final - another data point rwhron
@ 2002-04-16 18:31 ` J Sloan
0 siblings, 0 replies; 13+ messages in thread
From: J Sloan @ 2002-04-16 18:31 UTC (permalink / raw)
To: rwhron; +Cc: Linux kernel
rwhron@earthlink.net wrote:
>Running dbench 128 on ext2 mounted with delalloc and Andrew's
>patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
>was 7.5x faster than 2.5.8 vanilla and 1.5x faster than
>2.4.19pre6aa1.
>
Wow, good stuff - I'll have to pull those down
and give it a go - looks like that light is closer
than I thought :-)
Once tux is ported to 2.5 I'll really be a happy
camper.
Joe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
@ 2002-04-16 21:48 rwhron
2002-04-16 22:02 ` Andrew Morton
0 siblings, 1 reply; 13+ messages in thread
From: rwhron @ 2002-04-16 21:48 UTC (permalink / raw)
To: jjs, akpm; +Cc: linux-kernel
>>Running dbench 128 on ext2 mounted with delalloc and Andrew's
>>patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
>>was 7.5x faster than 2.5.8 vanilla and 1.5x faster than
> Wow, good stuff - I'll have to pull those down
Hmm, I had to run e2fsck -f twice on the filesystem that ran
dbench, tiobench, bonnie++ on nfs, and osdb. The filesystem
was showing 52% used and is normally 1% used before/after
testing. No big files on the fs. The directory where
bonnie++ on nfs runs had some temporary directories that
were not deletable. A bunch of files/directories were in
lost+found after e2fsck. After removing the files, the
fs was back to 1% used.
I backed up and did mke2fs in case there was any
pre-existing/lingering corruption. So keep your karma
up and test on a test box. :)
--
Randy Hron
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2.5.8 final - another data point
2002-04-16 21:48 rwhron
@ 2002-04-16 22:02 ` Andrew Morton
0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2002-04-16 22:02 UTC (permalink / raw)
To: rwhron; +Cc: jjs, linux-kernel
rwhron@earthlink.net wrote:
>
> >>Running dbench 128 on ext2 mounted with delalloc and Andrew's
> >>patches from http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.8/
> >>was 7.5x faster than 2.5.8 vanilla and 1.5x faster than
>
> > Wow, good stuff - I'll have to pull those down
>
> Hmm, I had to run e2fsck -f twice on the filesystem that ran
> dbench, tiobench, bonnie++ on nfs, and osdb. The filesystem
> was showing 52% used and is normally 1% used before/after
> testing. No big files on the fs. The directory where
> bonnie++ on nfs runs had some temporary directories that
> were not deletable. A bunch of files/directories were in
> lost+found after e2fsck. After removing the files, the
> fs was back to 1% used.
>
ho-hum. Presumably an unreservepage() got lost somewhere
in the diff shuffling.
All I'm doing with the delayed-allocation code at present
is keeping the diffs up to date. I haven't even compiled
that stuff for over a week. All work at present is against
dallocbase-70-writeback. It's probably not a good use of
your time to test anything beyond that. Sorry about that.
I'll leave the later diffs available so anyone who's interested
can see the multipage bio assembly stuff, but "dont use".
-
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2002-04-16 22:02 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-04-14 21:06 2.5.8 final - J Sloan
2002-04-15 5:46 ` 2.5.8 final - another data point J Sloan
2002-04-15 6:35 ` J Sloan
2002-04-15 7:27 ` Andrew Morton
2002-04-15 8:02 ` J Sloan
2002-04-15 7:18 ` Andrew Morton
2002-04-15 8:14 ` J Sloan
2002-04-15 14:15 ` 2.5.8 final - Luigi Genoni
2002-04-15 14:55 ` David S. Miller
-- strict thread matches above, loose matches on Subject: below --
2002-04-16 12:42 2.5.8 final - another data point rwhron
2002-04-16 18:31 ` J Sloan
2002-04-16 21:48 rwhron
2002-04-16 22:02 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox