* Minutes from 10/1 LSE Call
@ 2003-10-01 19:19 Hanna Linder
2003-10-01 23:29 ` Andrew Morton
0 siblings, 1 reply; 13+ messages in thread
From: Hanna Linder @ 2003-10-01 19:19 UTC (permalink / raw)
To: lse-tech, linux-kernel
LSE Call Minutes from 10/1
I. Sylvain Jeaugey and Simon Derr: CPUSETS, Controlling CPU placement.
http://marc.theaimsgroup.com/?l=lse-tech&m=106441942222186&w=2
Prcesses attach to cpusets. Made changes to system calls.
new file /proc/cpusets which shows which cpus are in the sets. also
/proc/pid/ info. Only cpus in the current cpuset are available to
the processes attached to the set. Hooks into fork and other system calls.
There are man pages on the web site for all these processes.
www.opensource.org/cpuset
----
Paul Jackson- q about restriction to certain cpus. ie cpu 3 & 5 in the
set. Does the process see the real numbers or are they renumbered?
A - They are not renumbered in the /proc info but for the schedsetaffinity
call they are renumbered starting at 0.
Q- So an application could become confused?
A- No, the application doesnt need to know those details.
Paul - Sometimes applications do want to know what cpu they are on.
----
Hubertus Franke- Do you do any optimization based on cpu usage? because it
is a virtualization they have the power to assign any cpu at will.
A -Right now we just have a basic view of the machine. There is some
awareness of NUMA.
Paul J- I would guess we would put the optimization at a higher layer
than the virtualization layer.
Hubertus- I think a good idea would be to not have a set mapping but
let the kernel pick the best mapping based on load or some intelligent
entity. That could be an interesting extension to this.
---
Mike Raymond - Can threads within the set leave?
A- No
Q - Can threads outside the cpuset join?
A- no, unless their cpusets overlap. Every thread is in a cpuset.
no thread can leave their cpuset.
Q - Is there some option ie strict that tells whether or not some
other thread can steal resources or not.
A - Only threads within the cpuset can use the processor.
The parent is always a superset of the child.
Q- I can imagine we would want the cpus for certain daemons
to be spun up very early.
More difficult to move memory than to reschedule a task.
But that is a bigger topic, not for this call.
====================
II. Mark Gross: Real-Time applications needs when system is stressed.
http://marc.theaimsgroup.com/?l=lse-tech&m=106494685313136&w=2
Looking at the performance robustness under Linux as the workloads
get heavy. Most this work is being motivated from telco software
venders as they try to use Linux. It works OK until the workload
gets large. At least that is what we keep hearing.
The things we are looking at is to try to isolate the bottlenecks
and create some mircobenchmarks that expose these bottlenecks.
We are also looking into developing tools to identify these
bottlenecks. I believe they are coming from semaphore locks
in the kernel. Would like a tool, such as lockmeter, that
could work with semaphore locks.
We got about 100 mb/sec using the bonie benchmark for block io writes,
for writes we hit a ceiling around 100-120 mb/sec. Stopped scaling
after about 3 spindles.
Tried to focus on the block io part of it. Have not tried
direct or raw io yet. With block IO we got about 133 mb/sec
doing a simple dd to dev/null from multiple spindles. This
was on the 2.6 test 3.
Jesse Barnes recommends trying direct or raw io. They saw some
higher numbers. Raw and Odirect avoid the cpu more than block io does.
Block it goes through the buffer cache instead of he page cache.
which only comes out of lowmem which really can restrict you.
Badari Pulavarty, said that is by design and there is no work
going on to change that to page cache. It is buffer IO, it is
supposed to go through the cache now. There was some talk but
nobody cared enough to change it. No one is saying to go through
block io right now anyway.
Q - Is using direct IO an option for posix apps?
A- not right now. There are a lot of apps.
It is great that direct io kicks ass but we should make sure
file io kicks butt too.
Odirect on large block sizes has low cpu utilization ( 3-5% ).
However with buffered IO we can easily get to 100% cpu utilization.
If you look at the profile most of that is in the copy_to_user function.
Dave howell- so these apps dont run very well on Linux right
now?
A- Yes.
Q- Did you see any spinlock contention with lockmeter?
A- Not really, there arent many spinlocks with this app.
Mark Gross Q- Is there a Kernprof for 2.6? Andrew Morton
has lockmeter in his tree but not kernprof.
John Hawkes A- No. Will require a different approach in 2.6 than 2.4.
Could use some help on this.
Badari- going back to the 100% utilization issue with block io.
That is probably caused by low mem problem. It would be good
to change the app to not use block io.
A- It is a whole suite of telco apps that use Posix so that is a
difficult option.
=======================
III. Steve Pratt: IO performance in mm tree vs mainline.
http://marc.theaimsgroup.com/?l=lse-tech&m=106495069918264&w=2
Got some interesting results comparing IO throughput on the mainline
tree vs the mm tree doing random and sequential reads with different
block sizes.
With large blcok size and random reads the (aio readahead speedup patch)
was giving significant thruput improvements. However, with sequential
reads we saw a 10% overhead in cpu utilization no throuput increase.
So there are some drawbacks to the readahead patch if doing both random and
sequential reads.
In mainline, once block size is over 32k our throuput actually drops off.
It levels off around 128k but at a greater cpu utilization.
Dont really understand why that is.
In mm tree, maintains throughput for all block size but the cpu utilzation
keeps going up to do the same throughput. Readprofile shows the extra time
is being spent in copy_to_user (in mm tree). Backing out readahead patch
reduces cpu by 10% for all block sizes but still shows the spike. So that
isnt the main problem.
Badari - Are you using the serveraid or ips driver?
A - yes. We are planning on moving to the qlogic driver but need to get
different disks and get it all set up. But I wouldnt expect that to
kick in until 128k. So the fact it starts at 32k makes me think that
isnt the issue. So 64k is less than the max throughput.
Q- What tool are you using for these measurements?
A- It is an internal tool called Raw Read that has been open sourced
and is available on IBM's open source site. Under developerworks
under Linux Perf. Will get the latest version out soon.
Mark Gross- We saw a similiar performance degredation as we scaled
by spindles.
Steve - I can add spindles no problem, we actually have 80 spindles
in a raid array that looks like 20 spindles.
---
minutes compiled by Hanna Linder 10/01/03
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Minutes from 10/1 LSE Call
2003-10-01 19:19 Minutes from 10/1 LSE Call Hanna Linder
@ 2003-10-01 23:29 ` Andrew Morton
2003-10-01 23:38 ` Larry McVoy
2003-10-02 19:21 ` [Lse-tech] " Steven Pratt
0 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2003-10-01 23:29 UTC (permalink / raw)
To: Hanna Linder; +Cc: lse-tech, linux-kernel
Hanna Linder <hannal@us.ibm.com> wrote:
>
> In mainline, once block size is over 32k our throuput actually drops off.
> It levels off around 128k but at a greater cpu utilization.
> Dont really understand why that is.
Probably thrashing the CPU's L1 cache.
> In mm tree, maintains throughput for all block size but the cpu utilzation
> keeps going up to do the same throughput. Readprofile shows the extra time
> is being spent in copy_to_user (in mm tree). Backing out readahead patch
> reduces cpu by 10% for all block sizes but still shows the spike. So that
> isnt the main problem.
If you have a loop like:
char *buf;
for (lots) {
read(fd, buf, size);
}
the optimum value of `size' is small: as little as 8k. Once `size' gets
close to half the size of the L1 cache you end up pushing the memory at
`buf' out of CPU cache all the time.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Minutes from 10/1 LSE Call
2003-10-01 23:29 ` Andrew Morton
@ 2003-10-01 23:38 ` Larry McVoy
2003-10-02 0:23 ` Jeff Garzik
2003-10-02 19:21 ` [Lse-tech] " Steven Pratt
1 sibling, 1 reply; 13+ messages in thread
From: Larry McVoy @ 2003-10-01 23:38 UTC (permalink / raw)
To: Andrew Morton; +Cc: Hanna Linder, lse-tech, linux-kernel
On Wed, Oct 01, 2003 at 04:29:16PM -0700, Andrew Morton wrote:
> If you have a loop like:
>
> char *buf;
>
> for (lots) {
> read(fd, buf, size);
> }
>
> the optimum value of `size' is small: as little as 8k. Once `size' gets
> close to half the size of the L1 cache you end up pushing the memory at
> `buf' out of CPU cache all the time.
I've seen this too, not that Andrew needs me to back him up, but in many
cases even 4k is big enough. Linux has a very thin system call layer so
it is OK, good even, to use reasonable buffer sizes.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Minutes from 10/1 LSE Call
2003-10-01 23:38 ` Larry McVoy
@ 2003-10-02 0:23 ` Jeff Garzik
2003-10-02 18:56 ` insecure
0 siblings, 1 reply; 13+ messages in thread
From: Jeff Garzik @ 2003-10-02 0:23 UTC (permalink / raw)
To: Larry McVoy; +Cc: Andrew Morton, Hanna Linder, lse-tech, linux-kernel
Larry McVoy wrote:
> On Wed, Oct 01, 2003 at 04:29:16PM -0700, Andrew Morton wrote:
>
>>If you have a loop like:
>>
>> char *buf;
>>
>> for (lots) {
>> read(fd, buf, size);
>> }
>>
>>the optimum value of `size' is small: as little as 8k. Once `size' gets
>>close to half the size of the L1 cache you end up pushing the memory at
>>`buf' out of CPU cache all the time.
>
>
> I've seen this too, not that Andrew needs me to back him up, but in many
> cases even 4k is big enough. Linux has a very thin system call layer so
> it is OK, good even, to use reasonable buffer sizes.
Slight tangent, FWIW... Back when I was working on my "race-free
userland" project, I noticed that the fastest cp(1) implementation was
GNU's: read/write from a single, statically allocated, page-aligned 4K
buffer. I experimented with various buffer sizes, mmap-based copies,
and even with sendfile(2) where both arguments were files.
read(2)/write(2) of a single 4K buffer was always the fastest.
Jeff
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Minutes from 10/1 LSE Call
2003-10-02 0:23 ` Jeff Garzik
@ 2003-10-02 18:56 ` insecure
2003-10-02 19:10 ` Jeff Garzik
0 siblings, 1 reply; 13+ messages in thread
From: insecure @ 2003-10-02 18:56 UTC (permalink / raw)
To: Jeff Garzik, Larry McVoy
Cc: Andrew Morton, Hanna Linder, lse-tech, linux-kernel
On Thursday 02 October 2003 03:23, Jeff Garzik wrote:
> Larry McVoy wrote:
> > On Wed, Oct 01, 2003 at 04:29:16PM -0700, Andrew Morton wrote:
> >>If you have a loop like:
> >>
> >> char *buf;
> >>
> >> for (lots) {
> >> read(fd, buf, size);
> >> }
> >>
> >>the optimum value of `size' is small: as little as 8k. Once `size' gets
> >>close to half the size of the L1 cache you end up pushing the memory at
> >>`buf' out of CPU cache all the time.
> >
> > I've seen this too, not that Andrew needs me to back him up, but in many
> > cases even 4k is big enough. Linux has a very thin system call layer so
> > it is OK, good even, to use reasonable buffer sizes.
>
> Slight tangent, FWIW... Back when I was working on my "race-free
> userland" project, I noticed that the fastest cp(1) implementation was
> GNU's: read/write from a single, statically allocated, page-aligned 4K
> buffer. I experimented with various buffer sizes, mmap-based copies,
> and even with sendfile(2) where both arguments were files.
> read(2)/write(2) of a single 4K buffer was always the fastest.
That sounds reasonable, but today's RAM throughput is on the order
of 1GB/s, not 100Mb/s. 'Out of L1' theory can't explain 100Mb/s ceiling
it seems.
--
vda
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Minutes from 10/1 LSE Call
2003-10-02 18:56 ` insecure
@ 2003-10-02 19:10 ` Jeff Garzik
2003-10-02 22:38 ` insecure
0 siblings, 1 reply; 13+ messages in thread
From: Jeff Garzik @ 2003-10-02 19:10 UTC (permalink / raw)
To: insecure; +Cc: Larry McVoy, Andrew Morton, Hanna Linder, lse-tech, linux-kernel
insecure wrote:
> That sounds reasonable, but today's RAM throughput is on the order
> of 1GB/s, not 100Mb/s. 'Out of L1' theory can't explain 100Mb/s ceiling
> it seems.
cp(1) data, at least, will never ever be in L1. Copying data you need
to look at the ends of the pipeline -- hard drive throughput, PCI bus
bandwidth, FSB bandwidth, speed at which ext2/3 allocates blocks, and
similar things are likely bottlenecks.
You'll never hit RAM bandwidth limits, unless your copies are extremely
tiny, and entirely in L2 or pagecache.
Jeff
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Minutes from 10/1 LSE Call
2003-10-02 19:10 ` Jeff Garzik
@ 2003-10-02 22:38 ` insecure
2003-10-02 22:45 ` Hanna Linder
2003-10-05 5:38 ` Andrew Morton
0 siblings, 2 replies; 13+ messages in thread
From: insecure @ 2003-10-02 22:38 UTC (permalink / raw)
To: Jeff Garzik
Cc: Larry McVoy, Andrew Morton, Hanna Linder, lse-tech, linux-kernel
On Thursday 02 October 2003 22:10, Jeff Garzik wrote:
> insecure wrote:
> > That sounds reasonable, but today's RAM throughput is on the order
> > of 1GB/s, not 100Mb/s. 'Out of L1' theory can't explain 100Mb/s ceiling
> > it seems.
>
> cp(1) data, at least, will never ever be in L1. Copying data you need
> to look at the ends of the pipeline -- hard drive throughput, PCI bus
> bandwidth, FSB bandwidth, speed at which ext2/3 allocates blocks, and
> similar things are likely bottlenecks.
Hmm.
On Wednesday 01 October 2003 22:19, Hanna Linder wrote:
> We got about 100 mb/sec using the bonie benchmark for block io writes,
> for writes we hit a ceiling around 100-120 mb/sec. Stopped scaling
> after about 3 spindles.
>
> Tried to focus on the block io part of it. Have not tried
> direct or raw io yet. With block IO we got about 133 mb/sec
> doing a simple dd to dev/null from multiple spindles. This
> was on the 2.6 test 3.
> ....
> Odirect on large block sizes has low cpu utilization ( 3-5% ).
> However with buffered IO we can easily get to 100% cpu utilization.
> If you look at the profile most of that is in the copy_to_user function.
So:
* we hit a ceiling of ~133 Mb/s, no matter how many disks
* CPU utilization is 100%, spent mostly in copy_to_user
* RAM bandwidth is >1Gb/s
These can't be true at once.
At least one of these three statements must be false (imho).
--
vda
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Minutes from 10/1 LSE Call
2003-10-02 22:38 ` insecure
@ 2003-10-02 22:45 ` Hanna Linder
2003-10-05 5:38 ` Andrew Morton
1 sibling, 0 replies; 13+ messages in thread
From: Hanna Linder @ 2003-10-02 22:45 UTC (permalink / raw)
To: insecure, Jeff Garzik
Cc: Larry McVoy, Andrew Morton, Hanna Linder, lse-tech, linux-kernel
Ahh. I see the confusion. These are excerpts from two different speakers.
Notice the line of ===== separating speakers in the minutes.
This part is from Steve Pratt's mm vs mainline comparisons:
> On Thursday 02 October 2003 22:10, Jeff Garzik wrote:
>> insecure wrote:
>> > That sounds reasonable, but today's RAM throughput is on the order
>> > of 1GB/s, not 100Mb/s. 'Out of L1' theory can't explain 100Mb/s ceiling
>> > it seems.
>>
>> cp(1) data, at least, will never ever be in L1. Copying data you need
>> to look at the ends of the pipeline -- hard drive throughput, PCI bus
>> bandwidth, FSB bandwidth, speed at which ext2/3 allocates blocks, and
>> similar things are likely bottlenecks.
>
> Hmm.
>
This part is from Mark Gross's Real Time Application issues discussion:
> On Wednesday 01 October 2003 22:19, Hanna Linder wrote:
>> We got about 100 mb/sec using the bonie benchmark for block io writes,
>> for writes we hit a ceiling around 100-120 mb/sec. Stopped scaling
>> after about 3 spindles.
>>
>> Tried to focus on the block io part of it. Have not tried
>> direct or raw io yet. With block IO we got about 133 mb/sec
>> doing a simple dd to dev/null from multiple spindles. This
>> was on the 2.6 test 3.
>> ....
>> Odirect on large block sizes has low cpu utilization ( 3-5% ).
>> However with buffered IO we can easily get to 100% cpu utilization.
>> If you look at the profile most of that is in the copy_to_user function.
>
> So:
> * we hit a ceiling of ~133 Mb/s, no matter how many disks
> * CPU utilization is 100%, spent mostly in copy_to_user
> * RAM bandwidth is >1Gb/s
Hanna
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Minutes from 10/1 LSE Call
2003-10-02 22:38 ` insecure
2003-10-02 22:45 ` Hanna Linder
@ 2003-10-05 5:38 ` Andrew Morton
1 sibling, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2003-10-05 5:38 UTC (permalink / raw)
To: insecure; +Cc: jgarzik, lm, hannal, lse-tech, linux-kernel
insecure <insecure@mail.od.ua> wrote:
>
> So:
> * we hit a ceiling of ~133 Mb/s, no matter how many disks
> * CPU utilization is 100%, spent mostly in copy_to_user
> * RAM bandwidth is >1Gb/s
>
> These can't be true at once.
True. But bear in mind that the data crosses the memory busses up to three
times: disk to pagecache, pagecache to CPU, CPU to user memory.
So top speed may be as little as 300 MB/sec.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Lse-tech] Re: Minutes from 10/1 LSE Call
2003-10-01 23:29 ` Andrew Morton
2003-10-01 23:38 ` Larry McVoy
@ 2003-10-02 19:21 ` Steven Pratt
2003-10-02 19:36 ` Andrew Morton
1 sibling, 1 reply; 13+ messages in thread
From: Steven Pratt @ 2003-10-02 19:21 UTC (permalink / raw)
To: Andrew Morton; +Cc: Hanna Linder, lse-tech, linux-kernel
Andrew Morton wrote:
>Hanna Linder <hannal@us.ibm.com> wrote:
>
>
>>In mainline, once block size is over 32k our throuput actually drops off.
>>It levels off around 128k but at a greater cpu utilization.
>>Dont really understand why that is.
>>
>>
>
>Probably thrashing the CPU's L1 cache.
>
>
>>In mm tree, maintains throughput for all block size but the cpu utilzation
>>keeps going up to do the same throughput. Readprofile shows the extra time
>>is being spent in copy_to_user (in mm tree). Backing out readahead patch
>>reduces cpu by 10% for all block sizes but still shows the spike. So that
>>isnt the main problem.
>>
>>
>
>If you have a loop like:
>
> char *buf;
>
> for (lots) {
> read(fd, buf, size);
> }
>
>
>the optimum value of `size' is small: as little as 8k. Once `size' gets
>close to half the size of the L1 cache you end up pushing the memory at
>`buf' out of CPU cache all the time.
>
>
>
Sure, but why do I only see this is the mm tree, and not the mainline
tree. Also, there are 160 threads doing this loop, so even at 32k block
size the working set of 'buf's is over 5MB with only a 2MB L2 cache.
Steve
>
>-------------------------------------------------------
>This sf.net email is sponsored by:ThinkGeek
>Welcome to geek heaven.
>http://thinkgeek.com/sf
>_______________________________________________
>Lse-tech mailing list
>Lse-tech@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/lse-tech
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [Lse-tech] Re: Minutes from 10/1 LSE Call
2003-10-02 19:21 ` [Lse-tech] " Steven Pratt
@ 2003-10-02 19:36 ` Andrew Morton
2003-10-03 19:33 ` Steven Pratt
0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2003-10-02 19:36 UTC (permalink / raw)
To: Steven Pratt; +Cc: hannal, lse-tech, linux-kernel
Steven Pratt <slpratt@austin.ibm.com> wrote:
>
> Sure, but why do I only see this is the mm tree, and not the mainline
> tree.
Please send a full description of how to reproduce it and I'll take a look.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Lse-tech] Re: Minutes from 10/1 LSE Call
2003-10-02 19:36 ` Andrew Morton
@ 2003-10-03 19:33 ` Steven Pratt
2003-10-03 20:13 ` Andrew Morton
0 siblings, 1 reply; 13+ messages in thread
From: Steven Pratt @ 2003-10-03 19:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: hannal, lse-tech, linux-kernel
Andrew Morton wrote:
>Steven Pratt <slpratt@austin.ibm.com> wrote:
>
>
>> Sure, but why do I only see this is the mm tree, and not the mainline
>> tree.
>>
>>
>
>Please send a full description of how to reproduce it and I'll take a look.
>
>
>
Get the latest rawread from
http://www-124.ibm.com/developerworks/opensource/linuxperf/rawread/rawread.html
mkfs devices and mount on /mnt/mntN where N is increasing index.
Create file 'foo' in each filesystem of size 1GB (for this example).
Unmount and remount the partitions/devices to flush the cache.
Filesystems are also umounted and re-mounted between each test run.
The following rawread commands will run the tests for block sizes
ranging from 1k-512k. The "-d 1" parameters assumes that you mounted
starting at /mnt/mnt1 and the "-m2 -p16" say to run 8 threads on each
of 2 devices /mnt/mnt1 and /mnt/mnt2.
rawread -m 2 -p 16 -d 6 -n 20480 -f -c -t 0 -s 1024
rawread -m 2 -p 16 -d 6 -n 10240 -f -c -t 0 -s 2048
rawread -m 2 -p 16 -d 6 -n 5120 -f -c -t 0 -s 4096
rawread -m 2 -p 16 -d 6 -n 2560 -f -c -t 0 -s 8192
rawread -m 2 -p 16 -d 1 -n 1280 -f -c -t 0 -s 16384
rawread -m 2 -p 16 -d 1 -n 640 -f -c -t 0 -s 32768
rawread -m 2 -p 16 -d 1 -n 320 -f -c -t 0 -s 65536
rawread -m 2 -p 16 -d 1 -n 160 -f -c -t 0 -s 131072
rawread -m 2 -p 16 -d 1 -n 80 -f -c -t 0 -s 262144
rawread -m 2 -p 16 -d 1 -n 40 -f -c -t 0 -s 524288
2 devices is the smallest number I have been able to run which shows
this problem. With only 1 device I did not see it. My original tests
were done with 20 devices. One thing of interest is that with only 2
devices the point at which CPU starts to increase again is at 128k
instead of at 32k which I saw with 20 devices. This would support your
theory that this is casued by cache misses with more/larger buffers.
I'm still not sure this accounts for all of the extra CPU usage, but I
am less worried about it.
But as long as I have your attention, there is one other thing about
these runs which bothers me, which is that the mm tree is doing horribly
on 1k and 2k block sizes. I looks like readahead is not functioning
properly for these requst sizes.
Here is a comparison for 2 devices between test6 and test6mm1. You can
see that the mm1 tree does great at larger block sizes, but poorly at
small ones.
Results:seqread-_vs_.seqread-
tolerance = 0.00 + 3.00% of A
test6 test6-mm1
Blocksize KBs/sec KBs/sec %diff diff tolerance
---------- ------------ ------------ -------- ------------ ------------
1024 44083 22641 -48.64 -21442.00 1322.49 *
2048 45276 26371 -41.76 -18905.00 1358.28 *
4096 44024 45260 2.81 1236.00 1320.72
8192 44519 50073 12.48 5554.00 1335.57 *
16384 46869 51528 9.94 4659.00 1406.07 *
32768 47900 52231 9.04 4331.00 1437.00 *
65536 42803 52183 21.91 9380.00 1284.09 *
131072 36525 49724 36.14 13199.00 1095.75 *
262144 34628 46192 33.39 11564.00 1038.84 *
524288 28997 48005 65.55 19008.00 869.91 *
Results:seqread-_vs_.seqread-
tolerance = 0.50 + 3.00% of A
test6 test6-mm1
Blocksize %CPU %CPU %diff diff tolerance
---------- ------------ ------------ -------- ------------ ------------
1024 27.87 11.72 -57.95 -16.15 1.34 *
2048 13.77 8.84 -35.80 -4.93 0.91 *
4096 9 9.99 11.00 0.99 0.77 *
8192 8.07 8.31 2.97 0.24 0.74
16384 5.7 6.63 16.32 0.93 0.67 *
32768 4.93 5.59 13.39 0.66 0.65 *
65536 3.76 4.7 25.00 0.94 0.61 *
131072 3.25 4.53 39.38 1.28 0.60 *
262144 3.23 6.15 90.40 2.92 0.60 *
524288 2.97 8.19 175.76 5.22 0.59 *
Steve
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2003-10-05 5:37 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-01 19:19 Minutes from 10/1 LSE Call Hanna Linder
2003-10-01 23:29 ` Andrew Morton
2003-10-01 23:38 ` Larry McVoy
2003-10-02 0:23 ` Jeff Garzik
2003-10-02 18:56 ` insecure
2003-10-02 19:10 ` Jeff Garzik
2003-10-02 22:38 ` insecure
2003-10-02 22:45 ` Hanna Linder
2003-10-05 5:38 ` Andrew Morton
2003-10-02 19:21 ` [Lse-tech] " Steven Pratt
2003-10-02 19:36 ` Andrew Morton
2003-10-03 19:33 ` Steven Pratt
2003-10-03 20:13 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox