* RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
@ 2005-04-21 17:41 Luck, Tony
2005-04-21 17:48 ` David Mosberger
2005-04-21 18:24 ` Stan Bubrouski
0 siblings, 2 replies; 9+ messages in thread
From: Luck, Tony @ 2005-04-21 17:41 UTC (permalink / raw)
To: davidm
Cc: akpm, Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ,
Gelato technical, linux-kernel
>Yeah, I'm facing the same issue. I started playing with git last
>night. Apart from disk-space usage, it's very nice, though I really
>hope someone puts together a web-interface on top of git soon so we
>can seek what changed when and by whom.
Disk space issues? A complete git repository of the Linux kernel with
all changesets back to 2.4.0 takes just over 3G ... which is big compared
to BK, but 3G of disk only costs about $1 (for IDE ... if you want 15K rpm
SCSI, then you'll pay a lot more). Network bandwidth is likely to be a
bigger problem.
There's a prototype web i/f at http://grmso.net:8090/ that's already looking
fairly slick.
-Tony
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-21 17:41 [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later Luck, Tony
@ 2005-04-21 17:48 ` David Mosberger
2005-04-21 18:24 ` Stan Bubrouski
1 sibling, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-04-21 17:48 UTC (permalink / raw)
To: Luck, Tony
Cc: davidm, akpm, Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ,
Gelato technical, linux-kernel
>>>>> On Thu, 21 Apr 2005 10:41:52 -0700, "Luck, Tony" <tony.luck@intel.com> said:
Tony> Disk space issues? A complete git repository of the Linux
Tony> kernel with all changesets back to 2.4.0 takes just over 3G
Tony> ... which is big compared to BK, but 3G of disk only costs
Tony> about $1 (for IDE ... if you want 15K rpm SCSI, then you'll
Tony> pay a lot more). Network bandwidth is likely to be a bigger
Tony> problem.
Ever heard that data is a gas? My disks always fill up in no time at
all, no matter how big they are. I agree that network bandwidth is an
bigger issue, though.
Tony> There's a prototype web i/f at http://grmso.net:8090/ that's
Tony> already looking fairly slick.
Indeed. Plus it has a cool name, too. Thanks for the pointer.
--david
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-21 17:41 [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later Luck, Tony
2005-04-21 17:48 ` David Mosberger
@ 2005-04-21 18:24 ` Stan Bubrouski
1 sibling, 0 replies; 9+ messages in thread
From: Stan Bubrouski @ 2005-04-21 18:24 UTC (permalink / raw)
To: Luck, Tony
Cc: davidm, akpm, Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ,
Gelato technical, linux-kernel
Luck, Tony wrote:
>>Yeah, I'm facing the same issue. I started playing with git last
>>night. Apart from disk-space usage, it's very nice, though I really
>>hope someone puts together a web-interface on top of git soon so we
>>can seek what changed when and by whom.
>
>
> Disk space issues? A complete git repository of the Linux kernel with
> all changesets back to 2.4.0 takes just over 3G ... which is big compared
> to BK, but 3G of disk only costs about $1 (for IDE ... if you want 15K rpm
> SCSI, then you'll pay a lot more). Network bandwidth is likely to be a
> bigger problem.
>
That said, is there any plan to change how this functions in the future
to solve these problems? I.e. have it not use so much diskspace and
thus use less bandwith. Am I misunderstanding in assuming that after
say 1000 commits go into the tree it could end up several megs or gigs
bigger?
If that is the case might it not be more prudent to sort this out now?
> There's a prototype web i/f at http://grmso.net:8090/ that's already looking
> fairly slick.
>
Yes it is very slick. Kudos to the creator.
-sb
> -Tony
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
@ 2005-04-21 18:29 Luck, Tony
2005-04-21 18:40 ` Stan Bubrouski
0 siblings, 1 reply; 9+ messages in thread
From: Luck, Tony @ 2005-04-21 18:29 UTC (permalink / raw)
To: Stan Bubrouski
Cc: davidm, akpm, Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ,
Gelato technical, linux-kernel
>That said, is there any plan to change how this functions in the future
>to solve these problems? I.e. have it not use so much diskspace and
>thus use less bandwith. Am I misunderstanding in assuming that after
>say 1000 commits go into the tree it could end up several megs or gigs
>bigger?
>
>If that is the case might it not be more prudent to sort this out now?
Only a new user would have to pull the whole history ... and for most
uses it is sufficient to just pull the current top of the tree. Linus'
own tree only has a history going back to 2.6.12.-rc2 (when he started
using git).
Someday there might be a server daemon that can batch up the changes for
a "pull" to conserve network bandwidth.
There is a mailing list "git@vger.kernel.org" where these issues are
discussed. Archives are available at marc.theaimsgroup.com and gelato.
-Tony
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-21 18:29 Luck, Tony
@ 2005-04-21 18:40 ` Stan Bubrouski
0 siblings, 0 replies; 9+ messages in thread
From: Stan Bubrouski @ 2005-04-21 18:40 UTC (permalink / raw)
To: Luck, Tony
Cc: davidm, akpm, Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ,
Gelato technical, linux-kernel
Luck, Tony wrote:
<SNIP>
> Only a new user would have to pull the whole history ... and for most
> uses it is sufficient to just pull the current top of the tree. Linus'
> own tree only has a history going back to 2.6.12.-rc2 (when he started
> using git).
>
> Someday there might be a server daemon that can batch up the changes for
> a "pull" to conserve network bandwidth.
>
> There is a mailing list "git@vger.kernel.org" where these issues are
> discussed. Archives are available at marc.theaimsgroup.com and gelato.
>
Thanks tony I wasn't aware of the list, I'll look there for git info
from now on.
Best Regards,
Stan
> -Tony
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
@ 2005-04-21 17:19 Luck, Tony
2005-04-21 17:33 ` David Mosberger
0 siblings, 1 reply; 9+ messages in thread
From: Luck, Tony @ 2005-04-21 17:19 UTC (permalink / raw)
To: davidm, akpm
Cc: Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ, Gelato technical,
linux-kernel
>I just checked 2.6.12-rc3 and the fls() fix is indeed missing. Do you
>know what happened?
If BitKeeper were still in use, I'd have dropped that patch into my
"release" tree and asked Linus to "pull" ... but it's not, and I was
stalled. I should have a "git" tree up and running in the next couple
of days. I'll make sure that the fls fix goes in early.
-Tony
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-21 17:19 Luck, Tony
@ 2005-04-21 17:33 ` David Mosberger
2005-04-21 17:39 ` Randy.Dunlap
0 siblings, 1 reply; 9+ messages in thread
From: David Mosberger @ 2005-04-21 17:33 UTC (permalink / raw)
To: Luck, Tony
Cc: davidm, akpm, Andreas Hirstius, Bartlomiej ZOLNIERKIEWICZ,
Gelato technical, linux-kernel
>>>>> On Thu, 21 Apr 2005 10:19:28 -0700, "Luck, Tony" <tony.luck@intel.com> said:
>> I just checked 2.6.12-rc3 and the fls() fix is indeed missing.
>> Do you know what happened?
Tony> If BitKeeper were still in use, I'd have dropped that patch
Tony> into my "release" tree and asked Linus to "pull" ... but it's
Tony> not, and I was stalled. I should have a "git" tree up and
Tony> running in the next couple of days. I'll make sure that the
Tony> fls fix goes in early.
Yeah, I'm facing the same issue. I started playing with git last
night. Apart from disk-space usage, it's very nice, though I really
hope someone puts together a web-interface on top of git soon so we
can seek what changed when and by whom.
--david
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-21 17:33 ` David Mosberger
@ 2005-04-21 17:39 ` Randy.Dunlap
0 siblings, 0 replies; 9+ messages in thread
From: Randy.Dunlap @ 2005-04-21 17:39 UTC (permalink / raw)
To: davidm
Cc: davidm, tony.luck, akpm, Andreas.Hirstius,
Bartlomiej.Zolnierkiewicz, gelato-technical, linux-kernel
On Thu, 21 Apr 2005 10:33:29 -0700 David Mosberger wrote:
| >>>>> On Thu, 21 Apr 2005 10:19:28 -0700, "Luck, Tony" <tony.luck@intel.com> said:
|
| >> I just checked 2.6.12-rc3 and the fls() fix is indeed missing.
| >> Do you know what happened?
|
| Tony> If BitKeeper were still in use, I'd have dropped that patch
| Tony> into my "release" tree and asked Linus to "pull" ... but it's
| Tony> not, and I was stalled. I should have a "git" tree up and
| Tony> running in the next couple of days. I'll make sure that the
| Tony> fls fix goes in early.
|
| Yeah, I'm facing the same issue. I started playing with git last
| night. Apart from disk-space usage, it's very nice, though I really
| hope someone puts together a web-interface on top of git soon so we
| can seek what changed when and by whom.
2 people have already done that. Examples:
http://ehlo.org/~kay/gitweb.pl
and
http://grmso.net:8090/
and the commits mailing list is now working.
A script to show nightly (or daily:) commits and make
a daily patch tarball is also close to ready.
---
~Randy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
@ 2005-04-20 17:37 Andreas Hirstius
2005-04-20 16:55 ` jmerkey
0 siblings, 1 reply; 9+ messages in thread
From: Andreas Hirstius @ 2005-04-20 17:37 UTC (permalink / raw)
To: linux-kernel
Hi,
We have a rx4640 with 3x 3Ware 9500 SATA controllers and 24x WD740GD HDD
in a software RAID0 configuration (using md).
With kernel 2.6.11 the read performance on the md is reduced by a factor
of 20 (!!) compared to previous kernels.
The write rate to the md doesn't change!! (it actually improves a bit).
The config for the kernels are basically identical.
Here is some vmstat output:
kernel 2.6.9: ~1GB/s read
procs memory swap io
system cpu
r b swpd free buff cache si so bi bo in cs us sy wa id
1 1 0 12672 6592 15914112 0 0 1081344 56 15719 1583 0 11 14 74
1 0 0 12672 6592 15915200 0 0 1130496 0 15996 1626 0 11 14 74
0 1 0 12672 6592 15914112 0 0 1081344 0 15891 1570 0 11 14 74
0 1 0 12480 6592 15914112 0 0 1081344 0 15855 1537 0 11 14 74
1 0 0 12416 6592 15914112 0 0 1130496 0 16006 1586 0 12 14 74
kernel 2.6.11: ~55MB/s read
procs memory swap io
system cpu
r b swpd free buff cache si so bi bo in cs us sy wa id
1 1 0 24448 37568 15905984 0 0 56934 0 5166 1862 0 1 24 75
0 1 0 20672 37568 15909248 0 0 57280 0 5168 1871 0 1 24 75
0 1 0 22848 37568 15907072 0 0 57306 0 5173 1874 0 1 24 75
0 1 0 25664 37568 15903808 0 0 57190 0 5171 1870 0 1 24 75
0 1 0 21952 37568 15908160 0 0 57267 0 5168 1871 0 1 24 75
Because the filesystem might have an impact on the measurement, "dd" on /dev/md0
was used to get information about the performance.
This also opens the possibility to test with block sizes larger than the page size.
And it appears that the performance with kernel 2.6.11 is closely
related to the block size.
For example if the block size is exactly a multiple (>2) of the page
size the performance is back to ~1.1GB/s.
The general behaviour is a bit more complicated:
1. bs <= 1.5 * ps : ~27-57MB/s (differs with ps)
2. bs > 1.5 * ps && bs < 2 * ps : rate increases to max. rate
3. bs = n * ps ; (n >= 2) : ~1.1GB/s (== max. rate)
4. bs > n * ps && bs < ~(n+0.5) * ps ; (n > 2) : ~27-70MB/s (differs
with ps)
5. bs > ~(n+0.5) * ps && bs < (n+1) * ps ; (n > 2) : increasing rate
in several, more or
less, distinct steps (e.g. 1/3 of max. rate and then 2/3 of max
rate for 64k pages)
I've tested all four possible page sizes on Itanium (4k, 8k, 16k and 64k) and the pattern is
always the same!!
With kernel 2.6.9 (any kernel before 2.6.10-bk6) the read rate is always at ~1.1GB/s,
independent of the block size.
This simple patch solves the problem, but I have no idea of possible side-effects ...
--- linux-2.6.12-rc2_orig/mm/filemap.c 2005-04-04 18:40:05.000000000 +0200
+++ linux-2.6.12-rc2/mm/filemap.c 2005-04-20 10:27:42.000000000 +0200
@@ -719,7 +719,7 @@
index = *ppos >> PAGE_CACHE_SHIFT;
next_index = index;
prev_index = ra.prev_page;
- last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
+ last_index = (*ppos + desc->count + PAGE_CACHE_SIZE) >> PAGE_CACHE_SHIFT;
offset = *ppos & ~PAGE_CACHE_MASK;
isize = i_size_read(inode);
--- linux-2.6.12-rc2_orig/mm/readahead.c 2005-04-04 18:40:05.000000000 +0200
+++ linux-2.6.12-rc2/mm/readahead.c 2005-04-20 18:37:04.000000000 +0200
@@ -70,7 +70,7 @@
*/
static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
{
- unsigned long newsize = roundup_pow_of_two(size);
+ unsigned long newsize = size;
if (newsize <= max / 64)
newsize = newsize * newsize;
In order to keep this mail short, I've created a webpage that contains
all the detailed information and some plots:
http://www.cern.ch/openlab-debugging/raid
Regards,
Andreas Hirstius
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-20 17:37 Andreas Hirstius
@ 2005-04-20 16:55 ` jmerkey
2005-04-21 8:32 ` Andreas Hirstius
0 siblings, 1 reply; 9+ messages in thread
From: jmerkey @ 2005-04-20 16:55 UTC (permalink / raw)
To: Andreas Hirstius; +Cc: linux-kernel
For 3Ware, you need to chage the queue depths, and you will see
dramatically improved performance. 3Ware can take requests
a lot faster than Linux pushes them out. Try changing this instead, you
won't be going to sleep all the time waiting on the read/write
request queues to get "unstarved".
/linux/include/linux/blkdev.h
//#define BLKDEV_MIN_RQ 4
//#define BLKDEV_MAX_RQ 128 /* Default maximum */
#define BLKDEV_MIN_RQ 4096
#define BLKDEV_MAX_RQ 8192 /* Default maximum */
Jeff
Andreas Hirstius wrote:
> Hi,
>
>
> We have a rx4640 with 3x 3Ware 9500 SATA controllers and 24x WD740GD
> HDD in a software RAID0 configuration (using md).
> With kernel 2.6.11 the read performance on the md is reduced by a
> factor of 20 (!!) compared to previous kernels.
> The write rate to the md doesn't change!! (it actually improves a bit).
>
> The config for the kernels are basically identical.
>
> Here is some vmstat output:
>
> kernel 2.6.9: ~1GB/s read
> procs memory swap io system cpu
> r b swpd free buff cache si so bi bo in cs us sy wa id
> 1 1 0 12672 6592 15914112 0 0 1081344 56 15719 1583 0 11 14 74
> 1 0 0 12672 6592 15915200 0 0 1130496 0 15996 1626 0 11 14 74
> 0 1 0 12672 6592 15914112 0 0 1081344 0 15891 1570 0 11 14 74
> 0 1 0 12480 6592 15914112 0 0 1081344 0 15855 1537 0 11 14 74
> 1 0 0 12416 6592 15914112 0 0 1130496 0 16006 1586 0 12 14 74
>
>
> kernel 2.6.11: ~55MB/s read
> procs memory swap io system cpu
> r b swpd free buff cache si so bi bo in cs us sy wa id
> 1 1 0 24448 37568 15905984 0 0 56934 0 5166 1862 0 1 24 75
> 0 1 0 20672 37568 15909248 0 0 57280 0 5168 1871 0 1 24 75
> 0 1 0 22848 37568 15907072 0 0 57306 0 5173 1874 0 1 24 75
> 0 1 0 25664 37568 15903808 0 0 57190 0 5171 1870 0 1 24 75
> 0 1 0 21952 37568 15908160 0 0 57267 0 5168 1871 0 1 24 75
>
>
> Because the filesystem might have an impact on the measurement, "dd"
> on /dev/md0
> was used to get information about the performance. This also opens the
> possibility to test with block sizes larger than the page size.
> And it appears that the performance with kernel 2.6.11 is closely
> related to the block size.
> For example if the block size is exactly a multiple (>2) of the page
> size the performance is back to ~1.1GB/s.
> The general behaviour is a bit more complicated:
> 1. bs <= 1.5 * ps : ~27-57MB/s (differs with ps)
> 2. bs > 1.5 * ps && bs < 2 * ps : rate increases to max. rate
> 3. bs = n * ps ; (n >= 2) : ~1.1GB/s (== max. rate)
> 4. bs > n * ps && bs < ~(n+0.5) * ps ; (n > 2) : ~27-70MB/s (differs
> with ps)
> 5. bs > ~(n+0.5) * ps && bs < (n+1) * ps ; (n > 2) : increasing rate
> in several, more or
> less, distinct steps (e.g. 1/3 of max. rate and then 2/3 of max rate
> for 64k pages)
>
> I've tested all four possible page sizes on Itanium (4k, 8k, 16k and
> 64k) and the pattern is always the same!!
>
> With kernel 2.6.9 (any kernel before 2.6.10-bk6) the read rate is
> always at ~1.1GB/s,
> independent of the block size.
>
>
> This simple patch solves the problem, but I have no idea of possible
> side-effects ...
>
> --- linux-2.6.12-rc2_orig/mm/filemap.c 2005-04-04 18:40:05.000000000
> +0200
> +++ linux-2.6.12-rc2/mm/filemap.c 2005-04-20 10:27:42.000000000 +0200
> @@ -719,7 +719,7 @@
> index = *ppos >> PAGE_CACHE_SHIFT;
> next_index = index;
> prev_index = ra.prev_page;
> - last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >>
> PAGE_CACHE_SHIFT;
> + last_index = (*ppos + desc->count + PAGE_CACHE_SIZE) >>
> PAGE_CACHE_SHIFT;
> offset = *ppos & ~PAGE_CACHE_MASK;
>
> isize = i_size_read(inode);
> --- linux-2.6.12-rc2_orig/mm/readahead.c 2005-04-04 18:40:05.000000000
> +0200
> +++ linux-2.6.12-rc2/mm/readahead.c 2005-04-20 18:37:04.000000000 +0200
> @@ -70,7 +70,7 @@
> */
> static unsigned long get_init_ra_size(unsigned long size, unsigned
> long max)
> {
> - unsigned long newsize = roundup_pow_of_two(size);
> + unsigned long newsize = size;
>
> if (newsize <= max / 64)
> newsize = newsize * newsize;
>
>
>
> In order to keep this mail short, I've created a webpage that contains
> all the detailed information and some plots:
> http://www.cern.ch/openlab-debugging/raid
>
>
> Regards,
>
> Andreas Hirstius
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
2005-04-20 16:55 ` jmerkey
@ 2005-04-21 8:32 ` Andreas Hirstius
[not found] ` <58cb370e05042102272ce70f2@mail.gmail.com>
0 siblings, 1 reply; 9+ messages in thread
From: Andreas Hirstius @ 2005-04-21 8:32 UTC (permalink / raw)
To: Gelato technical; +Cc: linux-kernel
A small update.
Patching mm/filemap.c is not necessary in order to get the improved
performance!
It's sufficient to remove roundup_pow_of_two from |get_init_ra_size ...
So a simple one-liner changes to picture dramatically.
But why ?!?!?
Andreas
|
jmerkey wrote:
>
>
> For 3Ware, you need to chage the queue depths, and you will see
> dramatically improved performance. 3Ware can take requests
> a lot faster than Linux pushes them out. Try changing this instead,
> you won't be going to sleep all the time waiting on the read/write
> request queues to get "unstarved".
>
>
> /linux/include/linux/blkdev.h
>
> //#define BLKDEV_MIN_RQ 4
> //#define BLKDEV_MAX_RQ 128 /* Default maximum */
> #define BLKDEV_MIN_RQ 4096
> #define BLKDEV_MAX_RQ 8192 /* Default maximum */
>
>
> Jeff
>
> Andreas Hirstius wrote:
>
>> Hi,
>>
>>
>> We have a rx4640 with 3x 3Ware 9500 SATA controllers and 24x WD740GD
>> HDD in a software RAID0 configuration (using md).
>> With kernel 2.6.11 the read performance on the md is reduced by a
>> factor of 20 (!!) compared to previous kernels.
>> The write rate to the md doesn't change!! (it actually improves a bit).
>>
>> The config for the kernels are basically identical.
>>
>> Here is some vmstat output:
>>
>> kernel 2.6.9: ~1GB/s read
>> procs memory swap io system cpu
>> r b swpd free buff cache si so bi bo in cs us sy wa id
>> 1 1 0 12672 6592 15914112 0 0 1081344 56 15719 1583 0 11 14 74
>> 1 0 0 12672 6592 15915200 0 0 1130496 0 15996 1626 0 11 14 74
>> 0 1 0 12672 6592 15914112 0 0 1081344 0 15891 1570 0 11 14 74
>> 0 1 0 12480 6592 15914112 0 0 1081344 0 15855 1537 0 11 14 74
>> 1 0 0 12416 6592 15914112 0 0 1130496 0 16006 1586 0 12 14 74
>>
>>
>> kernel 2.6.11: ~55MB/s read
>> procs memory swap io system cpu
>> r b swpd free buff cache si so bi bo in cs us sy wa id
>> 1 1 0 24448 37568 15905984 0 0 56934 0 5166 1862 0 1 24 75
>> 0 1 0 20672 37568 15909248 0 0 57280 0 5168 1871 0 1 24 75
>> 0 1 0 22848 37568 15907072 0 0 57306 0 5173 1874 0 1 24 75
>> 0 1 0 25664 37568 15903808 0 0 57190 0 5171 1870 0 1 24 75
>> 0 1 0 21952 37568 15908160 0 0 57267 0 5168 1871 0 1 24 75
>>
>>
>> Because the filesystem might have an impact on the measurement, "dd"
>> on /dev/md0
>> was used to get information about the performance. This also opens
>> the possibility to test with block sizes larger than the page size.
>> And it appears that the performance with kernel 2.6.11 is closely
>> related to the block size.
>> For example if the block size is exactly a multiple (>2) of the page
>> size the performance is back to ~1.1GB/s.
>> The general behaviour is a bit more complicated:
>> 1. bs <= 1.5 * ps : ~27-57MB/s (differs with ps)
>> 2. bs > 1.5 * ps && bs < 2 * ps : rate increases to max. rate
>> 3. bs = n * ps ; (n >= 2) : ~1.1GB/s (== max. rate)
>> 4. bs > n * ps && bs < ~(n+0.5) * ps ; (n > 2) : ~27-70MB/s (differs
>> with ps)
>> 5. bs > ~(n+0.5) * ps && bs < (n+1) * ps ; (n > 2) : increasing rate
>> in several, more or
>> less, distinct steps (e.g. 1/3 of max. rate and then 2/3 of max rate
>> for 64k pages)
>>
>> I've tested all four possible page sizes on Itanium (4k, 8k, 16k and
>> 64k) and the pattern is always the same!!
>>
>> With kernel 2.6.9 (any kernel before 2.6.10-bk6) the read rate is
>> always at ~1.1GB/s,
>> independent of the block size.
>>
>>
>> This simple patch solves the problem, but I have no idea of possible
>> side-effects ...
>>
>> --- linux-2.6.12-rc2_orig/mm/filemap.c 2005-04-04 18:40:05.000000000
>> +0200
>> +++ linux-2.6.12-rc2/mm/filemap.c 2005-04-20 10:27:42.000000000 +0200
>> @@ -719,7 +719,7 @@
>> index = *ppos >> PAGE_CACHE_SHIFT;
>> next_index = index;
>> prev_index = ra.prev_page;
>> - last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >>
>> PAGE_CACHE_SHIFT;
>> + last_index = (*ppos + desc->count + PAGE_CACHE_SIZE) >>
>> PAGE_CACHE_SHIFT;
>> offset = *ppos & ~PAGE_CACHE_MASK;
>>
>> isize = i_size_read(inode);
>> --- linux-2.6.12-rc2_orig/mm/readahead.c 2005-04-04
>> 18:40:05.000000000 +0200
>> +++ linux-2.6.12-rc2/mm/readahead.c 2005-04-20 18:37:04.000000000 +0200
>> @@ -70,7 +70,7 @@
>> */
>> static unsigned long get_init_ra_size(unsigned long size, unsigned
>> long max)
>> {
>> - unsigned long newsize = roundup_pow_of_two(size);
>> + unsigned long newsize = size;
>>
>> if (newsize <= max / 64)
>> newsize = newsize * newsize;
>>
>>
>>
>> In order to keep this mail short, I've created a webpage that
>> contains all the detailed information and some plots:
>> http://www.cern.ch/openlab-debugging/raid
>>
>>
>> Regards,
>>
>> Andreas Hirstius
>>
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-04-21 18:39 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-21 17:41 [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later Luck, Tony
2005-04-21 17:48 ` David Mosberger
2005-04-21 18:24 ` Stan Bubrouski
-- strict thread matches above, loose matches on Subject: below --
2005-04-21 18:29 Luck, Tony
2005-04-21 18:40 ` Stan Bubrouski
2005-04-21 17:19 Luck, Tony
2005-04-21 17:33 ` David Mosberger
2005-04-21 17:39 ` Randy.Dunlap
2005-04-20 17:37 Andreas Hirstius
2005-04-20 16:55 ` jmerkey
2005-04-21 8:32 ` Andreas Hirstius
[not found] ` <58cb370e05042102272ce70f2@mail.gmail.com>
2005-04-21 9:42 ` Bartlomiej ZOLNIERKIEWICZ
2005-04-21 11:30 ` Andreas Hirstius
2005-04-21 15:05 ` [Gelato-technical] " David Mosberger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox