From: Andreas Hirstius <Andreas.Hirstius@cern.ch>
To: jmerkey <jmerkey@utah-nac.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
Date: Wed, 20 Apr 2005 20:04:40 +0200 [thread overview]
Message-ID: <426699B8.3020504@cern.ch> (raw)
In-Reply-To: <42668977.5060708@utah-nac.org>
Just tried it, but the performance problem remains :-(
(actually, why should it change? This part of the code didn't change so
much between 2.6.10-bk6 and -bk7...)
Andreas
jmerkey wrote:
>
>
> For 3Ware, you need to chage the queue depths, and you will see
> dramatically improved performance. 3Ware can take requests
> a lot faster than Linux pushes them out. Try changing this instead,
> you won't be going to sleep all the time waiting on the read/write
> request queues to get "unstarved".
>
>
> /linux/include/linux/blkdev.h
>
> //#define BLKDEV_MIN_RQ 4
> //#define BLKDEV_MAX_RQ 128 /* Default maximum */
> #define BLKDEV_MIN_RQ 4096
> #define BLKDEV_MAX_RQ 8192 /* Default maximum */
>
>
> Jeff
>
> Andreas Hirstius wrote:
>
>> Hi,
>>
>>
>> We have a rx4640 with 3x 3Ware 9500 SATA controllers and 24x WD740GD
>> HDD in a software RAID0 configuration (using md).
>> With kernel 2.6.11 the read performance on the md is reduced by a
>> factor of 20 (!!) compared to previous kernels.
>> The write rate to the md doesn't change!! (it actually improves a bit).
>>
>> The config for the kernels are basically identical.
>>
>> Here is some vmstat output:
>>
>> kernel 2.6.9: ~1GB/s read
>> procs memory swap io system cpu
>> r b swpd free buff cache si so bi bo in cs us sy wa id
>> 1 1 0 12672 6592 15914112 0 0 1081344 56 15719 1583 0 11 14 74
>> 1 0 0 12672 6592 15915200 0 0 1130496 0 15996 1626 0 11 14 74
>> 0 1 0 12672 6592 15914112 0 0 1081344 0 15891 1570 0 11 14 74
>> 0 1 0 12480 6592 15914112 0 0 1081344 0 15855 1537 0 11 14 74
>> 1 0 0 12416 6592 15914112 0 0 1130496 0 16006 1586 0 12 14 74
>>
>>
>> kernel 2.6.11: ~55MB/s read
>> procs memory swap io system cpu
>> r b swpd free buff cache si so bi bo in cs us sy wa id
>> 1 1 0 24448 37568 15905984 0 0 56934 0 5166 1862 0 1 24 75
>> 0 1 0 20672 37568 15909248 0 0 57280 0 5168 1871 0 1 24 75
>> 0 1 0 22848 37568 15907072 0 0 57306 0 5173 1874 0 1 24 75
>> 0 1 0 25664 37568 15903808 0 0 57190 0 5171 1870 0 1 24 75
>> 0 1 0 21952 37568 15908160 0 0 57267 0 5168 1871 0 1 24 75
>>
>>
>> Because the filesystem might have an impact on the measurement, "dd"
>> on /dev/md0
>> was used to get information about the performance. This also opens
>> the possibility to test with block sizes larger than the page size.
>> And it appears that the performance with kernel 2.6.11 is closely
>> related to the block size.
>> For example if the block size is exactly a multiple (>2) of the page
>> size the performance is back to ~1.1GB/s.
>> The general behaviour is a bit more complicated:
>> 1. bs <= 1.5 * ps : ~27-57MB/s (differs with ps)
>> 2. bs > 1.5 * ps && bs < 2 * ps : rate increases to max. rate
>> 3. bs = n * ps ; (n >= 2) : ~1.1GB/s (== max. rate)
>> 4. bs > n * ps && bs < ~(n+0.5) * ps ; (n > 2) : ~27-70MB/s (differs
>> with ps)
>> 5. bs > ~(n+0.5) * ps && bs < (n+1) * ps ; (n > 2) : increasing rate
>> in several, more or
>> less, distinct steps (e.g. 1/3 of max. rate and then 2/3 of max rate
>> for 64k pages)
>>
>> I've tested all four possible page sizes on Itanium (4k, 8k, 16k and
>> 64k) and the pattern is always the same!!
>>
>> With kernel 2.6.9 (any kernel before 2.6.10-bk6) the read rate is
>> always at ~1.1GB/s,
>> independent of the block size.
>>
>>
>> This simple patch solves the problem, but I have no idea of possible
>> side-effects ...
>>
>> --- linux-2.6.12-rc2_orig/mm/filemap.c 2005-04-04 18:40:05.000000000
>> +0200
>> +++ linux-2.6.12-rc2/mm/filemap.c 2005-04-20 10:27:42.000000000 +0200
>> @@ -719,7 +719,7 @@
>> index = *ppos >> PAGE_CACHE_SHIFT;
>> next_index = index;
>> prev_index = ra.prev_page;
>> - last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >>
>> PAGE_CACHE_SHIFT;
>> + last_index = (*ppos + desc->count + PAGE_CACHE_SIZE) >>
>> PAGE_CACHE_SHIFT;
>> offset = *ppos & ~PAGE_CACHE_MASK;
>>
>> isize = i_size_read(inode);
>> --- linux-2.6.12-rc2_orig/mm/readahead.c 2005-04-04
>> 18:40:05.000000000 +0200
>> +++ linux-2.6.12-rc2/mm/readahead.c 2005-04-20 18:37:04.000000000 +0200
>> @@ -70,7 +70,7 @@
>> */
>> static unsigned long get_init_ra_size(unsigned long size, unsigned
>> long max)
>> {
>> - unsigned long newsize = roundup_pow_of_two(size);
>> + unsigned long newsize = size;
>>
>> if (newsize <= max / 64)
>> newsize = newsize * newsize;
>>
>>
>>
>> In order to keep this mail short, I've created a webpage that
>> contains all the detailed information and some plots:
>> http://www.cern.ch/openlab-debugging/raid
>>
>>
>> Regards,
>>
>> Andreas Hirstius
>>
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
next prev parent reply other threads:[~2005-04-20 18:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-20 17:37 Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later Andreas Hirstius
2005-04-20 16:55 ` jmerkey
2005-04-20 18:04 ` Andreas Hirstius [this message]
2005-04-20 18:24 ` Andreas Hirstius
2005-04-20 19:17 ` jmerkey
2005-04-21 1:11 ` Nick Piggin
2005-04-21 8:32 ` Andreas Hirstius
[not found] ` <58cb370e05042102272ce70f2@mail.gmail.com>
2005-04-21 9:42 ` Bartlomiej ZOLNIERKIEWICZ
2005-04-21 11:30 ` Andreas Hirstius
2005-04-21 15:05 ` [Gelato-technical] " David Mosberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=426699B8.3020504@cern.ch \
--to=andreas.hirstius@cern.ch \
--cc=jmerkey@utah-nac.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox