public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: jmerkey <jmerkey@utah-nac.org>
To: Andreas Hirstius <Andreas.Hirstius@cern.ch>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later
Date: Wed, 20 Apr 2005 10:55:19 -0600	[thread overview]
Message-ID: <42668977.5060708@utah-nac.org> (raw)
In-Reply-To: <42669357.9080604@cern.ch>



For 3Ware, you need to chage the queue depths, and you will see 
dramatically improved performance. 3Ware can take requests
a lot faster than Linux pushes them out. Try changing this instead, you 
won't be going to sleep all the time waiting on the read/write
request queues to get "unstarved".


/linux/include/linux/blkdev.h

//#define BLKDEV_MIN_RQ 4
//#define BLKDEV_MAX_RQ 128 /* Default maximum */
#define BLKDEV_MIN_RQ 4096
#define BLKDEV_MAX_RQ 8192 /* Default maximum */


Jeff

Andreas Hirstius wrote:

> Hi,
>
>
> We have a rx4640 with 3x 3Ware 9500 SATA controllers and 24x WD740GD 
> HDD in a software RAID0 configuration (using md).
> With kernel 2.6.11 the read performance on the md is reduced by a 
> factor of 20 (!!) compared to previous kernels.
> The write rate to the md doesn't change!! (it actually improves a bit).
>
> The config for the kernels are basically identical.
>
> Here is some vmstat output:
>
> kernel 2.6.9: ~1GB/s read
> procs memory swap io system cpu
> r b swpd free buff cache si so bi bo in cs us sy wa id
> 1 1 0 12672 6592 15914112 0 0 1081344 56 15719 1583 0 11 14 74
> 1 0 0 12672 6592 15915200 0 0 1130496 0 15996 1626 0 11 14 74
> 0 1 0 12672 6592 15914112 0 0 1081344 0 15891 1570 0 11 14 74
> 0 1 0 12480 6592 15914112 0 0 1081344 0 15855 1537 0 11 14 74
> 1 0 0 12416 6592 15914112 0 0 1130496 0 16006 1586 0 12 14 74
>
>
> kernel 2.6.11: ~55MB/s read
> procs memory swap io system cpu
> r b swpd free buff cache si so bi bo in cs us sy wa id
> 1 1 0 24448 37568 15905984 0 0 56934 0 5166 1862 0 1 24 75
> 0 1 0 20672 37568 15909248 0 0 57280 0 5168 1871 0 1 24 75
> 0 1 0 22848 37568 15907072 0 0 57306 0 5173 1874 0 1 24 75
> 0 1 0 25664 37568 15903808 0 0 57190 0 5171 1870 0 1 24 75
> 0 1 0 21952 37568 15908160 0 0 57267 0 5168 1871 0 1 24 75
>
>
> Because the filesystem might have an impact on the measurement, "dd" 
> on /dev/md0
> was used to get information about the performance. This also opens the 
> possibility to test with block sizes larger than the page size.
> And it appears that the performance with kernel 2.6.11 is closely 
> related to the block size.
> For example if the block size is exactly a multiple (>2) of the page 
> size the performance is back to ~1.1GB/s.
> The general behaviour is a bit more complicated:
> 1. bs <= 1.5 * ps : ~27-57MB/s (differs with ps)
> 2. bs > 1.5 * ps && bs < 2 * ps : rate increases to max. rate
> 3. bs = n * ps ; (n >= 2) : ~1.1GB/s (== max. rate)
> 4. bs > n * ps && bs < ~(n+0.5) * ps ; (n > 2) : ~27-70MB/s (differs 
> with ps)
> 5. bs > ~(n+0.5) * ps && bs < (n+1) * ps ; (n > 2) : increasing rate 
> in several, more or
> less, distinct steps (e.g. 1/3 of max. rate and then 2/3 of max rate 
> for 64k pages)
>
> I've tested all four possible page sizes on Itanium (4k, 8k, 16k and 
> 64k) and the pattern is always the same!!
>
> With kernel 2.6.9 (any kernel before 2.6.10-bk6) the read rate is 
> always at ~1.1GB/s,
> independent of the block size.
>
>
> This simple patch solves the problem, but I have no idea of possible 
> side-effects ...
>
> --- linux-2.6.12-rc2_orig/mm/filemap.c 2005-04-04 18:40:05.000000000 
> +0200
> +++ linux-2.6.12-rc2/mm/filemap.c 2005-04-20 10:27:42.000000000 +0200
> @@ -719,7 +719,7 @@
> index = *ppos >> PAGE_CACHE_SHIFT;
> next_index = index;
> prev_index = ra.prev_page;
> - last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> 
> PAGE_CACHE_SHIFT;
> + last_index = (*ppos + desc->count + PAGE_CACHE_SIZE) >> 
> PAGE_CACHE_SHIFT;
> offset = *ppos & ~PAGE_CACHE_MASK;
>
> isize = i_size_read(inode);
> --- linux-2.6.12-rc2_orig/mm/readahead.c 2005-04-04 18:40:05.000000000 
> +0200
> +++ linux-2.6.12-rc2/mm/readahead.c 2005-04-20 18:37:04.000000000 +0200
> @@ -70,7 +70,7 @@
> */
> static unsigned long get_init_ra_size(unsigned long size, unsigned 
> long max)
> {
> - unsigned long newsize = roundup_pow_of_two(size);
> + unsigned long newsize = size;
>
> if (newsize <= max / 64)
> newsize = newsize * newsize;
>
>
>
> In order to keep this mail short, I've created a webpage that contains 
> all the detailed information and some plots:
> http://www.cern.ch/openlab-debugging/raid
>
>
> Regards,
>
> Andreas Hirstius
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


  reply	other threads:[~2005-04-20 17:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-20 17:37 Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later Andreas Hirstius
2005-04-20 16:55 ` jmerkey [this message]
2005-04-20 18:04   ` Andreas Hirstius
2005-04-20 18:24   ` Andreas Hirstius
2005-04-20 19:17     ` jmerkey
2005-04-21  1:11   ` Nick Piggin
2005-04-21  8:32   ` Andreas Hirstius
     [not found]     ` <58cb370e05042102272ce70f2@mail.gmail.com>
2005-04-21  9:42       ` Bartlomiej ZOLNIERKIEWICZ
2005-04-21 11:30         ` Andreas Hirstius
2005-04-21 15:05           ` [Gelato-technical] " David Mosberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42668977.5060708@utah-nac.org \
    --to=jmerkey@utah-nac.org \
    --cc=Andreas.Hirstius@cern.ch \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox