From: jmerkey <jmerkey@utah-nac.org>
To: Tom Callahan <callahant@tessco.com>
Cc: Jens Axboe <axboe@suse.de>, Holger Kiehl <Holger.Kiehl@dwd.de>,
Vojtech Pavlik <vojtech@suse.cz>,
linux-raid <linux-raid@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Where is the performance bottleneck?
Date: Wed, 31 Aug 2005 09:47:14 -0600 [thread overview]
Message-ID: <4315D102.1080909@utah-nac.org> (raw)
In-Reply-To: <4315E199.3060003@tessco.com>
I'll try this approach as well. On 2.4.X kernels, I had to change
nr_requests to achieve performance, but
I noticed it didn't seem to work as well on 2.6.X. I'll retry the
change with nr_requests on 2.6.X.
Thanks
Jeff
Tom Callahan wrote:
>>From linux-kernel mailing list.....
>
>Don't do this. BLKDEV_MIN_RQ sets the size of the mempool reserved
>requests and will only get slightly used in low memory conditions, so
>most memory will probably be wasted.....
>
>Change /sys/block/xxx/queue/nr_requests
>
>Tom Callahan
>TESSCO Technologies
>(443)-506-6216
>callahant@tessco.com
>
>
>
>jmerkey wrote:
>
>
>
>>I have seen an 80GB/sec limitation in the kernel unless this value is
>>changed in the SCSI I/O layer
>>for 3Ware and other controllers during testing of 2.6.X series kernels.
>>
>>Change these values in include/linux/blkdev.h and performance goes from
>>80MB/S to over 670MB/S on the 3Ware controller.
>>
>>
>>//#define BLKDEV_MIN_RQ 4
>>//#define BLKDEV_MAX_RQ 128 /* Default maximum */
>>#define BLKDEV_MIN_RQ 4096
>>#define BLKDEV_MAX_RQ 8192 /* Default maximum */
>>
>>Jeff
>>
>>
>>
>>Jens Axboe wrote:
>>
>>
>>
>>
>>
>>>On Wed, Aug 31 2005, Holger Kiehl wrote:
>>>
>>>
>>>
>>>
>>>
>>>
>>>>On Wed, 31 Aug 2005, Jens Axboe wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>Nothing sticks out here either. There's plenty of idle time. It
>>>>>
>>>>>
>>>>>
>>>>>
>>smells
>>
>>
>>
>>
>>>>>like a driver issue. Can you try the same dd test, but read from the
>>>>>drives instead? Use a bigger blocksize here, 128 or 256k.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>I used the following command reading from all 8 disks in parallel:
>>>>
>>>> dd if=/dev/sd?1 of=/dev/null bs=256k count=78125
>>>>
>>>>Here vmstat output (I just cut something out in the middle):
>>>>
>>>>procs -----------memory---------- ---swap-- -----io---- --system--
>>>>----cpu----^M
>>>>r b swpd free buff cache si so bi bo in cs us
>>>>
>>>>
>>>>
>>>>
>>sy id
>>
>>
>>
>>
>>>>wa^M
>>>>3 7 4348 42640 7799984 9612 0 0 322816 0 3532 4987
>>>>
>>>>
>>>>
>>>>
>>0 22
>>
>>
>>
>>
>>>>0 78
>>>>1 7 4348 42136 7800624 9584 0 0 322176 0 3526 4987
>>>>
>>>>
>>>>
>>>>
>>0 23
>>
>>
>>
>>
>>>>4 74
>>>>0 8 4348 39912 7802648 9668 0 0 322176 0 3525 4955
>>>>
>>>>
>>>>
>>>>
>>0 22
>>
>>
>>
>>
>>>>12 66
>>>>1 7 4348 38912 7803700 9636 0 0 322432 0 3526 5078
>>>>
>>>>
>>>>
>>>>
>>0 23
>>
>>
>>
>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>Ok, so that's somewhat better than the writes but still off from what
>>>the individual drives can do in total.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>>You might want to try the same with direct io, just to eliminate the
>>>>>costly user copy. I don't expect it to make much of a difference
>>>>>
>>>>>
>>>>>
>>>>>
>>though,
>>
>>
>>
>>
>>>>>feels like the problem is elsewhere (driver, most likely).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>Sorry, I don't know how to do this. Do you mean using a C program
>>>>that sets some flag to do direct io, or how can I do that?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>I've attached a little sample for you, just run ala
>>>
>>># ./oread /dev/sdX
>>>
>>>and it will read 128k chunks direct from that device. Run on the same
>>>drives as above, reply with the vmstat info again.
>>>
>>>
>>>
>>>-----------------------------------------------------------------------
>>>
>>>
>>>
>>>
>>-
>>
>>
>>
>>
>>>#include <stdio.h>
>>>#include <stdlib.h>
>>>#define __USE_GNU
>>>#include <fcntl.h>
>>>#include <stdlib.h>
>>>#include <unistd.h>
>>>
>>>#define BS (131072)
>>>#define ALIGN(buf) (char *) (((unsigned long) (buf) + 4095) &
>>>
>>>
>>>
>>>
>>~(4095))
>>
>>
>>
>>
>>>#define BLOCKS (8192)
>>>
>>>int main(int argc, char *argv[])
>>>{
>>> char *p;
>>> int fd, i;
>>>
>>> if (argc < 2) {
>>> printf("%s: <dev>\n", argv[0]);
>>> return 1;
>>> }
>>>
>>> fd = open(argv[1], O_RDONLY | O_DIRECT);
>>> if (fd == -1) {
>>> perror("open");
>>> return 1;
>>> }
>>>
>>> p = ALIGN(malloc(BS + 4095));
>>> for (i = 0; i < BLOCKS; i++) {
>>> int r = read(fd, p, BS);
>>>
>>> if (r == BS)
>>> continue;
>>> else {
>>> if (r == -1)
>>> perror("read");
>>>
>>> break;
>>> }
>>> }
>>>
>>> return 0;
>>>}
>>>
>>>
>>>
>>>
>>>
>>>
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>
>>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
next prev parent reply other threads:[~2005-08-31 15:47 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-29 18:20 Where is the performance bottleneck? Holger Kiehl
2005-08-29 19:54 ` Mark Hahn
2005-08-30 19:08 ` Holger Kiehl
2005-08-30 23:05 ` Guy
2005-09-28 20:04 ` Bill Davidsen
2005-09-30 4:52 ` Guy
2005-09-30 5:19 ` dean gaudet
2005-10-06 21:15 ` Bill Davidsen
2005-08-29 20:10 ` Al Boldi
2005-08-30 19:18 ` Holger Kiehl
2005-08-31 10:30 ` Al Boldi
2005-08-29 23:09 ` Peter Chubb
[not found] ` <20050829202529.GA32214@midnight.suse.cz>
2005-08-30 20:06 ` Holger Kiehl
2005-08-31 7:11 ` Vojtech Pavlik
2005-08-31 7:26 ` Jens Axboe
2005-08-31 11:54 ` Holger Kiehl
2005-08-31 12:07 ` Jens Axboe
2005-08-31 13:55 ` Holger Kiehl
2005-08-31 14:24 ` Dr. David Alan Gilbert
2005-08-31 20:56 ` Holger Kiehl
2005-08-31 21:16 ` Dr. David Alan Gilbert
2005-08-31 16:20 ` Jens Axboe
2005-08-31 15:16 ` jmerkey
2005-08-31 16:58 ` Tom Callahan
2005-08-31 15:47 ` jmerkey [this message]
2005-08-31 17:11 ` Jens Axboe
2005-08-31 15:59 ` jmerkey
2005-08-31 17:32 ` Jens Axboe
2005-08-31 16:51 ` Holger Kiehl
2005-08-31 17:35 ` Jens Axboe
2005-08-31 19:00 ` Holger Kiehl
2005-08-31 18:06 ` Michael Tokarev
2005-08-31 18:52 ` Ming Zhang
2005-08-31 18:57 ` Ming Zhang
2005-08-31 12:24 ` Nick Piggin
2005-08-31 16:25 ` Holger Kiehl
2005-08-31 17:25 ` Nick Piggin
2005-08-31 21:57 ` Holger Kiehl
2005-09-01 9:12 ` Holger Kiehl
2005-09-02 14:28 ` Al Boldi
2005-08-31 13:38 ` Holger Kiehl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4315D102.1080909@utah-nac.org \
--to=jmerkey@utah-nac.org \
--cc=Holger.Kiehl@dwd.de \
--cc=axboe@suse.de \
--cc=callahant@tessco.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=vojtech@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).