All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Callahan <callahant@tessco.com>
To: jmerkey <jmerkey@utah-nac.org>
Cc: Jens Axboe <axboe@suse.de>, Holger Kiehl <Holger.Kiehl@dwd.de>,
	Vojtech Pavlik <vojtech@suse.cz>,
	linux-raid <linux-raid@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Where is the performance bottleneck?
Date: Wed, 31 Aug 2005 12:58:01 -0400	[thread overview]
Message-ID: <4315E199.3060003@tessco.com> (raw)
In-Reply-To: <4315C9EB.2030506@utah-nac.org>

From linux-kernel mailing list.....

Don't do this. BLKDEV_MIN_RQ sets the size of the mempool reserved
requests and will only get slightly used in low memory conditions, so
most memory will probably be wasted.....

Change /sys/block/xxx/queue/nr_requests

Tom Callahan
TESSCO Technologies
(443)-506-6216
callahant@tessco.com



jmerkey wrote:

>I have seen an 80GB/sec limitation in the kernel unless this value is 
>changed in the SCSI I/O layer
>for 3Ware and other controllers during testing of 2.6.X series kernels.
>
>Change these values in include/linux/blkdev.h and performance goes from 
>80MB/S to over 670MB/S on the 3Ware controller.
>
>
>//#define BLKDEV_MIN_RQ    4
>//#define BLKDEV_MAX_RQ    128    /* Default maximum */
>#define BLKDEV_MIN_RQ    4096
>#define BLKDEV_MAX_RQ    8192    /* Default maximum */
>
>Jeff
>
>
>
>Jens Axboe wrote:
>
>  
>
>>On Wed, Aug 31 2005, Holger Kiehl wrote:
>> 
>>
>>    
>>
>>>On Wed, 31 Aug 2005, Jens Axboe wrote:
>>>
>>>   
>>>
>>>      
>>>
>>>>Nothing sticks out here either. There's plenty of idle time. It
>>>>        
>>>>
>smells
>  
>
>>>>like a driver issue. Can you try the same dd test, but read from the
>>>>drives instead? Use a bigger blocksize here, 128 or 256k.
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>I used the following command reading from all 8 disks in parallel:
>>>
>>>  dd if=/dev/sd?1 of=/dev/null bs=256k count=78125
>>>
>>>Here vmstat output (I just cut something out in the middle):
>>>
>>>procs -----------memory---------- ---swap-- -----io---- --system-- 
>>>----cpu----^M
>>>r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
>>>      
>>>
>sy id 
>  
>
>>>wa^M
>>>3  7   4348  42640 7799984   9612    0    0 322816     0 3532  4987
>>>      
>>>
>0 22  
>  
>
>>>0 78
>>>1  7   4348  42136 7800624   9584    0    0 322176     0 3526  4987
>>>      
>>>
>0 23  
>  
>
>>>4 74
>>>0  8   4348  39912 7802648   9668    0    0 322176     0 3525  4955
>>>      
>>>
>0 22 
>  
>
>>>12 66
>>>1  7   4348  38912 7803700   9636    0    0 322432     0 3526  5078
>>>      
>>>
>0 23  
>  
>
>>>   
>>>
>>>      
>>>
>>Ok, so that's somewhat better than the writes but still off from what
>>the individual drives can do in total.
>>
>> 
>>
>>    
>>
>>>>You might want to try the same with direct io, just to eliminate the
>>>>costly user copy. I don't expect it to make much of a difference
>>>>        
>>>>
>though,
>  
>
>>>>feels like the problem is elsewhere (driver, most likely).
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>Sorry, I don't know how to do this. Do you mean using a C program
>>>that sets some flag to do direct io, or how can I do that?
>>>   
>>>
>>>      
>>>
>>I've attached a little sample for you, just run ala
>>
>># ./oread /dev/sdX
>>
>>and it will read 128k chunks direct from that device. Run on the same
>>drives as above, reply with the vmstat info again.
>>
>> 
>>
>>-----------------------------------------------------------------------
>>    
>>
>-
>  
>
>>#include <stdio.h>
>>#include <stdlib.h>
>>#define __USE_GNU
>>#include <fcntl.h>
>>#include <stdlib.h>
>>#include <unistd.h>
>>
>>#define BS		(131072)
>>#define ALIGN(buf)	(char *) (((unsigned long) (buf) + 4095) &
>>    
>>
>~(4095))
>  
>
>>#define BLOCKS		(8192)
>>
>>int main(int argc, char *argv[])
>>{
>>	char *p;
>>	int fd, i;
>>
>>	if (argc < 2) {
>>		printf("%s: <dev>\n", argv[0]);
>>		return 1;
>>	}
>>
>>	fd = open(argv[1], O_RDONLY | O_DIRECT);
>>	if (fd == -1) {
>>		perror("open");
>>		return 1;
>>	}
>>
>>	p = ALIGN(malloc(BS + 4095));
>>	for (i = 0; i < BLOCKS; i++) {
>>		int r = read(fd, p, BS);
>>
>>		if (r == BS)
>>			continue;
>>		else {
>>			if (r == -1)
>>				perror("read");
>>
>>			break;
>>		}
>>	}
>>
>>	return 0;
>>}
>> 
>>
>>    
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>

WARNING: multiple messages have this Message-ID (diff)
From: Tom Callahan <callahant@tessco.com>
To: jmerkey <jmerkey@utah-nac.org>
Cc: Jens Axboe <axboe@suse.de>, Holger Kiehl <Holger.Kiehl@dwd.de>,
	Vojtech Pavlik <vojtech@suse.cz>,
	linux-raid <linux-raid@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Where is the performance bottleneck?
Date: Wed, 31 Aug 2005 12:58:01 -0400	[thread overview]
Message-ID: <4315E199.3060003@tessco.com> (raw)
In-Reply-To: <4315C9EB.2030506@utah-nac.org>

>From linux-kernel mailing list.....

Don't do this. BLKDEV_MIN_RQ sets the size of the mempool reserved
requests and will only get slightly used in low memory conditions, so
most memory will probably be wasted.....

Change /sys/block/xxx/queue/nr_requests

Tom Callahan
TESSCO Technologies
(443)-506-6216
callahant@tessco.com



jmerkey wrote:

>I have seen an 80GB/sec limitation in the kernel unless this value is 
>changed in the SCSI I/O layer
>for 3Ware and other controllers during testing of 2.6.X series kernels.
>
>Change these values in include/linux/blkdev.h and performance goes from 
>80MB/S to over 670MB/S on the 3Ware controller.
>
>
>//#define BLKDEV_MIN_RQ    4
>//#define BLKDEV_MAX_RQ    128    /* Default maximum */
>#define BLKDEV_MIN_RQ    4096
>#define BLKDEV_MAX_RQ    8192    /* Default maximum */
>
>Jeff
>
>
>
>Jens Axboe wrote:
>
>  
>
>>On Wed, Aug 31 2005, Holger Kiehl wrote:
>> 
>>
>>    
>>
>>>On Wed, 31 Aug 2005, Jens Axboe wrote:
>>>
>>>   
>>>
>>>      
>>>
>>>>Nothing sticks out here either. There's plenty of idle time. It
>>>>        
>>>>
>smells
>  
>
>>>>like a driver issue. Can you try the same dd test, but read from the
>>>>drives instead? Use a bigger blocksize here, 128 or 256k.
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>I used the following command reading from all 8 disks in parallel:
>>>
>>>  dd if=/dev/sd?1 of=/dev/null bs=256k count=78125
>>>
>>>Here vmstat output (I just cut something out in the middle):
>>>
>>>procs -----------memory---------- ---swap-- -----io---- --system-- 
>>>----cpu----^M
>>>r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
>>>      
>>>
>sy id 
>  
>
>>>wa^M
>>>3  7   4348  42640 7799984   9612    0    0 322816     0 3532  4987
>>>      
>>>
>0 22  
>  
>
>>>0 78
>>>1  7   4348  42136 7800624   9584    0    0 322176     0 3526  4987
>>>      
>>>
>0 23  
>  
>
>>>4 74
>>>0  8   4348  39912 7802648   9668    0    0 322176     0 3525  4955
>>>      
>>>
>0 22 
>  
>
>>>12 66
>>>1  7   4348  38912 7803700   9636    0    0 322432     0 3526  5078
>>>      
>>>
>0 23  
>  
>
>>>   
>>>
>>>      
>>>
>>Ok, so that's somewhat better than the writes but still off from what
>>the individual drives can do in total.
>>
>> 
>>
>>    
>>
>>>>You might want to try the same with direct io, just to eliminate the
>>>>costly user copy. I don't expect it to make much of a difference
>>>>        
>>>>
>though,
>  
>
>>>>feels like the problem is elsewhere (driver, most likely).
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>Sorry, I don't know how to do this. Do you mean using a C program
>>>that sets some flag to do direct io, or how can I do that?
>>>   
>>>
>>>      
>>>
>>I've attached a little sample for you, just run ala
>>
>># ./oread /dev/sdX
>>
>>and it will read 128k chunks direct from that device. Run on the same
>>drives as above, reply with the vmstat info again.
>>
>> 
>>
>>-----------------------------------------------------------------------
>>    
>>
>-
>  
>
>>#include <stdio.h>
>>#include <stdlib.h>
>>#define __USE_GNU
>>#include <fcntl.h>
>>#include <stdlib.h>
>>#include <unistd.h>
>>
>>#define BS		(131072)
>>#define ALIGN(buf)	(char *) (((unsigned long) (buf) + 4095) &
>>    
>>
>~(4095))
>  
>
>>#define BLOCKS		(8192)
>>
>>int main(int argc, char *argv[])
>>{
>>	char *p;
>>	int fd, i;
>>
>>	if (argc < 2) {
>>		printf("%s: <dev>\n", argv[0]);
>>		return 1;
>>	}
>>
>>	fd = open(argv[1], O_RDONLY | O_DIRECT);
>>	if (fd == -1) {
>>		perror("open");
>>		return 1;
>>	}
>>
>>	p = ALIGN(malloc(BS + 4095));
>>	for (i = 0; i < BLOCKS; i++) {
>>		int r = read(fd, p, BS);
>>
>>		if (r == BS)
>>			continue;
>>		else {
>>			if (r == -1)
>>				perror("read");
>>
>>			break;
>>		}
>>	}
>>
>>	return 0;
>>}
>> 
>>
>>    
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>

  reply	other threads:[~2005-08-31 16:58 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-29 18:20 Where is the performance bottleneck? Holger Kiehl
2005-08-29 19:54 ` Mark Hahn
2005-08-30 19:08   ` Holger Kiehl
2005-08-30 23:05     ` Guy
2005-09-28 20:04       ` Bill Davidsen
2005-09-30  4:52         ` Guy
2005-09-30  5:19           ` dean gaudet
2005-10-06 21:15           ` Bill Davidsen
2005-08-29 20:10 ` Al Boldi
2005-08-30 19:18   ` Holger Kiehl
2005-08-31 10:30     ` Al Boldi
2005-08-29 20:25 ` Vojtech Pavlik
2005-08-30 20:06   ` Holger Kiehl
2005-08-31  7:11     ` Vojtech Pavlik
2005-08-31  7:26       ` Jens Axboe
2005-08-31 11:54         ` Holger Kiehl
2005-08-31 12:07           ` Jens Axboe
2005-08-31 13:55             ` Holger Kiehl
2005-08-31 14:24               ` Dr. David Alan Gilbert
2005-08-31 20:56                 ` Holger Kiehl
2005-08-31 21:16                   ` Dr. David Alan Gilbert
2005-08-31 16:20               ` Jens Axboe
2005-08-31 15:16                 ` jmerkey
2005-08-31 16:58                   ` Tom Callahan [this message]
2005-08-31 16:58                     ` Tom Callahan
2005-08-31 15:47                     ` jmerkey
2005-08-31 17:11                   ` Jens Axboe
2005-08-31 15:59                     ` jmerkey
2005-08-31 17:32                       ` Jens Axboe
2005-08-31 16:51                 ` Holger Kiehl
2005-08-31 17:35                   ` Jens Axboe
2005-08-31 19:00                     ` Holger Kiehl
2005-08-31 18:06                   ` Michael Tokarev
2005-08-31 18:52                     ` Ming Zhang
2005-08-31 18:57                       ` Ming Zhang
2005-08-31 12:24           ` Nick Piggin
2005-08-31 16:25             ` Holger Kiehl
2005-08-31 17:25               ` Nick Piggin
2005-08-31 21:57                 ` Holger Kiehl
2005-09-01  9:12                   ` Holger Kiehl
2005-09-02 14:28                     ` Al Boldi
2005-08-31 13:38       ` Holger Kiehl
2005-08-29 23:09 ` Peter Chubb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4315E199.3060003@tessco.com \
    --to=callahant@tessco.com \
    --cc=Holger.Kiehl@dwd.de \
    --cc=axboe@suse.de \
    --cc=jmerkey@utah-nac.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=vojtech@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.