public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Tuning Linux for high-speed disk subsystems
@ 2001-11-13 14:29 Roy Sigurd Karlsbakk
  2001-11-13 16:43 ` Ragnar Kjørstad
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Roy Sigurd Karlsbakk @ 2001-11-13 14:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: lars.nakkerud

Hi all

After some testing at Compaq's lab in Oslo, I've come to the conclusion
that Linux cannot scale higher than about 30-40MB/sec in or out of a
hardware or software RAID-0 set with several stripe/chunk sizes tried out.
The set is based on 5 18GB 10k disks running SCSI-3 (160MBps) alone on a
32bit/33MHz PCI bus.

After speking to the storage guys here, I was told the problem generally
was that the OS should send the data requests at 256kB block sizes, as the
drives (10k) could handle 100 I/O operations per second, and thereby could
give a total of (256*100)kB/sec per spindle. When using smaller block
sizes, the speed would decrease in a linear fasion.

Does anyone know this stuff good enough to help me how to tune the system?
PS: The CPUs were almost idle during the test. Tested file system was
ext2.

Regards

roy

--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
@ 2001-11-13 15:58 Jesse Pollard
  0 siblings, 0 replies; 16+ messages in thread
From: Jesse Pollard @ 2001-11-13 15:58 UTC (permalink / raw)
  To: roy, linux-kernel

Roy Sigurd Karlsbakk <roy@karlsbakk.net>:
> 
> Hi all
> 
> After some testing at Compaq's lab in Oslo, I've come to the conclusion
> that Linux cannot scale higher than about 30-40MB/sec in or out of a
> hardware or software RAID-0 set with several stripe/chunk sizes tried out.
> The set is based on 5 18GB 10k disks running SCSI-3 (160MBps) alone on a
> 32bit/33MHz PCI bus.
> 
> After speking to the storage guys here, I was told the problem generally
> was that the OS should send the data requests at 256kB block sizes, as the
> drives (10k) could handle 100 I/O operations per second, and thereby could
> give a total of (256*100)kB/sec per spindle. When using smaller block
> sizes, the speed would decrease in a linear fasion.
> 
> Does anyone know this stuff good enough to help me how to tune the system?
> PS: The CPUs were almost idle during the test. Tested file system was
> ext2.

I shouldn't be the authoritative answer on this, but to start with:

a. You don't provide enough info on the hardware configuration:

	a. are all of the drives on one SCSI controller?
	b. is there only one PCI?
	c. since you mention "CPUs", how many, and which ones
	d. which chipset?
	e. what was used for the benchmark?
	f. which hardware raids were tested?

b. Your mentioned limit (40MB/sec) sounds like it is really a
   memory<->bridge<->PCI<->controller bandwidth limit - this is about what I
   get from a SCSI-3 alone on 33MHz bus (I use SCSI 3 for system disk, SCSI 2
   for audio/CDRW/tape drive).

c. Based on the statement that the "CPUs were almost idle", it sounds like
   the limit is outside the OS. If you are trying to setup a disk server then
   you should check into multiple PCI busses @ 66MHz, and multiple disk
   controllers.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 17:59 ` Craig I. Hagan
@ 2001-11-13 16:18   ` Marcelo Tosatti
  2001-11-14 10:33   ` Roy Sigurd Karlsbakk
  2001-11-14 10:35   ` Roy Sigurd Karlsbakk
  2 siblings, 0 replies; 16+ messages in thread
From: Marcelo Tosatti @ 2001-11-13 16:18 UTC (permalink / raw)
  To: Craig I. Hagan; +Cc: Roy Sigurd Karlsbakk, linux-kernel, lars.nakkerud



On Tue, 13 Nov 2001, Craig I. Hagan wrote:

> > After some testing at Compaq's lab in Oslo, I've come to the conclusion
> > that Linux cannot scale higher than about 30-40MB/sec in or out of a
> > hardware or software RAID-0 set with several stripe/chunk sizes tried out.
> > The set is based on 5 18GB 10k disks running SCSI-3 (160MBps) alone on a
> > 32bit/33MHz PCI bus.
> 
> this isn't quite true. use either the RH kernel, the -ac series, or the
> attached patch (for 2.4.15-pre4). Then set /proc/sys/vm/max-readahead to 511 or
> 1023 (power of 2 minus 1)
> 
> this should allow you to generate large enough io's for streaming reads to do
> what you are looking for.

Craig,

This patch is already on my pending list. 

So if Linus does not apply it, I will.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 14:29 Roy Sigurd Karlsbakk
@ 2001-11-13 16:43 ` Ragnar Kjørstad
  2001-11-13 16:51 ` Alan Cox
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Ragnar Kjørstad @ 2001-11-13 16:43 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-kernel, lars.nakkerud

On Tue, Nov 13, 2001 at 03:29:13PM +0100, Roy Sigurd Karlsbakk wrote:
> After some testing at Compaq's lab in Oslo, I've come to the conclusion
> that Linux cannot scale higher than about 30-40MB/sec in or out of a
> hardware or software RAID-0 set with several stripe/chunk sizes tried out.

Eh, we do 60-70 MB/s reads and 110-120 MB/s writes on our RAIDs... from
linux.


> Does anyone know this stuff good enough to help me how to tune the system?
> PS: The CPUs were almost idle during the test. Tested file system was
> ext2.

I'd say you should get rid of your compaq raid controller and use a
regular SCSI-controller - 66Mhz 64 bit. (e.g. an adaptec)



-- 
Ragnar Kjørstad
Big Storage

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 14:29 Roy Sigurd Karlsbakk
  2001-11-13 16:43 ` Ragnar Kjørstad
@ 2001-11-13 16:51 ` Alan Cox
  2001-11-13 17:59 ` Craig I. Hagan
  2001-11-13 20:00 ` Dan Hollis
  3 siblings, 0 replies; 16+ messages in thread
From: Alan Cox @ 2001-11-13 16:51 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-kernel, lars.nakkerud

> After some testing at Compaq's lab in Oslo, I've come to the conclusion
> that Linux cannot scale higher than about 30-40MB/sec in or out of a
> hardware or software RAID-0 set with several stripe/chunk sizes tried out.
> The set is based on 5 18GB 10k disks running SCSI-3 (160MBps) alone on a
> 32bit/33MHz PCI bus.

I'm beating that with IDE 8)

> After speking to the storage guys here, I was told the problem generally
> was that the OS should send the data requests at 256kB block sizes, as the
> drives (10k) could handle 100 I/O operations per second, and thereby could

Right now we tend to queue 128 blocks per write. That can be tuned if you
want to play with it. 

> Does anyone know this stuff good enough to help me how to tune the system?
> PS: The CPUs were almost idle during the test. Tested file system was
> ext2.

Im not sure the best way to get big linear blocks in the ext2 layout or
if perhaps XFS would do that job better, but the physical layer comes
down the the block limit, scsi max sectors per I/O set by the controller
and to an extent the vm readahead (tunable in -ac kernels - the patch
to md.c should tell you how to hack md for that)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 14:29 Roy Sigurd Karlsbakk
  2001-11-13 16:43 ` Ragnar Kjørstad
  2001-11-13 16:51 ` Alan Cox
@ 2001-11-13 17:59 ` Craig I. Hagan
  2001-11-13 16:18   ` Marcelo Tosatti
                     ` (2 more replies)
  2001-11-13 20:00 ` Dan Hollis
  3 siblings, 3 replies; 16+ messages in thread
From: Craig I. Hagan @ 2001-11-13 17:59 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-kernel, lars.nakkerud

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1055 bytes --]

> After some testing at Compaq's lab in Oslo, I've come to the conclusion
> that Linux cannot scale higher than about 30-40MB/sec in or out of a
> hardware or software RAID-0 set with several stripe/chunk sizes tried out.
> The set is based on 5 18GB 10k disks running SCSI-3 (160MBps) alone on a
> 32bit/33MHz PCI bus.

this isn't quite true. use either the RH kernel, the -ac series, or the
attached patch (for 2.4.15-pre4). Then set /proc/sys/vm/max-readahead to 511 or
1023 (power of 2 minus 1)

this should allow you to generate large enough io's for streaming reads to do
what you are looking for.
	
-- craig


-------------------------------------------------------------------------------
Craig I. Hagan     "It's a small world, but I wouldn't want to back it up"
hagan(at)cih.com        "True hackers don't die, their ttl expires"
  	"It takes a village to raise an idiot, but an idiot can raze a village"

	Stop the spread of spam, use a sendmail condom!
	     http://www.cih.com/~hagan/smtpd-hacks

                       In Bandwidth we trust

[-- Attachment #2: Type: TEXT/PLAIN, Size: 4745 bytes --]

diff -ur linux-orig/drivers/md/md.c linux/drivers/md/md.c
--- linux-orig/drivers/md/md.c	Mon Nov 12 23:34:52 2001
+++ linux/drivers/md/md.c	Mon Nov 12 23:31:17 2001
@@ -3398,7 +3398,7 @@
 	/*
 	 * Tune reconstruction:
 	 */
-	window = MAX_READAHEAD*(PAGE_SIZE/512);
+	window = vm_max_readahead*(PAGE_SIZE/512);
 	printk(KERN_INFO "md: using %dk window, over a total of %d blocks.\n",
 	       window/2,max_sectors/2);
 
diff -ur linux-orig/include/linux/blkdev.h linux/include/linux/blkdev.h
--- linux-orig/include/linux/blkdev.h	Mon Nov 12 23:33:20 2001
+++ linux/include/linux/blkdev.h	Mon Nov 12 23:31:17 2001
@@ -180,10 +180,6 @@
 
 #define PageAlignSize(size) (((size) + PAGE_SIZE -1) & PAGE_MASK)
 
-/* read-ahead in pages.. */
-#define MAX_READAHEAD	31
-#define MIN_READAHEAD	3
-
 #define blkdev_entry_to_request(entry) list_entry((entry), struct request, queue)
 #define blkdev_entry_next_request(entry) blkdev_entry_to_request((entry)->next)
 #define blkdev_entry_prev_request(entry) blkdev_entry_to_request((entry)->prev)
diff -ur linux-orig/include/linux/mm.h linux/include/linux/mm.h
--- linux-orig/include/linux/mm.h	Mon Nov 12 23:33:38 2001
+++ linux/include/linux/mm.h	Mon Nov 12 23:31:17 2001
@@ -111,6 +111,10 @@
 #define VM_SequentialReadHint(v)	((v)->vm_flags & VM_SEQ_READ)
 #define VM_RandomReadHint(v)		((v)->vm_flags & VM_RAND_READ)
 
+/* read ahead limits */
+extern int vm_min_readahead;
+extern int vm_max_readahead;
+
 /*
  * mapping from the currently active vm_flags protection bits (the
  * low four bits) to a page protection mask..
diff -ur linux-orig/include/linux/raid/md_k.h linux/include/linux/raid/md_k.h
--- linux-orig/include/linux/raid/md_k.h	Mon Nov 12 23:33:02 2001
+++ linux/include/linux/raid/md_k.h	Mon Nov 12 23:31:17 2001
@@ -91,7 +91,7 @@
 /*
  * default readahead
  */
-#define MD_READAHEAD	MAX_READAHEAD
+#define MD_READAHEAD	vm_max_readahead
 
 static inline int disk_faulty(mdp_disk_t * d)
 {
diff -ur linux-orig/include/linux/sysctl.h linux/include/linux/sysctl.h
--- linux-orig/include/linux/sysctl.h	Mon Nov 12 23:34:36 2001
+++ linux/include/linux/sysctl.h	Mon Nov 12 23:31:21 2001
@@ -139,7 +139,9 @@
 	VM_PAGECACHE=7,		/* struct: Set cache memory thresholds */
 	VM_PAGERDAEMON=8,	/* struct: Control kswapd behaviour */
 	VM_PGT_CACHE=9,		/* struct: Set page table cache parameters */
-	VM_PAGE_CLUSTER=10	/* int: set number of pages to swap together */
+	VM_PAGE_CLUSTER=10,	/* int: set number of pages to swap together */
+        VM_MIN_READAHEAD=12,    /* Min file readahead */
+        VM_MAX_READAHEAD=13     /* Max file readahead */
 };
 
 
diff -ur linux-orig/kernel/sysctl.c linux/kernel/sysctl.c
--- linux-orig/kernel/sysctl.c	Mon Nov 12 23:32:28 2001
+++ linux/kernel/sysctl.c	Mon Nov 12 23:31:17 2001
@@ -270,6 +270,10 @@
 	 &pgt_cache_water, 2*sizeof(int), 0644, NULL, &proc_dointvec},
 	{VM_PAGE_CLUSTER, "page-cluster", 
 	 &page_cluster, sizeof(int), 0644, NULL, &proc_dointvec},
+	{VM_MIN_READAHEAD, "min-readahead",
+	&vm_min_readahead,sizeof(int), 0644, NULL, &proc_dointvec},
+	{VM_MAX_READAHEAD, "max-readahead",
+	&vm_max_readahead,sizeof(int), 0644, NULL, &proc_dointvec},
 	{0}
 };
 
diff -ur linux-orig/mm/filemap.c linux/mm/filemap.c
--- linux-orig/mm/filemap.c	Mon Nov 12 23:32:44 2001
+++ linux/mm/filemap.c	Mon Nov 12 23:31:21 2001
@@ -47,6 +47,12 @@
 unsigned int page_hash_bits;
 struct page **page_hash_table;
 
+int vm_max_readahead = 31;
+int vm_min_readahead = 3;
+EXPORT_SYMBOL(vm_max_readahead);
+EXPORT_SYMBOL(vm_min_readahead);
+
+
 spinlock_t pagecache_lock ____cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED;
 /*
  * NOTE: to avoid deadlocking you must never acquire the pagemap_lru_lock 
@@ -1129,7 +1135,7 @@
 static inline int get_max_readahead(struct inode * inode)
 {
 	if (!inode->i_dev || !max_readahead[MAJOR(inode->i_dev)])
-		return MAX_READAHEAD;
+		return vm_max_readahead;
 	return max_readahead[MAJOR(inode->i_dev)][MINOR(inode->i_dev)];
 }
 
@@ -1312,8 +1318,8 @@
 		if (filp->f_ramax < needed)
 			filp->f_ramax = needed;
 
-		if (reada_ok && filp->f_ramax < MIN_READAHEAD)
-				filp->f_ramax = MIN_READAHEAD;
+		if (reada_ok && filp->f_ramax < vm_min_readahead)
+				filp->f_ramax = vm_min_readahead;
 		if (filp->f_ramax > max_readahead)
 			filp->f_ramax = max_readahead;
 	}
--- linux-orig/drivers/ide/ide-probe.c	Mon Nov 12 23:49:38 2001
+++ linux/drivers/ide/ide-probe.c	Mon Nov 12 23:50:18 2001
@@ -779,7 +779,7 @@
 		/* IDE can do up to 128K per request. */
 		*max_sect++ = 255;
 #endif
-		*max_ra++ = MAX_READAHEAD;
+		*max_ra++ = vm_max_readahead;
 	}
 
 	for (unit = 0; unit < units; ++unit)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 14:29 Roy Sigurd Karlsbakk
                   ` (2 preceding siblings ...)
  2001-11-13 17:59 ` Craig I. Hagan
@ 2001-11-13 20:00 ` Dan Hollis
  3 siblings, 0 replies; 16+ messages in thread
From: Dan Hollis @ 2001-11-13 20:00 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-kernel, lars.nakkerud

On Tue, 13 Nov 2001, Roy Sigurd Karlsbakk wrote:
> After some testing at Compaq's lab in Oslo, I've come to the conclusion
> that Linux cannot scale higher than about 30-40MB/sec in or out of a
> hardware or software RAID-0 set with several stripe/chunk sizes tried out.

We managed >100mb/sec from a raid5 IDE setup, SMP athlon on Tyan S2460
with promise controllers.

-Dan
-- 
[-] Omae no subete no kichi wa ore no mono da. [-]


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Tuning Linux for high-speed disk subsystems
@ 2001-11-13 20:38 Torrey Hoffman
  0 siblings, 0 replies; 16+ messages in thread
From: Torrey Hoffman @ 2001-11-13 20:38 UTC (permalink / raw)
  To: 'Roy Sigurd Karlsbakk', linux-kernel; +Cc: lars.nakkerud

Roy Sigurd Karlsbakk wrote:

> After some testing at Compaq's lab in Oslo, I've come to the 
> conclusion
> that Linux cannot scale higher than about 30-40MB/sec in or out of a
> hardware or software RAID-0 set with several stripe/chunk 
> sizes tried out.

Hmmm. I saw "dbench 32" results of 73 MB / second using Linux
software RAID-0 and IDE.  However, I suppose some of that 
was due to caching, and not hardware throughput.

Details: 2.4.9-ac17, 4 x Maxtor 5400 RPM, 60 GB hard drives, 
2 x Promise TX-2 controllers, using UDMA-100, one drive / cable,
dual PIII-800, reiserfs, RAID - 0 with chunk-size = 1024

Torrey




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 17:59 ` Craig I. Hagan
  2001-11-13 16:18   ` Marcelo Tosatti
@ 2001-11-14 10:33   ` Roy Sigurd Karlsbakk
  2001-11-14 10:35   ` Roy Sigurd Karlsbakk
  2 siblings, 0 replies; 16+ messages in thread
From: Roy Sigurd Karlsbakk @ 2001-11-14 10:33 UTC (permalink / raw)
  To: Craig I. Hagan; +Cc: linux-kernel, lars.nakkerud

> this isn't quite true. use either the RH kernel, the -ac series, or the
> attached patch (for 2.4.15-pre4). Then set /proc/sys/vm/max-readahead to 511 or
> 1023 (power of 2 minus 1)
>
> this should allow you to generate large enough io's for streaming reads to do
> what you are looking for.

What does the setting mean? The number of pages?
--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-13 17:59 ` Craig I. Hagan
  2001-11-13 16:18   ` Marcelo Tosatti
  2001-11-14 10:33   ` Roy Sigurd Karlsbakk
@ 2001-11-14 10:35   ` Roy Sigurd Karlsbakk
  2 siblings, 0 replies; 16+ messages in thread
From: Roy Sigurd Karlsbakk @ 2001-11-14 10:35 UTC (permalink / raw)
  To: Craig I. Hagan; +Cc: linux-kernel, lars.nakkerud

> this isn't quite true. use either the RH kernel, the -ac series, or the
> attached patch (for 2.4.15-pre4). Then set /proc/sys/vm/max-readahead to 511 or
> 1023 (power of 2 minus 1)
>
> this should allow you to generate large enough io's for streaming reads to do
> what you are looking for.

How does this work when using software RAID-0 or 5?
--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Tuning Linux for high-speed disk subsystems
@ 2001-11-16  2:56 Dieter Nützel
  2001-11-16 11:51 ` Roy Sigurd Karlsbakk
  0 siblings, 1 reply; 16+ messages in thread
From: Dieter Nützel @ 2001-11-16  2:56 UTC (permalink / raw)
  To: Linux Kernel List

The heroinewarrior.com (Broadcast 2000) guys came to the following with the
Tyan Thunder K7 (2 x 1.0 GHz Athlon MP) dual channel U160 (Adaptec) and
RAID 0. http://heroinewarrior.com/athlon.php3

[-]
 As for performance our experiences are biased because this system is almost 
exclusively used for video software development not games like most. It needs 
a reliable operating system like Linux and very fast media storage drives.

 The inverse telecine, a grueling memory excercise which takes 3 hours on a 
dual PIII 933 and 2 hours on a dual Alpha, takes about 2 hours on the dual 
Athlon. 

Our 100 Gig SCSI raid, consisting of 6 15,000 rpm drives on the motherboard's 
two SCSI 160 channels gives a full 110MB/sec read and write with RAID 0. With 
RAID chunks set to 1MB the write accesses go to 160MB/sec and read accesses 
go to 90MB/sec sustained. This system would make a good motion capture tool. 
Previous Intel attempts at onboard disk I/O would give 50MB/sec.
[-]

-Dieter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Tuning Linux for high-speed disk subsystems
  2001-11-16  2:56 Dieter Nützel
@ 2001-11-16 11:51 ` Roy Sigurd Karlsbakk
  2001-11-16 15:24   ` Dieter Nützel
  0 siblings, 1 reply; 16+ messages in thread
From: Roy Sigurd Karlsbakk @ 2001-11-16 11:51 UTC (permalink / raw)
  To: Dieter Nützel; +Cc: Linux Kernel List

> Our 100 Gig SCSI raid, consisting of 6 15,000 rpm drives on the motherboard's
> two SCSI 160 channels gives a full 110MB/sec read and write with RAID 0. With
> RAID chunks set to 1MB the write accesses go to 160MB/sec and read accesses
> go to 90MB/sec sustained. This system would make a good motion capture tool.
> Previous Intel attempts at onboard disk I/O would give 50MB/sec.

How much do you think I can get out of 2x6 15k disks - each 6 disks are on
their own SCSI-3/160 bus.
--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA

Computers are like air conditioners.
They stop working when you open Windows.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-16 11:51 ` Roy Sigurd Karlsbakk
@ 2001-11-16 15:24   ` Dieter Nützel
  0 siblings, 0 replies; 16+ messages in thread
From: Dieter Nützel @ 2001-11-16 15:24 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: Linux Kernel List

Am Freitag, 16. November 2001 12:51 schrieb Roy Sigurd Karlsbakk:
> > Our 100 Gig SCSI raid, consisting of 6 15,000 rpm drives on the
> > motherboard's two SCSI 160 channels gives a full 110MB/sec read and write
> > with RAID 0. With RAID chunks set to 1MB the write accesses go to
> > 160MB/sec and read accesses go to 90MB/sec sustained. This system would
> > make a good motion capture tool. Previous Intel attempts at onboard disk
> > I/O would give 50MB/sec.
>
> How much do you think I can get out of 2x6 15k disks - each 6 disks are on
> their own SCSI-3/160 bus.

As I count your disks may be the double for the best case. I read here on 
LKML a post that someone claims that W2k deliever 250 MB/s with such a 
configuration. Linux 2.4 should do the same. Ask the SCSI gurus.

Regards,
	Dieter

-- 
Dieter Nützel
Graduate Student, Computer Science
@home: Dieter.Nuetzel@hamburg.de

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
       [not found] <3B6867E6CB09B24385A73719A50C7C9A79791F@athena.boxxtech.com>
@ 2001-11-16 16:53 ` Marvin Justice
  0 siblings, 0 replies; 16+ messages in thread
From: Marvin Justice @ 2001-11-16 16:53 UTC (permalink / raw)
  To: Dieter Nützel, Roy Sigurd Karlsbakk; +Cc: Linux Kernel List


> As I count your disks may be the double for the best case. I read here on
>  LKML a post that someone claims that W2k deliever 250 MB/s with such a
>  configuration. Linux 2.4 should do the same. Ask the SCSI gurus.
>

That may have been my post you refer to. With 2x5 disks, each capable of
50 MB/s by itself, we can stream 255 MB/s very smoothly in either direction 
with W2K --- as long as FILE_FLAG_NOBUFFER is used. With standard
reads the number is more like 100 MB/s if I recall correctly, so the buffer
cache can definitely get in the way.

With Linux + XFS I was getting 250 MB/s read and 220 MB/s write (with a
bit less smoothness than W2K) using O_DIRECT and no high mem to avoid
bounce buffer copies. Using standard reads the numbers drop to around 
120 MB/s. That was a couple of weeks ago and I want to try tweaking some
more but a co-worker has "borrowed" pieces of the hardware for the moment.

-- 
Marvin Justice
Software Developer
BOXX Technologies
www.boxxtech.com
mjustice@boxxtech.com
512-235-6318 (V)
512-835-0434 (F)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
@ 2001-11-22 10:15 Martin Knoblauch
  2001-11-22 16:30 ` Andreas Dilger
  0 siblings, 1 reply; 16+ messages in thread
From: Martin Knoblauch @ 2001-11-22 10:15 UTC (permalink / raw)
  To: mjustice; +Cc: linux-kernel@vger.kernel.org, S.Akhtary

> Re: Tuning Linux for high-speed disk subsystems
> 
> 
> > As I count your disks may be the double for the best case. I read here on
> > LKML a post that someone claims that W2k deliever 250 MB/s with such a
> > configuration. Linux 2.4 should do the same. Ask the SCSI gurus.
> >
> 
> That may have been my post you refer to. With 2x5 disks, each capable of
> 50 MB/s by itself, we can stream 255 MB/s very smoothly in either direction
> with W2K --- as long as FILE_FLAG_NOBUFFER is used. With standard
> reads the number is more like 100 MB/s if I recall correctly, so the buffer
> cache can definitely get in the way.
> 
> With Linux + XFS I was getting 250 MB/s read and 220 MB/s write (with a
> bit less smoothness than W2K) using O_DIRECT and no high mem to avoid
> bounce buffer copies. Using standard reads the numbers drop to around
> 120 MB/s. That was a couple of weeks ago and I want to try tweaking some
> more but a co-worker has "borrowed" pieces of the hardware for the moment.
> 
Marvin,

 could you elaborate a bit more :-), or point me/us to your post
(couldn't find it). We are currently evaluating solutions for doing HDTV
playback for one of our customers. This will need about 300-320 MB/sec
read. We know (at least someone claims so) that you can do it with SGI
equipment at a price. The goal for the customer is to definitely beat
that price :-))

Martin
-- 
------------------------------------------------------------------
Martin Knoblauch         |    email:  Martin.Knoblauch@TeraPort.de
TeraPort GmbH            |    Phone:  +49-89-510857-309
C+ITS                    |    Fax:    +49-89-510857-111
http://www.teraport.de   |    Mobile: +49-170-4904759

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tuning Linux for high-speed disk subsystems
  2001-11-22 10:15 Martin Knoblauch
@ 2001-11-22 16:30 ` Andreas Dilger
  0 siblings, 0 replies; 16+ messages in thread
From: Andreas Dilger @ 2001-11-22 16:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: S.Akhtary, mjustice

> Re: Tuning Linux for high-speed disk subsystems
> > As I count your disks may be the double for the best case. I read here on
> > LKML a post that someone claims that W2k deliever 250 MB/s with such a
> > configuration. Linux 2.4 should do the same. Ask the SCSI gurus.
> 
> That may have been my post you refer to. With 2x5 disks, each capable of
> 50 MB/s by itself, we can stream 255 MB/s very smoothly in either direction
> with W2K --- as long as FILE_FLAG_NOBUFFER is used. With standard
> reads the number is more like 100 MB/s if I recall correctly, so the buffer
> cache can definitely get in the way.
> 
> With Linux + XFS I was getting 250 MB/s read and 220 MB/s write (with a
> bit less smoothness than W2K) using O_DIRECT and no high mem to avoid
> bounce buffer copies. Using standard reads the numbers drop to around
> 120 MB/s. That was a couple of weeks ago and I want to try tweaking some
> more but a co-worker has "borrowed" pieces of the hardware for the moment.

Jusy FYI, Linus announced that he had returned Andrea's O_DIRECT support
to the most recent 2.4.15-pre kernel, so you are no longer restricted to
using XFS for no-cache I/O.  Whether you will be able to beat XFS for
speed using any other filesystem is another question.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2001-11-22 16:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3B6867E6CB09B24385A73719A50C7C9A79791F@athena.boxxtech.com>
2001-11-16 16:53 ` Tuning Linux for high-speed disk subsystems Marvin Justice
2001-11-22 10:15 Martin Knoblauch
2001-11-22 16:30 ` Andreas Dilger
  -- strict thread matches above, loose matches on Subject: below --
2001-11-16  2:56 Dieter Nützel
2001-11-16 11:51 ` Roy Sigurd Karlsbakk
2001-11-16 15:24   ` Dieter Nützel
2001-11-13 20:38 Torrey Hoffman
2001-11-13 15:58 Jesse Pollard
2001-11-13 14:29 Roy Sigurd Karlsbakk
2001-11-13 16:43 ` Ragnar Kjørstad
2001-11-13 16:51 ` Alan Cox
2001-11-13 17:59 ` Craig I. Hagan
2001-11-13 16:18   ` Marcelo Tosatti
2001-11-14 10:33   ` Roy Sigurd Karlsbakk
2001-11-14 10:35   ` Roy Sigurd Karlsbakk
2001-11-13 20:00 ` Dan Hollis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox