* Re: 3ware 9650 tips [not found] <alpine.LRH.0.999.0707131356520.25773@chaos.egr.duke.edu> @ 2007-07-13 18:35 ` Justin Piszcz 2007-07-13 18:54 ` Jon Collette ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Justin Piszcz @ 2007-07-13 18:35 UTC (permalink / raw) To: Joshua Baker-LePain; +Cc: linux-ide-arrays, linux-raid On Fri, 13 Jul 2007, Joshua Baker-LePain wrote: > My new system has a 3ware 9650SE-24M8 controller hooked to 24 500GB WD > drives. The controller is set up as a RAID6 w/ a hot spare. OS is CentOS 5 > x86_64. It's all running on a couple of Xeon 5130s on a Supermicro X7DBE > motherboard w/ 4GB of RAM. > > Trying to stick with a supported config as much as possible, I need to run > ext3. As per usual, though, initial ext3 numbers are less than impressive. > Using bonnie++ to get a baseline, I get (after doing 'blockdev --setra 65536' > on the device): > Write: 136MB/s > Read: 384MB/s > > Proving it's not the hardware, with XFS the numbers look like: > Write: 333MB/s > Read: 465MB/s > > How many folks are using these? Any tuning tips? > > Thanks. > > -- > Joshua Baker-LePain > Department of Biomedical Engineering > Duke University > Let's try that again with the right address :) You are using HW RAID then? Those numbers seem pretty awful for that setup, including linux-raid@ even it though it appears you're running HW raid, this is rather peculiar. To give you an example I get 464MB/s write and 627MB/s with a 10 disk raptor software raid5. Justin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 18:35 ` 3ware 9650 tips Justin Piszcz @ 2007-07-13 18:54 ` Jon Collette 2007-07-13 19:36 ` Justin Piszcz 2007-07-13 19:04 ` Joshua Baker-LePain ` (2 subsequent siblings) 3 siblings, 1 reply; 26+ messages in thread From: Jon Collette @ 2007-07-13 18:54 UTC (permalink / raw) To: Justin Piszcz; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid Wouldn't Raid 6 be slower than Raid 5 because of the extra fault tolerance? http://www.enterprisenetworksandservers.com/monthly/art.php?1754 - 20% drop according to this article His 500GB WD drives are 7200RPM compared to the Raptors 10K. So his numbers will be slower. Justin what file system do you have running on the Raptors? I think thats an interesting point made by Joshua. Justin Piszcz wrote: > > > On Fri, 13 Jul 2007, Joshua Baker-LePain wrote: > >> My new system has a 3ware 9650SE-24M8 controller hooked to 24 500GB >> WD drives. The controller is set up as a RAID6 w/ a hot spare. OS >> is CentOS 5 x86_64. It's all running on a couple of Xeon 5130s on a >> Supermicro X7DBE motherboard w/ 4GB of RAM. >> >> Trying to stick with a supported config as much as possible, I need >> to run ext3. As per usual, though, initial ext3 numbers are less >> than impressive. Using bonnie++ to get a baseline, I get (after doing >> 'blockdev --setra 65536' on the device): >> Write: 136MB/s >> Read: 384MB/s >> >> Proving it's not the hardware, with XFS the numbers look like: >> Write: 333MB/s >> Read: 465MB/s >> >> How many folks are using these? Any tuning tips? >> >> Thanks. >> >> -- >> Joshua Baker-LePain >> Department of Biomedical Engineering >> Duke University >> > > Let's try that again with the right address :) > > > You are using HW RAID then? Those numbers seem pretty awful for that > setup, including linux-raid@ even it though it appears you're running > HW raid, > this is rather peculiar. > > To give you an example I get 464MB/s write and 627MB/s with a 10 disk > raptor software raid5. > > Justin. > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 18:54 ` Jon Collette @ 2007-07-13 19:36 ` Justin Piszcz 2007-07-16 2:41 ` David Chinner 0 siblings, 1 reply; 26+ messages in thread From: Justin Piszcz @ 2007-07-13 19:36 UTC (permalink / raw) To: Jon Collette; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid, xfs On Fri, 13 Jul 2007, Jon Collette wrote: > Wouldn't Raid 6 be slower than Raid 5 because of the extra fault tolerance? > http://www.enterprisenetworksandservers.com/monthly/art.php?1754 - 20% > drop according to this article > > His 500GB WD drives are 7200RPM compared to the Raptors 10K. So his numbers > will be slower. > Justin what file system do you have running on the Raptors? I think thats an > interesting point made by Joshua. I use XFS: Justin what file system do you have running on the Raptors? I think thats an interesting point made by Joshua. But I also run several 'optimations' for SW RAID and my overall configuration-- However, for the mkfs.xfs options, they auto-optimize for whatever (SW) raid I have them configured for. Whereas if XFS cannot tell the disks/etc underneath the HW raid, I do not think the optimizations will be present(?)-- which menas you'd have to set the sunit and swidth appropriately. > > > Justin Piszcz wrote: >> >> >> On Fri, 13 Jul 2007, Joshua Baker-LePain wrote: >> >>> My new system has a 3ware 9650SE-24M8 controller hooked to 24 500GB WD >>> drives. The controller is set up as a RAID6 w/ a hot spare. OS is CentOS >>> 5 x86_64. It's all running on a couple of Xeon 5130s on a Supermicro >>> X7DBE motherboard w/ 4GB of RAM. >>> >>> Trying to stick with a supported config as much as possible, I need to run >>> ext3. As per usual, though, initial ext3 numbers are less than >>> impressive. Using bonnie++ to get a baseline, I get (after doing 'blockdev >>> --setra 65536' on the device): >>> Write: 136MB/s >>> Read: 384MB/s >>> >>> Proving it's not the hardware, with XFS the numbers look like: >>> Write: 333MB/s >>> Read: 465MB/s >>> >>> How many folks are using these? Any tuning tips? >>> >>> Thanks. >>> >>> -- >>> Joshua Baker-LePain >>> Department of Biomedical Engineering >>> Duke University >>> >> >> Let's try that again with the right address :) >> >> >> You are using HW RAID then? Those numbers seem pretty awful for that >> setup, including linux-raid@ even it though it appears you're running HW >> raid, >> this is rather peculiar. >> >> To give you an example I get 464MB/s write and 627MB/s with a 10 disk >> raptor software raid5. >> >> Justin. >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 19:36 ` Justin Piszcz @ 2007-07-16 2:41 ` David Chinner 2007-07-16 12:22 ` David Chinner 2007-07-16 15:43 ` Joshua Baker-LePain 0 siblings, 2 replies; 26+ messages in thread From: David Chinner @ 2007-07-16 2:41 UTC (permalink / raw) To: Justin Piszcz Cc: Jon Collette, Joshua Baker-LePain, linux-ide-arrays, linux-raid, xfs On Fri, Jul 13, 2007 at 03:36:46PM -0400, Justin Piszcz wrote: > On Fri, 13 Jul 2007, Jon Collette wrote: > > >Wouldn't Raid 6 be slower than Raid 5 because of the extra fault tolerance? > > http://www.enterprisenetworksandservers.com/monthly/art.php?1754 - 20% > >drop according to this article > > > >His 500GB WD drives are 7200RPM compared to the Raptors 10K. So his > >numbers will be slower. > >Justin what file system do you have running on the Raptors? I think thats > >an interesting point made by Joshua. > > I use XFS: When it comes to bandwidth, there is good reason for that. > >>>Trying to stick with a supported config as much as possible, I need to > >>>run ext3. As per usual, though, initial ext3 numbers are less than > >>>impressive. Using bonnie++ to get a baseline, I get (after doing > >>>'blockdev --setra 65536' on the device): > >>>Write: 136MB/s > >>>Read: 384MB/s > >>> > >>>Proving it's not the hardware, with XFS the numbers look like: > >>>Write: 333MB/s > >>>Read: 465MB/s > >>> Those are pretty typical numbers. In my experience, ext3 is limited to about 250MB/s buffered write speed. It's not disk limited, it's design limited. e.g. on a disk subsystem where XFS was getting 4-5GB/s buffered write, ext3 was doing 250MB/s. http://oss.sgi.com/projects/xfs/papers/ols2006/ols-2006-paper.pdf If you've got any sort of serious disk array, ext3 is not the filesystem to use.... > >>>How many folks are using these? Any tuning tips? Make sure you tell XFS the correct sunit/swidth. For hardware raid5/6, sunit = per-disk chunksize, swidth = number of *data* disks in array. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-16 2:41 ` David Chinner @ 2007-07-16 12:22 ` David Chinner 2007-07-16 12:39 ` Bernd Schubert 2007-07-16 15:50 ` Eric Sandeen 2007-07-16 15:43 ` Joshua Baker-LePain 1 sibling, 2 replies; 26+ messages in thread From: David Chinner @ 2007-07-16 12:22 UTC (permalink / raw) To: David Chinner Cc: Justin Piszcz, Jon Collette, Joshua Baker-LePain, linux-ide-arrays, linux-raid, xfs On Mon, Jul 16, 2007 at 12:41:15PM +1000, David Chinner wrote: > On Fri, Jul 13, 2007 at 03:36:46PM -0400, Justin Piszcz wrote: > > On Fri, 13 Jul 2007, Jon Collette wrote: > > > > >Wouldn't Raid 6 be slower than Raid 5 because of the extra fault tolerance? > > > http://www.enterprisenetworksandservers.com/monthly/art.php?1754 - 20% > > >drop according to this article > > > > > >His 500GB WD drives are 7200RPM compared to the Raptors 10K. So his > > >numbers will be slower. > > >Justin what file system do you have running on the Raptors? I think thats > > >an interesting point made by Joshua. > > > > I use XFS: > > When it comes to bandwidth, there is good reason for that. > > > >>>Trying to stick with a supported config as much as possible, I need to > > >>>run ext3. As per usual, though, initial ext3 numbers are less than > > >>>impressive. Using bonnie++ to get a baseline, I get (after doing > > >>>'blockdev --setra 65536' on the device): > > >>>Write: 136MB/s > > >>>Read: 384MB/s > > >>> > > >>>Proving it's not the hardware, with XFS the numbers look like: > > >>>Write: 333MB/s > > >>>Read: 465MB/s > > >>> > > Those are pretty typical numbers. In my experience, ext3 is limited to about > 250MB/s buffered write speed. It's not disk limited, it's design limited. e.g. > on a disk subsystem where XFS was getting 4-5GB/s buffered write, ext3 was doing > 250MB/s. > > http://oss.sgi.com/projects/xfs/papers/ols2006/ols-2006-paper.pdf > > If you've got any sort of serious disk array, ext3 is not the filesystem > to use.... To show what the difference is, I used blktrace and Chris Mason's seekwatcher script on a simple, single threaded dd command on a 12 disk dm RAID0 stripe: # dd if=/dev/zero of=/mnt/scratch/fred bs=1024k count=10k; sync http://oss.sgi.com/~dgc/writes/ext3_write.png http://oss.sgi.com/~dgc/writes/xfs_write.png You can see from the ext3 graph that it comes to a screeching halt every 5s (probably when pdflush runs) and at all other times the seek rate is >10,000 seeks/s. That's pretty bad for a brand new, empty filesystem and the only way it is sustained is the fact that the disks have their write caches turned on. ext4 will probably show better results, but I haven't got any of the tools installed to be able to test it.... The XFS pattern shows consistently an order of magnitude less seeks and consistent throughput above 600MB/s. To put the number of seeks in context, XFS is doing 512k I/Os at about 1200-1300 per second. The number of seeks? A bit above 10^3 per second or roughly 1 seek per I/O which is pretty much optimal. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-16 12:22 ` David Chinner @ 2007-07-16 12:39 ` Bernd Schubert 2007-07-16 15:50 ` Eric Sandeen 1 sibling, 0 replies; 26+ messages in thread From: Bernd Schubert @ 2007-07-16 12:39 UTC (permalink / raw) To: David Chinner Cc: Justin Piszcz, Jon Collette, Joshua Baker-LePain, linux-ide-arrays, linux-raid, xfs On Monday 16 July 2007 14:22:25 David Chinner wrote: > You can see from the ext3 graph that it comes to a screeching halt > every 5s (probably when pdflush runs) and at all other times the > seek rate is >10,000 seeks/s. That's pretty bad for a brand new, > empty filesystem and the only way it is sustained is the fact that > the disks have their write caches turned on. ext4 will probably show > better results, but I haven't got any of the tools installed to be > able to test it.... I recently did some filesystem throuput tests, you may find it here http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/lustre/performance/ ldiskfs is ext3+extents+mballoc+some-smaller-patches, so is almost ext4 (delayed allocation is still missing, but the clusterfs/lustre people didn't port it and I'm afraid of hard to detect filesystem corruptions if I include it myself). Write performance is still slower than with xfs and I'm really considering to try to use xfs in lustre. Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-16 12:22 ` David Chinner 2007-07-16 12:39 ` Bernd Schubert @ 2007-07-16 15:50 ` Eric Sandeen 2007-07-16 22:21 ` David Chinner 1 sibling, 1 reply; 26+ messages in thread From: Eric Sandeen @ 2007-07-16 15:50 UTC (permalink / raw) To: David Chinner Cc: Justin Piszcz, Jon Collette, Joshua Baker-LePain, linux-ide-arrays, linux-raid, xfs David Chinner wrote: > On Mon, Jul 16, 2007 at 12:41:15PM +1000, David Chinner wrote: >> On Fri, Jul 13, 2007 at 03:36:46PM -0400, Justin Piszcz wrote: ... >> If you've got any sort of serious disk array, ext3 is not the filesystem >> to use.... > > To show what the difference is, I used blktrace and Chris Mason's > seekwatcher script on a simple, single threaded dd command on > a 12 disk dm RAID0 stripe: > > # dd if=/dev/zero of=/mnt/scratch/fred bs=1024k count=10k; sync > > http://oss.sgi.com/~dgc/writes/ext3_write.png > http://oss.sgi.com/~dgc/writes/xfs_write.png Were those all with default mkfs & mount options? ext3 in writeback mode might be an interesting comparison too. -Eric ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-16 15:50 ` Eric Sandeen @ 2007-07-16 22:21 ` David Chinner 0 siblings, 0 replies; 26+ messages in thread From: David Chinner @ 2007-07-16 22:21 UTC (permalink / raw) To: Eric Sandeen Cc: David Chinner, Justin Piszcz, Jon Collette, Joshua Baker-LePain, linux-ide-arrays, linux-raid, xfs On Mon, Jul 16, 2007 at 10:50:34AM -0500, Eric Sandeen wrote: > David Chinner wrote: > > On Mon, Jul 16, 2007 at 12:41:15PM +1000, David Chinner wrote: > >> On Fri, Jul 13, 2007 at 03:36:46PM -0400, Justin Piszcz wrote: > ... > >> If you've got any sort of serious disk array, ext3 is not the filesystem > >> to use.... > > > > To show what the difference is, I used blktrace and Chris Mason's > > seekwatcher script on a simple, single threaded dd command on > > a 12 disk dm RAID0 stripe: > > > > # dd if=/dev/zero of=/mnt/scratch/fred bs=1024k count=10k; sync > > > > http://oss.sgi.com/~dgc/writes/ext3_write.png > > http://oss.sgi.com/~dgc/writes/xfs_write.png > > Were those all with default mkfs & mount options? ext3 in writeback > mode might be an interesting comparison too. Defaults. i.e. # mkfs.ext3 /dev/mapper/dm0 # mkfs.xfs /dev/mapper/dm0 The mkfs.xfs picked up sunit/swidth correctly from the dm volume. Last time I checked, writeback made little difference to ext3 throughput; maybe 5-10% at most. I'll run it again later today... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-16 2:41 ` David Chinner 2007-07-16 12:22 ` David Chinner @ 2007-07-16 15:43 ` Joshua Baker-LePain 2007-07-16 17:15 ` [Advocacy] " Bryan J. Smith 2007-07-16 17:34 ` Stuart Levy 1 sibling, 2 replies; 26+ messages in thread From: Joshua Baker-LePain @ 2007-07-16 15:43 UTC (permalink / raw) To: David Chinner Cc: Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs On Mon, 16 Jul 2007 at 12:41pm, David Chinner wrote > If you've got any sort of serious disk array, ext3 is not the filesystem > to use.... I do so wish that RedHat shared this view... -- Joshua Baker-LePain Department of Biomedical Engineering Duke University ^ permalink raw reply [flat|nested] 26+ messages in thread
* [Advocacy] Re: 3ware 9650 tips 2007-07-16 15:43 ` Joshua Baker-LePain @ 2007-07-16 17:15 ` Bryan J. Smith 2007-07-16 17:40 ` Al Boldi 2007-07-16 17:34 ` Stuart Levy 1 sibling, 1 reply; 26+ messages in thread From: Bryan J. Smith @ 2007-07-16 17:15 UTC (permalink / raw) To: Joshua Baker-LePain Cc: David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs Off-topic, advocacy-level response ... On Mon, 2007-07-16 at 11:43 -0400, Joshua Baker-LePain wrote: > I do so wish that RedHat shared this view... I've been trying to convince them since Red Hat Linux 7 (and, later, 9) that they need to realize the limits of Ext3 at the enterprise end of the scalability spectrum -- you know, that whole market they are seemingly saying they are the king of and a replacement for Sun? ;-> The problem with Red Hat is that when anyone brings up an alternative to Ext3, Red Hat falls back to arguments against other filesystems, which is rather easy given the various compatibility issues with JFS (ported from OS/2, requiring a lot of inode compatibility hacks -- don't get me started with my experiences) and ReiserFS (utter lack of inode compatibility in structures, requiring kernel-level emulation, etc... that never seems to work, regardless of what the advocates say, let alone the almsota always "out-of-sync" off-line repair tools). But when you bring up XFS and its history of a stable, but advanced inode structure, quota support from day 1, POSIX ACLs from nearly day 1, and all the SGI team put into 2.5.3+ that is now stock kernel, they still try to dance. One thing I always get is "oh, its extents don't perform well for /tmp or /var" or countless other arguments, of which I merely respond, "all the more reason to use Ext3 for those few filesystems, and XFS when Ext3 doesn't scale -- like for large /home, /export, etc... filesystems." No matter how many times I put forth the argument that XFS complements Ext3, they seem to treat it as yet another JFS/ReiserFS argument. Hopeless? -- Bryan "one of the reasons I still deploy Solaris instead of RHEL for fileservers, even though RHL7+XFS and RHL9+XFS rocked (and are still rocking!)" Smith -- Bryan J. Smith Professional, Technical Annoyance mailto:b.j.smith@ieee.org http://thebs413.blogspot.com -------------------------------------------------------- Fission Power: An Inconvenient Solution ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Advocacy] Re: 3ware 9650 tips 2007-07-16 17:15 ` [Advocacy] " Bryan J. Smith @ 2007-07-16 17:40 ` Al Boldi 2007-07-16 17:48 ` Matthew Wilcox 0 siblings, 1 reply; 26+ messages in thread From: Al Boldi @ 2007-07-16 17:40 UTC (permalink / raw) To: Bryan J. Smith, Joshua Baker-LePain Cc: David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs, linux-fsdevel Bryan J. Smith wrote: > Off-topic, advocacy-level response ... > > On Mon, 2007-07-16 at 11:43 -0400, Joshua Baker-LePain wrote: > > I do so wish that RedHat shared this view... > > I've been trying to convince them since Red Hat Linux 7 (and, later, 9) > that they need to realize the limits of Ext3 at the enterprise end of > the scalability spectrum -- you know, that whole market they are > seemingly saying they are the king of and a replacement for Sun? ;-> > > The problem with Red Hat is that when anyone brings up an alternative to > Ext3, Red Hat falls back to arguments against other filesystems, which > is rather easy given the various compatibility issues with JFS (ported > from OS/2, requiring a lot of inode compatibility hacks -- don't get me > started with my experiences) and ReiserFS (utter lack of inode > compatibility in structures, requiring kernel-level emulation, etc... > that never seems to work, regardless of what the advocates say, let > alone the almsota always "out-of-sync" off-line repair tools). > > But when you bring up XFS and its history of a stable, but advanced > inode structure, quota support from day 1, POSIX ACLs from nearly day 1, > and all the SGI team put into 2.5.3+ that is now stock kernel, they > still try to dance. One thing I always get is "oh, its extents don't > perform well for /tmp or /var" or countless other arguments, of which I > merely respond, "all the more reason to use Ext3 for those few > filesystems, and XFS when Ext3 doesn't scale -- like for > large /home, /export, etc... filesystems." No matter how many times I > put forth the argument that XFS complements Ext3, they seem to treat it > as yet another JFS/ReiserFS argument. > > Hopeless? > > -- Bryan "one of the reasons I still deploy Solaris instead of RHEL for > fileservers, even though RHL7+XFS and RHL9+XFS rocked (and are still > rocking!)" Smith XFS surely rocks, but it's missing one critical component: data=ordered And that's one component that's just too critical to overlook for an enterprise environment that is built on data-integrity over performance. So that's the secret why people still use ext3, and XFS' reliance on external hardware to ensure integrity is really misplaced. Now, maybe when we get the data=ordered onto the VFS level, then maybe XFS may become viable for the enterprise, and ext3 may cease to be KING. Thanks! -- Al ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Advocacy] Re: 3ware 9650 tips 2007-07-16 17:40 ` Al Boldi @ 2007-07-16 17:48 ` Matthew Wilcox 2007-07-16 18:28 ` [RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips) Al Boldi 2007-07-16 18:38 ` [Advocacy] Re: 3ware 9650 tips Bryan J. Smith 0 siblings, 2 replies; 26+ messages in thread From: Matthew Wilcox @ 2007-07-16 17:48 UTC (permalink / raw) To: Al Boldi Cc: Bryan J. Smith, Joshua Baker-LePain, David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs, linux-fsdevel On Mon, Jul 16, 2007 at 08:40:00PM +0300, Al Boldi wrote: > XFS surely rocks, but it's missing one critical component: data=ordered > And that's one component that's just too critical to overlook for an > enterprise environment that is built on data-integrity over performance. > > So that's the secret why people still use ext3, and XFS' reliance on external > hardware to ensure integrity is really misplaced. > > Now, maybe when we get the data=ordered onto the VFS level, then maybe XFS > may become viable for the enterprise, and ext3 may cease to be KING. Wow, thanks for bringing an advocacy thread onto linux-fsdevel. Just what we wanted. Do you have any insight into how to "get the data=ordered onto the VFS level"? Because to me, that sounds like pure nonsense. -- "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips) 2007-07-16 17:48 ` Matthew Wilcox @ 2007-07-16 18:28 ` Al Boldi 2007-07-16 19:02 ` Matthew Wilcox 2007-07-16 18:38 ` [Advocacy] Re: 3ware 9650 tips Bryan J. Smith 1 sibling, 1 reply; 26+ messages in thread From: Al Boldi @ 2007-07-16 18:28 UTC (permalink / raw) To: Matthew Wilcox Cc: Bryan J. Smith, Joshua Baker-LePain, David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs, linux-fsdevel Matthew Wilcox wrote: > On Mon, Jul 16, 2007 at 08:40:00PM +0300, Al Boldi wrote: > > XFS surely rocks, but it's missing one critical component: data=ordered > > And that's one component that's just too critical to overlook for an > > enterprise environment that is built on data-integrity over performance. > > > > So that's the secret why people still use ext3, and XFS' reliance on > > external hardware to ensure integrity is really misplaced. > > > > Now, maybe when we get the data=ordered onto the VFS level, then maybe > > XFS may become viable for the enterprise, and ext3 may cease to be KING. > > Wow, thanks for bringing an advocacy thread onto linux-fsdevel. Just what > we wanted. Do you have any insight into how to "get the data=ordered > onto the VFS level"? Because to me, that sounds like pure nonsense. Well, conceptually it sounds like a piece of cake, technically your guess is as good as mine. IIRC, akpm once mentioned something like this. But seriously, can you think of a technical reason why it shouldn't be possible to abstract data=ordered mode out into the VFS? Thanks! -- Al ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips) 2007-07-16 18:28 ` [RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips) Al Boldi @ 2007-07-16 19:02 ` Matthew Wilcox 0 siblings, 0 replies; 26+ messages in thread From: Matthew Wilcox @ 2007-07-16 19:02 UTC (permalink / raw) To: Al Boldi Cc: Bryan J. Smith, Joshua Baker-LePain, David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs, linux-fsdevel On Mon, Jul 16, 2007 at 09:28:08PM +0300, Al Boldi wrote: > Well, conceptually it sounds like a piece of cake, technically your guess is > as good as mine. IIRC, akpm once mentioned something like this. How much have you looked at the VFS? There's nothing journalling-related in the VFS right now. ext3 and XFS share no common journalling code, nor do I think that would be possible, due to the very different concepts they have of journalling. Here's a good hint: $ find fs -type f |xargs grep -l journal_start fs/ext3/acl.c fs/ext3/inode.c fs/ext3/ioctl.c fs/ext3/namei.c fs/ext3/resize.c fs/ext3/super.c fs/ext3/xattr.c fs/ext4/acl.c fs/ext4/extents.c fs/ext4/inode.c fs/ext4/ioctl.c fs/ext4/namei.c fs/ext4/resize.c fs/ext4/super.c fs/ext4/xattr.c fs/jbd/journal.c fs/jbd/transaction.c fs/jbd2/journal.c fs/jbd2/transaction.c fs/ocfs2/journal.c fs/ocfs2/super.c JBD and JBD2 provide a journalling implementation that ext3, ext4 and ocfs2 use. Note that XFS doesn't, it has its own journalling code. If you want XFS to support data=ordered, talk to the XFS folks. Or start picking through XFS yourself, of course -- you do have the source code. -- "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Advocacy] Re: 3ware 9650 tips 2007-07-16 17:48 ` Matthew Wilcox 2007-07-16 18:28 ` [RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips) Al Boldi @ 2007-07-16 18:38 ` Bryan J. Smith 1 sibling, 0 replies; 26+ messages in thread From: Bryan J. Smith @ 2007-07-16 18:38 UTC (permalink / raw) To: Matthew Wilcox Cc: Al Boldi, Joshua Baker-LePain, David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs, linux-fsdevel On Mon, 2007-07-16 at 11:48 -0600, Matthew Wilcox wrote: > Wow, thanks for bringing an advocacy thread onto linux-fsdevel. Just what > we wanted. Do you have any insight into how to "get the data=ordered > onto the VFS level"? Because to me, that sounds like pure nonsense. First off, I have no idea who decided to respond to my post and CC: linux-fsdevel on it. In retrospect, secondly, I should have not posted my post to linux-raid in the first place (is that list now mirrored to linux-fsdevel or something?). I was just sharing in my frustration of the lack of XFS support by Red Hat. So, lastly and in any case, my apologies to all, even if I did not proliferate it to linux-fsdevel, it was probably not ideal for me to post such to anything on vger.kernel.org (like linux-raid) in the first place. -- Bryan J. Smith Professional, Technical Annoyance mailto:b.j.smith@ieee.org http://thebs413.blogspot.com -------------------------------------------------------- Fission Power: An Inconvenient Solution ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-16 15:43 ` Joshua Baker-LePain 2007-07-16 17:15 ` [Advocacy] " Bryan J. Smith @ 2007-07-16 17:34 ` Stuart Levy 1 sibling, 0 replies; 26+ messages in thread From: Stuart Levy @ 2007-07-16 17:34 UTC (permalink / raw) To: Joshua Baker-LePain Cc: David Chinner, Justin Piszcz, Jon Collette, linux-ide-arrays, linux-raid, xfs On Mon, Jul 16, 2007 at 11:43:24AM -0400, Joshua Baker-LePain wrote: > On Mon, 16 Jul 2007 at 12:41pm, David Chinner wrote > > >If you've got any sort of serious disk array, ext3 is not the filesystem > >to use.... > > I do so wish that RedHat shared this view... So they support XFS in Fedora, but not in RHEL?? (I've been using Fedora...) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 18:35 ` 3ware 9650 tips Justin Piszcz 2007-07-13 18:54 ` Jon Collette @ 2007-07-13 19:04 ` Joshua Baker-LePain 2007-07-13 23:30 ` Michael Tokarev 2007-07-14 1:23 ` Andrew Klaassen 2007-07-14 9:04 ` Mikael Abrahamsson 3 siblings, 1 reply; 26+ messages in thread From: Joshua Baker-LePain @ 2007-07-13 19:04 UTC (permalink / raw) To: Justin Piszcz; +Cc: linux-ide-arrays, linux-raid On Fri, 13 Jul 2007 at 2:35pm, Justin Piszcz wrote > On Fri, 13 Jul 2007, Joshua Baker-LePain wrote: > >> My new system has a 3ware 9650SE-24M8 controller hooked to 24 500GB WD >> drives. The controller is set up as a RAID6 w/ a hot spare. OS is CentOS >> 5 x86_64. It's all running on a couple of Xeon 5130s on a Supermicro X7DBE >> motherboard w/ 4GB of RAM. >> >> Trying to stick with a supported config as much as possible, I need to run >> ext3. As per usual, though, initial ext3 numbers are less than impressive. >> Using bonnie++ to get a baseline, I get (after doing 'blockdev --setra >> 65536' on the device): >> Write: 136MB/s >> Read: 384MB/s >> >> Proving it's not the hardware, with XFS the numbers look like: >> Write: 333MB/s >> Read: 465MB/s >> >> How many folks are using these? Any tuning tips? >> >> Thanks. > > You are using HW RAID then? Those numbers seem pretty awful for that > setup, including linux-raid@ even it though it appears you're running HW > raid, > this is rather peculiar. Yep, hardware RAID -- I need the hot swappability (which, AFAIK, is still an issue with md). -- Joshua Baker-LePain Department of Biomedical Engineering Duke University ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 19:04 ` Joshua Baker-LePain @ 2007-07-13 23:30 ` Michael Tokarev 0 siblings, 0 replies; 26+ messages in thread From: Michael Tokarev @ 2007-07-13 23:30 UTC (permalink / raw) To: Joshua Baker-LePain; +Cc: Justin Piszcz, linux-ide-arrays, linux-raid Joshua Baker-LePain wrote: [] > Yep, hardware RAID -- I need the hot swappability (which, AFAIK, is > still an issue with md). Just out of curiocity - what do you mean by "swappability" ? For many years we're using linux software raid, we had no problems with "swappability" of the component drives (in case of drive failures and what not). With non-hotswappable drives (old scsi and ide ones), rebooting is needed for the system to recognize the drives. For modern sas/sata drives, i can replace a faulty drive without anyone noticing... Maybe you're referring to something else? Thanks. /mjt ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 18:35 ` 3ware 9650 tips Justin Piszcz 2007-07-13 18:54 ` Jon Collette 2007-07-13 19:04 ` Joshua Baker-LePain @ 2007-07-14 1:23 ` Andrew Klaassen 2007-07-14 8:08 ` Justin Piszcz 2007-07-14 9:04 ` Mikael Abrahamsson 3 siblings, 1 reply; 26+ messages in thread From: Andrew Klaassen @ 2007-07-14 1:23 UTC (permalink / raw) To: Justin Piszcz, Joshua Baker-LePain; +Cc: linux-ide-arrays, linux-raid --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > To give you an example I get 464MB/s write and > 627MB/s with a 10 disk > raptor software raid5. Is that with the 9650? Andrew ____________________________________________________________________________________ Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-14 1:23 ` Andrew Klaassen @ 2007-07-14 8:08 ` Justin Piszcz 2007-07-14 16:10 ` Andrew Klaassen 0 siblings, 1 reply; 26+ messages in thread From: Justin Piszcz @ 2007-07-14 8:08 UTC (permalink / raw) To: Andrew Klaassen; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid On Fri, 13 Jul 2007, Andrew Klaassen wrote: > --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > >> To give you an example I get 464MB/s write and >> 627MB/s with a 10 disk >> raptor software raid5. > > Is that with the 9650? > > Andrew > > Sorry no, its with software raid 5 and the 965 chipset + three SATA PCI-e cards. Justin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-14 8:08 ` Justin Piszcz @ 2007-07-14 16:10 ` Andrew Klaassen 2007-07-14 16:11 ` Justin Piszcz 0 siblings, 1 reply; 26+ messages in thread From: Andrew Klaassen @ 2007-07-14 16:10 UTC (permalink / raw) To: Justin Piszcz; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > On Fri, 13 Jul 2007, Andrew Klaassen wrote: > > > --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > > > >> To give you an example I get 464MB/s write and > >> 627MB/s with a 10 disk > >> raptor software raid5. > > > > Is that with the 9650? > > > > Andrew > > > > > > Sorry no, its with software raid 5 and the 965 > chipset + three SATA PCI-e > cards. Which cards? Those are pretty good numbers, so I'm interested. Andrew ____________________________________________________________________________________ Never miss an email again! Yahoo! Toolbar alerts you the instant new Mail arrives. http://tools.search.yahoo.com/toolbar/features/mail/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-14 16:10 ` Andrew Klaassen @ 2007-07-14 16:11 ` Justin Piszcz 2007-07-14 16:14 ` Andrew Klaassen 0 siblings, 1 reply; 26+ messages in thread From: Justin Piszcz @ 2007-07-14 16:11 UTC (permalink / raw) To: Andrew Klaassen; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid On Sat, 14 Jul 2007, Andrew Klaassen wrote: > --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > >> >> >> On Fri, 13 Jul 2007, Andrew Klaassen wrote: >> >>> --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: >>> >>>> To give you an example I get 464MB/s write and >>>> 627MB/s with a 10 disk >>>> raptor software raid5. >>> >>> Is that with the 9650? >>> >>> Andrew >>> >>> >> >> Sorry no, its with software raid 5 and the 965 >> chipset + three SATA PCI-e >> cards. > > Which cards? Those are pretty good numbers, so I'm > interested. > > Andrew 03:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01) $19.99 2 port SYBA cards (Silicon Image 3132s) http://www.directron.com/sdsa2pex2ir.html ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-14 16:11 ` Justin Piszcz @ 2007-07-14 16:14 ` Andrew Klaassen 2007-07-14 16:18 ` Justin Piszcz 0 siblings, 1 reply; 26+ messages in thread From: Andrew Klaassen @ 2007-07-14 16:14 UTC (permalink / raw) To: Justin Piszcz; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > 03:00.0 RAID bus controller: Silicon Image, Inc. SiI > 3132 Serial ATA Raid > II Controller (rev 01) > > $19.99 2 port SYBA cards (Silicon Image 3132s) > > http://www.directron.com/sdsa2pex2ir.html Cool, thanks. What are your bonnie++ rewrite numbers? Andrew ____________________________________________________________________________________ Be a better Globetrotter. Get better travel answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545469 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-14 16:14 ` Andrew Klaassen @ 2007-07-14 16:18 ` Justin Piszcz 0 siblings, 0 replies; 26+ messages in thread From: Justin Piszcz @ 2007-07-14 16:18 UTC (permalink / raw) To: Andrew Klaassen; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid On Sat, 14 Jul 2007, Andrew Klaassen wrote: > --- Justin Piszcz <jpiszcz@lucidpixels.com> wrote: > >> 03:00.0 RAID bus controller: Silicon Image, Inc. SiI >> 3132 Serial ATA Raid >> II Controller (rev 01) >> >> $19.99 2 port SYBA cards (Silicon Image 3132s) >> >> http://www.directron.com/sdsa2pex2ir.html > > Cool, thanks. > > What are your bonnie++ rewrite numbers? > > Andrew 3 runs: p34,16G,77169,99,413276,80,155348,26,78932,99,535482,41,607.0,0,16:100000:16/64,1500,12,4886,15,1790,16,1821,17,6081,19,2159,19 p34,16G,77659,99,451593,87,167267,28,79058,99,584310,45,613.1,0,16:100000:16/64,1843,15,6006,31,1325,11,1204,12,3629,12,3324,31 p34,16G,77873,99,441881,87,166384,28,75182,99,566384,43,619.4,0,16:100000:16/64,1537,13,4474,15,1827,18,880,8,7658,22,3864,36 avg: p34,16G,77567,99,435583,84.6667,163000,27.3333,77724,99,562059,43,613.167,0,16:100000:16/64,1626.67,13.3333,5122,20.3333,1647.33,15,1301.67,12.3333,5789.33,17.6667,3115.67,28.6667 rewrite: 163000 KiB/s When tarring 4.4GB (of backup files) it takes about 20 seconds on XFS. Seems to vary as I change my configuration a lot: Here is from a while back: Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP p34-raid5 15696M 73755 99 411445 75 198639 34 78721 99 483169 39 584.8 0 ------Sequential Create------ --------Random Create-------- p34-raid5,15696M,73755,99,411445,75,198639,34,78721,99,483169,39,584.8,0,16:100000:16/64,919,8,9940,28,2841,18,922,8,3225,10,2422,18 Justin. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-13 18:35 ` 3ware 9650 tips Justin Piszcz ` (2 preceding siblings ...) 2007-07-14 1:23 ` Andrew Klaassen @ 2007-07-14 9:04 ` Mikael Abrahamsson 2007-07-14 16:11 ` Andrew Klaassen 3 siblings, 1 reply; 26+ messages in thread From: Mikael Abrahamsson @ 2007-07-14 9:04 UTC (permalink / raw) To: Justin Piszcz; +Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid On Fri, 13 Jul 2007, Justin Piszcz wrote: > You are using HW RAID then? Those numbers seem pretty awful for that > setup, including linux-raid@ even it though it appears you're running HW > raid, this is rather peculiar. No, it has been discussed numerous times on this list. SW raid is faster because it has access to (often) gigabytes of block cache, which the HW raid controller doesn't have. SW raid is therefore able to avoid a lot of reads when it needs to write, speeding things up considerably. I always use 3ware HW-raid though as I consider it more reliable. Since most of my access is "write once, read many" write speed isn't as important to me as data integrity. Take your 3ware HW-raid, do a dd (read or write) to the device and see it being very quick (because it can fit all the data into its cache as it either reads or writes), then put a filesystem on it and do writes there, especially journaled writes, and see write speed go down to 1/10 or so. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: 3ware 9650 tips 2007-07-14 9:04 ` Mikael Abrahamsson @ 2007-07-14 16:11 ` Andrew Klaassen 0 siblings, 0 replies; 26+ messages in thread From: Andrew Klaassen @ 2007-07-14 16:11 UTC (permalink / raw) To: Mikael Abrahamsson, Justin Piszcz Cc: Joshua Baker-LePain, linux-ide-arrays, linux-raid --- Mikael Abrahamsson <swmike@swm.pp.se> wrote: > Take your 3ware HW-raid, do a dd (read or write) to > the device and see it > being very quick (because it can fit all the data > into its cache as it > either reads or writes), then put a filesystem on it > and do writes there, > especially journaled writes, and see write speed go > down to 1/10 or so. How does non-cached performance tend to compare? Andrew ____________________________________________________________________________________ Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase. http://farechase.yahoo.com/ ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2007-07-16 22:21 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <alpine.LRH.0.999.0707131356520.25773@chaos.egr.duke.edu>
2007-07-13 18:35 ` 3ware 9650 tips Justin Piszcz
2007-07-13 18:54 ` Jon Collette
2007-07-13 19:36 ` Justin Piszcz
2007-07-16 2:41 ` David Chinner
2007-07-16 12:22 ` David Chinner
2007-07-16 12:39 ` Bernd Schubert
2007-07-16 15:50 ` Eric Sandeen
2007-07-16 22:21 ` David Chinner
2007-07-16 15:43 ` Joshua Baker-LePain
2007-07-16 17:15 ` [Advocacy] " Bryan J. Smith
2007-07-16 17:40 ` Al Boldi
2007-07-16 17:48 ` Matthew Wilcox
2007-07-16 18:28 ` [RFC] VFS: data=ordered (was: [Advocacy] Re: 3ware 9650 tips) Al Boldi
2007-07-16 19:02 ` Matthew Wilcox
2007-07-16 18:38 ` [Advocacy] Re: 3ware 9650 tips Bryan J. Smith
2007-07-16 17:34 ` Stuart Levy
2007-07-13 19:04 ` Joshua Baker-LePain
2007-07-13 23:30 ` Michael Tokarev
2007-07-14 1:23 ` Andrew Klaassen
2007-07-14 8:08 ` Justin Piszcz
2007-07-14 16:10 ` Andrew Klaassen
2007-07-14 16:11 ` Justin Piszcz
2007-07-14 16:14 ` Andrew Klaassen
2007-07-14 16:18 ` Justin Piszcz
2007-07-14 9:04 ` Mikael Abrahamsson
2007-07-14 16:11 ` Andrew Klaassen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).