* Linux MD? Or an H710p? @ 2013-10-20 0:49 Steve Bergman 2013-10-20 7:37 ` Stan Hoeppner ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Steve Bergman @ 2013-10-20 0:49 UTC (permalink / raw) To: linux-raid Hello, I'm configuring a PowerEdge R520 that I'll be installing RHEL 6.4 on next month. (Actually, Scientific Linux 6.4) I'll be upgrading to RHEL (SL) 7 when it's available, which is looking like it might default to XFS. This will be a 6 drive RAID10 set up for ~100 Gnome (freenx) desktop users and a virtual Windows 2008 Server guest running MS-SQL, so there is plenty of opportunity for i/o parallelism. This seems a good fit for XFS. My preference would be to use Linux MD RAID10. But the Dell configurator seems strongly inclined to force me towards hardware RAID. My choices would be to get a PERC H310 controller that I don't need, plus a SAS controller that the drives would actually connect to, and use Linux md. Or I can go with a PERC H710p w/1GB NV cache running hardware RAID10. (Dell says their RAID cards have to function as RAID controllers, and cannot act as simple SAS controllers.) I also have a choice between 600GB 15k drives and 600GB 10k "HYB CARR" drives, which I take to be 2.5" hybrid SSD/Rotational drives in a 3.5" mounting adapter. Any comments on any of this? This is a bit fancier than what I usually configure. And I'm not sure what the performance and operational differences would be. I know that I'm familiar with Linux's software RAID tools. And I know I like the way I can replace a drive and have it sync up transparently in the background while the server is operational. I don't yet know if I can do that with the H710p card. I also like how I just *know* that XFS if configuring stride, etc. properly with MD. With the H710p, I don't know what, if anything, the card is telling the OS about the underlying RAID configuration. I also just plain like MD. I like the 1GB NV cache I get if I go hardware RAID, which I don't get with the simple SAS controller. (I could turn off barriers.) I also like the fact that it seems a more standard Dell configuration. (They won't even connect the drives to the SAS controller at the factory.) Any general guidance would be appreciated. We'll probably be keeping this server for 7 years, and it's pretty important to us. So I'm really wanting to get this right. Thanks, Steve Bergman ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-20 0:49 Linux MD? Or an H710p? Steve Bergman @ 2013-10-20 7:37 ` Stan Hoeppner 2013-10-20 8:50 ` Mikael Abrahamsson 2013-10-21 14:18 ` John Stoffel 2 siblings, 0 replies; 17+ messages in thread From: Stan Hoeppner @ 2013-10-20 7:37 UTC (permalink / raw) To: Steve Bergman, linux-raid On 10/19/2013 7:49 PM, Steve Bergman wrote: > Hello, > > I'm configuring a PowerEdge R520 that I'll be installing RHEL 6.4 on > next month. (Actually, Scientific Linux 6.4) I'll be upgrading to RHEL > (SL) 7 when it's available, which is looking like it might default to XFS. > > This will be a 6 drive RAID10 set up for ~100 Gnome (freenx) desktop > users and a virtual Windows 2008 Server guest running MS-SQL, so there > is plenty of opportunity for i/o parallelism. This seems a good fit for > XFS. > > My preference would be to use Linux MD RAID10. But the Dell configurator > seems strongly inclined to force me towards hardware RAID. > > My choices would be to get a PERC H310 controller that I don't need, > plus a SAS controller that the drives would actually connect to, and use > Linux md. Or I can go with a PERC H710p w/1GB NV cache running hardware > RAID10. (Dell says their RAID cards have to function as RAID > controllers, and cannot act as simple SAS controllers.) > > I also have a choice between 600GB 15k drives and 600GB 10k "HYB CARR" > drives, which I take to be 2.5" hybrid SSD/Rotational drives in a 3.5" > mounting adapter. > > Any comments on any of this? This is a bit fancier than what I usually > configure. And I'm not sure what the performance and operational > differences would be. I know that I'm familiar with Linux's software > RAID tools. And I know I like the way I can replace a drive and have it > sync up transparently in the background while the server is operational. > I don't yet know if I can do that with the H710p card. I also like how I > just *know* that XFS if configuring stride, etc. properly with MD. With > the H710p, I don't know what, if anything, the card is telling the OS > about the underlying RAID configuration. I also just plain like MD. > > I like the 1GB NV cache I get if I go hardware RAID, which I don't get > with the simple SAS controller. (I could turn off barriers.) I also like > the fact that it seems a more standard Dell configuration. (They won't > even connect the drives to the SAS controller at the factory.) > > Any general guidance would be appreciated. We'll probably be keeping > this server for 7 years, and it's pretty important to us. So I'm really > wanting to get this right. Do what everyone else does in this situation: Buy the box with everything you want minus the disk controller. Purchase an LSI 9211-8i and cables, pop the lid and install it, takes 5 minutes tops. Runs $300 for the KIT, about $250 if you buy the OEM card and 2x .5M cables separately. http://www.lsi.com/products/host-bus-adapters/pages/lsi-sas-9211-8i.aspx -- Stan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-20 0:49 Linux MD? Or an H710p? Steve Bergman 2013-10-20 7:37 ` Stan Hoeppner @ 2013-10-20 8:50 ` Mikael Abrahamsson 2013-10-21 14:18 ` John Stoffel 2 siblings, 0 replies; 17+ messages in thread From: Mikael Abrahamsson @ 2013-10-20 8:50 UTC (permalink / raw) To: Steve Bergman; +Cc: linux-raid On Sat, 19 Oct 2013, Steve Bergman wrote: > Any general guidance would be appreciated. We'll probably be keeping > this server for 7 years, and it's pretty important to us. So I'm really > wanting to get this right. I prefer to use hardware raid for the boot device, because it can be a mess to make sure grub can boot off of any of the two boot drives and to re-assure this is true over time. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-20 0:49 Linux MD? Or an H710p? Steve Bergman 2013-10-20 7:37 ` Stan Hoeppner 2013-10-20 8:50 ` Mikael Abrahamsson @ 2013-10-21 14:18 ` John Stoffel 2013-10-22 0:36 ` Steve Bergman 2 siblings, 1 reply; 17+ messages in thread From: John Stoffel @ 2013-10-21 14:18 UTC (permalink / raw) To: Steve Bergman; +Cc: linux-raid Steve> I'm configuring a PowerEdge R520 that I'll be installing RHEL Steve> 6.4 on next month. (Actually, Scientific Linux 6.4) I'll be Steve> upgrading to RHEL (SL) 7 when it's available, which is looking Steve> like it might default to XFS. Steve> This will be a 6 drive RAID10 set up for ~100 Gnome (freenx) Steve> desktop users and a virtual Windows 2008 Server guest running Steve> MS-SQL, so there is plenty of opportunity for i/o Steve> parallelism. This seems a good fit for XFS. So are you keeping home directories on here as well? And how busy will the MS-SQL server be? That's probably where most of your IO will come from I suspect. Also, make sure you get lots of memory. The more your freenx server can cache in memory, the better things will be. I also note that under Centos 6.4 firefox 22 has a tendency to grow without bound, sucking up all the memory and causing the system to bog down. I admit I'm reading email via OWA, using Service Now, and lots of tabs, but basically memory usage sucks. And I'm using freenx as well to access my desktop. I do admit I'm using a 3rd party repo, so I'm running: firefox-22.0-1.el6.remi.x86_64 Steve> My preference would be to use Linux MD RAID10. But the Dell Steve> configurator seems strongly inclined to force me towards Steve> hardware RAID. Skip the configurator and just buy a controller 3rd hand. Steve> My choices would be to get a PERC H310 controller that I don't need, Steve> plus a SAS controller that the drives would actually connect to, and use Steve> Linux md. Or I can go with a PERC H710p w/1GB NV cache running hardware Steve> RAID10. (Dell says their RAID cards have to function as RAID Steve> controllers, and cannot act as simple SAS controllers.) Steve> I also have a choice between 600GB 15k drives and 600GB 10k "HYB CARR" Steve> drives, which I take to be 2.5" hybrid SSD/Rotational drives in a 3.5" Steve> mounting adapter. Is your key metric latency, or throughput? John ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-21 14:18 ` John Stoffel @ 2013-10-22 0:36 ` Steve Bergman 2013-10-22 7:24 ` David Brown 2013-10-22 16:43 ` Stan Hoeppner 0 siblings, 2 replies; 17+ messages in thread From: Steve Bergman @ 2013-10-22 0:36 UTC (permalink / raw) Cc: linux-raid First of all, thank you Stan, Mikael, and John for your replies. Stan, I had made a private bet with myself that Stan Hoeppner would be the first to respond to my query. And I was not disappointed. In fact, I was hoping for advice from you. We're getting the 7 yr hardware support contract from Dell, and I'm a little concerned about "finger-pointing" issues with regards to putting in a non-Dell SAS controller. Network card? No problem. But drive controller? Forgive me for "white-knuckling" on this a bit. But I have gotten an OK to order the server with both the H710p and the mystery "SAS 6Gbps HBA External Controller [$148.55]" for which no one at Dell seems to be able to tell me the pedigree. So I can configure both ways and see which I like. I do find that 1GB NV cache with barriers turned off to be attractive. But hey, this is going to be a very nice opportunity for observing XFS's savvy with parallel i/o. And I'm looking forward to it. BTW, it's the problematic COBOL Point of Sale app that didn't do fsyncs that is being migrated to its Windows-only MS-SQL version in the virtualized instance of Windows 2008 Server. At least it will be a virtualized instance on this server if I get my way. Essentially, our core business is moving from Linux to Windows in this move. C'est la vie. I did my best. NCR won. Mikael, That's a good point. I know that at one time RHEL didn't get that right in its Grub config. I've been assuming that in 2013 it's a "taken for granted" thing, with the caveat that nothing involving the bootloader and boot sectors can ever be completely taken for granted. John, First, let me get an embarrassing misinterpretation out of the way. "HYB CARR" stands for "hybrid carrier" which is a fancy name for a 2.5" -> 3.5" drive mounting adapter. Fortunately, this is a workload (varied as it is) with which I am extremely familiar. Yes, Firefox uses (abuses?) memory aggressively. But if necessary, I can control that with system-wide lockprefs. This server, which ended up being a Dell R720, will have an insane 256GB of memory in a mirrored configuration, resulting in an effective (and half as insane) 128GB visible to the OS. In 7 years time that should seem about 1/25th as insane as that. And we'll just have to see about the 50% memory bandwidth hit we see for mirroring. But anyway, I know that 16GB was iffy for the same workload 5 years ago. And we've expanded a bit. I think I could reasonably run what we're doing now on 24GB. Which means that we'd probably need something between that and 32GB, because my brain tends to underestimate these things. We currently are running on 48GB, which is so roomy that it makes it hard to tell. -Steve ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-22 0:36 ` Steve Bergman @ 2013-10-22 7:24 ` David Brown 2013-10-22 15:29 ` keld 2013-10-22 16:56 ` Stan Hoeppner 2013-10-22 16:43 ` Stan Hoeppner 1 sibling, 2 replies; 17+ messages in thread From: David Brown @ 2013-10-22 7:24 UTC (permalink / raw) To: Steve Bergman; +Cc: linux-raid On 22/10/13 02:36, Steve Bergman wrote: <snip> > But hey, this is going to be a very nice opportunity for observing XFS's > savvy with parallel i/o. You mentioned using a 6-drive RAID10 in your first email, with XFS on top of that. Stan is the expert here, but my understanding is that you should go for three 2-drive RAID1 pairs, and then use an md linear "raid" for these pairs and put XFS on top of that in order to get the full benefits of XFS parallelism. mvh., David ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-22 7:24 ` David Brown @ 2013-10-22 15:29 ` keld 2013-10-22 16:56 ` Stan Hoeppner 1 sibling, 0 replies; 17+ messages in thread From: keld @ 2013-10-22 15:29 UTC (permalink / raw) To: David Brown; +Cc: Steve Bergman, linux-raid It would be nice if we could get some benchmarks on this. I would be inetersted in also figures from a standard raid10,far configuration. best regards keld On Tue, Oct 22, 2013 at 09:24:57AM +0200, David Brown wrote: > On 22/10/13 02:36, Steve Bergman wrote: > > <snip> > > > But hey, this is going to be a very nice opportunity for observing XFS's > > savvy with parallel i/o. > > You mentioned using a 6-drive RAID10 in your first email, with XFS on > top of that. Stan is the expert here, but my understanding is that you > should go for three 2-drive RAID1 pairs, and then use an md linear > "raid" for these pairs and put XFS on top of that in order to get the > full benefits of XFS parallelism. > > mvh., > > David > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-22 7:24 ` David Brown 2013-10-22 15:29 ` keld @ 2013-10-22 16:56 ` Stan Hoeppner 2013-10-23 7:03 ` David Brown 1 sibling, 1 reply; 17+ messages in thread From: Stan Hoeppner @ 2013-10-22 16:56 UTC (permalink / raw) To: David Brown, Steve Bergman; +Cc: linux-raid On 10/22/2013 2:24 AM, David Brown wrote: > On 22/10/13 02:36, Steve Bergman wrote: > > <snip> > >> But hey, this is going to be a very nice opportunity for observing XFS's >> savvy with parallel i/o. > > You mentioned using a 6-drive RAID10 in your first email, with XFS on > top of that. Stan is the expert here, but my understanding is that you > should go for three 2-drive RAID1 pairs, and then use an md linear > "raid" for these pairs and put XFS on top of that in order to get the > full benefits of XFS parallelism. XFS on a concatenation, which is what you described above, is a very workload specific storage architecture. It is not a general use architecture, and almost never good for database workloads. Here most of the data is stored in a single file or a small set of files, in a single directory. With such a DB workload and 3 concatenated mirrors, only 1/3rd of the spindles would see the vast majority of the IO. -- Stan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-22 16:56 ` Stan Hoeppner @ 2013-10-23 7:03 ` David Brown 2013-10-24 6:23 ` Stan Hoeppner 0 siblings, 1 reply; 17+ messages in thread From: David Brown @ 2013-10-23 7:03 UTC (permalink / raw) To: stan; +Cc: Steve Bergman, linux-raid On 22/10/13 18:56, Stan Hoeppner wrote: > On 10/22/2013 2:24 AM, David Brown wrote: >> On 22/10/13 02:36, Steve Bergman wrote: >> >> <snip> >> >>> But hey, this is going to be a very nice opportunity for observing XFS's >>> savvy with parallel i/o. >> >> You mentioned using a 6-drive RAID10 in your first email, with XFS on >> top of that. Stan is the expert here, but my understanding is that you >> should go for three 2-drive RAID1 pairs, and then use an md linear >> "raid" for these pairs and put XFS on top of that in order to get the >> full benefits of XFS parallelism. > > XFS on a concatenation, which is what you described above, is a very > workload specific storage architecture. It is not a general use > architecture, and almost never good for database workloads. Here most > of the data is stored in a single file or a small set of files, in a > single directory. With such a DB workload and 3 concatenated mirrors, > only 1/3rd of the spindles would see the vast majority of the IO. > That's a good point - while I had noted that the OP was running a database, I forgot it was a virtual windows machine and MS SQL database. The virtual machine will use a single large file for its virtual harddisk image, and so RAID10 + XFS will beat RAID1 + concat + XFS. On the other hand, he is also serving 100+ freenx desktop users. As far as I understand it (and I'm very happy for corrections if I'm wrong), that will mean a /home directory with 100+ sub-directories for the different users - and that /is/ one of the ideal cases for concat+XFS parallelism. Only the OP can say which type of access is going to dominate and where the balance should go. As a more general point, I don't know that you can generalise that database workloads normally store data in a single big file or a small set of files. I haven't worked with many databases, and none more than a few hundred MB, so I am theorising here on things I have read rather than personal practice. But certainly with postgresql the data is split into multiple directories - each table has its own directory. For very big tables, the data is split into multiple files - and at some point, they will hit the allocation group size and then be split over multiple AG's, leading to parallelism (with a bit of luck). I am guessing other databases are somewhat similar. Of course, like any database tuning, this will all be highly load-dependent. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-23 7:03 ` David Brown @ 2013-10-24 6:23 ` Stan Hoeppner 2013-10-24 7:26 ` David Brown 0 siblings, 1 reply; 17+ messages in thread From: Stan Hoeppner @ 2013-10-24 6:23 UTC (permalink / raw) To: David Brown; +Cc: Steve Bergman, linux-raid On 10/23/2013 2:03 AM, David Brown wrote: > On the other hand, he is also serving 100+ freenx desktop users. As far > as I understand it (and I'm very happy for corrections if I'm wrong), > that will mean a /home directory with 100+ sub-directories for the > different users - and that /is/ one of the ideal cases for concat+XFS > parallelism. No, it is /not/. Homedir storage is not an ideal use case. It's not even in the ballpark. There's simply not enough parallelism nor IOPS involved, and file sizes can vary substantially, so the workload is not deterministic, i.e. it is "general". Recall I said in my last reply that this "is a very workload specific storage architecture"? Workloads that benefit from XFS over concatenated disks are those that: 1. Expose inherent limitations and/or inefficiencies of striping, at the filesystem, elevator, and/or hardware level 2. Exhibit a high degree of directory level parallelism 3. Exhibit high IOPS or data rates 4. Most importantly, exhibit relatively deterministic IO patterns Typical homedir storage meets none of these criteria. Homedir files on a GUI desktop terminal server are not 'typical', but the TS workload doesn't meet these criteria either. -- Stan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-24 6:23 ` Stan Hoeppner @ 2013-10-24 7:26 ` David Brown 2013-10-25 9:34 ` Stan Hoeppner 0 siblings, 1 reply; 17+ messages in thread From: David Brown @ 2013-10-24 7:26 UTC (permalink / raw) To: stan; +Cc: Steve Bergman, linux-raid On 24/10/13 08:23, Stan Hoeppner wrote: > On 10/23/2013 2:03 AM, David Brown wrote: > >> On the other hand, he is also serving 100+ freenx desktop users. As far >> as I understand it (and I'm very happy for corrections if I'm wrong), >> that will mean a /home directory with 100+ sub-directories for the >> different users - and that /is/ one of the ideal cases for concat+XFS >> parallelism. > > No, it is /not/. Homedir storage is not an ideal use case. It's not > even in the ballpark. There's simply not enough parallelism nor IOPS > involved, and file sizes can vary substantially, so the workload is not > deterministic, i.e. it is "general". Recall I said in my last reply > that this "is a very workload specific storage architecture"? > > Workloads that benefit from XFS over concatenated disks are those that: > > 1. Expose inherent limitations and/or inefficiencies of striping, > at the filesystem, elevator, and/or hardware level > > 2. Exhibit a high degree of directory level parallelism > > 3. Exhibit high IOPS or data rates > > 4. Most importantly, exhibit relatively deterministic IO patterns > > Typical homedir storage meets none of these criteria. Homedir files on > a GUI desktop terminal server are not 'typical', but the TS workload > doesn't meet these criteria either. > I am trying to learn from your experience and knowledge here, so thank you for your time so far. Hopefully it is also of use and interest to others - that's one of the beauties of public mailing lists. Am I correct in thinking that a common "ideal use case" is a mail server with lots of accounts, especially with maildir structures, so that accesses are spread across lots of directories with typically many parallel accesses to many small files? First, to make sure I am not making any technical errors here, I believe that when you make your XFS over a linear concat, the allocation groups are spread evenly across the parts of the concat so that logically (by number) adjacent AG's will be on different underlying disks. When you make a new directory on the filesystem, it gets put in a different AG (wrapping around, of course, and overflowing when necessary). Thus if you make three directories, and put a file in each directory, then each file will be on a different disk. (I believe older XFS only allocated different top-level directories to different AG's, but current XFS does so for all directories). I have been thinking about what the XFS over concat gives you compared to XFS over raid0 on the same disks (or raid1 pairs - the details don't matter much). First, consider small files. Access to small files (smaller than the granularity of the raid0 chunks) will usually only involve one disk of the raid0 stripe, and will /definitely/ only involve one disk of the concat. You should be able to access multiple small files in parallel, if you are lucky in the mix (with raid0, this "luck" will be mostly random, while with concat it will depend on the mix of files within directories. In particular, multiple files within the same directory will not be paralleled). With a concat, all relevant accesses such as directory reads and inode table access will be within the same disk as the file, while with raid0 it could easily be a different disk - but such accesses are often cached in ram. With raid0 you have the chance of the small file spanning two disks, leading to longer latency for that file and for other parallel accesses. All in all, small file access should not be /too/ different - but my guess is concat has the edge for lowest overall latency with multiple parallel accesses, as I think concat will avoid jumps between disks better. For large files, there is a bigger difference. Raid0 gives striping for higher throughput - but these accesses block the parallel accesses to other files. concat has slower throughput as there is no striping, but the other disks are free for parallel accesses (big or small). To my mind, this boils down to a question of balancing - concat gives lower average latencies with highly parallel accesses, but sacrifices maximum throughput of large files. If you don't have lots of parallel accesses, then concat gains little or nothing compared to raid0. If I try to match up this with the points you made, point 1 about striping is clear - this is a major difference between concat and raid0. Point 2 and 3 about parallelism and high IOPs (and therefore low latency) is also clear - if you don't need such access, concat will give you nothing. Only the OP can decide if his usage will meet these points. But I am struggling with point 4 - "most importantly, exhibit relatively deterministic IO patterns". All you need is to have your file accesses spread amongst a range of directories. If the number of (roughly) parallel accesses is big enough, you'll get a fairly even spread across the disks - and if it is not big enough for that, you haven't matched point 2. This is not really much different from raid0 - small accesses will be scattered across the different disks. The big difference comes when there is a large file access - with raid0, you will block /all/ other accesses for a time, while with concat (over three disks) you will block one third of the accesses for three times as long. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-24 7:26 ` David Brown @ 2013-10-25 9:34 ` Stan Hoeppner 2013-10-25 11:42 ` David Brown 0 siblings, 1 reply; 17+ messages in thread From: Stan Hoeppner @ 2013-10-25 9:34 UTC (permalink / raw) To: David Brown; +Cc: Steve Bergman, linux-raid On 10/24/2013 2:26 AM, David Brown wrote: > On 24/10/13 08:23, Stan Hoeppner wrote: >> On 10/23/2013 2:03 AM, David Brown wrote: >> >>> On the other hand, he is also serving 100+ freenx desktop users. As far >>> as I understand it (and I'm very happy for corrections if I'm wrong), >>> that will mean a /home directory with 100+ sub-directories for the >>> different users - and that /is/ one of the ideal cases for concat+XFS >>> parallelism. >> >> No, it is /not/. Homedir storage is not an ideal use case. It's not >> even in the ballpark. There's simply not enough parallelism nor IOPS >> involved, and file sizes can vary substantially, so the workload is not >> deterministic, i.e. it is "general". Recall I said in my last reply >> that this "is a very workload specific storage architecture"? >> >> Workloads that benefit from XFS over concatenated disks are those that: >> >> 1. Expose inherent limitations and/or inefficiencies of striping, >> at the filesystem, elevator, and/or hardware level >> >> 2. Exhibit a high degree of directory level parallelism >> >> 3. Exhibit high IOPS or data rates >> >> 4. Most importantly, exhibit relatively deterministic IO patterns >> >> Typical homedir storage meets none of these criteria. Homedir files on >> a GUI desktop terminal server are not 'typical', but the TS workload >> doesn't meet these criteria either. If you could sum up everything below into a couple of short, direct, coherent questions you have, I'd be glad to address them. > I am trying to learn from your experience and knowledge here, so thank > you for your time so far. Hopefully it is also of use and interest to > others - that's one of the beauties of public mailing lists. > > > Am I correct in thinking that a common "ideal use case" is a mail server > with lots of accounts, especially with maildir structures, so that > accesses are spread across lots of directories with typically many > parallel accesses to many small files? > > > First, to make sure I am not making any technical errors here, I believe > that when you make your XFS over a linear concat, the allocation groups > are spread evenly across the parts of the concat so that logically (by > number) adjacent AG's will be on different underlying disks. When you > make a new directory on the filesystem, it gets put in a different AG > (wrapping around, of course, and overflowing when necessary). Thus if > you make three directories, and put a file in each directory, then each > file will be on a different disk. (I believe older XFS only allocated > different top-level directories to different AG's, but current XFS does > so for all directories). > > > > I have been thinking about what the XFS over concat gives you compared > to XFS over raid0 on the same disks (or raid1 pairs - the details don't > matter much). > > First, consider small files. Access to small files (smaller than the > granularity of the raid0 chunks) will usually only involve one disk of > the raid0 stripe, and will /definitely/ only involve one disk of the > concat. You should be able to access multiple small files in parallel, > if you are lucky in the mix (with raid0, this "luck" will be mostly > random, while with concat it will depend on the mix of files within > directories. In particular, multiple files within the same directory > will not be paralleled). With a concat, all relevant accesses such as > directory reads and inode table access will be within the same disk as > the file, while with raid0 it could easily be a different disk - but > such accesses are often cached in ram. With raid0 you have the chance > of the small file spanning two disks, leading to longer latency for that > file and for other parallel accesses. > > All in all, small file access should not be /too/ different - but my > guess is concat has the edge for lowest overall latency with multiple > parallel accesses, as I think concat will avoid jumps between disks better. > > > For large files, there is a bigger difference. Raid0 gives striping for > higher throughput - but these accesses block the parallel accesses to > other files. concat has slower throughput as there is no striping, but > the other disks are free for parallel accesses (big or small). > > > To my mind, this boils down to a question of balancing - concat gives > lower average latencies with highly parallel accesses, but sacrifices > maximum throughput of large files. If you don't have lots of parallel > accesses, then concat gains little or nothing compared to raid0. > > > If I try to match up this with the points you made, point 1 about > striping is clear - this is a major difference between concat and raid0. > Point 2 and 3 about parallelism and high IOPs (and therefore low > latency) is also clear - if you don't need such access, concat will give > you nothing. > > Only the OP can decide if his usage will meet these points. > > But I am struggling with point 4 - "most importantly, exhibit relatively > deterministic IO patterns". All you need is to have your file accesses > spread amongst a range of directories. If the number of (roughly) > parallel accesses is big enough, you'll get a fairly even spread across > the disks - and if it is not big enough for that, you haven't matched > point 2. This is not really much different from raid0 - small accesses > will be scattered across the different disks. The big difference comes > when there is a large file access - with raid0, you will block /all/ > other accesses for a time, while with concat (over three disks) you will > block one third of the accesses for three times as long. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-25 9:34 ` Stan Hoeppner @ 2013-10-25 11:42 ` David Brown 2013-10-26 9:37 ` Stan Hoeppner 0 siblings, 1 reply; 17+ messages in thread From: David Brown @ 2013-10-25 11:42 UTC (permalink / raw) To: stan; +Cc: Steve Bergman, linux-raid On 25/10/13 11:34, Stan Hoeppner wrote: > On 10/24/2013 2:26 AM, David Brown wrote: >> On 24/10/13 08:23, Stan Hoeppner wrote: >>> On 10/23/2013 2:03 AM, David Brown wrote: >>> >>>> On the other hand, he is also serving 100+ freenx desktop >>>> users. As far as I understand it (and I'm very happy for >>>> corrections if I'm wrong), that will mean a /home directory >>>> with 100+ sub-directories for the different users - and that >>>> /is/ one of the ideal cases for concat+XFS parallelism. >>> >>> No, it is /not/. Homedir storage is not an ideal use case. It's >>> not even in the ballpark. There's simply not enough parallelism >>> nor IOPS involved, and file sizes can vary substantially, so the >>> workload is not deterministic, i.e. it is "general". Recall I >>> said in my last reply that this "is a very workload specific >>> storage architecture"? >>> >>> Workloads that benefit from XFS over concatenated disks are those >>> that: >>> >>> 1. Expose inherent limitations and/or inefficiencies of >>> striping, at the filesystem, elevator, and/or hardware level >>> >>> 2. Exhibit a high degree of directory level parallelism >>> >>> 3. Exhibit high IOPS or data rates >>> >>> 4. Most importantly, exhibit relatively deterministic IO >>> patterns >>> >>> Typical homedir storage meets none of these criteria. Homedir >>> files on a GUI desktop terminal server are not 'typical', but the >>> TS workload doesn't meet these criteria either. > > If you could sum up everything below into a couple of short, direct, > coherent questions you have, I'd be glad to address them. > Maybe I've been rambling a bit much. I am not sure I can be very short while still explaining my reasoning, but these are the three most important paragraphs. They are statements that I hope to get confirmed or corrected, rather than questions as such. First, to make sure I am not making any technical errors here, I believe that when you make your XFS over a linear concat, the allocation groups are spread evenly across the parts of the concat so that logically (by number) adjacent AG's will be on different underlying disks. When you make a new directory on the filesystem, it gets put in a different AG (wrapping around, of course, and overflowing when necessary). Thus if you make three directories, and put a file in each directory, then each file will be on a different disk. (I believe older XFS only allocated different top-level directories to different AG's, but current XFS does so for all directories). <snip> To my mind, this boils down to a question of balancing - concat gives lower average latencies with highly parallel accesses, but sacrifices maximum throughput of large files. If you don't have lots of parallel accesses, then concat gains little or nothing compared to raid0. <snip> But I am struggling with point 4 - "most importantly, exhibit relatively deterministic IO patterns". All you need is to have your file accesses spread amongst a range of directories. If the number of (roughly) parallel accesses is big enough, you'll get a fairly even spread across the disks - and if it is not big enough for that, you haven't matched point 2. This is not really much different from raid0 - small accesses will be scattered across the different disks. The big difference comes when there is a large file access - with raid0, you will block /all/ other accesses for a time, while with concat (over three disks) you will block one third of the accesses for three times as long. mvh., David > > > >> I am trying to learn from your experience and knowledge here, so >> thank you for your time so far. Hopefully it is also of use and >> interest to others - that's one of the beauties of public mailing >> lists. >> >> >> Am I correct in thinking that a common "ideal use case" is a mail >> server with lots of accounts, especially with maildir structures, >> so that accesses are spread across lots of directories with >> typically many parallel accesses to many small files? >> >> >> First, to make sure I am not making any technical errors here, I >> believe that when you make your XFS over a linear concat, the >> allocation groups are spread evenly across the parts of the concat >> so that logically (by number) adjacent AG's will be on different >> underlying disks. When you make a new directory on the filesystem, >> it gets put in a different AG (wrapping around, of course, and >> overflowing when necessary). Thus if you make three directories, >> and put a file in each directory, then each file will be on a >> different disk. (I believe older XFS only allocated different >> top-level directories to different AG's, but current XFS does so >> for all directories). >> >> >> >> I have been thinking about what the XFS over concat gives you >> compared to XFS over raid0 on the same disks (or raid1 pairs - the >> details don't matter much). >> >> First, consider small files. Access to small files (smaller than >> the granularity of the raid0 chunks) will usually only involve one >> disk of the raid0 stripe, and will /definitely/ only involve one >> disk of the concat. You should be able to access multiple small >> files in parallel, if you are lucky in the mix (with raid0, this >> "luck" will be mostly random, while with concat it will depend on >> the mix of files within directories. In particular, multiple files >> within the same directory will not be paralleled). With a concat, >> all relevant accesses such as directory reads and inode table >> access will be within the same disk as the file, while with raid0 >> it could easily be a different disk - but such accesses are often >> cached in ram. With raid0 you have the chance of the small file >> spanning two disks, leading to longer latency for that file and for >> other parallel accesses. >> >> All in all, small file access should not be /too/ different - but >> my guess is concat has the edge for lowest overall latency with >> multiple parallel accesses, as I think concat will avoid jumps >> between disks better. >> >> >> For large files, there is a bigger difference. Raid0 gives >> striping for higher throughput - but these accesses block the >> parallel accesses to other files. concat has slower throughput as >> there is no striping, but the other disks are free for parallel >> accesses (big or small). >> >> >> To my mind, this boils down to a question of balancing - concat >> gives lower average latencies with highly parallel accesses, but >> sacrifices maximum throughput of large files. If you don't have >> lots of parallel accesses, then concat gains little or nothing >> compared to raid0. >> >> >> If I try to match up this with the points you made, point 1 about >> striping is clear - this is a major difference between concat and >> raid0. Point 2 and 3 about parallelism and high IOPs (and therefore >> low latency) is also clear - if you don't need such access, concat >> will give you nothing. >> >> Only the OP can decide if his usage will meet these points. >> >> But I am struggling with point 4 - "most importantly, exhibit >> relatively deterministic IO patterns". All you need is to have >> your file accesses spread amongst a range of directories. If the >> number of (roughly) parallel accesses is big enough, you'll get a >> fairly even spread across the disks - and if it is not big enough >> for that, you haven't matched point 2. This is not really much >> different from raid0 - small accesses will be scattered across the >> different disks. The big difference comes when there is a large >> file access - with raid0, you will block /all/ other accesses for a >> time, while with concat (over three disks) you will block one third >> of the accesses for three times as long. > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-25 11:42 ` David Brown @ 2013-10-26 9:37 ` Stan Hoeppner 2013-10-27 22:08 ` David Brown 0 siblings, 1 reply; 17+ messages in thread From: Stan Hoeppner @ 2013-10-26 9:37 UTC (permalink / raw) To: David Brown; +Cc: Steve Bergman, linux-raid On 10/25/2013 6:42 AM, David Brown wrote: > On 25/10/13 11:34, Stan Hoeppner wrote: ... >>>> Workloads that benefit from XFS over concatenated disks are those >>>> that: >>>> >>>> 1. Expose inherent limitations and/or inefficiencies of >>>> striping, at the filesystem, elevator, and/or hardware level >>>> >>>> 2. Exhibit a high degree of directory level parallelism >>>> >>>> 3. Exhibit high IOPS or data rates >>>> >>>> 4. Most importantly, exhibit relatively deterministic IO >>>> patterns ... > allocation groups are spread evenly across the parts of the concat > so that logically (by number) adjacent AG's will be on different > underlying disks. This is not correct. The LBA sectors are numbered linearly, hence teh md name "linear", from the first sector of the first disk (or partition) to the last sector of the last disk, creating one large virtual disk. Thus mkfs.xfs divides the disk into equal sized AGs from beginning to end. So if you have 4 exactly equal sized disks in the concatenation and default mkfs.xfs creates 8 AGs, then AG0/1 would be on the first disk, AG2/3 would be on the second, and so on. If the disks (or partitions) are not precisely the same number of sectors you will end up with portions of AGs laying across physical disk boundaries. The AGs are NOT adjacently interleaved across disks as you suggest. > To my mind, this boils down to a question of balancing - concat > gives lower average latencies with highly parallel accesses, but That's too general a statement. Again, it depends on the workload, and the type of parallel access. For some parallel small file workloads with high DLP, then yes. For a parallel DB workload with a single table file, no. See #2 and #4 above. > sacrifices maximum throughput of large files. Not true. There are large file streaming workloads that perform better with XFS over concatenation than with striped RAID. Again, this is workload dependent. See #1-4 above. > If you don't have > lots of parallel accesses, then concat gains little or nothing > compared to raid0. You just repeated #2-3. > But I am struggling with point 4 - "most importantly, exhibit > relatively deterministic IO patterns". It means exactly what is says. In the parallel workload, the file sizes, IOPS, and/or data rate to each AG needs to be roughly equal. Ergo the IO pattern is "deterministic". Deterministic means we know what the IO pattern is before we build the storage system and run the application on it. Again, this is a "workload specific storage architecture". > All you need is to have > your file accesses spread amongst a range of directories. If the > number of (roughly) parallel accesses is big enough, you'll get a > fairly even spread across the disks - and if it is not big enough > for that, you haven't matched point 2. And if you aim a shotgun at a flock of geese you might hit a couple. This is not deterministic. > This is not really much > different from raid0 - small accesses will be scattered across the > different disks. It's very different. And no they won't be scattered across the disks with a striped array. When aligned to a striped array, XFS will allocate all files at the start of a stripe. If the file is smaller than sunit it will reside entirely on the first disk. This creates a massive IO hotspot. If the workload consists of files that are all or mostly smaller than sunit, all other disks in the striped array will sit idle until the filesystem is sufficiently full that no virgin stripes remain. At this point all allocation will become unaligned, or aligned to sunit boundaries if possible, with new files being allocated into the massive fragmented free space. Performance can't be any worse than this scenario. You can format XFS without alignment on a striped array and avoid the single drive hotspot above. However, file placement within the AGs and thus on the stripe is non-deterministic, because you're not aligned. XFS doesn't know where the chunk and stripe boundaries are. So you'll still end up with hot spots, some disks more active than others. This is where a properly designed XFS over concatenation may help. I say "may" because if you're not hitting #2-3 it doesn't matter. The load may not be sufficient to expose the architectural defect in either of the striped architectures above. So, again, use of XFS over concatenation is workload specific. And 4 of the criteria to evaluate whether it should be used are above. > The big difference comes when there is a large > file access - with raid0, you will block /all/ other accesses for a > time, while with concat (over three disks) you will block one third > of the accesses for three times as long. You're assuming a mixed workload. Again, XFS over concatenation is never used with a mixed, i.e. non-deterministic, workload. It is used only with workloads that exhibit determinism. Once again: "This is a very workload specific storage architecture" How many times have I repeated this on this list? Apparently not enough. -- Stan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-26 9:37 ` Stan Hoeppner @ 2013-10-27 22:08 ` David Brown 0 siblings, 0 replies; 17+ messages in thread From: David Brown @ 2013-10-27 22:08 UTC (permalink / raw) To: stan; +Cc: Steve Bergman, linux-raid On 26/10/13 11:37, Stan Hoeppner wrote: > On 10/25/2013 6:42 AM, David Brown wrote: >> On 25/10/13 11:34, Stan Hoeppner wrote: > ... >>>>> Workloads that benefit from XFS over concatenated disks are those >>>>> that: >>>>> >>>>> 1. Expose inherent limitations and/or inefficiencies of >>>>> striping, at the filesystem, elevator, and/or hardware level >>>>> >>>>> 2. Exhibit a high degree of directory level parallelism >>>>> >>>>> 3. Exhibit high IOPS or data rates >>>>> >>>>> 4. Most importantly, exhibit relatively deterministic IO >>>>> patterns > ... > >> allocation groups are spread evenly across the parts of the concat >> so that logically (by number) adjacent AG's will be on different >> underlying disks. > > This is not correct. The LBA sectors are numbered linearly, hence teh > md name "linear", from the first sector of the first disk (or partition) > to the last sector of the last disk, creating one large virtual disk. > Thus mkfs.xfs divides the disk into equal sized AGs from beginning to > end. So if you have 4 exactly equal sized disks in the concatenation > and default mkfs.xfs creates 8 AGs, then AG0/1 would be on the first > disk, AG2/3 would be on the second, and so on. If the disks (or > partitions) are not precisely the same number of sectors you will end up > with portions of AGs laying across physical disk boundaries. The AGs > are NOT adjacently interleaved across disks as you suggest. OK. > >> To my mind, this boils down to a question of balancing - concat >> gives lower average latencies with highly parallel accesses, but > > That's too general a statement. Again, it depends on the workload, and > the type of parallel access. For some parallel small file workloads > with high DLP, then yes. For a parallel DB workload with a single table > file, no. See #2 and #4 above. Fair enough. I was thinking of parallel accesses to /different/ files, in different directories. I think if I had said that, we would be closer here. > >> sacrifices maximum throughput of large files. > > Not true. There are large file streaming workloads that perform better > with XFS over concatenation than with striped RAID. Again, this is > workload dependent. See #1-4 above. That would be workloads where you have parallel accesses to large files in different directories? > >> If you don't have >> lots of parallel accesses, then concat gains little or nothing >> compared to raid0. > > You just repeated #2-3. > Yes. >> But I am struggling with point 4 - "most importantly, exhibit >> relatively deterministic IO patterns". > > It means exactly what is says. In the parallel workload, the file > sizes, IOPS, and/or data rate to each AG needs to be roughly equal. > Ergo the IO pattern is "deterministic". Deterministic means we know > what the IO pattern is before we build the storage system and run the > application on it. > I know what deterministic means, and I know what you are saying here. I just did not understand why you felt it mattered so much - but your answer below makes it much clearer. > Again, this is a "workload specific storage architecture". No doubts there! > >> All you need is to have >> your file accesses spread amongst a range of directories. If the >> number of (roughly) parallel accesses is big enough, you'll get a >> fairly even spread across the disks - and if it is not big enough >> for that, you haven't matched point 2. > > And if you aim a shotgun at a flock of geese you might hit a couple. > This is not deterministic. > I think you would be hard pushed to get better than "random with known characteristics" for most workloads (as always, there are exceptions where the workload is known very accurately). Enough independent random accesses and tight enough characteristics will give you the determinism you are looking for. (If 50 people aim shotguns at a flock of geese, it doesn't matter if they aim randomly or at carefully assigned targets - the result is a fairly even spread across the flock.) >> This is not really much >> different from raid0 - small accesses will be scattered across the >> different disks. > > It's very different. And no they won't be scattered across the disks > with a striped array. When aligned to a striped array, XFS will > allocate all files at the start of a stripe. If the file is smaller > than sunit it will reside entirely on the first disk. This creates a > massive IO hotspot. If the workload consists of files that are all or > mostly smaller than sunit, all other disks in the striped array will sit > idle until the filesystem is sufficiently full that no virgin stripes > remain. At this point all allocation will become unaligned, or aligned > to sunit boundaries if possible, with new files being allocated into the > massive fragmented free space. Performance can't be any worse than this > scenario. /This/ is a key point that is new to me. It is a specific detail of XFS that I was not aware of, and I fully agree it makes a very significant difference. I am trying to think /why/ XFS does it this way. I assume there is a good reason. Could it be the general point that big files usually start as small files, and that by allocating in this way XFS aims to reduce fragmentation and maximise stripe throughput as the file grows? One thing I get from this is that if your workload is mostly small files (smaller than sunit), then linear concat is going to give you better performance than raid0 even if the accesses are not very evenly spread across allocation groups - pretty much anything is better than concentrating everything on the first disk only. (Of course, if you are only accessing small files and you /don't/ have a lot of parallelism, then performance is unlikely to matter much.) > > You can format XFS without alignment on a striped array and avoid the > single drive hotspot above. However, file placement within the AGs and > thus on the stripe is non-deterministic, because you're not aligned. > XFS doesn't know where the chunk and stripe boundaries are. So you'll > still end up with hot spots, some disks more active than others. > > This is where a properly designed XFS over concatenation may help. I > say "may" because if you're not hitting #2-3 it doesn't matter. The > load may not be sufficient to expose the architectural defect in either > of the striped architectures above. > > So, again, use of XFS over concatenation is workload specific. And 4 of > the criteria to evaluate whether it should be used are above. > >> The big difference comes when there is a large >> file access - with raid0, you will block /all/ other accesses for a >> time, while with concat (over three disks) you will block one third >> of the accesses for three times as long. > > You're assuming a mixed workload. Again, XFS over concatenation is > never used with a mixed, i.e. non-deterministic, workload. It is used > only with workloads that exhibit determinism. > Yes, I am assuming a mixed workload (partly because that's what the OP has). > Once again: "This is a very workload specific storage architecture" > I think most people, including me, understand that it is workload-specific. What we are learning is exactly what kinds of workload are best suited to which layout, and why. The ideal situation is to be able to test out many different layouts under real-life loads, but I think that's unrealistic in most cases. So the best we can do is try to learn the theory. > How many times have I repeated this on this list? Apparently not enough. > I try to listen in to most of these threads, and sometimes I join in. Usually I learn a little more each time. I hope the same applies to others here. The general point - that filesystem and raid layout is workload dependent - is one of these things that cannot be repeated too often, I think. Thanks, David ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p? 2013-10-22 0:36 ` Steve Bergman 2013-10-22 7:24 ` David Brown @ 2013-10-22 16:43 ` Stan Hoeppner 1 sibling, 0 replies; 17+ messages in thread From: Stan Hoeppner @ 2013-10-22 16:43 UTC (permalink / raw) To: Steve Bergman; +Cc: linux-raid On 10/21/2013 7:36 PM, Steve Bergman wrote: > First of all, thank you Stan, Mikael, and John for your replies. > > Stan, > > I had made a private bet with myself that Stan Hoeppner would be the > first to respond to my query. And I was not disappointed. In fact, I was > hoping for advice from you. No need to bet. Just assume. ;) > We're getting the 7 yr hardware support > contract from Dell, Insane, 7 years is. Exist then, Dell may not. Clouded, Dell's future is. > and I'm a little concerned about "finger-pointing" > issues with regards to putting in a non-Dell SAS controller. Then use a Dell HBA. They're all LSI products anyway, and have been since the mid 90s when Dell re-badged their first AMI MegaRAID cards as "Power Edge RAID Controller". The PERC H310 is a no-cache RAID HBA, i.e. a fancy SAS/SATA HBA with extremely low performance firmware based RAID5. Its hardware RAID1/10 performance isn't bad, and allows booting from an array device sans teh headaches of booting md based RAID. It literally is the Dell OEM LSI 9240-8i, identical but for Dell branded firmware and the PCB. You can use it it JBOD mode, i.e. as a vanilla SAS/SATA HBA. See page 43: ftp://ftp.dell.com/manuals/all-products/esuprt_ser_stor_net/esuprt_dell_adapters/poweredge-rc-h310_User%27s%20Guide_en-us.pdf You can also use it in mixed mode, configuring two disk drives as a hardware RAID1 set, and booting from it, and configuring the other drives as non-virtual disks, i.e. standalone drives for md/RAID use. This requires 8 drives if you want a 6 disk md/RAID10. Why, you ask? Because you cannot intermix hardware RAID and software RAID on any given drive, obviously. Frankly, if you plan to buy only 6 drives for a single RAID10 volume, there is no reason to use md/RAID at all. It will provide no advantage for your stated application, as the firmware RAID executes plenty fast enough on the LSISAS2008 ASIC of the H310 to handle the striping/mirroring of 6 disks, with no appreciable decrease in IOPS. Though for another $190 you can have the H710 with 512MB NVWC. The extra 512MB of the 710P won't gain you anything, yet costs an extra $225. The $190 above the H310 is a great investment for the occasion that your UPS takes a big dump and downs the server. With the H310 you will lose data, corrupting users' Gnome config files, and possibly suffer filesystem corruption. The H710 will also give a bump to write IOPS, i.e. end user responsiveness with your workload. All things considered, my advice is to buy the H710 at $375 and use hardware RAID10 on 6 disks. Make /boot, root, /home, etc on the single RAID disk device. I didn't give you my advice in my first reply, as you seemed set on using md. > Network > card? No problem. But drive controller? Forgive me for "white-knuckling" > on this a bit. But I have gotten an OK to order the server with both the > H710p and the mystery "SAS 6Gbps HBA External Controller [$148.55]" for Note "External". You don't know what an SFF8088 port is. See: http://www.backupworks.com/productimages/lsilogic/lsias9205-8e.jpg You do not plan to connect an MD1200/1220/3200 JBOD chassis. You don't need, nor want, this "External" SAS HBA. > which no one at Dell seems to be able to tell me the pedigree. So I can It's a Dell OEM card, sourced from LSI. But for $150 I'd say it's a 4 port card, w/ single SFF8088 connector. Doesn't matter. You can't use it. > configure both ways and see which I like. Again, you'll need 8 drives for the md solution. > I do find that 1GB NV cache > with barriers turned off to be attractive. Make sure you use kernel 3.0.0 or later, and edit fstab with inode64 mount option, as well as nobarrier. > But hey, this is going to be a very nice opportunity for observing XFS's > savvy with parallel i/o. And I'm looking forward to it. Well given that you've provided zero detail about the workload in this thread I can't comment. > BTW, it's the > problematic COBOL Point of Sale app Oh God... you're *that* guy? ;) > that didn't do fsyncs that is being > migrated to its Windows-only MS-SQL version in the virtualized instance Ok, so now we have the reason for the Windows VM and MSSQL. > of Windows 2008 Server. At least it will be a virtualized instance on > this server if I get my way. Did you happen to notice during your virtual machine educational excursions that fsync is typically treated as a noop by many hypervisors? I'd definitely opt for a persistent cache RAID controller. > Essentially, our core business is moving > from Linux to Windows in this move. C'est la vie. I did my best. NCR won. It's really difficult to believe POS vendors are moving away from some of the most proprietary, and secure (if not just obscure) systems on the planet, for decades running System/36, AT&T SYS V, SCO, Linux, and now to... Windows? -- Stan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux MD? Or an H710p?
@ 2013-10-23 19:05 Drew
0 siblings, 0 replies; 17+ messages in thread
From: Drew @ 2013-10-23 19:05 UTC (permalink / raw)
To: David Brown; +Cc: Stan Hoeppner, Steve Bergman, Linux RAID Mailing List
> As a more general point, I don't know that you can generalise that
> database workloads normally store data in a single big file or a small
> set of files. I haven't worked with many databases, and none more than
> a few hundred MB, so I am theorising here on things I have read rather
> than personal practice. But certainly with postgresql the data is split
> into multiple directories - each table has its own directory. For very
> big tables, the data is split into multiple files - and at some point,
> they will hit the allocation group size and then be split over multiple
> AG's, leading to parallelism (with a bit of luck). I am guessing other
> databases are somewhat similar. Of course, like any database tuning,
> this will all be highly load-dependent.
MS SQL Server does tend to store each database in it's own file no
matter the size. Ran into this with a VMware ESXi cluster maintained
by a vCenter instance running SQL Server Express on Windows Server
2008r2.
Both SQL Server Express 2005 & 2008 store the entire DB in one large
file. Know this because I ran up against a file size limitation on
Express '05 when the DB tables storing performance data grew to the
allowed max of '05. Had to upgrade to '08 and clean out old
performance data to make vCenter happy again.
--
Drew
"Nothing in life is to be feared. It is only to be understood."
--Marie Curie
"This started out as a hobby and spun horribly out of control."
-Unknown
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-10-27 22:08 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-10-20 0:49 Linux MD? Or an H710p? Steve Bergman 2013-10-20 7:37 ` Stan Hoeppner 2013-10-20 8:50 ` Mikael Abrahamsson 2013-10-21 14:18 ` John Stoffel 2013-10-22 0:36 ` Steve Bergman 2013-10-22 7:24 ` David Brown 2013-10-22 15:29 ` keld 2013-10-22 16:56 ` Stan Hoeppner 2013-10-23 7:03 ` David Brown 2013-10-24 6:23 ` Stan Hoeppner 2013-10-24 7:26 ` David Brown 2013-10-25 9:34 ` Stan Hoeppner 2013-10-25 11:42 ` David Brown 2013-10-26 9:37 ` Stan Hoeppner 2013-10-27 22:08 ` David Brown 2013-10-22 16:43 ` Stan Hoeppner -- strict thread matches above, loose matches on Subject: below -- 2013-10-23 19:05 Drew
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).