* SCSI HA problems @ 2011-10-22 0:08 Michael Robbert 2011-10-22 8:50 ` Emmanuel Florac 2011-10-22 13:31 ` James Bottomley 0 siblings, 2 replies; 4+ messages in thread From: Michael Robbert @ 2011-10-22 0:08 UTC (permalink / raw) To: linux-scsi@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 814 bytes --] I want some technical advice from the SCSI/SAS/SATA experts. We are trying to setup a low cost HA storage system with multiple servers that have SAS HBAs, SAS JBOD, and desktop SATA disks. When we first set it up everything appeared to work. I created RAID6 on one host and put it into a corosync/pacemaker config. I was then able to migrate the RAID from one host to another. A short while later a failover failed and I noticed that some of the drives became inaccessible on one of the hosts. The kernel was showing timeouts to the device. Oct 21 17:55:34 haraid-12-1 kernel: sd 1:0:3:0: timing out command, waited 10s So, my question is this, is this setup technically possible or are the 2 HBAs going to conflict with each other when talking over the same SAS bus to the SATA drives? Thanks, Mike [-- Attachment #2: Message signed with OpenPGP using GPGMail --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SCSI HA problems 2011-10-22 0:08 SCSI HA problems Michael Robbert @ 2011-10-22 8:50 ` Emmanuel Florac 2011-10-23 0:26 ` Stan Hoeppner 2011-10-22 13:31 ` James Bottomley 1 sibling, 1 reply; 4+ messages in thread From: Emmanuel Florac @ 2011-10-22 8:50 UTC (permalink / raw) To: Michael Robbert; +Cc: linux-scsi@vger.kernel.org Le Fri, 21 Oct 2011 18:08:37 -0600 vous écriviez: > > So, my question is this, is this setup technically possible or are > the 2 HBAs going to conflict with each other when talking over the > same SAS bus to the SATA drives? Your explanation lacks important information, like the hardware in use (controllers, jbods, drives, cabling, etc), kernel version, RAID ( is it linux software RAID you're using?) etc. However: First, you shouldn't be using desktop drives because it's extremely dangerous (search the web and you'll find countless horror stories of catastrophic failures, particularly with WD desktop drives). Second, normally for SAS HA configuration, you must use SAS drives; the main difference being that SAS drives have dual attachment, and can manage commands coming from dual sources (controllers). SATA drives lack the second path and can't be reliably driven from 2 different controllers at once, unless you added a SAS to SATA adapter to them. Third, your SAS controller must be able to work in multi-host configuration. Most PCIe SAS controllers (3Ware, Adaptec, Areca, HighPoint) can't do that at all. AFAIK only some LSI controllers are multi-host aware, and this is a software option you must buy in addition to the controller. Fourth, for a dual attachment you need to use both SAS data path to both hosts, which would quickly make clear you can't use SATA drives (because they'll simply won't show up at all on the second path). Fifth, if you're actually using linux md raid driver, I don't think it to be in any manner multi-host capable. So that would be a definitive dead end. My advice : the only reliable way to achieve HA using SATA drives and common SAS controllers is to use DRDB or some similar replication mechanism. Yes, that means you'll need a second JBOD and twice the number of drives. But it will _just_ _work_, both with hardware or software RAID. If necessary, you may need a pair of 10 Ge or IB cards for data synchronisation between hosts to perform well enough. Modern hardware can easily replicate over DRBD at several hundred MB per second. Don't forget : "cheap, good, fast: choose two." In the case of large, important, valuable data, "good" isn't really an option you may go without anyway. -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SCSI HA problems 2011-10-22 8:50 ` Emmanuel Florac @ 2011-10-23 0:26 ` Stan Hoeppner 0 siblings, 0 replies; 4+ messages in thread From: Stan Hoeppner @ 2011-10-23 0:26 UTC (permalink / raw) To: Emmanuel Florac; +Cc: Michael Robbert, linux-scsi@vger.kernel.org On 10/22/2011 3:50 AM, Emmanuel Florac wrote: > Le Fri, 21 Oct 2011 18:08:37 -0600 vous écriviez: >> So, my question is this, is this setup technically possible or are >> the 2 HBAs going to conflict with each other when talking over the >> same SAS bus to the SATA drives? > > Your explanation lacks important information, like the hardware in use > (controllers, jbods, drives, cabling, etc), kernel version, RAID ( is it > linux software RAID you're using?) etc. However: > > First, you shouldn't be using desktop drives because it's extremely > dangerous (search the web and you'll find countless horror stories of > catastrophic failures, particularly with WD desktop drives). Agreed. Particularly the "Green" drives from any manufacturer. Keep in mind that when a "home brew" system of this nature takes a catastrophic nose dive, you may spend a couple of days or more trying to hunt down the problem and fix it. And therein lies the rub: cheap SATA drives will drop en masse from arrays, inexplicably, and will test good in isolation on the bench. Now what? The problem isn't the drives but the entire low cost architecture. The only fix it to replace _everything_ if you want it to be reliable. Or, fix it by avoiding such a thing in the first place. Use "enterprise" quality SAS or SATA drives from day one, and use a good quality SAS controller, such as LSI. > Second, normally for SAS HA configuration, you must use SAS drives; the > main difference being that SAS drives have dual attachment, and can > manage commands coming from dual sources (controllers). SATA drives > lack the second path and can't be reliably driven from 2 different > controllers at once, unless you added a SAS to SATA adapter to them. > > Third, your SAS controller must be able to work in multi-host > configuration. Most PCIe SAS controllers (3Ware, Adaptec, Areca, > HighPoint) can't do that at all. AFAIK only some LSI controllers are > multi-host aware, and this is a software option you must buy in > addition to the controller. You'll also want, if not outright need, an SAS switch, such as the LSI SAS6160. Runs about $2000 USD from resellers. You'll also want a quality JBOD chassis w/expander at about $2k each. > Fourth, for a dual attachment you need to use both SAS data path to > both hosts, which would quickly make clear you can't use SATA drives > (because they'll simply won't show up at all on the second path). Which is why hardware RAID enclosures and cluster filesystems and/or NFS servers are much more popular than this type of shared SAS cluster. > Fifth, if you're actually using linux md raid driver, I don't think > it to be in any manner multi-host capable. So that would be a > definitive dead end. Yep. > My advice : the only reliable way to achieve HA using SATA drives and > common SAS controllers is to use DRDB or some similar replication > mechanism. Yes, that means you'll need a second JBOD and twice the > number of drives. But it will _just_ _work_, both with hardware or > software RAID. I see it as the only way to get away with using cheap SATA drives (which I still wouldn't recommend). If this is a lab exercise that's one thing. If this will be a production system, stay away from consumer class drives. > If necessary, you may need a pair of 10 Ge or IB cards for data > synchronisation between hosts to perform well enough. Modern hardware > can easily replicate over DRBD at several hundred MB per second. The replication link bandwidth depends entirely on the target application and expected filesystem bandwidth required, which the OP didn't state IIRC. That omission leads me to believe this is a research project/exercise, with no actual goal to realize. > Don't forget : "cheap, good, fast: choose two." In the case of large, > important, valuable data, "good" isn't really an option you may go > without anyway. Good advice. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SCSI HA problems 2011-10-22 0:08 SCSI HA problems Michael Robbert 2011-10-22 8:50 ` Emmanuel Florac @ 2011-10-22 13:31 ` James Bottomley 1 sibling, 0 replies; 4+ messages in thread From: James Bottomley @ 2011-10-22 13:31 UTC (permalink / raw) To: Michael Robbert; +Cc: linux-scsi@vger.kernel.org On Fri, 2011-10-21 at 18:08 -0600, Michael Robbert wrote: > I want some technical advice from the SCSI/SAS/SATA experts. We are > trying to setup a low cost HA storage system with multiple servers > that have SAS HBAs, SAS JBOD, and desktop SATA disks. When we first > set it up everything appeared to work. I created RAID6 on one host and > put it into a corosync/pacemaker config. I was then able to migrate > the RAID from one host to another. A short while later a failover > failed and I noticed that some of the drives became inaccessible on > one of the hosts. The kernel was showing timeouts to the device. > > Oct 21 17:55:34 haraid-12-1 kernel: sd 1:0:3:0: timing out command, waited 10s > > So, my question is this, is this setup technically possible or are the > 2 HBAs going to conflict with each other when talking over the same > SAS bus to the SATA drives? Well, yes, but the devil is in the details. Firstly it's only possible with SAS; SATA controllers don't really do multi-initiator and SATA disks don't have the cluster commands that SAS disks do. It is theoretically possible with a SAS controller and SATA disk provided the cluster software doesn't use any of the SCSI commands for clustering. The reason your setup likely doesn't work is because of the expander. SAS expanders are complex beasts: Any port can be device, table or subtractively routed. You don't usually see it, but a JBOD ships with a subtractive port for the HBA connection and a device port for everything else. You can't just plug another HBA into a random device port because the routing won't get device replies back to you (hence the timeout). The way I've got this set up at home uses a 12x expander with 4x subtractive and 9x table routing phys which group in clusters of up to four for ports. I can plug the second HBA into one of the table routed ports because the SCSI transport class installs the correct route tables, so I know it works for aic9xxx and mvsas. James ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-10-23 0:35 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-22 0:08 SCSI HA problems Michael Robbert 2011-10-22 8:50 ` Emmanuel Florac 2011-10-23 0:26 ` Stan Hoeppner 2011-10-22 13:31 ` James Bottomley
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.