All of lore.kernel.org
 help / color / mirror / Atom feed
* SCSI HA problems
@ 2011-10-22  0:08 Michael Robbert
  2011-10-22  8:50 ` Emmanuel Florac
  2011-10-22 13:31 ` James Bottomley
  0 siblings, 2 replies; 4+ messages in thread
From: Michael Robbert @ 2011-10-22  0:08 UTC (permalink / raw)
  To: linux-scsi@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 814 bytes --]

I want some technical advice from the SCSI/SAS/SATA experts. We are trying to setup a low cost HA storage system with multiple servers that have SAS HBAs, SAS JBOD, and desktop SATA disks. When we first set it up everything appeared to work. I created RAID6 on one host and put it into a corosync/pacemaker config. I was then able to migrate the RAID from one host to another. A short while later a failover failed and I noticed that some of the drives became inaccessible on one of the hosts. The kernel was showing timeouts to the device. 

Oct 21 17:55:34 haraid-12-1 kernel: sd 1:0:3:0: timing out command, waited 10s

So, my question is this, is this setup technically possible or are the 2 HBAs going to conflict with each other when talking over the same SAS bus to the SATA drives?

Thanks,
Mike


[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SCSI HA problems
  2011-10-22  0:08 SCSI HA problems Michael Robbert
@ 2011-10-22  8:50 ` Emmanuel Florac
  2011-10-23  0:26   ` Stan Hoeppner
  2011-10-22 13:31 ` James Bottomley
  1 sibling, 1 reply; 4+ messages in thread
From: Emmanuel Florac @ 2011-10-22  8:50 UTC (permalink / raw)
  To: Michael Robbert; +Cc: linux-scsi@vger.kernel.org

Le Fri, 21 Oct 2011 18:08:37 -0600 vous écriviez:

> 
> So, my question is this, is this setup technically possible or are
> the 2 HBAs going to conflict with each other when talking over the
> same SAS bus to the SATA drives?

Your explanation lacks important information, like the hardware in use
(controllers, jbods, drives, cabling, etc), kernel version, RAID ( is it
linux software RAID you're using?) etc. However:

First, you shouldn't be using desktop drives because it's extremely
dangerous (search the web and you'll find countless horror stories of
catastrophic failures, particularly with WD desktop drives).

Second, normally for SAS HA configuration, you must use SAS drives; the
main difference being that SAS drives have dual attachment, and can
manage commands coming from dual sources (controllers). SATA drives
lack the second path and can't be reliably driven from 2 different
controllers at once, unless you added a SAS to SATA adapter to them.

Third, your SAS controller must be able to work in multi-host
configuration. Most PCIe SAS controllers (3Ware, Adaptec, Areca,
HighPoint) can't do that at all. AFAIK only some LSI controllers are
multi-host aware, and this is a software option you must buy in
addition to the controller.

Fourth, for a dual attachment you need to use both SAS data path to
both hosts, which would quickly make clear you can't use SATA drives
(because they'll simply won't show up at all on the second path).

Fifth, if you're actually using linux md raid driver, I don't think
it to be in any manner multi-host capable. So that would be a
definitive dead end.

My advice : the only reliable way to achieve HA using SATA drives and
common SAS controllers is to use DRDB or some similar replication
mechanism. Yes, that means you'll need a second JBOD and twice the
number of drives. But it will _just_ _work_, both with hardware or
software RAID.

If necessary, you may need a pair of 10 Ge or IB cards for data
synchronisation between hosts to perform well enough. Modern hardware
can easily replicate over DRBD at several hundred MB per second.

Don't forget : "cheap, good, fast: choose two." In the case of large,
important, valuable data, "good" isn't really an option you may go
without anyway.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SCSI HA problems
  2011-10-22  0:08 SCSI HA problems Michael Robbert
  2011-10-22  8:50 ` Emmanuel Florac
@ 2011-10-22 13:31 ` James Bottomley
  1 sibling, 0 replies; 4+ messages in thread
From: James Bottomley @ 2011-10-22 13:31 UTC (permalink / raw)
  To: Michael Robbert; +Cc: linux-scsi@vger.kernel.org

On Fri, 2011-10-21 at 18:08 -0600, Michael Robbert wrote:
> I want some technical advice from the SCSI/SAS/SATA experts. We are
> trying to setup a low cost HA storage system with multiple servers
> that have SAS HBAs, SAS JBOD, and desktop SATA disks. When we first
> set it up everything appeared to work. I created RAID6 on one host and
> put it into a corosync/pacemaker config. I was then able to migrate
> the RAID from one host to another. A short while later a failover
> failed and I noticed that some of the drives became inaccessible on
> one of the hosts. The kernel was showing timeouts to the device. 
> 
> Oct 21 17:55:34 haraid-12-1 kernel: sd 1:0:3:0: timing out command, waited 10s
> 
> So, my question is this, is this setup technically possible or are the
> 2 HBAs going to conflict with each other when talking over the same
> SAS bus to the SATA drives?

Well, yes, but the devil is in the details.  Firstly it's only possible
with SAS; SATA controllers don't really do multi-initiator and SATA
disks don't have the cluster commands that SAS disks do.  It is
theoretically possible with a SAS controller and SATA disk provided the
cluster software doesn't use any of the SCSI commands for clustering.

The reason your setup likely doesn't work is because of the expander.
SAS expanders are complex beasts:  Any port can be device, table or
subtractively routed.  You don't usually see it, but a JBOD ships with a
subtractive port for the HBA connection and a device port for everything
else.  You can't just plug another HBA into a random device port because
the routing won't get device replies back to you (hence the timeout).

The way I've got this set up at home uses a 12x expander with 4x
subtractive and 9x table routing phys which group in clusters of up to
four for ports.  I can plug the second HBA into one of the table routed
ports because the SCSI transport class installs the correct route
tables, so I know it works for aic9xxx and mvsas.

James



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SCSI HA problems
  2011-10-22  8:50 ` Emmanuel Florac
@ 2011-10-23  0:26   ` Stan Hoeppner
  0 siblings, 0 replies; 4+ messages in thread
From: Stan Hoeppner @ 2011-10-23  0:26 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: Michael Robbert, linux-scsi@vger.kernel.org

On 10/22/2011 3:50 AM, Emmanuel Florac wrote:
> Le Fri, 21 Oct 2011 18:08:37 -0600 vous écriviez:

>> So, my question is this, is this setup technically possible or are
>> the 2 HBAs going to conflict with each other when talking over the
>> same SAS bus to the SATA drives?
> 
> Your explanation lacks important information, like the hardware in use
> (controllers, jbods, drives, cabling, etc), kernel version, RAID ( is it
> linux software RAID you're using?) etc. However:
> 
> First, you shouldn't be using desktop drives because it's extremely
> dangerous (search the web and you'll find countless horror stories of
> catastrophic failures, particularly with WD desktop drives).

Agreed.  Particularly the "Green" drives from any manufacturer.  Keep in
mind that when a "home brew" system of this nature takes a catastrophic
nose dive, you may spend a couple of days or more trying to hunt down
the problem and fix it.  And therein lies the rub:  cheap SATA drives
will drop en masse from arrays, inexplicably, and will test good in
isolation on the bench.  Now what?  The problem isn't the drives but the
entire low cost architecture.  The only fix it to replace _everything_
if you want it to be reliable.  Or, fix it by avoiding such a thing in
the first place.  Use "enterprise" quality SAS or SATA drives from day
one, and use a good quality SAS controller, such as LSI.

> Second, normally for SAS HA configuration, you must use SAS drives; the
> main difference being that SAS drives have dual attachment, and can
> manage commands coming from dual sources (controllers). SATA drives
> lack the second path and can't be reliably driven from 2 different
> controllers at once, unless you added a SAS to SATA adapter to them.
> 
> Third, your SAS controller must be able to work in multi-host
> configuration. Most PCIe SAS controllers (3Ware, Adaptec, Areca,
> HighPoint) can't do that at all. AFAIK only some LSI controllers are
> multi-host aware, and this is a software option you must buy in
> addition to the controller.

You'll also want, if not outright need, an SAS switch, such as the LSI
SAS6160.  Runs about $2000 USD from resellers.  You'll also want a
quality JBOD chassis w/expander at about $2k each.

> Fourth, for a dual attachment you need to use both SAS data path to
> both hosts, which would quickly make clear you can't use SATA drives
> (because they'll simply won't show up at all on the second path).

Which is why hardware RAID enclosures and cluster filesystems and/or NFS
servers are much more popular than this type of shared SAS cluster.

> Fifth, if you're actually using linux md raid driver, I don't think
> it to be in any manner multi-host capable. So that would be a
> definitive dead end.

Yep.

> My advice : the only reliable way to achieve HA using SATA drives and
> common SAS controllers is to use DRDB or some similar replication
> mechanism. Yes, that means you'll need a second JBOD and twice the
> number of drives. But it will _just_ _work_, both with hardware or
> software RAID.

I see it as the only way to get away with using cheap SATA drives (which
I still wouldn't recommend).   If this is a lab exercise that's one
thing.  If this will be a production system, stay away from consumer
class drives.

> If necessary, you may need a pair of 10 Ge or IB cards for data
> synchronisation between hosts to perform well enough. Modern hardware
> can easily replicate over DRBD at several hundred MB per second.

The replication link bandwidth depends entirely on the target
application and expected filesystem bandwidth required, which the OP
didn't state IIRC.  That omission leads me to believe this is a research
project/exercise, with no actual goal to realize.

> Don't forget : "cheap, good, fast: choose two." In the case of large,
> important, valuable data, "good" isn't really an option you may go
> without anyway.

Good advice.

-- 
Stan


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-10-23  0:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-22  0:08 SCSI HA problems Michael Robbert
2011-10-22  8:50 ` Emmanuel Florac
2011-10-23  0:26   ` Stan Hoeppner
2011-10-22 13:31 ` James Bottomley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.