All of lore.kernel.org
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Roland Dreier <roland@kernel.org>,
	linux-scsi@vger.kernel.org, eric@purestorage.com
Subject: Re: Handling multiple paths to enclosure devices?
Date: Thu, 28 Jul 2011 18:01:48 -0400	[thread overview]
Message-ID: <4E31DC4C.1050509@interlog.com> (raw)
In-Reply-To: <1311885202.5464.14.camel@mulgrave>

On 11-07-28 04:33 PM, James Bottomley wrote:
> On Thu, 2011-07-28 at 10:05 -0700, Roland Dreier wrote:
>> Hi,
>>
>> I'm seeing an issue with the current design of our enclosure
>> handling.  In a system with a bunch of drives in an enclosure, it's
>> definitely helpful to have a way to go from sdXX to which slot in the
>> enclosure that drive is in, and that's what the symlink
>> /sys/block/sdXX/device/enclosure_device:NN provides.
>>
>> However, in a system with multiple paths to the enclosure, eg an HBA
>> with two external SAS ports, both connected to a SAS expander in a
>> JBOD, ie in lame ASCII graphics, something like:
>>
>>       +-----+                 /-- drv1
>>       |     |       +-----+  /---  .
>>       |     |==SAS==|     |-/----  .
>>       | HBA |       | exp |------  .
>>       |     |==SAS==|     |-\----  .
>>       |     |       +-----+  \---  .
>>       +-----+                 \-- drvN
>
> So this configuration should form a single wide port and thus not
> actually be multiple paths.  However, if you have two HBAs instead of
> one (or a non-SAS HBA), I grant it becomes multipath.

Here are some conflicting results for the single HBA case.
I tested two LSI HBAs separately, each with 8 phys and wired 5
of those phys to a LSI SAS-2 expander.

Here are the results for a SAS-1.1 (3 Gbps) 3444E HBA seen
from a SMP DISCOVER on the expander:
   phy   5:T:attached:[5001517e85c3efe5:00  t(SATA)]  3 Gbps
   phy   7:T:attached:[5000c500215725bd:00  t(SSP)]  6 Gbps
   phy  12:T:attached:[500605b00006f263:03  i(SSP+STP+SMP)]  3 Gbps
   phy  20:T:attached:[500605b00006f264:04  i(SSP+STP+SMP)]  3 Gbps
   phy  21:T:attached:[500605b00006f264:05  i(SSP+STP+SMP)]  3 Gbps
   phy  22:T:attached:[500605b00006f264:06  i(SSP+STP+SMP)]  3 Gbps
   phy  23:T:attached:[500605b00006f264:07  i(SSP+STP+SMP)]  3 Gbps
   phy  24:D:attached:[5001517e85c3effd:00  V i(SSP+SMP) t(SSP)]  6 Gbps

That is two ports:
   - a narrow port [HBA phy 3 to expander phy 12]
     note the different SAS address of HBA phy 3: 500605b00006f263
   - a wide port [HBA phys 4-7 to expander phys 20-23]
Should the HBA report as two hosts?? [It only reports as one host
in my test.]

So if a SAS HBA can't handle a wide port with more than
4 phys, it can just change the SAS addresses on one or more
phys at the HBA end.


Here are the results for a SAS-2 (6 Gbps) 9212 HBA seen
from a SMP DISCOVER on the same expander:
   phy   5:T:attached:[5001517e85c3efe5:00  t(SATA)]  3 Gbps
   phy   7:T:attached:[5000c500215725bd:00  t(SSP)]  6 Gbps
   phy  12:T:attached:[500605b001d0d3e0:04  i(SSP+STP+SMP)]  6 Gbps
   phy  20:T:attached:[500605b001d0d3e0:03  i(SSP+STP+SMP)]  6 Gbps
   phy  21:T:attached:[500605b001d0d3e0:02  i(SSP+STP+SMP)]  6 Gbps
   phy  22:T:attached:[500605b001d0d3e0:01  i(SSP+STP+SMP)]  6 Gbps
   phy  23:T:attached:[500605b001d0d3e0:00  i(SSP+STP+SMP)]  6 Gbps
   phy  24:D:attached:[5001517e85c3effd:00  V i(SSP+SMP) t(SSP)]  6 Gbps

Now that is a single wide port (5 phys wide) since all
5 expander phys have the same SAS address (not shown)
and all 5 HBA phys have the same address.

>> we have two paths to each drive, so each gets two names, sdXX and
>> sdYY.  However, in drivers/misc/enclosure.c, the code only allows one
>> device in each component and so what happens is that sdXX gets
>> discovered, then gets an enclosure_device:NN link, then sdYY is
>> discovered, so sdXX's enclosure_device:NN link is removed and one is
>> added for sdYY.  And so if I want to figure out which enclosure slot
>> sdXX is in, I'm in for a hard time.
>>
>> It would be a simple matter of writing code to allow all the block
>> devices in a slot to link back to that slot -- we would have to be a
>> bit more careful of keeping track of what links exist, but it should
>> be doable.
>>
>> The wrinkle is that there are also /sys/class/enclosure/ZZZ/NN/device
>> symlinks that allow going the other way.  And it's harder to see how
>> to express multiple block devices in one enclosure slot.
>>
>> Thoughts on how to improve our enclosure handling?

IN SAS-2 the SMP DISCOVER response should contain slot information.
Looking at sysfs for my test system:

# lsscsi
[1:0:0:0]    disk    ATA      ST3320620AS      3.AA  /dev/sda
[6:0:0:0]    disk    ATA      ST31000528AS     CC38  /dev/sdb
[6:0:1:0]    disk    SEAGATE  ST32000444SS     0006  /dev/sdc
[6:0:2:0]    enclosu Intel    RES2SV240        0600  -

then fetching the corresponding SAS port addresses:

# lsscsi -t
[1:0:0:0]    disk    sata:                           /dev/sda
[6:0:0:0]    disk    sas:0x5001517e85c3efe5          /dev/sdb
[6:0:1:0]    disk    sas:0x5000c500215725bd          /dev/sdc
[6:0:2:0]    enclosu sas:0x5001517e85c3effd          -

Referring to the SMP DISCOVER response above, it can be seen
that /dev/sdc is connected to expander phy 7. So getting the
long form output for a SMP DISCOVER on phy 7 :

# smp_discover -p 7 /dev/bsg/expander-6\:0
Discover response:
   ...
   attached SAS address: 0x5000c500215725bd
   attached phy identifier: 0
   ...
   device slot number: 255
   device slot group number: 255
   device slot group output connector:

Sadly the value of 255 means not available. YMMV

> My initial thought is that in a multi-path situation, as above, we get
> two enclosures appearing as well (one down each path).  If we
> incorporated the idea of topological subtrees into the identity matching
> code, we'd end up filling each of the enclosures with the path connected
> devices.  That seems to be an easy situation for multi-path drivers to
> sort out and one requiring no alteration of the existing enclosure code
> (except to do the topological subtree search).
>
> How does that sound?

Does this also solve the problem reported a few weeks back
in which a SES logical unit reported duplicate element
descriptor names?

Doug Gilbert

      parent reply	other threads:[~2011-07-28 22:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-28 17:05 Handling multiple paths to enclosure devices? Roland Dreier
2011-07-28 20:33 ` James Bottomley
2011-07-28 20:57   ` Roland Dreier
2011-07-29  7:09     ` James Bottomley
2011-07-29  7:21       ` Hannes Reinecke
2011-07-29  7:25         ` James Bottomley
2011-07-28 22:01   ` Douglas Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E31DC4C.1050509@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=eric@purestorage.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=roland@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.