All of lore.kernel.org
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: SCSI development list <linux-scsi@vger.kernel.org>
Subject: mpt2sas,mpt3sas: SATA affiliations
Date: Wed, 12 Nov 2014 14:59:11 -0500	[thread overview]
Message-ID: <5463BC0F.4050605@interlog.com> (raw)

 From a correspondent and my own testing I have seen way
too many of these messages in the log:
    log_info(0x31160000): originator(PL), code(0x16), sub_code(0x0000)

That comes from either the mpt2sas or mpt3sas driver and may be
a problem with their interaction with the SCSI EH. In one case,
those messages go on forever, requiring a reboot; in my testing
(with sg_readcap) the command timeout (60 seconds) stopped them.


How they occur needs a bit of explaining: ATA disks are designed
to have only only initiator (host). So if you build a SAS fabric
including at least two initiators, an expander and one SATA disk,
then there is potentially a problem which SAS expanders address
with "affiliations". An affiliation is a mechanism for the
expander to remember the SAS address of the initiator (host)
that first "grabbed" the SATA disk, and rejecting any other
initiator that tries to access that SATA disk.

That rejection, in the link layer in SAS for the STP protocol,
is a OPEN_REJECT (STP RESOURCES BUSY) response. That is *not*
a retry-able error (so the use of "busy" is unfortunate).
FreeBSD handles this correctly, Linux in some cases retries
which results in chaos plus bloated logs.

There are mechanisms for the owner of the affiliation to clear
it so another initiator can claim it. However affiliations are
designed to thwart brute force attempts by non-owners. At best
non-owners should get one log message along the lines of:
   cannot access SATA disk xxxx since another machine/HBA is
   affiliated with it

Linux properly handles SATA affiliations when it comes across
them in normal device discovery. It is the "surprise"
disappearance of an affiliation that causes instability. That
surprise is caused by a utility like smp_phy_control telling
the expander to clear the affiliation and doing a rescan on
the other machine to claim the affiliation.

Doug Gilbert



             reply	other threads:[~2014-11-12 19:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-12 19:59 Douglas Gilbert [this message]
2014-11-13 14:26 ` mpt2sas,mpt3sas: SATA affiliations Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5463BC0F.4050605@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.