* Should multipath detect changed path UIDs?
@ 2005-01-26 21:49 goggin, edward
2005-01-30 6:14 ` Tim Pepper
0 siblings, 1 reply; 4+ messages in thread
From: goggin, edward @ 2005-01-26 21:49 UTC (permalink / raw)
To: 'dm-devel, 'linux-scsi@vger.kernel.org'
Cc: 'christophe varoqui'
Simply pulling two FC cables from a host and re-cabling incorrectly
(think a switch of two HBA cables) or a surprise reconfiguration of a
storage system target could lead to a situation where the potential for
data corruption is ripe. While it may not be reasonably possible to
prevent data corruption in this scenario (think I/O already queued to
target devices underneath the multipath target driver), the prudent
course of action may be to try to prevent data corruption whenever
the potential is discovered using reasonable means.
Besides such pilot error as described above, the pre-requisites for
the cable switch scenario include at least (1) having the cable disconnect
and subsequent re-connect events occur while no user or path checker
initiated I/O occurs to paths using the switched cables and (2) the
target-side connectivity for the two cables is asymmetrical.
Possible solutions could involve (1) detecting fiber channel disconnect
hotplug events and acting upon them or (2) modifying multipath checker
functions to verify a path's UID remains consistent in addition to verifying
path connectivity.
EMC's multipathing product PowerPath uses the latter approach for
testing paths to SCSI logical units on both Symmetrix and CLARiion
storage systems. Although using a single I/O for path testing may not
be possible for many storage systems, the PowerPath checker function
for both Symmetrix and CLARiion storage determines the UID for the
SCSI logical unit from reply (or sense) information returned by a single
test I/O and fails a path if the UID is not consistent.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Should multipath detect changed path UIDs?
@ 2005-01-26 21:50 goggin, edward
2005-01-26 22:13 ` Lars Marowsky-Bree
0 siblings, 1 reply; 4+ messages in thread
From: goggin, edward @ 2005-01-26 21:50 UTC (permalink / raw)
To: 'dm-devel@redhat.com'
Simply pulling two FC cables from a host and re-cabling incorrectly
(think a switch of two HBA cables) or a surprise reconfiguration of a
storage system target could lead to a situation where the potential for
data corruption is ripe. While it may not be reasonably possible to
prevent data corruption in this scenario (think I/O already queued to
target devices underneath the multipath target driver), the prudent
course of action may be to try to prevent data corruption whenever
the potential is discovered using reasonable means.
Besides such pilot error as described above, the pre-requisites for
the cable switch scenario include at least (1) having the cable disconnect
and subsequent re-connect events occur while no user or path checker
initiated I/O occurs to paths using the switched cables and (2) the
target-side connectivity for the two cables is asymmetrical.
Possible solutions could involve (1) detecting fiber channel disconnect
hotplug events and acting upon them or (2) modifying multipath checker
functions to verify a path's UID remains consistent in addition to verifying
path connectivity.
EMC's multipathing product PowerPath uses the latter approach for
testing paths to SCSI logical units on both Symmetrix and CLARiion
storage systems. Although using a single I/O for path testing may not
be possible for many storage systems, the PowerPath checker function
for both Symmetrix and CLARiion storage determines the UID for the
SCSI logical unit from reply (or sense) information returned by a single
test I/O and fails a path if the UID is not consistent.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Should multipath detect changed path UIDs?
2005-01-26 21:50 goggin, edward
@ 2005-01-26 22:13 ` Lars Marowsky-Bree
0 siblings, 0 replies; 4+ messages in thread
From: Lars Marowsky-Bree @ 2005-01-26 22:13 UTC (permalink / raw)
To: device-mapper development
On 2005-01-26T16:50:50, "goggin, edward" <egoggin@emc.com> wrote:
> Possible solutions could involve (1) detecting fiber channel disconnect
> hotplug events and acting upon them or (2) modifying multipath checker
> functions to verify a path's UID remains consistent in addition to verifying
> path connectivity.
If you look at the emc_clariion checker in multipath-tools, you'll find
it does that.
I've this handy paper called "Developing Multipath Software for EMC
CLARiiON Arrays" in front of me and believe that we implement it
already, as far as possible. ;-)
(There's one missing piece, it's handling the specific sense codes, but
you'll find that they are already coded in dm-emc.c in the kernel, just
disabled until we finally get the patch to get at the SCSI SENSE data in
the bio struct. I need to pester axboe about it again.)
Maybe you could add the Symmetrix specific handling to the emc_clariion
path checker and the dm-emc.c kernel module? I'd be glad to make it more
powerful than "just" the CX/AX arrays, alas, I "only" have a CX-500 for
playing with ;-)
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Should multipath detect changed path UIDs?
2005-01-26 21:49 Should multipath detect changed path UIDs? goggin, edward
@ 2005-01-30 6:14 ` Tim Pepper
0 siblings, 0 replies; 4+ messages in thread
From: Tim Pepper @ 2005-01-30 6:14 UTC (permalink / raw)
To: goggin, edward
Cc: 'dm-devel, linux-scsi@vger.kernel.org, christophe varoqui
Having the checker check for this change isn't a solution...it
shouldn't prevent any data corruption. Even if you check prior to
every IO, you still have an obvious race between the check and the
subsequent IO. You gain the ability to recognise the problem and
somehow magically warn the user (eg: disallow all subsequent IO to
catch the their attention). So then what is a reasonable frequency
for the check? How much corruption will you tolerate, because a lot
of IO can go out an FC connection in a short time.
Also, I'd think your #1 pre-requisite isn't quite right. I'd say the
hole is wider and that you probably could have IO running heavily, as
long as none of it is failed by the lower layers in response to the
missing cable (and the default retries/timeouts could easily mask even
a clumsy cable swap).
Personally I'm also not inclined to think triggering hot plug events
on the cable removal/replacement is ideal. I think fibre channel
isn't exactly intended as a dynamic environment and the SAN topology
is static or expected to be static from a given host's perspective.
(I could be wrong.) If this is the case, it would be helpful for a
host (or it's admin) to be able to know that topology and it's health
over time instead of only knowing the healthy portion of the topology
because the unhealthy parts are removed from its mappings. Problem
determination seems like it's easier if you have a list of bad parts
instead of just a shrinking list of good parts.
Shouldn't the hba hardware/software be able to recognise and discern
between different classes of fabric failures, hba port link loss and
the return of a link that puts the port in a different place in the
SAN. Does the HBA API give a common way to get detailed info out to
userspace daemons?
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-01-30 6:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-26 21:49 Should multipath detect changed path UIDs? goggin, edward
2005-01-30 6:14 ` Tim Pepper
-- strict thread matches above, loose matches on Subject: below --
2005-01-26 21:50 goggin, edward
2005-01-26 22:13 ` Lars Marowsky-Bree
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.