From: Hannes Reinecke <hare@suse.de>
To: device-mapper development <dm-devel@redhat.com>
Subject: Re: Failed path will not be recovered when disabling/enabling remote port
Date: Thu, 02 Jul 2009 15:16:57 +0200 [thread overview]
Message-ID: <4A4CB349.2050601@suse.de> (raw)
In-Reply-To: <20090702130619.GA21821@mars.virtualiron.com>
Hi all,
Konrad Rzeszutek wrote:
> On Thu, Jul 02, 2009 at 01:44:18PM +0200, Hannes Reinecke wrote:
>> Christian May wrote:
>>> Hi,
>>>
>>> I've setup an IBM z10 LPAR (mainframe server) with 2.6.30-kernel.
>>> Attached to the System z10 was an IBM DS8000 storage server. 10x SCSI
>>> LUNs were assigned to LPAR via two pathes:
>>>
>>> Example:
>>> 36005076303ffc1040000000000001269 dm-9 IBM,2107900
>>> size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
>>> `-+- policy='round-robin 0' prio=-2 status=active
>>> |- 0:0:0:1080639506 sdw 65:96 active undef running
>>> `- 1:0:1:1080639506 sdt 65:48 active undef running
>>>
>>> Special parameter setting: dev_loss_tmo=90sec; fast_io_fail_tmo=5sec
>>>
>>> multipath tools: multipath-tools v0.4.9 (04/04, 2009)
>>> device-mapper: device-mapper-1.02.27-7.fc10.s390x,
>>> device-mapper-libs-1.02.27-7.fc10.s390x
>>>
>>> When removing a remote port (disabling a port on the BROCADE FC switch)
>>> one path failed.
>>>
>>> root@h42lp26/ESAME:~]
>>>> multipath -l
>>> 36005076303ffc1040000000000001268 dm-8 ,
>>> size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
>>> `-+- policy='round-robin 0' prio=-2 status=active
>>> |- #:#:#:# - #:# failed undef running
>>> `- 1:0:1:1080573970 sdr 65:16 active undef running
>>>
>>> After a while (>90sec) SCSI LUNs were removed from system:
>>>
>> [ .. ]
>>> When re-enabling the path, SCSI LUNS were reassigned to system but path
>>> didn't recover:
>>>
>> [ .. ]
>>
>>>
>>> [root@h42lp26/ESAME:~]
>>>> multipath -l
>>> 36005076303ffc1040000000000001268 dm-8 ,
>>> size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
>>> `-+- policy='round-robin 0' prio=-2 status=active
>>> |- #:#:#:# - #:# failed undef running
>>> `- 1:0:1:1080573970 sdr 65:16 active undef running
>>>
>>>
>>> Running "multipath" command will recover the failed path but that's not
>>> way it should be...can somebody help to fix this? Why is the path not
>>> recovered automatically?
>>>
>> It should, really.
>>
>> The problem is that the paths have _not_ been reconnected;
>> the hashes indicates that the in-kernel multipath code references
>> a device for which no information is available.
>> And the new device has _not_ been reconnected, as otherwise
>> you'd end up with _three_ paths here.
>>
>> Probably missing udev integration.
>
> Could also be a race condition that is present in SLES10 + RHEL5
> kernels. Where the SysFS directories are created (and the udev event it
> sent out), but the kernel hasn't populated the SysFS directories. So
> when multipathd tries to read them it finds no pertient information and
> shoves it off to the 'orphan' state.
>
Really? With SLES10? Have you actually observed this?
We're running multipath _after_ udev has processed the event.
And udev already waited for sysfs, so we should be safe there.
It might be applicable to mainline multipath-tools, but
the SLES10 one ... I'd be surprised.
Well, reasonably surprised. multipath keeps on throwing
an amazing number of issues still.
Do you have more information here?
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
next prev parent reply other threads:[~2009-07-02 13:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-02 11:27 Failed path will not be recovered when disabling/enabling remote port Christian May
2009-07-02 11:44 ` Hannes Reinecke
2009-07-02 13:06 ` Konrad Rzeszutek
2009-07-02 13:16 ` Hannes Reinecke [this message]
2009-07-20 16:46 ` Konrad Rzeszutek
2009-07-21 6:19 ` Hannes Reinecke
2009-07-21 21:42 ` Konrad Rzeszutek
2009-07-02 17:51 ` Chandra Seetharaman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A4CB349.2050601@suse.de \
--to=hare@suse.de \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.