From: Jason Yan <yanaijie@huawei.com>
To: John Garry <john.garry@huawei.com>,
martin.petersen@oracle.com, jejb@linux.vnet.ibm.com
Cc: linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
zhaohongjiang@huawei.com, hare@suse.com,
dan.j.williams@intel.com, jthumshirn@suse.de, hch@lst.de,
huangdaode@hisilicon.com, chenxiang66@hisilicon.com,
xiexiuqi@huawei.com, tj@kernel.org, miaoxie@huawei.com,
Xiaofei Tan <tanxiaofei@huawei.com>,
Ewan Milne <emilne@redhat.com>, Tomas Henzl <thenzl@redhat.com>
Subject: Re: [PATCH v2 7/7] scsi: libsas: fix issue of swapping two sas disks
Date: Fri, 1 Feb 2019 10:04:31 +0800 [thread overview]
Message-ID: <5C53A92F.50608@huawei.com> (raw)
In-Reply-To: <1135102e-e563-d3ab-9b44-d8691c3e6ccb@huawei.com>
On 2019/2/1 0:34, John Garry wrote:
> On 31/01/2019 02:55, Jason Yan wrote:
>>
>>
>> On 2019/1/31 1:53, John Garry wrote:
>>> On 30/01/2019 08:24, Jason Yan wrote:
>>>> The work flow of revalidation now is scanning expander phy by the
>>>> sequence of the phy and check if the phy have changed. This will leads
>>>> to an issue of swapping two sas disks on one expander.
>>>>
>>>> Assume we have two sas disks, connected with expander phy10 and phy11:
>>>>
>>>> phy10: 5000cca04eb1001d port-0:0:10
>>>> phy11: 5000cca04eb043ad port-0:0:11
>>>>
>>>> Swap these two disks, and imaging the following scenario:
>>>>
>>>> revalidation 1:
>>>
>>> What does "revalidation 1" actually mean?
>>
>> 'revalidation 1' means one entry in sas_discover_domain().
>>
>>>
>>>> -->phy10: 0 --> delete phy10 domain device
>>>> -->phy11: 5000cca04eb043ad (no change)
>>>
>>> so is disk 11 still inserted at this stage?
>>
>> Maybe, but that's what we read from the hardware.
>>
>>>
>>>> revalidation done
>>>>
>>>> revalidation 2:
>>>
>>> is port-0:0:10 deleted now?
>>>
>>
>> Yes. But we don't care about it.
>>
>>>> -->step 1, check phy10:
>>>> -->phy10: 5000cca04eb043ad --> add to wide port(port-0:0:11) (phy11
>>>> address is still 5000cca04eb043ad now)
>
> We do not want this to happen and it seems to be the crux of the problem.
>
> As an alternate to your solution, how about check if the PHY is an end
> device. If so, it should not form/join a wideport; that is, apart from
> dual-port disks, which I am not sure about - I think each port still has
> a unique WWN, so should be ok.
>
If the PHY do not join a wideport, then it have to form a wideport of
it's own. I'm not sure if we can have two ports with the same address
and do not break anything?
>>>
>>> So this should not happen. How are you physcially swapping them such
>>> that phy11 address is still 5000cca04eb043ad? I don't see how this would
>>> be true at revalidation 1.
>>>
>>
>> This issue is because we always process the PHYs from 0 to max phy
>> number. And please be aware of the real physcial address of the PHY and
>> the address stored in the memory is not always the same.
>> Actually when you checking phy10, phy11 physcial address is not
>> 5000cca04eb043ad. But the address stored in domain device is still
>> 5000cca04eb043ad. We have not get a chance to to read it because we are
>> processing phy10 now, right?
>>
>
> I see.
>
>> It's very easy to reproduce. I suggest you to do it yourself and look at
>> the logs.
>>
>
> I can't physically access the backpane, and this is not the sort of
> thing which is easy to fake by hacking the driver.
>
> And the log which you provided internally does not have much - if any -
> libsas logs to help me understand it.
>
>>>>
>>>> -->step 2, check phy11:
>>>> -->phy11: 0 --> phy11 address is 0 now, but it's part of wide
>>>> port(port-0:0:11), the domain device will not be deleted.
>>>> revalidation done
>>>>
>>>> revalidation 3:
>>>> -->phy10, 5000cca04eb043ad (no change)
>>>> -->phy11: 5000cca04eb1001d --> try to add port-0:0:11 but failed,
>>>> port-0:0:11 already exist, trigger a warning as follows
>>>> revalidation done
>>>>
>>>> [14790.189699] sysfs: cannot create duplicate filename
>>>> '/devices/pci0000:74/0000:74:02.0/host0/port-0:0/expander-0:0/port-0:0:11'
>>>>
>>>>
>>>>
>>>> [14790.201081] CPU: 25 PID: 5031 Comm: kworker/u192:3 Not tainted
>>>> 4.16.0-rc1-191134-g138f084-dirty #228
>>>> [14790.210199] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 EC
>>>> UEFI
>>>> Nemo 2.0 RC0 - B303 05/16/2018
>>>> [14790.219323] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain
>>>> [14790.225404] Call trace:
>>>> [14790.227842] dump_backtrace+0x0/0x18c
>>>> [14790.231492] show_stack+0x14/0x1c
>>>> [14790.234798] dump_stack+0x88/0xac
>>>> [14790.238101] sysfs_warn_dup+0x64/0x7c
>>>> [14790.241751] sysfs_create_dir_ns+0x90/0xa0
>>>> [14790.245835] kobject_add_internal+0xa0/0x284
>>>> [14790.250092] kobject_add+0xb8/0x11c
>>>> [14790.253570] device_add+0xe8/0x598
>>>> [14790.256960] sas_port_add+0x24/0x50
>>>> [14790.260436] sas_ex_discover_devices+0xb10/0xc30
>>>> [14790.265040] sas_ex_revalidate_domain+0x1d8/0x518
>>>> [14790.269731] sas_revalidate_domain+0x12c/0x154
>>>> [14790.274163] process_one_work+0x128/0x2b0
>>>> [14790.278160] worker_thread+0x14c/0x408
>>>> [14790.281897] kthread+0xfc/0x128
>>>> [14790.285026] ret_from_fork+0x10/0x18
>>>> [14790.288598] ------------[ cut here ]------------
>>>>
>>>> At last, the disk 5000cca04eb1001d is lost.
>
>
> .
>
next prev parent reply other threads:[~2019-02-01 2:04 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-30 8:24 [PATCH v2 0/7] libsas: fix issue of swapping or replacing disks Jason Yan
2019-01-30 8:24 ` [PATCH v2 1/7] scsi: libsas: reset the negotiated_linkrate when phy is down Jason Yan
2019-01-30 13:08 ` John Garry
2019-01-31 1:11 ` Jason Yan
2019-01-31 9:00 ` John Garry
2019-01-30 8:24 ` [PATCH v2 2/7] scsi: libsas: only clear phy->in_shutdown after shutdown event done Jason Yan
2019-01-30 16:26 ` John Garry
2019-01-31 1:13 ` Jason Yan
2019-01-30 8:24 ` [PATCH v2 3/7] scsi: libsas: optimize the debug print of the revalidate process Jason Yan
2019-01-30 16:41 ` John Garry
2019-01-31 1:31 ` Jason Yan
2019-01-31 10:25 ` John Garry
2019-01-30 8:24 ` [PATCH v2 4/7] scsi: libsas: split the replacement of sas disks in two steps Jason Yan
2019-01-30 17:22 ` John Garry
2019-01-31 2:04 ` Jason Yan
2019-01-31 10:29 ` John Garry
2019-01-31 16:38 ` John Garry
2019-02-01 1:58 ` Jason Yan
2019-02-01 9:34 ` John Garry
2019-01-30 8:24 ` [PATCH v2 5/7] scsi: libsas: check if the same device when flutter Jason Yan
2019-01-30 8:24 ` [PATCH v2 6/7] scsi: libsas: reset the phy address if discover failed Jason Yan
2019-01-30 17:36 ` John Garry
2019-01-31 2:13 ` Jason Yan
2019-01-31 9:10 ` John Garry
2019-01-30 8:24 ` [PATCH v2 7/7] scsi: libsas: fix issue of swapping two sas disks Jason Yan
2019-01-30 17:53 ` John Garry
2019-01-31 2:55 ` Jason Yan
2019-01-31 16:34 ` John Garry
2019-02-01 2:04 ` Jason Yan [this message]
2019-02-01 9:27 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5C53A92F.50608@huawei.com \
--to=yanaijie@huawei.com \
--cc=chenxiang66@hisilicon.com \
--cc=dan.j.williams@intel.com \
--cc=emilne@redhat.com \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=huangdaode@hisilicon.com \
--cc=jejb@linux.vnet.ibm.com \
--cc=john.garry@huawei.com \
--cc=jthumshirn@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=miaoxie@huawei.com \
--cc=tanxiaofei@huawei.com \
--cc=thenzl@redhat.com \
--cc=tj@kernel.org \
--cc=xiexiuqi@huawei.com \
--cc=zhaohongjiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox