public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.g.garry@oracle.com>
To: yangxingui <yangxingui@huawei.com>,
	yanaijie@huawei.com, jejb@linux.ibm.com,
	martin.petersen@oracle.com, damien.lemoal@opensource.wdc.com
Cc: linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxarm@huawei.com, prime.zeng@hisilicon.com,
	chenxiang66@hisilicon.com, kangfenglong@huawei.com
Subject: Re: [PATCH v2] scsi: libsas: Fix exp-attached end device cannot be scanned in again after probe failed
Date: Tue, 21 May 2024 07:47:49 -0400	[thread overview]
Message-ID: <afa06fba-43b4-4068-9639-ef4a6dfd3cef@oracle.com> (raw)
In-Reply-To: <c1835d80-ca48-766e-c174-d94a2d357925@huawei.com>

On 20/05/2024 09:29, yangxingui wrote:
> Friendly ping ...

I'll check when I return from vacation in a couple of days

> 
> On 2024/4/24 16:08, Xingui Yang wrote:
>> We found that it is judged as broadcast flutter when the exp-attached end
>> device reconnects after probe failed, as follows:
>>
>> [78779.654026] sas: broadcast received: 0
>> [78779.654037] sas: REVALIDATING DOMAIN on port 0, pid:10
>> [78779.654680] sas: ex 500e004aaaaaaa1f phy05 change count has changed
>> [78779.662977] sas: ex 500e004aaaaaaa1f phy05 originated 
>> BROADCAST(CHANGE)
>> [78779.662986] sas: ex 500e004aaaaaaa1f phy05 new device attached
>> [78779.663079] sas: ex 500e004aaaaaaa1f phy05:U:8 attached: 
>> 500e004aaaaaaa05 (stp)
>> [78779.693542] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] found
>> [78779.701155] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
>> [78779.707864] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
>> ...
>> [78835.161307] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 
>> tries: 1
>> [78835.171344] sas: sas_probe_sata: for exp-attached device 
>> 500e004aaaaaaa05 returned -19
>> [78835.180879] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] is gone
>> [78835.187487] sas: broadcast received: 0
>> [78835.187504] sas: REVALIDATING DOMAIN on port 0, pid:10
>> [78835.188263] sas: ex 500e004aaaaaaa1f phy05 change count has changed
>> [78835.195870] sas: ex 500e004aaaaaaa1f phy05 originated 
>> BROADCAST(CHANGE)
>> [78835.195875] sas: ex 500e004aaaaaaa1f rediscovering phy05
>> [78835.196022] sas: ex 500e004aaaaaaa1f phy05:U:A attached: 
>> 500e004aaaaaaa05 (stp)
>> [78835.196026] sas: ex 500e004aaaaaaa1f phy05 broadcast flutter
>> [78835.197615] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
>>
>> The cause of the problem is that the related ex_phy's 
>> attached_sas_addr was
>> not cleared after the end device probe failed. In order to solve the 
>> above
>> problem, a function sas_ex_unregister_end_dev() is defined to clear the
>> ex_phy information and unregister the end device after the 
>> exp-attached end
>> device probe failed.
>>
>> As the sata device is an asynchronous probe, the sata device may probe

SATA, almost always capitalize acronyms

>> failed after done REVALIDATING DOMAIN. Then after its port is added to 
>> the
>> sas_port_del_list, the port will not be deleted until the end of the next
>> REVALIDATING DOMAIN and sas_destruct_ports() is called. A warning about
>> creating a duplicate port will occur in the new REVALIDATING DOMAIN when
>> the end device reconnects. Therefore, the previous destroy_list and
>> sas_port_del_list should be handled before REVALIDATING DOMAIN.
>>
>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>> ---
>> Changes since v1:
>> - Simplify the process of getting ex_phy id based on Jason's suggestion.
>> - Update commit information.
>> ---
>>   drivers/scsi/libsas/sas_discover.c | 2 ++
>>   drivers/scsi/libsas/sas_expander.c | 8 ++++++++
>>   drivers/scsi/libsas/sas_internal.h | 6 +++++-
>>   3 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/libsas/sas_discover.c 
>> b/drivers/scsi/libsas/sas_discover.c
>> index 8fb7c41c0962..aae90153f4c6 100644
>> --- a/drivers/scsi/libsas/sas_discover.c
>> +++ b/drivers/scsi/libsas/sas_discover.c
>> @@ -517,6 +517,8 @@ static void sas_revalidate_domain(struct 
>> work_struct *work)
>>       struct sas_ha_struct *ha = port->ha;
>>       struct domain_device *ddev = port->port_dev;
>> +    sas_destruct_devices(port);
>> +    sas_destruct_ports(port);
>>       /* prevent revalidation from finding sata links in recovery */
>>       mutex_lock(&ha->disco_mutex);
>>       if (test_bit(SAS_HA_ATA_EH_ACTIVE, &ha->state)) {
>> diff --git a/drivers/scsi/libsas/sas_expander.c 
>> b/drivers/scsi/libsas/sas_expander.c
>> index f6e6db8b8aba..45793c10009b 100644
>> --- a/drivers/scsi/libsas/sas_expander.c
>> +++ b/drivers/scsi/libsas/sas_expander.c
>> @@ -1856,6 +1856,14 @@ static void sas_unregister_devs_sas_addr(struct 
>> domain_device *parent,
>>       }
>>   }
>> +void sas_ex_unregister_end_dev(struct domain_device *dev)
>> +{
>> +    struct domain_device *parent = dev->parent;
>> +    struct sas_phy *phy = dev->phy;
>> +
>> +    sas_unregister_devs_sas_addr(parent, phy->number, true);
>> +}
>> +
>>   static int sas_discover_bfs_by_root_level(struct domain_device *root,
>>                         const int level)
>>   {
>> diff --git a/drivers/scsi/libsas/sas_internal.h 
>> b/drivers/scsi/libsas/sas_internal.h
>> index 3804aef165ad..434f928c2ed8 100644
>> --- a/drivers/scsi/libsas/sas_internal.h
>> +++ b/drivers/scsi/libsas/sas_internal.h
>> @@ -50,6 +50,7 @@ void sas_discover_event(struct asd_sas_port *port, 
>> enum discover_event ev);
>>   void sas_init_dev(struct domain_device *dev);
>>   void sas_unregister_dev(struct asd_sas_port *port, struct 
>> domain_device *dev);
>> +void sas_ex_unregister_end_dev(struct domain_device *dev);
>>   void sas_scsi_recover_host(struct Scsi_Host *shost);
>> @@ -145,7 +146,10 @@ static inline void sas_fail_probe(struct 
>> domain_device *dev, const char *func, i
>>           func, dev->parent ? "exp-attached" :
>>           "direct-attached",
>>           SAS_ADDR(dev->sas_addr), err);
>> -    sas_unregister_dev(dev->port, dev);
>> +    if (dev->parent && !dev_is_expander(dev->dev_type))
>> +        sas_ex_unregister_end_dev(dev);
>> +    else
>> +        sas_unregister_dev(dev->port, dev);
>>   }
>>   static inline void sas_fill_in_rphy(struct domain_device *dev,
>>


  reply	other threads:[~2024-05-21 11:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-24  8:08 [PATCH v2] scsi: libsas: Fix exp-attached end device cannot be scanned in again after probe failed Xingui Yang
2024-05-20 13:29 ` yangxingui
2024-05-21 11:47   ` John Garry [this message]
2024-05-22  3:51   ` Jason Yan
2024-05-24  8:36 ` John Garry
2024-05-25  3:08   ` yangxingui
2024-05-28 10:11     ` John Garry
2024-05-31  8:53       ` yangxingui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afa06fba-43b4-4068-9639-ef4a6dfd3cef@oracle.com \
    --to=john.g.garry@oracle.com \
    --cc=chenxiang66@hisilicon.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=jejb@linux.ibm.com \
    --cc=kangfenglong@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=yanaijie@huawei.com \
    --cc=yangxingui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox