From: Bjorn Helgaas <helgaas@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: Yihang Li <liyihang9@huawei.com>,
cassel@kernel.org, James.Bottomley@hansenpartnership.com,
martin.petersen@oracle.com, john.g.garry@oracle.com,
yanaijie@huawei.com, linux-kernel@vger.kernel.org,
linux-scsi@vger.kernel.org, linuxarm@huawei.com,
chenxiang66@hisilicon.com, prime.zeng@huawei.com,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: [bug report] scsi: SATA devices missing after FLR is triggered during HBA suspended
Date: Wed, 26 Jun 2024 10:15:46 -0500 [thread overview]
Message-ID: <20240626151546.GA1466906@bhelgaas> (raw)
In-Reply-To: <b39b4a5b-07b7-483b-9c42-3ac80503120d@kernel.org>
On Mon, Jun 24, 2024 at 09:10:41AM +0900, Damien Le Moal wrote:
> On 6/22/24 12:31 PM, Yihang Li wrote:
> > Hi Damien,
> >
> > Thanks for your reply.
> >
> > On 2024/6/19 7:11, Damien Le Moal wrote:
> >> On 6/18/24 22:29, Yihang Li wrote:
> >>> Hi Damien,
> >>>
> >>> I found out that two issues is caused by commit 0c76106cb975 ("scsi: sd:
> >>> Fix TCG OPAL unlock on system resume") and 626b13f015e0 ("scsi: Do not
> >>> rescan devices with a suspended queue").
> >>>
> >>> The two issues as follows for the situation that there are ATA disks
> >>> connected with SAS controller:
> >>
> >> Which controller ? What is the driver ?
> >
> > I'm using the hisi_sas_v3_hw driver and it supports HiSilicon's SAS controller.
>
> I do not have access to this HBA, but I have one that uses libsas/pm8001 driver
> so I will try to test with that.
>
> >>> (1) FLR is triggered after all disks and controller are suspended. As a
> >>> result, the number of disks is abnormal.
> >>
> >> I am assuming here that FLR means PCI "Function Level Reset" ?
> >
> > Yes, I am talking about the PCI "Function Level Reset"
> >
> >> FLR and disk/controller suspend execution timing are unrelated. FLR can be
> >> triggered at any time through sysfs. So please give details here. Why is FLR
> >> done when the system is being suspended ?
> >
> > Yes, it is because FLR can be triggered at any time that we are testing the
> > reliability of executing FLR commands after disk/controller suspended.
>
> "can be triggered" ? FLR is not a random asynchronous event. It is an action
> that is *issued* by a user with sys admin rights. And such users can do a lot
> of things that can break a machine...
>
> I fail to see the point of doing a function reset while the device is
> suspended. But granted, I guess the device should comeback up in such case,
> though I would like to hear what the PCI guys have to say about this.
>
> Bjorn,
>
> Is reseting a suspended PCI device something that should be/is supported ?
I doubt it. The PCI core should be preserving all the generic PCI
state across suspend/resume. The driver should only need to
save/restore device-specific things the PCI core doesn't know about.
A reset will clear out most state, and the driver doesn't know the
reset happened, so it will expect most device state to have been
preserved.
Bjorn
next prev parent reply other threads:[~2024-06-26 15:15 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240618132900.2731301-1-liyihang9@huawei.com>
[not found] ` <0c5e14eb-5560-48cb-9086-6ad9c3970427@kernel.org>
[not found] ` <f27d6fa7-3088-0e60-043e-e71232066b12@huawei.com>
2024-06-24 0:10 ` [bug report] scsi: SATA devices missing after FLR is triggered during HBA suspended Damien Le Moal
2024-06-24 12:10 ` Yihang Li
2024-07-01 3:03 ` Damien Le Moal
2024-07-02 11:20 ` Yihang Li
2024-06-26 15:15 ` Bjorn Helgaas [this message]
2024-06-27 0:56 ` Damien Le Moal
2024-06-27 8:19 ` Yihang Li
2024-07-01 20:39 ` Bjorn Helgaas
2024-07-02 2:38 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240626151546.GA1466906@bhelgaas \
--to=helgaas@kernel.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=bhelgaas@google.com \
--cc=cassel@kernel.org \
--cc=chenxiang66@hisilicon.com \
--cc=dlemoal@kernel.org \
--cc=john.g.garry@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=liyihang9@huawei.com \
--cc=martin.petersen@oracle.com \
--cc=prime.zeng@huawei.com \
--cc=yanaijie@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).