From: Damien Le Moal <dlemoal@kernel.org>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>,
Alan Stern <stern@rowland.harvard.edu>
Cc: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
"Martin K . Petersen" <martin.petersen@oracle.com>,
Joe Breuer <linux-kernel@jmbreuer.net>,
Hannes Reinecke <hare@suse.de>,
"Rafael J . Wysocki" <rafael@kernel.org>
Subject: Re: [PATCH] ata: libata-scsi: Avoid deadlock on rescan after device resume
Date: Fri, 16 Jun 2023 14:28:18 +0900 [thread overview]
Message-ID: <a0b41e35-0955-1e8f-6654-9a0c8559e9a4@kernel.org> (raw)
In-Reply-To: <CAAd53p7+-uXf+wiZkAxSPnjSY7oC6crtfKURptuWCuM7vDAMZw@mail.gmail.com>
On 6/16/23 12:32, Kai-Heng Feng wrote:
> On Thu, Jun 15, 2023 at 10:50 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>>
>> On Thu, Jun 15, 2023 at 05:33:26PM +0900, Damien Le Moal wrote:
>>> When an ATA port is resumed from sleep, the port is reset and a power
>>> management request issued to libata EH to reset the port and rescanning
>>> the device(s) attached to the port. Device rescanning is done by
>>> scheduling an ata_scsi_dev_rescan() work, which will execute
>>> scsi_rescan_device().
>>>
>>> However, scsi_rescan_device() takes the generic device lock, which is
>>> also taken by dpm_resume() when the SCSI device is resumed as well. If
>>> a device rescan execution starts before the completion of the SCSI
>>> device resume, the rcu locking used to refresh the cached VPD pages of
>>> the device, combined with the generic device locking from
>>> scsi_rescan_device() and from dpm_resume() can cause a deadlock.
>>>
>>> Avoid this situation by changing struct ata_port scsi_rescan_task to be
>>> a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is
>>> modified to check if the SCSI device associated with the ATA device that
>>> must be rescanned is not suspended. If the SCSI device is still
>>> suspended, ata_scsi_dev_rescan() returns early and reschedule itself for
>>> execution after an arbitrary delay of 5ms.
>>
>> I don't understand the nature of the relationship between the ATA port
>> and the corresponding SCSI device. Maybe you could explain it more
>> fully, if you have time.
>>
>> But in any case, this approach seems like a layering violation. Why not
>> instead call a SCSI utility routine to set a "needs_rescan" flag in the
>> scsi_device structure? Then scsi_device_resume() could automatically
>> call scsi_rescan_device() -- or rather an internal version that assumes
>> the device lock is already held -- if the flag is set. Or it could
>> queue a non-delayed work routine to do this. (Is it important to have
>> the rescan finish before userspace starts up and tries to access the ATA
>> device again?)
>>
>> That, combined with a guaranteed order of resuming, would do what you
>> want, right?
>
> What you are suggesting is pretty much like my previous approach:
> https://lore.kernel.org/all/20230502150435.423770-2-kai.heng.feng@canonical.com/
Not really. We need more than what you did. See my reply to Alan.
Your solution is rather similar to what I did but it was delaying the rescan
after the entire system is resumed (pm_suspend_target_state != PM_SUSPEND_ON),
which is really a heavy hammer and would significantly slow down resuming.
>
> Kai-Heng
>
>>
>> Alan Stern
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2023-06-16 5:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-15 8:33 [PATCH] ata: libata-scsi: Avoid deadlock on rescan after device resume Damien Le Moal
2023-06-15 8:35 ` Damien Le Moal
2023-06-17 6:55 ` Joe Breuer
2023-06-15 8:41 ` Hannes Reinecke
2023-06-15 8:45 ` Damien Le Moal
2023-06-15 14:50 ` Alan Stern
2023-06-16 3:32 ` Kai-Heng Feng
2023-06-16 5:28 ` Damien Le Moal [this message]
2023-06-16 5:25 ` Damien Le Moal
2023-06-16 14:25 ` Alan Stern
2023-06-16 3:31 ` Kai-Heng Feng
2023-06-16 5:28 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a0b41e35-0955-1e8f-6654-9a0c8559e9a4@kernel.org \
--to=dlemoal@kernel.org \
--cc=hare@suse.de \
--cc=kai.heng.feng@canonical.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@jmbreuer.net \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=rafael@kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox