linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>,
	Alan Stern <stern@rowland.harvard.edu>
Cc: Hannes Reinecke <hare@suse.de>,
	Joe Breuer <linux-kernel@jmbreuer.net>,
	Bagas Sanjaya <bagasdotme@gmail.com>, Pavel Machek <pavel@ucw.cz>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <len.brown@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Kees Cook <keescook@chromium.org>,
	Tony Luck <tony.luck@intel.com>,
	"Guilherme G. Piccoli" <gpiccoli@igalia.com>,
	Thorsten Leemhuis <linux@leemhuis.info>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Phillip Potter <phil@philpotter.co.uk>,
	Linux Power Management <linux-pm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Hardening <linux-hardening@vger.kernel.org>,
	Linux Regressions <regressions@lists.linux.dev>,
	Linux SCSI <linux-scsi@vger.kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Hannes Reinecke <hare@suse.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Martin Kepplinger <martin.kepplinger@puri.sm>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>
Subject: Re: Fwd: Waking up from resume locks up on sr device
Date: Thu, 15 Jun 2023 07:44:39 +0900	[thread overview]
Message-ID: <bf142e7d-178e-43a8-32e8-7e9e396eeee7@kernel.org> (raw)
In-Reply-To: <3f85cb4a-8b14-623f-eb4e-40baab1ed888@acm.org>

On 6/15/23 03:04, Bart Van Assche wrote:
> On 6/14/23 07:26, Alan Stern wrote:
>> On Wed, Jun 14, 2023 at 04:35:50PM +0900, Damien Le Moal wrote:
>>> Or... Why the heck scsi_rescan_device() is calling device_lock() ? This
>>> is the only place in scsi code I can see that takes this lock. I suspect
>>> this is to serialize either rescans, or serialize with resume, or both.
>>> For serializing rescans, we can use another lock. For serializing with
>>> PM, we should wait for PM transitions...
>>> Something is not right here.
>>
>> Here's what commit e27829dc92e5 ("scsi: serialize ->rescan against
>> ->remove", written by Christoph Hellwig) says:
>>
>>      Lock the device embedded in the scsi_device to protect against
>>      concurrent calls to ->remove.
>>
>> That's the commit which added the device_lock() call.
> 
> Even if scsi_rescan_device() would use another mechanism for 
> serialization against sd_remove() and sr_remove(), we still need to 
> solve the issue that the ATA code calls scsi_rescan_device() before 
> resuming has finished. scsi_rescan_device() issues I/O. Issuing I/O to a 
> device is not allowed before that device has been resumed.

I am not convinced of that: scsi suspend quiecse the queue, thus preventing IOs
from the block layer, but not internale scsi ml commands, which is what
scsi_rescan_device() issues.

In any case, I am thinking that best (and quickest) fix for this issue for now
is to have libata define a device link to make the scsi device a "parent" of the
ata device (which is the ata link as of now). This way, PM operation ordering
will ensure that the scsi device resume will be done before the ata device. What
I really do not like about this though is that the suspend side would be done in
the reverse order: ata first and then scsi, but we really want the reverse here
to ensure that the request queue is quiesced before we suspend ata. That said,
there is no such synchronization right now and so this is probably happening
already without raising issues apparently.

So ideally:
1) Make the ata device the parent of the scsi device using a device link
2) For suspend, the scsi device suspend will be done first, followed by the ata
device, which is what we want.
3) For resume, ata device will be first, followed by scsi device. The call to
scsi_rescan_device() from libata being in a work task, asynchronous from the ata
resume context, we need to synchronize that work to wait for the scsi device
resume to complete. (but do we really given that we are going to issue internal
commands only ?)

Alan, Rafael,

For the synchronization of step (3), if I understand the pm code correctly,
using device_pm_wait_for_dev() would work only if async resume is on. This would
be ineffective for the sync case. How can we best deal with this ?


-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2023-06-14 22:44 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09 11:04 Fwd: Waking up from resume locks up on sr device Bagas Sanjaya
2023-06-10  6:38 ` Bagas Sanjaya
2023-06-10  8:55   ` Pavel Machek
2023-06-10 13:27     ` Bagas Sanjaya
2023-06-10 15:03       ` Bart Van Assche
2023-06-11  9:05         ` Joe Breuer
2023-06-11 11:31           ` Bagas Sanjaya
2023-06-14  4:49           ` Damien Le Moal
2023-06-14  5:37             ` Kai-Heng Feng
2023-06-14  6:31               ` Damien Le Moal
2023-06-14  7:22               ` Damien Le Moal
2023-06-14  6:57             ` Hannes Reinecke
2023-06-14  7:35               ` Damien Le Moal
2023-06-14 14:26                 ` Alan Stern
2023-06-14 14:40                   ` Rafael J. Wysocki
2023-06-14 18:04                   ` Bart Van Assche
2023-06-14 22:44                     ` Damien Le Moal [this message]
2023-06-15  0:10                   ` Damien Le Moal
2023-06-15  4:40                     ` Christoph Hellwig
2023-06-15  4:57                       ` Damien Le Moal
2023-06-15  5:09                         ` Christoph Hellwig
2023-06-12  3:09         ` Damien Le Moal
2023-06-12  6:09           ` Hannes Reinecke
2023-06-12  7:22             ` Damien Le Moal
2023-06-12  7:36               ` Kai-Heng Feng
2023-06-12  7:47                 ` Damien Le Moal
2023-06-12 14:33                   ` Alan Stern
2023-06-12 15:37                     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf142e7d-178e-43a8-32e8-7e9e396eeee7@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=bagasdotme@gmail.com \
    --cc=bvanassche@acm.org \
    --cc=dan.j.williams@intel.com \
    --cc=gpiccoli@igalia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=jejb@linux.ibm.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@jmbreuer.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux@leemhuis.info \
    --cc=martin.kepplinger@puri.sm \
    --cc=martin.petersen@oracle.com \
    --cc=pavel@ucw.cz \
    --cc=phil@philpotter.co.uk \
    --cc=rafael@kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=stern@rowland.harvard.edu \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).