linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Hannes Reinecke <hare@suse.de>,
	Joe Breuer <linux-kernel@jmbreuer.net>,
	Bart Van Assche <bvanassche@acm.org>,
	Bagas Sanjaya <bagasdotme@gmail.com>, Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <len.brown@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Kees Cook <keescook@chromium.org>,
	Tony Luck <tony.luck@intel.com>,
	"Guilherme G. Piccoli" <gpiccoli@igalia.com>,
	Thorsten Leemhuis <linux@leemhuis.info>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Phillip Potter <phil@philpotter.co.uk>,
	Linux Power Management <linux-pm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Hardening <linux-hardening@vger.kernel.org>,
	Linux Regressions <regressions@lists.linux.dev>,
	Linux SCSI <linux-scsi@vger.kernel.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	Dan Williams <dan.j.williams@intel.com>,
	Hannes Reinecke <hare@suse.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Martin Kepplinger <martin.kepplinger@puri.sm>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>
Subject: Re: Fwd: Waking up from resume locks up on sr device
Date: Wed, 14 Jun 2023 16:35:50 +0900	[thread overview]
Message-ID: <b0fdf454-b2f7-c273-66f5-efe42fbc2807@kernel.org> (raw)
In-Reply-To: <37ed36f0-6f72-115c-85fb-62ef5ad72e76@suse.de>

On 6/14/23 15:57, Hannes Reinecke wrote:
> On 6/14/23 06:49, Damien Le Moal wrote:
>> On 6/11/23 18:05, Joe Breuer wrote:
>>> I'm the reporter of this issue.
>>>
>>> I just tried this patch against 6.3.4, and it completely fixes my
>>> suspend/resume issue.
>>>
>>> The optical drive stays usable after resume, even suspending/resuming
>>> during playback of CDDA content works flawlessly and playback resumes
>>> seamlessly after system resume.
>>>
>>> So, from my perspective: Good one!
>>
>> In place of Bart's fix, could you please try this patch ?
>>
>> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
>> index b80e68000dd3..a81eb4f882ab 100644
>> --- a/drivers/ata/libata-eh.c
>> +++ b/drivers/ata/libata-eh.c
>> @@ -4006,9 +4006,32 @@ static void ata_eh_handle_port_resume(struct
>> ata_port *ap)
>>          /* tell ACPI that we're resuming */
>>          ata_acpi_on_resume(ap);
>>
>> -       /* update the flags */
>>          spin_lock_irqsave(ap->lock, flags);
>> +
>> +       /* Update the flags */
>>          ap->pflags &= ~(ATA_PFLAG_PM_PENDING | ATA_PFLAG_SUSPENDED);
>> +
>> +       /*
>> +        * Resuming the port will trigger a rescan of the ATA device(s)
>> +        * connected to it. Before scheduling the rescan, make sure that
>> +        * the associated scsi device(s) are fully resumed as well.
>> +        */
>> +       ata_for_each_link(link, ap, HOST_FIRST) {
>> +               ata_for_each_dev(dev, link, ENABLED) {
>> +                       struct scsi_device *sdev = dev->sdev;
>> +
>> +                       if (!sdev)
>> +                               continue;
>> +                       if (scsi_device_get(sdev))
>> +                               continue;
>> +
>> +                       spin_unlock_irqrestore(ap->lock, flags);
>> +                       device_pm_wait_for_dev(&ap->tdev,
>> +                                              &sdev->sdev_gendev);
>> +                       scsi_device_put(sdev);
>> +                       spin_lock_irqsave(ap->lock, flags);
>> +               }
>> +       }
>>          spin_unlock_irqrestore(ap->lock, flags);
>>   }
>>   #endif /* CONFIG_PM */
>>
>> Thanks !
>>
> Well; not sure if that'll work out.
> The whole reason why we initial a rescan is that we need to check if the
> ports are still connected, and whether the devices react.
> So we can't iterate the ports here as this is the very thing which gets
> checked during EH.

Hmmm... Right. So we need to move that loop into ata_scsi_dev_rescan(),
which itself already loops over the port devices anyway.

> We really should claim resume to be finished as soon as we can talk with
> the HBA, and kick off EH asynchronously to let it finish the job after
> resume has completed.

That is what's done already:

static int ata_port_pm_resume(struct device *dev)
{
	ata_port_resume_async(to_ata_port(dev), PMSG_RESUME);
	pm_runtime_disable(dev);
	pm_runtime_set_active(dev);
	pm_runtime_enable(dev);
	return 0;
}

EH is kicked by ata_port_resume_async() -> ata_port_request_pm() and it
is async. There is no synchronization in EH with the PM side though. We
probably should have EH check that the port resume is done first, which
can be done in ata_eh_handle_port_resume() since that is the first thing
done when entering EH.

The problem remains though that we *must* wait for the scsi device
resume to be done before calling scsi_rescan_device(), which is done
asynchronously from EH, as a different work. So that one needs to wait
for the scsi side resume to be done.

I also thought of trigerring the rescan from the scsi side, but since
the resume may be asynchronous, we could endup trigerring it with the
ata side not yet resumed... That would only turn the problem around
instead of solving it.

Or... Why the heck scsi_rescan_device() is calling device_lock() ? This
is the only place in scsi code I can see that takes this lock. I suspect
this is to serialize either rescans, or serialize with resume, or both.
For serializing rescans, we can use another lock. For serializing with
PM, we should wait for PM transitions...
Something is not right here.

-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2023-06-14  7:36 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09 11:04 Fwd: Waking up from resume locks up on sr device Bagas Sanjaya
2023-06-10  6:38 ` Bagas Sanjaya
2023-06-10  8:55   ` Pavel Machek
2023-06-10 13:27     ` Bagas Sanjaya
2023-06-10 15:03       ` Bart Van Assche
2023-06-11  9:05         ` Joe Breuer
2023-06-11 11:31           ` Bagas Sanjaya
2023-06-14  4:49           ` Damien Le Moal
2023-06-14  5:37             ` Kai-Heng Feng
2023-06-14  6:31               ` Damien Le Moal
2023-06-14  7:22               ` Damien Le Moal
2023-06-14  6:57             ` Hannes Reinecke
2023-06-14  7:35               ` Damien Le Moal [this message]
2023-06-14 14:26                 ` Alan Stern
2023-06-14 14:40                   ` Rafael J. Wysocki
2023-06-14 18:04                   ` Bart Van Assche
2023-06-14 22:44                     ` Damien Le Moal
2023-06-15  0:10                   ` Damien Le Moal
2023-06-15  4:40                     ` Christoph Hellwig
2023-06-15  4:57                       ` Damien Le Moal
2023-06-15  5:09                         ` Christoph Hellwig
2023-06-12  3:09         ` Damien Le Moal
2023-06-12  6:09           ` Hannes Reinecke
2023-06-12  7:22             ` Damien Le Moal
2023-06-12  7:36               ` Kai-Heng Feng
2023-06-12  7:47                 ` Damien Le Moal
2023-06-12 14:33                   ` Alan Stern
2023-06-12 15:37                     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b0fdf454-b2f7-c273-66f5-efe42fbc2807@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=bagasdotme@gmail.com \
    --cc=bvanassche@acm.org \
    --cc=dan.j.williams@intel.com \
    --cc=gpiccoli@igalia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=jejb@linux.ibm.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@jmbreuer.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux@leemhuis.info \
    --cc=martin.kepplinger@puri.sm \
    --cc=martin.petersen@oracle.com \
    --cc=pavel@ucw.cz \
    --cc=phil@philpotter.co.uk \
    --cc=rafael@kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=stern@rowland.harvard.edu \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).