All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Rodrigo Vivi <rodrigo.vivi@kernel.org>
Cc: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	Paul Ausbeck <paula@soe.ucsc.edu>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	TW <dalzot@gmail.com>,
	regressions@lists.linux.dev, Bart Van Assche <bvanassche@acm.org>
Subject: Re: [PATCH] ata,scsi: do not issue START STOP UNIT on resume
Date: Tue, 29 Aug 2023 15:17:38 +0900	[thread overview]
Message-ID: <ccf3d87c-6517-6f01-a32a-4c98b841c7d4@kernel.org> (raw)
In-Reply-To: <ZOjgJl4nlieu3+kL@rdvivi-mobl4>

On 8/26/23 02:09, Rodrigo Vivi wrote:
>>> So, maybe we have some kind of disks/configuration out there where this
>>> start upon resume is needed? Maybe it is just a matter of timming to
>>> ensure some firmware underneath is up and back to life?
>>
>> I do not think so. Suspend will issue a start stop unit command to put the drive
>> to sleep and resume will reset the port (which should wake up the drive) and
>> then issue an IDENTIFY command (which will also wake up the drive) and other
>> read logs etc to rescan the drive.
>> In both cases, if the commands do not complete, we would see errors/timeout and
>> likely port reset/drive gone events. So I think this is likely another subtle
>> race between scsi suspend and ata suspend that is causing a deadlock.
>>
>> The main issue I think is that there is no direct ancestry between the ata port
>> (device) and scsi device, so the change to scsi async pm ops made a mess of the
>> suspend/resume operations ordering. For suspend, scsi device (child of ata port)
>> should be first, then ata port device (parent). For resume, the reverse order is
>> needed. PM normally ensures that parent/child ordering, but we lack that
>> parent/child relationship. I am working on fixing that but it is very slow
>> progress because I have been so far enable to recreate any of the issues that
>> have been reported. I am patching "blind"...
> 
> I believe your suspicious makes sense. And on these lines, that patch you
> attached earlier would fix that. However my initial tries of that didn't
> help. I'm going to run more tests and get back to you.

Rodrigo,

I pushed the resume-v2 branch to libata tree:

git@gitolite.kernel.org:pub/scm/linux/kernel/git/dlemoal/libata
(or https://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata.git)

This branch adds 13 patches on top of 6.5.0 to cleanup libata suspend/resume and
other device shutdown issues. The first 4 patches are the main ones to fix
suspend resume. I tested that on 2 different machines with different drives and
with qemu. All seems fine.

Could you try to run this through your CI ? I am very interested in seeing if it
survives your suspend/resume tests.

If you can confirm that all issues are fixed, I will rebase this on for-next and
post.

Thanks !


-- 
Damien Le Moal
Western Digital Research


  parent reply	other threads:[~2023-08-29  6:18 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-31  0:39 [PATCH] ata,scsi: do not issue START STOP UNIT on resume Damien Le Moal
2023-07-31  3:48 ` TW
2023-07-31  4:44   ` Damien Le Moal
2023-07-31  5:47 ` Tanner Watkins
2023-07-31 16:13 ` Hannes Reinecke
2023-08-01  3:44   ` Damien Le Moal
2023-08-01  6:16     ` Hannes Reinecke
2023-07-31 19:43 ` Paul Ausbeck
2023-08-01 18:36 ` Bart Van Assche
2023-08-02  8:05 ` Damien Le Moal
2023-08-24 18:28 ` Rodrigo Vivi
2023-08-24 23:42   ` Damien Le Moal
2023-08-25  1:31     ` Martin K. Petersen
2023-08-25  1:33       ` Damien Le Moal
2023-08-25 17:09     ` Rodrigo Vivi
2023-08-25 22:06       ` Damien Le Moal
2023-08-29  6:17       ` Damien Le Moal [this message]
2023-08-30 22:14         ` Rodrigo Vivi
2023-08-31  0:32           ` Damien Le Moal
2023-08-31  1:48             ` Vivi, Rodrigo
2023-08-31  3:06               ` Damien Le Moal
2023-09-05  5:20               ` Damien Le Moal
2023-09-05 17:17                 ` Rodrigo Vivi
2023-09-06  1:07                   ` Damien Le Moal
2023-08-31  6:55       ` Damien Le Moal
2023-08-25 12:19   ` Damien Le Moal
2023-09-12 17:39 ` Geert Uytterhoeven
2023-09-12 22:58   ` Damien Le Moal
2023-09-13 10:21     ` Geert Uytterhoeven
2023-09-13 10:34       ` Geert Uytterhoeven
2023-09-13 22:07         ` Damien Le Moal
2023-09-14  6:59           ` Geert Uytterhoeven
2023-09-13 22:03       ` Damien Le Moal
2023-09-14  6:53         ` Geert Uytterhoeven
2023-09-14  6:58           ` Damien Le Moal
2023-09-14 15:29         ` Phillip Susi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ccf3d87c-6517-6f01-a32a-4c98b841c7d4@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=dalzot@gmail.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=paula@soe.ucsc.edu \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=rodrigo.vivi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.