public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: "regressions@leemhuis.info" <regressions@leemhuis.info>,
	"dalzot@gmail.com" <dalzot@gmail.com>,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"paula@soe.ucsc.edu" <paula@soe.ucsc.edu>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	"bvanassche@acm.org" <bvanassche@acm.org>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>
Subject: Re: [PATCH] ata,scsi: do not issue START STOP UNIT on resume
Date: Wed, 6 Sep 2023 10:07:07 +0900	[thread overview]
Message-ID: <a4255590-26ef-56ed-8574-5297ac0ef40e@kernel.org> (raw)
In-Reply-To: <ZPditFZNQWQw5yp3@intel.com>

On 9/6/23 02:17, Rodrigo Vivi wrote:
>> I think I have now figured it out, and fixed. I could reliably recreate the same
>> hang both with qemu using a failed suspend (using a device not supporting
>> suspend) and real hardware with a short rtc wake.
>>
>> It turns out that the root cause of the hang is ata_scsi_dev_rescan(), which is
>> scheduled asynchronously from PM context on resume. With quick suspend after a
>> resume, suspend may win the race against that ata_scsi_dev_rescan() task
>> execution and we endup calling scsi_rescan_device() on a suspended device,
>> causing that function to wait with the device_lock() held, which causes PM to
>> deadlock when it needs to resume the scsi device. The recent commit 6aa0365a3c85
>> ("ata: libata-scsi: Avoid deadlock on rescan after device resume") was intended
>> to fix that, but it did so less than ideally and the fix has a race on the scsi
>> power state check, thus not always preventing the resume hang.
>>
>> I pushed a new patch series that goes on top of 6.5.0: resume-v3 branch in the
>> libata tree:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata.git
>>
>> This works very well for me. Using this script on real hardware:
>>
>> for (( i=0; i<20; i++ )); do
>> 	echo "+2" > /sys/class/rtc/rtc0/wakealarm
>> 	echo mem > /sys/power/state
>> done
>>
>> The system repeatedly suspends and resumes and comes back OK. Of note is that if
>> I set the delay to +1 second, then I sometime do not see the system resume and
>> the script stops. But using wakeup-on-lan (wol command) from another machine to
>> wake it up, the machine resumes normally and continues executing the script. So
>> it seems that setting the rtc alarm unreasonably early result in it being lost
>> and the system suspending wating to be woken up.
>>
>> I also tested this in qemu. As mentioned before, I cannot get rtc alarm to wake
>> up the VM guest though. However, using a virtio device that does not support
>> suspend, resume strats in the middle of the suspend operation due to the suspend
>> error reported by that device. And it turns out that systemd really insists on
>> suspending the system despite the error, so when running "systemctl suspend" I
>> see a retry for suspend right after the first failed one. That is enough to
>> trigger the issue without the patches.
>>
>> Please test !
> 
> \o/ works for me!
> 
> Feel free to use:
> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Awesome ! Thank you for testing. I will rebase the patches and post the official
version for 6.6 fixes (and the other cleanup patches for 6.7), after retesting
again. Never know :)

-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2023-09-06  1:07 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-31  0:39 [PATCH] ata,scsi: do not issue START STOP UNIT on resume Damien Le Moal
2023-07-31  3:48 ` TW
2023-07-31  4:44   ` Damien Le Moal
2023-07-31  5:47 ` Tanner Watkins
2023-07-31 16:13 ` Hannes Reinecke
2023-08-01  3:44   ` Damien Le Moal
2023-08-01  6:16     ` Hannes Reinecke
2023-07-31 19:43 ` Paul Ausbeck
2023-08-01 18:36 ` Bart Van Assche
2023-08-02  8:05 ` Damien Le Moal
2023-08-24 18:28 ` Rodrigo Vivi
2023-08-24 23:42   ` Damien Le Moal
2023-08-25  1:31     ` Martin K. Petersen
2023-08-25  1:33       ` Damien Le Moal
2023-08-25 17:09     ` Rodrigo Vivi
2023-08-25 22:06       ` Damien Le Moal
2023-08-29  6:17       ` Damien Le Moal
2023-08-30 22:14         ` Rodrigo Vivi
2023-08-31  0:32           ` Damien Le Moal
2023-08-31  1:48             ` Vivi, Rodrigo
2023-08-31  3:06               ` Damien Le Moal
2023-09-05  5:20               ` Damien Le Moal
2023-09-05 17:17                 ` Rodrigo Vivi
2023-09-06  1:07                   ` Damien Le Moal [this message]
2023-08-31  6:55       ` Damien Le Moal
2023-08-25 12:19   ` Damien Le Moal
2023-09-12 17:39 ` Geert Uytterhoeven
2023-09-12 22:58   ` Damien Le Moal
2023-09-13 10:21     ` Geert Uytterhoeven
2023-09-13 10:34       ` Geert Uytterhoeven
2023-09-13 22:07         ` Damien Le Moal
2023-09-14  6:59           ` Geert Uytterhoeven
2023-09-13 22:03       ` Damien Le Moal
2023-09-14  6:53         ` Geert Uytterhoeven
2023-09-14  6:58           ` Damien Le Moal
2023-09-14 15:29         ` Phillip Susi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4255590-26ef-56ed-8574-5297ac0ef40e@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=dalzot@gmail.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=paula@soe.ucsc.edu \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox