public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>, linux-ide@vger.kernel.org
Cc: linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	John Garry <john.g.garry@oracle.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Paul Ausbeck <paula@soe.ucsc.edu>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Joe Breuer <linux-kernel@jmbreuer.net>
Subject: Re: [PATCH 05/19] ata: libata-scsi: Fix delayed scsi_rescan_device() execution
Date: Fri, 15 Sep 2023 07:05:25 +0900	[thread overview]
Message-ID: <bf2f2365-7c30-d6f0-ad13-8a0c85bb5e91@kernel.org> (raw)
In-Reply-To: <f45fb721-aa64-4a10-952e-cf5236a5d1e3@acm.org>

On 9/15/23 02:25, Bart Van Assche wrote:
> On 9/10/23 21:02, Damien Le Moal wrote:
>> Commit 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after
>> device resume") modified ata_scsi_dev_rescan() to check the scsi device
>> "is_suspended" power field to ensure that the scsi device associated
>> with an ATA device is fully resumed when scsi_rescan_device() is
>> executed. However, this fix is problematic as:
>> 1) it relies on a PM internal field that should not be used without PM
>>     device locking protection.
>> 2) The check for is_suspended and the call to ata_scsi_dev_rescan() are
>>     not atomic and a suspend PM even may be triggered between them,
>>     casuing ata_scsi_dev_rescan() to be called on a suspended device,
>>     resulting in that function blocking while holding the scsi device
>>     lock, which would deadlock a following resume operation.
>> These problems can trigger PM deadlocks on resume, especially with
>> resume operations triggered quickly after or during suspend operations.
>> E.g., a simple bash script like:
>>
>> for (( i=0; i<10; i++ )); do
>> 	echo "+2 > /sys/class/rtc/rtc0/wakealarm
>> 	echo mem > /sys/power/state
>> done
>>
>> that triggers a resume 2 seconds after starting suspending a system can
>> quickly lead to a PM deadlock preventing the system from correctly
>> resuming.
>>
>> Fix this by replacing the check on is_suspended with a check on the scsi
>> device state inside ata_scsi_dev_rescan(), while holding the scsi device
>> lock, thus making the device rescan atomic with regard to PM operations.
>> Additionnly, make sure that scheduled rescan tasks are first cancelled
>> before suspending an ata port.
> 
> One patch per subsystem please. I think this patch can be split easily 
> into an ATA patch and a SCSI core patch.

In general, I agree that should be done. But this is a bug fix and having it
split in 2 risks breaking something if only one is reverted and also potentially
give bad bisect results. So I would rather not do that.

> 
> Thanks,
> 
> Bart.

-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2023-09-14 22:05 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11  4:01 [PATCH 00/19] Fix libata suspend/resume handling and code cleanup Damien Le Moal
2023-09-11  4:01 ` [PATCH 01/19] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
2023-09-11  6:34   ` Hannes Reinecke
2023-09-13  1:41   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 02/19] ata: libata-core: Fix port and device removal Damien Le Moal
2023-09-11  6:37   ` Hannes Reinecke
2023-09-11  6:44     ` Damien Le Moal
2023-09-11  7:07       ` Hannes Reinecke
2023-09-13  1:43   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 03/19] ata: libata-scsi: link ata port and scsi device Damien Le Moal
2023-09-11  6:41   ` Hannes Reinecke
2023-09-11  6:48     ` Damien Le Moal
2023-09-11  7:07       ` Hannes Reinecke
2023-09-11 10:38       ` John Garry
2023-09-11 11:48         ` Damien Le Moal
2023-09-11 15:15           ` John Garry
2023-09-12  6:13             ` Damien Le Moal
2023-09-12  8:49               ` John Garry
2023-09-12  9:00                 ` Damien Le Moal
2023-09-12  9:19                   ` John Garry
2023-09-13  1:43   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 04/19] ata: libata-scsi: Disable scsi device manage_start_stop Damien Le Moal
2023-09-11  6:46   ` Hannes Reinecke
2023-09-11  6:59     ` Damien Le Moal
2023-09-11  7:09       ` Hannes Reinecke
2023-09-14 16:37         ` Phillip Susi
2023-09-13  1:44   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 05/19] ata: libata-scsi: Fix delayed scsi_rescan_device() execution Damien Le Moal
2023-09-11  6:47   ` Hannes Reinecke
2023-09-13  1:44   ` Chia-Lin Kao (AceLan)
2023-09-14 17:25   ` Bart Van Assche
2023-09-14 22:05     ` Damien Le Moal [this message]
2023-09-11  4:02 ` [PATCH 06/19] ata: libata-core: Do not register PM operations for SAS ports Damien Le Moal
2023-09-11  6:50   ` Hannes Reinecke
2023-09-13  1:44   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 07/19] scsi: sd: Do not issue commands to suspended disks on remove Damien Le Moal
2023-09-11  6:51   ` Hannes Reinecke
2023-09-13  1:45   ` Chia-Lin Kao (AceLan)
2023-09-13 20:50   ` Bart Van Assche
2023-09-14  0:29     ` Damien Le Moal
2023-09-14 14:39       ` Bart Van Assche
2023-09-11  4:02 ` [PATCH 08/19] scsi: Remove scsi device no_start_on_resume flag Damien Le Moal
2023-09-11  6:52   ` Hannes Reinecke
2023-09-13  1:45   ` Chia-Lin Kao (AceLan)
2023-09-14 17:29   ` Bart Van Assche
2023-09-11  4:02 ` [PATCH 09/19] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat() Damien Le Moal
2023-09-11  6:57   ` Hannes Reinecke
2023-09-13  1:46   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 10/19] ata: libata-core: Synchronize ata_port_detach() with hotplug Damien Le Moal
2023-09-11  6:58   ` Hannes Reinecke
2023-09-13  1:46   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 11/19] ata: libata-core: Detach a port devices on shutdown Damien Le Moal
2023-09-11  6:59   ` Hannes Reinecke
2023-09-13  1:46   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 12/19] ata: libata-core: Remove ata_port_suspend_async() Damien Le Moal
2023-09-11  7:00   ` Hannes Reinecke
2023-09-13  1:47   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 13/19] ata: libata-core: Remove ata_port_resume_async() Damien Le Moal
2023-09-11  7:00   ` Hannes Reinecke
2023-09-13  1:47   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 14/19] ata: libata-core: skip poweroff for devices that are runtime suspended Damien Le Moal
2023-09-11  7:01   ` Hannes Reinecke
2023-09-13  1:48   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 15/19] ata: libata-core: Do not resume ports that have been " Damien Le Moal
2023-09-11  7:01   ` Hannes Reinecke
2023-09-13  1:48   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 16/19] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
2023-09-11  7:02   ` Hannes Reinecke
2023-09-13  1:48   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 17/19] ata: libata-eh: Improve reset error messages Damien Le Moal
2023-09-11  7:03   ` Hannes Reinecke
2023-09-11 10:03   ` John Garry
2023-09-13  1:49   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 18/19] ata: libata-eh: Reduce "disable device" message verbosity Damien Le Moal
2023-09-11  7:05   ` Hannes Reinecke
2023-09-11 10:14   ` Sergei Shtylyov
2023-09-13  1:49   ` Chia-Lin Kao (AceLan)
2023-09-11  4:02 ` [PATCH 19/19] ata: libata: Cleanup inline DMA helper functions Damien Le Moal
2023-09-11  7:06   ` Hannes Reinecke
2023-09-13  1:49   ` Chia-Lin Kao (AceLan)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf2f2365-7c30-d6f0-ad13-8a0c85bb5e91@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=john.g.garry@oracle.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@jmbreuer.net \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=paula@soe.ucsc.edu \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox