From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>, linux-ide@vger.kernel.org
Cc: linux-scsi@vger.kernel.org,
"Martin K . Petersen" <martin.petersen@oracle.com>,
John Garry <john.g.garry@oracle.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
Paul Ausbeck <paula@soe.ucsc.edu>,
Kai-Heng Feng <kai.heng.feng@canonical.com>,
Joe Breuer <linux-kernel@jmbreuer.net>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Chia-Lin Kao <acelan.kao@canonical.com>
Subject: Re: [PATCH v6 09/23] scsi: sd: Do not issue commands to suspended disks on shutdown
Date: Tue, 26 Sep 2023 08:00:23 +0200 [thread overview]
Message-ID: <2b3ceca3-9e1c-7266-1f60-19e5f032c3e3@kernel.org> (raw)
In-Reply-To: <ca064bd3-2496-4d79-b68c-beff775228c3@acm.org>
On 2023/09/25 22:22, Bart Van Assche wrote:
> On 9/22/23 17:29, Damien Le Moal wrote:
>> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
>> index 5eea762f84d1..4d42392fae07 100644
>> --- a/drivers/scsi/sd.h
>> +++ b/drivers/scsi/sd.h
>> @@ -150,6 +150,7 @@ struct scsi_disk {
>> unsigned urswrz : 1;
>> unsigned security : 1;
>> unsigned ignore_medium_access_errors : 1;
>> + unsigned suspended : 1;
>> };
>> #define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev)
>
> If the 'suspended' member is retained, please do not use a bitfield for the
> but use type 'bool' instead. Updates of instances of type 'bool' are atomic
> while there is no guarantee in the C standard that bitfield updates will be
> atomic. Bitfield updates are typically translated into a combination of &,
> | and ~ operations.
Sure, I can make it a bool.
> Additionally, I'm not convinced that we need the new 'suspended' member.
> The Linux kernel runtime PM subsystem serializes I/O and system-wide power
> operations. No I/O happens during system-wide suspend or resume operations
> and no system-wide suspend or resume callbacks are invoked while I/O is
> ongoing. The only exception is I/O that is initiated as the result of error
> handling by suspend or resume callbacks, e.g. the SCSI commands submitted
> by sd_shutdown(). Even if sd_shutdown() is called indirectly by a suspend
> or resume callback, I don't think that it can happen that a suspend or
> resume operation is ongoing for the device sd_shutdown() operates on. If
Yes, but that is not what this patch addresses.
> scsi_remove_host() is called from inside a resume callback, resuming of the
> devices affected by sd_shutdown() will only be attempted after the host
> adapter resume callback has finished.
No it will not because the commands issued in sd_shutdown() are synchronous, so
the adapter resume will wait for these to complete. But they will never complete
as the adapter itself is not fully resumed, AND the disk may not be in a state
that allows commands to be executed. Deadlock.
It is easy to recreate this issue if you have a pm8001 adapter: remove that fix
patch I sent to correctly re-allocate IRQs on resume and do a suspend-resume
cycle: on resume, the adapter fails to allocate IRQs and gives up, calling
scsi_remove_host(). The system end being stuck in resume context with no forward
progress ever made.
It seems that you are suggesting that we should use some information from the
scsi_device->power structure to detect the suspended state... But as mentioned
before, these are PM internal and should not be touched without the device lock
held. So the little "suspended" falg simplifies things a lot.
>
> Thanks,
>
> Bart.
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2023-09-26 6:00 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-23 0:29 [PATCH v6 00/23] Fix libata suspend/resume handling and code cleanup Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 01/23] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
2023-09-26 17:18 ` Bart Van Assche
2023-09-23 0:29 ` [PATCH v6 02/23] ata: libata-core: Fix port and device removal Damien Le Moal
2023-09-26 17:28 ` Bart Van Assche
2023-09-23 0:29 ` [PATCH v6 03/23] ata: libata-scsi: link ata port and scsi device Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 04/23] scsi: sd: Differentiate system and runtime start/stop management Damien Le Moal
2023-09-26 18:07 ` Bart Van Assche
2023-09-23 0:29 ` [PATCH v6 05/23] ata: libata-scsi: Disable scsi device manage_system_start_stop Damien Le Moal
2023-09-25 14:27 ` Phillip Susi
2023-09-26 6:19 ` Damien Le Moal
2023-09-26 6:34 ` Damien Le Moal
2023-09-26 15:25 ` Phillip Susi
2023-09-23 0:29 ` [PATCH v6 06/23] scsi: Do not attempt to rescan suspended devices Damien Le Moal
2023-09-26 18:10 ` Bart Van Assche
2023-09-23 0:29 ` [PATCH v6 07/23] ata: libata-scsi: Fix delayed scsi_rescan_device() execution Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 08/23] ata: libata-core: Do not register PM operations for SAS ports Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 09/23] scsi: sd: Do not issue commands to suspended disks on shutdown Damien Le Moal
2023-09-25 20:22 ` Bart Van Assche
2023-09-26 6:00 ` Damien Le Moal [this message]
2023-09-26 14:51 ` Bart Van Assche
2023-09-26 23:30 ` Bart Van Assche
2023-09-23 0:29 ` [PATCH v6 10/23] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 11/23] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 12/23] scsi: Remove scsi device no_start_on_resume flag Damien Le Moal
2023-09-26 20:42 ` Bart Van Assche
2023-09-23 0:29 ` [PATCH v6 13/23] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat() Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 14/23] ata: libata-core: Synchronize ata_port_detach() with hotplug Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 15/23] ata: libata-core: Detach a port devices on shutdown Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 16/23] ata: libata-core: Remove ata_port_suspend_async() Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 17/23] ata: libata-core: Remove ata_port_resume_async() Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 18/23] ata: libata-core: Do not poweroff runtime suspended ports Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 19/23] ata: libata-core: Do not resume " Damien Le Moal
2023-09-25 17:26 ` Phillip Susi
2023-09-26 6:27 ` Damien Le Moal
2023-09-26 15:01 ` Phillip Susi
2023-09-23 0:29 ` [PATCH v6 20/23] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 21/23] ata: libata-eh: Improve reset error messages Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 22/23] ata: libata-eh: Reduce "disable device" message verbosity Damien Le Moal
2023-09-23 0:29 ` [PATCH v6 23/23] ata: libata: Cleanup inline DMA helper functions Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2b3ceca3-9e1c-7266-1f60-19e5f032c3e3@kernel.org \
--to=dlemoal@kernel.org \
--cc=acelan.kao@canonical.com \
--cc=bvanassche@acm.org \
--cc=geert@linux-m68k.org \
--cc=john.g.garry@oracle.com \
--cc=kai.heng.feng@canonical.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@jmbreuer.net \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=paula@soe.ucsc.edu \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox