public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup
@ 2023-09-12  0:56 Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 01/21] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
                   ` (20 more replies)
  0 siblings, 21 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The first 7 patch of this series fix several issues with suspend/resume
power management operations in libata. The most important change
introduced is in patch 4, where the manage_start_stop of scsi devices
associated with ata devices is disabled and this functionality replaced
by a direct handling within libata EH of device suspend/resume (i.e.
spin down for suspend and spin up for resume).

The remaining patches are code cleanup that do not introduce any
significant functional change.

This series was tested on qemu and on various PCs and servers. I am
CC-ing people who recently reported issues with suspend/resume.
Additional testing would be much appreciated.

Of note is that there is no change to the overall suspend/resume model
of libata where suspend and resume operations are tied to the ata port.
This is somewhat broken as a port may have multiple devices (e.g. pata
and port multiplier cases). Fixing this model (to be more similar to
what libsas does) will be the next step.

Changes from v1:
 * Added patch 8 and 9 to fix compilation warnings with W=1
 * Addressed John comment in patch 19
 * Fixed patch 20 commit message (Sergei)
 * Added Hannes Review tag

Damien Le Moal (21):
  ata: libata-core: Fix ata_port_request_pm() locking
  ata: libata-core: Fix port and device removal
  ata: libata-scsi: link ata port and scsi device
  ata: libata-scsi: Disable scsi device manage_start_stop
  ata: libata-scsi: Fix delayed scsi_rescan_device() execution
  ata: libata-core: Do not register PM operations for SAS ports
  scsi: sd: Do not issue commands to suspended disks on remove
  ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
  ata: libata-eh: Fix compilation warning in ata_eh_link_report()
  scsi: Remove scsi device no_start_on_resume flag
  ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
  ata: libata-core: Synchronize ata_port_detach() with hotplug
  ata: libata-core: Detach a port devices on shutdown
  ata: libata-core: Remove ata_port_suspend_async()
  ata: libata-core: Remove ata_port_resume_async()
  ata: libata-core: skip poweroff for devices that are runtime suspended
  ata: libata-core: Do not resume ports that have been runtime suspended
  ata: libata-sata: Improve ata_sas_slave_configure()
  ata: libata-eh: Improve reset error messages
  ata: libata-eh: Reduce "disable device" message verbosity
  ata: libata: Cleanup inline DMA helper functions

 drivers/ata/libata-core.c      | 241 +++++++++++++++++++++++++--------
 drivers/ata/libata-eh.c        |  76 +++++++++--
 drivers/ata/libata-sata.c      |   5 +-
 drivers/ata/libata-scsi.c      | 143 ++++++++++---------
 drivers/ata/libata-transport.c |   9 +-
 drivers/ata/libata.h           |   7 +
 drivers/ata/pata_macio.c       |   1 +
 drivers/ata/sata_mv.c          |   1 +
 drivers/ata/sata_nv.c          |   2 +
 drivers/ata/sata_sil24.c       |   1 +
 drivers/scsi/scsi_scan.c       |  12 +-
 drivers/scsi/sd.c              |  10 +-
 include/linux/libata.h         |  27 ++--
 include/scsi/scsi_device.h     |   1 -
 include/scsi/scsi_host.h       |   2 +-
 15 files changed, 372 insertions(+), 166 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 01/21] ata: libata-core: Fix ata_port_request_pm() locking
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 02/21] ata: libata-core: Fix port and device removal Damien Le Moal
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The function ata_port_request_pm() checks the port flag
ATA_PFLAG_PM_PENDING and calls ata_port_wait_eh() if this flag is set to
ensure that power management operations for a port are not secheduled
simultaneously. However, this flag check is done without holding the
port lock.

Fix this by taking the port lock on entry to the function and checking
the flag under this lock. The lock is released and re-taken if
ata_port_wait_eh() needs to be called.

Fixes: 5ef41082912b ("ata: add ata port system PM callbacks")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 74314311295f..c4898483d716 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5040,17 +5040,20 @@ static void ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
 	struct ata_link *link;
 	unsigned long flags;
 
-	/* Previous resume operation might still be in
-	 * progress.  Wait for PM_PENDING to clear.
+	spin_lock_irqsave(ap->lock, flags);
+
+	/*
+	 * A previous PM operation might still be in progress. Wait for
+	 * ATA_PFLAG_PM_PENDING to clear.
 	 */
 	if (ap->pflags & ATA_PFLAG_PM_PENDING) {
+		spin_unlock_irqrestore(ap->lock, flags);
 		ata_port_wait_eh(ap);
+		spin_lock_irqsave(ap->lock, flags);
 		WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING);
 	}
 
-	/* request PM ops to EH */
-	spin_lock_irqsave(ap->lock, flags);
-
+	/* Request PM operation to EH */
 	ap->pm_mesg = mesg;
 	ap->pflags |= ATA_PFLAG_PM_PENDING;
 	ata_for_each_link(link, ap, HOST_FIRST) {
@@ -5062,10 +5065,8 @@ static void ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
 
 	spin_unlock_irqrestore(ap->lock, flags);
 
-	if (!async) {
+	if (!async)
 		ata_port_wait_eh(ap);
-		WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING);
-	}
 }
 
 /*
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 02/21] ata: libata-core: Fix port and device removal
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 01/21] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device Damien Le Moal
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Whenever an ATA adapter driver is removed (e.g. rmmod),
ata_port_detach() is called repeatedly for all the adapter ports to
remove (unload) the devices attached to the port and delete the port
device itself. Removing of devices is done using libata EH with the
ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute
ata_eh_unload() which disables all devices attached to the port.

ata_port_detach() finishes by calling scsi_remove_host() to remove the
scsi host associated with the port. This function will trigger the
removal of all scsi devices attached to the host and in the case of
disks, calls to sd_shutdown() which will flush the device write cache
and stop the device. However, given that the devices were already
disabled by ata_eh_unload(), the synchronize write cache command and
start stop unit commands fail. E.g. running "rmmod ahci" with first
removing sd_mod results in error messages like:

ata13.00: disable device
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 0:0:0:0: [sda] Stopping disk
sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

Fix this by removing all scsi devices of the ata devices connected to
the port before scheduling libata EH to disable the ATA devices.

Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index c4898483d716..693cb3cd70cd 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap)
 	struct ata_link *link;
 	struct ata_device *dev;
 
-	/* tell EH we're leaving & flush EH */
+	/* Wait for any ongoing EH */
+	ata_port_wait_eh(ap);
+
+	mutex_lock(&ap->scsi_scan_mutex);
 	spin_lock_irqsave(ap->lock, flags);
+
+	/* Remove scsi devices */
+	ata_for_each_link(link, ap, HOST_FIRST) {
+		ata_for_each_dev(dev, link, ALL) {
+			if (dev->sdev) {
+				spin_unlock_irqrestore(ap->lock, flags);
+				scsi_remove_device(dev->sdev);
+				spin_lock_irqsave(ap->lock, flags);
+				dev->sdev = NULL;
+			}
+		}
+	}
+
+	/* Tell EH to disable all devices */
 	ap->pflags |= ATA_PFLAG_UNLOADING;
 	ata_port_schedule_eh(ap);
+
 	spin_unlock_irqrestore(ap->lock, flags);
+	mutex_unlock(&ap->scsi_scan_mutex);
 
 	/* wait till EH commits suicide */
 	ata_port_wait_eh(ap);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 01/21] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 02/21] ata: libata-core: Fix port and device removal Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-13 10:25   ` Geert Uytterhoeven
  2023-09-12  0:56 ` [PATCH v2 04/21] ata: libata-scsi: Disable scsi device manage_start_stop Damien Le Moal
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

There is no direct device ancestry defined between an ata_device and
its scsi device which prevents the power management code from correctly
ordering suspend and resume operations. Create such ancestry with the
ata device as the parent to ensure that the scsi device (child) is
suspended before the ata device and that resume handles the ata device
before the scsi device.

The parent-child (supplier-consumer) relationship is established between
the ata_port (parent) and the scsi device (child) with the function
device_add_link(). The parent used is not the ata_device as the PM
operations are defined per port and the status of all devices connected
through that port is controlled from the port operations.

The device link is established with the new function
ata_scsi_dev_alloc(). This function is used to define the ->slave_alloc
callback of the scsi host template of most drivers.

Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-scsi.c | 46 ++++++++++++++++++++++++++++++++++-----
 drivers/ata/libata.h      |  1 +
 drivers/ata/pata_macio.c  |  1 +
 drivers/ata/sata_mv.c     |  1 +
 drivers/ata/sata_nv.c     |  2 ++
 drivers/ata/sata_sil24.c  |  1 +
 include/linux/libata.h    |  3 +++
 7 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index d3f28b82c97b..f63cf6e7332e 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1089,6 +1089,45 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
 	return 0;
 }
 
+int ata_scsi_dev_alloc(struct scsi_device *sdev, struct ata_port *ap)
+{
+	struct device_link *link;
+
+	ata_scsi_sdev_config(sdev);
+
+	/*
+	 * Create a link from the ata_port device to the scsi device to ensure
+	 * that PM does suspend/resume in the correct order: the scsi device is
+	 * consumer (child) and the ata port the supplier (parent).
+	 */
+	link = device_link_add(&sdev->sdev_gendev, &ap->tdev,
+			       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
+	if (!link) {
+		ata_port_err(ap, "Failed to create link to scsi device %s\n",
+			     dev_name(&sdev->sdev_gendev));
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+/**
+ *	ata_scsi_slave_alloc - Early setup of SCSI device
+ *	@sdev: SCSI device to examine
+ *
+ *	This is called from scsi_alloc_sdev() when the scsi device
+ *	associated with an ATA device is scanned on a port.
+ *
+ *	LOCKING:
+ *	Defined by SCSI layer.  We don't really care.
+ */
+
+int ata_scsi_slave_alloc(struct scsi_device *sdev)
+{
+	return ata_scsi_dev_alloc(sdev, ata_shost_to_port(sdev->host));
+}
+EXPORT_SYMBOL_GPL(ata_scsi_slave_alloc);
+
 /**
  *	ata_scsi_slave_config - Set SCSI device attributes
  *	@sdev: SCSI device to examine
@@ -1105,14 +1144,11 @@ int ata_scsi_slave_config(struct scsi_device *sdev)
 {
 	struct ata_port *ap = ata_shost_to_port(sdev->host);
 	struct ata_device *dev = __ata_scsi_find_dev(ap, sdev);
-	int rc = 0;
-
-	ata_scsi_sdev_config(sdev);
 
 	if (dev)
-		rc = ata_scsi_dev_config(sdev, dev);
+		return ata_scsi_dev_config(sdev, dev);
 
-	return rc;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(ata_scsi_slave_config);
 
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 6e7d352803bd..079981e7156a 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -111,6 +111,7 @@ extern struct ata_device *ata_scsi_find_dev(struct ata_port *ap,
 extern int ata_scsi_add_hosts(struct ata_host *host,
 			      const struct scsi_host_template *sht);
 extern void ata_scsi_scan_host(struct ata_port *ap, int sync);
+extern int ata_scsi_dev_alloc(struct scsi_device *sdev, struct ata_port *ap);
 extern int ata_scsi_offline_dev(struct ata_device *dev);
 extern bool ata_scsi_sense_is_valid(u8 sk, u8 asc, u8 ascq);
 extern void ata_scsi_set_sense(struct ata_device *dev,
diff --git a/drivers/ata/pata_macio.c b/drivers/ata/pata_macio.c
index 17f6ccee53c7..32968b4cf8e4 100644
--- a/drivers/ata/pata_macio.c
+++ b/drivers/ata/pata_macio.c
@@ -918,6 +918,7 @@ static const struct scsi_host_template pata_macio_sht = {
 	 * use 64K minus 256
 	 */
 	.max_segment_size	= MAX_DBDMA_SEG,
+	.slave_alloc		= ata_scsi_slave_alloc,
 	.slave_configure	= pata_macio_slave_config,
 	.sdev_groups		= ata_common_sdev_groups,
 	.can_queue		= ATA_DEF_QUEUE,
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index d105db5c7d81..353ac7b2f14a 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -673,6 +673,7 @@ static const struct scsi_host_template mv6_sht = {
 	.sdev_groups		= ata_ncq_sdev_groups,
 	.change_queue_depth	= ata_scsi_change_queue_depth,
 	.tag_alloc_policy	= BLK_TAG_ALLOC_RR,
+	.slave_alloc		= ata_scsi_slave_alloc,
 	.slave_configure	= ata_scsi_slave_config
 };
 
diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c
index 0a0cee755bde..5428dc2ec5e3 100644
--- a/drivers/ata/sata_nv.c
+++ b/drivers/ata/sata_nv.c
@@ -380,6 +380,7 @@ static const struct scsi_host_template nv_adma_sht = {
 	.can_queue		= NV_ADMA_MAX_CPBS,
 	.sg_tablesize		= NV_ADMA_SGTBL_TOTAL_LEN,
 	.dma_boundary		= NV_ADMA_DMA_BOUNDARY,
+	.slave_alloc		= ata_scsi_slave_alloc,
 	.slave_configure	= nv_adma_slave_config,
 	.sdev_groups		= ata_ncq_sdev_groups,
 	.change_queue_depth     = ata_scsi_change_queue_depth,
@@ -391,6 +392,7 @@ static const struct scsi_host_template nv_swncq_sht = {
 	.can_queue		= ATA_MAX_QUEUE - 1,
 	.sg_tablesize		= LIBATA_MAX_PRD,
 	.dma_boundary		= ATA_DMA_BOUNDARY,
+	.slave_alloc		= ata_scsi_slave_alloc,
 	.slave_configure	= nv_swncq_slave_config,
 	.sdev_groups		= ata_ncq_sdev_groups,
 	.change_queue_depth     = ata_scsi_change_queue_depth,
diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c
index 142e70bfc498..e0b1b3625031 100644
--- a/drivers/ata/sata_sil24.c
+++ b/drivers/ata/sata_sil24.c
@@ -381,6 +381,7 @@ static const struct scsi_host_template sil24_sht = {
 	.tag_alloc_policy	= BLK_TAG_ALLOC_FIFO,
 	.sdev_groups		= ata_ncq_sdev_groups,
 	.change_queue_depth	= ata_scsi_change_queue_depth,
+	.slave_alloc		= ata_scsi_slave_alloc,
 	.slave_configure	= ata_scsi_slave_config
 };
 
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 52d58b13e5ee..c8cfea386c16 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1144,6 +1144,7 @@ extern int ata_std_bios_param(struct scsi_device *sdev,
 			      struct block_device *bdev,
 			      sector_t capacity, int geom[]);
 extern void ata_scsi_unlock_native_capacity(struct scsi_device *sdev);
+extern int ata_scsi_slave_alloc(struct scsi_device *sdev);
 extern int ata_scsi_slave_config(struct scsi_device *sdev);
 extern void ata_scsi_slave_destroy(struct scsi_device *sdev);
 extern int ata_scsi_change_queue_depth(struct scsi_device *sdev,
@@ -1401,12 +1402,14 @@ extern const struct attribute_group *ata_common_sdev_groups[];
 	__ATA_BASE_SHT(drv_name),				\
 	.can_queue		= ATA_DEF_QUEUE,		\
 	.tag_alloc_policy	= BLK_TAG_ALLOC_RR,		\
+	.slave_alloc		= ata_scsi_slave_alloc,		\
 	.slave_configure	= ata_scsi_slave_config
 
 #define ATA_SUBBASE_SHT_QD(drv_name, drv_qd)			\
 	__ATA_BASE_SHT(drv_name),				\
 	.can_queue		= drv_qd,			\
 	.tag_alloc_policy	= BLK_TAG_ALLOC_RR,		\
+	.slave_alloc		= ata_scsi_slave_alloc,		\
 	.slave_configure	= ata_scsi_slave_config
 
 #define ATA_BASE_SHT(drv_name)					\
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 04/21] ata: libata-scsi: Disable scsi device manage_start_stop
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (2 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 05/21] ata: libata-scsi: Fix delayed scsi_rescan_device() execution Damien Le Moal
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The introduction of a device link to create a consumer/supplier
relationship between the scsi device of an ATA device and the ATA port
of the ATA device fixed the ordering of the suspend and resume
operations. For suspend, the scsi device is suspended first and the ata
port after it. This is fine as this allows the synchronize cache and
START STOP UNIT commands issued by the scsi disk driver to be executed
before the ata port is disabled.

For resume operations, the ata port is resumed first, followed
by the scsi device. This allows having the request queue of the scsi
device to be unfrozen after the ata port restart is scheduled in EH,
thus avoiding to see new requests issued to the ATA device prematurely.
However, since libata sets manage_start_stop to 1, the scsi disk resume
operation also results in issuing a START STOP UNIT command to wakeup
the device. This is too late and that must be done before libata EH
resume handling starts revalidating the drive with IDENTIFY etc
commands. Commit 0a8589055936 ("ata,scsi: do not issue START STOP UNIT
on resume") disabled issuing the START STOP UNIT command to avoid
issues with it. However, this is incorrect as transitioning a device to
the active power mode from the standby power mode set on suspend
requires a media access command. The device link reset and subsequent
SET FEATURES, IDENTIFY and READ LOG commands executed in libata EH
context triggered by the ata port resume operation may thus fail.

Fix this by handling a device power mode transitions for suspend and
resume in libata EH context without relying on the scsi disk management
triggered with the manage_start_stop flag.

To do this, the following libata helper functions are introduced:

1) ata_dev_power_set_standby():

This function issues a STANDBY IMMEDIATE command to transitiom a device
to the standby power mode. For HDDs, this spins down the disks. This
function applies only to ATA and ZAC devices and does nothing otherwise.
This function also does nothing for devices that have the
ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag
set.

For suspend, call ata_dev_power_set_standby() in
ata_eh_handle_port_suspend() before the port is disabled and frozen.
ata_eh_unload() is also modified to transition all enabled devices to
the standby power mode when the system is shutdown or devices removed.

2) ata_dev_power_set_active() and

This function applies to ATA or ZAC devices and issues a VERIFY command
for 1 sector at LBA 0 to transition the device to the active power mode.
For HDDs, since this function will complete only once the disk spin up.
Its execution uses the same timeouts as for reset, to give the drive
enough time to complete spinup without triggering a command timeout.

For resume, call ata_dev_power_set_active() in
ata_eh_revalidate_and_attach() after the port has been enabled and
before any other command is issued to the device.

With these changes, the manage_start_stop flag does not need to be set
in ata_scsi_dev_config().

Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume")
Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 90 +++++++++++++++++++++++++++++++++++++++
 drivers/ata/libata-eh.c   | 46 +++++++++++++++++++-
 drivers/ata/libata-scsi.c | 14 ++----
 drivers/ata/libata.h      |  2 +
 include/linux/libata.h    |  6 ++-
 5 files changed, 144 insertions(+), 14 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 693cb3cd70cd..0cf0caf77907 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1972,6 +1972,96 @@ int ata_dev_read_id(struct ata_device *dev, unsigned int *p_class,
 	return rc;
 }
 
+/**
+ *	ata_dev_power_set_standby - Set a device power mode to standby
+ *	@dev: target device
+ *
+ *	Issue a STANDBY IMMEDIATE command to set a device power mode to standby.
+ *	For an HDD device, this spins down the disks.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ */
+void ata_dev_power_set_standby(struct ata_device *dev)
+{
+	unsigned long ap_flags = dev->link->ap->flags;
+	struct ata_taskfile tf;
+	unsigned int err_mask;
+
+	/* Issue STANDBY IMMEDIATE command only if supported by the device */
+	if (dev->class != ATA_DEV_ATA && dev->class != ATA_DEV_ZAC)
+		return;
+
+	/*
+	 * Some odd clown BIOSes issue spindown on power off (ACPI S4 or S5)
+	 * causing some drives to spin up and down again. For these, do nothing
+	 * if we are being called on shutdown.
+	 */
+	if ((ap_flags & ATA_FLAG_NO_POWEROFF_SPINDOWN) &&
+	    system_state == SYSTEM_POWER_OFF)
+		return;
+
+	if ((ap_flags & ATA_FLAG_NO_HIBERNATE_SPINDOWN) &&
+	    system_entering_hibernation())
+		return;
+
+	ata_tf_init(dev, &tf);
+	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
+	tf.protocol = ATA_PROT_NODATA;
+	tf.command = ATA_CMD_STANDBYNOW1;
+
+	ata_dev_notice(dev, "Entering standby power mode\n");
+
+	err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 0);
+	if (err_mask)
+		ata_dev_err(dev, "STANDBY IMMEDIATE failed (err_mask=0x%x)\n",
+			    err_mask);
+}
+
+/**
+ *	ata_dev_power_set_active -  Set a device power mode to active
+ *	@dev: target device
+ *
+ *	Issue a VERIFY command to enter to ensure that the device is in the
+ *	active power mode. For a spun-down HDD (standby or idle power mode),
+ *	the VERIFY command will complete after the disk spins up.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ */
+void ata_dev_power_set_active(struct ata_device *dev)
+{
+	struct ata_taskfile tf;
+	unsigned int err_mask;
+
+	/*
+	 * Issue READ VERIFY SECTORS command for 1 sector at lba=0 only
+	 * if supported by the device.
+	 */
+	if (dev->class != ATA_DEV_ATA && dev->class != ATA_DEV_ZAC)
+		return;
+
+	ata_tf_init(dev, &tf);
+	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
+	tf.protocol = ATA_PROT_NODATA;
+	tf.command = ATA_CMD_VERIFY;
+	tf.nsect = 1;
+	if (dev->flags & ATA_DFLAG_LBA) {
+		tf.flags |= ATA_TFLAG_LBA;
+		tf.device |= ATA_LBA;
+	} else {
+		/* CHS */
+		tf.lbal = 0x1; /* sect */
+	}
+
+	ata_dev_notice(dev, "Entering active power mode\n");
+
+	err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 0);
+	if (err_mask)
+		ata_dev_err(dev, "VERIFY failed (err_mask=0x%x)\n",
+			    err_mask);
+}
+
 /**
  *	ata_read_log_page - read a specific log page
  *	@dev: target device
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 159ba6ba19eb..43b0a1509548 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -147,6 +147,8 @@ ata_eh_cmd_timeout_table[ATA_EH_CMD_TIMEOUT_TABLE_SIZE] = {
 	  .timeouts = ata_eh_other_timeouts, },
 	{ .commands = CMDS(ATA_CMD_FLUSH, ATA_CMD_FLUSH_EXT),
 	  .timeouts = ata_eh_flush_timeouts },
+	{ .commands = CMDS(ATA_CMD_VERIFY),
+	  .timeouts = ata_eh_reset_timeouts },
 };
 #undef CMDS
 
@@ -498,7 +500,19 @@ static void ata_eh_unload(struct ata_port *ap)
 	struct ata_device *dev;
 	unsigned long flags;
 
-	/* Restore SControl IPM and SPD for the next driver and
+	/*
+	 * Unless we are restarting, transition all enabled devices to
+	 * standby power mode.
+	 */
+	if (system_state != SYSTEM_RESTART) {
+		ata_for_each_link(link, ap, PMP_FIRST) {
+			ata_for_each_dev(dev, link, ENABLED)
+				ata_dev_power_set_standby(dev);
+		}
+	}
+
+	/*
+	 * Restore SControl IPM and SPD for the next driver and
 	 * disable attached devices.
 	 */
 	ata_for_each_link(link, ap, PMP_FIRST) {
@@ -684,6 +698,10 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap)
 			ehc->saved_xfer_mode[devno] = dev->xfer_mode;
 			if (ata_ncq_enabled(dev))
 				ehc->saved_ncq_enabled |= 1 << devno;
+
+			/* If we are resuming, wake up the device */
+			if (ap->pflags & ATA_PFLAG_RESUMING)
+				ehc->i.dev_action[devno] |= ATA_EH_SET_ACTIVE;
 		}
 	}
 
@@ -743,6 +761,8 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap)
 	/* clean up */
 	spin_lock_irqsave(ap->lock, flags);
 
+	ap->pflags &= ~ATA_PFLAG_RESUMING;
+
 	if (ap->pflags & ATA_PFLAG_LOADING)
 		ap->pflags &= ~ATA_PFLAG_LOADING;
 	else if ((ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) &&
@@ -1218,6 +1238,13 @@ void ata_eh_detach_dev(struct ata_device *dev)
 	struct ata_eh_context *ehc = &link->eh_context;
 	unsigned long flags;
 
+	/*
+	 * If the device is still enabled, transition it to standby power mode
+	 * (i.e. spin down HDDs).
+	 */
+	if (ata_dev_enabled(dev))
+		ata_dev_power_set_standby(dev);
+
 	ata_dev_disable(dev);
 
 	spin_lock_irqsave(ap->lock, flags);
@@ -3026,6 +3053,15 @@ static int ata_eh_revalidate_and_attach(struct ata_link *link,
 		if (ehc->i.flags & ATA_EHI_DID_RESET)
 			readid_flags |= ATA_READID_POSTRESET;
 
+		/*
+		 * When resuming, before executing any command, make sure to
+		 * transition the device to the active power mode.
+		 */
+		if ((action & ATA_EH_SET_ACTIVE) && ata_dev_enabled(dev)) {
+			ata_dev_power_set_active(dev);
+			ata_eh_done(link, dev, ATA_EH_SET_ACTIVE);
+		}
+
 		if ((action & ATA_EH_REVALIDATE) && ata_dev_enabled(dev)) {
 			WARN_ON(dev->class == ATA_DEV_PMP);
 
@@ -3999,6 +4035,7 @@ static void ata_eh_handle_port_suspend(struct ata_port *ap)
 	unsigned long flags;
 	int rc = 0;
 	struct ata_device *dev;
+	struct ata_link *link;
 
 	/* are we suspending? */
 	spin_lock_irqsave(ap->lock, flags);
@@ -4011,6 +4048,12 @@ static void ata_eh_handle_port_suspend(struct ata_port *ap)
 
 	WARN_ON(ap->pflags & ATA_PFLAG_SUSPENDED);
 
+	/* Set all devices attached to the port in standby mode */
+	ata_for_each_link(link, ap, HOST_FIRST) {
+		ata_for_each_dev(dev, link, ENABLED)
+			ata_dev_power_set_standby(dev);
+	}
+
 	/*
 	 * If we have a ZPODD attached, check its zero
 	 * power ready status before the port is frozen.
@@ -4093,6 +4136,7 @@ static void ata_eh_handle_port_resume(struct ata_port *ap)
 	/* update the flags */
 	spin_lock_irqsave(ap->lock, flags);
 	ap->pflags &= ~(ATA_PFLAG_PM_PENDING | ATA_PFLAG_SUSPENDED);
+	ap->pflags |= ATA_PFLAG_RESUMING;
 	spin_unlock_irqrestore(ap->lock, flags);
 }
 #endif /* CONFIG_PM */
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index f63cf6e7332e..9bb1ace8bf79 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1050,14 +1050,6 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
 		}
 	} else {
 		sdev->sector_size = ata_id_logical_sector_size(dev->id);
-		/*
-		 * Stop the drive on suspend but do not issue START STOP UNIT
-		 * on resume as this is not necessary and may fail: the device
-		 * will be woken up by ata_port_pm_resume() with a port reset
-		 * and device revalidation.
-		 */
-		sdev->manage_start_stop = 1;
-		sdev->no_start_on_resume = 1;
 	}
 
 	/*
@@ -1231,7 +1223,7 @@ static unsigned int ata_scsi_start_stop_xlat(struct ata_queued_cmd *qc)
 	}
 
 	if (cdb[4] & 0x1) {
-		tf->nsect = 1;	/* 1 sector, lba=0 */
+		tf->nsect = 1;  /* 1 sector, lba=0 */
 
 		if (qc->dev->flags & ATA_DFLAG_LBA) {
 			tf->flags |= ATA_TFLAG_LBA;
@@ -1247,7 +1239,7 @@ static unsigned int ata_scsi_start_stop_xlat(struct ata_queued_cmd *qc)
 			tf->lbah = 0x0; /* cyl high */
 		}
 
-		tf->command = ATA_CMD_VERIFY;	/* READ VERIFY */
+		tf->command = ATA_CMD_VERIFY;   /* READ VERIFY */
 	} else {
 		/* Some odd clown BIOSen issue spindown on power off (ACPI S4
 		 * or S5) causing some drives to spin up and down again.
@@ -1257,7 +1249,7 @@ static unsigned int ata_scsi_start_stop_xlat(struct ata_queued_cmd *qc)
 			goto skip;
 
 		if ((qc->ap->flags & ATA_FLAG_NO_HIBERNATE_SPINDOWN) &&
-		     system_entering_hibernation())
+		    system_entering_hibernation())
 			goto skip;
 
 		/* Issue ATA STANDBY IMMEDIATE command */
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 079981e7156a..a5ee06f0234a 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -60,6 +60,8 @@ extern int ata_dev_reread_id(struct ata_device *dev, unsigned int readid_flags);
 extern int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
 			      unsigned int readid_flags);
 extern int ata_dev_configure(struct ata_device *dev);
+extern void ata_dev_power_set_standby(struct ata_device *dev);
+extern void ata_dev_power_set_active(struct ata_device *dev);
 extern int sata_down_spd_limit(struct ata_link *link, u32 spd_limit);
 extern int ata_down_xfermask_limit(struct ata_device *dev, unsigned int sel);
 extern unsigned int ata_dev_set_feature(struct ata_device *dev,
diff --git a/include/linux/libata.h b/include/linux/libata.h
index c8cfea386c16..6593c79b7290 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -192,6 +192,7 @@ enum {
 	ATA_PFLAG_UNLOADING	= (1 << 9), /* driver is being unloaded */
 	ATA_PFLAG_UNLOADED	= (1 << 10), /* driver is unloaded */
 
+	ATA_PFLAG_RESUMING	= (1 << 16),  /* port is being resumed */
 	ATA_PFLAG_SUSPENDED	= (1 << 17), /* port is suspended (power) */
 	ATA_PFLAG_PM_PENDING	= (1 << 18), /* PM operation pending */
 	ATA_PFLAG_INIT_GTM_VALID = (1 << 19), /* initial gtm data valid */
@@ -314,9 +315,10 @@ enum {
 	ATA_EH_ENABLE_LINK	= (1 << 3),
 	ATA_EH_PARK		= (1 << 5), /* unload heads and stop I/O */
 	ATA_EH_GET_SUCCESS_SENSE = (1 << 6), /* Get sense data for successful cmd */
+	ATA_EH_SET_ACTIVE	= (1 << 7), /* Set a device to active power mode */
 
 	ATA_EH_PERDEV_MASK	= ATA_EH_REVALIDATE | ATA_EH_PARK |
-				  ATA_EH_GET_SUCCESS_SENSE,
+				  ATA_EH_GET_SUCCESS_SENSE | ATA_EH_SET_ACTIVE,
 	ATA_EH_ALL_ACTIONS	= ATA_EH_REVALIDATE | ATA_EH_RESET |
 				  ATA_EH_ENABLE_LINK,
 
@@ -353,7 +355,7 @@ enum {
 	/* This should match the actual table size of
 	 * ata_eh_cmd_timeout_table in libata-eh.c.
 	 */
-	ATA_EH_CMD_TIMEOUT_TABLE_SIZE = 7,
+	ATA_EH_CMD_TIMEOUT_TABLE_SIZE = 8,
 
 	/* Horkage types. May be set by libata or controller on drives
 	   (some horkage may be drive/controller pair dependent */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 05/21] ata: libata-scsi: Fix delayed scsi_rescan_device() execution
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (3 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 04/21] ata: libata-scsi: Disable scsi device manage_start_stop Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 06/21] ata: libata-core: Do not register PM operations for SAS ports Damien Le Moal
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Commit 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after
device resume") modified ata_scsi_dev_rescan() to check the scsi device
"is_suspended" power field to ensure that the scsi device associated
with an ATA device is fully resumed when scsi_rescan_device() is
executed. However, this fix is problematic as:
1) it relies on a PM internal field that should not be used without PM
   device locking protection.
2) The check for is_suspended and the call to ata_scsi_dev_rescan() are
   not atomic and a suspend PM even may be triggered between them,
   casuing ata_scsi_dev_rescan() to be called on a suspended device,
   resulting in that function blocking while holding the scsi device
   lock, which would deadlock a following resume operation.
These problems can trigger PM deadlocks on resume, especially with
resume operations triggered quickly after or during suspend operations.
E.g., a simple bash script like:

for (( i=0; i<10; i++ )); do
	echo "+2 > /sys/class/rtc/rtc0/wakealarm
	echo mem > /sys/power/state
done

that triggers a resume 2 seconds after starting suspending a system can
quickly lead to a PM deadlock preventing the system from correctly
resuming.

Fix this by replacing the check on is_suspended with a check on the scsi
device state inside ata_scsi_dev_rescan(), while holding the scsi device
lock, thus making the device rescan atomic with regard to PM operations.
Additionnly, make sure that scheduled rescan tasks are first cancelled
before suspending an ata port.

Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 16 ++++++++++++++++
 drivers/ata/libata-scsi.c | 36 ++++++++++++++++++------------------
 drivers/scsi/scsi_scan.c  | 12 +++++++++++-
 include/scsi/scsi_host.h  |  2 +-
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0cf0caf77907..0479493e54bd 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5172,11 +5172,27 @@ static const unsigned int ata_port_suspend_ehi = ATA_EHI_QUIET
 
 static void ata_port_suspend(struct ata_port *ap, pm_message_t mesg)
 {
+	/*
+	 * We are about to suspend the port, so we do not care about
+	 * scsi_rescan_device() calls scheduled by previous resume operations.
+	 * The next resume will schedule the rescan again. So cancel any rescan
+	 * that is not done yet.
+	 */
+	cancel_delayed_work_sync(&ap->scsi_rescan_task);
+
 	ata_port_request_pm(ap, mesg, 0, ata_port_suspend_ehi, false);
 }
 
 static void ata_port_suspend_async(struct ata_port *ap, pm_message_t mesg)
 {
+	/*
+	 * We are about to suspend the port, so we do not care about
+	 * scsi_rescan_device() calls scheduled by previous resume operations.
+	 * The next resume will schedule the rescan again. So cancel any rescan
+	 * that is not done yet.
+	 */
+	cancel_delayed_work_sync(&ap->scsi_rescan_task);
+
 	ata_port_request_pm(ap, mesg, 0, ata_port_suspend_ehi, true);
 }
 
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 9bb1ace8bf79..f2d4460ab450 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4750,7 +4750,7 @@ void ata_scsi_dev_rescan(struct work_struct *work)
 	struct ata_link *link;
 	struct ata_device *dev;
 	unsigned long flags;
-	bool delay_rescan = false;
+	int ret = 0;
 
 	mutex_lock(&ap->scsi_scan_mutex);
 	spin_lock_irqsave(ap->lock, flags);
@@ -4759,37 +4759,37 @@ void ata_scsi_dev_rescan(struct work_struct *work)
 		ata_for_each_dev(dev, link, ENABLED) {
 			struct scsi_device *sdev = dev->sdev;
 
+			/*
+			 * If the port was suspended before this was scheduled,
+			 * bail out.
+			 */
+			if (ap->pflags & ATA_PFLAG_SUSPENDED)
+				goto unlock;
+
 			if (!sdev)
 				continue;
 			if (scsi_device_get(sdev))
 				continue;
 
-			/*
-			 * If the rescan work was scheduled because of a resume
-			 * event, the port is already fully resumed, but the
-			 * SCSI device may not yet be fully resumed. In such
-			 * case, executing scsi_rescan_device() may cause a
-			 * deadlock with the PM code on device_lock(). Prevent
-			 * this by giving up and retrying rescan after a short
-			 * delay.
-			 */
-			delay_rescan = sdev->sdev_gendev.power.is_suspended;
-			if (delay_rescan) {
-				scsi_device_put(sdev);
-				break;
-			}
-
 			spin_unlock_irqrestore(ap->lock, flags);
-			scsi_rescan_device(sdev);
+			ret = scsi_rescan_device(sdev);
 			scsi_device_put(sdev);
 			spin_lock_irqsave(ap->lock, flags);
+
+			if (ret)
+				goto unlock;
 		}
 	}
 
+unlock:
 	spin_unlock_irqrestore(ap->lock, flags);
 	mutex_unlock(&ap->scsi_scan_mutex);
 
-	if (delay_rescan)
+	/*
+	 * Reschedule with a delay if scsi_rescan_device() returned an error
+	 * and the port has not been suspended.
+	 */
+	if (ret && !(ap->pflags & ATA_PFLAG_SUSPENDED))
 		schedule_delayed_work(&ap->scsi_rescan_task,
 				      msecs_to_jiffies(5));
 }
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 52014b2d39e1..6650f63afec9 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1619,12 +1619,18 @@ int scsi_add_device(struct Scsi_Host *host, uint channel,
 }
 EXPORT_SYMBOL(scsi_add_device);
 
-void scsi_rescan_device(struct scsi_device *sdev)
+int scsi_rescan_device(struct scsi_device *sdev)
 {
 	struct device *dev = &sdev->sdev_gendev;
+	int ret = 0;
 
 	device_lock(dev);
 
+	if (sdev->sdev_state != SDEV_RUNNING) {
+		ret = -ENXIO;
+		goto unlock;
+	}
+
 	scsi_attach_vpd(sdev);
 	scsi_cdl_check(sdev);
 
@@ -1638,7 +1644,11 @@ void scsi_rescan_device(struct scsi_device *sdev)
 			drv->rescan(dev);
 		module_put(dev->driver->owner);
 	}
+
+unlock:
 	device_unlock(dev);
+
+	return ret;
 }
 EXPORT_SYMBOL(scsi_rescan_device);
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 49f768d0ff37..4c2dc8150c6d 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -764,7 +764,7 @@ scsi_template_proc_dir(const struct scsi_host_template *sht);
 #define scsi_template_proc_dir(sht) NULL
 #endif
 extern void scsi_scan_host(struct Scsi_Host *);
-extern void scsi_rescan_device(struct scsi_device *);
+extern int scsi_rescan_device(struct scsi_device *sdev);
 extern void scsi_remove_host(struct Scsi_Host *);
 extern struct Scsi_Host *scsi_host_get(struct Scsi_Host *);
 extern int scsi_host_busy(struct Scsi_Host *shost);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 06/21] ata: libata-core: Do not register PM operations for SAS ports
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (4 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 05/21] ata: libata-scsi: Fix delayed scsi_rescan_device() execution Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove Damien Le Moal
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

libsas does its own domain based power management of ports. For such
ports, libata should not use a device type defining power management
operations as executing these operations for suspend/resume in addition
to libsas calls to ata_sas_port_suspend() and ata_sas_port_resume() is
not necessary (and likely dangerous to do, even though problems are not
seen currently).

Introduce the new ata_port_sas_type device_type for ports managed by
libsas. This new device type is used in ata_tport_add() and is defined
without power management operations.

Fixes: 2fcbdcb4c802 ("[SCSI] libata: export ata_port suspend/resume infrastructure for sas")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c      | 2 +-
 drivers/ata/libata-transport.c | 9 ++++++++-
 drivers/ata/libata.h           | 2 ++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0479493e54bd..18b2a0da9e54 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5339,7 +5339,7 @@ EXPORT_SYMBOL_GPL(ata_host_resume);
 #endif
 
 const struct device_type ata_port_type = {
-	.name = "ata_port",
+	.name = ATA_PORT_TYPE_NAME,
 #ifdef CONFIG_PM
 	.pm = &ata_port_pm_ops,
 #endif
diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c
index e4fb9d1b9b39..3e49a877500e 100644
--- a/drivers/ata/libata-transport.c
+++ b/drivers/ata/libata-transport.c
@@ -266,6 +266,10 @@ void ata_tport_delete(struct ata_port *ap)
 	put_device(dev);
 }
 
+static const struct device_type ata_port_sas_type = {
+	.name = ATA_PORT_TYPE_NAME,
+};
+
 /** ata_tport_add - initialize a transport ATA port structure
  *
  * @parent:	parent device
@@ -283,7 +287,10 @@ int ata_tport_add(struct device *parent,
 	struct device *dev = &ap->tdev;
 
 	device_initialize(dev);
-	dev->type = &ata_port_type;
+	if (ap->flags & ATA_FLAG_SAS_HOST)
+		dev->type = &ata_port_sas_type;
+	else
+		dev->type = &ata_port_type;
 
 	dev->parent = parent;
 	ata_host_get(ap->host);
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index a5ee06f0234a..c57e094c3af9 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -30,6 +30,8 @@ enum {
 	ATA_DNXFER_QUIET	= (1 << 31),
 };
 
+#define ATA_PORT_TYPE_NAME	"ata_port"
+
 extern atomic_t ata_print_id;
 extern int atapi_passthru16;
 extern int libata_fua;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (5 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 06/21] ata: libata-core: Do not register PM operations for SAS ports Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-14 14:48   ` Bart Van Assche
  2023-09-12  0:56 ` [PATCH v2 08/21] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Damien Le Moal
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

If an error occurs when resuming a host adapter before the devices
attached to the adapter are resumed, the adapter low level driver may
remove the scsi host, resulting in a call to sd_remove() for the
disks of the host. However, since this function calls sd_shutdown(),
a synchronize cache command and a start stop unit may be issued with the
drive still sleeping and the HBA non-functional. This causes PM resume
to hang, forcing a reset of the machine to recover.

Fix this by checking a device host state in sd_shutdown() and by
returning early doing nothing if the host state is not SHOST_RUNNING.

Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/sd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index c92a317ba547..a415abb721d3 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3763,7 +3763,8 @@ static void sd_shutdown(struct device *dev)
 	if (!sdkp)
 		return;         /* this can happen */
 
-	if (pm_runtime_suspended(dev))
+	if (pm_runtime_suspended(dev) ||
+	    sdkp->device->host->shost_state != SHOST_RUNNING)
 		return;
 
 	if (sdkp->WCE && sdkp->media_present) {
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 08/21] ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (6 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  6:14   ` Hannes Reinecke
  2023-09-12  0:56 ` [PATCH v2 09/21] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Damien Le Moal
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The 24 bytes length allocated to the ncq_desc string in
ata_dev_config_lba() for ata_dev_config_ncq() to use is too short,
causing the following gcc compilation warnings when compiling with W=1:

drivers/ata/libata-core.c: In function ‘ata_dev_configure’:
drivers/ata/libata-core.c:2378:56: warning: ‘%d’ directive output may be truncated writing between 1 and 2 bytes into a region of size between 1 and 11 [-Wformat-truncation=]
 2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
      |                                                        ^~
In function ‘ata_dev_config_ncq’,
    inlined from ‘ata_dev_config_lba’ at drivers/ata/libata-core.c:2649:8,
    inlined from ‘ata_dev_configure’ at drivers/ata/libata-core.c:2952:9:
drivers/ata/libata-core.c:2378:41: note: directive argument in the range [1, 32]
 2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
      |                                         ^~~~~~~~~~~~~~~~~~~~~
drivers/ata/libata-core.c:2378:17: note: ‘snprintf’ output between 16 and 31 bytes into a destination of size 24
 2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2379 |                         ddepth, aa_desc);
      |                         ~~~~~~~~~~~~~~~~

Avoid these warnings and the potential truncation by changing the size
of the ncq_desc string to 32 characters.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 drivers/ata/libata-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 18b2a0da9e54..2405ac8b53f0 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2619,7 +2619,7 @@ static int ata_dev_config_lba(struct ata_device *dev)
 {
 	const u16 *id = dev->id;
 	const char *lba_desc;
-	char ncq_desc[24];
+	char ncq_desc[32];
 	int ret;
 
 	dev->flags |= ATA_DFLAG_LBA;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 09/21] ata: libata-eh: Fix compilation warning in ata_eh_link_report()
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (7 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 08/21] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  6:14   ` Hannes Reinecke
  2023-09-12  0:56 ` [PATCH v2 10/21] scsi: Remove scsi device no_start_on_resume flag Damien Le Moal
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The 6 bytes length of the tries_buf string in ata_eh_link_report() is
too short and results in a gcc compilation warning with W-!:

drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’:
drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=]
 2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
      |                                                           ^~
drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4]
 2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
      |                                                        ^~~~~~
drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6
 2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2372 |                          ap->eh_tries);
      |                          ~~~~~~~~~~~~~

Avoid this warning by increasing the string size to 16B.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 drivers/ata/libata-eh.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 43b0a1509548..03c45630a35b 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2332,7 +2332,7 @@ static void ata_eh_link_report(struct ata_link *link)
 	struct ata_eh_context *ehc = &link->eh_context;
 	struct ata_queued_cmd *qc;
 	const char *frozen, *desc;
-	char tries_buf[6] = "";
+	char tries_buf[16] = "";
 	int tag, nr_failed = 0;
 
 	if (ehc->i.flags & ATA_EHI_QUIET)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 10/21] scsi: Remove scsi device no_start_on_resume flag
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (8 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 09/21] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 11/21] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat() Damien Le Moal
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The scsi device flag no_start_on_resume is not set by any scsi low
level driver. Remove it.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/sd.c          | 7 ++-----
 include/scsi/scsi_device.h | 1 -
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index a415abb721d3..b9504bb3cf4d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3846,11 +3846,8 @@ static int sd_resume(struct device *dev)
 	if (!sdkp->device->manage_start_stop)
 		return 0;
 
-	if (!sdkp->device->no_start_on_resume) {
-		sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
-		ret = sd_start_stop_device(sdkp, 1);
-	}
-
+	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
+	ret = sd_start_stop_device(sdkp, 1);
 	if (!ret)
 		opal_unlock_from_suspend(sdkp->opal_dev);
 	return ret;
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index b9230b6add04..75b2235b99e2 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -194,7 +194,6 @@ struct scsi_device {
 	unsigned no_start_on_add:1;	/* do not issue start on add */
 	unsigned allow_restart:1; /* issue START_UNIT in error handler */
 	unsigned manage_start_stop:1;	/* Let HLD (sd) manage start/stop */
-	unsigned no_start_on_resume:1; /* Do not issue START_STOP_UNIT on resume */
 	unsigned start_stop_pwr_cond:1;	/* Set power cond. in START_STOP_UNIT */
 	unsigned no_uld_attach:1; /* disable connecting to upper level drivers */
 	unsigned select_no_atn:1;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 11/21] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (9 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 10/21] scsi: Remove scsi device no_start_on_resume flag Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 12/21] ata: libata-core: Synchronize ata_port_detach() with hotplug Damien Le Moal
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Now that libata does its own internal device power mode management
through libata EH, the scsi disk driver will not issue START STOP UNIT
commands anymore. We can receive this command only from user passthrough
operations. So there is no need to consider the system state and ATA
port flags for suspend to translate the command.

Since setting up the taskfile for the verify and standby
immediate commands is the same as done in ata_dev_power_set_active()
and ata_dev_power_set_standby(), factor out this code into the helper
function ata_dev_power_init_tf() to simplify ata_scsi_start_stop_xlat()
as well as ata_dev_power_set_active() and ata_dev_power_set_standby().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 55 +++++++++++++++++++++++----------------
 drivers/ata/libata-scsi.c | 53 +++++++------------------------------
 drivers/ata/libata.h      |  2 ++
 3 files changed, 44 insertions(+), 66 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 2405ac8b53f0..8d1949302a8a 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1972,6 +1972,35 @@ int ata_dev_read_id(struct ata_device *dev, unsigned int *p_class,
 	return rc;
 }
 
+bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
+			   bool set_active)
+{
+	/* Only applies to ATA and ZAC devices */
+	if (dev->class != ATA_DEV_ATA && dev->class != ATA_DEV_ZAC)
+		return false;
+
+	ata_tf_init(dev, tf);
+	tf->flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
+	tf->protocol = ATA_PROT_NODATA;
+
+	if (set_active) {
+		/* VERIFY for 1 sector at lba=0 */
+		tf->command = ATA_CMD_VERIFY;
+		tf->nsect = 1;
+		if (dev->flags & ATA_DFLAG_LBA) {
+			tf->flags |= ATA_TFLAG_LBA;
+			tf->device |= ATA_LBA;
+		} else {
+			/* CHS */
+			tf->lbal = 0x1; /* sect */
+		}
+	} else {
+		tf->command = ATA_CMD_STANDBYNOW1;
+	}
+
+	return true;
+}
+
 /**
  *	ata_dev_power_set_standby - Set a device power mode to standby
  *	@dev: target device
@@ -1988,10 +2017,6 @@ void ata_dev_power_set_standby(struct ata_device *dev)
 	struct ata_taskfile tf;
 	unsigned int err_mask;
 
-	/* Issue STANDBY IMMEDIATE command only if supported by the device */
-	if (dev->class != ATA_DEV_ATA && dev->class != ATA_DEV_ZAC)
-		return;
-
 	/*
 	 * Some odd clown BIOSes issue spindown on power off (ACPI S4 or S5)
 	 * causing some drives to spin up and down again. For these, do nothing
@@ -2005,10 +2030,9 @@ void ata_dev_power_set_standby(struct ata_device *dev)
 	    system_entering_hibernation())
 		return;
 
-	ata_tf_init(dev, &tf);
-	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
-	tf.protocol = ATA_PROT_NODATA;
-	tf.command = ATA_CMD_STANDBYNOW1;
+	/* Issue STANDBY IMMEDIATE command only if supported by the device */
+	if (!ata_dev_power_init_tf(dev, &tf, false))
+		return;
 
 	ata_dev_notice(dev, "Entering standby power mode\n");
 
@@ -2038,22 +2062,9 @@ void ata_dev_power_set_active(struct ata_device *dev)
 	 * Issue READ VERIFY SECTORS command for 1 sector at lba=0 only
 	 * if supported by the device.
 	 */
-	if (dev->class != ATA_DEV_ATA && dev->class != ATA_DEV_ZAC)
+	if (!ata_dev_power_init_tf(dev, &tf, true))
 		return;
 
-	ata_tf_init(dev, &tf);
-	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
-	tf.protocol = ATA_PROT_NODATA;
-	tf.command = ATA_CMD_VERIFY;
-	tf.nsect = 1;
-	if (dev->flags & ATA_DFLAG_LBA) {
-		tf.flags |= ATA_TFLAG_LBA;
-		tf.device |= ATA_LBA;
-	} else {
-		/* CHS */
-		tf.lbal = 0x1; /* sect */
-	}
-
 	ata_dev_notice(dev, "Entering active power mode\n");
 
 	err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 0);
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index f2d4460ab450..92ae4b4f30ac 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1196,7 +1196,6 @@ EXPORT_SYMBOL_GPL(ata_scsi_slave_destroy);
 static unsigned int ata_scsi_start_stop_xlat(struct ata_queued_cmd *qc)
 {
 	struct scsi_cmnd *scmd = qc->scsicmd;
-	struct ata_taskfile *tf = &qc->tf;
 	const u8 *cdb = scmd->cmnd;
 	u16 fp;
 	u8 bp = 0xff;
@@ -1206,54 +1205,24 @@ static unsigned int ata_scsi_start_stop_xlat(struct ata_queued_cmd *qc)
 		goto invalid_fld;
 	}
 
-	tf->flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
-	tf->protocol = ATA_PROT_NODATA;
-	if (cdb[1] & 0x1) {
-		;	/* ignore IMMED bit, violates sat-r05 */
-	}
+	/* LOEJ bit set not supported */
 	if (cdb[4] & 0x2) {
 		fp = 4;
 		bp = 1;
-		goto invalid_fld;       /* LOEJ bit set not supported */
+		goto invalid_fld;
 	}
+
+	/* Power conditions not supported */
 	if (((cdb[4] >> 4) & 0xf) != 0) {
 		fp = 4;
 		bp = 3;
-		goto invalid_fld;       /* power conditions not supported */
+		goto invalid_fld;
 	}
 
-	if (cdb[4] & 0x1) {
-		tf->nsect = 1;  /* 1 sector, lba=0 */
-
-		if (qc->dev->flags & ATA_DFLAG_LBA) {
-			tf->flags |= ATA_TFLAG_LBA;
-
-			tf->lbah = 0x0;
-			tf->lbam = 0x0;
-			tf->lbal = 0x0;
-			tf->device |= ATA_LBA;
-		} else {
-			/* CHS */
-			tf->lbal = 0x1; /* sect */
-			tf->lbam = 0x0; /* cyl low */
-			tf->lbah = 0x0; /* cyl high */
-		}
-
-		tf->command = ATA_CMD_VERIFY;   /* READ VERIFY */
-	} else {
-		/* Some odd clown BIOSen issue spindown on power off (ACPI S4
-		 * or S5) causing some drives to spin up and down again.
-		 */
-		if ((qc->ap->flags & ATA_FLAG_NO_POWEROFF_SPINDOWN) &&
-		    system_state == SYSTEM_POWER_OFF)
-			goto skip;
-
-		if ((qc->ap->flags & ATA_FLAG_NO_HIBERNATE_SPINDOWN) &&
-		    system_entering_hibernation())
-			goto skip;
-
-		/* Issue ATA STANDBY IMMEDIATE command */
-		tf->command = ATA_CMD_STANDBYNOW1;
+	/* Ignore IMMED bit (cdb[1] & 0x1), violates sat-r05 */
+	if (!ata_dev_power_init_tf(qc->dev, &qc->tf, cdb[4] & 0x1)) {
+		ata_scsi_set_sense(qc->dev, scmd, ABORTED_COMMAND, 0, 0);
+		return 1;
 	}
 
 	/*
@@ -1268,12 +1237,8 @@ static unsigned int ata_scsi_start_stop_xlat(struct ata_queued_cmd *qc)
  invalid_fld:
 	ata_scsi_set_invalid_field(qc->dev, scmd, fp, bp);
 	return 1;
- skip:
-	scmd->result = SAM_STAT_GOOD;
-	return 1;
 }
 
-
 /**
  *	ata_scsi_flush_xlat - Translate SCSI SYNCHRONIZE CACHE command
  *	@qc: Storage for translated ATA taskfile
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index c57e094c3af9..af0e718f2b72 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -62,6 +62,8 @@ extern int ata_dev_reread_id(struct ata_device *dev, unsigned int readid_flags);
 extern int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
 			      unsigned int readid_flags);
 extern int ata_dev_configure(struct ata_device *dev);
+extern bool ata_dev_power_init_tf(struct ata_device *dev,
+				  struct ata_taskfile *tf, bool set_active);
 extern void ata_dev_power_set_standby(struct ata_device *dev);
 extern void ata_dev_power_set_active(struct ata_device *dev);
 extern int sata_down_spd_limit(struct ata_link *link, u32 spd_limit);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 12/21] ata: libata-core: Synchronize ata_port_detach() with hotplug
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (10 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 11/21] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat() Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 13/21] ata: libata-core: Detach a port devices on shutdown Damien Le Moal
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The call to async_synchronize_cookie() to synchronize a port removal
and hotplug probe is done in ata_host_detach() right before calling
ata_port_detach(). Move this call at the beginning of ata_port_detach()
to ensure that this operation is always synchronized with probe.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 8d1949302a8a..9f05ad187c9e 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6069,6 +6069,9 @@ static void ata_port_detach(struct ata_port *ap)
 	struct ata_link *link;
 	struct ata_device *dev;
 
+	/* Ensure ata_port probe has completed */
+	async_synchronize_cookie(ap->cookie + 1);
+
 	/* Wait for any ongoing EH */
 	ata_port_wait_eh(ap);
 
@@ -6133,11 +6136,8 @@ void ata_host_detach(struct ata_host *host)
 {
 	int i;
 
-	for (i = 0; i < host->n_ports; i++) {
-		/* Ensure ata_port probe has completed */
-		async_synchronize_cookie(host->ports[i]->cookie + 1);
+	for (i = 0; i < host->n_ports; i++)
 		ata_port_detach(host->ports[i]);
-	}
 
 	/* the host is dead now, dissociate ACPI */
 	ata_acpi_dissociate(host);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 13/21] ata: libata-core: Detach a port devices on shutdown
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (11 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 12/21] ata: libata-core: Synchronize ata_port_detach() with hotplug Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 14/21] ata: libata-core: Remove ata_port_suspend_async() Damien Le Moal
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Modify ata_pci_shutdown_one() to schedule EH to unload a port devices
before freezing and thawing the port. This ensures that drives are
cleanly disabled and transitioned to standby power mode when
a PCI adapter is removed or the system is powered off.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 9f05ad187c9e..7d110e5454bf 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6168,10 +6168,24 @@ EXPORT_SYMBOL_GPL(ata_pci_remove_one);
 void ata_pci_shutdown_one(struct pci_dev *pdev)
 {
 	struct ata_host *host = pci_get_drvdata(pdev);
+	struct ata_port *ap;
+	unsigned long flags;
 	int i;
 
+	/* Tell EH to disable all devices */
 	for (i = 0; i < host->n_ports; i++) {
-		struct ata_port *ap = host->ports[i];
+		ap = host->ports[i];
+		spin_lock_irqsave(ap->lock, flags);
+		ap->pflags |= ATA_PFLAG_UNLOADING;
+		ata_port_schedule_eh(ap);
+		spin_unlock_irqrestore(ap->lock, flags);
+	}
+
+	for (i = 0; i < host->n_ports; i++) {
+		ap = host->ports[i];
+
+		/* Wait for EH to complete before freezing the port */
+		ata_port_wait_eh(ap);
 
 		ap->pflags |= ATA_PFLAG_FROZEN;
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 14/21] ata: libata-core: Remove ata_port_suspend_async()
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (12 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 13/21] ata: libata-core: Detach a port devices on shutdown Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 15/21] ata: libata-core: Remove ata_port_resume_async() Damien Le Moal
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

ata_port_suspend_async() is only called by ata_sas_port_suspend().
Modify ata_port_suspend() with an additional bool argument indicating an
asynchronous or synchronous suspend to allow removing that helper
function. With this change, the variable ata_port_resume_ehi can also be
removed and its value (ATA_EHI_XXX flags passed directly to
ata_port_request_pm().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 46 +++++++++++++++------------------------
 1 file changed, 17 insertions(+), 29 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7d110e5454bf..cdfe71ae6a6c 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5170,18 +5170,8 @@ static void ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
 		ata_port_wait_eh(ap);
 }
 
-/*
- * On some hardware, device fails to respond after spun down for suspend.  As
- * the device won't be used before being resumed, we don't need to touch the
- * device.  Ask EH to skip the usual stuff and proceed directly to suspend.
- *
- * http://thread.gmane.org/gmane.linux.ide/46764
- */
-static const unsigned int ata_port_suspend_ehi = ATA_EHI_QUIET
-						 | ATA_EHI_NO_AUTOPSY
-						 | ATA_EHI_NO_RECOVERY;
-
-static void ata_port_suspend(struct ata_port *ap, pm_message_t mesg)
+static void ata_port_suspend(struct ata_port *ap, pm_message_t mesg,
+			     bool async)
 {
 	/*
 	 * We are about to suspend the port, so we do not care about
@@ -5191,20 +5181,18 @@ static void ata_port_suspend(struct ata_port *ap, pm_message_t mesg)
 	 */
 	cancel_delayed_work_sync(&ap->scsi_rescan_task);
 
-	ata_port_request_pm(ap, mesg, 0, ata_port_suspend_ehi, false);
-}
-
-static void ata_port_suspend_async(struct ata_port *ap, pm_message_t mesg)
-{
 	/*
-	 * We are about to suspend the port, so we do not care about
-	 * scsi_rescan_device() calls scheduled by previous resume operations.
-	 * The next resume will schedule the rescan again. So cancel any rescan
-	 * that is not done yet.
+	 * On some hardware, device fails to respond after spun down for
+	 * suspend. As the device wil not t be used until being resumed, we
+	 * do not need to touch the device. Ask EH to skip the usual stuff
+	 * and proceed directly to suspend.
+	 *
+	 * http://thread.gmane.org/gmane.linux.ide/46764
 	 */
-	cancel_delayed_work_sync(&ap->scsi_rescan_task);
-
-	ata_port_request_pm(ap, mesg, 0, ata_port_suspend_ehi, true);
+	ata_port_request_pm(ap, mesg, 0,
+			    ATA_EHI_QUIET | ATA_EHI_NO_AUTOPSY |
+			    ATA_EHI_NO_RECOVERY,
+			    async);
 }
 
 static int ata_port_pm_suspend(struct device *dev)
@@ -5214,7 +5202,7 @@ static int ata_port_pm_suspend(struct device *dev)
 	if (pm_runtime_suspended(dev))
 		return 0;
 
-	ata_port_suspend(ap, PMSG_SUSPEND);
+	ata_port_suspend(ap, PMSG_SUSPEND, false);
 	return 0;
 }
 
@@ -5225,13 +5213,13 @@ static int ata_port_pm_freeze(struct device *dev)
 	if (pm_runtime_suspended(dev))
 		return 0;
 
-	ata_port_suspend(ap, PMSG_FREEZE);
+	ata_port_suspend(ap, PMSG_FREEZE, false);
 	return 0;
 }
 
 static int ata_port_pm_poweroff(struct device *dev)
 {
-	ata_port_suspend(to_ata_port(dev), PMSG_HIBERNATE);
+	ata_port_suspend(to_ata_port(dev), PMSG_HIBERNATE, false);
 	return 0;
 }
 
@@ -5283,7 +5271,7 @@ static int ata_port_runtime_idle(struct device *dev)
 
 static int ata_port_runtime_suspend(struct device *dev)
 {
-	ata_port_suspend(to_ata_port(dev), PMSG_AUTO_SUSPEND);
+	ata_port_suspend(to_ata_port(dev), PMSG_AUTO_SUSPEND, false);
 	return 0;
 }
 
@@ -5313,7 +5301,7 @@ static const struct dev_pm_ops ata_port_pm_ops = {
  */
 void ata_sas_port_suspend(struct ata_port *ap)
 {
-	ata_port_suspend_async(ap, PMSG_SUSPEND);
+	ata_port_suspend(ap, PMSG_SUSPEND, true);
 }
 EXPORT_SYMBOL_GPL(ata_sas_port_suspend);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 15/21] ata: libata-core: Remove ata_port_resume_async()
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (13 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 14/21] ata: libata-core: Remove ata_port_suspend_async() Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 16/21] ata: libata-core: skip poweroff for devices that are runtime suspended Damien Le Moal
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Remove ata_port_resume_async() and replace it with a modified
ata_port_resume() taking an additional bool argument indicating if
ata EH resume operation should be executed synchronously or
asynchronously. With this change, the variable ata_port_resume_ehi is
not longer necessary and its value (ATA_EHI_XXX flags) passed directly
to ata_port_request_pm().

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index cdfe71ae6a6c..a4389dd807e5 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5223,22 +5223,17 @@ static int ata_port_pm_poweroff(struct device *dev)
 	return 0;
 }
 
-static const unsigned int ata_port_resume_ehi = ATA_EHI_NO_AUTOPSY
-						| ATA_EHI_QUIET;
-
-static void ata_port_resume(struct ata_port *ap, pm_message_t mesg)
+static void ata_port_resume(struct ata_port *ap, pm_message_t mesg,
+			    bool async)
 {
-	ata_port_request_pm(ap, mesg, ATA_EH_RESET, ata_port_resume_ehi, false);
-}
-
-static void ata_port_resume_async(struct ata_port *ap, pm_message_t mesg)
-{
-	ata_port_request_pm(ap, mesg, ATA_EH_RESET, ata_port_resume_ehi, true);
+	ata_port_request_pm(ap, mesg, ATA_EH_RESET,
+			    ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET,
+			    async);
 }
 
 static int ata_port_pm_resume(struct device *dev)
 {
-	ata_port_resume_async(to_ata_port(dev), PMSG_RESUME);
+	ata_port_resume(to_ata_port(dev), PMSG_RESUME, true);
 	pm_runtime_disable(dev);
 	pm_runtime_set_active(dev);
 	pm_runtime_enable(dev);
@@ -5277,7 +5272,7 @@ static int ata_port_runtime_suspend(struct device *dev)
 
 static int ata_port_runtime_resume(struct device *dev)
 {
-	ata_port_resume(to_ata_port(dev), PMSG_AUTO_RESUME);
+	ata_port_resume(to_ata_port(dev), PMSG_AUTO_RESUME, false);
 	return 0;
 }
 
@@ -5307,7 +5302,7 @@ EXPORT_SYMBOL_GPL(ata_sas_port_suspend);
 
 void ata_sas_port_resume(struct ata_port *ap)
 {
-	ata_port_resume_async(ap, PMSG_RESUME);
+	ata_port_resume(ap, PMSG_RESUME, true);
 }
 EXPORT_SYMBOL_GPL(ata_sas_port_resume);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 16/21] ata: libata-core: skip poweroff for devices that are runtime suspended
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (14 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 15/21] ata: libata-core: Remove ata_port_resume_async() Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 17/21] ata: libata-core: Do not resume ports that have been " Damien Le Moal
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

When powering off, there is no need to suspend a port that has already
been runtime suspended. Skip the EH PM request in ata_port_pm_poweroff()
in this case.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index a4389dd807e5..c3adaa01cbe3 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5219,7 +5219,8 @@ static int ata_port_pm_freeze(struct device *dev)
 
 static int ata_port_pm_poweroff(struct device *dev)
 {
-	ata_port_suspend(to_ata_port(dev), PMSG_HIBERNATE, false);
+	if (!pm_runtime_suspended(dev))
+		ata_port_suspend(to_ata_port(dev), PMSG_HIBERNATE, false);
 	return 0;
 }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 17/21] ata: libata-core: Do not resume ports that have been runtime suspended
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (15 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 16/21] ata: libata-core: skip poweroff for devices that are runtime suspended Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

The scsi disk driver does not resume disks that have been runtime
suspended by the user. To be consistent with this behavior, do the same
for ata ports and skip the PM request in ata_port_pm_resume() if the
port was already runtime suspended. With this change, it is no longer
necessary to for the PM state of the port to ACTIVE as the PM core code
will take care of that when handling runtime resume.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-core.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index c3adaa01cbe3..080b451e7ddd 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5234,10 +5234,8 @@ static void ata_port_resume(struct ata_port *ap, pm_message_t mesg,
 
 static int ata_port_pm_resume(struct device *dev)
 {
-	ata_port_resume(to_ata_port(dev), PMSG_RESUME, true);
-	pm_runtime_disable(dev);
-	pm_runtime_set_active(dev);
-	pm_runtime_enable(dev);
+	if (!pm_runtime_suspended(dev))
+		ata_port_resume(to_ata_port(dev), PMSG_RESUME, true);
 	return 0;
 }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure()
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (16 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 17/21] ata: libata-core: Do not resume ports that have been " Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  7:43   ` John Garry
  2023-09-12  0:56 ` [PATCH v2 19/21] ata: libata-eh: Improve reset error messages Damien Le Moal
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Change ata_sas_slave_configure() to return the return value of
ata_scsi_dev_config() to ensure that any error from that function is
propagated to libsas.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-sata.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index 5d31c08be013..0748e9ea4f5f 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -1169,8 +1169,8 @@ EXPORT_SYMBOL_GPL(ata_sas_tport_delete);
 int ata_sas_slave_configure(struct scsi_device *sdev, struct ata_port *ap)
 {
 	ata_scsi_sdev_config(sdev);
-	ata_scsi_dev_config(sdev, ap->link.device);
-	return 0;
+
+	return ata_scsi_dev_config(sdev, ap->link.device);
 }
 EXPORT_SYMBOL_GPL(ata_sas_slave_configure);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 19/21] ata: libata-eh: Improve reset error messages
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (17 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 20/21] ata: libata-eh: Reduce "disable device" message verbosity Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 21/21] ata: libata: Cleanup inline DMA helper functions Damien Le Moal
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Some drives are really slow to spinup on resume, resulting is a very
slow response to COMRESET and to error messages such as:

ata1: COMRESET failed (errno=-16)
ata1: link is slow to respond, please be patient (ready=0)
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: configured for UDMA/133

Given that the slowness of the response is indicated with the message
"link is slow to respond..." and that resets are retried until the
device is detected as online after up to 1min (ata_eh_reset_timeouts),
there is no point in printing the "COMRESET failed" error message. Let's
not scare the user with non fatal errors and only warn about reset
failures in ata_eh_reset() when all reset retries have been exhausted.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-eh.c   | 2 ++
 drivers/ata/libata-sata.c | 1 -
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 03c45630a35b..2b933e7a357a 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2919,6 +2919,8 @@ int ata_eh_reset(struct ata_link *link, int classify,
 		 */
 		if (ata_is_host_link(link))
 			ata_eh_thaw_port(ap);
+		ata_link_warn(link, "%s failed\n",
+			      reset == hardreset ? "hardreset" : "softreset");
 		goto out;
 	}
 
diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index 0748e9ea4f5f..00674aae1696 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -608,7 +608,6 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
 		/* online is set iff link is online && reset succeeded */
 		if (online)
 			*online = false;
-		ata_link_err(link, "COMRESET failed (errno=%d)\n", rc);
 	}
 	return rc;
 }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 20/21] ata: libata-eh: Reduce "disable device" message verbosity
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (18 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 19/21] ata: libata-eh: Improve reset error messages Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  2023-09-12  0:56 ` [PATCH v2 21/21] ata: libata: Cleanup inline DMA helper functions Damien Le Moal
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

There is no point in warning about a device being disabled when we
expect it to be, that is, on suspend, shutdown or when detaching the
device.

Suppress the message "disable device" for these cases by introducing the
EH static function ata_eh_dev_disable() and by using it in
ata_eh_unload() and ata_eh_detach_dev(). ata_dev_disable() code is
modified to call this new function after printing the "disable device"
message.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-eh.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 2b933e7a357a..9cad16bcbebc 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -494,6 +494,18 @@ void ata_eh_release(struct ata_port *ap)
 	mutex_unlock(&ap->host->eh_mutex);
 }
 
+static void ata_eh_dev_disable(struct ata_device *dev)
+{
+	ata_acpi_on_disable(dev);
+	ata_down_xfermask_limit(dev, ATA_DNXFER_FORCE_PIO0 | ATA_DNXFER_QUIET);
+	dev->class++;
+
+	/* From now till the next successful probe, ering is used to
+	 * track probe failures.  Clear accumulated device error info.
+	 */
+	ata_ering_clear(&dev->ering);
+}
+
 static void ata_eh_unload(struct ata_port *ap)
 {
 	struct ata_link *link;
@@ -517,8 +529,8 @@ static void ata_eh_unload(struct ata_port *ap)
 	 */
 	ata_for_each_link(link, ap, PMP_FIRST) {
 		sata_scr_write(link, SCR_CONTROL, link->saved_scontrol & 0xff0);
-		ata_for_each_dev(dev, link, ALL)
-			ata_dev_disable(dev);
+		ata_for_each_dev(dev, link, ENABLED)
+			ata_eh_dev_disable(dev);
 	}
 
 	/* freeze and set UNLOADED */
@@ -1211,14 +1223,8 @@ void ata_dev_disable(struct ata_device *dev)
 		return;
 
 	ata_dev_warn(dev, "disable device\n");
-	ata_acpi_on_disable(dev);
-	ata_down_xfermask_limit(dev, ATA_DNXFER_FORCE_PIO0 | ATA_DNXFER_QUIET);
-	dev->class++;
 
-	/* From now till the next successful probe, ering is used to
-	 * track probe failures.  Clear accumulated device error info.
-	 */
-	ata_ering_clear(&dev->ering);
+	ata_eh_dev_disable(dev);
 }
 EXPORT_SYMBOL_GPL(ata_dev_disable);
 
@@ -1240,12 +1246,12 @@ void ata_eh_detach_dev(struct ata_device *dev)
 
 	/*
 	 * If the device is still enabled, transition it to standby power mode
-	 * (i.e. spin down HDDs).
+	 * (i.e. spin down HDDs) and disable it.
 	 */
-	if (ata_dev_enabled(dev))
+	if (ata_dev_enabled(dev)) {
 		ata_dev_power_set_standby(dev);
-
-	ata_dev_disable(dev);
+		ata_eh_dev_disable(dev);
+	}
 
 	spin_lock_irqsave(ap->lock, flags);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 21/21] ata: libata: Cleanup inline DMA helper functions
  2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
                   ` (19 preceding siblings ...)
  2023-09-12  0:56 ` [PATCH v2 20/21] ata: libata-eh: Reduce "disable device" message verbosity Damien Le Moal
@ 2023-09-12  0:56 ` Damien Le Moal
  20 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  0:56 UTC (permalink / raw)
  To: linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

Simplify the inline DMA helper functions ata_using_mwdma(),
ata_using_udma() and ata_dma_enabled() to directly return as a boolean
the result of their test condition.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 include/linux/libata.h | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/linux/libata.h b/include/linux/libata.h
index 6593c79b7290..f48fe27dae94 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1878,23 +1878,21 @@ static inline unsigned long ata_deadline(unsigned long from_jiffies,
    change in future hardware and specs, secondly 0xFF means 'no DMA' but is
    > UDMA_0. Dyma ddreigiau */
 
-static inline int ata_using_mwdma(struct ata_device *adev)
+static inline bool ata_using_mwdma(struct ata_device *adev)
 {
-	if (adev->dma_mode >= XFER_MW_DMA_0 && adev->dma_mode <= XFER_MW_DMA_4)
-		return 1;
-	return 0;
+	return adev->dma_mode >= XFER_MW_DMA_0 &&
+		adev->dma_mode <= XFER_MW_DMA_4;
 }
 
-static inline int ata_using_udma(struct ata_device *adev)
+static inline bool ata_using_udma(struct ata_device *adev)
 {
-	if (adev->dma_mode >= XFER_UDMA_0 && adev->dma_mode <= XFER_UDMA_7)
-		return 1;
-	return 0;
+	return adev->dma_mode >= XFER_UDMA_0 &&
+		adev->dma_mode <= XFER_UDMA_7;
 }
 
-static inline int ata_dma_enabled(struct ata_device *adev)
+static inline bool ata_dma_enabled(struct ata_device *adev)
 {
-	return (adev->dma_mode == 0xFF ? 0 : 1);
+	return adev->dma_mode != 0xFF;
 }
 
 /**************************************************************************
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 08/21] ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
  2023-09-12  0:56 ` [PATCH v2 08/21] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Damien Le Moal
@ 2023-09-12  6:14   ` Hannes Reinecke
  0 siblings, 0 replies; 33+ messages in thread
From: Hannes Reinecke @ 2023-09-12  6:14 UTC (permalink / raw)
  To: Damien Le Moal, linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

On 9/12/23 02:56, Damien Le Moal wrote:
> The 24 bytes length allocated to the ncq_desc string in
> ata_dev_config_lba() for ata_dev_config_ncq() to use is too short,
> causing the following gcc compilation warnings when compiling with W=1:
> 
> drivers/ata/libata-core.c: In function ‘ata_dev_configure’:
> drivers/ata/libata-core.c:2378:56: warning: ‘%d’ directive output may be truncated writing between 1 and 2 bytes into a region of size between 1 and 11 [-Wformat-truncation=]
>   2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
>        |                                                        ^~
> In function ‘ata_dev_config_ncq’,
>      inlined from ‘ata_dev_config_lba’ at drivers/ata/libata-core.c:2649:8,
>      inlined from ‘ata_dev_configure’ at drivers/ata/libata-core.c:2952:9:
> drivers/ata/libata-core.c:2378:41: note: directive argument in the range [1, 32]
>   2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
>        |                                         ^~~~~~~~~~~~~~~~~~~~~
> drivers/ata/libata-core.c:2378:17: note: ‘snprintf’ output between 16 and 31 bytes into a destination of size 24
>   2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
>        |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   2379 |                         ddepth, aa_desc);
>        |                         ~~~~~~~~~~~~~~~~
> 
> Avoid these warnings and the potential truncation by changing the size
> of the ncq_desc string to 32 characters.
> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
>   drivers/ata/libata-core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 18b2a0da9e54..2405ac8b53f0 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2619,7 +2619,7 @@ static int ata_dev_config_lba(struct ata_device *dev)
>   {
>   	const u16 *id = dev->id;
>   	const char *lba_desc;
> -	char ncq_desc[24];
> +	char ncq_desc[32];
>   	int ret;
>   
>   	dev->flags |= ATA_DFLAG_LBA;
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 09/21] ata: libata-eh: Fix compilation warning in ata_eh_link_report()
  2023-09-12  0:56 ` [PATCH v2 09/21] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Damien Le Moal
@ 2023-09-12  6:14   ` Hannes Reinecke
  0 siblings, 0 replies; 33+ messages in thread
From: Hannes Reinecke @ 2023-09-12  6:14 UTC (permalink / raw)
  To: Damien Le Moal, linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

On 9/12/23 02:56, Damien Le Moal wrote:
> The 6 bytes length of the tries_buf string in ata_eh_link_report() is
> too short and results in a gcc compilation warning with W-!:
> 
> drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’:
> drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=]
>   2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
>        |                                                           ^~
> drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4]
>   2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
>        |                                                        ^~~~~~
> drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6
>   2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
>        |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   2372 |                          ap->eh_tries);
>        |                          ~~~~~~~~~~~~~
> 
> Avoid this warning by increasing the string size to 16B.
> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
>   drivers/ata/libata-eh.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index 43b0a1509548..03c45630a35b 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -2332,7 +2332,7 @@ static void ata_eh_link_report(struct ata_link *link)
>   	struct ata_eh_context *ehc = &link->eh_context;
>   	struct ata_queued_cmd *qc;
>   	const char *frozen, *desc;
> -	char tries_buf[6] = "";
> +	char tries_buf[16] = "";
>   	int tag, nr_failed = 0;
>   
>   	if (ehc->i.flags & ATA_EHI_QUIET)

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure()
  2023-09-12  0:56 ` [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
@ 2023-09-12  7:43   ` John Garry
  2023-09-12  7:52     ` Damien Le Moal
  0 siblings, 1 reply; 33+ messages in thread
From: John Garry @ 2023-09-12  7:43 UTC (permalink / raw)
  To: Damien Le Moal, linux-ide
  Cc: linux-scsi, Martin K . Petersen, Rodrigo Vivi, Paul Ausbeck,
	Kai-Heng Feng, Joe Breuer

On 12/09/2023 01:56, Damien Le Moal wrote:
> Change ata_sas_slave_configure() to return the return value of
> ata_scsi_dev_config() to ensure that any error from that function is
> propagated to libsas.

This seems reasonable, but does libsas even check the return code? From 
a glance, I don't think that it does...

> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>

Reviewed-by: John Garry <john.g.garry@oracle.com>

> ---
>   drivers/ata/libata-sata.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
> index 5d31c08be013..0748e9ea4f5f 100644
> --- a/drivers/ata/libata-sata.c
> +++ b/drivers/ata/libata-sata.c
> @@ -1169,8 +1169,8 @@ EXPORT_SYMBOL_GPL(ata_sas_tport_delete);
>   int ata_sas_slave_configure(struct scsi_device *sdev, struct ata_port *ap)
>   {
>   	ata_scsi_sdev_config(sdev);
> -	ata_scsi_dev_config(sdev, ap->link.device);
> -	return 0;
> +
> +	return ata_scsi_dev_config(sdev, ap->link.device);
>   }
>   EXPORT_SYMBOL_GPL(ata_sas_slave_configure);
>   


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure()
  2023-09-12  7:43   ` John Garry
@ 2023-09-12  7:52     ` Damien Le Moal
  0 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-12  7:52 UTC (permalink / raw)
  To: John Garry, linux-ide
  Cc: linux-scsi, Martin K . Petersen, Rodrigo Vivi, Paul Ausbeck,
	Kai-Heng Feng, Joe Breuer

On 9/12/23 16:43, John Garry wrote:
> On 12/09/2023 01:56, Damien Le Moal wrote:
>> Change ata_sas_slave_configure() to return the return value of
>> ata_scsi_dev_config() to ensure that any error from that function is
>> propagated to libsas.
> 
> This seems reasonable, but does libsas even check the return code? From 
> a glance, I don't think that it does...

Indeed it does not. This functions still always return 0 at present, so not a
big deal. But for consistency , I will add the check in libsas.

> 
>>
>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>> Reviewed-by: Hannes Reinecke <hare@suse.de>
> 
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> 
>> ---
>>   drivers/ata/libata-sata.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
>> index 5d31c08be013..0748e9ea4f5f 100644
>> --- a/drivers/ata/libata-sata.c
>> +++ b/drivers/ata/libata-sata.c
>> @@ -1169,8 +1169,8 @@ EXPORT_SYMBOL_GPL(ata_sas_tport_delete);
>>   int ata_sas_slave_configure(struct scsi_device *sdev, struct ata_port *ap)
>>   {
>>   	ata_scsi_sdev_config(sdev);
>> -	ata_scsi_dev_config(sdev, ap->link.device);
>> -	return 0;
>> +
>> +	return ata_scsi_dev_config(sdev, ap->link.device);
>>   }
>>   EXPORT_SYMBOL_GPL(ata_sas_slave_configure);
>>   
> 

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device
  2023-09-12  0:56 ` [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device Damien Le Moal
@ 2023-09-13 10:25   ` Geert Uytterhoeven
  2023-09-14  7:08     ` Geert Uytterhoeven
  0 siblings, 1 reply; 33+ messages in thread
From: Geert Uytterhoeven @ 2023-09-13 10:25 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-ide, linux-scsi, Martin K . Petersen, John Garry,
	Rodrigo Vivi, Paul Ausbeck, Kai-Heng Feng, Joe Breuer,
	linux-renesas-soc

 	Hi Damien,

On Tue, 12 Sep 2023, Damien Le Moal wrote:
> There is no direct device ancestry defined between an ata_device and
> its scsi device which prevents the power management code from correctly
> ordering suspend and resume operations. Create such ancestry with the
> ata device as the parent to ensure that the scsi device (child) is
> suspended before the ata device and that resume handles the ata device
> before the scsi device.
>
> The parent-child (supplier-consumer) relationship is established between
> the ata_port (parent) and the scsi device (child) with the function
> device_add_link(). The parent used is not the ata_device as the PM
> operations are defined per port and the status of all devices connected
> through that port is controlled from the port operations.
>
> The device link is established with the new function
> ata_scsi_dev_alloc(). This function is used to define the ->slave_alloc
> callback of the scsi host template of most drivers.
>
> Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
> Cc: stable@vger.kernel.org
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>

Thanks for your patch, which is now commit 99626085d036ec32 ("ata:
libata-scsi: link ata port and scsi device") in libata/for-next.

This patch causes /dev/sda to disappear on Renesas Salvator-XS with
R-Car H3 ES2.0.  Changes to dmesg before/after:

      sata_rcar ee300000.sata: ignoring dependency for device, assuming no driver
      scsi host0: sata_rcar
     -ata1: SATA max UDMA/133 irq 184 lpm-pol 0
     +ata1: SATA max UDMA/133 irq 179 lpm-pol 0
      ata1: link resume succeeded after 1 retries
      ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
      ata1.00: ATA-7: Maxtor 6L160M0, BANC1G10, max UDMA/133
      ata1.00: 320173056 sectors, multi 0: LBA48 NCQ (not used)
      ata1.00: configured for UDMA/133
      scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6L160M0   1G10 PQ: 0 ANSI: 5
     -sd 0:0:0:0: [sda] 320173056 512-byte logical blocks: (164 GB/153 GiB)
     -sd 0:0:0:0: [sda] Write Protect is off
     -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
     -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
     -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
     - sda: sda1
     -sd 0:0:0:0: [sda] Attached SCSI disk

Gr{oetje,eeting}s,

 						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
 							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device
  2023-09-13 10:25   ` Geert Uytterhoeven
@ 2023-09-14  7:08     ` Geert Uytterhoeven
  2023-09-14  7:18       ` Damien Le Moal
  2023-09-14 13:18       ` Damien Le Moal
  0 siblings, 2 replies; 33+ messages in thread
From: Geert Uytterhoeven @ 2023-09-14  7:08 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-ide, linux-scsi, Martin K . Petersen, John Garry,
	Rodrigo Vivi, Paul Ausbeck, Kai-Heng Feng, Joe Breuer,
	linux-renesas-soc

Hi Damien,

On Wed, Sep 13, 2023 at 12:27 PM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> On Tue, 12 Sep 2023, Damien Le Moal wrote:
> > There is no direct device ancestry defined between an ata_device and
> > its scsi device which prevents the power management code from correctly
> > ordering suspend and resume operations. Create such ancestry with the
> > ata device as the parent to ensure that the scsi device (child) is
> > suspended before the ata device and that resume handles the ata device
> > before the scsi device.
> >
> > The parent-child (supplier-consumer) relationship is established between
> > the ata_port (parent) and the scsi device (child) with the function
> > device_add_link(). The parent used is not the ata_device as the PM
> > operations are defined per port and the status of all devices connected
> > through that port is controlled from the port operations.
> >
> > The device link is established with the new function
> > ata_scsi_dev_alloc(). This function is used to define the ->slave_alloc
> > callback of the scsi host template of most drivers.
> >
> > Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> > Reviewed-by: Hannes Reinecke <hare@suse.de>
>
> Thanks for your patch, which is now commit 99626085d036ec32 ("ata:
> libata-scsi: link ata port and scsi device") in libata/for-next.
>
> This patch causes /dev/sda to disappear on Renesas Salvator-XS with
> R-Car H3 ES2.0.  Changes to dmesg before/after:
>
>       sata_rcar ee300000.sata: ignoring dependency for device, assuming no driver
>       scsi host0: sata_rcar
>      -ata1: SATA max UDMA/133 irq 184 lpm-pol 0
>      +ata1: SATA max UDMA/133 irq 179 lpm-pol 0
>       ata1: link resume succeeded after 1 retries
>       ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>       ata1.00: ATA-7: Maxtor 6L160M0, BANC1G10, max UDMA/133
>       ata1.00: 320173056 sectors, multi 0: LBA48 NCQ (not used)
>       ata1.00: configured for UDMA/133
>       scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6L160M0   1G10 PQ: 0 ANSI: 5
>      -sd 0:0:0:0: [sda] 320173056 512-byte logical blocks: (164 GB/153 GiB)
>      -sd 0:0:0:0: [sda] Write Protect is off
>      -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>      -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>      -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>      - sda: sda1
>      -sd 0:0:0:0: [sda] Attached SCSI disk

I see the same issue on SH/Landisk, which has CompactFLASH:

    -ata1: PATA max PIO0 ioport cmd 0xc0023040 ctl 0xc002302c irq 26
    +ata1: PATA max PIO0 ioport cmd 0xc0023040 ctl 0xc002302c irq 26 lpm-pol 0
     ata1.00: CFA: TS8GCF133, 20171204, max UDMA/100
     ata1.00: 15662304 sectors, multi 0: LBA48
     ata1.00: configured for PIO
     scsi 0:0:0:0: Direct-Access     ATA      TS8GCF133        1204
PQ: 0 ANSI: 5
    -sd 0:0:0:0: [sda] 15662304 512-byte logical blocks: (8.02 GB/7.47 GiB)
    -sd 0:0:0:0: [sda] Write Protect is off
    -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
    -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
    -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
    - sda: sda1 sda2 sda3
    -sd 0:0:0:0: [sda] Attached SCSI removable disk

and m68k/ARAnyM:

     atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller
     scsi host0: pata_falcon
     ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no
IRQ, using PIO polling
     ata1.00: ATA-2: Sarge m68k, , max PIO2
     ata1.00: 2118816 sectors, multi 0: LBA
     ata1.00: configured for PIO
     scsi 0:0:0:0: Direct-Access     ATA      Sarge m68k       n/a
PQ: 0 ANSI: 5
    -sd 0:0:0:0: [sda] 2118816 512-byte logical blocks: (1.08 GB/1.01 GiB)
    -sd 0:0:0:0: [sda] Write Protect is off
    -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
    -sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled,
doesn't support DPO or FUA
    -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
    - sda: AHDI sda1 sda2
    -sd 0:0:0:0: [sda] Attached SCSI disk

Reverting 99626085d036ec32 fixes the issue.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device
  2023-09-14  7:08     ` Geert Uytterhoeven
@ 2023-09-14  7:18       ` Damien Le Moal
  2023-09-14 13:18       ` Damien Le Moal
  1 sibling, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-14  7:18 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: linux-ide, linux-scsi, Martin K . Petersen, John Garry,
	Rodrigo Vivi, Paul Ausbeck, Kai-Heng Feng, Joe Breuer,
	linux-renesas-soc

On 9/14/23 16:08, Geert Uytterhoeven wrote:
> Hi Damien,
> 
> On Wed, Sep 13, 2023 at 12:27 PM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> On Tue, 12 Sep 2023, Damien Le Moal wrote:
>>> There is no direct device ancestry defined between an ata_device and
>>> its scsi device which prevents the power management code from correctly
>>> ordering suspend and resume operations. Create such ancestry with the
>>> ata device as the parent to ensure that the scsi device (child) is
>>> suspended before the ata device and that resume handles the ata device
>>> before the scsi device.
>>>
>>> The parent-child (supplier-consumer) relationship is established between
>>> the ata_port (parent) and the scsi device (child) with the function
>>> device_add_link(). The parent used is not the ata_device as the PM
>>> operations are defined per port and the status of all devices connected
>>> through that port is controlled from the port operations.
>>>
>>> The device link is established with the new function
>>> ata_scsi_dev_alloc(). This function is used to define the ->slave_alloc
>>> callback of the scsi host template of most drivers.
>>>
>>> Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>
>> Thanks for your patch, which is now commit 99626085d036ec32 ("ata:
>> libata-scsi: link ata port and scsi device") in libata/for-next.
>>
>> This patch causes /dev/sda to disappear on Renesas Salvator-XS with
>> R-Car H3 ES2.0.  Changes to dmesg before/after:
>>
>>       sata_rcar ee300000.sata: ignoring dependency for device, assuming no driver
>>       scsi host0: sata_rcar
>>      -ata1: SATA max UDMA/133 irq 184 lpm-pol 0
>>      +ata1: SATA max UDMA/133 irq 179 lpm-pol 0
>>       ata1: link resume succeeded after 1 retries
>>       ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>       ata1.00: ATA-7: Maxtor 6L160M0, BANC1G10, max UDMA/133
>>       ata1.00: 320173056 sectors, multi 0: LBA48 NCQ (not used)
>>       ata1.00: configured for UDMA/133
>>       scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6L160M0   1G10 PQ: 0 ANSI: 5
>>      -sd 0:0:0:0: [sda] 320173056 512-byte logical blocks: (164 GB/153 GiB)
>>      -sd 0:0:0:0: [sda] Write Protect is off
>>      -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>>      -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>>      -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>>      - sda: sda1
>>      -sd 0:0:0:0: [sda] Attached SCSI disk
> 
> I see the same issue on SH/Landisk, which has CompactFLASH:
> 
>     -ata1: PATA max PIO0 ioport cmd 0xc0023040 ctl 0xc002302c irq 26
>     +ata1: PATA max PIO0 ioport cmd 0xc0023040 ctl 0xc002302c irq 26 lpm-pol 0
>      ata1.00: CFA: TS8GCF133, 20171204, max UDMA/100
>      ata1.00: 15662304 sectors, multi 0: LBA48
>      ata1.00: configured for PIO
>      scsi 0:0:0:0: Direct-Access     ATA      TS8GCF133        1204
> PQ: 0 ANSI: 5
>     -sd 0:0:0:0: [sda] 15662304 512-byte logical blocks: (8.02 GB/7.47 GiB)
>     -sd 0:0:0:0: [sda] Write Protect is off
>     -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>     -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled,
> doesn't support DPO or FUA
>     -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>     - sda: sda1 sda2 sda3
>     -sd 0:0:0:0: [sda] Attached SCSI removable disk
> 
> and m68k/ARAnyM:
> 
>      atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller
>      scsi host0: pata_falcon
>      ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no
> IRQ, using PIO polling
>      ata1.00: ATA-2: Sarge m68k, , max PIO2
>      ata1.00: 2118816 sectors, multi 0: LBA
>      ata1.00: configured for PIO
>      scsi 0:0:0:0: Direct-Access     ATA      Sarge m68k       n/a
> PQ: 0 ANSI: 5
>     -sd 0:0:0:0: [sda] 2118816 512-byte logical blocks: (1.08 GB/1.01 GiB)
>     -sd 0:0:0:0: [sda] Write Protect is off
>     -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>     -sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled,
> doesn't support DPO or FUA
>     -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>     - sda: AHDI sda1 sda2
>     -sd 0:0:0:0: [sda] Attached SCSI disk
> 
> Reverting 99626085d036ec32 fixes the issue.

Yes. I can confirm this. I can recreate the issue, because I have a major
screw-up with that patch: the device_link_add() done in the new ->slave_alloc
operation for the scsi_host_template for the driver was in fact NOT set for
AHCI. So all my testing confirming that suspend/resume is OK was done *without*
that device link being created... That is embarrassing :)

The sata_rcar driver does get that slave_alloc method set through ATA_BASE_SHT()
-> ATA_SUBBASE_SHT(), and you get the problem. If I fix AHCI_SHT() macro to set
slave_alloc, I do get the issue as well: no scsi disk device is created after
the port scan. No clue why.

And I also now have no clue how suddenly the suspend/resume operations get
magically ordered correctly... I could recreate various issues before the
patches in for-6.7.

So going back to the beginning to sort things out. I probably "inadvertently"
fixed the issues with another change that has implications I am not seeing.

Will get back to you ASAP.

> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device
  2023-09-14  7:08     ` Geert Uytterhoeven
  2023-09-14  7:18       ` Damien Le Moal
@ 2023-09-14 13:18       ` Damien Le Moal
  2023-09-14 13:43         ` Geert Uytterhoeven
  1 sibling, 1 reply; 33+ messages in thread
From: Damien Le Moal @ 2023-09-14 13:18 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: linux-ide, linux-scsi, Martin K . Petersen, John Garry,
	Rodrigo Vivi, Paul Ausbeck, Kai-Heng Feng, Joe Breuer,
	linux-renesas-soc

On 9/14/23 16:08, Geert Uytterhoeven wrote:
> Hi Damien,
> 
> On Wed, Sep 13, 2023 at 12:27 PM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> On Tue, 12 Sep 2023, Damien Le Moal wrote:
>>> There is no direct device ancestry defined between an ata_device and
>>> its scsi device which prevents the power management code from correctly
>>> ordering suspend and resume operations. Create such ancestry with the
>>> ata device as the parent to ensure that the scsi device (child) is
>>> suspended before the ata device and that resume handles the ata device
>>> before the scsi device.
>>>
>>> The parent-child (supplier-consumer) relationship is established between
>>> the ata_port (parent) and the scsi device (child) with the function
>>> device_add_link(). The parent used is not the ata_device as the PM
>>> operations are defined per port and the status of all devices connected
>>> through that port is controlled from the port operations.
>>>
>>> The device link is established with the new function
>>> ata_scsi_dev_alloc(). This function is used to define the ->slave_alloc
>>> callback of the scsi host template of most drivers.
>>>
>>> Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>
>> Thanks for your patch, which is now commit 99626085d036ec32 ("ata:
>> libata-scsi: link ata port and scsi device") in libata/for-next.
>>
>> This patch causes /dev/sda to disappear on Renesas Salvator-XS with
>> R-Car H3 ES2.0.  Changes to dmesg before/after:
>>
>>       sata_rcar ee300000.sata: ignoring dependency for device, assuming no driver
>>       scsi host0: sata_rcar
>>      -ata1: SATA max UDMA/133 irq 184 lpm-pol 0
>>      +ata1: SATA max UDMA/133 irq 179 lpm-pol 0
>>       ata1: link resume succeeded after 1 retries
>>       ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>       ata1.00: ATA-7: Maxtor 6L160M0, BANC1G10, max UDMA/133
>>       ata1.00: 320173056 sectors, multi 0: LBA48 NCQ (not used)
>>       ata1.00: configured for UDMA/133
>>       scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6L160M0   1G10 PQ: 0 ANSI: 5
>>      -sd 0:0:0:0: [sda] 320173056 512-byte logical blocks: (164 GB/153 GiB)
>>      -sd 0:0:0:0: [sda] Write Protect is off
>>      -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>>      -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>>      -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>>      - sda: sda1
>>      -sd 0:0:0:0: [sda] Attached SCSI disk
> 
> I see the same issue on SH/Landisk, which has CompactFLASH:
> 
>     -ata1: PATA max PIO0 ioport cmd 0xc0023040 ctl 0xc002302c irq 26
>     +ata1: PATA max PIO0 ioport cmd 0xc0023040 ctl 0xc002302c irq 26 lpm-pol 0
>      ata1.00: CFA: TS8GCF133, 20171204, max UDMA/100
>      ata1.00: 15662304 sectors, multi 0: LBA48
>      ata1.00: configured for PIO
>      scsi 0:0:0:0: Direct-Access     ATA      TS8GCF133        1204
> PQ: 0 ANSI: 5
>     -sd 0:0:0:0: [sda] 15662304 512-byte logical blocks: (8.02 GB/7.47 GiB)
>     -sd 0:0:0:0: [sda] Write Protect is off
>     -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>     -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled,
> doesn't support DPO or FUA
>     -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>     - sda: sda1 sda2 sda3
>     -sd 0:0:0:0: [sda] Attached SCSI removable disk
> 
> and m68k/ARAnyM:
> 
>      atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller
>      scsi host0: pata_falcon
>      ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no
> IRQ, using PIO polling
>      ata1.00: ATA-2: Sarge m68k, , max PIO2
>      ata1.00: 2118816 sectors, multi 0: LBA
>      ata1.00: configured for PIO
>      scsi 0:0:0:0: Direct-Access     ATA      Sarge m68k       n/a
> PQ: 0 ANSI: 5
>     -sd 0:0:0:0: [sda] 2118816 512-byte logical blocks: (1.08 GB/1.01 GiB)
>     -sd 0:0:0:0: [sda] Write Protect is off
>     -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>     -sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled,
> doesn't support DPO or FUA
>     -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
>     - sda: AHDI sda1 sda2
>     -sd 0:0:0:0: [sda] Attached SCSI disk
> 
> Reverting 99626085d036ec32 fixes the issue.

Without reverting, can you try this incremental update ?

diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
index 4bae95b06ae3..72085756f4ba 100644
--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -398,6 +398,7 @@ extern const struct attribute_group *ahci_sdev_groups[];
        .sdev_groups            = ahci_sdev_groups,                     \
        .change_queue_depth     = ata_scsi_change_queue_depth,          \
        .tag_alloc_policy       = BLK_TAG_ALLOC_RR,                     \
+       .slave_alloc            = ata_scsi_slave_alloc,                 \
        .slave_configure        = ata_scsi_slave_config
 
 extern struct ata_port_operations ahci_ops;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7aa70af1fc07..5a0513452150 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1093,6 +1093,7 @@ int ata_scsi_dev_alloc(struct scsi_device *sdev, struct ata_port *ap)
         * consumer (child) and the ata port the supplier (parent).
         */
        link = device_link_add(&sdev->sdev_gendev, &ap->tdev,
+                              DL_FLAG_STATELESS |
                               DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
        if (!link) {
                ata_port_err(ap, "Failed to create link to scsi device %s\n",
@@ -1164,6 +1165,8 @@ void ata_scsi_slave_destroy(struct scsi_device *sdev)
        unsigned long flags;
        struct ata_device *dev;
 
+       device_link_remove(&sdev->sdev_gendev, &ap->tdev);
+
        spin_lock_irqsave(ap->lock, flags);
        dev = __ata_scsi_find_dev(ap, sdev);
        if (dev && dev->sdev) {

This solves the issue for me. If you confirm it works for you, I will squash
this into 99626085d036ec32.

Thanks !

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device
  2023-09-14 13:18       ` Damien Le Moal
@ 2023-09-14 13:43         ` Geert Uytterhoeven
  0 siblings, 0 replies; 33+ messages in thread
From: Geert Uytterhoeven @ 2023-09-14 13:43 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-ide, linux-scsi, Martin K . Petersen, John Garry,
	Rodrigo Vivi, Paul Ausbeck, Kai-Heng Feng, Joe Breuer,
	linux-renesas-soc

Hi Damien,

On Thu, Sep 14, 2023 at 3:18 PM Damien Le Moal <dlemoal@kernel.org> wrote:
> On 9/14/23 16:08, Geert Uytterhoeven wrote:
> > On Wed, Sep 13, 2023 at 12:27 PM Geert Uytterhoeven
> > <geert@linux-m68k.org> wrote:
> >> On Tue, 12 Sep 2023, Damien Le Moal wrote:
> >>> There is no direct device ancestry defined between an ata_device and
> >>> its scsi device which prevents the power management code from correctly
> >>> ordering suspend and resume operations. Create such ancestry with the
> >>> ata device as the parent to ensure that the scsi device (child) is
> >>> suspended before the ata device and that resume handles the ata device
> >>> before the scsi device.
> >>>
> >>> The parent-child (supplier-consumer) relationship is established between
> >>> the ata_port (parent) and the scsi device (child) with the function
> >>> device_add_link(). The parent used is not the ata_device as the PM
> >>> operations are defined per port and the status of all devices connected
> >>> through that port is controlled from the port operations.
> >>>
> >>> The device link is established with the new function
> >>> ata_scsi_dev_alloc(). This function is used to define the ->slave_alloc
> >>> callback of the scsi host template of most drivers.
> >>>
> >>> Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
> >>> Cc: stable@vger.kernel.org
> >>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> >>> Reviewed-by: Hannes Reinecke <hare@suse.de>
> >>
> >> Thanks for your patch, which is now commit 99626085d036ec32 ("ata:
> >> libata-scsi: link ata port and scsi device") in libata/for-next.
> >>
> >> This patch causes /dev/sda to disappear on Renesas Salvator-XS with
> >> R-Car H3 ES2.0.  Changes to dmesg before/after:
> >>
> >>       sata_rcar ee300000.sata: ignoring dependency for device, assuming no driver
> >>       scsi host0: sata_rcar
> >>      -ata1: SATA max UDMA/133 irq 184 lpm-pol 0
> >>      +ata1: SATA max UDMA/133 irq 179 lpm-pol 0
> >>       ata1: link resume succeeded after 1 retries
> >>       ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >>       ata1.00: ATA-7: Maxtor 6L160M0, BANC1G10, max UDMA/133
> >>       ata1.00: 320173056 sectors, multi 0: LBA48 NCQ (not used)
> >>       ata1.00: configured for UDMA/133
> >>       scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6L160M0   1G10 PQ: 0 ANSI: 5
> >>      -sd 0:0:0:0: [sda] 320173056 512-byte logical blocks: (164 GB/153 GiB)
> >>      -sd 0:0:0:0: [sda] Write Protect is off
> >>      -sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> >>      -sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> >>      -sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
> >>      - sda: sda1
> >>      -sd 0:0:0:0: [sda] Attached SCSI disk
> >
> > I see the same issue on SH/Landisk, which has CompactFLASH:
> > and m68k/ARAnyM:

> > Reverting 99626085d036ec32 fixes the issue.
>
> Without reverting, can you try this incremental update ?
>
> diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
> index 4bae95b06ae3..72085756f4ba 100644
> --- a/drivers/ata/ahci.h
> +++ b/drivers/ata/ahci.h
> @@ -398,6 +398,7 @@ extern const struct attribute_group *ahci_sdev_groups[];
>         .sdev_groups            = ahci_sdev_groups,                     \
>         .change_queue_depth     = ata_scsi_change_queue_depth,          \
>         .tag_alloc_policy       = BLK_TAG_ALLOC_RR,                     \
> +       .slave_alloc            = ata_scsi_slave_alloc,                 \
>         .slave_configure        = ata_scsi_slave_config
>
>  extern struct ata_port_operations ahci_ops;
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 7aa70af1fc07..5a0513452150 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1093,6 +1093,7 @@ int ata_scsi_dev_alloc(struct scsi_device *sdev, struct ata_port *ap)
>          * consumer (child) and the ata port the supplier (parent).
>          */
>         link = device_link_add(&sdev->sdev_gendev, &ap->tdev,
> +                              DL_FLAG_STATELESS |
>                                DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
>         if (!link) {
>                 ata_port_err(ap, "Failed to create link to scsi device %s\n",
> @@ -1164,6 +1165,8 @@ void ata_scsi_slave_destroy(struct scsi_device *sdev)
>         unsigned long flags;
>         struct ata_device *dev;
>
> +       device_link_remove(&sdev->sdev_gendev, &ap->tdev);
> +
>         spin_lock_irqsave(ap->lock, flags);
>         dev = __ata_scsi_find_dev(ap, sdev);
>         if (dev && dev->sdev) {
>
> This solves the issue for me. If you confirm it works for you, I will squash
> this into 99626085d036ec32.

Thank you, /dev/sda is back into business on R-Car SATA, Landisk, and
ARAnyM.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove
  2023-09-12  0:56 ` [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove Damien Le Moal
@ 2023-09-14 14:48   ` Bart Van Assche
  2023-09-14 22:06     ` Damien Le Moal
  0 siblings, 1 reply; 33+ messages in thread
From: Bart Van Assche @ 2023-09-14 14:48 UTC (permalink / raw)
  To: Damien Le Moal, linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

On 9/11/23 17:56, Damien Le Moal wrote:
> If an error occurs when resuming a host adapter before the devices
> attached to the adapter are resumed, the adapter low level driver may
> remove the scsi host, resulting in a call to sd_remove() for the
> disks of the host. However, since this function calls sd_shutdown(),
> a synchronize cache command and a start stop unit may be issued with the
> drive still sleeping and the HBA non-functional. This causes PM resume
> to hang, forcing a reset of the machine to recover.
> 
> Fix this by checking a device host state in sd_shutdown() and by
> returning early doing nothing if the host state is not SHOST_RUNNING.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/scsi/sd.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index c92a317ba547..a415abb721d3 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3763,7 +3763,8 @@ static void sd_shutdown(struct device *dev)
>   	if (!sdkp)
>   		return;         /* this can happen */
>   
> -	if (pm_runtime_suspended(dev))
> +	if (pm_runtime_suspended(dev) ||
> +	    sdkp->device->host->shost_state != SHOST_RUNNING)
>   		return;
>   
>   	if (sdkp->WCE && sdkp->media_present) {

The above seems wrong to me because no SYNCHRONIZE CACHE command will be
sent even if the device is still present, if the SCSI error handler is
active and if it will succeed at a later time. How about replacing the
above patch with something like the untested patch below?

Thanks,

Bart.


diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4cd281368826..c0e069d9d58e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3689,12 +3689,14 @@ static int sd_probe(struct device *dev)
  static int sd_remove(struct device *dev)
  {
  	struct scsi_disk *sdkp = dev_get_drvdata(dev);
+	int rpm_get_res;

-	scsi_autopm_get_device(sdkp->device);
+	rpm_get_res = scsi_autopm_get_device(sdkp->device);

  	device_del(&sdkp->disk_dev);
  	del_gendisk(sdkp->disk);
-	sd_shutdown(dev);
+	if (rpm_get_res >= 0)
+		sd_shutdown(dev);

  	put_disk(sdkp->disk);
  	return 0;


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove
  2023-09-14 14:48   ` Bart Van Assche
@ 2023-09-14 22:06     ` Damien Le Moal
  0 siblings, 0 replies; 33+ messages in thread
From: Damien Le Moal @ 2023-09-14 22:06 UTC (permalink / raw)
  To: Bart Van Assche, linux-ide
  Cc: linux-scsi, Martin K . Petersen, John Garry, Rodrigo Vivi,
	Paul Ausbeck, Kai-Heng Feng, Joe Breuer

On 9/14/23 23:48, Bart Van Assche wrote:
> On 9/11/23 17:56, Damien Le Moal wrote:
>> If an error occurs when resuming a host adapter before the devices
>> attached to the adapter are resumed, the adapter low level driver may
>> remove the scsi host, resulting in a call to sd_remove() for the
>> disks of the host. However, since this function calls sd_shutdown(),
>> a synchronize cache command and a start stop unit may be issued with the
>> drive still sleeping and the HBA non-functional. This causes PM resume
>> to hang, forcing a reset of the machine to recover.
>>
>> Fix this by checking a device host state in sd_shutdown() and by
>> returning early doing nothing if the host state is not SHOST_RUNNING.
>>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/scsi/sd.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index c92a317ba547..a415abb721d3 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3763,7 +3763,8 @@ static void sd_shutdown(struct device *dev)
>>   	if (!sdkp)
>>   		return;         /* this can happen */
>>   
>> -	if (pm_runtime_suspended(dev))
>> +	if (pm_runtime_suspended(dev) ||
>> +	    sdkp->device->host->shost_state != SHOST_RUNNING)
>>   		return;
>>   
>>   	if (sdkp->WCE && sdkp->media_present) {
> 
> The above seems wrong to me because no SYNCHRONIZE CACHE command will be
> sent even if the device is still present, if the SCSI error handler is
> active and if it will succeed at a later time. How about replacing the
> above patch with something like the untested patch below?
> 
> Thanks,
> 
> Bart.
> 
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 4cd281368826..c0e069d9d58e 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3689,12 +3689,14 @@ static int sd_probe(struct device *dev)
>   static int sd_remove(struct device *dev)
>   {
>   	struct scsi_disk *sdkp = dev_get_drvdata(dev);
> +	int rpm_get_res;
> 
> -	scsi_autopm_get_device(sdkp->device);
> +	rpm_get_res = scsi_autopm_get_device(sdkp->device);
> 
>   	device_del(&sdkp->disk_dev);
>   	del_gendisk(sdkp->disk);
> -	sd_shutdown(dev);
> +	if (rpm_get_res >= 0)
> +		sd_shutdown(dev);
> 
>   	put_disk(sdkp->disk);
>   	return 0;

OK. Let me try this.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-09-14 22:07 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-12  0:56 [PATCH v2 00/21] Fix libata suspend/resume handling and code cleanup Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 01/21] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 02/21] ata: libata-core: Fix port and device removal Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 03/21] ata: libata-scsi: link ata port and scsi device Damien Le Moal
2023-09-13 10:25   ` Geert Uytterhoeven
2023-09-14  7:08     ` Geert Uytterhoeven
2023-09-14  7:18       ` Damien Le Moal
2023-09-14 13:18       ` Damien Le Moal
2023-09-14 13:43         ` Geert Uytterhoeven
2023-09-12  0:56 ` [PATCH v2 04/21] ata: libata-scsi: Disable scsi device manage_start_stop Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 05/21] ata: libata-scsi: Fix delayed scsi_rescan_device() execution Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 06/21] ata: libata-core: Do not register PM operations for SAS ports Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 07/21] scsi: sd: Do not issue commands to suspended disks on remove Damien Le Moal
2023-09-14 14:48   ` Bart Van Assche
2023-09-14 22:06     ` Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 08/21] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Damien Le Moal
2023-09-12  6:14   ` Hannes Reinecke
2023-09-12  0:56 ` [PATCH v2 09/21] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Damien Le Moal
2023-09-12  6:14   ` Hannes Reinecke
2023-09-12  0:56 ` [PATCH v2 10/21] scsi: Remove scsi device no_start_on_resume flag Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 11/21] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat() Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 12/21] ata: libata-core: Synchronize ata_port_detach() with hotplug Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 13/21] ata: libata-core: Detach a port devices on shutdown Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 14/21] ata: libata-core: Remove ata_port_suspend_async() Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 15/21] ata: libata-core: Remove ata_port_resume_async() Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 16/21] ata: libata-core: skip poweroff for devices that are runtime suspended Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 17/21] ata: libata-core: Do not resume ports that have been " Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 18/21] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
2023-09-12  7:43   ` John Garry
2023-09-12  7:52     ` Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 19/21] ata: libata-eh: Improve reset error messages Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 20/21] ata: libata-eh: Reduce "disable device" message verbosity Damien Le Moal
2023-09-12  0:56 ` [PATCH v2 21/21] ata: libata: Cleanup inline DMA helper functions Damien Le Moal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox