public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Hard disk S3 resume time optimization
@ 2013-05-16 22:29 Brandt, Todd E
  2013-05-16 22:44 ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Brandt, Todd E @ 2013-05-16 22:29 UTC (permalink / raw)
  To: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org
  Cc: Jeff Garzik, Jens Axboe, Greg Kroah-Hartman,
	arjan@linux.intel.com

Hello, my name's Todd Brandt and I'm the project owner for Intel Open Source Technology Center's new project to optimize suspend/resume time.
Our website is: https://01.org/suspendresume

[The Problem]
The vast majority of time spent in S3 resume is consumed by the ATA subsystem as it resumes the computer's hard drives. For large hard disks this time can be upwards of 10 seconds, which makes S3 suspend/resume too costly to use frequently. This time needs to be reduced. I've written a blog entry describing the details at this url:
https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-worst-offender

[Proposed Solution]
I've implemented a potential solution in this patch, which effectively reduces total resume time from multiple seconds to less than 700ms. The idea is to allow the ATA/SCSI subsystems to resume asynchronously without blocking system resume completion. The dpm_resume call currently waits for all asynchronous devices to finish resuming before it gives control back to the user. But in the case of ATA/SCSI we can give control back immediately, since the hard disks won't be needed immediately in lieu of RAM and cache.

Patch1 goes through and sets the power.async_suspend flag for every device in the ATA/SCSI resume path. This includes the ata port, link, and dev devices, the scsi host and target devices, all their associated transport devices, the block devices, and block partitions. This allows the entire ATA resume path to be added to the async device queue in drivers/base/power/main.c. Without this, the ATA resume path ends up in both queues (sync and async), which causes interdependencies and backs up the two queues. It's also needed for the second patch to have any effect.

Patch2 updates the drivers/base/power subsystem to allow any devices which have registered as asynchronous and who have not registered "compete" callbacks to be non-blocking. As it happens, the ATA subsystem devices don't register complete callbacks, so this is a simple way of adding the new functionality and getting ATA to conform immediately. 

----------------------------------------------------------------

[PATCH1 - ata_resume_async.patch]

Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
---
 block/genhd.c                      |    2 ++
 block/partition-generic.c          |    1 +
 drivers/ata/libata-transport.c     |    4 +++-
 drivers/base/attribute_container.c |    1 +
 drivers/base/core.c                |    7 ++++++-
 drivers/scsi/scsi_sysfs.c          |    2 +-
 drivers/scsi/sd.c                  |    3 +++
 7 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 7dcfdd8..8b88c05 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -525,6 +525,8 @@ static void register_disk(struct gendisk *disk)
 	/* delay uevents, until we scanned partition table */
 	dev_set_uevent_suppress(ddev, 1);
 
+	device_enable_async_suspend(ddev);
+
 	if (device_add(ddev))
 		return;
 	if (!sysfs_deprecated) {
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 1cb4dec..7f136d1 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -325,6 +325,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	pdev->class = &block_class;
 	pdev->type = &part_type;
 	pdev->parent = ddev;
+	pdev->power.async_suspend = true;
 
 	err = blk_alloc_devt(p, &devt);
 	if (err)
diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c
index c04d393..493f5ce 100644
--- a/drivers/ata/libata-transport.c
+++ b/drivers/ata/libata-transport.c
@@ -285,13 +285,13 @@ int ata_tport_add(struct device *parent,
 	dev->parent = get_device(parent);
 	dev->release = ata_tport_release;
 	dev_set_name(dev, "ata%d", ap->print_id);
+	device_enable_async_suspend(dev);
 	transport_setup_device(dev);
 	error = device_add(dev);
 	if (error) {
 		goto tport_err;
 	}
 
-	device_enable_async_suspend(dev);
 	pm_runtime_set_active(dev);
 	pm_runtime_enable(dev);
 	pm_runtime_forbid(dev);
@@ -414,6 +414,7 @@ int ata_tlink_add(struct ata_link *link)
         else
 		dev_set_name(dev, "link%d.%d", ap->print_id, link->pmp);
 
+	device_enable_async_suspend(dev);
 	transport_setup_device(dev);
 
 	error = device_add(dev);
@@ -642,6 +643,7 @@ static int ata_tdev_add(struct ata_device *ata_dev)
         else
 		dev_set_name(dev, "dev%d.%d.0", ap->print_id, link->pmp);
 
+	device_enable_async_suspend(dev);
 	transport_setup_device(dev);
 	error = device_add(dev);
 	if (error) {
diff --git a/drivers/base/attribute_container.c b/drivers/base/attribute_container.c
index d78b204..49cc02d 100644
--- a/drivers/base/attribute_container.c
+++ b/drivers/base/attribute_container.c
@@ -349,6 +349,7 @@ attribute_container_add_attrs(struct device *classdev)
 int
 attribute_container_add_class_device(struct device *classdev)
 {
+	classdev->power.async_suspend = true;
 	int error = device_add(classdev);
 	if (error)
 		return error;
diff --git a/drivers/base/core.c b/drivers/base/core.c
index a235085..6c1cf6c 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1020,7 +1020,7 @@ int device_add(struct device *dev)
 		goto name_error;
 	}
 
-	pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
+	pr_debug("device: '%s': %s, %s\n", dev_name(dev), __func__,(dev->power.async_suspend)?"ASYNC":"SYNC");
 
 	parent = get_device(dev->parent);
 	kobj = get_device_parent(dev, parent);
@@ -1558,6 +1558,11 @@ struct device *device_create_vargs(struct class *class, struct device *parent,
 		goto error;
 	}
 
+	if (parent)
+		dev->power.async_suspend = parent->power.async_suspend;
+	else
+		dev->power.async_suspend = true;
+
 	dev->devt = devt;
 	dev->class = class;
 	dev->parent = parent;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 931a7d9..22b5a5a 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -838,6 +838,7 @@ static int scsi_target_add(struct scsi_target *starget)
 	if (starget->state != STARGET_CREATED)
 		return 0;
 
+	device_enable_async_suspend(&starget->dev);
 	error = device_add(&starget->dev);
 	if (error) {
 		dev_err(&starget->dev, "target device_add failed, error %d\n", error);
@@ -848,7 +849,6 @@ static int scsi_target_add(struct scsi_target *starget)
 
 	pm_runtime_set_active(&starget->dev);
 	pm_runtime_enable(&starget->dev);
-	device_enable_async_suspend(&starget->dev);
 
 	return 0;
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 7992635..3a412ea 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2924,6 +2924,9 @@ static int sd_probe(struct device *dev)
 	sdkp->dev.class = &sd_disk_class;
 	dev_set_name(&sdkp->dev, dev_name(dev));
 
+	if (dev)
+		sdkp->dev.power.async_suspend = dev->power.async_suspend;
+
 	if (device_add(&sdkp->dev))
 		goto out_free_index;

----------------------------------------------------------------

[PATCH2 - pm_resume_async_non_blocking.patch]

Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
---
 drivers/base/power/main.c |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 2b7f77d..280065b 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -713,7 +713,6 @@ void dpm_resume(pm_message_t state)
 		put_device(dev);
 	}
 	mutex_unlock(&dpm_list_mtx);
-	async_synchronize_full();
 	dpm_show_time(starttime, state, NULL);
 }
 
@@ -726,11 +725,14 @@ static void device_complete(struct device *dev, pm_message_t state)
 {
 	void (*callback)(struct device *) = NULL;
 	char *info = NULL;
+	bool hascb = false;
 
 	if (dev->power.syscore)
 		return;
 
-	device_lock(dev);
+ DoComplete:
+	if (hascb)
+		device_lock(dev);
 
 	if (dev->pm_domain) {
 		info = "completing power domain ";
@@ -751,13 +753,19 @@ static void device_complete(struct device *dev, pm_message_t state)
 		callback = dev->driver->pm->complete;
 	}
 
+	/* if a callback exists, lock the device and call it */
+	/* otherwise don't even lock/unlock the device */
 	if (callback) {
+		if (!hascb) {
+			hascb = true;
+			goto DoComplete;
+		}
+
 		pm_dev_dbg(dev, state, info);
 		callback(dev);
+		device_unlock(dev);
 	}
 
-	device_unlock(dev);
-
 	pm_runtime_put_sync(dev);
 }
 
@@ -1180,6 +1188,8 @@ int dpm_suspend(pm_message_t state)
 	might_sleep();
 
 	mutex_lock(&dpm_list_mtx);
+	/* wait for any processes still resuming */
+	async_synchronize_full();
 	pm_transition = state;
 	async_error = 0;
 	while (!list_empty(&dpm_prepared_list)) {

-----------------------------------------------------------------------

Todd Brandt
Linux Kernel Developer, Intel OTC, Hillsboro OR






^ permalink raw reply related	[flat|nested] 8+ messages in thread
* [PATCH] Hard disk S3 resume time optimization
@ 2013-08-01 23:40 Brandt, Todd E
  2013-08-02 12:42 ` Oliver Neukum
  0 siblings, 1 reply; 8+ messages in thread
From: Brandt, Todd E @ 2013-08-01 23:40 UTC (permalink / raw)
  To: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org

This patch is a potential way to reduce the S3 resume time for SATA drives. Essentially this patch removes the hard disk resume time from the total system resume time, with the disks still taking as long to come back online but in the background.

The major bottleneck is in the ata port resume which sends out a wakeup command and then waits up to several seconds for the port to resume. This patch changes the ata_port_resume_common function to be non blocking. The other bottleneck is in the scsi disk resume code, which issues a a command to startup the disk with blk_execute_rq, which then waits for the command to finish (which also depends on the ata port being fully resumed). The patch switches the sd_resume function to use blk_execute_rq_nowait instead (the underlying code is identical to sd_resume, but is changed to non-blocking).

Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
---
 drivers/ata/libata-core.c |  4 +++-
 drivers/scsi/sd.c         | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index c24354d..3585092 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -118,6 +118,7 @@ struct ata_force_ent {
 
 static struct ata_force_ent *ata_force_tbl;
 static int ata_force_tbl_size;
+int ata_resume_status;
 
 static char ata_force_param_buf[PAGE_SIZE] __initdata;
 /* param_buf is thrown away after initialization, disallow read */
@@ -5398,7 +5399,8 @@ static int ata_port_resume_common(struct device *dev, pm_message_t mesg)
 {
 	struct ata_port *ap = to_ata_port(dev);
 
-	return __ata_port_resume_common(ap, mesg, NULL);
+	ata_resume_status = 0;
+	return __ata_port_resume_common(ap, mesg, &ata_resume_status);
 }
 
 static int ata_port_resume(struct device *dev)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 86fcf2c..ee4d7e2 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -107,6 +107,7 @@ static int  sd_remove(struct device *);
 static void sd_shutdown(struct device *);
 static int sd_suspend(struct device *);
 static int sd_resume(struct device *);
+static int sd_resume_async(struct device *);
 static void sd_rescan(struct device *);
 static int sd_done(struct scsi_cmnd *);
 static int sd_eh_action(struct scsi_cmnd *, unsigned char *, int, int);
@@ -484,7 +485,7 @@ static struct class sd_disk_class = {
 
 static const struct dev_pm_ops sd_pm_ops = {
 	.suspend		= sd_suspend,
-	.resume			= sd_resume,
+	.resume			= sd_resume_async,
 	.poweroff		= sd_suspend,
 	.restore		= sd_resume,
 	.runtime_suspend	= sd_suspend,
@@ -3137,6 +3138,60 @@ done:
 	return ret;
 }
 
+static void sd_resume_async_end(struct request *rq, int error)
+{
+	struct scsi_disk *sdkp = rq->end_io_data;
+
+	sd_printk(KERN_NOTICE, sdkp, "Starting disk complete (async)\n");
+	rq->end_io_data = NULL;
+	__blk_put_request(rq->q, rq);
+}
+
+static int sd_resume_async(struct device *dev)
+{
+	unsigned char cmd[6] = { START_STOP };	/* START_VALID */
+	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct request *req;
+	int ret = 0;
+
+	if (!sdkp->device->manage_start_stop)
+		goto done;
+
+	sd_printk(KERN_NOTICE, sdkp, "Starting disk (async)\n");
+
+	cmd[4] |= 1;	/* START */
+
+	if (sdkp->device->start_stop_pwr_cond)
+		cmd[4] |= 1 << 4;	/* Active or Standby */
+
+	if (!scsi_device_online(sdkp->device)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
+	req = blk_get_request(sdkp->device->request_queue, 0, __GFP_WAIT);
+	if (!req) {
+		ret = DRIVER_ERROR << 24;
+		goto done;
+	}
+
+	req->cmd_len = COMMAND_SIZE(cmd[0]);
+	memcpy(req->cmd, cmd, req->cmd_len);
+	req->sense = NULL;
+	req->sense_len = 0;
+	req->retries = SD_MAX_RETRIES;
+	req->timeout = SD_TIMEOUT;
+	req->cmd_type = REQ_TYPE_BLOCK_PC;
+	req->cmd_flags |= REQ_PM | REQ_QUIET | REQ_PREEMPT;
+
+	req->end_io_data = sdkp;
+	blk_execute_rq_nowait(req->q, NULL, req, 1, sd_resume_async_end);
+
+done:
+	scsi_disk_put(sdkp);
+	return ret;
+}
+
 /**
  *	init_sd - entry point for this driver (both when built in or when
  *	a module).



Todd Brandt
Linux Kernel Developer OTC, Hillsboro OR

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-08-02 17:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-16 22:29 [PATCH] Hard disk S3 resume time optimization Brandt, Todd E
2013-05-16 22:44 ` Tejun Heo
2013-05-16 22:47   ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2013-08-01 23:40 Brandt, Todd E
2013-08-02 12:42 ` Oliver Neukum
2013-08-02 16:41   ` Brandt, Todd E
2013-08-02 17:05     ` Oliver Neukum
2013-08-02 17:12       ` Brandt, Todd E

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox