* [PATCH 0/2] Support runtime power off of HDD
@ 2012-09-13 7:40 Aaron Lu
2012-09-13 7:40 ` [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk Aaron Lu
2012-09-13 7:40 ` [PATCH 2/2] libata: acpi: set can_power_off for both ODD and HDD Aaron Lu
0 siblings, 2 replies; 24+ messages in thread
From: Aaron Lu @ 2012-09-13 7:40 UTC (permalink / raw)
To: Alan Stern, Jeff Garzik, James Bottomley
Cc: Aaron Lu, Jack Wang, Shane Huang, Oliver Neukum, linux-scsi,
linux-ide, linux-pm, linux-acpi, Aaron Lu
This patch set is baed on v7 ZPODD patches.
Aaron Lu (2):
scsi: sd: set ready_to_power_off for scsi disk
libata: acpi: set can_power_off for both ODD and HDD
drivers/ata/libata-acpi.c | 25 +++++++++++++++++--------
drivers/scsi/sd.c | 1 +
2 files changed, 18 insertions(+), 8 deletions(-)
--
1.7.12.21.g871e293
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 7:40 [PATCH 0/2] Support runtime power off of HDD Aaron Lu
@ 2012-09-13 7:40 ` Aaron Lu
2012-09-13 8:14 ` James Bottomley
2012-09-13 7:40 ` [PATCH 2/2] libata: acpi: set can_power_off for both ODD and HDD Aaron Lu
1 sibling, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-13 7:40 UTC (permalink / raw)
To: Alan Stern, Jeff Garzik, James Bottomley
Cc: Aaron Lu, Jack Wang, Shane Huang, Oliver Neukum, linux-scsi,
linux-ide, linux-pm, linux-acpi, Aaron Lu
The ready_to_power_off flag is used to give indication to ATA layer
if this device's power can be removed when runtime suspended.
This flag is determined by individual SCSI driver like sr, sd.
This flag is introduced to support zero power ODD. When ODD
is runtime suspended, it may not be OK to remove its power.
But for disk, it is always OK to be powered off, so set this flag.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
drivers/scsi/sd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4df73e5..de786cf 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2638,6 +2638,7 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
sd_printk(KERN_NOTICE, sdkp, "Attached SCSI %sdisk\n",
sdp->removable ? "removable " : "");
+ sdp->ready_to_power_off = 1;
scsi_autopm_put_device(sdp);
put_device(&sdkp->dev);
}
--
1.7.12.21.g871e293
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/2] libata: acpi: set can_power_off for both ODD and HDD
2012-09-13 7:40 [PATCH 0/2] Support runtime power off of HDD Aaron Lu
2012-09-13 7:40 ` [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk Aaron Lu
@ 2012-09-13 7:40 ` Aaron Lu
1 sibling, 0 replies; 24+ messages in thread
From: Aaron Lu @ 2012-09-13 7:40 UTC (permalink / raw)
To: Alan Stern, Jeff Garzik, James Bottomley
Cc: Aaron Lu, Jack Wang, Shane Huang, Oliver Neukum, linux-scsi,
linux-ide, linux-pm, linux-acpi, Aaron Lu
Hard disk may also be runtime powered off, so set can_power_off flag
for it too if condition satisfies so that the may_power_off sysfs file
will be created for it to give user a chance to disable runtime power
off.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
---
drivers/ata/libata-acpi.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/drivers/ata/libata-acpi.c b/drivers/ata/libata-acpi.c
index 24347e0..443c3f2 100644
--- a/drivers/ata/libata-acpi.c
+++ b/drivers/ata/libata-acpi.c
@@ -1015,7 +1015,7 @@ static void ata_acpi_add_pm_notifier(struct ata_device *dev)
if (ACPI_FAILURE(status))
return;
- if (dev->sdev->can_power_off) {
+ if (dev->class == ATA_DEV_ATAPI && dev->sdev->can_power_off) {
acpi_install_notify_handler(handle, ACPI_SYSTEM_NOTIFY,
ata_acpi_wake_dev, dev);
device_set_run_wake(&dev->sdev->sdev_gendev, true);
@@ -1036,7 +1036,7 @@ static void ata_acpi_remove_pm_notifier(struct ata_device *dev)
if (ACPI_FAILURE(status))
return;
- if (dev->sdev->can_power_off) {
+ if (dev->class == ATA_DEV_ATAPI && dev->sdev->can_power_off) {
device_set_run_wake(&dev->sdev->sdev_gendev, false);
acpi_remove_notify_handler(handle, ACPI_SYSTEM_NOTIFY,
ata_acpi_wake_dev);
@@ -1140,14 +1140,23 @@ static int ata_acpi_bind_device(struct ata_port *ap, struct scsi_device *sdev,
/*
* If firmware has _PS3 or _PR3 for this device,
- * and this ata ODD device support device attention,
- * it means this device can be powered off
+ * it means this device can be powered off runtime
*/
states = acpi_dev->power.states;
- if ((states[ACPI_STATE_D3_HOT].flags.valid ||
- states[ACPI_STATE_D3_COLD].flags.explicit_set) &&
- ata_dev->flags & ATA_DFLAG_DA)
- sdev->can_power_off = 1;
+ if (states[ACPI_STATE_D3_HOT].flags.valid ||
+ states[ACPI_STATE_D3_COLD].flags.explicit_set) {
+ /*
+ * For ODD, it needs to support device attention or
+ * it can't be powered up back by user
+ */
+ if (ata_dev->class == ATA_DEV_ATAPI &&
+ ata_dev->flags & ATA_DFLAG_DA)
+ sdev->can_power_off = 1;
+
+ /* No requirement for hard disk */
+ if (ata_dev->class == ATA_DEV_ATA)
+ sdev->can_power_off = 1;
+ }
return 0;
}
--
1.7.12.21.g871e293
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 7:40 ` [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk Aaron Lu
@ 2012-09-13 8:14 ` James Bottomley
2012-09-13 8:23 ` Aaron Lu
0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2012-09-13 8:14 UTC (permalink / raw)
To: Aaron Lu
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 2012-09-13 at 15:40 +0800, Aaron Lu wrote:
> The ready_to_power_off flag is used to give indication to ATA layer
> if this device's power can be removed when runtime suspended.
>
> This flag is determined by individual SCSI driver like sr, sd.
>
> This flag is introduced to support zero power ODD. When ODD
> is runtime suspended, it may not be OK to remove its power.
>
> But for disk, it is always OK to be powered off, so set this flag.
It is? I may have missed this, but where do you flush the cache of write
back cache devices you're about to power off?
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 8:14 ` James Bottomley
@ 2012-09-13 8:23 ` Aaron Lu
2012-09-13 8:37 ` James Bottomley
0 siblings, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-13 8:23 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On 09/13/2012 04:14 PM, James Bottomley wrote:
> On Thu, 2012-09-13 at 15:40 +0800, Aaron Lu wrote:
>> The ready_to_power_off flag is used to give indication to ATA layer
>> if this device's power can be removed when runtime suspended.
>>
>> This flag is determined by individual SCSI driver like sr, sd.
>>
>> This flag is introduced to support zero power ODD. When ODD
>> is runtime suspended, it may not be OK to remove its power.
>>
>> But for disk, it is always OK to be powered off, so set this flag.
>
> It is? I may have missed this, but where do you flush the cache of write
> back cache devices you're about to power off?
I suppose that is handled in sd_suspend callback, the power off happens
after a device is runtime suspended.
Thanks,
Aaron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 8:23 ` Aaron Lu
@ 2012-09-13 8:37 ` James Bottomley
2012-09-13 8:49 ` Aaron Lu
0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2012-09-13 8:37 UTC (permalink / raw)
To: Aaron Lu
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 2012-09-13 at 16:23 +0800, Aaron Lu wrote:
> On 09/13/2012 04:14 PM, James Bottomley wrote:
> > On Thu, 2012-09-13 at 15:40 +0800, Aaron Lu wrote:
> >> The ready_to_power_off flag is used to give indication to ATA layer
> >> if this device's power can be removed when runtime suspended.
> >>
> >> This flag is determined by individual SCSI driver like sr, sd.
> >>
> >> This flag is introduced to support zero power ODD. When ODD
> >> is runtime suspended, it may not be OK to remove its power.
> >>
> >> But for disk, it is always OK to be powered off, so set this flag.
> >
> > It is? I may have missed this, but where do you flush the cache of write
> > back cache devices you're about to power off?
>
> I suppose that is handled in sd_suspend callback, the power off happens
> after a device is runtime suspended.
Well that would mean something is wrong somewhere: For runtime power
management using idle timers and forced standby, there's no need to
flush the cache (if the drive goes into standby on its own as a result
of an idle timeout, the cache will never flush). The cache needs to
flush before we power off the device: that's before the system goes into
S3, or now before you power it off at runtime. Flushing the cache on
runtime transitions to standby will likely cause performance problems
since that happens quite often.
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 8:37 ` James Bottomley
@ 2012-09-13 8:49 ` Aaron Lu
2012-09-13 8:56 ` James Bottomley
0 siblings, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-13 8:49 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On 09/13/2012 04:37 PM, James Bottomley wrote:
> On Thu, 2012-09-13 at 16:23 +0800, Aaron Lu wrote:
>> On 09/13/2012 04:14 PM, James Bottomley wrote:
>>> On Thu, 2012-09-13 at 15:40 +0800, Aaron Lu wrote:
>>>> The ready_to_power_off flag is used to give indication to ATA layer
>>>> if this device's power can be removed when runtime suspended.
>>>>
>>>> This flag is determined by individual SCSI driver like sr, sd.
>>>>
>>>> This flag is introduced to support zero power ODD. When ODD
>>>> is runtime suspended, it may not be OK to remove its power.
>>>>
>>>> But for disk, it is always OK to be powered off, so set this flag.
>>>
>>> It is? I may have missed this, but where do you flush the cache of write
>>> back cache devices you're about to power off?
>>
>> I suppose that is handled in sd_suspend callback, the power off happens
>> after a device is runtime suspended.
>
> Well that would mean something is wrong somewhere: For runtime power
> management using idle timers and forced standby, there's no need to
The current mechanism for scsi disk runtime pm is based on open/close.
If there is some process opened this block device, it will be in active
state; only when all opened session exited, it will enter runtime
suspend state.
> flush the cache (if the drive goes into standby on its own as a result
> of an idle timeout, the cache will never flush). The cache needs to
> flush before we power off the device: that's before the system goes into
> S3, or now before you power it off at runtime. Flushing the cache on
> runtime transitions to standby will likely cause performance problems
> since that happens quite often.
As explained above, it didn't happen that often, especially for user who
has only one disk, the disk will be mounted, which makes it never be
able to enter runtime suspend state.
Thanks,
Aaron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 8:49 ` Aaron Lu
@ 2012-09-13 8:56 ` James Bottomley
2012-09-13 9:07 ` Aaron Lu
0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2012-09-13 8:56 UTC (permalink / raw)
To: Aaron Lu
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 2012-09-13 at 16:49 +0800, Aaron Lu wrote:
> On 09/13/2012 04:37 PM, James Bottomley wrote:
> > On Thu, 2012-09-13 at 16:23 +0800, Aaron Lu wrote:
> >> On 09/13/2012 04:14 PM, James Bottomley wrote:
> >>> On Thu, 2012-09-13 at 15:40 +0800, Aaron Lu wrote:
> >>>> The ready_to_power_off flag is used to give indication to ATA layer
> >>>> if this device's power can be removed when runtime suspended.
> >>>>
> >>>> This flag is determined by individual SCSI driver like sr, sd.
> >>>>
> >>>> This flag is introduced to support zero power ODD. When ODD
> >>>> is runtime suspended, it may not be OK to remove its power.
> >>>>
> >>>> But for disk, it is always OK to be powered off, so set this flag.
> >>>
> >>> It is? I may have missed this, but where do you flush the cache of write
> >>> back cache devices you're about to power off?
> >>
> >> I suppose that is handled in sd_suspend callback, the power off happens
> >> after a device is runtime suspended.
> >
> > Well that would mean something is wrong somewhere: For runtime power
> > management using idle timers and forced standby, there's no need to
>
> The current mechanism for scsi disk runtime pm is based on open/close.
> If there is some process opened this block device, it will be in active
> state; only when all opened session exited, it will enter runtime
> suspend state.
A mounted disk is open for the period of the mount. I thought the use
case for runtime PM was the laptop one but most laptops have a single
device to use as root, so if you never use runtime PM on an open device,
you never use it on 99% of our target systems ... doesn't that make the
feature a bit useless?
> > flush the cache (if the drive goes into standby on its own as a result
> > of an idle timeout, the cache will never flush). The cache needs to
> > flush before we power off the device: that's before the system goes into
> > S3, or now before you power it off at runtime. Flushing the cache on
> > runtime transitions to standby will likely cause performance problems
> > since that happens quite often.
>
> As explained above, it didn't happen that often, especially for user who
> has only one disk, the disk will be mounted, which makes it never be
> able to enter runtime suspend state.
So what's the target audience for the feature. If it isn't laptops or
standard desktops, is it the enterprise?
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 8:56 ` James Bottomley
@ 2012-09-13 9:07 ` Aaron Lu
2012-09-13 9:26 ` James Bottomley
0 siblings, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-13 9:07 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On 09/13/2012 04:56 PM, James Bottomley wrote:
> On Thu, 2012-09-13 at 16:49 +0800, Aaron Lu wrote:
>> On 09/13/2012 04:37 PM, James Bottomley wrote:
>>> On Thu, 2012-09-13 at 16:23 +0800, Aaron Lu wrote:
>>>> On 09/13/2012 04:14 PM, James Bottomley wrote:
>>>>> On Thu, 2012-09-13 at 15:40 +0800, Aaron Lu wrote:
>>>>>> The ready_to_power_off flag is used to give indication to ATA layer
>>>>>> if this device's power can be removed when runtime suspended.
>>>>>>
>>>>>> This flag is determined by individual SCSI driver like sr, sd.
>>>>>>
>>>>>> This flag is introduced to support zero power ODD. When ODD
>>>>>> is runtime suspended, it may not be OK to remove its power.
>>>>>>
>>>>>> But for disk, it is always OK to be powered off, so set this flag.
>>>>>
>>>>> It is? I may have missed this, but where do you flush the cache of write
>>>>> back cache devices you're about to power off?
>>>>
>>>> I suppose that is handled in sd_suspend callback, the power off happens
>>>> after a device is runtime suspended.
>>>
>>> Well that would mean something is wrong somewhere: For runtime power
>>> management using idle timers and forced standby, there's no need to
>>
>> The current mechanism for scsi disk runtime pm is based on open/close.
>> If there is some process opened this block device, it will be in active
>> state; only when all opened session exited, it will enter runtime
>> suspend state.
>
> A mounted disk is open for the period of the mount. I thought the use
> case for runtime PM was the laptop one but most laptops have a single
> device to use as root, so if you never use runtime PM on an open device,
> you never use it on 99% of our target systems ... doesn't that make the
> feature a bit useless?
I agree, but it may be helpful in some cases.
>
>>> flush the cache (if the drive goes into standby on its own as a result
>>> of an idle timeout, the cache will never flush). The cache needs to
>>> flush before we power off the device: that's before the system goes into
>>> S3, or now before you power it off at runtime. Flushing the cache on
>>> runtime transitions to standby will likely cause performance problems
>>> since that happens quite often.
>>
>> As explained above, it didn't happen that often, especially for user who
>> has only one disk, the disk will be mounted, which makes it never be
>> able to enter runtime suspend state.
>
> So what's the target audience for the feature. If it isn't laptops or
> standard desktops, is it the enterprise?
To make this feature useful for normal laptop user, a better mechanism
for scsi disk runtime pm is needed. Alan Stern and Lin Ming has been
working on this, and I'll see if I can make that patch work later.
So I think this is basically 2 things, one is the runtime suspend of the
disk, another is when it is runtime suspended, how to remove its power.
I'm currently doing the latter one, which is simpler, so I want to do it
first :-)
And there may exist some cases this can be helpful, if user has 2 or
more disks attached and he is only using one of them or some other
corner cases that I don't know.
Considering the effort to implement this feature pretty small, and it
shouldn't cause trouble for existing system, I think this may be worth
it.
Thanks,
Aaron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 9:07 ` Aaron Lu
@ 2012-09-13 9:26 ` James Bottomley
2012-09-13 10:16 ` Oliver Neukum
2012-09-14 5:20 ` Aaron Lu
0 siblings, 2 replies; 24+ messages in thread
From: James Bottomley @ 2012-09-13 9:26 UTC (permalink / raw)
To: Aaron Lu
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 2012-09-13 at 17:07 +0800, Aaron Lu wrote:
> On 09/13/2012 04:56 PM, James Bottomley wrote:
> > So what's the target audience for the feature. If it isn't laptops or
> > standard desktops, is it the enterprise?
>
> To make this feature useful for normal laptop user, a better mechanism
> for scsi disk runtime pm is needed. Alan Stern and Lin Ming has been
> working on this, and I'll see if I can make that patch work later.
>
> So I think this is basically 2 things, one is the runtime suspend of the
> disk, another is when it is runtime suspended, how to remove its power.
> I'm currently doing the latter one, which is simpler, so I want to do it
> first :-)
Well, I don't like the way the interaction of the patches is going.
You're the one proposing powering down the device outside of the
standards defined transitions, so you need to be responsible for the
actions that necessitates, including synchronizing the cache. The specs
(SPC-4) say that cache management is explicitly unnecessary for the
standard SCSI power states (Active, Idle, Standby and Stopped), so
someone at some point is going to read that and remove the unnecessary
cache sync in the code. When that happens, you'll start getting data
loss.
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 9:26 ` James Bottomley
@ 2012-09-13 10:16 ` Oliver Neukum
2012-09-13 10:51 ` James Bottomley
2012-09-14 5:20 ` Aaron Lu
1 sibling, 1 reply; 24+ messages in thread
From: Oliver Neukum @ 2012-09-13 10:16 UTC (permalink / raw)
To: James Bottomley
Cc: Aaron Lu, Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thursday 13 September 2012 10:26:44 James Bottomley wrote:
> On Thu, 2012-09-13 at 17:07 +0800, Aaron Lu wrote:
> > So I think this is basically 2 things, one is the runtime suspend of the
> > disk, another is when it is runtime suspended, how to remove its power.
> > I'm currently doing the latter one, which is simpler, so I want to do it
> > first :-)
>
> Well, I don't like the way the interaction of the patches is going.
> You're the one proposing powering down the device outside of the
> standards defined transitions, so you need to be responsible for the
> actions that necessitates, including synchronizing the cache. The specs
> (SPC-4) say that cache management is explicitly unnecessary for the
> standard SCSI power states (Active, Idle, Standby and Stopped), so
> someone at some point is going to read that and remove the unnecessary
> cache sync in the code. When that happens, you'll start getting data
> loss.
The cache is handled identically in sd_suspend() and sd_shutdown().
In fact sd_shutdown() will skip handling it if the device has already been
suspended, so the assumption is built into the code and has been so
for a long time.
Though it wouldn't hurt to add a comment that says that the system going
to S3 or S4 will cut power to a lot of disk so that the cache needs to be synced
even if the spec says we need not. Runtime PM doesn't much alter the
situation.
Regards
Oliver
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 10:16 ` Oliver Neukum
@ 2012-09-13 10:51 ` James Bottomley
2012-09-13 12:34 ` Oliver Neukum
0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2012-09-13 10:51 UTC (permalink / raw)
To: Oliver Neukum
Cc: Aaron Lu, Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 2012-09-13 at 12:16 +0200, Oliver Neukum wrote:
> On Thursday 13 September 2012 10:26:44 James Bottomley wrote:
> > On Thu, 2012-09-13 at 17:07 +0800, Aaron Lu wrote:
>
> > > So I think this is basically 2 things, one is the runtime suspend of the
> > > disk, another is when it is runtime suspended, how to remove its power.
> > > I'm currently doing the latter one, which is simpler, so I want to do it
> > > first :-)
> >
> > Well, I don't like the way the interaction of the patches is going.
> > You're the one proposing powering down the device outside of the
> > standards defined transitions, so you need to be responsible for the
> > actions that necessitates, including synchronizing the cache. The specs
> > (SPC-4) say that cache management is explicitly unnecessary for the
> > standard SCSI power states (Active, Idle, Standby and Stopped), so
> > someone at some point is going to read that and remove the unnecessary
> > cache sync in the code. When that happens, you'll start getting data
> > loss.
>
> The cache is handled identically in sd_suspend() and sd_shutdown().
> In fact sd_shutdown() will skip handling it if the device has already been
> suspended, so the assumption is built into the code and has been so
> for a long time.
>
> Though it wouldn't hurt to add a comment that says that the system going
> to S3 or S4 will cut power to a lot of disk so that the cache needs to be synced
> even if the spec says we need not. Runtime PM doesn't much alter the
> situation.
I think you're confusing two things. Sleep states (S3 and S4) aren't
spec'd in SCSI, so we have to take care of everything (including the
cache before power off) because they're done invisibly to the disk. The
same tends to go for link power management, which was previously our
only form of runtime PM, but which doesn't actually affect the disk at
all and, of course, ACPI power off of devices (ZPDD).
Disk runtime power states are defined in the standard and so we rely on
the standard taking care of the cache. I suspect the most efficient use
may be via the power management mode page, which does everything
automatically on timers (you just get to set the timer interval, plus
some transports *may* require an initialising command which we already
have some provision for) than doing it all ourselves from block.
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 10:51 ` James Bottomley
@ 2012-09-13 12:34 ` Oliver Neukum
2012-09-13 16:24 ` Alan Stern
0 siblings, 1 reply; 24+ messages in thread
From: Oliver Neukum @ 2012-09-13 12:34 UTC (permalink / raw)
To: James Bottomley
Cc: Aaron Lu, Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thursday 13 September 2012 11:51:07 James Bottomley wrote:
> On Thu, 2012-09-13 at 12:16 +0200, Oliver Neukum wrote:
> > On Thursday 13 September 2012 10:26:44 James Bottomley wrote:
> > > On Thu, 2012-09-13 at 17:07 +0800, Aaron Lu wrote:
> >
> > > > So I think this is basically 2 things, one is the runtime suspend of the
> > > > disk, another is when it is runtime suspended, how to remove its power.
> > > > I'm currently doing the latter one, which is simpler, so I want to do it
> > > > first :-)
> > >
> > > Well, I don't like the way the interaction of the patches is going.
> > > You're the one proposing powering down the device outside of the
> > > standards defined transitions, so you need to be responsible for the
> > > actions that necessitates, including synchronizing the cache. The specs
> > > (SPC-4) say that cache management is explicitly unnecessary for the
> > > standard SCSI power states (Active, Idle, Standby and Stopped), so
> > > someone at some point is going to read that and remove the unnecessary
> > > cache sync in the code. When that happens, you'll start getting data
> > > loss.
> >
> > The cache is handled identically in sd_suspend() and sd_shutdown().
> > In fact sd_shutdown() will skip handling it if the device has already been
> > suspended, so the assumption is built into the code and has been so
> > for a long time.
> >
> > Though it wouldn't hurt to add a comment that says that the system going
> > to S3 or S4 will cut power to a lot of disk so that the cache needs to be synced
> > even if the spec says we need not. Runtime PM doesn't much alter the
> > situation.
>
> I think you're confusing two things. Sleep states (S3 and S4) aren't
> spec'd in SCSI, so we have to take care of everything (including the
> cache before power off) because they're done invisibly to the disk. The
Yes, but this confusion is necessary. The driver core is supposed to
be generic and knows strictly speaking only suspended and active.
It is a driver's job to do what needs to be done and translate this
into the appropriate device states.
> same tends to go for link power management, which was previously our
> only form of runtime PM, but which doesn't actually affect the disk at
> all and, of course, ACPI power off of devices (ZPDD).
The latter however does cut power to the drive. So the driver should do
what it does when other operations that affect power are done.
> Disk runtime power states are defined in the standard and so we rely on
> the standard taking care of the cache. I suspect the most efficient use
> may be via the power management mode page, which does everything
> automatically on timers (you just get to set the timer interval, plus
> some transports *may* require an initialising command which we already
> have some provision for) than doing it all ourselves from block.
Well, yes, but we need support modes of power management that cut off
power to the disk in any case, so what does it matter if we also do it for
runtime PM?
Are you concerned about layering?
Regards
Oliver
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 12:34 ` Oliver Neukum
@ 2012-09-13 16:24 ` Alan Stern
2012-09-13 20:18 ` Oliver Neukum
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Alan Stern @ 2012-09-13 16:24 UTC (permalink / raw)
To: Oliver Neukum
Cc: James Bottomley, Aaron Lu, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 13 Sep 2012, Oliver Neukum wrote:
> > > > Well, I don't like the way the interaction of the patches is going.
> > > > You're the one proposing powering down the device outside of the
> > > > standards defined transitions, so you need to be responsible for the
> > > > actions that necessitates, including synchronizing the cache. The specs
> > > > (SPC-4) say that cache management is explicitly unnecessary for the
> > > > standard SCSI power states (Active, Idle, Standby and Stopped), so
> > > > someone at some point is going to read that and remove the unnecessary
> > > > cache sync in the code. When that happens, you'll start getting data
> > > > loss.
> > >
> > > The cache is handled identically in sd_suspend() and sd_shutdown().
> > > In fact sd_shutdown() will skip handling it if the device has already been
> > > suspended, so the assumption is built into the code and has been so
> > > for a long time.
> > >
> > > Though it wouldn't hurt to add a comment that says that the system going
> > > to S3 or S4 will cut power to a lot of disk so that the cache needs to be synced
> > > even if the spec says we need not. Runtime PM doesn't much alter the
> > > situation.
> >
> > I think you're confusing two things. Sleep states (S3 and S4) aren't
> > spec'd in SCSI, so we have to take care of everything (including the
> > cache before power off) because they're done invisibly to the disk. The
>
> Yes, but this confusion is necessary. The driver core is supposed to
> be generic and knows strictly speaking only suspended and active.
> It is a driver's job to do what needs to be done and translate this
> into the appropriate device states.
Currently the sd driver's suspend routine is not very sophisticated.
It needs to become smarter about the differences between system
suspend, runtime suspend, and power off.
> > same tends to go for link power management, which was previously our
> > only form of runtime PM, but which doesn't actually affect the disk at
> > all and, of course, ACPI power off of devices (ZPDD).
>
> The latter however does cut power to the drive. So the driver should do
> what it does when other operations that affect power are done.
>
> > Disk runtime power states are defined in the standard and so we rely on
> > the standard taking care of the cache. I suspect the most efficient use
> > may be via the power management mode page, which does everything
> > automatically on timers (you just get to set the timer interval, plus
> > some transports *may* require an initialising command which we already
> > have some provision for) than doing it all ourselves from block.
>
> Well, yes, but we need support modes of power management that cut off
> power to the disk in any case, so what does it matter if we also do it for
> runtime PM?
>
> Are you concerned about layering?
It sounds like James is partly concerned about efficiency. If Lin
Ming's patches are merged then we will be doing runtime suspend
relatively often, not just when the device file is closed. The
sd_suspend routine should know when SYNCHRONIZE CACHE is needed and
when it can be skipped.
>From what I gather of this discussion, we can avoid flushing the cache
during (1) a runtime suspend provided (2) the drive isn't going to be
powered down. If either (1) or (2) doesn't hold then the cache needs
to be synchronized.
The problem with relying on the internal timers and the power
management mode page is that the transitions take place automatically
and the host system doesn't know about them. We _want_ to know about
them so that the higher layers of the device tree can go to low power
when the disk does.
On the other hand, perhaps sd_suspend/sd_resume could use the mode page
by telling it to go into or out of Stopped mode immediately.
Alan Stern
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 16:24 ` Alan Stern
@ 2012-09-13 20:18 ` Oliver Neukum
2012-09-13 20:46 ` Alan Stern
2012-09-14 6:57 ` Aaron Lu
2012-09-14 8:15 ` James Bottomley
2 siblings, 1 reply; 24+ messages in thread
From: Oliver Neukum @ 2012-09-13 20:18 UTC (permalink / raw)
To: Alan Stern
Cc: James Bottomley, Aaron Lu, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thursday 13 September 2012 12:24:46 Alan Stern wrote:
> On Thu, 13 Sep 2012, Oliver Neukum wrote:
> > Yes, but this confusion is necessary. The driver core is supposed to
> > be generic and knows strictly speaking only suspended and active.
> > It is a driver's job to do what needs to be done and translate this
> > into the appropriate device states.
>
> Currently the sd driver's suspend routine is not very sophisticated.
> It needs to become smarter about the differences between system
> suspend, runtime suspend, and power off.
In what way?
> > Well, yes, but we need support modes of power management that cut off
> > power to the disk in any case, so what does it matter if we also do it for
> > runtime PM?
> >
> > Are you concerned about layering?
>
> It sounds like James is partly concerned about efficiency. If Lin
> Ming's patches are merged then we will be doing runtime suspend
> relatively often, not just when the device file is closed. The
> sd_suspend routine should know when SYNCHRONIZE CACHE is needed and
> when it can be skipped.
How? This depends on the hardware?
> From what I gather of this discussion, we can avoid flushing the cache
> during (1) a runtime suspend provided (2) the drive isn't going to be
> powered down. If either (1) or (2) doesn't hold then the cache needs
> to be synchronized.
This is true, but how is it relevant?
> The problem with relying on the internal timers and the power
> management mode page is that the transitions take place automatically
> and the host system doesn't know about them. We _want_ to know about
> them so that the higher layers of the device tree can go to low power
> when the disk does.
Why would you want that to correlate? The operation of the controller
and the driver is independent of the state.
And what would it tell us, as the driver knows aout all IO anyway?
Regards
Oliver
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 20:18 ` Oliver Neukum
@ 2012-09-13 20:46 ` Alan Stern
0 siblings, 0 replies; 24+ messages in thread
From: Alan Stern @ 2012-09-13 20:46 UTC (permalink / raw)
To: Oliver Neukum
Cc: James Bottomley, Aaron Lu, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 13 Sep 2012, Oliver Neukum wrote:
> On Thursday 13 September 2012 12:24:46 Alan Stern wrote:
> > On Thu, 13 Sep 2012, Oliver Neukum wrote:
>
> > > Yes, but this confusion is necessary. The driver core is supposed to
> > > be generic and knows strictly speaking only suspended and active.
> > > It is a driver's job to do what needs to be done and translate this
> > > into the appropriate device states.
> >
> > Currently the sd driver's suspend routine is not very sophisticated.
> > It needs to become smarter about the differences between system
> > suspend, runtime suspend, and power off.
>
> In what way?
sd_suspend should know whether or not to issue the SYNCHRONIZE CACHE
command.
> > It sounds like James is partly concerned about efficiency. If Lin
> > Ming's patches are merged then we will be doing runtime suspend
> > relatively often, not just when the device file is closed. The
> > sd_suspend routine should know when SYNCHRONIZE CACHE is needed and
> > when it can be skipped.
>
> How? This depends on the hardware?
It depends partly on the hardware, partly on the type of suspend, and
partly on the flag settings in sysfs.
> > From what I gather of this discussion, we can avoid flushing the cache
> > during (1) a runtime suspend provided (2) the drive isn't going to be
> > powered down. If either (1) or (2) doesn't hold then the cache needs
> > to be synchronized.
>
> This is true, but how is it relevant?
This, or something like it, is the algorithm sd_suspend should use for
determining whether or not to issue SYNCHRONIZE CACHE.
> > The problem with relying on the internal timers and the power
> > management mode page is that the transitions take place automatically
> > and the host system doesn't know about them. We _want_ to know about
> > them so that the higher layers of the device tree can go to low power
> > when the disk does.
>
> Why would you want that to correlate? The operation of the controller
> and the driver is independent of the state.
That's the problem -- I would like them not to be so independent. The
reason stated above: If we know when the controller puts the drive in a
low-power state then we can tell the higher layers of the device tree
to go to low power at those times.
> And what would it tell us, as the driver knows aout all IO anyway?
But the driver doesn't know when the controller has spun down the disk.
That's something else sd_suspend has to worry about.
Alan Stern
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 9:26 ` James Bottomley
2012-09-13 10:16 ` Oliver Neukum
@ 2012-09-14 5:20 ` Aaron Lu
2012-09-14 8:17 ` James Bottomley
1 sibling, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-14 5:20 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, Sep 13, 2012 at 10:26:44AM +0100, James Bottomley wrote:
> On Thu, 2012-09-13 at 17:07 +0800, Aaron Lu wrote:
> > So I think this is basically 2 things, one is the runtime suspend of the
> > disk, another is when it is runtime suspended, how to remove its power.
> > I'm currently doing the latter one, which is simpler, so I want to do it
> > first :-)
>
> Well, I don't like the way the interaction of the patches is going.
> You're the one proposing powering down the device outside of the
> standards defined transitions, so you need to be responsible for the
> actions that necessitates, including synchronizing the cache. The specs
OK, I'll update the code.
> (SPC-4) say that cache management is explicitly unnecessary for the
> standard SCSI power states (Active, Idle, Standby and Stopped), so
Just read the SPC-4 spec, in section 5.12.3, it has words like this:
Logical units that contain cache memory shall write all cached data to
the medium for the logical unit(e.g., as a logical unit would do in
response to a SYNCHRONIZE CACHE command as described SBC-3) prior to
entering into any power condition that prevents accessing the
media(e.g., before a hard drive stops its spindle motor during a change
to the standby power condition).
So this looks like cache needs to be synced before the device enter
standby/stopped power condition. Or do I miss somthing?
> someone at some point is going to read that and remove the unnecessary
> cache sync in the code. When that happens, you'll start getting data
> loss.
Indeed, I'll make sure cache gets synced when we are to power off the
device. Thanks for the remind.
-Aaron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 16:24 ` Alan Stern
2012-09-13 20:18 ` Oliver Neukum
@ 2012-09-14 6:57 ` Aaron Lu
2012-09-14 8:15 ` James Bottomley
2 siblings, 0 replies; 24+ messages in thread
From: Aaron Lu @ 2012-09-14 6:57 UTC (permalink / raw)
To: Alan Stern
Cc: Oliver Neukum, James Bottomley, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, Sep 13, 2012 at 12:24:46PM -0400, Alan Stern wrote:
> > > Disk runtime power states are defined in the standard and so we rely on
> > > the standard taking care of the cache. I suspect the most efficient use
> > > may be via the power management mode page, which does everything
> > > automatically on timers (you just get to set the timer interval, plus
> > > some transports *may* require an initialising command which we already
> > > have some provision for) than doing it all ourselves from block.
> >
> > Well, yes, but we need support modes of power management that cut off
> > power to the disk in any case, so what does it matter if we also do it for
> > runtime PM?
> >
> > Are you concerned about layering?
>
> It sounds like James is partly concerned about efficiency. If Lin
> Ming's patches are merged then we will be doing runtime suspend
> relatively often, not just when the device file is closed. The
> sd_suspend routine should know when SYNCHRONIZE CACHE is needed and
> when it can be skipped.
>
> From what I gather of this discussion, we can avoid flushing the cache
> during (1) a runtime suspend provided (2) the drive isn't going to be
> powered down. If either (1) or (2) doesn't hold then the cache needs
> to be synchronized.
Agree.
>
> The problem with relying on the internal timers and the power
> management mode page is that the transitions take place automatically
> and the host system doesn't know about them. We _want_ to know about
> them so that the higher layers of the device tree can go to low power
> when the disk does.
Looks like it's not easy to know when the device entered a low power
state. Constantly polling with request sense doesn't seem to be a good
idea.
This will make upper layer devices not able to enter runtime suspend
state and device's power can't be cut.
>
> On the other hand, perhaps sd_suspend/sd_resume could use the mode page
> by telling it to go into or out of Stopped mode immediately.
BTW, is it necessary to issue the stop command before we cut its power
either due to runtime power off or system entering S3/S4/S5?
Thanks,
Aaron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-13 16:24 ` Alan Stern
2012-09-13 20:18 ` Oliver Neukum
2012-09-14 6:57 ` Aaron Lu
@ 2012-09-14 8:15 ` James Bottomley
2 siblings, 0 replies; 24+ messages in thread
From: James Bottomley @ 2012-09-14 8:15 UTC (permalink / raw)
To: Alan Stern
Cc: Oliver Neukum, Aaron Lu, Jeff Garzik, Aaron Lu, Jack Wang,
Shane Huang, linux-scsi, linux-ide, linux-pm, linux-acpi
On Thu, 2012-09-13 at 12:24 -0400, Alan Stern wrote:
> On Thu, 13 Sep 2012, Oliver Neukum wrote:
> > > Disk runtime power states are defined in the standard and so we rely on
> > > the standard taking care of the cache. I suspect the most efficient use
> > > may be via the power management mode page, which does everything
> > > automatically on timers (you just get to set the timer interval, plus
> > > some transports *may* require an initialising command which we already
> > > have some provision for) than doing it all ourselves from block.
> >
> > Well, yes, but we need support modes of power management that cut off
> > power to the disk in any case, so what does it matter if we also do it for
> > runtime PM?
> >
> > Are you concerned about layering?
>
> It sounds like James is partly concerned about efficiency.
Sort of, but my main worry is correctness: I don't want a path in
runtime suspend that requires a cache flush to be dependent on the flush
being in a path which doesn't because efficiency dictates that at some
time or other the unnecessary flush will get removed (and then we'll
start corrupting data).
> If Lin
> Ming's patches are merged then we will be doing runtime suspend
> relatively often, not just when the device file is closed. The
> sd_suspend routine should know when SYNCHRONIZE CACHE is needed and
> when it can be skipped.
Keeping the flush in sd_suspend and making sure we know when to use it
would be fine by me as well ... I just need all the independent runtime
suspend patch authors to agree on this scheme.
> >From what I gather of this discussion, we can avoid flushing the cache
> during (1) a runtime suspend provided (2) the drive isn't going to be
> powered down. If either (1) or (2) doesn't hold then the cache needs
> to be synchronized.
>
> The problem with relying on the internal timers and the power
> management mode page is that the transitions take place automatically
> and the host system doesn't know about them. We _want_ to know about
> them so that the higher layers of the device tree can go to low power
> when the disk does.
Sigh ... the standards guys didn't help there then, since SPC-4
specifically says there will be no notifications.
> On the other hand, perhaps sd_suspend/sd_resume could use the mode page
> by telling it to go into or out of Stopped mode immediately.
That's perfectly legal. Even if you use timer based power state
management afforded by the mode page you can still preempt the timer
with an explicit go into this power state command.
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-14 5:20 ` Aaron Lu
@ 2012-09-14 8:17 ` James Bottomley
2012-09-14 8:48 ` Aaron Lu
0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2012-09-14 8:17 UTC (permalink / raw)
To: Aaron Lu
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Fri, 2012-09-14 at 13:20 +0800, Aaron Lu wrote:
> On Thu, Sep 13, 2012 at 10:26:44AM +0100, James Bottomley wrote:
> > On Thu, 2012-09-13 at 17:07 +0800, Aaron Lu wrote:
> > > So I think this is basically 2 things, one is the runtime suspend of the
> > > disk, another is when it is runtime suspended, how to remove its power.
> > > I'm currently doing the latter one, which is simpler, so I want to do it
> > > first :-)
> >
> > Well, I don't like the way the interaction of the patches is going.
> > You're the one proposing powering down the device outside of the
> > standards defined transitions, so you need to be responsible for the
> > actions that necessitates, including synchronizing the cache. The specs
>
> OK, I'll update the code.
>
> > (SPC-4) say that cache management is explicitly unnecessary for the
> > standard SCSI power states (Active, Idle, Standby and Stopped), so
>
> Just read the SPC-4 spec, in section 5.12.3, it has words like this:
>
> Logical units that contain cache memory shall write all cached data to
> the medium for the logical unit(e.g., as a logical unit would do in
> response to a SYNCHRONIZE CACHE command as described SBC-3) prior to
> entering into any power condition that prevents accessing the
> media(e.g., before a hard drive stops its spindle motor during a change
> to the standby power condition).
>
> So this looks like cache needs to be synced before the device enter
> standby/stopped power condition. Or do I miss somthing?
Um, no it says the device shall do the sync on its own (as though it
received a sync cache). That section says the device shall be
responsible for cache management in the power states.
> > someone at some point is going to read that and remove the unnecessary
> > cache sync in the code. When that happens, you'll start getting data
> > loss.
>
> Indeed, I'll make sure cache gets synced when we are to power off the
> device. Thanks for the remind.
Great, thanks.
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-14 8:17 ` James Bottomley
@ 2012-09-14 8:48 ` Aaron Lu
2012-09-14 10:26 ` James Bottomley
0 siblings, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-14 8:48 UTC (permalink / raw)
To: James Bottomley
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On 09/14/2012 04:17 PM, James Bottomley wrote:
>> Just read the SPC-4 spec, in section 5.12.3, it has words like this:
>>
>> Logical units that contain cache memory shall write all cached data to
>> the medium for the logical unit(e.g., as a logical unit would do in
>> response to a SYNCHRONIZE CACHE command as described SBC-3) prior to
>> entering into any power condition that prevents accessing the
>> media(e.g., before a hard drive stops its spindle motor during a change
>> to the standby power condition).
>>
>> So this looks like cache needs to be synced before the device enter
>> standby/stopped power condition. Or do I miss somthing?
>
> Um, no it says the device shall do the sync on its own (as though it
> received a sync cache). That section says the device shall be
> responsible for cache management in the power states.
Oh, I thought it was the host software's responsibility, thanks for the
explanation.
So if we program the device to let it enter standby/stopped power
condition with the start_stop_unit command, do we need to sync the
cache?
Thanks,
Aaron
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-14 8:48 ` Aaron Lu
@ 2012-09-14 10:26 ` James Bottomley
2012-09-14 13:54 ` Aaron Lu
0 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2012-09-14 10:26 UTC (permalink / raw)
To: Aaron Lu
Cc: Alan Stern, Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang,
Oliver Neukum, linux-scsi, linux-ide, linux-pm, linux-acpi
On Fri, 2012-09-14 at 16:48 +0800, Aaron Lu wrote:
> On 09/14/2012 04:17 PM, James Bottomley wrote:
> >> Just read the SPC-4 spec, in section 5.12.3, it has words like this:
> >>
> >> Logical units that contain cache memory shall write all cached data to
> >> the medium for the logical unit(e.g., as a logical unit would do in
> >> response to a SYNCHRONIZE CACHE command as described SBC-3) prior to
> >> entering into any power condition that prevents accessing the
> >> media(e.g., before a hard drive stops its spindle motor during a change
> >> to the standby power condition).
> >>
> >> So this looks like cache needs to be synced before the device enter
> >> standby/stopped power condition. Or do I miss somthing?
> >
> > Um, no it says the device shall do the sync on its own (as though it
> > received a sync cache). That section says the device shall be
> > responsible for cache management in the power states.
>
> Oh, I thought it was the host software's responsibility, thanks for the
> explanation.
>
> So if we program the device to let it enter standby/stopped power
> condition with the start_stop_unit command, do we need to sync the
> cache?
No, that's what the spec says. The device must manage the cache in both
the forced (start stop unit) and timed (power control mode page) cases.
The reason is the spec doesn't define what idle and standby actually
mean (just that they're "lower" power states). So the device
implementers get to choose if they stop the platter or power off the
motor. The spec just means that if they do anything that causes danger
to data in the cache, they have to deal with it themselves.
James
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-14 10:26 ` James Bottomley
@ 2012-09-14 13:54 ` Aaron Lu
2012-09-17 15:01 ` Aaron Lu
0 siblings, 1 reply; 24+ messages in thread
From: Aaron Lu @ 2012-09-14 13:54 UTC (permalink / raw)
To: James Bottomley, Alan Stern
Cc: Jeff Garzik, Aaron Lu, Jack Wang, Shane Huang, Oliver Neukum,
linux-scsi, linux-ide, linux-pm, linux-acpi
On Fri, Sep 14, 2012 at 11:26:29AM +0100, James Bottomley wrote:
> On Fri, 2012-09-14 at 16:48 +0800, Aaron Lu wrote:
> > So if we program the device to let it enter standby/stopped power
> > condition with the start_stop_unit command, do we need to sync the
> > cache?
>
> No, that's what the spec says. The device must manage the cache in both
> the forced (start stop unit) and timed (power control mode page) cases.
>
> The reason is the spec doesn't define what idle and standby actually
> mean (just that they're "lower" power states). So the device
> implementers get to choose if they stop the platter or power off the
> motor. The spec just means that if they do anything that causes danger
> to data in the cache, they have to deal with it themselves.
Thanks for the clear explanation.
So what about the following change? In sd_suspend, if device supports
start_stop command, then we just need issue this command then both
runtime suspend case and system S3/S4 case are OK, since when the device
enters stopped power condition, the internal cache should be taken care
of by the device and it is also ready to be powered off. And if device
does not support start_stop command, we will sync the cache if we are
doing S3/S4 or runtime suspend with power to be removed.
The sd_shutdown is changed accordingly, when device is already runtime
suspended:
1 If it supports start_stop, it should be in stopped power condition, no
more action required;
2 If it is already powered off, no more action required.
Otherwise, we runtime resume the device and sync cache for WCE device.
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4df73e5..760ce5b 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2845,18 +2845,18 @@ static void sd_shutdown(struct device *dev)
if (!sdkp)
return; /* this can happen */
- if (pm_runtime_suspended(dev))
+ if (pm_runtime_suspended(dev)
+ && (sdkp->device->manage_start_stop || sdkp->device->powered_off))
goto exit;
+ scsi_autopm_get_device(sdkp->device);
+
if (sdkp->WCE) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
sd_sync_cache(sdkp);
}
- if (system_state != SYSTEM_RESTART && sdkp->device->manage_start_stop) {
- sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n");
- sd_start_stop_device(sdkp, 0);
- }
+ scsi_autopm_put_device(sdkp->device);
exit:
scsi_disk_put(sdkp);
@@ -2870,16 +2870,18 @@ static int sd_suspend(struct device *dev, pm_message_t mesg)
if (!sdkp)
return 0; /* this can happen */
- if (sdkp->WCE) {
- sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
- ret = sd_sync_cache(sdkp);
- if (ret)
- goto done;
- }
-
- if ((mesg.event & PM_EVENT_SLEEP) && sdkp->device->manage_start_stop) {
+ if (sdkp->device->manage_start_stop) {
sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n");
ret = sd_start_stop_device(sdkp, 0);
+ goto done;
+ }
+
+ if (sdkp->WCE) {
+ if ((PMSG_IS_AUTO(mesg) && sdkp->device->may_power_off) ||
+ (mesg.event & PM_EVENT_SLEEP)) {
+ sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
+ ret = sd_sync_cache(sdkp);
+ }
}
done:
Thanks,
Aaron
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk
2012-09-14 13:54 ` Aaron Lu
@ 2012-09-17 15:01 ` Aaron Lu
0 siblings, 0 replies; 24+ messages in thread
From: Aaron Lu @ 2012-09-17 15:01 UTC (permalink / raw)
To: Jeff Garzik, James Bottomley, Alan Stern
Cc: Aaron Lu, Jack Wang, Shane Huang, Oliver Neukum, linux-scsi,
linux-ide, linux-pm, linux-acpi
On Fri, Sep 14, 2012 at 09:54:16PM +0800, Aaron Lu wrote:
> On Fri, Sep 14, 2012 at 11:26:29AM +0100, James Bottomley wrote:
> > On Fri, 2012-09-14 at 16:48 +0800, Aaron Lu wrote:
> > > So if we program the device to let it enter standby/stopped power
> > > condition with the start_stop_unit command, do we need to sync the
> > > cache?
> >
> > No, that's what the spec says. The device must manage the cache in both
> > the forced (start stop unit) and timed (power control mode page) cases.
> >
> > The reason is the spec doesn't define what idle and standby actually
> > mean (just that they're "lower" power states). So the device
> > implementers get to choose if they stop the platter or power off the
> > motor. The spec just means that if they do anything that causes danger
> > to data in the cache, they have to deal with it themselves.
>
> Thanks for the clear explanation.
>
> So what about the following change? In sd_suspend, if device supports
> start_stop command, then we just need issue this command then both
> runtime suspend case and system S3/S4 case are OK, since when the device
> enters stopped power condition, the internal cache should be taken care
> of by the device and it is also ready to be powered off. And if device
This is not the case for ata device, so scsi stop command should
translate to 2 ata commands: flush cache + enter standby.
If flush cache is skipped in sd driver, then ata scsi translate will
need to update to address this issue.
A possible change(not tested) like this:
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 8ec81ca..2de5fac 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1759,6 +1759,19 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
ata_qc_free(qc);
}
+static int ata_flush_cache(struct ata_device *dev)
+{
+ u8 cmd;
+
+ if (dev->flags & ATA_DFLAG_FLUSH_EXT)
+ cmd = ATA_CMD_FLUSH_EXT;
+ else
+ cmd = ATA_CMD_FLUSH;
+
+ return ata_do_simple_cmd(dev, cmd);
+}
+
+
/**
* ata_scsi_translate - Translate then issue SCSI command to ATA device
* @dev: ATA device to which the command is addressed
@@ -1816,6 +1829,13 @@ static int ata_scsi_translate(struct ata_device *dev, struct scsi_cmnd *cmd,
if (xlat_func(qc))
goto early_finish;
+ /* scsi stop cmd = flush cache + standby */
+ if (qc->tf.command == ATA_CMD_STANDBYNOW1 && ata_try_flush_cache(dev)) {
+ rc = ata_flush_cache(dev);
+ if (rc)
+ goto err_did;
+ }
+
if (ap->ops->qc_defer) {
if ((rc = ap->ops->qc_defer(qc)))
goto defer;
Thanks,
Aaron
^ permalink raw reply related [flat|nested] 24+ messages in thread
end of thread, other threads:[~2012-09-17 15:01 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-13 7:40 [PATCH 0/2] Support runtime power off of HDD Aaron Lu
2012-09-13 7:40 ` [PATCH 1/2] scsi: sd: set ready_to_power_off for scsi disk Aaron Lu
2012-09-13 8:14 ` James Bottomley
2012-09-13 8:23 ` Aaron Lu
2012-09-13 8:37 ` James Bottomley
2012-09-13 8:49 ` Aaron Lu
2012-09-13 8:56 ` James Bottomley
2012-09-13 9:07 ` Aaron Lu
2012-09-13 9:26 ` James Bottomley
2012-09-13 10:16 ` Oliver Neukum
2012-09-13 10:51 ` James Bottomley
2012-09-13 12:34 ` Oliver Neukum
2012-09-13 16:24 ` Alan Stern
2012-09-13 20:18 ` Oliver Neukum
2012-09-13 20:46 ` Alan Stern
2012-09-14 6:57 ` Aaron Lu
2012-09-14 8:15 ` James Bottomley
2012-09-14 5:20 ` Aaron Lu
2012-09-14 8:17 ` James Bottomley
2012-09-14 8:48 ` Aaron Lu
2012-09-14 10:26 ` James Bottomley
2012-09-14 13:54 ` Aaron Lu
2012-09-17 15:01 ` Aaron Lu
2012-09-13 7:40 ` [PATCH 2/2] libata: acpi: set can_power_off for both ODD and HDD Aaron Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).