Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: sd_resume redundant?  [was: [PATCH] libata: implement ata_wait_after_reset()]
       [not found]         ` <fa.jSmqI/kJPff3zaVtC404yNL5/Qk@ifi.uio.no>
@ 2007-05-20 17:45           ` Robert Hancock
  2007-05-20 18:53             ` Randy Dunlap
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Hancock @ 2007-05-20 17:45 UTC (permalink / raw)
  To: Indan Zupancic, linux-kernel; +Cc: Tejun Heo

Indan Zupancic wrote:
>>> Everything seems to work fine without sd_resume(), so why is it needed?
>> Because not all disks spin up without being told to do so and like it or
>> not spinning disks up on resume is the default behavior.  As I wrote in
>> the other reply, it would be worthwhile to make it configurable.
> 
> Not even after they receive a read command? Ugh.

ATA disks are supposed to spin up, yes. SCSI disks require a command to 
tell them to spin up if they're in the "stopped" state.

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sd_resume redundant?  [was: [PATCH] libata: implement ata_wait_after_reset()]
  2007-05-20 17:45           ` sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()] Robert Hancock
@ 2007-05-20 18:53             ` Randy Dunlap
  2007-05-20 22:25               ` Robert Hancock
  0 siblings, 1 reply; 8+ messages in thread
From: Randy Dunlap @ 2007-05-20 18:53 UTC (permalink / raw)
  To: Robert Hancock, ide; +Cc: Indan Zupancic, linux-kernel, Tejun Heo

On Sun, 20 May 2007 11:45:03 -0600 Robert Hancock wrote:

> Indan Zupancic wrote:
> >>> Everything seems to work fine without sd_resume(), so why is it needed?
> >> Because not all disks spin up without being told to do so and like it or
> >> not spinning disks up on resume is the default behavior.  As I wrote in
> >> the other reply, it would be worthwhile to make it configurable.
> > 
> > Not even after they receive a read command? Ugh.
> 
> ATA disks are supposed to spin up, yes. SCSI disks require a command to 
> tell them to spin up if they're in the "stopped" state.

Good info, but linux-ide was dropped.  Is that due to lack of
reply-to-all or is it a newsgroup thing or what?


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sd_resume redundant?  [was: [PATCH] libata: implement ata_wait_after_reset()]
  2007-05-20 18:53             ` Randy Dunlap
@ 2007-05-20 22:25               ` Robert Hancock
  0 siblings, 0 replies; 8+ messages in thread
From: Robert Hancock @ 2007-05-20 22:25 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: ide, Indan Zupancic, linux-kernel, Tejun Heo

Randy Dunlap wrote:
> On Sun, 20 May 2007 11:45:03 -0600 Robert Hancock wrote:
> 
>> Indan Zupancic wrote:
>>>>> Everything seems to work fine without sd_resume(), so why is it needed?
>>>> Because not all disks spin up without being told to do so and like it or
>>>> not spinning disks up on resume is the default behavior.  As I wrote in
>>>> the other reply, it would be worthwhile to make it configurable.
>>> Not even after they receive a read command? Ugh.
>> ATA disks are supposed to spin up, yes. SCSI disks require a command to 
>> tell them to spin up if they're in the "stopped" state.
> 
> Good info, but linux-ide was dropped.  Is that due to lack of
> reply-to-all or is it a newsgroup thing or what?

That would be a newsgroup thing. It seems that sometimes CCs get dropped 
when the posts are forwarded to fa.linux.kernel where I normally read them.

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* libata reset-seq merge broke sata_sil on sh
@ 2007-05-10  7:20 Paul Mundt
  2007-05-10 11:28 ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Mundt @ 2007-05-10  7:20 UTC (permalink / raw)
  To: jeff; +Cc: linux-ide, linux-kernel

Current git fails to boot via SATA on SH with the recent libata merge:

sata_sil 0000:00:01.0: cache line size not set.  Driver may not function
scsi0 : sata_sil
scsi1 : sata_sil
ata1: SATA max UDMA/100 cmd 0xfd000280 ctl 0xfd00028a bmdma 0xfd000200 irq 0
ata2: SATA max UDMA/100 cmd 0xfd0002c0 ctl 0xfd0002ca bmdma 0xfd000208 irq 0
ata1: device not ready (errno=-19), forcing hardreset
ata1: COMRESET failed (errno=-19)
ata1: reset failed (errno=-19), retrying in 10 secs
ata1: COMRESET failed (errno=-19)
ata1: reset failed (errno=-19), retrying in 10 secs
ata1: COMRESET failed (errno=-19)
ata1: reset failed (errno=-19), retrying in 35 secs
ata1: COMRESET failed (errno=-19)
ata1: reset failed, giving up
ata2: SATA link down (SStatus 0 SControl 310)
...

bisect points to the reset-seq merge as the bad change. Going back
a bit further from this, these are where the problems first started
showing up (but still managed to find the disk, which current git is
not able to).

At b8cffc6ad8c000410186815b7bcc6b76ef1bbb13 it's still working, even
though it's started complaining about the reset..

sata_sil 0000:00:01.0: version 2.2
sata_sil 0000:00:01.0: cache line size not set.  Driver may not function
scsi0 : sata_sil
scsi1 : sata_sil
ata1: SATA max UDMA/100 cmd 0xfd000280 ctl 0xfd00028a bmdma 0xfd000200 irq 0
ata2: SATA max UDMA/100 cmd 0xfd0002c0 ctl 0xfd0002ca bmdma 0xfd000208 irq 0
ata1: device not ready (errno=-19), forcing hardreset
ata1: COMRESET failed (errno=-19)
ata1: hardreset failed, retrying in 5 secs
ata1: COMRESET failed (errno=-19)
ata1: hardreset failed, retrying in 5 secs
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080
ata1.00: ATA-5: HHD424020F7SV00, 00MLA0A5, max UDMA/100
ata1.00: 39070080 sectors, multi 0: LBA
ata1.00: applying bridge limits
ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080
ata1.00: configured for UDMA/100
ata2: SATA link down (SStatus 0 SControl 310)
isa bounce pool size: 16 pages
scsi 0:0:0:0: Direct-Access     ATA      HHD424020F7SV00  00ML PQ: 0 ANSI: 5
SCSI device sda: 39070080 512-byte hdwr sectors (20004 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
SCSI device sda: 39070080 512-byte hdwr sectors (20004 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 sda4
sd 0:0:0:0: Attached scsi disk sda

at 27c78b372d05e47bbd059c9bb003c6d716abff54 the retry time was changed
to 10 seconds, which still manages to find the device..

sata_sil 0000:00:01.0: cache line size not set.  Driver may not function
scsi0 : sata_sil
scsi1 : sata_sil
ata1: SATA max UDMA/100 cmd 0xfd000280 ctl 0xfd00028a bmdma 0xfd000200 irq 0
ata2: SATA max UDMA/100 cmd 0xfd0002c0 ctl 0xfd0002ca bmdma 0xfd000208 irq 0
ata1: device not ready (errno=-19), forcing hardreset
ata1: COMRESET failed (errno=-19)
ata1: reset failed (errno=-19), retrying in 10 secs
ata1: COMRESET failed (errno=-19)
ata1: reset failed (errno=-19), retrying in 10 secs
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080
ata1.00: ATA-5: HHD424020F7SV00, 00MLA0A5, max UDMA/100
ata1.00: 39070080 sectors, multi 0: LBA
ata1.00: applying bridge limits
ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080
ata1.00: configured for UDMA/100
ata2: SATA link down (SStatus 0 SControl 310)
isa bounce pool size: 16 pages
scsi 0:0:0:0: Direct-Access     ATA      HHD424020F7SV00  00ML PQ: 0 ANSI: 5
SCSI device sda: 39070080 512-byte hdwr sectors (20004 MB)
sda: Write Protect is off
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
SCSI device sda: 39070080 512-byte hdwr sectors (20004 MB)
sda: Write Protect is off
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 sda4
sd 0:0:0:0: Attached scsi disk sda

However, if I go back before any of the reset changes were introduced,
things were working fine, and there were no problems with waiting for
the reset. Ideas?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: libata reset-seq merge broke sata_sil on sh
  2007-05-10  7:20 libata reset-seq merge broke sata_sil on sh Paul Mundt
@ 2007-05-10 11:28 ` Tejun Heo
  2007-05-10 11:53   ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-10 11:28 UTC (permalink / raw)
  To: Paul Mundt, jeff, linux-ide, linux-kernel

Hello,

Paul Mundt wrote:
[--snip, thanks a lot for the detailed report--]
> However, if I go back before any of the reset changes were introduced,
> things were working fine, and there were no problems with waiting for
> the reset. Ideas?

Hmm... It worked so well on all my sil's.  I'm a bit puzzled because we
also failed the same condition before the change too.  The only change
is we're less patient with the initial tries but in the end we give more
than enough time to the device to prepare itself.

* Does your drive start spun down when it boots?  Can you post dmesg
with printk timestamp turned on with kernel prior to reset-seq merge?

* In ata_bus_softreset() and sata_std_hardreset(), there's msleep(150)
delay before checking the status post-reset.  Does increasing the delay
make any difference?  Please try to increase it exponentially till it
reaches 10sec.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: libata reset-seq merge broke sata_sil on sh
  2007-05-10 11:28 ` Tejun Heo
@ 2007-05-10 11:53   ` Tejun Heo
  2007-05-10 12:46     ` Paul Mundt
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-10 11:53 UTC (permalink / raw)
  To: Paul Mundt, jeff, linux-ide, linux-kernel

Some more thoughts.

Tejun Heo wrote:
> Hello,
> 
> Paul Mundt wrote:
> [--snip, thanks a lot for the detailed report--]
>> However, if I go back before any of the reset changes were introduced,
>> things were working fine, and there were no problems with waiting for
>> the reset. Ideas?
> 
> Hmm... It worked so well on all my sil's.  I'm a bit puzzled because we
> also failed the same condition before the change too.  The only change
> is we're less patient with the initial tries but in the end we give more
> than enough time to the device to prepare itself.
> 
> * Does your drive start spun down when it boots?  Can you post dmesg
> with printk timestamp turned on with kernel prior to reset-seq merge?
> 
> * In ata_bus_softreset() and sata_std_hardreset(), there's msleep(150)
> delay before checking the status post-reset.  Does increasing the delay
> make any difference?  Please try to increase it exponentially till it
> reaches 10sec.

* According to the report, things still work till 27c78b37 - commit for
the actual merge and there's no related change till the current master
from there.  So, which commit actually breaks detection?  Or is
detection just flaky after b8cffc6a?

* How does things work on 9b89391c?

Also, please turn on printk timestamp on all reports so that we can now
the timeline of things.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: libata reset-seq merge broke sata_sil on sh
  2007-05-10 11:28 ` Tejun Heo
@ 2007-05-10 12:46     ` Paul Mundt
  2007-05-10 13:08       ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Mundt @ 2007-05-10 12:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jeff, linux-ide, linux-kernel

On Thu, May 10, 2007 at 01:28:19PM +0200, Tejun Heo wrote:
> Paul Mundt wrote:
> [--snip, thanks a lot for the detailed report--]
> > However, if I go back before any of the reset changes were introduced,
> > things were working fine, and there were no problems with waiting for
> > the reset. Ideas?
> 
> * Does your drive start spun down when it boots?  Can you post dmesg
> with printk timestamp turned on with kernel prior to reset-seq merge?
> 
Yes, it's spun down at boot. I'll get the logs with the timestamps and
the results of the mdelay() incrementing in the morning when I've got the
board handy again.

On Thu, May 10, 2007 at 01:53:48PM +0200, Tejun Heo wrote:
> * According to the report, things still work till 27c78b37 - commit for
> the actual merge and there's no related change till the current master
> from there.  So, which commit actually breaks detection?  Or is
> detection just flaky after b8cffc6a?
> 
The detection is simply flaky after that point, however before the
current master it never hit the 35 second point (and thus never implied
that the link was down). I'll double check the bisect log to see if there
was anything beyond that that may have caused it.

The -ENODEV at least implies that the SRST fails, so at least that's a
starting point.

One other curious thing is that it seems to misreport the IRQ. In this
case the log indicates 0, whereas it's actually on IRQ 66. When the
system is booted, /proc/interrupts reflects that sata_sil is on the
proper IRQ (even when it's reported as 0 in the boot printk()).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: libata reset-seq merge broke sata_sil on sh
  2007-05-10 12:46     ` Paul Mundt
@ 2007-05-10 13:08       ` Tejun Heo
       [not found]         ` <20070511005217.GA23186@li <464B3505.20004@gmail.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-10 13:08 UTC (permalink / raw)
  To: Paul Mundt, Tejun Heo, jeff, linux-ide, linux-kernel

Paul Mundt wrote:
> Yes, it's spun down at boot. I'll get the logs with the timestamps and
> the results of the mdelay() incrementing in the morning when I've got the
> board handy again.

I see.  That's where the expected initial prereset failure comes from.

> On Thu, May 10, 2007 at 01:53:48PM +0200, Tejun Heo wrote:
>> * According to the report, things still work till 27c78b37 - commit for
>> the actual merge and there's no related change till the current master
>> from there.  So, which commit actually breaks detection?  Or is
>> detection just flaky after b8cffc6a?
>>
> The detection is simply flaky after that point, however before the
> current master it never hit the 35 second point (and thus never implied
> that the link was down). I'll double check the bisect log to see if there
> was anything beyond that that may have caused it.
> 
> The -ENODEV at least implies that the SRST fails, so at least that's a
> starting point.

If prereset() fails to get the initial DRDY before 10secs, it assumes
something went wrong and escalates to hardreset.  sil family of
controllers report 0xff status while the link is broken and it seems
that your particular drive needs more than the current 150ms to recover
phy link.  It probably went unnoticed till now because the device was
never hardreset before.  If the diagnosis is correct, increasing the
delay in hardreset should fix the problem.  Well, let's see.  :-)

> One other curious thing is that it seems to misreport the IRQ. In this
> case the log indicates 0, whereas it's actually on IRQ 66. When the
> system is booted, /proc/interrupts reflects that sata_sil is on the
> proper IRQ (even when it's reported as 0 in the boot printk()).

That's somthing I've missed during init model conversion.  It's just
printk problem.  I'll fix it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

[parent not found: <20070511005217.GA23186@li <464B3505.20004@gmail.com>]

* Re: [PATCH] libata: implement ata_wait_after_reset()
@ 2007-05-19 16:39           ` Indan Zupancic
  2007-05-19 18:43             ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Indan Zupancic @ 2007-05-19 16:39 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Paul Mundt, Tejun Heo, jeff, linux-ide, linux-kernel, garyhade

Hello,

Disabling sd_resume() gives me a "quick" resume again (few seconds),
though it doesn't get rid of the COMRESET errors:

[    2.476417] ata1: device not ready (errno=-19), forcing hardreset
[    2.476440] ata2: SATA link down (SStatus 0 SControl 310)
[    2.824896] usb_endpoint usbdev2.3_ep00: PM: resume from 0, parent 2-1 still 2
[    2.824902] usb 2-1:1.0: PM: resume from 2, parent 2-1 still 2
[    2.824906] usb 2-1:1.1: PM: resume from 2, parent 2-1 still 2
[    2.824910] usb 2-1:1.2: PM: resume from 2, parent 2-1 still 2
[    2.824914] usb_endpoint usbdev2.3_ep05: PM: resume from 0, parent 2-1:1.2 still 2
[    2.824919] usb_endpoint usbdev2.3_ep85: PM: resume from 0, parent 2-1:1.2 still 2
[    2.824923] usb_endpoint usbdev2.3_ep81: PM: resume from 0, parent 2-1:1.0 still 2
[    2.825392] Restarting tasks ... done.
[    2.931812] ata1: COMRESET failed (errno=-19)
[    2.931880] ata1: reset failed (errno=-19), retrying in 10 secs
[    3.136543] usb 2-1: USB disconnect, address 3
[    3.249382] usb 2-1: new full speed USB device using ohci_hcd and address 4
[    3.396768] usb 2-1: configuration #1 chosen from 1 choice
[    4.225560] usb 2-1: reset full speed USB device using ohci_hcd and address 4
[    4.372298] usbcore: registered new interface driver speedtch
[    4.456118] speedtch 2-1:1.0: found stage 1 firmware speedtch-1.bin.0.00
[    4.621056] speedtch 2-1:1.0: found stage 2 firmware speedtch-2.bin.0.00
[    9.153941] ATM dev 0: ADSL line is synchronising
[   12.918440] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   12.921971] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
[   12.924924] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648

Doesn't the controller generate an interrupt when it detects a harddisk?
Is it really needed to do polling?

Idea: What about implementing a sata_sil specific .check_status function,
instead of using the generic ata_check_status()? That one could check other
registers to find out what's really going on in the case ENODEV is returned
by ata_check_status(), or something like that.

Greetings,

Indan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libata: implement ata_wait_after_reset()
  2007-05-19 16:39           ` [PATCH] libata: implement ata_wait_after_reset() Indan Zupancic
@ 2007-05-19 18:43             ` Tejun Heo
  2007-05-19 19:04               ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-19 18:43 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Paul Mundt, jeff, linux-ide, linux-kernel, garyhade

Indan Zupancic wrote:
> Doesn't the controller generate an interrupt when it detects a harddisk?
> Is it really needed to do polling?

It depends on the controller but the closest thing is usually PHY status
changed interrupt and PHY readiness doesn't imply device readiness.  On
some controllers, you can play smart and try to do these things by
interrupt but polling is much more reliable for this kind of stuff.

> Idea: What about implementing a sata_sil specific .check_status function,
> instead of using the generic ata_check_status()? That one could check other
> registers to find out what's really going on in the case ENODEV is returned
> by ata_check_status(), or something like that.

Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
isn't there, so the whole skip-0xff logic probably shouldn't apply in
such cases, but we can also achieve pretty good result by just making
the first reset tries a bit more aggressive.

The first timeout of 10secs was chosen because most desktop drives take
somewhere around 7~10 secs to spin up, so retrying before that didn't
make much sense, but mobile devices usually spin up faster, so it might
be worth to try one more time inbetween.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libata: implement ata_wait_after_reset()
  2007-05-19 18:43             ` Tejun Heo
@ 2007-05-19 19:04               ` Tejun Heo
  2007-05-19 22:33                 ` sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()] Indan Zupancic
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-19 19:04 UTC (permalink / raw)
  To: Indan Zupancic, Paul Mundt; +Cc: jeff, linux-ide, linux-kernel, garyhade

[-- Attachment #1: Type: text/plain, Size: 553 bytes --]

Tejun Heo wrote:
> Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
> isn't there, so the whole skip-0xff logic probably shouldn't apply in
> such cases, but we can also achieve pretty good result by just making
> the first reset tries a bit more aggressive.

So, here's the patch.

Paul, can you please test this patch without the previous patch?  Indan,
this should reduce the resume delay.  Please test.  But you'll still
feel some added delay compared to 2.6.20 due to the mentioned
suspend/resume change.

Thanks.

-- 
tejun

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 411 bytes --]

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index d5939e6..27ddc39 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3031,7 +3031,7 @@ int ata_wait_ready(struct ata_port *ap,
 
 		if (!(status & ATA_BUSY))
 			return 0;
-		if (status == 0xff)
+		if (!ata_port_online(ap) && status == 0xff)
 			return -ENODEV;
 		if (time_after(now, deadline))
 			return -EBUSY;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* sd_resume redundant?  [was: [PATCH] libata: implement  ata_wait_after_reset()]
  2007-05-19 19:04               ` Tejun Heo
@ 2007-05-19 22:33                 ` Indan Zupancic
  2007-05-20  9:54                   ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Indan Zupancic @ 2007-05-19 22:33 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Paul Mundt, jeff, linux-ide, linux-kernel, garyhade

On Sat, May 19, 2007 21:04, Tejun Heo wrote:
> Tejun Heo wrote:
>> Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
>> isn't there, so the whole skip-0xff logic probably shouldn't apply in
>> such cases, but we can also achieve pretty good result by just making
>> the first reset tries a bit more aggressive.
>
> So, here's the patch.
>
> Paul, can you please test this patch without the previous patch?  Indan,
> this should reduce the resume delay.  Please test.  But you'll still
> feel some added delay compared to 2.6.20 due to the mentioned
> suspend/resume change.

This removed the COMRESET errors indeed, and with sd_resume()
disabled everything is speedy again (2s or so. Still a desktop pc).
I didn't try with sd_resume enabled.

Everything seems to work fine without sd_resume(), so why is it needed?

Greetings,

Indan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sd_resume redundant?  [was: [PATCH] libata: implement      ata_wait_after_reset()]
  2007-05-19 22:33                 ` sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()] Indan Zupancic
@ 2007-05-20  9:54                   ` Tejun Heo
  2007-05-20 14:27                     ` Indan Zupancic
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-20  9:54 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Paul Mundt, jeff, linux-ide, linux-kernel, garyhade

Indan Zupancic wrote:
> On Sat, May 19, 2007 21:04, Tejun Heo wrote:
>> Tejun Heo wrote:
>>> Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
>>> isn't there, so the whole skip-0xff logic probably shouldn't apply in
>>> such cases, but we can also achieve pretty good result by just making
>>> the first reset tries a bit more aggressive.
>> So, here's the patch.
>>
>> Paul, can you please test this patch without the previous patch?  Indan,
>> this should reduce the resume delay.  Please test.  But you'll still
>> feel some added delay compared to 2.6.20 due to the mentioned
>> suspend/resume change.
> 
> This removed the COMRESET errors indeed, and with sd_resume()
> disabled everything is speedy again (2s or so. Still a desktop pc).
> I didn't try with sd_resume enabled.

Can you try to measure with sd_resume in place?

> Everything seems to work fine without sd_resume(), so why is it needed?

Because not all disks spin up without being told to do so and like it or
not spinning disks up on resume is the default behavior.  As I wrote in
the other reply, it would be worthwhile to make it configurable.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sd_resume redundant?  [was: [PATCH] libata: implement       ata_wait_after_reset()]
  2007-05-20  9:54                   ` Tejun Heo
@ 2007-05-20 14:27                     ` Indan Zupancic
  2007-05-20 17:17                       ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Indan Zupancic @ 2007-05-20 14:27 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Paul Mundt, jeff, linux-ide, linux-kernel, garyhade

On Sun, May 20, 2007 11:54, Tejun Heo wrote:
> Indan Zupancic wrote:
>> On Sat, May 19, 2007 21:04, Tejun Heo wrote:
>>> Tejun Heo wrote:
>>>> Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
>>>> isn't there, so the whole skip-0xff logic probably shouldn't apply in
>>>> such cases, but we can also achieve pretty good result by just making
>>>> the first reset tries a bit more aggressive.
>>> So, here's the patch.
>>>
>>> Paul, can you please test this patch without the previous patch?  Indan,
>>> this should reduce the resume delay.  Please test.  But you'll still
>>> feel some added delay compared to 2.6.20 due to the mentioned
>>> suspend/resume change.
>>
>> This removed the COMRESET errors indeed, and with sd_resume()
>> disabled everything is speedy again (2s or so. Still a desktop pc).
>> I didn't try with sd_resume enabled.
>
> Can you try to measure with sd_resume in place?

[    2.173366] sd 0:0:0:0: [sda] Starting disk
[    2.475422] ata2: SATA link down (SStatus 0 SControl 310)
[    5.478403] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[    5.481928] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
[    5.485904] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
[    5.485913] ata1.00: configured for UDMA/100
[    5.505109] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
[    5.505461] sd 0:0:0:0: [sda] Write Protect is off
[    5.505465] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    5.505612] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or
FUA
...
[    6.157259] Restarting tasks ... done.


And with echo 0 > /sys/class/scsi_disk/0\:0\:0\:0/manage_start_stop:

[    2.476476] ata2: SATA link down (SStatus 0 SControl 310)
...
[    2.825479] Restarting tasks ... done.
...
[    5.022076] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[    5.025605] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
[    5.028594] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
[    5.028606] ata1.00: configured for UDMA/100
[    5.028720] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
[    5.028767] sd 0:0:0:0: [sda] Write Protect is off
[    5.028773] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    5.028831] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or
FUA

So over all it takes half a second longer to detect the disk, but because everything waits on it,
it takes more than three seconds longer to resume.

Setting manage_start_stop to 0 fixes it and is good enough for me, I didn't notice anything bad yet
because of the unmanaged stop. Implementing background spin up will fix it too.


>> Everything seems to work fine without sd_resume(), so why is it needed?
>
> Because not all disks spin up without being told to do so and like it or
> not spinning disks up on resume is the default behavior.  As I wrote in
> the other reply, it would be worthwhile to make it configurable.

Not even after they receive a read command? Ugh.

Greeting,

Indan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sd_resume redundant?  [was: [PATCH] libata: implement        ata_wait_after_reset()]
  2007-05-20 14:27                     ` Indan Zupancic
@ 2007-05-20 17:17                       ` Tejun Heo
  2007-05-20 19:47                         ` Indan Zupancic
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2007-05-20 17:17 UTC (permalink / raw)
  To: Indan Zupancic; +Cc: Paul Mundt, jeff, linux-ide, linux-kernel, garyhade

Indan Zupancic wrote:
>> Can you try to measure with sd_resume in place?
> 
> [    2.173366] sd 0:0:0:0: [sda] Starting disk
> [    2.475422] ata2: SATA link down (SStatus 0 SControl 310)
> [    5.478403] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [    5.481928] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
> [    5.485904] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
> [    5.485913] ata1.00: configured for UDMA/100
> [    5.505109] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
> [    5.505461] sd 0:0:0:0: [sda] Write Protect is off
> [    5.505465] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    5.505612] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or
> FUA
> ...
> [    6.157259] Restarting tasks ... done.
> 
> 
> And with echo 0 > /sys/class/scsi_disk/0\:0\:0\:0/manage_start_stop:
> 
> [    2.476476] ata2: SATA link down (SStatus 0 SControl 310)
> ...
> [    2.825479] Restarting tasks ... done.
> ...
> [    5.022076] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [    5.025605] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
> [    5.028594] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
> [    5.028606] ata1.00: configured for UDMA/100
> [    5.028720] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
> [    5.028767] sd 0:0:0:0: [sda] Write Protect is off
> [    5.028773] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    5.028831] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or
> FUA
> 
> So over all it takes half a second longer to detect the disk, but
> because everything waits on it, it takes more than three seconds
> longer to resume.

Eeeek.  Extra three secs doesn't sound too hot.  :-(

> Setting manage_start_stop to 0 fixes it and is good enough for me, I
> didn't notice anything bad yet because of the unmanaged
> stop. Implementing background spin up will fix it too.

Just commenting out sd_resume() would be a better solution for your
case tho.

>>> Everything seems to work fine without sd_resume(), so why is it needed?
>> Because not all disks spin up without being told to do so and like it or
>> not spinning disks up on resume is the default behavior.  As I wrote in
>> the other reply, it would be worthwhile to make it configurable.
> 
> Not even after they receive a read command? Ugh.

After receiving a command which requires media access, they do.  What
I was saying is that the current default behavior is to spin up all
devices on resume and part of that is achieved by sd_resume().

Hmmm... skipping START_STOP during sd_resume() actually is a pretty
good solution for ATA devices.  I'll think about it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sd_resume redundant?  [was: [PATCH] libata: implement           ata_wait_after_reset()]
  2007-05-20 17:17                       ` Tejun Heo
@ 2007-05-20 19:47                         ` Indan Zupancic
  0 siblings, 0 replies; 8+ messages in thread
From: Indan Zupancic @ 2007-05-20 19:47 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Paul Mundt, jeff, linux-ide, linux-kernel, garyhade

On Sun, May 20, 2007 19:17, Tejun Heo wrote:
> Indan Zupancic wrote:
>> So over all it takes half a second longer to detect the disk, but
>> because everything waits on it, it takes more than three seconds
>> longer to resume.
>
> Eeeek.  Extra three secs doesn't sound too hot.  :-(

Hey, without your reset fix it was 11 seconds slower. ;-)

Well, it used to be around 1.7 seconds to resume, and now (after the fixes)
it's 2.8 (according to the dmesg timestamps). But the first timestamp printed
was 0.7, and now it's 2.1, so I don't know if it's a real slowdown or that the timer
is set correctly earlier. That was with 2.19 btw, I don't know what happened in
the meantime. I suspect the timer changes. To know for sure I'd need to boot
an older kernel and measure resume time with a stopwatch.

>> Setting manage_start_stop to 0 fixes it and is good enough for me, I
>> didn't notice anything bad yet because of the unmanaged
>> stop. Implementing background spin up will fix it too.
>
> Just commenting out sd_resume() would be a better solution for your
> case tho.

I like your idea below more. ;-)

>>>> Everything seems to work fine without sd_resume(), so why is it needed?
>>> Because not all disks spin up without being told to do so and like it or
>>> not spinning disks up on resume is the default behavior.  As I wrote in
>>> the other reply, it would be worthwhile to make it configurable.
>>
>> Not even after they receive a read command? Ugh.
>
> After receiving a command which requires media access, they do.  What
> I was saying is that the current default behavior is to spin up all
> devices on resume and part of that is achieved by sd_resume().
>
> Hmmm... skipping START_STOP during sd_resume() actually is a pretty
> good solution for ATA devices.  I'll think about it.

I can't speak for others, but for me this would fix all regressions and even
disk spin up seems half a second faster without the START_STOP.
It would also solve the laptopmode thing, not unnecessarily spinning up
disks that don't need to be (if the hardware doesn't do it).

Greetings,

Indan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-05-20 22:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa.M0o+I253i7tt/LF97sml0AM+Gec@ifi.uio.no>
     [not found] ` <fa.j/gDhS9FZwWkX/stLhREkTHNbFs@ifi.uio.no>
     [not found]   ` <fa.EalQUMJMBHX/KTGV2QFQLG6tYus@ifi.uio.no>
     [not found]     ` <fa.Zh3ttZJ9A5tObhEjMLSALUpxCuQ@ifi.uio.no>
     [not found]       ` <fa.tQ9LHpM0TOYAYEug36HHMvEVOO0@ifi.uio.no>
     [not found]         ` <fa.jSmqI/kJPff3zaVtC404yNL5/Qk@ifi.uio.no>
2007-05-20 17:45           ` sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()] Robert Hancock
2007-05-20 18:53             ` Randy Dunlap
2007-05-20 22:25               ` Robert Hancock
2007-05-10  7:20 libata reset-seq merge broke sata_sil on sh Paul Mundt
2007-05-10 11:28 ` Tejun Heo
2007-05-10 11:53   ` Tejun Heo
2007-05-10 12:46     ` Paul Mundt
2007-05-10 13:08       ` Tejun Heo
     [not found]         ` <20070511005217.GA23186@li <464B3505.20004@gmail.com>
2007-05-19 16:39           ` [PATCH] libata: implement ata_wait_after_reset() Indan Zupancic
2007-05-19 18:43             ` Tejun Heo
2007-05-19 19:04               ` Tejun Heo
2007-05-19 22:33                 ` sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()] Indan Zupancic
2007-05-20  9:54                   ` Tejun Heo
2007-05-20 14:27                     ` Indan Zupancic
2007-05-20 17:17                       ` Tejun Heo
2007-05-20 19:47                         ` Indan Zupancic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).