linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* smartd causing SATA timeouts on sleeping drives
@ 2007-10-06  1:38 Andrew Paprocki
  2007-10-06 20:15 ` Tejun Heo
  2007-10-10 19:46 ` Bruce Allen
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Paprocki @ 2007-10-06  1:38 UTC (permalink / raw)
  To: linux-ide; +Cc: Tejun Heo, Bruce Allen

Tejun/Bruce,

I tracked down the source of timeouts I have been frequently getting.
It appears smartd is not properly handling drives that are spun down
by the BIOS ACPI settings. I have SATA timeouts which occur every half
hour (the default -i 1800 in smartd) that do not occur when smartd is
not running. The drives smartd is configured to look at have a sleep
time configured in the BIOS. When the drives are asleep, I get a soft
reset every half hour as smartd attempts to access the drives. While
in this state, smartd also reports bad state to syslog (e.g.
temperature changes to 200C). Just for comparison, hddtemp knows the
drives are sleeping:

# hddtemp /dev/sda
/dev/sda: Hitachi HDS721010KLA330                 : drive is sleeping
# ls /storage
... wakes up the drives ...
# hddtemp /dev/sda
/dev/sda: Hitachi HDS721010KLA330                 :  29 C or  F

I'm pasting the example cmd / timeout error / soft reset below. Also,
I'm pasting the invalid settings which smartd detects when in this
state. What needs to change for smartd to recognize drives are
sleeping and either not perform its checks, or forcefully wake them up
to perform them? (Should that be a configuration parameter in smartd?)

Thanks,
-Andrew

# uname -a
Linux (none) 2.6.22.6 #5 Mon Sep 10 02:15:22 EDT 2007 i586 unknown
(Using sata_sil on 3114 chips)

# smartctl -V
smartmontools release 5.38 dated 2006/12/20 at 20:37:59 UTC
...
smartctl compile dated Sep 17 2007 at 13:47:25
(repository code checked out on Sep 17th)

# cat /var/run/smartd.conf
/dev/sda -d ata -a -S on -s (S/../.././02|L/../../6/03)
/dev/sdb -d ata -a -S on -s (S/../.././02|L/../../6/03)

What happens every 30 minutes when drives are sleeping:

Oct  6 01:05:48 (none) user.err kernel: ata2.00: exception Emask 0x0
SAct 0x0 SErr 0x0 action 0x2 frozen
Oct  6 01:05:48 (none) user.err kernel: ata2.00: cmd
b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
Oct  6 01:05:48 (none) user.warn kernel:          res
40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct  6 01:05:53 (none) user.warn kernel: ata2: port is slow to
respond, please be patient (Status 0xd0)
Oct  6 01:05:55 (none) user.info kernel: ata2: soft resetting port
Oct  6 01:05:56 (none) user.info kernel: ata2: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
Oct  6 01:05:56 (none) user.info kernel: ata2.00: configured for UDMA/100
Oct  6 01:05:56 (none) user.info kernel: ata2: EH complete
Oct  6 01:05:56 (none) user.notice kernel: sd 1:0:0:0: [sdb]
1953525168 512-byte hardware sectors (1000205 MB)
Oct  6 01:05:56 (none) user.notice kernel: sd 1:0:0:0: [sdb] Write
Protect is off
Oct  6 01:05:56 (none) user.debug kernel: sd 1:0:0:0: [sdb] Mode
Sense: 00 3a 00 00
Oct  6 01:05:56 (none) user.notice kernel: sd 1:0:0:0: [sdb] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA

Invalid attribute values:

Oct  2 22:35:21 (none) daemon.info smartd[585]: Device: /dev/sda,
SMART Prefailure Attribute: 7 Seek_Error_Rate changed from 87 to 86
Oct  2 23:35:21 (none) daemon.info smartd[585]: Device: /dev/sda,
SMART Prefailure Attribute: 7 Seek_Error_Rate changed from 86 to 85
Oct  5 20:05:56 (none) daemon.info smartd[585]: Device: /dev/sdb,
SMART Prefailure Attribute: 3 Spin_Up_Time changed from 84 to 85
Oct  6 01:05:38 (none) daemon.info smartd[585]: Device: /dev/sda,
SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 206
Oct  6 01:05:56 (none) daemon.info smartd[585]: Device: /dev/sdb,
SMART Usage Attribute: 194 Temperature_Celsius changed from 193 to 200

Once the drives are started up, those values report:

  3 Spin_Up_Time            0x0007   085   085   024    Pre-fail
Always       -       821 (Average 820)
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
Always       -       0
194 Temperature_Celsius     0x0002   193   193   000    Old_age
Always       -       31 (Lifetime Min/Max 24/67)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-06  1:38 smartd causing SATA timeouts on sleeping drives Andrew Paprocki
@ 2007-10-06 20:15 ` Tejun Heo
  2007-10-08  5:51   ` Andrew Paprocki
  2007-10-10 19:42   ` Bruce Allen
  2007-10-10 19:46 ` Bruce Allen
  1 sibling, 2 replies; 13+ messages in thread
From: Tejun Heo @ 2007-10-06 20:15 UTC (permalink / raw)
  To: Andrew Paprocki; +Cc: linux-ide, Bruce Allen

Andrew Paprocki wrote:
> Tejun/Bruce,
> 
> I tracked down the source of timeouts I have been frequently getting.
> It appears smartd is not properly handling drives that are spun down
> by the BIOS ACPI settings. I have SATA timeouts which occur every half
> hour (the default -i 1800 in smartd) that do not occur when smartd is
> not running. The drives smartd is configured to look at have a sleep
> time configured in the BIOS. When the drives are asleep, I get a soft
> reset every half hour as smartd attempts to access the drives. While
> in this state, smartd also reports bad state to syslog (e.g.
> temperature changes to 200C). Just for comparison, hddtemp knows the
> drives are sleeping:
> 
> # hddtemp /dev/sda
> /dev/sda: Hitachi HDS721010KLA330                 : drive is sleeping
> # ls /storage
> ... wakes up the drives ...
> # hddtemp /dev/sda
> /dev/sda: Hitachi HDS721010KLA330                 :  29 C or  F
> 
> I'm pasting the example cmd / timeout error / soft reset below. Also,
> I'm pasting the invalid settings which smartd detects when in this
> state. What needs to change for smartd to recognize drives are
> sleeping and either not perform its checks, or forcefully wake them up
> to perform them? (Should that be a configuration parameter in smartd?)

smartd should probably issue CHECK POWER MODE (0xe5) before issuing
other commands.  Bruce?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-06 20:15 ` Tejun Heo
@ 2007-10-08  5:51   ` Andrew Paprocki
  2007-10-08  6:06     ` Tejun Heo
  2007-10-10 19:42   ` Bruce Allen
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Paprocki @ 2007-10-08  5:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Bruce Allen

I found out after posting that this is governed by the -n parameter to
smartd. The default behavior is "-n never" which means smartd will
send the cmds regardless of the drive status. The man page indicates
that may cause the drive to spin-up to answer the cmds. It appears for
some drives (?) the cmds just timeout and libata performs a soft
reset. I'm going to change my setup to "-n standby", but it seems
strange to me that "-n never" is the default if it has this drastic of
a result (at least under Linux). Is there any way to know if the drive
will actually spin up as a result of the cmd instead of timing out?

On 10/6/07, Tejun Heo <htejun@gmail.com> wrote:
> smartd should probably issue CHECK POWER MODE (0xe5) before issuing
> other commands.  Bruce?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-08  5:51   ` Andrew Paprocki
@ 2007-10-08  6:06     ` Tejun Heo
  2007-10-08  6:32       ` Andrew Paprocki
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2007-10-08  6:06 UTC (permalink / raw)
  To: Andrew Paprocki; +Cc: linux-ide, Bruce Allen

Andrew Paprocki wrote:
> I found out after posting that this is governed by the -n parameter to
> smartd. The default behavior is "-n never" which means smartd will
> send the cmds regardless of the drive status. The man page indicates
> that may cause the drive to spin-up to answer the cmds. It appears for
> some drives (?) the cmds just timeout and libata performs a soft
> reset. I'm going to change my setup to "-n standby", but it seems
> strange to me that "-n never" is the default if it has this drastic of
> a result (at least under Linux). Is there any way to know if the drive
> will actually spin up as a result of the cmd instead of timing out?

If in standby mode, the drive would automatically spin up to process
command.  If in sleep mode, it needs SRST to spin back up.  Was your
drive in sleep mode?

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-08  6:06     ` Tejun Heo
@ 2007-10-08  6:32       ` Andrew Paprocki
  2007-10-10 19:39         ` Bruce Allen
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Paprocki @ 2007-10-08  6:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Bruce Allen

Yes, the drives were in sleep mode. That is the only case where these
timeouts/resets occur. It seems like the "-n never" mode of smartd
should send the SRST if the drive is truly sleeping, otherwise libata
will soft reset the drive when it sees the timeout. The "-n standby"
option sounds like a more sane default, but there might be legacy
reasons why it isn't configured that way.

On 10/8/07, Tejun Heo <htejun@gmail.com> wrote:
> Andrew Paprocki wrote:
> > I found out after posting that this is governed by the -n parameter to
> > smartd. The default behavior is "-n never" which means smartd will
> > send the cmds regardless of the drive status. The man page indicates
> > that may cause the drive to spin-up to answer the cmds. It appears for
> > some drives (?) the cmds just timeout and libata performs a soft
> > reset. I'm going to change my setup to "-n standby", but it seems
> > strange to me that "-n never" is the default if it has this drastic of
> > a result (at least under Linux). Is there any way to know if the drive
> > will actually spin up as a result of the cmd instead of timing out?
>
> If in standby mode, the drive would automatically spin up to process
> command.  If in sleep mode, it needs SRST to spin back up.  Was your
> drive in sleep mode?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-08  6:32       ` Andrew Paprocki
@ 2007-10-10 19:39         ` Bruce Allen
  2007-10-11  2:02           ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Bruce Allen @ 2007-10-10 19:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Smartmontools Developers List, Smartmontools Mailing List,
	Andrew Paprocki, linux-ide

Tejun,

Hmm, it sounds as if smartmontools should send a SRST to spin up the 
drive, but I do not know enough to be sure.

Could I add you to the developers list and give you CVS write access? 
This might make it easier for you to fix the various little smartmontools 
problems like this that keep cropping up!  Just fixing the code might be a 
lot faster than explaining it and sending patches...

If this is OK with you, please send me your sourceforge username, and I'll 
add you to the developers list.

Cheers,
 	Bruce


On Mon, 8 Oct 2007, Andrew Paprocki wrote:

> Yes, the drives were in sleep mode. That is the only case where these
> timeouts/resets occur. It seems like the "-n never" mode of smartd
> should send the SRST if the drive is truly sleeping, otherwise libata
> will soft reset the drive when it sees the timeout. The "-n standby"
> option sounds like a more sane default, but there might be legacy
> reasons why it isn't configured that way.
>
> On 10/8/07, Tejun Heo <htejun@gmail.com> wrote:
>> Andrew Paprocki wrote:
>>> I found out after posting that this is governed by the -n parameter to
>>> smartd. The default behavior is "-n never" which means smartd will
>>> send the cmds regardless of the drive status. The man page indicates
>>> that may cause the drive to spin-up to answer the cmds. It appears for
>>> some drives (?) the cmds just timeout and libata performs a soft
>>> reset. I'm going to change my setup to "-n standby", but it seems
>>> strange to me that "-n never" is the default if it has this drastic of
>>> a result (at least under Linux). Is there any way to know if the drive
>>> will actually spin up as a result of the cmd instead of timing out?
>>
>> If in standby mode, the drive would automatically spin up to process
>> command.  If in sleep mode, it needs SRST to spin back up.  Was your
>> drive in sleep mode?
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-06 20:15 ` Tejun Heo
  2007-10-08  5:51   ` Andrew Paprocki
@ 2007-10-10 19:42   ` Bruce Allen
  1 sibling, 0 replies; 13+ messages in thread
From: Bruce Allen @ 2007-10-10 19:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Andrew Paprocki, linux-ide

<SNIP>

> smartd should probably issue CHECK POWER MODE (0xe5) before issuing 
> other commands.  Bruce?

Hi Tejun,

Yes, this seems very reasonable.

I hope you say 'yes' to my offer from five minutes ago.

Cheers,
 	Bruce

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-06  1:38 smartd causing SATA timeouts on sleeping drives Andrew Paprocki
  2007-10-06 20:15 ` Tejun Heo
@ 2007-10-10 19:46 ` Bruce Allen
  1 sibling, 0 replies; 13+ messages in thread
From: Bruce Allen @ 2007-10-10 19:46 UTC (permalink / raw)
  To: Andrew Paprocki; +Cc: linux-ide, Tejun Heo

Andrew,

I forgot to say 'thank you' for tracking this down.

Thank you!

Cheers,
 	Bruce


On Fri, 5 Oct 2007, Andrew Paprocki wrote:

> Tejun/Bruce,
>
> I tracked down the source of timeouts I have been frequently getting.
> It appears smartd is not properly handling drives that are spun down
> by the BIOS ACPI settings. I have SATA timeouts which occur every half
> hour (the default -i 1800 in smartd) that do not occur when smartd is
> not running. The drives smartd is configured to look at have a sleep
> time configured in the BIOS. When the drives are asleep, I get a soft
> reset every half hour as smartd attempts to access the drives. While
> in this state, smartd also reports bad state to syslog (e.g.
> temperature changes to 200C). Just for comparison, hddtemp knows the
> drives are sleeping:
>
> # hddtemp /dev/sda
> /dev/sda: Hitachi HDS721010KLA330                 : drive is sleeping
> # ls /storage
> ... wakes up the drives ...
> # hddtemp /dev/sda
> /dev/sda: Hitachi HDS721010KLA330                 :  29 C or  F
>
> I'm pasting the example cmd / timeout error / soft reset below. Also,
> I'm pasting the invalid settings which smartd detects when in this
> state. What needs to change for smartd to recognize drives are
> sleeping and either not perform its checks, or forcefully wake them up
> to perform them? (Should that be a configuration parameter in smartd?)
>
> Thanks,
> -Andrew
>
> # uname -a
> Linux (none) 2.6.22.6 #5 Mon Sep 10 02:15:22 EDT 2007 i586 unknown
> (Using sata_sil on 3114 chips)
>
> # smartctl -V
> smartmontools release 5.38 dated 2006/12/20 at 20:37:59 UTC
> ...
> smartctl compile dated Sep 17 2007 at 13:47:25
> (repository code checked out on Sep 17th)
>
> # cat /var/run/smartd.conf
> /dev/sda -d ata -a -S on -s (S/../.././02|L/../../6/03)
> /dev/sdb -d ata -a -S on -s (S/../.././02|L/../../6/03)
>
> What happens every 30 minutes when drives are sleeping:
>
> Oct  6 01:05:48 (none) user.err kernel: ata2.00: exception Emask 0x0
> SAct 0x0 SErr 0x0 action 0x2 frozen
> Oct  6 01:05:48 (none) user.err kernel: ata2.00: cmd
> b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
> Oct  6 01:05:48 (none) user.warn kernel:          res
> 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Oct  6 01:05:53 (none) user.warn kernel: ata2: port is slow to
> respond, please be patient (Status 0xd0)
> Oct  6 01:05:55 (none) user.info kernel: ata2: soft resetting port
> Oct  6 01:05:56 (none) user.info kernel: ata2: SATA link up 1.5 Gbps
> (SStatus 113 SControl 310)
> Oct  6 01:05:56 (none) user.info kernel: ata2.00: configured for UDMA/100
> Oct  6 01:05:56 (none) user.info kernel: ata2: EH complete
> Oct  6 01:05:56 (none) user.notice kernel: sd 1:0:0:0: [sdb]
> 1953525168 512-byte hardware sectors (1000205 MB)
> Oct  6 01:05:56 (none) user.notice kernel: sd 1:0:0:0: [sdb] Write
> Protect is off
> Oct  6 01:05:56 (none) user.debug kernel: sd 1:0:0:0: [sdb] Mode
> Sense: 00 3a 00 00
> Oct  6 01:05:56 (none) user.notice kernel: sd 1:0:0:0: [sdb] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> Invalid attribute values:
>
> Oct  2 22:35:21 (none) daemon.info smartd[585]: Device: /dev/sda,
> SMART Prefailure Attribute: 7 Seek_Error_Rate changed from 87 to 86
> Oct  2 23:35:21 (none) daemon.info smartd[585]: Device: /dev/sda,
> SMART Prefailure Attribute: 7 Seek_Error_Rate changed from 86 to 85
> Oct  5 20:05:56 (none) daemon.info smartd[585]: Device: /dev/sdb,
> SMART Prefailure Attribute: 3 Spin_Up_Time changed from 84 to 85
> Oct  6 01:05:38 (none) daemon.info smartd[585]: Device: /dev/sda,
> SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 206
> Oct  6 01:05:56 (none) daemon.info smartd[585]: Device: /dev/sdb,
> SMART Usage Attribute: 194 Temperature_Celsius changed from 193 to 200
>
> Once the drives are started up, those values report:
>
>  3 Spin_Up_Time            0x0007   085   085   024    Pre-fail
> Always       -       821 (Average 820)
>  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
> Always       -       0
> 194 Temperature_Celsius     0x0002   193   193   000    Old_age
> Always       -       31 (Lifetime Min/Max 24/67)
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-10 19:39         ` Bruce Allen
@ 2007-10-11  2:02           ` Tejun Heo
  2007-10-11  2:46             ` Andrew Paprocki
                               ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Tejun Heo @ 2007-10-11  2:02 UTC (permalink / raw)
  To: Bruce Allen
  Cc: Smartmontools Developers List, Smartmontools Mailing List,
	Andrew Paprocki, linux-ide

Hello, Bruce.

Bruce Allen wrote:
> Hmm, it sounds as if smartmontools should send a SRST to spin up the
> drive, but I do not know enough to be sure.

Eh... Now that I think about it.  I don't think there's a way to work
around this from userland.  smartmontools doesn't know the current power
mode (sleeping drive doesn't even respond to CHECK POWER MODE), so it
can't determine whether the device needs SRST or not and issuing SRST
unconditionally would cause a lot more problems.

Maybe what should be done is to track sleep mode in libata and issue
SRST automatically if a command is issued to a sleeping drive.  I'll
work on it.

> Could I add you to the developers list and give you CVS write access?
> This might make it easier for you to fix the various little
> smartmontools problems like this that keep cropping up!  Just fixing the
> code might be a lot faster than explaining it and sending patches...
> 
> If this is OK with you, please send me your sourceforge username, and
> I'll add you to the developers list.

I'll stay chicken for the time being and only send patches.  :-)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-11  2:02           ` Tejun Heo
@ 2007-10-11  2:46             ` Andrew Paprocki
  2007-10-11  3:06               ` Tejun Heo
  2007-10-11  4:00             ` Andrew Paprocki
  2007-10-12  9:15             ` [smartmontools-devel] " Bruce Allen
  2 siblings, 1 reply; 13+ messages in thread
From: Andrew Paprocki @ 2007-10-11  2:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Bruce Allen, linux-ide

On 10/10/07, Tejun Heo <htejun@gmail.com> wrote:
> Maybe what should be done is to track sleep mode in libata and issue
> SRST automatically if a command is issued to a sleeping drive.  I'll
> work on it.

Another tidbit of info.. I just went through the pain of tracking down
everything in my system (system apps as well as my own code)
responsible for waking up sleeping drives. My end goal was to make
sure sleeping drives stayed asleep to reduce power consumption and
wear due to unnecessary spin-ups. I'm sure distros targeting laptops
or embedded systems that use live disks go through this pain
frequently.

Would all SRST cmds sent from libata come from the ata_std_softreset()
call? Could something like SystemTap be used without modifying libata
to track all pids which cause that function to be called? If that
would work, it could be an easy way to do what I did manually. That
is, unless someone knows of an easier way that I'm overlooking.. :) I
might give that a try to see if it works well and document the result.

-Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-11  2:46             ` Andrew Paprocki
@ 2007-10-11  3:06               ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2007-10-11  3:06 UTC (permalink / raw)
  To: Andrew Paprocki; +Cc: Bruce Allen, linux-ide

Andrew Paprocki wrote:
> On 10/10/07, Tejun Heo <htejun@gmail.com> wrote:
>> Maybe what should be done is to track sleep mode in libata and issue
>> SRST automatically if a command is issued to a sleeping drive.  I'll
>> work on it.
> 
> Another tidbit of info.. I just went through the pain of tracking down
> everything in my system (system apps as well as my own code)
> responsible for waking up sleeping drives. My end goal was to make
> sure sleeping drives stayed asleep to reduce power consumption and
> wear due to unnecessary spin-ups. I'm sure distros targeting laptops
> or embedded systems that use live disks go through this pain
> frequently.
> 
> Would all SRST cmds sent from libata come from the ata_std_softreset()
> call? Could something like SystemTap be used without modifying libata
> to track all pids which cause that function to be called? If that
> would work, it could be an easy way to do what I did manually. That
> is, unless someone knows of an easier way that I'm overlooking.. :) I
> might give that a try to see if it works well and document the result.

All resets come from ata_eh_reset() and you can attach probes to it but
the problem is that you can't identify the cause this way.  libata EH
thread would always be the issuing thread.  I think the best way to
track this is to use blktrace and look at which processes issue what
requests.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: smartd causing SATA timeouts on sleeping drives
  2007-10-11  2:02           ` Tejun Heo
  2007-10-11  2:46             ` Andrew Paprocki
@ 2007-10-11  4:00             ` Andrew Paprocki
  2007-10-12  9:15             ` [smartmontools-devel] " Bruce Allen
  2 siblings, 0 replies; 13+ messages in thread
From: Andrew Paprocki @ 2007-10-11  4:00 UTC (permalink / raw)
  To: Bruce Allen, Tejun Heo
  Cc: Smartmontools Developers List, Smartmontools Mailing List,
	linux-ide

Bruce/Tejun,

Just so you both know, even when specifying '-n standby,q' in smartd,
it still triggers timeouts on my system. The timeouts are no longer
coming from the default half-hour checks, but from my configured
self-test times with the '-s' option. It appears smartd overrides the
'-n' parameter in this case, triggering the libata soft reset. This is
another case that would be fixed if libata does the SRST
automatically.

Thanks,
-Andrew

Oct 11 02:16:52 (none) daemon.info smartd[23848]: Device: /dev/sdb,
STANDBY mode ignored due to scheduled self test (47 checks skipped)
Oct 11 02:17:03 (none) user.err kernel: ata2.00: exception Emask 0x0
SAct 0x0 SErr 0x0 action 0x2 frozen
Oct 11 02:17:03 (none) user.err kernel: ata2.00: cmd
b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
Oct 11 02:17:03 (none) user.warn kernel:          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 11 02:17:08 (none) user.warn kernel: ata2: port is slow to
respond, please be patient (Status 0xd0)
Oct 11 02:17:10 (none) user.info kernel: ata2: soft resetting port
Oct 11 02:17:10 (none) user.info kernel: ata2: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
Oct 11 02:17:10 (none) user.info kernel: ata2.00: configured for UDMA/100
Oct 11 02:17:10 (none) user.info kernel: ata2: EH complete

On 10/10/07, Tejun Heo <htejun@gmail.com> wrote:
> Maybe what should be done is to track sleep mode in libata and issue
> SRST automatically if a command is issued to a sleeping drive.  I'll
> work on it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [smartmontools-devel] smartd causing SATA timeouts on sleeping drives
  2007-10-11  2:02           ` Tejun Heo
  2007-10-11  2:46             ` Andrew Paprocki
  2007-10-11  4:00             ` Andrew Paprocki
@ 2007-10-12  9:15             ` Bruce Allen
  2 siblings, 0 replies; 13+ messages in thread
From: Bruce Allen @ 2007-10-12  9:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-ide, Smartmontools Developers List,
	Smartmontools Mailing List, Andrew Paprocki

>> Hmm, it sounds as if smartmontools should send a SRST to spin up the
>> drive, but I do not know enough to be sure.
>
> Eh... Now that I think about it.  I don't think there's a way to work
> around this from userland.  smartmontools doesn't know the current power
> mode (sleeping drive doesn't even respond to CHECK POWER MODE), so it
> can't determine whether the device needs SRST or not and issuing SRST
> unconditionally would cause a lot more problems.

OK, makes sense.

> Maybe what should be done is to track sleep mode in libata and issue 
> SRST automatically if a command is issued to a sleeping drive.  I'll 
> work on it.

Thank you!

>> Could I add you to the developers list and give you CVS write access? 
>> This might make it easier for you to fix the various little 
>> smartmontools problems like this that keep cropping up!  Just fixing 
>> the code might be a lot faster than explaining it and sending 
>> patches...
>>
>> If this is OK with you, please send me your sourceforge username, and 
>> I'll add you to the developers list.
>
> I'll stay chicken for the time being and only send patches.  :-)

I was really really hoping you would say 'yes'. Oh well...

Cheers,
 	Bruce


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-10-12  9:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-06  1:38 smartd causing SATA timeouts on sleeping drives Andrew Paprocki
2007-10-06 20:15 ` Tejun Heo
2007-10-08  5:51   ` Andrew Paprocki
2007-10-08  6:06     ` Tejun Heo
2007-10-08  6:32       ` Andrew Paprocki
2007-10-10 19:39         ` Bruce Allen
2007-10-11  2:02           ` Tejun Heo
2007-10-11  2:46             ` Andrew Paprocki
2007-10-11  3:06               ` Tejun Heo
2007-10-11  4:00             ` Andrew Paprocki
2007-10-12  9:15             ` [smartmontools-devel] " Bruce Allen
2007-10-10 19:42   ` Bruce Allen
2007-10-10 19:46 ` Bruce Allen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).