Re: scrub implies failing drive - smartctl blissfully unaware

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Robert White <rwhite@pobox.com>
To: Phillip Susi <psusi@ubuntu.com>, Duncan <1i5t5.duncan@cox.net>,
	linux-btrfs@vger.kernel.org
Subject: Re: scrub implies failing drive - smartctl blissfully unaware
Date: Wed, 19 Nov 2014 13:05:13 -0800	[thread overview]
Message-ID: <546D0609.9040105@pobox.com> (raw)
In-Reply-To: <546CC04F.6040207@ubuntu.com>

On 11/19/2014 08:07 AM, Phillip Susi wrote:
> On 11/18/2014 9:46 PM, Duncan wrote:
>> I'm not sure about normal operation, but certainly, many drives
>> take longer than 30 seconds to stabilize after power-on, and I
>> routinely see resets during this time.
>
> As far as I have seen, typical drive spin up time is on the order of
> 3-7 seconds.  Hell, I remember my pair of first generation seagate
> cheetah 15,000 rpm drives seemed to take *forever* to spin up and that
> still was maybe only 15 seconds.  If a drive takes longer than 30
> seconds, then there is something wrong with it.  I figure there is a
> reason why spin up time is tracked by SMART so it seems like long spin
> up time is a sign of a sick drive.

I was recently re-factoring Underdog (http://underdog.sourceforge.net) 
startup scripts to separate out the various startup domains (e.g. lvm, 
luks, mdadm) in the prtotype init.

So I notice you (Duncan) use the word "stabilize", as do a small number 
of drivers in the linux kernel. This word has very little to do with 
"disks" per se.

Between SCSI probing LUNs (where the controller tries every theoretical 
address and gives a potential device ample time to reply), and 
usb-storage having a simple timer delay set for each volume it sees, 
there is a lot of "waiting in the name of safety" going on in the linux 
kernel at device initialization.

When I added the messages "scanning /dev/sd??" to the startup sequence 
as I iterate through the disks and partitions present I discovered that 
the first time I called blkid (e.g. right between /dev/sda and 
/dev/sda1) I'd get a huge hit of many human seconds (I didn't time it, 
but I'd say eight or so) just for having a 2Tb My Book WD 3.0 disk 
enclosure attached as /dev/sdc. This enclosure having "spun up" in the 
previous boot cycle and only bing a soft reboot was immaterial. In this 
case usb-store is going to take its time and do its deal regardless of 
the state of the physical drive itself.

So there are _lots_ of places where you are going to get delays and very 
few of them involve the disk itself going from power-off to ready.

You said it yourself with respect to SSDs.

It's cheaper, and less error prone, and less likely to generate customer 
returns if the generic controller chips just "send init, wait a fixed 
delay, then request a status" compared to trying to "are-you-there-yet" 
poll each device like a nagging child. And you are going to see that at 
every level. And you are going to see it multiply with _sparsely_ 
provisioned buses where the cycle is going to be retried for absent LUNs 
(one disk on a Wide SCSI bus and a controller set to probe all LUNs is 
particularly egregious)

One of the reasons that the whole industry has started favoring 
point-to-point (SATA, SAS) or physical intercessor chaining 
point-to-point (eSATA) buses is to remove a lot of those wait-and-see 
delays.

That said, you should not see a drive (or target enclosure, or 
controller) "reset" during spin up. In a SCSI setting this is almost 
always a cabling, termination, or addressing issue. In IDE its jumper 
mismatch (master vs slave vs cable-select). Less often its a 
partitioning issue (trying to access sectors beyond the end of the drive).

Another strong actor is selecting the wrong storage controller chipset 
driver. In that case you may be faling back from high-end device you 
think it is, through intermediate chip-set, and back to ACPI or BIOS 
emulation

Another common cause is having a dedicated hardware RAID controller 
(dell likes to put LSI MegaRaid controllers in their boxes for example), 
many mother boards have hardware RAID support available through the 
bios, etc, leaving that feature active, then the adding a drive and 
_not_ initializing that drive with the RAID controller disk setup. In 
this case the controller is going to repeatedly probe the drive for its 
proprietary controller signature blocks (and reset the drive after each 
attempt) and then finally fall back to raw block pass-through. This can 
take a long time (thirty seconds to a minute).

But seriously, if you are seeing "reset" anywhere in any storage chain 
during a normal power-on cycle then you've got a problem  with geometry 
or configuration.

next prev parent reply	other threads:[~2014-11-19 21:05 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1XqYMg-0000YI-8y@watricky.valid.co.za>
2014-11-18  7:29 ` scrub implies failing drive - smartctl blissfully unaware Brendan Hide
2014-11-18  7:36   ` Roman Mamedov
2014-11-18 13:24     ` Brendan Hide
2014-11-18 15:16       ` Duncan
2014-11-18 12:08   ` Austin S Hemmelgarn
2014-11-18 13:25     ` Brendan Hide
2014-11-18 16:02     ` Phillip Susi
2014-11-18 15:35   ` Marc MERLIN
2014-11-18 16:04     ` Phillip Susi
2014-11-18 16:11       ` Marc MERLIN
2014-11-18 16:26         ` Phillip Susi
2014-11-18 18:57     ` Chris Murphy
2014-11-18 20:58       ` Phillip Susi
2014-11-19  2:40         ` Chris Murphy
2014-11-19 15:11           ` Phillip Susi
2014-11-20  0:05             ` Chris Murphy
2014-11-25 21:34               ` Phillip Susi
2014-11-25 23:13                 ` Chris Murphy
2014-11-26  1:53                   ` Rich Freeman
2014-12-01 19:10                   ` Phillip Susi
2014-11-28 15:02                 ` Patrik Lundquist
2014-11-19  2:46         ` Duncan
2014-11-19 16:07           ` Phillip Susi
2014-11-19 21:05             ` Robert White [this message]
2014-11-19 21:47               ` Phillip Susi
2014-11-19 22:25                 ` Robert White
2014-11-20 20:26                   ` Phillip Susi
2014-11-20 22:45                     ` Robert White
2014-11-21 15:11                       ` Phillip Susi
2014-11-21 21:12                         ` Robert White
2014-11-21 21:41                           ` Robert White
2014-11-22 22:06                           ` Phillip Susi
2014-11-19 22:33                 ` Robert White
2014-11-20 20:34                   ` Phillip Susi
2014-11-20 23:08                     ` Robert White
2014-11-21 15:27                       ` Phillip Susi
2014-11-20  0:25               ` Duncan
2014-11-20  2:08                 ` Robert White
2014-11-19 23:59             ` Duncan
2014-11-25 22:14               ` Phillip Susi
2014-11-28 15:55                 ` Patrik Lundquist
2014-11-21  4:58   ` Zygo Blaxell
2014-11-21  7:05     ` Brendan Hide
2014-11-21 12:55       ` Ian Armstrong
2014-11-21 17:45         ` Chris Murphy
2014-11-22  7:18           ` Ian Armstrong
2014-11-21 17:42       ` Zygo Blaxell
2014-11-21 18:06         ` Chris Murphy
2014-11-22  2:25           ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=546D0609.9040105@pobox.com \
    --to=rwhite@pobox.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=psusi@ubuntu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).