Re: RAID halting - Roger Heflin

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Roger Heflin <rogerheflin@gmail.com>
To: lrhorer@satx.rr.com
Cc: 'Linux RAID' <linux-raid@vger.kernel.org>
Subject: Re: RAID halting
Date: Sat, 04 Apr 2009 19:57:50 -0500	[thread overview]
Message-ID: <49D8020E.3010705@gmail.com> (raw)
In-Reply-To: <20090405000728.GGPW19140.cdptpa-omta03.mail.rr.com@Leslie>

Lelsie Rhorer wrote:
>> If one of your disks was clearing bad sectors then things get messy
>> and when it hits one of these bad sectors that it can successfully
>> move you would get a delay almost every time.
> 
> Yes, but in that case two things would be true:
> 
> 1.  Any write of any sort could readily trigger an event.  The system quite
> regularly writes more than 5000 sectors / second, but never do any of these
> writes trigger an event except in the case where it is a file creation.
> Like I said, the drives have no idea whether the sector they are attempting
> to write is a new file or not, or part of a directory structure or not.

Writes don't trigger this sort of events, it is only the reads, and 
are you sure the data the you wrote is still readable?

> 
> 2.  The kernel would be reporting SMART errors.  It isn't.

Smart had never really worked as good as the disk makers claim.   I 
have tested smart on sets of >1000 drives, and  smarts accuracy for 
detecting bad sector issues with disks was almost useless, I had 50 
known bad drives in the set, smart flagged only 15 of them as bad, and 
on top of that smart flagged another 15-20 drives as bad that did not 
appear to fail at all after months of usage since smart had declared 
them bad.   Basically smart is useful, but it cannot really be 
trusted, if you don't believe me, see google's similar study on large 
numbers of drives.

> 
> Finally, as you said yourself, the situation would result in a delay almost
> every time, yet there are signifcant stretches of time when every single
> file creation works just fine.  Also, it doesn't take a drive 40 seconds,
> let alone 2 minutes, to mark a sector bad.  The array chassis I had
> previously had some sort of problem which made the drives think there were
> bad sectors, when there weren't.  It cause one drive to be marked with more
> than a million bad sectors.  It never paused like this, however.
>

And what I said if you read it carefully is, that *WHEN* you hit a bad 
sector it will cause a delay almost every time, not you will hit a 
delay every time you read the disk.

It will only result in a delay if you hit the magic bad sector.   And 
on reads it cannot mark the sector bad until it successfully reads the 
sector so it tries really hard and takes a long time trying, and once 
it reads that sector successfully it will rewrite it elsewhere and 
mark the sector bad.    When you hit the next bad sector the same 
thing will happen again.   How bad of issue that you have depends on 
if the number of bad sectors on the disk is growing...if you only have 
20 bad ones eventually they will all get reread (maybe) and relocated, 
if you have a few more showing up each day, things will never get any 
better.

When the array chassis had its issue, likely the chassis decided they 
were bad after getting a successful read, the read came back quickly 
and the chassis decided it was bad and marked it as such, the *DRIVE* 
has to think the sector is bad to get the delay, and in the array 
chassis case the drive knew the sector was just find and the array 
chassis misinterpreted what the drive was telling it and decided it 
was bad.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

next prev parent reply	other threads:[~2009-04-05  0:57 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <49D7C19C.2050308@gmail.com>
2009-04-05  0:07 ` RAID halting Lelsie Rhorer
2009-04-05  0:49   ` Greg Freemyer
2009-04-05  5:34     ` Lelsie Rhorer
2009-04-05  7:16       ` Richard Scobie
2009-04-05  8:22         ` Lelsie Rhorer
2009-04-05 14:05           ` Drew
2009-04-05 18:54             ` Leslie Rhorer
2009-04-05 19:17               ` John Robinson
2009-04-05 20:00                 ` Greg Freemyer
2009-04-05 20:39                   ` Peter Grandi
2009-04-05 23:27                     ` Leslie Rhorer
2009-04-05 22:03                   ` Leslie Rhorer
2009-04-06 22:16                     ` Greg Freemyer
2009-04-07 18:22                       ` Leslie Rhorer
2009-04-24  4:52                   ` Leslie Rhorer
2009-04-24  6:50                     ` Richard Scobie
2009-04-24 10:03                       ` Leslie Rhorer
2009-04-28 19:36                         ` lrhorer
2009-04-24 15:24                     ` Andrew Burgess
2009-04-25  4:26                       ` Leslie Rhorer
2009-04-24 17:03                     ` Doug Ledford
2009-04-24 20:25                       ` Richard Scobie
2009-04-24 20:28                         ` CoolCold
2009-04-24 21:04                           ` Richard Scobie
2009-04-25  7:40                       ` Leslie Rhorer
2009-04-25  8:53                         ` Michał Przyłuski
2009-04-28 19:33                         ` Leslie Rhorer
2009-04-29 11:25                           ` John Robinson
2009-04-30  0:55                             ` Leslie Rhorer
2009-04-30 12:34                               ` John Robinson
2009-05-03  2:16                                 ` Leslie Rhorer
2009-05-03  2:23                           ` Leslie Rhorer
2009-04-24 20:25                     ` Greg Freemyer
2009-04-25  7:24                     ` Leslie Rhorer
2009-04-05 21:02                 ` Leslie Rhorer
2009-04-05 19:26               ` Richard Scobie
2009-04-05 20:40                 ` Leslie Rhorer
2009-04-05 20:57               ` Peter Grandi
2009-04-05 23:55                 ` Leslie Rhorer
2009-04-06 20:35                   ` jim owens
2009-04-07 17:47                     ` Leslie Rhorer
2009-04-07 18:18                       ` David Lethe
2009-04-08 14:17                         ` Leslie Rhorer
2009-04-08 14:30                           ` David Lethe
2009-04-09  4:52                             ` Leslie Rhorer
2009-04-09  6:45                               ` David Lethe
2009-04-08 14:37                           ` Greg Freemyer
2009-04-08 16:29                             ` Andrew Burgess
2009-04-09  3:24                               ` Leslie Rhorer
2009-04-10  3:02                               ` Leslie Rhorer
2009-04-10  4:51                                 ` Leslie Rhorer
2009-04-10 12:50                                   ` jim owens
2009-04-10 15:31                                   ` Bill Davidsen
2009-04-11  1:37                                     ` Leslie Rhorer
2009-04-11 13:02                                       ` Bill Davidsen
2009-04-10  8:53                                 ` David Greaves
2009-04-08 18:04                           ` Corey Hickey
2009-04-07 18:20                       ` Greg Freemyer
2009-04-08  8:45                       ` John Robinson
2009-04-09  3:34                         ` Leslie Rhorer
2009-04-05  7:33       ` Richard Scobie
2009-04-05  0:57   ` Roger Heflin [this message]
2009-04-05  6:30     ` Lelsie Rhorer
     [not found] <49F2A193.8080807@sauce.co.nz>
2009-04-25  7:03 ` Leslie Rhorer
     [not found] <49F21B75.7060705@sauce.co.nz>
2009-04-25  4:32 ` Leslie Rhorer
     [not found] <49D89515.3020800@computer.org>
2009-04-05 18:40 ` Leslie Rhorer
2009-04-05 14:22 FW: " David Lethe
2009-04-05 14:53 ` David Lethe
2009-04-05 20:33 ` Leslie Rhorer
2009-04-05 22:20   ` Peter Grandi
2009-04-06  0:31   ` Doug Ledford
2009-04-06  1:53     ` Leslie Rhorer
2009-04-06 12:37       ` Doug Ledford
  -- strict thread matches above, loose matches on Subject: below --
2009-04-05  5:33 David Lethe
2009-04-05  8:14 ` RAID halting Lelsie Rhorer
2009-04-04 17:05 Lelsie Rhorer
2009-04-02 13:35 Andrew Burgess
2009-04-04  5:57 ` RAID halting Lelsie Rhorer
2009-04-04 13:01   ` Andrew Burgess
2009-04-04 14:39     ` Lelsie Rhorer
2009-04-04 15:04       ` Andrew Burgess
2009-04-04 15:15         ` Lelsie Rhorer
2009-04-04 16:39           ` Andrew Burgess
2009-04-02  7:33 Peter Grandi
2009-04-02 23:01 ` RAID halting Lelsie Rhorer
2009-04-02  6:56 your mail Luca Berra
2009-04-04  6:44 ` RAID halting Lelsie Rhorer
2009-04-02  4:38 Strange filesystem slowness with 8TB RAID6 NeilBrown
2009-04-04  7:12 ` RAID halting Lelsie Rhorer
2009-04-04 12:38   ` Roger Heflin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49D8020E.3010705@gmail.com \
    --to=rogerheflin@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=lrhorer@satx.rr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).