linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Graham Mitchell" <gmitch@woodlea.com>
To: 'Neil Brown' <neilb@suse.de>
Cc: 'Brett Russ' <bruss@netezza.com>, linux-raid@vger.kernel.org
Subject: RE: [md PATCH 00/16] bad block list management for md and RAID1
Date: Fri, 18 Jun 2010 00:30:08 -0400	[thread overview]
Message-ID: <002c01cb0e9e$f4ba41e0$de2ec5a0$@woodlea.com> (raw)
In-Reply-To: <20100618135818.79967b83@notabene.brown>

> I think it would be a mistake to incorporate bad-block detection
functionality
> into md or mdadm.  We already have a program which does that and
> probably does it better than I could code.  Best to try to leverage what
> already exists.

I agree - I was thinking along the lines of maintenance type cases, where we
currently run an array check once a week - we could also schedule a full
'non-destructive badblocks -w' type test once a month (say), to catch disks
which are starting to go bad. Since mdadm understands the RAID layout, it
could migrate/redirect a stripe or block to another area, run badblocks on
each of the disks specifying the start and end sectors, and if the area on
one of the disks was bad, mark the area as bad - and since the data has been
redirected, we don't lose anything. If the area is good, then the data gets
moved back to its original location, and mdadm moves on to the next
stripe/block. I really think you'd need to do a fully destructive write test
on the drive though - I've actually just finished testing a Spinpoint F3
this evening, which has shown up 5 bad sectors, all on the 4th write pass
(0x00), so a quick read test probably wouldn't have shown them up.


> I'm not sure I see the logic though.  Surely if a drive has any errors
when new,
> then you don't want to trust it at all and cascading failure is likely and
> tomorrow there will be more errors.  So t would be best to do the badblock
> scan first and only add it to the array if it were completely successful.

Agreed Neil - I guess I am thinking more of the maintenance type cases, but
it would be nice to have mdadm check the drive when it's added to the array.
You could just blindly add the drive, and immediately schedule a full
badblocks test - but I guess I would still be paranoid, and still check the
disk before adding it.




  reply	other threads:[~2010-06-18  4:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-07  0:07 [md PATCH 00/16] bad block list management for md and RAID1 NeilBrown
2010-06-07  0:07 ` [md PATCH 02/16] md/bad-block-log: add sysfs interface for accessing bad-block-log NeilBrown
2010-06-07  0:07 ` [md PATCH 01/16] md: beginnings of bad block management NeilBrown
2010-06-07  0:07 ` [md PATCH 04/16] md: load/store badblock list from v1.x metadata NeilBrown
2010-06-07  0:07 ` [md PATCH 03/16] md: don't allow arrays to contain devices with bad blocks NeilBrown
2010-06-07  0:07 ` [md PATCH 06/16] md/raid1: clean up read_balance NeilBrown
2010-06-07  0:07 ` [md PATCH 07/16] md: simplify raid10 read_balance NeilBrown
2010-06-07  0:07 ` [md PATCH 05/16] md: reject devices with bad blocks and v0.90 metadata NeilBrown
2010-06-07  0:07 ` [md PATCH 08/16] md/raid1: avoid reading from known bad blocks NeilBrown
2010-06-07  0:07 ` [md PATCH 09/16] md/raid1: avoid reading known bad blocks during resync NeilBrown
2010-06-07  0:07 ` [md PATCH 11/16] md/multipath: discard ->working_disks in favour of ->degraded NeilBrown
2010-06-07  0:07 ` [md PATCH 12/16] md: make error_handler functions more uniform and correct NeilBrown
2010-06-07  0:07 ` [md PATCH 10/16] md: add 'write_error' flag to component devices NeilBrown
2010-06-07  0:07 ` [md PATCH 15/16] md/raid1: clear bad-block record when write succeeds NeilBrown
2010-06-07  0:07 ` [md PATCH 13/16] md: make it easier to wait for bad blocks to be acknowledged NeilBrown
2010-06-07  0:07 ` [md PATCH 14/16] md/raid1: avoid writing to known-bad blocks on known-bad drives NeilBrown
2010-06-07  0:07 ` [md PATCH 16/16] md/raid1: Handle write errors by updating badblock log NeilBrown
2010-06-07  0:28 ` [md PATCH 00/16] bad block list management for md and RAID1 Berkey B Walker
2010-06-07 22:18   ` Stefan /*St0fF*/ Hübner
2010-06-17 12:48 ` Brett Russ
2010-06-17 15:53   ` Graham Mitchell
2010-06-18  3:58     ` Neil Brown
2010-06-18  4:30       ` Graham Mitchell [this message]
2010-06-18  3:23   ` Neil Brown
     [not found]     ` <4C1BABC4.3020008@tmr.com>
2010-06-29  5:06       ` Neil Brown
2010-06-29 16:54         ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='002c01cb0e9e$f4ba41e0$de2ec5a0$@woodlea.com' \
    --to=gmitch@woodlea.com \
    --cc=bruss@netezza.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).