All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Graham Mitchell <gmitch@woodlea.com>
Cc: 'Brett Russ' <bruss@netezza.com>, linux-raid@vger.kernel.org
Subject: Re: [md PATCH 00/16] bad block list management for md and RAID1
Date: Fri, 18 Jun 2010 13:58:18 +1000	[thread overview]
Message-ID: <20100618135818.79967b83@notabene.brown> (raw)
In-Reply-To: <003901cb0e35$422f9020$c68eb060$@woodlea.com>

On Thu, 17 Jun 2010 11:53:40 -0400
"Graham Mitchell" <gmitch@woodlea.com> wrote:

> > This is a worthwhile addition, I think.  However, one concern we have is there
> > appears to be no distinction between media errors (i.e. bad
> > blocks) and other SCSI errors.
> 
> One thing I'd like to see would be being able to import a list of bad blocks from badblocks, and also have the ability for mdadm to be able to run a 'destructive' badblocks on the drives in the array, either at create/grow time, or on demand.

Importing a list of bad blocks would be quite trivial - you could write a
perl script to do it, though it might be nice to include it in mdadm.

> 
> I say 'destructive' since it would be a bad thing (tm) if it truly were destructive on a live array, but it would be nice for mdadm to do the full destructive aa/55/ff/00 write/read/compare cycle on each disk, without actually being destructive to the data that's there. I am slightly paranoid (having been bitten in the bum in the past), so I do a full destructive badblocks on every disk BEFORE I add It to an array (and yes, it can take days when I have 3 or 4 1TB drives to add). It would be nice to be able to add the disks to the server untested, and let mdadm do the testing when it was doing the grow.

I think it would be a mistake to incorporate bad-block detection
functionality into md or mdadm.  We already have a program which does that
and probably does it better than I could code.  Best to try to leverage what
already exists.

I'm not sure I see the logic though.  Surely if a drive has any errors when
new, then you don't want to trust it at all and cascading failure is likely
and tomorrow there will be more errors.  So t would be best to do the
badblock scan first and only add it to the array if it were completely
successful.

However if you really want to you could tell md that all blocks were bad,
then have the badblock scan run and after if finishes with some section, tell
md that section was OK and move on.
The current badblock list format allows ranges of blocks, but it is currently
limited to 512 ranges each of at most 512 blocks.  I could probably relax
that without too much effort, so that a single range could cover the whole
device... if we really thought that was a good idea.

Not convinced....

NeilBrown

  reply	other threads:[~2010-06-18  3:58 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-07  0:07 [md PATCH 00/16] bad block list management for md and RAID1 NeilBrown
2010-06-07  0:07 ` [md PATCH 01/16] md: beginnings of bad block management NeilBrown
2010-06-07  0:07 ` [md PATCH 02/16] md/bad-block-log: add sysfs interface for accessing bad-block-log NeilBrown
2010-06-07  0:07 ` [md PATCH 06/16] md/raid1: clean up read_balance NeilBrown
2010-06-07  0:07 ` [md PATCH 07/16] md: simplify raid10 read_balance NeilBrown
2010-06-07  0:07 ` [md PATCH 04/16] md: load/store badblock list from v1.x metadata NeilBrown
2010-06-07  0:07 ` [md PATCH 05/16] md: reject devices with bad blocks and v0.90 metadata NeilBrown
2010-06-07  0:07 ` [md PATCH 03/16] md: don't allow arrays to contain devices with bad blocks NeilBrown
2010-06-07  0:07 ` [md PATCH 11/16] md/multipath: discard ->working_disks in favour of ->degraded NeilBrown
2010-06-07  0:07 ` [md PATCH 12/16] md: make error_handler functions more uniform and correct NeilBrown
2010-06-07  0:07 ` [md PATCH 10/16] md: add 'write_error' flag to component devices NeilBrown
2010-06-07  0:07 ` [md PATCH 08/16] md/raid1: avoid reading from known bad blocks NeilBrown
2010-06-07  0:07 ` [md PATCH 09/16] md/raid1: avoid reading known bad blocks during resync NeilBrown
2010-06-07  0:07 ` [md PATCH 15/16] md/raid1: clear bad-block record when write succeeds NeilBrown
2010-06-07  0:07 ` [md PATCH 14/16] md/raid1: avoid writing to known-bad blocks on known-bad drives NeilBrown
2010-06-07  0:07 ` [md PATCH 16/16] md/raid1: Handle write errors by updating badblock log NeilBrown
2010-06-07  0:07 ` [md PATCH 13/16] md: make it easier to wait for bad blocks to be acknowledged NeilBrown
2010-06-07  0:28 ` [md PATCH 00/16] bad block list management for md and RAID1 Berkey B Walker
2010-06-07 22:18   ` Stefan /*St0fF*/ Hübner
2010-06-17 12:48 ` Brett Russ
2010-06-17 15:53   ` Graham Mitchell
2010-06-18  3:58     ` Neil Brown [this message]
2010-06-18  4:30       ` Graham Mitchell
2010-06-18  3:23   ` Neil Brown
     [not found]     ` <4C1BABC4.3020008@tmr.com>
2010-06-29  5:06       ` Neil Brown
2010-06-29 16:54         ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100618135818.79967b83@notabene.brown \
    --to=neilb@suse.de \
    --cc=bruss@netezza.com \
    --cc=gmitch@woodlea.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.