From: NeilBrown <neilb@suse.de>
To: Namhyung Kim <namhyung@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [md PATCH 09/36] md: make it easier to wait for bad blocks to be acknowledged.
Date: Wed, 27 Jul 2011 11:18:22 +1000 [thread overview]
Message-ID: <20110727111822.518dae92@notabene.brown> (raw)
In-Reply-To: <87livldo5c.fsf@gmail.com>
On Wed, 27 Jul 2011 01:04:15 +0900 Namhyung Kim <namhyung@gmail.com> wrote:
> NeilBrown <neilb@suse.de> writes:
>
> > It is only safe to choose not to write to a bad block if that bad
> > block is safely recorded in metadata - i.e. if it has been
> > 'acknowledged'.
> >
> > If it hasn't we need to wait for the acknowledgement.
> >
> > We support that using rdev->blocked wait and
> > md_wait_for_blocked_rdev by introducing a new device flag
> > 'BlockedBadBlock'.
> >
> > This flag is only advisory.
> > It is cleared whenever we acknowledge a bad block, so that a waiter
> > can re-check the particular bad blocks that it is interested it.
> >
> > It should be set by a caller when they find they need to wait.
> > This (set after test) is inherently racy, but as
> > md_wait_for_blocked_rdev already has a timeout, losing the race will
> > have minimal impact.
> >
> > When we clear "Blocked" was also clear "BlockedBadBlocks" incase it
> > was set incorrectly (see above race).
> >
> > We also modify the way we manage 'Blocked' to fit better with the new
> > handling of 'BlockedBadBlocks' and to make it consistent between
> > externally managed and internally managed metadata. This requires
> > that each raidXd loop checks if the metadata needs to be written and
> > triggers a write (md_check_recovery) if needed. Otherwise a queued
> > write request might cause raidXd to wait for the metadata to write,
> > and only that thread can write it.
> >
> > Before writing metadata, we set FaultRecorded for all devices that
> > are Faulty, then after writing the metadata we clear Blocked for any
> > device for which the Fault was certainly Recorded.
> >
> > The 'faulty' device flag now appears in sysfs if the device is faulty
> > *or* it has unacknowledged bad blocks. So user-space which does not
> > understand bad blocks can continue to function correctly.
> > User space which does, should not assume a device is faulty until it
> > sees the 'faulty' flag, and then sees the list of unacknowledged bad
> > blocks is empty.
> >
> > Signed-off-by: NeilBrown <neilb@suse.de>
>
> Probably you also need this patch:
>
> >From 76320c4fdaed91f26a083a9337bb5a5503300e0e Mon Sep 17 00:00:00 2001
> From: Namhyung Kim <namhyung@gmail.com>
> Date: Wed, 27 Jul 2011 00:59:26 +0900
> Subject: [PATCH] md: update documentation for md/rdev/state sysfs interface
>
> Previous patches in the bad block series extended behavior of
> rdev's 'state' interface but lacked documentation update.
> Fix it.
>
> Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Applied, thanks.
NeilBrown
> ---
> Documentation/md.txt | 14 +++++++++-----
> 1 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/md.txt b/Documentation/md.txt
> index 923a6bddce7c..fc94770f44ab 100644
> --- a/Documentation/md.txt
> +++ b/Documentation/md.txt
> @@ -360,18 +360,20 @@ Each directory contains:
> A file recording the current state of the device in the array
> which can be a comma separated list of
> faulty - device has been kicked from active use due to
> - a detected fault
> + a detected fault or it has unacknowledged bad
> + blocks
> in_sync - device is a fully in-sync member of the array
> writemostly - device will only be subject to read
> requests if there are no other options.
> This applies only to raid1 arrays.
> - blocked - device has failed, metadata is "external",
> - and the failure hasn't been acknowledged yet.
> + blocked - device has failed, and the failure hasn't been
> + acknowledged yet by the metadata handler.
> Writes that would write to this device if
> it were not faulty are blocked.
> spare - device is working, but not a full member.
> This includes spares that are in the process
> of being recovered to
> + write_error - device has ever seen a write error.
> This list may grow in future.
> This can be written to.
> Writing "faulty" simulates a failure on the device.
> @@ -379,9 +381,11 @@ Each directory contains:
> Writing "writemostly" sets the writemostly flag.
> Writing "-writemostly" clears the writemostly flag.
> Writing "blocked" sets the "blocked" flag.
> - Writing "-blocked" clears the "blocked" flag and allows writes
> - to complete.
> + Writing "-blocked" clears the "blocked" flags and allows writes
> + to complete and possibly simulates an error.
> Writing "in_sync" sets the in_sync flag.
> + Writing "write_error" sets writeerrorseen flag.
> + Writing "-write_error" clears writeerrorseen flag.
>
> This file responds to select/poll. Any change to 'faulty'
> or 'blocked' causes an event.
next prev parent reply other threads:[~2011-07-27 1:18 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-21 2:58 [md PATCH 00/36] md patches for 3.1 - part 2: bad block logs NeilBrown
2011-07-21 2:58 ` [md PATCH 02/36] md/bad-block-log: add sysfs interface for accessing bad-block-log NeilBrown
2011-07-22 15:43 ` Namhyung Kim
2011-07-26 2:29 ` NeilBrown
2011-07-26 5:17 ` Namhyung Kim
2011-07-26 8:48 ` Namhyung Kim
2011-07-26 15:03 ` [PATCH v2] md: add documentation for bad block log Namhyung Kim
2011-07-27 1:05 ` [md PATCH 02/36] md/bad-block-log: add sysfs interface for accessing bad-block-log NeilBrown
2011-07-21 2:58 ` [md PATCH 01/36] md: beginnings of bad block management NeilBrown
2011-07-22 15:03 ` Namhyung Kim
2011-07-26 2:26 ` NeilBrown
2011-07-26 5:17 ` Namhyung Kim
2011-07-22 16:52 ` Namhyung Kim
2011-07-26 3:20 ` NeilBrown
2011-07-21 2:58 ` [md PATCH 04/36] md: load/store badblock list from v1.x metadata NeilBrown
2011-07-22 16:34 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 03/36] md: don't allow arrays to contain devices with bad blocks NeilBrown
2011-07-22 15:47 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 05/36] md: Disable bad blocks and v0.90 metadata NeilBrown
2011-07-22 17:02 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 14/36] md/raid1: record badblocks found during resync etc NeilBrown
2011-07-27 15:39 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 07/36] md/raid1: avoid reading known bad blocks during resync NeilBrown
2011-07-26 14:25 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 13/36] md/raid1: Handle write errors by updating badblock log NeilBrown
2011-07-27 15:28 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 08/36] md: add 'write_error' flag to component devices NeilBrown
2011-07-26 15:22 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 06/36] md/raid1: avoid reading from known bad blocks NeilBrown
2011-07-26 14:06 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 11/36] md/raid1: clear bad-block record when write succeeds NeilBrown
2011-07-27 5:05 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 12/36] md/raid1: store behind-write pages in bi_vecs NeilBrown
2011-07-27 15:16 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 09/36] md: make it easier to wait for bad blocks to be acknowledged NeilBrown
2011-07-26 16:04 ` Namhyung Kim
2011-07-27 1:18 ` NeilBrown [this message]
2011-07-21 2:58 ` [md PATCH 10/36] md/raid1: avoid writing to known-bad blocks on known-bad drives NeilBrown
2011-07-27 4:09 ` Namhyung Kim
2011-07-27 4:19 ` NeilBrown
2011-07-21 2:58 ` [md PATCH 23/36] md/raid10: Split handle_read_error out from raid10d NeilBrown
2011-07-21 2:58 ` [md PATCH 19/36] md/raid5: write errors should be recorded as bad blocks if possible NeilBrown
2011-07-21 2:58 ` [md PATCH 16/36] md/raid1: factor several functions out or raid1d() NeilBrown
2011-07-27 15:55 ` Namhyung Kim
2011-07-28 1:39 ` NeilBrown
2011-07-21 2:58 ` [md PATCH 24/36] md/raid10: avoid reading from known bad blocks - part 1 NeilBrown
2011-07-21 2:58 ` [md PATCH 17/36] md/raid5: avoid reading from known bad blocks NeilBrown
2011-07-21 2:58 ` [md PATCH 15/36] md/raid1: improve handling of read failure during recovery NeilBrown
2011-07-27 15:45 ` Namhyung Kim
2011-07-21 2:58 ` [md PATCH 21/36] md/raid5: Clear bad blocks on successful write NeilBrown
2011-07-21 2:58 ` [md PATCH 20/36] md/raid5. Don't write to known bad block on doubtful devices NeilBrown
2011-07-21 2:58 ` [md PATCH 18/36] md/raid5: use bad-block log to improve handling of uncorrectable read errors NeilBrown
2011-07-21 2:58 ` [md PATCH 22/36] md/raid10: simplify/reindent some loops NeilBrown
2011-07-21 2:58 ` [md PATCH 27/36] md/raid10: avoid reading known bad blocks during resync/recovery NeilBrown
2011-07-21 2:58 ` [md PATCH 29/36] md/raid10: avoid writing to known bad blocks on known bad drives NeilBrown
2011-07-21 2:58 ` [md PATCH 33/36] md/raid10: record bad blocks due to write errors during resync/recovery NeilBrown
2011-07-21 2:58 ` [md PATCH 30/36] md/raid10: clear bad-block record when write succeeds NeilBrown
2011-07-21 2:58 ` [md PATCH 34/36] md/raid10: simplify read error handling during recovery NeilBrown
2011-07-21 2:58 ` [md PATCH 25/36] md/raid10: avoid reading from known bad blocks - part 2 NeilBrown
2011-07-21 2:58 ` [md PATCH 28/36] md/raid10 record bad blocks as needed during recovery NeilBrown
2011-07-21 2:58 ` [md PATCH 31/36] md/raid10: Handle write errors by updating badblock log NeilBrown
2011-07-21 2:58 ` [md PATCH 26/36] md/raid10 - avoid reading from known bad blocks - part 3 NeilBrown
2011-07-21 2:58 ` [md PATCH 32/36] md/raid10: attempt to fix read errors during resync/check NeilBrown
2011-07-21 2:58 ` [md PATCH 36/36] md/raid10: handle further errors during fix_read_error better NeilBrown
2011-07-21 2:58 ` [md PATCH 35/36] md/raid10: Handle read errors during recovery better NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110727111822.518dae92@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=namhyung@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).