From: "Keld Jørn Simonsen" <keld@keldix.com>
To: David Brown <david.brown@hesbynett.no>
Cc: linux-raid@vger.kernel.org
Subject: Re: md road-map: 2011
Date: Thu, 17 Feb 2011 02:04:55 +0100 [thread overview]
Message-ID: <20110217010455.GA16324@www2.open-std.org> (raw)
In-Reply-To: <ijhq7p$pjv$1@dough.gmane.org>
On Thu, Feb 17, 2011 at 01:30:49AM +0100, David Brown wrote:
> On 17/02/11 00:01, NeilBrown wrote:
> >On Wed, 16 Feb 2011 23:34:43 +0100 David Brown<david.brown@hesbynett.no>
> >wrote:
> >
> >>I thought there was some mechanism for block devices to report bad
> >>blocks back to the file system, and that file systems tracked bad block
> >>lists. Modern drives automatically relocate bad blocks (at least, they
> >>do if they can), but there was a time when they did not and it was up to
> >>the file system to track these. Whether that still applies to modern
> >>file systems, I do not know - they only file system I have studied in
> >>low-level detail is FAT16.
> >
> >When the block device reports an error the filesystem can certainly record
> >that information in a bad-block list, and possibly does.
> >
> >However I thought you were suggesting a situation where the block device
> >could succeed with the request, but knew that area of the device was of low
> >quality.
>
> I guess that is what I was trying to suggest, though not very clearly.
>
> >e.g. IO to a block on a stripe which had one 'bad block'. The IO should
> >succeed, but the data isn't as safe as elsewhere. It would be nice if we
> >could tell the filesystem that fact, and if it could make use of it. But we
> >currently cannot. We can say "success" or "failure", but we cannot say
> >"success, but you might not be so lucky next time".
> >
>
> Do filesystems re-try reads when there is a failure? Could you return
> fail on one read, then success on a re-read, which could be interpreted
> as "dying, but not yet dead" by the file system?
This should not be a file system feature. The file system is built upon
the raid, and in mirrorred rait types like raid1 and raid10, and also
other raid types, you cannot be sure which specific drive and sector the
data was read from - it could be one out of many (typically two) places.
So the bad blocks of a raid is a feature of the raid and its individual
drives, not the file system. If it was a property of the file system,
then the fs should be aware of the underlying raid topology, and know if
this was a parity block or data block of raid5 or raid6, or which
mirror instance of a raid1/10 type which was involved.
Best regards
keld
next prev parent reply other threads:[~2011-02-17 1:04 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-16 10:27 md road-map: 2011 NeilBrown
2011-02-16 11:28 ` Giovanni Tessore
2011-02-16 13:40 ` Roberto Spadim
2011-02-16 14:00 ` Robin Hill
2011-02-16 14:09 ` Roberto Spadim
2011-02-16 14:21 ` Roberto Spadim
2011-02-16 21:55 ` NeilBrown
2011-02-17 1:30 ` Roberto Spadim
2011-02-16 14:13 ` Joe Landman
2011-02-16 21:24 ` NeilBrown
2011-02-16 21:44 ` Roman Mamedov
2011-02-16 21:59 ` NeilBrown
2011-02-17 0:48 ` Phil Turmel
2011-02-16 22:12 ` Joe Landman
2011-02-16 15:42 ` David Brown
2011-02-16 21:35 ` NeilBrown
2011-02-16 22:34 ` David Brown
2011-02-16 23:01 ` NeilBrown
2011-02-17 0:30 ` David Brown
2011-02-17 0:55 ` NeilBrown
2011-02-17 1:04 ` Keld Jørn Simonsen [this message]
2011-02-17 10:45 ` David Brown
2011-02-17 10:58 ` Keld Jørn Simonsen
2011-02-17 11:45 ` Giovanni Tessore
2011-02-17 15:44 ` Keld Jørn Simonsen
2011-02-17 16:22 ` Roberto Spadim
2011-02-18 0:13 ` Giovanni Tessore
2011-02-18 2:56 ` Keld Jørn Simonsen
2011-02-18 4:27 ` Roberto Spadim
2011-02-18 9:47 ` Giovanni Tessore
2011-02-18 18:43 ` Keld Jørn Simonsen
2011-02-18 19:00 ` Roberto Spadim
2011-02-18 19:18 ` Keld Jørn Simonsen
2011-02-18 19:22 ` Roberto Spadim
2011-02-16 17:20 ` Joe Landman
2011-02-16 21:36 ` NeilBrown
2011-02-16 19:37 ` Phil Turmel
2011-02-16 21:44 ` NeilBrown
2011-02-17 0:11 ` Phil Turmel
2011-02-16 20:29 ` Piergiorgio Sartor
2011-02-16 21:48 ` NeilBrown
2011-02-16 22:53 ` Piergiorgio Sartor
2011-02-17 0:24 ` Phil Turmel
2011-02-17 0:52 ` NeilBrown
2011-02-17 1:14 ` Phil Turmel
2011-02-17 3:10 ` NeilBrown
2011-02-17 18:46 ` Phil Turmel
2011-02-17 21:04 ` Mr. James W. Laferriere
2011-02-18 1:48 ` NeilBrown
2011-02-17 19:56 ` Piergiorgio Sartor
2011-02-16 22:50 ` Keld Jørn Simonsen
2011-02-23 5:06 ` Daniel Reurich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110217010455.GA16324@www2.open-std.org \
--to=keld@keldix.com \
--cc=david.brown@hesbynett.no \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).