From: Neil Brown <neilb@suse.de>
To: Theodore Tso <tytso@mit.edu>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Ric Wheeler <ric@emc.com>,
Linux-ide <linux-ide@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>,
linux-raid@vger.kernel.org, Tejun Heo <htejun@gmail.com>,
James Bottomley <James.Bottomley@SteelEye.com>,
Mark Lord <mlord@pobox.com>, Neil Brown <neilb@suse.de>,
Jens Axboe <jens.axboe@oracle.com>,
"Clark, Nathan" <Clark_Nathan@emc.com>,
"Singh, Arvinder" <Singh_Arvinder@emc.com>,
"De Smet, Jochen" <DeSmet_Jochen@emc.com>,
"Farmer, Matt" <Farmer_Matt@emc.com>,
linux-fsdevel@vger.kernel.org, "Mizar,
Sunita" <Mizar_Sunita@emc.com>
Subject: Re: end to end error recovery musings
Date: Mon, 26 Feb 2007 16:33:37 +1100 [thread overview]
Message-ID: <17890.28977.989203.938339@notabene.brown> (raw)
In-Reply-To: message from Theodore Tso on Friday February 23
On Friday February 23, tytso@mit.edu wrote:
> On Fri, Feb 23, 2007 at 05:37:23PM -0700, Andreas Dilger wrote:
> > > Probably the only sane thing to do is to remember the bad sectors and
> > > avoid attempting reading them; that would mean marking "automatic"
> > > versus "explicitly requested" requests to determine whether or not to
> > > filter them against a list of discovered bad blocks.
> >
> > And clearing this list when the sector is overwritten, as it will almost
> > certainly be relocated at the disk level. For that matter, a huge win
> > would be to have the MD RAID layer rewrite only the bad sector (in hopes
> > of the disk relocating it) instead of failing the whiole disk. Otherwise,
> > a few read errors on different disks in a RAID set can take the whole
> > system offline. Apologies if this is already done in recent kernels...
Yes, current md does this.
>
> And having a way of making this list available to both the filesystem
> and to a userspace utility, so they can more easily deal with doing a
> forced rewrite of the bad sector, after determining which file is
> involved and perhaps doing something intelligent (up to and including
> automatically requesting a backup system to fetch a backup version of
> the file, and if it can be determined that the file shouldn't have
> been changed since the last backup, automatically fixing up the
> corrupted data block :-).
>
> - Ted
So we want a clear path for media read errors from the device up to
user-space. Stacked devices (like md) would do appropriate mappings
maybe (for raid0/linear at least. Other levels wouldn't tolerate
errors).
There would need to be a limit on the number of 'bad blocks' that is
recorded. Maybe a mechanism to clear old bad blocks from the list is
needed.
Maybe if generic make request gets a request for a block which
overlaps a 'bad-block' it returns an error immediately.
Do we want a path in the other direction to handle write errors? The
file system could say "Don't worry to much if this block cannot be
written, just return an error and I will write it somewhere else"?
This might allow md not to fail a whole drive if there is a single
write error.
Or is that completely un-necessary as all modern devices do bad-block
relocation for us?
Is there any need for a bad-block-relocating layer in md or dm?
What about corrected-error counts? Drives provide them with SMART.
The SCSI layer could provide some as well. Md can do a similar thing
to some extent. Where these are actually useful predictors of pending
failure is unclear, but there could be some value.
e.g. after a certain number of recovered errors raid5 could trigger a
background consistency check, or a filesystem could trigger a
background fsck should it support that.
Lots of interesting questions... not so many answers.
NeilBrown
next prev parent reply other threads:[~2007-02-26 5:33 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-23 14:15 end to end error recovery musings Ric Wheeler
2007-02-24 0:03 ` H. Peter Anvin
2007-02-24 0:37 ` Andreas Dilger
2007-02-24 2:05 ` H. Peter Anvin
2007-02-24 2:32 ` Theodore Tso
2007-02-24 18:39 ` Chris Wedgwood
2007-02-26 5:33 ` Neil Brown [this message]
2007-02-26 13:25 ` Theodore Tso
2007-02-26 15:15 ` Alan
2007-02-26 15:18 ` Ric Wheeler
2007-02-26 17:01 ` Alan
2007-02-26 16:42 ` Ric Wheeler
2007-02-26 15:17 ` James Bottomley
2007-02-26 18:59 ` H. Peter Anvin
2007-02-26 22:46 ` Jeff Garzik
2007-02-26 22:53 ` Ric Wheeler
2007-02-27 1:19 ` Alan
2007-02-26 6:01 ` Douglas Gilbert
-- strict thread matches above, loose matches on Subject: below --
2007-02-27 1:10 Moore, Eric
2007-02-27 16:50 ` Martin K. Petersen
2007-02-27 18:51 ` Ric Wheeler
2007-02-27 19:02 ` Alan
2007-02-27 18:39 ` Andreas Dilger
2007-02-27 19:07 ` Martin K. Petersen
2007-02-27 23:39 ` Alan
2007-02-27 22:51 ` Martin K. Petersen
2007-02-28 13:46 ` Douglas Gilbert
2007-02-28 17:16 ` Martin K. Petersen
2007-02-28 17:30 ` James Bottomley
2007-02-28 17:42 ` Martin K. Petersen
2007-02-28 17:52 ` James Bottomley
2007-03-01 1:28 ` H. Peter Anvin
2007-03-01 14:25 ` James Bottomley
2007-03-01 17:19 ` H. Peter Anvin
2007-02-28 15:19 ` Moore, Eric
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17890.28977.989203.938339@notabene.brown \
--to=neilb@suse.de \
--cc=Clark_Nathan@emc.com \
--cc=DeSmet_Jochen@emc.com \
--cc=Farmer_Matt@emc.com \
--cc=James.Bottomley@SteelEye.com \
--cc=Mizar_Sunita@emc.com \
--cc=Singh_Arvinder@emc.com \
--cc=hpa@zytor.com \
--cc=htejun@gmail.com \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=mlord@pobox.com \
--cc=ric@emc.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).