From: NeilBrown <neilb@suse.de>
To: Phil Turmel <philip@turmel.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: md road-map: 2011
Date: Thu, 17 Feb 2011 08:44:03 +1100 [thread overview]
Message-ID: <20110217084403.46b4d8ee@notabene.brown> (raw)
In-Reply-To: <4D5C2776.2080609@turmel.org>
On Wed, 16 Feb 2011 14:37:26 -0500 Phil Turmel <philip@turmel.org> wrote:
> Hi Neil,
>
> On 02/16/2011 05:27 AM, NeilBrown wrote:
> >
> > I all,
> > I wrote this today and posted it at
> > http://neil.brown.name/blog/20110216044002
> >
> > I thought it might be worth posting it here too...
> >
> > NeilBrown
> >
> >
> > -------------------------
> >
> >
> > It is about 2 years since I last published a road-map[1] for md/raid
> > so I thought it was time for another one. Unfortunately quite a few
> > things on the previous list remain undone, but there has been some
> > progress.
> >
> > I think one of the problems with some to-do lists is that they aren't
> > detailed enough. High-level design, low level design, implementation,
> > and testing are all very different sorts of tasks that seem to require
> > different styles of thinking and so are best done separately. As
> > writing up a road-map is a high-level design task it makes sense to do
> > the full high-level design at that point so that the tasks are
> > detailed enough to be addressed individually with little reference to
> > the other tasks in the list (except what is explicit in the road map).
> >
> > A particular need I am finding for this road map is to make explicit
> > the required ordering and interdependence of certain tasks. Hopefully
> > that will make it easier to address them in an appropriate order, and
> > mean that I waste less time saying "this is too hard, I might go read
> > some email instead".
> >
> > So the following is a detailed road-map for md raid for the coming
> > months.
> >
> > [1] http://neil.brown.name/blog/20090129234603
> >
> > Bad Block Log
> > -------------
> [trim /]
> > Bitmap of non-sync regions.
> > ---------------------------
> [trim /]
>
> It occurred to me that if you go to the trouble (and space and performance)
> to create and maintain metadata for lists of bad blocks, and separate
> metadata for sync status aka "trim", or hot-replace status, or reshape-status,
> or whatever features are dreamt up later, why not create an infrastructure to
> carry all of it efficiently?
>
> David Brown suggested a multi-level metadata structure. I concur, but somewhat
> more generic:
> Level 1: Coarse bitmap, set bit indicates 'look at level 2'
> Level 2: Fine bitmap, set bit indicates 'look at level 3'
> Level 3: Extent list, with starting block, length, and feature payload
>
> The bitmap levels are purely for hot-path performance.
>
> As an option, it should be possible to spread the detailed metadata through the
> data area, possibly in chunk-sized areas spread out at some user-defined
> interval. "meta-span", perhaps. Then resizing partitions that compose an
> array would be less likely to bump up against metadata size limits. The coarse
> bitmap should stay near the superblock, of course.
This is starting to sound a lot more like a filesystem than a RAID system.
I really don't want there to be so much metadata that I am tempted to spread
it out among the data. I think that implies too much complexity.
Maybe that is a good place to draw the line: If some metadata doesn't fit
easily at the start of end of the devices, it has no place in RAID - you
should add it to a filesystem instead.
>
> Personally, I'd like to see the bad-block feature actually perform block
> remapping, much like hard drives themselves do, but with the option to unmap the
> block if a later write succeeds. Using one retry per array restart as you
> described makes a lot of sense. In any case, remapping would retain redundancy
> where applicable short of full drive failure or remap overflow.
If the hard drives already do this, why should md try to do it as well??
If a hard drive has had some many write errors that it has used up all of its
spare space, then it is long past time to replace it.
>
> My $0.02, of course.
Here in .au, the smallest legal tender is $0.05 - but thanks anyway :-)
NeilBrown
>
> Phil
next prev parent reply other threads:[~2011-02-16 21:44 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-16 10:27 md road-map: 2011 NeilBrown
2011-02-16 11:28 ` Giovanni Tessore
2011-02-16 13:40 ` Roberto Spadim
2011-02-16 14:00 ` Robin Hill
2011-02-16 14:09 ` Roberto Spadim
2011-02-16 14:21 ` Roberto Spadim
2011-02-16 21:55 ` NeilBrown
2011-02-17 1:30 ` Roberto Spadim
2011-02-16 14:13 ` Joe Landman
2011-02-16 21:24 ` NeilBrown
2011-02-16 21:44 ` Roman Mamedov
2011-02-16 21:59 ` NeilBrown
2011-02-17 0:48 ` Phil Turmel
2011-02-16 22:12 ` Joe Landman
2011-02-16 15:42 ` David Brown
2011-02-16 21:35 ` NeilBrown
2011-02-16 22:34 ` David Brown
2011-02-16 23:01 ` NeilBrown
2011-02-17 0:30 ` David Brown
2011-02-17 0:55 ` NeilBrown
2011-02-17 1:04 ` Keld Jørn Simonsen
2011-02-17 10:45 ` David Brown
2011-02-17 10:58 ` Keld Jørn Simonsen
2011-02-17 11:45 ` Giovanni Tessore
2011-02-17 15:44 ` Keld Jørn Simonsen
2011-02-17 16:22 ` Roberto Spadim
2011-02-18 0:13 ` Giovanni Tessore
2011-02-18 2:56 ` Keld Jørn Simonsen
2011-02-18 4:27 ` Roberto Spadim
2011-02-18 9:47 ` Giovanni Tessore
2011-02-18 18:43 ` Keld Jørn Simonsen
2011-02-18 19:00 ` Roberto Spadim
2011-02-18 19:18 ` Keld Jørn Simonsen
2011-02-18 19:22 ` Roberto Spadim
2011-02-16 17:20 ` Joe Landman
2011-02-16 21:36 ` NeilBrown
2011-02-16 19:37 ` Phil Turmel
2011-02-16 21:44 ` NeilBrown [this message]
2011-02-17 0:11 ` Phil Turmel
2011-02-16 20:29 ` Piergiorgio Sartor
2011-02-16 21:48 ` NeilBrown
2011-02-16 22:53 ` Piergiorgio Sartor
2011-02-17 0:24 ` Phil Turmel
2011-02-17 0:52 ` NeilBrown
2011-02-17 1:14 ` Phil Turmel
2011-02-17 3:10 ` NeilBrown
2011-02-17 18:46 ` Phil Turmel
2011-02-17 21:04 ` Mr. James W. Laferriere
2011-02-18 1:48 ` NeilBrown
2011-02-17 19:56 ` Piergiorgio Sartor
2011-02-16 22:50 ` Keld Jørn Simonsen
2011-02-23 5:06 ` Daniel Reurich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110217084403.46b4d8ee@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=philip@turmel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).