linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Scott Long <scott_long@adaptec.com>
To: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Cc: Jeff Garzik <jgarzik@pobox.com>,
	"Justin T. Gibbs" <gibbs@scsiguy.com>,
	linux-raid@vger.kernel.org, "Gibbs,
	Justin" <justin_gibbs@adaptec.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: "Enhanced" MD code avaible for review
Date: Wed, 17 Mar 2004 17:23:08 -0700	[thread overview]
Message-ID: <4058EBEC.8070309@adaptec.com> (raw)
In-Reply-To: <200403172245.31842.bzolnier@elka.pw.edu.pl>

Bartlomiej Zolnierkiewicz wrote:
> On Wednesday 17 of March 2004 22:18, Scott Long wrote:
>  > Jeff Garzik wrote:
>  > > Justin T. Gibbs wrote:
>  > >  > [ I tried sending this last night from my Adaptec email address and
>  > >  > have yet to see it on the list.  Sorry if this is dup for any of 
> you.
>  > >  > ]
>  > >
>  > > Included linux-kernel in the CC (and also bounced this post there).
>  > >
>  > >  > For the past few months, Adaptec Inc, has been working to 
> enhance MD.
>  > >
>  > > The FAQ from several corners is going to be "why not DM?", so I would
>  > > humbly request that you (or Scott Long) re-post some of that rationale
>  > > here...
> 
> This is #1 question so... why not DM?  8)
> 
> Regards,
> Bartlomiej
> 


The primary feature of any RAID implementation is reliability. 
Reliability is a surprisingly hard goal.  Making sure that your
data is available and trustworthy under real-world scenarios is
a lot harder than it sounds.  This has been a significant focus
of ours on MD, and is the primary reason why we chose MD as the
foundation of our work.

Storage is the foundation of everything that you do with your
computer.  It needs to work regardless of what happened to your 
filesystem on the last crash, regardless of whether or not you
have the latest initrd tools, regardless of what rpms you've kept
up to date on, regardless if your userland works, regardless of
what libc you are using this week, etc.

With DM, what happens when your initrd gets accidentally corrupted?
What happens when the kernel and userland pieces get out of sync?
Maybe you are booting off of a single drive and only using DM arrays
for secondary storage, but maybe you're not.  If something goes wrong
with DM, how do you boot?

Secondly, our target here is to interoperate with hardware components
that run outside the scope of Linux.  The HostRAID or DDF BIOS is
going to create an array using it's own format.  It's not going to
have any knowledge of DM config files, initrd, ramfs, etc.  However,
the end user is still going to expect to be able to seamlessly install
onto that newly created array, maybe move that array to another system,
whatever, and have it all Just Work.  Has anyone heard of a hardware
RAID card that requires you to run OS-specific commands in order to
access the arrays on it?  Of course not.  The point here is to make
software raid just as easy to the end user.

The third, and arguably most important issue is the need for reliable
error recovery.  With the DM model, error recovery would be done in
userland.  Errors generated during I/O would be kicked to a userland
app that would then drive the recovery-spare activation-rebuild
sequence.  That's fine, but what if something happens that prevents
the userland tool from running?  Maybe it was a daemon that became
idle and got swapped out to disk, but now you can't swap it back in
because your I/O is failing.  Or maybe it needs to activate a helper
module or read a config file, but again it can't because i/o is
failing.  What if it crashes.  What if the source code gets out of sync
with the kernel interface.  What if you upgrade glibc and it stops
working for whatever unknown reason.

Some have suggested in the past that these userland tools get put into
ramfs and locked into memory.  If you do that, then it might as well be
part of the kernel anyways.  It's consuming the same memory, if not
more, than the equivalent code in the kernel (likely a lot more since
you'd  have to static link it).  And you still have the downsides of it
possibly getting out of date with the kernel.  So what are the upsides?

MD is not terribly heavy-weight.  As a monolithic module of
DDF+ASR+R0+R1 it's about 65k in size.  That's 1/2 the size of your
average SCSI driver these days, and no one is advocating putting those
into userland.  It just doesn't make sense to sacrifice reliability
for the phantom goal of 'reducing kernel bloat'.

Scott


  reply	other threads:[~2004-03-18  0:23 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-17 18:14 "Enhanced" MD code avaible for review Justin T. Gibbs
2004-03-17 19:18 ` Jeff Garzik
2004-03-17 19:32   ` Christoph Hellwig
2004-03-17 20:02     ` Jeff Garzik
2004-03-17 21:18   ` Scott Long
2004-03-17 21:35     ` Jeff Garzik
2004-03-17 21:45     ` Bartlomiej Zolnierkiewicz
2004-03-18  0:23       ` Scott Long [this message]
2004-03-18  1:55         ` Bartlomiej Zolnierkiewicz
2004-03-18  6:38         ` Stefan Smietanowski
2004-03-20 13:07         ` Arjan van de Ven
2004-03-21 23:42           ` Scott Long
2004-03-22  9:05             ` Arjan van de Ven
2004-03-22 21:59               ` Scott Long
2004-03-23  6:48                 ` Arjan van de Ven
2004-03-18  1:56     ` viro
     [not found] <1AOTW-4Vx-7@gated-at.bofh.it>
     [not found] ` <1AOTW-4Vx-5@gated-at.bofh.it>
2004-03-18  1:33   ` Andi Kleen
2004-03-18  2:00     ` Jeff Garzik
2004-03-20  9:58       ` Jamie Lokier
  -- strict thread matches above, loose matches on Subject: below --
2004-03-19 20:19 Justin T. Gibbs
2004-03-23  5:05 ` Neil Brown
2004-03-23  6:23   ` Justin T. Gibbs
2004-03-24  2:26     ` Neil Brown
2004-03-24 19:09       ` Matt Domsch
2004-03-25  2:21       ` Jeff Garzik
2004-03-25 18:00         ` Kevin Corry
2004-03-25 18:42           ` Jeff Garzik
2004-03-25 18:48             ` Jeff Garzik
2004-03-25 23:46               ` Justin T. Gibbs
2004-03-26  0:01                 ` Jeff Garzik
2004-03-26  0:10                   ` Justin T. Gibbs
2004-03-26  0:14                     ` Jeff Garzik
2004-03-25 22:04             ` Lars Marowsky-Bree
2004-03-26 19:19               ` Kevin Corry
2004-03-31 17:07                 ` Randy.Dunlap
2004-03-25 23:35             ` Justin T. Gibbs
2004-03-26  0:13               ` Jeff Garzik
2004-03-26 17:43                 ` Justin T. Gibbs
2004-03-28  0:06                   ` Lincoln Dale
2004-03-30 17:54                     ` Justin T. Gibbs
2004-03-28  0:30                   ` Jeff Garzik
2004-03-26 19:15             ` Kevin Corry
2004-03-26 20:45               ` Justin T. Gibbs
2004-03-27 15:39                 ` Kevin Corry
2004-03-30 17:03                   ` Justin T. Gibbs
2004-03-30 17:15                     ` Jeff Garzik
2004-03-30 17:35                       ` Justin T. Gibbs
2004-03-30 17:46                         ` Jeff Garzik
2004-03-30 18:04                           ` Justin T. Gibbs
2004-03-30 21:47                             ` Jeff Garzik
2004-03-30 22:12                               ` Justin T. Gibbs
2004-03-30 22:34                                 ` Jeff Garzik
2004-03-30 18:11                         ` Bartlomiej Zolnierkiewicz
2004-03-25 22:59           ` Justin T. Gibbs
2004-03-25 23:44             ` Lars Marowsky-Bree
2004-03-26  0:03               ` Justin T. Gibbs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4058EBEC.8070309@adaptec.com \
    --to=scott_long@adaptec.com \
    --cc=B.Zolnierkiewicz@elka.pw.edu.pl \
    --cc=gibbs@scsiguy.com \
    --cc=jgarzik@pobox.com \
    --cc=justin_gibbs@adaptec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).