The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Luca Berra <bluca@comedia.it>
To: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Joe Thornber <joe@fib011235813.fsnet.co.uk>,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: RFC - new raid superblock layout for md driver
Date: Tue, 3 Dec 2002 09:24:33 +0100	[thread overview]
Message-ID: <20021203082432.GB11862@percy.comedia.it> (raw)
In-Reply-To: <15851.53969.794768.513642@notabene.cse.unsw.edu.au>

On Tue, Dec 03, 2002 at 08:38:25AM +1100, Neil Brown wrote:
>> 1) Building a mirror is essentially just copying large amounts of data
>>    around, exactly what is needed to implement move functionality for
>>    arbitrarily remapping volumes.  (see
>>    http://people.sistina.com/~thornber/pvmove_outline.txt).
>
>Building a mirror may be just moving data around. But the interesting
>issues in raid1 are more about maintaining a mirror:  read balancing,
>retry on error, hot spares, etc.

true, that's why LVM (dm) should use md for the raid work.

>> 
>> 2) Extending raid 5 volumes becomes very simple if they are dm targets
>>    since you just add another segment, this new segment could even
>>    have different numbers of stripes.  eg,
>> 
>But is this something that you would *want* to do???
>
>To my mind, the raid1/raid5 almost always lives below any LVM or
>partitioning scheme.  You use raid1/raid5 to combine drives (real,
>physical drives) into virtual drives that are more reliable, and then
>you partition them or whatever you want to do.  raid1 and raid5 on top
>of LVM bits just doesn't make sense to me.
well to me does:
- you might want to split a mirror of a portion of data for backup
  purposes (when snapshots won't do) or for safety before attempting a
  risky operation.
- you might also want to have different raid strategies for different
  data. Think a medium sized storage with oracle, you might want to do
  a fast mirror for online redo logs(1) and raid5 for datafiles.(2)
- you might want to add mirroring after having put data on your disks
  and the current way to do it with MD on partitions is complex, with
  LVM over MD is really hard to do right.
- stackable devices are harder to maintain, a single interface to deal
  with mirroring and volume management would be easier.
- we wont have any more problems with 'switching cache buffer size' :))))

(1) yes i know they are mirrored by oracle, but having a fs unavailable
due to disk failure is a pita anyway
(2) a dba will tell you to use different disks, but i never found
anyone willing to use 4 73Gb disks for redo logs
  

>[[ I just had this really sick idea of creating a raid level that did
>data duplication (aka mirroring) for the first N stripes, and
I had another sick idea of teaching lilo how to do raid5, but it won't
fit in 512b. anyway for the normal MD on partitions case creating one
n-way raid1 for /boot and raid5 for the rest

>I really think the raid1/raid5 parts of MD are distinctly different
>from DM, and they should remain separate.  However I am quite happy to
>improve the interfaces so that seamless connections can be presented
>by user-space tools.

reading this it looks like that the only way dm could get raid is
reimplementing or duplicating code from existing md, thus duplicating
code in the kernel.

>To summarise:  If you want tigher integration between MD and DM, do it
>by defining useful interfaces, not by trying to tie them both together
>into one big lump.

we can think of md split in those major areas
1 the superblock interface, which i believe we all agree should go to
  user mode for all the array setup function, and should keep the
  portion for updating superblock in kernel space.
2 the raid logic code
3 the interface to lower block device layer
4 the interface to upper block device layer
(in md these 3 are thightly coupled)

some of these areas overlap with dm and it could be possible to merge
the duplicated functionality.

having said that and having looked 'briefly' at the code i believe that
doing something like this would mean reworking completely the logic
behind md, and adding some major parts to dm, or better to a separate
module.

in my idea we will have
a core that handles request mapping
metadata plugins for both md superblock format and lvm metadata (those
        would deal with keeping the metadata current with the array
        current status)
layout plugins for raid?, striping, linear, multipath (does this belong
        here or at a different level?)

L.

-- 
Luca Berra -- bluca@comedia.it
        Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \

  reply	other threads:[~2002-12-03  8:17 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-11-20  4:09 RFC - new raid superblock layout for md driver Neil Brown
2002-11-20 10:03 ` Anton Altaparmakov
2002-11-20 23:02   ` Neil Brown
2002-11-22  0:08   ` Kenneth D. Merry
2002-12-09  3:52     ` Neil Brown
2002-12-10  6:28       ` Kenneth D. Merry
2002-12-11  0:07         ` Neil Brown
2002-11-20 13:58 ` Bill Rugolsky Jr.
2002-11-20 23:17   ` Neil Brown
2002-11-20 14:09 ` Alan Cox
2002-11-20 23:11   ` Neil Brown
2002-11-21  0:30     ` Alan Cox
2002-11-21  0:30     ` Alan Cox
2002-11-20 16:03 ` Joel Becker
2002-11-20 23:31   ` Neil Brown
2002-11-21  1:46     ` Doug Ledford
2002-11-21 19:34       ` Joel Becker
2002-11-21 19:54         ` Doug Ledford
2002-11-21 19:57           ` Steven Dake
2002-11-21 20:38             ` Doug Ledford
2002-11-21 20:49               ` Steven Dake
2002-11-21 20:35                 ` Kevin Corry
2002-11-21 21:29             ` Alan Cox
2002-11-21 21:22               ` Doug Ledford
2002-11-21 20:53                 ` Kevin Corry
2002-11-21 21:55                   ` Doug Ledford
2002-11-21 23:49               ` DM vs MD (Was: RFC - new raid superblock layout for md driver) Luca Berra
2002-11-21 20:06           ` RFC - new raid superblock layout for md driver Joel Becker
2002-11-21 23:35           ` Luca Berra
2002-11-22 10:13   ` Joe Thornber
2002-12-02 21:38     ` Neil Brown
2002-12-03  8:24       ` Luca Berra [this message]
2002-11-20 17:05 ` Steven Dake
2002-11-20 23:30   ` Lars Marowsky-Bree
2002-11-20 23:48   ` Neil Brown
2002-11-21  0:29     ` Steven Dake
2002-11-21 15:23       ` John Stoffel
2002-11-21 19:36   ` Joel Becker
2002-11-22  7:11 ` Jeremy Fitzhardinge
  -- strict thread matches above, loose matches on Subject: below --
2002-11-20 15:55 Steve Pratt
2002-11-20 23:24 ` Neil Brown
2002-11-20 23:47 Lars Marowsky-Bree
2002-11-21  0:31 ` Neil Brown
2002-11-21  0:35 ` Steven Dake
2002-11-21  1:10   ` Alan Cox
2002-12-08 22:35   ` Neil Brown
2002-11-21 19:39 ` Joel Becker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20021203082432.GB11862@percy.comedia.it \
    --to=bluca@comedia.it \
    --cc=joe@fib011235813.fsnet.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox