public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@caldera.de>
To: unlisted-recipients:; (no To-header on input)@localhost.localdomain
Cc: linux-kernel@vger.kernel.org
Subject: Re: IBMs LVM ?
Date: Tue, 11 Sep 2001 21:57:35 +0200	[thread overview]
Message-ID: <20010911215735.A14883@caldera.de> (raw)
In-Reply-To: <3B9E255C.8943D6BB@uni-mb.si> <200109111526.f8BFQLr25266@ns.caldera.de> <20010911115713.D29347@turbolinux.com> <20010911200633.A5816@caldera.de> <20010911131531.E29347@turbolinux.com>
In-Reply-To: <20010911131531.E29347@turbolinux.com>; from adilger@turbolabs.com on Tue, Sep 11, 2001 at 01:15:31PM -0600

On Tue, Sep 11, 2001 at 01:15:31PM -0600, Andreas Dilger wrote:
> Well, you don't get anything for free.  You need to have some abstraction
> in order to handle multiple LVM-type systems in a useful way, I don't see
> any way around that.  As for the kernel code, I don't see TOO much
> abstraction.
> 
> - You need code to read each separate on-disk format (partition headers,
>   LVM metadata, etc)
> - You need code to combine each of these into some useful "volume"
> - You need code to do mapping from input volume/block to output disk/block
> - You need code to be able to modify a volume metadata.

Yupp.


> You need code to do this for each different metadata format or partition
> type.  Luckily (hopefully?) most people won't have systems with more than
> a couple different metadata formats at a time.  HOWEVER, you NEED to be
> able to migrate from one format to another, so you will have periods when
> you have extra modules loaded during a migration.

No - there is no reason we need support for migration in our volume
managment system.  The reason to have a framework is excatly that we
can have multiple different formats.  If there is a tool to provide
migration it's nice, but it shouldn't complicate the actual volume
managment core.

> > god-mode that allows even root to do anything and we come back.
> 
> It is not clear if you are you for or against "god-mode"?  Would you
> like it so that it is easy to shoot themselves in the head (or it
> is so complex/manual that it is hard not to), or rather everyone in
> straight-jackets in rubber rooms (ala Windows)?  Some safety is needed,
> but there are also reasons to bypass that safety.  I don't know what
> the current state of "god-mode" is in the EVMS code (if it is there,
> needed, whatever, so it is a moot point).

In unix we have the useruser (or more precise the capabilities model),
no need for a damn god mode.

> Exactly.  It was me and the EVMS LVM-emulation author that pointed out to
> the Sistina folks that there was NO need to have an incompatible change
> to the on-disk layout.  In fact, the EVMS code can handle BOTH formats,
> and did so before LVM did.

Of course they don't, but Sistina doesn't like staying with one version
more than a few month.  See my compat code to support 0.8 volumes on 0.9
that got rejected.

> But, strangely it didn't make it into the LVM codebase.  Why is that?

Why didn't all your nice stuff get merged? :P

> I'm not blaming you, of course.  EVMS is at least a SourceForge project.

With the problem that the IBM guys ignore me since someone told them that
I'm "not part of the OpenSource movement".  I started sending fixes in
the very beginning..

> Well, there are obvious reasons for that.  Some people won't want to have
> EVMS in the kernel (clearly you are one of them), so their patch can't
> just rip out the existing partition scanning code.  Maybe there will be a
> config option to not compile it later, or there will be an "EVMS-lite"
> which handles basic partitions, or whatever.

The naming doesn't matter.  (BTW, EVMS is a damn stupid name..).
Either we do the - as you say - VFS for blockdevices or we don't - it doesn't
make sense to do it conditional.  Wether the complicated drivers are actually
use, and wether people want to use the integrated userspace soloutions is
a very different question.

> 
> In any case, I doubt an EVMS patch would be accepted if it removed that
> code to start with.

I think such an patch would be accepted much more likely.  You know Linus
(and we all :)) likes ripping out code.

If you come and say: this patch nukes all special cases for MD, LVM and
partition handling, the code is now X lines less I bet he will like it.

> HOWEVER, since EVMS can handle ALL disk devices,

A few days I read that thread on the EVMS list about only detecting
SCSI and IDE drives yet - I probably got that wrong?..

> even just regular partitions, at some point it COULD be possible to get
> rid of the mess of different major/minor numbers for different disk types
> (hdX, sdX, cciss, rd, ida, etc) and assign all of them to EVMS.  Since EVMS
> only needs a minor number for the end-result volume (which may represent
> many individual disks, and doesn't need a minor for unused partitions),
> we would likely not have any shortage of block major numbers.  Consider
> 10*hdX*256 + 8*sdX*256 + MD*256 + 8*CCISS*256 + 8*DAC960*256 + 8*IDA*256...
> and it is a long time until you have 10k+ VISIBLE volumes on a single system.

Yes, that's a nice thing and in fact linux already posted his opinion
about all disk drivers sharing the major numbers (for 2.5?).

But that just needs a small assignement layer (like sound_core), not
something as complex as EVMS.  Or a bugfree devfs.

> 
> > Yes, because it is a Meta-LVM, not an actualy inplementation.
> > I _really_ want something like that in 2.5 - but not this horrible IBM
> > implementation.
> 
> Maybe you can help them work on it?

If they stop ignoring me - sure.

> They have recently just redone a lot
> of the code, partly based on input from Andrew Clausen.  I don't see any
> other similar projects out there, and I think it is a waste of effort to
> complain about work that is done rather than fix it.  I tried for a long
> time (while it was still my job to do so) to fix the current Linux-LVM
> code, and I only had very minimal impact.  Even so, Linux-LVM will never
> become a Meta-LVM without AT LEAST as much "mess" as EVMS, and probably
> will contain a lot more.

Linux-LVM doesn't even try to be a "Meta-LVM".

> 
> IBM is at least doing something about this by putting their code where
> their mouth is.  Nothing is ever perfect.  Consider MD RAID - it isn't
> even compatible between "stock" 2.2 and 2.2+Mingo-0.9RAID patches (which
> is one reason why Alan never accepted the 0.9 MD RAID into 2.2, despite the
> fact that ALL distros used the 0.9 MD RAID patches).

It's a different issue.  For 2.4 ti sould be very easy to support both,
2.2 is missing a generic block remapping interface.

> > Nope - I'm currently working on implementing VxVM support, and I have
> > to redo all the RAID stuff because it is so incomaptible.
> 
> The question is - will the VxVM support be yet ANOTHER separate code base
> in the kernel?

Of course I plug into one of the many generic volume managment frameworks
that are available - that's why we have this thread, heh?

> You complain about code bloat in EVMS, but having LVM, MD,
> VxVM, NT LDM, etc. all separate is also code bloat,

I think al these are together smaller than EVMS.
At lest they don't have their own 200k linked list implementation :P

> > If you have a design to share the RAID engine for very different layouts:
> > nice - but I don'T see the relation to EVMS.
> 
> No, I can't say I do - it was purely speculation based on the concept that
> RAID-1 and RAID-5 are GENERALLY the same concept.  Maybe the reason VxVM
> is so different is that it also incorporates "LVM" style function as well?

Yes.  It has a completly different layering.  FreeBSD Vinum has a structure
that is somewhat similar if you want to take a look at another opensource
project.

> Note that NT-LDM ALSO has RAID-0/1/5 modes (not yet supported by the LDM
> driver) which will eventually need support.  Do you want the number of RAID
> implementations in the kernel to grow without bound, or rather try and
> consolidate them (where possible) into a generic interface like EVMS?

You forgot ATA fakeraid by arjan :)

	Christoph

-- 
Of course it doesn't work. We've performed a software upgrade.

  reply	other threads:[~2001-09-11 19:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-09-11 14:53 IBMs LVM ? David Balazic
2001-09-11 15:26 ` Christoph Hellwig
2001-09-11 17:57   ` Andreas Dilger
2001-09-11 18:06     ` Christoph Hellwig
2001-09-11 19:15       ` Andreas Dilger
2001-09-11 19:57         ` Christoph Hellwig [this message]
2001-09-27 19:36       ` Jes Sorensen
2001-09-27 23:03         ` Alan Cox
     [not found] <200109142155.f8ELtVi03827@shay.ecn.purdue.edu>
2001-09-20 11:57 ` IBMs LVM? Christoph Hellwig
2001-09-20 16:05   ` Andreas Dilger
2001-09-20 16:12     ` Steven Whitehouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010911215735.A14883@caldera.de \
    --to=hch@caldera.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox