From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
avishay@gmail.com, jeff@garzik.org, viro@ZenIV.linux.org.uk,
linux-fsdevel@vger.kernel.org, osd-dev@open-osd.org,
linux-kernel@vger.kernel.org,
linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 7/9] exofs: mkexofs
Date: Wed, 31 Dec 2008 15:57:32 +0000 [thread overview]
Message-ID: <1230739053.3408.74.camel@localhost.localdomain> (raw)
In-Reply-To: <495B8D90.1090004@panasas.com>
On Wed, 2008-12-31 at 17:19 +0200, Boaz Harrosh wrote:
> Andrew Morton wrote:
> > On Tue, 16 Dec 2008 17:33:48 +0200
> > Boaz Harrosh <bharrosh@panasas.com> wrote:
> >
> >> We need a mechanism to prepare the file system (mkfs).
> >> I chose to implement that by means of a couple of
> >> mount-options. Because there is no user-mode API for committing
> >> OSD commands. And also, all this stuff is highly internal to
> >> the file system itself.
> >>
> >> - Added two mount options mkfs=0/1,format=capacity_in_meg, so mkfs/format
> >> can be executed by kernel code just before mount. An mkexofs utility
> >> can now be implemented by means of a script that mounts and unmount the
> >> file system with proper options.
> >
> > Doing mkfs in-kernel is unusual. I don't think the above description
> > sufficiently helps the uninitiated understand why mkfs cannot be done
> > in userspace as usual. Please flesh it out a bit.
>
> There are a few main reasons.
> - There is no user-mode API for initiating OSD commands. Such a subsystem
> would be hundredfold bigger then the mkfs code submitted. I think it would be
> hard and stupid to maintain a complex user-mode API just for creating
> a couple of objects and writing a couple of on disk structures.
This is really a reflection of the whole problem with the OSD paradigm.
In theory, a filesystem on OSD is a thin layer of metadata mapping
objects to files. Get this right and the storage will manage things,
like security and access and attributes (there's even a natural mapping
to the VFS concept of extended attributes). Plus, the storage has
enough information to manage persistence, backups and replication.
The real problem is that no-one has actually managed to come up with a
useful VFS<->OSD mapping layer (even by extending or altering the VFS).
Every filesystem that currently uses OSD has a separate direct OSD
speaking interface (i.e. it slices out the block layer to do this and
talks directly to the storage).
I suppose this could be taken to show that such a layer is impossibly
complex, as you assert, but its lack is reflected in strange looking
design decisions like in-kernel mkfs. It would also mean that there
would be very little layered code sharing between ODS based filesystems.
> - I intend to refactor the code further to make use of more super.c services,
> so to make this addition even smaller. Also future direction of raid over
> multiple objects will make even more kernel infrastructure needed which
> will need even more user-mode code duplication.
> - I anticipate problems that are not yet addressed in this body of work
> but will be in the future, mainly that a single OSD-target (lun) can
> be shared by lots of FSs, and a single FS can span many OSD-targets.
> Some central management is much easier to do in Kernel.
>
> >
> > What are the dependencies for this filesystem code? I assume that it
> > depends on various block- and scsi-level patches? Which ones, and
> > what is their status, and is this code even compileable without them?
> >
>
> This OSD-based file system is dependent on the open-osd initiator library
> code that I've submitted for inclusion for 2.6.29. It has been sitting
> in linux-next for a while now, and has not been receiving any comments
> for the last two updated patchsets I've sent to scsi-misc/lkml. However
> it has not yet been submitted into Jame's scsi-misc git tree, and James
> is the ultimate maintainer that should submit this work. I hope it will
> still be submitted into 2.6.29, as this code is totally self sufficient
> and does not endangers or changes any other Kernel subsystems.
> (All the needed ground work was already submitted to Linus since 2.6.26)
> So why should it not?
I don't like it mainly because it's not truly a useful general framework
for others to build on. However, as argued above, there might not
actually be such a useful framework, so as long as the only two
consumers (you and Lustre) want an interface like this, I'll put it in.
James
next prev parent reply other threads:[~2008-12-31 15:57 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-16 14:48 [PATCHSET 0/9] exofs (was osdfs) Boaz Harrosh
2008-12-16 14:52 ` [PATCH 1/9] exofs: osd Swiss army knife Boaz Harrosh
2008-12-29 20:29 ` Andrew Morton
2008-12-31 15:33 ` Boaz Harrosh
2008-12-31 19:26 ` Andrew Morton
2009-01-01 14:44 ` Boaz Harrosh
2009-01-02 16:52 ` Pavel Machek
2009-01-04 8:43 ` Boaz Harrosh
2009-01-04 20:03 ` Pavel Machek
2009-01-05 9:01 ` Boaz Harrosh
2009-01-05 9:36 ` Pavel Machek
2008-12-16 15:15 ` Boaz Harrosh
2009-01-07 15:47 ` [osd-dev] " Benny Halevy
2009-01-13 13:55 ` Alan Cox
2009-01-13 14:43 ` Boaz Harrosh
2009-01-13 14:52 ` Boaz Harrosh
2009-01-13 15:09 ` Jamie Lokier
2009-01-13 15:17 ` Jeff Garzik
2009-01-13 15:28 ` Benny Halevy
2008-12-16 15:17 ` [PATCH 2/9] exofs: file and file_inode operations Boaz Harrosh
2008-12-29 20:34 ` Andrew Morton
2008-12-31 15:36 ` Boaz Harrosh
2008-12-16 15:21 ` [PATCH 3/9] exofs: symlink_inode and fast_symlink_inode operations Boaz Harrosh
2008-12-29 20:35 ` Andrew Morton
2008-12-16 15:22 ` [PATCH 4/9] exofs: address_space_operations Boaz Harrosh
2008-12-29 20:45 ` Andrew Morton
2008-12-31 15:35 ` Boaz Harrosh
2008-12-16 15:28 ` [PATCH 5/9] exofs: dir_inode and directory operations Boaz Harrosh
2008-12-29 20:47 ` Andrew Morton
2008-12-31 15:33 ` Boaz Harrosh
2008-12-16 15:31 ` [PATCH 6/9] exofs: super_operations and file_system_type Boaz Harrosh
2008-12-17 22:23 ` Marcin Slusarz
2008-12-18 8:41 ` Boaz Harrosh
2008-12-29 20:50 ` Andrew Morton
2008-12-16 15:33 ` [PATCH 7/9] exofs: mkexofs Boaz Harrosh
2008-12-29 20:14 ` Andrew Morton
2008-12-31 15:19 ` Boaz Harrosh
2008-12-31 15:57 ` James Bottomley [this message]
2009-01-01 9:22 ` [osd-dev] " Benny Halevy
2009-01-01 9:54 ` Jeff Garzik
2009-01-01 14:23 ` Benny Halevy
2009-01-01 14:28 ` Matthew Wilcox
2009-01-01 18:12 ` Jörn Engel
2009-01-01 23:26 ` J. Bruce Fields
2009-01-02 7:14 ` Benny Halevy
2009-01-04 15:20 ` Boaz Harrosh
2009-01-04 15:38 ` Christoph Hellwig
2009-01-12 18:12 ` James Bottomley
2009-01-12 19:23 ` Jeff Garzik
2009-01-12 19:56 ` James Bottomley
2009-01-12 20:22 ` Jeff Garzik
2009-01-12 23:25 ` James Bottomley
2009-01-13 13:03 ` [osd-dev] " Benny Halevy
2009-01-13 13:24 ` Jeff Garzik
2009-01-13 13:32 ` Benny Halevy
2009-01-13 13:44 ` Jeff Garzik
2009-01-13 14:03 ` Alan Cox
2009-01-13 14:17 ` Jeff Garzik
2009-01-13 16:14 ` Alan Cox
2009-01-13 17:21 ` Boaz Harrosh
2009-01-21 18:13 ` Jeff Garzik
2009-01-21 18:44 ` Boaz Harrosh
2009-01-12 22:48 ` Jamie Lokier
2009-01-06 8:40 ` Andreas Dilger
2008-12-31 19:25 ` Andrew Morton
2009-01-01 13:33 ` Boaz Harrosh
2009-01-02 22:46 ` James Bottomley
2009-01-04 8:59 ` Boaz Harrosh
2008-12-16 15:36 ` [PATCH 8/9] exofs: Documentation Boaz Harrosh
2008-12-18 7:47 ` Pavel Machek
2008-12-18 8:32 ` Boaz Harrosh
2008-12-16 15:38 ` [PATCH 9/9] fs: Add exofs to Kernel build Boaz Harrosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1230739053.3408.74.camel@localhost.localdomain \
--to=james.bottomley@hansenpartnership.com \
--cc=akpm@linux-foundation.org \
--cc=avishay@gmail.com \
--cc=bharrosh@panasas.com \
--cc=jeff@garzik.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=osd-dev@open-osd.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox