public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* GFS2 Filesystem [0/16]
@ 2006-02-24 14:48 Steven Whitehouse
  2006-02-24 21:35 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Steven Whitehouse @ 2006-02-24 14:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Teigland, linux-kernel

Hi,

The following 16 patches make up the GFS2 filesystem as contained in the
git tree at:

http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=summary

Please consider GFS2 for inclusion in your -mm series of kernel patches.
The DLM is not included in this patch series since that is already in
-mm. The patches are relative to Linus' latest kernel (well as of
yesterday when I last updated the git tree).

There are some slight changes between the DLM in the git tree and that
in -mm being, that at Ingo Molnar's suggestion, it has been moved to be
in the fs/dlm directory in the git tree. Also the (unused) range locking
feature has been removed in the git tree version. Otherwise the two are
identical.

Below are some release notes which explain a bit more about GFS2 along
with pointers to documentation etc. I believe that we've taken into
account all the points which were raised in the comments from our last
posting to linux-kernel but see below for the detailed list,

Steve.

-------------------------------------------------------------------------------------------------
Release notes / State of the Union for GFS2

1. Relationship with GFS1
2. New features
3. Known issues (to be fixed before submission to Linus)
4. Some items from our TODO list
5. Where to find things....

1. Relationship with GFS1

A review of the metadata in GFS2 now means that most of the metadata
is now compatible between GFS1 and GFS2, making the writing of an
upgrade tool a relatively trivial operation. The differences between
the ondisk metadata between GFS1 and GFS2 are:

 a) The superblock has different magic numbers to indicate the new
    filesystem format.
 b) The indirect pointer blocks have pointers starting at a different
    offset to GFS1.
 c) The addition of the .gfs2_admin directory means that some new
    inodes would need to be added in order to upgrade a filesystem.
    The journals are now represented on disk as normal inodes as opposed
    to the extent based system of GFS1.
 d) The ondisk format for data has been changed _only_ in the case where
    that data is journaled. The new format for journaled data is in fact
    identical to the format for non-journaled data (i.e. the metadata
    header which used to be at the start of every journaled block is now
    no longer used for data blocks). Note that this change has resulted
    in a number of advantages outlined below (see 2(a)).
 e) In some cases, fields used in GFS1 are no longer used. These are
    left as padding fields in order to ease the upgrade procedure.


2. New features (since last posting to the kernel list)

 a) Journaled data files can now be:
    i) mmap()ed
   ii) exported via NFS
  iii) converted to/from normal files at any time
       (N.B. GFS1 had a restriction that conversion could only happen
       when files were zero sized)

 b) The .gfs2_admin directory exposes the internal files that GFS uses
    to store various bits of file system related information. This means
    that we've been able to remove virtually all the ioctl() calls from
    GFS2. There is one ioctl() call left which relates to
    getting/setting GFS2 specific flags on files. The various GFS2 tools
    will be updated in due course to use this new interface.

 c) Sparse annotation for the ondisk structures. (See also 3(e))

 d) vm_walk() and friends removed. All I/O is via the page cache now
    (aside from direct I/O of course).

 e) Recovery should be slightly faster since we now no longer need to
    read disk blocks from the journal which appear in the revoke list
    at recovery time.

 f) Many minor bug fixes and cleanups

 g) The code has also got smaller since the last posting to linux-kernel
    by approx 40k

3. Known issues (to be fixed before submission to Linus)

 a) Deadlock between page locks and GFS2's glocks.
    We intend fixing this in the same way that the OCFS2 file system
    does, i.e. adding the AOP_TRUNCATED_PAGE return code into the
    glock code at a suitable point.

 b) Protection of GFS2 system files under .gfs2_admin. Currently, due
    to the way in which GFS2's locking works its possible to hang a
    process by accessing a system file that's in use under some
    circumstances. This is mainly a problem with the journal files. We
    intend to add some special casing to prevent this from happening.

 c) selinux support will be integrated

 d) Various userland tools to be updated, currently mkfs is the only
    working userland program for GFS2.

 e) Remove the remainder of the endian conversion functions which are
    in ondisk.c (quite a few have gone already) in favour of changing
    the fields directly. This will remove a lot of sparse annotation
    warnings.

4. Some items from our TODO list (probably post-integration, but things
   we would like to do)

 a) Support for denying of write access to currently executing binaries.
    (Currently only works correctly on single node file systems, see the
    thread "Re: FMODE_EXEC or alike?" on linux-kernel/linux-fsdevel)

 b) Moving list of resource groups into a tree or similar structure
    sorted by disk location. This should then allow removal of the
    various sorts done in the deallocation code (since the resource
    groups will be pre-sorted) and also remove the requirement for
    the associated memory allocations.

5. Where to find things....

GFS2 and DLM kernel code is in a GIT tree at kernel.org:

http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=summary

The mkfs program is currently in the CVS head, details can be found at:

http://sources.redhat.com/cluster/

Also I'll put a tar ball version of mkfs in my directory on kernel.org.
mkfs is not currently hooked into the build system in CVS. Just a simple
make, make install (after editing the Makefile to point it at your
kernel source) should do the trick. This is all you should need to test
GFS in single node mode.

To use GFS2 in clustered mode, see the more detailed instructions on
the cluster page (url above).





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-02-24 14:48 GFS2 Filesystem [0/16] Steven Whitehouse
@ 2006-02-24 21:35 ` Christoph Hellwig
  2006-02-27  9:03   ` Steven Whitehouse
  2006-02-28 17:18   ` Phillip Susi
  2006-02-24 21:36 ` Christoph Hellwig
  2006-02-24 23:52 ` Andrew Morton
  2 siblings, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2006-02-24 21:35 UTC (permalink / raw)
  To: Steven Whitehouse; +Cc: Andrew Morton, David Teigland, linux-kernel

>  b) The .gfs2_admin directory exposes the internal files that GFS uses
>     to store various bits of file system related information. This means
>     that we've been able to remove virtually all the ioctl() calls from
>     GFS2. There is one ioctl() call left which relates to
>     getting/setting GFS2 specific flags on files. The various GFS2 tools
>     will be updated in due course to use this new interface.

Without even looking at the code a strong NACK here.  This is polluting
the namespace which is not acceptable.  Please implement a second
filesystem type gfsmeta to do this kind of admin work.  Search for ext2meta
which did something similar.  Or use a completely different approach,
I'd need to look at the actual functionality provided to give a better
advice, but currently I'm lacking the time for that.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-02-24 14:48 GFS2 Filesystem [0/16] Steven Whitehouse
  2006-02-24 21:35 ` Christoph Hellwig
@ 2006-02-24 21:36 ` Christoph Hellwig
  2006-02-24 23:52 ` Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2006-02-24 21:36 UTC (permalink / raw)
  To: Steven Whitehouse; +Cc: Andrew Morton, David Teigland, linux-kernel

oh, and please look at Andrews guidelines for submitting patches,
giving every mail the same subject modulo the patch numbering is
not exactly helpful.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-02-24 14:48 GFS2 Filesystem [0/16] Steven Whitehouse
  2006-02-24 21:35 ` Christoph Hellwig
  2006-02-24 21:36 ` Christoph Hellwig
@ 2006-02-24 23:52 ` Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2006-02-24 23:52 UTC (permalink / raw)
  To: Steven Whitehouse; +Cc: teigland, linux-kernel

Steven Whitehouse <swhiteho@redhat.com> wrote:
>
> The following 16 patches make up the GFS2 filesystem as contained in the
> git tree at:
> 
> http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=summary
> 
> Please consider GFS2 for inclusion in your -mm series of kernel patches.

Once the various review comments are sorted out I'd prefer that both DLM
and GFS be maintained by you in your git tree (like OCFS2 prior to and
after merge).

That's the most convenient thing for both you and me.  It has the downside
that putting things into git trees tends to hide them from view.  And GFS
needs a lot of viewing before it can proceed further.  That's an ongoing
problem with the git trees.

So, in a way, maintaining DLM and GFS in git trees as I suggest is likely
to retard an upstream merge.   But it would be more convenient.

Helpful, aren't I?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-02-24 21:35 ` Christoph Hellwig
@ 2006-02-27  9:03   ` Steven Whitehouse
  2006-02-28 17:18   ` Phillip Susi
  1 sibling, 0 replies; 8+ messages in thread
From: Steven Whitehouse @ 2006-02-27  9:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andrew Morton, David Teigland, linux-kernel

Hi,

On Fri, 2006-02-24 at 21:35 +0000, Christoph Hellwig wrote:
> >  b) The .gfs2_admin directory exposes the internal files that GFS uses
> >     to store various bits of file system related information. This means
> >     that we've been able to remove virtually all the ioctl() calls from
> >     GFS2. There is one ioctl() call left which relates to
> >     getting/setting GFS2 specific flags on files. The various GFS2 tools
> >     will be updated in due course to use this new interface.
> 
> Without even looking at the code a strong NACK here.  This is polluting
> the namespace which is not acceptable.  Please implement a second
> filesystem type gfsmeta to do this kind of admin work.  Search for ext2meta
> which did something similar.  Or use a completely different approach,
> I'd need to look at the actual functionality provided to give a better
> advice, but currently I'm lacking the time for that.
> 
Of all the comments we've received so far, this one raises the most
issues for us. Let me think about this one for a day or two and I'll get
back to you. Ideally we'd like to do it the way you propose, but I need
to check that it doesn't raise any other problems before I commit to
actually doing it,

Steve.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-02-24 21:35 ` Christoph Hellwig
  2006-02-27  9:03   ` Steven Whitehouse
@ 2006-02-28 17:18   ` Phillip Susi
  2006-03-02 10:12     ` Steven Whitehouse
  1 sibling, 1 reply; 8+ messages in thread
From: Phillip Susi @ 2006-02-28 17:18 UTC (permalink / raw)
  To: Christoph Hellwig, Steven Whitehouse, Andrew Morton,
	David Teigland, linux-kernel

I'm a bit confused.  Why exactly is this unacceptable, and what exactly 
do you propose instead?  Having an entirely separate mount point that is 
sort of parallel to the main one, but with extra metadata exposed?  So 
instead of /path/to/foo/.gfs2_admin/metafile you'd prefer having a 
separate mount point like /proc/fs/gfs/path/to/foo/metafile?


Christoph Hellwig wrote:
>>  b) The .gfs2_admin directory exposes the internal files that GFS uses
>>     to store various bits of file system related information. This means
>>     that we've been able to remove virtually all the ioctl() calls from
>>     GFS2. There is one ioctl() call left which relates to
>>     getting/setting GFS2 specific flags on files. The various GFS2 tools
>>     will be updated in due course to use this new interface.
> 
> Without even looking at the code a strong NACK here.  This is polluting
> the namespace which is not acceptable.  Please implement a second
> filesystem type gfsmeta to do this kind of admin work.  Search for ext2meta
> which did something similar.  Or use a completely different approach,
> I'd need to look at the actual functionality provided to give a better
> advice, but currently I'm lacking the time for that.
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-02-28 17:18   ` Phillip Susi
@ 2006-03-02 10:12     ` Steven Whitehouse
  2006-03-02 10:36       ` Al Viro
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Whitehouse @ 2006-03-02 10:12 UTC (permalink / raw)
  To: Phillip Susi
  Cc: Christoph Hellwig, Steven Whitehouse, Andrew Morton,
	David Teigland, linux-kernel

Hi,

On Tue, Feb 28, 2006 at 12:18:31PM -0500, Phillip Susi wrote:
> I'm a bit confused.  Why exactly is this unacceptable, and what exactly 
> do you propose instead?  Having an entirely separate mount point that is 
> sort of parallel to the main one, but with extra metadata exposed?  So 
> instead of /path/to/foo/.gfs2_admin/metafile you'd prefer having a 
> separate mount point like /proc/fs/gfs/path/to/foo/metafile?
>
I believe that is what Christoph is proposing. It does simplify certain
things, not least preventing someone from moving the .gfs2_admin directory
to somewhere other than the root directory of the filesystem or even
removing it completely which would otherwise need to be added as special
cases.

On the otherhand, its not clear to me at the moment, exactly how to
implement this bearing in mind that both the "normal" filesystem and
the metadata filesystem are really one and the same as far as journaling
and locking are concerned. Perhaps what's needed is one fs with two
different roots. I'm still looking into the best way to do this,

Steve.
 
> 
> Christoph Hellwig wrote:
> >> b) The .gfs2_admin directory exposes the internal files that GFS uses
> >>    to store various bits of file system related information. This means
> >>    that we've been able to remove virtually all the ioctl() calls from
> >>    GFS2. There is one ioctl() call left which relates to
> >>    getting/setting GFS2 specific flags on files. The various GFS2 tools
> >>    will be updated in due course to use this new interface.
> >
> >Without even looking at the code a strong NACK here.  This is polluting
> >the namespace which is not acceptable.  Please implement a second
> >filesystem type gfsmeta to do this kind of admin work.  Search for ext2meta
> >which did something similar.  Or use a completely different approach,
> >I'd need to look at the actual functionality provided to give a better
> >advice, but currently I'm lacking the time for that.
> >
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: GFS2 Filesystem [0/16]
  2006-03-02 10:12     ` Steven Whitehouse
@ 2006-03-02 10:36       ` Al Viro
  0 siblings, 0 replies; 8+ messages in thread
From: Al Viro @ 2006-03-02 10:36 UTC (permalink / raw)
  To: Steven Whitehouse
  Cc: Phillip Susi, Christoph Hellwig, Steven Whitehouse, Andrew Morton,
	David Teigland, linux-kernel

On Thu, Mar 02, 2006 at 10:12:19AM +0000, Steven Whitehouse wrote:
> Hi,
> 
> On Tue, Feb 28, 2006 at 12:18:31PM -0500, Phillip Susi wrote:
> > I'm a bit confused.  Why exactly is this unacceptable, and what exactly 
> > do you propose instead?  Having an entirely separate mount point that is 
> > sort of parallel to the main one, but with extra metadata exposed?  So 
> > instead of /path/to/foo/.gfs2_admin/metafile you'd prefer having a 
> > separate mount point like /proc/fs/gfs/path/to/foo/metafile?
> >
> I believe that is what Christoph is proposing. It does simplify certain
> things, not least preventing someone from moving the .gfs2_admin directory
> to somewhere other than the root directory of the filesystem or even
> removing it completely which would otherwise need to be added as special
> cases.
> 
> On the otherhand, its not clear to me at the moment, exactly how to
> implement this bearing in mind that both the "normal" filesystem and
> the metadata filesystem are really one and the same as far as journaling
> and locking are concerned. Perhaps what's needed is one fs with two
> different roots. I'm still looking into the best way to do this,

Two superblocks, one keeping a reference to another.  Filesystem driver is,
of course, the single piece of code, with common locking.  There's no need
to have the common struct super_block for that and no benefit in doing so -
only extra complications.  You can easily register two filesystem types
in the same driver and have ->get_sb() for your metadata fs parse its
arguments in any way it likes.  E.g. by doing pathname lookup on what would
normally be a device name and seeing if its on a filesystem of the primary
type; if it is - grab a reference to struct super_block of that fs
and work with it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-03-02 10:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-24 14:48 GFS2 Filesystem [0/16] Steven Whitehouse
2006-02-24 21:35 ` Christoph Hellwig
2006-02-27  9:03   ` Steven Whitehouse
2006-02-28 17:18   ` Phillip Susi
2006-03-02 10:12     ` Steven Whitehouse
2006-03-02 10:36       ` Al Viro
2006-02-24 21:36 ` Christoph Hellwig
2006-02-24 23:52 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox