* [Cluster-devel] furture plans for gfs2-utils: mount.gfs2 and the metafs @ 2009-06-05 10:00 Steven Whitehouse 2009-06-05 14:54 ` [Cluster-devel] " David Teigland 0 siblings, 1 reply; 7+ messages in thread From: Steven Whitehouse @ 2009-06-05 10:00 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, A little while back Fabio suggested that we (as the gfs2 team) should come up with a medium/long term plan for the gfs2-utils. We already have a number of projects that we are working on, such as internationalisation, cleaning up libgfs2 and turning it into a proper library, removing obsolete options from gfs2_tool, and so forth. This is an attempt to set out a vision for the future and to maybe generate some discussion. I would like to eventually remove gfs2_tool completely, replacing it with a command called something like tunegfs2 which would take arguments compatible with the ext2 tune2fs where similar functions makse sense. Its only real function would be to manipulate the on-disk superblock. Any "tune" parameters which currently exist and are useful would be replaced by mount arguments. There are very few of these - maybe only one or two and I think they are all related to quotas. Some more work remains to be done to verify that. Another issue is how we mount gfs2 filesystems. I would like to try and get rid of the mount.gfs2 helper for several reasons. Currently we are using a different fstype (gfs2meta) to allow access to the GFS2 meta filesystem. In reality though, we don't mount a different filesystem type, but the same filesystem type as the "normal" filesystem, but with a different root. We have also more recently also supported the "-o meta" mount option to mount the meta root directly, but with some restrictions. Bearing in mind how easy it is to lift those restrictions (something that I've been discussing with Christoph) I'd like to raise the possibility of replacing the mount.gfs2 helper with a system which is very similar to that which we used to replace the umount.gfs2 helper for similar reasons. So the plan would be to enhance the mount function of GFS2 so that it is possible to mount a GFS2 filesystem by allowing multiple mounts (effectively a bind mount) of that block device with or without the "-o meta" argument which is used to choose the filesystem root. The problem of course, is the mount.gfs2 will then not know whether it is the first mount of the fs, or a further mount of an existing fs unless its keeping count of mounts per block device internally. The solution would be to use the uevent mechanism (probably the DLM's uevents, but it could be done via the GFS2 ones too I think) to trigger the loading of the DLM's config, setting of the journal id and whatever else needs to be done, in a similar way that we use GFS2's umount uevent to trigger leaving the cpg. It would have a number of advantages: 1. Less userland code to maintain (the changes to gfs_controld would be fairly minor) 2. A clean mount interface with no restrictions as to the ordering of "normal" and meta mounts 3. gfs2-utils would not need to depend on libgfscontrol - in fact thats the only dependency on that library in the utils. Maybe we wouldn't need libgfscontrol either... 4. gfs_controld would be able to talk directly to gfs2 and deal with its uevents without needing any support from the mount helpers. That gives a clean and simple interface to gfs_controld 5. The whole thing could stay backwards compatible too (we'd retain the current gfs2metafs filesystem type for the time being to ensure that all existing userland will continue to work) It is possible to make these changes a bit at a time, provided we are careful about the ordering. There are of course a number of details still to be worked out, but this gives a rough outline of my current thoughts. Also, are there any other changes we should be considering making in gfs2-utils? Steve. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] Re: furture plans for gfs2-utils: mount.gfs2 and the metafs 2009-06-05 10:00 [Cluster-devel] furture plans for gfs2-utils: mount.gfs2 and the metafs Steven Whitehouse @ 2009-06-05 14:54 ` David Teigland 2009-06-05 15:32 ` Steven Whitehouse 0 siblings, 1 reply; 7+ messages in thread From: David Teigland @ 2009-06-05 14:54 UTC (permalink / raw) To: cluster-devel.redhat.com On Fri, Jun 05, 2009 at 11:00:17AM +0100, Steven Whitehouse wrote: > Another issue is how we mount gfs2 filesystems. I would like to try and > get rid of the mount.gfs2 helper for several reasons. Currently we are > using a different fstype (gfs2meta) to allow access to the GFS2 meta > filesystem. In reality though, we don't mount a different filesystem > type, but the same filesystem type as the "normal" filesystem, but with > a different root. We have also more recently also supported the "-o > meta" mount option to mount the meta root directly, but with some > restrictions. Bearing in mind how easy it is to lift those restrictions > (something that I've been discussing with Christoph) I'd like to raise > the possibility of replacing the mount.gfs2 helper with a system which > is very similar to that which we used to replace the umount.gfs2 helper > for similar reasons. > > So the plan would be to enhance the mount function of GFS2 so that it is > possible to mount a GFS2 filesystem by allowing multiple mounts > (effectively a bind mount) of that block device with or without the "-o > meta" argument which is used to choose the filesystem root. The problem > of course, is the mount.gfs2 will then not know whether it is the first > mount of the fs, or a further mount of an existing fs unless its keeping > count of mounts per block device internally. I don't follow your problem description there, could you state it more explicitly? Give an example (sequence of commands), to demonstrate the problem (e.g. which command fails or doesn't do the right thing). > The solution would be to use the uevent mechanism (probably the DLM's > uevents, but it could be done via the GFS2 ones too I think) to trigger > the loading of the DLM's config, setting of the journal id and whatever > else needs to be done, in a similar way that we use GFS2's umount uevent > to trigger leaving the cpg. It would have a number of advantages: I'm familiar with using uevents to mount, that's the way it originally worked in 2005 (gfs Groundhog Day continues): http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commit;h=2ec0da360f4eba591ecbf5e4dc8ed35b82f4142c Dave ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] Re: furture plans for gfs2-utils: mount.gfs2 and the metafs 2009-06-05 14:54 ` [Cluster-devel] " David Teigland @ 2009-06-05 15:32 ` Steven Whitehouse 2009-06-05 16:00 ` David Teigland 0 siblings, 1 reply; 7+ messages in thread From: Steven Whitehouse @ 2009-06-05 15:32 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, On Fri, 2009-06-05 at 09:54 -0500, David Teigland wrote: > On Fri, Jun 05, 2009 at 11:00:17AM +0100, Steven Whitehouse wrote: > > Another issue is how we mount gfs2 filesystems. I would like to try and > > get rid of the mount.gfs2 helper for several reasons. Currently we are > > using a different fstype (gfs2meta) to allow access to the GFS2 meta > > filesystem. In reality though, we don't mount a different filesystem > > type, but the same filesystem type as the "normal" filesystem, but with > > a different root. We have also more recently also supported the "-o > > meta" mount option to mount the meta root directly, but with some > > restrictions. Bearing in mind how easy it is to lift those restrictions > > (something that I've been discussing with Christoph) I'd like to raise > > the possibility of replacing the mount.gfs2 helper with a system which > > is very similar to that which we used to replace the umount.gfs2 helper > > for similar reasons. > > > > So the plan would be to enhance the mount function of GFS2 so that it is > > possible to mount a GFS2 filesystem by allowing multiple mounts > > (effectively a bind mount) of that block device with or without the "-o > > meta" argument which is used to choose the filesystem root. The problem > > of course, is the mount.gfs2 will then not know whether it is the first > > mount of the fs, or a further mount of an existing fs unless its keeping > > count of mounts per block device internally. > > I don't follow your problem description there, could you state it more > explicitly? Give an example (sequence of commands), to demonstrate the > problem (e.g. which command fails or doesn't do the right thing). > I have an fstab entry like this: /dev/sda7 /mnt/gfs0 gfs2 noauto,rw,data=ordered,lockproto=lock_dlm,locktable=unity:myfs,quota=on,meta 1 2 and then I do: [root at men-an-tol gfs2-2.6-fixes.git]# mount /mnt/gfs0 [root at men-an-tol gfs2-2.6-fixes.git]# mount -t gfs2 /mnt/gfs0 /mnt/gfs1 /sbin/mount.gfs2: bad read: Invalid argument on line 263 of file /builddir/build/BUILD/cluster-2.99.12/gfs2/mount/util.c [root at men-an-tol gfs2-2.6-fixes.git]# mount -t gfs2 /mnt/gfs0 /mnt/gfs1 -o meta /sbin/mount.gfs2: bad read: Invalid argument on line 263 of file /builddir/build/BUILD/cluster-2.99.12/gfs2/mount/util.c which works on a single node lock_nolock without the mount helper. > > The solution would be to use the uevent mechanism (probably the DLM's > > uevents, but it could be done via the GFS2 ones too I think) to trigger > > the loading of the DLM's config, setting of the journal id and whatever > > else needs to be done, in a similar way that we use GFS2's umount uevent > > to trigger leaving the cpg. It would have a number of advantages: > > I'm familiar with using uevents to mount, that's the way it originally worked > in 2005 (gfs Groundhog Day continues): > > http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commit;h=2ec0da360f4eba591ecbf5e4dc8ed35b82f4142c > > Dave > Then the question arises, why was it changed? Steve. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] Re: furture plans for gfs2-utils: mount.gfs2 and the metafs 2009-06-05 15:32 ` Steven Whitehouse @ 2009-06-05 16:00 ` David Teigland 2009-06-05 16:38 ` Steven Whitehouse 0 siblings, 1 reply; 7+ messages in thread From: David Teigland @ 2009-06-05 16:00 UTC (permalink / raw) To: cluster-devel.redhat.com On Fri, Jun 05, 2009 at 04:32:42PM +0100, Steven Whitehouse wrote: > Hi, > > On Fri, 2009-06-05 at 09:54 -0500, David Teigland wrote: > > On Fri, Jun 05, 2009 at 11:00:17AM +0100, Steven Whitehouse wrote: > > > Another issue is how we mount gfs2 filesystems. I would like to try and > > > get rid of the mount.gfs2 helper for several reasons. Currently we are > > > using a different fstype (gfs2meta) to allow access to the GFS2 meta > > > filesystem. In reality though, we don't mount a different filesystem > > > type, but the same filesystem type as the "normal" filesystem, but with > > > a different root. We have also more recently also supported the "-o > > > meta" mount option to mount the meta root directly, but with some > > > restrictions. Bearing in mind how easy it is to lift those restrictions > > > (something that I've been discussing with Christoph) I'd like to raise > > > the possibility of replacing the mount.gfs2 helper with a system which > > > is very similar to that which we used to replace the umount.gfs2 helper > > > for similar reasons. > > > > > > So the plan would be to enhance the mount function of GFS2 so that it is > > > possible to mount a GFS2 filesystem by allowing multiple mounts > > > (effectively a bind mount) of that block device with or without the "-o > > > meta" argument which is used to choose the filesystem root. The problem > > > of course, is the mount.gfs2 will then not know whether it is the first > > > mount of the fs, or a further mount of an existing fs unless its keeping > > > count of mounts per block device internally. > > > > I don't follow your problem description there, could you state it more > > explicitly? Give an example (sequence of commands), to demonstrate the > > problem (e.g. which command fails or doesn't do the right thing). > > > I have an fstab entry like this: > /dev/sda7 /mnt/gfs0 gfs2 noauto,rw,data=ordered,lockproto=lock_dlm,locktable=unity:myfs,quota=on,meta 1 2 > > and then I do: > [root at men-an-tol gfs2-2.6-fixes.git]# mount /mnt/gfs0 > [root at men-an-tol gfs2-2.6-fixes.git]# mount -t gfs2 /mnt/gfs0 /mnt/gfs1 I don't know what kind of mount <dir1> <dir2> is, some form of bind mount? > /sbin/mount.gfs2: bad read: Invalid argument on line 263 of file > /builddir/build/BUILD/cluster-2.99.12/gfs2/mount/util.c I can't find that line number in any code (version 2.99.12 appears to be from Oct 2008!?) But, isn't it simply complaining that you've provided two dirs as input args? > > I'm familiar with using uevents to mount, that's the way it originally > > worked in 2005 (gfs Groundhog Day continues): > > > > http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commit;h=2ec0da360f4eba591ecbf5e4dc8ed35b82f4142c > > > Then the question arises, why was it changed? Much cleaner and works better. Changing it back again now would require you to first rewrite most of gfs_controld, and then face up to all the new problems that doing it differently would present. It would be rearranging the deck chairs on the titanic; if you're dieing to do major work on this, just toss out the whole thing and design a much simpler system that doesn't require so much complex user/kernel interaction (see ocfs2). Dave ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] Re: furture plans for gfs2-utils: mount.gfs2 and the metafs 2009-06-05 16:00 ` David Teigland @ 2009-06-05 16:38 ` Steven Whitehouse 2009-06-05 18:46 ` David Teigland 0 siblings, 1 reply; 7+ messages in thread From: Steven Whitehouse @ 2009-06-05 16:38 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, On Fri, 2009-06-05 at 11:00 -0500, David Teigland wrote: > On Fri, Jun 05, 2009 at 04:32:42PM +0100, Steven Whitehouse wrote: > > Hi, > > > > On Fri, 2009-06-05 at 09:54 -0500, David Teigland wrote: > > > On Fri, Jun 05, 2009 at 11:00:17AM +0100, Steven Whitehouse wrote: > > > > Another issue is how we mount gfs2 filesystems. I would like to try and > > > > get rid of the mount.gfs2 helper for several reasons. Currently we are > > > > using a different fstype (gfs2meta) to allow access to the GFS2 meta > > > > filesystem. In reality though, we don't mount a different filesystem > > > > type, but the same filesystem type as the "normal" filesystem, but with > > > > a different root. We have also more recently also supported the "-o > > > > meta" mount option to mount the meta root directly, but with some > > > > restrictions. Bearing in mind how easy it is to lift those restrictions > > > > (something that I've been discussing with Christoph) I'd like to raise > > > > the possibility of replacing the mount.gfs2 helper with a system which > > > > is very similar to that which we used to replace the umount.gfs2 helper > > > > for similar reasons. > > > > > > > > So the plan would be to enhance the mount function of GFS2 so that it is > > > > possible to mount a GFS2 filesystem by allowing multiple mounts > > > > (effectively a bind mount) of that block device with or without the "-o > > > > meta" argument which is used to choose the filesystem root. The problem > > > > of course, is the mount.gfs2 will then not know whether it is the first > > > > mount of the fs, or a further mount of an existing fs unless its keeping > > > > count of mounts per block device internally. > > > > > > I don't follow your problem description there, could you state it more > > > explicitly? Give an example (sequence of commands), to demonstrate the > > > problem (e.g. which command fails or doesn't do the right thing). > > > > > I have an fstab entry like this: > > /dev/sda7 /mnt/gfs0 gfs2 noauto,rw,data=ordered,lockproto=lock_dlm,locktable=unity:myfs,quota=on,meta 1 2 > > > > and then I do: > > [root at men-an-tol gfs2-2.6-fixes.git]# mount /mnt/gfs0 > > [root at men-an-tol gfs2-2.6-fixes.git]# mount -t gfs2 /mnt/gfs0 /mnt/gfs1 > > I don't know what kind of mount <dir1> <dir2> is, some form of bind mount? > Yes. Identical to what was being done with the gfs2meta filesystem type previously. Its only using /mnt/gfs0 to grab an inode from which to find out what the block device is. Doing it this way eliminates any races in mounting the second fs root based upon the first. > > /sbin/mount.gfs2: bad read: Invalid argument on line 263 of file > > /builddir/build/BUILD/cluster-2.99.12/gfs2/mount/util.c > > I can't find that line number in any code (version 2.99.12 appears to be from > Oct 2008!?) But, isn't it simply complaining that you've provided two dirs as > input args? > Yes, the message isn't great and we are in the process of cleaning things like that up. It might be that it is the case, and we can simply fix that, in which case it makes things easy. The question is though, if I mount the same fs multiple times, does mount.gfs2 realise that its the same fs each time? If so, then I think we are done in that area. > > > I'm familiar with using uevents to mount, that's the way it originally > > > worked in 2005 (gfs Groundhog Day continues): > > > > > > http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commit;h=2ec0da360f4eba591ecbf5e4dc8ed35b82f4142c > > > > > Then the question arises, why was it changed? > > Much cleaner and works better. Changing it back again now would require you > to first rewrite most of gfs_controld, and then face up to all the new > problems that doing it differently would present. It would be rearranging the > deck chairs on the titanic; if you're dieing to do major work on this, just > toss out the whole thing and design a much simpler system that doesn't require > so much complex user/kernel interaction (see ocfs2). > > Dave > Well we can certainly look at what they are doing. I don't think we can easily change the recovery at this stage, but journal id allocation might not be so tricky. Are there any docs which describe how it works? Steve. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] Re: furture plans for gfs2-utils: mount.gfs2 and the metafs 2009-06-05 16:38 ` Steven Whitehouse @ 2009-06-05 18:46 ` David Teigland 2009-06-08 8:27 ` Steven Whitehouse 0 siblings, 1 reply; 7+ messages in thread From: David Teigland @ 2009-06-05 18:46 UTC (permalink / raw) To: cluster-devel.redhat.com On Fri, Jun 05, 2009 at 05:38:02PM +0100, Steven Whitehouse wrote: > > I don't know what kind of mount <dir1> <dir2> is, some form of bind mount? > > > Yes. Identical to what was being done with the gfs2meta filesystem type > previously. Its only using /mnt/gfs0 to grab an inode from which to find > out what the block device is. Doing it this way eliminates any races in > mounting the second fs root based upon the first. Ah, I see what you're trying to do now. You're trying to mount the meta fs by mounting the normal fs with a new option. That's a relatively small special case that shouldn't be hard to deal with. It's just "delicate" :-) Reworking major designs on the scale you suggest is killing a flea with sledgehammer. Spend more time studying the details, there's sure to be a small, targeted way of handling it. > > > /sbin/mount.gfs2: bad read: Invalid argument on line 263 of file > > > /builddir/build/BUILD/cluster-2.99.12/gfs2/mount/util.c > > > > I can't find that line number in any code (version 2.99.12 appears to be > > from Oct 2008!?) But, isn't it simply complaining that you've provided > > two dirs as input args? > > > Yes, the message isn't great and we are in the process of cleaning things > like that up. It might be that it is the case, and we can simply fix that, > in which case it makes things easy. > > The question is though, if I mount the same fs multiple times, does > mount.gfs2 realise that its the same fs each time? If so, then I think we > are done in that area. Mount code (both user and kernel) obviously needs to be adapted to deal with the special meta option, the user side has never had to before (the separate gfs2meta type means it never touches the cluster infrastructure). Eliminating the mount helper and entirely reworking how mounting happens is not the way to think about this, focus on the specific problem and make minor changes to deal with it directly. The mount.gfs/gfs_controld method is pretty simple: the first time a given fs is mounted, cluster stuff happens prior to mount(2). Subsequent mounts of the same fs don't involve any cluster activity, and go right to mount(2). Per http://www.mail-archive.com/cluster-devel at redhat.com/msg02568.html we don't do any tracking of multiple mounts of the same fs or reference counting, we leave it all to the vfs, and reverse the "cluster stuff" on the final unmount, which the vfs tells us about via uevent. Dave ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] Re: furture plans for gfs2-utils: mount.gfs2 and the metafs 2009-06-05 18:46 ` David Teigland @ 2009-06-08 8:27 ` Steven Whitehouse 0 siblings, 0 replies; 7+ messages in thread From: Steven Whitehouse @ 2009-06-08 8:27 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, On Fri, 2009-06-05 at 13:46 -0500, David Teigland wrote: > On Fri, Jun 05, 2009 at 05:38:02PM +0100, Steven Whitehouse wrote: > > > I don't know what kind of mount <dir1> <dir2> is, some form of bind mount? > > > > > Yes. Identical to what was being done with the gfs2meta filesystem type > > previously. Its only using /mnt/gfs0 to grab an inode from which to find > > out what the block device is. Doing it this way eliminates any races in > > mounting the second fs root based upon the first. > > Ah, I see what you're trying to do now. You're trying to mount the meta fs by > mounting the normal fs with a new option. That's a relatively small special > case that shouldn't be hard to deal with. It's just "delicate" :-) Reworking > major designs on the scale you suggest is killing a flea with sledgehammer. > Spend more time studying the details, there's sure to be a small, targeted way > of handling it. > Yes, in fact we've supported that for some time, but just haven't been very good at telling people about it. At the original merge we used to have two fstypes, but only because it was at that time impossible to use the "two roots" solution without extra exports/changes to core code which were rather frowned upon. Since other fs have started using multiple roots, its now easy and it was changed some time back so even if you request a mount of type gfs2meta, you actually get a mount of type gfs2, but with a different root. Eventually we should be able to drop the second fstype and just use the "-o meta" option instead. > > > > /sbin/mount.gfs2: bad read: Invalid argument on line 263 of file > > > > /builddir/build/BUILD/cluster-2.99.12/gfs2/mount/util.c > > > > > > I can't find that line number in any code (version 2.99.12 appears to be > > > from Oct 2008!?) But, isn't it simply complaining that you've provided > > > two dirs as input args? > > > > > Yes, the message isn't great and we are in the process of cleaning things > > like that up. It might be that it is the case, and we can simply fix that, > > in which case it makes things easy. > > > > The question is though, if I mount the same fs multiple times, does > > mount.gfs2 realise that its the same fs each time? If so, then I think we > > are done in that area. > > Mount code (both user and kernel) obviously needs to be adapted to deal with > the special meta option, the user side has never had to before (the separate > gfs2meta type means it never touches the cluster infrastructure). Eliminating > the mount helper and entirely reworking how mounting happens is not the way to > think about this, focus on the specific problem and make minor changes to deal > with it directly. > > The mount.gfs/gfs_controld method is pretty simple: the first time a given fs > is mounted, cluster stuff happens prior to mount(2). Subsequent mounts of the > same fs don't involve any cluster activity, and go right to mount(2). > > Per http://www.mail-archive.com/cluster-devel at redhat.com/msg02568.html > > we don't do any tracking of multiple mounts of the same fs or reference > counting, we leave it all to the vfs, and reverse the "cluster stuff" on the > final unmount, which the vfs tells us about via uevent. > > Dave > Ok, then it sounds like we don't need to make any changes here at the moment then, Steve. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-06-08 8:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-05 10:00 [Cluster-devel] furture plans for gfs2-utils: mount.gfs2 and the metafs Steven Whitehouse 2009-06-05 14:54 ` [Cluster-devel] " David Teigland 2009-06-05 15:32 ` Steven Whitehouse 2009-06-05 16:00 ` David Teigland 2009-06-05 16:38 ` Steven Whitehouse 2009-06-05 18:46 ` David Teigland 2009-06-08 8:27 ` Steven Whitehouse
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.