* bug with multiple mounts of filesystems in 2.6
@ 2004-07-26 16:29 John S J Anderson
2004-07-26 19:37 ` Trond Myklebust
0 siblings, 1 reply; 7+ messages in thread
From: John S J Anderson @ 2004-07-26 16:29 UTC (permalink / raw)
To: linux-kernel
Hi --
We're working on migrating to the 2.6 kernel series, and one big
problem has popped up: we have a number of NFS mounts that are
mounted read-only in one location and read-write in a distinct
location (on the same machine). With 2.4 series kernels, this worked
without issue, but with 2.6, it doesn't: it's not possible to mount
the same filesystem twice with different options for each mount; the
two mount points have to share the same mount options.
Furthermore, if you do mount one filesystem at two different places,
changing the mount options on one mount point ('mount -o remount,rw
$MOUNT1', for example) also results in the mount options on the
other mount point being changed. Finally, the information returned
by 'mount' about the mount points is wrong -- 'mount' will show (for
example) one mount point being 'rw' and the other being 'ro', when
in fact attempting to use the 'rw' mount point will show it to be
read-only. /proc/mounts correctly indicates that both mounts are
read-only.
Further experimentation has shown:
* the NFS layer is not involved; it is possible to reproduce this
problem using just locally-attached filesystems.
* this affects at least 2.6.3, 2.6.5, and 2.6.7.
* it happens with mount version 2.11r and 2.12.
Because the options of the first mount point "wins" (i.e., mounting a
filesystem once read-write, and then a second time as read-only,
leaves both mounts in a read-write state), it seems like there's
some sort of caching optimization going on -- but I haven't looked
into the code to find out if that guess is correct.
Any pointers towards restoring the 2.4 behavior welcomed.
thanks,
john.
--
Perhaps you have forgotten that this is an engineering discipline, not
some sort of black magic.
Mark Jason Dominus's Good Advice #11946
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: bug with multiple mounts of filesystems in 2.6
2004-07-26 16:29 bug with multiple mounts of filesystems in 2.6 John S J Anderson
@ 2004-07-26 19:37 ` Trond Myklebust
2004-07-26 21:33 ` Mike Waychison
0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2004-07-26 19:37 UTC (permalink / raw)
To: John S J Anderson; +Cc: linux-kernel
På må , 26/07/2004 klokka 12:29, skreiv John S J Anderson:
> Hi --
>
> We're working on migrating to the 2.6 kernel series, and one big
> problem has popped up: we have a number of NFS mounts that are
> mounted read-only in one location and read-write in a distinct
> location (on the same machine). With 2.4 series kernels, this worked
> without issue, but with 2.6, it doesn't: it's not possible to mount
> the same filesystem twice with different options for each mount; the
> two mount points have to share the same mount options.
That behaviour is no longer supported as it meant that you would have
different superblocks (and hence different out-of-sync caches) between
the 2 mountpoint. It is in any case not a behaviour that is supported on
any other Linux filesystems.
If you want readonly to be an exception, then you will have to move the
MS_RDONLY flag from being a superblock option to being a vfsmount
option, then propagate that vfsmount information down to all the tests
of IS_RDONLY(inode). Not a trivial task, and not one that looms high on
my list of priorities...
Cheers,
Trond
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug with multiple mounts of filesystems in 2.6
2004-07-26 19:37 ` Trond Myklebust
@ 2004-07-26 21:33 ` Mike Waychison
2004-07-26 22:30 ` Herbert Poetzl
2004-07-26 22:35 ` Trond Myklebust
0 siblings, 2 replies; 7+ messages in thread
From: Mike Waychison @ 2004-07-26 21:33 UTC (permalink / raw)
To: Trond Myklebust; +Cc: John S J Anderson, linux-kernel, viro, Herbert Poetzl
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Trond Myklebust wrote:
> På må , 26/07/2004 klokka 12:29, skreiv John S J Anderson:
>
>> Hi --
>>
>> We're working on migrating to the 2.6 kernel series, and one big
>> problem has popped up: we have a number of NFS mounts that are
>> mounted read-only in one location and read-write in a distinct
>> location (on the same machine). With 2.4 series kernels, this worked
>> without issue, but with 2.6, it doesn't: it's not possible to mount
>> the same filesystem twice with different options for each mount; the
>> two mount points have to share the same mount options.
>
>
> That behaviour is no longer supported as it meant that you would have
> different superblocks (and hence different out-of-sync caches) between
> the 2 mountpoint. It is in any case not a behaviour that is supported on
> any other Linux filesystems.
>
How is this any different than having two seperate nfs clients accessing
the same nfs export?
> If you want readonly to be an exception, then you will have to move the
> MS_RDONLY flag from being a superblock option to being a vfsmount
> option, then propagate that vfsmount information down to all the tests
> of IS_RDONLY(inode). Not a trivial task, and not one that looms high on
> my list of priorities...
>
What ever happened to the bind ro patches that were floating around a
couple months ago?
(http://marc.theaimsgroup.com/?t=107932320200005&r=1&w=2)
What is left in getting this done? Just the touch_file bit Viro
commented on?
- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD4DBQFBBXiTdQs4kOxk3/MRAp2CAJ9hLA43GX7breEAuFJp++noSX7hAQCYn7yw
FXXelAMC/NCetjqwC8Q67g==
=rQRx
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug with multiple mounts of filesystems in 2.6
2004-07-26 21:33 ` Mike Waychison
@ 2004-07-26 22:30 ` Herbert Poetzl
2004-07-26 22:35 ` Trond Myklebust
1 sibling, 0 replies; 7+ messages in thread
From: Herbert Poetzl @ 2004-07-26 22:30 UTC (permalink / raw)
To: Mike Waychison; +Cc: Trond Myklebust, John S J Anderson, linux-kernel, viro
On Mon, Jul 26, 2004 at 05:33:07PM -0400, Mike Waychison wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Trond Myklebust wrote:
> > På må , 26/07/2004 klokka 12:29, skreiv John S J Anderson:
> >
> >> Hi --
> >>
> >> We're working on migrating to the 2.6 kernel series, and one big
> >> problem has popped up: we have a number of NFS mounts that are
> >> mounted read-only in one location and read-write in a distinct
> >> location (on the same machine). With 2.4 series kernels, this worked
> >> without issue, but with 2.6, it doesn't: it's not possible to mount
> >> the same filesystem twice with different options for each mount; the
> >> two mount points have to share the same mount options.
> >
> > That behaviour is no longer supported as it meant that you would have
> > different superblocks (and hence different out-of-sync caches) between
> > the 2 mountpoint. It is in any case not a behaviour that is supported on
> > any other Linux filesystems.
>
> How is this any different than having two seperate nfs clients accessing
> the same nfs export?
>
> > If you want readonly to be an exception, then you will have to move the
> > MS_RDONLY flag from being a superblock option to being a vfsmount
> > option, then propagate that vfsmount information down to all the tests
> > of IS_RDONLY(inode). Not a trivial task, and not one that looms high on
> > my list of priorities...
>
> What ever happened to the bind ro patches that were floating around a
> couple months ago?
> (http://marc.theaimsgroup.com/?t=107932320200005&r=1&w=2)
>
> What is left in getting this done? Just the touch_file bit Viro
> commented on?
started with the noatime/nodiratime stuff for inclusion
but that patch was neither commented nor included, so
I put it on hold ... currently I'm planning to update
this for 2.6.8 ... we'll see ...
best,
Herbert
> - --
> Mike Waychison
> Sun Microsystems, Inc.
> 1 (650) 352-5299 voice
> 1 (416) 202-8336 voice
> http://www.sun.com
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> NOTICE: The opinions expressed in this email are held by me,
> and may not represent the views of Sun Microsystems, Inc.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
>
> iD4DBQFBBXiTdQs4kOxk3/MRAp2CAJ9hLA43GX7breEAuFJp++noSX7hAQCYn7yw
> FXXelAMC/NCetjqwC8Q67g==
> =rQRx
> -----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug with multiple mounts of filesystems in 2.6
2004-07-26 21:33 ` Mike Waychison
2004-07-26 22:30 ` Herbert Poetzl
@ 2004-07-26 22:35 ` Trond Myklebust
2004-07-27 0:56 ` Mike Waychison
1 sibling, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2004-07-26 22:35 UTC (permalink / raw)
To: Mike Waychison
Cc: John S J Anderson, linux-kernel, Alexander Viro, Herbert Poetzl
På må , 26/07/2004 klokka 17:33, skreiv Mike Waychison:
> How is this any different than having two seperate nfs clients accessing
> the same nfs export?
It isn't, but why do you think that should be a reason for allowing it?
By all means feel free to add "mount --bind -oro" capabilities, but it
is neither useful nor is it necessary to break the NFS caching model in
order to do so.
Cheers,
Trond
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug with multiple mounts of filesystems in 2.6
2004-07-26 22:35 ` Trond Myklebust
@ 2004-07-27 0:56 ` Mike Waychison
2004-07-27 2:51 ` Trond Myklebust
0 siblings, 1 reply; 7+ messages in thread
From: Mike Waychison @ 2004-07-27 0:56 UTC (permalink / raw)
To: Trond Myklebust
Cc: John S J Anderson, linux-kernel, Alexander Viro, Herbert Poetzl
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Trond Myklebust wrote:
> På må , 26/07/2004 klokka 17:33, skreiv Mike Waychison:
>
>
>>How is this any different than having two seperate nfs clients accessing
>>the same nfs export?
>
>
> It isn't, but why do you think that should be a reason for allowing it?
>
> By all means feel free to add "mount --bind -oro" capabilities, but it
> is neither useful nor is it necessary to break the NFS caching model in
> order to do so.
>
Agreed. The two problems are orthogonal. [1]
As an example where sharing the super_block is wrong (albeit probably
just an oversight) is that the protocols (udp vs tcp) are not compared
in nfs_compare_super. You could argue that the client fhandles should
be different though, I'm not sure..
Another 'bind mount extension' that would be nice to change at the
vfsmount level may be w/rsize, but that is probably a very intrusive
change for nfs and probably not possible. Thoughts?
[1] - I haven't tested mounting nfs ro, and then mounting nfs rw using
the bind extensions. Does nfs make any assumptions about the mount
being ro?
- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFBBahXdQs4kOxk3/MRAhTCAKCJGOaemEdeDrmtp/tG5Y6fHe+BTgCgkh8v
312wdekZsxms1ShJciogYRQ=
=7Hm7
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug with multiple mounts of filesystems in 2.6
2004-07-27 0:56 ` Mike Waychison
@ 2004-07-27 2:51 ` Trond Myklebust
0 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2004-07-27 2:51 UTC (permalink / raw)
To: Mike Waychison
Cc: John S J Anderson, linux-kernel, Alexander Viro, Herbert Poetzl
På må , 26/07/2004 klokka 20:56, skreiv Mike Waychison:
> As an example where sharing the super_block is wrong (albeit probably
> just an oversight) is that the protocols (udp vs tcp) are not compared
> in nfs_compare_super. You could argue that the client fhandles should
> be different though, I'm not sure..
Not an oversight. It's just that we don't really have a model for
trunking a single cache over several different RPC connections.
Basically, it all boils down to the problem that inodes and dentries
have no idea of which namespace you used to access them, and hence even
if you did have space in the vfsmount in which to store an rpc_client
struct and perhaps rsize/wsize ... you still have a reconstruction job
to do.
> Another 'bind mount extension' that would be nice to change at the
> vfsmount level may be w/rsize, but that is probably a very intrusive
> change for nfs and probably not possible. Thoughts?
The "struct nfs_open_context" I introduce in the latest NFS4_ALL patches
could probably be modified to include vfsmount information.
My question is why do people need to do this? What problems does it
solve?
Normally, rsize/wsize are per-server parameters. Their optimal value
depends above all upon the quality of the network link between client
and server. Ditto for the choice of UDP vs TCP: why would you want to
choose to use both options against a given server?
> [1] - I haven't tested mounting nfs ro, and then mounting nfs rw using
> the bind extensions. Does nfs make any assumptions about the mount
> being ro?
Nope, however the VFS support for this is missing. That was part of what
Herbert was doing in his patches.
Cheers,
Trond
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-07-27 2:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-26 16:29 bug with multiple mounts of filesystems in 2.6 John S J Anderson
2004-07-26 19:37 ` Trond Myklebust
2004-07-26 21:33 ` Mike Waychison
2004-07-26 22:30 ` Herbert Poetzl
2004-07-26 22:35 ` Trond Myklebust
2004-07-27 0:56 ` Mike Waychison
2004-07-27 2:51 ` Trond Myklebust
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.