* NFS4 mount problem
@ 2005-04-15 11:57 David Howells
2005-04-15 12:21 ` Stephen Rothwell
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: David Howells @ 2005-04-15 11:57 UTC (permalink / raw)
To: linux-fsdevel; +Cc: steved
We've come across an interesting problem with NFS4 mount on a PPC64 box. If
the mount program is compiled as PPC32, then the mount() syscall is returned
EFAULT.
It turns out that NFS4 requires potentially more data than can be put in a
single page, something sys_mount() enforces as a hard limit. This being the
case, nfs4_get_sb() expects the mount data to contain auxilliary pointers.
Unfortunately, the PPC32 userspace inserts 32-bit pointers, whilst the PPC64
kernel expects 64-bit pointers...
I can think of several ways to deal with this:
(1) Provide a PPC32 mount on ppc32 and a PPC64 mount on ppc64, and require
that these not be mixed.
This is something we'd like to avoid since we can otherwise run almost
entirely with a ppc32 userspace on a ppc64 kernel, making installation
less complex.
(2) Have the PPC32 mount program detect the fact that it's running under a
64-bit kernel and doctor the nfs4 mount data appropriately.
(3) Have the PPC32 mount program run the PPC64 mount program if it detects a
64-bit kernel.
(4) Have the ppc64 kernel detect whether it's a 32-bit or a 64-bit userspace
and translate the NFS4 mount() syscall if necessary.
(5) Introduce a new variation on the mount() syscall that can take more than
a page of data.
I would prefer option (4), otherwise we have to find _all_ usages of the
mount() system calls and fix them.
This has the potential to affect other places where we can run 32-bit
userspace under 64-bit kernels: i386 under x86_64 and s390 under s390x for
example.
David
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 11:57 NFS4 mount problem David Howells
@ 2005-04-15 12:21 ` Stephen Rothwell
2005-04-15 19:51 ` Bryan Henderson
2005-04-17 19:33 ` Trond Myklebust
2005-04-18 10:34 ` David Howells
2 siblings, 1 reply; 24+ messages in thread
From: Stephen Rothwell @ 2005-04-15 12:21 UTC (permalink / raw)
To: David Howells; +Cc: linux-fsdevel, steved
[-- Attachment #1: Type: text/plain, Size: 762 bytes --]
On Fri, 15 Apr 2005 12:57:48 +0100 David Howells <dhowells@redhat.com> wrote:
>
> (4) Have the ppc64 kernel detect whether it's a 32-bit or a 64-bit userspace
> and translate the NFS4 mount() syscall if necessary.
>
> I would prefer option (4), otherwise we have to find _all_ usages of the
> mount() system calls and fix them.
>
> This has the potential to affect other places where we can run 32-bit
> userspace under 64-bit kernels: i386 under x86_64 and s390 under s390x for
> example.
We already have compat_sys_mount that treats the mount data for smbfs and
ncpfs specially, so you could you add an nsfv4 specific bit there?
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 12:21 ` Stephen Rothwell
@ 2005-04-15 19:51 ` Bryan Henderson
2005-04-15 20:22 ` David S. Miller
0 siblings, 1 reply; 24+ messages in thread
From: Bryan Henderson @ 2005-04-15 19:51 UTC (permalink / raw)
To: Stephen Rothwell
Cc: David Howells, linux-fsdevel, linux-fsdevel-owner, steved
>We already have compat_sys_mount that treats the mount data for smbfs and
>ncpfs specially, so you could you add an nsfv4 specific bit there?
Do we really want to pile filesystem-type-specific stuff into fs/compat.c?
It's bad enough that it's there for smbfs and ncpfs (and similar stuff
for NFS server). It's only going to get worse.
fs/compat.c is fine for interfaces implemented by fs/ code, but the 32/64
bit translations for other interfaces ought to be done by the modules that
know those interfaces.
A mount option structure that contains addresses should contain
information as to whether it's in 32-bit-address format or 64-bit-address
format. The nfsv4 read_super method can use that to translate its own
mount options.
Another option would be for Linux to pass that information (essentially,
whether the mount() system call is being handled by sys_mount() or
compat_sys_mount() as another argument to read_super. This would allow
better backward compatibility with user space binaries, if there are
already 32 bit and 64bit binaries using indistinguishable mount option
structures.
The same issue, by the way, applies to ioctls, some of which have an
argument which is the address of a block of memory that contains other
addresses. fs/compat.c approaches these in a more
filesystem-type-independent way than it does mount(), but still not
independent enough.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 19:51 ` Bryan Henderson
@ 2005-04-15 20:22 ` David S. Miller
2005-04-15 22:07 ` Bryan Henderson
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: David S. Miller @ 2005-04-15 20:22 UTC (permalink / raw)
To: Bryan Henderson; +Cc: sfr, dhowells, linux-fsdevel, linux-fsdevel-owner, steved
Make a ->compat_read_super() just like we have a ->compat_ioctl()
method for files, if you want to suggest a solution like what
you describe.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 20:22 ` David S. Miller
@ 2005-04-15 22:07 ` Bryan Henderson
2005-04-17 13:55 ` Christoph Hellwig
2005-04-18 10:36 ` David Howells
2 siblings, 0 replies; 24+ messages in thread
From: Bryan Henderson @ 2005-04-15 22:07 UTC (permalink / raw)
To: David S. Miller; +Cc: dhowells, linux-fsdevel, sfr, steved
>Make a ->compat_read_super() just like we have a ->compat_ioctl()
>method for files, if you want to suggest a solution like what
>you describe.
Even better.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 20:22 ` David S. Miller
2005-04-15 22:07 ` Bryan Henderson
@ 2005-04-17 13:55 ` Christoph Hellwig
2005-04-18 17:07 ` Bryan Henderson
2005-04-18 10:36 ` David Howells
2 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2005-04-17 13:55 UTC (permalink / raw)
To: David S. Miller
Cc: Bryan Henderson, sfr, dhowells, linux-fsdevel,
linux-fsdevel-owner, steved
On Fri, Apr 15, 2005 at 01:22:59PM -0700, David S. Miller wrote:
>
> Make a ->compat_read_super() just like we have a ->compat_ioctl()
> method for files, if you want to suggest a solution like what
> you describe.
I don't think we should encourage filesystem writers to do such stupid
things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went
down that road.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 11:57 NFS4 mount problem David Howells
2005-04-15 12:21 ` Stephen Rothwell
@ 2005-04-17 19:33 ` Trond Myklebust
2005-04-18 17:17 ` Bryan Henderson
2005-04-18 10:34 ` David Howells
2 siblings, 1 reply; 24+ messages in thread
From: Trond Myklebust @ 2005-04-17 19:33 UTC (permalink / raw)
To: David Howells; +Cc: Linux Filesystem Development, Steve Dickson
fr den 15.04.2005 Klokka 12:57 (+0100) skreiv David Howells:
> We've come across an interesting problem with NFS4 mount on a PPC64 box. If
> the mount program is compiled as PPC32, then the mount() syscall is returned
> EFAULT.
So, why is this not a case of "Doctor it hurts..."?
In exactly which case do people have absolutely no alternative but to
run a 32-bit version of the mount program on top of a kernel that was
compiled as PPC64?
A simple script that runs "uname -r" and switches a soft-link between
the 32-bit and 64-bit version of "mount" is not rocket science.
> I would prefer option (4), otherwise we have to find _all_ usages of the
> mount() system calls and fix them.
mount() is not a documented syscall. The binary formats for filesystems
like NFS are only documented inside the kernels to which they apply.
There should therefore be exactly ONE instance of usage, and that is in
the "mount" program itself.
Cheers,
Trond
--
Trond Myklebust <trond.myklebust@fys.uio.no>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 11:57 NFS4 mount problem David Howells
2005-04-15 12:21 ` Stephen Rothwell
2005-04-17 19:33 ` Trond Myklebust
@ 2005-04-18 10:34 ` David Howells
2005-04-18 14:49 ` Trond Myklebust
` (2 more replies)
2 siblings, 3 replies; 24+ messages in thread
From: David Howells @ 2005-04-18 10:34 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Linux Filesystem Development, Steve Dickson
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> > We've come across an interesting problem with NFS4 mount on a PPC64 box. If
> > the mount program is compiled as PPC32, then the mount() syscall is returned
> > EFAULT.
>
> So, why is this not a case of "Doctor it hurts..."?
Because:
(1) The kernel is returning EFAULT to the 32-bit userspace; this implies that
userspace is handing over a bad address. It isn't, the kernel is
malfunctioning as it stands.
(2) The kernel API does not prohibit 32-bit userspace calling mount() under a
64-bit kernel. All other filesystems cope with it (AFAIK), so NFS4 must
too.
Either the kernel should return ENOSYS for any 32-bit mount on a 64-bit kernel
or it must support it fully. I think the latter is the right thing to do;
despite what you'd prefer, there are other callers of the mount syscall out
there.
> There should therefore be exactly ONE instance of usage, and that is in
> the "mount" program itself.
Exactly. That should then be the ppc32 mount; which should work equally well
with a ppc32 or a ppc64 kernel.
David
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-15 20:22 ` David S. Miller
2005-04-15 22:07 ` Bryan Henderson
2005-04-17 13:55 ` Christoph Hellwig
@ 2005-04-18 10:36 ` David Howells
2005-04-18 18:37 ` David S. Miller
2 siblings, 1 reply; 24+ messages in thread
From: David Howells @ 2005-04-18 10:36 UTC (permalink / raw)
To: Christoph Hellwig
Cc: David S. Miller, Bryan Henderson, sfr, linux-fsdevel,
linux-fsdevel-owner, steved
Christoph Hellwig <hch@infradead.org> wrote:
> I don't think we should encourage filesystem writers to do such stupid
> things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went
> down that road.
The problem with NFS4, I think, is that the mount syscall sets a hard limit on
the amount of mount data that's insufficiently large.
David
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 10:34 ` David Howells
@ 2005-04-18 14:49 ` Trond Myklebust
2005-04-18 22:07 ` Bryan Henderson
2005-04-18 15:23 ` David Howells
2005-04-18 21:50 ` Bryan Henderson
2 siblings, 1 reply; 24+ messages in thread
From: Trond Myklebust @ 2005-04-18 14:49 UTC (permalink / raw)
To: David Howells; +Cc: Linux Filesystem Development, Steve Dickson
må den 18.04.2005 Klokka 11:34 (+0100) skreiv David Howells:
> (1) The kernel is returning EFAULT to the 32-bit userspace; this implies that
> userspace is handing over a bad address. It isn't, the kernel is
> malfunctioning as it stands.
>
> (2) The kernel API does not prohibit 32-bit userspace calling mount() under a
> 64-bit kernel. All other filesystems cope with it (AFAIK), so NFS4 must
> too.
>
> Either the kernel should return ENOSYS for any 32-bit mount on a 64-bit kernel
> or it must support it fully. I think the latter is the right thing to do;
> despite what you'd prefer, there are other callers of the mount syscall out
> there.
No. If you want generalized support for mounting NFS filesystems, then
the right thing to do is to create a userland library that can translate
the mount options, set up the binary structure with sane defaults etc.
Without such a library, it is pointless to contemplate "other callers".
With such a library, you will have a single point for switching between
32bit and 64 bit.
My concern is that we are slowly but surely building up a bigger
in-kernel library for parsing the binary structure than it would take to
parse the naked mount option string. There are only 2 reasons for doing
that parsing in userland:
1) DNS lookups
2) Keeping the kernel parsing code small
> > There should therefore be exactly ONE instance of usage, and that is in
> > the "mount" program itself.
>
> Exactly. That should then be the ppc32 mount; which should work equally well
> with a ppc32 or a ppc64 kernel.
Then can we kill the PPC64 binary structure and substitute PPC32?
Cheers,
Trond
--
Trond Myklebust <trond.myklebust@fys.uio.no>
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 10:34 ` David Howells
2005-04-18 14:49 ` Trond Myklebust
@ 2005-04-18 15:23 ` David Howells
2005-04-18 15:45 ` Trond Myklebust
2005-04-18 21:50 ` Bryan Henderson
2 siblings, 1 reply; 24+ messages in thread
From: David Howells @ 2005-04-18 15:23 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Linux Filesystem Development, Steve Dickson
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> Without such a library, it is pointless to contemplate "other callers".
> With such a library, you will have a single point for switching between
> 32bit and 64 bit.
"Other callers" include such as busybox, sash and uClinux. I'm not sure about
such as Perl, but Perl is hardly in the same class as the other three.
Admittedly, a library is probably the right way to do it - libmount or some
such thing.
> Then can we kill the PPC64 binary structure and substitute PPC32?
Whilst I might be happy to, I'm not sure I can speak for everyone. You can
also use i386 mount on x86_64 for instance; and possibly s390 on s390x,
sparc32 on sparc64 and mips32 on mips64.
David
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 15:23 ` David Howells
@ 2005-04-18 15:45 ` Trond Myklebust
0 siblings, 0 replies; 24+ messages in thread
From: Trond Myklebust @ 2005-04-18 15:45 UTC (permalink / raw)
To: David Howells; +Cc: Linux Filesystem Development, Steve Dickson
må den 18.04.2005 Klokka 16:23 (+0100) skreiv David Howells:
> Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
>
> > Without such a library, it is pointless to contemplate "other callers".
> > With such a library, you will have a single point for switching between
> > 32bit and 64 bit.
>
> "Other callers" include such as busybox, sash and uClinux. I'm not sure about
> such as Perl, but Perl is hardly in the same class as the other three.
> Admittedly, a library is probably the right way to do it - libmount or some
> such thing.
Hmm... Nope. None of the above have support for NFSv4 AFAICS. busybox
has support for NFSv2/v3, but not v4.
Note you are starting to convince me that the correct way to do this is
to bite the bullet, move all the binary stuff into a compat library that
we can drop at some later time, and then rebase on the NFSroot parser.
Cheers,
Trond
--
Trond Myklebust <trond.myklebust@fys.uio.no>
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-17 13:55 ` Christoph Hellwig
@ 2005-04-18 17:07 ` Bryan Henderson
2005-04-18 17:16 ` Al Viro
2005-04-18 17:33 ` David Howells
0 siblings, 2 replies; 24+ messages in thread
From: Bryan Henderson @ 2005-04-18 17:07 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David S. Miller, dhowells, linux-fsdevel, sfr, steved
>On Fri, Apr 15, 2005 at 01:22:59PM -0700, David S. Miller wrote:
>>
>> Make a ->compat_read_super() just like we have a ->compat_ioctl()
>> method for files, if you want to suggest a solution like what
>> you describe.
>
>I don't think we should encourage filesystem writers to do such stupid
>things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went
>down that road.
Which road is that?
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 17:07 ` Bryan Henderson
@ 2005-04-18 17:16 ` Al Viro
2005-04-18 17:33 ` David Howells
1 sibling, 0 replies; 24+ messages in thread
From: Al Viro @ 2005-04-18 17:16 UTC (permalink / raw)
To: Bryan Henderson
Cc: Christoph Hellwig, David S. Miller, dhowells, linux-fsdevel, sfr,
steved
On Mon, Apr 18, 2005 at 10:07:14AM -0700, Bryan Henderson wrote:
> >On Fri, Apr 15, 2005 at 01:22:59PM -0700, David S. Miller wrote:
> >>
> >> Make a ->compat_read_super() just like we have a ->compat_ioctl()
> >> method for files, if you want to suggest a solution like what
> >> you describe.
> >
> >I don't think we should encourage filesystem writers to do such stupid
> >things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went
> >down that road.
>
> Which road is that?
Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data).
If you want it to be a blob, at least have a decency to use encoding
that would not depend on alignment rules and word size. Hell, you
could use XDR - it's not that nfs would need something new to handle
it. Or, better yet, use a normal string.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-17 19:33 ` Trond Myklebust
@ 2005-04-18 17:17 ` Bryan Henderson
2005-04-18 17:59 ` Trond Myklebust
0 siblings, 1 reply; 24+ messages in thread
From: Bryan Henderson @ 2005-04-18 17:17 UTC (permalink / raw)
To: Trond Myklebust
Cc: David Howells, Linux Filesystem Development, Steve Dickson
>mount() is not a documented syscall. The binary formats for filesystems
>like NFS are only documented inside the kernels to which they apply.
What _is_ a documented system call? Linux is famous for not having
documented interfaces (or, put another way, not distinguishing between an
interface you can read in an official document and one you discover by
reading kernel source code). But of all interfaces in Linux, the system
call interface is probably the most accepted as one a user of the kernel
can rely on.
I don't think a filesystem driver designer should expect mount options to
be private to one particular user space program. Especially one that
isn't even packaged with the driver.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 17:07 ` Bryan Henderson
2005-04-18 17:16 ` Al Viro
@ 2005-04-18 17:33 ` David Howells
2005-04-18 17:43 ` Al Viro
2005-04-18 17:52 ` Bryan Henderson
1 sibling, 2 replies; 24+ messages in thread
From: David Howells @ 2005-04-18 17:33 UTC (permalink / raw)
To: Al Viro
Cc: Bryan Henderson, Christoph Hellwig, David S. Miller,
linux-fsdevel, sfr, steved
Al Viro <viro@parcelfarce.linux.theplanet.co.uk> wrote:
>
> Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data).
> If you want it to be a blob, at least have a decency to use encoding
> that would not depend on alignment rules and word size. Hell, you
> could use XDR - it's not that nfs would need something new to handle
> it. Or, better yet, use a normal string.
Mount doesn't appear to permit a big enough blob though. It has a hard limit
of PAGE_SIZE.
David
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 17:33 ` David Howells
@ 2005-04-18 17:43 ` Al Viro
2005-04-18 17:52 ` Bryan Henderson
1 sibling, 0 replies; 24+ messages in thread
From: Al Viro @ 2005-04-18 17:43 UTC (permalink / raw)
To: David Howells
Cc: Bryan Henderson, Christoph Hellwig, David S. Miller,
linux-fsdevel, sfr, steved
On Mon, Apr 18, 2005 at 06:33:09PM +0100, David Howells wrote:
> Al Viro <viro@parcelfarce.linux.theplanet.co.uk> wrote:
>
> >
> > Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data).
> > If you want it to be a blob, at least have a decency to use encoding
> > that would not depend on alignment rules and word size. Hell, you
> > could use XDR - it's not that nfs would need something new to handle
> > it. Or, better yet, use a normal string.
>
> Mount doesn't appear to permit a big enough blob though. It has a hard limit
> of PAGE_SIZE.
Excuse me? Would the use of fixed offsets, field sizes and endianness
make the blob bigger? And as for the length of string representation
going past 4Kb... that could be easily dealt with in sys_mount() if it
really becomes a problem.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 17:33 ` David Howells
2005-04-18 17:43 ` Al Viro
@ 2005-04-18 17:52 ` Bryan Henderson
1 sibling, 0 replies; 24+ messages in thread
From: Bryan Henderson @ 2005-04-18 17:52 UTC (permalink / raw)
To: David Howells
Cc: David S. Miller, Christoph Hellwig, linux-fsdevel, sfr, steved,
Al Viro
>> Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data).
>> If you want it to be a blob, at least have a decency to use encoding
>> that would not depend on alignment rules and word size. Hell, you
>> could use XDR - it's not that nfs would need something new to handle
>> it. Or, better yet, use a normal string.
>
>Mount doesn't appear to permit a big enough blob though. It has a hard
limit
>of PAGE_SIZE.
That seems to me to be orthogonal to Al's point. You could make an
architecture-independent format for that page that still contains
addresses in user space of additional information. Which would presumably
also have an architecture-independent format.
But why is mount() special here? It's ancient tradition for Linux system
calls to take as parameters, and return as results, in-memory structures
that are dependent on local word size and endianness. Lots of them do.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 17:17 ` Bryan Henderson
@ 2005-04-18 17:59 ` Trond Myklebust
2005-04-20 10:57 ` Andries Brouwer
0 siblings, 1 reply; 24+ messages in thread
From: Trond Myklebust @ 2005-04-18 17:59 UTC (permalink / raw)
To: Bryan Henderson
Cc: David Howells, Linux Filesystem Development, Steve Dickson
må den 18.04.2005 Klokka 10:17 (-0700) skreiv Bryan Henderson:
> >mount() is not a documented syscall. The binary formats for filesystems
> >like NFS are only documented inside the kernels to which they apply.
>
> What _is_ a documented system call? Linux is famous for not having
> documented interfaces (or, put another way, not distinguishing between an
> interface you can read in an official document and one you discover by
> reading kernel source code). But of all interfaces in Linux, the system
> call interface is probably the most accepted as one a user of the kernel
> can rely on.
>
> I don't think a filesystem driver designer should expect mount options to
> be private to one particular user space program. Especially one that
> isn't even packaged with the driver.
If people really do need a fully documented NFS mount interface, then
the only one that makes sense is a string interface. Looking back at the
manpages, the string mount options are the only thing that have remained
constant over the last 10 years.
We're already up to version 6 of the binary interfaces for v2/v3, and if
you count NFSv4 too, then that makes 7. Choice of which binary interface
to use is entirely dependent on the kernel revision. Good luck fitting
all that (plus future revisions) into something like sash without
doubling its size...
Cheers,
Trond
--
Trond Myklebust <trond.myklebust@fys.uio.no>
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 10:36 ` David Howells
@ 2005-04-18 18:37 ` David S. Miller
0 siblings, 0 replies; 24+ messages in thread
From: David S. Miller @ 2005-04-18 18:37 UTC (permalink / raw)
To: David Howells
Cc: hch, hbryan, sfr, linux-fsdevel, linux-fsdevel-owner, steved
On Mon, 18 Apr 2005 11:36:25 +0100
David Howells <dhowells@redhat.com> wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
>
> > I don't think we should encourage filesystem writers to do such stupid
> > things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went
> > down that road.
>
> The problem with NFS4, I think, is that the mount syscall sets a hard limit on
> the amount of mount data that's insufficiently large.
That's correct, it currently cannot support more than one page
of data. Even worse, that makes the limit platform dependent.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 10:34 ` David Howells
2005-04-18 14:49 ` Trond Myklebust
2005-04-18 15:23 ` David Howells
@ 2005-04-18 21:50 ` Bryan Henderson
2 siblings, 0 replies; 24+ messages in thread
From: Bryan Henderson @ 2005-04-18 21:50 UTC (permalink / raw)
To: David Howells
Cc: Linux Filesystem Development, Steve Dickson, Trond Myklebust
>(1) The kernel is returning EFAULT to the 32-bit userspace; this implies
that
> userspace is handing over a bad address. It isn't, the kernel is
> malfunctioning as it stands.
>...
>Either the kernel should return ENOSYS for any 32-bit mount on a 64-bit
kernel
>or it must support it fully.
So this point is just the error code? If so, where do you get ENOSYS? A
more usual errno for where a particular filesystem type can't be mounted
is ENODEV. Choosing errnos is a pretty whimsical thing anyway, since
there are so many more kinds of errors than the authors of the errno space
contemplated, but EFAULT and ENOSYS are two that have a pretty solid
definition. ENOSYS is for when an entire system call type is missing.
I'm not sure we can complain about EFAULT, though, because you really are
supplying an invalid address. You're doing it because you're using the
wrong mount option format, so what you think of as 4 bytes of flags
followed by 4 bytes of address is really 8 bytes of address.
I do understand the more important issue of there being a kernel that
understands both mount option formats; but since you enumerated the errno
issue, I wanted to comment on that one independently.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 14:49 ` Trond Myklebust
@ 2005-04-18 22:07 ` Bryan Henderson
2005-04-18 23:34 ` Trond Myklebust
0 siblings, 1 reply; 24+ messages in thread
From: Bryan Henderson @ 2005-04-18 22:07 UTC (permalink / raw)
To: Trond Myklebust
Cc: David Howells, Linux Filesystem Development, Steve Dickson
>My concern is that we are slowly but surely building up a bigger
>in-kernel library for parsing the binary structure than it would take to
>parse the naked mount option string.
>
>...
>If people really do need a fully documented NFS mount interface, then
>the only one that makes sense is a string interface. Looking back at the
>manpages, the string mount options are the only thing that have remained
>constant over the last 10 years.
>
>We're already up to version 6 of the binary interfaces for v2/v3, and if
>you count NFSv4 too, then that makes 7.
I don't know the NFS mount option format, but I'm having a hard time
imagining how a string-based format can take less code to parse and be
more forward compatible than a binary one. People don't even use the term
"parse" for binary structures, because parsing typically means turning
strings into binary structures.
Having 6 separate formats isn't the only way to have an evolving binary
interface. People do make extensible binary formats.
>There are only 2 reasons for doing
>that parsing in userland:
>
> 1) DNS lookups
> 2) Keeping the kernel parsing code small
I personally almost never worry about the number of bytes of code, but I
worry a lot about its simplicity. User space code is less costly to
develop and less risky to make a mistake in. I would add,
3) Keeping the kernel parsing code simple.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 22:07 ` Bryan Henderson
@ 2005-04-18 23:34 ` Trond Myklebust
0 siblings, 0 replies; 24+ messages in thread
From: Trond Myklebust @ 2005-04-18 23:34 UTC (permalink / raw)
To: Bryan Henderson
Cc: David Howells, Linux Filesystem Development, Steve Dickson
må den 18.04.2005 Klokka 15:07 (-0700) skreiv Bryan Henderson:
> >We're already up to version 6 of the binary interfaces for v2/v3, and if
> >you count NFSv4 too, then that makes 7.
>
> I don't know the NFS mount option format, but I'm having a hard time
> imagining how a string-based format can take less code to parse and be
> more forward compatible than a binary one. People don't even use the term
> "parse" for binary structures, because parsing typically means turning
> strings into binary structures.
The string based parser (based, BTW, on the generic string parser in
lib/parser.c) for NFS mount options is already in the kernel, thanks to
NFSroot, and already needs to be maintained. As is the NFSv2/v3 "mount"
RPC code, and everything else that the kernel needs to take over that
duty.
The only extra information we need from userland is the DNS lookup of
the server hostname(NFSv2/v3/v4) and the client IP address (NFSv4 only).
> Having 6 separate formats isn't the only way to have an evolving
> binary
> interface. People do make extensible binary formats.
I never said they were 6 _separate_ formats. The NFSv2/v3 stuff is one
constantly "extending" binary format. See include/linux/nfs_mount.h.
Note how 4 of those fields are currently entirely obsolete (fd,
old_root, namlen, bsize) and how one more cannot be extended to cope
with IPv6 and other new transports (addr), and how one more (root)
cannot be used for NFSv4 mounts, which had to add in at least 2 more
fields that are unused by NFSv2/v3...
Sure, we could indeed have developed more sensible binary formats if our
1992 crystal ball had told us all about NFSv3 (RFC dates from 1995) and
NFSv4 (RFC dates from 2003). Not to forget lockd, statd, nfsacl, IPv6,
etc...
> I personally almost never worry about the number of bytes of code, but
> I
> worry a lot about its simplicity. User space code is less costly to
> develop and less risky to make a mistake in. I would add,
>
> 3) Keeping the kernel parsing code simple.
No.
3) Keeping the kernel parsing code _maintainable_
...and keeping around parsers for all these different formats and fields
and now extra 32-bit counterparts isn't my idea of code simplicity, code
compactness, or code maintainability.
Cheers,
Trond
--
Trond Myklebust <trond.myklebust@fys.uio.no>
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem
2005-04-18 17:59 ` Trond Myklebust
@ 2005-04-20 10:57 ` Andries Brouwer
0 siblings, 0 replies; 24+ messages in thread
From: Andries Brouwer @ 2005-04-20 10:57 UTC (permalink / raw)
To: Trond Myklebust
Cc: Bryan Henderson, David Howells, Linux Filesystem Development,
Steve Dickson
On Mon, Apr 18, 2005 at 01:59:28PM -0400, Trond Myklebust wrote:
> If people really do need a fully documented NFS mount interface, then
> the only one that makes sense is a string interface.
You can omit the "If" part and retain the conclusion.
The binary interface for nfs and family is really a pain.
Andries
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2005-04-20 10:57 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-15 11:57 NFS4 mount problem David Howells
2005-04-15 12:21 ` Stephen Rothwell
2005-04-15 19:51 ` Bryan Henderson
2005-04-15 20:22 ` David S. Miller
2005-04-15 22:07 ` Bryan Henderson
2005-04-17 13:55 ` Christoph Hellwig
2005-04-18 17:07 ` Bryan Henderson
2005-04-18 17:16 ` Al Viro
2005-04-18 17:33 ` David Howells
2005-04-18 17:43 ` Al Viro
2005-04-18 17:52 ` Bryan Henderson
2005-04-18 10:36 ` David Howells
2005-04-18 18:37 ` David S. Miller
2005-04-17 19:33 ` Trond Myklebust
2005-04-18 17:17 ` Bryan Henderson
2005-04-18 17:59 ` Trond Myklebust
2005-04-20 10:57 ` Andries Brouwer
2005-04-18 10:34 ` David Howells
2005-04-18 14:49 ` Trond Myklebust
2005-04-18 22:07 ` Bryan Henderson
2005-04-18 23:34 ` Trond Myklebust
2005-04-18 15:23 ` David Howells
2005-04-18 15:45 ` Trond Myklebust
2005-04-18 21:50 ` Bryan Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).