* NFS4 mount problem
@ 2005-04-15 11:57 David Howells
2005-04-15 12:21 ` Stephen Rothwell
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: David Howells @ 2005-04-15 11:57 UTC (permalink / raw)
To: linux-fsdevel; +Cc: steved
We've come across an interesting problem with NFS4 mount on a PPC64 box. If
the mount program is compiled as PPC32, then the mount() syscall is returned
EFAULT.
It turns out that NFS4 requires potentially more data than can be put in a
single page, something sys_mount() enforces as a hard limit. This being the
case, nfs4_get_sb() expects the mount data to contain auxilliary pointers.
Unfortunately, the PPC32 userspace inserts 32-bit pointers, whilst the PPC64
kernel expects 64-bit pointers...
I can think of several ways to deal with this:
(1) Provide a PPC32 mount on ppc32 and a PPC64 mount on ppc64, and require
that these not be mixed.
This is something we'd like to avoid since we can otherwise run almost
entirely with a ppc32 userspace on a ppc64 kernel, making installation
less complex.
(2) Have the PPC32 mount program detect the fact that it's running under a
64-bit kernel and doctor the nfs4 mount data appropriately.
(3) Have the PPC32 mount program run the PPC64 mount program if it detects a
64-bit kernel.
(4) Have the ppc64 kernel detect whether it's a 32-bit or a 64-bit userspace
and translate the NFS4 mount() syscall if necessary.
(5) Introduce a new variation on the mount() syscall that can take more than
a page of data.
I would prefer option (4), otherwise we have to find _all_ usages of the
mount() system calls and fix them.
This has the potential to affect other places where we can run 32-bit
userspace under 64-bit kernels: i386 under x86_64 and s390 under s390x for
example.
David
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: NFS4 mount problem 2005-04-15 11:57 NFS4 mount problem David Howells @ 2005-04-15 12:21 ` Stephen Rothwell 2005-04-15 19:51 ` Bryan Henderson 2005-04-17 19:33 ` Trond Myklebust 2005-04-18 10:34 ` David Howells 2 siblings, 1 reply; 24+ messages in thread From: Stephen Rothwell @ 2005-04-15 12:21 UTC (permalink / raw) To: David Howells; +Cc: linux-fsdevel, steved [-- Attachment #1: Type: text/plain, Size: 762 bytes --] On Fri, 15 Apr 2005 12:57:48 +0100 David Howells <dhowells@redhat.com> wrote: > > (4) Have the ppc64 kernel detect whether it's a 32-bit or a 64-bit userspace > and translate the NFS4 mount() syscall if necessary. > > I would prefer option (4), otherwise we have to find _all_ usages of the > mount() system calls and fix them. > > This has the potential to affect other places where we can run 32-bit > userspace under 64-bit kernels: i386 under x86_64 and s390 under s390x for > example. We already have compat_sys_mount that treats the mount data for smbfs and ncpfs specially, so you could you add an nsfv4 specific bit there? -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 12:21 ` Stephen Rothwell @ 2005-04-15 19:51 ` Bryan Henderson 2005-04-15 20:22 ` David S. Miller 0 siblings, 1 reply; 24+ messages in thread From: Bryan Henderson @ 2005-04-15 19:51 UTC (permalink / raw) To: Stephen Rothwell Cc: David Howells, linux-fsdevel, linux-fsdevel-owner, steved >We already have compat_sys_mount that treats the mount data for smbfs and >ncpfs specially, so you could you add an nsfv4 specific bit there? Do we really want to pile filesystem-type-specific stuff into fs/compat.c? It's bad enough that it's there for smbfs and ncpfs (and similar stuff for NFS server). It's only going to get worse. fs/compat.c is fine for interfaces implemented by fs/ code, but the 32/64 bit translations for other interfaces ought to be done by the modules that know those interfaces. A mount option structure that contains addresses should contain information as to whether it's in 32-bit-address format or 64-bit-address format. The nfsv4 read_super method can use that to translate its own mount options. Another option would be for Linux to pass that information (essentially, whether the mount() system call is being handled by sys_mount() or compat_sys_mount() as another argument to read_super. This would allow better backward compatibility with user space binaries, if there are already 32 bit and 64bit binaries using indistinguishable mount option structures. The same issue, by the way, applies to ioctls, some of which have an argument which is the address of a block of memory that contains other addresses. fs/compat.c approaches these in a more filesystem-type-independent way than it does mount(), but still not independent enough. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 19:51 ` Bryan Henderson @ 2005-04-15 20:22 ` David S. Miller 2005-04-15 22:07 ` Bryan Henderson ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: David S. Miller @ 2005-04-15 20:22 UTC (permalink / raw) To: Bryan Henderson; +Cc: sfr, dhowells, linux-fsdevel, linux-fsdevel-owner, steved Make a ->compat_read_super() just like we have a ->compat_ioctl() method for files, if you want to suggest a solution like what you describe. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 20:22 ` David S. Miller @ 2005-04-15 22:07 ` Bryan Henderson 2005-04-17 13:55 ` Christoph Hellwig 2005-04-18 10:36 ` David Howells 2 siblings, 0 replies; 24+ messages in thread From: Bryan Henderson @ 2005-04-15 22:07 UTC (permalink / raw) To: David S. Miller; +Cc: dhowells, linux-fsdevel, sfr, steved >Make a ->compat_read_super() just like we have a ->compat_ioctl() >method for files, if you want to suggest a solution like what >you describe. Even better. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 20:22 ` David S. Miller 2005-04-15 22:07 ` Bryan Henderson @ 2005-04-17 13:55 ` Christoph Hellwig 2005-04-18 17:07 ` Bryan Henderson 2005-04-18 10:36 ` David Howells 2 siblings, 1 reply; 24+ messages in thread From: Christoph Hellwig @ 2005-04-17 13:55 UTC (permalink / raw) To: David S. Miller Cc: Bryan Henderson, sfr, dhowells, linux-fsdevel, linux-fsdevel-owner, steved On Fri, Apr 15, 2005 at 01:22:59PM -0700, David S. Miller wrote: > > Make a ->compat_read_super() just like we have a ->compat_ioctl() > method for files, if you want to suggest a solution like what > you describe. I don't think we should encourage filesystem writers to do such stupid things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went down that road. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-17 13:55 ` Christoph Hellwig @ 2005-04-18 17:07 ` Bryan Henderson 2005-04-18 17:16 ` Al Viro 2005-04-18 17:33 ` David Howells 0 siblings, 2 replies; 24+ messages in thread From: Bryan Henderson @ 2005-04-18 17:07 UTC (permalink / raw) To: Christoph Hellwig; +Cc: David S. Miller, dhowells, linux-fsdevel, sfr, steved >On Fri, Apr 15, 2005 at 01:22:59PM -0700, David S. Miller wrote: >> >> Make a ->compat_read_super() just like we have a ->compat_ioctl() >> method for files, if you want to suggest a solution like what >> you describe. > >I don't think we should encourage filesystem writers to do such stupid >things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went >down that road. Which road is that? -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 17:07 ` Bryan Henderson @ 2005-04-18 17:16 ` Al Viro 2005-04-18 17:33 ` David Howells 1 sibling, 0 replies; 24+ messages in thread From: Al Viro @ 2005-04-18 17:16 UTC (permalink / raw) To: Bryan Henderson Cc: Christoph Hellwig, David S. Miller, dhowells, linux-fsdevel, sfr, steved On Mon, Apr 18, 2005 at 10:07:14AM -0700, Bryan Henderson wrote: > >On Fri, Apr 15, 2005 at 01:22:59PM -0700, David S. Miller wrote: > >> > >> Make a ->compat_read_super() just like we have a ->compat_ioctl() > >> method for files, if you want to suggest a solution like what > >> you describe. > > > >I don't think we should encourage filesystem writers to do such stupid > >things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went > >down that road. > > Which road is that? Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data). If you want it to be a blob, at least have a decency to use encoding that would not depend on alignment rules and word size. Hell, you could use XDR - it's not that nfs would need something new to handle it. Or, better yet, use a normal string. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 17:07 ` Bryan Henderson 2005-04-18 17:16 ` Al Viro @ 2005-04-18 17:33 ` David Howells 2005-04-18 17:43 ` Al Viro 2005-04-18 17:52 ` Bryan Henderson 1 sibling, 2 replies; 24+ messages in thread From: David Howells @ 2005-04-18 17:33 UTC (permalink / raw) To: Al Viro Cc: Bryan Henderson, Christoph Hellwig, David S. Miller, linux-fsdevel, sfr, steved Al Viro <viro@parcelfarce.linux.theplanet.co.uk> wrote: > > Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data). > If you want it to be a blob, at least have a decency to use encoding > that would not depend on alignment rules and word size. Hell, you > could use XDR - it's not that nfs would need something new to handle > it. Or, better yet, use a normal string. Mount doesn't appear to permit a big enough blob though. It has a hard limit of PAGE_SIZE. David ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 17:33 ` David Howells @ 2005-04-18 17:43 ` Al Viro 2005-04-18 17:52 ` Bryan Henderson 1 sibling, 0 replies; 24+ messages in thread From: Al Viro @ 2005-04-18 17:43 UTC (permalink / raw) To: David Howells Cc: Bryan Henderson, Christoph Hellwig, David S. Miller, linux-fsdevel, sfr, steved On Mon, Apr 18, 2005 at 06:33:09PM +0100, David Howells wrote: > Al Viro <viro@parcelfarce.linux.theplanet.co.uk> wrote: > > > > > Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data). > > If you want it to be a blob, at least have a decency to use encoding > > that would not depend on alignment rules and word size. Hell, you > > could use XDR - it's not that nfs would need something new to handle > > it. Or, better yet, use a normal string. > > Mount doesn't appear to permit a big enough blob though. It has a hard limit > of PAGE_SIZE. Excuse me? Would the use of fixed offsets, field sizes and endianness make the blob bigger? And as for the length of string representation going past 4Kb... that could be easily dealt with in sys_mount() if it really becomes a problem. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 17:33 ` David Howells 2005-04-18 17:43 ` Al Viro @ 2005-04-18 17:52 ` Bryan Henderson 1 sibling, 0 replies; 24+ messages in thread From: Bryan Henderson @ 2005-04-18 17:52 UTC (permalink / raw) To: David Howells Cc: David S. Miller, Christoph Hellwig, linux-fsdevel, sfr, steved, Al Viro >> Architecture-dependent blob passed to mount(2) (aka nfs4_mount_data). >> If you want it to be a blob, at least have a decency to use encoding >> that would not depend on alignment rules and word size. Hell, you >> could use XDR - it's not that nfs would need something new to handle >> it. Or, better yet, use a normal string. > >Mount doesn't appear to permit a big enough blob though. It has a hard limit >of PAGE_SIZE. That seems to me to be orthogonal to Al's point. You could make an architecture-independent format for that page that still contains addresses in user space of additional information. Which would presumably also have an architecture-independent format. But why is mount() special here? It's ancient tradition for Linux system calls to take as parameters, and return as results, in-memory structures that are dependent on local word size and endianness. Lots of them do. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 20:22 ` David S. Miller 2005-04-15 22:07 ` Bryan Henderson 2005-04-17 13:55 ` Christoph Hellwig @ 2005-04-18 10:36 ` David Howells 2005-04-18 18:37 ` David S. Miller 2 siblings, 1 reply; 24+ messages in thread From: David Howells @ 2005-04-18 10:36 UTC (permalink / raw) To: Christoph Hellwig Cc: David S. Miller, Bryan Henderson, sfr, linux-fsdevel, linux-fsdevel-owner, steved Christoph Hellwig <hch@infradead.org> wrote: > I don't think we should encourage filesystem writers to do such stupid > things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went > down that road. The problem with NFS4, I think, is that the mount syscall sets a hard limit on the amount of mount data that's insufficiently large. David ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 10:36 ` David Howells @ 2005-04-18 18:37 ` David S. Miller 0 siblings, 0 replies; 24+ messages in thread From: David S. Miller @ 2005-04-18 18:37 UTC (permalink / raw) To: David Howells Cc: hch, hbryan, sfr, linux-fsdevel, linux-fsdevel-owner, steved On Mon, 18 Apr 2005 11:36:25 +0100 David Howells <dhowells@redhat.com> wrote: > Christoph Hellwig <hch@infradead.org> wrote: > > > I don't think we should encourage filesystem writers to do such stupid > > things as ncfps/smbfs do. In fact I'm totally unhappy thay nfs4 went > > down that road. > > The problem with NFS4, I think, is that the mount syscall sets a hard limit on > the amount of mount data that's insufficiently large. That's correct, it currently cannot support more than one page of data. Even worse, that makes the limit platform dependent. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 11:57 NFS4 mount problem David Howells 2005-04-15 12:21 ` Stephen Rothwell @ 2005-04-17 19:33 ` Trond Myklebust 2005-04-18 17:17 ` Bryan Henderson 2005-04-18 10:34 ` David Howells 2 siblings, 1 reply; 24+ messages in thread From: Trond Myklebust @ 2005-04-17 19:33 UTC (permalink / raw) To: David Howells; +Cc: Linux Filesystem Development, Steve Dickson fr den 15.04.2005 Klokka 12:57 (+0100) skreiv David Howells: > We've come across an interesting problem with NFS4 mount on a PPC64 box. If > the mount program is compiled as PPC32, then the mount() syscall is returned > EFAULT. So, why is this not a case of "Doctor it hurts..."? In exactly which case do people have absolutely no alternative but to run a 32-bit version of the mount program on top of a kernel that was compiled as PPC64? A simple script that runs "uname -r" and switches a soft-link between the 32-bit and 64-bit version of "mount" is not rocket science. > I would prefer option (4), otherwise we have to find _all_ usages of the > mount() system calls and fix them. mount() is not a documented syscall. The binary formats for filesystems like NFS are only documented inside the kernels to which they apply. There should therefore be exactly ONE instance of usage, and that is in the "mount" program itself. Cheers, Trond -- Trond Myklebust <trond.myklebust@fys.uio.no> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-17 19:33 ` Trond Myklebust @ 2005-04-18 17:17 ` Bryan Henderson 2005-04-18 17:59 ` Trond Myklebust 0 siblings, 1 reply; 24+ messages in thread From: Bryan Henderson @ 2005-04-18 17:17 UTC (permalink / raw) To: Trond Myklebust Cc: David Howells, Linux Filesystem Development, Steve Dickson >mount() is not a documented syscall. The binary formats for filesystems >like NFS are only documented inside the kernels to which they apply. What _is_ a documented system call? Linux is famous for not having documented interfaces (or, put another way, not distinguishing between an interface you can read in an official document and one you discover by reading kernel source code). But of all interfaces in Linux, the system call interface is probably the most accepted as one a user of the kernel can rely on. I don't think a filesystem driver designer should expect mount options to be private to one particular user space program. Especially one that isn't even packaged with the driver. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 17:17 ` Bryan Henderson @ 2005-04-18 17:59 ` Trond Myklebust 2005-04-20 10:57 ` Andries Brouwer 0 siblings, 1 reply; 24+ messages in thread From: Trond Myklebust @ 2005-04-18 17:59 UTC (permalink / raw) To: Bryan Henderson Cc: David Howells, Linux Filesystem Development, Steve Dickson må den 18.04.2005 Klokka 10:17 (-0700) skreiv Bryan Henderson: > >mount() is not a documented syscall. The binary formats for filesystems > >like NFS are only documented inside the kernels to which they apply. > > What _is_ a documented system call? Linux is famous for not having > documented interfaces (or, put another way, not distinguishing between an > interface you can read in an official document and one you discover by > reading kernel source code). But of all interfaces in Linux, the system > call interface is probably the most accepted as one a user of the kernel > can rely on. > > I don't think a filesystem driver designer should expect mount options to > be private to one particular user space program. Especially one that > isn't even packaged with the driver. If people really do need a fully documented NFS mount interface, then the only one that makes sense is a string interface. Looking back at the manpages, the string mount options are the only thing that have remained constant over the last 10 years. We're already up to version 6 of the binary interfaces for v2/v3, and if you count NFSv4 too, then that makes 7. Choice of which binary interface to use is entirely dependent on the kernel revision. Good luck fitting all that (plus future revisions) into something like sash without doubling its size... Cheers, Trond -- Trond Myklebust <trond.myklebust@fys.uio.no> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 17:59 ` Trond Myklebust @ 2005-04-20 10:57 ` Andries Brouwer 0 siblings, 0 replies; 24+ messages in thread From: Andries Brouwer @ 2005-04-20 10:57 UTC (permalink / raw) To: Trond Myklebust Cc: Bryan Henderson, David Howells, Linux Filesystem Development, Steve Dickson On Mon, Apr 18, 2005 at 01:59:28PM -0400, Trond Myklebust wrote: > If people really do need a fully documented NFS mount interface, then > the only one that makes sense is a string interface. You can omit the "If" part and retain the conclusion. The binary interface for nfs and family is really a pain. Andries ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-15 11:57 NFS4 mount problem David Howells 2005-04-15 12:21 ` Stephen Rothwell 2005-04-17 19:33 ` Trond Myklebust @ 2005-04-18 10:34 ` David Howells 2005-04-18 14:49 ` Trond Myklebust ` (2 more replies) 2 siblings, 3 replies; 24+ messages in thread From: David Howells @ 2005-04-18 10:34 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux Filesystem Development, Steve Dickson Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > > We've come across an interesting problem with NFS4 mount on a PPC64 box. If > > the mount program is compiled as PPC32, then the mount() syscall is returned > > EFAULT. > > So, why is this not a case of "Doctor it hurts..."? Because: (1) The kernel is returning EFAULT to the 32-bit userspace; this implies that userspace is handing over a bad address. It isn't, the kernel is malfunctioning as it stands. (2) The kernel API does not prohibit 32-bit userspace calling mount() under a 64-bit kernel. All other filesystems cope with it (AFAIK), so NFS4 must too. Either the kernel should return ENOSYS for any 32-bit mount on a 64-bit kernel or it must support it fully. I think the latter is the right thing to do; despite what you'd prefer, there are other callers of the mount syscall out there. > There should therefore be exactly ONE instance of usage, and that is in > the "mount" program itself. Exactly. That should then be the ppc32 mount; which should work equally well with a ppc32 or a ppc64 kernel. David ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 10:34 ` David Howells @ 2005-04-18 14:49 ` Trond Myklebust 2005-04-18 22:07 ` Bryan Henderson 2005-04-18 15:23 ` David Howells 2005-04-18 21:50 ` Bryan Henderson 2 siblings, 1 reply; 24+ messages in thread From: Trond Myklebust @ 2005-04-18 14:49 UTC (permalink / raw) To: David Howells; +Cc: Linux Filesystem Development, Steve Dickson må den 18.04.2005 Klokka 11:34 (+0100) skreiv David Howells: > (1) The kernel is returning EFAULT to the 32-bit userspace; this implies that > userspace is handing over a bad address. It isn't, the kernel is > malfunctioning as it stands. > > (2) The kernel API does not prohibit 32-bit userspace calling mount() under a > 64-bit kernel. All other filesystems cope with it (AFAIK), so NFS4 must > too. > > Either the kernel should return ENOSYS for any 32-bit mount on a 64-bit kernel > or it must support it fully. I think the latter is the right thing to do; > despite what you'd prefer, there are other callers of the mount syscall out > there. No. If you want generalized support for mounting NFS filesystems, then the right thing to do is to create a userland library that can translate the mount options, set up the binary structure with sane defaults etc. Without such a library, it is pointless to contemplate "other callers". With such a library, you will have a single point for switching between 32bit and 64 bit. My concern is that we are slowly but surely building up a bigger in-kernel library for parsing the binary structure than it would take to parse the naked mount option string. There are only 2 reasons for doing that parsing in userland: 1) DNS lookups 2) Keeping the kernel parsing code small > > There should therefore be exactly ONE instance of usage, and that is in > > the "mount" program itself. > > Exactly. That should then be the ppc32 mount; which should work equally well > with a ppc32 or a ppc64 kernel. Then can we kill the PPC64 binary structure and substitute PPC32? Cheers, Trond -- Trond Myklebust <trond.myklebust@fys.uio.no> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 14:49 ` Trond Myklebust @ 2005-04-18 22:07 ` Bryan Henderson 2005-04-18 23:34 ` Trond Myklebust 0 siblings, 1 reply; 24+ messages in thread From: Bryan Henderson @ 2005-04-18 22:07 UTC (permalink / raw) To: Trond Myklebust Cc: David Howells, Linux Filesystem Development, Steve Dickson >My concern is that we are slowly but surely building up a bigger >in-kernel library for parsing the binary structure than it would take to >parse the naked mount option string. > >... >If people really do need a fully documented NFS mount interface, then >the only one that makes sense is a string interface. Looking back at the >manpages, the string mount options are the only thing that have remained >constant over the last 10 years. > >We're already up to version 6 of the binary interfaces for v2/v3, and if >you count NFSv4 too, then that makes 7. I don't know the NFS mount option format, but I'm having a hard time imagining how a string-based format can take less code to parse and be more forward compatible than a binary one. People don't even use the term "parse" for binary structures, because parsing typically means turning strings into binary structures. Having 6 separate formats isn't the only way to have an evolving binary interface. People do make extensible binary formats. >There are only 2 reasons for doing >that parsing in userland: > > 1) DNS lookups > 2) Keeping the kernel parsing code small I personally almost never worry about the number of bytes of code, but I worry a lot about its simplicity. User space code is less costly to develop and less risky to make a mistake in. I would add, 3) Keeping the kernel parsing code simple. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 22:07 ` Bryan Henderson @ 2005-04-18 23:34 ` Trond Myklebust 0 siblings, 0 replies; 24+ messages in thread From: Trond Myklebust @ 2005-04-18 23:34 UTC (permalink / raw) To: Bryan Henderson Cc: David Howells, Linux Filesystem Development, Steve Dickson må den 18.04.2005 Klokka 15:07 (-0700) skreiv Bryan Henderson: > >We're already up to version 6 of the binary interfaces for v2/v3, and if > >you count NFSv4 too, then that makes 7. > > I don't know the NFS mount option format, but I'm having a hard time > imagining how a string-based format can take less code to parse and be > more forward compatible than a binary one. People don't even use the term > "parse" for binary structures, because parsing typically means turning > strings into binary structures. The string based parser (based, BTW, on the generic string parser in lib/parser.c) for NFS mount options is already in the kernel, thanks to NFSroot, and already needs to be maintained. As is the NFSv2/v3 "mount" RPC code, and everything else that the kernel needs to take over that duty. The only extra information we need from userland is the DNS lookup of the server hostname(NFSv2/v3/v4) and the client IP address (NFSv4 only). > Having 6 separate formats isn't the only way to have an evolving > binary > interface. People do make extensible binary formats. I never said they were 6 _separate_ formats. The NFSv2/v3 stuff is one constantly "extending" binary format. See include/linux/nfs_mount.h. Note how 4 of those fields are currently entirely obsolete (fd, old_root, namlen, bsize) and how one more cannot be extended to cope with IPv6 and other new transports (addr), and how one more (root) cannot be used for NFSv4 mounts, which had to add in at least 2 more fields that are unused by NFSv2/v3... Sure, we could indeed have developed more sensible binary formats if our 1992 crystal ball had told us all about NFSv3 (RFC dates from 1995) and NFSv4 (RFC dates from 2003). Not to forget lockd, statd, nfsacl, IPv6, etc... > I personally almost never worry about the number of bytes of code, but > I > worry a lot about its simplicity. User space code is less costly to > develop and less risky to make a mistake in. I would add, > > 3) Keeping the kernel parsing code simple. No. 3) Keeping the kernel parsing code _maintainable_ ...and keeping around parsers for all these different formats and fields and now extra 32-bit counterparts isn't my idea of code simplicity, code compactness, or code maintainability. Cheers, Trond -- Trond Myklebust <trond.myklebust@fys.uio.no> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 10:34 ` David Howells 2005-04-18 14:49 ` Trond Myklebust @ 2005-04-18 15:23 ` David Howells 2005-04-18 15:45 ` Trond Myklebust 2005-04-18 21:50 ` Bryan Henderson 2 siblings, 1 reply; 24+ messages in thread From: David Howells @ 2005-04-18 15:23 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux Filesystem Development, Steve Dickson Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > Without such a library, it is pointless to contemplate "other callers". > With such a library, you will have a single point for switching between > 32bit and 64 bit. "Other callers" include such as busybox, sash and uClinux. I'm not sure about such as Perl, but Perl is hardly in the same class as the other three. Admittedly, a library is probably the right way to do it - libmount or some such thing. > Then can we kill the PPC64 binary structure and substitute PPC32? Whilst I might be happy to, I'm not sure I can speak for everyone. You can also use i386 mount on x86_64 for instance; and possibly s390 on s390x, sparc32 on sparc64 and mips32 on mips64. David ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 15:23 ` David Howells @ 2005-04-18 15:45 ` Trond Myklebust 0 siblings, 0 replies; 24+ messages in thread From: Trond Myklebust @ 2005-04-18 15:45 UTC (permalink / raw) To: David Howells; +Cc: Linux Filesystem Development, Steve Dickson må den 18.04.2005 Klokka 16:23 (+0100) skreiv David Howells: > Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > > > Without such a library, it is pointless to contemplate "other callers". > > With such a library, you will have a single point for switching between > > 32bit and 64 bit. > > "Other callers" include such as busybox, sash and uClinux. I'm not sure about > such as Perl, but Perl is hardly in the same class as the other three. > Admittedly, a library is probably the right way to do it - libmount or some > such thing. Hmm... Nope. None of the above have support for NFSv4 AFAICS. busybox has support for NFSv2/v3, but not v4. Note you are starting to convince me that the correct way to do this is to bite the bullet, move all the binary stuff into a compat library that we can drop at some later time, and then rebase on the NFSroot parser. Cheers, Trond -- Trond Myklebust <trond.myklebust@fys.uio.no> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: NFS4 mount problem 2005-04-18 10:34 ` David Howells 2005-04-18 14:49 ` Trond Myklebust 2005-04-18 15:23 ` David Howells @ 2005-04-18 21:50 ` Bryan Henderson 2 siblings, 0 replies; 24+ messages in thread From: Bryan Henderson @ 2005-04-18 21:50 UTC (permalink / raw) To: David Howells Cc: Linux Filesystem Development, Steve Dickson, Trond Myklebust >(1) The kernel is returning EFAULT to the 32-bit userspace; this implies that > userspace is handing over a bad address. It isn't, the kernel is > malfunctioning as it stands. >... >Either the kernel should return ENOSYS for any 32-bit mount on a 64-bit kernel >or it must support it fully. So this point is just the error code? If so, where do you get ENOSYS? A more usual errno for where a particular filesystem type can't be mounted is ENODEV. Choosing errnos is a pretty whimsical thing anyway, since there are so many more kinds of errors than the authors of the errno space contemplated, but EFAULT and ENOSYS are two that have a pretty solid definition. ENOSYS is for when an entire system call type is missing. I'm not sure we can complain about EFAULT, though, because you really are supplying an invalid address. You're doing it because you're using the wrong mount option format, so what you think of as 4 bytes of flags followed by 4 bytes of address is really 8 bytes of address. I do understand the more important issue of there being a kernel that understands both mount option formats; but since you enumerated the errno issue, I wanted to comment on that one independently. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2005-04-20 10:57 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-15 11:57 NFS4 mount problem David Howells 2005-04-15 12:21 ` Stephen Rothwell 2005-04-15 19:51 ` Bryan Henderson 2005-04-15 20:22 ` David S. Miller 2005-04-15 22:07 ` Bryan Henderson 2005-04-17 13:55 ` Christoph Hellwig 2005-04-18 17:07 ` Bryan Henderson 2005-04-18 17:16 ` Al Viro 2005-04-18 17:33 ` David Howells 2005-04-18 17:43 ` Al Viro 2005-04-18 17:52 ` Bryan Henderson 2005-04-18 10:36 ` David Howells 2005-04-18 18:37 ` David S. Miller 2005-04-17 19:33 ` Trond Myklebust 2005-04-18 17:17 ` Bryan Henderson 2005-04-18 17:59 ` Trond Myklebust 2005-04-20 10:57 ` Andries Brouwer 2005-04-18 10:34 ` David Howells 2005-04-18 14:49 ` Trond Myklebust 2005-04-18 22:07 ` Bryan Henderson 2005-04-18 23:34 ` Trond Myklebust 2005-04-18 15:23 ` David Howells 2005-04-18 15:45 ` Trond Myklebust 2005-04-18 21:50 ` Bryan Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).