From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: trond.myklebust@primarydata.com, linux-nfs@vger.kernel.org,
Eric Paris <eparis@parisplace.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v1 38/38] nfs: add a Kconfig option for NFS reexporting and documentation
Date: Thu, 14 Jan 2016 17:21:27 -0500 [thread overview]
Message-ID: <20160114222127.GA4177@fieldses.org> (raw)
In-Reply-To: <20151119192849.30cb4549@tlielax.poochiereds.net>
On Thu, Nov 19, 2015 at 07:28:49PM -0500, Jeff Layton wrote:
> On Thu, 19 Nov 2015 19:04:15 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
>
> > On Wed, Nov 18, 2015 at 04:15:21PM -0500, Jeff Layton wrote:
> > > On Wed, 18 Nov 2015 15:22:20 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > >
> > > > On Tue, Nov 17, 2015 at 06:53:00AM -0500, Jeff Layton wrote:
> > > > > +Filehandle size:
> > > > > +----------------
> > > > > +The maximum filehandle size is governed by the NFS version. Version 2
> > > > > +used fixed 32 byte filehandles. Version 3 moved to variable length
> > > > > +filehandles that can be up to 64 bytes in size. NFSv4 increased that
> > > > > +maximum to 128 bytes.
> > > > > +
> > > > > +When reexporting an NFS filesystem, the underlying filehandle from the
> > > > > +server must be embedded inside the filehandles presented to clients.
> > > > > +Thus if the underlying server presents filehandles that are too big, the
> > > > > +reexporting server can fail to encode them. This can lead to
> > > > > +NFSERR_OPNOTSUPP errors being returned to clients.
> > > > > +
> > > > > +This is not a trivial thing to programatically determine ahead of time
> > > > > +(and it can vary even within the same server), so some foreknowledge of
> > > > > +how the underlying server constructs filehandles, and their maximum
> > > > > +size is a must.
> > > >
> > > > This is the trickiest one, since it depends on an undocumented
> > > > implementation detail of the server.
> > > >
> > >
> > > Yes, indeed...
> > >
> > > > Do we even know if this works for all the exportable Linux filesystems?
> > > >
> > > > If proxying NFSv4.x servers is actually useful, could we add a per-fs
> > > > maximum-filesystem-size attribute to the protocol?
> > > >
> > >
> > > Erm, I think you mean maximum-filehandle-size, but I get your point...
> >
> > Whoops, thanks.
> >
> > > It's tough to do more than a quick survey, but looking at new-style fh:
> > >
> > > The max fsid len seems to be 28 bytes (FSID_UUID16_INUM), though you
> > > can get that down to 8 bytes if you specify the fsid directly. The fsid
> > > choice is weird, because it sort of depends on the filehandle sent by
> > > the client (which is used as a template), so I guess we really do need
> > > to assume worst-case.
> >
> > The client can only ever use filehandles it's been given, so if the
> > backend server's always been configured to use a certain kind (e.g. if
> > the exports have fsid= set), then we're OK, we're not responsible for
> > clients that guess random filehandles.
> >
> > > Once that's done, the encode_fh routines add the fileid part. btrfs has
> > > a pretty large maximum one: 40 bytes. That brings the max size up to 68
> > > bytes, which is already too large for NFSv3, before we ever get to
> > > the part where we embed that inside another fh. We require another 12
> > > bytes on top of the "underlying" filehandle for reexporting.
> >
> > So it's not necessarily that bad for nfsd, though of course it makes it
> > more complicated to configure the backend server. Well, and knfsd has
> > v3 support so this is all a bit academic I guess.
> >
>
> You just have to make sure you vet the filehandle size on the stuff
> you're reexporting. In our use-case, we know that the backend server's
> filehandles are well under 42 bytes, so we're well under the max size.
>
> One thing we could consider is promoting the dprintk in nfs_encode_fh
> when this occurs to a pr_err or something. That would at least make
> it very obvious when that occurs...
>
> > So I'm having trouble weighing the benefits of this patch set against
> > the risks.
> >
> > It's not even necessarily true that filehandles on a given filesystem
> > need be constant length. In theory a server could decide to start
> > giving out bigger filehandles some day (as long as it continued to
> > respect the old ones), and the proxy would break. In practice maybe
> > nobody does that.
> >
>
> Hard to say. There are a lot of oddball servers out there. There
> certainly are risks involved in reexporting, particularly if you don't
> heed the caveats. It's for good reason this Kconfig option defaults to
> "n". ;)
>
> OTOH, the kernel shouldn't crash or anything if that occurs. If your
> filehandles are too large to be embedded, then you just end up getting
> back FILEID_INVALID on the encode_fh. That sucks if it occurs, but
> it shouldn't happen if you're careful about what gets reexported.
OK, sorry for the long silence on this.
Basically I'm having trouble making the case to myself here:
- On the one hand, having you guys carry all this stuff is
annoying, I'd rather our code bases were closer.
- On the other hand, I can't see taking something that's in
practice basically only useful for one proprietary server,
which is the way it looks to me right now.
- Also, "NFS proxying" *sounds* much more general than it really
is, and I fear a lot of people are going to fall into that
trap now matter how we warn them.
Gah.
Anyway, for now I should take the one tracepoint patch at least (and
shouldn't some of the fs patches go in regardless?) but I'm punting on
the rest.
--b.
next prev parent reply other threads:[~2016-01-14 22:21 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-17 11:52 [PATCH v1 00/38] Allow NFS filesystems to be reexported via knfsd Jeff Layton
2015-11-17 11:52 ` [PATCH v1 01/38] nfsd: add new io class tracepoint Jeff Layton
2015-11-17 11:52 ` [PATCH v1 02/38] fs: have flush_delayed_fput flush the workqueue job Jeff Layton
2015-11-17 11:52 ` [PATCH v1 03/38] fs: add a kerneldoc header to fput Jeff Layton
2015-11-17 11:52 ` [PATCH v1 04/38] fs: rename "delayed_fput" infrastructure to "fput_global" Jeff Layton
2015-11-17 11:52 ` [PATCH v1 05/38] fs: add fput_global Jeff Layton
2015-11-17 11:52 ` [PATCH v1 06/38] fsnotify: fix a sparse warning Jeff Layton
2015-11-17 11:52 ` [PATCH v1 07/38] fsnotify: export several symbols Jeff Layton
2015-11-17 11:52 ` [PATCH v1 08/38] fsnotify: destroy marks with call_srcu instead of dedicated thread Jeff Layton
2015-11-17 11:52 ` [PATCH v1 09/38] fsnotify: add a srcu barrier for fsnotify Jeff Layton
2015-11-17 11:52 ` [PATCH v1 10/38] locks: create a new notifier chain for lease attempts Jeff Layton
2015-11-17 11:52 ` [PATCH v1 11/38] sunrpc: add a new cache_detail operation for when a cache is flushed Jeff Layton
2015-11-17 11:52 ` [PATCH v1 12/38] nfsd: add a new struct file caching facility to nfsd Jeff Layton
2015-11-17 11:52 ` [PATCH v1 13/38] nfsd: keep some rudimentary stats on nfsd_file cache Jeff Layton
2015-11-17 11:52 ` [PATCH v1 14/38] nfsd: allow filecache open to skip fh_verify check Jeff Layton
2015-11-17 11:52 ` [PATCH v1 15/38] nfsd: hook up nfsd_write to the new nfsd_file cache Jeff Layton
2015-11-17 11:52 ` [PATCH v1 16/38] nfsd: hook up nfsd_read to the " Jeff Layton
2015-11-17 11:52 ` [PATCH v1 17/38] nfsd: hook nfsd_commit up " Jeff Layton
2015-11-17 11:52 ` [PATCH v1 18/38] nfsd: convert nfs4_file->fi_fds array to use nfsd_files Jeff Layton
2015-11-17 11:52 ` [PATCH v1 19/38] nfsd: have nfsd_test_lock use the nfsd_file cache Jeff Layton
2015-11-17 11:52 ` [PATCH v1 20/38] nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Jeff Layton
2015-11-17 11:52 ` [PATCH v1 21/38] nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache Jeff Layton
2015-11-17 11:52 ` [PATCH v1 22/38] nfsd: rip out the raparms cache Jeff Layton
2015-11-17 11:52 ` [PATCH v1 23/38] nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations Jeff Layton
2015-11-17 11:52 ` [PATCH v1 24/38] nfsd: allow lockd to be forcibly disabled Jeff Layton
2015-11-17 11:52 ` [PATCH v1 25/38] nfsd: add errno mapping for EREMOTEIO Jeff Layton
2015-11-17 11:52 ` [PATCH v1 26/38] nfsd: return EREMOTE if we find an S_AUTOMOUNT inode Jeff Layton
2015-11-17 11:52 ` [PATCH v1 27/38] nfsd: allow filesystems to opt out of subtree checking Jeff Layton
2015-11-17 22:53 ` Jeff Layton
2015-11-17 11:52 ` [PATCH v1 28/38] nfsd: close cached files prior to a REMOVE or RENAME that would replace target Jeff Layton
2015-11-17 11:52 ` [PATCH v1 29/38] nfsd: retry once in nfsd_open on an -EOPENSTALE return Jeff Layton
2015-11-17 11:52 ` [PATCH v1 30/38] nfsd: close cached file when underlying file systems says no such file Jeff Layton
2015-11-17 11:52 ` [PATCH v1 31/38] nfs: replace d_add with d_splice_alias in atomic_open Jeff Layton
2015-11-19 20:06 ` J. Bruce Fields
2015-11-19 20:52 ` Trond Myklebust
2015-11-19 20:59 ` Jeff Layton
2015-11-19 22:32 ` J. Bruce Fields
2015-11-17 11:52 ` [PATCH v1 32/38] nfs: add encode_fh export op Jeff Layton
2015-11-17 11:52 ` [PATCH v1 33/38] nfs: add fh_to_dentry " Jeff Layton
2015-11-17 11:52 ` [PATCH v1 34/38] nfs: nfs_fh_to_dentry() make use of inode cache Jeff Layton
2015-11-17 11:52 ` [PATCH v1 35/38] nfs4: add NFSv4 LOOKUPP handlers Jeff Layton
2015-11-17 11:52 ` [PATCH v1 36/38] nfs: add a get_parent export operation for NFS Jeff Layton
2015-11-17 11:52 ` [PATCH v1 37/38] nfs: set export ops Jeff Layton
2015-11-17 11:53 ` [PATCH v1 38/38] nfs: add a Kconfig option for NFS reexporting and documentation Jeff Layton
2015-11-18 20:22 ` J. Bruce Fields
2015-11-18 21:15 ` Jeff Layton
2015-11-18 22:30 ` Frank Filz
2015-11-19 14:01 ` Jeff Layton
2015-11-20 0:04 ` J. Bruce Fields
2015-11-20 0:28 ` Jeff Layton
2016-01-14 22:21 ` J. Bruce Fields [this message]
2016-01-15 16:00 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160114222127.GA4177@fieldses.org \
--to=bfields@fieldses.org \
--cc=eparis@parisplace.org \
--cc=jlayton@poochiereds.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).