Re: overlayfs NFS export - Trond Myklebust

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Trond Myklebust <trondmy@primarydata.com>
To: "miklos@szeredi.hu" <miklos@szeredi.hu>,
	"amir73il@gmail.com" <amir73il@gmail.com>
Cc: "bfields@fieldses.org" <bfields@fieldses.org>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"jlayton@poochiereds.net" <jlayton@poochiereds.net>,
	Trond Myklebust <trondmy@primarydata.com>,
	"linux-unionfs@vger.kernel.org" <linux-unionfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: overlayfs NFS export
Date: Fri, 7 Apr 2017 14:57:37 +0000	[thread overview]
Message-ID: <1491577054.10609.1.camel@primarydata.com> (raw)
In-Reply-To: <CAOQ4uxhpKFhksVJTgWyy5YRzunLiQjf9TTgFO_EfBSdWvBfxWg@mail.gmail.com>

On Fri, 2017-04-07 at 17:29 +0300, Amir Goldstein wrote:
> [changing the subject and adding more NFS guys so they can shoot my
> idea down if it is too dumb to live]
> 
> On Fri, Apr 7, 2017 at 4:03 PM, Miklos Szeredi <miklos@szeredi.hu>
> wrote:
> > On Fri, Apr 7, 2017 at 12:47 PM, Amir Goldstein <amir73il@gmail.com
> > > wrote:
> > 
> > > Come to think about it, NFS export of regular file don't need to
> > > follow renames at all:
> > > - The handle for a regular file is always the handle for the real
> > > lower or upper inode
> > > - To decode a handle, create an O_TMPFILE style overlay dentry,
> > > which
> > > is not linked
> > >   to any path in overlay, but has the _upperdentry/lowerstack
> > > setup
> > 
> > I don't think nfs will allow such a scheme.  NFS3 server is
> > stateless,
> > which means there's no open/close in the protocol.   Hence we can't
> > copy-up on open(O_WR*) and return a different file handle for
> > writing.
> > If client looks up a file currently on lower and we return file
> > handle
> > based on lower file, then we must be able to decode that handle
> > after
> > the file has been copied up and even after rename.  And this must
> > work
> > reliably even if the overlay dentry is no longer in the dcache.
> > 
> > So there's no option, other than to have a reverse mapping
> > somewhere.
> > 
> 
> Either I am missing something or you are.
> 
> Consider this scenario:
> 
> On server:
> - touch a
> - ln a b
> 
> On NFS client:
> - rofd = open("a", O_RDONLY)
> 
> On server
> - rm a
> - reboot
> 
> NFS client must be able to continue to work with rofd
> even after reboot and even after original file was unlinked.
> 
> Furthermore, on server:
> - rm b
> 
> NFS client must continue to work with rofd even though
> NFS server is stateless and even though inode is now
> nlink = 0.
> That is possible because fs will instantiate a disconnected
> dentry when decoding the file handle is there is no dentry
> already in cache.
> 
> So what I am saying is that when nfsd tries to decode
> a handle from overlay mount, and there is no mathcing
> overlay dentry in cache (with the lower ino of course)
> then we instantiate a new disconnected dentry without
> lookup and set its _upperdenry or lowerstack according
> to the knowledge that we found the handle in underlying
> fs and we checked if it is a decedent of lower_mnt[i] or
> upper_mnt.
> 
> When NFS client opens a new rwfd it WILL get a different
> handle, but it WILL really be a different file then the rofd,
> so that sounds like a good thing?
> 
> I realize it may sound complicated, but redirect_fh patch
> has already all the needed parts in place for this, so the
> proof if what I am saying is right or wrong will be in whether
> or not I am able to present a working POC...
> 

What is the problem you are trying to solve?

As far as the NFS protocol is concerned, if 2 filehandles that
originate from the same server are equal (i.e. a byte-by-byte
comparison of their contents shows a match), then they MUST refer to
the same file.

On the other hand, if the filehandles differ, then they are _treated_
by both the client and the server as if they were pointing to different
files (even if that is not the case). That principle is even encoded in
the NFSv4 protocol, where it is a requirement that file state is
attached to the filehandle.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com

next prev parent reply	other threads:[~2017-04-07 14:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07 14:29 overlayfs NFS export Amir Goldstein
2017-04-07 14:53 ` Jeff Layton
2017-04-07 15:26   ` Amir Goldstein
2017-04-07 14:57 ` Trond Myklebust [this message]
2017-04-07 15:28   ` Miklos Szeredi
2017-04-07 15:45     ` Amir Goldstein
2017-04-07 15:58       ` Trond Myklebust
2017-04-07 16:10         ` Amir Goldstein
2017-04-07 16:21           ` Trond Myklebust
2017-04-07 18:43             ` Amir Goldstein
2017-04-07 16:47           ` Jeff Layton
2017-04-07 18:53             ` Amir Goldstein
2017-04-07 15:46     ` Trond Myklebust
2017-04-07 15:58       ` Amir Goldstein
2017-04-07 16:02         ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1491577054.10609.1.camel@primarydata.com \
    --to=trondmy@primarydata.com \
    --cc=amir73il@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).