All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>, NFS <linux-nfs@vger.kernel.org>
Subject: Re: [patch/rfc] allow exported (and *not* exported) filesystems to be unmounted.
Date: Mon, 8 Jul 2013 17:30:03 +1000	[thread overview]
Message-ID: <20130708173003.131cd901@notabene.brown> (raw)
In-Reply-To: <20130702155059.GD31697@fieldses.org>

[-- Attachment #1: Type: text/plain, Size: 6561 bytes --]

On Tue, 2 Jul 2013 11:50:59 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:

> On Tue, Jul 02, 2013 at 08:24:13AM +1000, NeilBrown wrote:
> > On Mon, 1 Jul 2013 15:12:38 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > wrote:
> > 
> > > On Thu, Jun 06, 2013 at 10:05:12AM +1000, NeilBrown wrote:
> > > > On Wed, 5 Jun 2013 09:36:58 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > > > wrote:
> > > > 
> > > > > On Wed, Jun 05, 2013 at 04:19:34PM +1000, NeilBrown wrote:
> > > > > > On Wed, 5 Jun 2013 04:41:15 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote:
> > > > > > 
> > > > > > > On Wed, Jun 05, 2013 at 01:05:41PM +1000, NeilBrown wrote:
> > > > > > > > 
> > > > > > > > Hi Bruce,
> > > > > > > >  this is a little issue that seems to keep coming up so I thought it might be
> > > > > > > >  time to fix it.
> > > > > > > > 
> > > > > > > >  As you know, a filesystem that is exported cannot be unmounted as the export
> > > > > > > >  cache holds a reference to it.  Though if it hasn't been accessed for a
> > > > > > > >  while then it can.
> > > > > > > > 
> > > > > > > >  As I hadn't realised before sometimes *non* exported filesystems can be
> > > > > > > >  pinned to.  A negative entry in the cache can pin a filesystem just as
> > > > > > > >  easily as a positive entry.
> > > > > > > >  An amusing, if somewhat contrived, example is that if you export '/' with
> > > > > > > >  crossmnt and:
> > > > > > > > 
> > > > > > > >     mount localhost:/ /mnt
> > > > > > > >     ls -l /
> > > > > > > >     umount /mnt
> > > > > > > > 
> > > > > > > >  the umount might fail.  This is because the "ls -l" tried to export every
> > > > > > > >  filesystem found mounted in '/'.  The export of "/mnt" failed of course
> > > > > > > >  because you cannot re-export an NFS filesystem.  But it is still in the
> > > > > > > >  cache.
> > > > > > > >  An 'exportfs -f' fixes this, but shouldn't be necessary.
> > > > > 
> > > > > Yeah, ugh.  As a less contrived example, can the default v4 root export
> > > > > lead to arbitrary filesystems being pinned just because a client tried
> > > > > to mount the wrong path?
> > > > 
> > > > I think it can only pin filesystems that are exported, either explicitly or
> > > > via a parent being exported with 'crossmnt'.
> > > 
> > > But see utils/mountd/v4root.c, and:
> > > 
> > > 	[root@server ~]# exportfs -v
> > > 	/export <world>(rw,...)
> > > 	[root@server ~]# mount /mnt
> > > 
> > > 		[root@pip4 ~]# mount pip4:/ /mnt/
> > > 		[root@pip4 ~]# ls -l /mnt/
> > > 		total 4
> > > 		drwxrwxrwt 3 root root 4096 Jun  7 10:34 export
> > > 		[root@pip4 ~]# 
> > > 
> > > 	[root@server ~]# umount /mnt/
> > > 	umount: /mnt: target is busy.
> > > 	...
> > > 	[root@server ~]# grep /mnt /proc/net/rpc/nfsd.export/content 
> > > 	# /mnt	*()
> > 
> > You make a good point.  I didn't think that would happen, and I think I could
> > argue that it is a bug.
> 
> Definitely looks like a bug to me.
> 
> > I have no idea how easy it would be to "fix" without
> > pouring over the code for a while though.  Or whether it is worth "fixing".
> 
> As long as clients are mostly just LOOKUPing down to the export they
> care about people may not hit this too much, but no doubt somebody will
> hit this eventually.
> 
> > > > > Could the export cache be indexed on a path string instead of a struct
> > > > > path?
> > > > 
> > > > Yes.  It would mean lots of extra pathname lookups and possibly lots of
> > > > "d_path()" calls.
> > > 
> > > Right.  Ugh.  Still, struct path seems wrong as it's not something gssd
> > > knows about, and there's not really a 1-1 mapping between the two (see
> > > e.g. the recent complaint about the case where the struct path
> > > represents a lazy-unmounted export
> > > http://mid.gmane.org/<20130625191008.GA20277@us.ibm.com> ).
> > 
> > I noticed that but haven't really followed it (though I just checked and
> > there isn't much to follow...)
> > 
> > What do we *want* to happen in this case?  I would argue that when the
> > filesystem is detached the export should immediately become invalid.
> > 
> > We could possibly put a check in exp_find_key() to see if ek->ek_path
> > was still attached (not sure how to do that) and if it is: invalidate the ek
> > before calling cache_check().  If the "is path detach" test is cheap, we
> > should probably do this.
> > 
> > Or - we could flush relevant entries from the cache whenever there is a
> > change in the mount table.  That is certainly the preferred option if the "is
> > path detached" test is at all expensive.  But it seems it isn't completely
> > clear how that flushing should be triggered...
> 
> There are probably other ways to get this kind of hang too: mounting
> over an export? mount --move?

I could argue that "mount --move" should  trigger a flush just like umount
does.  Mounting over an export (or mounting over a parent of an export) is
not so easy.

The core problem which causes the hang is that the path requested by
svc_export_request cannot be used to refer to the exp->ex_path.
This is something that svc_export_request could check.
e.g. after "d_path", it could "kern_path(pth, 0, &old_path);" and if that
fails or old_path doesn't match ex_path, then somehow invalidate the cache
entry....  though that is a bit awkward. We really want to catch the problem
at exp_find_key time.  Doing it there would be more of a performance burden.

Can d_path tell us that the path isn't reachable?
I think __d_path can, but that isn't exported (yet).  Using that could avoid
the "kern_path()" call.

So I'm thinking:

 - in svc_export_request() use __d_path (which needs to be exported)
 - if that fails, set a new CACHE_STALE flag on the cache_head
 - cache_read notices CACHE_STALE and calls cache_fresh_unlocked() (I think)
   to remove it from the queue.
 - cache_is_valid treats CACHE_STALE like CACHE_NEGATIVE only -ESTALE is
   returned
 - if exp_find() gets -ESTALE from exp_get_by_name, it sets CACHE_STALE on
   'ek'.

It gets a bit complicated doesn't it?

An alternative is that if rpc.mountd gets a request for a path that doesn't
exist, or that when it responds to a request, the request remains in the
channel, then it simply flushes everything.

This is a bit heavy handed, but it is a rare occurrence so maybe that doesn't
matter.

It doesn't solve my desire for umount of exported filesystems to "just work"
though.

NeilBrown


 - 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2013-07-08  7:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-05  3:05 [patch/rfc] allow exported (and *not* exported) filesystems to be unmounted NeilBrown
2013-06-05  3:41 ` Al Viro
2013-06-05  6:19   ` NeilBrown
2013-06-05 13:36     ` J. Bruce Fields
2013-06-06  0:05       ` NeilBrown
2013-07-01 19:12         ` J. Bruce Fields
2013-07-01 22:24           ` NeilBrown
2013-07-02 15:50             ` J. Bruce Fields
2013-07-08  7:30               ` NeilBrown [this message]
2013-07-08 20:04                 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130708173003.131cd901@notabene.brown \
    --to=neilb@suse.de \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.