linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Rename dir on server can cause client to get ESTALE
@ 2011-11-14  2:19 NeilBrown
  2011-12-01  1:49 ` Rename dir on server can cause client to get ESTALE - this time with PATCH NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2011-11-14  2:19 UTC (permalink / raw)
  To: Trond Myklebust, NFS, Alexander Viro

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]


hi,
  I've run into another issue that seems to related to FS_REVAL_DOT.

The script below makes the details precise, but the essence is that if I 'cd'
into a directory on the client, then rename it on the server, then it is
possible that the client will start getting ESTALE when accessing '.' - even
though the directory still exists.

The ESTALE is generated because nfs_lookup_revalidate fails on the dentry, so 
complete_walk (in fs/namei.c) gets failure from d_revalidate() and so sets the
status to -ESTALE.

nfs_lookup_revalidate fails because when it repeats the lookup it sees a
different directory (as you will see the script creates a new directory with
the old name).

I think it only makes sense to do a ->lookup revalidate of the dentry at the
end of the path when there was a real non '.' or '..' name leading to the
dentry.  If we were just looking up '.', we want to revalidate the inode, but
not the dentry.

Unfortunately I cannot see how that distinction could be introduced into the
current path-walk code.

Any ideas?

Thanks,
NeilBrown


SERVER=eli     # name of server.  ssh access required.
DIR=/home      # directory on server to mount
MPOINT=/mnt    # location on client to mount it.
TMP=/neilb/tmp # path to scratch area in $DIR

sudo umount $MPOINT
sudo mount -o vers=3 $SERVER:$DIR $MPOINT

cd /
ssh $SERVER "rm -r $DIR$TMP/*dir*"
ssh $SERVER "mkdir $DIR$TMP/adir"
while [ ! -d $MPOINT$TMP/adir ];
do echo -n . ; sleep 2;
done
cd $MPOINT$TMP/adir || exit
echo "Entered directory"
ls -la > /dev/null
ssh $SERVER "cd $DIR$TMP; mv adir adir.moved"
echo "Moved directory on server"
ls -la > /dev/null
echo -n "Waiting for move to be visible on client"
while ls -la $MPOINT$TMP/adir >/dev/null 2>&1
do echo -n . 
   sleep 3
   (cd / ; ssh $SERVER "cd $DIR$TMP; mkdir bdir ; rmdir bdir" )
done
echo
echo "Make replacement directory on server"
(cd / ; ssh $SERVER "cd $DIR$TMP; mkdir adir")
ls -la $MPOINT$TMP/adir
ls -la


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Rename dir on server can cause client to get ESTALE - this time with PATCH
  2011-11-14  2:19 Rename dir on server can cause client to get ESTALE NeilBrown
@ 2011-12-01  1:49 ` NeilBrown
  2011-12-01  2:12   ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2011-12-01  1:49 UTC (permalink / raw)
  To: Trond Myklebust, Alexander Viro; +Cc: NFS

[-- Attachment #1: Type: text/plain, Size: 4492 bytes --]

On Mon, 14 Nov 2011 13:19:29 +1100 NeilBrown <neilb@suse.de> wrote:

> 
> hi,
>   I've run into another issue that seems to related to FS_REVAL_DOT.
> 
> The script below makes the details precise, but the essence is that if I 'cd'
> into a directory on the client, then rename it on the server, then it is
> possible that the client will start getting ESTALE when accessing '.' - even
> though the directory still exists.
> 
> The ESTALE is generated because nfs_lookup_revalidate fails on the dentry, so 
> complete_walk (in fs/namei.c) gets failure from d_revalidate() and so sets the
> status to -ESTALE.
> 
> nfs_lookup_revalidate fails because when it repeats the lookup it sees a
> different directory (as you will see the script creates a new directory with
> the old name).
> 
> I think it only makes sense to do a ->lookup revalidate of the dentry at the
> end of the path when there was a real non '.' or '..' name leading to the
> dentry.  If we were just looking up '.', we want to revalidate the inode, but
> not the dentry.
> 
> Unfortunately I cannot see how that distinction could be introduced into the
> current path-walk code.
> 
> Any ideas?
> 
> Thanks,
> NeilBrown
> 
> 
> SERVER=eli     # name of server.  ssh access required.
> DIR=/home      # directory on server to mount
> MPOINT=/mnt    # location on client to mount it.
> TMP=/neilb/tmp # path to scratch area in $DIR
> 
> sudo umount $MPOINT
> sudo mount -o vers=3 $SERVER:$DIR $MPOINT
> 
> cd /
> ssh $SERVER "rm -r $DIR$TMP/*dir*"
> ssh $SERVER "mkdir $DIR$TMP/adir"
> while [ ! -d $MPOINT$TMP/adir ];
> do echo -n . ; sleep 2;
> done
> cd $MPOINT$TMP/adir || exit
> echo "Entered directory"
> ls -la > /dev/null
> ssh $SERVER "cd $DIR$TMP; mv adir adir.moved"
> echo "Moved directory on server"
> ls -la > /dev/null
> echo -n "Waiting for move to be visible on client"
> while ls -la $MPOINT$TMP/adir >/dev/null 2>&1
> do echo -n . 
>    sleep 3
>    (cd / ; ssh $SERVER "cd $DIR$TMP; mkdir bdir ; rmdir bdir" )
> done
> echo
> echo "Make replacement directory on server"
> (cd / ; ssh $SERVER "cd $DIR$TMP; mkdir adir")
> ls -la $MPOINT$TMP/adir
> ls -la
> 


.. but answer came there none....

I've looked some more at the code and now would like to propose a patch.
This fixes it for me and feels right.

Opinions?

Thanks,
NeilBrown

From 7abb2d77b4c8d8ca340e372447467d8a47241f83 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 30 Nov 2011 18:35:13 +1100
Subject: [PATCH] nfs - handle d_revalidate of 'dot' correctly.

When d_revalidate is called on a dentry because FS_REVAL_DOT is set
it isn't really appropriate to revalidate the name.

If the path was simply ".", then the current-working-directory could
have been renamed on the server and should still be accessible as "."
even if it has a new name.

If the path was "/some/long/path/.", then the final component ("path" in
this case) has already been revalidated and there is no particular
need to do it again.

If we change nd->last_type to refer to "the last component looked at"
rather than just "the last component", then these cases can be
detected by "nd->last_type != LAST_NORM".

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/namei.c   |    2 +-
 fs/nfs/dir.c |    9 +++++++++
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 5008f01..6a720f7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1434,6 +1434,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
 			}
 		}
 
+		nd->last_type = type;
 		/* remove trailing slashes? */
 		if (!c)
 			goto last_component;
@@ -1458,7 +1459,6 @@ static int link_path_walk(const char *name, struct nameidata *nd)
 
 last_component:
 		nd->last = this;
-		nd->last_type = type;
 		return 0;
 	}
 	terminate_walk(nd);
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index ac28990..f62827a 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1137,6 +1137,15 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (NFS_STALE(inode))
 		goto out_bad;
 
+	if (nd->last_type != LAST_NORM) {
+		/* name not relevant, just inode */
+		error = nfs_revalidate_inode(NFS_SERVER(inode), inode);
+		if (error)
+			goto out_bad;
+		else
+			goto out_valid;
+	}
+
 	error = -ENOMEM;
 	fhandle = nfs_alloc_fhandle();
 	fattr = nfs_alloc_fattr();
-- 
1.7.7.3





[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Rename dir on server can cause client to get ESTALE - this time with PATCH
  2011-12-01  1:49 ` Rename dir on server can cause client to get ESTALE - this time with PATCH NeilBrown
@ 2011-12-01  2:12   ` Al Viro
  2011-12-01  2:24     ` Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2011-12-01  2:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: Trond Myklebust, NFS

On Thu, Dec 01, 2011 at 12:49:22PM +1100, NeilBrown wrote:

> If the path was "/some/long/path/.", then the final component ("path" in
> this case) has already been revalidated and there is no particular
> need to do it again.
> 
> If we change nd->last_type to refer to "the last component looked at"
> rather than just "the last component", then these cases can be
> detected by "nd->last_type != LAST_NORM".

This is just plain wrong.  Let's *not* bring more dependencies on
nameidata into ->d_revalidate().  The goal is to get rid of it there...

FWIW, if you want a really nasty bug in that area, consider this:

mkdir /tmp/a
mkdir /tmp/b
echo "local file" >/tmp/x
mount -t nfs4 $SOMETHING /tmp/a
mount -t nfs4 $SOMETHING /tmp/b
echo "NFS file" >/tmp/a/x
mount --bind /tmp/x /tmp/a/x

now try opening /tmp/b/x.  And watch the NFS traffic; there won't be OPEN
request for x on server.  Why?  Because NFS sees that x is a mountpoint in
*some* instance of that filesystem.  And decides that opening it would be
wrong.  And so it would, if we were asked to open /tmp/a/x.  Alas, in this
case, while dentry is the same, it does *not* have anything mounted on it.
What we get is ->d_revalidate() returning without issuing OPEN and ->open()
being called - again, without issuing OPEN, since it assumes that ->lookup()
or ->d_revalidate() had done it for us.

Plain IO on resulting descriptor will work and work correcly (you'll get
"NFS file\n" read from it), but try to do F_SETLK on it and it'll fail
since that requires the server to have seen an OPEN.

As far as I can tell, the idea of open done in ->d_revalidate() is
unsalvagable.  It's simply the wrong place for that.  Note that NFS
is the only filesystem trying to do atomic open stuff in its ->d_revalidate()
and it's not succeeding.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Rename dir on server can cause client to get ESTALE - this time with PATCH
  2011-12-01  2:12   ` Al Viro
@ 2011-12-01  2:24     ` Trond Myklebust
  2011-12-01  2:47       ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2011-12-01  2:24 UTC (permalink / raw)
  To: Al Viro; +Cc: NeilBrown, NFS

On Thu, 2011-12-01 at 02:12 +0000, Al Viro wrote: 
> On Thu, Dec 01, 2011 at 12:49:22PM +1100, NeilBrown wrote:
> 
> > If the path was "/some/long/path/.", then the final component ("path" in
> > this case) has already been revalidated and there is no particular
> > need to do it again.
> > 
> > If we change nd->last_type to refer to "the last component looked at"
> > rather than just "the last component", then these cases can be
> > detected by "nd->last_type != LAST_NORM".
> 
> This is just plain wrong.  Let's *not* bring more dependencies on
> nameidata into ->d_revalidate().  The goal is to get rid of it there...
> 
> FWIW, if you want a really nasty bug in that area, consider this:
> 
> mkdir /tmp/a
> mkdir /tmp/b
> echo "local file" >/tmp/x
> mount -t nfs4 $SOMETHING /tmp/a
> mount -t nfs4 $SOMETHING /tmp/b
> echo "NFS file" >/tmp/a/x
> mount --bind /tmp/x /tmp/a/x
> 
> now try opening /tmp/b/x.  And watch the NFS traffic; there won't be OPEN
> request for x on server.  Why?  Because NFS sees that x is a mountpoint in
> *some* instance of that filesystem.  And decides that opening it would be
> wrong.  And so it would, if we were asked to open /tmp/a/x.  Alas, in this
> case, while dentry is the same, it does *not* have anything mounted on it.
> What we get is ->d_revalidate() returning without issuing OPEN and ->open()
> being called - again, without issuing OPEN, since it assumes that ->lookup()
> or ->d_revalidate() had done it for us.
> 
> Plain IO on resulting descriptor will work and work correcly (you'll get
> "NFS file\n" read from it), but try to do F_SETLK on it and it'll fail
> since that requires the server to have seen an OPEN.

We can possibly fix this for the NFSv4.1 case since that adds support
for open-by-filehandle. However, I agree that NFSv4.0 is unfixable: all
OPENs are required to do the equivalent of a lookup, which isn't
possible in the bind mount case.

> As far as I can tell, the idea of open done in ->d_revalidate() is
> unsalvagable.  It's simply the wrong place for that.  Note that NFS
> is the only filesystem trying to do atomic open stuff in its ->d_revalidate()
> and it's not succeeding.

Not doing an open there is prohibitively expensive, though: you are
likely to see your cached inode flushed down the toilet if you just drop
the dentry...

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Rename dir on server can cause client to get ESTALE - this time with PATCH
  2011-12-01  2:24     ` Trond Myklebust
@ 2011-12-01  2:47       ` Al Viro
  0 siblings, 0 replies; 5+ messages in thread
From: Al Viro @ 2011-12-01  2:47 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: NeilBrown, NFS

On Wed, Nov 30, 2011 at 09:24:18PM -0500, Trond Myklebust wrote:

> > As far as I can tell, the idea of open done in ->d_revalidate() is
> > unsalvagable.  It's simply the wrong place for that.  Note that NFS
> > is the only filesystem trying to do atomic open stuff in its ->d_revalidate()
> > and it's not succeeding.
> 
> Not doing an open there is prohibitively expensive, though: you are
> likely to see your cached inode flushed down the toilet if you just drop
> the dentry...

Wrong.  All you really need is to have that attempt to issue OPEN shifted
into ->open() itself.  The only interesting part is that we might need
to drop the original dentry and use a new one for ->f_path.dentry.

Don't drop that dentry; after the case in ->d_revalidate() that would have
attempted that OPEN you would either cross into covering vfsmount (in which
case dentry should be left alone as you are doing now) or issue ->open().
If it's really not valid (i.e. if OPEN yields a different inode), we can
deal with that in ->open() just fine.  The *only* subtle part is how to
deal with "it's a symlink, go away" from the server.  Which will require
changes in do_last().  I have that stuff; it'll need debugging serious
review once posted.  Which I'm going to do over weekend.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-12-01  2:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-14  2:19 Rename dir on server can cause client to get ESTALE NeilBrown
2011-12-01  1:49 ` Rename dir on server can cause client to get ESTALE - this time with PATCH NeilBrown
2011-12-01  2:12   ` Al Viro
2011-12-01  2:24     ` Trond Myklebust
2011-12-01  2:47       ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).