* [PATCH 0/2] vfs: plug some holes involving LAST_BIND symlinks and file bind mounts (try #6)
@ 2009-12-02 19:59 Jeff Layton
2009-12-02 19:59 ` [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks Jeff Layton
2009-12-02 19:59 ` [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems Jeff Layton
0 siblings, 2 replies; 15+ messages in thread
From: Jeff Layton @ 2009-12-02 19:59 UTC (permalink / raw)
To: linux-fsdevel, linux-kernel; +Cc: jamie, pavel, miklos, viro, duaneg, ebiederm
This patchset is another attempt to add missing dentry revalidations to
certain places in the lookup codepath. The main difference from the last
patchset is:
* I've dropped the patch that added permissions checks when chasing
LAST_BIND symlinks. There's a lot disagreement about whether the
current behavior is even a bug. I'd prefer to see more concensus on
that point before we do anything here.
* I've rejiggered the error handling in this codepath to ensure that the
bind mounts could still be unmounted. Testing showed that returning an
error in do_lookup when a bind-mounted dentry went stale made it
un-unmountable.
Comments and suggestions appreciated...
Jeff Layton (2):
vfs: force reval of target when following LAST_BIND symlinks
vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT
filesystems
fs/namei.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 55 insertions(+), 1 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks 2009-12-02 19:59 [PATCH 0/2] vfs: plug some holes involving LAST_BIND symlinks and file bind mounts (try #6) Jeff Layton @ 2009-12-02 19:59 ` Jeff Layton 2009-12-02 23:53 ` Eric W. Biederman 2009-12-03 10:39 ` Miklos Szeredi 2009-12-02 19:59 ` [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems Jeff Layton 1 sibling, 2 replies; 15+ messages in thread From: Jeff Layton @ 2009-12-02 19:59 UTC (permalink / raw) To: linux-fsdevel, linux-kernel; +Cc: jamie, pavel, miklos, viro, duaneg, ebiederm procfs-style symlinks return a last_type of LAST_BIND without an actual path string. This causes __follow_link to skip calling __vfs_follow_link and so the dentry isn't revalidated. This is a problem when the link target sits on NFSv4 as it depends on the VFS to revalidate the dentry before using it on an open call. Ensure that this occurs by forcing a revalidation of the target dentry of LAST_BIND symlinks. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- fs/namei.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 45 insertions(+), 0 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 3374917..339789e 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -414,6 +414,46 @@ do_revalidate(struct dentry *dentry, struct nameidata *nd) } /* + * force_reval_path - force revalidation of a dentry + * + * In some situations the path walking code will trust dentries without + * revalidating them. This causes problems for filesystems that depend on + * d_revalidate to handle file opens (e.g. NFSv4). When FS_REVAL_DOT is set + * (which indicates that it's possible for the dentry to go stale), force + * a d_revalidate call before proceeding. + * + * Returns 0 if the revalidation was successful. If the revalidation fails, + * either return the error returned by d_revalidate or -ESTALE if the + * revalidation it just returned 0. If d_revalidate returns 0, we attempt to + * invalidate the dentry. It's up to the caller to handle putting references + * to the path if necessary. + */ +static int +force_reval_path(struct path *path, struct nameidata *nd) +{ + int status; + struct dentry *dentry = path->dentry; + + /* + * only check on filesystems where it's possible for the dentry to + * become stale. It's assumed that if this flag is set then the + * d_revalidate op will also be defined. + */ + if (!(dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) + return 0; + + status = dentry->d_op->d_revalidate(dentry, nd); + if (status > 0) + return 0; + + if (!status) { + d_invalidate(dentry); + status = -ESTALE; + } + return status; +} + +/* * Internal lookup() using the new generic dcache. * SMP-safe */ @@ -641,6 +681,11 @@ static __always_inline int __do_follow_link(struct path *path, struct nameidata error = 0; if (s) error = __vfs_follow_link(nd, s); + else if (nd->last_type == LAST_BIND) { + error = force_reval_path(&nd->path, nd); + if (error) + path_put(&nd->path); + } if (dentry->d_inode->i_op->put_link) dentry->d_inode->i_op->put_link(dentry, nd, cookie); } -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks 2009-12-02 19:59 ` [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks Jeff Layton @ 2009-12-02 23:53 ` Eric W. Biederman 2009-12-03 10:39 ` Miklos Szeredi 1 sibling, 0 replies; 15+ messages in thread From: Eric W. Biederman @ 2009-12-02 23:53 UTC (permalink / raw) To: Jeff Layton Cc: linux-fsdevel, linux-kernel, jamie, pavel, miklos, viro, duaneg Jeff Layton <jlayton@redhat.com> writes: > procfs-style symlinks return a last_type of LAST_BIND without an actual > path string. This causes __follow_link to skip calling __vfs_follow_link > and so the dentry isn't revalidated. > > This is a problem when the link target sits on NFSv4 as it depends on > the VFS to revalidate the dentry before using it on an open call. Ensure > that this occurs by forcing a revalidation of the target dentry of > LAST_BIND symlinks. This seems reasonable to me, and in a quick run through I can't see any problems. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > --- > fs/namei.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 45 insertions(+), 0 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 3374917..339789e 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -414,6 +414,46 @@ do_revalidate(struct dentry *dentry, struct nameidata *nd) > } > > /* > + * force_reval_path - force revalidation of a dentry > + * > + * In some situations the path walking code will trust dentries without > + * revalidating them. This causes problems for filesystems that depend on > + * d_revalidate to handle file opens (e.g. NFSv4). When FS_REVAL_DOT is set > + * (which indicates that it's possible for the dentry to go stale), force > + * a d_revalidate call before proceeding. > + * > + * Returns 0 if the revalidation was successful. If the revalidation fails, > + * either return the error returned by d_revalidate or -ESTALE if the > + * revalidation it just returned 0. If d_revalidate returns 0, we attempt to > + * invalidate the dentry. It's up to the caller to handle putting references > + * to the path if necessary. > + */ > +static int > +force_reval_path(struct path *path, struct nameidata *nd) > +{ > + int status; > + struct dentry *dentry = path->dentry; > + > + /* > + * only check on filesystems where it's possible for the dentry to > + * become stale. It's assumed that if this flag is set then the > + * d_revalidate op will also be defined. > + */ > + if (!(dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) > + return 0; > + > + status = dentry->d_op->d_revalidate(dentry, nd); > + if (status > 0) > + return 0; > + > + if (!status) { > + d_invalidate(dentry); > + status = -ESTALE; > + } > + return status; > +} > + > +/* > * Internal lookup() using the new generic dcache. > * SMP-safe > */ > @@ -641,6 +681,11 @@ static __always_inline int __do_follow_link(struct path *path, struct nameidata > error = 0; > if (s) > error = __vfs_follow_link(nd, s); > + else if (nd->last_type == LAST_BIND) { > + error = force_reval_path(&nd->path, nd); > + if (error) > + path_put(&nd->path); > + } > if (dentry->d_inode->i_op->put_link) > dentry->d_inode->i_op->put_link(dentry, nd, cookie); > } > -- > 1.5.5.6 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks 2009-12-02 19:59 ` [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks Jeff Layton 2009-12-02 23:53 ` Eric W. Biederman @ 2009-12-03 10:39 ` Miklos Szeredi 1 sibling, 0 replies; 15+ messages in thread From: Miklos Szeredi @ 2009-12-03 10:39 UTC (permalink / raw) To: Jeff Layton Cc: linux-fsdevel, linux-kernel, jamie, pavel, miklos, viro, duaneg, ebiederm On Wed, 2 Dec 2009, Jeff Layton wrote: > procfs-style symlinks return a last_type of LAST_BIND without an actual > path string. This causes __follow_link to skip calling __vfs_follow_link > and so the dentry isn't revalidated. > > This is a problem when the link target sits on NFSv4 as it depends on > the VFS to revalidate the dentry before using it on an open call. Ensure > that this occurs by forcing a revalidation of the target dentry of > LAST_BIND symlinks. An added advantage is that the symlinks in /proc will show the invalidated path as "(deleted)". Maybe proc_pid_readlink() should also call force_reval_path() to get an up-to-date state of the path? Acked-by: Miklos Szeredi <mszeredi@suse.cz> > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > --- > fs/namei.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 45 insertions(+), 0 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 3374917..339789e 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -414,6 +414,46 @@ do_revalidate(struct dentry *dentry, struct nameidata *nd) > } > > /* > + * force_reval_path - force revalidation of a dentry > + * > + * In some situations the path walking code will trust dentries without > + * revalidating them. This causes problems for filesystems that depend on > + * d_revalidate to handle file opens (e.g. NFSv4). When FS_REVAL_DOT is set > + * (which indicates that it's possible for the dentry to go stale), force > + * a d_revalidate call before proceeding. > + * > + * Returns 0 if the revalidation was successful. If the revalidation fails, > + * either return the error returned by d_revalidate or -ESTALE if the > + * revalidation it just returned 0. If d_revalidate returns 0, we attempt to > + * invalidate the dentry. It's up to the caller to handle putting references > + * to the path if necessary. > + */ > +static int > +force_reval_path(struct path *path, struct nameidata *nd) > +{ > + int status; > + struct dentry *dentry = path->dentry; > + > + /* > + * only check on filesystems where it's possible for the dentry to > + * become stale. It's assumed that if this flag is set then the > + * d_revalidate op will also be defined. > + */ > + if (!(dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) > + return 0; > + > + status = dentry->d_op->d_revalidate(dentry, nd); > + if (status > 0) > + return 0; > + > + if (!status) { > + d_invalidate(dentry); > + status = -ESTALE; > + } > + return status; > +} > + > +/* > * Internal lookup() using the new generic dcache. > * SMP-safe > */ > @@ -641,6 +681,11 @@ static __always_inline int __do_follow_link(struct path *path, struct nameidata > error = 0; > if (s) > error = __vfs_follow_link(nd, s); > + else if (nd->last_type == LAST_BIND) { > + error = force_reval_path(&nd->path, nd); > + if (error) > + path_put(&nd->path); > + } > if (dentry->d_inode->i_op->put_link) > dentry->d_inode->i_op->put_link(dentry, nd, cookie); > } > -- > 1.5.5.6 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-02 19:59 [PATCH 0/2] vfs: plug some holes involving LAST_BIND symlinks and file bind mounts (try #6) Jeff Layton 2009-12-02 19:59 ` [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks Jeff Layton @ 2009-12-02 19:59 ` Jeff Layton 2009-12-03 0:01 ` Eric W. Biederman 2009-12-03 10:58 ` Miklos Szeredi 1 sibling, 2 replies; 15+ messages in thread From: Jeff Layton @ 2009-12-02 19:59 UTC (permalink / raw) To: linux-fsdevel, linux-kernel; +Cc: jamie, pavel, miklos, viro, duaneg, ebiederm In the case of a bind mounted file, the path walking code will assume that the cached dentry that was bind mounted is valid. This is a problem problem for NFSv4 in a way that's similar to LAST_BIND symlinks. Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and __follow_mount returns true. Note that in the non-open codepath, we cannot return an error to the lookup if the revalidation fails. Doing so will leave a bind mount in a state such that we can't unmount it. In that case we'll just have to settle for d_invalidating it (which should mostly turn out to be a d_drop in this case) and returning success. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- fs/namei.c | 11 ++++++++++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 339789e..0d55b6f 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -851,7 +851,13 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, done: path->mnt = mnt; path->dentry = dentry; - __follow_mount(path); + + /* + * cannot return the error returned by force_reval_path as that can + * make it impossible to unmount a bind mounted dentry if it's stale. + */ + if (__follow_mount(path)) + force_reval_path(path, nd); return 0; need_lookup: @@ -1840,6 +1846,9 @@ do_last: error = -ELOOP; if (flag & O_NOFOLLOW) goto exit_dput; + error = force_reval_path(&path, &nd); + if (error) + goto exit_dput; } error = -ENOENT; -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-02 19:59 ` [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems Jeff Layton @ 2009-12-03 0:01 ` Eric W. Biederman 2009-12-03 1:23 ` Jeff Layton 2009-12-03 10:58 ` Miklos Szeredi 1 sibling, 1 reply; 15+ messages in thread From: Eric W. Biederman @ 2009-12-03 0:01 UTC (permalink / raw) To: Jeff Layton Cc: linux-fsdevel, linux-kernel, jamie, pavel, miklos, viro, duaneg Jeff Layton <jlayton@redhat.com> writes: > In the case of a bind mounted file, the path walking code will assume > that the cached dentry that was bind mounted is valid. This is a problem > problem for NFSv4 in a way that's similar to LAST_BIND symlinks. > > Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and > __follow_mount returns true. > > Note that in the non-open codepath, we cannot return an error to the > lookup if the revalidation fails. Doing so will leave a bind mount in > a state such that we can't unmount it. In that case we'll just have to > settle for d_invalidating it (which should mostly turn out to be a > d_drop in this case) and returning success. Looks reasonable to me. I wonder a little if we care about follow_mount as well as __follow_mount. This seems reasonable to me. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > --- > fs/namei.c | 11 ++++++++++- > 1 files changed, 10 insertions(+), 1 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 339789e..0d55b6f 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -851,7 +851,13 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, > done: > path->mnt = mnt; > path->dentry = dentry; > - __follow_mount(path); > + > + /* > + * cannot return the error returned by force_reval_path as that can > + * make it impossible to unmount a bind mounted dentry if it's stale. > + */ > + if (__follow_mount(path)) > + force_reval_path(path, nd); > return 0; > > need_lookup: > @@ -1840,6 +1846,9 @@ do_last: > error = -ELOOP; > if (flag & O_NOFOLLOW) > goto exit_dput; > + error = force_reval_path(&path, &nd); > + if (error) > + goto exit_dput; > } > > error = -ENOENT; > -- > 1.5.5.6 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 0:01 ` Eric W. Biederman @ 2009-12-03 1:23 ` Jeff Layton 0 siblings, 0 replies; 15+ messages in thread From: Jeff Layton @ 2009-12-03 1:23 UTC (permalink / raw) To: Eric W. Biederman Cc: linux-fsdevel, linux-kernel, jamie, pavel, miklos, viro, duaneg On Wed, 02 Dec 2009 16:01:59 -0800 ebiederm@xmission.com (Eric W. Biederman) wrote: > Jeff Layton <jlayton@redhat.com> writes: > > > In the case of a bind mounted file, the path walking code will assume > > that the cached dentry that was bind mounted is valid. This is a problem > > problem for NFSv4 in a way that's similar to LAST_BIND symlinks. > > > > Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and > > __follow_mount returns true. > > > > Note that in the non-open codepath, we cannot return an error to the > > lookup if the revalidation fails. Doing so will leave a bind mount in > > a state such that we can't unmount it. In that case we'll just have to > > settle for d_invalidating it (which should mostly turn out to be a > > d_drop in this case) and returning success. > > Looks reasonable to me. I wonder a little if we care about follow_mount > as well as __follow_mount. > Good question. It does sort of seem like it. There's also follow_down too... OTOH, we have a reproducible testcases that the two patches in this series fix. I'm not aware of anything that's technically "broken" by the lack of revalidations in the other codepaths so I'm hesitant to add them in unless and until we do. > This seems reasonable to me. > > Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > > --- > > fs/namei.c | 11 ++++++++++- > > 1 files changed, 10 insertions(+), 1 deletions(-) > > > > diff --git a/fs/namei.c b/fs/namei.c > > index 339789e..0d55b6f 100644 > > --- a/fs/namei.c > > +++ b/fs/namei.c > > @@ -851,7 +851,13 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, > > done: > > path->mnt = mnt; > > path->dentry = dentry; > > - __follow_mount(path); > > + > > + /* > > + * cannot return the error returned by force_reval_path as that can > > + * make it impossible to unmount a bind mounted dentry if it's stale. > > + */ > > + if (__follow_mount(path)) > > + force_reval_path(path, nd); > > return 0; > > > > need_lookup: > > @@ -1840,6 +1846,9 @@ do_last: > > error = -ELOOP; > > if (flag & O_NOFOLLOW) > > goto exit_dput; > > + error = force_reval_path(&path, &nd); > > + if (error) > > + goto exit_dput; > > } > > > > error = -ENOENT; > > -- > > 1.5.5.6 -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-02 19:59 ` [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems Jeff Layton 2009-12-03 0:01 ` Eric W. Biederman @ 2009-12-03 10:58 ` Miklos Szeredi 2009-12-03 11:11 ` Eric W. Biederman 2009-12-03 15:20 ` Jeff Layton 1 sibling, 2 replies; 15+ messages in thread From: Miklos Szeredi @ 2009-12-03 10:58 UTC (permalink / raw) To: Jeff Layton Cc: linux-fsdevel, linux-kernel, jamie, pavel, miklos, viro, duaneg, ebiederm On Wed, 2 Dec 2009, Jeff Layton wrote: > In the case of a bind mounted file, the path walking code will assume > that the cached dentry that was bind mounted is valid. This is a problem > problem for NFSv4 in a way that's similar to LAST_BIND symlinks. > > Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and > __follow_mount returns true. > > Note that in the non-open codepath, we cannot return an error to the > lookup if the revalidation fails. Doing so will leave a bind mount in > a state such that we can't unmount it. In that case we'll just have to > settle for d_invalidating it (which should mostly turn out to be a > d_drop in this case) and returning success. The only worry I have is that this adds an extra branch in a very hot codepath (do_lookup). An error can't be returned, as you note, and for bind mounted directories d_invalidate() will not succeed: the directory is busy, it's referenced by the mount. So basically the only thing this does is working around the NFSv4 issue. But Trond has a proper solution to that, and a temporary solution could be added to do_filp_open() rather than burdening do_lookup() with it, no? Thanks, Miklos > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > --- > fs/namei.c | 11 ++++++++++- > 1 files changed, 10 insertions(+), 1 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 339789e..0d55b6f 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -851,7 +851,13 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, > done: > path->mnt = mnt; > path->dentry = dentry; > - __follow_mount(path); > + > + /* > + * cannot return the error returned by force_reval_path as that can > + * make it impossible to unmount a bind mounted dentry if it's stale. > + */ > + if (__follow_mount(path)) > + force_reval_path(path, nd); > return 0; > > need_lookup: > @@ -1840,6 +1846,9 @@ do_last: > error = -ELOOP; > if (flag & O_NOFOLLOW) > goto exit_dput; > + error = force_reval_path(&path, &nd); > + if (error) > + goto exit_dput; > } > > error = -ENOENT; > -- > 1.5.5.6 > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 10:58 ` Miklos Szeredi @ 2009-12-03 11:11 ` Eric W. Biederman 2009-12-03 11:19 ` Miklos Szeredi 2009-12-03 15:20 ` Jeff Layton 1 sibling, 1 reply; 15+ messages in thread From: Eric W. Biederman @ 2009-12-03 11:11 UTC (permalink / raw) To: Miklos Szeredi Cc: Jeff Layton, linux-fsdevel, linux-kernel, jamie, pavel, viro, duaneg Miklos Szeredi <miklos@szeredi.hu> writes: > On Wed, 2 Dec 2009, Jeff Layton wrote: >> In the case of a bind mounted file, the path walking code will assume >> that the cached dentry that was bind mounted is valid. This is a problem >> problem for NFSv4 in a way that's similar to LAST_BIND symlinks. >> >> Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and >> __follow_mount returns true. >> >> Note that in the non-open codepath, we cannot return an error to the >> lookup if the revalidation fails. Doing so will leave a bind mount in >> a state such that we can't unmount it. In that case we'll just have to >> settle for d_invalidating it (which should mostly turn out to be a >> d_drop in this case) and returning success. > > The only worry I have is that this adds an extra branch in a very hot > codepath (do_lookup). An error can't be returned, as you note, and > for bind mounted directories d_invalidate() will not succeed: the > directory is busy, it's referenced by the mount. Not true. d_mountpoint is false, so d_invalidate can succeed. > So basically the > only thing this does is working around the NFSv4 issue. No, this should catch other cases where we have a dentry goes stale as well, and lets the distributed filesystem handle it. It is probably worth a benchmark to ease the concerns about the hotpath. I expect the cpu will predict the branch as unlikely and we won't see any difference. Eric ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 11:11 ` Eric W. Biederman @ 2009-12-03 11:19 ` Miklos Szeredi 2009-12-03 11:56 ` Eric W. Biederman 0 siblings, 1 reply; 15+ messages in thread From: Miklos Szeredi @ 2009-12-03 11:19 UTC (permalink / raw) To: Eric W. Biederman Cc: miklos, jlayton, linux-fsdevel, linux-kernel, jamie, pavel, viro, duaneg On Thu, 03 Dec 2009, Eric W. Biederman wrote: > > The only worry I have is that this adds an extra branch in a very hot > > codepath (do_lookup). An error can't be returned, as you note, and > > for bind mounted directories d_invalidate() will not succeed: the > > directory is busy, it's referenced by the mount. > > Not true. d_mountpoint is false, so d_invalidate can succeed. Have a look at the code. d_invalidate() doesn't check for a mountpoint, it checks the refcount. It needs to keep the directory dentry hashed if it's in any way reachable other than from the cache (file descriptor, cwd, mount, etc). Thanks, Miklos ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 11:19 ` Miklos Szeredi @ 2009-12-03 11:56 ` Eric W. Biederman 2009-12-03 15:21 ` Miklos Szeredi 0 siblings, 1 reply; 15+ messages in thread From: Eric W. Biederman @ 2009-12-03 11:56 UTC (permalink / raw) To: Miklos Szeredi Cc: jlayton, linux-fsdevel, linux-kernel, jamie, pavel, viro, duaneg Miklos Szeredi <miklos@szeredi.hu> writes: > On Thu, 03 Dec 2009, Eric W. Biederman wrote: >> > The only worry I have is that this adds an extra branch in a very hot >> > codepath (do_lookup). An error can't be returned, as you note, and >> > for bind mounted directories d_invalidate() will not succeed: the >> > directory is busy, it's referenced by the mount. >> >> Not true. d_mountpoint is false, so d_invalidate can succeed. > > Have a look at the code. d_invalidate() doesn't check for a > mountpoint, it checks the refcount. It needs to keep the directory > dentry hashed if it's in any way reachable other than from the cache > (file descriptor, cwd, mount, etc). Ah. I thought you were thinking about the mandatory have_submounts() check in dentry->d_op->d_revalidate(). I expect the generic d_invalidate will simply hit the: spin_lock(&dcache_lock); if (d_unhashed(dentry)) { spin_unlock(&dcache_lock); return 0; } After the distributed filesystem has called d_drop in dentry->d_op->d_revalidate (when appropriate. Eric ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 11:56 ` Eric W. Biederman @ 2009-12-03 15:21 ` Miklos Szeredi 0 siblings, 0 replies; 15+ messages in thread From: Miklos Szeredi @ 2009-12-03 15:21 UTC (permalink / raw) To: Eric W. Biederman Cc: miklos, jlayton, linux-fsdevel, linux-kernel, jamie, pavel, viro, duaneg On Thu, 03 Dec 2009, Eric W. Biederman wrote: > Ah. I thought you were thinking about the mandatory have_submounts() > check in dentry->d_op->d_revalidate(). > > I expect the generic d_invalidate will simply hit the: > spin_lock(&dcache_lock); > if (d_unhashed(dentry)) { > spin_unlock(&dcache_lock); > return 0; > } > > After the distributed filesystem has called d_drop in > dentry->d_op->d_revalidate (when appropriate. Ah, right, NFS's d_revalidate does do d_drop(). Okay, I have no problem with the patch, as long as path lookup performance doesn't show a regression. Thanks, Miklos ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 10:58 ` Miklos Szeredi 2009-12-03 11:11 ` Eric W. Biederman @ 2009-12-03 15:20 ` Jeff Layton 2009-12-03 18:35 ` Eric W. Biederman 1 sibling, 1 reply; 15+ messages in thread From: Jeff Layton @ 2009-12-03 15:20 UTC (permalink / raw) To: Miklos Szeredi Cc: linux-fsdevel, linux-kernel, jamie, pavel, miklos, viro, duaneg, ebiederm, Trond Myklebust On Thu, 03 Dec 2009 11:58:43 +0100 Miklos Szeredi <miklos@szeredi.hu> wrote: > On Wed, 2 Dec 2009, Jeff Layton wrote: > > In the case of a bind mounted file, the path walking code will assume > > that the cached dentry that was bind mounted is valid. This is a problem > > problem for NFSv4 in a way that's similar to LAST_BIND symlinks. > > > > Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and > > __follow_mount returns true. > > > > Note that in the non-open codepath, we cannot return an error to the > > lookup if the revalidation fails. Doing so will leave a bind mount in > > a state such that we can't unmount it. In that case we'll just have to > > settle for d_invalidating it (which should mostly turn out to be a > > d_drop in this case) and returning success. > > The only worry I have is that this adds an extra branch in a very hot > codepath (do_lookup). An error can't be returned, as you note, and > for bind mounted directories d_invalidate() will not succeed: the > directory is busy, it's referenced by the mount. So basically the > only thing this does is working around the NFSv4 issue. But Trond has > a proper solution to that, and a temporary solution could be added to > do_filp_open() rather than burdening do_lookup() with it, no? > (re-adding Trond. I forgot to cc him on this latest set) Self-NAK on this patch... That's my main worry too, and sadly it doesn't seem to be unfounded. This patch adds a lot of extra d_revalidate calls here. I think it's going to be too expensive to do this. Unfortunately, adding the code to do_filp_open alone doesn't really help. The path walking code in there is just for the O_CREAT case. If we're not creating the file, we call into path_lookup_open which eventually calls do_lookup. What we probably need to do is only make it d_revalidate when it looks like the final dentry is bind mounted. I'm not sure how to tell that though and even if I could, I'm still leery of adding this to such a hot codepath. The only problem I've identified that this fixes is with file bind mounts and I don't get the impression they're that common. Maybe the best thing is to just fix the LAST_BIND symlink case for now and wait for Trond or Al's overhaul of this code. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 15:20 ` Jeff Layton @ 2009-12-03 18:35 ` Eric W. Biederman 2009-12-03 19:15 ` Jeff Layton 0 siblings, 1 reply; 15+ messages in thread From: Eric W. Biederman @ 2009-12-03 18:35 UTC (permalink / raw) To: Jeff Layton Cc: Miklos Szeredi, linux-fsdevel, linux-kernel, jamie, pavel, viro, duaneg, Trond Myklebust Jeff Layton <jlayton@redhat.com> writes: > On Thu, 03 Dec 2009 11:58:43 +0100 > Miklos Szeredi <miklos@szeredi.hu> wrote: > >> On Wed, 2 Dec 2009, Jeff Layton wrote: >> > In the case of a bind mounted file, the path walking code will assume >> > that the cached dentry that was bind mounted is valid. This is a problem >> > problem for NFSv4 in a way that's similar to LAST_BIND symlinks. >> > >> > Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and >> > __follow_mount returns true. >> > >> > Note that in the non-open codepath, we cannot return an error to the >> > lookup if the revalidation fails. Doing so will leave a bind mount in >> > a state such that we can't unmount it. In that case we'll just have to >> > settle for d_invalidating it (which should mostly turn out to be a >> > d_drop in this case) and returning success. >> >> The only worry I have is that this adds an extra branch in a very hot >> codepath (do_lookup). An error can't be returned, as you note, and >> for bind mounted directories d_invalidate() will not succeed: the >> directory is busy, it's referenced by the mount. So basically the >> only thing this does is working around the NFSv4 issue. But Trond has >> a proper solution to that, and a temporary solution could be added to >> do_filp_open() rather than burdening do_lookup() with it, no? >> > > (re-adding Trond. I forgot to cc him on this latest set) > > Self-NAK on this patch... > > That's my main worry too, and sadly it doesn't seem to be unfounded. > This patch adds a lot of extra d_revalidate calls here. I think it's > going to be too expensive to do this. How so? We should only see extra calls if we follow a mount point. Currently we call d_revalidate on every path component. > The only problem I've identified that this fixes is with file bind > mounts and I don't get the impression they're that common. Maybe the > best thing is to just fix the LAST_BIND symlink case for now and wait > for Trond or Al's overhaul of this code. Well right now following mount points breaks the VFS contract that we will revalidate all dentries before we use them. That breaking of the contract breaks NFS. I don't know what else d_revalidate is good for. On the sysfs side I only use it to unhash the dentry. Something we don't care about from the do_lookup side of things if we have a bind mount. I'm not clear what kind of changes revalidating a deleted but open file will give you on NFS. Eric ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems 2009-12-03 18:35 ` Eric W. Biederman @ 2009-12-03 19:15 ` Jeff Layton 0 siblings, 0 replies; 15+ messages in thread From: Jeff Layton @ 2009-12-03 19:15 UTC (permalink / raw) To: Eric W. Biederman Cc: Miklos Szeredi, linux-fsdevel, linux-kernel, jamie, pavel, viro, duaneg, Trond Myklebust On Thu, 03 Dec 2009 10:35:21 -0800 ebiederm@xmission.com (Eric W. Biederman) wrote: > Jeff Layton <jlayton@redhat.com> writes: > > > On Thu, 03 Dec 2009 11:58:43 +0100 > > Miklos Szeredi <miklos@szeredi.hu> wrote: > > > >> On Wed, 2 Dec 2009, Jeff Layton wrote: > >> > In the case of a bind mounted file, the path walking code will assume > >> > that the cached dentry that was bind mounted is valid. This is a problem > >> > problem for NFSv4 in a way that's similar to LAST_BIND symlinks. > >> > > >> > Fix this by revalidating the dentry if FS_FOLLOW_DOT is set and > >> > __follow_mount returns true. > >> > > >> > Note that in the non-open codepath, we cannot return an error to the > >> > lookup if the revalidation fails. Doing so will leave a bind mount in > >> > a state such that we can't unmount it. In that case we'll just have to > >> > settle for d_invalidating it (which should mostly turn out to be a > >> > d_drop in this case) and returning success. > >> > >> The only worry I have is that this adds an extra branch in a very hot > >> codepath (do_lookup). An error can't be returned, as you note, and > >> for bind mounted directories d_invalidate() will not succeed: the > >> directory is busy, it's referenced by the mount. So basically the > >> only thing this does is working around the NFSv4 issue. But Trond has > >> a proper solution to that, and a temporary solution could be added to > >> do_filp_open() rather than burdening do_lookup() with it, no? > >> > > > > (re-adding Trond. I forgot to cc him on this latest set) > > > > Self-NAK on this patch... > > > > That's my main worry too, and sadly it doesn't seem to be unfounded. > > This patch adds a lot of extra d_revalidate calls here. I think it's > > going to be too expensive to do this. > > How so? We should only see extra calls if we follow a mount point. > Currently we call d_revalidate on every path component. > > > The only problem I've identified that this fixes is with file bind > > mounts and I don't get the impression they're that common. Maybe the > > best thing is to just fix the LAST_BIND symlink case for now and wait > > for Trond or Al's overhaul of this code. > > Well right now following mount points breaks the VFS contract that we > will revalidate all dentries before we use them. That breaking of the > contract breaks NFS. > > I don't know what else d_revalidate is good for. On the sysfs side > I only use it to unhash the dentry. Something we don't care about > from the do_lookup side of things if we have a bind mount. > > I'm not clear what kind of changes revalidating a deleted but open > file will give you on NFS. > My concern here is based mainly on a simple experiment I did to fire off a printk and do a dump_stack every time we call d_revalidate from force_reval_path. Even a very small set of operations caused that to get called many, many times, mostly from do_lookup. You're correct that the current behavior breaks the d_revalidate "contract". What I'm not sure of is whether that truly breaks anything beyond file bind mounts. If it doesn't then we have to ask ourselves -- is it worth a potential performance hit in a very hot codepath to fix bind mounted files that live on NFSv4? My current thinking is "no". Much of that decision is based upon an assumption that file bind mounts are rarely ever used. If that's the case, then it's probably more prudent to wait until the VFS has been fixed so that NFSv4 no longer needs to depend on d_revalidate this way. The LAST_BIND symlink case is a little more straightforward since the fix is more targeted. I think that should be reasonable for 2.6.33. Cheers, -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-12-03 19:15 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-02 19:59 [PATCH 0/2] vfs: plug some holes involving LAST_BIND symlinks and file bind mounts (try #6) Jeff Layton 2009-12-02 19:59 ` [PATCH 1/2] vfs: force reval of target when following LAST_BIND symlinks Jeff Layton 2009-12-02 23:53 ` Eric W. Biederman 2009-12-03 10:39 ` Miklos Szeredi 2009-12-02 19:59 ` [PATCH 2/2] vfs: force reval on dentry of bind mounted files on FS_REVAL_DOT filesystems Jeff Layton 2009-12-03 0:01 ` Eric W. Biederman 2009-12-03 1:23 ` Jeff Layton 2009-12-03 10:58 ` Miklos Szeredi 2009-12-03 11:11 ` Eric W. Biederman 2009-12-03 11:19 ` Miklos Szeredi 2009-12-03 11:56 ` Eric W. Biederman 2009-12-03 15:21 ` Miklos Szeredi 2009-12-03 15:20 ` Jeff Layton 2009-12-03 18:35 ` Eric W. Biederman 2009-12-03 19:15 ` Jeff Layton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).