* [PATCH 0/7] VFS prep for union mounts/writable overlays @ 2009-12-23 23:36 Valerie Aurora 2009-12-23 23:36 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora 0 siblings, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:36 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora Writable overlays/union mounts requires some generic VFS improvements and changes. The patches in this set are independent of union mounts and should merged immediately. Jan Blunck (6): autofs4: Save autofs trigger's vfsmount in super block info VFS: Make lookup_hash() return a struct path VFS: Make real_lookup() return a struct path VFS: Propagate mnt_flags into do_loopback VFS: BUG_ON() rehash of an already hashed dentry VFS: Remove unnecessary micro-optimization in cached_lookup() Valerie Aurora (1): VFS: Add read-only users count to superblock fs/autofs4/autofs_i.h | 1 + fs/autofs4/init.c | 11 +++- fs/autofs4/root.c | 6 ++ fs/dcache.c | 1 + fs/namei.c | 217 ++++++++++++++++++++++++++----------------------- fs/namespace.c | 7 +- fs/super.c | 18 ++++- include/linux/fs.h | 5 + 8 files changed, 158 insertions(+), 108 deletions(-) ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2009-12-23 23:36 [PATCH 0/7] VFS prep for union mounts/writable overlays Valerie Aurora @ 2009-12-23 23:36 ` Valerie Aurora 2009-12-23 23:36 ` [PATCH 2/7] VFS: Make lookup_hash() return a struct path Valerie Aurora 2010-01-02 0:44 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Ian Kent 0 siblings, 2 replies; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:36 UTC (permalink / raw) To: linux-fsdevel Cc: Jan Blunck, Alexander Viro, Valerie Aurora, Ian Kent, autofs From: Jan Blunck <jblunck@suse.de> This is a bugfix/replacement for commit 051d381259eb57d6074d02a6ba6e90e744f1a29f: During a path walk if an autofs trigger is mounted on a dentry, when the follow_link method is called, the nameidata struct contains the vfsmount and mountpoint dentry of the parent mount while the dentry that is passed in is the root of the autofs trigger mount. I believe it is impossible to get the vfsmount of the trigger mount, within the follow_link method, when only the parent vfsmount and the root dentry of the trigger mount are known. The solution in this commit was to replace the path embedded in the parent's nameidata with the path of the link itself in __do_follow_link(). This is a relatively harmless misuse of the field, but union mounts ran into a bug during follow_link() caused by the nameidata containing the wrong path (we count on it being what it is all other places - the path of the parent). A cleaner and easier to understand solution is to save the necessary vfsmount in the autofs superblock info when it is mounted. Then we can easily update the vfsmount in autofs4_follow_link(). Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Ian Kent <raven@themaw.net> Cc: autofs@linux.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/autofs4/autofs_i.h | 1 + fs/autofs4/init.c | 11 ++++++++++- fs/autofs4/root.c | 6 ++++++ fs/namei.c | 7 ++----- 4 files changed, 19 insertions(+), 6 deletions(-) diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index 8f7cdde..db2bfce 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -130,6 +130,7 @@ struct autofs_sb_info { int reghost_enabled; int needs_reghost; struct super_block *sb; + struct vfsmount *mnt; struct mutex wq_mutex; spinlock_t fs_lock; struct autofs_wait_queue *queues; /* Wait queue pointer */ diff --git a/fs/autofs4/init.c b/fs/autofs4/init.c index 9722e4b..5e0dcd7 100644 --- a/fs/autofs4/init.c +++ b/fs/autofs4/init.c @@ -17,7 +17,16 @@ static int autofs_get_sb(struct file_system_type *fs_type, int flags, const char *dev_name, void *data, struct vfsmount *mnt) { - return get_sb_nodev(fs_type, flags, data, autofs4_fill_super, mnt); + struct autofs_sb_info *sbi; + int ret; + + ret = get_sb_nodev(fs_type, flags, data, autofs4_fill_super, mnt); + if (ret) + return ret; + + sbi = autofs4_sbi(mnt->mnt_sb); + sbi->mnt = mnt; + return 0; } static struct file_system_type autofs_fs_type = { diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index b96a3c5..cb991b8 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -179,6 +179,12 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) DPRINTK("dentry=%p %.*s oz_mode=%d nd->flags=%d", dentry, dentry->d_name.len, dentry->d_name.name, oz_mode, nd->flags); + + dput(nd->path.dentry); + mntput(nd->path.mnt); + nd->path.mnt = mntget(sbi->mnt); + nd->path.dentry = dget(dentry); + /* * For an expire of a covered direct or offset mount we need * to break out of follow_down() at the autofs mount trigger diff --git a/fs/namei.c b/fs/namei.c index d11f404..c768444 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -629,11 +629,8 @@ static __always_inline int __do_follow_link(struct path *path, struct nameidata touch_atime(path->mnt, dentry); nd_set_link(nd, NULL); - if (path->mnt != nd->path.mnt) { - path_to_nameidata(path, nd); - dget(dentry); - } - mntget(path->mnt); + if (path->mnt == nd->path.mnt) + mntget(nd->path.mnt); cookie = dentry->d_inode->i_op->follow_link(dentry, nd); error = PTR_ERR(cookie); if (!IS_ERR(cookie)) { -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 2/7] VFS: Make lookup_hash() return a struct path 2009-12-23 23:36 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora @ 2009-12-23 23:36 ` Valerie Aurora 2009-12-23 23:36 ` [PATCH 3/7] VFS: Make real_lookup() " Valerie Aurora 2010-01-02 0:44 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Ian Kent 1 sibling, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:36 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora From: Jan Blunck <jblunck@suse.de> This patch changes lookup_hash() into returning a struct path. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/namei.c | 115 ++++++++++++++++++++++++++++++----------------------------- 1 files changed, 58 insertions(+), 57 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index c768444..0ddfefb 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1169,7 +1169,7 @@ static int path_lookup_open(int dfd, const char *name, } static struct dentry *__lookup_hash(struct qstr *name, - struct dentry *base, struct nameidata *nd) + struct dentry *base, struct nameidata *nd) { struct dentry *dentry; struct inode *inode; @@ -1216,14 +1216,22 @@ out: * needs parent already locked. Doesn't follow mounts. * SMP-safe. */ -static struct dentry *lookup_hash(struct nameidata *nd) +static int lookup_hash(struct nameidata *nd, struct qstr *name, + struct path *path) { int err; err = inode_permission(nd->path.dentry->d_inode, MAY_EXEC); if (err) - return ERR_PTR(err); - return __lookup_hash(&nd->last, nd->path.dentry, nd); + return err; + path->mnt = nd->path.mnt; + path->dentry = __lookup_hash(name, nd->path.dentry, nd); + if (IS_ERR(path->dentry)) { + err = PTR_ERR(path->dentry); + path->dentry = NULL; + path->mnt = NULL; + } + return err; } static int __lookup_one_len(const char *name, struct qstr *this, @@ -1735,12 +1743,10 @@ struct file *do_filp_open(int dfd, const char *pathname, if (flag & O_EXCL) nd.flags |= LOOKUP_EXCL; mutex_lock(&dir->d_inode->i_mutex); - path.dentry = lookup_hash(&nd); - path.mnt = nd.path.mnt; + error = lookup_hash(&nd, &nd.last, &path); do_last: - error = PTR_ERR(path.dentry); - if (IS_ERR(path.dentry)) { + if (error) { mutex_unlock(&dir->d_inode->i_mutex); goto exit; } @@ -1901,8 +1907,7 @@ do_link: } dir = nd.path.dentry; mutex_lock(&dir->d_inode->i_mutex); - path.dentry = lookup_hash(&nd); - path.mnt = nd.path.mnt; + error = lookup_hash(&nd, &nd.last, &path); __putname(nd.last.name); goto do_last; } @@ -1936,7 +1941,8 @@ EXPORT_SYMBOL(filp_open); */ struct dentry *lookup_create(struct nameidata *nd, int is_dir) { - struct dentry *dentry = ERR_PTR(-EEXIST); + struct path path; + int err; mutex_lock_nested(&nd->path.dentry->d_inode->i_mutex, I_MUTEX_PARENT); /* @@ -1944,7 +1950,7 @@ struct dentry *lookup_create(struct nameidata *nd, int is_dir) * (foo/., foo/.., /////) */ if (nd->last_type != LAST_NORM) - goto fail; + return ERR_PTR(-EEXIST); nd->flags &= ~LOOKUP_PARENT; nd->flags |= LOOKUP_CREATE | LOOKUP_EXCL; nd->intent.open.flags = O_EXCL; @@ -1952,11 +1958,11 @@ struct dentry *lookup_create(struct nameidata *nd, int is_dir) /* * Do the final lookup. */ - dentry = lookup_hash(nd); - if (IS_ERR(dentry)) - goto fail; + err = lookup_hash(nd, &nd->last, &path); + if (err) + return ERR_PTR(err); - if (dentry->d_inode) + if (path.dentry->d_inode) goto eexist; /* * Special case - lookup gave negative, but... we had foo/bar/ @@ -1965,15 +1971,14 @@ struct dentry *lookup_create(struct nameidata *nd, int is_dir) * been asking for (non-existent) directory. -ENOENT for you. */ if (unlikely(!is_dir && nd->last.name[nd->last.len])) { - dput(dentry); - dentry = ERR_PTR(-ENOENT); + dput(path.dentry); + return ERR_PTR(-ENOENT); } - return dentry; + + return path.dentry; eexist: - dput(dentry); - dentry = ERR_PTR(-EEXIST); -fail: - return dentry; + path_put_conditional(&path, nd); + return ERR_PTR(-EEXIST); } EXPORT_SYMBOL_GPL(lookup_create); @@ -2210,7 +2215,7 @@ static long do_rmdir(int dfd, const char __user *pathname) { int error = 0; char * name; - struct dentry *dentry; + struct path path; struct nameidata nd; error = user_path_parent(dfd, pathname, &nd, &name); @@ -2232,21 +2237,20 @@ static long do_rmdir(int dfd, const char __user *pathname) nd.flags &= ~LOOKUP_PARENT; mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT); - dentry = lookup_hash(&nd); - error = PTR_ERR(dentry); - if (IS_ERR(dentry)) + error = lookup_hash(&nd, &nd.last, &path); + if (error) goto exit2; error = mnt_want_write(nd.path.mnt); if (error) goto exit3; - error = security_path_rmdir(&nd.path, dentry); + error = security_path_rmdir(&nd.path, path.dentry); if (error) goto exit4; - error = vfs_rmdir(nd.path.dentry->d_inode, dentry); + error = vfs_rmdir(nd.path.dentry->d_inode, path.dentry); exit4: mnt_drop_write(nd.path.mnt); exit3: - dput(dentry); + path_put_conditional(&path, &nd); exit2: mutex_unlock(&nd.path.dentry->d_inode->i_mutex); exit1: @@ -2301,7 +2305,7 @@ static long do_unlinkat(int dfd, const char __user *pathname) { int error; char *name; - struct dentry *dentry; + struct path path; struct nameidata nd; struct inode *inode = NULL; @@ -2316,26 +2320,25 @@ static long do_unlinkat(int dfd, const char __user *pathname) nd.flags &= ~LOOKUP_PARENT; mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT); - dentry = lookup_hash(&nd); - error = PTR_ERR(dentry); - if (!IS_ERR(dentry)) { + error = lookup_hash(&nd, &nd.last, &path); + if (!error) { /* Why not before? Because we want correct error value */ if (nd.last.name[nd.last.len]) goto slashes; - inode = dentry->d_inode; + inode = path.dentry->d_inode; if (inode) atomic_inc(&inode->i_count); error = mnt_want_write(nd.path.mnt); if (error) goto exit2; - error = security_path_unlink(&nd.path, dentry); + error = security_path_unlink(&nd.path, path.dentry); if (error) goto exit3; - error = vfs_unlink(nd.path.dentry->d_inode, dentry); + error = vfs_unlink(nd.path.dentry->d_inode, path.dentry); exit3: mnt_drop_write(nd.path.mnt); exit2: - dput(dentry); + path_put_conditional(&path, &nd); } mutex_unlock(&nd.path.dentry->d_inode->i_mutex); if (inode) @@ -2346,8 +2349,8 @@ exit1: return error; slashes: - error = !dentry->d_inode ? -ENOENT : - S_ISDIR(dentry->d_inode->i_mode) ? -EISDIR : -ENOTDIR; + error = !path.dentry->d_inode ? -ENOENT : + S_ISDIR(path.dentry->d_inode->i_mode) ? -EISDIR : -ENOTDIR; goto exit2; } @@ -2687,7 +2690,7 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname, int, newdfd, const char __user *, newname) { struct dentry *old_dir, *new_dir; - struct dentry *old_dentry, *new_dentry; + struct path old, new; struct dentry *trap; struct nameidata oldnd, newnd; char *from; @@ -2721,16 +2724,15 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname, trap = lock_rename(new_dir, old_dir); - old_dentry = lookup_hash(&oldnd); - error = PTR_ERR(old_dentry); - if (IS_ERR(old_dentry)) + error = lookup_hash(&oldnd, &oldnd.last, &old); + if (error) goto exit3; /* source must exist */ error = -ENOENT; - if (!old_dentry->d_inode) + if (!old.dentry->d_inode) goto exit4; /* unless the source is a directory trailing slashes give -ENOTDIR */ - if (!S_ISDIR(old_dentry->d_inode->i_mode)) { + if (!S_ISDIR(old.dentry->d_inode->i_mode)) { error = -ENOTDIR; if (oldnd.last.name[oldnd.last.len]) goto exit4; @@ -2739,32 +2741,31 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname, } /* source should not be ancestor of target */ error = -EINVAL; - if (old_dentry == trap) + if (old.dentry == trap) goto exit4; - new_dentry = lookup_hash(&newnd); - error = PTR_ERR(new_dentry); - if (IS_ERR(new_dentry)) + error = lookup_hash(&newnd, &newnd.last, &new); + if (error) goto exit4; /* target should not be an ancestor of source */ error = -ENOTEMPTY; - if (new_dentry == trap) + if (new.dentry == trap) goto exit5; error = mnt_want_write(oldnd.path.mnt); if (error) goto exit5; - error = security_path_rename(&oldnd.path, old_dentry, - &newnd.path, new_dentry); + error = security_path_rename(&oldnd.path, old.dentry, + &newnd.path, new.dentry); if (error) goto exit6; - error = vfs_rename(old_dir->d_inode, old_dentry, - new_dir->d_inode, new_dentry); + error = vfs_rename(old_dir->d_inode, old.dentry, + new_dir->d_inode, new.dentry); exit6: mnt_drop_write(oldnd.path.mnt); exit5: - dput(new_dentry); + path_put_conditional(&new, &newnd); exit4: - dput(old_dentry); + path_put_conditional(&old, &oldnd); exit3: unlock_rename(new_dir, old_dir); exit2: -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 3/7] VFS: Make real_lookup() return a struct path 2009-12-23 23:36 ` [PATCH 2/7] VFS: Make lookup_hash() return a struct path Valerie Aurora @ 2009-12-23 23:36 ` Valerie Aurora 2009-12-23 23:37 ` [PATCH 4/7] VFS: Propagate mnt_flags into do_loopback Valerie Aurora 0 siblings, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:36 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora From: Jan Blunck <jblunck@suse.de> This patch changes real_lookup() into returning a struct path. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/namei.c | 82 +++++++++++++++++++++++++++++++++++++---------------------- 1 files changed, 51 insertions(+), 31 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 0ddfefb..3f645ce 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -473,10 +473,11 @@ ok: * make sure that nobody added the entry to the dcache in the meantime.. * SMP-safe */ -static struct dentry * real_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd) +static int real_lookup(struct nameidata *nd, struct qstr *name, + struct path *path) { - struct dentry * result; - struct inode *dir = parent->d_inode; + struct inode *dir = nd->path.dentry->d_inode; + int res = 0; mutex_lock(&dir->i_mutex); /* @@ -493,27 +494,36 @@ static struct dentry * real_lookup(struct dentry * parent, struct qstr * name, s * * so doing d_lookup() (with seqlock), instead of lockfree __d_lookup */ - result = d_lookup(parent, name); - if (!result) { + path->dentry = d_lookup(nd->path.dentry, name); + path->mnt = nd->path.mnt; + if (!path->dentry) { struct dentry *dentry; /* Don't create child dentry for a dead directory. */ - result = ERR_PTR(-ENOENT); - if (IS_DEADDIR(dir)) + if (IS_DEADDIR(dir)) { + res = -ENOENT; goto out_unlock; + } - dentry = d_alloc(parent, name); - result = ERR_PTR(-ENOMEM); + dentry = d_alloc(nd->path.dentry, name); if (dentry) { - result = dir->i_op->lookup(dir, dentry, nd); - if (result) + path->dentry = dir->i_op->lookup(dir, dentry, nd); + if (path->dentry) { dput(dentry); - else - result = dentry; + if (IS_ERR(path->dentry)) { + res = PTR_ERR(path->dentry); + path->dentry = NULL; + path->mnt = NULL; + } + } else + path->dentry = dentry; + } else { + res = -ENOMEM; + path->mnt = NULL; } out_unlock: mutex_unlock(&dir->i_mutex); - return result; + return res; } /* @@ -521,12 +531,20 @@ out_unlock: * we waited on the semaphore. Need to revalidate. */ mutex_unlock(&dir->i_mutex); - if (result->d_op && result->d_op->d_revalidate) { - result = do_revalidate(result, nd); - if (!result) - result = ERR_PTR(-ENOENT); + if (path->dentry->d_op && path->dentry->d_op->d_revalidate) { + path->dentry = do_revalidate(path->dentry, nd); + if (!path->dentry) { + res = -ENOENT; + path->mnt = NULL; + } + if (IS_ERR(path->dentry)) { + res = PTR_ERR(path->dentry); + path->dentry = NULL; + path->mnt = NULL; + } } - return result; + + return res; } /* @@ -793,35 +811,37 @@ static __always_inline void follow_dotdot(struct nameidata *nd) static int do_lookup(struct nameidata *nd, struct qstr *name, struct path *path) { - struct vfsmount *mnt = nd->path.mnt; - struct dentry *dentry = __d_lookup(nd->path.dentry, name); + int err; - if (!dentry) + path->dentry = __d_lookup(nd->path.dentry, name); + path->mnt = nd->path.mnt; + if (!path->dentry) goto need_lookup; - if (dentry->d_op && dentry->d_op->d_revalidate) + if (path->dentry->d_op && path->dentry->d_op->d_revalidate) goto need_revalidate; + done: - path->mnt = mnt; - path->dentry = dentry; __follow_mount(path); return 0; need_lookup: - dentry = real_lookup(nd->path.dentry, name, nd); - if (IS_ERR(dentry)) + err = real_lookup(nd, name, path); + if (err) goto fail; goto done; need_revalidate: - dentry = do_revalidate(dentry, nd); - if (!dentry) + path->dentry = do_revalidate(path->dentry, nd); + if (!path->dentry) goto need_lookup; - if (IS_ERR(dentry)) + if (IS_ERR(path->dentry)) { + err = PTR_ERR(path->dentry); goto fail; + } goto done; fail: - return PTR_ERR(dentry); + return err; } /* -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 4/7] VFS: Propagate mnt_flags into do_loopback 2009-12-23 23:36 ` [PATCH 3/7] VFS: Make real_lookup() " Valerie Aurora @ 2009-12-23 23:37 ` Valerie Aurora 2009-12-23 23:37 ` [PATCH 5/7] VFS: Add read-only users count to superblock Valerie Aurora 0 siblings, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:37 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora From: Jan Blunck <jblunck@suse.de> The mnt_flags are propagated into do_loopback(), so that they can be checked when mounting something loopback into a union. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/namespace.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index bdc3cb4..2244801 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1446,8 +1446,8 @@ static int do_change_type(struct path *path, int flag) /* * do loopback mount. */ -static int do_loopback(struct path *path, char *old_name, - int recurse) +static int do_loopback(struct path *path, char *old_name, int recurse, + int mnt_flags) { struct path old_path; struct vfsmount *mnt = NULL; @@ -1959,7 +1959,8 @@ long do_mount(char *dev_name, char *dir_name, char *type_page, retval = do_remount(&path, flags & ~MS_REMOUNT, mnt_flags, data_page); else if (flags & MS_BIND) - retval = do_loopback(&path, dev_name, flags & MS_REC); + retval = do_loopback(&path, dev_name, flags & MS_REC, + mnt_flags); else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE)) retval = do_change_type(&path, flags); else if (flags & MS_MOVE) -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 5/7] VFS: Add read-only users count to superblock 2009-12-23 23:37 ` [PATCH 4/7] VFS: Propagate mnt_flags into do_loopback Valerie Aurora @ 2009-12-23 23:37 ` Valerie Aurora 2009-12-23 23:37 ` [PATCH 6/7] VFS: BUG_ON() rehash of an already hashed dentry Valerie Aurora 0 siblings, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:37 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora While we can check if a file system is currently read-only, we can't guarantee that it will stay read-only. The file system can be remounted read-write at any time; it's also conceivable that a file system can be mounted a second time and converted to read-write if the underlying fs allows it. This is a problem for union mounts, which require the underlying file system be read-only. Add a read-only users count and don't allow remounts to change the file system to read-write or read-write mounts if there are any read-only users. Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/super.c | 18 ++++++++++++++++-- include/linux/fs.h | 5 +++++ 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/fs/super.c b/fs/super.c index 19eb70b..8a12bab 100644 --- a/fs/super.c +++ b/fs/super.c @@ -583,8 +583,10 @@ int do_remount_sb(struct super_block *sb, int flags, void *data, int force) shrink_dcache_sb(sb); sync_filesystem(sb); - /* If we are remounting RDONLY and current sb is read/write, - make sure there are no rw files opened */ + /* + * If we are remounting RDONLY and current sb is read/write, + * make sure there are no rw files opened. + */ if ((flags & MS_RDONLY) && !(sb->s_flags & MS_RDONLY)) { if (force) mark_files_ro(sb); @@ -596,6 +598,14 @@ int do_remount_sb(struct super_block *sb, int flags, void *data, int force) } remount_rw = !(flags & MS_RDONLY) && (sb->s_flags & MS_RDONLY); + /* + * If we are remounting read/write, deny write access if the + * file system is being used by anything that requires it to + * stay read-only (e.g., union mounts). + */ + if (remount_rw && sb->s_readonly_users) + return -EROFS; + if (sb->s_op->remount_fs) { retval = sb->s_op->remount_fs(sb, &flags, data); if (retval) @@ -953,6 +963,10 @@ vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void WARN((mnt->mnt_sb->s_maxbytes < 0), "%s set sb->s_maxbytes to " "negative value (%lld)\n", type->name, mnt->mnt_sb->s_maxbytes); + error = -EROFS; + if (mnt->mnt_sb->s_readonly_users && !(flags & MS_RDONLY)) + goto out_sb; + mnt->mnt_mountpoint = mnt->mnt_root; mnt->mnt_parent = mnt; up_write(&mnt->mnt_sb->s_umount); diff --git a/include/linux/fs.h b/include/linux/fs.h index 2620a8c..4070eac 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1380,6 +1380,11 @@ struct super_block { * generic_show_options() */ char *s_options; + + /* + * Users who require read-only access - e.g., union mounts + */ + int s_readonly_users; }; extern struct timespec current_fs_time(struct super_block *sb); -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 6/7] VFS: BUG_ON() rehash of an already hashed dentry 2009-12-23 23:37 ` [PATCH 5/7] VFS: Add read-only users count to superblock Valerie Aurora @ 2009-12-23 23:37 ` Valerie Aurora 2009-12-23 23:37 ` [PATCH 7/7] VFS: Remove unnecessary micro-optimization in cached_lookup() Valerie Aurora 0 siblings, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:37 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora From: Jan Blunck <jblunck@suse.de> BUG_ON() rehash of an already hashed dentry. For debugging of dcache-related development. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/dcache.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index a100fa3..74d4ca9 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1551,6 +1551,7 @@ void d_rehash(struct dentry * entry) { spin_lock(&dcache_lock); spin_lock(&entry->d_lock); + BUG_ON(!d_unhashed(entry)); _d_rehash(entry); spin_unlock(&entry->d_lock); spin_unlock(&dcache_lock); -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 7/7] VFS: Remove unnecessary micro-optimization in cached_lookup() 2009-12-23 23:37 ` [PATCH 6/7] VFS: BUG_ON() rehash of an already hashed dentry Valerie Aurora @ 2009-12-23 23:37 ` Valerie Aurora 0 siblings, 0 replies; 27+ messages in thread From: Valerie Aurora @ 2009-12-23 23:37 UTC (permalink / raw) To: linux-fsdevel; +Cc: Jan Blunck, Alexander Viro, Valerie Aurora From: Jan Blunck <jblunck@suse.de> d_lookup() takes rename_lock which is a seq_lock. This is so cheap it's not worth calling lockless __d_lookup() first from cache_lookup(). Rename cached_lookup() to cache_lookup() while we're there. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Valerie Aurora <vaurora@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> --- fs/namei.c | 13 ++++--------- 1 files changed, 4 insertions(+), 9 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 3f645ce..ae200ba 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -417,15 +417,10 @@ do_revalidate(struct dentry *dentry, struct nameidata *nd) * Internal lookup() using the new generic dcache. * SMP-safe */ -static struct dentry * cached_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd) +static struct dentry *cache_lookup(struct dentry *parent, struct qstr *name, + struct nameidata *nd) { - struct dentry * dentry = __d_lookup(parent, name); - - /* lockess __d_lookup may fail due to concurrent d_move() - * in some unrelated directory, so try with d_lookup - */ - if (!dentry) - dentry = d_lookup(parent, name); + struct dentry *dentry = d_lookup(parent, name); if (dentry && dentry->d_op && dentry->d_op->d_revalidate) dentry = do_revalidate(dentry, nd); @@ -1208,7 +1203,7 @@ static struct dentry *__lookup_hash(struct qstr *name, goto out; } - dentry = cached_lookup(base, name, nd); + dentry = cache_lookup(base, name, nd); if (!dentry) { struct dentry *new; -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2009-12-23 23:36 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora 2009-12-23 23:36 ` [PATCH 2/7] VFS: Make lookup_hash() return a struct path Valerie Aurora @ 2010-01-02 0:44 ` Ian Kent 2010-01-14 5:43 ` Al Viro 1 sibling, 1 reply; 27+ messages in thread From: Ian Kent @ 2010-01-02 0:44 UTC (permalink / raw) To: Valerie Aurora; +Cc: linux-fsdevel, Jan Blunck, Alexander Viro, autofs On Wed, Dec 23, 2009 at 03:36:57PM -0800, Valerie Aurora wrote: > From: Jan Blunck <jblunck@suse.de> > > This is a bugfix/replacement for commit > 051d381259eb57d6074d02a6ba6e90e744f1a29f: > > During a path walk if an autofs trigger is mounted on a dentry, > when the follow_link method is called, the nameidata struct > contains the vfsmount and mountpoint dentry of the parent mount > while the dentry that is passed in is the root of the autofs > trigger mount. I believe it is impossible to get the vfsmount of > the trigger mount, within the follow_link method, when only the > parent vfsmount and the root dentry of the trigger mount are > known. > > The solution in this commit was to replace the path embedded in the > parent's nameidata with the path of the link itself in > __do_follow_link(). This is a relatively harmless misuse of the > field, but union mounts ran into a bug during follow_link() caused by > the nameidata containing the wrong path (we count on it being what it > is all other places - the path of the parent). > > A cleaner and easier to understand solution is to save the necessary > vfsmount in the autofs superblock info when it is mounted. Then we > can easily update the vfsmount in autofs4_follow_link(). > > Signed-off-by: Jan Blunck <jblunck@suse.de> > Signed-off-by: Valerie Aurora <vaurora@redhat.com> Acked-by: <raven@themaw.net> Don't know how I missed such an obvious solution when I did this. Thanks, Ian > Cc: Ian Kent <raven@themaw.net> > Cc: autofs@linux.kernel.org > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > --- > fs/autofs4/autofs_i.h | 1 + > fs/autofs4/init.c | 11 ++++++++++- > fs/autofs4/root.c | 6 ++++++ > fs/namei.c | 7 ++----- > 4 files changed, 19 insertions(+), 6 deletions(-) > > diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h > index 8f7cdde..db2bfce 100644 > --- a/fs/autofs4/autofs_i.h > +++ b/fs/autofs4/autofs_i.h > @@ -130,6 +130,7 @@ struct autofs_sb_info { > int reghost_enabled; > int needs_reghost; > struct super_block *sb; > + struct vfsmount *mnt; > struct mutex wq_mutex; > spinlock_t fs_lock; > struct autofs_wait_queue *queues; /* Wait queue pointer */ > diff --git a/fs/autofs4/init.c b/fs/autofs4/init.c > index 9722e4b..5e0dcd7 100644 > --- a/fs/autofs4/init.c > +++ b/fs/autofs4/init.c > @@ -17,7 +17,16 @@ > static int autofs_get_sb(struct file_system_type *fs_type, > int flags, const char *dev_name, void *data, struct vfsmount *mnt) > { > - return get_sb_nodev(fs_type, flags, data, autofs4_fill_super, mnt); > + struct autofs_sb_info *sbi; > + int ret; > + > + ret = get_sb_nodev(fs_type, flags, data, autofs4_fill_super, mnt); > + if (ret) > + return ret; > + > + sbi = autofs4_sbi(mnt->mnt_sb); > + sbi->mnt = mnt; > + return 0; > } > > static struct file_system_type autofs_fs_type = { > diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c > index b96a3c5..cb991b8 100644 > --- a/fs/autofs4/root.c > +++ b/fs/autofs4/root.c > @@ -179,6 +179,12 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) > DPRINTK("dentry=%p %.*s oz_mode=%d nd->flags=%d", > dentry, dentry->d_name.len, dentry->d_name.name, oz_mode, > nd->flags); > + > + dput(nd->path.dentry); > + mntput(nd->path.mnt); > + nd->path.mnt = mntget(sbi->mnt); > + nd->path.dentry = dget(dentry); > + > /* > * For an expire of a covered direct or offset mount we need > * to break out of follow_down() at the autofs mount trigger > diff --git a/fs/namei.c b/fs/namei.c > index d11f404..c768444 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -629,11 +629,8 @@ static __always_inline int __do_follow_link(struct path *path, struct nameidata > touch_atime(path->mnt, dentry); > nd_set_link(nd, NULL); > > - if (path->mnt != nd->path.mnt) { > - path_to_nameidata(path, nd); > - dget(dentry); > - } > - mntget(path->mnt); > + if (path->mnt == nd->path.mnt) > + mntget(nd->path.mnt); > cookie = dentry->d_inode->i_op->follow_link(dentry, nd); > error = PTR_ERR(cookie); > if (!IS_ERR(cookie)) { > -- > 1.5.6.5 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-02 0:44 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Ian Kent @ 2010-01-14 5:43 ` Al Viro 2010-01-14 19:18 ` Valerie Aurora 0 siblings, 1 reply; 27+ messages in thread From: Al Viro @ 2010-01-14 5:43 UTC (permalink / raw) To: Ian Kent; +Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs On Sat, Jan 02, 2010 at 08:44:25AM +0800, Ian Kent wrote: > On Wed, Dec 23, 2009 at 03:36:57PM -0800, Valerie Aurora wrote: > > From: Jan Blunck <jblunck@suse.de> > > > > This is a bugfix/replacement for commit > > 051d381259eb57d6074d02a6ba6e90e744f1a29f: > > > > During a path walk if an autofs trigger is mounted on a dentry, > > when the follow_link method is called, the nameidata struct > > contains the vfsmount and mountpoint dentry of the parent mount > > while the dentry that is passed in is the root of the autofs > > trigger mount. I believe it is impossible to get the vfsmount of > > the trigger mount, within the follow_link method, when only the > > parent vfsmount and the root dentry of the trigger mount are > > known. > > > > The solution in this commit was to replace the path embedded in the > > parent's nameidata with the path of the link itself in > > __do_follow_link(). This is a relatively harmless misuse of the > > field, but union mounts ran into a bug during follow_link() caused by > > the nameidata containing the wrong path (we count on it being what it > > is all other places - the path of the parent). > > > > A cleaner and easier to understand solution is to save the necessary > > vfsmount in the autofs superblock info when it is mounted. Then we > > can easily update the vfsmount in autofs4_follow_link(). > > > > Signed-off-by: Jan Blunck <jblunck@suse.de> > > Signed-off-by: Valerie Aurora <vaurora@redhat.com> > Acked-by: <raven@themaw.net> > > Don't know how I missed such an obvious solution when I did this. > Thanks, Ian TBH, I don't like either variant (both the in-tree one and that). The reason why vfsmount does *NOT* belong in superblock, TYVM: you've messed the lifetime rules. You can't pin it down, or the damn thing will be impossible to kill. OTOH, you have no promise whatsoever that superblock won't outlive the initial vfsmount. You might get another vfsmount over the same thing and once the original one is gone... So this is simply broken. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-14 5:43 ` Al Viro @ 2010-01-14 19:18 ` Valerie Aurora 2010-01-15 6:05 ` Ian Kent 0 siblings, 1 reply; 27+ messages in thread From: Valerie Aurora @ 2010-01-14 19:18 UTC (permalink / raw) To: Al Viro; +Cc: Ian Kent, linux-fsdevel, Jan Blunck, autofs On Thu, Jan 14, 2010 at 05:43:22AM +0000, Al Viro wrote: > On Sat, Jan 02, 2010 at 08:44:25AM +0800, Ian Kent wrote: > > On Wed, Dec 23, 2009 at 03:36:57PM -0800, Valerie Aurora wrote: > > > From: Jan Blunck <jblunck@suse.de> > > > > > > This is a bugfix/replacement for commit > > > 051d381259eb57d6074d02a6ba6e90e744f1a29f: > > > > > > During a path walk if an autofs trigger is mounted on a dentry, > > > when the follow_link method is called, the nameidata struct > > > contains the vfsmount and mountpoint dentry of the parent mount > > > while the dentry that is passed in is the root of the autofs > > > trigger mount. I believe it is impossible to get the vfsmount of > > > the trigger mount, within the follow_link method, when only the > > > parent vfsmount and the root dentry of the trigger mount are > > > known. > > > > > > The solution in this commit was to replace the path embedded in the > > > parent's nameidata with the path of the link itself in > > > __do_follow_link(). This is a relatively harmless misuse of the > > > field, but union mounts ran into a bug during follow_link() caused by > > > the nameidata containing the wrong path (we count on it being what it > > > is all other places - the path of the parent). > > > > > > A cleaner and easier to understand solution is to save the necessary > > > vfsmount in the autofs superblock info when it is mounted. Then we > > > can easily update the vfsmount in autofs4_follow_link(). > > > > > > Signed-off-by: Jan Blunck <jblunck@suse.de> > > > Signed-off-by: Valerie Aurora <vaurora@redhat.com> > > Acked-by: <raven@themaw.net> > > > > Don't know how I missed such an obvious solution when I did this. > > Thanks, Ian > > TBH, I don't like either variant (both the in-tree one and that). > The reason why vfsmount does *NOT* belong in superblock, TYVM: you've > messed the lifetime rules. You can't pin it down, or the damn thing will > be impossible to kill. OTOH, you have no promise whatsoever that superblock > won't outlive the initial vfsmount. You might get another vfsmount over > the same thing and once the original one is gone... > > So this is simply broken. Ian, you're the expert - any ideas? What are the constraints here? -VAL ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-14 19:18 ` Valerie Aurora @ 2010-01-15 6:05 ` Ian Kent 2010-01-15 8:03 ` Al Viro 2010-01-15 14:55 ` David Howells 0 siblings, 2 replies; 27+ messages in thread From: Ian Kent @ 2010-01-15 6:05 UTC (permalink / raw) To: Valerie Aurora; +Cc: Al Viro, linux-fsdevel, Jan Blunck, autofs On 01/15/2010 03:18 AM, Valerie Aurora wrote: > On Thu, Jan 14, 2010 at 05:43:22AM +0000, Al Viro wrote: >> On Sat, Jan 02, 2010 at 08:44:25AM +0800, Ian Kent wrote: >>> On Wed, Dec 23, 2009 at 03:36:57PM -0800, Valerie Aurora wrote: >>>> From: Jan Blunck <jblunck@suse.de> >>>> >>>> This is a bugfix/replacement for commit >>>> 051d381259eb57d6074d02a6ba6e90e744f1a29f: >>>> >>>> During a path walk if an autofs trigger is mounted on a dentry, >>>> when the follow_link method is called, the nameidata struct >>>> contains the vfsmount and mountpoint dentry of the parent mount >>>> while the dentry that is passed in is the root of the autofs >>>> trigger mount. I believe it is impossible to get the vfsmount of >>>> the trigger mount, within the follow_link method, when only the >>>> parent vfsmount and the root dentry of the trigger mount are >>>> known. >>>> >>>> The solution in this commit was to replace the path embedded in the >>>> parent's nameidata with the path of the link itself in >>>> __do_follow_link(). This is a relatively harmless misuse of the >>>> field, but union mounts ran into a bug during follow_link() caused by >>>> the nameidata containing the wrong path (we count on it being what it >>>> is all other places - the path of the parent). >>>> >>>> A cleaner and easier to understand solution is to save the necessary >>>> vfsmount in the autofs superblock info when it is mounted. Then we >>>> can easily update the vfsmount in autofs4_follow_link(). >>>> >>>> Signed-off-by: Jan Blunck <jblunck@suse.de> >>>> Signed-off-by: Valerie Aurora <vaurora@redhat.com> >>> Acked-by: <raven@themaw.net> >>> >>> Don't know how I missed such an obvious solution when I did this. >>> Thanks, Ian >> >> TBH, I don't like either variant (both the in-tree one and that). >> The reason why vfsmount does *NOT* belong in superblock, TYVM: you've >> messed the lifetime rules. You can't pin it down, or the damn thing will >> be impossible to kill. OTOH, you have no promise whatsoever that superblock >> won't outlive the initial vfsmount. You might get another vfsmount over >> the same thing and once the original one is gone... >> >> So this is simply broken. > > Ian, you're the expert - any ideas? What are the constraints here? It looks to me like the struct path coming to __do_follow_link() has what I need. A potentially big change, but using the struct path instead of the dentry in the inode op follow_link() call would get me what I need in my follow_link() call without having to store anything. Ian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 6:05 ` Ian Kent @ 2010-01-15 8:03 ` Al Viro 2010-01-15 17:36 ` Steve French 2010-01-18 5:08 ` Ian Kent 2010-01-15 14:55 ` David Howells 1 sibling, 2 replies; 27+ messages in thread From: Al Viro @ 2010-01-15 8:03 UTC (permalink / raw) To: Ian Kent Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, dhowells, Trond.Myklebust [AFS, CIFS and NFS maintainers added to Cc] On Fri, Jan 15, 2010 at 02:05:24PM +0800, Ian Kent wrote: > >> So this is simply broken. > > > > Ian, you're the expert - any ideas? What are the constraints here? > > It looks to me like the struct path coming to __do_follow_link() has > what I need. A potentially big change, but using the struct path instead > of the dentry in the inode op follow_link() call would get me what I > need in my follow_link() call without having to store anything. Yuck. If we go for change of ->follow_link() prototype, we might as well DTRT and split the damn thing instead of trying to overload an unsuitable method. Let's see what's really going on there. AFAICT, we have several, er, oddities in the area: * abuse of mnt_devname by nfs; do we ever want it to differ (for nfs automounting purposes) for different vfsmounts over the same superblock? * inconsistent behaviour when something had managed to mount first. Some expect the possibility of -EBUSY, some do not. * WTF is CIFS doing setting MNT_SHRINKABLE on *parent* vfsmount? Blindly, not even bothering to cooperate with mnt_make_readonly() and its ilk. Most of all, _why_? It's child that should be marked so, not the parent... * speaking of which, what should happen to other vfsmount flags? NFS seems to want them all inherited and MNT_SHRINKABLE set, which is reasonable enough; CIFS sets MNT_SHRINKABLE on parent and inherits everything; AFS can't be arsed and just sets MNT_SHRINKABLE, all flags on parent be damned. * what's going on with use of intent flags in case of autofs4? Other than that, it'd seem that we would be better off if we treated these guys as "if we try to follow mountpoint and that sucker turns out to be the end of the road, do something fs-specific and either try to go further or see if we could attach the mountpoint given to us here". Which might be better done in VFS, with fs method returning vfsmount, NULL or ERR_PTR. And yes, it'd mean rework of vfsmount expiry interface; we'd need to put the vfsmount on expiry list before attaching it anywhere. Not a big deal, that... Comments? It's obviously not a solution yet and this direction might very well turn out to be unfeasible, but let's at least see where that might lead. There are, after all, 4 users of that stuff. Should be enough to figure out what we really need and come up with a sane design... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 8:03 ` Al Viro @ 2010-01-15 17:36 ` Steve French 2010-01-18 5:08 ` Ian Kent 1 sibling, 0 replies; 27+ messages in thread From: Steve French @ 2010-01-15 17:36 UTC (permalink / raw) To: Al Viro Cc: autofs, dhowells, Trond.Myklebust, sfrench, Valerie Aurora, linux-fsdevel, Jan Blunck, Ian Kent [-- Attachment #1.1: Type: text/plain, Size: 481 bytes --] On Fri, Jan 15, 2010 at 2:03 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > [AFS, CIFS and NFS maintainers added to Cc] > * WTF is CIFS doing setting MNT_SHRINKABLE on *parent* vfsmount? > Blindly, not even bothering to cooperate with mnt_make_readonly() and > its ilk. Most of all, _why_? It's child that should be marked so, not > the parent... > Yes. Looks like a bug. Thanks for spotting this. Doublechecking with Igor (who wrote this section). -- Thanks, Steve [-- Attachment #1.2: Type: text/html, Size: 793 bytes --] [-- Attachment #2: Type: text/plain, Size: 140 bytes --] _______________________________________________ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 8:03 ` Al Viro 2010-01-15 17:36 ` Steve French @ 2010-01-18 5:08 ` Ian Kent 1 sibling, 0 replies; 27+ messages in thread From: Ian Kent @ 2010-01-18 5:08 UTC (permalink / raw) To: Al Viro Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, dhowells, Trond.Myklebust On 01/15/2010 04:03 PM, Al Viro wrote: > * what's going on with use of intent flags in case of autofs4? It's the same old story. autofs needs to take care not to trigger mounts for certain system calls when we have reached the last path component. AFAIK the only way to do that is to check the flags. > > Other than that, it'd seem that we would be better off if we treated these > guys as "if we try to follow mountpoint and that sucker turns out to be > the end of the road, do something fs-specific and either try to go further > or see if we could attach the mountpoint given to us here". Which might > be better done in VFS, with fs method returning vfsmount, NULL or ERR_PTR. Forgive my thickness but am I understanding this correctly? Are you just saying it would be better to add a file system method to cater for the case where we might need to do further resolution on the mount root we have just arrived at before continuing into it (mmm ... now it feels like I've stated the obvious, but anyway ...)? Would that be before following the link or after (not that it would matter for my needs)? Ian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 6:05 ` Ian Kent 2010-01-15 8:03 ` Al Viro @ 2010-01-15 14:55 ` David Howells 2010-01-15 16:58 ` Al Viro 2010-01-15 17:08 ` David Howells 1 sibling, 2 replies; 27+ messages in thread From: David Howells @ 2010-01-15 14:55 UTC (permalink / raw) To: Al Viro Cc: dhowells, Ian Kent, Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust Al Viro <viro@ZenIV.linux.org.uk> wrote: > AFS can't be arsed and just sets MNT_SHRINKABLE, all flags on parent be > damned. Why's that wrong? This is AFS's automounter, and it seems quite reasonable to have the outer edges pruned to make space. David ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 14:55 ` David Howells @ 2010-01-15 16:58 ` Al Viro 2010-01-15 17:08 ` David Howells 1 sibling, 0 replies; 27+ messages in thread From: Al Viro @ 2010-01-15 16:58 UTC (permalink / raw) To: David Howells Cc: Ian Kent, Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust On Fri, Jan 15, 2010 at 02:55:23PM +0000, David Howells wrote: > Al Viro <viro@ZenIV.linux.org.uk> wrote: > > > AFS can't be arsed and just sets MNT_SHRINKABLE, all flags on parent be > > damned. > > Why's that wrong? This is AFS's automounter, and it seems quite reasonable to > have the outer edges pruned to make space. Do you want it to inerit e.g. nosuid? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 14:55 ` David Howells 2010-01-15 16:58 ` Al Viro @ 2010-01-15 17:08 ` David Howells 2010-01-15 17:26 ` Al Viro 1 sibling, 1 reply; 27+ messages in thread From: David Howells @ 2010-01-15 17:08 UTC (permalink / raw) To: Al Viro Cc: dhowells, Ian Kent, Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust Al Viro <viro@ZenIV.linux.org.uk> wrote: > Do you want it to inerit e.g. nosuid? Are you just talking about MNT_SHRINKABLE? Or all the other flags? Should I be passing: nd->path.mnt->mnt_flags | MNT_SHRINKABLE instead? David ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 17:08 ` David Howells @ 2010-01-15 17:26 ` Al Viro 2010-01-16 10:17 ` Al Viro 0 siblings, 1 reply; 27+ messages in thread From: Al Viro @ 2010-01-15 17:26 UTC (permalink / raw) To: David Howells Cc: Ian Kent, Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust On Fri, Jan 15, 2010 at 05:08:28PM +0000, David Howells wrote: > Al Viro <viro@ZenIV.linux.org.uk> wrote: > > > Do you want it to inerit e.g. nosuid? > > Are you just talking about MNT_SHRINKABLE? Or all the other flags? > > Should I be passing: > > nd->path.mnt->mnt_flags | MNT_SHRINKABLE > > instead? Maybe, maybe not... BTW, even that leaves an unpleasant race with mnt_make_readonly() (CIFS and NFS seem to be suffering from one). Which flags do we want to be inherited? Grabbing MNT_WRITE_HOLD, for example, would obviously be a bad idea... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-15 17:26 ` Al Viro @ 2010-01-16 10:17 ` Al Viro 2010-01-17 17:57 ` Al Viro 0 siblings, 1 reply; 27+ messages in thread From: Al Viro @ 2010-01-16 10:17 UTC (permalink / raw) To: David Howells Cc: Ian Kent, Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust On Fri, Jan 15, 2010 at 05:26:33PM +0000, Al Viro wrote: > Maybe, maybe not... BTW, even that leaves an unpleasant race with > mnt_make_readonly() (CIFS and NFS seem to be suffering from one). > Which flags do we want to be inherited? Grabbing MNT_WRITE_HOLD, > for example, would obviously be a bad idea... Oh, man... After doing code review re locking rules for mnt_flags and related stuff, a bunch of races had shown up. I'll put fixes into #for-linus tomorrow. Shit galore: * may_umount() ought to take namespace_sem (shared). Otherwise we race with clone_mnt() doing add_list() to ->mnt_share/->mnt_slave. * attach_recursive_mnt() ought to take vfsmount_lock around its loop that does set_mnt_shared(); otherwise mount --move can race with e.g. mount -o remount. * do_remount() ought to take vfsmount_lock around the assigment to mnt_flags *and* take care to leave MNT_SHARED and MNT_UNBINDABLE alone, especially the former. * CIFS shouldn't step on mnt_flags * NFS, AFS and CIFS should *not* leave MNT_SHARED and MNT_WRITE_HOLD in flags passed to do_add_mount(); alternatively, do_add_mount() might trim those. * [unsolved, to be dealt along with per-superblock write counts] do_remount() plays fast and loose with MNT_READONLY for !MS_BIND case. * [*really* unsolved] it remains to be seen whether we want to propagate modifications of mount flags via shared subtree stuff. For most of those it's trivial (and arguably the right thing to do), but ro/rw is really nasty. Nick's mnt_want_write() implementation will need very careful analysis. * [#for-next fodder] pnode.c:get_source() cleanup; right now it is correct, but PITA to prove the correctness. Incidentally, CL_PROPAGATION is gone after that one. Not #for-linus stuff, but I'll be really happier with it for post-.33 * [#for-next, but might go into #for-linus as well] explicit documentation of invariants added to Documentation/filesystems/sharedsubtree.txt Part of that can be deduced from what's already there, but I'd rather have it explicit and one crucial bit is simply missing ("if mnt->mnt_master != NULL, the entire mnt->mnt_share list consists of adjacent elements of mnt->mnt_slaves, in the same order") and proving correctness without it is Not Fun(tm). Hell, I've spent an hour figuring out whether it's broken or not and I'd been through all that code back when it had been written. In details. I'll give that another look after I get some sleep, then to public tree it goes... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-16 10:17 ` Al Viro @ 2010-01-17 17:57 ` Al Viro 2010-01-18 4:21 ` Ian Kent 0 siblings, 1 reply; 27+ messages in thread From: Al Viro @ 2010-01-17 17:57 UTC (permalink / raw) To: Ian Kent Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust, David Howells On Sat, Jan 16, 2010 at 10:17:14AM +0000, Al Viro wrote: > * [unsolved, to be dealt along with per-superblock write counts] > do_remount() plays fast and loose with MNT_READONLY for !MS_BIND case. > * [*really* unsolved] it remains to be seen whether we want to > propagate modifications of mount flags via shared subtree stuff. For most > of those it's trivial (and arguably the right thing to do), but ro/rw is > really nasty. Nick's mnt_want_write() implementation will need very careful > analysis. Speaking of autofs4, what the hell is going on in autofs_dev_ioctl_ismountpoint()? Checks in there make no sense, both the "could that dentry be negative?" and whatever the hell it is trying to do with mnt_mountpoint. Ian? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-17 17:57 ` Al Viro @ 2010-01-18 4:21 ` Ian Kent 2010-01-18 5:59 ` Al Viro 0 siblings, 1 reply; 27+ messages in thread From: Ian Kent @ 2010-01-18 4:21 UTC (permalink / raw) To: Al Viro Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust, David Howells On 01/18/2010 01:57 AM, Al Viro wrote: > On Sat, Jan 16, 2010 at 10:17:14AM +0000, Al Viro wrote: >> * [unsolved, to be dealt along with per-superblock write counts] >> do_remount() plays fast and loose with MNT_READONLY for !MS_BIND case. >> * [*really* unsolved] it remains to be seen whether we want to >> propagate modifications of mount flags via shared subtree stuff. For most >> of those it's trivial (and arguably the right thing to do), but ro/rw is >> really nasty. Nick's mnt_want_write() implementation will need very careful >> analysis. > > Speaking of autofs4, what the hell is going on in > autofs_dev_ioctl_ismountpoint()? Checks in there make no sense, > both the "could that dentry be negative?" and whatever the hell > it is trying to do with mnt_mountpoint. Ian? In that case we may find an autofs mount that has something mounted on top of it and user space wants to know the super of the covering mount. If there is something mounted on top of it user space needs to know if it is another autofs file system or some other type of file system. So if the nameidata path, located by autofs_dev_ioctl_find_super(), is not the top (or bottom, depending on the terminology you prefer) then we need to follow the mount and return the magic of the thing mounted on top of it. Ian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-18 4:21 ` Ian Kent @ 2010-01-18 5:59 ` Al Viro 2010-01-18 9:14 ` Ian Kent 0 siblings, 1 reply; 27+ messages in thread From: Al Viro @ 2010-01-18 5:59 UTC (permalink / raw) To: Ian Kent Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust, David Howells On Mon, Jan 18, 2010 at 12:21:09PM +0800, Ian Kent wrote: > In that case we may find an autofs mount that has something mounted on > top of it and user space wants to know the super of the covering mount. > > If there is something mounted on top of it user space needs to know if > it is another autofs file system or some other type of file system. So > if the nameidata path, located by autofs_dev_ioctl_find_super(), is not > the top (or bottom, depending on the terminology you prefer) then we > need to follow the mount and return the magic of the thing mounted on > top of it. IDGI. What you are doing there is if (path.mnt->mnt_mountpoint != path.mnt->mnt_root) { if (follow_down(&path)) magic = path.mnt->mnt_sb->s_magic; } and I don't think it means what you think it means. Just what is that mnt_mountpoint check about? Before that point we'd found the autofs vfsmount M that 1) M is mounted on <name> 2) M->mnt_sb has the right s_dev 3) M is the closest one to root in mount tree out of vfsmounts satisfying (1) and (2) Now we check that 4) the mountpoint M is attached to has dentry different from M->mnt_root. That's an interesting thing to check, seeing that the only way to get it false is to have mount --bind name name, with name not being the mountpoint before that. And M being the result of that mount --bind. 5) something is mounted on top of root of M. Then we proceed to return the s_magic of that something. For one thing, if there *are* several vfsmounts satisfying (1,2), which one do we really want? For another, what's the intent of (4)? It looks very odd; what's really being checked there? In another branch we have if (path.dentry->d_inode && path.mnt->mnt_root == path.dentry) { err = 1; magic = path.dentry->d_inode->i_sb->s_magic; } and AFAICT, path.dentry->d_inode == NULL is impossible there. Besides, path.mnt->mnt_sb->s_magic would be simpler anyway (and evaluate to the same thing). ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-18 5:59 ` Al Viro @ 2010-01-18 9:14 ` Ian Kent 2010-01-18 10:27 ` Al Viro 0 siblings, 1 reply; 27+ messages in thread From: Ian Kent @ 2010-01-18 9:14 UTC (permalink / raw) To: Al Viro Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust, David Howells On 01/18/2010 01:59 PM, Al Viro wrote: > On Mon, Jan 18, 2010 at 12:21:09PM +0800, Ian Kent wrote: > >> In that case we may find an autofs mount that has something mounted on >> top of it and user space wants to know the super of the covering mount. >> >> If there is something mounted on top of it user space needs to know if >> it is another autofs file system or some other type of file system. So >> if the nameidata path, located by autofs_dev_ioctl_find_super(), is not >> the top (or bottom, depending on the terminology you prefer) then we >> need to follow the mount and return the magic of the thing mounted on >> top of it. > > IDGI. What you are doing there is > if (path.mnt->mnt_mountpoint != path.mnt->mnt_root) { > if (follow_down(&path)) > magic = path.mnt->mnt_sb->s_magic; > } > and I don't think it means what you think it means. Just what is that > mnt_mountpoint check about? Before that point we'd found the autofs > vfsmount M that > 1) M is mounted on <name> > 2) M->mnt_sb has the right s_dev > 3) M is the closest one to root in mount tree out of vfsmounts > satisfying (1) and (2) > Now we check that > 4) the mountpoint M is attached to has dentry different from > M->mnt_root. That's an interesting thing to check, seeing that the > only way to get it false is to have mount --bind name name, with name > not being the mountpoint before that. And M being the result of > that mount --bind. > 5) something is mounted on top of root of M. > > Then we proceed to return the s_magic of that something. For one thing, > if there *are* several vfsmounts satisfying (1,2), which one do we really > want? For another, what's the intent of (4)? It looks very odd; what's > really being checked there? The code here has changed a little from what I originally posted but, assuming for now the functionality is equivalent, I can't see what the point of that check is and now I can't remember if there was some odd special case. follow_mount() should be sufficient. I'll fix this. The possibility of more than one vfsmount being present is, as you say possible, but it is not legal for autofs (and last time I checked I concluded it wasn't possible for me to veto bind mount requests). Other than bind mounts I'm struggling to think of a case where I would have more than one autofs fs mount with the same s_dev. > > In another branch we have > if (path.dentry->d_inode && > path.mnt->mnt_root == path.dentry) { > err = 1; > magic = path.dentry->d_inode->i_sb->s_magic; > } > and AFAICT, path.dentry->d_inode == NULL is impossible there. Besides, > path.mnt->mnt_sb->s_magic would be simpler anyway (and evaluate to > the same thing). Yes, the dentry should always be positive here but let me think about it a little more in case I'm missing something. And yes, using the vfsmount super block pointer would be better. I'll fix these too. Ian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-18 9:14 ` Ian Kent @ 2010-01-18 10:27 ` Al Viro 2010-01-18 19:35 ` Trond Myklebust 2010-01-19 7:05 ` Ian Kent 0 siblings, 2 replies; 27+ messages in thread From: Al Viro @ 2010-01-18 10:27 UTC (permalink / raw) To: Ian Kent Cc: autofs, David Howells, Trond.Myklebust, sfrench, Valerie Aurora, linux-fsdevel, Jan Blunck On Mon, Jan 18, 2010 at 05:14:58PM +0800, Ian Kent wrote: > The possibility of more than one vfsmount being present is, as you say > possible, but it is not legal for autofs (and last time I checked I > concluded it wasn't possible for me to veto bind mount requests). Other > than bind mounts I'm struggling to think of a case where I would have > more than one autofs fs mount with the same s_dev. That's interesting... Other place where we go through the stack of mounts is where we look for one with given bits in type; do we have a possibility of multiple candidates there and which one do we really want? Same function, case when we have ioctlfd == -1, !autofs_type_any(type). Current code (as well as original) goes for one closest to root; it would certainly be simpler to go for one on top of stack... > Yes, the dentry should always be positive here but let me think about it > a little more in case I'm missing something. And yes, using the vfsmount > super block pointer would be better. I'll fix these too. FWIW, I'd rather have that stuff in vfs tree; that's not the only patch eliminating mnt_mountpoint use. AFAICT, none of the uses outside of core VFS code are legitimate (and BTW, NFSv4 one is contrary to RFC - it should do follow_down() in loop before reporting mount_fileid instead of just picking the immediate ancestor for one thing and I'm not at all sure that check around that thing is correct; it ignores export path.dentry and looks at path.mnt.root instead, which seems to be wrong). Other places would be just as happy with mnt/mnt->mnt_root instead of mnt->mnt_parent/ mnt->mnt_mountpoint or, worse, mnt/mnt->mnt_mountpoint. Pohmelfs is even nastier, with its d_path() abuses, but that's a separate story. IMO we ought to get rid of mnt_mountpoint/mnt_parent uses outside of core VFS and I'd rather do that in one patch queue. Back to another question: which syscalls should and which syscalls should not trigger automount on the last component? Note that it's something we'd better be consistent about between autofs4 and cifs/afs/nfs... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-18 10:27 ` Al Viro @ 2010-01-18 19:35 ` Trond Myklebust 2010-01-19 7:05 ` Ian Kent 1 sibling, 0 replies; 27+ messages in thread From: Trond Myklebust @ 2010-01-18 19:35 UTC (permalink / raw) To: Al Viro Cc: Ian Kent, Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, David Howells On Mon, 2010-01-18 at 10:27 +0000, Al Viro wrote: > Back to another question: which syscalls should and which syscalls should not > trigger automount on the last component? Note that it's something we'd better > be consistent about between autofs4 and cifs/afs/nfs... In addition to the ones that trigger automounts now: One syscall that we've had a lot of complaints about is 'stat()'. Since it doesn't follow symlinks, it will fail to trigger the automount, and will return bogus values for st_dev. This again confuses some versions of the 'du' utility that check the value of st_dev before and after entering the subdir. (see https://bugzilla.redhat.com/show_bug.cgi?id=431166 ) In the cifs/afs/nfs case, statfs() should really trigger automounts too. Not sure if we want that for the autofs case. Finally, it makes little sense for cifs/afs/nfs to allow open() on the underlying directory, even if O_NOFOLLOW is set. Again, autofs might be an exception due to its reliance on ioctls as a method of communication with the kernel. Cheers Trond ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info 2010-01-18 10:27 ` Al Viro 2010-01-18 19:35 ` Trond Myklebust @ 2010-01-19 7:05 ` Ian Kent 1 sibling, 0 replies; 27+ messages in thread From: Ian Kent @ 2010-01-19 7:05 UTC (permalink / raw) To: Al Viro Cc: Valerie Aurora, linux-fsdevel, Jan Blunck, autofs, sfrench, Trond.Myklebust, David Howells On 01/18/2010 06:27 PM, Al Viro wrote: > On Mon, Jan 18, 2010 at 05:14:58PM +0800, Ian Kent wrote: > >> The possibility of more than one vfsmount being present is, as you say >> possible, but it is not legal for autofs (and last time I checked I >> concluded it wasn't possible for me to veto bind mount requests). Other >> than bind mounts I'm struggling to think of a case where I would have >> more than one autofs fs mount with the same s_dev. > > That's interesting... Other place where we go through the stack of mounts > is where we look for one with given bits in type; do we have a possibility > of multiple candidates there and which one do we really want? Same function, > case when we have ioctlfd == -1, !autofs_type_any(type). Current code > (as well as original) goes for one closest to root; it would certainly be > simpler to go for one on top of stack... We shouldn't have multiple candidates for the same reason autofs can't support bind mounts. The mounts in an autofs tree of mounts are tied to a map of mounts in user space which means they are tied to a specific path. And, we won't (or at least shouldn't unless user space uses the facility incorrectly) have multiple mounts of the same autofs type on the stack due to the way autofs must use these mounts. As far as going from the top of the stack goes I don't think it makes any real difference since the stack is small (2 or possibly 3). The find_autofs_mount() function does go from the (terminology?) bottom of the mount stack using follow_up() whereas the original code went from the top using the path_lookup() to get the parent and then follow_down(). So there is additional following with the current code since kern_path() will go down to the bottom of the stack and then we proceed to go back up in find_autofs_mount(). > >> Yes, the dentry should always be positive here but let me think about it >> a little more in case I'm missing something. And yes, using the vfsmount >> super block pointer would be better. I'll fix these too. > > FWIW, I'd rather have that stuff in vfs tree; that's not the only patch > eliminating mnt_mountpoint use. AFAICT, none of the uses outside of > core VFS code are legitimate (and BTW, NFSv4 one is contrary to RFC - it > should do follow_down() in loop before reporting mount_fileid instead of > just picking the immediate ancestor for one thing and I'm not at all sure > that check around that thing is correct; it ignores export path.dentry > and looks at path.mnt.root instead, which seems to be wrong). Other places > would be just as happy with mnt/mnt->mnt_root instead of mnt->mnt_parent/ > mnt->mnt_mountpoint or, worse, mnt/mnt->mnt_mountpoint. Pohmelfs is even > nastier, with its d_path() abuses, but that's a separate story. > > IMO we ought to get rid of mnt_mountpoint/mnt_parent uses outside of core > VFS and I'd rather do that in one patch queue. > > Back to another question: which syscalls should and which syscalls should not > trigger automount on the last component? Note that it's something we'd better > be consistent about between autofs4 and cifs/afs/nfs... That's actually a difficult question. stat(2) is the most problematic call for much the same reasons as described by Trond. A (hopefully fairly simple) description of a couple of specific cases will be useful to clarify the situation. First, terminology, an "indirect autofs mount" is an autofs fs mount that implements single path components as mount points within the autofs fs directory. There are two cases, indirect mounts that are "nobrose" and ones that "browse"able. The later is just an autofs indirect mount with pre-created mount point directories so they can be seen by listing the directory whereas a nobrowse mount corresponds to an autofs mount we have seen historically where a directory listing of an autofs mount with no current active automounted mounts returns nothing. There is no problem with stat(2) for nobrowse autofs indirect mounts because directories that can potentially trigger mounts simply don't exist and the user space tools tend to behave as people expect. But, for browse autofs indirect mounts we get into trouble. For example a long or colour ls will stat(2) each directory to get needed information causing all the potential mounts to be mounted. Which is an pbvious problem for autofs maps that have more than a few entries and a disaster for anything with more than a hundred or so entries. Currently stat(2) is honoured for the nobrowse case (the case were we are creating a dentry) and not for the browse case, unless of course the mount has already been triggered. This gives rise to the stat() -> chdir() (or other mount triggering call) -> stat() inconsistency we see in user space utilities. In Solaris browse is the default for indirect mounts and we see the same lies in directory listings as mounts aren't triggered in that case either. statfs(2) is an interesting case for autofs direct mounts. More terminology, a "direct autofs mount" is an autofs mount that implements a mount trigger for a given path. An autofs direct map consists of entries with a full and a target mount and so each map entry is a distinct mount within the host file system (mmm ... sounds a bit like the description in the Solaris man page). When a direct mount is crossed the corresponding mount is triggered and is mounted on top of the autofs direct mount point. We probably should honour statfs(2) calls but without also necessarily honouring stat(2) calls. Currently the statfs(2) calls get caught by the exclusion above for stat(2) calls but we don't see many complaints because we don't list autofs mounts in /etc/mtab, where the user space utilities usually look for their information. Once again, this approach is similar to what we see in Solaris. In Solaris /etc/mnttab is similar to our /proc/mounts but there is a pseudo option ("hide" or something similar, I can't remember now) used to prevent user space utilities from considering entries that shouldn't be reported. So the stat(2) call is the big problem while other related system calls (in that they use very similar lookup flags), like statfs(2) have the opposite problem. This probably isn't really what you wanted to hear but I've tried to give a decent warts and all description of the difficulty. Ian ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2010-01-19 7:05 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-23 23:36 [PATCH 0/7] VFS prep for union mounts/writable overlays Valerie Aurora 2009-12-23 23:36 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Valerie Aurora 2009-12-23 23:36 ` [PATCH 2/7] VFS: Make lookup_hash() return a struct path Valerie Aurora 2009-12-23 23:36 ` [PATCH 3/7] VFS: Make real_lookup() " Valerie Aurora 2009-12-23 23:37 ` [PATCH 4/7] VFS: Propagate mnt_flags into do_loopback Valerie Aurora 2009-12-23 23:37 ` [PATCH 5/7] VFS: Add read-only users count to superblock Valerie Aurora 2009-12-23 23:37 ` [PATCH 6/7] VFS: BUG_ON() rehash of an already hashed dentry Valerie Aurora 2009-12-23 23:37 ` [PATCH 7/7] VFS: Remove unnecessary micro-optimization in cached_lookup() Valerie Aurora 2010-01-02 0:44 ` [PATCH 1/7] autofs4: Save autofs trigger's vfsmount in super block info Ian Kent 2010-01-14 5:43 ` Al Viro 2010-01-14 19:18 ` Valerie Aurora 2010-01-15 6:05 ` Ian Kent 2010-01-15 8:03 ` Al Viro 2010-01-15 17:36 ` Steve French 2010-01-18 5:08 ` Ian Kent 2010-01-15 14:55 ` David Howells 2010-01-15 16:58 ` Al Viro 2010-01-15 17:08 ` David Howells 2010-01-15 17:26 ` Al Viro 2010-01-16 10:17 ` Al Viro 2010-01-17 17:57 ` Al Viro 2010-01-18 4:21 ` Ian Kent 2010-01-18 5:59 ` Al Viro 2010-01-18 9:14 ` Ian Kent 2010-01-18 10:27 ` Al Viro 2010-01-18 19:35 ` Trond Myklebust 2010-01-19 7:05 ` Ian Kent
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).