* [PATCH -V20 00/12] Generic name to handle and open by handle syscalls
@ 2010-09-28 19:36 Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 01/12] exportfs: Return the minimum required handle size Aneesh Kumar K.V
` (11 more replies)
0 siblings, 12 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel
Hi,
The below set of patches implement open by handle support using exportfs
operations. This allows user space application to map a file name to file
handle and later open the file using handle. This should be usable
for userspace NFS [1] and 9P server [2]. XFS already support this with the ioctls
XFS_IOC_PATH_TO_HANDLE and XFS_IOC_OPEN_BY_HANDLE.
[1] http://nfs-ganesha.sourceforge.net/
[2] http://thread.gmane.org/gmane.comp.emulators.qemu/68992
git repo for the patchset at:
git://git.kernel.org/pub/scm/linux/kernel/git/kvaneesh/linux-open-handle.git open-by-handle
Test case can be found at
http://git.kernel.org/?p=fs/ext2/kvaneesh/handle-test.git
git://git.kernel.org/pub/scm/fs/ext2/kvaneesh/handle-test.git
Changes from V19:
a) Drop handle based chown, xattr, utimes syscalls
b) Rebased to latest linus kernel (050026feae5bd4fe2db4096b63b15abce7c47faa)
-aneesh
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH -V20 01/12] exportfs: Return the minimum required handle size
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:52 ` J. Bruce Fields
2010-09-28 19:36 ` [PATCH -V20 02/12] vfs: Add name to file handle conversion support Aneesh Kumar K.V
` (10 subsequent siblings)
11 siblings, 1 reply; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/btrfs/export.c | 8 ++++++--
fs/exportfs/expfs.c | 9 +++++++--
fs/fat/inode.c | 4 +++-
fs/fuse/inode.c | 4 +++-
fs/gfs2/export.c | 8 ++++++--
fs/isofs/export.c | 8 ++++++--
fs/ocfs2/export.c | 8 +++++++-
fs/reiserfs/inode.c | 7 ++++++-
fs/udf/namei.c | 7 ++++++-
fs/xfs/linux-2.6/xfs_export.c | 4 +++-
include/linux/exportfs.h | 6 ++++--
mm/shmem.c | 4 +++-
12 files changed, 60 insertions(+), 17 deletions(-)
diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
index 951ef09..5f8ee5a 100644
--- a/fs/btrfs/export.c
+++ b/fs/btrfs/export.c
@@ -21,9 +21,13 @@ static int btrfs_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
int len = *max_len;
int type;
- if ((len < BTRFS_FID_SIZE_NON_CONNECTABLE) ||
- (connectable && len < BTRFS_FID_SIZE_CONNECTABLE))
+ if (connectable && (len < BTRFS_FID_SIZE_CONNECTABLE)) {
+ *max_len = BTRFS_FID_SIZE_CONNECTABLE;
return 255;
+ } else if (len < BTRFS_FID_SIZE_NON_CONNECTABLE) {
+ *max_len = BTRFS_FID_SIZE_NON_CONNECTABLE;
+ return 255;
+ }
len = BTRFS_FID_SIZE_NON_CONNECTABLE;
type = FILEID_BTRFS_WITHOUT_PARENT;
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index e9e1759..cfee0f0 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -319,9 +319,14 @@ static int export_encode_fh(struct dentry *dentry, struct fid *fid,
struct inode * inode = dentry->d_inode;
int len = *max_len;
int type = FILEID_INO32_GEN;
-
- if (len < 2 || (connectable && len < 4))
+
+ if (connectable && (len < 4)) {
+ *max_len = 4;
+ return 255;
+ } else if (len < 2) {
+ *max_len = 2;
return 255;
+ }
len = 2;
fid->i32.ino = inode->i_ino;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 8300580..0812d29 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -759,8 +759,10 @@ fat_encode_fh(struct dentry *de, __u32 *fh, int *lenp, int connectable)
struct inode *inode = de->d_inode;
u32 ipos_h, ipos_m, ipos_l;
- if (len < 5)
+ if (len < 5) {
+ *lenp = 5;
return 255; /* no room */
+ }
ipos_h = MSDOS_I(inode)->i_pos >> 8;
ipos_m = (MSDOS_I(inode)->i_pos & 0xf0) << 24;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index da9e6e1..52adfcd 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -640,8 +640,10 @@ static int fuse_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
u64 nodeid;
u32 generation;
- if (*max_len < len)
+ if (*max_len < len) {
+ *max_len = len;
return 255;
+ }
nodeid = get_fuse_inode(inode)->nodeid;
generation = inode->i_generation;
diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index dfe237a..bd0fd68 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -36,9 +36,13 @@ static int gfs2_encode_fh(struct dentry *dentry, __u32 *p, int *len,
struct super_block *sb = inode->i_sb;
struct gfs2_inode *ip = GFS2_I(inode);
- if (*len < GFS2_SMALL_FH_SIZE ||
- (connectable && *len < GFS2_LARGE_FH_SIZE))
+ if (connectable && (*len < GFS2_LARGE_FH_SIZE)) {
+ *len = GFS2_LARGE_FH_SIZE;
return 255;
+ } else if (*len < GFS2_SMALL_FH_SIZE) {
+ *len = GFS2_SMALL_FH_SIZE;
+ return 255;
+ }
fh[0] = cpu_to_be32(ip->i_no_formal_ino >> 32);
fh[1] = cpu_to_be32(ip->i_no_formal_ino & 0xFFFFFFFF);
diff --git a/fs/isofs/export.c b/fs/isofs/export.c
index ed752cb..dd4687f 100644
--- a/fs/isofs/export.c
+++ b/fs/isofs/export.c
@@ -124,9 +124,13 @@ isofs_export_encode_fh(struct dentry *dentry,
* offset of the inode and the upper 16 bits of fh32[1] to
* hold the offset of the parent.
*/
-
- if (len < 3 || (connectable && len < 5))
+ if (connectable && (len < 5)) {
+ *max_len = 5;
+ return 255;
+ } else if (len < 3) {
+ *max_len = 3;
return 255;
+ }
len = 3;
fh32[0] = ei->i_iget5_block;
diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
index 19ad145..250a347 100644
--- a/fs/ocfs2/export.c
+++ b/fs/ocfs2/export.c
@@ -201,8 +201,14 @@ static int ocfs2_encode_fh(struct dentry *dentry, u32 *fh_in, int *max_len,
dentry->d_name.len, dentry->d_name.name,
fh, len, connectable);
- if (len < 3 || (connectable && len < 6)) {
+ if (connectable && (len < 6)) {
mlog(ML_ERROR, "fh buffer is too small for encoding\n");
+ *max_len = 6;
+ type = 255;
+ goto bail;
+ } else if (len < 3) {
+ mlog(ML_ERROR, "fh buffer is too small for encoding\n");
+ *max_len = 3;
type = 255;
goto bail;
}
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index caa7583..44eebc7 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -1595,8 +1595,13 @@ int reiserfs_encode_fh(struct dentry *dentry, __u32 * data, int *lenp,
struct inode *inode = dentry->d_inode;
int maxlen = *lenp;
- if (maxlen < 3)
+ if (need_parent && (maxlen < 5)) {
+ *lenp = 5;
return 255;
+ } else if (maxlen < 3) {
+ *lenp = 3;
+ return 255;
+ }
data[0] = inode->i_ino;
data[1] = le32_to_cpu(INODE_PKEY(inode)->k_dir_id);
diff --git a/fs/udf/namei.c b/fs/udf/namei.c
index bf5fc67..20db42f 100644
--- a/fs/udf/namei.c
+++ b/fs/udf/namei.c
@@ -1336,8 +1336,13 @@ static int udf_encode_fh(struct dentry *de, __u32 *fh, int *lenp,
struct fid *fid = (struct fid *)fh;
int type = FILEID_UDF_WITHOUT_PARENT;
- if (len < 3 || (connectable && len < 5))
+ if (connectable && (len < 5)) {
+ *lenp = 5;
+ return 255;
+ } else if (len < 3) {
+ *lenp = 3;
return 255;
+ }
*lenp = 3;
fid->udf.block = location.logicalBlockNum;
diff --git a/fs/xfs/linux-2.6/xfs_export.c b/fs/xfs/linux-2.6/xfs_export.c
index 3764d74..7132d7c 100644
--- a/fs/xfs/linux-2.6/xfs_export.c
+++ b/fs/xfs/linux-2.6/xfs_export.c
@@ -81,8 +81,10 @@ xfs_fs_encode_fh(
* seven combinations work. The real answer is "don't use v2".
*/
len = xfs_fileid_length(fileid_type);
- if (*max_len < len)
+ if (*max_len < len) {
+ *max_len = len;
return 255;
+ }
*max_len = len;
switch (fileid_type) {
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index a9cd507..acd0b2d 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -108,8 +108,10 @@ struct fid {
* set, the encode_fh() should store sufficient information so that a good
* attempt can be made to find not only the file but also it's place in the
* filesystem. This typically means storing a reference to de->d_parent in
- * the filehandle fragment. encode_fh() should return the number of bytes
- * stored or a negative error code such as %-ENOSPC
+ * the filehandle fragment. encode_fh() should return the fileid_type on
+ * success and on error returns 255 (if the space needed to encode fh is
+ * greater than @max_len*4 bytes). On error @max_len contain the minimum
+ * size(in 4 byte unit) needed to encode the file handle.
*
* fh_to_dentry:
* @fh_to_dentry is given a &struct super_block (@sb) and a file handle
diff --git a/mm/shmem.c b/mm/shmem.c
index 080b09a..9e6d86f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2143,8 +2143,10 @@ static int shmem_encode_fh(struct dentry *dentry, __u32 *fh, int *len,
{
struct inode *inode = dentry->d_inode;
- if (*len < 3)
+ if (*len < 3) {
+ *len = 3;
return 255;
+ }
if (hlist_unhashed(&inode->i_hash)) {
/* Unfortunately insert_inode_hash is not idempotent,
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 02/12] vfs: Add name to file handle conversion support
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 01/12] exportfs: Return the minimum required handle size Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 20:30 ` J. Bruce Fields
2010-09-28 19:36 ` [PATCH -V20 03/12] vfs: Add open by file handle support Aneesh Kumar K.V
` (9 subsequent siblings)
11 siblings, 1 reply; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
The syscall also return mount id which can be used
to lookup file system specific information such as uuid
in /proc/<pid>/mountinfo
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/open.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/exportfs.h | 3 +
include/linux/fs.h | 7 +++
include/linux/syscalls.h | 5 ++-
4 files changed, 143 insertions(+), 1 deletions(-)
diff --git a/fs/open.c b/fs/open.c
index d74e198..9d5823b 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -30,6 +30,7 @@
#include <linux/fs_struct.h>
#include <linux/ima.h>
#include <linux/dnotify.h>
+#include <linux/exportfs.h>
#include "internal.h"
@@ -1042,3 +1043,131 @@ int nonseekable_open(struct inode *inode, struct file *filp)
}
EXPORT_SYMBOL(nonseekable_open);
+
+#ifdef CONFIG_EXPORTFS
+static long do_sys_name_to_handle(struct path *path,
+ struct file_handle __user *ufh,
+ int __user *mnt_id)
+{
+ long retval;
+ int handle_size;
+ struct file_handle f_handle;
+ struct file_handle *handle = NULL;
+
+ if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
+ retval = -EFAULT;
+ goto err_out;
+ }
+ if (f_handle.handle_size > MAX_HANDLE_SZ) {
+ retval = -EINVAL;
+ goto err_out;
+ }
+ handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
+ GFP_KERNEL);
+ if (!handle) {
+ retval = -ENOMEM;
+ goto err_out;
+ }
+
+ /* convert handle size to multiple of sizeof(u32) */
+ handle_size = f_handle.handle_size >> 2;
+
+ /* we ask for a non connected handle */
+ retval = exportfs_encode_fh(path->dentry,
+ (struct fid *)handle->f_handle,
+ &handle_size, 0);
+ /* convert handle size to bytes */
+ handle_size *= sizeof(u32);
+ handle->handle_type = retval;
+ handle->handle_size = handle_size;
+ if (handle_size > f_handle.handle_size) {
+ /*
+ * set the handle_size to zero so we copy only
+ * non variable part of the file_handle
+ */
+ handle_size = 0;
+ retval = -EOVERFLOW;
+ } else
+ retval = 0;
+ /* copy the mount id */
+ if (copy_to_user(mnt_id, &path->mnt->mnt_id, sizeof(*mnt_id))) {
+ retval = -EFAULT;
+ goto err_free_out;
+ }
+ if (copy_to_user(ufh, handle,
+ sizeof(struct file_handle) + handle_size))
+ retval = -EFAULT;
+err_free_out:
+ kfree(handle);
+err_out:
+ return retval;
+}
+
+/**
+ * sys_name_to_handle_at: convert name to handle
+ * @dfd: directory relative to which name is interpreted if not absolute
+ * @name: name that should be converted to handle.
+ * @handle: resulting file handle
+ * @mnt_id: mount id of the file system containing the file
+ * @flag: flag value to indicate whether to follow symlink or not
+ *
+ * @handle->handle_size indicate the space available to store the
+ * variable part of the file handle in bytes. If there is not
+ * enough space, the field is updated to return the minimum
+ * value required.
+ */
+SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
+ struct file_handle __user *, handle, int __user*, mnt_id,
+ int, flag)
+{
+
+ int follow;
+ int fput_needed;
+ long ret = -EINVAL;
+ struct path path, *pp;
+ struct file *file = NULL;
+
+ if ((flag & ~AT_SYMLINK_FOLLOW) != 0)
+ goto err_out;
+
+ if (name == NULL && dfd != AT_FDCWD) {
+ file = fget_light(dfd, &fput_needed);
+ if (file) {
+ pp = &file->f_path;
+ ret = 0;
+ } else
+ ret = -EBADF;
+ } else {
+ follow = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
+ ret = user_path_at(dfd, name, follow, &path);
+ pp = &path;
+ }
+ if (ret)
+ goto err_out;
+ /*
+ * We need t make sure wether the file system
+ * support decoding of the file handle
+ */
+ if (!pp->mnt->mnt_sb->s_export_op ||
+ !pp->mnt->mnt_sb->s_export_op->fh_to_dentry) {
+ ret = -EOPNOTSUPP;
+ goto out_path;
+ }
+ ret = do_sys_name_to_handle(pp, handle, mnt_id);
+
+out_path:
+ if (file)
+ fput_light(file, fput_needed);
+ else
+ path_put(&path);
+err_out:
+ return ret;
+}
+#else
+SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
+ struct file_handle __user *, handle, int __user *, mnt_id,
+ int, flag)
+{
+ return -ENOSYS;
+}
+#endif
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index acd0b2d..1a6d72f 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -8,6 +8,9 @@ struct inode;
struct super_block;
struct vfsmount;
+/* limit the handle size to some value */
+#define MAX_HANDLE_SZ 4096
+
/*
* The fileid_type identifies how the file within the filesystem is encoded.
* In theory this is freely set and parsed by the filesystem, but we try to
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 63d069b..b64c160 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -948,6 +948,13 @@ struct file {
#endif
};
+struct file_handle {
+ int handle_size;
+ int handle_type;
+ /* file identifier */
+ unsigned char f_handle[0];
+};
+
#define get_file(x) atomic_long_inc(&(x)->f_count)
#define fput_atomic(x) atomic_long_add_unless(&(x)->f_count, -1, 1)
#define file_count(x) atomic_long_read(&(x)->f_count)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e6319d1..6ab4d07 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -62,6 +62,7 @@ struct robust_list_head;
struct getcpu_cache;
struct old_linux_dirent;
struct perf_event_attr;
+struct file_handle;
#include <linux/types.h>
#include <linux/aio_abi.h>
@@ -831,5 +832,7 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long pgoff);
asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
-
+asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
+ struct file_handle __user *handle,
+ int __user *mnt_id, int flag);
#endif
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 03/12] vfs: Add open by file handle support
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 01/12] exportfs: Return the minimum required handle size Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 02/12] vfs: Add name to file handle conversion support Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-29 5:27 ` Aneesh Kumar K. V
2010-09-28 19:36 ` [PATCH -V20 04/12] vfs: Add handle based readlink syscall Aneesh Kumar K.V
` (8 subsequent siblings)
11 siblings, 1 reply; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/compat.c | 11 +++
fs/exportfs/expfs.c | 2 +
fs/namei.c | 223 +++++++++++++++++++++++++++++++++++++++++++---
fs/open.c | 32 ++++++-
include/linux/fs.h | 10 ++-
include/linux/namei.h | 1 +
include/linux/syscalls.h | 3 +
7 files changed, 263 insertions(+), 19 deletions(-)
diff --git a/fs/compat.c b/fs/compat.c
index 0644a15..4a423fa 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -2334,3 +2334,14 @@ asmlinkage long compat_sys_timerfd_gettime(int ufd,
}
#endif /* CONFIG_TIMERFD */
+
+/*
+ * Exactly like fs/open.c:sys_open_by_handle_at(), except that it
+ * doesn't set the O_LARGEFILE flag.
+ */
+asmlinkage long
+compat_sys_open_by_handle_at(int mountdirfd,
+ struct file_handle __user *handle, int flags)
+{
+ return do_handle_open(mountdirfd, handle, flags);
+}
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index cfee0f0..05a1179 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -373,6 +373,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
/*
* Try to get any dentry for the given file handle from the filesystem.
*/
+ if (!nop || !nop->fh_to_dentry)
+ return ERR_PTR(-ESTALE);
result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
if (!result)
result = ERR_PTR(-ESTALE);
diff --git a/fs/namei.c b/fs/namei.c
index 24896e8..3439962 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -32,6 +32,7 @@
#include <linux/fcntl.h>
#include <linux/device_cgroup.h>
#include <linux/fs_struct.h>
+#include <linux/exportfs.h>
#include <asm/uaccess.h>
#include "internal.h"
@@ -1050,6 +1051,29 @@ out_fail:
return retval;
}
+struct vfsmount *get_vfsmount_from_fd(int fd)
+{
+ int fput_needed;
+ struct path path;
+ struct file *filep;
+
+ if (fd == AT_FDCWD) {
+ struct fs_struct *fs = current->fs;
+ spin_lock(&fs->lock);
+ path = fs->pwd;
+ mntget(path.mnt);
+ spin_lock(&fs->lock);
+ } else {
+ filep = fget_light(fd, &fput_needed);
+ if (!filep)
+ return ERR_PTR(-EBADF);
+ path = filep->f_path;
+ mntget(path.mnt);
+ fput_light(filep, fput_needed);
+ }
+ return path.mnt;
+}
+
/* Returns 0 and nd will be valid on success; Retuns error, otherwise. */
static int do_path_lookup(int dfd, const char *name,
unsigned int flags, struct nameidata *nd)
@@ -1537,26 +1561,30 @@ static int open_will_truncate(int flag, struct inode *inode)
return (flag & O_TRUNC);
}
-static struct file *finish_open(struct nameidata *nd,
+static struct file *finish_open(struct file *filp, struct path *path,
int open_flag, int acc_mode)
{
- struct file *filp;
- int will_truncate;
int error;
+ int will_truncate;
- will_truncate = open_will_truncate(open_flag, nd->path.dentry->d_inode);
+ will_truncate = open_will_truncate(open_flag, path->dentry->d_inode);
if (will_truncate) {
- error = mnt_want_write(nd->path.mnt);
+ error = mnt_want_write(path->mnt);
if (error)
goto exit;
}
- error = may_open(&nd->path, acc_mode, open_flag);
+ error = may_open(path, acc_mode, open_flag);
if (error) {
if (will_truncate)
- mnt_drop_write(nd->path.mnt);
+ mnt_drop_write(path->mnt);
goto exit;
}
- filp = nameidata_to_filp(nd);
+ /* Has the filesystem initialised the file for us? */
+ if (filp->f_path.dentry == NULL)
+ filp = __dentry_open(path->dentry, path->mnt, filp,
+ NULL, current_cred());
+ else
+ path_put(path);
if (!IS_ERR(filp)) {
error = ima_file_check(filp, acc_mode);
if (error) {
@@ -1566,7 +1594,7 @@ static struct file *finish_open(struct nameidata *nd,
}
if (!IS_ERR(filp)) {
if (will_truncate) {
- error = handle_truncate(&nd->path);
+ error = handle_truncate(path);
if (error) {
fput(filp);
filp = ERR_PTR(error);
@@ -1579,13 +1607,17 @@ static struct file *finish_open(struct nameidata *nd,
* on its behalf.
*/
if (will_truncate)
- mnt_drop_write(nd->path.mnt);
+ mnt_drop_write(path->mnt);
return filp;
exit:
- if (!IS_ERR(nd->intent.open.file))
- release_open_intent(nd);
- path_put(&nd->path);
+ if (!IS_ERR(filp)) {
+ if (filp->f_path.dentry == NULL)
+ put_filp(filp);
+ else
+ fput(filp);
+ }
+ path_put(path);
return ERR_PTR(error);
}
@@ -1719,7 +1751,9 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
if (S_ISDIR(path->dentry->d_inode->i_mode))
goto exit;
ok:
- filp = finish_open(nd, open_flag, acc_mode);
+ filp = finish_open(nd->intent.open.file, &nd->path,
+ open_flag, acc_mode);
+
return filp;
exit_mutex_unlock:
@@ -1892,6 +1926,167 @@ struct file *filp_open(const char *filename, int flags, int mode)
}
EXPORT_SYMBOL(filp_open);
+#ifdef CONFIG_EXPORTFS
+static int vfs_dentry_acceptable(void *context, struct dentry *dentry)
+{
+ return 1;
+}
+
+static int do_handle_to_path(int mountdirfd, struct file_handle *handle,
+ struct path *path)
+{
+ int retval = 0;
+ int handle_size;
+
+ path->mnt = get_vfsmount_from_fd(mountdirfd);
+ if (IS_ERR(path->mnt)) {
+ retval = PTR_ERR(path->mnt);
+ goto out_err;
+ }
+ /* change the handle size to multiple of sizeof(u32) */
+ handle_size = handle->handle_size >> 2;
+ path->dentry = exportfs_decode_fh(path->mnt,
+ (struct fid *)handle->f_handle,
+ handle_size, handle->handle_type,
+ vfs_dentry_acceptable, NULL);
+ if (IS_ERR(path->dentry)) {
+ retval = PTR_ERR(path->dentry);
+ goto out_mnt;
+ }
+ return 0;
+out_mnt:
+ mntput(path->mnt);
+out_err:
+ return retval;
+}
+
+int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
+ struct path *path)
+{
+ int retval = 0;
+ struct file_handle f_handle;
+ struct file_handle *handle = NULL;
+
+ /*
+ * With handle we don't look at the execute bit on the
+ * the directory. Ideally we would like CAP_DAC_SEARCH.
+ * But we don't have that
+ */
+ if (!capable(CAP_DAC_READ_SEARCH)) {
+ retval = -EPERM;
+ goto out_err;
+ }
+ if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
+ retval = -EFAULT;
+ goto out_err;
+ }
+ if ((f_handle.handle_size > MAX_HANDLE_SZ) ||
+ (f_handle.handle_size <= 0)) {
+ retval = -EINVAL;
+ goto out_err;
+ }
+ handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
+ GFP_KERNEL);
+ if (!handle) {
+ retval = -ENOMEM;
+ goto out_err;
+ }
+ /* copy the full handle */
+ if (copy_from_user(handle, ufh,
+ sizeof(struct file_handle) +
+ f_handle.handle_size)) {
+ retval = -EFAULT;
+ goto out_handle;
+ }
+
+ retval = do_handle_to_path(mountdirfd, handle, path);
+
+out_handle:
+ kfree(handle);
+out_err:
+ return retval;
+}
+#else
+int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
+ struct path *path)
+{
+ return -ENOSYS;
+}
+#endif
+
+long do_handle_open(int mountdirfd,
+ struct file_handle __user *ufh, int open_flag)
+{
+ long retval = 0;
+ int fd, acc_mode;
+ struct path path;
+ struct file *filp;
+
+ /* can't use O_CREATE with open_by_handle */
+ if (open_flag & O_CREAT) {
+ retval = -EINVAL;
+ goto out_err;
+ }
+ retval = handle_to_path(mountdirfd, ufh, &path);
+ if (retval)
+ goto out_err;
+
+ if ((open_flag & O_DIRECTORY) &&
+ !S_ISDIR(path.dentry->d_inode->i_mode)) {
+ retval = -ENOTDIR;
+ goto out_path;
+ }
+ /*
+ * O_SYNC is implemented as __O_SYNC|O_DSYNC. As many places only
+ * check for O_DSYNC if the need any syncing at all we enforce it's
+ * always set instead of having to deal with possibly weird behaviour
+ * for malicious applications setting only __O_SYNC.
+ */
+ if (open_flag & __O_SYNC)
+ open_flag |= O_DSYNC;
+
+ acc_mode = MAY_OPEN | ACC_MODE(open_flag);
+
+ /* O_TRUNC implies we need access checks for write permissions */
+ if (open_flag & O_TRUNC)
+ acc_mode |= MAY_WRITE;
+ /*
+ * Allow the LSM permission hook to distinguish append
+ * access from general write access.
+ */
+ if (open_flag & O_APPEND)
+ acc_mode |= MAY_APPEND;
+
+ fd = get_unused_fd_flags(open_flag);
+ if (fd < 0) {
+ retval = fd;
+ goto out_path;
+ }
+ filp = get_empty_filp();
+ if (!filp) {
+ retval = -ENFILE;
+ goto out_free_fd;
+ }
+ filp->f_flags = open_flag;
+ filp = finish_open(filp, &path, open_flag, acc_mode);
+ if (IS_ERR(filp)) {
+ put_unused_fd(fd);
+ retval = PTR_ERR(filp);
+ } else {
+ retval = fd;
+ fsnotify_open(filp);
+ fd_install(fd, filp);
+ }
+ return retval;
+
+out_free_fd:
+ put_unused_fd(fd);
+out_path:
+ path_put(&path);
+out_err:
+ return retval;
+}
+
/**
* lookup_create - lookup a dentry, creating it if it doesn't exist
* @nd: nameidata info
diff --git a/fs/open.c b/fs/open.c
index 9d5823b..a0239cb 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -652,10 +652,10 @@ static inline int __get_file_write_access(struct inode *inode,
return error;
}
-static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
- struct file *f,
- int (*open)(struct inode *, struct file *),
- const struct cred *cred)
+struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
+ struct file *f,
+ int (*open)(struct inode *, struct file *),
+ const struct cred *cred)
{
struct inode *inode;
int error;
@@ -1171,3 +1171,27 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
return -ENOSYS;
}
#endif
+
+/**
+ * sys_open_by_handle_at: Open the file handle
+ * @mountdirfd: directory file descriptor
+ * @handle: file handle to be opened
+ * @flag: open flags.
+ *
+ * @mountdirfd indicate the directory file descriptor
+ * of the mount point. file handle is decoded relative
+ * to the vfsmount pointed by the @mountdirfd. @flags
+ * value is same as the open(2) flags.
+ */
+SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
+ struct file_handle __user *, handle,
+ int, flags)
+{
+ long ret;
+
+ if (force_o_largefile())
+ flags |= O_LARGEFILE;
+
+ ret = do_handle_open(mountdirfd, handle, flags);
+ return ret;
+}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b64c160..63c2fd1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1937,6 +1937,10 @@ extern int do_fallocate(struct file *file, int mode, loff_t offset,
extern long do_sys_open(int dfd, const char __user *filename, int flags,
int mode);
extern struct file *filp_open(const char *, int, int);
+struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
+ struct file *f,
+ int (*open)(struct inode *, struct file *),
+ const struct cred *cred);
extern struct file * dentry_open(struct dentry *, struct vfsmount *, int,
const struct cred *);
extern int filp_close(struct file *, fl_owner_t id);
@@ -2148,11 +2152,15 @@ extern void free_write_pipe(struct file *);
extern struct file *do_filp_open(int dfd, const char *pathname,
int open_flag, int mode, int acc_mode);
+extern int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
+ struct path *path);
+extern long do_handle_open(int mountdirfd,
+ struct file_handle __user *ufh, int open_flag);
extern int may_open(struct path *, int, int);
extern int kernel_read(struct file *, loff_t, char *, unsigned long);
extern struct file * open_exec(const char *);
-
+
/* fs/dcache.c -- generic fs support functions */
extern int is_subdir(struct dentry *, struct dentry *);
extern int path_is_under(struct path *, struct path *);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 05b441d..827aef0 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -64,6 +64,7 @@ extern int user_path_at(int, const char __user *, unsigned, struct path *);
#define user_path_dir(name, path) \
user_path_at(AT_FDCWD, name, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, path)
+extern struct vfsmount *get_vfsmount_from_fd(int);
extern int kern_path(const char *, unsigned, struct path *);
extern int path_lookup(const char *, unsigned, struct nameidata *);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 6ab4d07..89a0ade 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -835,4 +835,7 @@ asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
struct file_handle __user *handle,
int __user *mnt_id, int flag);
+asmlinkage long sys_open_by_handle_at(int mountdirfd,
+ struct file_handle __user *handle,
+ int flags);
#endif
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 04/12] vfs: Add handle based readlink syscall
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (2 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 03/12] vfs: Add open by file handle support Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 05/12] vfs: Add handle based stat syscall Aneesh Kumar K.V
` (7 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/stat.c | 28 ++++++++++++++++++++++++++++
include/linux/syscalls.h | 3 +++
2 files changed, 31 insertions(+), 0 deletions(-)
diff --git a/fs/stat.c b/fs/stat.c
index 12e90e2..29052eb 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -319,6 +319,34 @@ SYSCALL_DEFINE3(readlink, const char __user *, path, char __user *, buf,
return sys_readlinkat(AT_FDCWD, path, buf, bufsiz);
}
+SYSCALL_DEFINE4(handle_readlink, int, mountdirfd,
+ struct file_handle __user *, ufh,
+ char __user *, buf, int, bufsiz)
+{
+ long retval = 0;
+ struct path path;
+ struct inode *inode;
+
+ if (bufsiz <= 0)
+ return -EINVAL;
+ retval = handle_to_path(mountdirfd, ufh, &path);
+ if (retval)
+ goto out_err;
+
+ inode = path.dentry->d_inode;
+ retval = -EINVAL;
+ if (inode->i_op->readlink) {
+ retval = security_inode_readlink(path.dentry);
+ if (!retval) {
+ touch_atime(path.mnt, path.dentry);
+ retval = inode->i_op->readlink(path.dentry,
+ buf, bufsiz);
+ }
+ }
+ path_put(&path);
+out_err:
+ return retval;
+}
/* ---------- LFS-64 ----------- */
#ifdef __ARCH_WANT_STAT64
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 89a0ade..bf03e4a 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -838,4 +838,7 @@ asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
asmlinkage long sys_open_by_handle_at(int mountdirfd,
struct file_handle __user *handle,
int flags);
+asmlinkage long sys_handle_readlink(int mountdirfd,
+ struct file_handle __user *ufh,
+ char __user *buf, int bufsiz);
#endif
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 05/12] vfs: Add handle based stat syscall
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (3 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 04/12] vfs: Add handle based readlink syscall Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 06/12] vfs: Add handle based link syscall Aneesh Kumar K.V
` (6 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/x86/ia32/sys_ia32.c | 13 +++++++++++++
fs/stat.c | 42 ++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 3 +++
include/linux/syscalls.h | 9 +++++++++
4 files changed, 67 insertions(+), 0 deletions(-)
diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 849813f..7f679e4 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -138,6 +138,19 @@ asmlinkage long sys32_fstatat(unsigned int dfd, const char __user *filename,
return cp_stat64(statbuf, &stat);
}
+asmlinkage long sys32_handle_stat64(int mountdirfd,
+ struct file_handle __user *ufh,
+ struct stat64 __user *statbuf)
+{
+ struct kstat stat;
+ int error;
+
+ error = do_handle_stat(mountdirfd, ufh, &stat);
+ if (error)
+ return error;
+ return cp_stat64(statbuf, &stat);
+}
+
/*
* Linux/i386 didn't use to be able to handle more than
* 4 system call parameters, so these system calls used a memory
diff --git a/fs/stat.c b/fs/stat.c
index 29052eb..d448876 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -286,6 +286,35 @@ SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct stat __user *, statbuf)
return error;
}
+int do_handle_stat(int mountdirfd, struct file_handle __user *ufh,
+ struct kstat *stat)
+{
+ struct path path;
+ int error = -EINVAL;
+
+ error = handle_to_path(mountdirfd, ufh, &path);
+ if (error)
+ goto out;
+
+ error = vfs_getattr(path.mnt, path.dentry, stat);
+ path_put(&path);
+out:
+ return error;
+}
+
+SYSCALL_DEFINE3(handle_stat, int, mountdirfd,
+ struct file_handle __user *, ufh,
+ struct stat __user *, statbuf)
+{
+ struct kstat stat;
+ int error;
+
+ error = do_handle_stat(mountdirfd, ufh, &stat);
+ if (error)
+ return error;
+ return cp_new_stat(&stat, statbuf);
+}
+
SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname,
char __user *, buf, int, bufsiz)
{
@@ -434,6 +463,19 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, filename,
return error;
return cp_new_stat64(&stat, statbuf);
}
+
+SYSCALL_DEFINE3(handle_stat64, int, mountdirfd,
+ struct file_handle __user *, ufh,
+ struct stat64 __user *, statbuf)
+{
+ struct kstat stat;
+ int error;
+
+ error = do_handle_stat(mountdirfd, ufh, &stat);
+ if (error)
+ return error;
+ return cp_new_stat64(&stat, statbuf);
+}
#endif /* __ARCH_WANT_STAT64 */
/* Caller is here responsible for sufficient locking (ie. inode->i_lock) */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 63c2fd1..0bac293 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2335,6 +2335,9 @@ extern int vfs_stat(const char __user *, struct kstat *);
extern int vfs_lstat(const char __user *, struct kstat *);
extern int vfs_fstat(unsigned int, struct kstat *);
extern int vfs_fstatat(int , const char __user *, struct kstat *, int);
+extern int do_handle_stat(int mountdirfd,
+ struct file_handle __user *ufh,
+ struct kstat *stat);
extern int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
unsigned long arg);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index bf03e4a..de4f242 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -841,4 +841,13 @@ asmlinkage long sys_open_by_handle_at(int mountdirfd,
asmlinkage long sys_handle_readlink(int mountdirfd,
struct file_handle __user *ufh,
char __user *buf, int bufsiz);
+#if BITS_PER_LONG == 32
+asmlinkage long sys_handle_stat64(int mountdirfd,
+ struct file_handle __user *ufh,
+ struct stat64 __user *statbuf);
+#else
+asmlinkage long sys_handle_stat(int mountdirfd,
+ struct file_handle __user *ufh,
+ struct stat __user *statbuf);
+#endif
#endif
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 06/12] vfs: Add handle based link syscall
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (4 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 05/12] vfs: Add handle based stat syscall Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 07/12] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
` (5 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/namei.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/syscalls.h | 2 ++
2 files changed, 47 insertions(+), 0 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 3439962..141cc78 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2697,6 +2697,51 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
return sys_linkat(AT_FDCWD, oldname, AT_FDCWD, newname, 0);
}
+SYSCALL_DEFINE4(handle_link, int, mountdirfd, struct file_handle __user *, uofh,
+ int, newdfd, const char __user *, newname)
+{
+ char *to;
+ int error;
+ struct dentry *new_dentry;
+ struct nameidata nd;
+ struct path old_path;
+
+ error = handle_to_path(mountdirfd, uofh, &old_path);
+ if (error)
+ return error;
+
+ error = user_path_parent(newdfd, newname, &nd, &to);
+ if (error)
+ goto out;
+ error = -EXDEV;
+ if (old_path.mnt != nd.path.mnt)
+ goto out_release;
+ new_dentry = lookup_create(&nd, 0);
+ error = PTR_ERR(new_dentry);
+ if (IS_ERR(new_dentry))
+ goto out_unlock;
+ error = mnt_want_write(nd.path.mnt);
+ if (error)
+ goto out_dput;
+ error = security_path_link(old_path.dentry, &nd.path, new_dentry);
+ if (error)
+ goto out_drop_write;
+ error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
+out_drop_write:
+ mnt_drop_write(nd.path.mnt);
+out_dput:
+ dput(new_dentry);
+out_unlock:
+ mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
+out_release:
+ path_put(&nd.path);
+ putname(to);
+out:
+ path_put(&old_path);
+
+ return error;
+}
+
/*
* The worst of all namespace operations - renaming directory. "Perverted"
* doesn't even start to describe it. Somebody in UCB had a heck of a trip...
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index de4f242..278d2ae 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -850,4 +850,6 @@ asmlinkage long sys_handle_stat(int mountdirfd,
struct file_handle __user *ufh,
struct stat __user *statbuf);
#endif
+asmlinkage long sys_handle_link(int mountdirfd, struct file_handle __user *uofh,
+ int newfd, const char __user *newname);
#endif
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 07/12] x86: Add new syscalls for x86_32
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (5 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 06/12] vfs: Add handle based link syscall Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 08/12] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
` (4 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
This patch adds new syscalls to x86_32
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/x86/include/asm/unistd_32.h | 7 ++++++-
arch/x86/kernel/syscall_table_32.S | 5 +++++
2 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
index b766a5e..6e4d5b5 100644
--- a/arch/x86/include/asm/unistd_32.h
+++ b/arch/x86/include/asm/unistd_32.h
@@ -346,10 +346,15 @@
#define __NR_fanotify_init 338
#define __NR_fanotify_mark 339
#define __NR_prlimit64 340
+#define __NR_name_to_handle_at 341
+#define __NR_open_by_handle_at 342
+#define __NR_readlink_by_handle 343
+#define __NR_stat64_by_handle 344
+#define __NR_link_by_handle 345
#ifdef __KERNEL__
-#define NR_syscalls 341
+#define NR_syscalls 346
#define __ARCH_WANT_IPC_PARSE_VERSION
#define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index b35786d..17b3a03 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -340,3 +340,8 @@ ENTRY(sys_call_table)
.long sys_fanotify_init
.long sys_fanotify_mark
.long sys_prlimit64 /* 340 */
+ .long sys_name_to_handle_at
+ .long sys_open_by_handle_at
+ .long sys_handle_readlink
+ .long sys_handle_stat64
+ .long sys_handle_link /* 345 */
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 08/12] x86: Add new syscalls for x86_64
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (6 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 07/12] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 09/12] unistd.h: Add new syscalls numbers to asm-generic Aneesh Kumar K.V
` (3 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
This patch add new syscalls to x86_64
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/x86/ia32/ia32entry.S | 5 +++++
arch/x86/include/asm/unistd_64.h | 10 ++++++++++
2 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 518bb99..4649908 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -851,4 +851,9 @@ ia32_sys_call_table:
.quad sys_fanotify_init
.quad sys32_fanotify_mark
.quad sys_prlimit64 /* 340 */
+ .quad sys_name_to_handle_at
+ .quad compat_sys_open_by_handle_at
+ .quad sys_handle_readlink
+ .quad sys32_handle_stat64
+ .quad sys_handle_link /* 345 */
ia32_syscall_end:
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index 363e9b8..4e13e76 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -669,6 +669,16 @@ __SYSCALL(__NR_fanotify_init, sys_fanotify_init)
__SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
#define __NR_prlimit64 302
__SYSCALL(__NR_prlimit64, sys_prlimit64)
+#define __NR_name_to_handle_at 303
+__SYSCALL(__NR_name_to_handle_at, sys_name_to_handle_at)
+#define __NR_open_by_handle_at 304
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)
+#define __NR_readlink_by_handle 305
+__SYSCALL(__NR_readlink_by_handle, sys_handle_readlink)
+#define __NR_stat_by_handle 306
+__SYSCALL(__NR_stat_by_handle, sys_handle_stat)
+#define __NR_link_by_handle 307
+__SYSCALL(__NR_link_by_handle, sys_handle_link)
#ifndef __NO_STUBS
#define __ARCH_WANT_OLD_READDIR
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 09/12] unistd.h: Add new syscalls numbers to asm-generic
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (7 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 08/12] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 10/12] vfs: Export file system uuid via /proc/<pid>/mountinfo Aneesh Kumar K.V
` (2 subsequent siblings)
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
include/asm-generic/unistd.h | 12 +++++++++++-
1 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/include/asm-generic/unistd.h b/include/asm-generic/unistd.h
index b969770..9469305 100644
--- a/include/asm-generic/unistd.h
+++ b/include/asm-generic/unistd.h
@@ -646,9 +646,19 @@ __SYSCALL(__NR_prlimit64, sys_prlimit64)
__SYSCALL(__NR_fanotify_init, sys_fanotify_init)
#define __NR_fanotify_mark 263
__SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
+#define __NR_name_to_handle_at 264
+__SYSCALL(__NR_name_to_handle_at, sys_name_to_handle_at)
+#define __NR_open_by_handle_at 265
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)
+#define __NR_readlink_by_handle 266
+__SYSCALL(__NR_readlink_by_handle, sys_handle_readlink)
+#define __NR_stat64_by_handle 267
+__SYSCALL(__NR_stat64_by_handle, sys_handle_stat64)
+#define __NR_link_by_handle 268
+__SYSCALL(__NR_link_by_handle, sys_handle_link)
#undef __NR_syscalls
-#define __NR_syscalls 264
+#define __NR_syscalls 269
/*
* All syscalls below here should go away really,
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 10/12] vfs: Export file system uuid via /proc/<pid>/mountinfo
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (8 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 09/12] unistd.h: Add new syscalls numbers to asm-generic Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 11/12] ext3: Copy fs UUID to superblock Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 12/12] ext4: " Aneesh Kumar K.V
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
We add a per superblock uuid field. File systems should
update the uuid in the fill_super callback
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/namespace.c | 16 ++++++++++++++++
include/linux/fs.h | 1 +
2 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index a72eaab..27fd1b8 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -871,6 +871,18 @@ const struct seq_operations mounts_op = {
.show = show_vfsmnt
};
+static int uuid_is_nil(u8 *uuid)
+{
+ int i;
+ u8 *cp = (u8 *)uuid;
+
+ for (i = 0; i < 16; i++) {
+ if (*cp++)
+ return 0;
+ }
+ return 1;
+}
+
static int show_mountinfo(struct seq_file *m, void *v)
{
struct proc_mounts *p = m->private;
@@ -909,6 +921,10 @@ static int show_mountinfo(struct seq_file *m, void *v)
if (IS_MNT_UNBINDABLE(mnt))
seq_puts(m, " unbindable");
+ if (!uuid_is_nil(mnt->mnt_sb->s_uuid))
+ /* print the uuid */
+ seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
+
/* Filesystem specific data */
seq_puts(m, " - ");
show_type(m, sb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0bac293..36c1339 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1367,6 +1367,7 @@ struct super_block {
wait_queue_head_t s_wait_unfrozen;
char s_id[32]; /* Informational name */
+ u8 s_uuid[16]; /* UUID */
void *s_fs_info; /* Filesystem private info */
fmode_t s_mode;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 11/12] ext3: Copy fs UUID to superblock.
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (9 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 10/12] vfs: Export file system uuid via /proc/<pid>/mountinfo Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 12/12] ext4: " Aneesh Kumar K.V
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
File system UUID is made available to application
via /proc/<pid>/mountinfo
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/ext3/super.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 5dbf4db..6dda322 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -1918,6 +1918,7 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
sb->s_qcop = &ext3_qctl_operations;
sb->dq_op = &ext3_quota_operations;
#endif
+ memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
INIT_LIST_HEAD(&sbi->s_orphan); /* unlinked but open files */
mutex_init(&sbi->s_orphan_lock);
mutex_init(&sbi->s_resize_lock);
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH -V20 12/12] ext4: Copy fs UUID to superblock
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
` (10 preceding siblings ...)
2010-09-28 19:36 ` [PATCH -V20 11/12] ext3: Copy fs UUID to superblock Aneesh Kumar K.V
@ 2010-09-28 19:36 ` Aneesh Kumar K.V
11 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K.V @ 2010-09-28 19:36 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
Aneesh Kumar K.V
File system UUID is made available to application
via /proc/<pid>/mountinfo
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/ext4/super.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 2614774..b46a78c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2941,6 +2941,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
sb->s_qcop = &ext4_qctl_operations;
sb->dq_op = &ext4_quota_operations;
#endif
+ memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
+
INIT_LIST_HEAD(&sbi->s_orphan); /* unlinked but open files */
mutex_init(&sbi->s_orphan_lock);
mutex_init(&sbi->s_resize_lock);
--
1.7.0.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 01/12] exportfs: Return the minimum required handle size
2010-09-28 19:36 ` [PATCH -V20 01/12] exportfs: Return the minimum required handle size Aneesh Kumar K.V
@ 2010-09-28 19:52 ` J. Bruce Fields
2010-09-29 5:34 ` Aneesh Kumar K. V
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-09-28 19:52 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, miklos,
linux-fsdevel, sfrench, philippe.deniel, linux-kernel
On Wed, Sep 29, 2010 at 01:06:39AM +0530, Aneesh Kumar K.V wrote:
> The exportfs encode handle function should return the minimum required
> handle size. This helps user to find out the handle size by passing 0
> handle size in the first step and then redoing to the call again with
> the returned handle size value.
Just nits; seems OK otherwise:
> diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
> index 19ad145..250a347 100644
> --- a/fs/ocfs2/export.c
> +++ b/fs/ocfs2/export.c
> @@ -201,8 +201,14 @@ static int ocfs2_encode_fh(struct dentry *dentry, u32 *fh_in, int *max_len,
> dentry->d_name.len, dentry->d_name.name,
> fh, len, connectable);
>
> - if (len < 3 || (connectable && len < 6)) {
> + if (connectable && (len < 6)) {
> mlog(ML_ERROR, "fh buffer is too small for encoding\n");
Should that really be a printk(KERN_ERR, ...) if this is an expected use
of the interface?
> + *max_len = 6;
> + type = 255;
> + goto bail;
> + } else if (len < 3) {
> + mlog(ML_ERROR, "fh buffer is too small for encoding\n");
Ditto.
> + *max_len = 3;
> type = 255;
> goto bail;
> }
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index a9cd507..acd0b2d 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -108,8 +108,10 @@ struct fid {
> * set, the encode_fh() should store sufficient information so that a good
> * attempt can be made to find not only the file but also it's place in the
> * filesystem. This typically means storing a reference to de->d_parent in
> - * the filehandle fragment. encode_fh() should return the number of bytes
> - * stored or a negative error code such as %-ENOSPC
> + * the filehandle fragment. encode_fh() should return the fileid_type on
> + * success and on error returns 255 (if the space needed to encode fh is
> + * greater than @max_len*4 bytes). On error @max_len contain the minimum
s/contain/contains/.
> + * size(in 4 byte unit) needed to encode the file handle.
> *
> * fh_to_dentry:
> * @fh_to_dentry is given a &struct super_block (@sb) and a file handle
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 02/12] vfs: Add name to file handle conversion support
2010-09-28 19:36 ` [PATCH -V20 02/12] vfs: Add name to file handle conversion support Aneesh Kumar K.V
@ 2010-09-28 20:30 ` J. Bruce Fields
2010-09-29 8:16 ` Aneesh Kumar K. V
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-09-28 20:30 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, miklos,
linux-fsdevel, sfrench, philippe.deniel, linux-kernel
On Wed, Sep 29, 2010 at 01:06:40AM +0530, Aneesh Kumar K.V wrote:
> @@ -1042,3 +1043,131 @@ int nonseekable_open(struct inode *inode, struct file *filp)
> }
>
> EXPORT_SYMBOL(nonseekable_open);
> +
> +#ifdef CONFIG_EXPORTFS
> +static long do_sys_name_to_handle(struct path *path,
> + struct file_handle __user *ufh,
> + int __user *mnt_id)
> +{
> + long retval;
> + int handle_size;
> + struct file_handle f_handle;
> + struct file_handle *handle = NULL;
> +
> + if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> + retval = -EFAULT;
> + goto err_out;
> + }
> + if (f_handle.handle_size > MAX_HANDLE_SZ) {
Couldn't handle_size also be negative?:
> +struct file_handle {
> + int handle_size;
Say the user passes in -1.
> + retval = -EINVAL;
> + goto err_out;
> + }
> + handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
> + GFP_KERNEL);
This succeeds, but allocates too little memory.
> + if (!handle) {
> + retval = -ENOMEM;
> + goto err_out;
> + }
> +
> + /* convert handle size to multiple of sizeof(u32) */
> + handle_size = f_handle.handle_size >> 2;
Now handle_size is a large positive number.
> +
> + /* we ask for a non connected handle */
> + retval = exportfs_encode_fh(path->dentry,
> + (struct fid *)handle->f_handle,
> + &handle_size, 0);
So this succeeds, and writes past the end of the allocated handle.
As long as the interface is privileged hopefully this would be hard to
abuse. But how about just defining handle.handle_size and handle_size
as unsigned?
The u32/bytes thing seems an easy source of mistakes. Would it be
possible to use "bytes" or "words" everywhere in place of "size" or
"SZ"? And, where possible, store only one or other other in a given
variable. (So do stuff like:
handle_words = f_handle_size >> 2;
retval = exportfs_encode_fh(.,., &handle_words,.);
handle->handle_type = retval;
handle->handle_bytes = handle_words << 2;
if (handle->handle_bytes > f_handle.handle_bytes) {
...
)
By the way, apologies, I can't remember from last time: did you decide
that overflow was really the only case when 255 would be returned from
exportfs_encode_fs()?
--b.
> + /* convert handle size to bytes */
> + handle_size *= sizeof(u32);
> + handle->handle_type = retval;
> + handle->handle_size = handle_size;
> + if (handle_size > f_handle.handle_size) {
> + /*
> + * set the handle_size to zero so we copy only
> + * non variable part of the file_handle
> + */
> + handle_size = 0;
> + retval = -EOVERFLOW;
> + } else
> + retval = 0;
> + /* copy the mount id */
> + if (copy_to_user(mnt_id, &path->mnt->mnt_id, sizeof(*mnt_id))) {
> + retval = -EFAULT;
> + goto err_free_out;
> + }
> + if (copy_to_user(ufh, handle,
> + sizeof(struct file_handle) + handle_size))
> + retval = -EFAULT;
> +err_free_out:
> + kfree(handle);
> +err_out:
> + return retval;
> +}
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 03/12] vfs: Add open by file handle support
2010-09-28 19:36 ` [PATCH -V20 03/12] vfs: Add open by file handle support Aneesh Kumar K.V
@ 2010-09-29 5:27 ` Aneesh Kumar K. V
0 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K. V @ 2010-09-29 5:27 UTC (permalink / raw)
To: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, bfields,
miklos
Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel
On Wed, 29 Sep 2010 01:06:41 +0530, "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> fs/compat.c | 11 +++
> fs/exportfs/expfs.c | 2 +
> fs/namei.c | 223 +++++++++++++++++++++++++++++++++++++++++++---
> fs/open.c | 32 ++++++-
> include/linux/fs.h | 10 ++-
> include/linux/namei.h | 1 +
> include/linux/syscalls.h | 3 +
> 7 files changed, 263 insertions(+), 19 deletions(-)
>
> diff --git a/fs/compat.c b/fs/compat.c
> index 0644a15..4a423fa 100644
> --- a/fs/compat.c
> +++ b/fs/compat.c
> @@ -2334,3 +2334,14 @@ asmlinkage long compat_sys_timerfd_gettime(int ufd,
> }
>
> #endif /* CONFIG_TIMERFD */
> +
> +/*
> + * Exactly like fs/open.c:sys_open_by_handle_at(), except that it
> + * doesn't set the O_LARGEFILE flag.
> + */
> +asmlinkage long
> +compat_sys_open_by_handle_at(int mountdirfd,
> + struct file_handle __user *handle, int flags)
> +{
> + return do_handle_open(mountdirfd, handle, flags);
> +}
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index cfee0f0..05a1179 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -373,6 +373,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
> /*
> * Try to get any dentry for the given file handle from the filesystem.
> */
> + if (!nop || !nop->fh_to_dentry)
> + return ERR_PTR(-ESTALE);
> result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
> if (!result)
> result = ERR_PTR(-ESTALE);
> diff --git a/fs/namei.c b/fs/namei.c
> index 24896e8..3439962 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -32,6 +32,7 @@
> #include <linux/fcntl.h>
> #include <linux/device_cgroup.h>
> #include <linux/fs_struct.h>
> +#include <linux/exportfs.h>
> #include <asm/uaccess.h>
>
> #include "internal.h"
> @@ -1050,6 +1051,29 @@ out_fail:
> return retval;
> }
>
> +struct vfsmount *get_vfsmount_from_fd(int fd)
> +{
> + int fput_needed;
> + struct path path;
> + struct file *filep;
> +
> + if (fd == AT_FDCWD) {
> + struct fs_struct *fs = current->fs;
> + spin_lock(&fs->lock);
> + path = fs->pwd;
> + mntget(path.mnt);
> + spin_lock(&fs->lock);
That should be spin_unlock. A missing stg refresh before sending the
patch series.
-aneesh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 01/12] exportfs: Return the minimum required handle size
2010-09-28 19:52 ` J. Bruce Fields
@ 2010-09-29 5:34 ` Aneesh Kumar K. V
0 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K. V @ 2010-09-29 5:34 UTC (permalink / raw)
To: J. Bruce Fields
Cc: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, miklos,
linux-fsdevel, sfrench, philippe.deniel, linux-kernel
On Tue, 28 Sep 2010 15:52:50 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Wed, Sep 29, 2010 at 01:06:39AM +0530, Aneesh Kumar K.V wrote:
> > The exportfs encode handle function should return the minimum required
> > handle size. This helps user to find out the handle size by passing 0
> > handle size in the first step and then redoing to the call again with
> > the returned handle size value.
>
> Just nits; seems OK otherwise:
>
> > diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
> > index 19ad145..250a347 100644
> > --- a/fs/ocfs2/export.c
> > +++ b/fs/ocfs2/export.c
> > @@ -201,8 +201,14 @@ static int ocfs2_encode_fh(struct dentry *dentry, u32 *fh_in, int *max_len,
> > dentry->d_name.len, dentry->d_name.name,
> > fh, len, connectable);
> >
> > - if (len < 3 || (connectable && len < 6)) {
> > + if (connectable && (len < 6)) {
> > mlog(ML_ERROR, "fh buffer is too small for encoding\n");
>
> Should that really be a printk(KERN_ERR, ...) if this is an expected use
> of the interface?
>
> > + *max_len = 6;
> > + type = 255;
> > + goto bail;
> > + } else if (len < 3) {
> > + mlog(ML_ERROR, "fh buffer is too small for encoding\n");
>
> Ditto.
I removed the mlog(..) call at both sites. Considering that one would
use the callpath to find out minimum required handle size, I guess we
don't want logging there.
>
> > + *max_len = 3;
> > type = 255;
> > goto bail;
> > }
> > diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> > index a9cd507..acd0b2d 100644
> > --- a/include/linux/exportfs.h
> > +++ b/include/linux/exportfs.h
> > @@ -108,8 +108,10 @@ struct fid {
> > * set, the encode_fh() should store sufficient information so that a good
> > * attempt can be made to find not only the file but also it's place in the
> > * filesystem. This typically means storing a reference to de->d_parent in
> > - * the filehandle fragment. encode_fh() should return the number of bytes
> > - * stored or a negative error code such as %-ENOSPC
> > + * the filehandle fragment. encode_fh() should return the fileid_type on
> > + * success and on error returns 255 (if the space needed to encode fh is
> > + * greater than @max_len*4 bytes). On error @max_len contain the minimum
>
> s/contain/contains/.
Fixed
>
> > + * size(in 4 byte unit) needed to encode the file handle.
> > *
> > * fh_to_dentry:
> > * @fh_to_dentry is given a &struct super_block (@sb) and a file handle
>
-aneesh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 02/12] vfs: Add name to file handle conversion support
2010-09-28 20:30 ` J. Bruce Fields
@ 2010-09-29 8:16 ` Aneesh Kumar K. V
2010-09-29 17:26 ` Sage Weil
0 siblings, 1 reply; 20+ messages in thread
From: Aneesh Kumar K. V @ 2010-09-29 8:16 UTC (permalink / raw)
To: J. Bruce Fields, sage
Cc: hch, viro, adilger, corbet, neilb, npiggin, hooanon05, miklos,
linux-fsdevel, sfrench, philippe.deniel, linux-kernel
On Tue, 28 Sep 2010 16:30:45 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Wed, Sep 29, 2010 at 01:06:40AM +0530, Aneesh Kumar K.V wrote:
> > @@ -1042,3 +1043,131 @@ int nonseekable_open(struct inode *inode, struct file *filp)
> > }
> >
> > EXPORT_SYMBOL(nonseekable_open);
> > +
> > +#ifdef CONFIG_EXPORTFS
> > +static long do_sys_name_to_handle(struct path *path,
> > + struct file_handle __user *ufh,
> > + int __user *mnt_id)
> > +{
> > + long retval;
> > + int handle_size;
> > + struct file_handle f_handle;
> > + struct file_handle *handle = NULL;
> > +
> > + if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> > + retval = -EFAULT;
> > + goto err_out;
> > + }
> > + if (f_handle.handle_size > MAX_HANDLE_SZ) {
>
> Couldn't handle_size also be negative?:
>
> > +struct file_handle {
> > + int handle_size;
>
> Say the user passes in -1.
>
> > + retval = -EINVAL;
> > + goto err_out;
> > + }
> > + handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
> > + GFP_KERNEL);
>
> This succeeds, but allocates too little memory.
>
> > + if (!handle) {
> > + retval = -ENOMEM;
> > + goto err_out;
> > + }
> > +
> > + /* convert handle size to multiple of sizeof(u32) */
> > + handle_size = f_handle.handle_size >> 2;
>
> Now handle_size is a large positive number.
>
> > +
> > + /* we ask for a non connected handle */
> > + retval = exportfs_encode_fh(path->dentry,
> > + (struct fid *)handle->f_handle,
> > + &handle_size, 0);
>
> So this succeeds, and writes past the end of the allocated handle.
>
> As long as the interface is privileged hopefully this would be hard to
> abuse. But how about just defining handle.handle_size and handle_size
> as unsigned?
>
> The u32/bytes thing seems an easy source of mistakes. Would it be
> possible to use "bytes" or "words" everywhere in place of "size" or
> "SZ"? And, where possible, store only one or other other in a given
> variable. (So do stuff like:
>
> handle_words = f_handle_size >> 2;
> retval = exportfs_encode_fh(.,., &handle_words,.);
> handle->handle_type = retval;
> handle->handle_bytes = handle_words << 2;
> if (handle->handle_bytes > f_handle.handle_bytes) {
> ...
> )
Updated the patch to do this. Instead of handle_words i used handle_dwords
>
> By the way, apologies, I can't remember from last time: did you decide
> that overflow was really the only case when 255 would be returned from
> exportfs_encode_fs()?
>
All in kernel file system other than cepth return 255 on overflow.
ceph return -ENOSPC when there is an EOVERFLOW case. (I also
need to fix Ceph to return minimum handle size). I guess ceph usage was
correct as per the existing documentation. But the current documentation
is wrong and all the file system was returning 255 instead of ENOSPC
We look at the returned handle size of exportfs_encode_fh and determine
the overflow case in open by handle code. May be i should fix that to
include both 255 and ENOSPC ?
if ((handle->handle_bytes > f_handle.handle_bytes) ||
(retval == 255) || (retval == -ENOSPC)) {
/* As per old exportfs_encode_fh documentation
* we could return ENOSPC to indicate overflow
* But file system returned 255 always. So handle
* both the values
*/
/*
* set the handle_size to zero so we copy only
* non variable part of the file_handle
*/
handle->handle_bytes = 0;
retval = -EOVERFLOW;
}
Attached ceph change below
diff --git a/fs/ceph/export.c b/fs/ceph/export.c
index 4480cb1..e38423e 100644
--- a/fs/ceph/export.c
+++ b/fs/ceph/export.c
@@ -42,32 +42,37 @@ struct ceph_nfs_confh {
static int ceph_encode_fh(struct dentry *dentry, u32 *rawfh, int *max_len,
int connectable)
{
+ int type;
struct ceph_nfs_fh *fh = (void *)rawfh;
struct ceph_nfs_confh *cfh = (void *)rawfh;
struct dentry *parent = dentry->d_parent;
struct inode *inode = dentry->d_inode;
- int type;
+ int connected_handle_length = sizeof(*cfh)/4;
+ int handle_length = sizeof(*fh)/4;
/* don't re-export snaps */
if (ceph_snap(inode) != CEPH_NOSNAP)
return -EINVAL;
- if (*max_len >= sizeof(*cfh)) {
+ if (*max_len >= connected_handle_length) {
dout("encode_fh %p connectable\n", dentry);
cfh->ino = ceph_ino(dentry->d_inode);
cfh->parent_ino = ceph_ino(parent->d_inode);
cfh->parent_name_hash = parent->d_name.hash;
- *max_len = sizeof(*cfh);
+ *max_len = connected_handle_length;
type = 2;
- } else if (*max_len > sizeof(*fh)) {
- if (connectable)
- return -ENOSPC;
+ } else if (*max_len >= handle_length) {
+ if (connectable) {
+ *max_len = connected_handle_length;
+ return 255;
+ }
dout("encode_fh %p\n", dentry);
fh->ino = ceph_ino(dentry->d_inode);
- *max_len = sizeof(*fh);
+ *max_len = handle_length;
type = 1;
} else {
- return -ENOSPC;
+ *max_len = handle_length;
+ return 255;
}
return type;
}
-aneesh
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 02/12] vfs: Add name to file handle conversion support
2010-09-29 8:16 ` Aneesh Kumar K. V
@ 2010-09-29 17:26 ` Sage Weil
2010-09-30 5:26 ` Aneesh Kumar K. V
0 siblings, 1 reply; 20+ messages in thread
From: Sage Weil @ 2010-09-29 17:26 UTC (permalink / raw)
To: Aneesh Kumar K. V
Cc: J. Bruce Fields, hch, viro, adilger, corbet, neilb, npiggin,
hooanon05, miklos, linux-fsdevel, sfrench, philippe.deniel,
linux-kernel
Hi Aneesh,
On Wed, 29 Sep 2010, Aneesh Kumar K. V wrote:
> On Tue, 28 Sep 2010 16:30:45 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > By the way, apologies, I can't remember from last time: did you decide
> > that overflow was really the only case when 255 would be returned from
> > exportfs_encode_fs()?
> >
>
> All in kernel file system other than cepth return 255 on overflow.
> ceph return -ENOSPC when there is an EOVERFLOW case. (I also
> need to fix Ceph to return minimum handle size). I guess ceph usage was
> correct as per the existing documentation. But the current documentation
> is wrong and all the file system was returning 255 instead of ENOSPC
>
> We look at the returned handle size of exportfs_encode_fh and determine
> the overflow case in open by handle code. May be i should fix that to
> include both 255 and ENOSPC ?
>
> if ((handle->handle_bytes > f_handle.handle_bytes) ||
> (retval == 255) || (retval == -ENOSPC)) {
> /* As per old exportfs_encode_fh documentation
> * we could return ENOSPC to indicate overflow
> * But file system returned 255 always. So handle
> * both the values
> */
> /*
> * set the handle_size to zero so we copy only
> * non variable part of the file_handle
> */
> handle->handle_bytes = 0;
> retval = -EOVERFLOW;
> }
>
> Attached ceph change below
This looks good to me. If ceph is the only one returning ENOSPC we may as
well fix it there (and in the documentation) and avoid adding an
additional return code check. Unless you're worried about out of tree
file systems?
In any case, I'll add the below patch to the ceph tree.
Thanks!
sage
>
> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> index 4480cb1..e38423e 100644
> --- a/fs/ceph/export.c
> +++ b/fs/ceph/export.c
> @@ -42,32 +42,37 @@ struct ceph_nfs_confh {
> static int ceph_encode_fh(struct dentry *dentry, u32 *rawfh, int *max_len,
> int connectable)
> {
> + int type;
> struct ceph_nfs_fh *fh = (void *)rawfh;
> struct ceph_nfs_confh *cfh = (void *)rawfh;
> struct dentry *parent = dentry->d_parent;
> struct inode *inode = dentry->d_inode;
> - int type;
> + int connected_handle_length = sizeof(*cfh)/4;
> + int handle_length = sizeof(*fh)/4;
>
> /* don't re-export snaps */
> if (ceph_snap(inode) != CEPH_NOSNAP)
> return -EINVAL;
>
> - if (*max_len >= sizeof(*cfh)) {
> + if (*max_len >= connected_handle_length) {
> dout("encode_fh %p connectable\n", dentry);
> cfh->ino = ceph_ino(dentry->d_inode);
> cfh->parent_ino = ceph_ino(parent->d_inode);
> cfh->parent_name_hash = parent->d_name.hash;
> - *max_len = sizeof(*cfh);
> + *max_len = connected_handle_length;
> type = 2;
> - } else if (*max_len > sizeof(*fh)) {
> - if (connectable)
> - return -ENOSPC;
> + } else if (*max_len >= handle_length) {
> + if (connectable) {
> + *max_len = connected_handle_length;
> + return 255;
> + }
> dout("encode_fh %p\n", dentry);
> fh->ino = ceph_ino(dentry->d_inode);
> - *max_len = sizeof(*fh);
> + *max_len = handle_length;
> type = 1;
> } else {
> - return -ENOSPC;
> + *max_len = handle_length;
> + return 255;
> }
> return type;
> }
>
> -aneesh
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH -V20 02/12] vfs: Add name to file handle conversion support
2010-09-29 17:26 ` Sage Weil
@ 2010-09-30 5:26 ` Aneesh Kumar K. V
0 siblings, 0 replies; 20+ messages in thread
From: Aneesh Kumar K. V @ 2010-09-30 5:26 UTC (permalink / raw)
To: Sage Weil
Cc: J. Bruce Fields, hch, viro, adilger, corbet, neilb, npiggin,
hooanon05, miklos, linux-fsdevel, sfrench, philippe.deniel,
linux-kernel
On Wed, 29 Sep 2010 10:26:32 -0700 (PDT), Sage Weil <sage@newdream.net> wrote:
> Hi Aneesh,
>
> On Wed, 29 Sep 2010, Aneesh Kumar K. V wrote:
>
> > On Tue, 28 Sep 2010 16:30:45 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > By the way, apologies, I can't remember from last time: did you decide
> > > that overflow was really the only case when 255 would be returned from
> > > exportfs_encode_fs()?
> > >
> >
> > All in kernel file system other than cepth return 255 on overflow.
> > ceph return -ENOSPC when there is an EOVERFLOW case. (I also
> > need to fix Ceph to return minimum handle size). I guess ceph usage was
> > correct as per the existing documentation. But the current documentation
> > is wrong and all the file system was returning 255 instead of ENOSPC
> >
> > We look at the returned handle size of exportfs_encode_fh and determine
> > the overflow case in open by handle code. May be i should fix that to
> > include both 255 and ENOSPC ?
> >
> > if ((handle->handle_bytes > f_handle.handle_bytes) ||
> > (retval == 255) || (retval == -ENOSPC)) {
> > /* As per old exportfs_encode_fh documentation
> > * we could return ENOSPC to indicate overflow
> > * But file system returned 255 always. So handle
> > * both the values
> > */
> > /*
> > * set the handle_size to zero so we copy only
> > * non variable part of the file_handle
> > */
> > handle->handle_bytes = 0;
> > retval = -EOVERFLOW;
> > }
> >
> > Attached ceph change below
>
> This looks good to me. If ceph is the only one returning ENOSPC we may as
> well fix it there (and in the documentation) and avoid adding an
> additional return code check. Unless you're worried about out of tree
> file systems?
>
> In any case, I'll add the below patch to the ceph tree.
>
> Thanks!
> sage
>
>
>
>
> >
> > diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> > index 4480cb1..e38423e 100644
> > --- a/fs/ceph/export.c
> > +++ b/fs/ceph/export.c
> > @@ -42,32 +42,37 @@ struct ceph_nfs_confh {
> > static int ceph_encode_fh(struct dentry *dentry, u32 *rawfh, int *max_len,
> > int connectable)
> > {
> > + int type;
> > struct ceph_nfs_fh *fh = (void *)rawfh;
> > struct ceph_nfs_confh *cfh = (void *)rawfh;
> > struct dentry *parent = dentry->d_parent;
> > struct inode *inode = dentry->d_inode;
> > - int type;
> > + int connected_handle_length = sizeof(*cfh)/4;
> > + int handle_length = sizeof(*fh)/4;
> >
> > /* don't re-export snaps */
> > if (ceph_snap(inode) != CEPH_NOSNAP)
> > return -EINVAL;
> >
> > - if (*max_len >= sizeof(*cfh)) {
> > + if (*max_len >= connected_handle_length) {
> > dout("encode_fh %p connectable\n", dentry);
> > cfh->ino = ceph_ino(dentry->d_inode);
> > cfh->parent_ino = ceph_ino(parent->d_inode);
> > cfh->parent_name_hash = parent->d_name.hash;
> > - *max_len = sizeof(*cfh);
> > + *max_len = connected_handle_length;
> > type = 2;
> > - } else if (*max_len > sizeof(*fh)) {
> > - if (connectable)
> > - return -ENOSPC;
> > + } else if (*max_len >= handle_length) {
> > + if (connectable) {
> > + *max_len = connected_handle_length;
> > + return 255;
> > + }
> > dout("encode_fh %p\n", dentry);
> > fh->ino = ceph_ino(dentry->d_inode);
> > - *max_len = sizeof(*fh);
> > + *max_len = handle_length;
> > type = 1;
> > } else {
> > - return -ENOSPC;
> > + *max_len = handle_length;
> > + return 255;
> > }
> > return type;
> > }
> >
Part of the patch that update *max_len on error is added by the
open by handle patch series for other file system. If I split that
part from the patch above it will create merge conflict later. So i am not
sure how to handle that. But if you are ok with everything in a single
patch you can add
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-aneesh
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2010-09-30 5:26 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-28 19:36 [PATCH -V20 00/12] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 01/12] exportfs: Return the minimum required handle size Aneesh Kumar K.V
2010-09-28 19:52 ` J. Bruce Fields
2010-09-29 5:34 ` Aneesh Kumar K. V
2010-09-28 19:36 ` [PATCH -V20 02/12] vfs: Add name to file handle conversion support Aneesh Kumar K.V
2010-09-28 20:30 ` J. Bruce Fields
2010-09-29 8:16 ` Aneesh Kumar K. V
2010-09-29 17:26 ` Sage Weil
2010-09-30 5:26 ` Aneesh Kumar K. V
2010-09-28 19:36 ` [PATCH -V20 03/12] vfs: Add open by file handle support Aneesh Kumar K.V
2010-09-29 5:27 ` Aneesh Kumar K. V
2010-09-28 19:36 ` [PATCH -V20 04/12] vfs: Add handle based readlink syscall Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 05/12] vfs: Add handle based stat syscall Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 06/12] vfs: Add handle based link syscall Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 07/12] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 08/12] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 09/12] unistd.h: Add new syscalls numbers to asm-generic Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 10/12] vfs: Export file system uuid via /proc/<pid>/mountinfo Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 11/12] ext3: Copy fs UUID to superblock Aneesh Kumar K.V
2010-09-28 19:36 ` [PATCH -V20 12/12] ext4: " Aneesh Kumar K.V
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).