linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] union mounts: In-kernel copyup v1
@ 2010-03-30 21:16 Valerie Aurora
  2010-03-30 21:16 ` [PATCH 1/4] VFS: Split inode_permission() and create path_permission() Valerie Aurora
  0 siblings, 1 reply; 5+ messages in thread
From: Valerie Aurora @ 2010-03-30 21:16 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel

Well, not really v1, but gotta start numbering somewhere. :)

This is a proof of concept implementation of in-kernel copyup for
union mounts.  It implements copy up of files on chown(), which is in
some ways harder than rename() or link() because it doesn't lookup the
parent separately.

The idea behind this version is to keep the <mnt,dentry> pair of the
parent directory of the chown() target around in case we need to copy
it up.  If we do need to copy up, we already have the path of the
directory for the topmost file system, instead of having to look it up
again.

Valerie Aurora (4):
  VFS: Split inode_permission() and create path_permission()
  VFS: create user_path_and_parent()
  union-mount: In-kernel copyup routines
  union-mount: Implement chown()

 fs/namei.c            |  125 +++++++++++++++++++++++++++++++----
 fs/open.c             |   33 ++++++++--
 fs/union.c            |  172 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h    |    1 +
 include/linux/namei.h |    2 +
 include/linux/union.h |    2 +
 6 files changed, 315 insertions(+), 20 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] VFS: Split inode_permission() and create path_permission()
  2010-03-30 21:16 [RFC PATCH 0/4] union mounts: In-kernel copyup v1 Valerie Aurora
@ 2010-03-30 21:16 ` Valerie Aurora
  2010-03-30 21:16   ` [PATCH 2/4] VFS: create user_path_and_parent() Valerie Aurora
  0 siblings, 1 reply; 5+ messages in thread
From: Valerie Aurora @ 2010-03-30 21:16 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel

Split inode_permission() into inode and file-system-dependent parts.
Create path_permission() to check permission based on the path to the
inode.  This is for union mounts, in which an inode can be located on
a read-only lower layer file system but is still writable, since we
will copy it up to the writable top layer file system.  So in that
case, we want to ignore MS_RDONLY on the lower layer.  To make this
decision, we must know the path (vfsmount, dentry) of both the target
and its parent.
---
 fs/namei.c         |   92 ++++++++++++++++++++++++++++++++++++++++++++--------
 include/linux/fs.h |    1 +
 2 files changed, 79 insertions(+), 14 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index e72d990..b4ba87d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -241,29 +241,20 @@ int generic_permission(struct inode *inode, int mask,
 }
 
 /**
- * inode_permission  -  check for access rights to a given inode
+ * __inode_permission  -  check for access rights to a given inode
  * @inode:	inode to check permission on
  * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
  *
  * Used to check for read/write/execute permissions on an inode.
- * We use "fsuid" for this, letting us set arbitrary permissions
- * for filesystem access without changing the "normal" uids which
- * are used for other things.
+ *
+ * This does not check for a read-only file system.  You probably want
+ * inode_permission().
  */
-int inode_permission(struct inode *inode, int mask)
+static int __inode_permission(struct inode *inode, int mask)
 {
 	int retval;
 
 	if (mask & MAY_WRITE) {
-		umode_t mode = inode->i_mode;
-
-		/*
-		 * Nobody gets write access to a read-only fs.
-		 */
-		if (IS_RDONLY(inode) &&
-		    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
-			return -EROFS;
-
 		/*
 		 * Nobody gets write access to an immutable file.
 		 */
@@ -288,6 +279,79 @@ int inode_permission(struct inode *inode, int mask)
 }
 
 /**
+ * sb_permission  -  check superblock-level permissions
+ * @sb: superblock of inode to check permission on
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ *
+ * Separate out file-system wide checks from inode-specific permission
+ * checks.  In particular, union mounts want to check the read-only
+ * status of the top-level file system, not the lower.
+ */
+int sb_permission(struct super_block *sb, struct inode *inode, int mask)
+{
+	if (mask & MAY_WRITE) {
+		umode_t mode = inode->i_mode;
+
+		/*
+		 * Nobody gets write access to a read-only fs.
+		 */
+		if ((sb->s_flags & MS_RDONLY) &&
+		    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
+			return -EROFS;
+	}
+	return 0;
+}
+
+/**
+ * inode_permission  -  check for access rights to a given inode
+ * @inode:	inode to check permission on
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ *
+ * Used to check for read/write/execute permissions on an inode.
+ * We use "fsuid" for this, letting us set arbitrary permissions
+ * for filesystem access without changing the "normal" uids which
+ * are used for other things.
+ */
+int inode_permission(struct inode *inode, int mask)
+{
+	int retval;
+
+	retval = sb_permission(inode->i_sb, inode, mask);
+	if (retval)
+		return retval;
+	return __inode_permission(inode, mask);
+}
+
+/**
+ * path_permission - check for inode access rights depending on path
+ * @path: path of inode to check
+ * @parent_path: path of inode's parent
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ *
+ * Like inode_permission, but used to check for permission when the
+ * file may potentially be copied up between union layers.
+ */
+
+int path_permission(struct path *path, struct path *parent_path, int mask)
+{
+	struct vfsmount *mnt;
+	int retval;
+
+	/* Catch some reversal of args */
+	BUG_ON(!S_ISDIR(parent_path->dentry->d_inode->i_mode));
+
+	if (IS_MNT_UNION(parent_path->mnt))
+		mnt = parent_path->mnt;
+	else
+		mnt = path->mnt;
+
+	retval = sb_permission(mnt->mnt_sb, path->dentry->d_inode, mask);
+	if (retval)
+		return retval;
+	return __inode_permission(path->dentry->d_inode, mask);
+}
+
+/**
  * file_permission  -  check for additional access rights to a given file
  * @file:	file to check access rights for
  * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4dae882..0b1c811 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2114,6 +2114,7 @@ extern sector_t bmap(struct inode *, sector_t);
 #endif
 extern int notify_change(struct dentry *, struct iattr *);
 extern int inode_permission(struct inode *, int);
+extern int path_permission(struct path *, struct path *, int);
 extern int generic_permission(struct inode *, int,
 		int (*check_acl)(struct inode *, int));
 
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] VFS: create user_path_and_parent()
  2010-03-30 21:16 ` [PATCH 1/4] VFS: Split inode_permission() and create path_permission() Valerie Aurora
@ 2010-03-30 21:16   ` Valerie Aurora
  2010-03-30 21:16     ` [PATCH 3/4] union-mount: In-kernel copyup routines Valerie Aurora
  0 siblings, 1 reply; 5+ messages in thread
From: Valerie Aurora @ 2010-03-30 21:16 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel

Proof-of-concept implementation of user_path_and_parent().  Lookup
both the parent and the target of a user-supplied filename, to supply
later to path_permission().  This is for union mounts, in which the
write permission of a file depends on the path used to access it.
---
 fs/namei.c            |   33 +++++++++++++++++++++++++++++++++
 include/linux/namei.h |    2 ++
 2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index b4ba87d..b4b8594 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1577,6 +1577,39 @@ static int user_path_parent(int dfd, const char __user *path,
 	return error;
 }
 
+int user_path_and_parent(int dfd, const char __user *filename,
+			 unsigned flags, struct path *parent,
+			 struct path *child)
+{
+	struct nameidata parent_nd;
+	struct nameidata child_nd;
+	char *s = getname(filename);
+	int error;
+
+	if (IS_ERR(s))
+		return PTR_ERR(s);
+
+	/* Lookup parent */
+	error = do_path_lookup(dfd, s, LOOKUP_PARENT, &parent_nd);
+	if (error)
+		goto out_putname;
+	*parent = parent_nd.path;
+
+	/* Lookup child - XXX optimize, racy */
+	error = do_path_lookup(dfd, s, flags, &child_nd);
+	if (error)
+		goto out_path_put;
+	*child = child_nd.path;
+	putname(s);
+	return 0;
+
+out_path_put:
+	path_put(&parent_nd.path);
+out_putname:
+	putname(s);
+	return error;
+}
+
 /*
  * It's inline, so penalty for filesystems that don't use sticky bit is
  * minimal.
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 05b441d..1c81b17 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -58,6 +58,8 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND};
 #define LOOKUP_RENAME_TARGET	0x0800
 
 extern int user_path_at(int, const char __user *, unsigned, struct path *);
+extern int user_path_and_parent(int , const char __user *, unsigned,
+				struct path *, struct path *);
 
 #define user_path(name, path) user_path_at(AT_FDCWD, name, LOOKUP_FOLLOW, path)
 #define user_lpath(name, path) user_path_at(AT_FDCWD, name, 0, path)
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] union-mount: In-kernel copyup routines
  2010-03-30 21:16   ` [PATCH 2/4] VFS: create user_path_and_parent() Valerie Aurora
@ 2010-03-30 21:16     ` Valerie Aurora
  2010-03-30 21:16       ` [PATCH 4/4] union-mount: Implement chown() Valerie Aurora
  0 siblings, 1 reply; 5+ messages in thread
From: Valerie Aurora @ 2010-03-30 21:16 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel

When a file on the read-only layer of a union mount is altered, it
must be copied up to the topmost read-write layer.  This patch creates
union_copyup_path() and its supporting routines.
---
 fs/union.c            |  172 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/union.h |    2 +
 2 files changed, 174 insertions(+), 0 deletions(-)

diff --git a/fs/union.c b/fs/union.c
index e2384ad..2b12085 100644
--- a/fs/union.c
+++ b/fs/union.c
@@ -26,6 +26,7 @@
 #include <linux/namei.h>
 #include <linux/file.h>
 #include <linux/security.h>
+#include <linux/splice.h>
 
 /*
  * This is borrowed from fs/inode.c. The hashtable for lookups. Somebody
@@ -633,3 +634,174 @@ out_fput:
 	mnt_drop_write(topmost_path->mnt);
 	return res;
 }
+
+/**
+ * union_create_file
+ *
+ * @parent: path of the parent dir
+ * @old: path of the source file
+ * @new: path of the new file, negative dentry
+ *
+ * Must already have mnt_want_write() on the mnt and the parent's
+ * i_mutex.
+ */
+
+static int union_create_file(struct path *parent, struct path *old,
+			     struct dentry *new)
+{
+	BUG_ON(!mutex_is_locked(&parent->dentry->d_inode->i_mutex));
+
+	/*
+	 * XXX - Need to initialize real nameidata, or else pass it
+	 * all the way through from original user_path_and_parent()
+	 * lookup.
+	 */
+	return vfs_create(parent->dentry->d_inode, new,
+			  old->dentry->d_inode->i_mode, NULL);
+}
+
+/**
+ * union_copyup_data - Copy up len bytes of old's data to new
+ *
+ * @old: source file
+ * @new: target file
+ * @len: number of bytes to copy
+ */
+
+static int union_copyup_data(struct path *old, struct vfsmount *new_mnt,
+			     struct dentry *new_dentry, size_t len)
+{
+	struct file *old_file;
+	struct file *new_file;
+	const struct cred *cred = current_cred();
+	loff_t offset = 0;
+	long bytes;
+	int error;
+
+	if (len == 0)
+		return 0;
+
+	/* Get reference to balance later fput() */
+	path_get(old);
+	old_file = dentry_open(old->dentry, old->mnt, O_RDONLY, cred);
+	if (IS_ERR(old_file))
+		return PTR_ERR(old_file);
+
+	dget(new_dentry);
+	mntget(new_mnt);
+	new_file = dentry_open(new_dentry, new_mnt, O_WRONLY, cred);
+	if (IS_ERR(new_file)) {
+		error = PTR_ERR(new_file);
+		goto out_fput;
+	}
+
+	bytes = do_splice_direct(old_file, &offset, new_file, len,
+				 SPLICE_F_MOVE);
+	if (bytes < 0)
+		error = bytes;
+
+	fput(new_file);
+out_fput:
+	fput(old_file);
+	return error;
+}
+
+/**
+ * union_copyup_path_len - Copy up a file and len bytes of data
+ *
+ * @parent: parent directory's path
+ * @path: path of file to be copied up
+ * @len: number of bytes of file data to copy up
+ *
+ * Parent's i_mutex must be held by caller.  Newly copied up path is
+ * returned in @path and original is path_put().
+ */
+
+int union_copyup_path_len(struct path *parent, struct path *path, size_t len)
+{
+	struct dentry *dentry;
+	int error;
+
+	BUG_ON(!mutex_is_locked(&parent->dentry->d_inode->i_mutex));
+
+	dentry = lookup_one_len(path->dentry->d_name.name, parent->dentry,
+				path->dentry->d_name.len);
+	if (IS_ERR(dentry))
+		return PTR_ERR(dentry);
+
+	if (dentry->d_inode) {
+		/*
+		 * We raced with someone else and "lost."  That's
+		 * okay, they did all the work of copying up the file.
+		 * Note that currently data copyup happens under the
+		 * parent dir's i_mutex.  If we move it outside that,
+		 * we'll need some way of waiting for the data copyup
+		 * to complete here.
+		 */
+		error = 0;
+		goto out_newpath;
+	}
+	/* Create file */
+	error = union_create_file(parent, path, dentry);
+	if (error)
+		goto out_dput;
+	/* Copyup data */
+	error = union_copyup_data(path, parent->mnt, dentry, len);
+	if (error) {
+		/* Most likely error: ENOSPC */
+		vfs_unlink(parent->dentry->d_inode, dentry);
+		goto out_dput;
+	}
+	/* XXX Copyup xattrs and any other dangly bits */
+out_newpath:
+	/* path_put() of original must happen before we copy in new */
+	path_put(path);
+	path->dentry = dentry;
+	path->mnt = mntget(parent->mnt);
+	return error;
+out_dput:
+	/* Don't path_put(path), let caller unwind */
+	dput(dentry);
+	return error;
+}
+
+/**
+ * union_copyup_path - Copy up a file given its path (and its parent's)
+ *
+ * @parent: parent directory's path
+ * @path: path of file to be copied up
+ * @newpath: return path of newly copied up file
+ */
+
+int union_copyup_path(struct path *parent, struct path *path)
+{
+	size_t len;
+	int error;
+
+	if (!IS_MNT_UNION(parent->mnt))
+		return 0;
+	if (parent->mnt == path->mnt)
+		return 0;
+
+	/* Catch swapped arguments.  At this point, path can only be a file. */
+	BUG_ON(!S_ISDIR(parent->dentry->d_inode->i_mode));
+	BUG_ON(!S_ISREG(path->dentry->d_inode->i_mode));
+
+	mutex_lock(&parent->dentry->d_inode->i_mutex);
+	if (IS_DEADDIR(parent->dentry->d_inode)) {
+		error = -ENOENT;
+		goto out_unlock;
+	}
+
+	/* XXX - need more locking than parent i_mutex? */
+	len = i_size_read(path->dentry->d_inode);
+	if (((size_t)len != len) || ((ssize_t)len != len)) {
+		error = -EFBIG;
+		goto out_unlock;
+	}
+
+	error = union_copyup_path_len(parent, path, len);
+out_unlock:
+	mutex_unlock(&parent->dentry->d_inode->i_mutex);
+	return error;
+}
diff --git a/include/linux/union.h b/include/linux/union.h
index 66deeb2..5ff59c6 100644
--- a/include/linux/union.h
+++ b/include/linux/union.h
@@ -52,6 +52,7 @@ extern struct dentry * union_create_topmost_dir(struct path *, struct qstr *,
 extern int attach_mnt_union(struct vfsmount *, struct vfsmount *);
 extern void detach_mnt_union(struct vfsmount *);
 extern int union_copyup_dir(struct path *);
+extern int union_copyup_path(struct path *, struct path *);
 
 #else /* CONFIG_UNION_MOUNT */
 
@@ -66,6 +67,7 @@ extern int union_copyup_dir(struct path *);
 #define attach_mnt_union(x, y)		do { } while (0)
 #define detach_mnt_union(x)		do { } while (0)
 #define union_copyup_dir(x)		({ BUG(); (0); })
+#define union_copyup_path(x, y)		({ BUG(); (NULL); })
 
 #endif	/* CONFIG_UNION_MOUNT */
 #endif	/* __KERNEL__ */
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] union-mount: Implement chown()
  2010-03-30 21:16     ` [PATCH 3/4] union-mount: In-kernel copyup routines Valerie Aurora
@ 2010-03-30 21:16       ` Valerie Aurora
  0 siblings, 0 replies; 5+ messages in thread
From: Valerie Aurora @ 2010-03-30 21:16 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel

Proof-of-concept implementation of chown() for union mounts.  Copies
up the target file even if later permission checks fail.
---
 fs/open.c |   33 +++++++++++++++++++++++++++------
 1 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index e17f544..55d6b5b 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -30,6 +30,7 @@
 #include <linux/falloc.h>
 #include <linux/fs_struct.h>
 #include <linux/ima.h>
+#include <linux/union.h>
 
 #include "internal.h"
 
@@ -704,19 +705,39 @@ static int chown_common(struct path *path, uid_t user, gid_t group)
 
 SYSCALL_DEFINE3(chown, const char __user *, filename, uid_t, user, gid_t, group)
 {
-	struct path path;
+	struct path parent;
+	struct path child;
+	struct vfsmount *mnt;
 	int error;
 
-	error = user_path(filename, &path);
+	error = user_path_and_parent(AT_FDCWD, filename, 0, &parent, &child);
 	if (error)
 		goto out;
-	error = mnt_want_write(path.mnt);
+
+	if (IS_MNT_UNION(parent.mnt))
+		mnt = parent.mnt;
+	else
+		mnt = child.mnt;
+
+	error = mnt_want_write(mnt);
 	if (error)
 		goto out_release;
-	error = chown_common(&path, user, group);
-	mnt_drop_write(path.mnt);
+	/*
+	 * XXX Will copy up when operation fails, e.g., EPERM.  We
+	 * have to either (1) separate chown_common() into chown_ok()
+	 * and chown_change(), or (2) push both parent and child path
+	 * down into every check that could return failure.  But this
+	 * is good enough for a proof of concept.
+	 */
+	error = union_copyup_path(&parent, &child);
+	if (error)
+		goto out_drop_write;
+	error = chown_common(&child, user, group);
+out_drop_write:
+	mnt_drop_write(mnt);
 out_release:
-	path_put(&path);
+	path_put(&child);
+	path_put(&parent);
 out:
 	return error;
 }
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-03-30 21:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-30 21:16 [RFC PATCH 0/4] union mounts: In-kernel copyup v1 Valerie Aurora
2010-03-30 21:16 ` [PATCH 1/4] VFS: Split inode_permission() and create path_permission() Valerie Aurora
2010-03-30 21:16   ` [PATCH 2/4] VFS: create user_path_and_parent() Valerie Aurora
2010-03-30 21:16     ` [PATCH 3/4] union-mount: In-kernel copyup routines Valerie Aurora
2010-03-30 21:16       ` [PATCH 4/4] union-mount: Implement chown() Valerie Aurora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).