linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/23] Mount writer count API (read-only bind mounts prep)
@ 2007-07-12  0:17 Dave Hansen
  2007-07-12  0:17 ` [PATCH 01/23] rearrange may_open() to be r/o friendly Dave Hansen
                   ` (22 more replies)
  0 siblings, 23 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen

The most contentious part of the r/o bind mount patches is
actually implementing the count tracking.  It has NUMA and
SMP implications, and is going to need to have a whole
discussion on that one patch.

These patches, on the other hand, simply introduce a new
API: mnt_want_write() and mnt_drop_write().  They do not
functionally change the kernel, just alter the way in
which we check filesystems for our ability to write to
them.

These functions should be used in place of IS_RDONLY(inode)
as they explicitly spell out when a mount is expected to
_stay_ r/w instead of simply checking at a single point
in time.

It should take a very small number (like 3) of small
patches to actually implement read-only bind mounts on
top of this new API.

These apply to current -git (as of July 11th, 2007).

---

Why do we need r/o bind mounts?

This feature allows a read-only view into a read-write filesystem.
In the process of doing that, it also provides infrastructure for
keeping track of the number of writers to any given mount.

This has a number of uses.  It allows chroots to have parts of
filesystems writable.  It will be useful for containers in the future
because users may have root inside a container, but should not
be allowed to write to somefilesystems.  This also replaces 
patches that vserver has had out of the tree for several years.

It allows security enhancement by making sure that parts of
your filesystem read-only (such as when you don't trust your
FTP server), when you don't want to have entire new filesystems
mounted, or when you want atime selectively updated.
I've been using the following script to test that the feature is
working as desired.  It takes a directory and makes a regular
bind and a r/o bind mount of it.  It then performs some normal
filesystem operations on the three directories, including ones
that are expected to fail, like creating a file on the r/o
mount.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/23] rearrange may_open() to be r/o friendly
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 02/23] create cleanup helper svc_msnfs() Dave Hansen
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


may_open() calls vfs_permission() before it does checks for
IS_RDONLY(inode).  It checks _again_ inside of vfs_permission().

The check inside of vfs_permission() is going away eventually.
With the mnt_want/drop_write() functions, all of the r/o
checks (except for this one) are consistently done before
calling permission().  Because of this, I'd like to use
permission() to hold a debugging check to make sure that
the mnt_want/drop_write() calls are actually being made.

So, to do this:
1. remove the IS_RDONLY() check from permission()
2. enforce that you must mnt_want_write() before
   even calling permission()
3. enable a debugging in permission()

We need to rearrange may_open().  Here's the patch.

---

 lxc-dave/fs/namei.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff -puN fs/namei.c~rearrange-permission-and-ro-checks-in-may_open fs/namei.c
--- lxc/fs/namei.c~rearrange-permission-and-ro-checks-in-may_open	2007-07-10 12:46:01.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:01.000000000 -0700
@@ -228,6 +228,10 @@ int permission(struct inode *inode, int 
 {
 	umode_t mode = inode->i_mode;
 	int retval, submask;
+	struct vfsmount *mnt = NULL;
+
+	if (nd)
+		mnt = nd->mnt;
 
 	if (mask & MAY_WRITE) {
 
@@ -252,7 +256,7 @@ int permission(struct inode *inode, int 
 	 * the fs is mounted with the "noexec" flag.
 	 */
 	if ((mask & MAY_EXEC) && S_ISREG(mode) && (!(mode & S_IXUGO) ||
-			(nd && nd->mnt && (nd->mnt->mnt_flags & MNT_NOEXEC))))
+			(mnt && (mnt->mnt_flags & MNT_NOEXEC))))
 		return -EACCES;
 
 	/* Ordinary permission routines do not understand MAY_APPEND. */
@@ -1546,10 +1550,6 @@ int may_open(struct nameidata *nd, int a
 	if (S_ISDIR(inode->i_mode) && (flag & FMODE_WRITE))
 		return -EISDIR;
 
-	error = vfs_permission(nd, acc_mode);
-	if (error)
-		return error;
-
 	/*
 	 * FIFO's, sockets and device files are special: they don't
 	 * actually live on the filesystem itself, and as such you
@@ -1564,6 +1564,10 @@ int may_open(struct nameidata *nd, int a
 		flag &= ~O_TRUNC;
 	} else if (IS_RDONLY(inode) && (flag & FMODE_WRITE))
 		return -EROFS;
+
+	error = vfs_permission(nd, acc_mode);
+	if (error)
+		return error;
 	/*
 	 * An append-only file must be opened in append mode for writing.
 	 */
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 02/23] create cleanup helper svc_msnfs()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
  2007-07-12  0:17 ` [PATCH 01/23] rearrange may_open() to be r/o friendly Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 03/23] filesystem helpers for custom 'struct file's Dave Hansen
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


I'm going to be modifying nfsd_rename() shortly to support
read-only bind mounts.  This #ifdef is around the area I'm
patching, and it starts to get really ugly if I just try
to add my new code by itself.  Using this little helper
makes things a lot cleaner to use.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/nfsd/vfs.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff -puN fs/nfsd/vfs.c~create-svc_msnfs-helper fs/nfsd/vfs.c
--- lxc/fs/nfsd/vfs.c~create-svc_msnfs-helper	2007-07-10 12:46:02.000000000 -0700
+++ lxc-dave/fs/nfsd/vfs.c	2007-07-10 12:46:02.000000000 -0700
@@ -837,6 +837,15 @@ nfsd_read_actor(read_descriptor_t *desc,
 	return size;
 }
 
+static inline int svc_msnfs(struct svc_fh *ffhp)
+{
+#ifdef MSNFS
+	return (ffhp->fh_export->ex_flags & NFSEXP_MSNFS);
+#else
+	return 0;
+#endif
+}
+
 static __be32
 nfsd_vfs_read(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
               loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
@@ -849,11 +858,9 @@ nfsd_vfs_read(struct svc_rqst *rqstp, st
 
 	err = nfserr_perm;
 	inode = file->f_path.dentry->d_inode;
-#ifdef MSNFS
-	if ((fhp->fh_export->ex_flags & NFSEXP_MSNFS) &&
-		(!lock_may_read(inode, offset, *count)))
+
+	if (svc_msnfs(fhp) && !lock_may_read(inode, offset, *count))
 		goto out;
-#endif
 
 	/* Get readahead parameters */
 	ra = nfsd_get_raparms(inode->i_sb->s_dev, inode->i_ino);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 03/23] filesystem helpers for custom 'struct file's
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
  2007-07-12  0:17 ` [PATCH 01/23] rearrange may_open() to be r/o friendly Dave Hansen
  2007-07-12  0:17 ` [PATCH 02/23] create cleanup helper svc_msnfs() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 04/23] r/o bind mounts: stub functions Dave Hansen
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


Christoph H. says this stands on its own and can go in before the
rest of the r/o bind mount set.  

---

Some filesystems forego the vfs and may_open() and create their
own 'struct file's.

This patch creates a couple of helper functions which can be
used by these filesystems, and will provide a unified place
which the r/o bind mount code may patch.

Also, rename an existing, static-scope init_file() to a less
generic name.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/configfs/dir.c    |    5 +++--
 lxc-dave/fs/file_table.c      |   34 ++++++++++++++++++++++++++++++++++
 lxc-dave/fs/hugetlbfs/inode.c |   22 +++++++++-------------
 lxc-dave/include/linux/file.h |    9 +++++++++
 lxc-dave/ipc/shm.c            |   13 +++++--------
 lxc-dave/mm/shmem.c           |    7 ++-----
 lxc-dave/mm/tiny-shmem.c      |   19 +++++++------------
 lxc-dave/net/socket.c         |   18 +++++++++---------
 8 files changed, 78 insertions(+), 49 deletions(-)

diff -puN fs/configfs/dir.c~filesystem-helpers-for-custom-struct-file-s fs/configfs/dir.c
--- lxc/fs/configfs/dir.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/fs/configfs/dir.c	2007-07-10 12:46:03.000000000 -0700
@@ -142,7 +142,7 @@ static int init_dir(struct inode * inode
 	return 0;
 }
 
-static int init_file(struct inode * inode)
+static int configfs_init_file(struct inode * inode)
 {
 	inode->i_size = PAGE_SIZE;
 	inode->i_fop = &configfs_file_operations;
@@ -283,7 +283,8 @@ static int configfs_attach_attr(struct c
 
 	dentry->d_fsdata = configfs_get(sd);
 	sd->s_dentry = dentry;
-	error = configfs_create(dentry, (attr->ca_mode & S_IALLUGO) | S_IFREG, init_file);
+	error = configfs_create(dentry, (attr->ca_mode & S_IALLUGO) | S_IFREG,
+				configfs_init_file);
 	if (error) {
 		configfs_put(sd);
 		return error;
diff -puN fs/file_table.c~filesystem-helpers-for-custom-struct-file-s fs/file_table.c
--- lxc/fs/file_table.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/fs/file_table.c	2007-07-10 12:46:03.000000000 -0700
@@ -138,6 +138,40 @@ fail:
 
 EXPORT_SYMBOL(get_empty_filp);
 
+struct file *alloc_file(struct vfsmount *mnt, struct dentry *dentry,
+		mode_t mode, const struct file_operations *fop)
+{
+	struct file *file;
+	struct path;
+
+	file = get_empty_filp();
+	if (!file)
+		return NULL;
+
+	init_file(file, mnt, dentry, mode, fop);
+	return file;
+}
+EXPORT_SYMBOL(alloc_file);
+
+/*
+ * Note: This is a crappy interface.  It is here to make
+ * merging with the existing users of get_empty_filp()
+ * who have complex failure logic easier.  All users
+ * of this should be moving to alloc_file().
+ */
+int init_file(struct file *file, struct vfsmount *mnt, struct dentry *dentry,
+	   mode_t mode, const struct file_operations *fop)
+{
+	int error = 0;
+	file->f_path.dentry = dentry;
+	file->f_path.mnt = mntget(mnt);
+	file->f_mapping = dentry->d_inode->i_mapping;
+	file->f_mode = mode;
+	file->f_op = fop;
+	return error;
+}
+EXPORT_SYMBOL(init_file);
+
 void fastcall fput(struct file *file)
 {
 	if (atomic_dec_and_test(&file->f_count))
diff -puN fs/hugetlbfs/inode.c~filesystem-helpers-for-custom-struct-file-s fs/hugetlbfs/inode.c
--- lxc/fs/hugetlbfs/inode.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/fs/hugetlbfs/inode.c	2007-07-10 12:46:03.000000000 -0700
@@ -761,16 +761,11 @@ struct file *hugetlb_file_setup(const ch
 	if (!dentry)
 		goto out_shm_unlock;
 
-	error = -ENFILE;
-	file = get_empty_filp();
-	if (!file)
-		goto out_dentry;
-
 	error = -ENOSPC;
 	inode = hugetlbfs_get_inode(root->d_sb, current->fsuid,
 				current->fsgid, S_IFREG | S_IRWXUGO, 0);
 	if (!inode)
-		goto out_file;
+		goto out_dentry;
 
 	error = -ENOMEM;
 	if (hugetlb_reserve_pages(inode, 0, size >> HPAGE_SHIFT))
@@ -779,17 +774,18 @@ struct file *hugetlb_file_setup(const ch
 	d_instantiate(dentry, inode);
 	inode->i_size = size;
 	inode->i_nlink = 0;
-	file->f_path.mnt = mntget(hugetlbfs_vfsmount);
-	file->f_path.dentry = dentry;
-	file->f_mapping = inode->i_mapping;
-	file->f_op = &hugetlbfs_file_operations;
-	file->f_mode = FMODE_WRITE | FMODE_READ;
+
+	error = -ENFILE;
+	file = alloc_file(hugetlbfs_vfsmount, dentry,
+			FMODE_WRITE | FMODE_READ,
+			&hugetlbfs_file_operations);
+	if (!file)
+		goto out_inode;
+
 	return file;
 
 out_inode:
 	iput(inode);
-out_file:
-	put_filp(file);
 out_dentry:
 	dput(dentry);
 out_shm_unlock:
diff -puN include/linux/file.h~filesystem-helpers-for-custom-struct-file-s include/linux/file.h
--- lxc/include/linux/file.h~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/include/linux/file.h	2007-07-10 12:46:03.000000000 -0700
@@ -62,6 +62,15 @@ extern struct kmem_cache *filp_cachep;
 extern void FASTCALL(__fput(struct file *));
 extern void FASTCALL(fput(struct file *));
 
+struct file_operations;
+struct vfsmount;
+struct dentry;
+extern int init_file(struct file *, struct vfsmount *mnt,
+		struct dentry *dentry, mode_t mode,
+		const struct file_operations *fop);
+extern struct file *alloc_file(struct vfsmount *, struct dentry *dentry,
+		mode_t mode, const struct file_operations *fop);
+
 static inline void fput_light(struct file *file, int fput_needed)
 {
 	if (unlikely(fput_needed))
diff -puN ipc/shm.c~filesystem-helpers-for-custom-struct-file-s ipc/shm.c
--- lxc/ipc/shm.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/ipc/shm.c	2007-07-10 12:46:03.000000000 -0700
@@ -906,7 +906,7 @@ long do_shmat(int shmid, char __user *sh
 		goto out_unlock;
 
 	path.dentry = dget(shp->shm_file->f_path.dentry);
-	path.mnt    = mntget(shp->shm_file->f_path.mnt);
+	path.mnt    = shp->shm_file->f_path.mnt;
 	shp->shm_nattch++;
 	size = i_size_read(path.dentry->d_inode);
 	shm_unlock(shp);
@@ -914,18 +914,16 @@ long do_shmat(int shmid, char __user *sh
 	err = -ENOMEM;
 	sfd = kzalloc(sizeof(*sfd), GFP_KERNEL);
 	if (!sfd)
-		goto out_put_path;
+		goto out_put_dentry;
 
 	err = -ENOMEM;
-	file = get_empty_filp();
+
+	file = alloc_file(path.mnt, path.dentry, f_mode, &shm_file_operations);
 	if (!file)
 		goto out_free;
 
-	file->f_op = &shm_file_operations;
 	file->private_data = sfd;
-	file->f_path = path;
 	file->f_mapping = shp->shm_file->f_mapping;
-	file->f_mode = f_mode;
 	sfd->id = shp->id;
 	sfd->ns = get_ipc_ns(ns);
 	sfd->file = shp->shm_file;
@@ -976,9 +974,8 @@ out_unlock:
 
 out_free:
 	kfree(sfd);
-out_put_path:
+out_put_dentry:
 	dput(path.dentry);
-	mntput(path.mnt);
 	goto out_nattch;
 }
 
diff -puN mm/shmem.c~filesystem-helpers-for-custom-struct-file-s mm/shmem.c
--- lxc/mm/shmem.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/mm/shmem.c	2007-07-10 12:46:03.000000000 -0700
@@ -2566,11 +2566,8 @@ struct file *shmem_file_setup(char *name
 	d_instantiate(dentry, inode);
 	inode->i_size = size;
 	inode->i_nlink = 0;	/* It is unlinked */
-	file->f_path.mnt = mntget(shm_mnt);
-	file->f_path.dentry = dentry;
-	file->f_mapping = inode->i_mapping;
-	file->f_op = &shmem_file_operations;
-	file->f_mode = FMODE_WRITE | FMODE_READ;
+	init_file(file, shm_mnt, dentry, FMODE_WRITE | FMODE_READ,
+			&shmem_file_operations);
 	return file;
 
 close_file:
diff -puN mm/tiny-shmem.c~filesystem-helpers-for-custom-struct-file-s mm/tiny-shmem.c
--- lxc/mm/tiny-shmem.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/mm/tiny-shmem.c	2007-07-10 12:46:03.000000000 -0700
@@ -66,24 +66,19 @@ struct file *shmem_file_setup(char *name
 	if (!dentry)
 		goto put_memory;
 
-	error = -ENFILE;
-	file = get_empty_filp();
-	if (!file)
-		goto put_dentry;
-
 	error = -ENOSPC;
 	inode = ramfs_get_inode(root->d_sb, S_IFREG | S_IRWXUGO, 0);
 	if (!inode)
-		goto close_file;
+		goto put_dentry;
 
 	d_instantiate(dentry, inode);
-	inode->i_nlink = 0;	/* It is unlinked */
+	error = -ENFILE;
+	file = alloc_file(shm_mnt, dentry, FMODE_WRITE | FMODE_READ,
+			&ramfs_file_operations);
+	if (!file)
+		goto put_dentry;
 
-	file->f_path.mnt = mntget(shm_mnt);
-	file->f_path.dentry = dentry;
-	file->f_mapping = inode->i_mapping;
-	file->f_op = &ramfs_file_operations;
-	file->f_mode = FMODE_WRITE | FMODE_READ;
+	inode->i_nlink = 0;	/* It is unlinked */
 
 	/* notify everyone as to the change of file size */
 	error = do_truncate(dentry, size, 0, file);
diff -puN net/socket.c~filesystem-helpers-for-custom-struct-file-s net/socket.c
--- lxc/net/socket.c~filesystem-helpers-for-custom-struct-file-s	2007-07-10 12:46:03.000000000 -0700
+++ lxc-dave/net/socket.c	2007-07-10 12:46:03.000000000 -0700
@@ -364,26 +364,26 @@ static int sock_alloc_fd(struct file **f
 
 static int sock_attach_fd(struct socket *sock, struct file *file)
 {
+	struct dentry *dentry;
 	struct qstr name = { .name = "" };
 
-	file->f_path.dentry = d_alloc(sock_mnt->mnt_sb->s_root, &name);
-	if (unlikely(!file->f_path.dentry))
+	dentry = d_alloc(sock_mnt->mnt_sb->s_root, &name);
+	if (unlikely(!dentry))
 		return -ENOMEM;
 
-	file->f_path.dentry->d_op = &sockfs_dentry_operations;
+	dentry->d_op = &sockfs_dentry_operations;
 	/*
 	 * We dont want to push this dentry into global dentry hash table.
 	 * We pretend dentry is already hashed, by unsetting DCACHE_UNHASHED
 	 * This permits a working /proc/$pid/fd/XXX on sockets
 	 */
-	file->f_path.dentry->d_flags &= ~DCACHE_UNHASHED;
-	d_instantiate(file->f_path.dentry, SOCK_INODE(sock));
-	file->f_path.mnt = mntget(sock_mnt);
-	file->f_mapping = file->f_path.dentry->d_inode->i_mapping;
+	dentry->d_flags &= ~DCACHE_UNHASHED;
+	d_instantiate(dentry, SOCK_INODE(sock));
 
 	sock->file = file;
-	file->f_op = SOCK_INODE(sock)->i_fop = &socket_file_ops;
-	file->f_mode = FMODE_READ | FMODE_WRITE;
+	init_file(file, sock_mnt, dentry, FMODE_READ | FMODE_WRITE,
+		  &socket_file_ops);
+	SOCK_INODE(sock)->i_fop = &socket_file_ops;
 	file->f_flags = O_RDWR;
 	file->f_pos = 0;
 	file->private_data = sock;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 04/23] r/o bind mounts: stub functions
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (2 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 03/23] filesystem helpers for custom 'struct file's Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 05/23] elevate write count open()'d files Dave Hansen
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


This patch adds two function mnt_want_write() and
mnt_drop_write().  These are used like a lock pair around
and fs operations that might cause a write to the filesystem.

Before these can become useful, we must first cover each
place in the VFS where writes are performed with a
want/drop pair.  When that is complete, we can actually
introduce code that will safely check the counts before
allowing r/w<->r/o transitions to occur.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namespace.c        |   46 +++++++++++++++++++++++++++++++++++++++++
 lxc-dave/include/linux/mount.h |    3 ++
 2 files changed, 49 insertions(+)

diff -puN fs/namespace.c~add-vfsmount-writer-count fs/namespace.c
--- lxc/fs/namespace.c~add-vfsmount-writer-count	2007-07-10 12:46:04.000000000 -0700
+++ lxc-dave/fs/namespace.c	2007-07-10 12:46:04.000000000 -0700
@@ -76,6 +76,52 @@ struct vfsmount *alloc_vfsmnt(const char
 	return mnt;
 }
 
+/*
+ * Most r/o checks on a fs are for operations that take
+ * discrete amounts of time, like a write() or unlink().
+ * We must keep track of when those operations start
+ * (for permission checks) and when they end, so that
+ * we can determine when writes are able to occur to
+ * a filesystem.
+ */
+/*
+ * This tells the low-level filesystem that a write is
+ * about to be performed to it, and makes sure that
+ * writes are allowed before returning success.  When
+ * the write operation is finished, mnt_drop_write()
+ * must be called.  This is effectively a refcount.
+ */
+int mnt_want_write(struct vfsmount *mnt)
+{
+	if (__mnt_is_readonly(mnt))
+		return -EROFS;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mnt_want_write);
+
+/*
+ * Tells the low-level filesystem that we are done
+ * performing a write to it.  Must be matched with
+ * mnt_want_write() call above.
+ */
+void mnt_drop_write(struct vfsmount *mnt)
+{
+}
+EXPORT_SYMBOL_GPL(mnt_drop_write);
+
+/*
+ * This shouldn't be used directly ouside of the VFS.
+ * It does not guarantee that the filesystem will stay
+ * r/w, just that it is right *now*e
+ * mnt_want/drop_write() will _keep_ the filesystem
+ * r/w.
+ */
+int __mnt_is_readonly(struct vfsmount *mnt)
+{
+	return (mnt->mnt_sb->s_flags & MS_RDONLY);
+}
+EXPORT_SYMBOL_GPL(__mnt_is_readonly);
+
 int simple_set_mnt(struct vfsmount *mnt, struct super_block *sb)
 {
 	mnt->mnt_sb = sb;
diff -puN include/linux/mount.h~add-vfsmount-writer-count include/linux/mount.h
--- lxc/include/linux/mount.h~add-vfsmount-writer-count	2007-07-10 12:46:04.000000000 -0700
+++ lxc-dave/include/linux/mount.h	2007-07-10 12:46:04.000000000 -0700
@@ -70,9 +70,12 @@ static inline struct vfsmount *mntget(st
 	return mnt;
 }
 
+extern int mnt_want_write(struct vfsmount *mnt);
+extern void mnt_drop_write(struct vfsmount *mnt);
 extern void mntput_no_expire(struct vfsmount *mnt);
 extern void mnt_pin(struct vfsmount *mnt);
 extern void mnt_unpin(struct vfsmount *mnt);
+extern int __mnt_is_readonly(struct vfsmount *mnt);
 
 static inline void mntput(struct vfsmount *mnt)
 {
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 05/23] elevate write count open()'d files
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (3 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 04/23] r/o bind mounts: stub functions Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 06/23] r/o bind mounts: elevate write count for some ioctls Dave Hansen
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


This is the first really tricky patch in the series.  It
elevates the writer count on a mount each time a
non-special file is opened for write.

This is not completely apparent in the patch because the
two if() conditions in may_open() above the
mnt_want_write() call are, combined, equivalent to
special_file().

There is also an elevated count around the vfs_create()
call in open_namei().  The count needs to be kept elevated
all the way into the may_open() call.  Otherwise, when the
write is dropped, a ro->rw transisition could occur.  This
would lead to having rw access on the newly created file,
while the vfsmount is ro.  That is bad.

Some filesystems forego the use of normal vfs calls to create
struct files.  Make sure that these users elevate the mnt writer
count because they will get __fput(), and we need to make
sure they're balanced.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/file_table.c |    9 ++++++++-
 lxc-dave/fs/namei.c      |   20 ++++++++++++++++----
 lxc-dave/ipc/mqueue.c    |    3 +++
 3 files changed, 27 insertions(+), 5 deletions(-)

diff -puN fs/file_table.c~tricky-elevate-write-count-files-are-open-ed fs/file_table.c
--- lxc/fs/file_table.c~tricky-elevate-write-count-files-are-open-ed	2007-07-10 12:46:04.000000000 -0700
+++ lxc-dave/fs/file_table.c	2007-07-10 12:46:04.000000000 -0700
@@ -168,6 +168,10 @@ int init_file(struct file *file, struct 
 	file->f_mapping = dentry->d_inode->i_mapping;
 	file->f_mode = mode;
 	file->f_op = fop;
+	if (mode & FMODE_WRITE) {
+		error = mnt_want_write(mnt);
+		WARN_ON(error);
+	}
 	return error;
 }
 EXPORT_SYMBOL(init_file);
@@ -205,8 +209,11 @@ void fastcall __fput(struct file *file)
 	if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL))
 		cdev_put(inode->i_cdev);
 	fops_put(file->f_op);
-	if (file->f_mode & FMODE_WRITE)
+	if (file->f_mode & FMODE_WRITE) {
 		put_write_access(inode);
+		if (!special_file(inode->i_mode))
+			mnt_drop_write(mnt);
+	}
 	put_pid(file->f_owner.pid);
 	file_kill(file);
 	file->f_path.dentry = NULL;
diff -puN fs/namei.c~tricky-elevate-write-count-files-are-open-ed fs/namei.c
--- lxc/fs/namei.c~tricky-elevate-write-count-files-are-open-ed	2007-07-10 12:46:04.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:04.000000000 -0700
@@ -1562,8 +1562,15 @@ int may_open(struct nameidata *nd, int a
 			return -EACCES;
 
 		flag &= ~O_TRUNC;
-	} else if (IS_RDONLY(inode) && (flag & FMODE_WRITE))
-		return -EROFS;
+	} else if (flag & FMODE_WRITE) {
+		/*
+		 * effectively: !special_file()
+		 * balanced by __fput()
+		 */
+		error = mnt_want_write(nd->mnt);
+		if (error)
+			return error;
+	}
 
 	error = vfs_permission(nd, acc_mode);
 	if (error)
@@ -1706,14 +1713,17 @@ do_last:
 	}
 
 	if (IS_ERR(nd->intent.open.file)) {
-		mutex_unlock(&dir->d_inode->i_mutex);
 		error = PTR_ERR(nd->intent.open.file);
-		goto exit_dput;
+		goto exit_mutex_unlock;
 	}
 
 	/* Negative dentry, just create the file */
 	if (!path.dentry->d_inode) {
+		error = mnt_want_write(nd->mnt);
+		if (error)
+			goto exit_mutex_unlock;
 		error = open_namei_create(nd, &path, flag, mode);
+		mnt_drop_write(nd->mnt);
 		if (error)
 			goto exit;
 		return 0;
@@ -1751,6 +1761,8 @@ ok:
 		goto exit;
 	return 0;
 
+exit_mutex_unlock:
+	mutex_unlock(&dir->d_inode->i_mutex);
 exit_dput:
 	dput_path(&path, nd);
 exit:
diff -puN ipc/mqueue.c~tricky-elevate-write-count-files-are-open-ed ipc/mqueue.c
--- lxc/ipc/mqueue.c~tricky-elevate-write-count-files-are-open-ed	2007-07-10 12:46:04.000000000 -0700
+++ lxc-dave/ipc/mqueue.c	2007-07-10 12:46:04.000000000 -0700
@@ -686,6 +686,9 @@ asmlinkage long sys_mq_open(const char _
 				goto out;
 			filp = do_open(dentry, oflag);
 		} else {
+			error = mnt_want_write(mqueue_mnt);
+			if (error)
+				goto out;
 			filp = do_create(mqueue_mnt->mnt_root, dentry,
 						oflag, mode, u_attr);
 		}
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 06/23] r/o bind mounts: elevate write count for some ioctls
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (4 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 05/23] elevate write count open()'d files Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 07/23] elevate writer count for chown and friends Dave Hansen
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


Some ioctl()s can cause writes to the filesystem.  Take
these, and make them use mnt_want/drop_write() instead.

We need to pass the filp one layer deeper in XFS, but
somebody _just_ pulled it out in February because nobody
was using it, so I don't feel guilty for adding it back.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/ext2/ioctl.c              |   46 +++++++++-----
 lxc-dave/fs/ext3/ioctl.c              |  100 +++++++++++++++++++++-----------
 lxc-dave/fs/ext4/ioctl.c              |  105 +++++++++++++++++++++-------------
 lxc-dave/fs/fat/file.c                |   10 +--
 lxc-dave/fs/hfsplus/ioctl.c           |   39 +++++++-----
 lxc-dave/fs/jfs/ioctl.c               |   33 ++++++----
 lxc-dave/fs/ocfs2/ioctl.c             |   11 +--
 lxc-dave/fs/reiserfs/ioctl.c          |   53 +++++++++++------
 lxc-dave/fs/xfs/linux-2.6/xfs_ioctl.c |   15 +++-
 lxc-dave/fs/xfs/linux-2.6/xfs_iops.c  |    7 --
 lxc-dave/fs/xfs/linux-2.6/xfs_lrw.c   |    9 ++
 11 files changed, 272 insertions(+), 156 deletions(-)

diff -puN fs/ext2/ioctl.c~ioctl-mnt-takers fs/ext2/ioctl.c
--- lxc/fs/ext2/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/ext2/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -12,6 +12,7 @@
 #include <linux/time.h>
 #include <linux/sched.h>
 #include <linux/compat.h>
+#include <linux/mount.h>
 #include <linux/smp_lock.h>
 #include <asm/current.h>
 #include <asm/uaccess.h>
@@ -22,6 +23,7 @@ int ext2_ioctl (struct inode * inode, st
 {
 	struct ext2_inode_info *ei = EXT2_I(inode);
 	unsigned int flags;
+	int ret;
 
 	ext2_debug ("cmd = %u, arg = %lu\n", cmd, arg);
 
@@ -33,14 +35,19 @@ int ext2_ioctl (struct inode * inode, st
 	case EXT2_IOC_SETFLAGS: {
 		unsigned int oldflags;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
+		ret = mnt_want_write(filp->f_vfsmnt);
+		if (ret)
+			return ret;
+
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			ret = -EACCES;
+			goto setflags_out;
+		}
 
-		if (get_user(flags, (int __user *) arg))
-			return -EFAULT;
+		if (get_user(flags, (int __user *) arg)) {
+			ret = -EFAULT;
+			goto setflags_out;
+		}
 
 		if (!S_ISDIR(inode->i_mode))
 			flags &= ~EXT2_DIRSYNC_FL;
@@ -57,7 +64,8 @@ int ext2_ioctl (struct inode * inode, st
 		if ((flags ^ oldflags) & (EXT2_APPEND_FL | EXT2_IMMUTABLE_FL)) {
 			if (!capable(CAP_LINUX_IMMUTABLE)) {
 				mutex_unlock(&inode->i_mutex);
-				return -EPERM;
+				ret = -EPERM;
+				goto setflags_out;
 			}
 		}
 
@@ -69,20 +77,26 @@ int ext2_ioctl (struct inode * inode, st
 		ext2_set_inode_flags(inode);
 		inode->i_ctime = CURRENT_TIME_SEC;
 		mark_inode_dirty(inode);
-		return 0;
+setflags_out:
+		mnt_drop_write(filp->f_vfsmnt);
+		return ret;
 	}
 	case EXT2_IOC_GETVERSION:
 		return put_user(inode->i_generation, (int __user *) arg);
 	case EXT2_IOC_SETVERSION:
 		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
 			return -EPERM;
-		if (IS_RDONLY(inode))
-			return -EROFS;
-		if (get_user(inode->i_generation, (int __user *) arg))
-			return -EFAULT;	
-		inode->i_ctime = CURRENT_TIME_SEC;
-		mark_inode_dirty(inode);
-		return 0;
+		ret = mnt_want_write(filp->f_vfsmnt);
+		if (ret)
+			return ret;
+		if (get_user(inode->i_generation, (int __user *) arg)) {
+			ret = -EFAULT;
+		} else {
+			inode->i_ctime = CURRENT_TIME_SEC;
+			mark_inode_dirty(inode);
+		}
+		mnt_drop_write(filp->f_vfsmnt);
+		return ret;
 	default:
 		return -ENOTTY;
 	}
diff -puN fs/ext3/ioctl.c~ioctl-mnt-takers fs/ext3/ioctl.c
--- lxc/fs/ext3/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/ext3/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -12,6 +12,7 @@
 #include <linux/capability.h>
 #include <linux/ext3_fs.h>
 #include <linux/ext3_jbd.h>
+#include <linux/mount.h>
 #include <linux/time.h>
 #include <linux/compat.h>
 #include <linux/smp_lock.h>
@@ -38,14 +39,19 @@ int ext3_ioctl (struct inode * inode, st
 		unsigned int oldflags;
 		unsigned int jflag;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			err = -EACCES;
+			goto flags_out;
+		}
 
-		if (get_user(flags, (int __user *) arg))
-			return -EFAULT;
+		if (get_user(flags, (int __user *) arg)) {
+			err = -EFAULT;
+			goto flags_out;
+		}
 
 		if (!S_ISDIR(inode->i_mode))
 			flags &= ~EXT3_DIRSYNC_FL;
@@ -65,7 +71,8 @@ int ext3_ioctl (struct inode * inode, st
 		if ((flags ^ oldflags) & (EXT3_APPEND_FL | EXT3_IMMUTABLE_FL)) {
 			if (!capable(CAP_LINUX_IMMUTABLE)) {
 				mutex_unlock(&inode->i_mutex);
-				return -EPERM;
+				err = -EPERM;
+				goto flags_out;
 			}
 		}
 
@@ -76,7 +83,8 @@ int ext3_ioctl (struct inode * inode, st
 		if ((jflag ^ oldflags) & (EXT3_JOURNAL_DATA_FL)) {
 			if (!capable(CAP_SYS_RESOURCE)) {
 				mutex_unlock(&inode->i_mutex);
-				return -EPERM;
+				err = -EPERM;
+				goto flags_out;
 			}
 		}
 
@@ -84,7 +92,8 @@ int ext3_ioctl (struct inode * inode, st
 		handle = ext3_journal_start(inode, 1);
 		if (IS_ERR(handle)) {
 			mutex_unlock(&inode->i_mutex);
-			return PTR_ERR(handle);
+			err = PTR_ERR(handle);
+			goto flags_out;
 		}
 		if (IS_SYNC(inode))
 			handle->h_sync = 1;
@@ -110,6 +119,8 @@ flags_err:
 		if ((jflag ^ oldflags) & (EXT3_JOURNAL_DATA_FL))
 			err = ext3_change_inode_journal_flag(inode, jflag);
 		mutex_unlock(&inode->i_mutex);
+flags_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 	case EXT3_IOC_GETVERSION:
@@ -124,14 +135,18 @@ flags_err:
 
 		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
 			return -EPERM;
-		if (IS_RDONLY(inode))
-			return -EROFS;
-		if (get_user(generation, (int __user *) arg))
-			return -EFAULT;
-
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
+		if (get_user(generation, (int __user *) arg)) {
+			err = -EFAULT;
+			goto setversion_out;
+		}
 		handle = ext3_journal_start(inode, 1);
-		if (IS_ERR(handle))
-			return PTR_ERR(handle);
+		if (IS_ERR(handle)) {
+			err = PTR_ERR(handle);
+			goto setversion_out;
+		}
 		err = ext3_reserve_inode_write(handle, inode, &iloc);
 		if (err == 0) {
 			inode->i_ctime = CURRENT_TIME_SEC;
@@ -139,6 +154,8 @@ flags_err:
 			err = ext3_mark_iloc_dirty(handle, inode, &iloc);
 		}
 		ext3_journal_stop(handle);
+setversion_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 #ifdef CONFIG_JBD_DEBUG
@@ -174,18 +191,24 @@ flags_err:
 		}
 		return -ENOTTY;
 	case EXT3_IOC_SETRSVSZ: {
+		int err;
 
 		if (!test_opt(inode->i_sb, RESERVATION) ||!S_ISREG(inode->i_mode))
 			return -ENOTTY;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			err = -EACCES;
+			goto setrsvsz_out;
+		}
 
-		if (get_user(rsv_window_size, (int __user *)arg))
-			return -EFAULT;
+		if (get_user(rsv_window_size, (int __user *)arg)) {
+			err = -EFAULT;
+			goto setrsvsz_out;
+		}
 
 		if (rsv_window_size > EXT3_MAX_RESERVE_BLOCKS)
 			rsv_window_size = EXT3_MAX_RESERVE_BLOCKS;
@@ -203,7 +226,9 @@ flags_err:
 			rsv->rsv_goal_size = rsv_window_size;
 		}
 		mutex_unlock(&ei->truncate_mutex);
-		return 0;
+setrsvsz_out:
+		mnt_drop_write(filp->f_vfsmnt);
+		return err;
 	}
 	case EXT3_IOC_GROUP_EXTEND: {
 		ext3_fsblk_t n_blocks_count;
@@ -213,17 +238,20 @@ flags_err:
 		if (!capable(CAP_SYS_RESOURCE))
 			return -EPERM;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if (get_user(n_blocks_count, (__u32 __user *)arg))
-			return -EFAULT;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
+		if (get_user(n_blocks_count, (__u32 __user *)arg)) {
+			err = -EFAULT;
+			goto group_extend_out;
+		}
 		err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
 		journal_lock_updates(EXT3_SB(sb)->s_journal);
 		journal_flush(EXT3_SB(sb)->s_journal);
 		journal_unlock_updates(EXT3_SB(sb)->s_journal);
-
+group_extend_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 	case EXT3_IOC_GROUP_ADD: {
@@ -234,18 +262,22 @@ flags_err:
 		if (!capable(CAP_SYS_RESOURCE))
 			return -EPERM;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
 		if (copy_from_user(&input, (struct ext3_new_group_input __user *)arg,
-				sizeof(input)))
-			return -EFAULT;
+				sizeof(input))) {
+			err = -EFAULT;
+			goto group_add_out;
+		}
 
 		err = ext3_group_add(sb, &input);
 		journal_lock_updates(EXT3_SB(sb)->s_journal);
 		journal_flush(EXT3_SB(sb)->s_journal);
 		journal_unlock_updates(EXT3_SB(sb)->s_journal);
-
+group_add_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 
diff -puN fs/ext4/ioctl.c~ioctl-mnt-takers fs/ext4/ioctl.c
--- lxc/fs/ext4/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/ext4/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -12,6 +12,7 @@
 #include <linux/capability.h>
 #include <linux/ext4_fs.h>
 #include <linux/ext4_jbd2.h>
+#include <linux/mount.h>
 #include <linux/time.h>
 #include <linux/compat.h>
 #include <linux/smp_lock.h>
@@ -37,15 +38,19 @@ int ext4_ioctl (struct inode * inode, st
 		unsigned int oldflags;
 		unsigned int jflag;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
-		if (get_user(flags, (int __user *) arg))
-			return -EFAULT;
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			err = -EACCES;
+			goto flags_out;
+		}
 
+		if (get_user(flags, (int __user *) arg)) {
+			err = -EFAULT;
+			goto flags_out;
+		}
 		if (!S_ISDIR(inode->i_mode))
 			flags &= ~EXT4_DIRSYNC_FL;
 
@@ -64,7 +69,8 @@ int ext4_ioctl (struct inode * inode, st
 		if ((flags ^ oldflags) & (EXT4_APPEND_FL | EXT4_IMMUTABLE_FL)) {
 			if (!capable(CAP_LINUX_IMMUTABLE)) {
 				mutex_unlock(&inode->i_mutex);
-				return -EPERM;
+				err = -EPERM;
+				goto flags_out;
 			}
 		}
 
@@ -75,7 +81,8 @@ int ext4_ioctl (struct inode * inode, st
 		if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
 			if (!capable(CAP_SYS_RESOURCE)) {
 				mutex_unlock(&inode->i_mutex);
-				return -EPERM;
+				err = -EPERM;
+				goto flags_out;
 			}
 		}
 
@@ -83,7 +90,8 @@ int ext4_ioctl (struct inode * inode, st
 		handle = ext4_journal_start(inode, 1);
 		if (IS_ERR(handle)) {
 			mutex_unlock(&inode->i_mutex);
-			return PTR_ERR(handle);
+			err = PTR_ERR(handle);
+			goto flags_out;
 		}
 		if (IS_SYNC(inode))
 			handle->h_sync = 1;
@@ -103,12 +111,14 @@ flags_err:
 		ext4_journal_stop(handle);
 		if (err) {
 			mutex_unlock(&inode->i_mutex);
-			return err;
+			goto flags_out;
 		}
 
 		if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL))
 			err = ext4_change_inode_journal_flag(inode, jflag);
 		mutex_unlock(&inode->i_mutex);
+flags_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 	case EXT4_IOC_GETVERSION:
@@ -123,14 +133,18 @@ flags_err:
 
 		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
 			return -EPERM;
-		if (IS_RDONLY(inode))
-			return -EROFS;
-		if (get_user(generation, (int __user *) arg))
-			return -EFAULT;
-
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
+		if (get_user(generation, (int __user *) arg)) {
+			err = -EFAULT;
+			goto setversion_out;
+		}
 		handle = ext4_journal_start(inode, 1);
-		if (IS_ERR(handle))
-			return PTR_ERR(handle);
+		if (IS_ERR(handle)) {
+			err = PTR_ERR(handle);
+			goto setversion_out;
+		}
 		err = ext4_reserve_inode_write(handle, inode, &iloc);
 		if (err == 0) {
 			inode->i_ctime = CURRENT_TIME_SEC;
@@ -138,6 +152,8 @@ flags_err:
 			err = ext4_mark_iloc_dirty(handle, inode, &iloc);
 		}
 		ext4_journal_stop(handle);
+setversion_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 #ifdef CONFIG_JBD_DEBUG
@@ -173,19 +189,23 @@ flags_err:
 		}
 		return -ENOTTY;
 	case EXT4_IOC_SETRSVSZ: {
+		int err;
 
 		if (!test_opt(inode->i_sb, RESERVATION) ||!S_ISREG(inode->i_mode))
 			return -ENOTTY;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
-
-		if (get_user(rsv_window_size, (int __user *)arg))
-			return -EFAULT;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			err = -EACCES;
+			goto setrsvsz_out;
+		}
+		if (get_user(rsv_window_size, (int __user *)arg)) {
+			err = -EFAULT;
+			goto setrsvsz_out;
+		}
 		if (rsv_window_size > EXT4_MAX_RESERVE_BLOCKS)
 			rsv_window_size = EXT4_MAX_RESERVE_BLOCKS;
 
@@ -202,7 +222,9 @@ flags_err:
 			rsv->rsv_goal_size = rsv_window_size;
 		}
 		mutex_unlock(&ei->truncate_mutex);
-		return 0;
+setrsvsz_out:
+		mnt_drop_write(filp->f_vfsmnt);
+		return err;
 	}
 	case EXT4_IOC_GROUP_EXTEND: {
 		ext4_fsblk_t n_blocks_count;
@@ -212,17 +234,21 @@ flags_err:
 		if (!capable(CAP_SYS_RESOURCE))
 			return -EPERM;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if (get_user(n_blocks_count, (__u32 __user *)arg))
-			return -EFAULT;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
+		if (get_user(n_blocks_count, (__u32 __user *)arg)) {
+			err = -EFAULT;
+			goto group_extend_out;
+		}
 		err = ext4_group_extend(sb, EXT4_SB(sb)->s_es, n_blocks_count);
 		jbd2_journal_lock_updates(EXT4_SB(sb)->s_journal);
 		jbd2_journal_flush(EXT4_SB(sb)->s_journal);
 		jbd2_journal_unlock_updates(EXT4_SB(sb)->s_journal);
 
+group_extend_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 	case EXT4_IOC_GROUP_ADD: {
@@ -233,18 +259,21 @@ flags_err:
 		if (!capable(CAP_SYS_RESOURCE))
 			return -EPERM;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
 
 		if (copy_from_user(&input, (struct ext4_new_group_input __user *)arg,
-				sizeof(input)))
-			return -EFAULT;
-
+				sizeof(input))) {
+			err = -EFAULT;
+			goto group_add_out;
+		}
 		err = ext4_group_add(sb, &input);
 		jbd2_journal_lock_updates(EXT4_SB(sb)->s_journal);
 		jbd2_journal_flush(EXT4_SB(sb)->s_journal);
 		jbd2_journal_unlock_updates(EXT4_SB(sb)->s_journal);
-
+group_add_out:
+		mnt_drop_write(filp->f_vfsmnt);
 		return err;
 	}
 
diff -puN fs/fat/file.c~ioctl-mnt-takers fs/fat/file.c
--- lxc/fs/fat/file.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/fat/file.c	2007-07-10 12:46:05.000000000 -0700
@@ -8,6 +8,7 @@
 
 #include <linux/capability.h>
 #include <linux/module.h>
+#include <linux/mount.h>
 #include <linux/time.h>
 #include <linux/msdos_fs.h>
 #include <linux/smp_lock.h>
@@ -46,10 +47,9 @@ int fat_generic_ioctl(struct inode *inod
 
 		mutex_lock(&inode->i_mutex);
 
-		if (IS_RDONLY(inode)) {
-			err = -EROFS;
-			goto up;
-		}
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			goto up_no_drop_write;
 
 		/*
 		 * ATTR_VOLUME and ATTR_DIR cannot be changed; this also
@@ -106,6 +106,8 @@ int fat_generic_ioctl(struct inode *inod
 		MSDOS_I(inode)->i_attrs = attr & ATTR_UNUSED;
 		mark_inode_dirty(inode);
 	up:
+		mnt_drop_write(filp->f_vfsmnt);
+	up_no_drop_write:
 		mutex_unlock(&inode->i_mutex);
 		return err;
 	}
diff -puN fs/hfsplus/ioctl.c~ioctl-mnt-takers fs/hfsplus/ioctl.c
--- lxc/fs/hfsplus/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/hfsplus/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -35,25 +35,32 @@ int hfsplus_ioctl(struct inode *inode, s
 			flags |= FS_NODUMP_FL; /* EXT2_NODUMP_FL */
 		return put_user(flags, (int __user *)arg);
 	case HFSPLUS_IOC_EXT2_SETFLAGS: {
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
-
-		if (get_user(flags, (int __user *)arg))
-			return -EFAULT;
-
+		int err = 0;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
+
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			err = -EACCES;
+			goto setflags_out;
+		}
+		if (get_user(flags, (int __user *)arg)) {
+			err = -EFAULT;
+			goto setflags_out;
+		}
 		if (flags & (FS_IMMUTABLE_FL|FS_APPEND_FL) ||
 		    HFSPLUS_I(inode).rootflags & (HFSPLUS_FLG_IMMUTABLE|HFSPLUS_FLG_APPEND)) {
-			if (!capable(CAP_LINUX_IMMUTABLE))
-				return -EPERM;
+			if (!capable(CAP_LINUX_IMMUTABLE)) {
+				err = -EPERM;
+				goto setflags_out;
+			}
 		}
 
 		/* don't silently ignore unsupported ext2 flags */
-		if (flags & ~(FS_IMMUTABLE_FL|FS_APPEND_FL|FS_NODUMP_FL))
-			return -EOPNOTSUPP;
-
+		if (flags & ~(FS_IMMUTABLE_FL|FS_APPEND_FL|FS_NODUMP_FL)) {
+			err = -EOPNOTSUPP;
+			goto setflags_out;
+		}
 		if (flags & FS_IMMUTABLE_FL) { /* EXT2_IMMUTABLE_FL */
 			inode->i_flags |= S_IMMUTABLE;
 			HFSPLUS_I(inode).rootflags |= HFSPLUS_FLG_IMMUTABLE;
@@ -75,7 +82,9 @@ int hfsplus_ioctl(struct inode *inode, s
 
 		inode->i_ctime = CURRENT_TIME_SEC;
 		mark_inode_dirty(inode);
-		return 0;
+setflags_out:
+		mnt_drop_write(filp->f_vfsmnt);
+		return err;
 	}
 	default:
 		return -ENOTTY;
diff -puN fs/jfs/ioctl.c~ioctl-mnt-takers fs/jfs/ioctl.c
--- lxc/fs/jfs/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/jfs/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -8,6 +8,7 @@
 #include <linux/fs.h>
 #include <linux/ctype.h>
 #include <linux/capability.h>
+#include <linux/mount.h>
 #include <linux/time.h>
 #include <linux/sched.h>
 #include <asm/current.h>
@@ -65,16 +66,20 @@ int jfs_ioctl(struct inode * inode, stru
 		return put_user(flags, (int __user *) arg);
 	case JFS_IOC_SETFLAGS: {
 		unsigned int oldflags;
+		int err;
 
-		if (IS_RDONLY(inode))
-			return -EROFS;
-
-		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
-			return -EACCES;
-
-		if (get_user(flags, (int __user *) arg))
-			return -EFAULT;
-
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
+
+		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) {
+			err = -EACCES;
+			goto setflags_out;
+		}
+		if (get_user(flags, (int __user *) arg)) {
+			err = -EFAULT;
+			goto setflags_out;
+		}
 		flags = jfs_map_ext2(flags, 1);
 		if (!S_ISDIR(inode->i_mode))
 			flags &= ~JFS_DIRSYNC_FL;
@@ -89,8 +94,10 @@ int jfs_ioctl(struct inode * inode, stru
 		if ((oldflags & JFS_IMMUTABLE_FL) ||
 			((flags ^ oldflags) &
 			(JFS_APPEND_FL | JFS_IMMUTABLE_FL))) {
-			if (!capable(CAP_LINUX_IMMUTABLE))
-				return -EPERM;
+			if (!capable(CAP_LINUX_IMMUTABLE)) {
+				err = -EPERM;
+				goto setflags_out;
+			}
 		}
 
 		flags = flags & JFS_FL_USER_MODIFIABLE;
@@ -100,7 +107,9 @@ int jfs_ioctl(struct inode * inode, stru
 		jfs_set_inode_flags(inode);
 		inode->i_ctime = CURRENT_TIME_SEC;
 		mark_inode_dirty(inode);
-		return 0;
+setflags_out:
+		mnt_drop_write(filp->f_vfsmnt);
+		return err;
 	}
 	default:
 		return -ENOTTY;
diff -puN fs/ocfs2/ioctl.c~ioctl-mnt-takers fs/ocfs2/ioctl.c
--- lxc/fs/ocfs2/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/ocfs2/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -57,10 +57,6 @@ static int ocfs2_set_inode_attr(struct i
 		goto bail;
 	}
 
-	status = -EROFS;
-	if (IS_RDONLY(inode))
-		goto bail_unlock;
-
 	status = -EACCES;
 	if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
 		goto bail_unlock;
@@ -128,8 +124,13 @@ int ocfs2_ioctl(struct inode * inode, st
 		if (get_user(flags, (int __user *) arg))
 			return -EFAULT;
 
-		return ocfs2_set_inode_attr(inode, flags,
+		status = mnt_want_write(filp->f_vfsmnt);
+		if (status)
+			return status;
+		status = ocfs2_set_inode_attr(inode, flags,
 			OCFS2_FL_MODIFIABLE);
+		mnt_drop_write(filp->f_vfsmnt);
+		return status;
 	default:
 		return -ENOTTY;
 	}
diff -puN fs/reiserfs/ioctl.c~ioctl-mnt-takers fs/reiserfs/ioctl.c
--- lxc/fs/reiserfs/ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/reiserfs/ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -4,6 +4,7 @@
 
 #include <linux/capability.h>
 #include <linux/fs.h>
+#include <linux/mount.h>
 #include <linux/reiserfs_fs.h>
 #include <linux/time.h>
 #include <asm/uaccess.h>
@@ -25,6 +26,7 @@ int reiserfs_ioctl(struct inode *inode, 
 		   unsigned long arg)
 {
 	unsigned int flags;
+	int err = 0;
 
 	switch (cmd) {
 	case REISERFS_IOC_UNPACK:
@@ -48,48 +50,61 @@ int reiserfs_ioctl(struct inode *inode, 
 			if (!reiserfs_attrs(inode->i_sb))
 				return -ENOTTY;
 
-			if (IS_RDONLY(inode))
-				return -EROFS;
+			err = mnt_want_write(filp->f_vfsmnt);
+			if (err)
+				return err;
 
 			if ((current->fsuid != inode->i_uid)
-			    && !capable(CAP_FOWNER))
-				return -EPERM;
-
-			if (get_user(flags, (int __user *)arg))
-				return -EFAULT;
-
+			    && !capable(CAP_FOWNER)) {
+				err = -EPERM;
+				goto setflags_out;
+			}
+			if (get_user(flags, (int __user *)arg)) {
+				err = -EFAULT;
+				goto setflags_out;
+			}
 			if (((flags ^ REISERFS_I(inode)->
 			      i_attrs) & (REISERFS_IMMUTABLE_FL |
 					  REISERFS_APPEND_FL))
-			    && !capable(CAP_LINUX_IMMUTABLE))
-				return -EPERM;
-
+			    && !capable(CAP_LINUX_IMMUTABLE)) {
+				err = -EPERM;
+				goto setflags_out;
+			}
 			if ((flags & REISERFS_NOTAIL_FL) &&
 			    S_ISREG(inode->i_mode)) {
 				int result;
 
 				result = reiserfs_unpack(inode, filp);
-				if (result)
-					return result;
+				if (result) {
+					err = result;
+					goto setflags_out;
+				}
 			}
 			sd_attrs_to_i_attrs(flags, inode);
 			REISERFS_I(inode)->i_attrs = flags;
 			inode->i_ctime = CURRENT_TIME_SEC;
 			mark_inode_dirty(inode);
-			return 0;
+setflags_out:
+			mnt_drop_write(filp->f_vfsmnt);
+			return err;
 		}
 	case REISERFS_IOC_GETVERSION:
 		return put_user(inode->i_generation, (int __user *)arg);
 	case REISERFS_IOC_SETVERSION:
 		if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER))
 			return -EPERM;
-		if (IS_RDONLY(inode))
-			return -EROFS;
-		if (get_user(inode->i_generation, (int __user *)arg))
-			return -EFAULT;
+		err = mnt_want_write(filp->f_vfsmnt);
+		if (err)
+			return err;
+		if (get_user(inode->i_generation, (int __user *)arg)) {
+			err = -EFAULT;
+			goto setversion_out;
+		}
 		inode->i_ctime = CURRENT_TIME_SEC;
 		mark_inode_dirty(inode);
-		return 0;
+setversion_out:
+		mnt_drop_write(filp->f_vfsmnt);
+		return err;
 	default:
 		return -ENOTTY;
 	}
diff -puN fs/xfs/linux-2.6/xfs_ioctl.c~ioctl-mnt-takers fs/xfs/linux-2.6/xfs_ioctl.c
--- lxc/fs/xfs/linux-2.6/xfs_ioctl.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/xfs/linux-2.6/xfs_ioctl.c	2007-07-10 12:46:05.000000000 -0700
@@ -526,8 +526,6 @@ xfs_attrmulti_attr_set(
 	char			*kbuf;
 	int			error = EFAULT;
 
-	if (IS_RDONLY(&vp->v_inode))
-		return -EROFS;
 	if (IS_IMMUTABLE(&vp->v_inode) || IS_APPEND(&vp->v_inode))
 		return EPERM;
 	if (len > XATTR_SIZE_MAX)
@@ -553,8 +551,6 @@ xfs_attrmulti_attr_remove(
 	char			*name,
 	__uint32_t		flags)
 {
-	if (IS_RDONLY(&vp->v_inode))
-		return -EROFS;
 	if (IS_IMMUTABLE(&vp->v_inode) || IS_APPEND(&vp->v_inode))
 		return EPERM;
 	return bhv_vop_attr_remove(vp, name, flags, NULL);
@@ -564,6 +560,7 @@ STATIC int
 xfs_attrmulti_by_handle(
 	xfs_mount_t		*mp,
 	void			__user *arg,
+	struct file		*parfilp,
 	struct inode		*parinode)
 {
 	int			error;
@@ -618,13 +615,21 @@ xfs_attrmulti_by_handle(
 					&ops[i].am_length, ops[i].am_flags);
 			break;
 		case ATTR_OP_SET:
+			ops[i].am_error = mnt_want_write(parfilp->f_vfsmnt);
+			if (ops[i].am_error)
+				break;
 			ops[i].am_error = xfs_attrmulti_attr_set(vp,
 					attr_name, ops[i].am_attrvalue,
 					ops[i].am_length, ops[i].am_flags);
+			mnt_drop_write(parfilp->f_vfsmnt);
 			break;
 		case ATTR_OP_REMOVE:
+			ops[i].am_error = mnt_want_write(parfilp->f_vfsmnt);
+			if (ops[i].am_error)
+				break;
 			ops[i].am_error = xfs_attrmulti_attr_remove(vp,
 					attr_name, ops[i].am_flags);
+			mnt_drop_write(parfilp->f_vfsmnt);
 			break;
 		default:
 			ops[i].am_error = EINVAL;
@@ -804,7 +809,7 @@ xfs_ioctl(
 		return xfs_attrlist_by_handle(mp, arg, inode);
 
 	case XFS_IOC_ATTRMULTI_BY_HANDLE:
-		return xfs_attrmulti_by_handle(mp, arg, inode);
+		return xfs_attrmulti_by_handle(mp, arg, filp, inode);
 
 	case XFS_IOC_SWAPEXT: {
 		error = xfs_swapext((struct xfs_swapext __user *)arg);
diff -puN fs/xfs/linux-2.6/xfs_iops.c~ioctl-mnt-takers fs/xfs/linux-2.6/xfs_iops.c
--- lxc/fs/xfs/linux-2.6/xfs_iops.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/xfs/linux-2.6/xfs_iops.c	2007-07-10 12:46:05.000000000 -0700
@@ -156,13 +156,6 @@ xfs_ichgtime_fast(
 	 */
 	ASSERT((flags & XFS_ICHGTIME_ACC) == 0);
 
-	/*
-	 * We're not supposed to change timestamps in readonly-mounted
-	 * filesystems.  Throw it away if anyone asks us.
-	 */
-	if (unlikely(IS_RDONLY(inode)))
-		return;
-
 	if (flags & XFS_ICHGTIME_MOD) {
 		tvp = &inode->i_mtime;
 		ip->i_d.di_mtime.t_sec = (__int32_t)tvp->tv_sec;
diff -puN fs/xfs/linux-2.6/xfs_lrw.c~ioctl-mnt-takers fs/xfs/linux-2.6/xfs_lrw.c
--- lxc/fs/xfs/linux-2.6/xfs_lrw.c~ioctl-mnt-takers	2007-07-10 12:46:05.000000000 -0700
+++ lxc-dave/fs/xfs/linux-2.6/xfs_lrw.c	2007-07-10 12:46:05.000000000 -0700
@@ -50,6 +50,7 @@
 #include "xfs_iomap.h"
 
 #include <linux/capability.h>
+#include <linux/mount.h>
 #include <linux/writeback.h>
 
 
@@ -761,10 +762,16 @@ start:
 	if (new_size > xip->i_size)
 		io->io_new_size = new_size;
 
-	if (likely(!(ioflags & IO_INVIS))) {
+	/*
+	 * We're not supposed to change timestamps in readonly-mounted
+	 * filesystems.  Throw it away if anyone asks us.
+	 */
+	if (likely(!(ioflags & IO_INVIS) &&
+		   !mnt_want_write(file->f_vfsmnt))) {
 		file_update_time(file);
 		xfs_ichgtime_fast(xip, inode,
 				  XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+		mnt_drop_write(file->f_vfsmnt);
 	}
 
 	/*
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 07/23] elevate writer count for chown and friends
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (5 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 06/23] r/o bind mounts: elevate write count for some ioctls Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 08/23] make access() use mnt check Dave Hansen
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


chown/chmod,etc... don't call permission in the same way
that the normal "open for write" calls do.  They still
write to the filesystem, so bump the write count during
these operations.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/open.c |   39 ++++++++++++++++++++++++++++++---------
 1 file changed, 30 insertions(+), 9 deletions(-)

diff -puN fs/open.c~elevate-writer-count-for-chown-and-friends fs/open.c
--- lxc/fs/open.c~elevate-writer-count-for-chown-and-friends	2007-07-10 12:46:06.000000000 -0700
+++ lxc-dave/fs/open.c	2007-07-10 12:46:06.000000000 -0700
@@ -510,12 +510,12 @@ asmlinkage long sys_fchmod(unsigned int 
 
 	audit_inode(NULL, inode);
 
-	err = -EROFS;
-	if (IS_RDONLY(inode))
+	err = mnt_want_write(file->f_vfsmnt);
+	if (err)
 		goto out_putf;
 	err = -EPERM;
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
-		goto out_putf;
+		goto out_drop_write;
 	mutex_lock(&inode->i_mutex);
 	if (mode == (mode_t) -1)
 		mode = inode->i_mode;
@@ -524,6 +524,8 @@ asmlinkage long sys_fchmod(unsigned int 
 	err = notify_change(dentry, &newattrs);
 	mutex_unlock(&inode->i_mutex);
 
+out_drop_write:
+	mnt_drop_write(file->f_vfsmnt);
 out_putf:
 	fput(file);
 out:
@@ -543,13 +545,13 @@ asmlinkage long sys_fchmodat(int dfd, co
 		goto out;
 	inode = nd.dentry->d_inode;
 
-	error = -EROFS;
-	if (IS_RDONLY(inode))
+	error = mnt_want_write(nd.mnt);
+	if (error)
 		goto dput_and_out;
 
 	error = -EPERM;
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
-		goto dput_and_out;
+		goto out_drop_write;
 
 	mutex_lock(&inode->i_mutex);
 	if (mode == (mode_t) -1)
@@ -559,6 +561,8 @@ asmlinkage long sys_fchmodat(int dfd, co
 	error = notify_change(nd.dentry, &newattrs);
 	mutex_unlock(&inode->i_mutex);
 
+out_drop_write:
+	mnt_drop_write(nd.mnt);
 dput_and_out:
 	path_release(&nd);
 out:
@@ -581,9 +585,6 @@ static int chown_common(struct dentry * 
 		printk(KERN_ERR "chown_common: NULL inode\n");
 		goto out;
 	}
-	error = -EROFS;
-	if (IS_RDONLY(inode))
-		goto out;
 	error = -EPERM;
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
 		goto out;
@@ -613,7 +614,12 @@ asmlinkage long sys_chown(const char __u
 	error = user_path_walk(filename, &nd);
 	if (error)
 		goto out;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_release;
 	error = chown_common(nd.dentry, user, group);
+	mnt_drop_write(nd.mnt);
+out_release:
 	path_release(&nd);
 out:
 	return error;
@@ -633,7 +639,12 @@ asmlinkage long sys_fchownat(int dfd, co
 	error = __user_walk_fd(dfd, filename, follow, &nd);
 	if (error)
 		goto out;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_release;
 	error = chown_common(nd.dentry, user, group);
+	mnt_drop_write(nd.mnt);
+out_release:
 	path_release(&nd);
 out:
 	return error;
@@ -647,7 +658,12 @@ asmlinkage long sys_lchown(const char __
 	error = user_path_walk_link(filename, &nd);
 	if (error)
 		goto out;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_release;
 	error = chown_common(nd.dentry, user, group);
+	mnt_drop_write(nd.mnt);
+out_release:
 	path_release(&nd);
 out:
 	return error;
@@ -664,9 +680,14 @@ asmlinkage long sys_fchown(unsigned int 
 	if (!file)
 		goto out;
 
+	error = mnt_want_write(file->f_vfsmnt);
+	if (error)
+		goto out_fput;
 	dentry = file->f_path.dentry;
 	audit_inode(NULL, dentry->d_inode);
 	error = chown_common(dentry, user, group);
+	mnt_drop_write(file->f_vfsmnt);
+out_fput:
 	fput(file);
 out:
 	return error;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 08/23] make access() use mnt check
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (6 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 07/23] elevate writer count for chown and friends Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 09/23] elevate mnt writers for callers of vfs_mkdir() Dave Hansen
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


It is OK to let access() go without using a mnt_want/drop_write()
pair because it doesn't actually do writes to the filesystem,
and it is inherently racy anyway.  This is a rare case when it is
OK to use __mnt_is_readonly() directly.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/open.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff -puN fs/open.c~make-access-use-helper fs/open.c
--- lxc/fs/open.c~make-access-use-helper	2007-07-10 12:46:07.000000000 -0700
+++ lxc-dave/fs/open.c	2007-07-10 12:46:07.000000000 -0700
@@ -396,8 +396,17 @@ asmlinkage long sys_faccessat(int dfd, c
 	if(res || !(mode & S_IWOTH) ||
 	   special_file(nd.dentry->d_inode->i_mode))
 		goto out_path_release;
-
-	if(IS_RDONLY(nd.dentry->d_inode))
+	/*
+	 * This is a rare case where using __mnt_is_readonly()
+	 * is OK without a mnt_want/drop_write() pair.  Since
+	 * no actual write to the fs is performed here, we do
+	 * not need to telegraph to that to anyone.
+	 *
+	 * By doing this, we accept that this access is
+	 * inherently racy and know that the fs may change
+	 * state before we even see this result.
+	 */
+	if (__mnt_is_readonly(nd.mnt))
 		res = -EROFS;
 
 out_path_release:
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 09/23] elevate mnt writers for callers of vfs_mkdir()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (7 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 08/23] make access() use mnt check Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 10/23] elevate write count during entire ncp_ioctl() Dave Hansen
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


Pretty self-explanatory.  Fits in with the rest of the series.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namei.c            |    5 +++++
 lxc-dave/fs/nfsd/nfs4recover.c |    4 ++++
 2 files changed, 9 insertions(+)

diff -puN fs/namei.c~elevate-mnt-writers-for-callers-of-vfs-mkdir fs/namei.c
--- lxc/fs/namei.c~elevate-mnt-writers-for-callers-of-vfs-mkdir	2007-07-10 12:46:07.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:07.000000000 -0700
@@ -1993,7 +1993,12 @@ asmlinkage long sys_mkdirat(int dfd, con
 
 	if (!IS_POSIXACL(nd.dentry->d_inode))
 		mode &= ~current->fs->umask;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_dput;
 	error = vfs_mkdir(nd.dentry->d_inode, dentry, mode);
+	mnt_drop_write(nd.mnt);
+out_dput:
 	dput(dentry);
 out_unlock:
 	mutex_unlock(&nd.dentry->d_inode->i_mutex);
diff -puN fs/nfsd/nfs4recover.c~elevate-mnt-writers-for-callers-of-vfs-mkdir fs/nfsd/nfs4recover.c
--- lxc/fs/nfsd/nfs4recover.c~elevate-mnt-writers-for-callers-of-vfs-mkdir	2007-07-10 12:46:07.000000000 -0700
+++ lxc-dave/fs/nfsd/nfs4recover.c	2007-07-10 12:46:07.000000000 -0700
@@ -156,7 +156,11 @@ nfsd4_create_clid_dir(struct nfs4_client
 		dprintk("NFSD: nfsd4_create_clid_dir: DIRECTORY EXISTS\n");
 		goto out_put;
 	}
+	status = mnt_want_write(rec_dir.mnt);
+	if (status)
+		goto out_put;
 	status = vfs_mkdir(rec_dir.dentry->d_inode, dentry, S_IRWXU);
+	mnt_drop_write(rec_dir.mnt);
 out_put:
 	dput(dentry);
 out_unlock:
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 10/23] elevate write count during entire ncp_ioctl()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (8 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 09/23] elevate mnt writers for callers of vfs_mkdir() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 11/23] elevate write count for link and symlink calls Dave Hansen
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


Some ioctls need write access, but others don't.  Make a helper
function to decide when write access is needed, and take it.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/ncpfs/ioctl.c |   55 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff -puN fs/ncpfs/ioctl.c~elevate-write-count-during-entire-ncp-ioctl fs/ncpfs/ioctl.c
--- lxc/fs/ncpfs/ioctl.c~elevate-write-count-during-entire-ncp-ioctl	2007-07-10 12:46:08.000000000 -0700
+++ lxc-dave/fs/ncpfs/ioctl.c	2007-07-10 12:46:08.000000000 -0700
@@ -14,6 +14,7 @@
 #include <linux/ioctl.h>
 #include <linux/time.h>
 #include <linux/mm.h>
+#include <linux/mount.h>
 #include <linux/highuid.h>
 #include <linux/smp_lock.h>
 #include <linux/vmalloc.h>
@@ -261,7 +262,7 @@ ncp_get_charsets(struct ncp_server* serv
 }
 #endif /* CONFIG_NCPFS_NLS */
 
-int ncp_ioctl(struct inode *inode, struct file *filp,
+static int __ncp_ioctl(struct inode *inode, struct file *filp,
 	      unsigned int cmd, unsigned long arg)
 {
 	struct ncp_server *server = NCP_SERVER(inode);
@@ -822,6 +823,58 @@ outrel:			
 	return -EINVAL;
 }
 
+static int ncp_ioctl_need_write(unsigned int cmd)
+{
+	switch (cmd) {
+	case NCP_IOC_GET_FS_INFO:
+	case NCP_IOC_GET_FS_INFO_V2:
+	case NCP_IOC_NCPREQUEST:
+	case NCP_IOC_SETDENTRYTTL:
+	case NCP_IOC_SIGN_INIT:
+	case NCP_IOC_LOCKUNLOCK:
+	case NCP_IOC_SET_SIGN_WANTED:
+		return 1;
+	case NCP_IOC_GETOBJECTNAME:
+	case NCP_IOC_SETOBJECTNAME:
+	case NCP_IOC_GETPRIVATEDATA:
+	case NCP_IOC_SETPRIVATEDATA:
+	case NCP_IOC_SETCHARSETS:
+	case NCP_IOC_GETCHARSETS:
+	case NCP_IOC_CONN_LOGGED_IN:
+	case NCP_IOC_GETDENTRYTTL:
+	case NCP_IOC_GETMOUNTUID2:
+	case NCP_IOC_SIGN_WANTED:
+	case NCP_IOC_GETROOT:
+	case NCP_IOC_SETROOT:
+		return 0;
+	default:
+		/* unkown IOCTL command, assume write */
+		WARN_ON(1);
+	}
+	return 1;
+}
+
+int ncp_ioctl(struct inode *inode, struct file *filp,
+	      unsigned int cmd, unsigned long arg)
+{
+	int ret;
+
+	if (ncp_ioctl_need_write(cmd)) {
+		/*
+		 * inside the ioctl(), any failures which
+		 * are because of file_permission() are
+		 * -EACCESS, so it seems consistent to keep
+		 *  that here.
+		 */
+		if (mnt_want_write(filp->f_vfsmnt))
+			return -EACCES;
+	}
+	ret = __ncp_ioctl(inode, filp, cmd, arg);
+	if (ncp_ioctl_need_write(cmd))
+		mnt_drop_write(filp->f_vfsmnt);
+	return ret;
+}
+
 #ifdef CONFIG_COMPAT
 long ncp_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 11/23] elevate write count for link and symlink calls
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (9 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 10/23] elevate write count during entire ncp_ioctl() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 12/23] elevate mount count for extended attributes Dave Hansen
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namei.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff -puN fs/namei.c~elevate-write-count-for-link-and-symlink-calls fs/namei.c
--- lxc/fs/namei.c~elevate-write-count-for-link-and-symlink-calls	2007-07-10 12:46:09.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:09.000000000 -0700
@@ -2266,7 +2266,12 @@ asmlinkage long sys_symlinkat(const char
 	if (IS_ERR(dentry))
 		goto out_unlock;
 
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_dput;
 	error = vfs_symlink(nd.dentry->d_inode, dentry, from, S_IALLUGO);
+	mnt_drop_write(nd.mnt);
+out_dput:
 	dput(dentry);
 out_unlock:
 	mutex_unlock(&nd.dentry->d_inode->i_mutex);
@@ -2361,7 +2366,12 @@ asmlinkage long sys_linkat(int olddfd, c
 	error = PTR_ERR(new_dentry);
 	if (IS_ERR(new_dentry))
 		goto out_unlock;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_dput;
 	error = vfs_link(old_nd.dentry, nd.dentry->d_inode, new_dentry);
+	mnt_drop_write(nd.mnt);
+out_dput:
 	dput(new_dentry);
 out_unlock:
 	mutex_unlock(&nd.dentry->d_inode->i_mutex);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 12/23] elevate mount count for extended attributes
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (10 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 11/23] elevate write count for link and symlink calls Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 13/23] elevate write count for file_update_time() Dave Hansen
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


This basically audits the callers of xattr_permission(), which
calls permission() and can perform writes to the filesystem.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/nfsd/nfs4proc.c |    7 ++++++-
 lxc-dave/fs/xattr.c         |   16 ++++++++++++++--
 2 files changed, 20 insertions(+), 3 deletions(-)

diff -puN fs/nfsd/nfs4proc.c~elevate-mount-count-for-extended-attributes fs/nfsd/nfs4proc.c
--- lxc/fs/nfsd/nfs4proc.c~elevate-mount-count-for-extended-attributes	2007-07-10 12:46:10.000000000 -0700
+++ lxc-dave/fs/nfsd/nfs4proc.c	2007-07-10 12:46:10.000000000 -0700
@@ -626,14 +626,19 @@ nfsd4_setattr(struct svc_rqst *rqstp, st
 			return status;
 		}
 	}
+	status = mnt_want_write(cstate->current_fh.fh_export->ex_mnt);
+	if (status)
+		return status;
 	status = nfs_ok;
 	if (setattr->sa_acl != NULL)
 		status = nfsd4_set_nfs4_acl(rqstp, &cstate->current_fh,
 					    setattr->sa_acl);
 	if (status)
-		return status;
+		goto out;
 	status = nfsd_setattr(rqstp, &cstate->current_fh, &setattr->sa_iattr,
 				0, (time_t)0);
+out:
+	mnt_drop_write(cstate->current_fh.fh_export->ex_mnt);
 	return status;
 }
 
diff -puN fs/xattr.c~elevate-mount-count-for-extended-attributes fs/xattr.c
--- lxc/fs/xattr.c~elevate-mount-count-for-extended-attributes	2007-07-10 12:46:10.000000000 -0700
+++ lxc-dave/fs/xattr.c	2007-07-10 12:46:10.000000000 -0700
@@ -11,6 +11,7 @@
 #include <linux/slab.h>
 #include <linux/file.h>
 #include <linux/xattr.h>
+#include <linux/mount.h>
 #include <linux/namei.h>
 #include <linux/security.h>
 #include <linux/syscalls.h>
@@ -32,8 +33,6 @@ xattr_permission(struct inode *inode, co
 	 * filesystem  or on an immutable / append-only inode.
 	 */
 	if (mask & MAY_WRITE) {
-		if (IS_RDONLY(inode))
-			return -EROFS;
 		if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
 			return -EPERM;
 	}
@@ -236,7 +235,11 @@ sys_setxattr(char __user *path, char __u
 	error = user_path_walk(path, &nd);
 	if (error)
 		return error;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		return error;
 	error = setxattr(nd.dentry, name, value, size, flags);
+	mnt_drop_write(nd.mnt);
 	path_release(&nd);
 	return error;
 }
@@ -251,7 +254,11 @@ sys_lsetxattr(char __user *path, char __
 	error = user_path_walk_link(path, &nd);
 	if (error)
 		return error;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		return error;
 	error = setxattr(nd.dentry, name, value, size, flags);
+	mnt_drop_write(nd.mnt);
 	path_release(&nd);
 	return error;
 }
@@ -267,9 +274,14 @@ sys_fsetxattr(int fd, char __user *name,
 	f = fget(fd);
 	if (!f)
 		return error;
+	error = mnt_want_write(f->f_vfsmnt);
+	if (error)
+		goto out_fput;
 	dentry = f->f_path.dentry;
 	audit_inode(NULL, dentry->d_inode);
 	error = setxattr(dentry, name, value, size, flags);
+	mnt_drop_write(f->f_vfsmnt);
+out_fput:
 	fput(f);
 	return error;
 }
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 13/23] elevate write count for file_update_time()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (11 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 12/23] elevate mount count for extended attributes Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 14/23] mount_is_safe(): add comment Dave Hansen
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/inode.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff -puN fs/inode.c~elevate-write-count-for-file_update_time fs/inode.c
--- lxc/fs/inode.c~elevate-write-count-for-file_update_time	2007-07-10 12:46:10.000000000 -0700
+++ lxc-dave/fs/inode.c	2007-07-10 12:46:10.000000000 -0700
@@ -1221,10 +1221,19 @@ void file_update_time(struct file *file)
 	struct inode *inode = file->f_path.dentry->d_inode;
 	struct timespec now;
 	int sync_it = 0;
+	int err = 0;
 
 	if (IS_NOCMTIME(inode))
 		return;
-	if (IS_RDONLY(inode))
+	/*
+	 * Ideally, we want to guarantee that 'f_vfsmnt'
+	 * is non-NULL here.  But, NFS exports need to
+	 * be fixed up before we can do that.  So, check
+	 * it for now. - Dave Hansen
+	 */
+	if (file->f_vfsmnt)
+		err = mnt_want_write(file->f_vfsmnt);
+	if (err)
 		return;
 
 	now = current_fs_time(inode->i_sb);
@@ -1240,6 +1249,8 @@ void file_update_time(struct file *file)
 
 	if (sync_it)
 		mark_inode_dirty_sync(inode);
+	if (file->f_vfsmnt)
+		mnt_drop_write(file->f_vfsmnt);
 }
 
 EXPORT_SYMBOL(file_update_time);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 14/23] mount_is_safe(): add comment
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (12 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 13/23] elevate write count for file_update_time() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 15/23] unix_find_other() elevate write count for touch_atime() Dave Hansen
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


This area of code is currently #ifdef'd out, so add a comment
for the time when it is actually used.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namespace.c |    4 ++++
 1 file changed, 4 insertions(+)

diff -puN fs/namespace.c~mount-is-safe-add-comment fs/namespace.c
--- lxc/fs/namespace.c~mount-is-safe-add-comment	2007-07-10 12:46:11.000000000 -0700
+++ lxc-dave/fs/namespace.c	2007-07-10 12:46:11.000000000 -0700
@@ -728,6 +728,10 @@ static int mount_is_safe(struct nameidat
 		if (current->uid != nd->dentry->d_inode->i_uid)
 			return -EPERM;
 	}
+	/*
+	 * We will eventually check for the mnt->writer_count here,
+	 * but since the code is not used now, skip it - Dave Hansen
+	 */
 	if (vfs_permission(nd, MAY_WRITE))
 		return -EPERM;
 	return 0;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 15/23] unix_find_other() elevate write count for touch_atime()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (13 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 14/23] mount_is_safe(): add comment Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 16/23] elevate write count over calls to vfs_rename() Dave Hansen
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/net/unix/af_unix.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff -puN net/unix/af_unix.c~unix-find-other-elevate-write-count-for-touch-atime net/unix/af_unix.c
--- lxc/net/unix/af_unix.c~unix-find-other-elevate-write-count-for-touch-atime	2007-07-10 12:46:11.000000000 -0700
+++ lxc-dave/net/unix/af_unix.c	2007-07-10 12:46:11.000000000 -0700
@@ -702,21 +702,27 @@ static struct sock *unix_find_other(stru
 		err = path_lookup(sunname->sun_path, LOOKUP_FOLLOW, &nd);
 		if (err)
 			goto fail;
+
+		err = mnt_want_write(nd.mnt);
+		if (err)
+			goto put_path_fail;
+
 		err = vfs_permission(&nd, MAY_WRITE);
 		if (err)
-			goto put_fail;
+			goto mnt_drop_write_fail;
 
 		err = -ECONNREFUSED;
 		if (!S_ISSOCK(nd.dentry->d_inode->i_mode))
-			goto put_fail;
+			goto mnt_drop_write_fail;
 		u=unix_find_socket_byinode(nd.dentry->d_inode);
 		if (!u)
-			goto put_fail;
+			goto mnt_drop_write_fail;
 
 		if (u->sk_type == type)
 			touch_atime(nd.mnt, nd.dentry);
 
 		path_release(&nd);
+		mnt_drop_write(nd.mnt);
 
 		err=-EPROTOTYPE;
 		if (u->sk_type != type) {
@@ -736,7 +742,9 @@ static struct sock *unix_find_other(stru
 	}
 	return u;
 
-put_fail:
+mnt_drop_write_fail:
+	mnt_drop_write(nd.mnt);
+put_path_fail:
 	path_release(&nd);
 fail:
 	*error=err;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 16/23] elevate write count over calls to vfs_rename()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (14 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 15/23] unix_find_other() elevate write count for touch_atime() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 17/23] nfs: check mnt instead of superblock directly Dave Hansen
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


This also uses the little helper in the NFS code to
make an if() a little bit less ugly.  We introduced
the helper at the beginning of the series.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namei.c    |    4 ++++
 lxc-dave/fs/nfsd/vfs.c |   15 +++++++++++----
 2 files changed, 15 insertions(+), 4 deletions(-)

diff -puN fs/namei.c~elevate-write-count-over-calls-to-vfs-rename fs/namei.c
--- lxc/fs/namei.c~elevate-write-count-over-calls-to-vfs-rename	2007-07-10 12:46:12.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:12.000000000 -0700
@@ -2597,8 +2597,12 @@ static int do_rename(int olddfd, const c
 	if (new_dentry == trap)
 		goto exit5;
 
+	error = mnt_want_write(oldnd.mnt);
+	if (error)
+		goto exit5;
 	error = vfs_rename(old_dir->d_inode, old_dentry,
 				   new_dir->d_inode, new_dentry);
+	mnt_drop_write(oldnd.mnt);
 exit5:
 	dput(new_dentry);
 exit4:
diff -puN fs/nfsd/vfs.c~elevate-write-count-over-calls-to-vfs-rename fs/nfsd/vfs.c
--- lxc/fs/nfsd/vfs.c~elevate-write-count-over-calls-to-vfs-rename	2007-07-10 12:46:12.000000000 -0700
+++ lxc-dave/fs/nfsd/vfs.c	2007-07-10 12:46:12.000000000 -0700
@@ -1622,13 +1622,20 @@ nfsd_rename(struct svc_rqst *rqstp, stru
 	if (ndentry == trap)
 		goto out_dput_new;
 
-#ifdef MSNFS
-	if ((ffhp->fh_export->ex_flags & NFSEXP_MSNFS) &&
+	if (svc_msnfs(ffhp) &&
 		((atomic_read(&odentry->d_count) > 1)
 		 || (atomic_read(&ndentry->d_count) > 1))) {
 			host_err = -EPERM;
-	} else
-#endif
+			goto out_dput_new;
+	}
+
+	host_err = -EXDEV;
+	if (ffhp->fh_export->ex_mnt != tfhp->fh_export->ex_mnt)
+		goto out_dput_new;
+	host_err = mnt_want_write(ffhp->fh_export->ex_mnt);
+	if (host_err)
+		goto out_dput_new;
+
 	host_err = vfs_rename(fdir, odentry, tdir, ndentry);
 	if (!host_err && EX_ISSYNC(tfhp->fh_export)) {
 		host_err = nfsd_sync_dir(tdentry);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 17/23] nfs: check mnt instead of superblock directly
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (15 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 16/23] elevate write count over calls to vfs_rename() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 18/23] elevate writer count for do_sys_truncate() Dave Hansen
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


If we depend on the inodes for writeability, we will not
catch the r/o mounts when implemented.

This patches uses __mnt_want_write().  It does not guarantee
that the mount will stay writeable after the check.  But,
this is OK for one of the checks because it is just for a
printk().

The other two are probably unnecessary and duplicate existing
checks in the VFS.  This won't make them better checks than
before, but it will make them detect r/o mounts.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/nfs/dir.c  |    3 ++-
 lxc-dave/fs/nfsd/vfs.c |    4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff -puN fs/nfs/dir.c~nfs-check-mnt-instead-of-sb fs/nfs/dir.c
--- lxc/fs/nfs/dir.c~nfs-check-mnt-instead-of-sb	2007-07-10 12:46:13.000000000 -0700
+++ lxc-dave/fs/nfs/dir.c	2007-07-10 12:46:13.000000000 -0700
@@ -998,7 +998,8 @@ static int is_atomic_open(struct inode *
 	if (nd->flags & LOOKUP_DIRECTORY)
 		return 0;
 	/* Are we trying to write to a read only partition? */
-	if (IS_RDONLY(dir) && (nd->intent.open.flags & (O_CREAT|O_TRUNC|FMODE_WRITE)))
+	if (__mnt_is_readonly(nd->mnt) &&
+	    (nd->intent.open.flags & (O_CREAT|O_TRUNC|FMODE_WRITE)))
 		return 0;
 	return 1;
 }
diff -puN fs/nfsd/vfs.c~nfs-check-mnt-instead-of-sb fs/nfsd/vfs.c
--- lxc/fs/nfsd/vfs.c~nfs-check-mnt-instead-of-sb	2007-07-10 12:46:13.000000000 -0700
+++ lxc-dave/fs/nfsd/vfs.c	2007-07-10 12:46:13.000000000 -0700
@@ -1810,7 +1810,7 @@ nfsd_permission(struct svc_export *exp, 
 		inode->i_mode,
 		IS_IMMUTABLE(inode)?	" immut" : "",
 		IS_APPEND(inode)?	" append" : "",
-		IS_RDONLY(inode)?	" ro" : "");
+		__mnt_is_readonly(exp->ex_mnt)?	" ro" : "");
 	dprintk("      owner %d/%d user %d/%d\n",
 		inode->i_uid, inode->i_gid, current->fsuid, current->fsgid);
 #endif
@@ -1821,7 +1821,7 @@ nfsd_permission(struct svc_export *exp, 
 	 */
 	if (!(acc & MAY_LOCAL_ACCESS))
 		if (acc & (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) {
-			if (EX_RDONLY(exp) || IS_RDONLY(inode))
+			if (EX_RDONLY(exp) || __mnt_is_readonly(exp->ex_mnt))
 				return nfserr_rofs;
 			if (/* (acc & MAY_WRITE) && */ IS_IMMUTABLE(inode))
 				return nfserr_perm;
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 18/23] elevate writer count for do_sys_truncate()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (16 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 17/23] nfs: check mnt instead of superblock directly Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 19/23] elevate write count for do_utimes() Dave Hansen
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/open.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff -puN fs/open.c~elevate-writer-count-for-do-sys-truncate fs/open.c
--- lxc/fs/open.c~elevate-writer-count-for-do-sys-truncate	2007-07-10 12:46:13.000000000 -0700
+++ lxc-dave/fs/open.c	2007-07-10 12:46:13.000000000 -0700
@@ -243,28 +243,28 @@ static long do_sys_truncate(const char _
 	if (!S_ISREG(inode->i_mode))
 		goto dput_and_out;
 
-	error = vfs_permission(&nd, MAY_WRITE);
+	error = mnt_want_write(nd.mnt);
 	if (error)
 		goto dput_and_out;
 
-	error = -EROFS;
-	if (IS_RDONLY(inode))
-		goto dput_and_out;
+	error = vfs_permission(&nd, MAY_WRITE);
+	if (error)
+		goto mnt_drop_write_and_out;
 
 	error = -EPERM;
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
-		goto dput_and_out;
+		goto mnt_drop_write_and_out;
 
 	/*
 	 * Make sure that there are no leases.
 	 */
 	error = break_lease(inode, FMODE_WRITE);
 	if (error)
-		goto dput_and_out;
+		goto mnt_drop_write_and_out;
 
 	error = get_write_access(inode);
 	if (error)
-		goto dput_and_out;
+		goto mnt_drop_write_and_out;
 
 	error = locks_verify_truncate(inode, NULL, length);
 	if (!error) {
@@ -273,6 +273,8 @@ static long do_sys_truncate(const char _
 	}
 	put_write_access(inode);
 
+mnt_drop_write_and_out:
+	mnt_drop_write(nd.mnt);
 dput_and_out:
 	path_release(&nd);
 out:
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 19/23] elevate write count for do_utimes()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (17 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 18/23] elevate writer count for do_sys_truncate() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 20/23] elevate write count for do_sys_utime() and touch_atime() Dave Hansen
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/utimes.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff -puN fs/utimes.c~elevate-write-count-for-do-utimes fs/utimes.c
--- lxc/fs/utimes.c~elevate-write-count-for-do-utimes	2007-07-10 12:46:14.000000000 -0700
+++ lxc-dave/fs/utimes.c	2007-07-10 12:46:14.000000000 -0700
@@ -2,6 +2,7 @@
 #include <linux/file.h>
 #include <linux/fs.h>
 #include <linux/linkage.h>
+#include <linux/mount.h>
 #include <linux/namei.h>
 #include <linux/sched.h>
 #include <linux/stat.h>
@@ -75,8 +76,8 @@ long do_utimes(int dfd, char __user *fil
 
 	inode = dentry->d_inode;
 
-	error = -EROFS;
-	if (IS_RDONLY(inode))
+	error = mnt_want_write(nd.mnt);
+	if (error)
 		goto dput_and_out;
 
 	/* Don't worry, the checks are done in inode_change_ok() */
@@ -84,7 +85,7 @@ long do_utimes(int dfd, char __user *fil
 	if (times) {
 		error = -EPERM;
                 if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
-                        goto dput_and_out;
+			goto mnt_drop_write_and_out;
 
 		if (times[0].tv_nsec == UTIME_OMIT)
 			newattrs.ia_valid &= ~ATTR_ATIME;
@@ -104,22 +105,24 @@ long do_utimes(int dfd, char __user *fil
 	} else {
 		error = -EACCES;
                 if (IS_IMMUTABLE(inode))
-                        goto dput_and_out;
+			goto mnt_drop_write_and_out;
 
 		if (current->fsuid != inode->i_uid) {
 			if (f) {
 				if (!(f->f_mode & FMODE_WRITE))
-					goto dput_and_out;
+					goto mnt_drop_write_and_out;
 			} else {
 				error = vfs_permission(&nd, MAY_WRITE);
 				if (error)
-					goto dput_and_out;
+					goto mnt_drop_write_and_out;
 			}
 		}
 	}
 	mutex_lock(&inode->i_mutex);
 	error = notify_change(dentry, &newattrs);
 	mutex_unlock(&inode->i_mutex);
+mnt_drop_write_and_out:
+	mnt_drop_write(nd.mnt);
 dput_and_out:
 	if (f)
 		fput(f);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 20/23] elevate write count for do_sys_utime() and touch_atime()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (18 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 19/23] elevate write count for do_utimes() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 21/23] sys_mknodat(): elevate write count for vfs_mknod/create() Dave Hansen
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/inode.c |   20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff -puN fs/inode.c~elevate-write-count-for-do-sys-utime-and-touch-atime fs/inode.c
--- lxc/fs/inode.c~elevate-write-count-for-do-sys-utime-and-touch-atime	2007-07-10 12:46:15.000000000 -0700
+++ lxc-dave/fs/inode.c	2007-07-10 12:46:15.000000000 -0700
@@ -1165,22 +1165,23 @@ void touch_atime(struct vfsmount *mnt, s
 	struct inode *inode = dentry->d_inode;
 	struct timespec now;
 
-	if (inode->i_flags & S_NOATIME)
+	if (mnt && mnt_want_write(mnt))
 		return;
+	if (inode->i_flags & S_NOATIME)
+		goto out;
 	if (IS_NOATIME(inode))
-		return;
+		goto out;
 	if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode))
-		return;
+		goto out;
 
 	/*
 	 * We may have a NULL vfsmount when coming from NFSD
 	 */
 	if (mnt) {
 		if (mnt->mnt_flags & MNT_NOATIME)
-			return;
+			goto out;
 		if ((mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode))
-			return;
-
+			goto out;
 		if (mnt->mnt_flags & MNT_RELATIME) {
 			/*
 			 * With relative atime, only update atime if the
@@ -1191,16 +1192,19 @@ void touch_atime(struct vfsmount *mnt, s
 						&inode->i_atime) < 0 &&
 			    timespec_compare(&inode->i_ctime,
 						&inode->i_atime) < 0)
-				return;
+				goto out;
 		}
 	}
 
 	now = current_fs_time(inode->i_sb);
 	if (timespec_equal(&inode->i_atime, &now))
-		return;
+		goto out;
 
 	inode->i_atime = now;
 	mark_inode_dirty_sync(inode);
+out:
+	if (mnt)
+		mnt_drop_write(mnt);
 }
 EXPORT_SYMBOL(touch_atime);
 
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 21/23] sys_mknodat(): elevate write count for vfs_mknod/create()
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (19 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 20/23] elevate write count for do_sys_utime() and touch_atime() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 22/23] elevate mnt writers for vfs_unlink() callers Dave Hansen
  2007-07-12  0:17 ` [PATCH 23/23] do_rmdir(): elevate write count Dave Hansen
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


This takes care of all of the direct callers of vfs_mknod().
Since a few of these cases also handle normal file creation
as well, this also covers some calls to vfs_create().

So that we don't have to make three mnt_want/drop_write()
calls inside of the switch statement, we move some of its
logic outside of the switch.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namei.c         |   32 +++++++++++++++++++++-----------
 lxc-dave/fs/nfsd/vfs.c      |    4 ++++
 lxc-dave/net/unix/af_unix.c |    4 ++++
 3 files changed, 29 insertions(+), 11 deletions(-)

diff -puN fs/namei.c~sys-mknodat-elevate-write-count-for-vfs-mknod-create fs/namei.c
--- lxc/fs/namei.c~sys-mknodat-elevate-write-count-for-vfs-mknod-create	2007-07-10 12:46:15.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-11 10:10:55.000000000 -0700
@@ -1912,12 +1912,25 @@ asmlinkage long sys_mknodat(int dfd, con
 	if (error)
 		goto out;
 	dentry = lookup_create(&nd, 0);
-	error = PTR_ERR(dentry);
-
+	if (IS_ERR(dentry)) {
+		error = PTR_ERR(dentry);
+		goto out_unlock;
+	}
 	if (!IS_POSIXACL(nd.dentry->d_inode))
 		mode &= ~current->fs->umask;
-	if (!IS_ERR(dentry)) {
-		switch (mode & S_IFMT) {
+	if (S_ISDIR(mode)) {
+		error = -EPERM;
+		goto out_dput;
+	}
+	if (!S_ISREG(mode)  && !S_ISCHR(mode)  && !S_ISBLK(mode) &&
+	    !S_ISFIFO(mode) && !S_ISSOCK(mode) && mode != 0) {
+		error = -EINVAL;
+		goto out_dput;
+	}
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto out_dput;
+	switch (mode & S_IFMT) {
 		case 0: case S_IFREG:
 			error = vfs_create(nd.dentry->d_inode,dentry,mode,&nd);
 			break;
@@ -1928,14 +1941,11 @@ asmlinkage long sys_mknodat(int dfd, con
 		case S_IFIFO: case S_IFSOCK:
 			error = vfs_mknod(nd.dentry->d_inode,dentry,mode,0);
 			break;
-		case S_IFDIR:
-			error = -EPERM;
-			break;
-		default:
-			error = -EINVAL;
-		}
-		dput(dentry);
 	}
+	mnt_drop_write(nd.mnt);
+out_dput:
+	dput(dentry);
+out_unlock:
 	mutex_unlock(&nd.dentry->d_inode->i_mutex);
 	path_release(&nd);
 out:
diff -puN fs/nfsd/vfs.c~sys-mknodat-elevate-write-count-for-vfs-mknod-create fs/nfsd/vfs.c
--- lxc/fs/nfsd/vfs.c~sys-mknodat-elevate-write-count-for-vfs-mknod-create	2007-07-10 12:46:15.000000000 -0700
+++ lxc-dave/fs/nfsd/vfs.c	2007-07-10 12:46:15.000000000 -0700
@@ -1199,7 +1199,11 @@ nfsd_create(struct svc_rqst *rqstp, stru
 	case S_IFBLK:
 	case S_IFIFO:
 	case S_IFSOCK:
+		host_err = mnt_want_write(fhp->fh_export->ex_mnt);
+		if (host_err)
+			break;
 		host_err = vfs_mknod(dirp, dchild, iap->ia_mode, rdev);
+		mnt_drop_write(fhp->fh_export->ex_mnt);
 		break;
 	default:
 	        printk("nfsd: bad file type %o in nfsd_create\n", type);
diff -puN net/unix/af_unix.c~sys-mknodat-elevate-write-count-for-vfs-mknod-create net/unix/af_unix.c
--- lxc/net/unix/af_unix.c~sys-mknodat-elevate-write-count-for-vfs-mknod-create	2007-07-10 12:46:15.000000000 -0700
+++ lxc-dave/net/unix/af_unix.c	2007-07-10 12:46:15.000000000 -0700
@@ -815,7 +815,11 @@ static int unix_bind(struct socket *sock
 		 */
 		mode = S_IFSOCK |
 		       (SOCK_INODE(sock)->i_mode & ~current->fs->umask);
+		err = mnt_want_write(nd.mnt);
+		if (err)
+			goto out_mknod_dput;
 		err = vfs_mknod(nd.dentry->d_inode, dentry, mode, 0);
+		mnt_drop_write(nd.mnt);
 		if (err)
 			goto out_mknod_dput;
 		mutex_unlock(&nd.dentry->d_inode->i_mutex);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 22/23] elevate mnt writers for vfs_unlink() callers
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (20 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 21/23] sys_mknodat(): elevate write count for vfs_mknod/create() Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  2007-07-12  0:17 ` [PATCH 23/23] do_rmdir(): elevate write count Dave Hansen
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen



Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namei.c   |    4 ++++
 lxc-dave/ipc/mqueue.c |    5 ++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff -puN fs/namei.c~elevate-mnt-writers-for-vfs-unlink-callers fs/namei.c
--- lxc/fs/namei.c~elevate-mnt-writers-for-vfs-unlink-callers	2007-07-10 12:46:16.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:16.000000000 -0700
@@ -2195,7 +2195,11 @@ static long do_unlinkat(int dfd, const c
 		inode = dentry->d_inode;
 		if (inode)
 			atomic_inc(&inode->i_count);
+		error = mnt_want_write(nd.mnt);
+		if (error)
+			goto exit2;
 		error = vfs_unlink(nd.dentry->d_inode, dentry);
+		mnt_drop_write(nd.mnt);
 	exit2:
 		dput(dentry);
 	}
diff -puN ipc/mqueue.c~elevate-mnt-writers-for-vfs-unlink-callers ipc/mqueue.c
--- lxc/ipc/mqueue.c~elevate-mnt-writers-for-vfs-unlink-callers	2007-07-10 12:46:16.000000000 -0700
+++ lxc-dave/ipc/mqueue.c	2007-07-10 12:46:16.000000000 -0700
@@ -750,8 +750,11 @@ asmlinkage long sys_mq_unlink(const char
 	inode = dentry->d_inode;
 	if (inode)
 		atomic_inc(&inode->i_count);
-
+	err = mnt_want_write(mqueue_mnt);
+	if (err)
+		goto out_err;
 	err = vfs_unlink(dentry->d_parent->d_inode, dentry);
+	mnt_drop_write(mqueue_mnt);
 out_err:
 	dput(dentry);
 
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 23/23] do_rmdir(): elevate write count
  2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
                   ` (21 preceding siblings ...)
  2007-07-12  0:17 ` [PATCH 22/23] elevate mnt writers for vfs_unlink() callers Dave Hansen
@ 2007-07-12  0:17 ` Dave Hansen
  22 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2007-07-12  0:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, viro, hch, Dave Hansen


Elevate the write count during the vfs_rmdir() call.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
---

 lxc-dave/fs/namei.c |    5 +++++
 1 file changed, 5 insertions(+)

diff -puN fs/namei.c~do-rmdir-elevate-write-count fs/namei.c
--- lxc/fs/namei.c~do-rmdir-elevate-write-count	2007-07-10 12:46:16.000000000 -0700
+++ lxc-dave/fs/namei.c	2007-07-10 12:46:16.000000000 -0700
@@ -2115,7 +2115,12 @@ static long do_rmdir(int dfd, const char
 	error = PTR_ERR(dentry);
 	if (IS_ERR(dentry))
 		goto exit2;
+	error = mnt_want_write(nd.mnt);
+	if (error)
+		goto exit3;
 	error = vfs_rmdir(nd.dentry->d_inode, dentry);
+	mnt_drop_write(nd.mnt);
+exit3:
 	dput(dentry);
 exit2:
 	mutex_unlock(&nd.dentry->d_inode->i_mutex);
_

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2007-07-12  0:17 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-12  0:17 [PATCH 00/23] Mount writer count API (read-only bind mounts prep) Dave Hansen
2007-07-12  0:17 ` [PATCH 01/23] rearrange may_open() to be r/o friendly Dave Hansen
2007-07-12  0:17 ` [PATCH 02/23] create cleanup helper svc_msnfs() Dave Hansen
2007-07-12  0:17 ` [PATCH 03/23] filesystem helpers for custom 'struct file's Dave Hansen
2007-07-12  0:17 ` [PATCH 04/23] r/o bind mounts: stub functions Dave Hansen
2007-07-12  0:17 ` [PATCH 05/23] elevate write count open()'d files Dave Hansen
2007-07-12  0:17 ` [PATCH 06/23] r/o bind mounts: elevate write count for some ioctls Dave Hansen
2007-07-12  0:17 ` [PATCH 07/23] elevate writer count for chown and friends Dave Hansen
2007-07-12  0:17 ` [PATCH 08/23] make access() use mnt check Dave Hansen
2007-07-12  0:17 ` [PATCH 09/23] elevate mnt writers for callers of vfs_mkdir() Dave Hansen
2007-07-12  0:17 ` [PATCH 10/23] elevate write count during entire ncp_ioctl() Dave Hansen
2007-07-12  0:17 ` [PATCH 11/23] elevate write count for link and symlink calls Dave Hansen
2007-07-12  0:17 ` [PATCH 12/23] elevate mount count for extended attributes Dave Hansen
2007-07-12  0:17 ` [PATCH 13/23] elevate write count for file_update_time() Dave Hansen
2007-07-12  0:17 ` [PATCH 14/23] mount_is_safe(): add comment Dave Hansen
2007-07-12  0:17 ` [PATCH 15/23] unix_find_other() elevate write count for touch_atime() Dave Hansen
2007-07-12  0:17 ` [PATCH 16/23] elevate write count over calls to vfs_rename() Dave Hansen
2007-07-12  0:17 ` [PATCH 17/23] nfs: check mnt instead of superblock directly Dave Hansen
2007-07-12  0:17 ` [PATCH 18/23] elevate writer count for do_sys_truncate() Dave Hansen
2007-07-12  0:17 ` [PATCH 19/23] elevate write count for do_utimes() Dave Hansen
2007-07-12  0:17 ` [PATCH 20/23] elevate write count for do_sys_utime() and touch_atime() Dave Hansen
2007-07-12  0:17 ` [PATCH 21/23] sys_mknodat(): elevate write count for vfs_mknod/create() Dave Hansen
2007-07-12  0:17 ` [PATCH 22/23] elevate mnt writers for vfs_unlink() callers Dave Hansen
2007-07-12  0:17 ` [PATCH 23/23] do_rmdir(): elevate write count Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).