linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fs: fix i_writecount on shmem and friends
@ 2014-03-03 15:16 David Herrmann
  2014-03-11 19:05 ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: David Herrmann @ 2014-03-03 15:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fsdevel, Linus Torvalds, Andrew Morton, David Herrmann,
	Al Viro, David Howells, Oleg Nesterov, stable

VM_DENYWRITE currently relies on i_writecount. Unless there's an active
writable reference to an inode, VM_DENYWRITE is not allowed.
Unfortunately, alloc_file() does not increase i_writecount, therefore,
does not prevent a following VM_DENYWRITE even though the new file might
have been opened with FMODE_WRITE. However, callers of alloc_file() expect
the file object to be fully instantiated so they can call fput() on it. We
could now either fix all callers to do an get_write_access() if opened
with FMODE_WRITE, or simply fix alloc_file() to do that. I chose the
latter.

Note that this bug allows some rather subtle misbehavior. The following
sequence of calls should work just fine, but currently fails:
    int p[2], orig, ro, rw;
    char buf[128];

    pipe(p);
    sprintf(buf, "/proc/self/fd/%d", p[1]);
    ro = open("/proc/self/fd/$orig", O_RDONLY);
    close(p[1]);
    rw = open("/proc/self/fd/$ro", O_RDWR);

The final open() cannot succeed as close(p[1]) caused an integer underflow
on i_writecount, effectively causing VM_DENYWRITE on the inode. The open
will fail with -ETXTBUSY.

It's a rather odd sequence of calls and given that open() doesn't use
alloc_file() (and thus not affected by this bug), it's rather unlikely
that this is a serious issue. But stuff like anon_inode shares a *single*
inode across a huge set of interfaces. If any of these is broken like
pipe(), it will affect all of these (ranging from dma-buf to epoll).

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
 fs/file_table.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 5fff903..e3c8dd0 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -167,6 +167,7 @@ struct file *alloc_file(struct path *path, fmode_t mode,
 		const struct file_operations *fop)
 {
 	struct file *file;
+	int error;
 
 	file = get_empty_filp();
 	if (IS_ERR(file))
@@ -178,15 +179,23 @@ struct file *alloc_file(struct path *path, fmode_t mode,
 	file->f_mode = mode;
 	file->f_op = fop;
 
-	/*
-	 * These mounts don't really matter in practice
-	 * for r/o bind mounts.  They aren't userspace-
-	 * visible.  We do this for consistency, and so
-	 * that we can do debugging checks at __fput()
-	 */
-	if ((mode & FMODE_WRITE) && !special_file(path->dentry->d_inode->i_mode)) {
-		file_take_write(file);
-		WARN_ON(mnt_clone_write(path->mnt));
+	if (mode & FMODE_WRITE) {
+		error = get_write_access(path->dentry->d_inode);
+		if (error) {
+			put_filp(file);
+			return ERR_PTR(error);
+		}
+
+		/*
+		 * These mounts don't really matter in practice
+		 * for r/o bind mounts.  They aren't userspace-
+		 * visible.  We do this for consistency, and so
+		 * that we can do debugging checks at __fput()
+		 */
+		if (!special_file(path->dentry->d_inode->i_mode)) {
+			file_take_write(file);
+			WARN_ON(mnt_clone_write(path->mnt));
+		}
 	}
 	if ((mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
 		i_readcount_inc(path->dentry->d_inode);
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-03-20 11:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-03 15:16 [PATCH] fs: fix i_writecount on shmem and friends David Herrmann
2014-03-11 19:05 ` Linus Torvalds
2014-03-12 18:19   ` Al Viro
2014-03-12 22:30     ` David Herrmann
2014-03-13  0:37       ` Al Viro
2014-03-13 11:03         ` David Herrmann
2014-03-20 11:13         ` David Herrmann
2014-03-13  4:08     ` NeilBrown
2014-03-13  4:29       ` Al Viro
2014-03-13  5:55         ` NeilBrown
2014-03-14  4:51           ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).