* [PATCH v5] fs: clear file privilege bits when mmap writing
@ 2015-12-10 22:33 Kees Cook
2016-01-07 19:36 ` Kees Cook
2016-01-08 0:30 ` Andy Lutomirski
0 siblings, 2 replies; 3+ messages in thread
From: Kees Cook @ 2015-12-10 22:33 UTC (permalink / raw)
To: Alexander Viro
Cc: Jan Kara, yalin wang, Willy Tarreau, Andrew Morton, linux-fsdevel,
linux-arch, linux-api
Normally, when a user can modify a file that has setuid or setgid bits,
those bits are cleared when they are not the file owner or a member
of the group. This is enforced when using write and truncate but not
when writing to a shared mmap on the file. This could allow the file
writer to gain privileges by changing a binary without losing the
setuid/setgid/caps bits.
Changing the bits requires holding inode->i_mutex, so it cannot be done
during the page fault (due to mmap_sem being held during the fault). We
could do this during vm_mmap_pgoff, but that would need coverage in
mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem
again. We could clear at open() time, but it's possible things are
accidentally opening with O_RDWR and only reading. Better to clear on
close and error failures (i.e. an improvement over now, which is not
clearing at all).
Instead, detect the need to clear the bits during the page fault, and
actually remove the bits during final fput. Since the file was open for
writing, it wouldn't have been possible to execute it yet.
Signed-off-by: Kees Cook <keescook@chromium.org>
---
v5:
- add to f_flags instead, viro
- add i_mutex during __fput, jack
v4:
- delay removal instead of still needing mmap_sem for mprotect, yalin
v3:
- move outside of mmap_sem for real now, fengguang
- check return code of file_remove_privs, akpm
v2:
- move to mmap from fault handler, jack
---
fs/file_table.c | 11 +++++++++++
fs/open.c | 2 +-
include/uapi/asm-generic/fcntl.h | 4 ++++
mm/memory.c | 5 +++++
4 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/fs/file_table.c b/fs/file_table.c
index ad17e05ebf95..4a8b0b4553e9 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -191,6 +191,17 @@ static void __fput(struct file *file)
might_sleep();
+ /*
+ * XXX: While avoiding mmap_sem, we've already been written to.
+ * We must ignore the return value, since we can't reject the
+ * write.
+ */
+ if (unlikely(file->f_flags & O_REMOVEPRIV)) {
+ mutex_lock(&inode->i_mutex);
+ file_remove_privs(file);
+ mutex_unlock(&inode->i_mutex);
+ }
+
fsnotify_close(file);
/*
* The function eventpoll_release() should be the first called
diff --git a/fs/open.c b/fs/open.c
index b6f1e96a7c0b..89069d16ca80 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -895,7 +895,7 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
op->mode = 0;
/* Must never be set by userspace */
- flags &= ~FMODE_NONOTIFY & ~O_CLOEXEC;
+ flags &= ~FMODE_NONOTIFY & ~O_CLOEXEC & ~O_REMOVEPRIV;
/*
* O_SYNC is implemented as __O_SYNC|O_DSYNC. As many places only
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index e063effe0cc1..096c4b3afe6a 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -88,6 +88,10 @@
#define __O_TMPFILE 020000000
#endif
+#ifndef O_REMOVEPRIV
+#define O_REMOVEPRIV 040000000
+#endif
+
/* a horrid kludge trying to make sure that this will fail on old kernels */
#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
#define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT)
diff --git a/mm/memory.c b/mm/memory.c
index c387430f06c3..ad4188a8f279 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2036,6 +2036,11 @@ static inline int wp_page_reuse(struct mm_struct *mm,
if (!page_mkwrite)
file_update_time(vma->vm_file);
+ if (unlikely((vma->vm_file->f_flags & O_REMOVEPRIV) == 0)) {
+ spin_lock(&vma->vm_file->f_lock);
+ vma->vm_file->f_flags |= O_REMOVEPRIV;
+ spin_unlock(&vma->vm_file->f_lock);
+ }
}
return VM_FAULT_WRITE;
--
2.6.3
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v5] fs: clear file privilege bits when mmap writing
2015-12-10 22:33 [PATCH v5] fs: clear file privilege bits when mmap writing Kees Cook
@ 2016-01-07 19:36 ` Kees Cook
2016-01-08 0:30 ` Andy Lutomirski
1 sibling, 0 replies; 3+ messages in thread
From: Kees Cook @ 2016-01-07 19:36 UTC (permalink / raw)
To: Alexander Viro
Cc: Jan Kara, yalin wang, Willy Tarreau, Andrew Morton,
linux-fsdevel@vger.kernel.org, linux-arch, Linux API
On Thu, Dec 10, 2015 at 2:33 PM, Kees Cook <keescook@chromium.org> wrote:
> Normally, when a user can modify a file that has setuid or setgid bits,
> those bits are cleared when they are not the file owner or a member
> of the group. This is enforced when using write and truncate but not
> when writing to a shared mmap on the file. This could allow the file
> writer to gain privileges by changing a binary without losing the
> setuid/setgid/caps bits.
>
> Changing the bits requires holding inode->i_mutex, so it cannot be done
> during the page fault (due to mmap_sem being held during the fault). We
> could do this during vm_mmap_pgoff, but that would need coverage in
> mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem
> again. We could clear at open() time, but it's possible things are
> accidentally opening with O_RDWR and only reading. Better to clear on
> close and error failures (i.e. an improvement over now, which is not
> clearing at all).
>
> Instead, detect the need to clear the bits during the page fault, and
> actually remove the bits during final fput. Since the file was open for
> writing, it wouldn't have been possible to execute it yet.
>
> Signed-off-by: Kees Cook <keescook@chromium.org>
Al, how does this look?
-Kees
> ---
> v5:
> - add to f_flags instead, viro
> - add i_mutex during __fput, jack
> v4:
> - delay removal instead of still needing mmap_sem for mprotect, yalin
> v3:
> - move outside of mmap_sem for real now, fengguang
> - check return code of file_remove_privs, akpm
> v2:
> - move to mmap from fault handler, jack
> ---
> fs/file_table.c | 11 +++++++++++
> fs/open.c | 2 +-
> include/uapi/asm-generic/fcntl.h | 4 ++++
> mm/memory.c | 5 +++++
> 4 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/fs/file_table.c b/fs/file_table.c
> index ad17e05ebf95..4a8b0b4553e9 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -191,6 +191,17 @@ static void __fput(struct file *file)
>
> might_sleep();
>
> + /*
> + * XXX: While avoiding mmap_sem, we've already been written to.
> + * We must ignore the return value, since we can't reject the
> + * write.
> + */
> + if (unlikely(file->f_flags & O_REMOVEPRIV)) {
> + mutex_lock(&inode->i_mutex);
> + file_remove_privs(file);
> + mutex_unlock(&inode->i_mutex);
> + }
> +
> fsnotify_close(file);
> /*
> * The function eventpoll_release() should be the first called
> diff --git a/fs/open.c b/fs/open.c
> index b6f1e96a7c0b..89069d16ca80 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -895,7 +895,7 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
> op->mode = 0;
>
> /* Must never be set by userspace */
> - flags &= ~FMODE_NONOTIFY & ~O_CLOEXEC;
> + flags &= ~FMODE_NONOTIFY & ~O_CLOEXEC & ~O_REMOVEPRIV;
>
> /*
> * O_SYNC is implemented as __O_SYNC|O_DSYNC. As many places only
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index e063effe0cc1..096c4b3afe6a 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -88,6 +88,10 @@
> #define __O_TMPFILE 020000000
> #endif
>
> +#ifndef O_REMOVEPRIV
> +#define O_REMOVEPRIV 040000000
> +#endif
> +
> /* a horrid kludge trying to make sure that this will fail on old kernels */
> #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
> #define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT)
> diff --git a/mm/memory.c b/mm/memory.c
> index c387430f06c3..ad4188a8f279 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2036,6 +2036,11 @@ static inline int wp_page_reuse(struct mm_struct *mm,
>
> if (!page_mkwrite)
> file_update_time(vma->vm_file);
> + if (unlikely((vma->vm_file->f_flags & O_REMOVEPRIV) == 0)) {
> + spin_lock(&vma->vm_file->f_lock);
> + vma->vm_file->f_flags |= O_REMOVEPRIV;
> + spin_unlock(&vma->vm_file->f_lock);
> + }
> }
>
> return VM_FAULT_WRITE;
> --
> 2.6.3
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v5] fs: clear file privilege bits when mmap writing
2015-12-10 22:33 [PATCH v5] fs: clear file privilege bits when mmap writing Kees Cook
2016-01-07 19:36 ` Kees Cook
@ 2016-01-08 0:30 ` Andy Lutomirski
1 sibling, 0 replies; 3+ messages in thread
From: Andy Lutomirski @ 2016-01-08 0:30 UTC (permalink / raw)
To: Kees Cook
Cc: Alexander Viro, Jan Kara, yalin wang, Willy Tarreau,
Andrew Morton, Linux FS Devel, linux-arch, Linux API
On Thu, Dec 10, 2015 at 2:33 PM, Kees Cook <keescook@chromium.org> wrote:
> Instead, detect the need to clear the bits during the page fault, and
> actually remove the bits during final fput. Since the file was open for
> writing, it wouldn't have been possible to execute it yet.
This is cute but mysterious. Could you add a comment?
>
> + /*
> + * XXX: While avoiding mmap_sem, we've already been written to.
> + * We must ignore the return value, since we can't reject the
> + * write.
> + */
e.g. here?
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-01-08 0:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-10 22:33 [PATCH v5] fs: clear file privilege bits when mmap writing Kees Cook
2016-01-07 19:36 ` Kees Cook
2016-01-08 0:30 ` Andy Lutomirski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).