linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9] fs: clear file privilege bits when mmap writing
@ 2016-01-14 21:22 Kees Cook
  2016-01-15  5:55 ` Konstantin Khlebnikov
  0 siblings, 1 reply; 8+ messages in thread
From: Kees Cook @ 2016-01-14 21:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Konstantin Khlebnikov, Andy Lutomirski, Jan Kara,
	yalin wang, Willy Tarreau, linux-mm, linux-kernel

Normally, when a user can modify a file that has setuid or setgid bits,
those bits are cleared when they are not the file owner or a member
of the group. This is enforced when using write and truncate but not
when writing to a shared mmap on the file. This could allow the file
writer to gain privileges by changing a binary without losing the
setuid/setgid/caps bits.

Changing the bits requires holding inode->i_mutex, so it cannot be done
during the page fault (due to mmap_sem being held during the fault).
Instead, clear the bits if PROT_WRITE is being used at mmap open time,
or added at mprotect time.

Since we can't do the check in the right place inside mmap (due to
holding mmap_sem), we have to do it before holding mmap_sem, which
means duplicating some checks, which have to be available to the non-MMU
builds too.

When walking VMAs during mprotect, we need to drop mmap_sem (while
holding a file reference) and restart the walk after clearing privileges.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
v9:
- use file_needs_remove_privs, jack & koct9i
v8:
- use mmap/mprotect method, with mprotect walk restart, thanks to koct9i
v7:
- document and avoid arch-specific O_* values, viro
v6:
- clarify ETXTBSY situation in comments, luto
v5:
- add to f_flags instead, viro
- add i_mutex during __fput, jack
v4:
- delay removal instead of still needing mmap_sem for mprotect, yalin
v3:
- move outside of mmap_sem for real now, fengguang
- check return code of file_remove_privs, akpm
v2:
- move to mmap from fault handler, jack
---
 include/linux/mm.h |  1 +
 mm/mmap.c          | 20 ++++----------------
 mm/mprotect.c      | 24 ++++++++++++++++++++++++
 mm/util.c          | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 16 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 00bad7793788..b264c8be7114 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1912,6 +1912,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
 
 extern unsigned long mmap_region(struct file *file, unsigned long addr,
 	unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
+extern int do_mmap_shared_checks(struct file *file, unsigned long prot);
 extern unsigned long do_mmap(struct file *file, unsigned long addr,
 	unsigned long len, unsigned long prot, unsigned long flags,
 	vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
diff --git a/mm/mmap.c b/mm/mmap.c
index 2ce04a649f6b..b3424db0a29e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1320,25 +1320,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 		return -EAGAIN;
 
 	if (file) {
-		struct inode *inode = file_inode(file);
+		int err;
 
 		switch (flags & MAP_TYPE) {
 		case MAP_SHARED:
-			if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
-				return -EACCES;
-
-			/*
-			 * Make sure we don't allow writing to an append-only
-			 * file..
-			 */
-			if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
-				return -EACCES;
-
-			/*
-			 * Make sure there are no mandatory locks on the file.
-			 */
-			if (locks_verify_locked(file))
-				return -EAGAIN;
+			err = do_mmap_shared_checks(file, prot);
+			if (err)
+				return err;
 
 			vm_flags |= VM_SHARED | VM_MAYSHARE;
 			if (!(file->f_mode & FMODE_WRITE))
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ef5be8eaab00..57cb81c11668 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -12,6 +12,7 @@
 #include <linux/hugetlb.h>
 #include <linux/shm.h>
 #include <linux/mman.h>
+#include <linux/file.h>
 #include <linux/fs.h>
 #include <linux/highmem.h>
 #include <linux/security.h>
@@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
 
 	vm_flags = calc_vm_prot_bits(prot);
 
+restart:
 	down_write(&current->mm->mmap_sem);
 
 	vma = find_vma(current->mm, start);
@@ -416,6 +418,28 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
 			goto out;
 		}
 
+		/*
+		 * If we're adding write permissions to a shared file,
+		 * we must clear privileges (like done at mmap time),
+		 * but we have to juggle the locks to avoid holding
+		 * mmap_sem while holding i_mutex.
+		 */
+		if ((vma->vm_flags & VM_SHARED) && vma->vm_file &&
+		    (newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
+		    file_needs_remove_privs(vma->vm_file)) {
+			struct file *file = get_file(vma->vm_file);
+
+			start = vma->vm_start;
+			up_write(&current->mm->mmap_sem);
+			mutex_lock(&file_inode(file)->i_mutex);
+			error = file_remove_privs(file);
+			mutex_unlock(&file_inode(file)->i_mutex);
+			fput(file);
+			if (error)
+				return error;
+			goto restart;
+		}
+
 		error = security_file_mprotect(vma, reqprot, prot);
 		if (error)
 			goto out;
diff --git a/mm/util.c b/mm/util.c
index 9af1c12b310c..1882eaf33a37 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -283,6 +283,29 @@ int __weak get_user_pages_fast(unsigned long start,
 }
 EXPORT_SYMBOL_GPL(get_user_pages_fast);
 
+int do_mmap_shared_checks(struct file *file, unsigned long prot)
+{
+	struct inode *inode = file_inode(file);
+
+	if ((prot & PROT_WRITE) && !(file->f_mode & FMODE_WRITE))
+		return -EACCES;
+
+	/*
+	 * Make sure we don't allow writing to an append-only
+	 * file..
+	 */
+	if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
+		return -EACCES;
+
+	/*
+	 * Make sure there are no mandatory locks on the file.
+	 */
+	if (locks_verify_locked(file))
+		return -EAGAIN;
+
+	return 0;
+}
+
 unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
 	unsigned long len, unsigned long prot,
 	unsigned long flag, unsigned long pgoff)
@@ -291,6 +314,33 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	unsigned long populate;
 
+	/*
+	 * If we must remove privs, we do it here since doing it during
+	 * page fault may be expensive and cannot hold inode->i_mutex,
+	 * since mm->mmap_sem is already held.
+	 */
+	if (file && (flag & MAP_TYPE) == MAP_SHARED && (prot & PROT_WRITE)) {
+		struct inode *inode = file_inode(file);
+		int err;
+
+		if (!IS_NOSEC(inode)) {
+			/*
+			 * Make sure we can't strip privs from a file that
+			 * wouldn't otherwise be allowed to be mmapped.
+			 */
+			err = do_mmap_shared_checks(file, prot);
+			if (err)
+				return err;
+
+			mutex_lock(&inode->i_mutex);
+			err = file_remove_privs(file);
+			mutex_unlock(&inode->i_mutex);
+
+			if (err)
+				return err;
+		}
+	}
+
 	ret = security_mmap_file(file, prot, flag);
 	if (!ret) {
 		down_write(&mm->mmap_sem);
-- 
2.6.3


-- 
Kees Cook
Chrome OS & Brillo Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2016-01-14 21:22 [PATCH v9] fs: clear file privilege bits when mmap writing Kees Cook
@ 2016-01-15  5:55 ` Konstantin Khlebnikov
  2016-01-15  6:18   ` Andy Lutomirski
  2016-05-25 21:36   ` Kees Cook
  0 siblings, 2 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2016-01-15  5:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Morton, Alexander Viro, Andy Lutomirski, Jan Kara,
	yalin wang, Willy Tarreau, linux-mm@kvack.org,
	Linux Kernel Mailing List

On Fri, Jan 15, 2016 at 12:22 AM, Kees Cook <keescook@chromium.org> wrote:
> Normally, when a user can modify a file that has setuid or setgid bits,
> those bits are cleared when they are not the file owner or a member
> of the group. This is enforced when using write and truncate but not
> when writing to a shared mmap on the file. This could allow the file
> writer to gain privileges by changing a binary without losing the
> setuid/setgid/caps bits.
>
> Changing the bits requires holding inode->i_mutex, so it cannot be done
> during the page fault (due to mmap_sem being held during the fault).
> Instead, clear the bits if PROT_WRITE is being used at mmap open time,
> or added at mprotect tLooks good to me.ime.
>
> Since we can't do the check in the right place inside mmap (due to
> holding mmap_sem), we have to do it before holding mmap_sem, which
> means duplicating some checks, which have to be available to the non-MMU
> builds too.
>
> When walking VMAs during mprotect, we need to drop mmap_sem (while
> holding a file reference) and restart the walk after clearing privileges.
>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Looks good. Ack.

> ---
> v9:
> - use file_needs_remove_privs, jack & koct9i
> v8:
> - use mmap/mprotect method, with mprotect walk restart, thanks to koct9i
> v7:
> - document and avoid arch-specific O_* values, viro
> v6:
> - clarify ETXTBSY situation in comments, luto
> v5:
> - add to f_flags instead, viro
> - add i_mutex during __fput, jack
> v4:
> - delay removal instead of still needing mmap_sem for mprotect, yalin
> v3:
> - move outside of mmap_sem for real now, fengguang
> - check return code of file_remove_privs, akpm
> v2:
> - move to mmap from fault handler, jack
> ---
>  include/linux/mm.h |  1 +
>  mm/mmap.c          | 20 ++++----------------
>  mm/mprotect.c      | 24 ++++++++++++++++++++++++
>  mm/util.c          | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 79 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 00bad7793788..b264c8be7114 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1912,6 +1912,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
>
>  extern unsigned long mmap_region(struct file *file, unsigned long addr,
>         unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
> +extern int do_mmap_shared_checks(struct file *file, unsigned long prot);
>  extern unsigned long do_mmap(struct file *file, unsigned long addr,
>         unsigned long len, unsigned long prot, unsigned long flags,
>         vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2ce04a649f6b..b3424db0a29e 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1320,25 +1320,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>                 return -EAGAIN;
>
>         if (file) {
> -               struct inode *inode = file_inode(file);
> +               int err;
>
>                 switch (flags & MAP_TYPE) {
>                 case MAP_SHARED:
> -                       if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
> -                               return -EACCES;
> -
> -                       /*
> -                        * Make sure we don't allow writing to an append-only
> -                        * file..
> -                        */
> -                       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
> -                               return -EACCES;
> -
> -                       /*
> -                        * Make sure there are no mandatory locks on the file.
> -                        */
> -                       if (locks_verify_locked(file))
> -                               return -EAGAIN;
> +                       err = do_mmap_shared_checks(file, prot);
> +                       if (err)
> +                               return err;
>
>                         vm_flags |= VM_SHARED | VM_MAYSHARE;
>                         if (!(file->f_mode & FMODE_WRITE))
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index ef5be8eaab00..57cb81c11668 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -12,6 +12,7 @@
>  #include <linux/hugetlb.h>
>  #include <linux/shm.h>
>  #include <linux/mman.h>
> +#include <linux/file.h>
>  #include <linux/fs.h>
>  #include <linux/highmem.h>
>  #include <linux/security.h>
> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>
>         vm_flags = calc_vm_prot_bits(prot);
>
> +restart:
>         down_write(&current->mm->mmap_sem);
>
>         vma = find_vma(current->mm, start);
> @@ -416,6 +418,28 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>                         goto out;
>                 }
>
> +               /*
> +                * If we're adding write permissions to a shared file,
> +                * we must clear privileges (like done at mmap time),
> +                * but we have to juggle the locks to avoid holding
> +                * mmap_sem while holding i_mutex.
> +                */
> +               if ((vma->vm_flags & VM_SHARED) && vma->vm_file &&
> +                   (newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
> +                   file_needs_remove_privs(vma->vm_file)) {
> +                       struct file *file = get_file(vma->vm_file);
> +
> +                       start = vma->vm_start;
> +                       up_write(&current->mm->mmap_sem);
> +                       mutex_lock(&file_inode(file)->i_mutex);
> +                       error = file_remove_privs(file);
> +                       mutex_unlock(&file_inode(file)->i_mutex);
> +                       fput(file);
> +                       if (error)
> +                               return error;
> +                       goto restart;
> +               }
> +
>                 error = security_file_mprotect(vma, reqprot, prot);
>                 if (error)
>                         goto out;
> diff --git a/mm/util.c b/mm/util.c
> index 9af1c12b310c..1882eaf33a37 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -283,6 +283,29 @@ int __weak get_user_pages_fast(unsigned long start,
>  }
>  EXPORT_SYMBOL_GPL(get_user_pages_fast);
>
> +int do_mmap_shared_checks(struct file *file, unsigned long prot)
> +{
> +       struct inode *inode = file_inode(file);
> +
> +       if ((prot & PROT_WRITE) && !(file->f_mode & FMODE_WRITE))
> +               return -EACCES;
> +
> +       /*
> +        * Make sure we don't allow writing to an append-only
> +        * file..
> +        */
> +       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
> +               return -EACCES;
> +
> +       /*
> +        * Make sure there are no mandatory locks on the file.
> +        */
> +       if (locks_verify_locked(file))
> +               return -EAGAIN;
> +
> +       return 0;
> +}
> +
>  unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>         unsigned long len, unsigned long prot,
>         unsigned long flag, unsigned long pgoff)
> @@ -291,6 +314,33 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>         struct mm_struct *mm = current->mm;
>         unsigned long populate;
>
> +       /*
> +        * If we must remove privs, we do it here since doing it during
> +        * page fault may be expensive and cannot hold inode->i_mutex,
> +        * since mm->mmap_sem is already held.
> +        */
> +       if (file && (flag & MAP_TYPE) == MAP_SHARED && (prot & PROT_WRITE)) {
> +               struct inode *inode = file_inode(file);
> +               int err;
> +
> +               if (!IS_NOSEC(inode)) {
> +                       /*
> +                        * Make sure we can't strip privs from a file that
> +                        * wouldn't otherwise be allowed to be mmapped.
> +                        */
> +                       err = do_mmap_shared_checks(file, prot);
> +                       if (err)
> +                               return err;
> +
> +                       mutex_lock(&inode->i_mutex);
> +                       err = file_remove_privs(file);
> +                       mutex_unlock(&inode->i_mutex);
> +
> +                       if (err)
> +                               return err;
> +               }
> +       }
> +
>         ret = security_mmap_file(file, prot, flag);
>         if (!ret) {
>                 down_write(&mm->mmap_sem);
> --
> 2.6.3
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2016-01-15  5:55 ` Konstantin Khlebnikov
@ 2016-01-15  6:18   ` Andy Lutomirski
  2016-01-15  6:36     ` Konstantin Khlebnikov
  2016-05-25 21:36   ` Kees Cook
  1 sibling, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2016-01-15  6:18 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Kees Cook, Andrew Morton, Alexander Viro, Jan Kara, yalin wang,
	Willy Tarreau, linux-mm@kvack.org, Linux Kernel Mailing List

On Thu, Jan 14, 2016 at 9:55 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Fri, Jan 15, 2016 at 12:22 AM, Kees Cook <keescook@chromium.org> wrote:
>> Normally, when a user can modify a file that has setuid or setgid bits,
>> those bits are cleared when they are not the file owner or a member
>> of the group. This is enforced when using write and truncate but not
>> when writing to a shared mmap on the file. This could allow the file
>> writer to gain privileges by changing a binary without losing the
>> setuid/setgid/caps bits.
>>
>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>> during the page fault (due to mmap_sem being held during the fault).
>> Instead, clear the bits if PROT_WRITE is being used at mmap open time,
>> or added at mprotect tLooks good to me.ime.
>>
>> Since we can't do the check in the right place inside mmap (due to
>> holding mmap_sem), we have to do it before holding mmap_sem, which
>> means duplicating some checks, which have to be available to the non-MMU
>> builds too.
>>
>> When walking VMAs during mprotect, we need to drop mmap_sem (while
>> holding a file reference) and restart the walk after clearing privileges.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> Looks good. Ack.

While we're at it:

int should_remove_suid(struct dentry *dentry)
{
        umode_t mode = d_inode(dentry)->i_mode;
        int kill = 0;

        /* suid always must be killed */
        if (unlikely(mode & S_ISUID))
                kill = ATTR_KILL_SUID;

        /*
         * sgid without any exec bits is just a mandatory locking mark; leave
         * it alone.  If some exec bits are set, it's a real sgid; kill it.
         */
        if (unlikely((mode & S_ISGID) && (mode & S_IXGRP)))
                kill |= ATTR_KILL_SGID;

        if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode)))
                return kill;

        return 0;
}
EXPORT_SYMBOL(should_remove_suid);

Oh wait, is that an implicit use of current_cred in vfs_write?  No, it
couldn't be.  Kernel developers *never* make that mistake.

This is, of course, totally fucked because this function doesn't have
access to a struct file and therefore can't see f_cred.  I'm not going
to look in to this right now, but I swear I saw an exploit that took
advantage of this bug recently.  Anyone want to try to fix it?

FWIW, posix says (man 3p write):

       Upon  successful  completion,  where  nbyte  is greater than 0, write()
       shall mark for update the last data modification and last  file  status
       change  timestamps  of the file, and if the file is a regular file, the
       S_ISUID and S_ISGID bits of the file mode may be cleared.

so maybe the thing to do is just drop the capable check entirely and
cross our fingers that nothing was relying on it.

--Andy

>
>> ---
>> v9:
>> - use file_needs_remove_privs, jack & koct9i
>> v8:
>> - use mmap/mprotect method, with mprotect walk restart, thanks to koct9i
>> v7:
>> - document and avoid arch-specific O_* values, viro
>> v6:
>> - clarify ETXTBSY situation in comments, luto
>> v5:
>> - add to f_flags instead, viro
>> - add i_mutex during __fput, jack
>> v4:
>> - delay removal instead of still needing mmap_sem for mprotect, yalin
>> v3:
>> - move outside of mmap_sem for real now, fengguang
>> - check return code of file_remove_privs, akpm
>> v2:
>> - move to mmap from fault handler, jack
>> ---
>>  include/linux/mm.h |  1 +
>>  mm/mmap.c          | 20 ++++----------------
>>  mm/mprotect.c      | 24 ++++++++++++++++++++++++
>>  mm/util.c          | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 79 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 00bad7793788..b264c8be7114 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1912,6 +1912,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
>>
>>  extern unsigned long mmap_region(struct file *file, unsigned long addr,
>>         unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
>> +extern int do_mmap_shared_checks(struct file *file, unsigned long prot);
>>  extern unsigned long do_mmap(struct file *file, unsigned long addr,
>>         unsigned long len, unsigned long prot, unsigned long flags,
>>         vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 2ce04a649f6b..b3424db0a29e 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -1320,25 +1320,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>>                 return -EAGAIN;
>>
>>         if (file) {
>> -               struct inode *inode = file_inode(file);
>> +               int err;
>>
>>                 switch (flags & MAP_TYPE) {
>>                 case MAP_SHARED:
>> -                       if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
>> -                               return -EACCES;
>> -
>> -                       /*
>> -                        * Make sure we don't allow writing to an append-only
>> -                        * file..
>> -                        */
>> -                       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
>> -                               return -EACCES;
>> -
>> -                       /*
>> -                        * Make sure there are no mandatory locks on the file.
>> -                        */
>> -                       if (locks_verify_locked(file))
>> -                               return -EAGAIN;
>> +                       err = do_mmap_shared_checks(file, prot);
>> +                       if (err)
>> +                               return err;
>>
>>                         vm_flags |= VM_SHARED | VM_MAYSHARE;
>>                         if (!(file->f_mode & FMODE_WRITE))
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index ef5be8eaab00..57cb81c11668 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -12,6 +12,7 @@
>>  #include <linux/hugetlb.h>
>>  #include <linux/shm.h>
>>  #include <linux/mman.h>
>> +#include <linux/file.h>
>>  #include <linux/fs.h>
>>  #include <linux/highmem.h>
>>  #include <linux/security.h>
>> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>
>>         vm_flags = calc_vm_prot_bits(prot);
>>
>> +restart:
>>         down_write(&current->mm->mmap_sem);
>>
>>         vma = find_vma(current->mm, start);
>> @@ -416,6 +418,28 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>                         goto out;
>>                 }
>>
>> +               /*
>> +                * If we're adding write permissions to a shared file,
>> +                * we must clear privileges (like done at mmap time),
>> +                * but we have to juggle the locks to avoid holding
>> +                * mmap_sem while holding i_mutex.
>> +                */
>> +               if ((vma->vm_flags & VM_SHARED) && vma->vm_file &&
>> +                   (newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
>> +                   file_needs_remove_privs(vma->vm_file)) {
>> +                       struct file *file = get_file(vma->vm_file);
>> +
>> +                       start = vma->vm_start;
>> +                       up_write(&current->mm->mmap_sem);
>> +                       mutex_lock(&file_inode(file)->i_mutex);
>> +                       error = file_remove_privs(file);
>> +                       mutex_unlock(&file_inode(file)->i_mutex);
>> +                       fput(file);
>> +                       if (error)
>> +                               return error;
>> +                       goto restart;
>> +               }
>> +
>>                 error = security_file_mprotect(vma, reqprot, prot);
>>                 if (error)
>>                         goto out;
>> diff --git a/mm/util.c b/mm/util.c
>> index 9af1c12b310c..1882eaf33a37 100644
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -283,6 +283,29 @@ int __weak get_user_pages_fast(unsigned long start,
>>  }
>>  EXPORT_SYMBOL_GPL(get_user_pages_fast);
>>
>> +int do_mmap_shared_checks(struct file *file, unsigned long prot)
>> +{
>> +       struct inode *inode = file_inode(file);
>> +
>> +       if ((prot & PROT_WRITE) && !(file->f_mode & FMODE_WRITE))
>> +               return -EACCES;
>> +
>> +       /*
>> +        * Make sure we don't allow writing to an append-only
>> +        * file..
>> +        */
>> +       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
>> +               return -EACCES;
>> +
>> +       /*
>> +        * Make sure there are no mandatory locks on the file.
>> +        */
>> +       if (locks_verify_locked(file))
>> +               return -EAGAIN;
>> +
>> +       return 0;
>> +}
>> +
>>  unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>>         unsigned long len, unsigned long prot,
>>         unsigned long flag, unsigned long pgoff)
>> @@ -291,6 +314,33 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>>         struct mm_struct *mm = current->mm;
>>         unsigned long populate;
>>
>> +       /*
>> +        * If we must remove privs, we do it here since doing it during
>> +        * page fault may be expensive and cannot hold inode->i_mutex,
>> +        * since mm->mmap_sem is already held.
>> +        */
>> +       if (file && (flag & MAP_TYPE) == MAP_SHARED && (prot & PROT_WRITE)) {
>> +               struct inode *inode = file_inode(file);
>> +               int err;
>> +
>> +               if (!IS_NOSEC(inode)) {
>> +                       /*
>> +                        * Make sure we can't strip privs from a file that
>> +                        * wouldn't otherwise be allowed to be mmapped.
>> +                        */
>> +                       err = do_mmap_shared_checks(file, prot);
>> +                       if (err)
>> +                               return err;
>> +
>> +                       mutex_lock(&inode->i_mutex);
>> +                       err = file_remove_privs(file);
>> +                       mutex_unlock(&inode->i_mutex);
>> +
>> +                       if (err)
>> +                               return err;
>> +               }
>> +       }
>> +
>>         ret = security_mmap_file(file, prot, flag);
>>         if (!ret) {
>>                 down_write(&mm->mmap_sem);
>> --
>> 2.6.3
>>
>>
>> --
>> Kees Cook
>> Chrome OS & Brillo Security



-- 
Andy Lutomirski
AMA Capital Management, LLC

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2016-01-15  6:18   ` Andy Lutomirski
@ 2016-01-15  6:36     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2016-01-15  6:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Kees Cook, Andrew Morton, Alexander Viro, Jan Kara, yalin wang,
	Willy Tarreau, linux-mm@kvack.org, Linux Kernel Mailing List

On Fri, Jan 15, 2016 at 9:18 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Jan 14, 2016 at 9:55 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>> On Fri, Jan 15, 2016 at 12:22 AM, Kees Cook <keescook@chromium.org> wrote:
>>> Normally, when a user can modify a file that has setuid or setgid bits,
>>> those bits are cleared when they are not the file owner or a member
>>> of the group. This is enforced when using write and truncate but not
>>> when writing to a shared mmap on the file. This could allow the file
>>> writer to gain privileges by changing a binary without losing the
>>> setuid/setgid/caps bits.
>>>
>>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>>> during the page fault (due to mmap_sem being held during the fault).
>>> Instead, clear the bits if PROT_WRITE is being used at mmap open time,
>>> or added at mprotect tLooks good to me.ime.
>>>
>>> Since we can't do the check in the right place inside mmap (due to
>>> holding mmap_sem), we have to do it before holding mmap_sem, which
>>> means duplicating some checks, which have to be available to the non-MMU
>>> builds too.
>>>
>>> When walking VMAs during mprotect, we need to drop mmap_sem (while
>>> holding a file reference) and restart the walk after clearing privileges.
>>>
>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>
>> Looks good. Ack.
>
> While we're at it:
>
> int should_remove_suid(struct dentry *dentry)
> {
>         umode_t mode = d_inode(dentry)->i_mode;
>         int kill = 0;
>
>         /* suid always must be killed */
>         if (unlikely(mode & S_ISUID))
>                 kill = ATTR_KILL_SUID;
>
>         /*
>          * sgid without any exec bits is just a mandatory locking mark; leave
>          * it alone.  If some exec bits are set, it's a real sgid; kill it.
>          */
>         if (unlikely((mode & S_ISGID) && (mode & S_IXGRP)))
>                 kill |= ATTR_KILL_SGID;
>
>         if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode)))
>                 return kill;
>
>         return 0;
> }
> EXPORT_SYMBOL(should_remove_suid);
>
> Oh wait, is that an implicit use of current_cred in vfs_write?  No, it
> couldn't be.  Kernel developers *never* make that mistake.
>
> This is, of course, totally fucked because this function doesn't have
> access to a struct file and therefore can't see f_cred.  I'm not going
> to look in to this right now, but I swear I saw an exploit that took
> advantage of this bug recently.  Anyone want to try to fix it?

Good point. it's here since 2.3.43.
As I see file->f_cred is reachable in all places.

>
> FWIW, posix says (man 3p write):
>
>        Upon  successful  completion,  where  nbyte  is greater than 0, write()
>        shall mark for update the last data modification and last  file  status
>        change  timestamps  of the file, and if the file is a regular file, the
>        S_ISUID and S_ISGID bits of the file mode may be cleared.
>
> so maybe the thing to do is just drop the capable check entirely and
> cross our fingers that nothing was relying on it.
>
> --Andy
>
>>
>>> ---
>>> v9:
>>> - use file_needs_remove_privs, jack & koct9i
>>> v8:
>>> - use mmap/mprotect method, with mprotect walk restart, thanks to koct9i
>>> v7:
>>> - document and avoid arch-specific O_* values, viro
>>> v6:
>>> - clarify ETXTBSY situation in comments, luto
>>> v5:
>>> - add to f_flags instead, viro
>>> - add i_mutex during __fput, jack
>>> v4:
>>> - delay removal instead of still needing mmap_sem for mprotect, yalin
>>> v3:
>>> - move outside of mmap_sem for real now, fengguang
>>> - check return code of file_remove_privs, akpm
>>> v2:
>>> - move to mmap from fault handler, jack
>>> ---
>>>  include/linux/mm.h |  1 +
>>>  mm/mmap.c          | 20 ++++----------------
>>>  mm/mprotect.c      | 24 ++++++++++++++++++++++++
>>>  mm/util.c          | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  4 files changed, 79 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 00bad7793788..b264c8be7114 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -1912,6 +1912,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
>>>
>>>  extern unsigned long mmap_region(struct file *file, unsigned long addr,
>>>         unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
>>> +extern int do_mmap_shared_checks(struct file *file, unsigned long prot);
>>>  extern unsigned long do_mmap(struct file *file, unsigned long addr,
>>>         unsigned long len, unsigned long prot, unsigned long flags,
>>>         vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>> index 2ce04a649f6b..b3424db0a29e 100644
>>> --- a/mm/mmap.c
>>> +++ b/mm/mmap.c
>>> @@ -1320,25 +1320,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>>>                 return -EAGAIN;
>>>
>>>         if (file) {
>>> -               struct inode *inode = file_inode(file);
>>> +               int err;
>>>
>>>                 switch (flags & MAP_TYPE) {
>>>                 case MAP_SHARED:
>>> -                       if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
>>> -                               return -EACCES;
>>> -
>>> -                       /*
>>> -                        * Make sure we don't allow writing to an append-only
>>> -                        * file..
>>> -                        */
>>> -                       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
>>> -                               return -EACCES;
>>> -
>>> -                       /*
>>> -                        * Make sure there are no mandatory locks on the file.
>>> -                        */
>>> -                       if (locks_verify_locked(file))
>>> -                               return -EAGAIN;
>>> +                       err = do_mmap_shared_checks(file, prot);
>>> +                       if (err)
>>> +                               return err;
>>>
>>>                         vm_flags |= VM_SHARED | VM_MAYSHARE;
>>>                         if (!(file->f_mode & FMODE_WRITE))
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index ef5be8eaab00..57cb81c11668 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -12,6 +12,7 @@
>>>  #include <linux/hugetlb.h>
>>>  #include <linux/shm.h>
>>>  #include <linux/mman.h>
>>> +#include <linux/file.h>
>>>  #include <linux/fs.h>
>>>  #include <linux/highmem.h>
>>>  #include <linux/security.h>
>>> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>>
>>>         vm_flags = calc_vm_prot_bits(prot);
>>>
>>> +restart:
>>>         down_write(&current->mm->mmap_sem);
>>>
>>>         vma = find_vma(current->mm, start);
>>> @@ -416,6 +418,28 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>>                         goto out;
>>>                 }
>>>
>>> +               /*
>>> +                * If we're adding write permissions to a shared file,
>>> +                * we must clear privileges (like done at mmap time),
>>> +                * but we have to juggle the locks to avoid holding
>>> +                * mmap_sem while holding i_mutex.
>>> +                */
>>> +               if ((vma->vm_flags & VM_SHARED) && vma->vm_file &&
>>> +                   (newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
>>> +                   file_needs_remove_privs(vma->vm_file)) {
>>> +                       struct file *file = get_file(vma->vm_file);
>>> +
>>> +                       start = vma->vm_start;
>>> +                       up_write(&current->mm->mmap_sem);
>>> +                       mutex_lock(&file_inode(file)->i_mutex);
>>> +                       error = file_remove_privs(file);
>>> +                       mutex_unlock(&file_inode(file)->i_mutex);
>>> +                       fput(file);
>>> +                       if (error)
>>> +                               return error;
>>> +                       goto restart;
>>> +               }
>>> +
>>>                 error = security_file_mprotect(vma, reqprot, prot);
>>>                 if (error)
>>>                         goto out;
>>> diff --git a/mm/util.c b/mm/util.c
>>> index 9af1c12b310c..1882eaf33a37 100644
>>> --- a/mm/util.c
>>> +++ b/mm/util.c
>>> @@ -283,6 +283,29 @@ int __weak get_user_pages_fast(unsigned long start,
>>>  }
>>>  EXPORT_SYMBOL_GPL(get_user_pages_fast);
>>>
>>> +int do_mmap_shared_checks(struct file *file, unsigned long prot)
>>> +{
>>> +       struct inode *inode = file_inode(file);
>>> +
>>> +       if ((prot & PROT_WRITE) && !(file->f_mode & FMODE_WRITE))
>>> +               return -EACCES;
>>> +
>>> +       /*
>>> +        * Make sure we don't allow writing to an append-only
>>> +        * file..
>>> +        */
>>> +       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
>>> +               return -EACCES;
>>> +
>>> +       /*
>>> +        * Make sure there are no mandatory locks on the file.
>>> +        */
>>> +       if (locks_verify_locked(file))
>>> +               return -EAGAIN;
>>> +
>>> +       return 0;
>>> +}
>>> +
>>>  unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>>>         unsigned long len, unsigned long prot,
>>>         unsigned long flag, unsigned long pgoff)
>>> @@ -291,6 +314,33 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>>>         struct mm_struct *mm = current->mm;
>>>         unsigned long populate;
>>>
>>> +       /*
>>> +        * If we must remove privs, we do it here since doing it during
>>> +        * page fault may be expensive and cannot hold inode->i_mutex,
>>> +        * since mm->mmap_sem is already held.
>>> +        */
>>> +       if (file && (flag & MAP_TYPE) == MAP_SHARED && (prot & PROT_WRITE)) {
>>> +               struct inode *inode = file_inode(file);
>>> +               int err;
>>> +
>>> +               if (!IS_NOSEC(inode)) {
>>> +                       /*
>>> +                        * Make sure we can't strip privs from a file that
>>> +                        * wouldn't otherwise be allowed to be mmapped.
>>> +                        */
>>> +                       err = do_mmap_shared_checks(file, prot);
>>> +                       if (err)
>>> +                               return err;
>>> +
>>> +                       mutex_lock(&inode->i_mutex);
>>> +                       err = file_remove_privs(file);
>>> +                       mutex_unlock(&inode->i_mutex);
>>> +
>>> +                       if (err)
>>> +                               return err;
>>> +               }
>>> +       }
>>> +
>>>         ret = security_mmap_file(file, prot, flag);
>>>         if (!ret) {
>>>                 down_write(&mm->mmap_sem);
>>> --
>>> 2.6.3
>>>
>>>
>>> --
>>> Kees Cook
>>> Chrome OS & Brillo Security
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2016-01-15  5:55 ` Konstantin Khlebnikov
  2016-01-15  6:18   ` Andy Lutomirski
@ 2016-05-25 21:36   ` Kees Cook
  2016-05-25 21:49     ` Al Viro
  1 sibling, 1 reply; 8+ messages in thread
From: Kees Cook @ 2016-05-25 21:36 UTC (permalink / raw)
  To: Alexander Viro, Andrew Morton
  Cc: Konstantin Khlebnikov, Andy Lutomirski, Jan Kara, yalin wang,
	Willy Tarreau, linux-mm@kvack.org, Linux Kernel Mailing List

On Thu, Jan 14, 2016 at 9:55 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Fri, Jan 15, 2016 at 12:22 AM, Kees Cook <keescook@chromium.org> wrote:
>> Normally, when a user can modify a file that has setuid or setgid bits,
>> those bits are cleared when they are not the file owner or a member
>> of the group. This is enforced when using write and truncate but not
>> when writing to a shared mmap on the file. This could allow the file
>> writer to gain privileges by changing a binary without losing the
>> setuid/setgid/caps bits.
>>
>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>> during the page fault (due to mmap_sem being held during the fault).
>> Instead, clear the bits if PROT_WRITE is being used at mmap open time,
>> or added at mprotect tLooks good to me.ime.
>>
>> Since we can't do the check in the right place inside mmap (due to
>> holding mmap_sem), we have to do it before holding mmap_sem, which
>> means duplicating some checks, which have to be available to the non-MMU
>> builds too.
>>
>> When walking VMAs during mprotect, we need to drop mmap_sem (while
>> holding a file reference) and restart the walk after clearing privileges.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> Looks good. Ack.

Hm, this didn't end up getting picked up. (This jumped out at me again
because i_mutex just vanished...)

Al, what's the right way to update the locking in this patch?

-Kees

>
>> ---
>> v9:
>> - use file_needs_remove_privs, jack & koct9i
>> v8:
>> - use mmap/mprotect method, with mprotect walk restart, thanks to koct9i
>> v7:
>> - document and avoid arch-specific O_* values, viro
>> v6:
>> - clarify ETXTBSY situation in comments, luto
>> v5:
>> - add to f_flags instead, viro
>> - add i_mutex during __fput, jack
>> v4:
>> - delay removal instead of still needing mmap_sem for mprotect, yalin
>> v3:
>> - move outside of mmap_sem for real now, fengguang
>> - check return code of file_remove_privs, akpm
>> v2:
>> - move to mmap from fault handler, jack
>> ---
>>  include/linux/mm.h |  1 +
>>  mm/mmap.c          | 20 ++++----------------
>>  mm/mprotect.c      | 24 ++++++++++++++++++++++++
>>  mm/util.c          | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 79 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 00bad7793788..b264c8be7114 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1912,6 +1912,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
>>
>>  extern unsigned long mmap_region(struct file *file, unsigned long addr,
>>         unsigned long len, vm_flags_t vm_flags, unsigned long pgoff);
>> +extern int do_mmap_shared_checks(struct file *file, unsigned long prot);
>>  extern unsigned long do_mmap(struct file *file, unsigned long addr,
>>         unsigned long len, unsigned long prot, unsigned long flags,
>>         vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate);
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 2ce04a649f6b..b3424db0a29e 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -1320,25 +1320,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>>                 return -EAGAIN;
>>
>>         if (file) {
>> -               struct inode *inode = file_inode(file);
>> +               int err;
>>
>>                 switch (flags & MAP_TYPE) {
>>                 case MAP_SHARED:
>> -                       if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
>> -                               return -EACCES;
>> -
>> -                       /*
>> -                        * Make sure we don't allow writing to an append-only
>> -                        * file..
>> -                        */
>> -                       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
>> -                               return -EACCES;
>> -
>> -                       /*
>> -                        * Make sure there are no mandatory locks on the file.
>> -                        */
>> -                       if (locks_verify_locked(file))
>> -                               return -EAGAIN;
>> +                       err = do_mmap_shared_checks(file, prot);
>> +                       if (err)
>> +                               return err;
>>
>>                         vm_flags |= VM_SHARED | VM_MAYSHARE;
>>                         if (!(file->f_mode & FMODE_WRITE))
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index ef5be8eaab00..57cb81c11668 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -12,6 +12,7 @@
>>  #include <linux/hugetlb.h>
>>  #include <linux/shm.h>
>>  #include <linux/mman.h>
>> +#include <linux/file.h>
>>  #include <linux/fs.h>
>>  #include <linux/highmem.h>
>>  #include <linux/security.h>
>> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>
>>         vm_flags = calc_vm_prot_bits(prot);
>>
>> +restart:
>>         down_write(&current->mm->mmap_sem);
>>
>>         vma = find_vma(current->mm, start);
>> @@ -416,6 +418,28 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>                         goto out;
>>                 }
>>
>> +               /*
>> +                * If we're adding write permissions to a shared file,
>> +                * we must clear privileges (like done at mmap time),
>> +                * but we have to juggle the locks to avoid holding
>> +                * mmap_sem while holding i_mutex.
>> +                */
>> +               if ((vma->vm_flags & VM_SHARED) && vma->vm_file &&
>> +                   (newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
>> +                   file_needs_remove_privs(vma->vm_file)) {
>> +                       struct file *file = get_file(vma->vm_file);
>> +
>> +                       start = vma->vm_start;
>> +                       up_write(&current->mm->mmap_sem);
>> +                       mutex_lock(&file_inode(file)->i_mutex);
>> +                       error = file_remove_privs(file);
>> +                       mutex_unlock(&file_inode(file)->i_mutex);
>> +                       fput(file);
>> +                       if (error)
>> +                               return error;
>> +                       goto restart;
>> +               }
>> +
>>                 error = security_file_mprotect(vma, reqprot, prot);
>>                 if (error)
>>                         goto out;
>> diff --git a/mm/util.c b/mm/util.c
>> index 9af1c12b310c..1882eaf33a37 100644
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -283,6 +283,29 @@ int __weak get_user_pages_fast(unsigned long start,
>>  }
>>  EXPORT_SYMBOL_GPL(get_user_pages_fast);
>>
>> +int do_mmap_shared_checks(struct file *file, unsigned long prot)
>> +{
>> +       struct inode *inode = file_inode(file);
>> +
>> +       if ((prot & PROT_WRITE) && !(file->f_mode & FMODE_WRITE))
>> +               return -EACCES;
>> +
>> +       /*
>> +        * Make sure we don't allow writing to an append-only
>> +        * file..
>> +        */
>> +       if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
>> +               return -EACCES;
>> +
>> +       /*
>> +        * Make sure there are no mandatory locks on the file.
>> +        */
>> +       if (locks_verify_locked(file))
>> +               return -EAGAIN;
>> +
>> +       return 0;
>> +}
>> +
>>  unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>>         unsigned long len, unsigned long prot,
>>         unsigned long flag, unsigned long pgoff)
>> @@ -291,6 +314,33 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
>>         struct mm_struct *mm = current->mm;
>>         unsigned long populate;
>>
>> +       /*
>> +        * If we must remove privs, we do it here since doing it during
>> +        * page fault may be expensive and cannot hold inode->i_mutex,
>> +        * since mm->mmap_sem is already held.
>> +        */
>> +       if (file && (flag & MAP_TYPE) == MAP_SHARED && (prot & PROT_WRITE)) {
>> +               struct inode *inode = file_inode(file);
>> +               int err;
>> +
>> +               if (!IS_NOSEC(inode)) {
>> +                       /*
>> +                        * Make sure we can't strip privs from a file that
>> +                        * wouldn't otherwise be allowed to be mmapped.
>> +                        */
>> +                       err = do_mmap_shared_checks(file, prot);
>> +                       if (err)
>> +                               return err;
>> +
>> +                       mutex_lock(&inode->i_mutex);
>> +                       err = file_remove_privs(file);
>> +                       mutex_unlock(&inode->i_mutex);
>> +
>> +                       if (err)
>> +                               return err;
>> +               }
>> +       }
>> +
>>         ret = security_mmap_file(file, prot, flag);
>>         if (!ret) {
>>                 down_write(&mm->mmap_sem);
>> --
>> 2.6.3
>>
>>
>> --
>> Kees Cook
>> Chrome OS & Brillo Security



-- 
Kees Cook
Chrome OS & Brillo Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2016-05-25 21:36   ` Kees Cook
@ 2016-05-25 21:49     ` Al Viro
  2017-01-28  3:47       ` Andy Lutomirski
  0 siblings, 1 reply; 8+ messages in thread
From: Al Viro @ 2016-05-25 21:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Morton, Konstantin Khlebnikov, Andy Lutomirski, Jan Kara,
	yalin wang, Willy Tarreau, linux-mm@kvack.org,
	Linux Kernel Mailing List

On Wed, May 25, 2016 at 02:36:57PM -0700, Kees Cook wrote:

> Hm, this didn't end up getting picked up. (This jumped out at me again
> because i_mutex just vanished...)
> 
> Al, what's the right way to update the locking in this patch?

->i_mutex is dealt with just by using lock_inode(inode)/unlock_inode(inode);
I hadn't looked at the rest of the locking in there.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2016-05-25 21:49     ` Al Viro
@ 2017-01-28  3:47       ` Andy Lutomirski
  2017-02-07 20:18         ` Kees Cook
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2017-01-28  3:47 UTC (permalink / raw)
  To: Al Viro
  Cc: Kees Cook, Andrew Morton, Konstantin Khlebnikov, Jan Kara,
	yalin wang, Willy Tarreau, linux-mm@kvack.org,
	Linux Kernel Mailing List

On Wed, May 25, 2016 at 2:49 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Wed, May 25, 2016 at 02:36:57PM -0700, Kees Cook wrote:
>
>> Hm, this didn't end up getting picked up. (This jumped out at me again
>> because i_mutex just vanished...)
>>
>> Al, what's the right way to update the locking in this patch?
>
> ->i_mutex is dealt with just by using lock_inode(inode)/unlock_inode(inode);
> I hadn't looked at the rest of the locking in there.

Ping?

If this is too messy, I'm wondering if we could get away with a much
simpler approach: clear the suid and sgid bits when the file is opened
for write.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v9] fs: clear file privilege bits when mmap writing
  2017-01-28  3:47       ` Andy Lutomirski
@ 2017-02-07 20:18         ` Kees Cook
  0 siblings, 0 replies; 8+ messages in thread
From: Kees Cook @ 2017-02-07 20:18 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Al Viro, Andrew Morton, Konstantin Khlebnikov, Jan Kara,
	yalin wang, Willy Tarreau, linux-mm@kvack.org,
	Linux Kernel Mailing List

On Fri, Jan 27, 2017 at 7:47 PM, Andy Lutomirski <luto@kernel.org> wrote:
> On Wed, May 25, 2016 at 2:49 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Wed, May 25, 2016 at 02:36:57PM -0700, Kees Cook wrote:
>>
>>> Hm, this didn't end up getting picked up. (This jumped out at me again
>>> because i_mutex just vanished...)
>>>
>>> Al, what's the right way to update the locking in this patch?
>>
>> ->i_mutex is dealt with just by using lock_inode(inode)/unlock_inode(inode);
>> I hadn't looked at the rest of the locking in there.
>
> Ping?
>
> If this is too messy, I'm wondering if we could get away with a much
> simpler approach: clear the suid and sgid bits when the file is opened
> for write.

I think that'll break something, but I don't have any actual examples.
Regardless, I'd still like to see this hole fixed...

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-02-07 20:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-14 21:22 [PATCH v9] fs: clear file privilege bits when mmap writing Kees Cook
2016-01-15  5:55 ` Konstantin Khlebnikov
2016-01-15  6:18   ` Andy Lutomirski
2016-01-15  6:36     ` Konstantin Khlebnikov
2016-05-25 21:36   ` Kees Cook
2016-05-25 21:49     ` Al Viro
2017-01-28  3:47       ` Andy Lutomirski
2017-02-07 20:18         ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).