* Re: [PATCH v6] fs: clear file privilege bits when mmap writing
[not found] ` <20160110211051.GH17997@ZenIV.linux.org.uk>
@ 2016-01-10 22:30 ` Konstantin Khlebnikov
0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Khlebnikov @ 2016-01-10 22:30 UTC (permalink / raw)
To: Al Viro
Cc: Kees Cook, Andy Lutomirski, Jan Kara, yalin wang, Willy Tarreau,
Andrew Morton, linux-fsdevel, linux-arch, Linux API,
linux-mm@kvack.org
On Mon, Jan 11, 2016 at 12:10 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Sun, Jan 10, 2016 at 10:51:52PM +0300, Konstantin Khlebnikov wrote:
>> On Sun, Jan 10, 2016 at 10:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > On Sun, Jan 10, 2016 at 06:48:32PM +0300, Konstantin Khlebnikov wrote:
>> >> I think this should be done in mmap/mprotect. Code in sys_mmap is trivial.
>> >>
>> >> In sys_mprotect you can check file_needs_remove_privs() and VM_SHARED
>> >> under mmap_sem, then if needed grab reference to struct file from vma and
>> >> clear suid after unlocking mmap_sem.
>> >
>> > Which vma? mprotect(2) can cover more than one mapping... You'd have to
>> > play interesting games to collect the set of affected struct file; it
>> > _might_ be doable (e.g. by using task_work_add() to have the damn thing
>> > trigger on the way to userland), but it would require some care to avoid
>> > hitting the same file more than once - it might, after all, be mmapped
>> > in more than one process, so racing mprotect() would need to be taken
>> > into account. Hell knows - might be doable, but I'm not sure it'll be
>> > any prettier.
>>
>> Ok, I didn't thought about that. mprotect don't have to be atomic for whole
>> range -- we could drop mmap_sem, clear suid from one file and restart it
>> for next vma and so on.
>
> Won't be fun. Even aside of the user-visible behaviour changes, you'll have
> a lot of new corner cases, starting with the fact that you can't hold onto
> vma - virtual address is the best you can do and vma you find after regaining
> mmap_sem might start at lower address than one where you are restarting;
> getting the splitting-related logics right will be interesting, to put it
> mildly.
I don't see any problems here -- in this case mprotect virtually turns
into series of
indendent mprotect calls. Yes, we have to find vma again. Not a big deal.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v6] fs: clear file privilege bits when mmap writing
[not found] ` <CALYGNiPC224w7-xeo9NOX9nrHH84o+_KXBtKWtd4TPXQyQMq2w@mail.gmail.com>
@ 2016-01-11 22:45 ` Kees Cook
2016-01-11 23:16 ` Konstantin Khlebnikov
0 siblings, 1 reply; 4+ messages in thread
From: Kees Cook @ 2016-01-11 22:45 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Alexander Viro, Andy Lutomirski, Jan Kara, yalin wang,
Willy Tarreau, Andrew Morton, linux-fsdevel, linux-arch,
Linux API, linux-mm@kvack.org
On Mon, Jan 11, 2016 at 2:39 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Mon, Jan 11, 2016 at 10:38 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Sun, Jan 10, 2016 at 7:48 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>> On Sat, Jan 9, 2016 at 2:27 AM, Kees Cook <keescook@chromium.org> wrote:
>>>> Normally, when a user can modify a file that has setuid or setgid bits,
>>>> those bits are cleared when they are not the file owner or a member
>>>> of the group. This is enforced when using write and truncate but not
>>>> when writing to a shared mmap on the file. This could allow the file
>>>> writer to gain privileges by changing a binary without losing the
>>>> setuid/setgid/caps bits.
>>>>
>>>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>>>> during the page fault (due to mmap_sem being held during the fault). We
>>>> could do this during vm_mmap_pgoff, but that would need coverage in
>>>> mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem
>>>> again. We could clear at open() time, but it's possible things are
>>>> accidentally opening with O_RDWR and only reading. Better to clear on
>>>> close and error failures (i.e. an improvement over now, which is not
>>>> clearing at all).
>>>
>>> I think this should be done in mmap/mprotect. Code in sys_mmap is trivial.
>>>
>>> In sys_mprotect you can check file_needs_remove_privs() and VM_SHARED
>>> under mmap_sem, then if needed grab reference to struct file from vma and
>>> clear suid after unlocking mmap_sem.
>>>
>>> I haven't seen previous iterations, probably this approach has known flaws.
>>
>> mmap_sem is still needed in mprotect (to find and hold the vma), so
>> it's not possible. I'd love to be proven wrong, but I didn't see a
>> way.
>
> something like this
>
> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>
> vm_flags = calc_vm_prot_bits(prot);
>
> +restart:
> down_write(¤t->mm->mmap_sem);
>
> vma = find_vma(current->mm, start);
> @@ -416,6 +418,21 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start,
> size_t, len,
> goto out;
> }
>
> + if ((newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
> + vma->vm_file && file_needs_remove_privs(vma->vm_file)) {
> + struct file *file = get_file(vma->vm_file);
> +
> + start = vma->vm_start;
> + up_write(¤t->mm->mmap_sem);
> + mutex_lock(&file_inode(file)->i_mutex);
> + error = file_remove_privs(file);
> + mutex_unlock(&file_inode(file)->i_mutex);
> + fput(file);
> + if (error)
> + return error;
> + goto restart;
> + }
> +
Is this safe against the things Al mentioned? I still don't like the
mmap/mprotect approach because it makes the change before anything was
actually written...
-Kees
>
>
>>
>> -Kees
>>
>> --
>> Kees Cook
>> Chrome OS & Brillo Security
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v6] fs: clear file privilege bits when mmap writing
2016-01-11 22:45 ` Kees Cook
@ 2016-01-11 23:16 ` Konstantin Khlebnikov
2016-01-11 23:19 ` Kees Cook
0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Khlebnikov @ 2016-01-11 23:16 UTC (permalink / raw)
To: Kees Cook
Cc: Alexander Viro, Andy Lutomirski, Jan Kara, yalin wang,
Willy Tarreau, Andrew Morton, linux-fsdevel, linux-arch,
Linux API, linux-mm@kvack.org
On Tue, Jan 12, 2016 at 1:45 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, Jan 11, 2016 at 2:39 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>> On Mon, Jan 11, 2016 at 10:38 PM, Kees Cook <keescook@chromium.org> wrote:
>>> On Sun, Jan 10, 2016 at 7:48 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>> On Sat, Jan 9, 2016 at 2:27 AM, Kees Cook <keescook@chromium.org> wrote:
>>>>> Normally, when a user can modify a file that has setuid or setgid bits,
>>>>> those bits are cleared when they are not the file owner or a member
>>>>> of the group. This is enforced when using write and truncate but not
>>>>> when writing to a shared mmap on the file. This could allow the file
>>>>> writer to gain privileges by changing a binary without losing the
>>>>> setuid/setgid/caps bits.
>>>>>
>>>>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>>>>> during the page fault (due to mmap_sem being held during the fault). We
>>>>> could do this during vm_mmap_pgoff, but that would need coverage in
>>>>> mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem
>>>>> again. We could clear at open() time, but it's possible things are
>>>>> accidentally opening with O_RDWR and only reading. Better to clear on
>>>>> close and error failures (i.e. an improvement over now, which is not
>>>>> clearing at all).
>>>>
>>>> I think this should be done in mmap/mprotect. Code in sys_mmap is trivial.
>>>>
>>>> In sys_mprotect you can check file_needs_remove_privs() and VM_SHARED
>>>> under mmap_sem, then if needed grab reference to struct file from vma and
>>>> clear suid after unlocking mmap_sem.
>>>>
>>>> I haven't seen previous iterations, probably this approach has known flaws.
>>>
>>> mmap_sem is still needed in mprotect (to find and hold the vma), so
>>> it's not possible. I'd love to be proven wrong, but I didn't see a
>>> way.
>>
>> something like this
>>
>> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>
>> vm_flags = calc_vm_prot_bits(prot);
>>
>> +restart:
>> down_write(¤t->mm->mmap_sem);
>>
>> vma = find_vma(current->mm, start);
>> @@ -416,6 +418,21 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start,
>> size_t, len,
>> goto out;
>> }
>>
>> + if ((newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
>> + vma->vm_file && file_needs_remove_privs(vma->vm_file)) {
>> + struct file *file = get_file(vma->vm_file);
>> +
>> + start = vma->vm_start;
>> + up_write(¤t->mm->mmap_sem);
>> + mutex_lock(&file_inode(file)->i_mutex);
>> + error = file_remove_privs(file);
>> + mutex_unlock(&file_inode(file)->i_mutex);
>> + fput(file);
>> + if (error)
>> + return error;
>> + goto restart;
>> + }
>> +
>
> Is this safe against the things Al mentioned? I still don't like the
> mmap/mprotect approach because it makes the change before anything was
> actually written...
(I forgot to check VM_SHARED)
Yep, this should be safe.
I think suid should be cleared before any possible change of data.
New content could hit the disk but suid never be cleared,
for example if system suddenly crashed or rebooted.
>
> -Kees
>
>>
>>
>>>
>>> -Kees
>>>
>>> --
>>> Kees Cook
>>> Chrome OS & Brillo Security
>
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v6] fs: clear file privilege bits when mmap writing
2016-01-11 23:16 ` Konstantin Khlebnikov
@ 2016-01-11 23:19 ` Kees Cook
0 siblings, 0 replies; 4+ messages in thread
From: Kees Cook @ 2016-01-11 23:19 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Alexander Viro, Andy Lutomirski, Jan Kara, yalin wang,
Willy Tarreau, Andrew Morton, linux-fsdevel, linux-arch,
Linux API, linux-mm@kvack.org
On Mon, Jan 11, 2016 at 3:16 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Tue, Jan 12, 2016 at 1:45 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Mon, Jan 11, 2016 at 2:39 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>> On Mon, Jan 11, 2016 at 10:38 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> On Sun, Jan 10, 2016 at 7:48 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>> On Sat, Jan 9, 2016 at 2:27 AM, Kees Cook <keescook@chromium.org> wrote:
>>>>>> Normally, when a user can modify a file that has setuid or setgid bits,
>>>>>> those bits are cleared when they are not the file owner or a member
>>>>>> of the group. This is enforced when using write and truncate but not
>>>>>> when writing to a shared mmap on the file. This could allow the file
>>>>>> writer to gain privileges by changing a binary without losing the
>>>>>> setuid/setgid/caps bits.
>>>>>>
>>>>>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>>>>>> during the page fault (due to mmap_sem being held during the fault). We
>>>>>> could do this during vm_mmap_pgoff, but that would need coverage in
>>>>>> mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem
>>>>>> again. We could clear at open() time, but it's possible things are
>>>>>> accidentally opening with O_RDWR and only reading. Better to clear on
>>>>>> close and error failures (i.e. an improvement over now, which is not
>>>>>> clearing at all).
>>>>>
>>>>> I think this should be done in mmap/mprotect. Code in sys_mmap is trivial.
>>>>>
>>>>> In sys_mprotect you can check file_needs_remove_privs() and VM_SHARED
>>>>> under mmap_sem, then if needed grab reference to struct file from vma and
>>>>> clear suid after unlocking mmap_sem.
>>>>>
>>>>> I haven't seen previous iterations, probably this approach has known flaws.
>>>>
>>>> mmap_sem is still needed in mprotect (to find and hold the vma), so
>>>> it's not possible. I'd love to be proven wrong, but I didn't see a
>>>> way.
>>>
>>> something like this
>>>
>>> @@ -375,6 +376,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len,
>>>
>>> vm_flags = calc_vm_prot_bits(prot);
>>>
>>> +restart:
>>> down_write(¤t->mm->mmap_sem);
>>>
>>> vma = find_vma(current->mm, start);
>>> @@ -416,6 +418,21 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start,
>>> size_t, len,
>>> goto out;
>>> }
>>>
>>> + if ((newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE) &&
>>> + vma->vm_file && file_needs_remove_privs(vma->vm_file)) {
>>> + struct file *file = get_file(vma->vm_file);
>>> +
>>> + start = vma->vm_start;
>>> + up_write(¤t->mm->mmap_sem);
>>> + mutex_lock(&file_inode(file)->i_mutex);
>>> + error = file_remove_privs(file);
>>> + mutex_unlock(&file_inode(file)->i_mutex);
>>> + fput(file);
>>> + if (error)
>>> + return error;
>>> + goto restart;
>>> + }
>>> +
>>
>> Is this safe against the things Al mentioned? I still don't like the
>> mmap/mprotect approach because it makes the change before anything was
>> actually written...
>
> (I forgot to check VM_SHARED)
>
> Yep, this should be safe.
>
> I think suid should be cleared before any possible change of data.
> New content could hit the disk but suid never be cleared,
> for example if system suddenly crashed or rebooted.
Oooh, very good point. Yeah, that's enough to convince me. :) Ignore my v7...
Al, are you okay with this semantic change?
-Kees
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-01-11 23:19 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20160108232727.GA23490@www.outflux.net>
[not found] ` <CALYGNiOUL7ewU3+5Zoi_9qofYWwF0vpqMy=A0wS=jUFZ11haCg@mail.gmail.com>
[not found] ` <20160110193044.GG17997@ZenIV.linux.org.uk>
[not found] ` <CALYGNiOxyXX2dpiPoGQUz0CDsvZtH57CO7gE2rAmTQWLigeL1w@mail.gmail.com>
[not found] ` <20160110211051.GH17997@ZenIV.linux.org.uk>
2016-01-10 22:30 ` [PATCH v6] fs: clear file privilege bits when mmap writing Konstantin Khlebnikov
[not found] ` <CAGXu5jJaoZC7WL=MndBr915XhEpn9n3HOOhB-ue1xqyKFWxxzQ@mail.gmail.com>
[not found] ` <CALYGNiPC224w7-xeo9NOX9nrHH84o+_KXBtKWtd4TPXQyQMq2w@mail.gmail.com>
2016-01-11 22:45 ` Kees Cook
2016-01-11 23:16 ` Konstantin Khlebnikov
2016-01-11 23:19 ` Kees Cook
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).