* uid=0 inside user-namespace and procfs file permissions
@ 2014-10-01 0:22 Aditya Kali
2014-10-01 0:35 ` Eric W. Biederman
0 siblings, 1 reply; 5+ messages in thread
From: Aditya Kali @ 2014-10-01 0:22 UTC (permalink / raw)
To: Serge Hallyn, Eric W. Biederman, linux-kernel@vger.kernel.org,
linux-security-module
Hi all,
I am trying to run a process with uid=0 inside userns. But in the when
I also do capset() after setresuid(0, 0, 0), I am seeing inconsistent
proc file permissions. Almost all the files in /proc/<pid>/ has global
'root' as owner and group even if the actual process uid is correctly
changed.
I wrote a simple program that demonstrate the issue:
1. parent, as global root (uid=0 in init_user_ns) fork()s a child
2. child:
a) unshare(CLONE_NEWUSER)
b) [wait for parent to write uid_map]
c) setresgid(id, id, id) ; setresuid(0, 0, 0);
d) conditionally call capset() to clear capabilities
e) execve(/bin/sleep)
3. parent:
a) populates child's uid_map and maps some uid to 0 inside userns. ex:
0 99 1
b) waitpid()
(the actual program can be found at http://pastebin.com/f4P17VFn for
your reference).
When there is no capset() call after setresuid(0,0,0), everything is
fine. But when I do a capset() to clear all capabilities, the 'owner'
and 'group' of all the files under /proc/<child_pid>/ of the child
process are reverted to global 'root' user.
# without capset (2.d):
root@vm1# id
uid=0(root) gid=0(root) groups=0(root)
root@vm1# ./userns_uid0
child_pid: 24277
proc_file: /proc/24277/uid_map
proc_file: /proc/24277/gid_map
child resuming
^Z
[1]+ Stopped ./userns_uid0
root@vm1# cat /proc/24277/uid_map
0 99 1
root@vm1# cat /proc/24277/status | grep -e "Uid:" -e "Gid:"
Uid: 99 99 99 99
Gid: 99 99 99 99
root@vm1# ls -l /proc/24277/
total 0
dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:31 attr
-r-------- 1 nobody nobody 0 2014-09-30 16:31 auxv
-r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cgroup
--w------- 1 nobody nobody 0 2014-09-30 16:31 clear_refs
-r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cmdline
-rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 comm
-rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 coredump_filter
-r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cpuset
...
[All files have owner='nobody' and group='nobody' .. same as that of
the process]
With the additional capset() call, the files under /proc/<child_pid>/
are now owned by global root:
root@vm1# ./userns_uid0 resetcaps
child_pid: 24706
proc_file: /proc/24706/uid_map
proc_file: /proc/24706/gid_map
child resuming
resetting caps
^Z
[2]+ Stopped ./userns_uid0 resetcaps
root@vm1# cat /proc/24706/uid_map
0 99 1
root@vm1# cat /proc/24706/status | grep -e "Uid:" -e "Gid:"
Uid: 99 99 99 99
Gid: 99 99 99 99
[Everything as before till now]
root@vm1# ls -l /proc/24706/
total 0
dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:47 attr
-r-------- 1 root root 0 2014-09-30 16:47 auxv
-r--r--r-- 1 root root 0 2014-09-30 16:47 cgroup
--w------- 1 root root 0 2014-09-30 16:47 clear_refs
-r--r--r-- 1 root root 0 2014-09-30 16:47 cmdline
-rw-r--r-- 1 root root 0 2014-09-30 16:47 comm
-rw-r--r-- 1 root root 0 2014-09-30 16:47 coredump_filter
-r--r--r-- 1 root root 0 2014-09-30 16:47 cpuset
...
-r--r--r-- 1 root root 0 2014-09-30 16:47 mountinfo
-r--r--r-- 1 root root 0 2014-09-30 16:47 mounts
-r-------- 1 root root 0 2014-09-30 16:47 mountstats
dr-xr-xr-x 5 nobody nobody 0 2014-09-30 16:47 net
dr-x--x--x 2 root root 0 2014-09-30 16:47 ns
-r--r--r-- 1 root root 0 2014-09-30 16:47 numa_maps
...
-r--r--r-- 1 root root 0 2014-09-30 16:47 status
-r-------- 1 root root 0 2014-09-30 16:47 syscall
dr-xr-xr-x 3 nobody nobody 0 2014-09-30 16:47 task
..
Only the directories 'attr', 'net' and 'task' are owned by the uid=99.
Rest all files are owned by global root.
This behavior seems inconsistent. I ran this on 3.17 kernel. Can
someone with expertise in this area explain if this is expected?
Thanks,
--
Aditya
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uid=0 inside user-namespace and procfs file permissions
2014-10-01 0:22 uid=0 inside user-namespace and procfs file permissions Aditya Kali
@ 2014-10-01 0:35 ` Eric W. Biederman
2014-10-01 0:51 ` Aditya Kali
0 siblings, 1 reply; 5+ messages in thread
From: Eric W. Biederman @ 2014-10-01 0:35 UTC (permalink / raw)
To: Aditya Kali
Cc: Serge Hallyn, linux-kernel@vger.kernel.org, linux-security-module
Aditya Kali <adityakali@google.com> writes:
> Hi all,
>
> I am trying to run a process with uid=0 inside userns. But in the when
> I also do capset() after setresuid(0, 0, 0), I am seeing inconsistent
> proc file permissions. Almost all the files in /proc/<pid>/ has global
> 'root' as owner and group even if the actual process uid is correctly
> changed.
>
> I wrote a simple program that demonstrate the issue:
>
> 1. parent, as global root (uid=0 in init_user_ns) fork()s a child
> 2. child:
> a) unshare(CLONE_NEWUSER)
> b) [wait for parent to write uid_map]
> c) setresgid(id, id, id) ; setresuid(0, 0, 0);
> d) conditionally call capset() to clear capabilities
> e) execve(/bin/sleep)
> 3. parent:
> a) populates child's uid_map and maps some uid to 0 inside userns. ex:
> 0 99 1
> b) waitpid()
>
> (the actual program can be found at http://pastebin.com/f4P17VFn for
> your reference).
>
> When there is no capset() call after setresuid(0,0,0), everything is
> fine. But when I do a capset() to clear all capabilities, the 'owner'
> and 'group' of all the files under /proc/<child_pid>/ of the child
> process are reverted to global 'root' user.
>
> # without capset (2.d):
> root@vm1# id
> uid=0(root) gid=0(root) groups=0(root)
>
> root@vm1# ./userns_uid0
> child_pid: 24277
> proc_file: /proc/24277/uid_map
> proc_file: /proc/24277/gid_map
> child resuming
>
> ^Z
> [1]+ Stopped ./userns_uid0
> root@vm1# cat /proc/24277/uid_map
> 0 99 1
> root@vm1# cat /proc/24277/status | grep -e "Uid:" -e "Gid:"
> Uid: 99 99 99 99
> Gid: 99 99 99 99
> root@vm1# ls -l /proc/24277/
> total 0
> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:31 attr
> -r-------- 1 nobody nobody 0 2014-09-30 16:31 auxv
> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cgroup
> --w------- 1 nobody nobody 0 2014-09-30 16:31 clear_refs
> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cmdline
> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 comm
> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 coredump_filter
> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cpuset
> ...
> [All files have owner='nobody' and group='nobody' .. same as that of
> the process]
>
> With the additional capset() call, the files under /proc/<child_pid>/
> are now owned by global root:
>
> root@vm1# ./userns_uid0 resetcaps
> child_pid: 24706
> proc_file: /proc/24706/uid_map
> proc_file: /proc/24706/gid_map
> child resuming
> resetting caps
> ^Z
> [2]+ Stopped ./userns_uid0 resetcaps
> root@vm1# cat /proc/24706/uid_map
> 0 99 1
> root@vm1# cat /proc/24706/status | grep -e "Uid:" -e "Gid:"
> Uid: 99 99 99 99
> Gid: 99 99 99 99
>
> [Everything as before till now]
>
> root@vm1# ls -l /proc/24706/
> total 0
> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:47 attr
> -r-------- 1 root root 0 2014-09-30 16:47 auxv
> -r--r--r-- 1 root root 0 2014-09-30 16:47 cgroup
> --w------- 1 root root 0 2014-09-30 16:47 clear_refs
> -r--r--r-- 1 root root 0 2014-09-30 16:47 cmdline
> -rw-r--r-- 1 root root 0 2014-09-30 16:47 comm
> -rw-r--r-- 1 root root 0 2014-09-30 16:47 coredump_filter
> -r--r--r-- 1 root root 0 2014-09-30 16:47 cpuset
> ...
> -r--r--r-- 1 root root 0 2014-09-30 16:47 mountinfo
> -r--r--r-- 1 root root 0 2014-09-30 16:47 mounts
> -r-------- 1 root root 0 2014-09-30 16:47 mountstats
> dr-xr-xr-x 5 nobody nobody 0 2014-09-30 16:47 net
> dr-x--x--x 2 root root 0 2014-09-30 16:47 ns
> -r--r--r-- 1 root root 0 2014-09-30 16:47 numa_maps
> ...
> -r--r--r-- 1 root root 0 2014-09-30 16:47 status
> -r-------- 1 root root 0 2014-09-30 16:47 syscall
> dr-xr-xr-x 3 nobody nobody 0 2014-09-30 16:47 task
> ..
>
> Only the directories 'attr', 'net' and 'task' are owned by the uid=99.
> Rest all files are owned by global root.
>
> This behavior seems inconsistent. I ran this on 3.17 kernel. Can
> someone with expertise in this area explain if this is expected?
So I am not quite certain what you are seeing.
In general proc files are expected to be owned by the euid of a process.
However when the task_dumpable is cleared the files become owned by the
global root user. We have considered relaxing that to the namespace
root user but so far implementing a more granular task_dumpable has not
been done.
The directories are world readable so they don't matter.
What puzzles me is that you have directories owned by nobody, and you
are talking about uid = 99 and gid = 99. Nobody is traditionally
(u16_t)-2 and there should never actually be used by anyone. And is
used as the default number of unmapped uids and gids.
It looks like you are doing something weird with nobody so I don't have
a clue what is actually going on.
Eric
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uid=0 inside user-namespace and procfs file permissions
2014-10-01 0:35 ` Eric W. Biederman
@ 2014-10-01 0:51 ` Aditya Kali
2014-10-01 2:38 ` Eric W. Biederman
0 siblings, 1 reply; 5+ messages in thread
From: Aditya Kali @ 2014-10-01 0:51 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Serge Hallyn, linux-kernel@vger.kernel.org, linux-security-module
On Tue, Sep 30, 2014 at 5:35 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Aditya Kali <adityakali@google.com> writes:
>
>> Hi all,
>>
>> I am trying to run a process with uid=0 inside userns. But in the when
>> I also do capset() after setresuid(0, 0, 0), I am seeing inconsistent
>> proc file permissions. Almost all the files in /proc/<pid>/ has global
>> 'root' as owner and group even if the actual process uid is correctly
>> changed.
>>
>> I wrote a simple program that demonstrate the issue:
>>
>> 1. parent, as global root (uid=0 in init_user_ns) fork()s a child
>> 2. child:
>> a) unshare(CLONE_NEWUSER)
>> b) [wait for parent to write uid_map]
>> c) setresgid(id, id, id) ; setresuid(0, 0, 0);
>> d) conditionally call capset() to clear capabilities
>> e) execve(/bin/sleep)
>> 3. parent:
>> a) populates child's uid_map and maps some uid to 0 inside userns. ex:
>> 0 99 1
>> b) waitpid()
>>
>> (the actual program can be found at http://pastebin.com/f4P17VFn for
>> your reference).
>>
>> When there is no capset() call after setresuid(0,0,0), everything is
>> fine. But when I do a capset() to clear all capabilities, the 'owner'
>> and 'group' of all the files under /proc/<child_pid>/ of the child
>> process are reverted to global 'root' user.
>>
>> # without capset (2.d):
>> root@vm1# id
>> uid=0(root) gid=0(root) groups=0(root)
>>
>> root@vm1# ./userns_uid0
>> child_pid: 24277
>> proc_file: /proc/24277/uid_map
>> proc_file: /proc/24277/gid_map
>> child resuming
>>
>> ^Z
>> [1]+ Stopped ./userns_uid0
>> root@vm1# cat /proc/24277/uid_map
>> 0 99 1
>> root@vm1# cat /proc/24277/status | grep -e "Uid:" -e "Gid:"
>> Uid: 99 99 99 99
>> Gid: 99 99 99 99
>> root@vm1# ls -l /proc/24277/
>> total 0
>> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:31 attr
>> -r-------- 1 nobody nobody 0 2014-09-30 16:31 auxv
>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cgroup
>> --w------- 1 nobody nobody 0 2014-09-30 16:31 clear_refs
>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cmdline
>> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 comm
>> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 coredump_filter
>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cpuset
>> ...
>> [All files have owner='nobody' and group='nobody' .. same as that of
>> the process]
>>
>> With the additional capset() call, the files under /proc/<child_pid>/
>> are now owned by global root:
>>
>> root@vm1# ./userns_uid0 resetcaps
>> child_pid: 24706
>> proc_file: /proc/24706/uid_map
>> proc_file: /proc/24706/gid_map
>> child resuming
>> resetting caps
>> ^Z
>> [2]+ Stopped ./userns_uid0 resetcaps
>> root@vm1# cat /proc/24706/uid_map
>> 0 99 1
>> root@vm1# cat /proc/24706/status | grep -e "Uid:" -e "Gid:"
>> Uid: 99 99 99 99
>> Gid: 99 99 99 99
>>
>> [Everything as before till now]
>>
>> root@vm1# ls -l /proc/24706/
>> total 0
>> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:47 attr
>> -r-------- 1 root root 0 2014-09-30 16:47 auxv
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cgroup
>> --w------- 1 root root 0 2014-09-30 16:47 clear_refs
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cmdline
>> -rw-r--r-- 1 root root 0 2014-09-30 16:47 comm
>> -rw-r--r-- 1 root root 0 2014-09-30 16:47 coredump_filter
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cpuset
>> ...
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 mountinfo
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 mounts
>> -r-------- 1 root root 0 2014-09-30 16:47 mountstats
>> dr-xr-xr-x 5 nobody nobody 0 2014-09-30 16:47 net
>> dr-x--x--x 2 root root 0 2014-09-30 16:47 ns
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 numa_maps
>> ...
>> -r--r--r-- 1 root root 0 2014-09-30 16:47 status
>> -r-------- 1 root root 0 2014-09-30 16:47 syscall
>> dr-xr-xr-x 3 nobody nobody 0 2014-09-30 16:47 task
>> ..
>>
>> Only the directories 'attr', 'net' and 'task' are owned by the uid=99.
>> Rest all files are owned by global root.
>>
>> This behavior seems inconsistent. I ran this on 3.17 kernel. Can
>> someone with expertise in this area explain if this is expected?
>
> So I am not quite certain what you are seeing.
>
> In general proc files are expected to be owned by the euid of a process.
> However when the task_dumpable is cleared the files become owned by the
> global root user. We have considered relaxing that to the namespace
> root user but so far implementing a more granular task_dumpable has not
> been done.
>
I tried explicitly setting PR_SET_DUMPABLE before execve(), but that
didn't either.
> The directories are world readable so they don't matter.
>
> What puzzles me is that you have directories owned by nobody, and you
> are talking about uid = 99 and gid = 99. Nobody is traditionally
> (u16_t)-2 and there should never actually be used by anyone. And is
> used as the default number of unmapped uids and gids.
>
> It looks like you are doing something weird with nobody so I don't have
> a clue what is actually going on.
>
The issue is not specific to uid 99 or "nobody". Its just a dummy user
I have for testing. The issue happens with any user with non-zero uid.
> Eric
Thanks,
--
Aditya
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uid=0 inside user-namespace and procfs file permissions
2014-10-01 0:51 ` Aditya Kali
@ 2014-10-01 2:38 ` Eric W. Biederman
2014-10-01 5:28 ` Aditya Kali
0 siblings, 1 reply; 5+ messages in thread
From: Eric W. Biederman @ 2014-10-01 2:38 UTC (permalink / raw)
To: Aditya Kali
Cc: Serge Hallyn, linux-kernel@vger.kernel.org, linux-security-module
Aditya Kali <adityakali@google.com> writes:
> On Tue, Sep 30, 2014 at 5:35 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> Aditya Kali <adityakali@google.com> writes:
>>
>>> Hi all,
>>>
>>> I am trying to run a process with uid=0 inside userns. But in the when
>>> I also do capset() after setresuid(0, 0, 0), I am seeing inconsistent
>>> proc file permissions. Almost all the files in /proc/<pid>/ has global
>>> 'root' as owner and group even if the actual process uid is correctly
>>> changed.
>>>
>>> I wrote a simple program that demonstrate the issue:
>>>
>>> 1. parent, as global root (uid=0 in init_user_ns) fork()s a child
>>> 2. child:
>>> a) unshare(CLONE_NEWUSER)
>>> b) [wait for parent to write uid_map]
>>> c) setresgid(id, id, id) ; setresuid(0, 0, 0);
>>> d) conditionally call capset() to clear capabilities
>>> e) execve(/bin/sleep)
>>> 3. parent:
>>> a) populates child's uid_map and maps some uid to 0 inside userns. ex:
>>> 0 99 1
>>> b) waitpid()
>>>
>>> (the actual program can be found at http://pastebin.com/f4P17VFn for
>>> your reference).
>>>
>>> When there is no capset() call after setresuid(0,0,0), everything is
>>> fine. But when I do a capset() to clear all capabilities, the 'owner'
>>> and 'group' of all the files under /proc/<child_pid>/ of the child
>>> process are reverted to global 'root' user.
>>>
>>> # without capset (2.d):
>>> root@vm1# id
>>> uid=0(root) gid=0(root) groups=0(root)
>>>
>>> root@vm1# ./userns_uid0
>>> child_pid: 24277
>>> proc_file: /proc/24277/uid_map
>>> proc_file: /proc/24277/gid_map
>>> child resuming
>>>
>>> ^Z
>>> [1]+ Stopped ./userns_uid0
>>> root@vm1# cat /proc/24277/uid_map
>>> 0 99 1
>>> root@vm1# cat /proc/24277/status | grep -e "Uid:" -e "Gid:"
>>> Uid: 99 99 99 99
>>> Gid: 99 99 99 99
>>> root@vm1# ls -l /proc/24277/
>>> total 0
>>> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:31 attr
>>> -r-------- 1 nobody nobody 0 2014-09-30 16:31 auxv
>>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cgroup
>>> --w------- 1 nobody nobody 0 2014-09-30 16:31 clear_refs
>>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cmdline
>>> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 comm
>>> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 coredump_filter
>>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cpuset
>>> ...
>>> [All files have owner='nobody' and group='nobody' .. same as that of
>>> the process]
>>>
>>> With the additional capset() call, the files under /proc/<child_pid>/
>>> are now owned by global root:
>>>
>>> root@vm1# ./userns_uid0 resetcaps
>>> child_pid: 24706
>>> proc_file: /proc/24706/uid_map
>>> proc_file: /proc/24706/gid_map
>>> child resuming
>>> resetting caps
>>> ^Z
>>> [2]+ Stopped ./userns_uid0 resetcaps
>>> root@vm1# cat /proc/24706/uid_map
>>> 0 99 1
>>> root@vm1# cat /proc/24706/status | grep -e "Uid:" -e "Gid:"
>>> Uid: 99 99 99 99
>>> Gid: 99 99 99 99
>>>
>>> [Everything as before till now]
>>>
>>> root@vm1# ls -l /proc/24706/
>>> total 0
>>> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:47 attr
>>> -r-------- 1 root root 0 2014-09-30 16:47 auxv
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cgroup
>>> --w------- 1 root root 0 2014-09-30 16:47 clear_refs
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cmdline
>>> -rw-r--r-- 1 root root 0 2014-09-30 16:47 comm
>>> -rw-r--r-- 1 root root 0 2014-09-30 16:47 coredump_filter
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cpuset
>>> ...
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 mountinfo
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 mounts
>>> -r-------- 1 root root 0 2014-09-30 16:47 mountstats
>>> dr-xr-xr-x 5 nobody nobody 0 2014-09-30 16:47 net
>>> dr-x--x--x 2 root root 0 2014-09-30 16:47 ns
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 numa_maps
>>> ...
>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 status
>>> -r-------- 1 root root 0 2014-09-30 16:47 syscall
>>> dr-xr-xr-x 3 nobody nobody 0 2014-09-30 16:47 task
>>> ..
>>>
>>> Only the directories 'attr', 'net' and 'task' are owned by the uid=99.
>>> Rest all files are owned by global root.
>>>
>>> This behavior seems inconsistent. I ran this on 3.17 kernel. Can
>>> someone with expertise in this area explain if this is expected?
>>
>> So I am not quite certain what you are seeing.
>>
>> In general proc files are expected to be owned by the euid of a process.
>> However when the task_dumpable is cleared the files become owned by the
>> global root user. We have considered relaxing that to the namespace
>> root user but so far implementing a more granular task_dumpable has not
>> been done.
>>
>
> I tried explicitly setting PR_SET_DUMPABLE before execve(), but that
> didn't either.
>
>> The directories are world readable so they don't matter.
>>
>> What puzzles me is that you have directories owned by nobody, and you
>> are talking about uid = 99 and gid = 99. Nobody is traditionally
>> (u16_t)-2 and there should never actually be used by anyone. And is
>> used as the default number of unmapped uids and gids.
>>
>> It looks like you are doing something weird with nobody so I don't have
>> a clue what is actually going on.
>>
>
> The issue is not specific to uid 99 or "nobody". Its just a dummy user
> I have for testing. The issue happens with any user with non-zero uid.
But my issue with reading your directory listings of proc is.
I can't tell if you are giving me a listing of proc from a process in
the user namespace or outside of the user namespace.
If the process 24706 had uid == 99 and gid == 99 (outside of the user
namespace). And your are listing the files from outside of the user
namespace. And uid 99 is mapped to nobody in /etc/passwd and
gid 99 is mapped to nobody in /etc/group. And your ls process is
not running in your user namespace. Then this looks like proper
handling of dumpable. Otherwise I don't have a clue what is going on
because I can't make sense of your directory listings.
Eric
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uid=0 inside user-namespace and procfs file permissions
2014-10-01 2:38 ` Eric W. Biederman
@ 2014-10-01 5:28 ` Aditya Kali
0 siblings, 0 replies; 5+ messages in thread
From: Aditya Kali @ 2014-10-01 5:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Serge Hallyn, linux-kernel@vger.kernel.org, linux-security-module
On Tue, Sep 30, 2014 at 7:38 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Aditya Kali <adityakali@google.com> writes:
>
>> On Tue, Sep 30, 2014 at 5:35 PM, Eric W. Biederman
>> <ebiederm@xmission.com> wrote:
>>> Aditya Kali <adityakali@google.com> writes:
>>>
>>>> Hi all,
>>>>
>>>> I am trying to run a process with uid=0 inside userns. But in the when
>>>> I also do capset() after setresuid(0, 0, 0), I am seeing inconsistent
>>>> proc file permissions. Almost all the files in /proc/<pid>/ has global
>>>> 'root' as owner and group even if the actual process uid is correctly
>>>> changed.
>>>>
>>>> I wrote a simple program that demonstrate the issue:
>>>>
>>>> 1. parent, as global root (uid=0 in init_user_ns) fork()s a child
>>>> 2. child:
>>>> a) unshare(CLONE_NEWUSER)
>>>> b) [wait for parent to write uid_map]
>>>> c) setresgid(id, id, id) ; setresuid(0, 0, 0);
>>>> d) conditionally call capset() to clear capabilities
>>>> e) execve(/bin/sleep)
>>>> 3. parent:
>>>> a) populates child's uid_map and maps some uid to 0 inside userns. ex:
>>>> 0 99 1
>>>> b) waitpid()
>>>>
>>>> (the actual program can be found at http://pastebin.com/f4P17VFn for
>>>> your reference).
>>>>
>>>> When there is no capset() call after setresuid(0,0,0), everything is
>>>> fine. But when I do a capset() to clear all capabilities, the 'owner'
>>>> and 'group' of all the files under /proc/<child_pid>/ of the child
>>>> process are reverted to global 'root' user.
>>>>
>>>> # without capset (2.d):
>>>> root@vm1# id
>>>> uid=0(root) gid=0(root) groups=0(root)
>>>>
>>>> root@vm1# ./userns_uid0
>>>> child_pid: 24277
>>>> proc_file: /proc/24277/uid_map
>>>> proc_file: /proc/24277/gid_map
>>>> child resuming
>>>>
>>>> ^Z
>>>> [1]+ Stopped ./userns_uid0
>>>> root@vm1# cat /proc/24277/uid_map
>>>> 0 99 1
>>>> root@vm1# cat /proc/24277/status | grep -e "Uid:" -e "Gid:"
>>>> Uid: 99 99 99 99
>>>> Gid: 99 99 99 99
>>>> root@vm1# ls -l /proc/24277/
>>>> total 0
>>>> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:31 attr
>>>> -r-------- 1 nobody nobody 0 2014-09-30 16:31 auxv
>>>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cgroup
>>>> --w------- 1 nobody nobody 0 2014-09-30 16:31 clear_refs
>>>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cmdline
>>>> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 comm
>>>> -rw-r--r-- 1 nobody nobody 0 2014-09-30 16:31 coredump_filter
>>>> -r--r--r-- 1 nobody nobody 0 2014-09-30 16:31 cpuset
>>>> ...
>>>> [All files have owner='nobody' and group='nobody' .. same as that of
>>>> the process]
>>>>
>>>> With the additional capset() call, the files under /proc/<child_pid>/
>>>> are now owned by global root:
>>>>
>>>> root@vm1# ./userns_uid0 resetcaps
>>>> child_pid: 24706
>>>> proc_file: /proc/24706/uid_map
>>>> proc_file: /proc/24706/gid_map
>>>> child resuming
>>>> resetting caps
>>>> ^Z
>>>> [2]+ Stopped ./userns_uid0 resetcaps
>>>> root@vm1# cat /proc/24706/uid_map
>>>> 0 99 1
>>>> root@vm1# cat /proc/24706/status | grep -e "Uid:" -e "Gid:"
>>>> Uid: 99 99 99 99
>>>> Gid: 99 99 99 99
>>>>
>>>> [Everything as before till now]
>>>>
>>>> root@vm1# ls -l /proc/24706/
>>>> total 0
>>>> dr-xr-xr-x 2 nobody nobody 0 2014-09-30 16:47 attr
>>>> -r-------- 1 root root 0 2014-09-30 16:47 auxv
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cgroup
>>>> --w------- 1 root root 0 2014-09-30 16:47 clear_refs
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cmdline
>>>> -rw-r--r-- 1 root root 0 2014-09-30 16:47 comm
>>>> -rw-r--r-- 1 root root 0 2014-09-30 16:47 coredump_filter
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 cpuset
>>>> ...
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 mountinfo
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 mounts
>>>> -r-------- 1 root root 0 2014-09-30 16:47 mountstats
>>>> dr-xr-xr-x 5 nobody nobody 0 2014-09-30 16:47 net
>>>> dr-x--x--x 2 root root 0 2014-09-30 16:47 ns
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 numa_maps
>>>> ...
>>>> -r--r--r-- 1 root root 0 2014-09-30 16:47 status
>>>> -r-------- 1 root root 0 2014-09-30 16:47 syscall
>>>> dr-xr-xr-x 3 nobody nobody 0 2014-09-30 16:47 task
>>>> ..
>>>>
>>>> Only the directories 'attr', 'net' and 'task' are owned by the uid=99.
>>>> Rest all files are owned by global root.
>>>>
>>>> This behavior seems inconsistent. I ran this on 3.17 kernel. Can
>>>> someone with expertise in this area explain if this is expected?
>>>
>>> So I am not quite certain what you are seeing.
>>>
>>> In general proc files are expected to be owned by the euid of a process.
>>> However when the task_dumpable is cleared the files become owned by the
>>> global root user. We have considered relaxing that to the namespace
>>> root user but so far implementing a more granular task_dumpable has not
>>> been done.
>>>
>>
>> I tried explicitly setting PR_SET_DUMPABLE before execve(), but that
>> didn't either.
>>
>>> The directories are world readable so they don't matter.
>>>
>>> What puzzles me is that you have directories owned by nobody, and you
>>> are talking about uid = 99 and gid = 99. Nobody is traditionally
>>> (u16_t)-2 and there should never actually be used by anyone. And is
>>> used as the default number of unmapped uids and gids.
>>>
>>> It looks like you are doing something weird with nobody so I don't have
>>> a clue what is actually going on.
>>>
>>
>> The issue is not specific to uid 99 or "nobody". Its just a dummy user
>> I have for testing. The issue happens with any user with non-zero uid.
>
> But my issue with reading your directory listings of proc is.
>
> I can't tell if you are giving me a listing of proc from a process in
> the user namespace or outside of the user namespace.
The listing is as seen from outside the user namespace.
>
> If the process 24706 had uid == 99 and gid == 99 (outside of the user
> namespace). And your are listing the files from outside of the user
> namespace. And uid 99 is mapped to nobody in /etc/passwd and
> gid 99 is mapped to nobody in /etc/group. And your ls process is
> not running in your user namespace.
All of above is correct.
> Then this looks like proper
> handling of dumpable. Otherwise I don't have a clue what is going on
> because I can't make sense of your directory listings.
>
So you are saying this is expected behavior? My experiment with
prctl(PR_SET_DUMPABLE, 1) didn't help either. I expected the owner and
group in the proc file listing (as seen from init_user_ns) to be
'nobody' since the process is really running as uid=99 ("nobody") in
the init_user_ns. What am I missing?
I will try to go over the set_dumpable() call-sites tomorrow and get more info.
> Eric
Thanks,
--
Aditya
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-10-01 5:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-01 0:22 uid=0 inside user-namespace and procfs file permissions Aditya Kali
2014-10-01 0:35 ` Eric W. Biederman
2014-10-01 0:51 ` Aditya Kali
2014-10-01 2:38 ` Eric W. Biederman
2014-10-01 5:28 ` Aditya Kali
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox