From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [PATCH] procfs: change the owner of non-dumpable and writeable files Date: Thu, 19 Jan 2017 10:29:31 +0100 Message-ID: <20170119092930.GJ30786@dhcp22.suse.cz> References: <20170118040159.4751-1-asarai@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <20170118040159.4751-1-asarai-l3A5Bk7waGM@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Aleksa Sarai Cc: Kees Cook , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, Oleg Nesterov , dev-IGmTWi+3HBZvNhPySn5qfx2eb7JE58TQ@public.gmane.org, John Stultz , Al Viro , Andrew Morton , Janis Danisevskis , "Eric W. Biederman" List-Id: containers.vger.kernel.org Cc Eric On Wed 18-01-17 15:01:59, Aleksa Sarai wrote: > In order to protect against ptrace(2) and similar attacks on container > runtimes when they join namespaces, many runtimes set mm->dumpable to > SUID_DUMP_DISABLE. However, doing this means that attempting to set up > an unprivileged user namespace will fail because an unprivileged process > can no longer access /proc/self/{setgroups,{uid,gid}_map} for the > container process (which is the same uid as the runtime process). > > Fix this by changing pid_getattr to *also* change the owner of regular > files that have a mode of 0644 (when the process is not dumpable). This > ensures that the important /proc/[pid]/... files mentioned above are > properly accessible by a container runtime in a rootless container > context. > > The most blantant issue is that a non-dumpable process in a rootless > container context is unable to open /proc/self/setgroups, because it > doesn't own the file. > > int main(void) > { > prctl(PR_SET_DUMPABLE, 0, 0, 0, 0); > unshare(CLONE_NEWUSER); > > /* This will fail. */ > int fd = open("/proc/self/setgroups", O_WRONLY); > if (fd < 0) > abort(); > > return 0; > } I do agree that failing to open anything in /proc/self/ is more than unexepcted! I cannot judge the patch but my gut feeling tells me that the fix should be somewhere in the open handler. One nit below > > Cc: dev-IGmTWi+3HBZvNhPySn5qfx2eb7JE58TQ@public.gmane.org > Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > Signed-off-by: Aleksa Sarai > --- > fs/proc/base.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index ca651ac00660..ebabb12f4536 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1729,6 +1729,7 @@ int pid_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat) > return -ENOENT; > } > if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) || > + (inode->i_mode == (S_IFREG|S_IRUGO|S_IWUSR)) || > task_dumpable(task)) { > cred = __task_cred(task); > stat->uid = cred->euid; > @@ -1770,6 +1771,7 @@ int pid_revalidate(struct dentry *dentry, unsigned int flags) > > if (task) { > if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) || > + (inode->i_mode == (S_IFREG|S_IRUGO|S_IWUSR)) || > task_dumpable(task)) { > rcu_read_lock(); > cred = __task_cred(task); > @@ -2394,7 +2396,7 @@ static int proc_pident_instantiate(struct inode *dir, > return -ENOENT; > } > > -static struct dentry *proc_pident_lookup(struct inode *dir, > +static struct dentry *proc_pident_lookup(struct inode *dir, > struct dentry *dentry, > const struct pid_entry *ents, > unsigned int nents) > @@ -2536,7 +2538,7 @@ static const struct pid_entry attr_dir_stuff[] = { > > static int proc_attr_dir_readdir(struct file *file, struct dir_context *ctx) > { > - return proc_pident_readdir(file, ctx, > + return proc_pident_readdir(file, ctx, > attr_dir_stuff, ARRAY_SIZE(attr_dir_stuff)); > } this two are just whitespace noise -- Michal Hocko SUSE Labs