From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Lezcano Subject: Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control. Date: Mon, 08 Mar 2010 22:12:50 +0100 Message-ID: <4B956852.7050804@free.fr> References: <4B88E431.6040609@parallels.com> <4B8AE8C1.1030305@free.fr> <4B8D28CF.8060304@parallels.com> <20100302211942.GA17816@us.ibm.com> <20100303000743.GA13744@us.ibm.com> <4B8E9370.3050300@parallels.com> <4B9158F5.5040205@parallels.com> <4B926B1B.5070207@free.fr> <4B92C886.9020507@free.fr> <4B952BBE.6070507@free.fr> <4B9556A9.60206@free.fr> <4B95611C.5060403@free.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Pavel Emelyanov , Sukadev Bhattiprolu , Serge Hallyn , Linux Netdev List , containers@lists.linux-foundation.org, Netfilter Development Mailinglist , Ben Greear To: "Eric W. Biederman" Return-path: Received: from mtagate2.uk.ibm.com ([194.196.100.162]:34607 "EHLO mtagate2.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754774Ab0CHVM4 (ORCPT ); Mon, 8 Mar 2010 16:12:56 -0500 In-Reply-To: Sender: netfilter-devel-owner@vger.kernel.org List-ID: Eric W. Biederman wrote: > Daniel Lezcano writes: > > >> Eric W. Biederman wrote: >> >>> Daniel Lezcano writes: >>> >>> >>> >>>> Eric W. Biederman wrote: >>>> >>>> >>>>> Daniel Lezcano writes: >>>>> >>>>> >>>>> >>>>>> Eric W. Biederman wrote: >>>>>> >>>>>> >>>>>>> I have take an snapshot of my development tree and placed it at. >>>>>>> >>>>>>> >>>>>>> git://git.kernel.org/pub/scm/linux/people/ebiederm/linux-2.6.33-nsfd-v5.git >>>>>>> >>>>>>> >>>>>> Hi Eric, >>>>>> >>>>>> thanks for the pointer. >>>>>> >>>>>> I tried to boot the kernel under qemu and I got this oops: >>>>>> >>>>>> >>>>> I am clearly running an old userspace on my test machine. No udev. >>>>> It looks like udev has a long standing netlink misfeature, where >>>>> it does not initializing NETLINK_CB.... >>>>> >>>>> >>>>> >From 8d85e3ab88718eda3d94cf8e1be14b69dae2b8f1 Mon Sep 17 00:00:00 2001 >>>>> From: Eric W. Biederman >>>>> Date: Mon, 8 Mar 2010 09:25:20 -0800 >>>>> Subject: [PATCH] kobject_uevent: Use the netlink allocator helper... >>>>> >>>>> Signed-off-by: Eric W. Biederman >>>>> >>>>> >>>> Thanks. >>>> >>>> I was able to boot but I have the following warning: >>>> >>>> >>> Thanks for the bug report. >>> >>> >> Thanks to you for the patchset :) >> >> >>> For the moment you might want to drop: >>> af_netlink: Allow credentials to work across namespaces. >>> af_netlink: Debugging in case I have missed something. >>> >>> Although I am curious if you hit my debugging messages in >>> netlink recv. >>> >>> >> No, it does not appear (looked for "missing NETLINK_CB proto"). >> >> >>> I guess if the goal is to test my nsfd bits you can drop everything >>> starting with my 'scm: Reorder scm_cookie.' commit. The rest is what >>> it takes to get get uids, gid and pids translated when the cross >>> namespaces on an af_unix of an af_netlink socket. >>> >>> At least in the af_netlink case it appears clear I am have missed >>> something. >>> >>> This is a warning that netlink throws when the packet accounting messed >>> up. So it sounds like you are exercising another path that I failed >>> to exercise and fix. >>> >>> >> I will look forward if I find more clues for this warning. >> >> In the meantime was able to enter the container with the ugly following >> program: >> >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> #define __NR_setns 300 >> >> int setns(int nstype, int fd) >> { >> return syscall (__NR_setns, nstype, fd); >> } >> >> int main(int argc, char *argv[]) >> { >> char path[MAXPATHLEN]; >> char *ns[] = { "pid", "mnt", "net", "pid", "uts" }; >> const int size = sizeof(ns) / sizeof(char *); >> int fd[size]; >> int i; >> >> if (argc != 3) { >> fprintf(stderr, "mynsenter \n"); >> exit(1); >> } >> >> for (i = 0; i < size; i++) { >> sprintf(path, "/proc/%s/ns/%s", argv[1], ns[i]); >> >> fd[i] = open(path, O_RDONLY); >> if (fd[i] < 0) { >> perror("open"); >> return -1; >> } >> >> } >> >> for (i = 0; i < size; i++) { >> >> if (setns(0, fd[i])) { >> perror("setns"); >> return -1; >> } >> } >> >> execve(argv[2], &argv[2], NULL); >> perror("execve"); >> >> return 0; >> } >> >> At the fist glance, no problem :) >> > > No fork() so your processes is completely in the pid namespace? > What I do is to attach "/bin/sh" to the container with this program. The container is a VPS running busybox with the full isolation. echo $$ gives the real pid. All the forked processes appears in the pid namespace, they are visible through /proc with the virtual pid. I am not able to change to the /proc/self directory (I assume this is normal).