From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control. Date: Tue, 09 Mar 2010 02:13:08 -0800 Message-ID: References: <4B88E431.6040609@parallels.com> <20100303000743.GA13744@us.ibm.com> <4B8E9370.3050300@parallels.com> <4B9158F5.5040205@parallels.com> <4B926B1B.5070207@free.fr> <4B92C886.9020507@free.fr> <4B952BBE.6070507@free.fr> <4B9556A9.60206@free.fr> <4B95611C.5060403@free.fr> <4B956852.7050804@free.fr> <4B961D09.4010802@free.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Pavel Emelyanov , Sukadev Bhattiprolu , Serge Hallyn , Linux Netdev List , containers@lists.linux-foundation.org, Netfilter Development Mailinglist , Ben Greear To: Daniel Lezcano Return-path: In-Reply-To: <4B961D09.4010802@free.fr> (Daniel Lezcano's message of "Tue\, 09 Mar 2010 11\:03\:53 +0100") Sender: netdev-owner@vger.kernel.org List-Id: netfilter-devel.vger.kernel.org Daniel Lezcano writes: > Eric W. Biederman wrote: > > [ ... ] >> I guess my meaning is I was expecting. >> child = fork(); >> if (child == 0) { >> execve(...); >> } >> waitpid(child); >> >> This puts /bin/sh in the container as well. >> > #include > #include > #include > #include > #include > #include > #include > #include > > #define __NR_setns 300 > > int setns(int nstype, int fd) > { > return syscall (__NR_setns, nstype, fd); > } > > int main(int argc, char *argv[]) > { > char path[MAXPATHLEN]; > char *ns[] = { "pid", "mnt", "net", "pid", "uts" }; > const int size = sizeof(ns) / sizeof(char *); > int fd[size]; > int i; > pid_t pid; > if (argc != 3) { > fprintf(stderr, "mynsenter \n"); > exit(1); > } > > for (i = 0; i < size; i++) { > sprintf(path, "/proc/%s/ns/%s", argv[1], ns[i]); > > fd[i] = open(path, O_RDONLY| FD_CLOEXEC); > if (fd[i] < 0) { > perror("open"); > return -1; > } > > } > for (i = 0; i < size; i++) > if (setns(0, fd[i])) { > perror("setns"); > return -1; > } > > pid = fork(); > if (!pid) { > > fprintf(stderr, "mypid is %d\n", syscall(__NR_getpid)); > > execve(argv[2], &argv[2], NULL); > perror("execve"); > > } > > if (pid < 0) { > perror("fork"); > return -1; > } > > if (waitpid(&pid, NULL, 0) < 0) { > perror("waitpid"); > } > > return 0; > } &pid ??? Isn't that a type error? > Waitpid returns an error: > > waitpid: No child processes > > The pid number returned by fork is the pid from the init pid namespace but it > seems waitpid is waiting for a pid belonging to the child pid namespace. > > waitpid > -> wait4 > -> find_get_pid > -> find_vpid > -> find_pid_ns(nr, current->nsproxy->pid_ns); But it isn't. It is. find_pid_ns(nr, task_active_pid_ns(current)); Which is: find_pid_ns(nr, ns_of_pid(task_pid(current))); Which is a value that doesn't change. When we attach to a pid namespace. > The current->nsproxy->pid_ns is the one of the namespace we attached to. So the > real pid returned by the fork does not exist in this pid namespace. > Maybe fork should return a pid number belonging to the current pid namespace we > are attached no ? Do you not have my patch that changed that? Eric