From mboxrd@z Thu Jan 1 00:00:00 1970 From: Serge Hallyn Subject: Re: setns vs unshare bug Date: Fri, 10 Aug 2012 10:00:46 -0500 Message-ID: <20120810150046.GA20449@sergelap> References: <502520E8.5040401@parallels.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Pavel Emelyanov Cc: Linux Containers , "Eric W. Biederman" List-Id: containers.vger.kernel.org Hi Pavel, I don't believe this is a bug. The fd is to a specific network namespace. If the target task later changes his namespace, that doesn't change the fact that you asked for access to the old namespace. You're worried about a race? -serge Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org): > Hi, Eric! > > There's an issue with setns versus unshare syscall which I consider > to be worth looking at. Look -- when you open some task's namespace file, > e.g. /proc//ns/net, the net namespace is cached on the proc inode. > > If later the task with the pid unshares the namespace in question > (in this case -- net ns) the subsequent openings of this task's proc ns > file will result in old namespace obtained and the setns call will not > work as expected. Here's a simple proggie which demonstrates this: > > int main(void) > { > int pid, fd; > char path[64]; > > pid = fork(); > if (!pid) { > fd = open("/proc/self/ns/net", O_RDONLY); > close(fd); > unshare(CLONE_NEWNET); > printf("New net:\n"); > system("ip l"); > sleep(1); > } else { > sleep(1); > printf("Old net:\n"); > system("ip l"); > sprintf(path, "/proc/%d/ns/net", pid); > fd = open(path, O_RDONLY); > set_ns(fd, CLONE_NEWNET); > printf("New net 2:\n"); > system("ip l"); > } > > return 0; > } > > The "else" branch after set_ns expects the net it set to be the new one (and > contain a lo device only), but it's not so -- after the setns syscall the net > namespace isn't changed! If you comment out the "if" branch's open and close > calls (thus avoiding the ns caching) the setns works as expected. > > I assume you're aware of this problem, so do you have plans to fix this? > > Thanks, > Pavel > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers