* setns vs unshare bug
@ 2012-08-10 14:55 Pavel Emelyanov
[not found] ` <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Pavel Emelyanov @ 2012-08-10 14:55 UTC (permalink / raw)
To: Eric W. Biederman, Linux Containers
Hi, Eric!
There's an issue with setns versus unshare syscall which I consider
to be worth looking at. Look -- when you open some task's namespace file,
e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode.
If later the task with the pid <pid> unshares the namespace in question
(in this case -- net ns) the subsequent openings of this task's proc ns
file will result in old namespace obtained and the setns call will not
work as expected. Here's a simple proggie which demonstrates this:
int main(void)
{
int pid, fd;
char path[64];
pid = fork();
if (!pid) {
fd = open("/proc/self/ns/net", O_RDONLY);
close(fd);
unshare(CLONE_NEWNET);
printf("New net:\n");
system("ip l");
sleep(1);
} else {
sleep(1);
printf("Old net:\n");
system("ip l");
sprintf(path, "/proc/%d/ns/net", pid);
fd = open(path, O_RDONLY);
set_ns(fd, CLONE_NEWNET);
printf("New net 2:\n");
system("ip l");
}
return 0;
}
The "else" branch after set_ns expects the net it set to be the new one (and
contain a lo device only), but it's not so -- after the setns syscall the net
namespace isn't changed! If you comment out the "if" branch's open and close
calls (thus avoiding the ns caching) the setns works as expected.
I assume you're aware of this problem, so do you have plans to fix this?
Thanks,
Pavel
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>]
* Re: setns vs unshare bug [not found] ` <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> @ 2012-08-10 15:00 ` Serge Hallyn 2012-08-10 15:08 ` Pavel Emelyanov 0 siblings, 1 reply; 5+ messages in thread From: Serge Hallyn @ 2012-08-10 15:00 UTC (permalink / raw) To: Pavel Emelyanov; +Cc: Linux Containers, Eric W. Biederman Hi Pavel, I don't believe this is a bug. The fd is to a specific network namespace. If the target task later changes his namespace, that doesn't change the fact that you asked for access to the old namespace. You're worried about a race? -serge Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org): > Hi, Eric! > > There's an issue with setns versus unshare syscall which I consider > to be worth looking at. Look -- when you open some task's namespace file, > e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode. > > If later the task with the pid <pid> unshares the namespace in question > (in this case -- net ns) the subsequent openings of this task's proc ns > file will result in old namespace obtained and the setns call will not > work as expected. Here's a simple proggie which demonstrates this: > > int main(void) > { > int pid, fd; > char path[64]; > > pid = fork(); > if (!pid) { > fd = open("/proc/self/ns/net", O_RDONLY); > close(fd); > unshare(CLONE_NEWNET); > printf("New net:\n"); > system("ip l"); > sleep(1); > } else { > sleep(1); > printf("Old net:\n"); > system("ip l"); > sprintf(path, "/proc/%d/ns/net", pid); > fd = open(path, O_RDONLY); > set_ns(fd, CLONE_NEWNET); > printf("New net 2:\n"); > system("ip l"); > } > > return 0; > } > > The "else" branch after set_ns expects the net it set to be the new one (and > contain a lo device only), but it's not so -- after the setns syscall the net > namespace isn't changed! If you comment out the "if" branch's open and close > calls (thus avoiding the ns caching) the setns works as expected. > > I assume you're aware of this problem, so do you have plans to fix this? > > Thanks, > Pavel > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: setns vs unshare bug 2012-08-10 15:00 ` Serge Hallyn @ 2012-08-10 15:08 ` Pavel Emelyanov [not found] ` <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Pavel Emelyanov @ 2012-08-10 15:08 UTC (permalink / raw) To: Serge Hallyn; +Cc: Linux Containers, Eric W. Biederman On 08/10/2012 07:00 PM, Serge Hallyn wrote: > Hi Pavel, > > I don't believe this is a bug. The fd is to a specific network > namespace. If the target task later changes his namespace, that > doesn't change the fact that you asked for access to the old > namespace. > > You're worried about a race? No, it's not a race. The proc ns file doesn't reflect the actual state of a task it belongs to, but instead has some internal state which is not observable/controllable from the outside. Look at my proggie -- the "else" branch does expects that setns will bring it into a new net, but it only does so if proc dcache is empty! Thanks, Pavel > -serge > > Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org): >> Hi, Eric! >> >> There's an issue with setns versus unshare syscall which I consider >> to be worth looking at. Look -- when you open some task's namespace file, >> e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode. >> >> If later the task with the pid <pid> unshares the namespace in question >> (in this case -- net ns) the subsequent openings of this task's proc ns >> file will result in old namespace obtained and the setns call will not >> work as expected. Here's a simple proggie which demonstrates this: >> >> int main(void) >> { >> int pid, fd; >> char path[64]; >> >> pid = fork(); >> if (!pid) { >> fd = open("/proc/self/ns/net", O_RDONLY); >> close(fd); >> unshare(CLONE_NEWNET); >> printf("New net:\n"); >> system("ip l"); >> sleep(1); >> } else { >> sleep(1); >> printf("Old net:\n"); >> system("ip l"); >> sprintf(path, "/proc/%d/ns/net", pid); >> fd = open(path, O_RDONLY); >> set_ns(fd, CLONE_NEWNET); >> printf("New net 2:\n"); >> system("ip l"); >> } >> >> return 0; >> } >> >> The "else" branch after set_ns expects the net it set to be the new one (and >> contain a lo device only), but it's not so -- after the setns syscall the net >> namespace isn't changed! If you comment out the "if" branch's open and close >> calls (thus avoiding the ns caching) the setns works as expected. >> >> I assume you're aware of this problem, so do you have plans to fix this? >> >> Thanks, >> Pavel >> _______________________________________________ >> Containers mailing list >> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org >> https://lists.linuxfoundation.org/mailman/listinfo/containers > . > ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>]
* Re: setns vs unshare bug [not found] ` <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> @ 2012-08-10 15:17 ` Pavel Emelyanov [not found] ` <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Pavel Emelyanov @ 2012-08-10 15:17 UTC (permalink / raw) To: Serge Hallyn; +Cc: Linux Containers, Eric W. Biederman On 08/10/2012 07:08 PM, Pavel Emelyanov wrote: > On 08/10/2012 07:00 PM, Serge Hallyn wrote: >> Hi Pavel, >> >> I don't believe this is a bug. The fd is to a specific network >> namespace. If the target task later changes his namespace, that >> doesn't change the fact that you asked for access to the old >> namespace. >> >> You're worried about a race? > > No, it's not a race. The proc ns file doesn't reflect the actual state > of a task it belongs to, but instead has some internal state which is > not observable/controllable from the outside. Look at my proggie -- the > "else" branch does expects that setns will bring it into a new net, but > it only does so if proc dcache is empty! I mean -- I open a task's proc ns file _strictly_ _after_ that task called unshare, but happen to obtain an _old_ net namespace, because this old netns was cached on the task's proc file. Hope this explains better what I'm concerned about. Thanks, Pavel ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>]
* Re: setns vs unshare bug [not found] ` <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> @ 2012-08-10 15:24 ` Serge Hallyn 0 siblings, 0 replies; 5+ messages in thread From: Serge Hallyn @ 2012-08-10 15:24 UTC (permalink / raw) To: Pavel Emelyanov; +Cc: Linux Containers, Eric W. Biederman Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org): > On 08/10/2012 07:08 PM, Pavel Emelyanov wrote: > > On 08/10/2012 07:00 PM, Serge Hallyn wrote: > >> Hi Pavel, > >> > >> I don't believe this is a bug. The fd is to a specific network > >> namespace. If the target task later changes his namespace, that > >> doesn't change the fact that you asked for access to the old > >> namespace. > >> > >> You're worried about a race? > > > > No, it's not a race. The proc ns file doesn't reflect the actual state > > of a task it belongs to, but instead has some internal state which is > > not observable/controllable from the outside. Look at my proggie -- the > > "else" branch does expects that setns will bring it into a new net, but > > it only does so if proc dcache is empty! > > I mean -- I open a task's proc ns file _strictly_ _after_ that task called > unshare, but happen to obtain an _old_ net namespace, because this old netns > was cached on the task's proc file. > > Hope this explains better what I'm concerned about. Ooh! Sorry, now I see. -serge ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-08-10 15:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-10 14:55 setns vs unshare bug Pavel Emelyanov
[not found] ` <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-10 15:00 ` Serge Hallyn
2012-08-10 15:08 ` Pavel Emelyanov
[not found] ` <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-10 15:17 ` Pavel Emelyanov
[not found] ` <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-10 15:24 ` Serge Hallyn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox