Linux Container Development
 help / color / mirror / Atom feed
* setns vs unshare bug
@ 2012-08-10 14:55 Pavel Emelyanov
       [not found] ` <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Pavel Emelyanov @ 2012-08-10 14:55 UTC (permalink / raw)
  To: Eric W. Biederman, Linux Containers

Hi, Eric!

There's an issue with setns versus unshare syscall which I consider
to be worth looking at. Look -- when you open some task's namespace file,
e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode.

If later the task with the pid <pid> unshares the namespace in question
(in this case -- net ns) the subsequent openings of this task's proc ns
file will result in old namespace obtained and the setns call will not
work as expected. Here's a simple proggie which demonstrates this:

int main(void)
{
	int pid, fd;
	char path[64];

	pid = fork();
	if (!pid) {
		fd = open("/proc/self/ns/net", O_RDONLY);
		close(fd);
		unshare(CLONE_NEWNET);
		printf("New net:\n");
		system("ip l");
		sleep(1);
	} else {
		sleep(1);
		printf("Old net:\n");
		system("ip l");
		sprintf(path, "/proc/%d/ns/net", pid);
		fd = open(path, O_RDONLY);
		set_ns(fd, CLONE_NEWNET);
		printf("New net 2:\n");
		system("ip l");
	}

	return 0;
}

The "else" branch after set_ns expects the net it set to be the new one (and
contain a lo device only), but it's not so -- after the setns syscall the net
namespace isn't changed! If you comment out the "if" branch's open and close
calls (thus avoiding the ns caching) the setns works as expected.

I assume you're aware of this problem, so do you have plans to fix this?

Thanks,
Pavel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: setns vs unshare bug
       [not found] ` <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
@ 2012-08-10 15:00   ` Serge Hallyn
  2012-08-10 15:08     ` Pavel Emelyanov
  0 siblings, 1 reply; 5+ messages in thread
From: Serge Hallyn @ 2012-08-10 15:00 UTC (permalink / raw)
  To: Pavel Emelyanov; +Cc: Linux Containers, Eric W. Biederman

Hi Pavel,

I don't believe this is a bug.  The fd is to a specific network
namespace.  If the target task later changes his namespace, that
doesn't change the fact that you asked for access to the old
namespace.

You're worried about a race?

-serge

Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org):
> Hi, Eric!
> 
> There's an issue with setns versus unshare syscall which I consider
> to be worth looking at. Look -- when you open some task's namespace file,
> e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode.
> 
> If later the task with the pid <pid> unshares the namespace in question
> (in this case -- net ns) the subsequent openings of this task's proc ns
> file will result in old namespace obtained and the setns call will not
> work as expected. Here's a simple proggie which demonstrates this:
> 
> int main(void)
> {
> 	int pid, fd;
> 	char path[64];
> 
> 	pid = fork();
> 	if (!pid) {
> 		fd = open("/proc/self/ns/net", O_RDONLY);
> 		close(fd);
> 		unshare(CLONE_NEWNET);
> 		printf("New net:\n");
> 		system("ip l");
> 		sleep(1);
> 	} else {
> 		sleep(1);
> 		printf("Old net:\n");
> 		system("ip l");
> 		sprintf(path, "/proc/%d/ns/net", pid);
> 		fd = open(path, O_RDONLY);
> 		set_ns(fd, CLONE_NEWNET);
> 		printf("New net 2:\n");
> 		system("ip l");
> 	}
> 
> 	return 0;
> }
> 
> The "else" branch after set_ns expects the net it set to be the new one (and
> contain a lo device only), but it's not so -- after the setns syscall the net
> namespace isn't changed! If you comment out the "if" branch's open and close
> calls (thus avoiding the ns caching) the setns works as expected.
> 
> I assume you're aware of this problem, so do you have plans to fix this?
> 
> Thanks,
> Pavel
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: setns vs unshare bug
  2012-08-10 15:00   ` Serge Hallyn
@ 2012-08-10 15:08     ` Pavel Emelyanov
       [not found]       ` <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Pavel Emelyanov @ 2012-08-10 15:08 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Linux Containers, Eric W. Biederman

On 08/10/2012 07:00 PM, Serge Hallyn wrote:
> Hi Pavel,
> 
> I don't believe this is a bug.  The fd is to a specific network
> namespace.  If the target task later changes his namespace, that
> doesn't change the fact that you asked for access to the old
> namespace.
> 
> You're worried about a race?

No, it's not a race. The proc ns file doesn't reflect the actual state
of a task it belongs to, but instead has some internal state which is
not observable/controllable from the outside. Look at my proggie -- the
"else" branch does expects that setns will bring it into a new net, but
it only does so if proc dcache is empty!

Thanks,
Pavel

> -serge
> 
> Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org):
>> Hi, Eric!
>>
>> There's an issue with setns versus unshare syscall which I consider
>> to be worth looking at. Look -- when you open some task's namespace file,
>> e.g. /proc/<pid>/ns/net, the net namespace is cached on the proc inode.
>>
>> If later the task with the pid <pid> unshares the namespace in question
>> (in this case -- net ns) the subsequent openings of this task's proc ns
>> file will result in old namespace obtained and the setns call will not
>> work as expected. Here's a simple proggie which demonstrates this:
>>
>> int main(void)
>> {
>> 	int pid, fd;
>> 	char path[64];
>>
>> 	pid = fork();
>> 	if (!pid) {
>> 		fd = open("/proc/self/ns/net", O_RDONLY);
>> 		close(fd);
>> 		unshare(CLONE_NEWNET);
>> 		printf("New net:\n");
>> 		system("ip l");
>> 		sleep(1);
>> 	} else {
>> 		sleep(1);
>> 		printf("Old net:\n");
>> 		system("ip l");
>> 		sprintf(path, "/proc/%d/ns/net", pid);
>> 		fd = open(path, O_RDONLY);
>> 		set_ns(fd, CLONE_NEWNET);
>> 		printf("New net 2:\n");
>> 		system("ip l");
>> 	}
>>
>> 	return 0;
>> }
>>
>> The "else" branch after set_ns expects the net it set to be the new one (and
>> contain a lo device only), but it's not so -- after the setns syscall the net
>> namespace isn't changed! If you comment out the "if" branch's open and close
>> calls (thus avoiding the ns caching) the setns works as expected.
>>
>> I assume you're aware of this problem, so do you have plans to fix this?
>>
>> Thanks,
>> Pavel
>> _______________________________________________
>> Containers mailing list
>> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>> https://lists.linuxfoundation.org/mailman/listinfo/containers
> .
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: setns vs unshare bug
       [not found]       ` <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
@ 2012-08-10 15:17         ` Pavel Emelyanov
       [not found]           ` <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Pavel Emelyanov @ 2012-08-10 15:17 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Linux Containers, Eric W. Biederman

On 08/10/2012 07:08 PM, Pavel Emelyanov wrote:
> On 08/10/2012 07:00 PM, Serge Hallyn wrote:
>> Hi Pavel,
>>
>> I don't believe this is a bug.  The fd is to a specific network
>> namespace.  If the target task later changes his namespace, that
>> doesn't change the fact that you asked for access to the old
>> namespace.
>>
>> You're worried about a race?
> 
> No, it's not a race. The proc ns file doesn't reflect the actual state
> of a task it belongs to, but instead has some internal state which is
> not observable/controllable from the outside. Look at my proggie -- the
> "else" branch does expects that setns will bring it into a new net, but
> it only does so if proc dcache is empty!

I mean -- I open a task's proc ns file _strictly_ _after_ that task called
unshare, but happen to obtain an _old_ net namespace, because this old netns
was cached on the task's proc file.

Hope this explains better what I'm concerned about.

Thanks,
Pavel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: setns vs unshare bug
       [not found]           ` <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
@ 2012-08-10 15:24             ` Serge Hallyn
  0 siblings, 0 replies; 5+ messages in thread
From: Serge Hallyn @ 2012-08-10 15:24 UTC (permalink / raw)
  To: Pavel Emelyanov; +Cc: Linux Containers, Eric W. Biederman

Quoting Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org):
> On 08/10/2012 07:08 PM, Pavel Emelyanov wrote:
> > On 08/10/2012 07:00 PM, Serge Hallyn wrote:
> >> Hi Pavel,
> >>
> >> I don't believe this is a bug.  The fd is to a specific network
> >> namespace.  If the target task later changes his namespace, that
> >> doesn't change the fact that you asked for access to the old
> >> namespace.
> >>
> >> You're worried about a race?
> > 
> > No, it's not a race. The proc ns file doesn't reflect the actual state
> > of a task it belongs to, but instead has some internal state which is
> > not observable/controllable from the outside. Look at my proggie -- the
> > "else" branch does expects that setns will bring it into a new net, but
> > it only does so if proc dcache is empty!
> 
> I mean -- I open a task's proc ns file _strictly_ _after_ that task called
> unshare, but happen to obtain an _old_ net namespace, because this old netns
> was cached on the task's proc file.
> 
> Hope this explains better what I'm concerned about.

Ooh!  Sorry, now I see.

-serge

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-08-10 15:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-10 14:55 setns vs unshare bug Pavel Emelyanov
     [not found] ` <502520E8.5040401-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-10 15:00   ` Serge Hallyn
2012-08-10 15:08     ` Pavel Emelyanov
     [not found]       ` <502523DF.4040103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-10 15:17         ` Pavel Emelyanov
     [not found]           ` <502525FC.3060608-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-10 15:24             ` Serge Hallyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox