netns: Issues with deleting virtual interfaces during namespace cleanup

All of lore.kernel.org
 help / color / mirror / Atom feed

* netns: Issues with deleting virtual interfaces during namespace cleanup
@ 2011-02-26 16:59 Ward, David - 0663 - MITLL
       [not found] ` <4D69316F.4000606-OVIABD91gjs3uPMLIKxrzw@public.gmane.org>
       [not found] ` <4D697F6A.9000907@free.fr>
  0 siblings, 2 replies; 6+ messages in thread
From: Ward, David - 0663 - MITLL @ 2011-02-26 16:59 UTC (permalink / raw)
  To: Daniel Lezcano, Eric W. Biederman, Pavel Emelyanov


[-- Attachment #1.1: Type: text/plain, Size: 2410 bytes --]

(Apologies for the cross-post, but Thunderbird messed up the formatting 
when I sent this originally, and then I realized I sent it to the wrong 
list.)

A patch was applied to the kernel in November 2008 that deletes virtual 
network interfaces when network namespaces are cleaned up 
(d0c082cea6dfb9b674b4f6e1e84025662dbd24e8).  A discussion about this 
patch took place on this list 
(https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html), 
where Daniel Lezcano wrote:

 > After discussing with Benjamin, this patch means an user can no longer
 > manage a pool of virtual devices because they will be automatically
 > destroyed when the namespace exits. I don't think it is a big concern,
 > but just in case I am asking :)

I currently have two use cases where this behavior is not desirable:

   1. I use a veth pair device to connect two containers together (as
      opposed to connecting a container to the host).  To do this, I
      create the veth pair device manually in the host with iproute2
      ("ip link add type veth").  Then when I start each container, it
      pulls in one of the interfaces of the veth pair device with
      "lxc.network.type = phys".  When I stop one of the containers, its
      interface to the veth pair device is deleted instead of moved back
      to the host, so I can not just start the stopped container again
      and re-establish the same link.
   2. I start a process in the host that creates a TUN/TAP interface,
      such as a VPN client.  I pull the TUN/TAP interface into the
      container with "lxc.network.type = phys".  When the container
      exits, the TUN/TAP interface is deleted because it is a virtual
      interface, while the VPN client process continues to run in the
      host.  Again I can not just start the container again with the
      same connection; I have to restart the VPN client.

It makes sense that virtual network interfaces that get created inside a 
container should be deleted when the container exits.  However, I feel 
that network interfaces from the host that get assigned to the container 
should be returned to the host when the container exits, whether they 
are physical or virtual.

Can the kernel distinguish between network interfaces that were created 
inside the namespace, and network interfaces that were moved there?

David


[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5650 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netns: Issues with deleting virtual interfaces during namespace cleanup
       [not found] ` <4D69316F.4000606-OVIABD91gjs3uPMLIKxrzw@public.gmane.org>
@ 2011-02-26 22:32   ` Daniel Lezcano
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Lezcano @ 2011-02-26 22:32 UTC (permalink / raw)
  To: Ward, David - 0663 - MITLL
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Eric W. Biederman, Lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	Pavel Emelyanov

On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote:
> (Apologies for the cross-post, but Thunderbird messed up the formatting
> when I sent this originally, and then I realized I sent it to the wrong
> list.)
>
> A patch was applied to the kernel in November 2008 that deletes virtual
> network interfaces when network namespaces are cleaned up
> (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this
> patch took place on this list
> (https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html),
> where Daniel Lezcano wrote:
>
>  > After discussing with Benjamin, this patch means an user can no longer
>  > manage a pool of virtual devices because they will be automatically
>  > destroyed when the namespace exits. I don't think it is a big concern,
>  > but just in case I am asking :)
>
> I currently have two use cases where this behavior is not desirable:
>
> 1. I use a veth pair device to connect two containers together (as
> opposed to connecting a container to the host). To do this, I
> create the veth pair device manually in the host with iproute2
> ("ip link add type veth"). Then when I start each container, it
> pulls in one of the interfaces of the veth pair device with
> "lxc.network.type = phys". When I stop one of the containers, its
> interface to the veth pair device is deleted instead of moved back
> to the host, so I can not just start the stopped container again
> and re-establish the same link.

Maybe you can rely on the lxc configuration to do that.

Assuming you create the two container always in the same order.

The first one:

lxc.network.type=veth
lxc.network.veth.pair=vethX

The second one

lxc.network.type=phys
lxc.network.link=vethX

The drawback is you have to stop / start both of them.


Otherwise, why don't you use the macvlan configuration ?

For both containers:

lxc.network.type=macvlan
lxc.network.macvlan.mode=bridge
lxc.network.link=dummy0


> 2. I start a process in the host that creates a TUN/TAP interface,
> such as a VPN client. I pull the TUN/TAP interface into the
> container with "lxc.network.type = phys". When the container
> exits, the TUN/TAP interface is deleted because it is a virtual
> interface, while the VPN client process continues to run in the
> host. Again I can not just start the container again with the
> same connection; I have to restart the VPN client.
>
> It makes sense that virtual network interfaces that get created inside a
> container should be deleted when the container exits. However, I feel
> that network interfaces from the host that get assigned to the container
> should be returned to the host when the container exits, whether they
> are physical or virtual.

Wouldn't make sense to add a configuration option for lxc to create such 
device and handle the vpn client ?

There is the lxc.network.script.up option where you can launch your vpn 
client. So adding the tun/tap interface as a network option, lxc will 
create it for you and when it is up, the up script is invoked where the 
vpn client is launched.

The lxc.network.script.down does not exist yet, but it is quite easy to 
add the option.

What do you think ?

> Can the kernel distinguish between network interfaces that were created
> inside the namespace, and network interfaces that were moved there?

IMHO that will add more complexity to the network namespace, especially 
to handle the nested namespaces. Furthermore that will impact the 
current design. I am not really in favor of that as that was initial 
behavior and there were limitations.
  <javascript:void(0);>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netns: Issues with deleting virtual interfaces during namespace cleanup
       [not found]   ` <4D697F6A.9000907-GANU6spQydw@public.gmane.org>
@ 2011-02-27  5:16     ` Renato Westphal
       [not found]       ` <AANLkTinQQHKiujHNet07kbK5eqYvp6-2iBnn27v2-85+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
       [not found]       ` <4D6A1726.1010400@free.fr>
  0 siblings, 2 replies; 6+ messages in thread
From: Renato Westphal @ 2011-02-27  5:16 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Pavel Emelyanov, Ward, David - 0663 - MITLL,
	Lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Eric W. Biederman

[-- Attachment #1: Type: text/plain, Size: 4497 bytes --]

Hello David,

You may try the patch below (kernel v2.6.35) and see if that helps. It
basically does what you asked for: during namespace cleanup, move back the
virtual interfaces to their original namespaces. I did some tests with veth
pairs and nested netns's and everything worked fine.

I think this should be the default behaviour, I would like if someone could
review/fix this patch and push it upstream.

Have a good day,
Renato.

2011/2/26 Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org>

> On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote:
> > (Apologies for the cross-post, but Thunderbird messed up the formatting
> > when I sent this originally, and then I realized I sent it to the wrong
> > list.)
> >
> > A patch was applied to the kernel in November 2008 that deletes virtual
> > network interfaces when network namespaces are cleaned up
> > (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this
> > patch took place on this list
> > (
> https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html
> ),
> > where Daniel Lezcano wrote:
> >
> >  > After discussing with Benjamin, this patch means an user can no longer
> >  > manage a pool of virtual devices because they will be automatically
> >  > destroyed when the namespace exits. I don't think it is a big concern,
> >  > but just in case I am asking :)
> >
> > I currently have two use cases where this behavior is not desirable:
> >
> > 1. I use a veth pair device to connect two containers together (as
> > opposed to connecting a container to the host). To do this, I
> > create the veth pair device manually in the host with iproute2
> > ("ip link add type veth"). Then when I start each container, it
> > pulls in one of the interfaces of the veth pair device with
> > "lxc.network.type = phys". When I stop one of the containers, its
> > interface to the veth pair device is deleted instead of moved back
> > to the host, so I can not just start the stopped container again
> > and re-establish the same link.
>
> Maybe you can rely on the lxc configuration to do that.
>
> Assuming you create the two container always in the same order.
>
> The first one:
>
> lxc.network.type=veth
> lxc.network.veth.pair=vethX
>
> The second one
>
> lxc.network.type=phys
> lxc.network.link=vethX
>
> The drawback is you have to stop / start both of them.
>
>
> Otherwise, why don't you use the macvlan configuration ?
>
> For both containers:
>
> lxc.network.type=macvlan
> lxc.network.macvlan.mode=bridge
> lxc.network.link=dummy0
>
>
> > 2. I start a process in the host that creates a TUN/TAP interface,
> > such as a VPN client. I pull the TUN/TAP interface into the
> > container with "lxc.network.type = phys". When the container
> > exits, the TUN/TAP interface is deleted because it is a virtual
> > interface, while the VPN client process continues to run in the
> > host. Again I can not just start the container again with the
> > same connection; I have to restart the VPN client.
> >
> > It makes sense that virtual network interfaces that get created inside a
> > container should be deleted when the container exits. However, I feel
> > that network interfaces from the host that get assigned to the container
> > should be returned to the host when the container exits, whether they
> > are physical or virtual.
>
> Wouldn't make sense to add a configuration option for lxc to create such
> device and handle the vpn client ?
>
> There is the lxc.network.script.up option where you can launch your vpn
> client. So adding the tun/tap interface as a network option, lxc will
> create it for you and when it is up, the up script is invoked where the
> vpn client is launched.
>
> The lxc.network.script.down does not exist yet, but it is quite easy to
> add the option.
>
> What do you think ?
>
> > Can the kernel distinguish between network interfaces that were created
> > inside the namespace, and network interfaces that were moved there?
>
> IMHO that will add more complexity to the network namespace, especially
> to handle the nested namespaces. Furthermore that will impact the
> current design. I am not really in favor of that as that was initial
> behavior and there were limitations.
>  <javascript:void(0);>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
>



-- 
Renato Westphal

[-- Attachment #2: commit-4b938c0.patch --]
[-- Type: text/x-patch, Size: 2144 bytes --]

commit 4b938c007d9a20d7ee6753083d7a9c6b1f098671
Author: Renato Westphal <rwestphal@inf.ufrgs.br>
Date:   Sun Feb 27 02:07:56 2011 -0300

    netns: Preserve imported virtual interfaces during namespace cleanup

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b21e405..7cce799 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1019,6 +1019,8 @@ struct net_device {
 #ifdef CONFIG_NET_NS
 	/* Network namespace this network device is inside */
 	struct net		*nd_net;
+	/* Initial network namespace of this network device */
+	struct net		*nd_init_net;
 #endif
 
 	/* mid-layer private */
diff --git a/net/core/dev.c b/net/core/dev.c
index f3a24c4..16d9bc4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5830,6 +5830,7 @@ static struct pernet_operations __net_initdata netdev_net_ops = {
 static void __net_exit default_device_exit(struct net *net)
 {
 	struct net_device *dev, *aux;
+	struct net *dest_net;
 	/*
 	 * Push all migratable network devices back to the
 	 * initial network namespace
@@ -5844,12 +5845,13 @@ static void __net_exit default_device_exit(struct net *net)
 			continue;
 
 		/* Leave virtual devices for the generic cleanup */
-		if (dev->rtnl_link_ops)
+		if (dev->rtnl_link_ops && dev->nd_net == dev->nd_init_net)
 			continue;
 
 		/* Push remaing network devices to init_net */
+		dest_net = dev->rtnl_link_ops ? dev->nd_init_net : &init_net;
 		snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex);
-		err = dev_change_net_namespace(dev, &init_net, fb_name);
+		err = dev_change_net_namespace(dev, dest_net, fb_name);
 		if (err) {
 			printk(KERN_EMERG "%s: failed to move %s to init_net: %d\n",
 				__func__, dev->name, err);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 19bedd5..b2e3155 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1394,6 +1394,7 @@ struct net_device *rtnl_create_link(struct net *src_net, struct net *net,
 		goto err;
 
 	dev_net_set(dev, net);
+	dev->nd_init_net = dev_net(dev);
 	dev->rtnl_link_ops = ops;
 	dev->rtnl_link_state = RTNL_LINK_INITIALIZING;
 	dev->real_num_tx_queues = real_num_queues;

[-- Attachment #3: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: netns: Issues with deleting virtual interfaces during namespace cleanup
       [not found]       ` <AANLkTinQQHKiujHNet07kbK5eqYvp6-2iBnn27v2-85+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-02-27  9:02         ` Eric W. Biederman
  2011-02-27  9:19         ` Daniel Lezcano
  1 sibling, 0 replies; 6+ messages in thread
From: Eric W. Biederman @ 2011-02-27  9:02 UTC (permalink / raw)
  To: Renato Westphal
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Pavel Emelyanov, Lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	Ward, David - 0663 - MITLL

Renato Westphal <renatowestphal-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hello David,
>
> You may try the patch below (kernel v2.6.35) and see if that helps. It
> basically does what you asked for: during namespace cleanup, move back the
> virtual interfaces to their original namespaces. I did some tests with veth
> pairs and nested netns's and everything worked fine.
>
> I think this should be the default behaviour, I would like if someone could
> review/fix this patch and push it upstream.

I think this approach of pushing virtual network devices back where they
came from is a bad idea.  All of the desired benefits can be obtained by
using an extra veth pair and ethernet bridging.  The current semantics
make it difficult to leak virtual network devices by accident.  The
suggested patch fails hard when the originating network namespace exits
before the target network namespace, and I would contend that is a
fundamentally hard problem and will lead to complicated code.  Finally I
don't see what is gained by changing the current semantics.

Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netns: Issues with deleting virtual interfaces during namespace cleanup
       [not found]       ` <AANLkTinQQHKiujHNet07kbK5eqYvp6-2iBnn27v2-85+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-02-27  9:02         ` Eric W. Biederman
@ 2011-02-27  9:19         ` Daniel Lezcano
  1 sibling, 0 replies; 6+ messages in thread
From: Daniel Lezcano @ 2011-02-27  9:19 UTC (permalink / raw)
  To: Renato Westphal
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Pavel Emelyanov, Ward, David - 0663 - MITLL,
	Lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Eric W. Biederman

On 02/27/2011 06:16 AM, Renato Westphal wrote:
> Hello David,
>
> You may try the patch below (kernel v2.6.35) and see if that helps. It
> basically does what you asked for: during namespace cleanup, move back the
> virtual interfaces to their original namespaces. I did some tests with veth
> pairs and nested netns's and everything worked fine.
>
> I think this should be the default behaviour, I would like if someone could
> review/fix this patch and push it upstream.

I don't think you should modify this. The automatic destruction behavior 
is implemented since a couple of years now and the userspace components 
rely on that.

Moreover, that will add extra complexity to the kernel, especially with 
the nested namespaces. For example, if netns1 and netns2 are created, 
where netns2 is child of netns1. You create a device in netns1, move it 
to netns2 and then netns1 exits. What happens to the device in netns2 
when this one is destroyed ? You have to track the net namespace life 
cycle to ensure the consistency with the network namespace origin of the 
device and take decision regarding if it is dead or not.

No, really, I am not in favor of that.

However, you can provide an interface to the device, eg a sysfs 
attribute, to flag it as non-destroyable-at-exit and so it will be kept 
untouched and moved back to the init_net_ns.

> 2011/2/26 Daniel Lezcano<daniel.lezcano-GANU6spQydw@public.gmane.org>
>
>> On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote:
>>> (Apologies for the cross-post, but Thunderbird messed up the formatting
>>> when I sent this originally, and then I realized I sent it to the wrong
>>> list.)
>>>
>>> A patch was applied to the kernel in November 2008 that deletes virtual
>>> network interfaces when network namespaces are cleaned up
>>> (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this
>>> patch took place on this list
>>> (
>> https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html
>> ),
>>> where Daniel Lezcano wrote:
>>>
>>>   >  After discussing with Benjamin, this patch means an user can no longer
>>>   >  manage a pool of virtual devices because they will be automatically
>>>   >  destroyed when the namespace exits. I don't think it is a big concern,
>>>   >  but just in case I am asking :)
>>>
>>> I currently have two use cases where this behavior is not desirable:
>>>
>>> 1. I use a veth pair device to connect two containers together (as
>>> opposed to connecting a container to the host). To do this, I
>>> create the veth pair device manually in the host with iproute2
>>> ("ip link add type veth"). Then when I start each container, it
>>> pulls in one of the interfaces of the veth pair device with
>>> "lxc.network.type = phys". When I stop one of the containers, its
>>> interface to the veth pair device is deleted instead of moved back
>>> to the host, so I can not just start the stopped container again
>>> and re-establish the same link.
>> Maybe you can rely on the lxc configuration to do that.
>>
>> Assuming you create the two container always in the same order.
>>
>> The first one:
>>
>> lxc.network.type=veth
>> lxc.network.veth.pair=vethX
>>
>> The second one
>>
>> lxc.network.type=phys
>> lxc.network.link=vethX
>>
>> The drawback is you have to stop / start both of them.
>>
>>
>> Otherwise, why don't you use the macvlan configuration ?
>>
>> For both containers:
>>
>> lxc.network.type=macvlan
>> lxc.network.macvlan.mode=bridge
>> lxc.network.link=dummy0
>>
>>
>>> 2. I start a process in the host that creates a TUN/TAP interface,
>>> such as a VPN client. I pull the TUN/TAP interface into the
>>> container with "lxc.network.type = phys". When the container
>>> exits, the TUN/TAP interface is deleted because it is a virtual
>>> interface, while the VPN client process continues to run in the
>>> host. Again I can not just start the container again with the
>>> same connection; I have to restart the VPN client.
>>>
>>> It makes sense that virtual network interfaces that get created inside a
>>> container should be deleted when the container exits. However, I feel
>>> that network interfaces from the host that get assigned to the container
>>> should be returned to the host when the container exits, whether they
>>> are physical or virtual.
>> Wouldn't make sense to add a configuration option for lxc to create such
>> device and handle the vpn client ?
>>
>> There is the lxc.network.script.up option where you can launch your vpn
>> client. So adding the tun/tap interface as a network option, lxc will
>> create it for you and when it is up, the up script is invoked where the
>> vpn client is launched.
>>
>> The lxc.network.script.down does not exist yet, but it is quite easy to
>> add the option.
>>
>> What do you think ?
>>
>>> Can the kernel distinguish between network interfaces that were created
>>> inside the namespace, and network interfaces that were moved there?
>> IMHO that will add more complexity to the network namespace, especially
>> to handle the nested namespaces. Furthermore that will impact the
>> current design. I am not really in favor of that as that was initial
>> behavior and there were limitations.
>>   <javascript:void(0);>
>> _______________________________________________
>> Containers mailing list
>> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>> https://lists.linux-foundation.org/mailman/listinfo/containers
>>
>
>

<javascript:void(0);>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netns: Issues with deleting virtual interfaces during namespace cleanup
       [not found]         ` <4D6A1726.1010400-GANU6spQydw@public.gmane.org>
@ 2011-02-27 15:28           ` Renato Westphal
  0 siblings, 0 replies; 6+ messages in thread
From: Renato Westphal @ 2011-02-27 15:28 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Pavel Emelyanov, Ward, David - 0663 - MITLL,
	Lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Eric W. Biederman

Daniel/Eric,

You're completely right. This patch adds more problems than it solves.

I have a problem similar to that of David, but now I'm convinced that
it is better to deal with it with the userspace tools.

Renato.

2011/2/27 Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org>:
> On 02/27/2011 06:16 AM, Renato Westphal wrote:
>>
>> Hello David,
>>
>> You may try the patch below (kernel v2.6.35) and see if that helps. It
>> basically does what you asked for: during namespace cleanup, move back the
>> virtual interfaces to their original namespaces. I did some tests with
>> veth
>> pairs and nested netns's and everything worked fine.
>>
>> I think this should be the default behaviour, I would like if someone
>> could
>> review/fix this patch and push it upstream.
>
> I don't think you should modify this. The automatic destruction behavior is
> implemented since a couple of years now and the userspace components rely on
> that.
>
> Moreover, that will add extra complexity to the kernel, especially with the
> nested namespaces. For example, if netns1 and netns2 are created, where
> netns2 is child of netns1. You create a device in netns1, move it to netns2
> and then netns1 exits. What happens to the device in netns2 when this one is
> destroyed ? You have to track the net namespace life cycle to ensure the
> consistency with the network namespace origin of the device and take
> decision regarding if it is dead or not.
>
> No, really, I am not in favor of that.
>
> However, you can provide an interface to the device, eg a sysfs attribute,
> to flag it as non-destroyable-at-exit and so it will be kept untouched and
> moved back to the init_net_ns.
>
>> 2011/2/26 Daniel Lezcano<daniel.lezcano-GANU6spQydw@public.gmane.org>
>>
>>> On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote:
>>>>
>>>> (Apologies for the cross-post, but Thunderbird messed up the formatting
>>>> when I sent this originally, and then I realized I sent it to the wrong
>>>> list.)
>>>>
>>>> A patch was applied to the kernel in November 2008 that deletes virtual
>>>> network interfaces when network namespaces are cleaned up
>>>> (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this
>>>> patch took place on this list
>>>> (
>>>
>>>
>>> https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html
>>> ),
>>>>
>>>> where Daniel Lezcano wrote:
>>>>
>>>>  >  After discussing with Benjamin, this patch means an user can no
>>>> longer
>>>>  >  manage a pool of virtual devices because they will be automatically
>>>>  >  destroyed when the namespace exits. I don't think it is a big
>>>> concern,
>>>>  >  but just in case I am asking :)
>>>>
>>>> I currently have two use cases where this behavior is not desirable:
>>>>
>>>> 1. I use a veth pair device to connect two containers together (as
>>>> opposed to connecting a container to the host). To do this, I
>>>> create the veth pair device manually in the host with iproute2
>>>> ("ip link add type veth"). Then when I start each container, it
>>>> pulls in one of the interfaces of the veth pair device with
>>>> "lxc.network.type = phys". When I stop one of the containers, its
>>>> interface to the veth pair device is deleted instead of moved back
>>>> to the host, so I can not just start the stopped container again
>>>> and re-establish the same link.
>>>
>>> Maybe you can rely on the lxc configuration to do that.
>>>
>>> Assuming you create the two container always in the same order.
>>>
>>> The first one:
>>>
>>> lxc.network.type=veth
>>> lxc.network.veth.pair=vethX
>>>
>>> The second one
>>>
>>> lxc.network.type=phys
>>> lxc.network.link=vethX
>>>
>>> The drawback is you have to stop / start both of them.
>>>
>>>
>>> Otherwise, why don't you use the macvlan configuration ?
>>>
>>> For both containers:
>>>
>>> lxc.network.type=macvlan
>>> lxc.network.macvlan.mode=bridge
>>> lxc.network.link=dummy0
>>>
>>>
>>>> 2. I start a process in the host that creates a TUN/TAP interface,
>>>> such as a VPN client. I pull the TUN/TAP interface into the
>>>> container with "lxc.network.type = phys". When the container
>>>> exits, the TUN/TAP interface is deleted because it is a virtual
>>>> interface, while the VPN client process continues to run in the
>>>> host. Again I can not just start the container again with the
>>>> same connection; I have to restart the VPN client.
>>>>
>>>> It makes sense that virtual network interfaces that get created inside a
>>>> container should be deleted when the container exits. However, I feel
>>>> that network interfaces from the host that get assigned to the container
>>>> should be returned to the host when the container exits, whether they
>>>> are physical or virtual.
>>>
>>> Wouldn't make sense to add a configuration option for lxc to create such
>>> device and handle the vpn client ?
>>>
>>> There is the lxc.network.script.up option where you can launch your vpn
>>> client. So adding the tun/tap interface as a network option, lxc will
>>> create it for you and when it is up, the up script is invoked where the
>>> vpn client is launched.
>>>
>>> The lxc.network.script.down does not exist yet, but it is quite easy to
>>> add the option.
>>>
>>> What do you think ?
>>>
>>>> Can the kernel distinguish between network interfaces that were created
>>>> inside the namespace, and network interfaces that were moved there?
>>>
>>> IMHO that will add more complexity to the network namespace, especially
>>> to handle the nested namespaces. Furthermore that will impact the
>>> current design. I am not really in favor of that as that was initial
>>> behavior and there were limitations.
>>>  <javascript:void(0);>
>>> _______________________________________________
>>> Containers mailing list
>>> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>> https://lists.linux-foundation.org/mailman/listinfo/containers
>>>
>>
>>
>
> <javascript:void(0);>
>



-- 
Renato Westphal

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-02-27 15:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-26 16:59 netns: Issues with deleting virtual interfaces during namespace cleanup Ward, David - 0663 - MITLL
     [not found] ` <4D69316F.4000606-OVIABD91gjs3uPMLIKxrzw@public.gmane.org>
2011-02-26 22:32   ` Daniel Lezcano
     [not found] ` <4D697F6A.9000907@free.fr>
     [not found]   ` <4D697F6A.9000907-GANU6spQydw@public.gmane.org>
2011-02-27  5:16     ` Renato Westphal
     [not found]       ` <AANLkTinQQHKiujHNet07kbK5eqYvp6-2iBnn27v2-85+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-27  9:02         ` Eric W. Biederman
2011-02-27  9:19         ` Daniel Lezcano
     [not found]       ` <4D6A1726.1010400@free.fr>
     [not found]         ` <4D6A1726.1010400-GANU6spQydw@public.gmane.org>
2011-02-27 15:28           ` Renato Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.