* uevent when moving nic between network namespaces?
@ 2012-10-12 3:13 Serge Hallyn
2012-10-12 3:26 ` Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Serge Hallyn @ 2012-10-12 3:13 UTC (permalink / raw)
To: ebiederm-aS9lmoZGLiVWk0Htik3J/w, Dan Kegel, Stéphane Graber,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Hi,
Dan Kegel (cc:d) found an interesting nuisance relating to upstart
and network interfaces with lxc containers. In particular, when you
start a container, two veths are created. A uevent for their creation
is sent, and so a 'network-interface' upstart job is created for each.
One of the veths is passed into the container. When the container
shuts down, the veth in the init-net-ns gets a net-device-removed
uevent, so the network-interface upstart job goes away. But the veth
in the container doesn't cause a net-device-removed upstart uevent
to be sent. So its network-interface upstart job sticks around.
The details are at:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589
I notice that when simply renaming a netdev (sudo ip link set veth1 name
veth2) then udevadm monitor shows:
KERNEL[17945.234850] move /devices/virtual/net/veth2 (net)
UDEV [17945.235758] move /devices/virtual/net/veth2 (net)
but when I do 'sudo ip link set veth2 netns 27689' then 'udevadm
monitor' shows nothing.
When I do
sudo ip link set veth1 netns 32296
(in process 32296) sudo ip link set veth1 name veth2
then, again udevadm monitor shows nothing.
So the question is, should the kernel be sending uevents for
net-device-removed and then net-device-added when a nic is moved
between network namespaces? Or should lxc just fake that?
-serge
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
2012-10-12 3:13 uevent when moving nic between network namespaces? Serge Hallyn
@ 2012-10-12 3:26 ` Eric W. Biederman
[not found] ` <871uh4pdzd.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2012-10-12 3:26 UTC (permalink / raw)
To: Serge Hallyn
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stéphane Graber, Daniel Lezcano,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Kegel
Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> Hi,
>
> Dan Kegel (cc:d) found an interesting nuisance relating to upstart
> and network interfaces with lxc containers. In particular, when you
> start a container, two veths are created. A uevent for their creation
> is sent, and so a 'network-interface' upstart job is created for each.
> One of the veths is passed into the container. When the container
> shuts down, the veth in the init-net-ns gets a net-device-removed
> uevent, so the network-interface upstart job goes away. But the veth
> in the container doesn't cause a net-device-removed upstart uevent
> to be sent. So its network-interface upstart job sticks around.
>
> The details are at:
>
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589
>
> I notice that when simply renaming a netdev (sudo ip link set veth1 name
> veth2) then udevadm monitor shows:
>
> KERNEL[17945.234850] move /devices/virtual/net/veth2 (net)
> UDEV [17945.235758] move /devices/virtual/net/veth2 (net)
>
> but when I do 'sudo ip link set veth2 netns 27689' then 'udevadm
> monitor' shows nothing.
>
> When I do
>
> sudo ip link set veth1 netns 32296
> (in process 32296) sudo ip link set veth1 name veth2
>
> then, again udevadm monitor shows nothing.
>
> So the question is, should the kernel be sending uevents for
> net-device-removed and then net-device-added when a nic is moved
> between network namespaces? Or should lxc just fake that?
To the best of my memory I wired up those events, and they should be
delivered. Now they uevents will only be delivered in the relevant
network namespace.
Hmm. But the relevant code in the kernel is device_rename, and it
happens after we switch the network namespace on the device.
Which probably means that in practice only the new network namespace is
seeing uevents.
Grr.
Eric
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
[not found] ` <871uh4pdzd.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2012-10-12 19:18 ` Serge Hallyn
2012-10-12 19:38 ` Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Serge Hallyn @ 2012-10-12 19:18 UTC (permalink / raw)
To: Eric W. Biederman
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stéphane Graber, Daniel Lezcano,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Kegel
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
>
> > Hi,
> >
> > Dan Kegel (cc:d) found an interesting nuisance relating to upstart
> > and network interfaces with lxc containers. In particular, when you
> > start a container, two veths are created. A uevent for their creation
> > is sent, and so a 'network-interface' upstart job is created for each.
> > One of the veths is passed into the container. When the container
> > shuts down, the veth in the init-net-ns gets a net-device-removed
> > uevent, so the network-interface upstart job goes away. But the veth
> > in the container doesn't cause a net-device-removed upstart uevent
> > to be sent. So its network-interface upstart job sticks around.
> >
> > The details are at:
> >
> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589
> >
> > I notice that when simply renaming a netdev (sudo ip link set veth1 name
> > veth2) then udevadm monitor shows:
> >
> > KERNEL[17945.234850] move /devices/virtual/net/veth2 (net)
> > UDEV [17945.235758] move /devices/virtual/net/veth2 (net)
> >
> > but when I do 'sudo ip link set veth2 netns 27689' then 'udevadm
> > monitor' shows nothing.
> >
> > When I do
> >
> > sudo ip link set veth1 netns 32296
> > (in process 32296) sudo ip link set veth1 name veth2
> >
> > then, again udevadm monitor shows nothing.
> >
> > So the question is, should the kernel be sending uevents for
> > net-device-removed and then net-device-added when a nic is moved
> > between network namespaces? Or should lxc just fake that?
>
> To the best of my memory I wired up those events, and they should be
> delivered. Now they uevents will only be delivered in the relevant
> network namespace.
>
> Hmm. But the relevant code in the kernel is device_rename, and it
> happens after we switch the network namespace on the device.
>
> Which probably means that in practice only the new network namespace is
> seeing uevents.
>
> Grr.
Ah, indeed. A few more experiments show that:
1. 'sudo ip link add type veth' on the host ends up with some kernel
messages, namely
KERNEL[389.393581] add /devices/virtual/net/veth1/queues/rx-0 (queues)
KERNEL[389.394953] add /devices/virtual/net/veth1/queues/tx-0 (queues)
sent to all namespaces - though the
UDEV [389.405255] add /devices/virtual/net/veth1 (net)
only gets sent to the initial namespace.
2. Then when I 'sudo ip link set veth1 netns <pid-in-container>', I get
KERNEL[405.041296] move /devices/virtual/net/veth2 (net)
only in the container's namespace - exactly as you said above should
happen.
Eric, are you working on a patch for this? Should we just explicitly
add a remove uevent before doing the transition, or is it more
complicated than that?
thanks,
-serge
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
2012-10-12 19:18 ` Serge Hallyn
@ 2012-10-12 19:38 ` Eric W. Biederman
[not found] ` <87sj9jmqew.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2012-10-12 19:38 UTC (permalink / raw)
To: Serge Hallyn
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stéphane Graber, Daniel Lezcano,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Kegel
Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
>>
>> > Hi,
>> >
>> > Dan Kegel (cc:d) found an interesting nuisance relating to upstart
>> > and network interfaces with lxc containers. In particular, when you
>> > start a container, two veths are created. A uevent for their creation
>> > is sent, and so a 'network-interface' upstart job is created for each.
>> > One of the veths is passed into the container. When the container
>> > shuts down, the veth in the init-net-ns gets a net-device-removed
>> > uevent, so the network-interface upstart job goes away. But the veth
>> > in the container doesn't cause a net-device-removed upstart uevent
>> > to be sent. So its network-interface upstart job sticks around.
>> >
>> > The details are at:
>> >
>> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589
>> >
>> > I notice that when simply renaming a netdev (sudo ip link set veth1 name
>> > veth2) then udevadm monitor shows:
>> >
>> > KERNEL[17945.234850] move /devices/virtual/net/veth2 (net)
>> > UDEV [17945.235758] move /devices/virtual/net/veth2 (net)
>> >
>> > but when I do 'sudo ip link set veth2 netns 27689' then 'udevadm
>> > monitor' shows nothing.
>> >
>> > When I do
>> >
>> > sudo ip link set veth1 netns 32296
>> > (in process 32296) sudo ip link set veth1 name veth2
>> >
>> > then, again udevadm monitor shows nothing.
>> >
>> > So the question is, should the kernel be sending uevents for
>> > net-device-removed and then net-device-added when a nic is moved
>> > between network namespaces? Or should lxc just fake that?
>>
>> To the best of my memory I wired up those events, and they should be
>> delivered. Now they uevents will only be delivered in the relevant
>> network namespace.
>>
>> Hmm. But the relevant code in the kernel is device_rename, and it
>> happens after we switch the network namespace on the device.
>>
>> Which probably means that in practice only the new network namespace is
>> seeing uevents.
>>
>> Grr.
>
> Ah, indeed. A few more experiments show that:
>
> 1. 'sudo ip link add type veth' on the host ends up with some kernel
> messages, namely
>
> KERNEL[389.393581] add /devices/virtual/net/veth1/queues/rx-0 (queues)
> KERNEL[389.394953] add /devices/virtual/net/veth1/queues/tx-0 (queues)
>
> sent to all namespaces - though the
Yes. The queue uevents are not currently network namespace aware. That
is a bug I would be happy to see fixed.
> UDEV [389.405255] add /devices/virtual/net/veth1 (net)
>
> only gets sent to the initial namespace.
>
> 2. Then when I 'sudo ip link set veth1 netns <pid-in-container>', I get
>
> KERNEL[405.041296] move /devices/virtual/net/veth2 (net)
>
> only in the container's namespace - exactly as you said above should
> happen.
>
> Eric, are you working on a patch for this? Should we just explicitly
> add a remove uevent before doing the transition, or is it more
> complicated than that?
I am not currently working on a patch for this, but I will be happy to
review one. At a quick glance it looks like this could just be as
simple as calling kobject_uevent at the proper time, but testing and
reading through the relevant code paths is probably a good idea as there
always seems to be gotchas in that code.
Eric
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
[not found] ` <87sj9jmqew.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2012-10-12 21:56 ` Serge Hallyn
2012-10-12 22:08 ` Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Serge Hallyn @ 2012-10-12 21:56 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stefan Bader, Stéphane Graber, Dan Kegel,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> I am not currently working on a patch for this, but I will be happy to
> review one. At a quick glance it looks like this could just be as
> simple as calling kobject_uevent at the proper time, but testing and
> reading through the relevant code paths is probably a good idea as there
> always seems to be gotchas in that code.
>
> Eric
This (the simple fix) works for me, actually.
I do notice the ifdef shouldn't be needed, all the better.
From b436802aa8ae80f699b3d7bcf584d3e86af7355a Mon Sep 17 00:00:00 2001
From: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Date: Fri, 12 Oct 2012 21:42:05 +0100
Subject: [PATCH 1/1] dev_change_net_namespace: send a KOBJ_REMOVED to
original netns
Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
net/core/dev.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index e2215ee..8062a5a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6172,6 +6172,10 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
dev_uc_flush(dev);
dev_mc_flush(dev);
+ /* Send a netdev-removed uevent to the old namespace */
+#ifdef CONFIG_HOTPLUG
+ kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
+#endif
/* Actually switch the network namespace */
dev_net_set(dev, net);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
2012-10-12 21:56 ` Serge Hallyn
@ 2012-10-12 22:08 ` Eric W. Biederman
[not found] ` <87bog7mjhm.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2012-10-12 22:08 UTC (permalink / raw)
To: Serge Hallyn
Cc: Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stefan Bader, Stéphane Graber, Dan Kegel,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>> I am not currently working on a patch for this, but I will be happy to
>> review one. At a quick glance it looks like this could just be as
>> simple as calling kobject_uevent at the proper time, but testing and
>> reading through the relevant code paths is probably a good idea as there
>> always seems to be gotchas in that code.
>>
>> Eric
>
> This (the simple fix) works for me, actually.
>
> I do notice the ifdef shouldn't be needed, all the better.
Should we have a KOBJ_ADD in the new network namespace or is the
KOBJ_MOVE sufficient?
Eric
> From b436802aa8ae80f699b3d7bcf584d3e86af7355a Mon Sep 17 00:00:00 2001
> From: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> Date: Fri, 12 Oct 2012 21:42:05 +0100
> Subject: [PATCH 1/1] dev_change_net_namespace: send a KOBJ_REMOVED to
> original netns
>
> Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> net/core/dev.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e2215ee..8062a5a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6172,6 +6172,10 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
> dev_uc_flush(dev);
> dev_mc_flush(dev);
>
> + /* Send a netdev-removed uevent to the old namespace */
> +#ifdef CONFIG_HOTPLUG
> + kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
> +#endif
> /* Actually switch the network namespace */
> dev_net_set(dev, net);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
[not found] ` <87bog7mjhm.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2012-10-12 22:17 ` Serge Hallyn
2012-10-12 22:29 ` Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Serge Hallyn @ 2012-10-12 22:17 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stefan Bader, Stéphane Graber, Dan Kegel,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
>
> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >> I am not currently working on a patch for this, but I will be happy to
> >> review one. At a quick glance it looks like this could just be as
> >> simple as calling kobject_uevent at the proper time, but testing and
> >> reading through the relevant code paths is probably a good idea as there
> >> always seems to be gotchas in that code.
> >>
> >> Eric
> >
> > This (the simple fix) works for me, actually.
> >
> > I do notice the ifdef shouldn't be needed, all the better.
>
> Should we have a KOBJ_ADD in the new network namespace or is the
> KOBJ_MOVE sufficient?
I was wondering about that... the KOBJ_ADD is technically not sufficient
imo, since a MOVE (for a device which udev/upstart has never seen before)
doesn't necessarily mean "configure this." So when I pass one end of a
veth into a running ubuntu container, there is no network-interface or
network-interface-security upstart job for it, whereas if I do a
ip link add type veth inside the container, those do get the jobs.
Now, ISTM passing an endpoing into a container is mainly done at
startup, and upstart will end up configuring it anyway. Nothing is
really breaking in any of the container usages I've seen because of this.
But it would definately be cleaner to pass a KOBJ_ADD before the KOBJ_MOVE.
Otherwise, udev has to guess what the MOVE meant.
If there's no objection, I'll add that (and test it) and send to netdev
on monday.
-serge
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
2012-10-12 22:17 ` Serge Hallyn
@ 2012-10-12 22:29 ` Eric W. Biederman
[not found] ` <87626fmihz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2012-10-12 22:29 UTC (permalink / raw)
To: Serge Hallyn
Cc: Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stefan Bader, Stéphane Graber, Dan Kegel,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
>>
>> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>> >> I am not currently working on a patch for this, but I will be happy to
>> >> review one. At a quick glance it looks like this could just be as
>> >> simple as calling kobject_uevent at the proper time, but testing and
>> >> reading through the relevant code paths is probably a good idea as there
>> >> always seems to be gotchas in that code.
>> >>
>> >> Eric
>> >
>> > This (the simple fix) works for me, actually.
>> >
>> > I do notice the ifdef shouldn't be needed, all the better.
>>
>> Should we have a KOBJ_ADD in the new network namespace or is the
>> KOBJ_MOVE sufficient?
>
> I was wondering about that... the KOBJ_ADD is technically not sufficient
> imo, since a MOVE (for a device which udev/upstart has never seen before)
> doesn't necessarily mean "configure this." So when I pass one end of a
> veth into a running ubuntu container, there is no network-interface or
> network-interface-security upstart job for it, whereas if I do a
> ip link add type veth inside the container, those do get the jobs.
>
> Now, ISTM passing an endpoing into a container is mainly done at
> startup, and upstart will end up configuring it anyway. Nothing is
> really breaking in any of the container usages I've seen because of this.
> But it would definately be cleaner to pass a KOBJ_ADD before the KOBJ_MOVE.
> Otherwise, udev has to guess what the MOVE meant.
>
> If there's no objection, I'll add that (and test it) and send to netdev
> on monday.
Sounds good. Right now I have the suspicion we might want our own
variant on sysfs_move that sends these instead of the move...
But let's confirm things work better with add/remove before we go crazy
on the best way to generate maintainable code.
Eric
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
[not found] ` <87626fmihz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2012-10-13 5:17 ` Serge Hallyn
2012-10-13 5:27 ` Eric W. Biederman
0 siblings, 1 reply; 10+ messages in thread
From: Serge Hallyn @ 2012-10-13 5:17 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stefan Bader, Stéphane Graber, Dan Kegel,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
>
> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >> Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> writes:
> >>
> >> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >> >> I am not currently working on a patch for this, but I will be happy to
> >> >> review one. At a quick glance it looks like this could just be as
> >> >> simple as calling kobject_uevent at the proper time, but testing and
> >> >> reading through the relevant code paths is probably a good idea as there
> >> >> always seems to be gotchas in that code.
> >> >>
> >> >> Eric
> >> >
> >> > This (the simple fix) works for me, actually.
> >> >
> >> > I do notice the ifdef shouldn't be needed, all the better.
> >>
> >> Should we have a KOBJ_ADD in the new network namespace or is the
> >> KOBJ_MOVE sufficient?
> >
> > I was wondering about that... the KOBJ_ADD is technically not sufficient
> > imo, since a MOVE (for a device which udev/upstart has never seen before)
> > doesn't necessarily mean "configure this." So when I pass one end of a
> > veth into a running ubuntu container, there is no network-interface or
> > network-interface-security upstart job for it, whereas if I do a
> > ip link add type veth inside the container, those do get the jobs.
> >
> > Now, ISTM passing an endpoing into a container is mainly done at
> > startup, and upstart will end up configuring it anyway. Nothing is
> > really breaking in any of the container usages I've seen because of this.
> > But it would definately be cleaner to pass a KOBJ_ADD before the KOBJ_MOVE.
> > Otherwise, udev has to guess what the MOVE meant.
> >
> > If there's no objection, I'll add that (and test it) and send to netdev
> > on monday.
>
> Sounds good. Right now I have the suspicion we might want our own
> variant on sysfs_move that sends these instead of the move...
>
> But let's confirm things work better with add/remove before we go crazy
> on the best way to generate maintainable code.
Yup all still looks good with the following trivial patch. And now when
I pass a netdev into a running container, it gets a network-interface
upstart job just as it does on a real host.
And no network-interface jobs stick around after the container shuts
down, meaning this solves the kernel part of bug 1065589
(https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589).
(Pre-existing nics don't get a network-interface job - the fact that lxc
first passes in the netdevs and then execs init therefore still causes
some asymmetry wrt a real host, where netdevs always come up after init
starts. AFAIK we don't care, but Stéphane might know of a reason why we
do - in either case it's not the kernel's problem)
From 01dc08273fa63a50f6dbb7377397ec52a7a337f8 Mon Sep 17 00:00:00 2001
From: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Date: Fri, 12 Oct 2012 21:42:05 +0100
Subject: [PATCH 1/1] dev_change_net_namespace: send a KOBJ_REMOVED to
original netns
v2: also send KOBJ_ADD to new netns. There will then be a
_MOVE event from the device_rename() call, but that should
be innocuous.
Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
net/core/dev.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index e2215ee..2c43aaf 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6172,6 +6172,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
dev_uc_flush(dev);
dev_mc_flush(dev);
+ /* Send a netdev-removed uevent to the old namespace */
+ kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
+
/* Actually switch the network namespace */
dev_net_set(dev, net);
@@ -6183,6 +6186,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
dev->iflink = dev->ifindex;
}
+ /* Send a netdev-add uevent to the new namespace */
+ kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
+
/* Fixup kobjects */
err = device_rename(&dev->dev, dev->name);
WARN_ON(err);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: uevent when moving nic between network namespaces?
2012-10-13 5:17 ` Serge Hallyn
@ 2012-10-13 5:27 ` Eric W. Biederman
0 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2012-10-13 5:27 UTC (permalink / raw)
To: Serge Hallyn
Cc: Daniel Lezcano,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Stefan Bader, Stéphane Graber, Dan Kegel,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Serge Hallyn <serge.hallyn@canonical.com> writes:
[snip old comments]
> Yup all still looks good with the following trivial patch. And now when
> I pass a netdev into a running container, it gets a network-interface
> upstart job just as it does on a real host.
>
> And no network-interface jobs stick around after the container shuts
> down, meaning this solves the kernel part of bug 1065589
> (https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589).
>
> (Pre-existing nics don't get a network-interface job - the fact that lxc
> first passes in the netdevs and then execs init therefore still causes
> some asymmetry wrt a real host, where netdevs always come up after init
> starts. AFAIK we don't care, but Stéphane might know of a reason why we
> do - in either case it's not the kernel's problem)
>
> From 01dc08273fa63a50f6dbb7377397ec52a7a337f8 Mon Sep 17 00:00:00 2001
> From: Serge Hallyn <serge.hallyn@canonical.com>
> Date: Fri, 12 Oct 2012 21:42:05 +0100
> Subject: [PATCH 1/1] dev_change_net_namespace: send a KOBJ_REMOVED to
> original netns
>
> v2: also send KOBJ_ADD to new netns. There will then be a
> _MOVE event from the device_rename() call, but that should
> be innocuous.
This patch looks reasonable to me. I would add to the changelog the
motivation. Something like your comments just above this patch.
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
> ---
> net/core/dev.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e2215ee..2c43aaf 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6172,6 +6172,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
> dev_uc_flush(dev);
> dev_mc_flush(dev);
>
> + /* Send a netdev-removed uevent to the old namespace */
> + kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
> +
> /* Actually switch the network namespace */
> dev_net_set(dev, net);
>
> @@ -6183,6 +6186,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
> dev->iflink = dev->ifindex;
> }
>
> + /* Send a netdev-add uevent to the new namespace */
> + kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
> +
> /* Fixup kobjects */
> err = device_rename(&dev->dev, dev->name);
> WARN_ON(err);
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-10-13 5:27 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-12 3:13 uevent when moving nic between network namespaces? Serge Hallyn
2012-10-12 3:26 ` Eric W. Biederman
[not found] ` <871uh4pdzd.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-10-12 19:18 ` Serge Hallyn
2012-10-12 19:38 ` Eric W. Biederman
[not found] ` <87sj9jmqew.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-10-12 21:56 ` Serge Hallyn
2012-10-12 22:08 ` Eric W. Biederman
[not found] ` <87bog7mjhm.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-10-12 22:17 ` Serge Hallyn
2012-10-12 22:29 ` Eric W. Biederman
[not found] ` <87626fmihz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-10-13 5:17 ` Serge Hallyn
2012-10-13 5:27 ` Eric W. Biederman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.