netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Lezcano <dlezcano@fr.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Denis V. Lunev" <den@sw.ru>,
	davem@davemloft.net, netdev@vger.kernel.org, xemul@openvz.org,
	containers@lists.osdl.org, yoshfuji@linux-ipv6.org,
	Benjamin Thery <benjamin.thery@bull.net>
Subject: Re: [patch 1/1][NETNS][IPV6] protect addrconf from loopback registration
Date: Tue, 13 Nov 2007 11:55:32 +0100	[thread overview]
Message-ID: <473982A4.5020905@fr.ibm.com> (raw)
In-Reply-To: <m16407450i.fsf@ebiederm.dsl.xmission.com>

Eric W. Biederman wrote:
> Daniel Lezcano <dlezcano@fr.ibm.com> writes:
> 
>> Eric W. Biederman wrote:
>>> "Denis V. Lunev" <den@sw.ru> writes:
>>>
>>>>> Index: linux-2.6-netns/net/ipv6/addrconf.c
>>>>> ===================================================================
>>>>> --- linux-2.6-netns.orig/net/ipv6/addrconf.c
>>>>> +++ linux-2.6-netns/net/ipv6/addrconf.c
>>>>> @@ -2272,7 +2272,8 @@ static int addrconf_notify(struct notifi
>>>>>   	switch(event) {
>>>>>  	case NETDEV_REGISTER:
>>>>> -		if (!idev && dev->mtu >= IPV6_MIN_MTU) {
>>>>> +		if (!(dev->flags & IFF_LOOPBACK) &&
>>>>> +		    !idev && dev->mtu >= IPV6_MIN_MTU) {
>>> It is idev being true here for the loopback device that would
>>> prevent things not missing the REGISTER event.
>>>
>>> Hmm.  But we do call ipv6_add_dev on loopback and now the loopback
>>> device is practically guaranteed to be the first device so we can
>>> probably just remove the special case in addrconf_init.
>>>
>>> Anyway Daniels patch makes increasingly less sense the more I look
>>> at it.
>> Let me try to clarify:
>>
>>  * when the init network namespace is created, the loopback is created first,
>> before ipv6, and the notifier call chain for ipv6 is not setup, so the protocol
>> does not receive the REGISTER event
>>
>>  * when the init network namespace is destroyed during shutdown, the loopback is
>> not unregistered, so there is no UNREGISTER event
> 
> * When addrconf_init calls register_netdevice_notifier we receive
>   NETDEV_REGISTER and NETDEV_UP for all network devices that are in
>   the system including the loopback device.

Thanks for the information. Effectively, I missed this :|

>>  * when we create a new network namespace, a new instance of the loopback is
>> created and a NETDEV_REGISTER is sent to ipv6 because the notifier call chain
>> has been setup by the init netns (while ipv6 protocol is not yet configured for
>> the namespace which is being created)
> 
> Possibly there may be some ordering issues here.
> 
>>  * when the network namespace exits, the loopback is unregistered after the ipv6
>> protocol but the NETDEV_UNREGISTER is sent to addrconf_notify while the ipv6
>> protocol has been destroyed.
>>
>>
>> The objective of the patch is to discard these events because they were never
>> taken into account and they are not expected to be receive by ipv6 protocol.
> 
> My opinion is that both your analysis is slightly off (as to the cause
> of your problems) and that your approach to fix your problem is wrong
> because you don't untangle the knot you keep it.

Yes, I will look at how to fix that properly.

> ...
> I have register_pernet_subsys and register_per_net_device to ensure
> that when we create a new network namespace all of the subsystems are
> initialized before the network devices are initialize.  So ipv6 should
> be ready before we initialize the new loopback device comes into
> existence.
> 
> The preservation of the order of the network namespace callbacks
> ensures that the loopback device will be the first network device
> registered, and if it helps we can take advantage of that in reference
> to the weirdness from the comment below.
> 
> 	/* The addrconf netdev notifier requires that loopback_dev
> 	 * has it's ipv6 private information allocated and setup
> 	 * before it can bring up and give link-local addresses
> 	 * to other devices which are up.
> 	 *
> 	 * Unfortunately, loopback_dev is not necessarily the first
> 	 * entry in the global dev_base list of net devices.  In fact,
> 	 * it is likely to be the very last entry on that list.
> 	 * So this causes the notifier registry below to try and
> 	 * give link-local addresses to all devices besides loopback_dev
> 	 * first, then loopback_dev, which cases all the non-loopback_dev
> 	 * devices to fail to get a link-local address.
> 	 *
> 	 * So, as a temporary fix, allocate the ipv6 structure for
> 	 * loopback_dev first by hand.
> 	 * Longer term, all of the dependencies ipv6 has upon the loopback
> 	 * device and it being up should be removed.
> 	 */
> 
> We can just special case registration of the loopback device to
> do:
> 	ip6_null_entry.u.dst.dev = init_net.loopback_dev;
> 	ip6_null_entry.rt6i_idev = in6_dev_get(init_net.loopback_dev);
> #ifdef CONFIG_IPV6_MULTIPLE_TABLES
> 	ip6_prohibit_entry.u.dst.dev = init_net.loopback_dev;
> 	ip6_prohibit_entry.rt6i_idev = in6_dev_get(init_net.loopback_dev);
> 	ip6_blk_hole_entry.u.dst.dev = init_net.loopback_dev;
> 	ip6_blk_hole_entry.rt6i_idev = in6_dev_get(init_net.loopback_dev);
> #endif
> 
> Which would remove the special case from addrconf_init.



  parent reply	other threads:[~2007-11-13 11:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20071112151953.052335971@mai.toulouse-stg.fr.ibm.com>
2007-11-12 15:19 ` [patch 1/1][NETNS][IPV6] protect addrconf from loopback registration Daniel Lezcano
2007-11-12 16:05   ` Denis V. Lunev
2007-11-12 16:11     ` Daniel Lezcano
2007-11-12 16:49       ` Denis V. Lunev
2007-11-12 16:59         ` Eric W. Biederman
2007-11-12 22:24         ` David Miller
2007-11-13 12:59           ` Eric W. Biederman
     [not found]     ` <473879C3.5020301-3ImXcnM4P+0@public.gmane.org>
2007-11-12 16:51       ` Eric W. Biederman
2007-11-12 17:01         ` Daniel Lezcano
2007-11-12 19:50           ` Eric W. Biederman
2007-11-13  1:52             ` YOSHIFUJI Hideaki / 吉藤英明
2007-11-13 13:11               ` Eric W. Biederman
2007-11-13 10:55             ` Daniel Lezcano [this message]
2007-11-12 21:00           ` Denis V. Lunev
2007-11-12 16:40   ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=473982A4.5020905@fr.ibm.com \
    --to=dlezcano@fr.ibm.com \
    --cc=benjamin.thery@bull.net \
    --cc=containers@lists.osdl.org \
    --cc=davem@davemloft.net \
    --cc=den@sw.ru \
    --cc=ebiederm@xmission.com \
    --cc=netdev@vger.kernel.org \
    --cc=xemul@openvz.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).