netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] IPv6 start/stop problems
@ 2008-03-18 14:32 Denis V. Lunev
       [not found] ` <1205850761.9163.21.camel-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
  2008-03-31  8:38 ` [PATCH 0/3] IPv6 start/stop problems Denis V. Lunev
  0 siblings, 2 replies; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-18 14:32 UTC (permalink / raw)
  To: David Miller
  Cc: yoshfuji-VfPWfsRibaP+Ru+s062T9g, Netdev List, Linux Containers

Hello, Dave!

We have faced a several problems with IPv6 start/stop on 2.6.18 RHEL5
kernel in OpenVz. The code in the 2.6.25 does not differ from 2.6.18 in
respect to this.

Regards,
	Den

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
       [not found] ` <1205850761.9163.21.camel-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
@ 2008-03-18 14:35   ` Denis V. Lunev
  2008-03-23  0:38     ` David Miller
  2008-03-18 14:35   ` [PATCH 2/3] [IPV6]: inet6_dev on loopback should be kept until namespace stop Denis V. Lunev
  2008-03-18 14:35   ` [PATCH 3/3] [IPV6]: Fix refcounting for anycast dst entries Denis V. Lunev
  2 siblings, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-18 14:35 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: yoshfuji-VfPWfsRibaP+Ru+s062T9g, netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-qjLDD68F18O7TbgM5vRIOg, Denis V. Lunev,
	benjamin.thery-6ktuUTfB/bM

addrconf_ifdown is broken in respect to the usage of how parameter. This
function is called with (event != NETDEV_DOWN) and (2) on the IPv6 stop.
It the latter case inet6_dev from loopback device should be destroyed.

Signed-off-by: Denis V. Lunev <den-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
---
 net/ipv6/addrconf.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 4b86d38..d68e8f5 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2457,7 +2457,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 	/* Step 1: remove reference to ipv6 device from parent device.
 		   Do not dev_put!
 	 */
-	if (how == 1) {
+	if (how) {
 		idev->dead = 1;
 
 		/* protected by rtnl_lock */
@@ -2489,12 +2489,12 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 	write_lock_bh(&idev->lock);
 
 	/* Step 3: clear flags for stateless addrconf */
-	if (how != 1)
+	if (!how)
 		idev->if_flags &= ~(IF_RS_SENT|IF_RA_RCVD|IF_READY);
 
 	/* Step 4: clear address list */
 #ifdef CONFIG_IPV6_PRIVACY
-	if (how == 1 && del_timer(&idev->regen_timer))
+	if (how && del_timer(&idev->regen_timer))
 		in6_dev_put(idev);
 
 	/* clear tempaddr list */
@@ -2531,7 +2531,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 	/* Step 5: Discard multicast list */
 
-	if (how == 1)
+	if (how)
 		ipv6_mc_destroy_dev(idev);
 	else
 		ipv6_mc_down(idev);
@@ -2540,7 +2540,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 	/* Shot the device (if unregistered) */
 
-	if (how == 1) {
+	if (how) {
 		addrconf_sysctl_unregister(idev);
 		neigh_parms_release(&nd_tbl, idev->nd_parms);
 		neigh_ifdown(&nd_tbl, dev);
-- 
1.5.3.rc5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] [IPV6]: inet6_dev on loopback should be kept until namespace stop.
       [not found] ` <1205850761.9163.21.camel-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
  2008-03-18 14:35   ` [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used Denis V. Lunev
@ 2008-03-18 14:35   ` Denis V. Lunev
  2008-03-18 14:35   ` [PATCH 3/3] [IPV6]: Fix refcounting for anycast dst entries Denis V. Lunev
  2 siblings, 0 replies; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-18 14:35 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: yoshfuji-VfPWfsRibaP+Ru+s062T9g, netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-qjLDD68F18O7TbgM5vRIOg, Denis V. Lunev,
	benjamin.thery-6ktuUTfB/bM

In the other case it will be destroyed when last address will be removed
from lo inside a namespace. This will break IPv6 in several places. The
most obvious one is ip6_dst_ifdown.

Signed-off-by: Denis V. Lunev <den-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
---
 net/ipv6/addrconf.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d68e8f5..40784ea 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2444,7 +2444,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 	ASSERT_RTNL();
 
-	if (dev == init_net.loopback_dev && how == 1)
+	if ((dev->flags & IFF_LOOPBACK) && how == 1)
 		how = 0;
 
 	rt6_ifdown(net, dev);
-- 
1.5.3.rc5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] [IPV6]: Fix refcounting for anycast dst entries.
       [not found] ` <1205850761.9163.21.camel-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
  2008-03-18 14:35   ` [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used Denis V. Lunev
  2008-03-18 14:35   ` [PATCH 2/3] [IPV6]: inet6_dev on loopback should be kept until namespace stop Denis V. Lunev
@ 2008-03-18 14:35   ` Denis V. Lunev
  2 siblings, 0 replies; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-18 14:35 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: yoshfuji-VfPWfsRibaP+Ru+s062T9g, netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-qjLDD68F18O7TbgM5vRIOg, Denis V. Lunev,
	benjamin.thery-6ktuUTfB/bM

Anycast DST entries allocated inside ipv6_dev_ac_inc are leaked when
network device is stopped without removing IPv6 addresses from it. The
bug has been observed in the reality on 2.6.18-rhel5 kernel.

In the above case addrconf_ifdown marks all entries as obsolete and
ip6_del_rt called from __ipv6_dev_ac_dec returns ENOENT. The referrence is
not dropped.

The fix is simple. DST entry should not keep referrence when stored in the
FIB6 tree.

Signed-off-by: Denis V. Lunev <den-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
---
 net/ipv6/anycast.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 96868b9..7bc0469 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -334,9 +334,7 @@ int ipv6_dev_ac_inc(struct net_device *dev, struct in6_addr *addr)
 	idev->ac_list = aca;
 	write_unlock_bh(&idev->lock);
 
-	dst_hold(&rt->u.dst);
-	if (ip6_ins_rt(rt))
-		dst_release(&rt->u.dst);
+	ip6_ins_rt(rt);
 
 	addrconf_join_solict(dev, &aca->aca_addr);
 
@@ -378,10 +376,7 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, struct in6_addr *addr)
 	addrconf_leave_solict(idev, &aca->aca_addr);
 
 	dst_hold(&aca->aca_rt->u.dst);
-	if (ip6_del_rt(aca->aca_rt))
-		dst_free(&aca->aca_rt->u.dst);
-	else
-		dst_release(&aca->aca_rt->u.dst);
+	ip6_del_rt(aca->aca_rt);
 
 	aca_put(aca);
 	return 0;
-- 
1.5.3.rc5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
  2008-03-18 14:35   ` [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used Denis V. Lunev
@ 2008-03-23  0:38     ` David Miller
  2008-03-23  8:13       ` Denis V. Lunev
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2008-03-23  0:38 UTC (permalink / raw)
  To: den; +Cc: yoshfuji, netdev, kaber, containers, dlezcano, benjamin.thery

From: "Denis V. Lunev" <den@openvz.org>
Date: Tue, 18 Mar 2008 17:35:23 +0300

> addrconf_ifdown is broken in respect to the usage of how parameter. This
> function is called with (event != NETDEV_DOWN) and (2) on the IPv6 stop.
> It the latter case inet6_dev from loopback device should be destroyed.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>

The code purposefully treats "2" specially because when IPV6 routes
are destroyed they are changed to point to the loopback device's
inet6_dev object.

This allows statistic bumping code to not have to check if it has a
NULL inet6_dev pointer or not, because that's now impossible.

Since ipv6 is not unloadable, addrconf_cleanup(), and thus the
"how == 2" case can only occur when ipv6 fails to load properly.
The only real consequence of this bug is that if ipv6 fails
to load properly, a subsequent successfull load of ipv6 will
leak the loopback device's inet6_dev object, which isn't that
much of a big deal.

I understand that for namespaces you have to deal with multiple
loopback devices, but you'll need to solve that problem while
still handling the wish of the ipv6 stack for inet6_dev objects
of loopback devices to be permanent and guarenteed to always
be around for the sake of statistics bumping.

I thus can't apply any of these patches until those issues are
resolved.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
  2008-03-23  0:38     ` David Miller
@ 2008-03-23  8:13       ` Denis V. Lunev
  2008-03-23 10:17         ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-23  8:13 UTC (permalink / raw)
  To: David Miller
  Cc: yoshfuji, netdev, kaber, containers, dlezcano, benjamin.thery

On Sat, 2008-03-22 at 17:38 -0700, David Miller wrote:
> From: "Denis V. Lunev" <den@openvz.org>
> Date: Tue, 18 Mar 2008 17:35:23 +0300
> 
> > addrconf_ifdown is broken in respect to the usage of how parameter. This
> > function is called with (event != NETDEV_DOWN) and (2) on the IPv6 stop.
> > It the latter case inet6_dev from loopback device should be destroyed.
> > 
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> 
> The code purposefully treats "2" specially because when IPV6 routes
> are destroyed they are changed to point to the loopback device's
> inet6_dev object.
> 
> This allows statistic bumping code to not have to check if it has a
> NULL inet6_dev pointer or not, because that's now impossible.
> 
> Since ipv6 is not unloadable, addrconf_cleanup(), and thus the
> "how == 2" case can only occur when ipv6 fails to load properly.
> The only real consequence of this bug is that if ipv6 fails
> to load properly, a subsequent successfull load of ipv6 will
> leak the loopback device's inet6_dev object, which isn't that
> much of a big deal.
> 
> I understand that for namespaces you have to deal with multiple
> loopback devices, but you'll need to solve that problem while
> still handling the wish of the ipv6 stack for inet6_dev objects
> of loopback devices to be permanent and guarenteed to always
> be around for the sake of statistics bumping.

First, this behaviour is broken for a namespace right now in the 2.6.26
tree. inet6_dev pointer will be NULL for a loopback inside the
namespace. The case is simple. Just remove all INET6 addresses from a
loopback device inside a VE. This will call
  inet6_addr_del
    addrconf_ifdown(dev, 1);
       if (dev == init_net.loopback_dev && how == 1)
                how = 0;
the condition will be false and how will not be changed here.

Pls note, that ip6_dst_ifdown deals with a namespace loopback rather
than init_net loopback to track referrences of the namespace objects.
This allows us to catch refcounting bugs smoothly (see patch 3 in the
set).

That's why I have extended a special "2" case to really destroy
inet6_dev to have a way to destroy it. Generic code should not suffer
from this from my POW.

> I thus can't apply any of these patches until those issues are
> resolved.

IMHO special "2" case was intended to have a stub to unload the module
in the future.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
  2008-03-23  8:13       ` Denis V. Lunev
@ 2008-03-23 10:17         ` David Miller
  2008-03-23 14:34           ` Denis V. Lunev
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2008-03-23 10:17 UTC (permalink / raw)
  To: den; +Cc: yoshfuji, netdev, kaber, containers, dlezcano, benjamin.thery

From: "Denis V. Lunev" <den@openvz.org>
Date: Sun, 23 Mar 2008 11:13:16 +0300

> First, this behaviour is broken for a namespace right now in the 2.6.26
> tree. inet6_dev pointer will be NULL for a loopback inside the
> namespace. The case is simple. Just remove all INET6 addresses from a
> loopback device inside a VE. This will call
>   inet6_addr_del
>     addrconf_ifdown(dev, 1);
>        if (dev == init_net.loopback_dev && how == 1)
>                 how = 0;
> the condition will be false and how will not be changed here.

That's a bug.

You can't mark any namespace's loopback device's inet6_dev as NULL
until you know that all routes, devices, and packets referring to such
devices and routes in that namespace are %100 gone and unreferenced.

It is now obviously apparent that there are several severe errors
here.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
  2008-03-23 10:17         ` David Miller
@ 2008-03-23 14:34           ` Denis V. Lunev
  2008-03-24  5:49             ` David Miller
  2008-04-03 20:33             ` David Miller
  0 siblings, 2 replies; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-23 14:34 UTC (permalink / raw)
  To: David Miller
  Cc: yoshfuji, netdev, kaber, containers, dlezcano, benjamin.thery

On Sun, 2008-03-23 at 03:17 -0700, David Miller wrote:
> From: "Denis V. Lunev" <den@openvz.org>
> Date: Sun, 23 Mar 2008 11:13:16 +0300
> 
> > First, this behaviour is broken for a namespace right now in the 2.6.26
> > tree. inet6_dev pointer will be NULL for a loopback inside the
> > namespace. The case is simple. Just remove all INET6 addresses from a
> > loopback device inside a VE. This will call
> >   inet6_addr_del
> >     addrconf_ifdown(dev, 1);
> >        if (dev == init_net.loopback_dev && how == 1)
> >                 how = 0;
> > the condition will be false and how will not be changed here.
> 
> That's a bug.
> 
> You can't mark any namespace's loopback device's inet6_dev as NULL
> until you know that all routes, devices, and packets referring to such
> devices and routes in that namespace are %100 gone and unreferenced.
> 
> It is now obviously apparent that there are several severe errors
> here.

You are perfectly correct and the place in addrconf_cleanup is that
place when we believe that we should destroy all the staff.

You see, it is pretty useless to call addrconf_ifdown(dev, 2) after
addrconf_dev(dev, 0) for a loopback in the current code! No new cleanups
will be performed for 2, pls check :)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
  2008-03-23 14:34           ` Denis V. Lunev
@ 2008-03-24  5:49             ` David Miller
  2008-04-03 20:33             ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2008-03-24  5:49 UTC (permalink / raw)
  To: den; +Cc: yoshfuji, netdev, kaber, containers, dlezcano, benjamin.thery

From: "Denis V. Lunev" <den@openvz.org>
Date: Sun, 23 Mar 2008 17:34:59 +0300

> You are perfectly correct and the place in addrconf_cleanup is that
> place when we believe that we should destroy all the staff.
> 
> You see, it is pretty useless to call addrconf_ifdown(dev, 2) after
> addrconf_dev(dev, 0) for a loopback in the current code! No new cleanups
> will be performed for 2, pls check :)

Ok, I'll take another close look at this and apply your patches
if I agree with you :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] IPv6 start/stop problems
  2008-03-18 14:32 [PATCH 0/3] IPv6 start/stop problems Denis V. Lunev
       [not found] ` <1205850761.9163.21.camel-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
@ 2008-03-31  8:38 ` Denis V. Lunev
  2008-03-31  8:41   ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Denis V. Lunev @ 2008-03-31  8:38 UTC (permalink / raw)
  To: David Miller; +Cc: yoshfuji, Netdev List, Patrick McHardy, Linux Containers

On Tue, 2008-03-18 at 17:32 +0300, Denis V. Lunev wrote:
> Hello, Dave!
> 
> We have faced a several problems with IPv6 start/stop on 2.6.18 RHEL5
> kernel in OpenVz. The code in the 2.6.25 does not differ from 2.6.18 in
> respect to this.

Hi, Dave!

Have you changed you mind about this?

Regards,
	Den


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] IPv6 start/stop problems
  2008-03-31  8:38 ` [PATCH 0/3] IPv6 start/stop problems Denis V. Lunev
@ 2008-03-31  8:41   ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2008-03-31  8:41 UTC (permalink / raw)
  To: den; +Cc: yoshfuji, netdev, kaber, containers

From: "Denis V. Lunev" <den@parallels.com>
Date: Mon, 31 Mar 2008 12:38:01 +0400

> On Tue, 2008-03-18 at 17:32 +0300, Denis V. Lunev wrote:
> > Hello, Dave!
> > 
> > We have faced a several problems with IPv6 start/stop on 2.6.18 RHEL5
> > kernel in OpenVz. The code in the 2.6.25 does not differ from 2.6.18 in
> > respect to this.
> 
> Hi, Dave!
> 
> Have you changed you mind about this?

I still haven't gotten around to reviewing this stuff yet.
When I do, I'll be sure to let you know. :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used.
  2008-03-23 14:34           ` Denis V. Lunev
  2008-03-24  5:49             ` David Miller
@ 2008-04-03 20:33             ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2008-04-03 20:33 UTC (permalink / raw)
  To: den; +Cc: yoshfuji, netdev, kaber, containers, dlezcano, benjamin.thery

From: "Denis V. Lunev" <den@openvz.org>
Date: Sun, 23 Mar 2008 17:34:59 +0300

> You are perfectly correct and the place in addrconf_cleanup is that
> place when we believe that we should destroy all the staff.
> 
> You see, it is pretty useless to call addrconf_ifdown(dev, 2) after
> addrconf_dev(dev, 0) for a loopback in the current code! No new cleanups
> will be performed for 2, pls check :)

I've rereviewed these three patches and I agree with your
assesment.

Therefore, I've applied these three patches to net-2.6 and
will push them out after some build validation.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-04-03 20:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-18 14:32 [PATCH 0/3] IPv6 start/stop problems Denis V. Lunev
     [not found] ` <1205850761.9163.21.camel-aPCOdVxUTlgvJsYlp49lxw@public.gmane.org>
2008-03-18 14:35   ` [PATCH 1/3] [IPV6]: Event type in addrconf_ifdown is mis-used Denis V. Lunev
2008-03-23  0:38     ` David Miller
2008-03-23  8:13       ` Denis V. Lunev
2008-03-23 10:17         ` David Miller
2008-03-23 14:34           ` Denis V. Lunev
2008-03-24  5:49             ` David Miller
2008-04-03 20:33             ` David Miller
2008-03-18 14:35   ` [PATCH 2/3] [IPV6]: inet6_dev on loopback should be kept until namespace stop Denis V. Lunev
2008-03-18 14:35   ` [PATCH 3/3] [IPV6]: Fix refcounting for anycast dst entries Denis V. Lunev
2008-03-31  8:38 ` [PATCH 0/3] IPv6 start/stop problems Denis V. Lunev
2008-03-31  8:41   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).