All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
@ 2013-07-02 22:04 Pierre Emeriaud
  2013-07-03  5:00 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 7+ messages in thread
From: Pierre Emeriaud @ 2013-07-02 22:04 UTC (permalink / raw)
  To: netdev

Hello,

Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache.

When adding a route to 2000::/3 with a next-hop that is not in the
neighbor cache, the route is not preferred over the default.

Tested against 3.9.1-debian and 3.9.8-arch. Not able to reproduce with 3.8.0
$ cat /proc/version
Linux version 3.9.8-1-ARCH (tobias@testing-i686) (gcc version 4.8.1
(GCC) ) #1 SMP PREEMPT Fri Jun 28 07:43:59 CEST 2013

How to reproduce:

# ip -6 route
2001:db8:ee8c:180::/64 dev eth0  proto kernel  metric 256  expires 86176sec
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
default via fe80::f6ca:e5ff:fe43:d114 dev eth0  proto ra  metric 1024
expires 1576sec

# ip -6 addr add 2001:db8::1/64 dev eth1
# ip -6 route add 2000::/3 via 2001:db8::2/64

$ ip -6 route show
2001:db8::/64 dev eth1  proto kernel  metric 256
2001:db8:ee8c:180::/64 dev eth0  proto kernel  metric 256  expires 86360sec
2000::/3 via 2001:db8::2 dev eth1  metric 1024
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
default via fe80::f6ca:e5ff:fe43:d114 dev eth0  proto ra  metric 1024
expires 1760sec


$ ip -6 route get 2a00:1450:4007:806::1001
2a00:1450:4007:806::1001 from :: via fe80::f6ca:e5ff:fe43:d114 dev
eth0  src 2001:db8:ee8c:180:21b:77ff:fe30:9e36  metric 0

# ip -6 neig add 2001:db8::2 dev eth1 lladdr 86:74:4c:45:18:f2 nud permanent

# ip -6 route get 2a00:1450:4007:806::1001
2a00:1450:4007:806::1001 from :: via 2001:db8::2 dev eth1  src
2001:db8::1  metric 0


Workaround:
Next hop has to be in the neighbor cache. Static or dynamic entry.
Sending a ping6 to the next-hop works fine.


Regards,
Pierre.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
  2013-07-02 22:04 PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache Pierre Emeriaud
@ 2013-07-03  5:00 ` Hannes Frederic Sowa
  2013-07-03 18:15   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-03  5:00 UTC (permalink / raw)
  To: Pierre Emeriaud; +Cc: netdev, yoshfuji

[Cc YOSHIFUJI Hideaki because of commit
887c95cc1da53f66a5890fdeab13414613010097 ("ipv6: Complete neighbour entry
removal from dst_entry.")]

On Wed, Jul 03, 2013 at 12:04:33AM +0200, Pierre Emeriaud wrote:
> Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache.
> 
> When adding a route to 2000::/3 with a next-hop that is not in the
> neighbor cache, the route is not preferred over the default.

Thanks for the report!

Well.

We ignore this route because of rt6_score_route returning -1 in this case.
This traces down to rt6_check_neigh returning false.

Before the above mentioned commit we kicked off some logic to create a
neighbour entry in ip6_route_add. Now we end up with neigh == NULL.

This is a hotfix but I need to do more research regarding
CONFIG_IPV6_ROUTER_PREF and further expectations of neigh != NULL (you
can try this at your own risk ;):

--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -547,6 +547,10 @@ static inline bool rt6_check_neigh(struct rt6_info *rt)
                        ret = true;
 #endif
                read_unlock(&neigh->lock);
+       } else {
+#ifdef CONFIG_IPV6_ROUTER_PREF
+               ret = true;
+#endif
        }
        rcu_read_unlock_bh();
 

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
  2013-07-03  5:00 ` Hannes Frederic Sowa
@ 2013-07-03 18:15   ` Hannes Frederic Sowa
  2013-07-03 18:21     ` Sergei Shtylyov
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-03 18:15 UTC (permalink / raw)
  To: Pierre Emeriaud, netdev, yoshfuji

On Wed, Jul 03, 2013 at 07:00:07AM +0200, Hannes Frederic Sowa wrote:
> [Cc YOSHIFUJI Hideaki because of commit
> 887c95cc1da53f66a5890fdeab13414613010097 ("ipv6: Complete neighbour entry
> removal from dst_entry.")]
> 
> On Wed, Jul 03, 2013 at 12:04:33AM +0200, Pierre Emeriaud wrote:
> > Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache.
> > 
> > When adding a route to 2000::/3 with a next-hop that is not in the
> > neighbor cache, the route is not preferred over the default.
> 
> Thanks for the report!
> 
> Well.
> 
> We ignore this route because of rt6_score_route returning -1 in this case.
> This traces down to rt6_check_neigh returning false.
> 
> Before the above mentioned commit we kicked off some logic to create a
> neighbour entry in ip6_route_add. Now we end up with neigh == NULL.
> 
> This is a hotfix but I need to do more research regarding
> CONFIG_IPV6_ROUTER_PREF and further expectations of neigh != NULL (you
> can try this at your own risk ;):

I looked up the relevant RFCs and do think this is the proper fix. Could you
give it a test?

[PATCH net] ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available

After the removal of rt->n we do not create a neighbour entry at route
insertion time (rt6_bind_neighbour is gone). As long as no neighbour is
created because of "useful traffic" we skip this routing entry because
rt6_check_neigh cannot pick up a valid neighbour (neigh == NULL) and
thus returns false.

This change was introduced by commit
887c95cc1da53f66a5890fdeab13414613010097 ("ipv6: Complete neighbour
entry removal from dst_entry.")

To quote RFC4191:
"If the host has no information about the router's reachability, then
the host assumes the router is reachable."

and also:
"A host MUST NOT probe a router's reachability in the absence of useful
traffic that the host would have sent to the router if it were reachable."

So, just assume the router is reachable and let's rt6_probe do the
rest. We don't need to create a neighbour on route insertion time.

If we don't compile with CONFIG_IPV6_ROUTER_PREF (RFC4191 support)
a neighbour is only valid if its nud_state is NUD_VALID. I did not find
any references that we should probe the router on route insertion time
via the other RFCs. So skip this route in that case.

Reported-by: Pierre Emeriaud <petrus.lt@gmail.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/route.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ad0aa6b..450979d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -547,6 +547,10 @@ static inline bool rt6_check_neigh(struct rt6_info *rt)
 			ret = true;
 #endif
 		read_unlock(&neigh->lock);
+	} else {
+#ifdef CONFIG_IPV6_ROUTER_PREF
+		ret = true;
+#endif
 	}
 	rcu_read_unlock_bh();
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
  2013-07-03 18:15   ` Hannes Frederic Sowa
@ 2013-07-03 18:21     ` Sergei Shtylyov
  2013-07-03 18:45       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 7+ messages in thread
From: Sergei Shtylyov @ 2013-07-03 18:21 UTC (permalink / raw)
  To: Pierre Emeriaud, netdev, yoshfuji

Hello.

On 07/03/2013 10:15 PM, Hannes Frederic Sowa wrote:

> I looked up the relevant RFCs and do think this is the proper fix. Could you
> give it a test?

> [PATCH net] ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available

> After the removal of rt->n we do not create a neighbour entry at route
> insertion time (rt6_bind_neighbour is gone). As long as no neighbour is
> created because of "useful traffic" we skip this routing entry because
> rt6_check_neigh cannot pick up a valid neighbour (neigh == NULL) and
> thus returns false.

> This change was introduced by commit
> 887c95cc1da53f66a5890fdeab13414613010097 ("ipv6: Complete neighbour
> entry removal from dst_entry.")

> To quote RFC4191:
> "If the host has no information about the router's reachability, then
> the host assumes the router is reachable."

> and also:
> "A host MUST NOT probe a router's reachability in the absence of useful
> traffic that the host would have sent to the router if it were reachable."

> So, just assume the router is reachable and let's rt6_probe do the
> rest. We don't need to create a neighbour on route insertion time.

> If we don't compile with CONFIG_IPV6_ROUTER_PREF (RFC4191 support)
> a neighbour is only valid if its nud_state is NUD_VALID. I did not find
> any references that we should probe the router on route insertion time
> via the other RFCs. So skip this route in that case.

> Reported-by: Pierre Emeriaud <petrus.lt@gmail.com>
> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
>   net/ipv6/route.c | 4 ++++
>   1 file changed, 4 insertions(+)

> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index ad0aa6b..450979d 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -547,6 +547,10 @@ static inline bool rt6_check_neigh(struct rt6_info *rt)
>   			ret = true;
>   #endif
>   		read_unlock(&neigh->lock);
> +	} else {
> +#ifdef CONFIG_IPV6_ROUTER_PREF

    How about:

	} else if (IS_ENABLED(CONFIG_IPV6_ROUTER_PREF)) {

> +		ret = true;
> +#endif
>   	}

WBR, Sergei

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
  2013-07-03 18:21     ` Sergei Shtylyov
@ 2013-07-03 18:45       ` Hannes Frederic Sowa
  2013-07-04  0:51         ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-03 18:45 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Pierre Emeriaud, netdev, yoshfuji

On Wed, Jul 03, 2013 at 10:21:19PM +0400, Sergei Shtylyov wrote:
>    How about:
> 
> 	} else if (IS_ENABLED(CONFIG_IPV6_ROUTER_PREF)) {
> 
> >+		ret = true;
> >+#endif
> >  	}

Definitely an improvment, thanks!

[PATCH net v2] ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available

After the removal of rt->n we do not create a neighbour entry at route
insertion time (rt6_bind_neighbour is gone). As long as no neighbour is
created because of "useful traffic" we skip this routing entry because
rt6_check_neigh cannot pick up a valid neighbour (neigh == NULL) and
thus returns false.

This change was introduced by commit
887c95cc1da53f66a5890fdeab13414613010097 ("ipv6: Complete neighbour
entry removal from dst_entry.")

To quote RFC4191:
"If the host has no information about the router's reachability, then
the host assumes the router is reachable."

and also:
"A host MUST NOT probe a router's reachability in the absence of useful
traffic that the host would have sent to the router if it were reachable."

So, just assume the router is reachable and let's rt6_probe do the
rest. We don't need to create a neighbour on route insertion time.

If we don't compile with CONFIG_IPV6_ROUTER_PREF (RFC4191 support)
a neighbour is only valid if its nud_state is NUD_VALID. I did not find
any references that we should probe the router on route insertion time
via the other RFCs. So skip this route in that case.

v2:
a) use IS_ENABLED instead of #ifdefs (thanks to Sergei Shtylyov)

Reported-by: Pierre Emeriaud <petrus.lt@gmail.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/route.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ad0aa6b..7f1332f 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -547,6 +547,8 @@ static inline bool rt6_check_neigh(struct rt6_info *rt)
 			ret = true;
 #endif
 		read_unlock(&neigh->lock);
+	} else if (IS_ENABLED(CONFIG_IPV6_ROUTER_PREF)) {
+		ret = true;
 	}
 	rcu_read_unlock_bh();
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
  2013-07-03 18:45       ` Hannes Frederic Sowa
@ 2013-07-04  0:51         ` David Miller
  2013-07-04  5:20           ` Pierre Emeriaud
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2013-07-04  0:51 UTC (permalink / raw)
  To: hannes; +Cc: sergei.shtylyov, petrus.lt, netdev, yoshfuji

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed, 3 Jul 2013 20:45:04 +0200

> On Wed, Jul 03, 2013 at 10:21:19PM +0400, Sergei Shtylyov wrote:
>>    How about:
>> 
>> 	} else if (IS_ENABLED(CONFIG_IPV6_ROUTER_PREF)) {
>> 
>> >+		ret = true;
>> >+#endif
>> >  	}
> 
> Definitely an improvment, thanks!
> 
> [PATCH net v2] ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available

Applied and queued up for -stable, thanks everyone.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache
  2013-07-04  0:51         ` David Miller
@ 2013-07-04  5:20           ` Pierre Emeriaud
  0 siblings, 0 replies; 7+ messages in thread
From: Pierre Emeriaud @ 2013-07-04  5:20 UTC (permalink / raw)
  To: David Miller; +Cc: hannes, Sergei Shtylyov, netdev, yoshfuji

2013/7/4 David Miller <davem@davemloft.net>:
>>
>> [PATCH net v2] ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
>
> Applied and queued up for -stable, thanks everyone.

I just tested the patch and it works fine! The more-specific route is
now preferred over default even if the next-hop is not (yet) known in
the neighbor cache.

Thanks everyone !

-pierre.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-07-04  5:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-02 22:04 PROBLEM: Linux 3.9 more-specific ipv6 route ignored until next-hop is in neighbor cache Pierre Emeriaud
2013-07-03  5:00 ` Hannes Frederic Sowa
2013-07-03 18:15   ` Hannes Frederic Sowa
2013-07-03 18:21     ` Sergei Shtylyov
2013-07-03 18:45       ` Hannes Frederic Sowa
2013-07-04  0:51         ` David Miller
2013-07-04  5:20           ` Pierre Emeriaud

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.