From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: hannes@stressinduktion.org
Cc: netdev@vger.kernel.org, yoshfuji@linux-ipv6.org,
petrus.lt@gmail.com, davem@davemloft.net
Subject: Re: [PATCH RFC] ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF
Date: Wed, 10 Jul 2013 16:30:35 +0200 [thread overview]
Message-ID: <51DD700B.4060504@6wind.com> (raw)
In-Reply-To: <20130710134951.GE15411@order.stressinduktion.org>
Le 10/07/2013 15:49, Hannes Frederic Sowa a écrit :
> On Wed, Jul 10, 2013 at 03:17:41PM +0200, Hannes Frederic Sowa wrote:
>> On Wed, Jul 10, 2013 at 02:08:42PM +0200, Nicolas Dichtel wrote:
>>> Le 10/07/2013 13:15, Hannes Frederic Sowa a écrit :
>>>> On Wed, Jul 10, 2013 at 09:54:58AM +0200, Nicolas Dichtel wrote:
>>>>> Le 09/07/2013 23:57, Hannes Frederic Sowa a écrit :
>>>>>> Are we sure we decrement all sibling's rt6i_nsiblings? Shouldn't we
>>>>>> start iterating from fn->leaf? But this does not seem to cause it,
>>>>>> because my trace does not report any calls to fib6_del_route.
>>>>> Note sure to follow you, but all siblings are listed in rt6i_siblings, so
>>>>> it must be enough.
>>>>
>>>> My hunch was to iterate over fn->leaf->rt_next and compare the metrics
>>>> like we
>>>> do when adding a new route. Then take that rt6_info->rt6i_siblings
>>>> list_head
>>>> to iterate over the remaining siblings. But I did not review that part
>>>> carefully, need to check later.
>>>>
>>>>>> You could try reproduce it by having an interface autoconfigured with
>>>>>> a default router with NUD_VALID neighbour. I then added an unused vlan
>>>>>> interface (vid 100 in my case) and added the following ip addresses:
>>>>>>
>>>>>> ip -6 a a 2001:ffff::1/64 dev eth0.100
>>>>>> ip -6 r a 2000::/3 nexthop via 2001:ffff::30 nexthop via 2001:ffff::31
>>>>>> nexthop via 2001:ffff::32 nexthop via 2001:ffff::33
>>>>>>
>>>>>> (all nexthops should not be reachable)
>>>>>>
>>>>>> After starting a ping6 2000::1 the box should panic soon, after the
>>>>>> first nexthop entry times out.
>>>>>>
>>>>>> Perhaps you could give me a hint?
>>>>> I will run some tests with your patch. Will see.
>>>>>
>>>>> I assume you didn't reproduce this without your patch.
>>>>
>>>> Current kernel does not correctly select more specific routes, so these
>>>> routes
>>>> are not even tried and the logic should not be excercised.
>>>>
>>>> Ah, sorry, you should also compile your kernel without
>>>> CONFIG_IPV6_ROUTER_PREF, too, if you try to reproduce it.
>>> I've done this.
>>>
>>> My conf (eth1 autoconfigured, I use net-next + your patch):
>>> vconfig add eth1 100
>>> ifconfig eth1.100 up
>>> ip -6 a a 2001:ffff::1/64 dev eth1.100
>>> ip -6 r a 2000::/3 nexthop via 2001:ffff::30 nexthop via 2001:ffff::31
>>> nexthop via 2001:ffff::32 nexthop via 2001:ffff::33
>>> ping6 2000::1
>>
>> Hm, I see. I suspect something with timing. I, too, use a net-next and have
>> one function dump_route added and sprinkeld it at some points.
>>
>> When I copy&pasted your calls I could not reproduce it. After a reboot when
>> just applying the commands from my history (which I did a lot faster), I got
>> the panic again.
>>
>> I'll remove the dump_routes and recheck later.
>
> This patch ontop
>
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -46,6 +46,16 @@
> #define RT6_TRACE(x...) do { ; } while (0)
> #endif
>
> +static void dump_route(struct rt6_info *rt, const char *prefix)
> +{
> + u32 f = rt->rt6i_flags;
> + struct rt6key *k = &rt->rt6i_dst;
> + printk(KERN_INFO "%s: %p dst %pI6c plen %d gateway %pI6c, siblings %d, metric %d, expires %d gateway %d idev6 %p dev %p\n", prefix,
> + rt, &k->addr, k->plen, &rt->rt6i_gateway, rt->rt6i_nsiblings, rt->rt6i_metric, f&RTF_EXPIRES, f&RTF_GATEWAY, rt->rt6i_idev, rt->dst.dev);
> +}
> +
> +
> +
> static struct kmem_cache * fib6_node_kmem __read_mostly;
>
> enum fib_walk_state_t
> @@ -693,8 +703,11 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
> */
> if (rt->rt6i_flags & RTF_GATEWAY &&
> !(rt->rt6i_flags & RTF_EXPIRES) &&
> - !(iter->rt6i_flags & RTF_EXPIRES))
> + !(iter->rt6i_flags & RTF_EXPIRES)) {
> rt->rt6i_nsiblings++;
> + dump_route(rt, "(rt)");
> + dump_route(iter, "(iter)");
> + }
> }
>
> if (iter->rt6i_metric > rt->rt6i_metric)
> @@ -718,6 +731,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
> if (sibling->rt6i_metric == rt->rt6i_metric) {
> list_add_tail(&rt->rt6i_siblings,
> &sibling->rt6i_siblings);
> + dump_route(sibling, "(sibling)");
> break;
> }
> sibling = sibling->dst.rt6_next;
> @@ -730,6 +744,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
> list_for_each_entry_safe(sibling, temp_sibling,
> &rt->rt6i_siblings, rt6i_siblings) {
> sibling->rt6i_nsiblings++;
> + dump_route(sibling, "(sibling increment)");
> BUG_ON(sibling->rt6i_nsiblings != rt->rt6i_nsiblings);
> rt6i_nsiblings++;
> }
>
> produces this panic:
>
> [ 59.234779] (rt): ffff880113242000 dst 2000::1 plen 128 gateway 2001:ffff::33, siblings 1, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [ 59.243794] (iter): ffff880117e7b680 dst 2000::1 plen 128 gateway 2001:ffff::31, siblings 2, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [ 59.261383] (rt): ffff880113242000 dst 2000::1 plen 128 gateway 2001:ffff::33, siblings 2, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [ 59.270030] (iter): ffff880117e7bb00 dst 2000::1 plen 128 gateway 2001:ffff::32, siblings 2, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [ 59.291933] (sibling): ffff880117e62480 dst 2000::1 plen 128 gateway 2001:ffff::30, siblings 2, metric 0, expires 4194304 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [ 59.306893] (sibling increment): ffff880117e62480 dst 2000::1 plen 128 gateway 2001:ffff::30, siblings 3, metric 0, expires 4194304 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
I don't have the same output:
[ 97.945170] (rt): f1a02d80 dst 2000:: plen 3 gateway 2001:660:1234:5678::31,
siblings 1, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[ 97.948117] (iter): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 0, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.951207] (sibling): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 0, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.954272] (sibling increment): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 1, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.957545] (rt): f1a02c80 dst 2000:: plen 3 gateway 2001:660:1234:5678::32,
siblings 1, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[ 97.960376] (iter): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 1, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.961902] (rt): f1a02c80 dst 2000:: plen 3 gateway 2001:660:1234:5678::32,
siblings 2, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[ 97.963095] (iter): f1a02d80 dst 2000:: plen 3 gateway
2001:660:1234:5678::31, siblings 1, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.964354] (sibling): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 1, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.965604] (sibling increment): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 2, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.966916] (sibling increment): f1a02d80 dst 2000:: plen 3 gateway
2001:660:1234:5678::31, siblings 2, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.968254] (rt): f1a02b80 dst 2000:: plen 3 gateway 2001:660:1234:5678::33,
siblings 1, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[ 97.969467] (iter): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 2, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.970702] (rt): f1a02b80 dst 2000:: plen 3 gateway 2001:660:1234:5678::33,
siblings 2, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[ 97.971895] (iter): f1a02d80 dst 2000:: plen 3 gateway
2001:660:1234:5678::31, siblings 2, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.973137] (rt): f1a02b80 dst 2000:: plen 3 gateway 2001:660:1234:5678::33,
siblings 3, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[ 97.974331] (iter): f1a02c80 dst 2000:: plen 3 gateway
2001:660:1234:5678::32, siblings 2, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.975542] (sibling): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 2, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.976808] (sibling increment): f1a02e80 dst 2000:: plen 3 gateway
2001:660:1234:5678::30, siblings 3, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.978126] (sibling increment): f1a02d80 dst 2000:: plen 3 gateway
2001:660:1234:5678::31, siblings 3, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
[ 97.979453] (sibling increment): f1a02c80 dst 2000:: plen 3 gateway
2001:660:1234:5678::32, siblings 3, metric 1024, expires 0 gateway 2 idev6
f7507e00 dev f7793000
Can you send me the output of:
ip -6 r
ip -6 a
next prev parent reply other threads:[~2013-07-10 14:30 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-07 17:30 [PATCH RFC] ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF Hannes Frederic Sowa
2013-07-09 21:57 ` Hannes Frederic Sowa
2013-07-10 7:54 ` Nicolas Dichtel
2013-07-10 9:28 ` Nicolas Dichtel
2013-07-10 10:53 ` Hannes Frederic Sowa
2013-07-10 12:22 ` Nicolas Dichtel
2013-07-10 13:21 ` Hannes Frederic Sowa
2013-07-10 14:10 ` Nicolas Dichtel
2013-07-10 15:20 ` Hannes Frederic Sowa
2013-07-10 15:59 ` Hannes Frederic Sowa
2013-07-10 16:35 ` Hannes Frederic Sowa
2013-07-11 8:07 ` Nicolas Dichtel
2013-07-10 21:21 ` Hannes Frederic Sowa
2013-07-11 8:04 ` Nicolas Dichtel
2013-07-11 10:24 ` Hannes Frederic Sowa
2013-07-11 14:46 ` Hannes Frederic Sowa
2013-07-11 14:57 ` Nicolas Dichtel
2013-07-12 8:51 ` Hannes Frederic Sowa
2013-07-12 12:04 ` Nicolas Dichtel
2013-07-12 16:19 ` Hannes Frederic Sowa
2013-07-12 19:01 ` Nicolas Dichtel
2013-07-12 19:20 ` Hannes Frederic Sowa
2013-07-12 21:48 ` Hannes Frederic Sowa
2013-07-10 11:15 ` Hannes Frederic Sowa
2013-07-10 11:40 ` Hannes Frederic Sowa
2013-07-10 12:08 ` Nicolas Dichtel
2013-07-10 13:17 ` Hannes Frederic Sowa
2013-07-10 13:49 ` Hannes Frederic Sowa
2013-07-10 14:30 ` Nicolas Dichtel [this message]
2013-07-10 14:34 ` Hannes Frederic Sowa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51DD700B.4060504@6wind.com \
--to=nicolas.dichtel@6wind.com \
--cc=davem@davemloft.net \
--cc=hannes@stressinduktion.org \
--cc=netdev@vger.kernel.org \
--cc=petrus.lt@gmail.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.