netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: hannes@stressinduktion.org
Cc: netdev@vger.kernel.org, yoshfuji@linux-ipv6.org,
	petrus.lt@gmail.com, davem@davemloft.net
Subject: Re: [PATCH RFC] ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF
Date: Wed, 10 Jul 2013 16:30:35 +0200	[thread overview]
Message-ID: <51DD700B.4060504@6wind.com> (raw)
In-Reply-To: <20130710134951.GE15411@order.stressinduktion.org>

Le 10/07/2013 15:49, Hannes Frederic Sowa a écrit :
> On Wed, Jul 10, 2013 at 03:17:41PM +0200, Hannes Frederic Sowa wrote:
>> On Wed, Jul 10, 2013 at 02:08:42PM +0200, Nicolas Dichtel wrote:
>>> Le 10/07/2013 13:15, Hannes Frederic Sowa a écrit :
>>>> On Wed, Jul 10, 2013 at 09:54:58AM +0200, Nicolas Dichtel wrote:
>>>>> Le 09/07/2013 23:57, Hannes Frederic Sowa a écrit :
>>>>>> Are we sure we decrement all sibling's rt6i_nsiblings? Shouldn't we
>>>>>> start iterating from fn->leaf? But this does not seem to cause it,
>>>>>> because my trace does not report any calls to fib6_del_route.
>>>>> Note sure to follow you, but all siblings are listed in rt6i_siblings, so
>>>>> it must be enough.
>>>>
>>>> My hunch was to iterate over fn->leaf->rt_next and compare the metrics
>>>> like we
>>>> do when adding a new route. Then take that rt6_info->rt6i_siblings
>>>> list_head
>>>> to iterate over the remaining siblings. But I did not review that part
>>>> carefully, need to check later.
>>>>
>>>>>> You could try reproduce it by having an interface autoconfigured with
>>>>>> a default router with NUD_VALID neighbour. I then added an unused vlan
>>>>>> interface (vid 100 in my case) and added the following ip addresses:
>>>>>>
>>>>>> ip -6 a a 2001:ffff::1/64 dev eth0.100
>>>>>> ip -6 r a 2000::/3 nexthop via 2001:ffff::30 nexthop via 2001:ffff::31
>>>>>> nexthop via 2001:ffff::32 nexthop via 2001:ffff::33
>>>>>>
>>>>>> (all nexthops should not be reachable)
>>>>>>
>>>>>> After starting a ping6 2000::1 the box should panic soon, after the
>>>>>> first nexthop entry times out.
>>>>>>
>>>>>> Perhaps you could give me a hint?
>>>>> I will run some tests with your patch. Will see.
>>>>>
>>>>> I assume you didn't reproduce this without your patch.
>>>>
>>>> Current kernel does not correctly select more specific routes, so these
>>>> routes
>>>> are not even tried and the logic should not be excercised.
>>>>
>>>> Ah, sorry, you should also compile your kernel without
>>>> CONFIG_IPV6_ROUTER_PREF, too, if you try to reproduce it.
>>> I've done this.
>>>
>>> My conf (eth1 autoconfigured, I use net-next + your patch):
>>> vconfig add eth1 100
>>> ifconfig eth1.100 up
>>> ip -6 a a 2001:ffff::1/64 dev eth1.100
>>> ip -6 r a 2000::/3 nexthop via 2001:ffff::30 nexthop via 2001:ffff::31
>>> nexthop via 2001:ffff::32 nexthop via 2001:ffff::33
>>> ping6 2000::1
>>
>> Hm, I see. I suspect something with timing. I, too, use a net-next and have
>> one function dump_route added and sprinkeld it at some points.
>>
>> When I copy&pasted your calls I could not reproduce it. After a reboot when
>> just applying the commands from my history (which I did a lot faster), I got
>> the panic again.
>>
>> I'll remove the dump_routes and recheck later.
>
> This patch ontop
>
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -46,6 +46,16 @@
>   #define RT6_TRACE(x...) do { ; } while (0)
>   #endif
>
> +static void dump_route(struct rt6_info *rt, const char *prefix)
> +{
> +       u32 f = rt->rt6i_flags;
> +       struct rt6key *k = &rt->rt6i_dst;
> +       printk(KERN_INFO "%s: %p dst %pI6c plen %d gateway %pI6c, siblings %d, metric %d, expires %d gateway %d idev6 %p dev %p\n", prefix,
> +              rt, &k->addr, k->plen, &rt->rt6i_gateway, rt->rt6i_nsiblings, rt->rt6i_metric, f&RTF_EXPIRES, f&RTF_GATEWAY, rt->rt6i_idev, rt->dst.dev);
> +}
> +
> +
> +
>   static struct kmem_cache * fib6_node_kmem __read_mostly;
>
>   enum fib_walk_state_t
> @@ -693,8 +703,11 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
>                           */
>                          if (rt->rt6i_flags & RTF_GATEWAY &&
>                              !(rt->rt6i_flags & RTF_EXPIRES) &&
> -                           !(iter->rt6i_flags & RTF_EXPIRES))
> +                           !(iter->rt6i_flags & RTF_EXPIRES)) {
>                                  rt->rt6i_nsiblings++;
> +                               dump_route(rt, "(rt)");
> +                               dump_route(iter, "(iter)");
> +                       }
>                  }
>
>                  if (iter->rt6i_metric > rt->rt6i_metric)
> @@ -718,6 +731,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
>                          if (sibling->rt6i_metric == rt->rt6i_metric) {
>                                  list_add_tail(&rt->rt6i_siblings,
>                                                &sibling->rt6i_siblings);
> +                               dump_route(sibling, "(sibling)");
>                                  break;
>                          }
>                          sibling = sibling->dst.rt6_next;
> @@ -730,6 +744,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
>                  list_for_each_entry_safe(sibling, temp_sibling,
>                                           &rt->rt6i_siblings, rt6i_siblings) {
>                          sibling->rt6i_nsiblings++;
> +                       dump_route(sibling, "(sibling increment)");
>                          BUG_ON(sibling->rt6i_nsiblings != rt->rt6i_nsiblings);
>                          rt6i_nsiblings++;
>                  }
>
> produces this panic:
>
> [   59.234779] (rt): ffff880113242000 dst 2000::1 plen 128 gateway 2001:ffff::33, siblings 1, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [   59.243794] (iter): ffff880117e7b680 dst 2000::1 plen 128 gateway 2001:ffff::31, siblings 2, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [   59.261383] (rt): ffff880113242000 dst 2000::1 plen 128 gateway 2001:ffff::33, siblings 2, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [   59.270030] (iter): ffff880117e7bb00 dst 2000::1 plen 128 gateway 2001:ffff::32, siblings 2, metric 0, expires 0 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [   59.291933] (sibling): ffff880117e62480 dst 2000::1 plen 128 gateway 2001:ffff::30, siblings 2, metric 0, expires 4194304 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
> [   59.306893] (sibling increment): ffff880117e62480 dst 2000::1 plen 128 gateway 2001:ffff::30, siblings 3, metric 0, expires 4194304 gateway 2 idev6 ffff8801131ab000 dev ffff88011816d000
I don't have the same output:
[   97.945170] (rt): f1a02d80 dst 2000:: plen 3 gateway 2001:660:1234:5678::31, 
siblings 1, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[   97.948117] (iter): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 0, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.951207] (sibling): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 0, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.954272] (sibling increment): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 1, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.957545] (rt): f1a02c80 dst 2000:: plen 3 gateway 2001:660:1234:5678::32, 
siblings 1, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[   97.960376] (iter): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 1, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.961902] (rt): f1a02c80 dst 2000:: plen 3 gateway 2001:660:1234:5678::32, 
siblings 2, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[   97.963095] (iter): f1a02d80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::31, siblings 1, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.964354] (sibling): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 1, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.965604] (sibling increment): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 2, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.966916] (sibling increment): f1a02d80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::31, siblings 2, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.968254] (rt): f1a02b80 dst 2000:: plen 3 gateway 2001:660:1234:5678::33, 
siblings 1, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[   97.969467] (iter): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 2, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.970702] (rt): f1a02b80 dst 2000:: plen 3 gateway 2001:660:1234:5678::33, 
siblings 2, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[   97.971895] (iter): f1a02d80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::31, siblings 2, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.973137] (rt): f1a02b80 dst 2000:: plen 3 gateway 2001:660:1234:5678::33, 
siblings 3, metric 1024, expires 0 gateway 2 idev6 f7507e00 dev f7793000
[   97.974331] (iter): f1a02c80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::32, siblings 2, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.975542] (sibling): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 2, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.976808] (sibling increment): f1a02e80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::30, siblings 3, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.978126] (sibling increment): f1a02d80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::31, siblings 3, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000
[   97.979453] (sibling increment): f1a02c80 dst 2000:: plen 3 gateway 
2001:660:1234:5678::32, siblings 3, metric 1024, expires 0 gateway 2 idev6 
f7507e00 dev f7793000

Can you send me the output of:
ip -6 r
ip -6 a

  reply	other threads:[~2013-07-10 14:30 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-07 17:30 [PATCH RFC] ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF Hannes Frederic Sowa
2013-07-09 21:57 ` Hannes Frederic Sowa
2013-07-10  7:54   ` Nicolas Dichtel
2013-07-10  9:28     ` Nicolas Dichtel
2013-07-10 10:53       ` Hannes Frederic Sowa
2013-07-10 12:22         ` Nicolas Dichtel
2013-07-10 13:21           ` Hannes Frederic Sowa
2013-07-10 14:10             ` Nicolas Dichtel
2013-07-10 15:20               ` Hannes Frederic Sowa
2013-07-10 15:59                 ` Hannes Frederic Sowa
2013-07-10 16:35                   ` Hannes Frederic Sowa
2013-07-11  8:07                     ` Nicolas Dichtel
2013-07-10 21:21               ` Hannes Frederic Sowa
2013-07-11  8:04                 ` Nicolas Dichtel
2013-07-11 10:24                   ` Hannes Frederic Sowa
2013-07-11 14:46                     ` Hannes Frederic Sowa
2013-07-11 14:57                       ` Nicolas Dichtel
2013-07-12  8:51                         ` Hannes Frederic Sowa
2013-07-12 12:04                           ` Nicolas Dichtel
2013-07-12 16:19                             ` Hannes Frederic Sowa
2013-07-12 19:01                               ` Nicolas Dichtel
2013-07-12 19:20                                 ` Hannes Frederic Sowa
2013-07-12 21:48                                   ` Hannes Frederic Sowa
2013-07-10 11:15     ` Hannes Frederic Sowa
2013-07-10 11:40       ` Hannes Frederic Sowa
2013-07-10 12:08       ` Nicolas Dichtel
2013-07-10 13:17         ` Hannes Frederic Sowa
2013-07-10 13:49           ` Hannes Frederic Sowa
2013-07-10 14:30             ` Nicolas Dichtel [this message]
2013-07-10 14:34               ` Hannes Frederic Sowa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51DD700B.4060504@6wind.com \
    --to=nicolas.dichtel@6wind.com \
    --cc=davem@davemloft.net \
    --cc=hannes@stressinduktion.org \
    --cc=netdev@vger.kernel.org \
    --cc=petrus.lt@gmail.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).