All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Eric Dumazet <eric.dumazet@gmail.com>, Thomas.Winter@alliedtelesis.co.nz
Cc: David Ahern <dsahern@gmail.com>,
	Ido Schimmel <idosch@mellanox.com>,
	netdev@vger.kernel.org, davem@davemloft.net,
	roopa@cumulusnetworks.com, nikolay@cumulusnetworks.com,
	pch@ordbogen.com, jkbs@redhat.com, yoshfuji@linux-ipv6.org,
	mlxsw@mellanox.com
Subject: Re: [PATCH net-next 1/4] ipv6: Calculate hash thresholds for IPv6 nexthops
Date: Wed, 2 May 2018 21:53:10 +0300	[thread overview]
Message-ID: <20180502185310.GA31998@splinter> (raw)
In-Reply-To: <20180502175244.GA14587@splinter>

On Wed, May 02, 2018 at 08:52:44PM +0300, Ido Schimmel wrote:
> On Wed, May 02, 2018 at 08:21:06PM +0300, Ido Schimmel wrote:
> > On Wed, May 02, 2018 at 09:43:50AM -0700, Eric Dumazet wrote:
> > > 
> > > 
> > > On 01/09/2018 07:43 PM, David Ahern wrote:
> > > > On 1/9/18 7:40 AM, Ido Schimmel wrote:
> > > >> Before we convert IPv6 to use hash-threshold instead of modulo-N, we
> > > >> first need each nexthop to store its region boundary in the hash
> > > >> function's output space.
> > > >>
> > > >> The boundary is calculated by dividing the output space equally between
> > > >> the different active nexthops. That is, nexthops that are not dead or
> > > >> linkdown.
> > > >>
> > > >> The boundaries are rebalanced whenever a nexthop is added or removed to
> > > >> a multipath route and whenever a nexthop becomes active or inactive.
> > > >>
> > > >> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> > > >> ---
> > > >>  include/net/ip6_fib.h   |  1 +
> > > >>  include/net/ip6_route.h |  7 ++++
> > > >>  net/ipv6/ip6_fib.c      |  8 ++---
> > > >>  net/ipv6/route.c        | 96 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > >>  4 files changed, 106 insertions(+), 6 deletions(-)
> > > >>
> > > > 
> > > > LGTM.
> > > > Acked-by: David Ahern <dsahern@gmail.com>
> > > > 
> > > 
> > > For some reason I have a divide by zero error booting my hosts with latest net tree.
> > > 
> > > What guarantee do we have that total is not zero when rt6_upper_bound_set() is called ?
> > 
> > Thanks for the report, Eric. I believe I didn't cover all the cases and
> > 'rt6i_nh_weight' might be 0 is some cases. I'll try to reproduce and
> > work on a fix.
> 
> Hmmm, I think it's due to commit edd7ceb78296 ("ipv6: Allow non-gateway
> ECMP for IPv6") which allows routes without a gateway (such as those
> configured using slaac) to have siblings.
> 
> Can you please check if reverting the patch / applying the below fixes
> the issue?

So this fixes the issue for me. To reproduce:

# ip -6 address add 2001:db8::1/64 dev dummy0
# ip -6 address add 2001:db8::1/64 dev dummy1

This reproduces the issue because due to above commit both local routes
are considered siblings... :/

local 2001:db8::1 proto kernel metric 0 
        nexthop dev dummy0 weight 1 
        nexthop dev dummy1 weight 1 pref medium

I think it's best to revert the patch and have Thomas submit a fixed
version to net-next. I was actually surprised to see it applied to net.

> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index f4d61736c41a..129dd4f4b264 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -3606,6 +3606,7 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev,
>  	rt->dst.input = ip6_input;
>  	rt->dst.output = ip6_output;
>  	rt->rt6i_idev = idev;
> +	rt->rt6i_nh_weight = 1;
>  
>  	rt->rt6i_protocol = RTPROT_KERNEL;
>  	rt->rt6i_flags = RTF_UP | RTF_NONEXTHOP;

  reply	other threads:[~2018-05-02 18:53 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-09 14:40 [PATCH net-next 0/4] ipv6: Add support for non-equal-cost multipath Ido Schimmel
2018-01-09 14:40 ` [PATCH net-next 1/4] ipv6: Calculate hash thresholds for IPv6 nexthops Ido Schimmel
2018-01-10  3:43   ` David Ahern
2018-05-02 16:43     ` Eric Dumazet
2018-05-02 17:21       ` Ido Schimmel
2018-05-02 17:52         ` Ido Schimmel
2018-05-02 18:53           ` Ido Schimmel [this message]
2018-05-02 18:58             ` David Ahern
2018-05-02 19:04               ` Ido Schimmel
2018-05-02 20:48                 ` Thomas Winter
2018-05-02 20:56                   ` David Ahern
2018-05-04  1:13                     ` David Ahern
2018-01-09 14:40 ` [PATCH net-next 2/4] ipv6: Use a 31-bit multipath hash Ido Schimmel
2018-01-10  3:43   ` David Ahern
2018-01-09 14:40 ` [PATCH net-next 3/4] ipv6: Use hash-threshold instead of modulo-N Ido Schimmel
2018-01-10  3:54   ` David Ahern
2018-01-10 12:02     ` Ido Schimmel
2018-01-09 14:40 ` [PATCH net-next 4/4] ipv6: Add support for non-equal-cost multipath Ido Schimmel
2018-01-10  3:48   ` David Ahern
2018-01-10 11:47     ` Ido Schimmel
2018-01-10 15:53       ` David Ahern
2018-01-10  4:38 ` [PATCH net-next 0/4] " David Ahern
2018-01-10 12:31   ` Ido Schimmel
2018-01-10 20:15 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180502185310.GA31998@splinter \
    --to=idosch@idosch.org \
    --cc=Thomas.Winter@alliedtelesis.co.nz \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=idosch@mellanox.com \
    --cc=jkbs@redhat.com \
    --cc=mlxsw@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=pch@ordbogen.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.