From: Ido Schimmel <idosch@idosch.org>
To: David Ahern <dsahern@gmail.com>
Cc: Ido Schimmel <idosch@mellanox.com>,
netdev@vger.kernel.org, davem@davemloft.net,
roopa@cumulusnetworks.com, nicolas.dichtel@6wind.com,
mlxsw@mellanox.com
Subject: Re: [RFC PATCH net-next 03/19] ipv6: Clear nexthop flags upon netdev up
Date: Wed, 3 Jan 2018 19:40:25 +0200 [thread overview]
Message-ID: <20180103174025.GA6584@splinter> (raw)
In-Reply-To: <2a815a43-a9e9-10bd-7b4b-a3996b6952d2@gmail.com>
On Wed, Jan 03, 2018 at 09:56:02AM -0700, David Ahern wrote:
> On 1/3/18 9:43 AM, Ido Schimmel wrote:
> > On Wed, Jan 03, 2018 at 08:32:51AM -0700, David Ahern wrote:
> >> On 1/3/18 12:44 AM, Ido Schimmel wrote:
> >>>>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> >>>>> index ed06b1190f05..b6405568ed7b 100644
> >>>>> --- a/net/ipv6/addrconf.c
> >>>>> +++ b/net/ipv6/addrconf.c
> >>>>> @@ -3484,6 +3484,9 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
> >>>>> if (run_pending)
> >>>>> addrconf_dad_run(idev);
> >>>>>
> >>>>> + /* Device has an address by now */
> >>>>> + rt6_sync_up(dev, RTNH_F_DEAD);
> >>>>> +
> >>>>
> >>>> Seems like this should be in the NETDEV_UP section, say after
> >>>> addrconf_permanent_addr.
> >>>
> >>> Unless the `keep_addr_on_down` sysctl is set, then at this stage the
> >>> netdev doesn't have an IP address and we shouldn't clear the dead flag
> >>> just yet.
> >>>
> >>> This is consistent with IPv4 that clears the dead flag from nexthops in
> >>> a multipath route only if the nexthop device has an IP address. When the
> >>> last IPv4 address is removed from a netdev all the routes using it are
> >>> flushed and there's nothing to clear upon NETDEV_UP.
> >>
> >> I have a bug about that IPv4 handling from the FRR team:
> >>
> >> $ ip link add dummy1 type dummy
> >> $ ip li set dummy1 up
> >> $ ip route add 1.1.1.0/24 dev dummy1
> >>
> >> $ ip addr add dev dummy1 2.2.2.1/24
> >> $ ip ro ls | grep dummy1
> >> 1.1.1.0/24 dev dummy1 scope link
> >> 2.2.2.0/24 dev dummy1 proto kernel scope link src 2.2.2.1
> >>
> >> $ ip addr del dev dummy1 2.2.2.1/24
> >> $ ip ro ls | grep dummy1
> >> <no outpu>
> >>
> >> The 1.1.1.0/24 route was removed as well the 2.2.2.0 connected route.
> >
> > If you're going to skip the flushing in this case, at least mark the
> > nexthops as dead.
>
> On a down event, yes. If the device is still up then a route such as:
> $ ip route add 1.1.1.0/24 dev dummy1
> should still be usable even without an address on it.
mlxsw will trap all the packets hitting the route until you assign an IP
address to dummy1.
> > And this is my second reason to have rt6_sync_up() where I put it. I'm
> > preparing another set which sends FIB_EVENT_NH_ADD events from
> > rt6_sync_up() similar to what we've in fib_sync_up(). When mlxsw (others
>
> On a tangent here, but I have been meaning to ask why you have
> FIB_EVENT_NH_ADD events as opposed to handling netdev events. What does
> a FIB_EVENT_NH_ADD provide that you can't do from a netdev event handler?
It'll make switch drivers more complex than they already are. Why every
driver needs to duplicate the logic in call_fib_nh_notifiers()?
> > in the future) processes the event it needs to add the nexthop back to
> > the forwarding plane. To do that, it needs to have a RIF for the
> > nexthop device. For the nexthop device to have a RIF, it needs at least
> > one IP address configured on the netdev.
>
> Why is that?
> $ ip addr sh dev swp1s0.51
> 44: swp1s0.51@swp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> noqueue master vrf1101 state UP group default qlen 1000
> link/ether 7c:fe:90:e8:3a:7d brd ff:ff:ff:ff:ff:ff
>
> $ ip ro add vrf vrf1101 1.1.1.0/24 dev swp1s0.51
>
> $ ip ro ls vrf vrf1101
> unreachable default metric 8192
> 1.1.1.0/24 dev swp1s0.51 scope link offload
>
> In this case, I take it mlxsw allocates a rif because of the vlan. The
> above does not work on just swp1s0 -- ie., that route is not offloaded:
>
> $ # ip ro ls
> ...
> 1.1.1.0/24 dev swp1s0 scope link
> ...
>
> Interesting.
It allocates the RIF because of the enslavement to a VRF, which is an
explicit indication the user wants to use the interface for L3
forwarding.
David, can we please get back to the issue at hand? What's the problem
with the location of the call to rt6_sync_up()?
next prev parent reply other threads:[~2018-01-03 17:40 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-31 16:14 [RFC PATCH net-next 00/19] ipv6: Align nexthop behaviour with IPv4 Ido Schimmel
2017-12-31 16:14 ` [RFC PATCH net-next 01/19] ipv6: Remove redundant route flushing during namespace dismantle Ido Schimmel
2017-12-31 16:14 ` [RFC PATCH net-next 02/19] ipv6: Mark dead nexthops with appropriate flags Ido Schimmel
2017-12-31 16:14 ` [RFC PATCH net-next 03/19] ipv6: Clear nexthop flags upon netdev up Ido Schimmel
2018-01-02 16:20 ` David Ahern
2018-01-03 7:44 ` Ido Schimmel
2018-01-03 15:32 ` David Ahern
2018-01-03 16:43 ` Ido Schimmel
2018-01-03 16:56 ` David Ahern
2018-01-03 17:40 ` Ido Schimmel [this message]
2018-01-03 18:47 ` David Ahern
2018-01-03 20:53 ` Ido Schimmel
2018-01-03 23:08 ` David Ahern
2017-12-31 16:14 ` [RFC PATCH net-next 04/19] ipv6: Prepare to handle multiple netdev events Ido Schimmel
2018-01-02 16:29 ` David Ahern
2018-01-03 7:46 ` Ido Schimmel
2017-12-31 16:14 ` [RFC PATCH net-next 05/19] ipv6: Set nexthop flags upon carrier change Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 06/19] ipv6: Set nexthop flags during route creation Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 07/19] ipv6: Check nexthop flags during route lookup instead of carrier Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 08/19] ipv6: Check nexthop flags in route dump " Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 09/19] ipv6: Ignore dead routes during lookup Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 10/19] ipv6: Report dead flag during route dump Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 11/19] ipv6: Add explicit flush indication to routes Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 12/19] ipv6: Teach tree walker to skip multipath routes Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 13/19] ipv6: Flush all sibling routes upon NETDEV_UNREGISTER Ido Schimmel
2018-01-02 17:42 ` David Ahern
2018-01-03 7:50 ` Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 14/19] ipv6: Export sernum update function Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 15/19] ipv6: Take table lock outside of " Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 16/19] ipv6: Flush multipath routes when all siblings are dead Ido Schimmel
2018-01-02 17:38 ` David Ahern
2018-01-03 7:54 ` Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 17/19] selftests: fib_tests: Add test cases for IPv4/IPv6 FIB Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 18/19] selftests: fib_tests: Add test cases for netdev down Ido Schimmel
2017-12-31 16:15 ` [RFC PATCH net-next 19/19] selftests: fib_tests: Add test cases for netdev carrier change Ido Schimmel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180103174025.GA6584@splinter \
--to=idosch@idosch.org \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=idosch@mellanox.com \
--cc=mlxsw@mellanox.com \
--cc=netdev@vger.kernel.org \
--cc=nicolas.dichtel@6wind.com \
--cc=roopa@cumulusnetworks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).