* "ip route show dev enp0s9" does not show all routes for enp0s9
@ 2024-06-28 0:01 Muggeridge, Matt
2024-06-28 2:36 ` Stephen Hemminger
0 siblings, 1 reply; 9+ messages in thread
From: Muggeridge, Matt @ 2024-06-28 0:01 UTC (permalink / raw)
To: netdev@vger.kernel.org
Hi,
This looks like a problem in "iproute2". This was observed on a fresh install of Ubuntu 24.04, with Linux 6.8.0-36-generic.
NOTE: I first raised this in https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/2070412, then later found https://github.com/iproute2/iproute2/blob/main/README.devel.
* PROBLEM
Compare the outputs:
$ ip -6 route show dev enp0s9
2001:2:0:1000::/64 proto ra metric 1024 expires 65518sec pref medium
fe80::/64 proto kernel metric 256 pref medium
$ ip -6 route
2001:2:0:1000::/64 dev enp0s9 proto ra metric 1024 expires 65525sec pref medium
fe80::/64 dev enp0s3 proto kernel metric 256 pref medium
fe80::/64 dev enp0s9 proto kernel metric 256 pref medium
default proto ra metric 1024 expires 589sec pref medium
nexthop via fe80::200:10ff:fe10:1060 dev enp0s9 weight 1
nexthop via fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
The default route is associated with enp0s9, yet the first command above does not show it.
FWIW, the two default route entries were created by two separate routers on the network, each sending their RA.
* REPRODUCER
Statically Configure systemd-networkd with two route entries, similar to the following:
$ networkctl cat 10-enp0s9.network
# /etc/systemd/network/10-enp0s9.network
[Match]
Name=enp0s9
[Link]
RequiredForOnline=no
[Network]
Description="Internal Network: Private VM-to-VM IPv6 interface"
DHCP=no
LLDP=no
EmitLLDP=no
# /etc/systemd/network/10-enp0s9.network.d/address.conf
[Network]
Address=2001:2:0:1000:a00:27ff:fe5f:f72d/64
# /etc/systemd/network/10-enp0s9.network.d/route-1060.conf
[Route]
Gateway=fe80::200:10ff:fe10:1060
GatewayOnLink=true
# /etc/systemd/network/10-enp0s9.network.d/route-1061.conf
[Route]
Gateway=fe80::200:10ff:fe10:1061
GatewayOnLink=true
Now reload and reconfigure the interface and you will see two routes.
$ networkctl reload
$ networkctl reconfigure enp0s9
$ ip -6 r
$ ip -6 r show dev enp0s9 # the routes are not shown
Matt.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-06-28 0:01 "ip route show dev enp0s9" does not show all routes for enp0s9 Muggeridge, Matt
@ 2024-06-28 2:36 ` Stephen Hemminger
2024-06-28 2:54 ` Muggeridge, Matt
0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2024-06-28 2:36 UTC (permalink / raw)
To: Muggeridge, Matt; +Cc: netdev@vger.kernel.org
On Fri, 28 Jun 2024 00:01:47 +0000
"Muggeridge, Matt" <matt.muggeridge2@hpe.com> wrote:
> Hi,
>
> This looks like a problem in "iproute2". This was observed on a fresh install of Ubuntu 24.04, with Linux 6.8.0-36-generic.
>
> NOTE: I first raised this in https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/2070412, then later found https://github.com/iproute2/iproute2/blob/main/README.devel.
>
> * PROBLEM
> Compare the outputs:
>
> $ ip -6 route show dev enp0s9
> 2001:2:0:1000::/64 proto ra metric 1024 expires 65518sec pref medium
> fe80::/64 proto kernel metric 256 pref medium
>
> $ ip -6 route
> 2001:2:0:1000::/64 dev enp0s9 proto ra metric 1024 expires 65525sec pref medium
> fe80::/64 dev enp0s3 proto kernel metric 256 pref medium
> fe80::/64 dev enp0s9 proto kernel metric 256 pref medium
> default proto ra metric 1024 expires 589sec pref medium
> nexthop via fe80::200:10ff:fe10:1060 dev enp0s9 weight 1
> nexthop via fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
>
> The default route is associated with enp0s9, yet the first command above does not show it.
>
> FWIW, the two default route entries were created by two separate routers on the network, each sending their RA.
>
> * REPRODUCER
> Statically Configure systemd-networkd with two route entries, similar to the following:
>
> $ networkctl cat 10-enp0s9.network
> # /etc/systemd/network/10-enp0s9.network
> [Match]
> Name=enp0s9
>
> [Link]
> RequiredForOnline=no
>
> [Network]
> Description="Internal Network: Private VM-to-VM IPv6 interface"
> DHCP=no
> LLDP=no
> EmitLLDP=no
>
>
> # /etc/systemd/network/10-enp0s9.network.d/address.conf
> [Network]
> Address=2001:2:0:1000:a00:27ff:fe5f:f72d/64
>
>
> # /etc/systemd/network/10-enp0s9.network.d/route-1060.conf
> [Route]
> Gateway=fe80::200:10ff:fe10:1060
> GatewayOnLink=true
>
>
> # /etc/systemd/network/10-enp0s9.network.d/route-1061.conf
> [Route]
> Gateway=fe80::200:10ff:fe10:1061
> GatewayOnLink=true
>
>
>
> Now reload and reconfigure the interface and you will see two routes.
>
> $ networkctl reload
> $ networkctl reconfigure enp0s9
> $ ip -6 r
> $ ip -6 r show dev enp0s9 # the routes are not shown
>
"Don't blame the messenger", the ip command only reports what the kernel
sends. So it is likely a route semantics issue in the kernel.
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-06-28 2:36 ` Stephen Hemminger
@ 2024-06-28 2:54 ` Muggeridge, Matt
2024-06-30 10:39 ` Ido Schimmel
0 siblings, 1 reply; 9+ messages in thread
From: Muggeridge, Matt @ 2024-06-28 2:54 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, June 28, 2024 12:37 PM
>
> On Fri, 28 Jun 2024 00:01:47 +0000
> "Muggeridge, Matt" <matt.muggeridge2@hpe.com> wrote:
>
> > Hi,
> >
> > This looks like a problem in "iproute2". This was observed on a fresh install
> of Ubuntu 24.04, with Linux 6.8.0-36-generic.
> >
> > NOTE: I first raised this in
> https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/2070412, then
> later found https://github.com/iproute2/iproute2/blob/main/README.devel.
> >
> > * PROBLEM
> > Compare the outputs:
> >
> > $ ip -6 route show dev enp0s9
> > 2001:2:0:1000::/64 proto ra metric 1024 expires 65518sec pref medium
> > fe80::/64 proto kernel metric 256 pref medium
> >
> > $ ip -6 route
> > 2001:2:0:1000::/64 dev enp0s9 proto ra metric 1024 expires 65525sec
> > pref medium
> > fe80::/64 dev enp0s3 proto kernel metric 256 pref medium
> > fe80::/64 dev enp0s9 proto kernel metric 256 pref medium default proto
> > ra metric 1024 expires 589sec pref medium nexthop via
> > fe80::200:10ff:fe10:1060 dev enp0s9 weight 1 nexthop via
> > fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
> >
> > The default route is associated with enp0s9, yet the first command above
> does not show it.
> >
> > FWIW, the two default route entries were created by two separate routers
> on the network, each sending their RA.
> >
> > * REPRODUCER
> > Statically Configure systemd-networkd with two route entries, similar to the
> following:
> >
> > $ networkctl cat 10-enp0s9.network
> > # /etc/systemd/network/10-enp0s9.network
> > [Match]
> > Name=enp0s9
> >
> > [Link]
> > RequiredForOnline=no
> >
> > [Network]
> > Description="Internal Network: Private VM-to-VM IPv6 interface"
> > DHCP=no
> > LLDP=no
> > EmitLLDP=no
> >
> >
> > # /etc/systemd/network/10-enp0s9.network.d/address.conf
> > [Network]
> > Address=2001:2:0:1000:a00:27ff:fe5f:f72d/64
> >
> >
> > # /etc/systemd/network/10-enp0s9.network.d/route-1060.conf
> > [Route]
> > Gateway=fe80::200:10ff:fe10:1060
> > GatewayOnLink=true
> >
> >
> > # /etc/systemd/network/10-enp0s9.network.d/route-1061.conf
> > [Route]
> > Gateway=fe80::200:10ff:fe10:1061
> > GatewayOnLink=true
> >
> >
> >
> > Now reload and reconfigure the interface and you will see two routes.
> >
> > $ networkctl reload
> > $ networkctl reconfigure enp0s9
> > $ ip -6 r
> > $ ip -6 r show dev enp0s9 # the routes are not shown
> >
>
> "Don't blame the messenger", the ip command only reports what the kernel
> sends. So it is likely a route semantics issue in the kernel.
Thanks Stephen.
Ok, I have reported it on my distro in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2071406.
I guess the kernel netdev folks will see this thread and can comment too?
Cheers,
Matt.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-06-28 2:54 ` Muggeridge, Matt
@ 2024-06-30 10:39 ` Ido Schimmel
2024-06-30 16:23 ` Stephen Hemminger
0 siblings, 1 reply; 9+ messages in thread
From: Ido Schimmel @ 2024-06-30 10:39 UTC (permalink / raw)
To: Muggeridge, Matt; +Cc: Stephen Hemminger, netdev@vger.kernel.org
On Fri, Jun 28, 2024 at 02:54:58AM +0000, Muggeridge, Matt wrote:
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Friday, June 28, 2024 12:37 PM
> >
> > On Fri, 28 Jun 2024 00:01:47 +0000
> > "Muggeridge, Matt" <matt.muggeridge2@hpe.com> wrote:
> >
> > > Hi,
> > >
> > > This looks like a problem in "iproute2". This was observed on a fresh install
> > of Ubuntu 24.04, with Linux 6.8.0-36-generic.
> > >
> > > NOTE: I first raised this in
> > https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/2070412, then
> > later found https://github.com/iproute2/iproute2/blob/main/README.devel.
> > >
> > > * PROBLEM
> > > Compare the outputs:
> > >
> > > $ ip -6 route show dev enp0s9
> > > 2001:2:0:1000::/64 proto ra metric 1024 expires 65518sec pref medium
> > > fe80::/64 proto kernel metric 256 pref medium
> > >
> > > $ ip -6 route
> > > 2001:2:0:1000::/64 dev enp0s9 proto ra metric 1024 expires 65525sec
> > > pref medium
> > > fe80::/64 dev enp0s3 proto kernel metric 256 pref medium
> > > fe80::/64 dev enp0s9 proto kernel metric 256 pref medium default proto
> > > ra metric 1024 expires 589sec pref medium nexthop via
> > > fe80::200:10ff:fe10:1060 dev enp0s9 weight 1 nexthop via
> > > fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
> > >
> > > The default route is associated with enp0s9, yet the first command above
> > does not show it.
> > >
> > > FWIW, the two default route entries were created by two separate routers
> > on the network, each sending their RA.
> > >
> > > * REPRODUCER
> > > Statically Configure systemd-networkd with two route entries, similar to the
> > following:
> > >
> > > $ networkctl cat 10-enp0s9.network
> > > # /etc/systemd/network/10-enp0s9.network
> > > [Match]
> > > Name=enp0s9
> > >
> > > [Link]
> > > RequiredForOnline=no
> > >
> > > [Network]
> > > Description="Internal Network: Private VM-to-VM IPv6 interface"
> > > DHCP=no
> > > LLDP=no
> > > EmitLLDP=no
> > >
> > >
> > > # /etc/systemd/network/10-enp0s9.network.d/address.conf
> > > [Network]
> > > Address=2001:2:0:1000:a00:27ff:fe5f:f72d/64
> > >
> > >
> > > # /etc/systemd/network/10-enp0s9.network.d/route-1060.conf
> > > [Route]
> > > Gateway=fe80::200:10ff:fe10:1060
> > > GatewayOnLink=true
> > >
> > >
> > > # /etc/systemd/network/10-enp0s9.network.d/route-1061.conf
> > > [Route]
> > > Gateway=fe80::200:10ff:fe10:1061
> > > GatewayOnLink=true
> > >
> > >
> > >
> > > Now reload and reconfigure the interface and you will see two routes.
> > >
> > > $ networkctl reload
> > > $ networkctl reconfigure enp0s9
> > > $ ip -6 r
> > > $ ip -6 r show dev enp0s9 # the routes are not shown
> > >
> >
> > "Don't blame the messenger", the ip command only reports what the kernel
> > sends. So it is likely a route semantics issue in the kernel.
>
> Thanks Stephen.
>
> Ok, I have reported it on my distro in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2071406.
>
> I guess the kernel netdev folks will see this thread and can comment too?
The problem seems to be in iproute2 and not in the kernel. Both IPv4 and
IPv6 will dump the route if at least one of the nexthop devices is the
one specified by user space. You can see the routes in the strace output
below.
ip link add name dummy1 up type dummy
ip link add name dummy2 up type dummy
ip address add 192.0.2.1/28 dev dummy1
ip address add 192.0.2.17/28 dev dummy2
ip addres add 2001:db8:1::1/64 dev dummy1
ip addres add 2001:db8:2::1/64 dev dummy2
ip route add 198.51.100.0/24 nexthop via 192.0.2.2 dev dummy1 nexthop via 192.0.2.18 dev dummy2
ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy1 nexthop via 2001:db8:2::2 dev dummy2
# strace -e network ip -4 route show dev dummy1
[...]
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=60, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=704}, {rtm_family=AF_INET, rtm_dst_len=28, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_KERNEL, rtm_scope=RT_SCOPE_LINK, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=8, nla_type=RTA_DST}, inet_addr("192.0.2.0")], [{nla_len=8, nla_type=RTA_PREFSRC}, inet_addr("192.0.2.1")], [{nla_len=8, nla_type=RTA_OIF}, if_nametoindex("dummy1")]]], [{nlmsg_len=80, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=704}, {rtm_family=AF_INET, rtm_dst_len=24, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_BOOT, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=8, nla_type=RTA_DST}, inet_addr("198.51.100.0")], [{nla_len=36, nla_type=RTA_MULTIPATH}, [[{rtnh_len=16, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy1")}, [{nla_len=8, nla_type=RTA_GATEWAY}, inet_addr("192.0.2.2")]], [{rtnh_len=16, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy2")}, [{nla_len=8, nla_type=RTA_GATEWAY}, inet_addr("192.0.2.18")]]]]]]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 140
# strace -e network ip -6 route show dev dummy1
[...]
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=116, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=708}, {rtm_family=AF_INET6, rtm_dst_len=64, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_KERNEL, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=20, nla_type=RTA_DST}, inet_pton(AF_INET6, "2001:db8:1::")], [{nla_len=8, nla_type=RTA_PRIORITY}, 256], [{nla_len=8, nla_type=RTA_OIF}, if_nametoindex("dummy1")], [{nla_len=36, nla_type=RTA_CACHEINFO}, {rta_clntref=0, rta_lastuse=0, rta_expires=0, rta_error=0, rta_used=0, rta_id=0, rta_ts=0, rta_tsage=0}], [{nla_len=5, nla_type=RTA_PREF}, 0]]], [{nlmsg_len=168, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=708}, {rtm_family=AF_INET6, rtm_dst_len=64, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_BOOT, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=20, nla_type=RTA_DST}, inet_pton(AF_INET6, "2001:db8:10::")], [{nla_len=8, nla_type=RTA_PRIORITY}, 1024], [{nla_len=60, nla_type=RTA_MULTIPATH}, [[{rtnh_len=28, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy1")}, [{nla_len=20, nla_type=RTA_GATEWAY}, inet_pton(AF_INET6, "2001:db8:1::2")]], [{rtnh_len=28, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy2")}, [{nla_len=20, nla_type=RTA_GATEWAY}, inet_pton(AF_INET6, "2001:db8:2::2")]]]], [{nla_len=36, nla_type=RTA_CACHEINFO}, {rta_clntref=0, rta_lastuse=0, rta_expires=0, rta_error=0, rta_used=0, rta_id=0, rta_ts=0, rta_tsage=0}], [{nla_len=5, nla_type=RTA_PREF}, 0]]], [{nlmsg_len=116, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=708}, {rtm_family=AF_INET6, rtm_dst_len=64, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_KERNEL, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=20, nla_type=RTA_DST}, inet_pton(AF_INET6, "fe80::")], [{nla_len=8, nla_type=RTA_PRIORITY}, 256], [{nla_len=8, nla_type=RTA_OIF}, if_nametoindex("dummy1")], [{nla_len=36, nla_type=RTA_CACHEINFO}, {rta_clntref=0, rta_lastuse=0, rta_expires=0, rta_error=0, rta_used=0, rta_id=0, rta_ts=0, rta_tsage=0}], [{nla_len=5, nla_type=RTA_PREF}, 0]]]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 400
Following patch works for me [1], but it is missing support for
RTA_GATEWAY which is also present in the RTA_MULTIPATH nest.
[1]
diff --git a/ip/iproute.c b/ip/iproute.c
index b53046116826..3999853a1455 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -310,12 +310,28 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
return 0;
}
if (filter.oifmask) {
- int oif = 0;
+ if (tb[RTA_OIF]) {
+ int oif = rta_getattr_u32(tb[RTA_OIF]);
- if (tb[RTA_OIF])
- oif = rta_getattr_u32(tb[RTA_OIF]);
- if ((oif^filter.oif)&filter.oifmask)
- return 0;
+ if ((oif ^ filter.oif) & filter.oifmask)
+ return 0;
+ } else if (tb[RTA_MULTIPATH]) {
+ const struct rtnexthop *nh = RTA_DATA(tb[RTA_MULTIPATH]);
+ int len = RTA_PAYLOAD(tb[RTA_MULTIPATH]);
+ bool dev_match = false;
+
+ while (len >= sizeof(*nh)) {
+ if (nh->rtnh_ifindex == filter.oif) {
+ dev_match = true;
+ break;
+ }
+
+ len -= NLMSG_ALIGN(nh->rtnh_len);
+ nh = RTNH_NEXT(nh);
+ }
+ if (!dev_match)
+ return 0;
+ }
}
if (filter.markmask) {
int mark = 0;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-06-30 10:39 ` Ido Schimmel
@ 2024-06-30 16:23 ` Stephen Hemminger
2024-07-01 7:17 ` Ido Schimmel
0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2024-06-30 16:23 UTC (permalink / raw)
To: Ido Schimmel; +Cc: Muggeridge, Matt, netdev@vger.kernel.org
On Sun, 30 Jun 2024 13:39:35 +0300
Ido Schimmel <idosch@idosch.org> wrote:
> On Fri, Jun 28, 2024 at 02:54:58AM +0000, Muggeridge, Matt wrote:
> > > From: Stephen Hemminger <stephen@networkplumber.org>
> > > Sent: Friday, June 28, 2024 12:37 PM
> > >
> > > On Fri, 28 Jun 2024 00:01:47 +0000
> > > "Muggeridge, Matt" <matt.muggeridge2@hpe.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > This looks like a problem in "iproute2". This was observed on a fresh install
> > > of Ubuntu 24.04, with Linux 6.8.0-36-generic.
> > > >
> > > > NOTE: I first raised this in
> > > https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/2070412, then
> > > later found https://github.com/iproute2/iproute2/blob/main/README.devel.
> > > >
> > > > * PROBLEM
> > > > Compare the outputs:
> > > >
> > > > $ ip -6 route show dev enp0s9
> > > > 2001:2:0:1000::/64 proto ra metric 1024 expires 65518sec pref medium
> > > > fe80::/64 proto kernel metric 256 pref medium
> > > >
> > > > $ ip -6 route
> > > > 2001:2:0:1000::/64 dev enp0s9 proto ra metric 1024 expires 65525sec
> > > > pref medium
> > > > fe80::/64 dev enp0s3 proto kernel metric 256 pref medium
> > > > fe80::/64 dev enp0s9 proto kernel metric 256 pref medium default proto
> > > > ra metric 1024 expires 589sec pref medium nexthop via
> > > > fe80::200:10ff:fe10:1060 dev enp0s9 weight 1 nexthop via
> > > > fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
> > > >
> > > > The default route is associated with enp0s9, yet the first command above
> > > does not show it.
> > > >
> > > > FWIW, the two default route entries were created by two separate routers
> > > on the network, each sending their RA.
> > > >
> > > > * REPRODUCER
> > > > Statically Configure systemd-networkd with two route entries, similar to the
> > > following:
> > > >
> > > > $ networkctl cat 10-enp0s9.network
> > > > # /etc/systemd/network/10-enp0s9.network
> > > > [Match]
> > > > Name=enp0s9
> > > >
> > > > [Link]
> > > > RequiredForOnline=no
> > > >
> > > > [Network]
> > > > Description="Internal Network: Private VM-to-VM IPv6 interface"
> > > > DHCP=no
> > > > LLDP=no
> > > > EmitLLDP=no
> > > >
> > > >
> > > > # /etc/systemd/network/10-enp0s9.network.d/address.conf
> > > > [Network]
> > > > Address=2001:2:0:1000:a00:27ff:fe5f:f72d/64
> > > >
> > > >
> > > > # /etc/systemd/network/10-enp0s9.network.d/route-1060.conf
> > > > [Route]
> > > > Gateway=fe80::200:10ff:fe10:1060
> > > > GatewayOnLink=true
> > > >
> > > >
> > > > # /etc/systemd/network/10-enp0s9.network.d/route-1061.conf
> > > > [Route]
> > > > Gateway=fe80::200:10ff:fe10:1061
> > > > GatewayOnLink=true
> > > >
> > > >
> > > >
> > > > Now reload and reconfigure the interface and you will see two routes.
> > > >
> > > > $ networkctl reload
> > > > $ networkctl reconfigure enp0s9
> > > > $ ip -6 r
> > > > $ ip -6 r show dev enp0s9 # the routes are not shown
> > > >
> > >
> > > "Don't blame the messenger", the ip command only reports what the kernel
> > > sends. So it is likely a route semantics issue in the kernel.
> >
> > Thanks Stephen.
> >
> > Ok, I have reported it on my distro in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2071406.
> >
> > I guess the kernel netdev folks will see this thread and can comment too?
>
> The problem seems to be in iproute2 and not in the kernel. Both IPv4 and
> IPv6 will dump the route if at least one of the nexthop devices is the
> one specified by user space. You can see the routes in the strace output
> below.
>
> ip link add name dummy1 up type dummy
> ip link add name dummy2 up type dummy
> ip address add 192.0.2.1/28 dev dummy1
> ip address add 192.0.2.17/28 dev dummy2
> ip addres add 2001:db8:1::1/64 dev dummy1
> ip addres add 2001:db8:2::1/64 dev dummy2
> ip route add 198.51.100.0/24 nexthop via 192.0.2.2 dev dummy1 nexthop via 192.0.2.18 dev dummy2
> ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy1 nexthop via 2001:db8:2::2 dev dummy2
>
> # strace -e network ip -4 route show dev dummy1
> [...]
> recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=60, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=704}, {rtm_family=AF_INET, rtm_dst_len=28, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_KERNEL, rtm_scope=RT_SCOPE_LINK, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=8, nla_type=RTA_DST}, inet_addr("192.0.2.0")], [{nla_len=8, nla_type=RTA_PREFSRC}, inet_addr("192.0.2.1")], [{nla_len=8, nla_type=RTA_OIF}, if_nametoindex("dummy1")]]], [{nlmsg_len=80, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=704}, {rtm_family=AF_INET, rtm_dst_len=24, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_BOOT, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=8, nla_type=RTA_DST}, inet_addr("198.51.100.0")], [{nla_len=36, nla_type=RTA_MULTIPATH}, [[{rtnh_len=16, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy1")}, [{nla_len=8, nla_type=RTA_GATEWAY}, inet_addr("192.0.2.2")]], [{rtnh_len=16, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy2")}, [{nla_len=8, nla_type=RTA_GATEWAY}, inet_addr("192.0.2.18")]]]]]]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 140
>
> # strace -e network ip -6 route show dev dummy1
> [...]
> recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=116, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=708}, {rtm_family=AF_INET6, rtm_dst_len=64, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_KERNEL, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=20, nla_type=RTA_DST}, inet_pton(AF_INET6, "2001:db8:1::")], [{nla_len=8, nla_type=RTA_PRIORITY}, 256], [{nla_len=8, nla_type=RTA_OIF}, if_nametoindex("dummy1")], [{nla_len=36, nla_type=RTA_CACHEINFO}, {rta_clntref=0, rta_lastuse=0, rta_expires=0, rta_error=0, rta_used=0, rta_id=0, rta_ts=0, rta_tsage=0}], [{nla_len=5, nla_type=RTA_PREF}, 0]]], [{nlmsg_len=168, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=708}, {rtm_family=AF_INET6, rtm_dst_len=64, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_BOOT, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=20, nla_type=RTA_DST}, inet_pton(AF_INET6, "2001:db8:10::")], [{nla_len=8, nla_type=RTA_PRIORITY}, 1024], [{nla_len=60, nla_type=RTA_MULTIPATH}, [[{rtnh_len=28, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy1")}, [{nla_len=20, nla_type=RTA_GATEWAY}, inet_pton(AF_INET6, "2001:db8:1::2")]], [{rtnh_len=28, rtnh_flags=0, rtnh_hops=0, rtnh_ifindex=if_nametoindex("dummy2")}, [{nla_len=20, nla_type=RTA_GATEWAY}, inet_pton(AF_INET6, "2001:db8:2::2")]]]], [{nla_len=36, nla_type=RTA_CACHEINFO}, {rta_clntref=0, rta_lastuse=0, rta_expires=0, rta_error=0, rta_used=0, rta_id=0, rta_ts=0, rta_tsage=0}], [{nla_len=5, nla_type=RTA_PREF}, 0]]], [{nlmsg_len=116, nlmsg_type=RTM_NEWROUTE, nlmsg_flags=NLM_F_MULTI|NLM_F_DUMP_FILTERED, nlmsg_seq=1719737009, nlmsg_pid=708}, {rtm_family=AF_INET6, rtm_dst_len=64, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_KERNEL, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}, [[{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN], [{nla_len=20, nla_type=RTA_DST}, inet_pton(AF_INET6, "fe80::")], [{nla_len=8, nla_type=RTA_PRIORITY}, 256], [{nla_len=8, nla_type=RTA_OIF}, if_nametoindex("dummy1")], [{nla_len=36, nla_type=RTA_CACHEINFO}, {rta_clntref=0, rta_lastuse=0, rta_expires=0, rta_error=0, rta_used=0, rta_id=0, rta_ts=0, rta_tsage=0}], [{nla_len=5, nla_type=RTA_PREF}, 0]]]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 400
>
> Following patch works for me [1], but it is missing support for
> RTA_GATEWAY which is also present in the RTA_MULTIPATH nest.
>
> [1]
> diff --git a/ip/iproute.c b/ip/iproute.c
> index b53046116826..3999853a1455 100644
> --- a/ip/iproute.c
> +++ b/ip/iproute.c
> @@ -310,12 +310,28 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
> return 0;
> }
> if (filter.oifmask) {
> - int oif = 0;
> + if (tb[RTA_OIF]) {
> + int oif = rta_getattr_u32(tb[RTA_OIF]);
>
> - if (tb[RTA_OIF])
> - oif = rta_getattr_u32(tb[RTA_OIF]);
> - if ((oif^filter.oif)&filter.oifmask)
> - return 0;
> + if ((oif ^ filter.oif) & filter.oifmask)
> + return 0;
> + } else if (tb[RTA_MULTIPATH]) {
> + const struct rtnexthop *nh = RTA_DATA(tb[RTA_MULTIPATH]);
> + int len = RTA_PAYLOAD(tb[RTA_MULTIPATH]);
> + bool dev_match = false;
> +
> + while (len >= sizeof(*nh)) {
> + if (nh->rtnh_ifindex == filter.oif) {
> + dev_match = true;
> + break;
> + }
> +
> + len -= NLMSG_ALIGN(nh->rtnh_len);
> + nh = RTNH_NEXT(nh);
> + }
> + if (!dev_match)
> + return 0;
> + }
> }
> if (filter.markmask) {
> int mark = 0;
Good catch, original code did not handle multipath in filtering.
Suggest moving the loop into helper function for clarity
diff --git a/ip/iproute.c b/ip/iproute.c
index b5304611..44666240 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -154,6 +154,24 @@ static int flush_update(void)
return 0;
}
+static bool filter_multipath(const struct rtattr *rta)
+{
+ const struct rtnexthop *nh = RTA_DATA(rta);
+ int len = RTA_PAYLOAD(rta);
+
+ while (len >= sizeof(*nh)) {
+ if (nh->rtnh_len > len)
+ break;
+
+ if (!((nh->rtnh_ifindex ^ filter.oif) & filter.oifmask))
+ return true;
+
+ len -= NLMSG_ALIGN(nh->rtnh_len);
+ nh = RTNH_NEXT(nh);
+ }
+ return false;
+}
+
static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
{
struct rtmsg *r = NLMSG_DATA(n);
@@ -310,12 +328,15 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
return 0;
}
if (filter.oifmask) {
- int oif = 0;
+ if (tb[RTA_OIF]) {
+ int oif = rta_getattr_u32(tb[RTA_OIF]);
- if (tb[RTA_OIF])
- oif = rta_getattr_u32(tb[RTA_OIF]);
- if ((oif^filter.oif)&filter.oifmask)
- return 0;
+ if ((oif ^ filter.oif) & filter.oifmask)
+ return 0;
+ } else if (tb[RTA_MULTIPATH]) {
+ if (!filter_multipath(tb[RTA_MULTIPATH]))
+ return 0;
+ }
}
if (filter.markmask) {
int mark = 0;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-06-30 16:23 ` Stephen Hemminger
@ 2024-07-01 7:17 ` Ido Schimmel
0 siblings, 0 replies; 9+ messages in thread
From: Ido Schimmel @ 2024-07-01 7:17 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Muggeridge, Matt, netdev@vger.kernel.org
On Sun, Jun 30, 2024 at 09:23:08AM -0700, Stephen Hemminger wrote:
> Good catch, original code did not handle multipath in filtering.
>
> Suggest moving the loop into helper function for clarity
Thanks, looks good. Do you want to submit it?
You can add:
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "ip route show dev enp0s9" does not show all routes for enp0s9
@ 2024-07-03 4:00 Muggeridge, Matt
2024-07-03 5:50 ` Stephen Hemminger
0 siblings, 1 reply; 9+ messages in thread
From: Muggeridge, Matt @ 2024-07-03 4:00 UTC (permalink / raw)
To: netdev@vger.kernel.org
> On Sun, Jun 30, 2024 at 09:23:08AM -0700, Stephen Hemminger wrote:
> > Good catch, original code did not handle multipath in filtering.
> >
> > Suggest moving the loop into helper function for clarity
>
> Thanks, looks good. Do you want to submit it?
>
> You can add:
>
> Reviewed-by: Ido Schimmel mailto:idosch@nvidia.com
Just wondering which repo this will find its way into. I sleuthed
your repos and the iproute2 repo but could not find it.
Thanks,
Matt.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-07-03 4:00 Muggeridge, Matt
@ 2024-07-03 5:50 ` Stephen Hemminger
2024-07-03 19:46 ` Muggeridge, Matt
0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2024-07-03 5:50 UTC (permalink / raw)
To: Muggeridge, Matt; +Cc: netdev@vger.kernel.org
On Wed, 3 Jul 2024 04:00:44 +0000
"Muggeridge, Matt" <matt.muggeridge2@hpe.com> wrote:
> > On Sun, Jun 30, 2024 at 09:23:08AM -0700, Stephen Hemminger wrote:
> > > Good catch, original code did not handle multipath in filtering.
> > >
> > > Suggest moving the loop into helper function for clarity
> >
> > Thanks, looks good. Do you want to submit it?
> >
> > You can add:
> >
> > Reviewed-by: Ido Schimmel mailto:idosch@nvidia.com
>
> Just wondering which repo this will find its way into. I sleuthed
> your repos and the iproute2 repo but could not find it.
>
> Thanks,
> Matt.
>
>
It would go in iproute2 but was not an official patch since it was not
tested. Since you are doing multipath routing, could you please make sure
it works.
Suppose a test with dummy devices is possible, but somewhat artificial
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: "ip route show dev enp0s9" does not show all routes for enp0s9
2024-07-03 5:50 ` Stephen Hemminger
@ 2024-07-03 19:46 ` Muggeridge, Matt
0 siblings, 0 replies; 9+ messages in thread
From: Muggeridge, Matt @ 2024-07-03 19:46 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, July 3, 2024 3:50 PM
> To: Muggeridge, Matt <matt.muggeridge2@hpe.com>
> Cc: netdev@vger.kernel.org
> Subject: Re: "ip route show dev enp0s9" does not show all routes for enp0s9
>
> On Wed, 3 Jul 2024 04:00:44 +0000
> "Muggeridge, Matt" <matt.muggeridge2@hpe.com> wrote:
>
> > > On Sun, Jun 30, 2024 at 09:23:08AM -0700, Stephen Hemminger wrote:
> > > > Good catch, original code did not handle multipath in filtering.
> > > >
> > > > Suggest moving the loop into helper function for clarity
> > >
> > > Thanks, looks good. Do you want to submit it?
> > >
> > > You can add:
> > >
> > > Reviewed-by: Ido Schimmel mailto:idosch@nvidia.com
> >
> > Just wondering which repo this will find its way into. I sleuthed
> > your repos and the iproute2 repo but could not find it.
> >
> > Thanks,
> > Matt.
> >
> >
>
> It would go in iproute2 but was not an official patch since it was not tested.
> Since you are doing multipath routing, could you please make sure it works.
>
> Suppose a test with dummy devices is possible, but somewhat artificial
I have tested it, and it worked.
# Using my distros "ip" command
$ ip -6 r show dev enp0s9
2001:2:0:1000::/64 proto ra metric 2048 expires 65480sec pref medium
fe80::/64 proto kernel metric 256 pref medium
# Using the patched version of "./ip" command
~/work/iproute2/ip (main)$ ./ip -6 r show dev enp0s9
2001:2:0:1000::/64 proto ra metric 2048 expires 65478sec pref medium
fe80::/64 proto kernel metric 256 pref medium
default proto ra metric 2048 expires 538sec pref medium
nexthop via fe80::200:10ff:fe10:1060 dev enp0s9 weight 1
nexthop via fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
All the best!
Matt.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-07-03 19:46 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-28 0:01 "ip route show dev enp0s9" does not show all routes for enp0s9 Muggeridge, Matt
2024-06-28 2:36 ` Stephen Hemminger
2024-06-28 2:54 ` Muggeridge, Matt
2024-06-30 10:39 ` Ido Schimmel
2024-06-30 16:23 ` Stephen Hemminger
2024-07-01 7:17 ` Ido Schimmel
-- strict thread matches above, loose matches on Subject: below --
2024-07-03 4:00 Muggeridge, Matt
2024-07-03 5:50 ` Stephen Hemminger
2024-07-03 19:46 ` Muggeridge, Matt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).