* [PATCH net 00/12] net: iflink and link-netnsid fixes
@ 2020-10-01 7:59 Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 01/12] ipvlan: add get_link_net Sabrina Dubroca
` (12 more replies)
0 siblings, 13 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Nicolas Dichtel, Marek Lindner, Simon Wunderlich,
Antonio Quartulli, Sven Eckelmann, b.a.t.m.a.n, Roopa Prabhu,
Nikolay Aleksandrov
In a lot of places, we use this kind of comparison to detect if a
device has a lower link:
dev->ifindex != dev_get_iflink(dev)
This seems to be a leftover of the pre-netns days, when the ifindex
was unique over the whole system. Nowadays, with network namespaces,
it's very easy to create a device with the same ifindex as its lower
link:
ip netns add main
ip netns add peer
ip -net main link add dummy0 type dummy
ip -net main link add link dummy0 macvlan0 netns peer type macvlan
ip -net main link show type dummy
9: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop ...
ip -net peer link show type macvlan
9: macvlan0@if9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop ...
To detect if a device has a lower link, we can simply check the
existence of the dev->netdev_ops->ndo_get_iflink operation, instead of
checking its return value. In particular, I attempted to fix one of
these checks in commit feadc4b6cf42 ("rtnetlink: always put IFLA_LINK
for links with a link-netnsid"), but this patch isn't correct, since
tunnel devices can export IFLA_LINK_NETNSID without IFLA_LINK. That
patch needs to be reverted.
This series will fix all those bogus comparisons, and export missing
IFLA_LINK_NETNSID attributes in bridge and ipv6 dumps.
ipvlan and geneve are also missing the get_link_net operation, so
userspace can't know when those device are cross-netns. There are a
couple of other device types that have an ndo_get_iflink op but no
get_link_net (virt_wifi, ipoib), and should probably also have a
get_link_net.
Sabrina Dubroca (12):
ipvlan: add get_link_net
geneve: add get_link_net
Revert "rtnetlink: always put IFLA_LINK for links with a link-netnsid"
rtnetlink: always put IFLA_LINK for links with ndo_get_iflink
bridge: always put IFLA_LINK for ports with ndo_get_iflink
bridge: advertise IFLA_LINK_NETNSID when dumping bridge ports
ipv6: always put IFLA_LINK for devices with ndo_get_iflink
ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses
net: link_watch: fix operstate when the link has the same index as the
device
net: link_watch: fix detection of urgent events
batman-adv: fix iflink detection in batadv_is_on_batman_iface
batman-adv: fix detection of lower link in batadv_get_real_netdevice
drivers/net/can/vxcan.c | 2 +-
drivers/net/geneve.c | 8 ++++++++
drivers/net/ipvlan/ipvlan_main.c | 9 +++++++++
drivers/net/veth.c | 2 +-
include/net/rtnetlink.h | 4 ++++
net/batman-adv/hard-interface.c | 4 ++--
net/bridge/br_netlink.c | 4 +++-
net/core/link_watch.c | 4 ++--
net/core/rtnetlink.c | 25 ++++++++++++-------------
net/ipv6/addrconf.c | 11 ++++++++++-
10 files changed, 52 insertions(+), 21 deletions(-)
--
2.28.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH net 01/12] ipvlan: add get_link_net
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 8:19 ` Eric Dumazet
2020-10-01 7:59 ` [PATCH net 02/12] geneve: " Sabrina Dubroca
` (11 subsequent siblings)
12 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
Currently, ipvlan devices don't advertise a link-netnsid. We can get
it from the lower device, like macvlan does.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
drivers/net/ipvlan/ipvlan_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 5bca94c99006..a81bb68a5713 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -678,6 +678,14 @@ void ipvlan_link_setup(struct net_device *dev)
}
EXPORT_SYMBOL_GPL(ipvlan_link_setup);
+static struct net *ipvlan_get_link_net(const struct net_device *dev)
+{
+ struct ipvl_dev *ipvlan = netdev_priv(dev);
+ struct net_device *phy_dev = ipvlan->phy_dev;
+
+ return dev_net(phy_dev);
+}
+
static const struct nla_policy ipvlan_nl_policy[IFLA_IPVLAN_MAX + 1] =
{
[IFLA_IPVLAN_MODE] = { .type = NLA_U16 },
@@ -691,6 +699,7 @@ static struct rtnl_link_ops ipvlan_link_ops = {
.setup = ipvlan_link_setup,
.newlink = ipvlan_link_new,
.dellink = ipvlan_link_delete,
+ .get_link_net = ipvlan_get_link_net,
};
int ipvlan_link_register(struct rtnl_link_ops *ops)
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 02/12] geneve: add get_link_net
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 01/12] ipvlan: add get_link_net Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 03/12] Revert "rtnetlink: always put IFLA_LINK for links with a link-netnsid" Sabrina Dubroca
` (10 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
Currently, geneve devices don't advertise a link netns. Similarly to
VXLAN, we can get it from geneve_dev->net.
Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
drivers/net/geneve.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 974a244f45ba..cd47940bfcbe 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1791,6 +1791,13 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
return -EMSGSIZE;
}
+static struct net *geneve_get_link_net(const struct net_device *dev)
+{
+ struct geneve_dev *geneve = netdev_priv(dev);
+
+ return geneve->net;
+}
+
static struct rtnl_link_ops geneve_link_ops __read_mostly = {
.kind = "geneve",
.maxtype = IFLA_GENEVE_MAX,
@@ -1803,6 +1810,7 @@ static struct rtnl_link_ops geneve_link_ops __read_mostly = {
.dellink = geneve_dellink,
.get_size = geneve_get_size,
.fill_info = geneve_fill_info,
+ .get_link_net = geneve_get_link_net,
};
struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 03/12] Revert "rtnetlink: always put IFLA_LINK for links with a link-netnsid"
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 01/12] ipvlan: add get_link_net Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 02/12] geneve: " Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 04/12] rtnetlink: always put IFLA_LINK for links with ndo_get_iflink Sabrina Dubroca
` (9 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
This reverts commit feadc4b6cf42a53a8a93c918a569a0b7e62bd350.
It fixed the particular issue that was seen in the OpenShift setup,
but ignored the case of tunnels like VXLAN, which export a
IFLA_LINK_NETNSID attribute but don't have an IFLA_LINK.
In case a vxlan device is created in one netns, then moved to another,
we end up seeing:
# ip -net foo link
15: vxlan1@if15: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default qlen 1000
^
The next patch will fix the original problem properly.
Fixes: feadc4b6cf42 ("rtnetlink: always put IFLA_LINK for links with a link-netnsid")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/core/rtnetlink.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 68e0682450c6..c35b3f02b4f9 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1547,15 +1547,14 @@ static int put_master_ifindex(struct sk_buff *skb, struct net_device *dev)
return ret;
}
-static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev,
- bool force)
+static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev)
{
int ifindex = dev_get_iflink(dev);
- if (force || dev->ifindex != ifindex)
- return nla_put_u32(skb, IFLA_LINK, ifindex);
+ if (dev->ifindex == ifindex)
+ return 0;
- return 0;
+ return nla_put_u32(skb, IFLA_LINK, ifindex);
}
static noinline_for_stack int nla_put_ifalias(struct sk_buff *skb,
@@ -1572,8 +1571,6 @@ static int rtnl_fill_link_netnsid(struct sk_buff *skb,
const struct net_device *dev,
struct net *src_net, gfp_t gfp)
{
- bool put_iflink = false;
-
if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) {
struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
@@ -1582,12 +1579,10 @@ static int rtnl_fill_link_netnsid(struct sk_buff *skb,
if (nla_put_s32(skb, IFLA_LINK_NETNSID, id))
return -EMSGSIZE;
-
- put_iflink = true;
}
}
- return nla_put_iflink(skb, dev, put_iflink);
+ return 0;
}
static int rtnl_fill_link_af(struct sk_buff *skb,
@@ -1738,6 +1733,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
#ifdef CONFIG_RPS
nla_put_u32(skb, IFLA_NUM_RX_QUEUES, dev->num_rx_queues) ||
#endif
+ nla_put_iflink(skb, dev) ||
put_master_ifindex(skb, dev) ||
nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) ||
(dev->qdisc &&
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 04/12] rtnetlink: always put IFLA_LINK for links with ndo_get_iflink
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (2 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 03/12] Revert "rtnetlink: always put IFLA_LINK for links with a link-netnsid" Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 05/12] bridge: always put IFLA_LINK for ports " Sabrina Dubroca
` (8 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
The test that nla_put_iflink() uses to detect whether a device has a
lower link doesn't work with network namespaces, as a device can have
the same ifindex as its parent:
ip netns add main
ip netns add peer
ip -net main link add dummy0 type dummy
ip -net main link add link dummy0 macvlan0 netns peer type macvlan
ip -net main link show type dummy
# 9: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop ...
ip -net peer link show type macvlan
# 9: macvlan0@if9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop ...
Instead of calling dev_get_iflink(), we can use the existence of the
ndo_get_iflink operation (which dev_get_iflink would call) to check if
a device has a lower link.
I previously tried to fix this with commit feadc4b6cf42 ("rtnetlink:
always put IFLA_LINK for links with a link-netnsid") but didn't get to
the root of the problem.
Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/core/rtnetlink.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c35b3f02b4f9..a8459fb59ccd 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1549,12 +1549,13 @@ static int put_master_ifindex(struct sk_buff *skb, struct net_device *dev)
static int nla_put_iflink(struct sk_buff *skb, const struct net_device *dev)
{
- int ifindex = dev_get_iflink(dev);
+ if (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink) {
+ int ifindex = dev->netdev_ops->ndo_get_iflink(dev);
- if (dev->ifindex == ifindex)
- return 0;
+ return nla_put_u32(skb, IFLA_LINK, ifindex);
+ }
- return nla_put_u32(skb, IFLA_LINK, ifindex);
+ return 0;
}
static noinline_for_stack int nla_put_ifalias(struct sk_buff *skb,
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 05/12] bridge: always put IFLA_LINK for ports with ndo_get_iflink
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (3 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 04/12] rtnetlink: always put IFLA_LINK for links with ndo_get_iflink Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 06/12] bridge: advertise IFLA_LINK_NETNSID when dumping bridge ports Sabrina Dubroca
` (7 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Roopa Prabhu, Nikolay Aleksandrov
Bridge devices try to detect if devices have a lower link when dumping
information about their ports, but that detection doesn't work when
the device and its link have the same ifindex:
For a bridge with the following ports:
20: veth1@if20: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master bridge0 ...
22: veth2@if21: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master bridge0 ...
We get this output with the "bridge link" command:
20: veth1: <BROADCAST,MULTICAST> mtu 1500 master bridge0 ...
22: veth2@if21: <BROADCAST,MULTICAST> mtu 1500 master bridge0 ...
veth1 should also have "@if20" in bridge link.
Instead of calling dev_get_iflink(), we can use the existence of the
ndo_get_iflink operation (which dev_get_iflink would call) to check if
a device has a lower link.
Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/bridge/br_netlink.c | 2 +-
net/core/rtnetlink.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 147d52596e17..6af5d62ddf7b 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -410,7 +410,7 @@ static int br_fill_ifinfo(struct sk_buff *skb,
nla_put_u8(skb, IFLA_OPERSTATE, operstate) ||
(dev->addr_len &&
nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr)) ||
- (dev->ifindex != dev_get_iflink(dev) &&
+ (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink &&
nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))))
goto nla_put_failure;
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a8459fb59ccd..3d8051158890 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -4625,7 +4625,7 @@ int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
nla_put_u32(skb, IFLA_MASTER, br_dev->ifindex)) ||
(dev->addr_len &&
nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr)) ||
- (dev->ifindex != dev_get_iflink(dev) &&
+ (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink &&
nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))))
goto nla_put_failure;
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 06/12] bridge: advertise IFLA_LINK_NETNSID when dumping bridge ports
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (4 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 05/12] bridge: always put IFLA_LINK for ports " Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 07/12] ipv6: always put IFLA_LINK for devices with ndo_get_iflink Sabrina Dubroca
` (6 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Roopa Prabhu, Nikolay Aleksandrov
Currently, we're not advertising link-netnsid for bridge ports, so the
"bridge link" command will not correctly interpret the value of the
IFLA_LINK attribute.
With this setup (ip link output):
9: bridge0: <BROADCAST,MULTICAST> mtu 1500 ...
10: veth0@if10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master bridge0 ...
11: veth1@if9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master bridge0 ...
we'll get:
10: veth0: <BROADCAST,MULTICAST> mtu 1500 master bridge0 ...
11: veth1@bridge0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 master ...
instead of:
10: veth0@if10: <BROADCAST,MULTICAST> mtu 1500 master bridge0 ...
11: veth1@if9: <BROADCAST,MULTICAST> mtu 1500 master bridge0 ...
br_fill_ifinfo can be called without RTNL (from
br_forward_delay_timer_expired), so we need to change get_link_net
callbacks to use rcu_dereference_rtnl instead of rtnl_dereference.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
drivers/net/can/vxcan.c | 2 +-
drivers/net/veth.c | 2 +-
include/net/rtnetlink.h | 4 ++++
net/bridge/br_netlink.c | 2 ++
net/core/rtnetlink.c | 8 +++++---
5 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index d6ba9426be4d..870109d38b28 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -276,7 +276,7 @@ static const struct nla_policy vxcan_policy[VXCAN_INFO_MAX + 1] = {
static struct net *vxcan_get_link_net(const struct net_device *dev)
{
struct vxcan_priv *priv = netdev_priv(dev);
- struct net_device *peer = rtnl_dereference(priv->peer);
+ struct net_device *peer = rcu_dereference_rtnl(priv->peer);
return peer ? dev_net(peer) : dev_net(dev);
}
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index a475f48d43c4..5f814620d97e 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1433,7 +1433,7 @@ static const struct nla_policy veth_policy[VETH_INFO_MAX + 1] = {
static struct net *veth_get_link_net(const struct net_device *dev)
{
struct veth_priv *priv = netdev_priv(dev);
- struct net_device *peer = rtnl_dereference(priv->peer);
+ struct net_device *peer = rcu_dereference_rtnl(priv->peer);
return peer ? dev_net(peer) : dev_net(dev);
}
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index e2091bb2b3a8..c37cb3d98c7c 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -168,6 +168,10 @@ int rtnl_nla_parse_ifla(struct nlattr **tb, const struct nlattr *head, int len,
struct netlink_ext_ack *exterr);
struct net *rtnl_get_net_ns_capable(struct sock *sk, int netnsid);
+int rtnl_fill_link_netnsid(struct sk_buff *skb,
+ const struct net_device *dev,
+ struct net *src_net, gfp_t gfp);
+
#define MODULE_ALIAS_RTNL_LINK(kind) MODULE_ALIAS("rtnl-link-" kind)
#endif
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 6af5d62ddf7b..81ea4e89edba 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -164,6 +164,7 @@ static inline size_t br_nlmsg_size(struct net_device *dev, u32 filter_mask)
+ nla_total_size(4) /* IFLA_MASTER */
+ nla_total_size(4) /* IFLA_MTU */
+ nla_total_size(4) /* IFLA_LINK */
+ + nla_total_size(4) /* IFLA_LINK_NETNSID */
+ nla_total_size(1) /* IFLA_OPERSTATE */
+ nla_total_size(br_port_info_size()) /* IFLA_PROTINFO */
+ nla_total_size(br_get_link_af_size_filtered(dev,
@@ -410,6 +411,7 @@ static int br_fill_ifinfo(struct sk_buff *skb,
nla_put_u8(skb, IFLA_OPERSTATE, operstate) ||
(dev->addr_len &&
nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr)) ||
+ rtnl_fill_link_netnsid(skb, dev, dev_net(br->dev), GFP_ATOMIC) ||
(dev->netdev_ops && dev->netdev_ops->ndo_get_iflink &&
nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))))
goto nla_put_failure;
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 3d8051158890..26ce9fafc379 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1568,9 +1568,9 @@ static noinline_for_stack int nla_put_ifalias(struct sk_buff *skb,
return ret > 0 ? nla_put_string(skb, IFLA_IFALIAS, buf) : 0;
}
-static int rtnl_fill_link_netnsid(struct sk_buff *skb,
- const struct net_device *dev,
- struct net *src_net, gfp_t gfp)
+int rtnl_fill_link_netnsid(struct sk_buff *skb,
+ const struct net_device *dev,
+ struct net *src_net, gfp_t gfp)
{
if (dev->rtnl_link_ops && dev->rtnl_link_ops->get_link_net) {
struct net *link_net = dev->rtnl_link_ops->get_link_net(dev);
@@ -1585,6 +1585,7 @@ static int rtnl_fill_link_netnsid(struct sk_buff *skb,
return 0;
}
+EXPORT_SYMBOL_GPL(rtnl_fill_link_netnsid);
static int rtnl_fill_link_af(struct sk_buff *skb,
const struct net_device *dev,
@@ -4625,6 +4626,7 @@ int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
nla_put_u32(skb, IFLA_MASTER, br_dev->ifindex)) ||
(dev->addr_len &&
nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr)) ||
+ rtnl_fill_link_netnsid(skb, dev, dev_net(br_dev), GFP_ATOMIC) ||
(dev->netdev_ops && dev->netdev_ops->ndo_get_iflink &&
nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))))
goto nla_put_failure;
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 07/12] ipv6: always put IFLA_LINK for devices with ndo_get_iflink
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (5 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 06/12] bridge: advertise IFLA_LINK_NETNSID when dumping bridge ports Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses Sabrina Dubroca
` (5 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
IPv6 tries to detect if devices have a lower link when dumping
addresses, but that detection doesn't work when the device and its
link have the same ifindex.
In this setup:
ip netns add main
ip netns add peer
ip -net main link add dummy0 type dummy # ifidx 9
ip -net main link add link dummy0 macvlan0 up netns peer type macvlan # ifidx 9
We'll get:
ip -net peer -6 a
9: macvlan0: <snip>
Instead of:
ip -net peer -6 a
9: macvlan0@if9: <snip>
Instead of calling dev_get_iflink(), we can use the existence of the
ndo_get_iflink operation (which dev_get_iflink would call) to check if
a device has a lower link.
Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/ipv6/addrconf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 01146b66d666..688e441a8699 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -5835,7 +5835,7 @@ static int inet6_fill_ifinfo(struct sk_buff *skb, struct inet6_dev *idev,
(dev->addr_len &&
nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr)) ||
nla_put_u32(skb, IFLA_MTU, dev->mtu) ||
- (dev->ifindex != dev_get_iflink(dev) &&
+ (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink &&
nla_put_u32(skb, IFLA_LINK, dev_get_iflink(dev))) ||
nla_put_u8(skb, IFLA_OPERSTATE,
netif_running(dev) ? dev->operstate : IF_OPER_DOWN))
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (6 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 07/12] ipv6: always put IFLA_LINK for devices with ndo_get_iflink Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 15:58 ` Nicolas Dichtel
2020-10-01 7:59 ` [PATCH net 09/12] net: link_watch: fix operstate when the link has the same index as the device Sabrina Dubroca
` (4 subsequent siblings)
12 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
Currently, we're not advertising link-netnsid when dumping IPv6
addresses, so the "ip -6 addr" command will not correctly interpret
the value of the IFLA_LINK attribute.
For example, we'll get:
9: macvlan0@macvlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
<snip>
Instead of:
9: macvlan0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 link-netns main
<snip>
ndisc_ifinfo_sysctl_change calls inet6_fill_ifinfo without rcu or
rtnl, so I'm adding rcu_read_lock around rtnl_fill_link_netnsid.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/ipv6/addrconf.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 688e441a8699..fb95c0227dfe 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -5533,6 +5533,7 @@ static inline size_t inet6_if_nlmsg_size(void)
+ nla_total_size(MAX_ADDR_LEN) /* IFLA_ADDRESS */
+ nla_total_size(4) /* IFLA_MTU */
+ nla_total_size(4) /* IFLA_LINK */
+ + nla_total_size(4) /* IFLA_LINK_NETNSID */
+ nla_total_size(1) /* IFLA_OPERSTATE */
+ nla_total_size(inet6_ifla6_size()); /* IFLA_PROTINFO */
}
@@ -5840,6 +5841,14 @@ static int inet6_fill_ifinfo(struct sk_buff *skb, struct inet6_dev *idev,
nla_put_u8(skb, IFLA_OPERSTATE,
netif_running(dev) ? dev->operstate : IF_OPER_DOWN))
goto nla_put_failure;
+
+ rcu_read_lock();
+ if (rtnl_fill_link_netnsid(skb, dev, dev_net(dev), GFP_ATOMIC)) {
+ rcu_read_unlock();
+ goto nla_put_failure;
+ }
+ rcu_read_unlock();
+
protoinfo = nla_nest_start_noflag(skb, IFLA_PROTINFO);
if (!protoinfo)
goto nla_put_failure;
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 09/12] net: link_watch: fix operstate when the link has the same index as the device
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (7 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 11:00 ` Taehee Yoo
2020-10-01 7:59 ` [PATCH net 10/12] net: link_watch: fix detection of urgent events Sabrina Dubroca
` (3 subsequent siblings)
12 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
When we create a macvlan device on top of a bond, the macvlan device
should always start with IF_OPER_LOWERLAYERDOWN if the bond is
down. Currently, this doesn't happen if the macvlan device gets the
same ifindex as the bond, which can happen because different
namespaces assign the ifindex independently:
ip netns add main
ip netns add peer
ip -net main link add type bond # idx 9
ip -net main link add link bond0 netns peer type macvlan # idx 10
ip -net main link add link bond0 type macvlan # idx 9
ip -net main link show type macvlan # M-DOWN since bond0 is DOWN
10: macvlan0@bond0: <BROADCAST,MULTICAST,M-DOWN> ...
ip -net peer link show type macvlan # should also be M-DOWN
9: macvlan0@if9: <BROADCAST,MULTICAST> ...
Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/core/link_watch.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 75431ca9300f..6932ad51aa4a 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -38,7 +38,7 @@ static unsigned char default_operstate(const struct net_device *dev)
return IF_OPER_TESTING;
if (!netif_carrier_ok(dev))
- return (dev->ifindex != dev_get_iflink(dev) ?
+ return (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink ?
IF_OPER_LOWERLAYERDOWN : IF_OPER_DOWN);
if (netif_dormant(dev))
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 10/12] net: link_watch: fix detection of urgent events
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (8 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 09/12] net: link_watch: fix operstate when the link has the same index as the device Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 11/12] batman-adv: fix iflink detection in batadv_is_on_batman_iface Sabrina Dubroca
` (2 subsequent siblings)
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca
linkwatch_urgent_event can miss urgent events when the device and its
link have the same ifindex, which can happen when those devices are in
different network namespaces.
With this setup, the vlan0 device can remain in LOWERLAYERDOWN state
for a full second (the linkwatch delay for non-urgent events), while
the vlan1 device will come up immediately:
ip netns add a
ip netns add b
ip -net a link add dummy0 type dummy
ip -net a link add link dummy0 vlan0 netns b type vlan id 1
ip -net a link add link dummy0 vlan1 netns b type vlan id 2
ip -net a link set dummy0 up
ip -net b link set vlan1 down ; ip -net b link set vlan0 down
sleep 2
ip -net b link set vlan1 up ; ip -net b link set vlan0 up
Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/core/link_watch.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 6932ad51aa4a..8a5d0ab820f9 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -92,7 +92,7 @@ static bool linkwatch_urgent_event(struct net_device *dev)
if (!netif_running(dev))
return false;
- if (dev->ifindex != dev_get_iflink(dev))
+ if (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink)
return true;
if (netif_is_lag_port(dev) || netif_is_lag_master(dev))
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 11/12] batman-adv: fix iflink detection in batadv_is_on_batman_iface
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (9 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 10/12] net: link_watch: fix detection of urgent events Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2022-05-14 10:21 ` Sven Eckelmann
2020-10-01 7:59 ` [PATCH net 12/12] batman-adv: fix detection of lower link in batadv_get_real_netdevice Sabrina Dubroca
2020-10-01 21:25 ` [PATCH net 00/12] net: iflink and link-netnsid fixes Stephen Hemminger
12 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Marek Lindner, Simon Wunderlich,
Antonio Quartulli, Sven Eckelmann, b.a.t.m.a.n
BATMAN compares ifindex with dev_get_iflink to detect devices that
don't have a parent, but that's wrong, since a device can have the
same index as its parent if it's created in a different network
namespace:
ip netns add main
ip netns add peer
ip -net main link add dummy0 type dummy
# keep ifindex in sync between the namespaces
ip -net peer link add eatidx type dummy
ip netns exec main batctl if add dummy0
# macsec0 and bat0 have the same ifindex
ip -net main link add link bat0 netns peer type macsec
ip netns exec peer batctl if add macsec0
That last command would fail if we didn't keep the ifindex in sync
between the two namespaces, and should also fail when the macsec0
device has the same ifindex as its link. Let's use the presence of a
ndo_get_iflink operation, rather than the value it returns, to detect
a device without a link.
Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/batman-adv/hard-interface.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index fa06b51c0144..0d87c5d56844 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -159,7 +159,7 @@ static bool batadv_is_on_batman_iface(const struct net_device *net_dev)
/* no more parents..stop recursion */
if (dev_get_iflink(net_dev) == 0 ||
- dev_get_iflink(net_dev) == net_dev->ifindex)
+ !(net_dev->netdev_ops && net_dev->netdev_ops->ndo_get_iflink))
return false;
parent_net = batadv_getlink_net(net_dev, net);
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH net 12/12] batman-adv: fix detection of lower link in batadv_get_real_netdevice
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (10 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 11/12] batman-adv: fix iflink detection in batadv_is_on_batman_iface Sabrina Dubroca
@ 2020-10-01 7:59 ` Sabrina Dubroca
2020-10-01 21:25 ` [PATCH net 00/12] net: iflink and link-netnsid fixes Stephen Hemminger
12 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-01 7:59 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Marek Lindner, Simon Wunderlich,
Antonio Quartulli, Sven Eckelmann, b.a.t.m.a.n
Currently, batadv_get_real_netdevice can return different results in
this situation:
ip netns add main
ip netns add peer
ip -net main link add dummy1 type dummy
ip -net main link add link dummy1 netns peer type macsec # same ifindex as dummy1
ip -net main link add link dummy1 netns peer type macsec port 2
Let's use the presence of a ndo_get_iflink operation, rather than the
value it returns, to detect a device without a link.
Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/batman-adv/hard-interface.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index 0d87c5d56844..8f7d2dd37321 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -223,7 +223,7 @@ static struct net_device *batadv_get_real_netdevice(struct net_device *netdev)
if (!netdev)
return NULL;
- if (netdev->ifindex == dev_get_iflink(netdev)) {
+ if (!(netdev->netdev_ops && netdev->netdev_ops->ndo_get_iflink)) {
dev_hold(netdev);
return netdev;
}
--
2.28.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH net 01/12] ipvlan: add get_link_net
2020-10-01 7:59 ` [PATCH net 01/12] ipvlan: add get_link_net Sabrina Dubroca
@ 2020-10-01 8:19 ` Eric Dumazet
0 siblings, 0 replies; 21+ messages in thread
From: Eric Dumazet @ 2020-10-01 8:19 UTC (permalink / raw)
To: Sabrina Dubroca, netdev
On 10/1/20 9:59 AM, Sabrina Dubroca wrote:
> Currently, ipvlan devices don't advertise a link-netnsid. We can get
> it from the lower device, like macvlan does.
>
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
> drivers/net/ipvlan/ipvlan_main.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
> index 5bca94c99006..a81bb68a5713 100644
> --- a/drivers/net/ipvlan/ipvlan_main.c
> +++ b/drivers/net/ipvlan/ipvlan_main.c
> @@ -678,6 +678,14 @@ void ipvlan_link_setup(struct net_device *dev)
> }
> EXPORT_SYMBOL_GPL(ipvlan_link_setup);
>
> +static struct net *ipvlan_get_link_net(const struct net_device *dev)
> +{
> + struct ipvl_dev *ipvlan = netdev_priv(dev);
> + struct net_device *phy_dev = ipvlan->phy_dev;
> +
> + return dev_net(phy_dev);
> +}
> +
> static const struct nla_policy ipvlan_nl_policy[IFLA_IPVLAN_MAX + 1] =
> {
> [IFLA_IPVLAN_MODE] = { .type = NLA_U16 },
> @@ -691,6 +699,7 @@ static struct rtnl_link_ops ipvlan_link_ops = {
> .setup = ipvlan_link_setup,
> .newlink = ipvlan_link_new,
> .dellink = ipvlan_link_delete,
> + .get_link_net = ipvlan_get_link_net,
> };
>
> int ipvlan_link_register(struct rtnl_link_ops *ops)
>
This conflicts with a patch in net-next...
commit 0bad834ca7bf9999ed9841f2bf9f5f07fbe02136
Author: Taehee Yoo <ap420073@gmail.com>
Date: Fri Aug 21 17:47:32 2020 +0000
ipvlan: advertise link netns via netlink
I think you should take it as is.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 09/12] net: link_watch: fix operstate when the link has the same index as the device
2020-10-01 7:59 ` [PATCH net 09/12] net: link_watch: fix operstate when the link has the same index as the device Sabrina Dubroca
@ 2020-10-01 11:00 ` Taehee Yoo
0 siblings, 0 replies; 21+ messages in thread
From: Taehee Yoo @ 2020-10-01 11:00 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: Netdev
On Thu, 1 Oct 2020 at 17:09, Sabrina Dubroca <sd@queasysnail.net> wrote:
>
> When we create a macvlan device on top of a bond, the macvlan device
> should always start with IF_OPER_LOWERLAYERDOWN if the bond is
> down. Currently, this doesn't happen if the macvlan device gets the
> same ifindex as the bond, which can happen because different
> namespaces assign the ifindex independently:
>
> ip netns add main
> ip netns add peer
> ip -net main link add type bond # idx 9
> ip -net main link add link bond0 netns peer type macvlan # idx 10
> ip -net main link add link bond0 type macvlan # idx 9
>
> ip -net main link show type macvlan # M-DOWN since bond0 is DOWN
> 10: macvlan0@bond0: <BROADCAST,MULTICAST,M-DOWN> ...
>
> ip -net peer link show type macvlan # should also be M-DOWN
> 9: macvlan0@if9: <BROADCAST,MULTICAST> ...
>
Hi Sabrina!
I think this patch will fix another bug.
Test commands:
ip link add dummy0 up type dummy
while :
do
ip link add vlan1 up link dummy0 type vlan id 10
for i in {2..7}
do
let p=$i-1
ip link add vlan$i up link vlan$p type vlan id 10
done
modprobe -rv 8021q
done
Splat looks like:
[ 65.259829][ T656] BUG: unable to handle page fault for address:
ffffa13971571100
[ 65.261719][ T656] #PF: supervisor read access in kernel mode
[ 65.263213][ T656] #PF: error_code(0x0000) - not-present page
[ 65.264685][ T656] PGD 14801067 P4D 14801067 PUD 14805067 PMD
13ff7b067 PTE 800ffffecea8e060
[ 65.266851][ T656] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
[ 65.268278][ T656] CPU: 5 PID: 656 Comm: modprobe Not tainted
5.9.0-rc6+ #745
[ 65.270099][ T656] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 65.272532][ T656] RIP: 0010:vlan_dev_get_iflink+0xc/0x20 [8021q]
[ 65.274106][ T656] Code: 00 ff ff ff ff 48 89 46 04 31 c0 c3 66 90
0f 1f 44 00 00 f3 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b
87 f0 0c 00 00 <8b> 80 00 01 00 00 c3 0f 1f 00 66 2e 0f 1f 84 00 00 00
00 00 0f 1f
[ 65.278942][ T656] RSP: 0018:ffffa139690e7e10 EFLAGS: 00010282
[ 65.280455][ T656] RAX: ffffa13971571000 RBX: ffffa13972281000
RCX: 0000000000000006
[ 65.282434][ T656] RDX: 0000000000000000 RSI: 0000000000000000
RDI: ffffa13972281000
[ 65.284404][ T656] RBP: 0000000000000008 R08: 0000000000000001
R09: 0000000000000001
[ 65.286381][ T656] R10: 0000000076c2727a R11: 00000000ffe5333b
R12: 00000000fffc68b0
[ 65.288363][ T656] R13: 0000000000000000 R14: ffffa139690e7e40
R15: dead000000000100
[ 65.290338][ T656] FS: 00007fce35d80540(0000)
GS:ffffa1397aa00000(0000) knlGS:0000000000000000
[ 65.292546][ T656] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 65.294187][ T656] CR2: ffffa13971571100 CR3: 0000000128e76004
CR4: 00000000003706e0
[ 65.296167][ T656] DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
[ 65.298183][ T656] DR3: 0000000000000000 DR6: 00000000fffe0ff0
DR7: 0000000000000400
[ 65.300160][ T656] Call Trace:
[ 65.300979][ T656] rfc2863_policy+0x7f/0xa0
[ 65.302107][ T656] linkwatch_do_dev+0x13/0x50
[ 65.303279][ T656] netdev_run_todo+0x156/0x350
[ 65.304461][ T656] ? __rtnl_link_unregister+0x93/0xd0
[ 65.305801][ T656] rtnl_link_unregister+0xdc/0x100
[ 65.307077][ T656] ? do_wait_intr_irq+0xb0/0xb0
[ 65.308285][ T656] vlan_cleanup_module+0xc/0x33 [8021q]
[ 65.309654][ T656] __x64_sys_delete_module+0x13f/0x210
[ ... ]
When module is deleted, interfaces of its module are deleted automatically.
Each interface will be deleted in order and linkwatch event of
each interface will be called in order too.
linkwatch event internally calls rfc2863_policy() and it internally calls
->ndo_get_iflink().
->ndo_get_iflink() dereferences lower interface pointer but at this
moment, the lower interface could be already deleted.
So, it could dereference the freed memory so that kernel panic could
occur.
Interface graph:
vlan3
|
vlan2
|
vlan1
|
dummy0
When the 8021q module is being removed,
1. flush vlan1's linkwatch event(dereferences dummy0)
2. delete vlan1
3. flush vlan2's linkwatch event(dereference vlan1)
At this point, vlan1 was already deleted so ->ndo_get_iflink()
will dereference the freed memory.
But this patch removes dev_get_iflink() from default_operstate().
So, this panic will not happen.
I tested it, it works well!
So, please add the below tags.
Reported-by: syzbot+95eec132c4bd9b1d8430@syzkaller.appspotmail.com
Reported-by: syzbot+d702fd2351989927037c@syzkaller.appspotmail.com
Fixes: a54acb3a6f85 ("dev: introduce dev_get_iflink()")
Taehee Yoo
> Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
> net/core/link_watch.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/link_watch.c b/net/core/link_watch.c
> index 75431ca9300f..6932ad51aa4a 100644
> --- a/net/core/link_watch.c
> +++ b/net/core/link_watch.c
> @@ -38,7 +38,7 @@ static unsigned char default_operstate(const struct net_device *dev)
> return IF_OPER_TESTING;
>
> if (!netif_carrier_ok(dev))
> - return (dev->ifindex != dev_get_iflink(dev) ?
> + return (dev->netdev_ops && dev->netdev_ops->ndo_get_iflink ?
> IF_OPER_LOWERLAYERDOWN : IF_OPER_DOWN);
>
> if (netif_dormant(dev))
> --
> 2.28.0
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses
2020-10-01 7:59 ` [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses Sabrina Dubroca
@ 2020-10-01 15:58 ` Nicolas Dichtel
2020-10-02 9:03 ` Sabrina Dubroca
0 siblings, 1 reply; 21+ messages in thread
From: Nicolas Dichtel @ 2020-10-01 15:58 UTC (permalink / raw)
To: Sabrina Dubroca, netdev
Le 01/10/2020 à 09:59, Sabrina Dubroca a écrit :
> Currently, we're not advertising link-netnsid when dumping IPv6
> addresses, so the "ip -6 addr" command will not correctly interpret
> the value of the IFLA_LINK attribute.
>
> For example, we'll get:
> 9: macvlan0@macvlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
> <snip>
>
> Instead of:
> 9: macvlan0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 link-netns main
> <snip>
>
> ndisc_ifinfo_sysctl_change calls inet6_fill_ifinfo without rcu or
> rtnl, so I'm adding rcu_read_lock around rtnl_fill_link_netnsid.
I don't think this is needed.
ndisc_ifinfo_sysctl_change() takes a reference on the idev (with in6_dev_get(dev)).
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 00/12] net: iflink and link-netnsid fixes
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
` (11 preceding siblings ...)
2020-10-01 7:59 ` [PATCH net 12/12] batman-adv: fix detection of lower link in batadv_get_real_netdevice Sabrina Dubroca
@ 2020-10-01 21:25 ` Stephen Hemminger
2020-10-02 9:07 ` Sabrina Dubroca
12 siblings, 1 reply; 21+ messages in thread
From: Stephen Hemminger @ 2020-10-01 21:25 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: netdev, Nicolas Dichtel, Marek Lindner, Simon Wunderlich,
Antonio Quartulli, Sven Eckelmann, b.a.t.m.a.n, Roopa Prabhu,
Nikolay Aleksandrov
On Thu, 1 Oct 2020 09:59:24 +0200
Sabrina Dubroca <sd@queasysnail.net> wrote:
> In a lot of places, we use this kind of comparison to detect if a
> device has a lower link:
>
> dev->ifindex != dev_get_iflink(dev)
Since this is a common operation, it would be good to add a new
helper function in netdevice.h
In your patch set, you are copying the same code snippet which
seems to indicate that it should be a helper.
Something like:
static inline bool netdev_has_link(const struct net_device *dev)
{
const struct net_device_ops *ops = dev->netdev_ops;
return ops && ops->ndo_get_iflink;
}
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses
2020-10-01 15:58 ` Nicolas Dichtel
@ 2020-10-02 9:03 ` Sabrina Dubroca
2020-10-05 15:16 ` Nicolas Dichtel
0 siblings, 1 reply; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-02 9:03 UTC (permalink / raw)
To: Nicolas Dichtel; +Cc: netdev
2020-10-01, 17:58:40 +0200, Nicolas Dichtel wrote:
> Le 01/10/2020 à 09:59, Sabrina Dubroca a écrit :
> > Currently, we're not advertising link-netnsid when dumping IPv6
> > addresses, so the "ip -6 addr" command will not correctly interpret
> > the value of the IFLA_LINK attribute.
> >
> > For example, we'll get:
> > 9: macvlan0@macvlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
> > <snip>
> >
> > Instead of:
> > 9: macvlan0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 link-netns main
> > <snip>
> >
> > ndisc_ifinfo_sysctl_change calls inet6_fill_ifinfo without rcu or
> > rtnl, so I'm adding rcu_read_lock around rtnl_fill_link_netnsid.
> I don't think this is needed.
> ndisc_ifinfo_sysctl_change() takes a reference on the idev (with in6_dev_get(dev)).
The problem is veth's get_link_net implementation, even after my change in patch 6:
static struct net *veth_get_link_net(const struct net_device *dev)
{
struct veth_priv *priv = netdev_priv(dev);
struct net_device *peer = rcu_dereference_rtnl(priv->peer);
return peer ? dev_net(peer) : dev_net(dev);
}
These commands:
ip link add type veth
sysctl net.ipv6.neigh.veth0.retrans_time_ms=2000
cause this splat:
[ 91.426764] =============================
[ 91.427445] WARNING: suspicious RCU usage
[ 91.428129] 5.9.0-rc6-net-00331-gae48bef8808b-dirty #266 Not tainted
[ 91.429209] -----------------------------
[ 91.433898] drivers/net/veth.c:1436 suspicious rcu_dereference_check() usage!
[ 91.435127]
other info that might help us debug this:
[ 91.436515]
rcu_scheduler_active = 2, debug_locks = 1
[ 91.437636] 1 lock held by sysctl/3718:
[ 91.438310] #0: ffff88806488c430 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0x2a7/0x350
[ 91.439769]
stack backtrace:
[ 91.440552] CPU: 2 PID: 3718 Comm: sysctl Not tainted 5.9.0-rc6-net-00331-gae48bef8808b-dirty #266
[ 91.442132] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.14.0-1 04/01/2014
[ 91.443742] Call Trace:
[ 91.444204] dump_stack+0x9a/0xd0
[ 91.444810] veth_get_link_net+0xa6/0xb0
[ 91.445534] rtnl_fill_link_netnsid+0xa2/0x130
[ 91.446330] ? rtnl_put_cacheinfo+0x190/0x190
[ 91.447120] ? memcpy+0x39/0x60
[ 91.447717] inet6_fill_ifinfo+0x2f7/0x480
I guess I could push the rcu_read_lock down into veth and vxcan's
handlers instead of the rcu_dereference_rtnl change in patch 6 and
adding this rcu_read_lock.
--
Sabrina
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 00/12] net: iflink and link-netnsid fixes
2020-10-01 21:25 ` [PATCH net 00/12] net: iflink and link-netnsid fixes Stephen Hemminger
@ 2020-10-02 9:07 ` Sabrina Dubroca
0 siblings, 0 replies; 21+ messages in thread
From: Sabrina Dubroca @ 2020-10-02 9:07 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, Nicolas Dichtel, Marek Lindner, Simon Wunderlich,
Antonio Quartulli, Sven Eckelmann, b.a.t.m.a.n, Roopa Prabhu,
Nikolay Aleksandrov
2020-10-01, 14:25:38 -0700, Stephen Hemminger wrote:
> On Thu, 1 Oct 2020 09:59:24 +0200
> Sabrina Dubroca <sd@queasysnail.net> wrote:
>
> > In a lot of places, we use this kind of comparison to detect if a
> > device has a lower link:
> >
> > dev->ifindex != dev_get_iflink(dev)
>
>
> Since this is a common operation, it would be good to add a new
> helper function in netdevice.h
>
> In your patch set, you are copying the same code snippet which
> seems to indicate that it should be a helper.
>
> Something like:
>
> static inline bool netdev_has_link(const struct net_device *dev)
> {
> const struct net_device_ops *ops = dev->netdev_ops;
>
> return ops && ops->ndo_get_iflink;
> }
Good idea, I'll add that in v2.
--
Sabrina
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses
2020-10-02 9:03 ` Sabrina Dubroca
@ 2020-10-05 15:16 ` Nicolas Dichtel
0 siblings, 0 replies; 21+ messages in thread
From: Nicolas Dichtel @ 2020-10-05 15:16 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev
Le 02/10/2020 à 11:03, Sabrina Dubroca a écrit :
[snip]
> I guess I could push the rcu_read_lock down into veth and vxcan's
> handlers instead of the rcu_dereference_rtnl change in patch 6 and
> adding this rcu_read_lock.
>
Yes, I think it would avoid having this problem later, when someone else will
use this helper.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH net 11/12] batman-adv: fix iflink detection in batadv_is_on_batman_iface
2020-10-01 7:59 ` [PATCH net 11/12] batman-adv: fix iflink detection in batadv_is_on_batman_iface Sabrina Dubroca
@ 2022-05-14 10:21 ` Sven Eckelmann
0 siblings, 0 replies; 21+ messages in thread
From: Sven Eckelmann @ 2022-05-14 10:21 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Marek Lindner, Simon Wunderlich, Antonio Quartulli, b.a.t.m.a.n,
netdev
[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]
On Thursday, 1 October 2020 09:59:35 CEST Sabrina Dubroca wrote:
> device has the same ifindex as its link. Let's use the presence of a
> ndo_get_iflink operation, rather than the value it returns, to detect
> a device without a link.
There wasn't any activity in this patchset since a while, it doesn't apply
anymore and the assumptions made here doesn't seem to be reflect the current
situation in the kernel. See commit 6c1f41afc1db ("batman-adv: Don't expect
inter-netns unique iflink indices"):
> But only checking for dev->netdev_ops->ndo_get_iflink is also not an option
> because ipoib_get_iflink implements it even when it sometimes returns an
> iflink != ifindex and sometimes iflink == ifindex. The caller must
> therefore make sure itself to check both netns and iflink + ifindex for
> equality. Only when they are equal, a "physical" interface was detected
> which should stop the traversal. On the other hand, vxcan_get_iflink can
> also return 0 in case there was currently no valid peer. In this case, it
> is still necessary to stop.
It would would be nice when the situation would be better but the proposed
patches don't solve it. So I will mark the two patches as "Rejected" (from
"Changes requested") in batadv's patchwork. It is not meant as sign of
disapproval of someone working in this area to improve the situation - I just
don't want to wait for the v2 [1] anymore.
Kind regards,
Sven
[1] https://lore.kernel.org/all/20201002090703.GD3565727@bistromath.localdomain/
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-05-14 10:21 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-01 7:59 [PATCH net 00/12] net: iflink and link-netnsid fixes Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 01/12] ipvlan: add get_link_net Sabrina Dubroca
2020-10-01 8:19 ` Eric Dumazet
2020-10-01 7:59 ` [PATCH net 02/12] geneve: " Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 03/12] Revert "rtnetlink: always put IFLA_LINK for links with a link-netnsid" Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 04/12] rtnetlink: always put IFLA_LINK for links with ndo_get_iflink Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 05/12] bridge: always put IFLA_LINK for ports " Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 06/12] bridge: advertise IFLA_LINK_NETNSID when dumping bridge ports Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 07/12] ipv6: always put IFLA_LINK for devices with ndo_get_iflink Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 08/12] ipv6: advertise IFLA_LINK_NETNSID when dumping ipv6 addresses Sabrina Dubroca
2020-10-01 15:58 ` Nicolas Dichtel
2020-10-02 9:03 ` Sabrina Dubroca
2020-10-05 15:16 ` Nicolas Dichtel
2020-10-01 7:59 ` [PATCH net 09/12] net: link_watch: fix operstate when the link has the same index as the device Sabrina Dubroca
2020-10-01 11:00 ` Taehee Yoo
2020-10-01 7:59 ` [PATCH net 10/12] net: link_watch: fix detection of urgent events Sabrina Dubroca
2020-10-01 7:59 ` [PATCH net 11/12] batman-adv: fix iflink detection in batadv_is_on_batman_iface Sabrina Dubroca
2022-05-14 10:21 ` Sven Eckelmann
2020-10-01 7:59 ` [PATCH net 12/12] batman-adv: fix detection of lower link in batadv_get_real_netdevice Sabrina Dubroca
2020-10-01 21:25 ` [PATCH net 00/12] net: iflink and link-netnsid fixes Stephen Hemminger
2020-10-02 9:07 ` Sabrina Dubroca
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).