* [PATCH net-next 0/1] rtnetlink: request RTM_GETLINK by pid or fd
@ 2018-01-18 20:21 Christian Brauner
2018-01-18 20:21 ` [PATCH net-next 1/1] " Christian Brauner
0 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2018-01-18 20:21 UTC (permalink / raw)
To: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski,
vyasevich, jbenc, netdev, linux-kernel
Cc: Christian Brauner
Hey everyone,
This makes it possible to identify the target network namespace of a
RTM_GETLINK message by pid or fd.
Often userspace tools that make heavy use of network namespaces need a simple
and cheap way of querying network devices and network device properties. This
becomes even more crucial when the network namespaces in question are
transient. In such scenarios setting a netns id property is not really wanted
and it is preferable to avoid the hit of (possibly multiple) setns() syscalls
(e.g. attaching to the target network namespace and back to the original
network namespace.). This commit lets userspace set the IFLA_NET_NS_{FD,PID}
property to identify a target network namespace where the device in question is
to be queried.
I couldn't find any obvious reason why this shouldn't be allowed but I haven't
done a deep dive into the possible security implications. So if I missed a very
obvious point why this wasn't possible so far, I'm sorry.
Christian
Christian Brauner (1):
rtnetlink: request RTM_GETLINK by pid or fd
net/core/rtnetlink.c | 63 +++++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 50 insertions(+), 13 deletions(-)
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread* [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-18 20:21 [PATCH net-next 0/1] rtnetlink: request RTM_GETLINK by pid or fd Christian Brauner @ 2018-01-18 20:21 ` Christian Brauner 2018-01-18 20:29 ` Jiri Benc 0 siblings, 1 reply; 23+ messages in thread From: Christian Brauner @ 2018-01-18 20:21 UTC (permalink / raw) To: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, jbenc, netdev, linux-kernel Cc: Christian Brauner This makes it possible to identify the target network namespace of a RTM_GETLINK message by pid or fd. Often userspace tools that make heavy use of network namespaces need a simple and cheap way of querying network devices and network device properties. This becomes even more crucial when the network namespaces in question are transient. In such scenarios setting a netns id property is not really wanted and it is preferable to avoid the hit of (possibly multiple) setns() syscalls (e.g. attaching to the target network namespace and back to the original network namespace.). This commit lets userspace set the IFLA_NET_NS_{FD,PID} property to identify a target network namespace where the device in question is to be queried. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> --- net/core/rtnetlink.c | 63 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 50 insertions(+), 13 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 16d644a4f974..5210448dcf1f 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1509,7 +1509,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, struct net *src_net, int type, u32 pid, u32 seq, u32 change, unsigned int flags, u32 ext_filter_mask, - u32 event, int *new_nsid, int tgt_netnsid) + u32 event, int *new_nsid, int tgt_netnsid, + int tgt_netnsid_type) { struct ifinfomsg *ifm; struct nlmsghdr *nlh; @@ -1527,8 +1528,23 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, ifm->ifi_flags = dev_get_flags(dev); ifm->ifi_change = change; - if (tgt_netnsid >= 0 && nla_put_s32(skb, IFLA_IF_NETNSID, tgt_netnsid)) - goto nla_put_failure; + if (tgt_netnsid >= 0) { + u32 err = -1; + + switch (tgt_netnsid_type) { + case IFLA_IF_NETNSID: + err = nla_put_s32(skb, IFLA_IF_NETNSID, tgt_netnsid); + break; + case IFLA_NET_NS_PID: + err = nla_put_s32(skb, IFLA_NET_NS_PID, tgt_netnsid); + break; + case IFLA_NET_NS_FD: + err = nla_put_s32(skb, IFLA_NET_NS_FD, tgt_netnsid); + break; + } + if (err) + goto nla_put_failure; + } if (nla_put_string(skb, IFLA_IFNAME, dev->name) || nla_put_u32(skb, IFLA_TXQLEN, dev->tx_queue_len) || @@ -1791,6 +1807,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) unsigned int flags = NLM_F_MULTI; int master_idx = 0; int netnsid = -1; + int netnsid_type = -1; int err; int hdrlen; @@ -1810,12 +1827,22 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) if (nlmsg_parse(cb->nlh, hdrlen, tb, IFLA_MAX, ifla_policy, NULL) >= 0) { if (tb[IFLA_IF_NETNSID]) { + netnsid_type = IFLA_IF_NETNSID; netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]); tgt_net = get_target_net(skb->sk, netnsid); - if (IS_ERR(tgt_net)) { - tgt_net = net; - netnsid = -1; - } + } else if (tb[IFLA_NET_NS_PID]) { + netnsid_type = IFLA_NET_NS_PID; + netnsid = nla_get_s32(tb[IFLA_NET_NS_PID]); + tgt_net = get_net_ns_by_pid(netnsid); + } else if (tb[IFLA_NET_NS_FD]) { + netnsid_type = IFLA_NET_NS_FD; + netnsid = nla_get_s32(tb[IFLA_NET_NS_FD]); + tgt_net = get_net_ns_by_fd(netnsid); + } + if (IS_ERR(tgt_net)) { + tgt_net = net; + netnsid = -1; + netnsid_type = -1; } if (tb[IFLA_EXT_MASK]) @@ -1845,7 +1872,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) cb->nlh->nlmsg_seq, 0, flags, ext_filter_mask, 0, NULL, - netnsid); + netnsid, netnsid_type); if (err < 0) { if (likely(skb->len)) @@ -2984,6 +3011,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, struct net_device *dev = NULL; struct sk_buff *nskb; int netnsid = -1; + int netnsid_type = -1; int err; u32 ext_filter_mask = 0; @@ -2992,11 +3020,20 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, return err; if (tb[IFLA_IF_NETNSID]) { + netnsid_type = IFLA_IF_NETNSID; netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]); tgt_net = get_target_net(NETLINK_CB(skb).sk, netnsid); - if (IS_ERR(tgt_net)) - return PTR_ERR(tgt_net); - } + } else if (tb[IFLA_NET_NS_PID]) { + netnsid_type = IFLA_NET_NS_PID; + netnsid = nla_get_s32(tb[IFLA_NET_NS_PID]); + tgt_net = get_net_ns_by_pid(netnsid); + } else if (tb[IFLA_NET_NS_FD]) { + netnsid_type = IFLA_NET_NS_FD; + netnsid = nla_get_s32(tb[IFLA_NET_NS_FD]); + tgt_net = get_net_ns_by_fd(netnsid); + } + if (IS_ERR(tgt_net)) + return PTR_ERR(tgt_net); if (tb[IFLA_IFNAME]) nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); @@ -3025,7 +3062,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, err = rtnl_fill_ifinfo(nskb, dev, net, RTM_NEWLINK, NETLINK_CB(skb).portid, nlh->nlmsg_seq, 0, 0, ext_filter_mask, - 0, NULL, netnsid); + 0, NULL, netnsid, netnsid_type); if (err < 0) { /* -EMSGSIZE implies BUG in if_nlmsg_size */ WARN_ON(err == -EMSGSIZE); @@ -3134,7 +3171,7 @@ struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct net_device *dev, err = rtnl_fill_ifinfo(skb, dev, dev_net(dev), type, 0, 0, change, 0, 0, event, - new_nsid, -1); + new_nsid, -1, -1); if (err < 0) { /* -EMSGSIZE implies BUG in if_nlmsg_size() */ WARN_ON(err == -EMSGSIZE); -- 2.14.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-18 20:21 ` [PATCH net-next 1/1] " Christian Brauner @ 2018-01-18 20:29 ` Jiri Benc 2018-01-18 20:55 ` Christian Brauner 0 siblings, 1 reply; 23+ messages in thread From: Jiri Benc @ 2018-01-18 20:29 UTC (permalink / raw) To: Christian Brauner Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel On Thu, 18 Jan 2018 21:21:24 +0100, Christian Brauner wrote: > In such scenarios setting a netns id property is > not really wanted Why? I think that's what you should do if you want to avoid setns. Just use netnsid. I don't see any problem with that and you didn't provide any explanation. Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-18 20:29 ` Jiri Benc @ 2018-01-18 20:55 ` Christian Brauner 2018-01-22 21:00 ` Jiri Benc 0 siblings, 1 reply; 23+ messages in thread From: Christian Brauner @ 2018-01-18 20:55 UTC (permalink / raw) To: Jiri Benc Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Thu, Jan 18, 2018 at 09:29:14PM +0100, Jiri Benc wrote: > On Thu, 18 Jan 2018 21:21:24 +0100, Christian Brauner wrote: > > In such scenarios setting a netns id property is > > not really wanted > > Why? I think that's what you should do if you want to avoid setns. Just > use netnsid. I don't see any problem with that and you didn't provide > any explanation. Ah, sorry, maybe I was to brief. When creating and destroying a lot of short-lived network namespaces it seems unnecessary to give them all label/netns id. Using a netns id makes much more sense when you want a persistent, long-living network namespace. For example, iproute2 where you want to create a persistent network namespace that sticks around via ip netns add bla && ip netns set bla 5. A more concrete scenario is creating a network namespace, moving a device into it via RTM_SETLINK which also supports IFLA_NET_NS_{FD,PID} and then wanting to query the device. This would be very easy to do if one could reuse the IFLA_NET_NS_{FD,PID} without having to set a netnsid. Christian ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-18 20:55 ` Christian Brauner @ 2018-01-22 21:00 ` Jiri Benc 2018-01-22 21:23 ` Christian Brauner 0 siblings, 1 reply; 23+ messages in thread From: Jiri Benc @ 2018-01-22 21:00 UTC (permalink / raw) To: Christian Brauner Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Thu, 18 Jan 2018 21:55:53 +0100, Christian Brauner wrote: > A more concrete scenario is creating a network namespace, moving a > device into it via RTM_SETLINK which also supports IFLA_NET_NS_{FD,PID} > and then wanting to query the device. This would be very easy to do if > one could reuse the IFLA_NET_NS_{FD,PID} without having to set a > netnsid. Wouldn't be a better solution to have a way to request the netnsid allocation (and return it) right away when creating the name space, then? Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-22 21:00 ` Jiri Benc @ 2018-01-22 21:23 ` Christian Brauner 2018-01-22 22:06 ` Jiri Benc 0 siblings, 1 reply; 23+ messages in thread From: Christian Brauner @ 2018-01-22 21:23 UTC (permalink / raw) To: Jiri Benc Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Mon, Jan 22, 2018 at 10:00:46PM +0100, Jiri Benc wrote: > On Thu, 18 Jan 2018 21:55:53 +0100, Christian Brauner wrote: > > A more concrete scenario is creating a network namespace, moving a > > device into it via RTM_SETLINK which also supports IFLA_NET_NS_{FD,PID} > > and then wanting to query the device. This would be very easy to do if > > one could reuse the IFLA_NET_NS_{FD,PID} without having to set a > > netnsid. > > Wouldn't be a better solution to have a way to request the netnsid > allocation (and return it) right away when creating the name space, > then? That is certainly a good idea and I'm happy to send a follow-up patch for that! But there's still value in being able to use IFLA_NET_NS_{FD,PID} in scenarios where the network namespace has been created by another process. In this case we don't know what its netnsid is and not even if it had been assigned one at creation time or not. In this case it would be useful to refer to the netns via a pid or fd. A more concrete and frequent example is querying a network namespace of a (sorry for the buzzword :)) container for all defined network interfaces. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-22 21:23 ` Christian Brauner @ 2018-01-22 22:06 ` Jiri Benc 2018-01-22 22:25 ` Christian Brauner 2018-01-23 16:37 ` Nicolas Dichtel 0 siblings, 2 replies; 23+ messages in thread From: Jiri Benc @ 2018-01-22 22:06 UTC (permalink / raw) To: Christian Brauner Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Mon, 22 Jan 2018 22:23:54 +0100, Christian Brauner wrote: > That is certainly a good idea and I'm happy to send a follow-up patch > for that! Note that I haven't looked into that and I don't know whether it is easily possible. I'll appreciate if you could try that. > But there's still value in being able to use > IFLA_NET_NS_{FD,PID} in scenarios where the network namespace has been > created by another process. In this case we don't know what its netnsid > is and not even if it had been assigned one at creation time or not. In > this case it would be useful to refer to the netns via a pid or fd. A > more concrete and frequent example is querying a network namespace of a > (sorry for the buzzword :)) container for all defined network > interfaces. That's what spurred my original comment. If you don't know the netnsid in such case, we're missing something in uAPI but at a different point than RTM_GETLINK. When you find yourself in a need to query an interface in another netns, you had to learn about that interface in the first place. Meaning you got its ifindex (or ifname, perhaps) somehow. My point is, you should have learned the netnsid at the same time. ifindex alone is not an unique identifier of an interface. The (ifindex, netnsid) pair is. (Also, note that ifindex can change when moving the interface to a different netns.) So you should never get ifindex alone when the interface is in another netns than the current one. If that happens, that is the uAPI that needs to be fixed. You have to always get the (ifindex, netnsid) pair. And that's also the way how it should operate in the other direction: (ifindex, netnsid) is the identifier to query interface in another netns. Note that many APIs in networking are based around netnsid - look at NETLINK_LISTEN_ALL_NSID. It allows you to keep track of interfaces as they are moved from name spaces to name spaces using a single socket: you get notifications about interfaces disappearing or appearing in "watched" name spaces. The name spaces are, of course, referenced by their netnsid. In order to add another "watched" name space, just assign it (or, more typically, let the kernel assign) a netnsid. Btw, we have one missing piece here: when an interface is moved to a name space that does not have netnsid attached, we want to find out where the interface was moved to. But there's currently no reliable way to do it. For veth, the other end can be used to get the netnsid (note that RTM_GETLINK will return the correct link netnsid even when the queried veth interface is in a different name space), for openvswitch, we can now use genetlink, etc., but using different ways for different interface types is not the best API ever and for standalone interfaces we have nothing. I'd like to see something added to uAPI to cover this in a generic way. But as for this patch, I don't think it's the correct way. We do have missing pieces in uAPI wrt. netns support but I think they're at different places. If you have a counter example, please speak up. Thanks, Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-22 22:06 ` Jiri Benc @ 2018-01-22 22:25 ` Christian Brauner 2018-01-23 9:30 ` Jiri Benc 2018-01-23 16:37 ` Nicolas Dichtel 1 sibling, 1 reply; 23+ messages in thread From: Christian Brauner @ 2018-01-22 22:25 UTC (permalink / raw) To: Jiri Benc Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Mon, Jan 22, 2018 at 11:06:16PM +0100, Jiri Benc wrote: > On Mon, 22 Jan 2018 22:23:54 +0100, Christian Brauner wrote: > > That is certainly a good idea and I'm happy to send a follow-up patch > > for that! > > Note that I haven't looked into that and I don't know whether it is > easily possible. I'll appreciate if you could try that. > > > But there's still value in being able to use > > IFLA_NET_NS_{FD,PID} in scenarios where the network namespace has been > > created by another process. In this case we don't know what its netnsid > > is and not even if it had been assigned one at creation time or not. In > > this case it would be useful to refer to the netns via a pid or fd. A > > more concrete and frequent example is querying a network namespace of a > > (sorry for the buzzword :)) container for all defined network > > interfaces. > > That's what spurred my original comment. If you don't know the netnsid > in such case, we're missing something in uAPI but at a different point > than RTM_GETLINK. > > When you find yourself in a need to query an interface in another > netns, you had to learn about that interface in the first place. > Meaning you got its ifindex (or ifname, perhaps) somehow. My point is, > you should have learned the netnsid at the same time. <snip> > different places. If you have a counter example, please speak up. This is not necessarily true in scenarios where I move a network device via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't created. Here is an example: nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; nlmsghdr->nlmsg_type = RTM_NEWLINK; /* move to network namespace of pid */ nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid) /* give interface new name */ nla_put_string(nlmsg, IFLA_IFNAME, ifname) The only thing I have is the pid that identifies the network namespace. There's no non-syscall way to learn the netnsid. In this case I just want to be able to do the analogue to RTM_NEWLINK + IFLA_NET_NS_PID aka RTM_GETLINK + IFLA_NET_NS_PID to e.g. retrieve the list of all network devices in the network namespace identified by pid. Christian ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-22 22:25 ` Christian Brauner @ 2018-01-23 9:30 ` Jiri Benc 2018-01-23 10:26 ` Wolfgang Bumiller 0 siblings, 1 reply; 23+ messages in thread From: Jiri Benc @ 2018-01-23 9:30 UTC (permalink / raw) To: Christian Brauner Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote: > This is not necessarily true in scenarios where I move a network device > via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't > created. Here is an example: > > nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; > nlmsghdr->nlmsg_type = RTM_NEWLINK; > /* move to network namespace of pid */ > nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid) > /* give interface new name */ > nla_put_string(nlmsg, IFLA_IFNAME, ifname) > > The only thing I have is the pid that identifies the network namespace. How do you know the interface did not get renamed in the new netns? This is racy and won't work reliably. You really need to know the netnsid before moving the interface to the netns to be able to do meaningful queries. You may argue that for your case, you're fine with the race. But I know about use cases where it matters a lot: those are tools that show network topology including changes in real time, such as Skydive. It's important to have the uAPI designed right. And we don't want two different APIs for the same thing. If you want to do any watching over the interfaces (as opposed to "move to the netns and forget"), you really have to work with netnsids. Let's focus on how to do that more easily. We don't return netnsid at all places where we should return it and we need to fix that. > There's no non-syscall way to learn the netnsid. And that is the primary problem. Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 9:30 ` Jiri Benc @ 2018-01-23 10:26 ` Wolfgang Bumiller 2018-01-23 10:42 ` Jiri Benc 2018-01-23 16:50 ` [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd Nicolas Dichtel 0 siblings, 2 replies; 23+ messages in thread From: Wolfgang Bumiller @ 2018-01-23 10:26 UTC (permalink / raw) To: Jiri Benc Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Tue, Jan 23, 2018 at 10:30:09AM +0100, Jiri Benc wrote: > On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote: > > This is not necessarily true in scenarios where I move a network device > > via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't > > created. Here is an example: > > > > nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; > > nlmsghdr->nlmsg_type = RTM_NEWLINK; > > /* move to network namespace of pid */ > > nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid) > > /* give interface new name */ > > nla_put_string(nlmsg, IFLA_IFNAME, ifname) > > > > The only thing I have is the pid that identifies the network namespace. > > How do you know the interface did not get renamed in the new netns? > > This is racy and won't work reliably. You really need to know the > netnsid before moving the interface to the netns to be able to do > meaningful queries. Even if you know the netnsid, do the mentioned watches work for nested/child namespaces if eg. a container creates new namespace before and/or after the watch was established and moves interfaces to these child namespaces, would you just see them disappear, or can you keep track of them later on as well? Even if that works, from what the documentation tells me netlink is an unreliable protocol, so if my watcher's socket buffer is full, wouldn't I be losing important tracking information? I think one possible solution to tracking interfaces would be to have a unique identifier that never changes (even if it's just a simple uint64_t incremented whenever an interface is created). But since they're not local to the current namespace that may require a lot of extra permission checks (but I'm just speculating here...). In any case, IFLA_NET_NS_FD/PID are already there and I had been wondering previously why they couldn't be used with RTM_GETLINK, it would just make sense. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 10:26 ` Wolfgang Bumiller @ 2018-01-23 10:42 ` Jiri Benc [not found] ` <20180123114218.vsm5nu2jajrqjvko@gmail.com> 2018-01-23 16:50 ` [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd Nicolas Dichtel 1 sibling, 1 reply; 23+ messages in thread From: Jiri Benc @ 2018-01-23 10:42 UTC (permalink / raw) To: Wolfgang Bumiller Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Tue, 23 Jan 2018 11:26:58 +0100, Wolfgang Bumiller wrote: > Even if you know the netnsid, do the mentioned watches work for > nested/child namespaces if eg. a container creates new namespace before > and/or after the watch was established and moves interfaces to these > child namespaces, would you just see them disappear, or can you keep > track of them later on as well? What do you mean by "nested namespaces"? There's no such thing for net name spaces. As for missing API to get netnsid of the netns the interface is moved to, see my previous emails in this thread. This needs to be added. > Even if that works, from what the documentation tells me netlink is an > unreliable protocol, so if my watcher's socket buffer is full, wouldn't > I be losing important tracking information? Sure. But that's fundamentally unfixable independently on netlink, the kernel needs to take an action if a program is not reading its messages. Either some messages get dropped or the program is killed or infinite amount of memory is consumed. This has nothing to do with uAPI design. > I think one possible solution to tracking interfaces would be to have a > unique identifier that never changes (even if it's just a simple > uint64_t incremented whenever an interface is created). But since > they're not local to the current namespace that may require a lot of > extra permission checks (but I'm just speculating here...). You'll get a hard NACK from CRIU folks if you try to propose this. > In any case, IFLA_NET_NS_FD/PID are already there and I had been > wondering previously why they couldn't be used with RTM_GETLINK, it > would just make sense. Those predate netnsids and we can't get rid of them now, since they're part of uAPI. But we can (and should) make sure we don't add more of those. Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <20180123114218.vsm5nu2jajrqjvko@gmail.com>]
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd [not found] ` <20180123114218.vsm5nu2jajrqjvko@gmail.com> @ 2018-01-23 12:22 ` Jiri Benc 2018-01-23 16:55 ` Nicolas Dichtel ` (3 more replies) 0 siblings, 4 replies; 23+ messages in thread From: Jiri Benc @ 2018-01-23 12:22 UTC (permalink / raw) To: Christian Brauner; +Cc: netdev (Christian, I'm adding back the netdev list, there's no reason not to have the discussion in open.) On Tue, 23 Jan 2018 12:42:19 +0100, Christian Brauner wrote: > Thanks for the comments and discussion. Sorry, for not going through the > list but I just have a quick question that doesn't deserve to spam the > whole netdev list. :) I have written another set of patches that would > add support for IFLA_IF_NETNSID to RTM_{N,S}ETLINK and friends which I > planned to send out after the pid and fd ones. Would you be interested > in those if I sent them out over the next week? Yes, please CC me. I will have limited time next week (whole day meetings for the most of the week) but I'll try to review. In general, I see value in adding netnsid support to setlink, newlink and dellink. It's a bit of minefield, though. Look for and be sure to resolve: - Backwards compatibility with old kernels. Old kernels will ignore the IFLA_IF_NETNSID attribute that is unknown to them. And it's certainly undesirable for an application to attempt to set/create an interface in a given netns and instead the kernel sets/creates an interface in the current netns. In 79e1ad148c844, I put into place means to handle that: any kernel that understands IFLA_IF_NETNSID but does not support setlink/newlink/dellink will return -EOPNOTSUPP. This reduces the problem to detecting whether the kernel understands IFLA_IF_NETNSID. This can be done by invoking getlink with IFLA_IF_NETNSID and looking whether the reply contains IFLA_IF_NETNSID. The current getlink calls are also limited to CAP_NET_ADMIN in the *target* netns. If/when this is relaxed in the future, we still need to stay backwards compatible. So, the application needs to do this: 1. call RTM_GETLINK with IFLA_IF_NETNSID on lo (or do a dump) 2. if getting EACCESS, assume IFLA_IF_NETNSID is not supported => fall back to other means 2. if the reply does not contain IFLA_IF_NETNSID, the kernel does not support IFLA_IF_NETNSID => fall back to other means 3. try RTM_SETLINK/NEWLINK/DELLINK and check for EOPNOTSUPP => fall back to other means What is important for your kernel patches is to not break this. If you implement only partial support for some of the operation, think about the future extensibility. Applications have to be able to reliably detect what is supported on the running kernel, regardless of the kernel version. - Access rights. This is security sensitive stuff. I took the big hammer and limited everything to CAP_NET_ADMIN in the target netns. If you want to relax this, be sure to get review from people involved in name space security. Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 12:22 ` Jiri Benc @ 2018-01-23 16:55 ` Nicolas Dichtel 2018-01-23 18:05 ` Christian Brauner 2018-01-24 11:32 ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner ` (2 subsequent siblings) 3 siblings, 1 reply; 23+ messages in thread From: Nicolas Dichtel @ 2018-01-23 16:55 UTC (permalink / raw) To: Jiri Benc, Christian Brauner; +Cc: netdev Le 23/01/2018 à 13:22, Jiri Benc a écrit : > (Christian, I'm adding back the netdev list, there's no reason not to > have the discussion in open.) > > On Tue, 23 Jan 2018 12:42:19 +0100, Christian Brauner wrote: >> Thanks for the comments and discussion. Sorry, for not going through the >> list but I just have a quick question that doesn't deserve to spam the >> whole netdev list. :) I have written another set of patches that would >> add support for IFLA_IF_NETNSID to RTM_{N,S}ETLINK and friends which I >> planned to send out after the pid and fd ones. Would you be interested >> in those if I sent them out over the next week? > > Yes, please CC me. I will have limited time next week (whole day > meetings for the most of the week) but I'll try to review. CC me also please. Regards, Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 16:55 ` Nicolas Dichtel @ 2018-01-23 18:05 ` Christian Brauner 0 siblings, 0 replies; 23+ messages in thread From: Christian Brauner @ 2018-01-23 18:05 UTC (permalink / raw) To: Nicolas Dichtel; +Cc: Jiri Benc, netdev On Tue, Jan 23, 2018 at 05:55:27PM +0100, Nicolas Dichtel wrote: > Le 23/01/2018 à 13:22, Jiri Benc a écrit : > > (Christian, I'm adding back the netdev list, there's no reason not to > > have the discussion in open.) > > > > On Tue, 23 Jan 2018 12:42:19 +0100, Christian Brauner wrote: > >> Thanks for the comments and discussion. Sorry, for not going through the > >> list but I just have a quick question that doesn't deserve to spam the > >> whole netdev list. :) I have written another set of patches that would > >> add support for IFLA_IF_NETNSID to RTM_{N,S}ETLINK and friends which I > >> planned to send out after the pid and fd ones. Would you be interested > >> in those if I sent them out over the next week? > > > > Yes, please CC me. I will have limited time next week (whole day > > meetings for the most of the week) but I'll try to review. > CC me also please. Of course, will do. I plan on sending out the patches for RTM_SETLINK and RTM_DELLINK tomorrow after I've done some more testing on it. Thanks! Christian ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK 2018-01-23 12:22 ` Jiri Benc 2018-01-23 16:55 ` Nicolas Dichtel @ 2018-01-24 11:32 ` Christian Brauner 2018-01-24 11:32 ` [PATCH net-next 1/3] rtnetlink: enable IFLA_IF_NETNSID in do_setlink() Christian Brauner 2018-01-24 11:52 ` [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK Christian Brauner 2018-01-24 11:53 ` [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK Christian Brauner 3 siblings, 1 reply; 23+ messages in thread From: Christian Brauner @ 2018-01-24 11:32 UTC (permalink / raw) To: netdev Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, linux-kernel, stephen, jbenc, w.bumiller, nicolas.dichtel, Christian Brauner Hi, Based on the previous discussion this enables passing a IFLA_IF_NETNSID property along with RTM_SETLINK and RTM_DELLINK requests. The patch for RTM_NEWLINK will be sent out in a separate patch since there are more corner-cases to think about. Best, Christian Christian Brauner (3): rtnetlink: enable IFLA_IF_NETNSID in do_setlink() rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK net/core/rtnetlink.c | 97 ++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 76 insertions(+), 21 deletions(-) -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net-next 1/3] rtnetlink: enable IFLA_IF_NETNSID in do_setlink() 2018-01-24 11:32 ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner @ 2018-01-24 11:32 ` Christian Brauner 0 siblings, 0 replies; 23+ messages in thread From: Christian Brauner @ 2018-01-24 11:32 UTC (permalink / raw) To: netdev Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, linux-kernel, stephen, jbenc, w.bumiller, nicolas.dichtel, Christian Brauner RTM_{NEW,SET}LINK already allow operations on other network namespaces by identifying the target network namespace through IFLA_NET_NS_{FD,PID} properties. This is done by looking for the corresponding properties in do_setlink(). Extend do_setlink() to also look for the IFLA_IF_NETNSID property. This introduces no functional changes since all callers of do_setlink() currently block IFLA_IF_NETNSID by reporting an error before they reach do_setlink(). This introduces the helpers: static struct net *rtnl_link_get_net_by_nlattr(struct net *src_net, struct nlattr *tb[]) static struct net *rtnl_link_get_net_capable(const struct sk_buff *skb, struct net *src_net, struct nlattr *tb[], int cap) to simplify permission checks and target network namespace retrieval for RTM_* requests that already support IFLA_NET_NS_{FD,PID} but get extended to IFLA_IF_NETNSID. To perserve backwards compatibility the helpers look for IFLA_NET_NS_{FD,PID} properties first before checking for IFLA_IF_NETNSID. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> --- net/core/rtnetlink.c | 54 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 47 insertions(+), 7 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 16d644a4f974..54134187485b 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1893,6 +1893,49 @@ struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[]) } EXPORT_SYMBOL(rtnl_link_get_net); +/* Figure out which network namespace we are talking about by + * examining the link attributes in the following order: + * + * 1. IFLA_NET_NS_PID + * 2. IFLA_NET_NS_FD + * 3. IFLA_IF_NETNSID + */ +static struct net *rtnl_link_get_net_by_nlattr(struct net *src_net, + struct nlattr *tb[]) +{ + struct net *net; + + if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD]) + return rtnl_link_get_net(src_net, tb); + + if (!tb[IFLA_IF_NETNSID]) + return get_net(src_net); + + net = get_net_ns_by_id(src_net, nla_get_u32(tb[IFLA_IF_NETNSID])); + if (!net) + return ERR_PTR(-EINVAL); + + return net; +} + +static struct net *rtnl_link_get_net_capable(const struct sk_buff *skb, + struct net *src_net, + struct nlattr *tb[], int cap) +{ + struct net *net; + + net = rtnl_link_get_net_by_nlattr(src_net, tb); + if (IS_ERR(net)) + return net; + + if (!netlink_ns_capable(skb, net->user_ns, cap)) { + put_net(net); + return ERR_PTR(-EPERM); + } + + return net; +} + static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[]) { if (dev) { @@ -2155,17 +2198,14 @@ static int do_setlink(const struct sk_buff *skb, const struct net_device_ops *ops = dev->netdev_ops; int err; - if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD]) { - struct net *net = rtnl_link_get_net(dev_net(dev), tb); + if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_IF_NETNSID]) { + struct net *net = rtnl_link_get_net_capable(skb, dev_net(dev), + tb, CAP_NET_ADMIN); if (IS_ERR(net)) { err = PTR_ERR(net); goto errout; } - if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) { - put_net(net); - err = -EPERM; - goto errout; - } + err = dev_change_net_namespace(dev, net, ifname); put_net(net); if (err) -- 2.14.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK 2018-01-23 12:22 ` Jiri Benc 2018-01-23 16:55 ` Nicolas Dichtel 2018-01-24 11:32 ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner @ 2018-01-24 11:52 ` Christian Brauner 2018-01-24 11:53 ` [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK Christian Brauner 3 siblings, 0 replies; 23+ messages in thread From: Christian Brauner @ 2018-01-24 11:52 UTC (permalink / raw) To: netdev Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, linux-kernel, jbenc, w.bumiller, nicolas.dichtel, Christian Brauner - Backwards Compatibility: If userspace wants to determine whether RTM_SETLINK supports the IFLA_IF_NETNSID property they should first send an RTM_GETLINK request with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply does not include IFLA_IF_NETNSID userspace should assume that IFLA_IF_NETNSID is not supported on this kernel. If the reply does contain an IFLA_IF_NETNSID property userspace can send an RTM_SETLINK with a IFLA_IF_NETNSID property. If they receive EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property with RTM_SETLINK. Userpace should then fallback to other means. To retain backwards compatibility the kernel will first check whether a IFLA_NET_NS_PID or IFLA_NET_NS_FD property has been passed. If either one is found it will be used to identify the target network namespace. This implies that users who do not care whether their running kernel supports IFLA_IF_NETNSID with RTM_SETLINK can pass both IFLA_NET_NS_{FD,PID} and IFLA_IF_NETNSID referring to the same network namespace. - Security: Callers must have CAP_NET_ADMIN in the owning user namespace of the target network namespace. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> --- net/core/rtnetlink.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 54134187485b..a4d4409685e3 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2546,9 +2546,6 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, if (err < 0) goto errout; - if (tb[IFLA_IF_NETNSID]) - return -EOPNOTSUPP; - if (tb[IFLA_IFNAME]) nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); else -- 2.14.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK 2018-01-23 12:22 ` Jiri Benc ` (2 preceding siblings ...) 2018-01-24 11:52 ` [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK Christian Brauner @ 2018-01-24 11:53 ` Christian Brauner 3 siblings, 0 replies; 23+ messages in thread From: Christian Brauner @ 2018-01-24 11:53 UTC (permalink / raw) To: netdev Cc: davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, linux-kernel, jbenc, w.bumiller, nicolas.dichtel, Christian Brauner - Backwards Compatibility: If userspace wants to determine whether RTM_DELLINK supports the IFLA_IF_NETNSID property they should first send an RTM_GETLINK request with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply does not include IFLA_IF_NETNSID userspace should assume that IFLA_IF_NETNSID is not supported on this kernel. If the reply does contain an IFLA_IF_NETNSID property userspace can send an RTM_DELLINK with a IFLA_IF_NETNSID property. If they receive EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property with RTM_DELLINK. Userpace should then fallback to other means. - Security: Callers must have CAP_NET_ADMIN in the owning user namespace of the target network namespace. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> --- net/core/rtnetlink.c | 40 +++++++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 11 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index a4d4409685e3..17f7c3508a0c 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2630,36 +2630,54 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); - struct net_device *dev; + struct net *tgt_net = net; + struct net_device *dev = NULL; struct ifinfomsg *ifm; char ifname[IFNAMSIZ]; struct nlattr *tb[IFLA_MAX+1]; int err; + int netnsid = -1; err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy, extack); if (err < 0) return err; - if (tb[IFLA_IF_NETNSID]) - return -EOPNOTSUPP; - if (tb[IFLA_IFNAME]) nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); + if (tb[IFLA_IF_NETNSID]) { + netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]); + tgt_net = get_target_net(NETLINK_CB(skb).sk, netnsid); + if (IS_ERR(tgt_net)) + return PTR_ERR(tgt_net); + } + + err = -EINVAL; ifm = nlmsg_data(nlh); if (ifm->ifi_index > 0) - dev = __dev_get_by_index(net, ifm->ifi_index); + dev = __dev_get_by_index(tgt_net, ifm->ifi_index); else if (tb[IFLA_IFNAME]) - dev = __dev_get_by_name(net, ifname); + dev = __dev_get_by_name(tgt_net, ifname); else if (tb[IFLA_GROUP]) - return rtnl_group_dellink(net, nla_get_u32(tb[IFLA_GROUP])); + err = rtnl_group_dellink(tgt_net, nla_get_u32(tb[IFLA_GROUP])); else - return -EINVAL; + goto out; - if (!dev) - return -ENODEV; + if (!dev) { + if (tb[IFLA_GROUP]) + goto out; - return rtnl_delete_link(dev); + err = -ENODEV; + goto out; + } + + err = rtnl_delete_link(dev); + +out: + if (netnsid >= 0) + put_net(tgt_net); + + return err; } int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm) -- 2.14.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 10:26 ` Wolfgang Bumiller 2018-01-23 10:42 ` Jiri Benc @ 2018-01-23 16:50 ` Nicolas Dichtel 1 sibling, 0 replies; 23+ messages in thread From: Nicolas Dichtel @ 2018-01-23 16:50 UTC (permalink / raw) To: Wolfgang Bumiller, Jiri Benc Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen Le 23/01/2018 à 11:26, Wolfgang Bumiller a écrit : > On Tue, Jan 23, 2018 at 10:30:09AM +0100, Jiri Benc wrote: >> On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote: >>> This is not necessarily true in scenarios where I move a network device >>> via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't >>> created. Here is an example: >>> >>> nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; >>> nlmsghdr->nlmsg_type = RTM_NEWLINK; >>> /* move to network namespace of pid */ >>> nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid) >>> /* give interface new name */ >>> nla_put_string(nlmsg, IFLA_IFNAME, ifname) >>> >>> The only thing I have is the pid that identifies the network namespace. >> >> How do you know the interface did not get renamed in the new netns? >> >> This is racy and won't work reliably. You really need to know the >> netnsid before moving the interface to the netns to be able to do >> meaningful queries. > > Even if you know the netnsid, do the mentioned watches work for > nested/child namespaces if eg. a container creates new namespace before > and/or after the watch was established and moves interfaces to these > child namespaces, would you just see them disappear, or can you keep > track of them later on as well? nsid can be monitored (see ip monitor nsid). > > Even if that works, from what the documentation tells me netlink is an > unreliable protocol, so if my watcher's socket buffer is full, wouldn't > I be losing important tracking information? You can track socket error statistics. In case of error, you can start a dump to ensure that you have the right view of the system. > > I think one possible solution to tracking interfaces would be to have a > unique identifier that never changes (even if it's just a simple > uint64_t incremented whenever an interface is created). But since > they're not local to the current namespace that may require a lot of > extra permission checks (but I'm just speculating here...). It's not possible to have unique identifiers. With CRIU, you need to be able to reassign all existing ids. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-22 22:06 ` Jiri Benc 2018-01-22 22:25 ` Christian Brauner @ 2018-01-23 16:37 ` Nicolas Dichtel 2018-01-23 17:08 ` Jiri Benc 1 sibling, 1 reply; 23+ messages in thread From: Nicolas Dichtel @ 2018-01-23 16:37 UTC (permalink / raw) To: Jiri Benc, Christian Brauner Cc: Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen Le 22/01/2018 à 23:06, Jiri Benc a écrit : [snip] > Btw, we have one missing piece here: when an interface is moved to a > name space that does not have netnsid attached, we want to find out > where the interface was moved to. But there's currently no reliable way > to do it. For veth, the other end can be used to get the netnsid (note > that RTM_GETLINK will return the correct link netnsid even when the > queried veth interface is in a different name space), for openvswitch, > we can now use genetlink, etc., but using different ways for different > interface types is not the best API ever and for standalone interfaces > we have nothing. I'd like to see something added to uAPI to cover this > in a generic way. When a virtual interface moves to another netns, the netlink RTM_DELLINK message contains the attribute IFLA_NEW_NETNSID, which identifies where the interface moves. The nsid may be allocated if needed. I don't know if it's acceptable to also allocate an nsid in case of a physical interface. Regards, Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 16:37 ` Nicolas Dichtel @ 2018-01-23 17:08 ` Jiri Benc 2018-01-24 10:53 ` Nicolas Dichtel 0 siblings, 1 reply; 23+ messages in thread From: Jiri Benc @ 2018-01-23 17:08 UTC (permalink / raw) To: Nicolas Dichtel Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Tue, 23 Jan 2018 17:37:11 +0100, Nicolas Dichtel wrote: > When a virtual interface moves to another netns, the netlink RTM_DELLINK message > contains the attribute IFLA_NEW_NETNSID, which identifies where the interface > moves. The nsid may be allocated if needed. The problem is that ifindex may change and it's not announced. The only way is to track both ifindex and ifname, watch for the ifname to appear in the target netns and update the application's view of ifindex. It would be much better if the whole (ifindex, netnsid) pair was returned. I think it could be added. > I don't know if it's acceptable to also allocate an nsid in case of a physical > interface. I think we need that. Thanks, Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-23 17:08 ` Jiri Benc @ 2018-01-24 10:53 ` Nicolas Dichtel 2018-01-24 11:03 ` Jiri Benc 0 siblings, 1 reply; 23+ messages in thread From: Nicolas Dichtel @ 2018-01-24 10:53 UTC (permalink / raw) To: Jiri Benc Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen Le 23/01/2018 à 18:08, Jiri Benc a écrit : > On Tue, 23 Jan 2018 17:37:11 +0100, Nicolas Dichtel wrote: >> When a virtual interface moves to another netns, the netlink RTM_DELLINK message >> contains the attribute IFLA_NEW_NETNSID, which identifies where the interface >> moves. The nsid may be allocated if needed. > > The problem is that ifindex may change and it's not announced. The only > way is to track both ifindex and ifname, watch for the ifname to appear > in the target netns and update the application's view of ifindex. Yes, you're right. > > It would be much better if the whole (ifindex, netnsid) pair was > returned. I think it could be added. Sure. Do you plan to send a patch? > >> I don't know if it's acceptable to also allocate an nsid in case of a physical >> interface. > > I think we need that. If you agree, I can send a patch to remove this limitation. Regards, Nicolas ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd 2018-01-24 10:53 ` Nicolas Dichtel @ 2018-01-24 11:03 ` Jiri Benc 0 siblings, 0 replies; 23+ messages in thread From: Jiri Benc @ 2018-01-24 11:03 UTC (permalink / raw) To: Nicolas Dichtel Cc: Christian Brauner, Christian Brauner, davem, dsahern, fw, daniel, lucien.xin, mschiffer, jakub.kicinski, vyasevich, netdev, linux-kernel, stephen On Wed, 24 Jan 2018 11:53:11 +0100, Nicolas Dichtel wrote: > Le 23/01/2018 à 18:08, Jiri Benc a écrit : > > It would be much better if the whole (ifindex, netnsid) pair was > > returned. I think it could be added. > Sure. Do you plan to send a patch? I can do that but it will take a while. I'll be happy to leave that to someone else but if there's no one, I'll eventually do it. > > I think we need that. > If you agree, I can send a patch to remove this limitation. That would be great. Thanks! Jiri ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2018-01-24 11:53 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-18 20:21 [PATCH net-next 0/1] rtnetlink: request RTM_GETLINK by pid or fd Christian Brauner
2018-01-18 20:21 ` [PATCH net-next 1/1] " Christian Brauner
2018-01-18 20:29 ` Jiri Benc
2018-01-18 20:55 ` Christian Brauner
2018-01-22 21:00 ` Jiri Benc
2018-01-22 21:23 ` Christian Brauner
2018-01-22 22:06 ` Jiri Benc
2018-01-22 22:25 ` Christian Brauner
2018-01-23 9:30 ` Jiri Benc
2018-01-23 10:26 ` Wolfgang Bumiller
2018-01-23 10:42 ` Jiri Benc
[not found] ` <20180123114218.vsm5nu2jajrqjvko@gmail.com>
2018-01-23 12:22 ` Jiri Benc
2018-01-23 16:55 ` Nicolas Dichtel
2018-01-23 18:05 ` Christian Brauner
2018-01-24 11:32 ` [PATCH net-next 0/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_{DEL,SET}LINK Christian Brauner
2018-01-24 11:32 ` [PATCH net-next 1/3] rtnetlink: enable IFLA_IF_NETNSID in do_setlink() Christian Brauner
2018-01-24 11:52 ` [PATCH net-next 2/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK Christian Brauner
2018-01-24 11:53 ` [PATCH net-next 3/3] rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK Christian Brauner
2018-01-23 16:50 ` [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd Nicolas Dichtel
2018-01-23 16:37 ` Nicolas Dichtel
2018-01-23 17:08 ` Jiri Benc
2018-01-24 10:53 ` Nicolas Dichtel
2018-01-24 11:03 ` Jiri Benc
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).