netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: Christian Brauner <christian.brauner@canonical.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>,
	netdev@vger.kernel.org, stephen@networkplumber.org,
	w.bumiller@proxmox.com, ebiederm@xmission.com, jbenc@redhat.com,
	nicolas.dichtel@6wind.com, linux-kernel@vger.kernel.org,
	dsahern@gmail.com, davem@davemloft.net
Subject: Re: [PATCH net 1/1 v2] rtnetlink: require unique netns identifier
Date: Tue, 6 Feb 2018 13:49:10 +0300	[thread overview]
Message-ID: <2ca51c73-71c3-7d4c-d6fb-a550038518bc@virtuozzo.com> (raw)
In-Reply-To: <20180205232438.GA8695@gmail.com>

Hi, Christian,

On 06.02.2018 02:24, Christian Brauner wrote:
> On Tue, Feb 06, 2018 at 12:47:46AM +0300, Kirill Tkhai wrote:
>> On 05.02.2018 18:55, Christian Brauner wrote:
>>> Since we've added support for IFLA_IF_NETNSID for RTM_{DEL,GET,SET,NEW}LINK
>>> it is possible for userspace to send us requests with three different
>>> properties to identify a target network namespace. This affects at least
>>> RTM_{NEW,SET}LINK. Each of them could potentially refer to a different
>>> network namespace which is confusing. For legacy reasons the kernel will
>>> pick the IFLA_NET_NS_PID property first and then look for the
>>> IFLA_NET_NS_FD property but there is no reason to extend this type of
>>> behavior to network namespace ids. The regression potential is quite
>>> minimal since the rtnetlink requests in question either won't allow
>>> IFLA_IF_NETNSID requests before 4.16 is out (RTM_{NEW,SET}LINK) or don't
>>> support IFLA_NET_NS_{PID,FD} (RTM_{DEL,GET}LINK) in the first place.
>>>> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
>>> ---
>>> ChangeLog v1->v2:
>>> * return errno when the specified network namespace id is invalid
>>> * fill in struct netlink_ext_ack if the network namespace id is invalid
>>> * rename rtnl_ensure_unique_netns_attr() to rtnl_ensure_unique_netns() to
>>>   indicate that a request without any network namespace identifying attributes
>>>   is also considered valid.
>>>
>>> ChangeLog v0->v1:
>>> * report a descriptive error to userspace via struct netlink_ext_ack
>>> * do not fail when multiple properties specifiy the same network namespace
>>> ---
>>>  net/core/rtnetlink.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 69 insertions(+)
>>>
>>> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
>>> index 56af8e41abfc..c096c4ff9a00 100644
>>> --- a/net/core/rtnetlink.c
>>> +++ b/net/core/rtnetlink.c
>>> @@ -1951,6 +1951,59 @@ static struct net *rtnl_link_get_net_capable(const struct sk_buff *skb,
>>>  	return net;
>>>  }
>>>  
>>> +/* Verify that rtnetlink requests supporting network namespace ids
>>> + * do not pass additional properties referring to different network
>>> + * namespaces.
>>> + */
>>> +static int rtnl_ensure_unique_netns(const struct sock *sk, struct nlattr *tb[],
>>> +				    struct netlink_ext_ack *extack)
>>> +{
>>> +	int ret = -EINVAL;
>>> +	struct net *net = NULL, *unique_net = NULL;
>>> +
>>> +	/* Requests without network namespace ids have been able to specify
>>> +	 * multiple properties referring to different network namespaces so
>>> +	 * don't regress them.
>>> +	 */
>>> +	if (!tb[IFLA_IF_NETNSID])
>>> +		return 0;
>>> +
>>> +	/* Caller operates on the current network namespace. */
>>> +	if (!tb[IFLA_NET_NS_PID] && !tb[IFLA_NET_NS_FD])
>>> +		return 0;
>>> +
>>> +	unique_net = get_net_ns_by_id(sock_net(sk), nla_get_s32(tb[IFLA_IF_NETNSID]));
>>> +	if (!unique_net) {
>>> +		NL_SET_ERR_MSG(extack, "invalid network namespace id");
>>> +		return ret;
>>> +	}
>>> +
>>> +	if (tb[IFLA_NET_NS_PID]) {
>>> +		net = get_net_ns_by_pid(nla_get_u32(tb[IFLA_NET_NS_PID]));
>>> +		if (net != unique_net)
>>> +			goto on_error;
>>> +	}
>>> +
>>> +	if (tb[IFLA_NET_NS_FD]) {
>>> +		net = get_net_ns_by_fd(nla_get_u32(tb[IFLA_NET_NS_FD]));
>>> +		if (net != unique_net)
>>> +			goto on_error;
>>> +	}
>>> +
>>> +	ret = 0;
>>> +
>>> +on_error:
>>> +	put_net(unique_net);
>>> +
>>> +	if (net && !IS_ERR(net))
>>> +		put_net(net);
>>
>> 1)When we have tb[IFLA_NET_NS_PID and tb[IFLA_NET_NS_FD] both set and pointing
>> to the same net, this function increments net::count in get_net_ns_by_pid() and
>> in get_net_ns_by_fd(), i.e. twice. But only single put_net(net) will be called.
>> So, after this function net::count will be incremented by 1, and it never will
>> die.
> 
> Thanks for spotting this, Kirill.
> 
>>
>> 2)The whole approach does not seem good for me. The first reason is it's racy.
>> Even if rtnl_ensure_unique_netns() returns 0, this does not guarantees that
>> tb[IFLA_IF_NETNSID] and tb[IFLA_NET_NS_PID] will be point the same net later,
>> as the pid may die or do setns(). Racy check is worse than no check at all.
>>
>> The second reason is after this patch get_net_ns_by_id/get_net_ns_by_pid()/
>> get_net_ns_by_fd() will be called twice: the first time is in your check
>> and the second time is where they are actually used. This is not good for
>> performance.
> 
> If this is really a performance problem we can simply fix this by
> performing the check when the target network namespace is retrieved in
> each request. The intention for doing it in one function at the
> beginning of each request was to make it generic and easily
> understandable.

I haven't measured the performance with stopwatch, of course, but this is
additional operations, and we should not use them unless they are really need.
The approach with get_net()/put_net() is racy and it does not solve the
problem. So it does not seem good for me despite it is generic.

>>
>> What is the problem people pass several different tb[xxx] in one call? We
>> may just describe the order of tb[xxx] in man page and their priorities,
>> and ignore the rest after the first not zero tb[xxx] is found, and do that
>> in the place, where net from tb[xxx] in actually used. This is the thing
>> we already do.
>>
>> Comparing to classic Linux interface such as syscalls, it's usual behavior
>> for them to ignore one argument, when another is set. Nobody confuses.
> 
> From what I gather from recent discussions I had here using pids and
> fds to perform operations on network namespaces in netlink requests is
> not the future. Specifically, using pids and fds will not be extended to
> existing or future requests that do not already support it.
> 
> It also very much smells like a security liability if what you've
> outlined above is true: a user sends a request with a pid and the task
> dies and the pid gets recycled. Now, we can't easily fix this by simply
> ignoring pids and fds from here on since this would likely break a bunch
> of userspace programs but we can ensure that if a network namespace
> identifier is passed that no other way of retrieving the target network
> namespace is passed. Especially with requests that already support pids
> and fds. It's either that or reversing the order meaning that if a
> network namespace identifier is passed then it should take precedence
> over the other identifiers. Furthermore, this would also clearly
> indicate that netns ids are the preferred way to perform operations on
> network namespaces via netlink requests.

If we really need this, can't we simply zero excess identifiers instead?

void rtnl_kill_them_all(struct nlattr *tb[])
{
	if (!tb[IFLA_IF_NETNSID])
		return;
	tb[IFLA_NET_NS_PID] = tb[IFLA_NET_NS_FD] = NULL;
}

It's not racy and solves the problem you are solving.

> I also do not think that your suggestion of making guarantees in what
> order additional netlink properties are evaluated is a good one. I don't
> think we want to give userspace the impression that sticking a pid, fd,
> and netnsid into the same netlink request is something that we actively
> support.
> 
> What is certainly a good point is that if pids and fds are as you said
> inherently racy then we shouldn't perform the check but do what my
> original patch did and simply refuse to combine netns ids with pids
> and/or fds.

Thanks,
Kirill

  reply	other threads:[~2018-02-06 10:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-05 15:55 [PATCH net 0/1 v2] rtnetlink: require unique netns identifier Christian Brauner
2018-02-05 15:55 ` [PATCH net 1/1 " Christian Brauner
2018-02-05 16:28   ` David Ahern
2018-02-05 21:47   ` Kirill Tkhai
2018-02-05 23:24     ` Christian Brauner
2018-02-06 10:49       ` Kirill Tkhai [this message]
2018-02-06 12:18         ` Christian Brauner
2018-02-06 22:31       ` Eric W. Biederman
2018-02-07 11:13         ` Jiri Benc
2018-02-07  4:54   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ca51c73-71c3-7d4c-d6fb-a550038518bc@virtuozzo.com \
    --to=ktkhai@virtuozzo.com \
    --cc=christian.brauner@canonical.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=jbenc@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=stephen@networkplumber.org \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).