From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Date: Thu, 21 Dec 2017 11:39:44 -0600 Message-ID: <87ind0ou8v.fsf@xmission.com> References: <151386201910.3724.7199367937841370542.stgit@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain Cc: netdev@vger.kernel.org, davem@davemloft.net, eric.dumazet@gmail.com To: Kirill Tkhai Return-path: Received: from out02.mta.xmission.com ([166.70.13.232]:59371 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752483AbdLURkM (ORCPT ); Thu, 21 Dec 2017 12:40:12 -0500 In-Reply-To: <151386201910.3724.7199367937841370542.stgit@localhost.localdomain> (Kirill Tkhai's message of "Thu, 21 Dec 2017 16:13:52 +0300") Sender: netdev-owner@vger.kernel.org List-ID: Kirill Tkhai writes: > peernet2id_alloc() is racy without rtnl_lock() as atomic_read(&peer->count) > under net->nsid_lock does not guarantee, peer is alive: > > rcu_read_lock() > peernet2id_alloc() .. > spin_lock_bh(&net->nsid_lock) .. > atomic_read(&peer->count) == 1 .. > .. put_net() > .. cleanup_net() > .. for_each_net(tmp) > .. spin_lock_bh(&tmp->nsid_lock) > .. __peernet2id(tmp, net) == -1 > .. .. > .. .. > __peernet2id_alloc(alloc == true) .. > .. .. > rcu_read_unlock() .. > .. synchronize_rcu() > .. kmem_cache_free(net) > > After the above situation, net::netns_id contains id pointing to freed memory, > and any other dereferencing by the id will operate with this freed memory. > > Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except > ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc() > is generic interface, and better we fix it before someone really starts > use it in wrong context. So it comes down to this piece of code from ovs and just let me say ick. if (!net_eq(net, dev_net(vport->dev))) { int id = peernet2id_alloc(net, dev_net(vport->dev)); if (nla_put_s32(skb, OVS_VPORT_ATTR_NETNSID, id)) goto nla_put_failure; } Without the rtnl lock dev_net can cange between the test and the call of peernet2id_alloc. At first glance it looks like the bug is that we are running a control path of the networking stack without the rtnl lock. So it may be that ASSERT_RTNL() is the better fix. Given that it would be nice to reduce the scope of the rtnl lock this might not be a bad direction. Let me see. Is rtnl_notify safe without the rtnl lock? > > Signed-off-by: Kirill Tkhai > --- > net/core/net_namespace.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c > index 60a71be75aea..6a4eab438221 100644 > --- a/net/core/net_namespace.c > +++ b/net/core/net_namespace.c > @@ -221,17 +221,32 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id); > */ > int peernet2id_alloc(struct net *net, struct net *peer) > { > - bool alloc; > + bool alloc = false, alive = false; > int id; ^^^ Perhaps we want "ASSERT_RTNL();" here? > > - if (atomic_read(&net->count) == 0) > - return NETNSA_NSID_NOT_ASSIGNED; Moving this hunk is of no benefit. The code must be called with a valid reference to net. Which means net->count is a fancy way of testing to see if the code is in cleanup_net. In all other cases net->count should be non-zero and it should remain that way because of our caller must keep a reference. > spin_lock_bh(&net->nsid_lock); > - alloc = atomic_read(&peer->count) == 0 ? false : true; > + /* Spinlock guarantees we never hash a peer to net->netns_ids > + * after idr_destroy(&net->netns_ids) occurs in cleanup_net(). > + */ > + if (atomic_read(&net->count) == 0) { > + id = NETNSA_NSID_NOT_ASSIGNED; > + goto unlock; > + } > + /* > + * When peer is obtained from RCU lists, we may race with > + * its cleanup. Check whether it's alive, and this guarantees > + * we never hash a peer back to net->netns_ids, after it has > + * just been idr_remove()'d from there in cleanup_net(). > + */ > + if (maybe_get_net(peer)) > + alive = alloc = true; Yes this does seem reasonable. The more obvious looking code which would return NETNSA_NSID_NOT_ASSIGNED if the peer has a count of 0, is silly as it makes would make it appear that a peer is momentary outside of a network namespace when the peer is in fact moving from one network namespace to another. > id = __peernet2id_alloc(net, peer, &alloc); > +unlock: > spin_unlock_bh(&net->nsid_lock); > if (alloc && id >= 0) > rtnl_net_notifyid(net, RTM_NEWNSID, id); ^^^^^^ Is this safe without the rtnl lock? > + if (alive) > + put_net(peer); > return id; > } > EXPORT_SYMBOL_GPL(peernet2id_alloc); Eric