netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/3] net: Fix possible race in peernet2id_alloc()
@ 2018-01-09 15:00 Kirill Tkhai
  2018-01-09 15:00 ` [PATCH v2 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
  2018-01-09 15:00 ` [PATCH v2 3/3] net: Remove spinlock from get_net_ns_by_id() Kirill Tkhai
  0 siblings, 2 replies; 7+ messages in thread
From: Kirill Tkhai @ 2018-01-09 15:00 UTC (permalink / raw)
  To: netdev, davem; +Cc: eric.dumazet, ktkhai, ebiederm

peernet2id_alloc() is racy without rtnl_lock() as atomic_read(&peer->count)
under net->nsid_lock does not guarantee, peer is alive:

rcu_read_lock()
peernet2id_alloc()                            ..
  spin_lock_bh(&net->nsid_lock)               ..
  atomic_read(&peer->count) == 1              ..
  ..                                          put_net()
  ..                                            cleanup_net()
  ..                                              for_each_net(tmp)
  ..                                                spin_lock_bh(&tmp->nsid_lock)
  ..                                                __peernet2id(tmp, net) == -1
  ..                                                    ..
  ..                                                    ..
    __peernet2id_alloc(alloc == true)                   ..
  ..                                                    ..
rcu_read_unlock()                                       ..
..                                                synchronize_rcu()
..                                                kmem_cache_free(net)

After the above situation, net::netns_id contains id pointing to freed memory,
and any other dereferencing by the id will operate with this freed memory.

Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
is generic interface, and better we fix it before someone really starts
use it in wrong context.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 60a71be75aea..497135f314ce 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -221,17 +221,26 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id);
  */
 int peernet2id_alloc(struct net *net, struct net *peer)
 {
-	bool alloc;
+	bool alloc = false, alive = false;
 	int id;
 
 	if (atomic_read(&net->count) == 0)
 		return NETNSA_NSID_NOT_ASSIGNED;
 	spin_lock_bh(&net->nsid_lock);
-	alloc = atomic_read(&peer->count) == 0 ? false : true;
+	/*
+	 * When peer is obtained from RCU lists, we may race with
+	 * its cleanup. Check whether it's alive, and this guarantees
+	 * we never hash a peer back to net->netns_ids, after it has
+	 * just been idr_remove()'d from there in cleanup_net().
+	 */
+	if (maybe_get_net(peer))
+		alive = alloc = true;
 	id = __peernet2id_alloc(net, peer, &alloc);
 	spin_unlock_bh(&net->nsid_lock);
 	if (alloc && id >= 0)
 		rtnl_net_notifyid(net, RTM_NEWNSID, id);
+	if (alive)
+		put_net(peer);
 	return id;
 }
 EXPORT_SYMBOL_GPL(peernet2id_alloc);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/3] net: Add BUG_ON() to get_net()
  2018-01-09 15:00 [PATCH v2 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
@ 2018-01-09 15:00 ` Kirill Tkhai
  2018-01-09 18:52   ` Eric Dumazet
  2018-01-09 15:00 ` [PATCH v2 3/3] net: Remove spinlock from get_net_ns_by_id() Kirill Tkhai
  1 sibling, 1 reply; 7+ messages in thread
From: Kirill Tkhai @ 2018-01-09 15:00 UTC (permalink / raw)
  To: netdev, davem; +Cc: eric.dumazet, ktkhai, ebiederm

Since people may mistakenly obtain destroying net
from net_namespace_list and from net::netns_ids
without checking for its net::counter, let's protect
against such situations and insert BUG_ON() to stop
move on after this.

Panic is better, than memory corruption and undefined
behavior.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/net_namespace.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 10f99dafd5ac..ff0e47471d5b 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -195,7 +195,7 @@ void __put_net(struct net *net);
 
 static inline struct net *get_net(struct net *net)
 {
-	atomic_inc(&net->count);
+	BUG_ON(atomic_inc_return(&net->count) <= 1);
 	return net;
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/3] net: Remove spinlock from get_net_ns_by_id()
  2018-01-09 15:00 [PATCH v2 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
  2018-01-09 15:00 ` [PATCH v2 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
@ 2018-01-09 15:00 ` Kirill Tkhai
  1 sibling, 0 replies; 7+ messages in thread
From: Kirill Tkhai @ 2018-01-09 15:00 UTC (permalink / raw)
  To: netdev, davem; +Cc: eric.dumazet, ktkhai, ebiederm

idr_find() is safe under rcu_read_lock() and
maybe_get_net() guarantees that net is alive.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 497135f314ce..5f9e18b31a1a 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -273,11 +273,9 @@ struct net *get_net_ns_by_id(struct net *net, int id)
 		return NULL;
 
 	rcu_read_lock();
-	spin_lock_bh(&net->nsid_lock);
 	peer = idr_find(&net->netns_ids, id);
 	if (peer)
 		peer = maybe_get_net(peer);
-	spin_unlock_bh(&net->nsid_lock);
 	rcu_read_unlock();
 
 	return peer;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()
  2018-01-09 15:00 ` [PATCH v2 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
@ 2018-01-09 18:52   ` Eric Dumazet
  2018-01-10  7:37     ` Kirill Tkhai
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-01-09 18:52 UTC (permalink / raw)
  To: Kirill Tkhai, netdev, davem; +Cc: ebiederm

On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
> Since people may mistakenly obtain destroying net
> from net_namespace_list and from net::netns_ids
> without checking for its net::counter, let's protect
> against such situations and insert BUG_ON() to stop
> move on after this.
> 
> Panic is better, than memory corruption and undefined
> behavior.
> 
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  include/net/net_namespace.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 10f99dafd5ac..ff0e47471d5b 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -195,7 +195,7 @@ void __put_net(struct net *net);
>  
>  static inline struct net *get_net(struct net *net)
>  {
> -	atomic_inc(&net->count);
> +	BUG_ON(atomic_inc_return(&net->count) <= 1);
>  	return net;
>  }


Why not simply use refcount_t instead of duplicating its logic ?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()
  2018-01-09 18:52   ` Eric Dumazet
@ 2018-01-10  7:37     ` Kirill Tkhai
  2018-01-10  9:58       ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill Tkhai @ 2018-01-10  7:37 UTC (permalink / raw)
  To: Eric Dumazet, netdev, davem; +Cc: ebiederm

On 09.01.2018 21:52, Eric Dumazet wrote:
> On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
>> Since people may mistakenly obtain destroying net
>> from net_namespace_list and from net::netns_ids
>> without checking for its net::counter, let's protect
>> against such situations and insert BUG_ON() to stop
>> move on after this.
>>
>> Panic is better, than memory corruption and undefined
>> behavior.
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  include/net/net_namespace.h |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
>> index 10f99dafd5ac..ff0e47471d5b 100644
>> --- a/include/net/net_namespace.h
>> +++ b/include/net/net_namespace.h
>> @@ -195,7 +195,7 @@ void __put_net(struct net *net);
>>  
>>  static inline struct net *get_net(struct net *net)
>>  {
>> -	atomic_inc(&net->count);
>> +	BUG_ON(atomic_inc_return(&net->count) <= 1);
>>  	return net;
>>  }
> 
> 
> Why not simply use refcount_t instead of duplicating its logic?

The main goal of the change is to catch rare races happening on production nodes
with real load and to prevent memory corruption. You can't simply use refcount_t
primitives, as there is no appropriate primitive with BUG_ON() among them. WARN_ON()
from the primitives doesn't protect from memory corruption.

Also, keep in mind, that CONFIG_REFCOUNT_FULL is usually disabled on no-debug kernel.
I've checked both Fedora and Debian. So, the only possibility to catch such the races,
if someone really happy meets them on test kernel and test workload, which is usually
is very unlikely.

Kirill

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()
  2018-01-10  7:37     ` Kirill Tkhai
@ 2018-01-10  9:58       ` Eric Dumazet
  2018-01-10 10:48         ` Kirill Tkhai
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-01-10  9:58 UTC (permalink / raw)
  To: Kirill Tkhai, netdev, davem; +Cc: ebiederm

On Wed, 2018-01-10 at 10:37 +0300, Kirill Tkhai wrote:
> On 09.01.2018 21:52, Eric Dumazet wrote:
> > On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
> > > Since people may mistakenly obtain destroying net
> > > from net_namespace_list and from net::netns_ids
> > > without checking for its net::counter, let's protect
> > > against such situations and insert BUG_ON() to stop
> > > move on after this.
> > > 
> > > Panic is better, than memory corruption and undefined
> > > behavior.
> > > 
> > > Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> > > ---
> > >  include/net/net_namespace.h |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> > > index 10f99dafd5ac..ff0e47471d5b 100644
> > > --- a/include/net/net_namespace.h
> > > +++ b/include/net/net_namespace.h
> > > @@ -195,7 +195,7 @@ void __put_net(struct net *net);
> > >  
> > >  static inline struct net *get_net(struct net *net)
> > >  {
> > > -	atomic_inc(&net->count);
> > > +	BUG_ON(atomic_inc_return(&net->count) <= 1);
> > >  	return net;
> > >  }
> > 
> > 
> > Why not simply use refcount_t instead of duplicating its logic?
> 
> The main goal of the change is to catch rare races happening on production nodes
> with real load and to prevent memory corruption. You can't simply use refcount_t
> primitives, as there is no appropriate primitive with BUG_ON() among them. WARN_ON()
> from the primitives doesn't protect from memory corruption.
> 
> Also, keep in mind, that CONFIG_REFCOUNT_FULL is usually disabled on no-debug kernel.
> I've checked both Fedora and Debian. So, the only possibility to catch such the races,
> if someone really happy meets them on test kernel and test workload, which is usually
> is very unlikely.
> 

Keep in mind that most of these bugs are found by syzkaller or other
fuzzer bots, that have CONFIG_REFCOUNT_FULL enabled.

Do not rely on production workload to find such bug for you, coverage
is very very low.

Resistance is futile, because this refcount will eventually be
converted some day.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()
  2018-01-10  9:58       ` Eric Dumazet
@ 2018-01-10 10:48         ` Kirill Tkhai
  0 siblings, 0 replies; 7+ messages in thread
From: Kirill Tkhai @ 2018-01-10 10:48 UTC (permalink / raw)
  To: Eric Dumazet, netdev, davem; +Cc: ebiederm

On 10.01.2018 12:58, Eric Dumazet wrote:
> On Wed, 2018-01-10 at 10:37 +0300, Kirill Tkhai wrote:
>> On 09.01.2018 21:52, Eric Dumazet wrote:
>>> On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
>>>> Since people may mistakenly obtain destroying net
>>>> from net_namespace_list and from net::netns_ids
>>>> without checking for its net::counter, let's protect
>>>> against such situations and insert BUG_ON() to stop
>>>> move on after this.
>>>>
>>>> Panic is better, than memory corruption and undefined
>>>> behavior.
>>>>
>>>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>>>> ---
>>>>  include/net/net_namespace.h |    2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
>>>> index 10f99dafd5ac..ff0e47471d5b 100644
>>>> --- a/include/net/net_namespace.h
>>>> +++ b/include/net/net_namespace.h
>>>> @@ -195,7 +195,7 @@ void __put_net(struct net *net);
>>>>  
>>>>  static inline struct net *get_net(struct net *net)
>>>>  {
>>>> -	atomic_inc(&net->count);
>>>> +	BUG_ON(atomic_inc_return(&net->count) <= 1);
>>>>  	return net;
>>>>  }
>>>
>>>
>>> Why not simply use refcount_t instead of duplicating its logic?
>>
>> The main goal of the change is to catch rare races happening on production nodes
>> with real load and to prevent memory corruption. You can't simply use refcount_t
>> primitives, as there is no appropriate primitive with BUG_ON() among them. WARN_ON()
>> from the primitives doesn't protect from memory corruption.
>>
>> Also, keep in mind, that CONFIG_REFCOUNT_FULL is usually disabled on no-debug kernel.
>> I've checked both Fedora and Debian. So, the only possibility to catch such the races,
>> if someone really happy meets them on test kernel and test workload, which is usually
>> is very unlikely.
>>
> 
> Keep in mind that most of these bugs are found by syzkaller or other
> fuzzer bots, that have CONFIG_REFCOUNT_FULL enabled.
> 
> Do not rely on production workload to find such bug for you, coverage
> is very very low.
> 
> Resistance is futile, because this refcount will eventually be
> converted some day.

OK, I'll do v3.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-01-10 10:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-09 15:00 [PATCH v2 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
2018-01-09 15:00 ` [PATCH v2 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
2018-01-09 18:52   ` Eric Dumazet
2018-01-10  7:37     ` Kirill Tkhai
2018-01-10  9:58       ` Eric Dumazet
2018-01-10 10:48         ` Kirill Tkhai
2018-01-09 15:00 ` [PATCH v2 3/3] net: Remove spinlock from get_net_ns_by_id() Kirill Tkhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).