public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: dev: fix the incorrect hold of net namespace's lo device
@ 2012-08-22  8:31 Gao feng
  2012-08-22  8:39 ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Gao feng @ 2012-08-22  8:31 UTC (permalink / raw)
  To: ebiederm; +Cc: davem, eric.dumazet, netdev, Gao feng

in dst_dev_event,we get the still referenced dst entries
from dst_garbage list,and call dst_ifdown to change these
dst entries' device to the net namesapce's lo device.

when we moving a net device(A) to another net namespace,
because free_fib_info_rcu is called after a grace period,
we may call dst_dev_event before free_fib_info_rcu putting
dst_entry into the dst_garbage list.

so in dst_dev_event, we can't see these dst entries through
dst_garbage list, and without changing their device to the
old net namespace's lo device. after a grace period, these
dst entries which dst->dev is device A will in the dst_garbage
list, and the device A will belong to the new net namespcae.

then we exit from this new net namespace, the dst_dev_event
is called again,it will get these dst entries from dst_garbage
list,and call dst_ifdown to hold the new net namespace's lo
device incorrectly and put the device A.

so it will tigger the emg message in netdev_wait_allrefs like
below.
unregister_netdevice: waiting for lo to become free. Usage count = 1

fix this problem by adding rcu_barrier() in dst_dev_event
when event is NETDEV_UNREGISTER.
with this,dst_ifdown will be called after the dst_garbage list
beeing updated.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/core/dst.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 56d6361..38c2199 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -375,6 +375,7 @@ static int dst_dev_event(struct notifier_block *this, unsigned long event,
 
 	switch (event) {
 	case NETDEV_UNREGISTER:
+		rcu_barrier();
 	case NETDEV_DOWN:
 		mutex_lock(&dst_gc_mutex);
 		for (dst = dst_busy_list; dst; dst = dst->next) {
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: dev: fix the incorrect hold of net namespace's lo device
  2012-08-22  8:31 [PATCH] net: dev: fix the incorrect hold of net namespace's lo device Gao feng
@ 2012-08-22  8:39 ` Eric Dumazet
  2012-08-22 11:00   ` Gao feng
  2012-08-23  3:13   ` Eric Dumazet
  0 siblings, 2 replies; 6+ messages in thread
From: Eric Dumazet @ 2012-08-22  8:39 UTC (permalink / raw)
  To: Gao feng; +Cc: ebiederm, davem, netdev

On Wed, 2012-08-22 at 16:31 +0800, Gao feng wrote:
> in dst_dev_event,we get the still referenced dst entries
> from dst_garbage list,and call dst_ifdown to change these
> dst entries' device to the net namesapce's lo device.
> 
> when we moving a net device(A) to another net namespace,
> because free_fib_info_rcu is called after a grace period,
> we may call dst_dev_event before free_fib_info_rcu putting
> dst_entry into the dst_garbage list.
> 
> so in dst_dev_event, we can't see these dst entries through
> dst_garbage list, and without changing their device to the
> old net namespace's lo device. after a grace period, these
> dst entries which dst->dev is device A will in the dst_garbage
> list, and the device A will belong to the new net namespcae.
> 
> then we exit from this new net namespace, the dst_dev_event
> is called again,it will get these dst entries from dst_garbage
> list,and call dst_ifdown to hold the new net namespace's lo
> device incorrectly and put the device A.
> 
> so it will tigger the emg message in netdev_wait_allrefs like
> below.
> unregister_netdevice: waiting for lo to become free. Usage count = 1
> 
> fix this problem by adding rcu_barrier() in dst_dev_event
> when event is NETDEV_UNREGISTER.
> with this,dst_ifdown will be called after the dst_garbage list
> beeing updated.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  net/core/dst.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/net/core/dst.c b/net/core/dst.c
> index 56d6361..38c2199 100644
> --- a/net/core/dst.c
> +++ b/net/core/dst.c
> @@ -375,6 +375,7 @@ static int dst_dev_event(struct notifier_block *this, unsigned long event,
>  
>  	switch (event) {
>  	case NETDEV_UNREGISTER:
> +		rcu_barrier();
>  	case NETDEV_DOWN:
>  		mutex_lock(&dst_gc_mutex);
>  		for (dst = dst_busy_list; dst; dst = dst->next) {


Did you miss http://patchwork.ozlabs.org/patch/176517/  or is this patch
an alternative ?

rcu_barrier() at this place will kill some workloads.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: dev: fix the incorrect hold of net namespace's lo device
  2012-08-22  8:39 ` Eric Dumazet
@ 2012-08-22 11:00   ` Gao feng
  2012-08-22 11:24     ` Eric Dumazet
  2012-08-23  3:13   ` Eric Dumazet
  1 sibling, 1 reply; 6+ messages in thread
From: Gao feng @ 2012-08-22 11:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: ebiederm, davem, netdev

On Wed, 2012-08-22 at 16:39 +0800, Eric Dumazet wrote:
> On Wed, 2012-08-22 at 16:31 +0800, Gao feng wrote:
>> in dst_dev_event,we get the still referenced dst entries
>> from dst_garbage list,and call dst_ifdown to change these
>> dst entries' device to the net namesapce's lo device.
>>
>> when we moving a net device(A) to another net namespace,
>> because free_fib_info_rcu is called after a grace period,
>> we may call dst_dev_event before free_fib_info_rcu putting
>> dst_entry into the dst_garbage list.
>>
>> so in dst_dev_event, we can't see these dst entries through
>> dst_garbage list, and without changing their device to the
>> old net namespace's lo device. after a grace period, these
>> dst entries which dst->dev is device A will in the dst_garbage
>> list, and the device A will belong to the new net namespcae.
>>
>> then we exit from this new net namespace, the dst_dev_event
>> is called again,it will get these dst entries from dst_garbage
>> list,and call dst_ifdown to hold the new net namespace's lo
>> device incorrectly and put the device A.
>>
>> so it will tigger the emg message in netdev_wait_allrefs like
>> below.
>> unregister_netdevice: waiting for lo to become free. Usage count = 1
>>
>> fix this problem by adding rcu_barrier() in dst_dev_event
>> when event is NETDEV_UNREGISTER.
>> with this,dst_ifdown will be called after the dst_garbage list
>> beeing updated.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/core/dst.c |    1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/core/dst.c b/net/core/dst.c
>> index 56d6361..38c2199 100644
>> --- a/net/core/dst.c
>> +++ b/net/core/dst.c
>> @@ -375,6 +375,7 @@ static int dst_dev_event(struct notifier_block *this, unsigned long event,
>>  
>>  	switch (event) {
>>  	case NETDEV_UNREGISTER:
>> +		rcu_barrier();
>>  	case NETDEV_DOWN:
>>  		mutex_lock(&dst_gc_mutex);
>>  		for (dst = dst_busy_list; dst; dst = dst->next) {
> 
> 
> Did you miss http://patchwork.ozlabs.org/patch/176517/  or is this patch
> an alternative ?
> 

Hi Eric

I saw your patch and think this patch is clear and doesn't change too much logic.

I test your patch, it not fix this problem.

In my test case,when moving a net device to another net namespace,
Because you patch delete NETDEV_UNREGISTER event from dst_dev_event,
we will just put dst entries into the dst garbage list in event
NETDEV_DOWN,without call dst_ifdown to change these dst entries' device
to the lo device,and now this net device belongs to the new net namespace.

After the net device beeing moved to another net namespace, I rmmod this
net device's driver,this will trigger the new added event NETDEV_UNREGISTER_FINISH,
so in dst_dev_event,we will change these dst entries's device to the new net
namespace's lo device,and this will make the referenct count of the new net namespace's
lo device incorrect. when we exit the new net namespace,this emg message is still exist.

Message from syslogd@Donkey at Aug 22 18:50:13 ...
 kernel:[ 1161.979036] unregister_netdevice: waiting for lo to become free. Usage count = 1

And because net_mutex is locked here,so we can't create new net namespace.

> rcu_barrier() at this place will kill some workloads.
> 

I think this will only add some workloads when unregister a net device.
Do I miss something?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: dev: fix the incorrect hold of net namespace's lo device
  2012-08-22 11:00   ` Gao feng
@ 2012-08-22 11:24     ` Eric Dumazet
  2012-08-23  3:09       ` Gao feng
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2012-08-22 11:24 UTC (permalink / raw)
  To: Gao feng; +Cc: ebiederm, davem, netdev

On Wed, 2012-08-22 at 19:00 +0800, Gao feng wrote:

> Hi Eric
> 
> I saw your patch and think this patch is clear and doesn't change too much logic.
> 
> I test your patch, it not fix this problem.
> 
> In my test case,when moving a net device to another net namespace,
> Because you patch delete NETDEV_UNREGISTER event from dst_dev_event,
> we will just put dst entries into the dst garbage list in event
> NETDEV_DOWN,without call dst_ifdown to change these dst entries' device
> to the lo device,and now this net device belongs to the new net namespace.
> 

Then fix the "moving a net device to another net namespace", instead
of slowing down other common operations.

dev_change_net_namespace() is probably a better place to put your patch

> After the net device beeing moved to another net namespace, I rmmod this
> net device's driver,this will trigger the new added event NETDEV_UNREGISTER_FINISH,
> so in dst_dev_event,we will change these dst entries's device to the new net
> namespace's lo device,and this will make the referenct count of the new net namespace's
> lo device incorrect. when we exit the new net namespace,this emg message is still exist.
> 
> Message from syslogd@Donkey at Aug 22 18:50:13 ...
>  kernel:[ 1161.979036] unregister_netdevice: waiting for lo to become free. Usage count = 1
> 
> And because net_mutex is locked here,so we can't create new net namespace.
> 
> > rcu_barrier() at this place will kill some workloads.
> > 
> 
> I think this will only add some workloads when unregister a net device.
> Do I miss something?

Yes, rcu_barrier() at this point is killing performance, because we hold
RTNL.

We worked hard to batch things, your patch is a huge step backward.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: dev: fix the incorrect hold of net namespace's lo device
  2012-08-22 11:24     ` Eric Dumazet
@ 2012-08-23  3:09       ` Gao feng
  0 siblings, 0 replies; 6+ messages in thread
From: Gao feng @ 2012-08-23  3:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: ebiederm, davem, netdev

于 2012年08月22日 19:24, Eric Dumazet 写道:
> On Wed, 2012-08-22 at 19:00 +0800, Gao feng wrote:
> 
>> Hi Eric
>>
>> I saw your patch and think this patch is clear and doesn't change too much logic.
>>
>> I test your patch, it not fix this problem.
>>
>> In my test case,when moving a net device to another net namespace,
>> Because you patch delete NETDEV_UNREGISTER event from dst_dev_event,
>> we will just put dst entries into the dst garbage list in event
>> NETDEV_DOWN,without call dst_ifdown to change these dst entries' device
>> to the lo device,and now this net device belongs to the new net namespace.
>>
> 
> Then fix the "moving a net device to another net namespace", instead
> of slowing down other common operations.
> 

okay, I will send a patch to fix this problem after your patch beeing applied.

> dev_change_net_namespace() is probably a better place to put your patch
> 
>> After the net device beeing moved to another net namespace, I rmmod this
>> net device's driver,this will trigger the new added event NETDEV_UNREGISTER_FINISH,
>> so in dst_dev_event,we will change these dst entries's device to the new net
>> namespace's lo device,and this will make the referenct count of the new net namespace's
>> lo device incorrect. when we exit the new net namespace,this emg message is still exist.
>>
>> Message from syslogd@Donkey at Aug 22 18:50:13 ...
>>  kernel:[ 1161.979036] unregister_netdevice: waiting for lo to become free. Usage count = 1
>>
>> And because net_mutex is locked here,so we can't create new net namespace.
>>
>>> rcu_barrier() at this place will kill some workloads.
>>>
>>
>> I think this will only add some workloads when unregister a net device.
>> Do I miss something?
> 
> Yes, rcu_barrier() at this point is killing performance, because we hold
> RTNL.
> 
> We worked hard to batch things, your patch is a huge step backward.
> 

Get it,thanks for your explanation.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net: dev: fix the incorrect hold of net namespace's lo device
  2012-08-22  8:39 ` Eric Dumazet
  2012-08-22 11:00   ` Gao feng
@ 2012-08-23  3:13   ` Eric Dumazet
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2012-08-23  3:13 UTC (permalink / raw)
  To: Gao feng; +Cc: ebiederm, davem, netdev

On Wed, 2012-08-22 at 10:39 +0200, Eric Dumazet wrote:

> rcu_barrier() at this place will kill some workloads.

For example, I tested the following one :

modprobe dummy numdummies=1000
time rmmod dummy

-> 16.5 seconds instead of 0.4 second

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-08-23  3:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-22  8:31 [PATCH] net: dev: fix the incorrect hold of net namespace's lo device Gao feng
2012-08-22  8:39 ` Eric Dumazet
2012-08-22 11:00   ` Gao feng
2012-08-22 11:24     ` Eric Dumazet
2012-08-23  3:09       ` Gao feng
2012-08-23  3:13   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox