netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch] bonding: fix potential deadlock in bond_uninit()
@ 2010-03-31 10:52 Amerigo Wang
  2010-03-31 11:28 ` Eric W. Biederman
  0 siblings, 1 reply; 4+ messages in thread
From: Amerigo Wang @ 2010-03-31 10:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Pirko, Stephen Hemminger, netdev, David S. Miller,
	Eric W. Biederman, Amerigo Wang, bonding-devel, Jay Vosburgh


bond_uninit() is invoked with rtnl_lock held, when it does destroy_workqueue()
which will potentially flush all works in this workqueue, if we hold rtnl_lock
again in the work function, it will deadlock.

So unlock rtnl_lock before calling destroy_workqueue().

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>

---
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 5b92fbf..b781728 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4542,8 +4542,11 @@ static void bond_uninit(struct net_device *bond_dev)
 
 	bond_remove_proc_entry(bond);
 
-	if (bond->wq)
+	if (bond->wq) {
+		rtnl_unlock();
 		destroy_workqueue(bond->wq);
+		rtnl_lock();
+	}
 
 	netif_addr_lock_bh(bond_dev);
 	bond_mc_list_destroy(bond);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Patch] bonding: fix potential deadlock in bond_uninit()
  2010-03-31 10:52 [Patch] bonding: fix potential deadlock in bond_uninit() Amerigo Wang
@ 2010-03-31 11:28 ` Eric W. Biederman
  2010-03-31 23:02   ` Stephen Hemminger
  2010-04-01  2:49   ` Cong Wang
  0 siblings, 2 replies; 4+ messages in thread
From: Eric W. Biederman @ 2010-03-31 11:28 UTC (permalink / raw)
  To: Amerigo Wang
  Cc: linux-kernel, Jiri Pirko, Stephen Hemminger, netdev,
	David S. Miller, bonding-devel, Jay Vosburgh

Amerigo Wang <amwang@redhat.com> writes:

> bond_uninit() is invoked with rtnl_lock held, when it does destroy_workqueue()
> which will potentially flush all works in this workqueue, if we hold rtnl_lock
> again in the work function, it will deadlock.
>
> So unlock rtnl_lock before calling destroy_workqueue().

Ouch.  That seems rather rude to our caller, and likely very
dangerous.

Is this a deadlock you actually hit, or is this something lockdep
warned about?

My gut feel says we need to move the destroy_workqueue into
the network device destructor.

Eric



> Signed-off-by: WANG Cong <amwang@redhat.com>
> Cc: Jay Vosburgh <fubar@us.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: Jiri Pirko <jpirko@redhat.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>
> ---
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 5b92fbf..b781728 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -4542,8 +4542,11 @@ static void bond_uninit(struct net_device *bond_dev)
>  
>  	bond_remove_proc_entry(bond);
>  
> -	if (bond->wq)
> +	if (bond->wq) {
> +		rtnl_unlock();
>  		destroy_workqueue(bond->wq);
> +		rtnl_lock();
> +	}
>  
>  	netif_addr_lock_bh(bond_dev);
>  	bond_mc_list_destroy(bond);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Patch] bonding: fix potential deadlock in bond_uninit()
  2010-03-31 11:28 ` Eric W. Biederman
@ 2010-03-31 23:02   ` Stephen Hemminger
  2010-04-01  2:49   ` Cong Wang
  1 sibling, 0 replies; 4+ messages in thread
From: Stephen Hemminger @ 2010-03-31 23:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Amerigo Wang, linux-kernel, Jiri Pirko, netdev, David S. Miller,
	bonding-devel, Jay Vosburgh

On Wed, 31 Mar 2010 04:28:33 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Amerigo Wang <amwang@redhat.com> writes:
> 
> > bond_uninit() is invoked with rtnl_lock held, when it does destroy_workqueue()
> > which will potentially flush all works in this workqueue, if we hold rtnl_lock
> > again in the work function, it will deadlock.
> >
> > So unlock rtnl_lock before calling destroy_workqueue().
> 
> Ouch.  That seems rather rude to our caller, and likely very
> dangerous.
> 
> Is this a deadlock you actually hit, or is this something lockdep
> warned about?
> 
> My gut feel says we need to move the destroy_workqueue into
> the network device destructor.
> 
> Eric

Why is there one workqueue per bond device rather than just one workqueue for
all bonding devices controlled by the module instance? It would be cleaner
on removal and less space and overhead.  I can't see that doing arp/mii or alb
work is high parallel and load activity.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Patch] bonding: fix potential deadlock in bond_uninit()
  2010-03-31 11:28 ` Eric W. Biederman
  2010-03-31 23:02   ` Stephen Hemminger
@ 2010-04-01  2:49   ` Cong Wang
  1 sibling, 0 replies; 4+ messages in thread
From: Cong Wang @ 2010-04-01  2:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Jiri Pirko, Stephen Hemminger, netdev,
	David S. Miller, bonding-devel, Jay Vosburgh

Eric W. Biederman wrote:
> Amerigo Wang <amwang@redhat.com> writes:
> 
>> bond_uninit() is invoked with rtnl_lock held, when it does destroy_workqueue()
>> which will potentially flush all works in this workqueue, if we hold rtnl_lock
>> again in the work function, it will deadlock.
>>
>> So unlock rtnl_lock before calling destroy_workqueue().
> 
> Ouch.  That seems rather rude to our caller, and likely very
> dangerous.


This is reasonable, because workqueue flush functions will potentially
call all the work functions which could take the same lock taken before
the flush call, thus deadlock.

> 
> Is this a deadlock you actually hit, or is this something lockdep
> warned about?

It's only a lockdep warning.

> 
> My gut feel says we need to move the destroy_workqueue into
> the network device destructor.
> 

Oh, this seems a better idea, as long as the destructor are not called
with any locks holding.

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-04-01  2:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-31 10:52 [Patch] bonding: fix potential deadlock in bond_uninit() Amerigo Wang
2010-03-31 11:28 ` Eric W. Biederman
2010-03-31 23:02   ` Stephen Hemminger
2010-04-01  2:49   ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).