From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jay Vosburgh Subject: Re: [PATCH] bonding: prevent deadlock on slave store with alb mode (v2) Date: Wed, 25 May 2011 10:32:49 -0700 Message-ID: <10367.1306344769@death> References: <1306265765-8257-1-git-send-email-nhorman@tuxdriver.com> <1306343805-3223-1-git-send-email-nhorman@tuxdriver.com> Cc: netdev@vger.kernel.org, Andy Gospodarek , nicolas.2p.debian@gmail.com, "David S. Miller" To: Neil Horman Return-path: Received: from e7.ny.us.ibm.com ([32.97.182.137]:34212 "EHLO e7.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752441Ab1EYRdP (ORCPT ); Wed, 25 May 2011 13:33:15 -0400 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by e7.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p4PH9eMk030994 for ; Wed, 25 May 2011 13:09:40 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4PHX1t1929868 for ; Wed, 25 May 2011 13:33:02 -0400 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4PBWw0U025796 for ; Wed, 25 May 2011 05:32:59 -0600 In-reply-to: <1306343805-3223-1-git-send-email-nhorman@tuxdriver.com> Sender: netdev-owner@vger.kernel.org List-ID: Neil Horman wrote: >This soft lockup was recently reported: > >[root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters >[root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves >bonding: bond5: doing slave updates when interface is down. >bonding bond5: master_dev is not up in bond_enslave >[root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves >bonding: bond5: doing slave updates when interface is down. > >BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444] >CPU 12: >Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc >be2d >Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1 >RIP: 0010:[] [] >.text.lock.spinlock+0x26/00 >RSP: 0018:ffff810113167da8 EFLAGS: 00000286 >RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025 >RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8 >RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c >R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000 >R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282 >FS: 00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000 >CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0 > >Call Trace: > [] _spin_lock_bh+0x9/0x14 > [] :bonding:tlb_clear_slave+0x22/0xa1 > [] :bonding:bond_alb_deinit_slave+0xba/0xf0 > [] :bonding:bond_release+0x1b4/0x450 > [] __down_write_nested+0x12/0x92 > [] :bonding:bonding_store_slaves+0x25c/0x2f7 > [] sysfs_write_file+0xb9/0xe8 > [] vfs_write+0xce/0x174 > [] sys_write+0x45/0x6e > [] tracesys+0xd5/0xe0 > >It occurs because we are able to change the slave configuarion of a bond while >the bond interface is down. The bonding driver initializes some data structures >only after its ndo_open routine is called. Among them is the initalization of >the alb tx and rx hash locks. So if we add or remove a slave without first >opening the bond master device, we run the risk of trying to lock/unlock a >spinlock that has garbage for data in it, which results in our above softlock. > >Note that sometimes this works, because in many cases an unlocked spinlock has >the raw_lock parameter initialized to zero (meaning that the kzalloc of the >net_device private data is equivalent to calling spin_lock_init), but thats not >true in all cases, and we aren't guaranteed that condition, so we need to pass >the relevant spinlocks through the spin_lock_init function. > >Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to >the ndo_init path, so they are ready for use by the bond_store_slaves path. > >Change notes: >v2) Based on conversation with Jay and Nicolas it seems that the ability to >enslave devices while the bond master is down should be safe to do. As such >this is an outlier bug, and so instead we'll just initalize the errant spinlocks >in the init path rather than the open path, solving the problem. We'll also >remove the warnings about the bond being down during enslave operations, since >it should be safe One spelling nit, below; fix that and I'm good with this. Signed-off-by: Jay Vosburgh -J >Signed-off-by: Neil Horman >Reported-by: jtluka@redhat.com >CC: Jay Vosburgh >CC: Andy Gospodarek >CC: nicolas.2p.debian@gmail.com >CC: "David S. Miller" >--- > drivers/net/bonding/bond_alb.c | 4 ---- > drivers/net/bonding/bond_main.c | 16 ++++++++++------ > drivers/net/bonding/bond_sysfs.c | 6 ------ > 3 files changed, 10 insertions(+), 16 deletions(-) > >diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c >index 8f2d2e7..2df9276 100644 >--- a/drivers/net/bonding/bond_alb.c >+++ b/drivers/net/bonding/bond_alb.c >@@ -163,8 +163,6 @@ static int tlb_initialize(struct bonding *bond) > struct tlb_client_info *new_hashtbl; > int i; > >- spin_lock_init(&(bond_info->tx_hashtbl_lock)); >- > new_hashtbl = kzalloc(size, GFP_KERNEL); > if (!new_hashtbl) { > pr_err("%s: Error: Failed to allocate TLB hash table\n", >@@ -747,8 +745,6 @@ static int rlb_initialize(struct bonding *bond) > int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info); > int i; > >- spin_lock_init(&(bond_info->rx_hashtbl_lock)); >- > new_hashtbl = kmalloc(size, GFP_KERNEL); > if (!new_hashtbl) { > pr_err("%s: Error: Failed to allocate RLB hash table\n", >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index 088fd84..59bf5c5 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -1542,12 +1542,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) > bond_dev->name, slave_dev->name); > } > >- /* bond must be initialized by bond_open() before enslaving */ >- if (!(bond_dev->flags & IFF_UP)) { >- pr_warning("%s: master_dev is not up in bond_enslave\n", >- bond_dev->name); >- } >- > /* already enslaved */ > if (slave_dev->flags & IFF_SLAVE) { > pr_debug("Error, Device was already enslaved\n"); >@@ -4832,9 +4826,19 @@ static int bond_init(struct net_device *bond_dev) > { > struct bonding *bond = netdev_priv(bond_dev); > struct bond_net *bn = net_generic(dev_net(bond_dev), bond_net_id); >+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); > > pr_debug("Begin bond_init for %s\n", bond_dev->name); > >+ /* >+ * Initialize locks that may be requiered during "Required." >+ * en/deslave operations. All of the bond_open work >+ * (of which this is part) should really be moved to >+ * a phase prior to dev_open >+ */ >+ spin_lock_init(&(bond_info->tx_hashtbl_lock)); >+ spin_lock_init(&(bond_info->rx_hashtbl_lock)); >+ > bond->wq = create_singlethread_workqueue(bond_dev->name); > if (!bond->wq) > return -ENOMEM; >diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c >index 4059bfc..bb1319f 100644 >--- a/drivers/net/bonding/bond_sysfs.c >+++ b/drivers/net/bonding/bond_sysfs.c >@@ -227,12 +227,6 @@ static ssize_t bonding_store_slaves(struct device *d, > struct net_device *dev; > struct bonding *bond = to_bond(d); > >- /* Quick sanity check -- is the bond interface up? */ >- if (!(bond->dev->flags & IFF_UP)) { >- pr_warning("%s: doing slave updates when interface is down.\n", >- bond->dev->name); >- } >- > if (!rtnl_trylock()) > return restart_syscall(); > >-- >1.7.5.2 > --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com