* [PATCH 0/4] bonding: bug fixes for 2.6.26 @ 2008-05-03 0:49 Jay Vosburgh 2008-05-03 0:49 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device Jay Vosburgh 0 siblings, 1 reply; 12+ messages in thread From: Jay Vosburgh @ 2008-05-03 0:49 UTC (permalink / raw) To: netdev; +Cc: Jeff Garzik Following are four bonding bug fixes for 2.6.26, as follows: 1- Fix error unwind logic in bond_create (call unregister_netdevice instead of free_netdev). 2- Fix error unwind in bonding_store_bonds (release necessary locks). 3- Fix deadlock between bonding_store_bonds and bond_destroy_sysfs (do sysfs action outside of rtnl). 4- Fix error unwind in bond_enslave to correctly clear master setting and IFF_SLAVE. Please apply. Thanks, -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/4] bonding: Do not call free_netdev for already registered device. 2008-05-03 0:49 [PATCH 0/4] bonding: bug fixes for 2.6.26 Jay Vosburgh @ 2008-05-03 0:49 ` Jay Vosburgh 2008-05-03 0:49 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds Jay Vosburgh ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Jay Vosburgh @ 2008-05-03 0:49 UTC (permalink / raw) To: netdev; +Cc: Jeff Garzik, Pavel Emelyanov, Jay Vosburgh From: Pavel Emelyanov <xemul@openvz.org> If the call to bond_create_sysfs_entry in bond_create fails, the proper rollback is to call unregister_netdevice, not free_netdev. Otherwise - kernel BUG at net/core/dev.c:4057! Checked with artificial failures injected into bond_create_sysfs_entry. Pavel's original patch modified by Jay Vosburgh to move code around for clarity (remove goto-hopping within the unwind block). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> --- drivers/net/bonding/bond_main.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 6e91b4b..7ffd819 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4939,7 +4939,9 @@ int bond_create(char *name, struct bond_params *params, struct bonding **newbond if (res < 0) { rtnl_lock(); down_write(&bonding_rwsem); - goto out_bond; + bond_deinit(bond_dev); + unregister_netdevice(bond_dev); + goto out_rtnl; } return 0; -- 1.5.2.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds 2008-05-03 0:49 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device Jay Vosburgh @ 2008-05-03 0:49 ` Jay Vosburgh 2008-05-03 0:49 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs Jay Vosburgh 2008-05-03 0:51 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds David Miller 2008-05-03 0:51 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device David Miller 2008-05-06 16:17 ` Jeff Garzik 2 siblings, 2 replies; 12+ messages in thread From: Jay Vosburgh @ 2008-05-03 0:49 UTC (permalink / raw) To: netdev; +Cc: Jeff Garzik, Jay Vosburgh Fixed an error unwind in bonding_store_bonds that didn't release the locks it held, and consolidated unwinds into a common block at the end of the function. Bug reported by Pavel Emelyanov <xemul@openvz.org>, who provided a different fix. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> --- drivers/net/bonding/bond_sysfs.c | 16 ++++++++-------- 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index 979c2d0..68c41a0 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -146,29 +146,29 @@ static ssize_t bonding_store_bonds(struct class *cls, const char *buffer, size_t ": Unable remove bond %s due to open references.\n", ifname); res = -EPERM; - goto out; + goto out_unlock; } printk(KERN_INFO DRV_NAME ": %s is being deleted...\n", bond->dev->name); bond_destroy(bond); - up_write(&bonding_rwsem); - rtnl_unlock(); - goto out; + goto out_unlock; } printk(KERN_ERR DRV_NAME ": unable to delete non-existent bond %s\n", ifname); res = -ENODEV; - up_write(&bonding_rwsem); - rtnl_unlock(); - goto out; + goto out_unlock; } err_no_cmd: printk(KERN_ERR DRV_NAME ": no command found in bonding_masters. Use +ifname or -ifname.\n"); - res = -EPERM; + return -EPERM; + +out_unlock: + up_write(&bonding_rwsem); + rtnl_unlock(); /* Always return either count or an error. If you return 0, you'll * get called forever, which is bad. -- 1.5.2.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs. 2008-05-03 0:49 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds Jay Vosburgh @ 2008-05-03 0:49 ` Jay Vosburgh 2008-05-03 0:49 ` [PATCH 4/4] bonding: fix enslavement error unwinds Jay Vosburgh 2008-05-03 0:51 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs David Miller 2008-05-03 0:51 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds David Miller 1 sibling, 2 replies; 12+ messages in thread From: Jay Vosburgh @ 2008-05-03 0:49 UTC (permalink / raw) To: netdev; +Cc: Jeff Garzik, Pavel Emelyanov From: Pavel Emelyanov <xemul@openvz.org> The sysfs layer has an internal protection, that ensures, that all the process sitting inside ->sore/->show callback exits before the appropriate entry is unregistered (the calltraces are rather big, but I can provide them if required). On the other hand, bonding takes rtnl_lock in a) the bonding_store_bonds, i.e. in ->store callback, b) module exit before calling the sysfs unregister routines. Thus, the classical AB-BA deadlock may occur. To reproduce run # while :; do modprobe bonding; rmmod bonding; done and # while :; do echo '+bond%d' > /sys/class/net/bonding_masters ; done in parallel. The fix is to move the bond_destroy_sysfs out of the rtnl_lock, but _before_ bond_free_all to make sure no bonding devices exist after module unload. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Jay Vosburgh <fubar@us.ibm.com> --- drivers/net/bonding/bond_main.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7ffd819..3977760 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4995,9 +4995,10 @@ err: destroy_workqueue(bond->wq); } + bond_destroy_sysfs(); + rtnl_lock(); bond_free_all(); - bond_destroy_sysfs(); rtnl_unlock(); out: return res; @@ -5009,9 +5010,10 @@ static void __exit bonding_exit(void) unregister_netdevice_notifier(&bond_netdev_notifier); unregister_inetaddr_notifier(&bond_inetaddr_notifier); + bond_destroy_sysfs(); + rtnl_lock(); bond_free_all(); - bond_destroy_sysfs(); rtnl_unlock(); } -- 1.5.2.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/4] bonding: fix enslavement error unwinds 2008-05-03 0:49 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs Jay Vosburgh @ 2008-05-03 0:49 ` Jay Vosburgh 2008-05-03 0:52 ` David Miller 2008-05-03 0:51 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs David Miller 1 sibling, 1 reply; 12+ messages in thread From: Jay Vosburgh @ 2008-05-03 0:49 UTC (permalink / raw) To: netdev; +Cc: Jeff Garzik, Jay Vosburgh For commit c2edacf80e155ef54ae4774379d461b60896bc2e, two steps were rearranged in the enslavement process: netdev_set_master is now before the call to dev_open to open the slave. This patch updates the error cases and unwind process at the end of bond_enslave to match the new order. Without this patch, it is possible for the enslavement to fail, but leave the slave with IFF_SLAVE set in its flags. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> --- drivers/net/bonding/bond_main.c | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 3977760..b99ccd2 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1425,13 +1425,13 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) res = netdev_set_master(slave_dev, bond_dev); if (res) { dprintk("Error %d calling netdev_set_master\n", res); - goto err_close; + goto err_restore_mac; } /* open the slave since the application closed it */ res = dev_open(slave_dev); if (res) { dprintk("Openning slave %s failed\n", slave_dev->name); - goto err_restore_mac; + goto err_unset_master; } new_slave->dev = slave_dev; @@ -1444,7 +1444,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) */ res = bond_alb_init_slave(bond, new_slave); if (res) { - goto err_unset_master; + goto err_close; } } @@ -1619,7 +1619,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) res = bond_create_slave_symlinks(bond_dev, slave_dev); if (res) - goto err_unset_master; + goto err_close; printk(KERN_INFO DRV_NAME ": %s: enslaving %s as a%s interface with a%s link.\n", @@ -1631,12 +1631,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) return 0; /* Undo stages on error */ -err_unset_master: - netdev_set_master(slave_dev, NULL); - err_close: dev_close(slave_dev); +err_unset_master: + netdev_set_master(slave_dev, NULL); + err_restore_mac: if (!bond->params.fail_over_mac) { memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN); -- 1.5.2.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 4/4] bonding: fix enslavement error unwinds 2008-05-03 0:49 ` [PATCH 4/4] bonding: fix enslavement error unwinds Jay Vosburgh @ 2008-05-03 0:52 ` David Miller 2008-05-03 1:06 ` [PATCH REPOST " Jay Vosburgh 0 siblings, 1 reply; 12+ messages in thread From: David Miller @ 2008-05-03 0:52 UTC (permalink / raw) To: fubar; +Cc: netdev, jgarzik From: Jay Vosburgh <fubar@us.ibm.com> Date: Fri, 2 May 2008 17:49:40 -0700 > For commit c2edacf80e155ef54ae4774379d461b60896bc2e, two steps > were rearranged in the enslavement process: netdev_set_master is now > before the call to dev_open to open the slave. Please add the commit header line text to this SHA-ID reference. > This patch updates the error cases and unwind process at the > end of bond_enslave to match the new order. Without this patch, it is > possible for the enslavement to fail, but leave the slave with IFF_SLAVE > set in its flags. > > Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Otherwise: Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH REPOST 4/4] bonding: fix enslavement error unwinds 2008-05-03 0:52 ` David Miller @ 2008-05-03 1:06 ` Jay Vosburgh 2008-05-03 1:07 ` David Miller 0 siblings, 1 reply; 12+ messages in thread From: Jay Vosburgh @ 2008-05-03 1:06 UTC (permalink / raw) To: David Miller; +Cc: netdev, jgarzik As part of: commit c2edacf80e155ef54ae4774379d461b60896bc2e Author: Jay Vosburgh <fubar@us.ibm.com> Date: Mon Jul 9 10:42:47 2007 -0700 bonding / ipv6: no addrconf for slaves separately from master two steps were rearranged in the enslavement process: netdev_set_master is now before the call to dev_open to open the slave. This patch updates the error cases and unwind process at the end of bond_enslave to match the new order. Without this patch, it is possible for the enslavement to fail, but leave the slave with IFF_SLAVE set in its flags. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> --- drivers/net/bonding/bond_main.c | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 3977760..b99ccd2 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1425,13 +1425,13 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) res = netdev_set_master(slave_dev, bond_dev); if (res) { dprintk("Error %d calling netdev_set_master\n", res); - goto err_close; + goto err_restore_mac; } /* open the slave since the application closed it */ res = dev_open(slave_dev); if (res) { dprintk("Openning slave %s failed\n", slave_dev->name); - goto err_restore_mac; + goto err_unset_master; } new_slave->dev = slave_dev; @@ -1444,7 +1444,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) */ res = bond_alb_init_slave(bond, new_slave); if (res) { - goto err_unset_master; + goto err_close; } } @@ -1619,7 +1619,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) res = bond_create_slave_symlinks(bond_dev, slave_dev); if (res) - goto err_unset_master; + goto err_close; printk(KERN_INFO DRV_NAME ": %s: enslaving %s as a%s interface with a%s link.\n", @@ -1631,12 +1631,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) return 0; /* Undo stages on error */ -err_unset_master: - netdev_set_master(slave_dev, NULL); - err_close: dev_close(slave_dev); +err_unset_master: + netdev_set_master(slave_dev, NULL); + err_restore_mac: if (!bond->params.fail_over_mac) { memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN); -- 1.5.2.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH REPOST 4/4] bonding: fix enslavement error unwinds 2008-05-03 1:06 ` [PATCH REPOST " Jay Vosburgh @ 2008-05-03 1:07 ` David Miller 0 siblings, 0 replies; 12+ messages in thread From: David Miller @ 2008-05-03 1:07 UTC (permalink / raw) To: fubar; +Cc: netdev, jgarzik From: Jay Vosburgh <fubar@us.ibm.com> Date: Fri, 02 May 2008 18:06:02 -0700 > > As part of: > > commit c2edacf80e155ef54ae4774379d461b60896bc2e > Author: Jay Vosburgh <fubar@us.ibm.com> > Date: Mon Jul 9 10:42:47 2007 -0700 > > bonding / ipv6: no addrconf for slaves separately from master > > two steps were rearranged in the enslavement process: netdev_set_master > is now before the call to dev_open to open the slave. > > This patch updates the error cases and unwind process at the > end of bond_enslave to match the new order. Without this patch, it is > possible for the enslavement to fail, but leave the slave with IFF_SLAVE > set in its flags. > > Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Thanks for fixing up the commit message: Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs. 2008-05-03 0:49 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs Jay Vosburgh 2008-05-03 0:49 ` [PATCH 4/4] bonding: fix enslavement error unwinds Jay Vosburgh @ 2008-05-03 0:51 ` David Miller 1 sibling, 0 replies; 12+ messages in thread From: David Miller @ 2008-05-03 0:51 UTC (permalink / raw) To: fubar; +Cc: netdev, jgarzik, xemul From: Jay Vosburgh <fubar@us.ibm.com> Date: Fri, 2 May 2008 17:49:39 -0700 > From: Pavel Emelyanov <xemul@openvz.org> > > The sysfs layer has an internal protection, that ensures, that > all the process sitting inside ->sore/->show callback exits > before the appropriate entry is unregistered (the calltraces > are rather big, but I can provide them if required). > > On the other hand, bonding takes rtnl_lock in > a) the bonding_store_bonds, i.e. in ->store callback, > b) module exit before calling the sysfs unregister routines. > > Thus, the classical AB-BA deadlock may occur. To reproduce run > # while :; do modprobe bonding; rmmod bonding; done > and > # while :; do echo '+bond%d' > /sys/class/net/bonding_masters ; done > in parallel. > > The fix is to move the bond_destroy_sysfs out of the rtnl_lock, > but _before_ bond_free_all to make sure no bonding devices exist > after module unload. > > Signed-off-by: Pavel Emelyanov <xemul@openvz.org> > Acked-by: Jay Vosburgh <fubar@us.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds 2008-05-03 0:49 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds Jay Vosburgh 2008-05-03 0:49 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs Jay Vosburgh @ 2008-05-03 0:51 ` David Miller 1 sibling, 0 replies; 12+ messages in thread From: David Miller @ 2008-05-03 0:51 UTC (permalink / raw) To: fubar; +Cc: netdev, jgarzik From: Jay Vosburgh <fubar@us.ibm.com> Date: Fri, 2 May 2008 17:49:38 -0700 > Fixed an error unwind in bonding_store_bonds that didn't release > the locks it held, and consolidated unwinds into a common block at the > end of the function. Bug reported by Pavel Emelyanov <xemul@openvz.org>, > who provided a different fix. > > Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] bonding: Do not call free_netdev for already registered device. 2008-05-03 0:49 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device Jay Vosburgh 2008-05-03 0:49 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds Jay Vosburgh @ 2008-05-03 0:51 ` David Miller 2008-05-06 16:17 ` Jeff Garzik 2 siblings, 0 replies; 12+ messages in thread From: David Miller @ 2008-05-03 0:51 UTC (permalink / raw) To: fubar; +Cc: netdev, jgarzik, xemul From: Jay Vosburgh <fubar@us.ibm.com> Date: Fri, 2 May 2008 17:49:37 -0700 > From: Pavel Emelyanov <xemul@openvz.org> > > If the call to bond_create_sysfs_entry in bond_create fails, the > proper rollback is to call unregister_netdevice, not free_netdev. > Otherwise - kernel BUG at net/core/dev.c:4057! > > Checked with artificial failures injected into bond_create_sysfs_entry. > > Pavel's original patch modified by Jay Vosburgh to move code around > for clarity (remove goto-hopping within the unwind block). > > Signed-off-by: Pavel Emelyanov <xemul@openvz.org> > Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] bonding: Do not call free_netdev for already registered device. 2008-05-03 0:49 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device Jay Vosburgh 2008-05-03 0:49 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds Jay Vosburgh 2008-05-03 0:51 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device David Miller @ 2008-05-06 16:17 ` Jeff Garzik 2 siblings, 0 replies; 12+ messages in thread From: Jeff Garzik @ 2008-05-06 16:17 UTC (permalink / raw) To: Jay Vosburgh; +Cc: netdev, Pavel Emelyanov Jay Vosburgh wrote: > From: Pavel Emelyanov <xemul@openvz.org> > > If the call to bond_create_sysfs_entry in bond_create fails, the > proper rollback is to call unregister_netdevice, not free_netdev. > Otherwise - kernel BUG at net/core/dev.c:4057! > > Checked with artificial failures injected into bond_create_sysfs_entry. > > Pavel's original patch modified by Jay Vosburgh to move code around > for clarity (remove goto-hopping within the unwind block). > > Signed-off-by: Pavel Emelyanov <xemul@openvz.org> > Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> > --- > drivers/net/bonding/bond_main.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) applied 1-4 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-05-06 16:17 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-03 0:49 [PATCH 0/4] bonding: bug fixes for 2.6.26 Jay Vosburgh 2008-05-03 0:49 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device Jay Vosburgh 2008-05-03 0:49 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds Jay Vosburgh 2008-05-03 0:49 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs Jay Vosburgh 2008-05-03 0:49 ` [PATCH 4/4] bonding: fix enslavement error unwinds Jay Vosburgh 2008-05-03 0:52 ` David Miller 2008-05-03 1:06 ` [PATCH REPOST " Jay Vosburgh 2008-05-03 1:07 ` David Miller 2008-05-03 0:51 ` [PATCH 3/4] bonding: Deadlock between bonding_store_bonds and bond_destroy_sysfs David Miller 2008-05-03 0:51 ` [PATCH 2/4] bonding: fix error unwind in bonding_store_bonds David Miller 2008-05-03 0:51 ` [PATCH 1/4] bonding: Do not call free_netdev for already registered device David Miller 2008-05-06 16:17 ` Jeff Garzik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).