* [PATCH] bonding: various fixes for bonding, netpoll & netconsole @ 2010-10-12 21:55 nhorman 2010-10-12 21:55 ` [PATCH 1/5] Fix bonding drivers improper modification of netpoll structure nhorman ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: nhorman @ 2010-10-12 21:55 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman Grr, really sorry guys, 3rd times the charm. netdev greylisted this and several others pitched it because of a bad mail header. I'm resending through tuxdriver like I should have done before A while ago we tried to enable netpoll on the bonding driver to enable netconsole. That worked well in a steady state, but deadlocked frequently in failover conditions due to some recursive lock-taking (as well as a few other problems). I've gone through the driver, netconsole and netpoll code, fixed up those deadlocks, and confirmed that, with this patch series, we can use netconsole on bonding without deadlock in all bonding modes with all slaves, even accross failovers. I've also fixed up some incidental bugs that I ran across while looking through this code, as described in individual patches Signed-off-by: Neil Horman <nhorman@tuxdriver.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/5] Fix bonding drivers improper modification of netpoll structure 2010-10-12 21:55 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman @ 2010-10-12 21:55 ` nhorman 2010-10-12 21:55 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman ` (3 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-12 21:55 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> The bonding driver currently modifies the netpoll structure in its xmit path while sending frames from netpoll. This is racy, as other cpus can access the netpoll structure in parallel. Since the bonding driver points np->dev to a slave device, other cpus can inadvertently attempt to send data directly to slave devices, leading to improper locking with the bonding master, lost frames, and deadlocks. This patch fixes that up. This patch also removes the real_dev pointer from the netpoll structure as that data is really only used by bonding in the poll_controller, and we can emulate its behavior by check each slave for IS_UP. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/bonding/bond_main.c | 15 +++++++++------ include/linux/netpoll.h | 9 +++++++-- net/core/netpoll.c | 6 +++--- 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index a0bf35d..eb7d089 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -449,11 +449,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, if (unlikely(bond->dev->priv_flags & IFF_IN_NETPOLL)) { struct netpoll *np = bond->dev->npinfo->netpoll; slave_dev->npinfo = bond->dev->npinfo; - np->real_dev = np->dev = skb->dev; slave_dev->priv_flags |= IFF_IN_NETPOLL; - netpoll_send_skb(np, skb); + netpoll_send_skb_on_dev(np, skb, slave_dev); slave_dev->priv_flags &= ~IFF_IN_NETPOLL; - np->dev = bond->dev; } else #endif dev_queue_xmit(skb); @@ -1332,9 +1330,14 @@ static bool slaves_support_netpoll(struct net_device *bond_dev) static void bond_poll_controller(struct net_device *bond_dev) { - struct net_device *dev = bond_dev->npinfo->netpoll->real_dev; - if (dev != bond_dev) - netpoll_poll_dev(dev); + struct bonding *bond = netdev_priv(bond_dev); + struct slave *slave; + int i; + + bond_for_each_slave(bond, slave, i) { + if (slave->dev && IS_UP(slave->dev)) + netpoll_poll_dev(slave->dev); + } } static void bond_netpoll_cleanup(struct net_device *bond_dev) diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h index 50d8009..79358bb 100644 --- a/include/linux/netpoll.h +++ b/include/linux/netpoll.h @@ -14,7 +14,6 @@ struct netpoll { struct net_device *dev; - struct net_device *real_dev; char dev_name[IFNAMSIZ]; const char *name; void (*rx_hook)(struct netpoll *, int, char *, int); @@ -53,7 +52,13 @@ void netpoll_set_trap(int trap); void __netpoll_cleanup(struct netpoll *np); void netpoll_cleanup(struct netpoll *np); int __netpoll_rx(struct sk_buff *skb); -void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb); +void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb, + struct net_device *dev); +static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) +{ + netpoll_send_skb_on_dev(np, skb, np->dev); +} + #ifdef CONFIG_NETPOLL diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 537e01a..4e98ffa 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -288,11 +288,11 @@ static int netpoll_owner_active(struct net_device *dev) return 0; } -void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) +void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb, + struct net_device *dev) { int status = NETDEV_TX_BUSY; unsigned long tries; - struct net_device *dev = np->dev; const struct net_device_ops *ops = dev->netdev_ops; /* It is up to the caller to keep npinfo alive. */ struct netpoll_info *npinfo = np->dev->npinfo; @@ -346,7 +346,7 @@ void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) schedule_delayed_work(&npinfo->tx_work,0); } } -EXPORT_SYMBOL(netpoll_send_skb); +EXPORT_SYMBOL(netpoll_send_skb_on_dev); void netpoll_send_udp(struct netpoll *np, const char *msg, int len) { -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll 2010-10-12 21:55 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman 2010-10-12 21:55 ` [PATCH 1/5] Fix bonding drivers improper modification of netpoll structure nhorman @ 2010-10-12 21:55 ` nhorman 2010-10-13 2:44 ` Cong Wang 2010-10-12 21:55 ` [PATCH 3/5] Fix napi poll for bonding driver nhorman ` (2 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: nhorman @ 2010-10-12 21:55 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> The monitoring paths in the bonding driver take write locks that are shared by the tx path. If netconsole is in use, these paths can call printk which puts us in the netpoll tx path, which, if netconsole is attached to the bonding driver, result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the monitor paths in the bonding driver don't claim the xmit_lock, nor should they). The solution is to use a per cpu flag internal to the driver to indicate when a cpu is holding the lock in a path that might recusrse into the tx path for the driver via netconsole. By checking this flag on transmit, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. I've tested this and am able to transmit via netconsole while causing failover conditions on the bond slave links. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/bonding/bond_main.c | 67 ++++++++++++++++++++++++++++++++++++-- 1 files changed, 63 insertions(+), 4 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index eb7d089..1d0d8c5 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -76,6 +76,7 @@ #include <linux/if_vlan.h> #include <linux/if_bonding.h> #include <linux/jiffies.h> +#include <linux/preempt.h> #include <net/route.h> #include <net/net_namespace.h> #include <net/netns/generic.h> @@ -169,6 +170,35 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link /*----------------------------- Global variables ----------------------------*/ +#ifdef CONFIG_NET_POLL_CONTROLLER +static cpumask_var_t netpoll_block_tx; + +static inline void block_netpoll_tx(void +{ + preempt_disable(); + BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), + netpoll_block_tx)); +} + +static inline void unblock_netpoll_tx(void) +{ + BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(), + netpoll_block_tx)); + preempt_enable(); +} + +static inline int is_netpoll_tx_blocked(struct net_device *dev) +{ + if (unlikely(dev->priv_flags & IFF_IN_NETPOLL)) + return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx); + return 0; +} +#else +#define block_netpoll_tx() +#define unblock_netpoll_tx() +#define is_netpoll_tx_blocked(dev) +#endif + static const char * const version = DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n"; @@ -310,6 +340,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id); + block_netpoll_tx(); write_lock_bh(&bond->lock); list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { @@ -344,6 +375,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return res; } @@ -1804,10 +1836,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) bond_set_carrier(bond); #ifdef CONFIG_NET_POLL_CONTROLLER - /* - * Netpoll and bonding is broken, make sure it is not initialized - * until it is fixed. - */ if (disable_netpoll) { bond_dev->priv_flags |= IFF_DISABLE_NETPOLL; } else { @@ -1892,6 +1920,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) return -EINVAL; } + block_netpoll_tx(); netdev_bonding_change(bond_dev, NETDEV_BONDING_DESLAVE); write_lock_bh(&bond->lock); @@ -1901,6 +1930,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) pr_info("%s: %s not enslaved\n", bond_dev->name, slave_dev->name); write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return -EINVAL; } @@ -1994,6 +2024,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) } write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); /* must do this from outside any spinlocks */ bond_destroy_slave_symlinks(bond_dev, slave_dev); @@ -2085,6 +2116,7 @@ static int bond_release_all(struct net_device *bond_dev) struct net_device *slave_dev; struct sockaddr addr; + block_netpoll_tx(); write_lock_bh(&bond->lock); netif_carrier_off(bond_dev); @@ -2183,6 +2215,7 @@ static int bond_release_all(struct net_device *bond_dev) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return 0; } @@ -2232,9 +2265,11 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi (old_active) && (new_active->link == BOND_LINK_UP) && IS_UP(new_active->dev)) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_change_active_slave(bond, new_active); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } else res = -EINVAL; @@ -2466,9 +2501,11 @@ static void bond_miimon_commit(struct bonding *bond) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -2911,11 +2948,13 @@ void bond_loadbalance_arp_mon(struct work_struct *work) } if (do_failover) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } re_arm: @@ -3074,9 +3113,11 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -4564,6 +4605,13 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bonding *bond = netdev_priv(dev); + /* + * If we risk deadlock from transmitting this in the + * netpoll path, tell netpoll to queue the frame for later tx + */ + if (is_netpoll_tx_blocked(dev)) + return NETDEV_TX_BUSY; + if (TX_QUEUE_OVERRIDE(bond->params.mode)) { if (!bond_slave_override(bond, skb)) return NETDEV_TX_OK; @@ -5295,6 +5343,13 @@ static int __init bonding_init(void) if (res) goto err; +#ifdef CONFIG_NET_POLL_CONTROLLER + if (!alloc_cpumask_var(&netpoll_block_tx, GFP_KERNEL)) { + res = -ENOMEM; + bond_destroy_sysfs(); + goto err; + } +#endif register_netdevice_notifier(&bond_netdev_notifier); register_inetaddr_notifier(&bond_inetaddr_notifier); bond_register_ipv6_notifier(); @@ -5316,6 +5371,10 @@ static void __exit bonding_exit(void) bond_destroy_sysfs(); +#ifdef CONFIG_NET_POLL_CONTROLLER + free_cpumask_var(netpoll_block_tx); +#endif + rtnl_link_unregister(&bond_link_ops); unregister_pernet_subsys(&bond_net_ops); } -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll 2010-10-12 21:55 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman @ 2010-10-13 2:44 ` Cong Wang 2010-10-13 10:51 ` Neil Horman 0 siblings, 1 reply; 11+ messages in thread From: Cong Wang @ 2010-10-13 2:44 UTC (permalink / raw) To: nhorman; +Cc: netdev, bonding-devel, fubar, davem, andy On 10/13/10 05:55, nhorman@tuxdriver.com wrote: > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -76,6 +76,7 @@ > #include<linux/if_vlan.h> > #include<linux/if_bonding.h> > #include<linux/jiffies.h> > +#include<linux/preempt.h> > #include<net/route.h> > #include<net/net_namespace.h> > #include<net/netns/generic.h> > @@ -169,6 +170,35 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link > > /*----------------------------- Global variables ----------------------------*/ > > +#ifdef CONFIG_NET_POLL_CONTROLLER > +static cpumask_var_t netpoll_block_tx; > + > +static inline void block_netpoll_tx(void > +{ > + preempt_disable(); > + BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), > + netpoll_block_tx)); > +} > + > +static inline void unblock_netpoll_tx(void) > +{ > + BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(), > + netpoll_block_tx)); > + preempt_enable(); > +} > + > +static inline int is_netpoll_tx_blocked(struct net_device *dev) > +{ > + if (unlikely(dev->priv_flags& IFF_IN_NETPOLL)) > + return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx); > + return 0; > +} > +#else > +#define block_netpoll_tx() > +#define unblock_netpoll_tx() > +#define is_netpoll_tx_blocked(dev) > +#endif > + These should go to netpoll.h, IMHO. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll 2010-10-13 2:44 ` Cong Wang @ 2010-10-13 10:51 ` Neil Horman 0 siblings, 0 replies; 11+ messages in thread From: Neil Horman @ 2010-10-13 10:51 UTC (permalink / raw) To: Cong Wang; +Cc: netdev, bonding-devel, fubar, davem, andy On Wed, Oct 13, 2010 at 10:44:55AM +0800, Cong Wang wrote: > On 10/13/10 05:55, nhorman@tuxdriver.com wrote: > >--- a/drivers/net/bonding/bond_main.c > >+++ b/drivers/net/bonding/bond_main.c > >@@ -76,6 +76,7 @@ > > #include<linux/if_vlan.h> > > #include<linux/if_bonding.h> > > #include<linux/jiffies.h> > >+#include<linux/preempt.h> > > #include<net/route.h> > > #include<net/net_namespace.h> > > #include<net/netns/generic.h> > >@@ -169,6 +170,35 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link > > > > /*----------------------------- Global variables ----------------------------*/ > > > >+#ifdef CONFIG_NET_POLL_CONTROLLER > >+static cpumask_var_t netpoll_block_tx; > >+ > >+static inline void block_netpoll_tx(void > >+{ > >+ preempt_disable(); > >+ BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), > >+ netpoll_block_tx)); > >+} > >+ > >+static inline void unblock_netpoll_tx(void) > >+{ > >+ BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(), > >+ netpoll_block_tx)); > >+ preempt_enable(); > >+} > >+ > >+static inline int is_netpoll_tx_blocked(struct net_device *dev) > >+{ > >+ if (unlikely(dev->priv_flags& IFF_IN_NETPOLL)) > >+ return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx); > >+ return 0; > >+} > >+#else > >+#define block_netpoll_tx() > >+#define unblock_netpoll_tx() > >+#define is_netpoll_tx_blocked(dev) > >+#endif > >+ > > These should go to netpoll.h, IMHO. > Doh, you're right, they should. Particularly because I just noticed there are a few paths through sysfs for bonding that can recurse the same way the monitoring code can. I'll respin this shortly, thanks. Neil ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/5] Fix napi poll for bonding driver 2010-10-12 21:55 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman 2010-10-12 21:55 ` [PATCH 1/5] Fix bonding drivers improper modification of netpoll structure nhorman 2010-10-12 21:55 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman @ 2010-10-12 21:55 ` nhorman 2010-10-12 21:55 ` [PATCH 4/5] Fix netconsole to not deadlock on rmmod nhorman 2010-10-12 21:55 ` [PATCH 5/5] Re-enable netpoll over bonding nhorman 4 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-12 21:55 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> Usually the netpoll path, when preforming a napi poll can get away with just polling all the napi instances of the configured device. Thats not the case for the bonding driver however, as the napi instances which may wind up getting flagged as needing polling after the poll_controller call don't belong to the bonded device, but rather to the slave devices. Fix this by checking the device in question for the IFF_MASTER flag, if set, we know we need to check the full poll list for this cpu, rather than just the devices napi instance list. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- net/core/netpoll.c | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 4e98ffa..d79d221 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -156,8 +156,15 @@ static void poll_napi(struct net_device *dev) { struct napi_struct *napi; int budget = 16; + struct softnet_data *sd = &__get_cpu_var(softnet_data); + struct list_head *nlist; - list_for_each_entry(napi, &dev->napi_list, dev_list) { + if (dev->flags & IFF_MASTER) + nlist = &sd->poll_list; + else + nlist = &dev->napi_list; + + list_for_each_entry(napi, nlist, dev_list) { if (napi->poll_owner != smp_processor_id() && spin_trylock(&napi->poll_lock)) { budget = poll_one_napi(dev->npinfo, napi, budget); -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/5] Fix netconsole to not deadlock on rmmod 2010-10-12 21:55 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman ` (2 preceding siblings ...) 2010-10-12 21:55 ` [PATCH 3/5] Fix napi poll for bonding driver nhorman @ 2010-10-12 21:55 ` nhorman 2010-10-12 21:55 ` [PATCH 5/5] Re-enable netpoll over bonding nhorman 4 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-12 21:55 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> Netconsole calls netpoll_cleanup on receipt of a NETDEVICE_UNREGISTER event. The notifier subsystem calls these event handlers with rtnl_lock held, which netpoll_cleanup also takes, resulting in deadlock. Fix this by calling the __netpoll_cleanup interior function instead, and fixing up the additional pointers. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/netconsole.c | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index ca142c4..94255f0 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -678,7 +678,14 @@ static int netconsole_netdev_event(struct notifier_block *this, strlcpy(nt->np.dev_name, dev->name, IFNAMSIZ); break; case NETDEV_UNREGISTER: - netpoll_cleanup(&nt->np); + /* + * rtnl_lock already held + */ + if (nt->np.dev) { + __netpoll_cleanup(&nt->np); + dev_put(nt->np.dev); + nt->np.dev = NULL; + } /* Fall through */ case NETDEV_GOING_DOWN: case NETDEV_BONDING_DESLAVE: -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 5/5] Re-enable netpoll over bonding 2010-10-12 21:55 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman ` (3 preceding siblings ...) 2010-10-12 21:55 ` [PATCH 4/5] Fix netconsole to not deadlock on rmmod nhorman @ 2010-10-12 21:55 ` nhorman 4 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-12 21:55 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> With the inclusion of previous fixup patches, netpoll over bonding apears to work reliably with failover conditions. This reverts Gospos previous commit c22d7ac844f1cb9c6a5fd20f89ebadc2feef891b, and allows access again to the netpoll functionality in the bonding driver. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/bonding/bond_main.c | 31 +++++++++++-------------------- 1 files changed, 11 insertions(+), 20 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 1d0d8c5..3b53360 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -173,7 +173,7 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link #ifdef CONFIG_NET_POLL_CONTROLLER static cpumask_var_t netpoll_block_tx; -static inline void block_netpoll_tx(void +static inline void block_netpoll_tx(void) { preempt_disable(); BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), @@ -209,9 +209,6 @@ static int arp_ip_count; static int bond_mode = BOND_MODE_ROUNDROBIN; static int xmit_hashtype = BOND_XMIT_POLICY_LAYER2; static int lacp_fast; -#ifdef CONFIG_NET_POLL_CONTROLLER -static int disable_netpoll = 1; -#endif const struct bond_parm_tbl bond_lacp_tbl[] = { { "slow", AD_LACP_SLOW}, @@ -1836,19 +1833,15 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) bond_set_carrier(bond); #ifdef CONFIG_NET_POLL_CONTROLLER - if (disable_netpoll) { + if (slaves_support_netpoll(bond_dev)) { + bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL; + if (bond_dev->npinfo) + slave_dev->npinfo = bond_dev->npinfo; + } else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) { bond_dev->priv_flags |= IFF_DISABLE_NETPOLL; - } else { - if (slaves_support_netpoll(bond_dev)) { - bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL; - if (bond_dev->npinfo) - slave_dev->npinfo = bond_dev->npinfo; - } else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) { - bond_dev->priv_flags |= IFF_DISABLE_NETPOLL; - pr_info("New slave device %s does not support netpoll\n", - slave_dev->name); - pr_info("Disabling netpoll support for %s\n", bond_dev->name); - } + pr_info("New slave device %s does not support netpoll\n", + slave_dev->name); + pr_info("Disabling netpoll support for %s\n", bond_dev->name); } #endif read_unlock(&bond->lock); @@ -2055,10 +2048,8 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) #ifdef CONFIG_NET_POLL_CONTROLLER read_lock_bh(&bond->lock); - /* Make sure netpoll over stays disabled until fixed. */ - if (!disable_netpoll) - if (slaves_support_netpoll(bond_dev)) - bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL; + if (slaves_support_netpoll(bond_dev)) + bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL; read_unlock_bh(&bond->lock); if (slave_dev->netdev_ops->ndo_netpoll_cleanup) slave_dev->netdev_ops->ndo_netpoll_cleanup(slave_dev); -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v3) @ 2010-10-14 2:01 nhorman 2010-10-14 2:01 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman 0 siblings, 1 reply; 11+ messages in thread From: nhorman @ 2010-10-14 2:01 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman Version 3, taking the following changes into account: 1) Gospo noted an oops on his test system, which arose from a use after free of the cpumask_var_t that I allocate for use in tx blocking. Easy fix, just alloc the cpumask first, and free it last in the module init/exit routines Summary: A while ago we tried to enable netpoll on the bonding driver to enable netconsole. That worked well in a steady state, but deadlocked frequently in failover conditions due to some recursive lock-taking (as well as a few other problems). I've gone through the driver, netconsole and netpoll code, fixed up those deadlocks, and confirmed that, with this patch series, we can use netconsole on bonding without deadlock in all bonding modes with all slaves, even accross failovers. I've also fixed up some incidental bugs that I ran across while looking through this code, as described in individual patches Signed-off-by: Neil Horman <nhorman@tuxdriver.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll 2010-10-14 2:01 [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v3) nhorman @ 2010-10-14 2:01 ` nhorman 0 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-14 2:01 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> The monitoring paths in the bonding driver take write locks that are shared by the tx path. If netconsole is in use, these paths can call printk which puts us in the netpoll tx path, which, if netconsole is attached to the bonding driver, result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the monitor paths in the bonding driver don't claim the xmit_lock, nor should they). The solution is to use a per cpu flag internal to the driver to indicate when a cpu is holding the lock in a path that might recusrse into the tx path for the driver via netconsole. By checking this flag on transmit, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. I've tested this and am able to transmit via netconsole while causing failover conditions on the bond slave links. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/bonding/bond_main.c | 46 ++++++++++++++++++++++++++++++++++--- drivers/net/bonding/bond_sysfs.c | 8 ++++++ drivers/net/bonding/bonding.h | 30 ++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 4 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index eb7d089..8868a51 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -76,6 +76,7 @@ #include <linux/if_vlan.h> #include <linux/if_bonding.h> #include <linux/jiffies.h> +#include <linux/preempt.h> #include <net/route.h> #include <net/net_namespace.h> #include <net/netns/generic.h> @@ -169,6 +170,10 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link /*----------------------------- Global variables ----------------------------*/ +#ifdef CONFIG_NET_POLL_CONTROLLER +cpumask_var_t netpoll_block_tx; +#endif + static const char * const version = DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n"; @@ -310,6 +315,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id); + block_netpoll_tx(); write_lock_bh(&bond->lock); list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { @@ -344,6 +350,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return res; } @@ -1804,10 +1811,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) bond_set_carrier(bond); #ifdef CONFIG_NET_POLL_CONTROLLER - /* - * Netpoll and bonding is broken, make sure it is not initialized - * until it is fixed. - */ if (disable_netpoll) { bond_dev->priv_flags |= IFF_DISABLE_NETPOLL; } else { @@ -1892,6 +1895,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) return -EINVAL; } + block_netpoll_tx(); netdev_bonding_change(bond_dev, NETDEV_BONDING_DESLAVE); write_lock_bh(&bond->lock); @@ -1901,6 +1905,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) pr_info("%s: %s not enslaved\n", bond_dev->name, slave_dev->name); write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return -EINVAL; } @@ -1994,6 +1999,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) } write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); /* must do this from outside any spinlocks */ bond_destroy_slave_symlinks(bond_dev, slave_dev); @@ -2085,6 +2091,7 @@ static int bond_release_all(struct net_device *bond_dev) struct net_device *slave_dev; struct sockaddr addr; + block_netpoll_tx(); write_lock_bh(&bond->lock); netif_carrier_off(bond_dev); @@ -2183,6 +2190,7 @@ static int bond_release_all(struct net_device *bond_dev) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return 0; } @@ -2232,9 +2240,11 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi (old_active) && (new_active->link == BOND_LINK_UP) && IS_UP(new_active->dev)) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_change_active_slave(bond, new_active); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } else res = -EINVAL; @@ -2466,9 +2476,11 @@ static void bond_miimon_commit(struct bonding *bond) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -2911,11 +2923,13 @@ void bond_loadbalance_arp_mon(struct work_struct *work) } if (do_failover) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } re_arm: @@ -3074,9 +3088,11 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -4564,6 +4580,13 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bonding *bond = netdev_priv(dev); + /* + * If we risk deadlock from transmitting this in the + * netpoll path, tell netpoll to queue the frame for later tx + */ + if (is_netpoll_tx_blocked(dev)) + return NETDEV_TX_BUSY; + if (TX_QUEUE_OVERRIDE(bond->params.mode)) { if (!bond_slave_override(bond, skb)) return NETDEV_TX_OK; @@ -5277,6 +5300,13 @@ static int __init bonding_init(void) if (res) goto out; +#ifdef CONFIG_NET_POLL_CONTROLLER + if (!alloc_cpumask_var(&netpoll_block_tx, GFP_KERNEL)) { + res = -ENOMEM; + goto out; + } +#endif + res = register_pernet_subsys(&bond_net_ops); if (res) goto out; @@ -5295,6 +5325,7 @@ static int __init bonding_init(void) if (res) goto err; + register_netdevice_notifier(&bond_netdev_notifier); register_inetaddr_notifier(&bond_inetaddr_notifier); bond_register_ipv6_notifier(); @@ -5304,6 +5335,9 @@ err: rtnl_link_unregister(&bond_link_ops); err_link: unregister_pernet_subsys(&bond_net_ops); +#ifdef CONFIG_NET_POLL_CONTROLLER + free_cpumask_var(netpoll_block_tx); +#endif goto out; } @@ -5318,6 +5352,10 @@ static void __exit bonding_exit(void) rtnl_link_unregister(&bond_link_ops); unregister_pernet_subsys(&bond_net_ops); + +#ifdef CONFIG_NET_POLL_CONTROLLER + free_cpumask_var(netpoll_block_tx); +#endif } module_init(bonding_init); diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index 01b4c3f..8fd0174 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -1066,6 +1066,7 @@ static ssize_t bonding_store_primary(struct device *d, if (!rtnl_trylock()) return restart_syscall(); + block_netpoll_tx(); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); @@ -1101,6 +1102,7 @@ static ssize_t bonding_store_primary(struct device *d, out: write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); + unblock_netpoll_tx(); rtnl_unlock(); return count; @@ -1146,11 +1148,13 @@ static ssize_t bonding_store_primary_reselect(struct device *d, bond->dev->name, pri_reselect_tbl[new_value].modename, new_value); + block_netpoll_tx(); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); + unblock_netpoll_tx(); out: rtnl_unlock(); return ret; @@ -1232,6 +1236,8 @@ static ssize_t bonding_store_active_slave(struct device *d, if (!rtnl_trylock()) return restart_syscall(); + + block_netpoll_tx(); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); @@ -1288,6 +1294,8 @@ static ssize_t bonding_store_active_slave(struct device *d, out: write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); + unblock_netpoll_tx(); + rtnl_unlock(); return count; diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h index c15f213..deef1aa 100644 --- a/drivers/net/bonding/bonding.h +++ b/drivers/net/bonding/bonding.h @@ -19,6 +19,7 @@ #include <linux/proc_fs.h> #include <linux/if_bonding.h> #include <linux/kobject.h> +#include <linux/cpumask.h> #include <linux/in6.h> #include "bond_3ad.h" #include "bond_alb.h" @@ -117,6 +118,35 @@ bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave) +#ifdef CONFIG_NET_POLL_CONTROLLER +extern cpumask_var_t netpoll_block_tx; + +static inline void block_netpoll_tx(void) +{ + preempt_disable(); + BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), + netpoll_block_tx)); +} + +static inline void unblock_netpoll_tx(void) +{ + BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(), + netpoll_block_tx)); + preempt_enable(); +} + +static inline int is_netpoll_tx_blocked(struct net_device *dev) +{ + if (unlikely(dev->priv_flags & IFF_IN_NETPOLL)) + return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx); + return 0; +} +#else +#define block_netpoll_tx() +#define unblock_netpoll_tx() +#define is_netpoll_tx_blocked(dev) +#endif + struct bond_params { int mode; int xmit_policy; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2) @ 2010-10-13 12:35 nhorman 2010-10-13 12:35 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman 0 siblings, 1 reply; 11+ messages in thread From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman Version 2, taking teh following changes into account: 1) Moved tx blocking/checking macros to netpoll.h as suggested by amwang 2) Added tx blocking macro calls to sysfs paths, as they can deadlock in the same way that the link monitoring paths can. Summary: A while ago we tried to enable netpoll on the bonding driver to enable netconsole. That worked well in a steady state, but deadlocked frequently in failover conditions due to some recursive lock-taking (as well as a few other problems). I've gone through the driver, netconsole and netpoll code, fixed up those deadlocks, and confirmed that, with this patch series, we can use netconsole on bonding without deadlock in all bonding modes with all slaves, even accross failovers. I've also fixed up some incidental bugs that I ran across while looking through this code, as described in individual patches Signed-off-by: Neil Horman <nhorman@tuxdriver.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll 2010-10-13 12:35 [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2) nhorman @ 2010-10-13 12:35 ` nhorman 0 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> The monitoring paths in the bonding driver take write locks that are shared by the tx path. If netconsole is in use, these paths can call printk which puts us in the netpoll tx path, which, if netconsole is attached to the bonding driver, result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the monitor paths in the bonding driver don't claim the xmit_lock, nor should they). The solution is to use a per cpu flag internal to the driver to indicate when a cpu is holding the lock in a path that might recusrse into the tx path for the driver via netconsole. By checking this flag on transmit, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. I've tested this and am able to transmit via netconsole while causing failover conditions on the bond slave links. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/bonding/bond_main.c | 42 ++++++++++++++++++++++++++++++++++--- drivers/net/bonding/bond_sysfs.c | 8 +++++++ drivers/net/bonding/bonding.h | 30 +++++++++++++++++++++++++++ 3 files changed, 76 insertions(+), 4 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index eb7d089..bc38d69 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -76,6 +76,7 @@ #include <linux/if_vlan.h> #include <linux/if_bonding.h> #include <linux/jiffies.h> +#include <linux/preempt.h> #include <net/route.h> #include <net/net_namespace.h> #include <net/netns/generic.h> @@ -169,6 +170,10 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link /*----------------------------- Global variables ----------------------------*/ +#ifdef CONFIG_NET_POLL_CONTROLLER +cpumask_var_t netpoll_block_tx; +#endif + static const char * const version = DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n"; @@ -310,6 +315,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id); + block_netpoll_tx(); write_lock_bh(&bond->lock); list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { @@ -344,6 +350,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return res; } @@ -1804,10 +1811,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) bond_set_carrier(bond); #ifdef CONFIG_NET_POLL_CONTROLLER - /* - * Netpoll and bonding is broken, make sure it is not initialized - * until it is fixed. - */ if (disable_netpoll) { bond_dev->priv_flags |= IFF_DISABLE_NETPOLL; } else { @@ -1892,6 +1895,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) return -EINVAL; } + block_netpoll_tx(); netdev_bonding_change(bond_dev, NETDEV_BONDING_DESLAVE); write_lock_bh(&bond->lock); @@ -1901,6 +1905,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) pr_info("%s: %s not enslaved\n", bond_dev->name, slave_dev->name); write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return -EINVAL; } @@ -1994,6 +1999,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) } write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); /* must do this from outside any spinlocks */ bond_destroy_slave_symlinks(bond_dev, slave_dev); @@ -2085,6 +2091,7 @@ static int bond_release_all(struct net_device *bond_dev) struct net_device *slave_dev; struct sockaddr addr; + block_netpoll_tx(); write_lock_bh(&bond->lock); netif_carrier_off(bond_dev); @@ -2183,6 +2190,7 @@ static int bond_release_all(struct net_device *bond_dev) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return 0; } @@ -2232,9 +2240,11 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi (old_active) && (new_active->link == BOND_LINK_UP) && IS_UP(new_active->dev)) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_change_active_slave(bond, new_active); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } else res = -EINVAL; @@ -2466,9 +2476,11 @@ static void bond_miimon_commit(struct bonding *bond) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -2911,11 +2923,13 @@ void bond_loadbalance_arp_mon(struct work_struct *work) } if (do_failover) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } re_arm: @@ -3074,9 +3088,11 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -4564,6 +4580,13 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bonding *bond = netdev_priv(dev); + /* + * If we risk deadlock from transmitting this in the + * netpoll path, tell netpoll to queue the frame for later tx + */ + if (is_netpoll_tx_blocked(dev)) + return NETDEV_TX_BUSY; + if (TX_QUEUE_OVERRIDE(bond->params.mode)) { if (!bond_slave_override(bond, skb)) return NETDEV_TX_OK; @@ -5295,6 +5318,13 @@ static int __init bonding_init(void) if (res) goto err; +#ifdef CONFIG_NET_POLL_CONTROLLER + if (!alloc_cpumask_var(&netpoll_block_tx, GFP_KERNEL)) { + res = -ENOMEM; + bond_destroy_sysfs(); + goto err; + } +#endif register_netdevice_notifier(&bond_netdev_notifier); register_inetaddr_notifier(&bond_inetaddr_notifier); bond_register_ipv6_notifier(); @@ -5316,6 +5346,10 @@ static void __exit bonding_exit(void) bond_destroy_sysfs(); +#ifdef CONFIG_NET_POLL_CONTROLLER + free_cpumask_var(netpoll_block_tx); +#endif + rtnl_link_unregister(&bond_link_ops); unregister_pernet_subsys(&bond_net_ops); } diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c index 01b4c3f..8fd0174 100644 --- a/drivers/net/bonding/bond_sysfs.c +++ b/drivers/net/bonding/bond_sysfs.c @@ -1066,6 +1066,7 @@ static ssize_t bonding_store_primary(struct device *d, if (!rtnl_trylock()) return restart_syscall(); + block_netpoll_tx(); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); @@ -1101,6 +1102,7 @@ static ssize_t bonding_store_primary(struct device *d, out: write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); + unblock_netpoll_tx(); rtnl_unlock(); return count; @@ -1146,11 +1148,13 @@ static ssize_t bonding_store_primary_reselect(struct device *d, bond->dev->name, pri_reselect_tbl[new_value].modename, new_value); + block_netpoll_tx(); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); + unblock_netpoll_tx(); out: rtnl_unlock(); return ret; @@ -1232,6 +1236,8 @@ static ssize_t bonding_store_active_slave(struct device *d, if (!rtnl_trylock()) return restart_syscall(); + + block_netpoll_tx(); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); @@ -1288,6 +1294,8 @@ static ssize_t bonding_store_active_slave(struct device *d, out: write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); + unblock_netpoll_tx(); + rtnl_unlock(); return count; diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h index c15f213..deef1aa 100644 --- a/drivers/net/bonding/bonding.h +++ b/drivers/net/bonding/bonding.h @@ -19,6 +19,7 @@ #include <linux/proc_fs.h> #include <linux/if_bonding.h> #include <linux/kobject.h> +#include <linux/cpumask.h> #include <linux/in6.h> #include "bond_3ad.h" #include "bond_alb.h" @@ -117,6 +118,35 @@ bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave) +#ifdef CONFIG_NET_POLL_CONTROLLER +extern cpumask_var_t netpoll_block_tx; + +static inline void block_netpoll_tx(void) +{ + preempt_disable(); + BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), + netpoll_block_tx)); +} + +static inline void unblock_netpoll_tx(void) +{ + BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(), + netpoll_block_tx)); + preempt_enable(); +} + +static inline int is_netpoll_tx_blocked(struct net_device *dev) +{ + if (unlikely(dev->priv_flags & IFF_IN_NETPOLL)) + return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx); + return 0; +} +#else +#define block_netpoll_tx() +#define unblock_netpoll_tx() +#define is_netpoll_tx_blocked(dev) +#endif + struct bond_params { int mode; int xmit_policy; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH] bonding: various fixes for bonding, netpoll & netconsole @ 2010-10-12 20:29 nhorman 2010-10-12 20:29 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman 0 siblings, 1 reply; 11+ messages in thread From: nhorman @ 2010-10-12 20:29 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman Sorry if this is a resend for some, a bad git-send-email config caused some rejections, so I'm resending. A few months back, an attempt was made to enable netpoll over bonding, so that netconsole could be used over bonded interfaces. This worked in the steady state, but had several deadlocks in various failover conditions. I've gone through the bonding code, and fixed up those deadlocks, along with several other problems noted along the way, which caused other issues with netpoll+bonding. With this patch series, netpoll works with bonding in all modes accross all slaves during failover conditions. It also allows fixes some deadlock conditions in the netconsole code itself. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll 2010-10-12 20:29 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman @ 2010-10-12 20:29 ` nhorman 0 siblings, 0 replies; 11+ messages in thread From: nhorman @ 2010-10-12 20:29 UTC (permalink / raw) To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman From: Neil Horman <nhorman@tuxdriver.com> The monitoring paths in the bonding driver take write locks that are shared by the tx path. If netconsole is in use, these paths can call printk which puts us in the netpoll tx path, which, if netconsole is attached to the bonding driver, result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the monitor paths in the bonding driver don't claim the xmit_lock, nor should they). The solution is to use a per cpu flag internal to the driver to indicate when a cpu is holding the lock in a path that might recusrse into the tx path for the driver via netconsole. By checking this flag on transmit, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. I've tested this and am able to transmit via netconsole while causing failover conditions on the bond slave links. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/bonding/bond_main.c | 67 ++++++++++++++++++++++++++++++++++++-- 1 files changed, 63 insertions(+), 4 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index eb7d089..1d0d8c5 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -76,6 +76,7 @@ #include <linux/if_vlan.h> #include <linux/if_bonding.h> #include <linux/jiffies.h> +#include <linux/preempt.h> #include <net/route.h> #include <net/net_namespace.h> #include <net/netns/generic.h> @@ -169,6 +170,35 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link /*----------------------------- Global variables ----------------------------*/ +#ifdef CONFIG_NET_POLL_CONTROLLER +static cpumask_var_t netpoll_block_tx; + +static inline void block_netpoll_tx(void +{ + preempt_disable(); + BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(), + netpoll_block_tx)); +} + +static inline void unblock_netpoll_tx(void) +{ + BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(), + netpoll_block_tx)); + preempt_enable(); +} + +static inline int is_netpoll_tx_blocked(struct net_device *dev) +{ + if (unlikely(dev->priv_flags & IFF_IN_NETPOLL)) + return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx); + return 0; +} +#else +#define block_netpoll_tx() +#define unblock_netpoll_tx() +#define is_netpoll_tx_blocked(dev) +#endif + static const char * const version = DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n"; @@ -310,6 +340,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id); + block_netpoll_tx(); write_lock_bh(&bond->lock); list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { @@ -344,6 +375,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return res; } @@ -1804,10 +1836,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) bond_set_carrier(bond); #ifdef CONFIG_NET_POLL_CONTROLLER - /* - * Netpoll and bonding is broken, make sure it is not initialized - * until it is fixed. - */ if (disable_netpoll) { bond_dev->priv_flags |= IFF_DISABLE_NETPOLL; } else { @@ -1892,6 +1920,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) return -EINVAL; } + block_netpoll_tx(); netdev_bonding_change(bond_dev, NETDEV_BONDING_DESLAVE); write_lock_bh(&bond->lock); @@ -1901,6 +1930,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) pr_info("%s: %s not enslaved\n", bond_dev->name, slave_dev->name); write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return -EINVAL; } @@ -1994,6 +2024,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) } write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); /* must do this from outside any spinlocks */ bond_destroy_slave_symlinks(bond_dev, slave_dev); @@ -2085,6 +2116,7 @@ static int bond_release_all(struct net_device *bond_dev) struct net_device *slave_dev; struct sockaddr addr; + block_netpoll_tx(); write_lock_bh(&bond->lock); netif_carrier_off(bond_dev); @@ -2183,6 +2215,7 @@ static int bond_release_all(struct net_device *bond_dev) out: write_unlock_bh(&bond->lock); + unblock_netpoll_tx(); return 0; } @@ -2232,9 +2265,11 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi (old_active) && (new_active->link == BOND_LINK_UP) && IS_UP(new_active->dev)) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_change_active_slave(bond, new_active); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } else res = -EINVAL; @@ -2466,9 +2501,11 @@ static void bond_miimon_commit(struct bonding *bond) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -2911,11 +2948,13 @@ void bond_loadbalance_arp_mon(struct work_struct *work) } if (do_failover) { + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } re_arm: @@ -3074,9 +3113,11 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks) do_failover: ASSERT_RTNL(); + block_netpoll_tx(); write_lock_bh(&bond->curr_slave_lock); bond_select_active_slave(bond); write_unlock_bh(&bond->curr_slave_lock); + unblock_netpoll_tx(); } bond_set_carrier(bond); @@ -4564,6 +4605,13 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bonding *bond = netdev_priv(dev); + /* + * If we risk deadlock from transmitting this in the + * netpoll path, tell netpoll to queue the frame for later tx + */ + if (is_netpoll_tx_blocked(dev)) + return NETDEV_TX_BUSY; + if (TX_QUEUE_OVERRIDE(bond->params.mode)) { if (!bond_slave_override(bond, skb)) return NETDEV_TX_OK; @@ -5295,6 +5343,13 @@ static int __init bonding_init(void) if (res) goto err; +#ifdef CONFIG_NET_POLL_CONTROLLER + if (!alloc_cpumask_var(&netpoll_block_tx, GFP_KERNEL)) { + res = -ENOMEM; + bond_destroy_sysfs(); + goto err; + } +#endif register_netdevice_notifier(&bond_netdev_notifier); register_inetaddr_notifier(&bond_inetaddr_notifier); bond_register_ipv6_notifier(); @@ -5316,6 +5371,10 @@ static void __exit bonding_exit(void) bond_destroy_sysfs(); +#ifdef CONFIG_NET_POLL_CONTROLLER + free_cpumask_var(netpoll_block_tx); +#endif + rtnl_link_unregister(&bond_link_ops); unregister_pernet_subsys(&bond_net_ops); } -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-10-14 2:02 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-12 21:55 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman 2010-10-12 21:55 ` [PATCH 1/5] Fix bonding drivers improper modification of netpoll structure nhorman 2010-10-12 21:55 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman 2010-10-13 2:44 ` Cong Wang 2010-10-13 10:51 ` Neil Horman 2010-10-12 21:55 ` [PATCH 3/5] Fix napi poll for bonding driver nhorman 2010-10-12 21:55 ` [PATCH 4/5] Fix netconsole to not deadlock on rmmod nhorman 2010-10-12 21:55 ` [PATCH 5/5] Re-enable netpoll over bonding nhorman -- strict thread matches above, loose matches on Subject: below -- 2010-10-14 2:01 [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v3) nhorman 2010-10-14 2:01 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman 2010-10-13 12:35 [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2) nhorman 2010-10-13 12:35 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman 2010-10-12 20:29 [PATCH] bonding: various fixes for bonding, netpoll & netconsole nhorman 2010-10-12 20:29 ` [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll nhorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).