netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
@ 2014-12-09 14:37 Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 01/11] netpoll: introduce netpoll_irq_lock to protect netpoll controller against interrupts Sabrina Dubroca
                   ` (11 more replies)
  0 siblings, 12 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Peter Zijlstra, Thomas Gleixner

In commit e22b886a8a43b ("sched/wait: Add might_sleep() checks"),
Peter Zijlstra added a check that fires on netpoll controllers that
use disable_irq().

There are 60 drivers currently using disable_irq() in their netpoll
controller.  A lot of these simply disable irqs before calling their
interrupt handler.  A few do something a little more complex, often
with multiple irqs.

Thomas Gleixner and Peter Zijlstra suggested the idea of a adding a
spinlock in the interrupt handler and submitted the initial patch for
this: https://lkml.org/lkml/2014/10/29/742

There are three groups of drivers/netpoll controllers that need to be
fixed:
  1) drivers that simply disable_irq()
     a) interrupt handler takes a lock
        examples: 8139cp/8139too
        The disable_irq() call is not necessary, we can just remove it
        and rely on the lock in the interrupt handler.
     b) interrupt handler doesn't take a lock
        examples: e1000, atl1c
        Add locking to the interrupt handler, remove the disable_irq() call.
  2) other drivers
     These are the other drivers whose netpoll controller use disable_irq:
     drivers/net/ethernet/nvidia/forcedeth.c
     drivers/net/ethernet/broadcom/bnx2.c
     drivers/net/ethernet/neterion/s2io.c
     drivers/net/ethernet/neterion/vxge/vxge-main.c
     drivers/net/ethernet/silan/sc92031.c
     drivers/net/ethernet/freescale/fec_main.c
     drivers/net/ethernet/freescale/gianfar.c
     drivers/net/ethernet/pasemi/pasemi_mac.c
     drivers/net/ethernet/xilinx/ll_temac_main.c
     drivers/net/ethernet/xilinx/xilinx_axienet_main.c
     Additionally, igb calls synchronize_irq()
     in igb_netpoll -> igb_irq_disable.


These patches provide new helpers to lock the interrupt handler
against netpoll controller, and convert a few drivers as examples for
each group.  I will take care of the rest if this RFC passes the
review.

e1000 and 8139cp have been tested on QEMU, the rest has only been
compiled.  The changes to drivers in group 2) are more complex, and
need more careful review.


Note: I'm not sure whether this should go into net-next or net.  For
now, since the check was only in linux-next, I based the patches on
net-next.  But Peter and Thomas said:

>> But I suspect most all the network drivers will need this and maybe
>> more, disable_irq() is a popular little thing and we 'just' changed
>> semantics on them.
>
> We changed that almost 4 years ago :) What we 'just' did was to add
> a prominent warning into the code.


Please review and comment.  Thanks!


Sabrina Dubroca (11):
  netpoll: introduce netpoll_irq_lock to protect netpoll controller
    against interrupts
  e1000: remove disable_irq from netpoll controller, use
    netpoll_irq_lock
  8139cp/too: remove disable_irq from netpoll controller
  atl1c: remove disable_irq from netpoll controller, use
    netpoll_irq_lock
  bnx2: remove disable_irq from netpoll controller, use netpoll_irq_lock
  s2io: remove disable_irq from netpoll controller, use netpoll_irq_lock
  pasemi: remove disable_irq from netpoll controller, use
    netpoll_irq_lock
  ll_temac: remove disable_irq from netpoll controller, use
    netpoll_irq_lock
  xilinx/axienet: remove disable_irq from netpoll controller, use
    netpoll_irq_lock
  gianfar: remove disable_irq from netpoll controller, use
    netpoll_irq_lock
  net: fec: remove disable_irq from netpoll controller, use
    netpoll_irq_lock

 drivers/net/ethernet/atheros/atl1c/atl1c.h        |  3 ++
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c   | 12 ++++---
 drivers/net/ethernet/broadcom/bnx2.c              | 24 ++++++++++----
 drivers/net/ethernet/broadcom/bnx2.h              |  4 +++
 drivers/net/ethernet/freescale/fec.h              |  2 ++
 drivers/net/ethernet/freescale/fec_main.c         | 14 +++++++++
 drivers/net/ethernet/freescale/gianfar.c          | 38 ++++++++++-------------
 drivers/net/ethernet/freescale/gianfar.h          |  5 +++
 drivers/net/ethernet/intel/e1000/e1000.h          |  3 ++
 drivers/net/ethernet/intel/e1000/e1000_main.c     | 21 ++++++++++---
 drivers/net/ethernet/neterion/s2io.c              | 27 +++++++++-------
 drivers/net/ethernet/neterion/s2io.h              |  4 +++
 drivers/net/ethernet/pasemi/pasemi_mac.c          | 25 +++++++++------
 drivers/net/ethernet/pasemi/pasemi_mac.h          |  3 ++
 drivers/net/ethernet/realtek/8139cp.c             |  3 +-
 drivers/net/ethernet/realtek/8139too.c            |  3 +-
 drivers/net/ethernet/xilinx/ll_temac.h            |  3 ++
 drivers/net/ethernet/xilinx/ll_temac_main.c       | 13 ++++----
 drivers/net/ethernet/xilinx/xilinx_axienet.h      |  3 ++
 drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 15 +++++----
 include/linux/netpoll.h                           | 31 ++++++++++++++++++
 21 files changed, 182 insertions(+), 74 deletions(-)


Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>

-- 
2.1.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 01/11] netpoll: introduce netpoll_irq_lock to protect netpoll controller against interrupts
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 02/11] e1000: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Peter Zijlstra, Thomas Gleixner

In many drivers' ->ndo_poll_controller implementation, we currently
have the following code:

static void poll_controller(struct net_device *dev)
{
        disable_irq(irq);
        intr_handler(irq, dev);
        enable_irq(irq);
}

Commit e22b886a8a43b ("sched/wait: Add might_sleep() checks") added a
might_sleep() check to synchronize_irq(). Thomas Gleixner and Peter
Zijlstra suggested the original idea and patch to replace disable_irq
in the poll controller with a spin_lock in the interrupt handler.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/netpoll.h | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index b25ee9ffdbe6..a171f1a50e0e 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -117,4 +117,35 @@ static inline bool netpoll_tx_running(struct net_device *dev)
 }
 #endif
 
+struct netpoll_irq_lock {
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	spinlock_t lock;
+#endif
+};
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static inline void netpoll_irq_lock_init(struct netpoll_irq_lock *np_lock)
+{
+	spin_lock_init(&np_lock->lock);
+}
+static inline void netpoll_irq_lock(struct netpoll_irq_lock *np_lock)
+{
+	spin_lock(&np_lock->lock);
+}
+static inline void netpoll_irq_unlock(struct netpoll_irq_lock *np_lock)
+{
+	spin_unlock(&np_lock->lock);
+}
+#else
+static inline void netpoll_irq_lock_init(struct netpoll_irq_lock *np_lock)
+{
+}
+static inline void netpoll_irq_lock(struct netpoll_irq_lock *np_lock)
+{
+}
+static inline void netpoll_irq_unlock(struct netpoll_irq_lock *np_lock)
+{
+}
+#endif
+
 #endif
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 02/11] e1000: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 01/11] netpoll: introduce netpoll_irq_lock to protect netpoll controller against interrupts Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 03/11] 8139cp/too: remove disable_irq from netpoll controller Sabrina Dubroca
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Jeff Kirsher, Linux NICS

disable_irq() may sleep, replace it with a spin_lock in the interrupt handler.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Linux NICS <linux.nics@intel.com>
---
 drivers/net/ethernet/intel/e1000/e1000.h      |  3 +++
 drivers/net/ethernet/intel/e1000/e1000_main.c | 21 ++++++++++++++++-----
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index 69707108d23c..79444125b9bd 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -68,6 +68,7 @@
 #include <linux/mii.h>
 #include <linux/ethtool.h>
 #include <linux/if_vlan.h>
+#include <linux/netpoll.h>
 
 #define BAR_0		0
 #define BAR_1		1
@@ -323,6 +324,8 @@ struct e1000_adapter {
 	struct delayed_work watchdog_task;
 	struct delayed_work fifo_stall_task;
 	struct delayed_work phy_info_task;
+
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 enum e1000_state_t {
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 24f3986cfae2..5749a27e5c5e 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -32,6 +32,7 @@
 #include <linux/prefetch.h>
 #include <linux/bitops.h>
 #include <linux/if_vlan.h>
+#include <linux/netpoll.h>
 
 char e1000_driver_name[] = "e1000";
 static char e1000_driver_string[] = "Intel(R) PRO/1000 Network Driver";
@@ -1313,6 +1314,7 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
 	e1000_irq_disable(adapter);
 
 	spin_lock_init(&adapter->stats_lock);
+	netpoll_irq_lock_init(&adapter->netpoll_lock);
 
 	set_bit(__E1000_DOWN, &adapter->flags);
 
@@ -3751,10 +3753,8 @@ void e1000_update_stats(struct e1000_adapter *adapter)
  * @irq: interrupt number
  * @data: pointer to a network interface device structure
  **/
-static irqreturn_t e1000_intr(int irq, void *data)
+static irqreturn_t __e1000_intr(int irq, struct e1000_adapter *adapter)
 {
-	struct net_device *netdev = data;
-	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	u32 icr = er32(ICR);
 
@@ -3796,6 +3796,19 @@ static irqreturn_t e1000_intr(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t e1000_intr(int irq, void *data)
+{
+	struct net_device *netdev = data;
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	irqreturn_t ret;
+
+	netpoll_irq_lock(&adapter->netpoll_lock);
+	ret = __e1000_intr(irq, adapter);
+	netpoll_irq_unlock(&adapter->netpoll_lock);
+
+	return ret;
+}
+
 /**
  * e1000_clean - NAPI Rx polling callback
  * @adapter: board private structure
@@ -5220,9 +5233,7 @@ static void e1000_netpoll(struct net_device *netdev)
 {
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 
-	disable_irq(adapter->pdev->irq);
 	e1000_intr(adapter->pdev->irq, netdev);
-	enable_irq(adapter->pdev->irq);
 }
 #endif
 
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 03/11] 8139cp/too: remove disable_irq from netpoll controller
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 01/11] netpoll: introduce netpoll_irq_lock to protect netpoll controller against interrupts Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 02/11] e1000: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca

disable_irq() may sleep. The interrupt handlers for 8139 take the
device lock, so no need for additional protection.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 drivers/net/ethernet/realtek/8139cp.c  | 3 +--
 drivers/net/ethernet/realtek/8139too.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 75b1693ec8bf..6a296c5c3860 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -581,6 +581,7 @@ static irqreturn_t cp_interrupt (int irq, void *dev_instance)
 		return IRQ_NONE;
 	cp = netdev_priv(dev);
 
+	/* also protects against reentrancy from cp_poll_controller */
 	spin_lock(&cp->lock);
 
 	status = cpr16(IntrStatus);
@@ -639,9 +640,7 @@ static void cp_poll_controller(struct net_device *dev)
 	struct cp_private *cp = netdev_priv(dev);
 	const int irq = cp->pdev->irq;
 
-	disable_irq(irq);
 	cp_interrupt(irq, dev);
-	enable_irq(irq);
 }
 #endif
 
diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index 63dc0f95d050..f4437112984b 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -2162,6 +2162,7 @@ static irqreturn_t rtl8139_interrupt (int irq, void *dev_instance)
 	int link_changed = 0; /* avoid bogus "uninit" warning */
 	int handled = 0;
 
+	/* also protects against reentrancy from rtl8139_poll_controller */
 	spin_lock (&tp->lock);
 	status = RTL_R16 (IntrStatus);
 
@@ -2227,9 +2228,7 @@ static void rtl8139_poll_controller(struct net_device *dev)
 	struct rtl8139_private *tp = netdev_priv(dev);
 	const int irq = tp->pci_dev->irq;
 
-	disable_irq(irq);
 	rtl8139_interrupt(irq, dev);
-	enable_irq(irq);
 }
 #endif
 
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (2 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 03/11] 8139cp/too: remove disable_irq from netpoll controller Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
       [not found]   ` <CAMXMK6t5NfPQFBxK1Qny45LCS6rwX4Ys1n4C7fsTPHXu=x_vuQ@mail.gmail.com>
  2014-12-09 14:37 ` [RFC PATCH net-next 05/11] bnx2: " Sabrina Dubroca
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Jay Cliburn, Chris Snook

disable_irq() may sleep, replace it with a spin_lock in the interrupt handler.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Jay Cliburn <jcliburn@gmail.com>
Cc: Chris Snook <chris.snook@gmail.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c.h      |  3 +++
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 12 ++++++++----
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h b/drivers/net/ethernet/atheros/atl1c/atl1c.h
index b9203d928938..8d97791e1516 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c.h
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h
@@ -49,6 +49,7 @@
 #include <linux/workqueue.h>
 #include <net/checksum.h>
 #include <net/ip6_checksum.h>
+#include <linux/netpoll.h>
 
 #include "atl1c_hw.h"
 
@@ -555,6 +556,8 @@ struct atl1c_adapter {
 	struct atl1c_rfd_ring rfd_ring;
 	struct atl1c_rrd_ring rrd_ring;
 	u32 bd_number;     /* board number;*/
+
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 #define AT_WRITE_REG(a, reg, value) ( \
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 72fb86b9aa24..7a1b11eb8e4e 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -826,6 +826,7 @@ static int atl1c_sw_init(struct atl1c_adapter *adapter)
 	atomic_set(&adapter->irq_sem, 1);
 	spin_lock_init(&adapter->mdio_lock);
 	spin_lock_init(&adapter->tx_lock);
+	netpoll_irq_lock_init(&adapter->netpoll_lock);
 	set_bit(__AT_DOWN, &adapter->flags);
 
 	return 0;
@@ -1584,10 +1585,11 @@ static irqreturn_t atl1c_intr(int irq, void *data)
 	struct pci_dev *pdev = adapter->pdev;
 	struct atl1c_hw *hw = &adapter->hw;
 	int max_ints = AT_MAX_INT_WORK;
-	int handled = IRQ_NONE;
+	irqreturn_t handled = IRQ_NONE;
 	u32 status;
 	u32 reg_data;
 
+	netpoll_irq_lock(&adapter->netpoll_lock);
 	do {
 		AT_READ_REG(hw, REG_ISR, &reg_data);
 		status = reg_data & hw->intr_mask;
@@ -1622,7 +1624,8 @@ static irqreturn_t atl1c_intr(int irq, void *data)
 			/* reset MAC */
 			set_bit(ATL1C_WORK_EVENT_RESET, &adapter->work_event);
 			schedule_work(&adapter->common_task);
-			return IRQ_HANDLED;
+			handled = IRQ_HANDLED;
+			goto out;
 		}
 
 		if (status & ISR_OVER)
@@ -1641,6 +1644,9 @@ static irqreturn_t atl1c_intr(int irq, void *data)
 	} while (--max_ints > 0);
 	/* re-enable Interrupt*/
 	AT_WRITE_REG(&adapter->hw, REG_ISR, 0);
+
+out:
+	netpoll_irq_unlock(&adapter->netpoll_lock);
 	return handled;
 }
 
@@ -1900,9 +1906,7 @@ static void atl1c_netpoll(struct net_device *netdev)
 {
 	struct atl1c_adapter *adapter = netdev_priv(netdev);
 
-	disable_irq(adapter->pdev->irq);
 	atl1c_intr(adapter->pdev->irq, netdev);
-	enable_irq(adapter->pdev->irq);
 }
 #endif
 
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 05/11] bnx2: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (3 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 06/11] s2io: " Sabrina Dubroca
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Sony Chacko, Dept-HSGLinuxNICDev

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handlers.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Sony Chacko <sony.chacko@qlogic.com>
Cc: Dept-HSGLinuxNICDev@qlogic.com
---
 drivers/net/ethernet/broadcom/bnx2.c | 24 ++++++++++++++++++------
 drivers/net/ethernet/broadcom/bnx2.h |  4 ++++
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 823d01c5684c..675d90849d10 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -39,6 +39,7 @@
 #include <linux/mii.h>
 #include <linux/if.h>
 #include <linux/if_vlan.h>
+#include <linux/netpoll.h>
 #include <net/ip.h>
 #include <net/tcp.h>
 #include <net/checksum.h>
@@ -864,6 +865,7 @@ bnx2_alloc_mem(struct bnx2 *bp)
 		&bnapi->status_blk.msi->status_tx_quick_consumer_index0;
 	bnapi->hw_rx_cons_ptr =
 		&bnapi->status_blk.msi->status_rx_quick_consumer_index0;
+	netpoll_irq_lock_init(&bnapi->netpoll_lock);
 	if (bp->flags & BNX2_FLAG_MSIX_CAP) {
 		for (i = 1; i < bp->irq_nvecs; i++) {
 			struct status_block_msix *sblk;
@@ -876,6 +878,7 @@ bnx2_alloc_mem(struct bnx2 *bp)
 				&sblk->status_tx_quick_consumer_index;
 			bnapi->hw_rx_cons_ptr =
 				&sblk->status_rx_quick_consumer_index;
+			netpoll_irq_lock_init(&bnapi->netpoll_lock);
 			bnapi->int_num = i << 24;
 		}
 	}
@@ -3302,6 +3305,7 @@ bnx2_msi(int irq, void *dev_instance)
 	struct bnx2_napi *bnapi = dev_instance;
 	struct bnx2 *bp = bnapi->bp;
 
+	netpoll_irq_lock(&bnapi->netpoll_lock);
 	prefetch(bnapi->status_blk.msi);
 	BNX2_WR(bp, BNX2_PCICFG_INT_ACK_CMD,
 		BNX2_PCICFG_INT_ACK_CMD_USE_INT_HC_PARAM |
@@ -3309,10 +3313,12 @@ bnx2_msi(int irq, void *dev_instance)
 
 	/* Return here if interrupt is disabled. */
 	if (unlikely(atomic_read(&bp->intr_sem) != 0))
-		return IRQ_HANDLED;
+		goto out;
 
 	napi_schedule(&bnapi->napi);
 
+out:
+	netpoll_irq_unlock(&bnapi->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -3322,14 +3328,17 @@ bnx2_msi_1shot(int irq, void *dev_instance)
 	struct bnx2_napi *bnapi = dev_instance;
 	struct bnx2 *bp = bnapi->bp;
 
+	netpoll_irq_lock(&bnapi->netpoll_lock);
 	prefetch(bnapi->status_blk.msi);
 
 	/* Return here if interrupt is disabled. */
 	if (unlikely(atomic_read(&bp->intr_sem) != 0))
-		return IRQ_HANDLED;
+		goto out;
 
 	napi_schedule(&bnapi->napi);
 
+out:
+	netpoll_irq_unlock(&bnapi->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -3340,6 +3349,7 @@ bnx2_interrupt(int irq, void *dev_instance)
 	struct bnx2 *bp = bnapi->bp;
 	struct status_block *sblk = bnapi->status_blk.msi;
 
+	netpoll_irq_lock(&bnapi->netpoll_lock);
 	/* When using INTx, it is possible for the interrupt to arrive
 	 * at the CPU before the status block posted prior to the
 	 * interrupt. Reading a register will flush the status block.
@@ -3348,8 +3358,10 @@ bnx2_interrupt(int irq, void *dev_instance)
 	 */
 	if ((sblk->status_idx == bnapi->last_status_idx) &&
 	    (BNX2_RD(bp, BNX2_PCICFG_MISC_STATUS) &
-	     BNX2_PCICFG_MISC_STATUS_INTA_VALUE))
+	     BNX2_PCICFG_MISC_STATUS_INTA_VALUE)) {
+		netpoll_irq_unlock(&bnapi->netpoll_lock);
 		return IRQ_NONE;
+	}
 
 	BNX2_WR(bp, BNX2_PCICFG_INT_ACK_CMD,
 		BNX2_PCICFG_INT_ACK_CMD_USE_INT_HC_PARAM |
@@ -3362,13 +3374,15 @@ bnx2_interrupt(int irq, void *dev_instance)
 
 	/* Return here if interrupt is shared and is disabled. */
 	if (unlikely(atomic_read(&bp->intr_sem) != 0))
-		return IRQ_HANDLED;
+		goto out;
 
 	if (napi_schedule_prep(&bnapi->napi)) {
 		bnapi->last_status_idx = sblk->status_idx;
 		__napi_schedule(&bnapi->napi);
 	}
 
+out:
+	netpoll_irq_unlock(&bnapi->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -7915,9 +7929,7 @@ poll_bnx2(struct net_device *dev)
 	for (i = 0; i < bp->irq_nvecs; i++) {
 		struct bnx2_irq *irq = &bp->irq_tbl[i];
 
-		disable_irq(irq->vector);
 		irq->handler(irq->vector, &bp->bnx2_napi[i]);
-		enable_irq(irq->vector);
 	}
 }
 #endif
diff --git a/drivers/net/ethernet/broadcom/bnx2.h b/drivers/net/ethernet/broadcom/bnx2.h
index 28df35d35893..a9ee8c2e1847 100644
--- a/drivers/net/ethernet/broadcom/bnx2.h
+++ b/drivers/net/ethernet/broadcom/bnx2.h
@@ -14,6 +14,8 @@
 #ifndef BNX2_H
 #define BNX2_H
 
+#include <linux/netpoll.h>
+
 /* Hardware data structures and register definitions automatically
  * generated from RTL code. Do not modify.
  */
@@ -6780,6 +6782,8 @@ struct bnx2_napi {
 
 	struct bnx2_rx_ring_info	rx_ring;
 	struct bnx2_tx_ring_info	tx_ring;
+
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 struct bnx2 {
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 06/11] s2io: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (4 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 05/11] bnx2: " Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 07/11] pasemi: " Sabrina Dubroca
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Jon Mason

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handler and netpoll controller.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Jon Mason <jdmason@kudzu.us>
---
 drivers/net/ethernet/neterion/s2io.c | 27 ++++++++++++++++-----------
 drivers/net/ethernet/neterion/s2io.h |  4 ++++
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/neterion/s2io.c b/drivers/net/ethernet/neterion/s2io.c
index f5e4b820128b..6730630b7181 100644
--- a/drivers/net/ethernet/neterion/s2io.c
+++ b/drivers/net/ethernet/neterion/s2io.c
@@ -2838,7 +2838,6 @@ static int s2io_poll_inta(struct napi_struct *napi, int budget)
 static void s2io_netpoll(struct net_device *dev)
 {
 	struct s2io_nic *nic = netdev_priv(dev);
-	const int irq = nic->pdev->irq;
 	struct XENA_dev_config __iomem *bar0 = nic->bar0;
 	u64 val64 = 0xFFFFFFFFFFFFFFFFULL;
 	int i;
@@ -2848,7 +2847,8 @@ static void s2io_netpoll(struct net_device *dev)
 	if (pci_channel_offline(nic->pdev))
 		return;
 
-	disable_irq(irq);
+	/* protects against interrupts from the device */
+	netpoll_irq_lock(&nic->netpoll_lock);
 
 	writeq(val64, &bar0->rx_traffic_int);
 	writeq(val64, &bar0->tx_traffic_int);
@@ -2877,7 +2877,7 @@ static void s2io_netpoll(struct net_device *dev)
 			break;
 		}
 	}
-	enable_irq(irq);
+	netpoll_irq_unlock(&nic->netpoll_lock);
 }
 #endif
 
@@ -4717,13 +4717,16 @@ static irqreturn_t s2io_isr(int irq, void *dev_id)
 	u64 reason = 0;
 	struct mac_info *mac_control;
 	struct config_param *config;
+	irqreturn_t handled = IRQ_NONE;
+
+	netpoll_irq_lock(&sp->netpoll_lock);
 
 	/* Pretend we handled any irq's from a disconnected card */
 	if (pci_channel_offline(sp->pdev))
-		return IRQ_NONE;
+		goto out;
 
 	if (!is_s2io_card_up(sp))
-		return IRQ_NONE;
+		goto out;
 
 	config = &sp->config;
 	mac_control = &sp->mac_control;
@@ -4737,8 +4740,9 @@ static irqreturn_t s2io_isr(int irq, void *dev_id)
 	 */
 	reason = readq(&bar0->general_int_status);
 
+	handled = IRQ_HANDLED;
 	if (unlikely(reason == S2IO_MINUS_ONE))
-		return IRQ_HANDLED;	/* Nothing much can be done. Get out */
+		goto out;	/* Nothing much can be done. Get out */
 
 	if (reason &
 	    (GEN_INTR_RXTRAFFIC | GEN_INTR_TXTRAFFIC | GEN_INTR_TXPIC)) {
@@ -4793,15 +4797,14 @@ static irqreturn_t s2io_isr(int irq, void *dev_id)
 		}
 		writeq(sp->general_int_mask, &bar0->general_int_mask);
 		readl(&bar0->general_int_status);
-
-		return IRQ_HANDLED;
-
 	} else if (!reason) {
 		/* The interrupt was not raised by us */
-		return IRQ_NONE;
+		handled = IRQ_NONE;
 	}
 
-	return IRQ_HANDLED;
+out:
+	netpoll_irq_unlock(&sp->netpoll_lock);
+	return handled;
 }
 
 /**
@@ -7040,6 +7043,7 @@ static int s2io_add_isr(struct s2io_nic *sp)
 				  "MSI-X-TX entries enabled through alarm vector\n");
 		}
 	}
+
 	if (sp->config.intr_type == INTA) {
 		err = request_irq(sp->pdev->irq, s2io_isr, IRQF_SHARED,
 				  sp->name, dev);
@@ -8047,6 +8051,7 @@ s2io_init_nic(struct pci_dev *pdev, const struct pci_device_id *pre)
 
 		spin_lock_init(&fifo->tx_lock);
 	}
+	netpoll_irq_lock_init(&sp->netpoll_lock);
 
 	/*
 	 * SXE-002: Configure link and activity LED to init state
diff --git a/drivers/net/ethernet/neterion/s2io.h b/drivers/net/ethernet/neterion/s2io.h
index d89b6ed82c51..bcfc5e91b926 100644
--- a/drivers/net/ethernet/neterion/s2io.h
+++ b/drivers/net/ethernet/neterion/s2io.h
@@ -13,6 +13,8 @@
 #ifndef _S2IO_H
 #define _S2IO_H
 
+#include <linux/netpoll.h>
+
 #define TBD 0
 #define s2BIT(loc)		(0x8000000000000000ULL >> (loc))
 #define vBIT(val, loc, sz)	(((u64)val) << (64-loc-sz))
@@ -965,6 +967,8 @@ struct s2io_nic {
 #define VPD_STRING_LEN 80
 	u8  product_name[VPD_STRING_LEN];
 	u8  serial_num[VPD_STRING_LEN];
+
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 #define RESET_ERROR 1
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 07/11] pasemi: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (5 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 06/11] s2io: " Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 08/11] ll_temac: " Sabrina Dubroca
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Olof Johansson

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handlers and netpoll controller.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Olof Johansson <olof@lixom.net>
---
 drivers/net/ethernet/pasemi/pasemi_mac.c | 25 +++++++++++++++----------
 drivers/net/ethernet/pasemi/pasemi_mac.h |  3 +++
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/pasemi/pasemi_mac.c b/drivers/net/ethernet/pasemi/pasemi_mac.c
index 30d934d66356..da937a1e3b9b 100644
--- a/drivers/net/ethernet/pasemi/pasemi_mac.c
+++ b/drivers/net/ethernet/pasemi/pasemi_mac.c
@@ -426,6 +426,7 @@ static int pasemi_mac_setup_rx_resources(const struct net_device *dev)
 	chno = ring->chan.chno;
 
 	spin_lock_init(&ring->lock);
+	netpoll_irq_lock_init(&ring->netpoll_lock);
 
 	ring->size = RX_RING_SIZE;
 	ring->ring_info = kzalloc(sizeof(struct pasemi_mac_buffer) *
@@ -509,6 +510,7 @@ pasemi_mac_setup_tx_resources(const struct net_device *dev)
 	chno = ring->chan.chno;
 
 	spin_lock_init(&ring->lock);
+	netpoll_irq_lock_init(&ring->netpoll_lock);
 
 	ring->size = TX_RING_SIZE;
 	ring->ring_info = kzalloc(sizeof(struct pasemi_mac_buffer) *
@@ -954,13 +956,16 @@ restart:
 
 static irqreturn_t pasemi_mac_rx_intr(int irq, void *data)
 {
-	const struct pasemi_mac_rxring *rxring = data;
+	struct pasemi_mac_rxring *rxring = data;
 	struct pasemi_mac *mac = rxring->mac;
 	const struct pasemi_dmachan *chan = &rxring->chan;
 	unsigned int reg;
 
-	if (!(*chan->status & PAS_STATUS_CAUSE_M))
+	netpoll_irq_lock(&rxring->netpoll_lock);
+	if (!(*chan->status & PAS_STATUS_CAUSE_M)) {
+		netpoll_irq_unlock(&rxring->netpoll_lock);
 		return IRQ_NONE;
+	}
 
 	/* Don't reset packet count so it won't fire again but clear
 	 * all others.
@@ -976,6 +981,7 @@ static irqreturn_t pasemi_mac_rx_intr(int irq, void *data)
 
 	write_iob_reg(PAS_IOB_DMA_RXCH_RESET(chan->chno), reg);
 
+	netpoll_irq_unlock(&rxring->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -1000,8 +1006,11 @@ static irqreturn_t pasemi_mac_tx_intr(int irq, void *data)
 	struct pasemi_mac *mac = txring->mac;
 	unsigned int reg;
 
-	if (!(*chan->status & PAS_STATUS_CAUSE_M))
+	netpoll_irq_lock(&txring->netpoll_lock);
+	if (!(*chan->status & PAS_STATUS_CAUSE_M)) {
+		netpoll_irq_unlock(&txring->netpoll_lock);
 		return IRQ_NONE;
+	}
 
 	reg = 0;
 
@@ -1017,6 +1026,7 @@ static irqreturn_t pasemi_mac_tx_intr(int irq, void *data)
 	if (reg)
 		write_iob_reg(PAS_IOB_DMA_TXCH_RESET(chan->chno), reg);
 
+	netpoll_irq_unlock(&txring->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -1628,20 +1638,15 @@ static int pasemi_mac_poll(struct napi_struct *napi, int budget)
 #ifdef CONFIG_NET_POLL_CONTROLLER
 /*
  * Polling 'interrupt' - used by things like netconsole to send skbs
- * without having to re-enable interrupts. It's not called while
- * the interrupt routine is executing.
+ * without having to re-enable interrupts. The interrupt routine
+ * protects itself with netpoll_irq_lock.
  */
 static void pasemi_mac_netpoll(struct net_device *dev)
 {
 	const struct pasemi_mac *mac = netdev_priv(dev);
 
-	disable_irq(mac->tx->chan.irq);
 	pasemi_mac_tx_intr(mac->tx->chan.irq, mac->tx);
-	enable_irq(mac->tx->chan.irq);
-
-	disable_irq(mac->rx->chan.irq);
 	pasemi_mac_rx_intr(mac->rx->chan.irq, mac->rx);
-	enable_irq(mac->rx->chan.irq);
 }
 #endif
 
diff --git a/drivers/net/ethernet/pasemi/pasemi_mac.h b/drivers/net/ethernet/pasemi/pasemi_mac.h
index a5807703ab96..78ff0c6947b4 100644
--- a/drivers/net/ethernet/pasemi/pasemi_mac.h
+++ b/drivers/net/ethernet/pasemi/pasemi_mac.h
@@ -22,6 +22,7 @@
 
 #include <linux/ethtool.h>
 #include <linux/netdevice.h>
+#include <linux/netpoll.h>
 #include <linux/spinlock.h>
 #include <linux/phy.h>
 
@@ -43,6 +44,7 @@ struct pasemi_mac_txring {
 	struct pasemi_mac_buffer *ring_info;
 	struct pasemi_mac *mac;	/* Needed in intr handler */
 	struct timer_list clean_timer;
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 struct pasemi_mac_rxring {
@@ -55,6 +57,7 @@ struct pasemi_mac_rxring {
 	unsigned int	 next_to_clean;
 	struct pasemi_mac_buffer *ring_info;
 	struct pasemi_mac *mac;	/* Needed in intr handler */
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 struct pasemi_mac_csring {
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 08/11] ll_temac: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (6 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 07/11] pasemi: " Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 09/11] xilinx/axienet: " Sabrina Dubroca
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Michal Simek, Sören Brinkmann

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handler and netpoll controller.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
---
 drivers/net/ethernet/xilinx/ll_temac.h      |  3 +++
 drivers/net/ethernet/xilinx/ll_temac_main.c | 13 +++++++------
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/ll_temac.h b/drivers/net/ethernet/xilinx/ll_temac.h
index 522abe2ff25a..09906958a689 100644
--- a/drivers/net/ethernet/xilinx/ll_temac.h
+++ b/drivers/net/ethernet/xilinx/ll_temac.h
@@ -3,6 +3,7 @@
 #define XILINX_LL_TEMAC_H
 
 #include <linux/netdevice.h>
+#include <linux/netpoll.h>
 #include <linux/of.h>
 #include <linux/spinlock.h>
 
@@ -368,6 +369,8 @@ struct temac_local {
 	int tx_bd_next;
 	int tx_bd_tail;
 	int rx_bd_ci;
+
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 /* xilinx_temac.c */
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 9c2d91ea0af4..37c3f9588208 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -815,6 +815,7 @@ static irqreturn_t ll_temac_tx_irq(int irq, void *_ndev)
 	struct temac_local *lp = netdev_priv(ndev);
 	unsigned int status;
 
+	netpoll_irq_lock(&lp->netpoll_lock);
 	status = lp->dma_in(lp, TX_IRQ_REG);
 	lp->dma_out(lp, TX_IRQ_REG, status);
 
@@ -823,6 +824,8 @@ static irqreturn_t ll_temac_tx_irq(int irq, void *_ndev)
 	if (status & 0x080)
 		dev_err(&ndev->dev, "DMA error 0x%x\n", status);
 
+	netpoll_irq_unlock(&lp->netpoll_lock);
+
 	return IRQ_HANDLED;
 }
 
@@ -832,6 +835,7 @@ static irqreturn_t ll_temac_rx_irq(int irq, void *_ndev)
 	struct temac_local *lp = netdev_priv(ndev);
 	unsigned int status;
 
+	netpoll_irq_lock(&lp->netpoll_lock);
 	/* Read and clear the status registers */
 	status = lp->dma_in(lp, RX_IRQ_REG);
 	lp->dma_out(lp, RX_IRQ_REG, status);
@@ -839,6 +843,8 @@ static irqreturn_t ll_temac_rx_irq(int irq, void *_ndev)
 	if (status & (IRQ_COAL | IRQ_DLY))
 		ll_temac_recv(lp->ndev);
 
+	netpoll_irq_unlock(&lp->netpoll_lock);
+
 	return IRQ_HANDLED;
 }
 
@@ -905,14 +911,8 @@ temac_poll_controller(struct net_device *ndev)
 {
 	struct temac_local *lp = netdev_priv(ndev);
 
-	disable_irq(lp->tx_irq);
-	disable_irq(lp->rx_irq);
-
 	ll_temac_rx_irq(lp->tx_irq, ndev);
 	ll_temac_tx_irq(lp->rx_irq, ndev);
-
-	enable_irq(lp->tx_irq);
-	enable_irq(lp->rx_irq);
 }
 #endif
 
@@ -1037,6 +1037,7 @@ static int temac_of_probe(struct platform_device *op)
 	lp->dev = &op->dev;
 	lp->options = XTE_OPTION_DEFAULTS;
 	spin_lock_init(&lp->rx_lock);
+	netpoll_irq_lock_init(&lp->netpoll_lock);
 	mutex_init(&lp->indirect_mutex);
 
 	/* map device registers */
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 09/11] xilinx/axienet: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (7 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 08/11] ll_temac: " Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 10/11] gianfar: " Sabrina Dubroca
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem
  Cc: netdev, Sabrina Dubroca, Anirudha Sarangi, John Linn,
	Michal Simek, Sören Brinkmann

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handler and netpoll controller.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Anirudha Sarangi <anirudh@xilinx.com>
Cc: John Linn <John.Linn@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
---
 drivers/net/ethernet/xilinx/xilinx_axienet.h      |  3 +++
 drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 15 +++++++++------
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
index 44b8d2bad8c3..b871857cb258 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
@@ -9,6 +9,7 @@
 #define XILINX_AXIENET_H
 
 #include <linux/netdevice.h>
+#include <linux/netpoll.h>
 #include <linux/spinlock.h>
 #include <linux/interrupt.h>
 
@@ -455,6 +456,8 @@ struct axienet_local {
 
 	u32 coalesce_count_rx;
 	u32 coalesce_count_tx;
+
+	struct netpoll_irq_lock netpoll_lock;
 };
 
 /**
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 4ea2d4e6f1d1..ff6be3d82618 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -806,6 +806,7 @@ static irqreturn_t axienet_tx_irq(int irq, void *_ndev)
 	struct net_device *ndev = _ndev;
 	struct axienet_local *lp = netdev_priv(ndev);
 
+	netpoll_irq_lock(&lp->netpoll_lock);
 	status = axienet_dma_in32(lp, XAXIDMA_TX_SR_OFFSET);
 	if (status & (XAXIDMA_IRQ_IOC_MASK | XAXIDMA_IRQ_DELAY_MASK)) {
 		axienet_start_xmit_done(lp->ndev);
@@ -834,6 +835,7 @@ static irqreturn_t axienet_tx_irq(int irq, void *_ndev)
 	}
 out:
 	axienet_dma_out32(lp, XAXIDMA_TX_SR_OFFSET, status);
+	netpoll_irq_unlock(&lp->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -854,6 +856,7 @@ static irqreturn_t axienet_rx_irq(int irq, void *_ndev)
 	struct net_device *ndev = _ndev;
 	struct axienet_local *lp = netdev_priv(ndev);
 
+	netpoll_irq_lock(&lp->netpoll_lock);
 	status = axienet_dma_in32(lp, XAXIDMA_RX_SR_OFFSET);
 	if (status & (XAXIDMA_IRQ_IOC_MASK | XAXIDMA_IRQ_DELAY_MASK)) {
 		axienet_recv(lp->ndev);
@@ -882,6 +885,7 @@ static irqreturn_t axienet_rx_irq(int irq, void *_ndev)
 	}
 out:
 	axienet_dma_out32(lp, XAXIDMA_RX_SR_OFFSET, status);
+	netpoll_irq_unlock(&lp->netpoll_lock);
 	return IRQ_HANDLED;
 }
 
@@ -1035,18 +1039,15 @@ static int axienet_change_mtu(struct net_device *ndev, int new_mtu)
  * axienet_poll_controller - Axi Ethernet poll mechanism.
  * @ndev:	Pointer to net_device structure
  *
- * This implements Rx/Tx ISR poll mechanisms. The interrupts are disabled prior
- * to polling the ISRs and are enabled back after the polling is done.
+ * This implements Rx/Tx ISR poll mechanisms. The ISRs use a spinlock to
+ * protect against reentrancy when polling.
  */
 static void axienet_poll_controller(struct net_device *ndev)
 {
 	struct axienet_local *lp = netdev_priv(ndev);
-	disable_irq(lp->tx_irq);
-	disable_irq(lp->rx_irq);
+
 	axienet_rx_irq(lp->tx_irq, ndev);
 	axienet_tx_irq(lp->rx_irq, ndev);
-	enable_irq(lp->tx_irq);
-	enable_irq(lp->rx_irq);
 }
 #endif
 
@@ -1602,6 +1603,8 @@ static int axienet_of_probe(struct platform_device *op)
 	if (ret)
 		dev_warn(&op->dev, "error registering MDIO bus\n");
 
+	netpoll_irq_lock_init(&lp->netpoll_lock);
+
 	ret = register_netdev(lp->ndev);
 	if (ret) {
 		dev_err(lp->dev, "register_netdev() error (%i)\n", ret);
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 10/11] gianfar: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (8 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 09/11] xilinx/axienet: " Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-09 14:37 ` [RFC PATCH net-next 11/11] net: fec: " Sabrina Dubroca
  2014-12-10  2:44 ` [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller David Miller
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca, Claudiu Manoil

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handler and netpoll controller.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
---
 drivers/net/ethernet/freescale/gianfar.c | 38 ++++++++++++++------------------
 drivers/net/ethernet/freescale/gianfar.h |  5 +++++
 2 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 86dccb26fecc..c6e85983fb65 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2709,6 +2709,7 @@ irqreturn_t gfar_receive(int irq, void *grp_id)
 	unsigned long flags;
 	u32 imask;
 
+	netpoll_irq_lock(&grp->netpoll_rx_lock);
 	if (likely(napi_schedule_prep(&grp->napi_rx))) {
 		spin_lock_irqsave(&grp->grplock, flags);
 		imask = gfar_read(&grp->regs->imask);
@@ -2723,6 +2724,7 @@ irqreturn_t gfar_receive(int irq, void *grp_id)
 		gfar_write(&grp->regs->ievent, IEVENT_RX_MASK);
 	}
 
+	netpoll_irq_unlock(&grp->netpoll_rx_lock);
 	return IRQ_HANDLED;
 }
 
@@ -2733,6 +2735,7 @@ static irqreturn_t gfar_transmit(int irq, void *grp_id)
 	unsigned long flags;
 	u32 imask;
 
+	netpoll_irq_lock(&grp->netpoll_tx_lock);
 	if (likely(napi_schedule_prep(&grp->napi_tx))) {
 		spin_lock_irqsave(&grp->grplock, flags);
 		imask = gfar_read(&grp->regs->imask);
@@ -2747,6 +2750,7 @@ static irqreturn_t gfar_transmit(int irq, void *grp_id)
 		gfar_write(&grp->regs->ievent, IEVENT_TX_MASK);
 	}
 
+	netpoll_irq_unlock(&grp->netpoll_tx_lock);
 	return IRQ_HANDLED;
 }
 
@@ -3073,27 +3077,10 @@ static void gfar_netpoll(struct net_device *dev)
 	struct gfar_private *priv = netdev_priv(dev);
 	int i;
 
-	/* If the device has multiple interrupts, run tx/rx */
-	if (priv->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) {
-		for (i = 0; i < priv->num_grps; i++) {
-			struct gfar_priv_grp *grp = &priv->gfargrp[i];
-
-			disable_irq(gfar_irq(grp, TX)->irq);
-			disable_irq(gfar_irq(grp, RX)->irq);
-			disable_irq(gfar_irq(grp, ER)->irq);
-			gfar_interrupt(gfar_irq(grp, TX)->irq, grp);
-			enable_irq(gfar_irq(grp, ER)->irq);
-			enable_irq(gfar_irq(grp, RX)->irq);
-			enable_irq(gfar_irq(grp, TX)->irq);
-		}
-	} else {
-		for (i = 0; i < priv->num_grps; i++) {
-			struct gfar_priv_grp *grp = &priv->gfargrp[i];
+	for (i = 0; i < priv->num_grps; i++) {
+		struct gfar_priv_grp *grp = &priv->gfargrp[i];
 
-			disable_irq(gfar_irq(grp, TX)->irq);
-			gfar_interrupt(gfar_irq(grp, TX)->irq, grp);
-			enable_irq(gfar_irq(grp, TX)->irq);
-		}
+		gfar_interrupt(gfar_irq(grp, TX)->irq, grp);
 	}
 }
 #endif
@@ -3102,9 +3089,11 @@ static void gfar_netpoll(struct net_device *dev)
 static irqreturn_t gfar_interrupt(int irq, void *grp_id)
 {
 	struct gfar_priv_grp *gfargrp = grp_id;
+	u32 events;
 
+	netpoll_irq_lock(&gfargrp->netpoll_intr_lock);
 	/* Save ievent for future reference */
-	u32 events = gfar_read(&gfargrp->regs->ievent);
+	events = gfar_read(&gfargrp->regs->ievent);
 
 	/* Check for reception */
 	if (events & IEVENT_RX_MASK)
@@ -3118,6 +3107,7 @@ static irqreturn_t gfar_interrupt(int irq, void *grp_id)
 	if (events & IEVENT_ERR_MASK)
 		gfar_error(irq, grp_id);
 
+	netpoll_irq_unlock(&gfargrp->netpoll_intr_lock);
 	return IRQ_HANDLED;
 }
 
@@ -3306,9 +3296,11 @@ static irqreturn_t gfar_error(int irq, void *grp_id)
 	struct gfar __iomem *regs = gfargrp->regs;
 	struct gfar_private *priv= gfargrp->priv;
 	struct net_device *dev = priv->ndev;
+	u32 events;
 
+	netpoll_irq_lock(&gfargrp->netpoll_err_lock);
 	/* Save ievent for future reference */
-	u32 events = gfar_read(&regs->ievent);
+	events = gfar_read(&regs->ievent);
 
 	/* Clear IEVENT */
 	gfar_write(&regs->ievent, events & IEVENT_ERR_MASK);
@@ -3377,6 +3369,8 @@ static irqreturn_t gfar_error(int irq, void *grp_id)
 		atomic64_inc(&priv->extra_stats.tx_babt);
 		netif_dbg(priv, tx_err, dev, "babbling TX error\n");
 	}
+
+	netpoll_irq_unlock(&gfargrp->netpoll_err_lock);
 	return IRQ_HANDLED;
 }
 
diff --git a/drivers/net/ethernet/freescale/gianfar.h b/drivers/net/ethernet/freescale/gianfar.h
index b581b8823a2a..4da559dd772d 100644
--- a/drivers/net/ethernet/freescale/gianfar.h
+++ b/drivers/net/ethernet/freescale/gianfar.h
@@ -37,6 +37,7 @@
 #include <linux/mm.h>
 #include <linux/mii.h>
 #include <linux/phy.h>
+#include <linux/netpoll.h>
 
 #include <asm/io.h>
 #include <asm/irq.h>
@@ -1079,6 +1080,10 @@ struct gfar_priv_grp {
 	unsigned long rx_bit_map;
 
 	struct gfar_irqinfo *irqinfo[GFAR_NUM_IRQS];
+	struct netpoll_irq_lock netpoll_intr_lock;
+	struct netpoll_irq_lock netpoll_rx_lock;
+	struct netpoll_irq_lock netpoll_tx_lock;
+	struct netpoll_irq_lock netpoll_err_lock;
 };
 
 #define gfar_irq(grp, ID) \
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next 11/11] net: fec: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (9 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 10/11] gianfar: " Sabrina Dubroca
@ 2014-12-09 14:37 ` Sabrina Dubroca
  2014-12-10  2:44 ` [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller David Miller
  11 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 14:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Sabrina Dubroca

disable_irq() may sleep, replace it with a spin_lock in the interrupt
handler and netpoll controller.

No actual testing done, only compiled.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 drivers/net/ethernet/freescale/fec.h      |  2 ++
 drivers/net/ethernet/freescale/fec_main.c | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 469691ad4a1e..fe8931465a53 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -16,6 +16,7 @@
 #include <linux/clocksource.h>
 #include <linux/net_tstamp.h>
 #include <linux/ptp_clock_kernel.h>
+#include <linux/netpoll.h>
 
 #if defined(CONFIG_M523x) || defined(CONFIG_M527x) || defined(CONFIG_M528x) || \
     defined(CONFIG_M520x) || defined(CONFIG_M532x) || \
@@ -509,6 +510,7 @@ struct fec_enet_private {
 	int	speed;
 	struct	completion mdio_done;
 	int	irq[FEC_IRQ_NUM];
+	struct netpoll_irq_lock netpoll_locks[FEC_IRQ_NUM];
 	bool	bufdesc_ex;
 	int	pause_flag;
 	u32	quirks;
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index d2955ce24d0b..b7579c5acedb 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1558,6 +1558,18 @@ fec_enet_interrupt(int irq, void *dev_id)
 	const unsigned napi_mask = FEC_ENET_RXF | FEC_ENET_TXF;
 	uint int_events;
 	irqreturn_t ret = IRQ_NONE;
+	int i;
+	struct netpoll_irq_lock *netpoll_lock = NULL;
+
+	for (i = 0; i < FEC_IRQ_NUM; i++) {
+		if (fep->irq[i] == irq) {
+			netpoll_lock = &fep->netpoll_locks[i];
+			netpoll_irq_lock(netpoll_lock);
+			break;
+		}
+	}
+	if (!netpoll_lock)
+		return ret;
 
 	int_events = readl(fep->hwp + FEC_IEVENT);
 	writel(int_events & ~napi_mask, fep->hwp + FEC_IEVENT);
@@ -1579,6 +1591,7 @@ fec_enet_interrupt(int irq, void *dev_id)
 	if (fep->ptp_clock)
 		fec_ptp_check_pps_event(fep);
 
+	netpoll_irq_unlock(netpoll_lock);
 	return ret;
 }
 
@@ -3237,6 +3250,7 @@ fec_probe(struct platform_device *pdev)
 
 	for (i = 0; i < FEC_IRQ_NUM; i++) {
 		irq = platform_get_irq(pdev, i);
+		netpoll_irq_lock_init(&fep->netpoll_locks[i]);
 		if (irq < 0) {
 			if (i)
 				break;
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock
       [not found]   ` <CAMXMK6t5NfPQFBxK1Qny45LCS6rwX4Ys1n4C7fsTPHXu=x_vuQ@mail.gmail.com>
@ 2014-12-09 17:23     ` Sabrina Dubroca
  2014-12-09 21:17       ` Chris Snook
  0 siblings, 1 reply; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-09 17:23 UTC (permalink / raw)
  To: Chris Snook; +Cc: davem, netdev, Jay Cliburn

2014-12-09, 16:13:33 +0000, Chris Snook wrote:
> Could you explain the bug a little more for us? It's not obvious to me how
> sleeping there is a problem.
> 
> -- Chris

Sorry for the lack of context.

A might_sleep() check in disable_irq() was added in commit
e22b886a8a43b ("sched/wait: Add might_sleep() checks") [1], and it
triggers when using netconsole:

BUG: sleeping function called from invalid context at kernel/irq/manage.c:104
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: systemd
no locks held by systemd/1.
irq event stamp: 10102965
hardirqs last  enabled at (10102965): [<ffffffff810cbafd>] vprintk_emit+0x2dd/0x6a0
hardirqs last disabled at (10102964): [<ffffffff810cb897>] vprintk_emit+0x77/0x6a0
softirqs last  enabled at (10102342): [<ffffffff810666aa>] __do_softirq+0x27a/0x6f0
softirqs last disabled at (10102337): [<ffffffff81066e86>] irq_exit+0x56/0xe0
Preemption disabled at:[<ffffffff817de50d>] printk_emit+0x31/0x33

CPU: 1 PID: 1 Comm: systemd Not tainted 3.18.0-rc2-next-20141029-dirty #222
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
 ffffffff81a82291 ffff88001e743978 ffffffff817df31d 0000000000000000
 0000000000000000 ffff88001e7439a8 ffffffff8108dfa2 ffff88001e7439a8
 ffffffff81a82291 0000000000000068 0000000000000000 ffff88001e7439d8
Call Trace:
 [<ffffffff817df31d>] dump_stack+0x4f/0x7c
 [<ffffffff8108dfa2>] ___might_sleep+0x182/0x2b0
 [<ffffffff8108e10a>] __might_sleep+0x3a/0xc0
 [<ffffffff810ce358>] synchronize_irq+0x38/0xa0
 [<ffffffff810ce633>] ? __disable_irq_nosync+0x43/0x70
 [<ffffffff810ce690>] disable_irq+0x20/0x30
 [<ffffffff815d7253>] e1000_netpoll+0x23/0x60
 [<ffffffff81678d02>] netpoll_poll_dev+0x72/0x3a0
 [<ffffffff817e9993>] ? _raw_spin_trylock+0x73/0x90
 [<ffffffff8167920f>] ? netpoll_send_skb_on_dev+0x1df/0x2e0
 [<ffffffff816791e7>] netpoll_send_skb_on_dev+0x1b7/0x2e0
 [<ffffffff816795f3>] netpoll_send_udp+0x2e3/0x490


The initial discussion of this problem is here: https://lkml.org/lkml/2014/10/29/523


[1] https://lkml.org/lkml/2014/10/28/427


> On Tue, Dec 9, 2014, 06:39 Sabrina Dubroca <sd@queasysnail.net> wrote:
> 
> > disable_irq() may sleep, replace it with a spin_lock in the interrupt
> > handler.
> >
> > No actual testing done, only compiled.
> >
> > Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> > Cc: Jay Cliburn <jcliburn@gmail.com>
> > Cc: Chris Snook <chris.snook@gmail.com>
> > ---
> >  drivers/net/ethernet/atheros/atl1c/atl1c.h      |  3 +++
> >  drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 12 ++++++++----
> >  2 files changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h
> > b/drivers/net/ethernet/atheros/atl1c/atl1c.h
> > index b9203d928938..8d97791e1516 100644
> > --- a/drivers/net/ethernet/atheros/atl1c/atl1c.h
> > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h
> > @@ -49,6 +49,7 @@
> >  #include <linux/workqueue.h>
> >  #include <net/checksum.h>
> >  #include <net/ip6_checksum.h>
> > +#include <linux/netpoll.h>
> >
> >  #include "atl1c_hw.h"
> >
> > @@ -555,6 +556,8 @@ struct atl1c_adapter {
> >         struct atl1c_rfd_ring rfd_ring;
> >         struct atl1c_rrd_ring rrd_ring;
> >         u32 bd_number;     /* board number;*/
> > +
> > +       struct netpoll_irq_lock netpoll_lock;
> >  };
> >
> >  #define AT_WRITE_REG(a, reg, value) ( \
> > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > index 72fb86b9aa24..7a1b11eb8e4e 100644
> > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> > @@ -826,6 +826,7 @@ static int atl1c_sw_init(struct atl1c_adapter *adapter)
> >         atomic_set(&adapter->irq_sem, 1);
> >         spin_lock_init(&adapter->mdio_lock);
> >         spin_lock_init(&adapter->tx_lock);
> > +       netpoll_irq_lock_init(&adapter->netpoll_lock);
> >         set_bit(__AT_DOWN, &adapter->flags);
> >
> >         return 0;
> > @@ -1584,10 +1585,11 @@ static irqreturn_t atl1c_intr(int irq, void *data)
> >         struct pci_dev *pdev = adapter->pdev;
> >         struct atl1c_hw *hw = &adapter->hw;
> >         int max_ints = AT_MAX_INT_WORK;
> > -       int handled = IRQ_NONE;
> > +       irqreturn_t handled = IRQ_NONE;
> >         u32 status;
> >         u32 reg_data;
> >
> > +       netpoll_irq_lock(&adapter->netpoll_lock);
> >         do {
> >                 AT_READ_REG(hw, REG_ISR, &reg_data);
> >                 status = reg_data & hw->intr_mask;
> > @@ -1622,7 +1624,8 @@ static irqreturn_t atl1c_intr(int irq, void *data)
> >                         /* reset MAC */
> >                         set_bit(ATL1C_WORK_EVENT_RESET,
> > &adapter->work_event);
> >                         schedule_work(&adapter->common_task);
> > -                       return IRQ_HANDLED;
> > +                       handled = IRQ_HANDLED;
> > +                       goto out;
> >                 }
> >
> >                 if (status & ISR_OVER)
> > @@ -1641,6 +1644,9 @@ static irqreturn_t atl1c_intr(int irq, void *data)
> >         } while (--max_ints > 0);
> >         /* re-enable Interrupt*/
> >         AT_WRITE_REG(&adapter->hw, REG_ISR, 0);
> > +
> > +out:
> > +       netpoll_irq_unlock(&adapter->netpoll_lock);
> >         return handled;
> >  }
> >
> > @@ -1900,9 +1906,7 @@ static void atl1c_netpoll(struct net_device *netdev)
> >  {
> >         struct atl1c_adapter *adapter = netdev_priv(netdev);
> >
> > -       disable_irq(adapter->pdev->irq);
> >         atl1c_intr(adapter->pdev->irq, netdev);
> > -       enable_irq(adapter->pdev->irq);
> >  }
> >  #endif
> >
> > --
> > 2.1.3
> >
> >

-- 
Sabrina

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock
  2014-12-09 17:23     ` Sabrina Dubroca
@ 2014-12-09 21:17       ` Chris Snook
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Snook @ 2014-12-09 21:17 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: David S. Miller, netdev@vger.kernel.org, Jay Cliburn

Thanks for clarifying. I'll see if I can find someone with an atl1c
device to test this.

-- Chris

On Tue, Dec 9, 2014 at 9:23 AM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> 2014-12-09, 16:13:33 +0000, Chris Snook wrote:
>> Could you explain the bug a little more for us? It's not obvious to me how
>> sleeping there is a problem.
>>
>> -- Chris
>
> Sorry for the lack of context.
>
> A might_sleep() check in disable_irq() was added in commit
> e22b886a8a43b ("sched/wait: Add might_sleep() checks") [1], and it
> triggers when using netconsole:
>
> BUG: sleeping function called from invalid context at kernel/irq/manage.c:104
> in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: systemd
> no locks held by systemd/1.
> irq event stamp: 10102965
> hardirqs last  enabled at (10102965): [<ffffffff810cbafd>] vprintk_emit+0x2dd/0x6a0
> hardirqs last disabled at (10102964): [<ffffffff810cb897>] vprintk_emit+0x77/0x6a0
> softirqs last  enabled at (10102342): [<ffffffff810666aa>] __do_softirq+0x27a/0x6f0
> softirqs last disabled at (10102337): [<ffffffff81066e86>] irq_exit+0x56/0xe0
> Preemption disabled at:[<ffffffff817de50d>] printk_emit+0x31/0x33
>
> CPU: 1 PID: 1 Comm: systemd Not tainted 3.18.0-rc2-next-20141029-dirty #222
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
>  ffffffff81a82291 ffff88001e743978 ffffffff817df31d 0000000000000000
>  0000000000000000 ffff88001e7439a8 ffffffff8108dfa2 ffff88001e7439a8
>  ffffffff81a82291 0000000000000068 0000000000000000 ffff88001e7439d8
> Call Trace:
>  [<ffffffff817df31d>] dump_stack+0x4f/0x7c
>  [<ffffffff8108dfa2>] ___might_sleep+0x182/0x2b0
>  [<ffffffff8108e10a>] __might_sleep+0x3a/0xc0
>  [<ffffffff810ce358>] synchronize_irq+0x38/0xa0
>  [<ffffffff810ce633>] ? __disable_irq_nosync+0x43/0x70
>  [<ffffffff810ce690>] disable_irq+0x20/0x30
>  [<ffffffff815d7253>] e1000_netpoll+0x23/0x60
>  [<ffffffff81678d02>] netpoll_poll_dev+0x72/0x3a0
>  [<ffffffff817e9993>] ? _raw_spin_trylock+0x73/0x90
>  [<ffffffff8167920f>] ? netpoll_send_skb_on_dev+0x1df/0x2e0
>  [<ffffffff816791e7>] netpoll_send_skb_on_dev+0x1b7/0x2e0
>  [<ffffffff816795f3>] netpoll_send_udp+0x2e3/0x490
>
>
> The initial discussion of this problem is here: https://lkml.org/lkml/2014/10/29/523
>
>
> [1] https://lkml.org/lkml/2014/10/28/427
>
>
>> On Tue, Dec 9, 2014, 06:39 Sabrina Dubroca <sd@queasysnail.net> wrote:
>>
>> > disable_irq() may sleep, replace it with a spin_lock in the interrupt
>> > handler.
>> >
>> > No actual testing done, only compiled.
>> >
>> > Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
>> > Cc: Jay Cliburn <jcliburn@gmail.com>
>> > Cc: Chris Snook <chris.snook@gmail.com>
>> > ---
>> >  drivers/net/ethernet/atheros/atl1c/atl1c.h      |  3 +++
>> >  drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 12 ++++++++----
>> >  2 files changed, 11 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h
>> > b/drivers/net/ethernet/atheros/atl1c/atl1c.h
>> > index b9203d928938..8d97791e1516 100644
>> > --- a/drivers/net/ethernet/atheros/atl1c/atl1c.h
>> > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h
>> > @@ -49,6 +49,7 @@
>> >  #include <linux/workqueue.h>
>> >  #include <net/checksum.h>
>> >  #include <net/ip6_checksum.h>
>> > +#include <linux/netpoll.h>
>> >
>> >  #include "atl1c_hw.h"
>> >
>> > @@ -555,6 +556,8 @@ struct atl1c_adapter {
>> >         struct atl1c_rfd_ring rfd_ring;
>> >         struct atl1c_rrd_ring rrd_ring;
>> >         u32 bd_number;     /* board number;*/
>> > +
>> > +       struct netpoll_irq_lock netpoll_lock;
>> >  };
>> >
>> >  #define AT_WRITE_REG(a, reg, value) ( \
>> > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
>> > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
>> > index 72fb86b9aa24..7a1b11eb8e4e 100644
>> > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
>> > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
>> > @@ -826,6 +826,7 @@ static int atl1c_sw_init(struct atl1c_adapter *adapter)
>> >         atomic_set(&adapter->irq_sem, 1);
>> >         spin_lock_init(&adapter->mdio_lock);
>> >         spin_lock_init(&adapter->tx_lock);
>> > +       netpoll_irq_lock_init(&adapter->netpoll_lock);
>> >         set_bit(__AT_DOWN, &adapter->flags);
>> >
>> >         return 0;
>> > @@ -1584,10 +1585,11 @@ static irqreturn_t atl1c_intr(int irq, void *data)
>> >         struct pci_dev *pdev = adapter->pdev;
>> >         struct atl1c_hw *hw = &adapter->hw;
>> >         int max_ints = AT_MAX_INT_WORK;
>> > -       int handled = IRQ_NONE;
>> > +       irqreturn_t handled = IRQ_NONE;
>> >         u32 status;
>> >         u32 reg_data;
>> >
>> > +       netpoll_irq_lock(&adapter->netpoll_lock);
>> >         do {
>> >                 AT_READ_REG(hw, REG_ISR, &reg_data);
>> >                 status = reg_data & hw->intr_mask;
>> > @@ -1622,7 +1624,8 @@ static irqreturn_t atl1c_intr(int irq, void *data)
>> >                         /* reset MAC */
>> >                         set_bit(ATL1C_WORK_EVENT_RESET,
>> > &adapter->work_event);
>> >                         schedule_work(&adapter->common_task);
>> > -                       return IRQ_HANDLED;
>> > +                       handled = IRQ_HANDLED;
>> > +                       goto out;
>> >                 }
>> >
>> >                 if (status & ISR_OVER)
>> > @@ -1641,6 +1644,9 @@ static irqreturn_t atl1c_intr(int irq, void *data)
>> >         } while (--max_ints > 0);
>> >         /* re-enable Interrupt*/
>> >         AT_WRITE_REG(&adapter->hw, REG_ISR, 0);
>> > +
>> > +out:
>> > +       netpoll_irq_unlock(&adapter->netpoll_lock);
>> >         return handled;
>> >  }
>> >
>> > @@ -1900,9 +1906,7 @@ static void atl1c_netpoll(struct net_device *netdev)
>> >  {
>> >         struct atl1c_adapter *adapter = netdev_priv(netdev);
>> >
>> > -       disable_irq(adapter->pdev->irq);
>> >         atl1c_intr(adapter->pdev->irq, netdev);
>> > -       enable_irq(adapter->pdev->irq);
>> >  }
>> >  #endif
>> >
>> > --
>> > 2.1.3
>> >
>> >
>
> --
> Sabrina

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
                   ` (10 preceding siblings ...)
  2014-12-09 14:37 ` [RFC PATCH net-next 11/11] net: fec: " Sabrina Dubroca
@ 2014-12-10  2:44 ` David Miller
  2014-12-11 21:45   ` Sabrina Dubroca
  11 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2014-12-10  2:44 UTC (permalink / raw)
  To: sd; +Cc: netdev, peterz, tglx


Adding a new spinlock to every interrupt service routine is
simply a non-starter.

You will certainly have to find a way to fix this in a way
that doesn't involve adding any new overhead to the normal
operational paths of these drivers.

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2014-12-10  2:44 ` [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller David Miller
@ 2014-12-11 21:45   ` Sabrina Dubroca
  2014-12-12  2:14     ` David Miller
  2014-12-12 22:01     ` Thomas Gleixner
  0 siblings, 2 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2014-12-11 21:45 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, peterz, tglx

2014-12-09, 21:44:33 -0500, David Miller wrote:
> 
> Adding a new spinlock to every interrupt service routine is
> simply a non-starter.
> 
> You will certainly have to find a way to fix this in a way
> that doesn't involve adding any new overhead to the normal
> operational paths of these drivers.
> 
> Thanks.

Okay. Here is another idea.

Since the issue is with the wait_event() part of synchronize_irq(),
and it only takes care of threaded handlers, maybe we could try not
waiting for threaded handlers.

Introduce disable_irq_nosleep() that returns true if it successfully
synchronized against all handlers (there was no threaded handler
running), false if it left some threads running.  And in
->ndo_poll_controller, only call the interrupt handler if
synchronization was successful.

Both users of the poll controllers retry their action (alloc/xmit an
skb) several times, with calls to the device's poll controller between
attempts.  And hopefully, if the first attempt fails, we will still
manage to get through?



diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 83140cbb5f01..d967937aca3c 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -5216,8 +5216,8 @@ static void e1000_netpoll(struct net_device *netdev)
 {
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 
-	disable_irq(adapter->pdev->irq);
-	e1000_intr(adapter->pdev->irq, netdev);
+	if (disable_irq_nosleep(adapter->pdev->irq))
+		e1000_intr(adapter->pdev->irq, netdev);
 	enable_irq(adapter->pdev->irq);
 }
 #endif
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 69517a24bc50..f2e4125ac963 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -184,6 +184,7 @@ extern void devm_free_irq(struct device *dev, unsigned int irq, void *dev_id);
 #endif
 
 extern void disable_irq_nosync(unsigned int irq);
+extern bool disable_irq_nosleep(unsigned int irq);
 extern void disable_irq(unsigned int irq);
 extern void disable_percpu_irq(unsigned int irq);
 extern void enable_irq(unsigned int irq);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 0a9104b4608b..58199b023845 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -106,6 +106,23 @@ void synchronize_irq(unsigned int irq)
 }
 EXPORT_SYMBOL(synchronize_irq);
 
+static bool synchronize_irq_nosleep(unsigned int irq)
+{
+	struct irq_desc *desc = irq_to_desc(irq);
+
+	if (desc) {
+		__synchronize_hardirq(desc);
+		/*
+		 * We made sure that no hardirq handler is
+		 * running. Now check if some threaded handlers are
+		 * active.
+		 */
+		return !atomic_read(&desc->threads_active);
+	}
+
+	return true;
+}
+
 #ifdef CONFIG_SMP
 cpumask_var_t irq_default_affinity;
 
@@ -418,6 +435,31 @@ void disable_irq_nosync(unsigned int irq)
 EXPORT_SYMBOL(disable_irq_nosync);
 
 /**
+ *	disable_irq_nosleep - disable an irq and wait for completion of hard IRQ handlers
+ *	@irq: Interrupt to disable
+ *
+ *	Disable the selected interrupt line.  Enables and Disables are
+ *	nested.
+ *	This function waits for any pending hard IRQ handlers for this
+ *	interrupt to complete before returning. If you use this
+ *	function while holding a resource the IRQ handler may need you
+ *	will deadlock.
+ *	This function does not wait for threaded IRQ handlers.
+ *
+ *      Returns true if synchronized, false if there are threaded
+ *      handlers pending.
+ *
+ *	This function may be called - with care - from IRQ context.
+ */
+bool disable_irq_nosleep(unsigned int irq)
+{
+	if (!__disable_irq_nosync(irq))
+		return synchronize_irq_nosleep(irq);
+	return true;
+}
+EXPORT_SYMBOL(disable_irq_nosleep);
+
+/**
  *	disable_irq - disable an irq and wait for completion
  *	@irq: Interrupt to disable
  *


-- 
Sabrina

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2014-12-11 21:45   ` Sabrina Dubroca
@ 2014-12-12  2:14     ` David Miller
  2014-12-12 22:01     ` Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: David Miller @ 2014-12-12  2:14 UTC (permalink / raw)
  To: sd; +Cc: netdev, peterz, tglx

From: Sabrina Dubroca <sd@queasysnail.net>
Date: Thu, 11 Dec 2014 22:45:37 +0100

> Introduce disable_irq_nosleep() that returns true if it successfully
> synchronized against all handlers (there was no threaded handler
> running), false if it left some threads running.  And in
> ->ndo_poll_controller, only call the interrupt handler if
> synchronization was successful.
> 
> Both users of the poll controllers retry their action (alloc/xmit an
> skb) several times, with calls to the device's poll controller between
> attempts.  And hopefully, if the first attempt fails, we will still
> manage to get through?

This looks more palatable to me.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2014-12-11 21:45   ` Sabrina Dubroca
  2014-12-12  2:14     ` David Miller
@ 2014-12-12 22:01     ` Thomas Gleixner
  2015-01-05 14:31       ` Sabrina Dubroca
  1 sibling, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2014-12-12 22:01 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: David Miller, netdev, peterz

On Thu, 11 Dec 2014, Sabrina Dubroca wrote:
> 2014-12-09, 21:44:33 -0500, David Miller wrote:
> > 
> > Adding a new spinlock to every interrupt service routine is
> > simply a non-starter.
> > 
> > You will certainly have to find a way to fix this in a way
> > that doesn't involve adding any new overhead to the normal
> > operational paths of these drivers.
> 
> Okay. Here is another idea.
> 
> Since the issue is with the wait_event() part of synchronize_irq(),
> and it only takes care of threaded handlers, maybe we could try not
> waiting for threaded handlers.
> 
> Introduce disable_irq_nosleep() that returns true if it successfully
> synchronized against all handlers (there was no threaded handler
> running), false if it left some threads running.  And in
> ->ndo_poll_controller, only call the interrupt handler if
> synchronization was successful.
> 
> Both users of the poll controllers retry their action (alloc/xmit an
> skb) several times, with calls to the device's poll controller between
> attempts.  And hopefully, if the first attempt fails, we will still
> manage to get through?

Hopefully is not a good starting point. Is the poll controller
definitely retrying? Otherwise you might end up with the following:

Interrupt line is shared between your network device and a
device which requested a threaded interrupt handler.

  CPU0	       		   	    CPU1
  interrupt()
    your_device_handler()
      return NONE;
    shared_device_handler()
      return WAKE_THREAD;
      --> atomic_inc(threads_active);
				    poll()
				      disable_irq_nosleep()
					sync_hardirq()
					return atomic_read(threads_active);

So if you do not have a reliable retry then you might just go into a
stale state. And this can happen if the interrupt type is edge because
we do not disable the interrupt when we wakeup the thread for obvious
reasons.

Aside of that I think that something like this is a reasonable
approach to the problem.

The only other nitpicks I have are:

    - The name of the function sucks, though my tired braain can't
      come up with something reasonable right now

    - The lack of extensive documentation how this interface is
      supposed to be used and the pitfals of abusage, both in the
      function documentation and the changelog.

      Merlily copying the existing documentation of the other
      interface is not sufficient.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2014-12-12 22:01     ` Thomas Gleixner
@ 2015-01-05 14:31       ` Sabrina Dubroca
  2015-02-05  0:20         ` Sabrina Dubroca
  0 siblings, 1 reply; 22+ messages in thread
From: Sabrina Dubroca @ 2015-01-05 14:31 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: David Miller, netdev, peterz

2014-12-12, 23:01:28 +0100, Thomas Gleixner wrote:
> On Thu, 11 Dec 2014, Sabrina Dubroca wrote:
> > 2014-12-09, 21:44:33 -0500, David Miller wrote:
> > > 
> > > Adding a new spinlock to every interrupt service routine is
> > > simply a non-starter.
> > > 
> > > You will certainly have to find a way to fix this in a way
> > > that doesn't involve adding any new overhead to the normal
> > > operational paths of these drivers.
> > 
> > Okay. Here is another idea.
> > 
> > Since the issue is with the wait_event() part of synchronize_irq(),
> > and it only takes care of threaded handlers, maybe we could try not
> > waiting for threaded handlers.
> > 
> > Introduce disable_irq_nosleep() that returns true if it successfully
> > synchronized against all handlers (there was no threaded handler
> > running), false if it left some threads running.  And in
> > ->ndo_poll_controller, only call the interrupt handler if
> > synchronization was successful.
> > 
> > Both users of the poll controllers retry their action (alloc/xmit an
> > skb) several times, with calls to the device's poll controller between
> > attempts.  And hopefully, if the first attempt fails, we will still
> > manage to get through?
> 
> Hopefully is not a good starting point. Is the poll controller
> definitely retrying? Otherwise you might end up with the following:
> 
> Interrupt line is shared between your network device and a
> device which requested a threaded interrupt handler.
> 
>   CPU0	       		   	    CPU1
>   interrupt()
>     your_device_handler()
>       return NONE;
>     shared_device_handler()
>       return WAKE_THREAD;
>       --> atomic_inc(threads_active);
> 				    poll()
> 				      disable_irq_nosleep()
> 					sync_hardirq()
> 					return atomic_read(threads_active);
> 
> So if you do not have a reliable retry then you might just go into a
> stale state. And this can happen if the interrupt type is edge because
> we do not disable the interrupt when we wakeup the thread for obvious
> reasons.

We do have loops retrying to run the netpoll controller, and trying to
do the work even if the controller doesn't help.  And by hopefully I
mean: even if we fail, we tried our best and netpoll isn't 100%
reliable.


static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve)
{
	...
repeat:

	skb = alloc_skb(len, GFP_ATOMIC);
	if (!skb)
		skb = skb_dequeue(&skb_pool);

	if (!skb) {
		if (++count < 10) {
			netpoll_poll_dev(np->dev);
			goto repeat;
		}
		return NULL;
	}

	...
}

void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
			     struct net_device *dev)
{
	...

		/* try until next clock tick */
		for (tries = jiffies_to_usecs(1)/USEC_PER_POLL;
		     tries > 0; --tries) {
			if (HARD_TX_TRYLOCK(dev, txq)) {
				if (!netif_xmit_stopped(txq))
					status = netpoll_start_xmit(skb, dev, txq);

				HARD_TX_UNLOCK(dev, txq);

				if (status == NETDEV_TX_OK)
					break;

			}

			/* tickle device maybe there is some cleanup */
			netpoll_poll_dev(np->dev);

			udelay(USEC_PER_POLL);
		}

	...
}



> Aside of that I think that something like this is a reasonable
> approach to the problem.
> 
> The only other nitpicks I have are:
> 
>     - The name of the function sucks, though my tired braain can't
>       come up with something reasonable right now

I couldn't think of anything better.  Maybe 'disable_irq_trysync' or
'disable_irq_hardsync'?

Or maybe you prefer something that works like spin_trylock, and
reenables the irq before returning if we can't sync?  Maybe the risk
of abuse would be a bit lower this way?

I made synchronize_irq_nosleep static, but maybe it should be
EXPORT_SYMBOL'ed as well.  I didn't need that for e1000, but that
would be more consistent.


>     - The lack of extensive documentation how this interface is
>       supposed to be used and the pitfals of abusage, both in the
>       function documentation and the changelog.
> 
>       Merlily copying the existing documentation of the other
>       interface is not sufficient.


Yes, my email wasn't really a changelog, just a description and RFC.


Modified documentation:

-----
disable_irq_nosleep - disable an irq and wait for completion of hard IRQ handlers
@irq: Interrupt to disable

Disable the selected interrupt line.  Enables and Disables are
nested.
This function does not sleep, and is safe to call in atomic context.

This function waits for any pending hard IRQ handlers for this
interrupt to complete before returning. If you use this
function while holding a resource the IRQ handler may need you
will deadlock.

This function does not wait for threaded IRQ handlers.
Returns true if synchronized, false if there are threaded
handlers pending.

If false is returned, the caller must assume that synchronization
didn't occur, and that it is NOT safe to proceed.
The caller MUST reenable the interrupt by calling enable_irq in all
cases.

This function may be called - with care - from IRQ context.
-----


Thanks.

--
Sabrina

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2015-01-05 14:31       ` Sabrina Dubroca
@ 2015-02-05  0:20         ` Sabrina Dubroca
  2015-02-05 13:06           ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Sabrina Dubroca @ 2015-02-05  0:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: David Miller, netdev, peterz

Thomas, ping?

thread is over there if you need it:
https://marc.info/?l=linux-netdev&m=141833435000554&w=2

2015-01-05, 15:31:26 +0100, Sabrina Dubroca wrote:
> 2014-12-12, 23:01:28 +0100, Thomas Gleixner wrote:
> > On Thu, 11 Dec 2014, Sabrina Dubroca wrote:
> > > 2014-12-09, 21:44:33 -0500, David Miller wrote:
> > > > 
> > > > Adding a new spinlock to every interrupt service routine is
> > > > simply a non-starter.
> > > > 
> > > > You will certainly have to find a way to fix this in a way
> > > > that doesn't involve adding any new overhead to the normal
> > > > operational paths of these drivers.
> > > 
> > > Okay. Here is another idea.
> > > 
> > > Since the issue is with the wait_event() part of synchronize_irq(),
> > > and it only takes care of threaded handlers, maybe we could try not
> > > waiting for threaded handlers.
> > > 
> > > Introduce disable_irq_nosleep() that returns true if it successfully
> > > synchronized against all handlers (there was no threaded handler
> > > running), false if it left some threads running.  And in
> > > ->ndo_poll_controller, only call the interrupt handler if
> > > synchronization was successful.
> > > 
> > > Both users of the poll controllers retry their action (alloc/xmit an
> > > skb) several times, with calls to the device's poll controller between
> > > attempts.  And hopefully, if the first attempt fails, we will still
> > > manage to get through?
> > 
> > Hopefully is not a good starting point. Is the poll controller
> > definitely retrying? Otherwise you might end up with the following:
> > 
> > Interrupt line is shared between your network device and a
> > device which requested a threaded interrupt handler.
> > 
> >   CPU0	       		   	    CPU1
> >   interrupt()
> >     your_device_handler()
> >       return NONE;
> >     shared_device_handler()
> >       return WAKE_THREAD;
> >       --> atomic_inc(threads_active);
> > 				    poll()
> > 				      disable_irq_nosleep()
> > 					sync_hardirq()
> > 					return atomic_read(threads_active);
> > 
> > So if you do not have a reliable retry then you might just go into a
> > stale state. And this can happen if the interrupt type is edge because
> > we do not disable the interrupt when we wakeup the thread for obvious
> > reasons.
> 
> We do have loops retrying to run the netpoll controller, and trying to
> do the work even if the controller doesn't help.  And by hopefully I
> mean: even if we fail, we tried our best and netpoll isn't 100%
> reliable.
> 
> 
> static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve)
> {
> 	...
> repeat:
> 
> 	skb = alloc_skb(len, GFP_ATOMIC);
> 	if (!skb)
> 		skb = skb_dequeue(&skb_pool);
> 
> 	if (!skb) {
> 		if (++count < 10) {
> 			netpoll_poll_dev(np->dev);
> 			goto repeat;
> 		}
> 		return NULL;
> 	}
> 
> 	...
> }
> 
> void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
> 			     struct net_device *dev)
> {
> 	...
> 
> 		/* try until next clock tick */
> 		for (tries = jiffies_to_usecs(1)/USEC_PER_POLL;
> 		     tries > 0; --tries) {
> 			if (HARD_TX_TRYLOCK(dev, txq)) {
> 				if (!netif_xmit_stopped(txq))
> 					status = netpoll_start_xmit(skb, dev, txq);
> 
> 				HARD_TX_UNLOCK(dev, txq);
> 
> 				if (status == NETDEV_TX_OK)
> 					break;
> 
> 			}
> 
> 			/* tickle device maybe there is some cleanup */
> 			netpoll_poll_dev(np->dev);
> 
> 			udelay(USEC_PER_POLL);
> 		}
> 
> 	...
> }
> 
> 
> 
> > Aside of that I think that something like this is a reasonable
> > approach to the problem.
> > 
> > The only other nitpicks I have are:
> > 
> >     - The name of the function sucks, though my tired braain can't
> >       come up with something reasonable right now
> 
> I couldn't think of anything better.  Maybe 'disable_irq_trysync' or
> 'disable_irq_hardsync'?
> 
> Or maybe you prefer something that works like spin_trylock, and
> reenables the irq before returning if we can't sync?  Maybe the risk
> of abuse would be a bit lower this way?
> 
> I made synchronize_irq_nosleep static, but maybe it should be
> EXPORT_SYMBOL'ed as well.  I didn't need that for e1000, but that
> would be more consistent.
> 
> 
> >     - The lack of extensive documentation how this interface is
> >       supposed to be used and the pitfals of abusage, both in the
> >       function documentation and the changelog.
> > 
> >       Merlily copying the existing documentation of the other
> >       interface is not sufficient.
> 
> 
> Yes, my email wasn't really a changelog, just a description and RFC.
> 
> 
> Modified documentation:
> 
> -----
> disable_irq_nosleep - disable an irq and wait for completion of hard IRQ handlers
> @irq: Interrupt to disable
> 
> Disable the selected interrupt line.  Enables and Disables are
> nested.
> This function does not sleep, and is safe to call in atomic context.
> 
> This function waits for any pending hard IRQ handlers for this
> interrupt to complete before returning. If you use this
> function while holding a resource the IRQ handler may need you
> will deadlock.
> 
> This function does not wait for threaded IRQ handlers.
> Returns true if synchronized, false if there are threaded
> handlers pending.
> 
> If false is returned, the caller must assume that synchronization
> didn't occur, and that it is NOT safe to proceed.
> The caller MUST reenable the interrupt by calling enable_irq in all
> cases.
> 
> This function may be called - with care - from IRQ context.
> -----
> 
> 
> Thanks.
> 
> --
> Sabrina
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2015-02-05  0:20         ` Sabrina Dubroca
@ 2015-02-05 13:06           ` Peter Zijlstra
  2015-02-05 15:33             ` Sabrina Dubroca
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2015-02-05 13:06 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Thomas Gleixner, David Miller, netdev

On Thu, Feb 05, 2015 at 01:20:59AM +0100, Sabrina Dubroca wrote:
> Thomas, ping?
> 
> thread is over there if you need it:
> https://marc.info/?l=linux-netdev&m=141833435000554&w=2

It appears to me you have addressed most his issues and since it appears
to me the existing synchronze_hardirq() seems to be mostly what we need
I'll propose (and merge) the below and will take flames from tglx when
he returns :-)

---
Subject: genirq: Provide disable_hardirq()

For things like netpoll there is a need to disable an interrupt from
atomic context. Currently netpoll uses disable_irq() which will
sleep-wait on threaded handlers and thus forced_irqthreads breaks
things.

Provide disable_hardirq(), which uses synchronize_hardirq() to only wait
for active hardirq handlers; also change synchronize_hardirq() to
return the status of threaded handlers.

This will allow one to try-disable an interrupt from atomic context, or
in case of request_threaded_irq() to only wait for the hardirq part.

Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/hardirq.h   |  2 +-
 include/linux/interrupt.h |  1 +
 kernel/irq/manage.c       | 36 ++++++++++++++++++++++++++++++++++--
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index cba442ec3c66..f4af03404b97 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -9,7 +9,7 @@
 
 
 extern void synchronize_irq(unsigned int irq);
-extern void synchronize_hardirq(unsigned int irq);
+extern bool synchronize_hardirq(unsigned int irq);
 
 #if defined(CONFIG_TINY_RCU)
 
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index d9b05b5bf8c7..3bb01b9a379c 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -184,6 +184,7 @@ extern void devm_free_irq(struct device *dev, unsigned int irq, void *dev_id);
 #endif
 
 extern void disable_irq_nosync(unsigned int irq);
+extern bool disable_hardirq(unsigned int irq);
 extern void disable_irq(unsigned int irq);
 extern void disable_percpu_irq(unsigned int irq);
 extern void enable_irq(unsigned int irq);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index f038e586a4b9..1309cccd714f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -68,14 +68,20 @@ static void __synchronize_hardirq(struct irq_desc *desc)
  *	Do not use this for shutdown scenarios where you must be sure
  *	that all parts (hardirq and threaded handler) have completed.
  *
+ *	Returns: false if a theaded handler is active.
+ *
  *	This function may be called - with care - from IRQ context.
  */
-void synchronize_hardirq(unsigned int irq)
+bool synchronize_hardirq(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
 
-	if (desc)
+	if (desc) {
 		__synchronize_hardirq(desc);
+		return !atomic_read(&desc->threads_active);
+	}
+
+	return true;
 }
 EXPORT_SYMBOL(synchronize_hardirq);
 
@@ -439,6 +445,32 @@ void disable_irq(unsigned int irq)
 }
 EXPORT_SYMBOL(disable_irq);
 
+/**
+ *	disable_hardirq - disables an irq and waits for hardirq completion
+ *	@irq: Interrupt to disable
+ *
+ *	Disable the selected interrupt line.  Enables and Disables are
+ *	nested.
+ *	This function waits for any pending hard IRQ handlers for this
+ *	interrupt to complete before returning. If you use this function while
+ *	holding a resource the hard IRQ handler may need you will deadlock.
+ *
+ *	When used to optimistically disable an interrupt from atomic context
+ *	the return value must be checked.
+ *
+ *	Returns: false if a theaded handler is active.
+ *
+ *	This function may be called - with care - from IRQ context.
+ */
+bool disable_hardirq(unsigned int irq)
+{
+	if (!__disable_irq_nosync(irq))
+		return synchronize_hardirq(irq);
+
+	return false;
+}
+EXPORT_SYMBOL(disable_hardirq);
+
 void __enable_irq(struct irq_desc *desc, unsigned int irq)
 {
 	switch (desc->depth) {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
  2015-02-05 13:06           ` Peter Zijlstra
@ 2015-02-05 15:33             ` Sabrina Dubroca
  0 siblings, 0 replies; 22+ messages in thread
From: Sabrina Dubroca @ 2015-02-05 15:33 UTC (permalink / raw)
  To: Peter Zijlstra, David Miller; +Cc: Thomas Gleixner, netdev

2015-02-05, 14:06:23 +0100, Peter Zijlstra wrote:
> On Thu, Feb 05, 2015 at 01:20:59AM +0100, Sabrina Dubroca wrote:
> > Thomas, ping?
> > 
> > thread is over there if you need it:
> > https://marc.info/?l=linux-netdev&m=141833435000554&w=2
> 
> It appears to me you have addressed most his issues and since it appears
> to me the existing synchronze_hardirq() seems to be mostly what we need
> I'll propose (and merge) the below and will take flames from tglx when
> he returns :-)

Great, thanks a lot Peter!

I will prepare the drivers patches based on this.

David, do you want one patch for each driver, or do you want me to group the changes?
There are 60+ drivers to patch, and almost all the patches are just:

-	disable_irq(irq);
-	handler(a, b);
+	if (disable_hardirq(irq))
+		handler(a,b);

> ---
> Subject: genirq: Provide disable_hardirq()
> 
> For things like netpoll there is a need to disable an interrupt from
> atomic context. Currently netpoll uses disable_irq() which will
> sleep-wait on threaded handlers and thus forced_irqthreads breaks
> things.
> 
> Provide disable_hardirq(), which uses synchronize_hardirq() to only wait
> for active hardirq handlers; also change synchronize_hardirq() to
> return the status of threaded handlers.
> 
> This will allow one to try-disable an interrupt from atomic context, or
> in case of request_threaded_irq() to only wait for the hardirq part.
> 
> Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/linux/hardirq.h   |  2 +-
>  include/linux/interrupt.h |  1 +
>  kernel/irq/manage.c       | 36 ++++++++++++++++++++++++++++++++++--
>  3 files changed, 36 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index cba442ec3c66..f4af03404b97 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -9,7 +9,7 @@
>  
>  
>  extern void synchronize_irq(unsigned int irq);
> -extern void synchronize_hardirq(unsigned int irq);
> +extern bool synchronize_hardirq(unsigned int irq);
>  
>  #if defined(CONFIG_TINY_RCU)
>  
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index d9b05b5bf8c7..3bb01b9a379c 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -184,6 +184,7 @@ extern void devm_free_irq(struct device *dev, unsigned int irq, void *dev_id);
>  #endif
>  
>  extern void disable_irq_nosync(unsigned int irq);
> +extern bool disable_hardirq(unsigned int irq);
>  extern void disable_irq(unsigned int irq);
>  extern void disable_percpu_irq(unsigned int irq);
>  extern void enable_irq(unsigned int irq);
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index f038e586a4b9..1309cccd714f 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -68,14 +68,20 @@ static void __synchronize_hardirq(struct irq_desc *desc)
>   *	Do not use this for shutdown scenarios where you must be sure
>   *	that all parts (hardirq and threaded handler) have completed.
>   *
> + *	Returns: false if a theaded handler is active.
> + *
>   *	This function may be called - with care - from IRQ context.
>   */
> -void synchronize_hardirq(unsigned int irq)
> +bool synchronize_hardirq(unsigned int irq)
>  {
>  	struct irq_desc *desc = irq_to_desc(irq);
>  
> -	if (desc)
> +	if (desc) {
>  		__synchronize_hardirq(desc);
> +		return !atomic_read(&desc->threads_active);
> +	}
> +
> +	return true;
>  }
>  EXPORT_SYMBOL(synchronize_hardirq);
>  
> @@ -439,6 +445,32 @@ void disable_irq(unsigned int irq)
>  }
>  EXPORT_SYMBOL(disable_irq);
>  
> +/**
> + *	disable_hardirq - disables an irq and waits for hardirq completion
> + *	@irq: Interrupt to disable
> + *
> + *	Disable the selected interrupt line.  Enables and Disables are
> + *	nested.
> + *	This function waits for any pending hard IRQ handlers for this
> + *	interrupt to complete before returning. If you use this function while
> + *	holding a resource the hard IRQ handler may need you will deadlock.
> + *
> + *	When used to optimistically disable an interrupt from atomic context
> + *	the return value must be checked.
> + *
> + *	Returns: false if a theaded handler is active.
> + *
> + *	This function may be called - with care - from IRQ context.
> + */
> +bool disable_hardirq(unsigned int irq)
> +{
> +	if (!__disable_irq_nosync(irq))
> +		return synchronize_hardirq(irq);
> +
> +	return false;
> +}
> +EXPORT_SYMBOL(disable_hardirq);
> +
>  void __enable_irq(struct irq_desc *desc, unsigned int irq)
>  {
>  	switch (desc->depth) {
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Sabrina

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-02-05 15:33 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 01/11] netpoll: introduce netpoll_irq_lock to protect netpoll controller against interrupts Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 02/11] e1000: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 03/11] 8139cp/too: remove disable_irq from netpoll controller Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
     [not found]   ` <CAMXMK6t5NfPQFBxK1Qny45LCS6rwX4Ys1n4C7fsTPHXu=x_vuQ@mail.gmail.com>
2014-12-09 17:23     ` Sabrina Dubroca
2014-12-09 21:17       ` Chris Snook
2014-12-09 14:37 ` [RFC PATCH net-next 05/11] bnx2: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 06/11] s2io: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 07/11] pasemi: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 08/11] ll_temac: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 09/11] xilinx/axienet: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 10/11] gianfar: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 11/11] net: fec: " Sabrina Dubroca
2014-12-10  2:44 ` [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller David Miller
2014-12-11 21:45   ` Sabrina Dubroca
2014-12-12  2:14     ` David Miller
2014-12-12 22:01     ` Thomas Gleixner
2015-01-05 14:31       ` Sabrina Dubroca
2015-02-05  0:20         ` Sabrina Dubroca
2015-02-05 13:06           ` Peter Zijlstra
2015-02-05 15:33             ` Sabrina Dubroca

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).