netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PHYLIB: IRQ event workqueue handling fixes
@ 2007-09-19 14:38 Maciej W. Rozycki
  2007-09-20 23:53 ` Andrew Morton
  2007-10-15 12:53 ` Jarek Poplawski
  0 siblings, 2 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2007-09-19 14:38 UTC (permalink / raw)
  To: Andy Fleming, Andrew Morton, Jeff Garzik; +Cc: netdev, linux-kernel

 Keep track of disable_irq_nosync() invocations and call enable_irq() the 
right number of times if work has been cancelled that would include them.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
 Now that the call to flush_work_keventd() (problematic because of 
rtnl_mutex being held) has been replaced by cancel_work_sync() another 
issue has arisen and been left unresolved.  As the MDIO bus cannot be 
accessed from the interrupt context the PHY interrupt handler uses 
disable_irq_nosync() to prevent from looping and schedules some work to be 
done as a softirq, which, apart from handling the state change of the 
originating PHY, is responsible for reenabling the interrupt.  Now if the 
interrupt line is shared by another device and a call to the softirq 
handler has been cancelled, that call to enable_irq() never happens and 
the other device cannot use its interrupt anymore as its stuck disabled.

 I decided to use a counter rather than a flag because there may be more 
than one call to phy_change() cancelled in the queue -- a real one and a 
fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else.  
Therefore because of its nesting property enable_irq() has to be called 
the right number of times to match the number disable_irq_nosync() was 
called and restore the original state.  This DEBUG_SHIRQ feature is also 
the reason why free_irq() has to be called before cancel_work_sync().

 While at it I updated the comment about phy_stop_interrupts() being 
called from `keventd' -- this is no longer relevant as the use of 
cancel_work_sync() makes such an approach unnecessary.  OTOH a similar 
comment referring to flush_scheduled_work() in phy_stop() still applies as 
using cancel_work_sync() there would be dangerous.

 Checked with checkpatch.pl and at the run time (with and without 
DEBUG_SHIRQ).

 Please apply.

  Maciej
 
patch-mips-2.6.23-rc5-20070904-phy-irq-fix-9
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c
--- linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c	2007-09-16 17:17:20.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c	2007-09-18 14:28:07.000000000 +0000
@@ -7,7 +7,7 @@
  * Author: Andy Fleming
  *
  * Copyright (c) 2004 Freescale Semiconductor, Inc.
- * Copyright (c) 2006  Maciej W. Rozycki
+ * Copyright (c) 2006, 2007  Maciej W. Rozycki
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -35,6 +35,7 @@
 #include <linux/timer.h>
 #include <linux/workqueue.h>
 
+#include <asm/atomic.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/uaccess.h>
@@ -561,6 +562,7 @@ static irqreturn_t phy_interrupt(int irq
 	 * queue will write the PHY to disable and clear the
 	 * interrupt, and then reenable the irq line. */
 	disable_irq_nosync(irq);
+	atomic_inc(&phydev->irq_disable);
 
 	schedule_work(&phydev->phy_queue);
 
@@ -631,6 +633,7 @@ int phy_start_interrupts(struct phy_devi
 
 	INIT_WORK(&phydev->phy_queue, phy_change);
 
+	atomic_set(&phydev->irq_disable, 0);
 	if (request_irq(phydev->irq, phy_interrupt,
 				IRQF_SHARED,
 				"phy_interrupt",
@@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic
 	if (err)
 		phy_error(phydev);
 
+	free_irq(phydev->irq, phydev);
+
 	/*
-	 * Finish any pending work; we might have been scheduled to be called
-	 * from keventd ourselves, but cancel_work_sync() handles that.
+	 * Cannot call flush_scheduled_work() here as desired because
+	 * of rtnl_lock(), but we do not really care about what would
+	 * be done, except from enable_irq(), so cancel any work
+	 * possibly pending and take care of the matter below.
 	 */
 	cancel_work_sync(&phydev->phy_queue);
-
-	free_irq(phydev->irq, phydev);
+	/*
+	 * If work indeed has been cancelled, disable_irq() will have
+	 * been left unbalanced from phy_interrupt() and enable_irq()
+	 * has to be called so that other devices on the line work.
+	 */
+	while (atomic_dec_return(&phydev->irq_disable) >= 0)
+		enable_irq(phydev->irq);
 
 	return err;
 }
@@ -694,6 +706,7 @@ static void phy_change(struct work_struc
 		phydev->state = PHY_CHANGELINK;
 	spin_unlock(&phydev->lock);
 
+	atomic_dec(&phydev->irq_disable);
 	enable_irq(phydev->irq);
 
 	/* Reenable interrupts */
@@ -707,6 +720,7 @@ static void phy_change(struct work_struc
 
 irq_enable_err:
 	disable_irq(phydev->irq);
+	atomic_inc(&phydev->irq_disable);
 phy_err:
 	phy_error(phydev);
 }
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h linux-mips-2.6.23-rc5-20070904/include/linux/phy.h
--- linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h	2007-07-10 04:56:57.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/include/linux/phy.h	2007-09-11 23:05:41.000000000 +0000
@@ -25,6 +25,8 @@
 #include <linux/timer.h>
 #include <linux/workqueue.h>
 
+#include <asm/atomic.h>
+
 #define PHY_BASIC_FEATURES	(SUPPORTED_10baseT_Half | \
 				 SUPPORTED_10baseT_Full | \
 				 SUPPORTED_100baseT_Half | \
@@ -281,6 +283,7 @@ struct phy_device {
 	/* Interrupt and Polling infrastructure */
 	struct work_struct phy_queue;
 	struct timer_list phy_timer;
+	atomic_t irq_disable;
 
 	spinlock_t lock;
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2007-10-23  9:18 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-19 14:38 [PATCH] PHYLIB: IRQ event workqueue handling fixes Maciej W. Rozycki
2007-09-20 23:53 ` Andrew Morton
2007-09-21 12:51   ` Maciej W. Rozycki
2007-09-21 18:42     ` Andrew Morton
2007-10-15 12:53 ` Jarek Poplawski
2007-10-15 17:03   ` Maciej W. Rozycki
2007-10-16  6:21     ` Jarek Poplawski
2007-10-16 17:19       ` Maciej W. Rozycki
2007-10-17  8:58         ` Jarek Poplawski
2007-10-17  9:08           ` Benjamin Herrenschmidt
2007-10-17  9:09           ` Jarek Poplawski
2007-10-18  6:31           ` Jarek Poplawski
2007-10-18  7:05             ` [PATCH] flush_work_sync vs. flush_scheduled_work " Jarek Poplawski
2007-10-18 15:48               ` Oleg Nesterov
2007-10-18 15:58                 ` Maciej W. Rozycki
2007-10-19  7:50                 ` Jarek Poplawski
2007-10-19  8:01                   ` Jarek Poplawski
2007-10-22  6:11                   ` Jarek Poplawski
2007-10-22 18:02                     ` Oleg Nesterov
2007-10-23  6:59                       ` Jarek Poplawski
2007-10-23  9:21                       ` Jarek Poplawski
2007-10-19  8:00                 ` Johannes Berg
2007-10-18 11:37             ` Maciej W. Rozycki
2007-10-18 11:30           ` Maciej W. Rozycki
2007-10-18 14:37             ` Jarek Poplawski
2007-10-18 15:31               ` Maciej W. Rozycki
2007-10-19  8:17             ` Jarek Poplawski
2007-10-19 12:57               ` Maciej W. Rozycki
2007-10-19 11:38             ` Maciej W. Rozycki
2007-10-19 14:39               ` Jarek Poplawski
2007-10-19 17:58                 ` Maciej W. Rozycki
2007-10-19 21:46                 ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).