netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PATCH 2.6.17-rc5 tulip free_irq() called too late
@ 2006-05-31 19:52 Grant Grundler
  2006-06-08 14:43 ` Jeff Garzik
  0 siblings, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-05-31 19:52 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev

Jeff,
SLES10 testing exposed an MCA that was confirmed to be a DMA IO TLB miss.
This means tulip device was attempting to DMA to memory that was already
unmapped. The test was crashing in the "ifconfig down" step when a 4-port
tulip card was under this work load:

while :
do
	ifconfig eth24 up
	ifconfig eth25 up
	ifconfig eth26 up
	ifconfig eth27 up
        # Pound both interfaces with ethtool
        for i in `seq 1000`
        do
                ethtool eth24 &>/dev/null
                ethtool eth25 &>/dev/null
                ethtool eth26 &>/dev/null
                ethtool eth27 &>/dev/null
        done

	# Bring interfaces down
        echo ifconfig $nic1 down
        ifconfig eth24 down
        ifconfig eth25 down
        ifconfig eth26 down
        ifconfig eth27 down

        sleep 5
done


[ And yes, I know tulip doesn't support ethtool. Don't ask.
  It's still a sore point at the moment. Just consider it 
  a delay loop or use "sleep 5" instead. ]

The real "network load" comes from another box(en) running 4 instances
of "ping -f -s 1450 192.168.x.y" where "x.y" is the subnet/IP of eth24-27.
The parisc and ia64 machines will crash in minutes.

I believe the problem is a race condition between an interrupt coming
in and the tulip_down() code path. Moving the "free_irq()" to before
tulip_down() call fixes the problem. I've been able to run the above
test for several hours now.
Please apply.

thanks,
grant


Signed-off-by: Grant Grundler <grundler@parisc-linux.org>

--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -18,11 +18,11 @@
 
 #define DRV_NAME	"tulip"
 #ifdef CONFIG_TULIP_NAPI
-#define DRV_VERSION    "1.1.13-NAPI" /* Keep at least for test */
+#define DRV_VERSION    "1.1.14-NAPI" /* Keep at least for test */
 #else
-#define DRV_VERSION	"1.1.13"
+#define DRV_VERSION	"1.1.14"
 #endif
-#define DRV_RELDATE	"December 15, 2004"
+#define DRV_RELDATE	"May 31, 2006"
 
 
 #include <linux/module.h>
@@ -772,14 +774,13 @@ static int tulip_close (struct net_devic
 	int i;
 
 	netif_stop_queue (dev);
-
+	free_irq (dev->irq, dev);    /* don't let IRQs race w/tulip_down() */
 	tulip_down (dev);
 
 	if (tulip_debug > 1)
 		printk (KERN_DEBUG "%s: Shutting down ethercard, status was %2.2x.\n",
 			dev->name, ioread32 (ioaddr + CSR5));
 
-	free_irq (dev->irq, dev);
 
 	/* Free all the skbuffs in the Rx queue. */
 	for (i = 0; i < RX_RING_SIZE; i++) {

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-05-31 19:52 PATCH 2.6.17-rc5 tulip free_irq() called too late Grant Grundler
@ 2006-06-08 14:43 ` Jeff Garzik
  2006-06-08 15:22   ` Grant Grundler
  2006-06-08 17:01   ` Grant Grundler
  0 siblings, 2 replies; 30+ messages in thread
From: Jeff Garzik @ 2006-06-08 14:43 UTC (permalink / raw)
  To: Grant Grundler, Andrew Morton; +Cc: netdev, Val Henson


(CC'ing our newly minted tulip maintainer, Val)

Grant Grundler wrote:
> Jeff,
> SLES10 testing exposed an MCA that was confirmed to be a DMA IO TLB miss.
> This means tulip device was attempting to DMA to memory that was already
> unmapped. The test was crashing in the "ifconfig down" step when a 4-port
> tulip card was under this work load:
> 
> while :
> do
> 	ifconfig eth24 up
> 	ifconfig eth25 up
> 	ifconfig eth26 up
> 	ifconfig eth27 up
>         # Pound both interfaces with ethtool
>         for i in `seq 1000`
>         do
>                 ethtool eth24 &>/dev/null
>                 ethtool eth25 &>/dev/null
>                 ethtool eth26 &>/dev/null
>                 ethtool eth27 &>/dev/null
>         done
> 
> 	# Bring interfaces down
>         echo ifconfig $nic1 down
>         ifconfig eth24 down
>         ifconfig eth25 down
>         ifconfig eth26 down
>         ifconfig eth27 down
> 
>         sleep 5
> done
> 
> 
> [ And yes, I know tulip doesn't support ethtool. Don't ask.
>   It's still a sore point at the moment. Just consider it 
>   a delay loop or use "sleep 5" instead. ]
> 
> The real "network load" comes from another box(en) running 4 instances
> of "ping -f -s 1450 192.168.x.y" where "x.y" is the subnet/IP of eth24-27.
> The parisc and ia64 machines will crash in minutes.
> 
> I believe the problem is a race condition between an interrupt coming
> in and the tulip_down() code path. Moving the "free_irq()" to before
> tulip_down() call fixes the problem. I've been able to run the above
> test for several hours now.

NAK.  This is a band-aid, and one that creates new problems even as it 
attempts to solve problems.

Calling free_irq() while the chip is still active is just a bad idea, 
because the chip could raise an interrupt, creating a 
screaming-interrupts situation.  Consider especially the case of shared 
interrupts here, as a concrete example of how this won't work.

Perhaps cp_close() in 8139cp.c could be an example of a good ordering? 
It stops the chip, syncs irqs, frees irq, then frees [thus unmapping] 
the rings.

	Jeff





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 14:43 ` Jeff Garzik
@ 2006-06-08 15:22   ` Grant Grundler
  2006-06-08 15:32     ` Grant Grundler
  2006-06-08 15:32     ` Jeff Garzik
  2006-06-08 17:01   ` Grant Grundler
  1 sibling, 2 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-08 15:22 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Andrew Morton, netdev, Val Henson

On Thu, Jun 08, 2006 at 10:43:04AM -0400, Jeff Garzik wrote:
> (CC'ing our newly minted tulip maintainer, Val)

Excellent!
Has MAINTAINERS file been updated? :)

...
> NAK.  This is a band-aid, and one that creates new problems even as it 
> attempts to solve problems.

You failed to demonstrate that below. Any other objections?

> Calling free_irq() while the chip is still active is just a bad idea, 
> because the chip could raise an interrupt, creating a 
> screaming-interrupts situation. Consider especially the case of shared 
> interrupts here, as a concrete example of how this won't work.

The chip IRQ gets turned off in tulip_down().
It won't be screaming for very long.
No one will hear it screaming and no one will care.
The decision to bring down the NIC was already made.

Secondly, if tulip has a problem with shared IRQs, it's already there.
We would have called free_irq() anyway - just a bit later.
In the shared IRQ case, I expect free_irq() to unlink this instance
of the tulip interrupt handler from the CPU vector list. If that
doesn't happen, I could consider that an arch specific bug,
not a tulip driver bug.

> Perhaps cp_close() in 8139cp.c could be an example of a good ordering? 
> It stops the chip, syncs irqs, frees irq, then frees [thus unmapping] 
> the rings.

Sorry, I don't see how it matters if we disable chip IRQ first
or unlink from CPU IRQ list first. Does it?

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 15:22   ` Grant Grundler
@ 2006-06-08 15:32     ` Grant Grundler
  2006-06-08 15:38       ` Jeff Garzik
  2006-06-08 15:32     ` Jeff Garzik
  1 sibling, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-06-08 15:32 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Andrew Morton, netdev, Val Henson

On Thu, Jun 08, 2006 at 09:22:21AM -0600, Grant Grundler wrote:
> > Perhaps cp_close() in 8139cp.c could be an example of a good ordering? 
> > It stops the chip, syncs irqs, frees irq, then frees [thus unmapping] 
> > the rings.
> 
> Sorry, I don't see how it matters if we disable chip IRQ first
> or unlink from CPU IRQ list first. Does it?

Ok...I think I understand what you are driving at here.
The case is when CPU vector is enabled and shared but
one device _without_ an interrupt handler is registered
is still yanking on the interrupt line. It will cause
linux to disable the line since the IRQ isn't being handled.

This is only possible in the shared IRQ case.

Can we call free_irq() from tulip_down()?
I have the feeling this is going to cause alot more code movement.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 15:22   ` Grant Grundler
  2006-06-08 15:32     ` Grant Grundler
@ 2006-06-08 15:32     ` Jeff Garzik
  2006-06-08 15:36       ` Grant Grundler
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff Garzik @ 2006-06-08 15:32 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Andrew Morton, netdev, Val Henson

Grant Grundler wrote:
> On Thu, Jun 08, 2006 at 10:43:04AM -0400, Jeff Garzik wrote:
>> (CC'ing our newly minted tulip maintainer, Val)
> 
> Excellent!
> Has MAINTAINERS file been updated? :)

It should be updated, yes.


>> Calling free_irq() while the chip is still active is just a bad idea, 
>> because the chip could raise an interrupt, creating a 
>> screaming-interrupts situation. Consider especially the case of shared 
>> interrupts here, as a concrete example of how this won't work.
> 
> The chip IRQ gets turned off in tulip_down().
> It won't be screaming for very long.

Then you admit that you add a race.


> No one will hear it screaming and no one will care.

False.  Easy counter-example already provided:  Shared interrupts.

For some IRQ architectures, even without shared interrupts, this could 
easily trigger the kernel's screaming-irq detection code.


> Secondly, if tulip has a problem with shared IRQs, it's already there.
> We would have called free_irq() anyway - just a bit later.

Yes... AFTER we told the chip to stop DMA'ing, and delivering interrupts.

I'm frankly surprised you don't see the obvious, natural ordering.  You 
stop the hardware from wanting to deliver interrupts, before 
unregistering the irq handler.  IOW you turn out the lights, before 
leaving the room.


> In the shared IRQ case, I expect free_irq() to unlink this instance
> of the tulip interrupt handler from the CPU vector list. If that

Irrelevant -- that doesn't stop the tulip hardware from raising the irq, 
nor stop the system from attempting to deliver it [in all cases].  In 
the shared interrupt case, that means that another driver's irq handler 
will be called for an event tulip hardware raised.

	Jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 15:32     ` Jeff Garzik
@ 2006-06-08 15:36       ` Grant Grundler
  0 siblings, 0 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-08 15:36 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Andrew Morton, netdev, Val Henson

On Thu, Jun 08, 2006 at 11:32:39AM -0400, Jeff Garzik wrote:
> >The chip IRQ gets turned off in tulip_down().
> >It won't be screaming for very long.
> 
> Then you admit that you add a race.

Yes - I realized that after I hit <send> :(

...
> >In the shared IRQ case, I expect free_irq() to unlink this instance
> >of the tulip interrupt handler from the CPU vector list. If that
> 
> Irrelevant -- that doesn't stop the tulip hardware from raising the irq, 
> nor stop the system from attempting to deliver it [in all cases].  In 
> the shared interrupt case, that means that another driver's irq handler 
> will be called for an event tulip hardware raised.

*nod*

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 15:32     ` Grant Grundler
@ 2006-06-08 15:38       ` Jeff Garzik
  2006-06-08 15:47         ` Grant Grundler
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff Garzik @ 2006-06-08 15:38 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Andrew Morton, netdev, Val Henson

Grant Grundler wrote:
> Ok...I think I understand what you are driving at here.
> The case is when CPU vector is enabled and shared but
> one device _without_ an interrupt handler is registered
> is still yanking on the interrupt line. It will cause
> linux to disable the line since the IRQ isn't being handled.

Correct.


> Can we call free_irq() from tulip_down()?

I'm sure you can answer that yourself.  If it doesn't cause problems 
elsewhere, yes.  Otherwise, no.  :)

Did you read the example I cited, cp_close() ?

	Jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 15:38       ` Jeff Garzik
@ 2006-06-08 15:47         ` Grant Grundler
  0 siblings, 0 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-08 15:47 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Andrew Morton, netdev, Val Henson

On Thu, Jun 08, 2006 at 11:38:52AM -0400, Jeff Garzik wrote:
> >Can we call free_irq() from tulip_down()?
> 
> I'm sure you can answer that yourself.  If it doesn't cause problems 
> elsewhere, yes.  Otherwise, no.  :)

Yeah, well, I was hoping you would "Just Know" (tm). :)
Research takes time.

> Did you read the example I cited, cp_close() ?

Yes - after I sent the mail. It looks good to me too.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCH 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 14:43 ` Jeff Garzik
  2006-06-08 15:22   ` Grant Grundler
@ 2006-06-08 17:01   ` Grant Grundler
  2006-06-13 23:55     ` PATCHv3 " Grant Grundler
  1 sibling, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-06-08 17:01 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Andrew Morton, netdev, Val Henson

On Thu, Jun 08, 2006 at 10:43:04AM -0400, Jeff Garzik wrote:
...
> Perhaps cp_close() in 8139cp.c could be an example of a good ordering? 
> It stops the chip, syncs irqs, frees irq, then frees [thus unmapping] 
> the rings.

Here is a new patch that moves free_irq() into tulip_down().
The resulting code is structured the same as cp_close().

While I believe the code is correct, I'm not real happy
that tulip_up() and tulip_down() don't both deal with IRQs.
(ie not symetrical). Oh well, I can submit another patch
for that if Val doesn't want to wrangle that herself.

This is compile tested only.
I'll drop it on ia64/parisc servers later and retest.

thanks,
grant


Index: drivers/net/tulip/tulip_core.c
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/net/tulip/tulip_core.c,v
retrieving revision 1.35
diff -u -p -r1.35 tulip_core.c
--- drivers/net/tulip/tulip_core.c	23 Apr 2006 15:18:28 -0000	1.35
+++ drivers/net/tulip/tulip_core.c	8 Jun 2006 16:25:43 -0000
@@ -18,11 +18,11 @@
 
 #define DRV_NAME	"tulip"
 #ifdef CONFIG_TULIP_NAPI
-#define DRV_VERSION    "1.1.13-NAPI" /* Keep at least for test */
+#define DRV_VERSION    "1.1.14-NAPI" /* Keep at least for test */
 #else
-#define DRV_VERSION	"1.1.13"
+#define DRV_VERSION	"1.1.14"
 #endif
-#define DRV_RELDATE	"December 15, 2004"
+#define DRV_RELDATE	"May 6, 2006"
 
 
 #include <linux/module.h>
@@ -743,6 +745,11 @@ static void tulip_down (struct net_devic
 	/* Stop the Tx and Rx processes. */
 	tulip_stop_rxtx(tp);
 
+	spin_unlock_irqrestore (&tp->lock, flags);
+
+	synchronize_irq(dev->irq);
+	free_irq (dev->irq, dev);
+
 	/* prepare receive buffers */
 	tulip_refill_rx(dev);
 
@@ -752,7 +759,6 @@ static void tulip_down (struct net_devic
 	if (ioread32 (ioaddr + CSR6) != 0xffffffff)
 		tp->stats.rx_missed_errors += ioread32 (ioaddr + CSR8) & 0xffff;
 
-	spin_unlock_irqrestore (&tp->lock, flags);
 
 	init_timer(&tp->timer);
 	tp->timer.data = (unsigned long)dev;
@@ -774,14 +780,13 @@ static int tulip_close (struct net_devic
 	int i;
 
 	netif_stop_queue (dev);
 
 	tulip_down (dev);
 
 	if (tulip_debug > 1)
 		printk (KERN_DEBUG "%s: Shutting down ethercard, status was %2.2x.\n",
 			dev->name, ioread32 (ioaddr + CSR5));
 
-	free_irq (dev->irq, dev);
 
 	/* Free all the skbuffs in the Rx queue. */
 	for (i = 0; i < RX_RING_SIZE; i++) {
@@ -1750,7 +1757,6 @@ static int tulip_suspend (struct pci_dev
 		tulip_down(dev);
 
 	netif_device_detach(dev);
-	free_irq(dev->irq, dev);
 
 	pci_save_state(pdev);
 	pci_disable_device(pdev);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-08 17:01   ` Grant Grundler
@ 2006-06-13 23:55     ` Grant Grundler
  2006-06-14  0:06       ` Valerie Henson
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-13 23:55 UTC (permalink / raw)
  To: Valerie Henson; +Cc: Jeff Garzik, Andrew Morton, netdev

On Thu, Jun 08, 2006 at 11:01:20AM -0600, Grant Grundler wrote:
> Here is a new patch that moves free_irq() into tulip_down().
> The resulting code is structured the same as cp_close().

Val,
Two details are wrong in version 2 and are fixed in v3 (appended below):

o we don't need synchronize_irq() before calling free_irq().
  (It should be removed from cp_close() too)
  Thanks to willy for pointing me at kernel/irq/manage.c.

o tulip_stop_rxtx() has to be called _after_ free_irq().
  ie. v2 patch didn't fix the original race condition
  and when under test, dies about as fast as the original code.

Tested on rx4640 (HP IA64) for several hours.
Please apply.

thanks,
grant

Change Log:
	IRQs are racing with tulip_down(). DMA can be restarted _after_
	we call tulip_stop_rxtx() and the DMA buffers are unmapped.
	The result is an MCA (hard crash on ia64) because of an
	IO TLB miss.

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>

--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -18,11 +18,11 @@
 
 #define DRV_NAME	"tulip"
 #ifdef CONFIG_TULIP_NAPI
-#define DRV_VERSION    "1.1.13-NAPI" /* Keep at least for test */
+#define DRV_VERSION    "1.1.14-NAPI" /* Keep at least for test */
 #else
-#define DRV_VERSION	"1.1.13"
+#define DRV_VERSION	"1.1.14"
 #endif
-#define DRV_RELDATE	"December 15, 2004"
+#define DRV_RELDATE	"May 6, 2006"
 
 
 #include <linux/module.h>
@@ -741,21 +741,20 @@ static void tulip_down (struct net_devic
 
 	/* Disable interrupts by clearing the interrupt mask. */
 	iowrite32 (0x00000000, ioaddr + CSR7);
+	ioread32 (ioaddr + CSR7);	/* flush posted write */
 
-	/* Stop the Tx and Rx processes. */
-	tulip_stop_rxtx(tp);
+	spin_unlock_irqrestore (&tp->lock, flags);
 
-	/* prepare receive buffers */
-	tulip_refill_rx(dev);
+	free_irq (dev->irq, dev);	/* no more races after this */
+	tulip_stop_rxtx(tp);		/* Stop DMA */
 
-	/* release any unconsumed transmit buffers */
-	tulip_clean_tx_ring(tp);
+	/* Put driver back into the state we start with */
+	tulip_refill_rx(dev);		/* prepare RX buffers */
+	tulip_clean_tx_ring(tp);	/* clean up unsent TX buffers */
 
 	if (ioread32 (ioaddr + CSR6) != 0xffffffff)
 		tp->stats.rx_missed_errors += ioread32 (ioaddr + CSR8) & 0xffff;
 
-	spin_unlock_irqrestore (&tp->lock, flags);
-
 	init_timer(&tp->timer);
 	tp->timer.data = (unsigned long)dev;
 	tp->timer.function = tulip_tbl[tp->chip_id].media_timer;
@@ -774,7 +773,6 @@ static int tulip_close (struct net_devic
 		printk (KERN_DEBUG "%s: Shutting down ethercard, status was %2.2x.\n",
 			dev->name, ioread32 (ioaddr + CSR5));
 
-	free_irq (dev->irq, dev);
 
 	/* Free all the skbuffs in the Rx queue. */
 	for (i = 0; i < RX_RING_SIZE; i++) {
@@ -1752,7 +1752,6 @@ static int tulip_suspend (struct pci_dev
 		tulip_down(dev);
 
 	netif_device_detach(dev);
-	free_irq(dev->irq, dev);
 
 	pci_save_state(pdev);
 	pci_disable_device(pdev);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-13 23:55     ` PATCHv3 " Grant Grundler
@ 2006-06-14  0:06       ` Valerie Henson
  2006-06-14  0:33       ` Jeff Garzik
  2006-06-22  0:43       ` Valerie Henson
  2 siblings, 0 replies; 30+ messages in thread
From: Valerie Henson @ 2006-06-14  0:06 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Andrew Morton, netdev

On Tue, Jun 13, 2006 at 05:55:31PM -0600, Grant Grundler wrote:
> On Thu, Jun 08, 2006 at 11:01:20AM -0600, Grant Grundler wrote:
> > Here is a new patch that moves free_irq() into tulip_down().
> > The resulting code is structured the same as cp_close().
> 
> Val,
> Two details are wrong in version 2 and are fixed in v3 (appended below):
> 
> o we don't need synchronize_irq() before calling free_irq().
>   (It should be removed from cp_close() too)
>   Thanks to willy for pointing me at kernel/irq/manage.c.
> 
> o tulip_stop_rxtx() has to be called _after_ free_irq().
>   ie. v2 patch didn't fix the original race condition
>   and when under test, dies about as fast as the original code.
> 
> Tested on rx4640 (HP IA64) for several hours.
> Please apply.

Thanks, I'll take a look next week (after the workshop and moving).

-VAL

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-13 23:55     ` PATCHv3 " Grant Grundler
  2006-06-14  0:06       ` Valerie Henson
@ 2006-06-14  0:33       ` Jeff Garzik
  2006-06-14  4:44         ` Grant Grundler
  2006-06-22  0:43       ` Valerie Henson
  2 siblings, 1 reply; 30+ messages in thread
From: Jeff Garzik @ 2006-06-14  0:33 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Valerie Henson, Andrew Morton, netdev

Grant Grundler wrote:
> o tulip_stop_rxtx() has to be called _after_ free_irq().
>   ie. v2 patch didn't fix the original race condition
>   and when under test, dies about as fast as the original code.

You made the race window smaller, but it's still there.  The chip's DMA 
engines should be stopped before you unregister the interrupt handler.

	Jeff




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14  0:33       ` Jeff Garzik
@ 2006-06-14  4:44         ` Grant Grundler
  2006-06-14 13:05           ` Kyle McMartin
  2006-06-14 15:03           ` Jeff Garzik
  0 siblings, 2 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-14  4:44 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Valerie Henson, Andrew Morton, netdev

On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote:
> Grant Grundler wrote:
> >o tulip_stop_rxtx() has to be called _after_ free_irq().
> >  ie. v2 patch didn't fix the original race condition
> >  and when under test, dies about as fast as the original code.
> 
> You made the race window smaller, but it's still there.  The chip's DMA 
> engines should be stopped before you unregister the interrupt handler.

Switching the order to be:
        tulip_stop_rxtx(tp);            /* Stop DMA */
        free_irq (dev->irq, dev);       /* no more races after this */

still leaves us open to IRQs being delivered _after_ we've stopped DMA.
That in turn allows the interrupt handler to re-enable DMA again.

Or are you worried about a masked, pending interrupt causing
problems when the driver is re-opened or resumed?
If firmware left the device in a that state at boot time wouldn't
the driver be required to handle it?

thanks,
grant



> 
> 	Jeff
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14  4:44         ` Grant Grundler
@ 2006-06-14 13:05           ` Kyle McMartin
  2006-06-14 14:54             ` Grant Grundler
  2006-06-14 15:03           ` Jeff Garzik
  1 sibling, 1 reply; 30+ messages in thread
From: Kyle McMartin @ 2006-06-14 13:05 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Valerie Henson, Andrew Morton, netdev

On Tue, Jun 13, 2006 at 10:44:12PM -0600, Grant Grundler wrote:
> On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote:
> > Grant Grundler wrote:
> > >o tulip_stop_rxtx() has to be called _after_ free_irq().
> > >  ie. v2 patch didn't fix the original race condition
> > >  and when under test, dies about as fast as the original code.
> > 
> > You made the race window smaller, but it's still there.  The chip's DMA 
> > engines should be stopped before you unregister the interrupt handler.
> 
> Switching the order to be:
>         tulip_stop_rxtx(tp);            /* Stop DMA */
>         free_irq (dev->irq, dev);       /* no more races after this */
> 

I think the correct sequence would be:

	reset tulip interrupt mask
	flush posted write

	synchronize irq			/* make sure we got 'em all */
	tulip_stop_rxtx			/* turn off dma */
	free irq			/* bye bye */

The synchronize irq guarantees we shouldn't see another irq
generated by the card because it was held up somewhere.

Cheers,
	Kyle M.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 13:05           ` Kyle McMartin
@ 2006-06-14 14:54             ` Grant Grundler
  0 siblings, 0 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-14 14:54 UTC (permalink / raw)
  To: Kyle McMartin
  Cc: Grant Grundler, Jeff Garzik, Valerie Henson, Andrew Morton,
	netdev

On Wed, Jun 14, 2006 at 09:05:06AM -0400, Kyle McMartin wrote:
> I think the correct sequence would be:
> 
> 	reset tulip interrupt mask
> 	flush posted write
> 
> 	synchronize irq			/* make sure we got 'em all */

> 	tulip_stop_rxtx			/* turn off dma */
> 	free irq			/* bye bye */
> 
> The synchronize irq guarantees we shouldn't see another irq
> generated by the card because it was held up somewhere.

Kyle,
syncronize_irq() only guarantees currently executing interrupt handler
completes before handing control back to the caller.
It does not guarantee IRQ signals still inflight are "flushed".
Remember that IRQ lines are a "sideband" signal and not subject
to PCI data ordering rules.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14  4:44         ` Grant Grundler
  2006-06-14 13:05           ` Kyle McMartin
@ 2006-06-14 15:03           ` Jeff Garzik
  2006-06-14 18:14             ` Grant Grundler
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff Garzik @ 2006-06-14 15:03 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Valerie Henson, Andrew Morton, netdev

Grant Grundler wrote:
> On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote:
>> Grant Grundler wrote:
>>> o tulip_stop_rxtx() has to be called _after_ free_irq().
>>>  ie. v2 patch didn't fix the original race condition
>>>  and when under test, dies about as fast as the original code.
>> You made the race window smaller, but it's still there.  The chip's DMA 
>> engines should be stopped before you unregister the interrupt handler.
> 
> Switching the order to be:
>         tulip_stop_rxtx(tp);            /* Stop DMA */
>         free_irq (dev->irq, dev);       /* no more races after this */
> 
> still leaves us open to IRQs being delivered _after_ we've stopped DMA.

Correct.  And that is the preferred, natural, logical, obvious order:

1) Turn things off.
2) Wait for activity to cease.


> That in turn allows the interrupt handler to re-enable DMA again.

Then that would be a problem to solve...  Some interrupt handlers will 
test netif_running() or a driver-specific shutting-down flag, 
specifically to avoid such behaviors.

	Jeff




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 15:03           ` Jeff Garzik
@ 2006-06-14 18:14             ` Grant Grundler
  2006-06-14 19:51               ` Jeff Garzik
  2006-06-14 20:47               ` Francois Romieu
  0 siblings, 2 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-14 18:14 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Valerie Henson, Andrew Morton, netdev

On Wed, Jun 14, 2006 at 11:03:48AM -0400, Jeff Garzik wrote:
> Grant Grundler wrote:
> >Switching the order to be:
> >        tulip_stop_rxtx(tp);            /* Stop DMA */
> >        free_irq (dev->irq, dev);       /* no more races after this */
> >
> >still leaves us open to IRQs being delivered _after_ we've stopped DMA.
> 
> Correct.  And that is the preferred, natural, logical, obvious order:
> 
> 1) Turn things off.
> 2) Wait for activity to cease.

Patch v3 does this in two stages:
1) turn off tulip interrupts
2) free_irq() calls syncronize_irq() to handle pending IRQs

then calls tulip_stop_rxtx() which:
1) tells tulip to stop DMA
2) poll until DMA completes

After this we can free remaining resources.

> >That in turn allows the interrupt handler to re-enable DMA again.
> 
> Then that would be a problem to solve...  Some interrupt handlers will 
> test netif_running() or a driver-specific shutting-down flag, 
> specifically to avoid such behaviors.

I'm not keen on adding more code to tulip_interrupt() routine
for something that rarely happens (compared to IRQs) and is handled
outside the interrupt routine.  I'm pretty sure stopping interrupts
before stopping DMA is sufficient.
Can you show an example where it doesn't work?

This is important since I'm going to propose a new Documentation/pci.txt
based on this experience.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 18:14             ` Grant Grundler
@ 2006-06-14 19:51               ` Jeff Garzik
  2006-06-14 22:25                 ` Grant Grundler
  2006-06-14 20:47               ` Francois Romieu
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff Garzik @ 2006-06-14 19:51 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Valerie Henson, Andrew Morton, netdev

Grant Grundler wrote:
> On Wed, Jun 14, 2006 at 11:03:48AM -0400, Jeff Garzik wrote:
>> Grant Grundler wrote:
>>> Switching the order to be:
>>>        tulip_stop_rxtx(tp);            /* Stop DMA */
>>>        free_irq (dev->irq, dev);       /* no more races after this */
>>>
>>> still leaves us open to IRQs being delivered _after_ we've stopped DMA.
>> Correct.  And that is the preferred, natural, logical, obvious order:
>>
>> 1) Turn things off.
>> 2) Wait for activity to cease.
> 
> Patch v3 does this in two stages:
> 1) turn off tulip interrupts
> 2) free_irq() calls syncronize_irq() to handle pending IRQs
> 
> then calls tulip_stop_rxtx() which:
> 1) tells tulip to stop DMA
> 2) poll until DMA completes
> 
> After this we can free remaining resources.

You need to turn off the thing that generates work (DMA engine), before 
turning off the thing that reaps work (irq handler).


>>> That in turn allows the interrupt handler to re-enable DMA again.
>> Then that would be a problem to solve...  Some interrupt handlers will 
>> test netif_running() or a driver-specific shutting-down flag, 
>> specifically to avoid such behaviors.
> 
> I'm not keen on adding more code to tulip_interrupt() routine
> for something that rarely happens (compared to IRQs) and is handled
> outside the interrupt routine.  I'm pretty sure stopping interrupts
> before stopping DMA is sufficient.
> Can you show an example where it doesn't work?

It should be completely obvious that the chip is still generating 
work...  You don't want to leave the hardware in a position where it has 
unacknowledged events.

	Jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 18:14             ` Grant Grundler
  2006-06-14 19:51               ` Jeff Garzik
@ 2006-06-14 20:47               ` Francois Romieu
  2006-06-14 22:30                 ` Grant Grundler
  1 sibling, 1 reply; 30+ messages in thread
From: Francois Romieu @ 2006-06-14 20:47 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Valerie Henson, Andrew Morton, netdev

Grant Grundler <grundler@parisc-linux.org> :
[...]
> I'm not keen on adding more code to tulip_interrupt() routine
> for something that rarely happens (compared to IRQs) and is handled
> outside the interrupt routine.  I'm pretty sure stopping interrupts
> before stopping DMA is sufficient.
> Can you show an example where it doesn't work?

Shared irq. 

The device has not quiesced, the kernel stop listening to it and the
neighbor device receives a late interruption from the network device.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 19:51               ` Jeff Garzik
@ 2006-06-14 22:25                 ` Grant Grundler
  0 siblings, 0 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-14 22:25 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Grant Grundler, Valerie Henson, Andrew Morton, netdev

On Wed, Jun 14, 2006 at 03:51:37PM -0400, Jeff Garzik wrote:
> You need to turn off the thing that generates work (DMA engine), before 
> turning off the thing that reaps work (irq handler).
...
> It should be completely obvious that the chip is still generating 
> work...

Yes, I agree it still generates work. ie we can still RX packets.
But those will get discarded anyway.
In other words, If work is generated and I don't know it and
don't care, was it really work? :)

>   You don't want to leave the hardware in a position where it has 
> unacknowledged events.

Ok. Let me repeat two questions I asked a while ago:
| Are you worried about a masked, pending interrupt causing
| problems when the driver is re-opened or resumed?

The answer to "Yes" it seems.
And we will disagree on that since I've proven it's not a problem.
And it can't be a problem anyone else has seen since it's been
this way for 5+ years.

| If firmware left the device in that state at boot time wouldn't
| the driver be required to handle it?

Can you comment on this please?

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 20:47               ` Francois Romieu
@ 2006-06-14 22:30                 ` Grant Grundler
  2006-06-15 20:30                   ` Francois Romieu
  0 siblings, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-06-14 22:30 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Grant Grundler, Jeff Garzik, Valerie Henson, Andrew Morton,
	netdev

On Wed, Jun 14, 2006 at 10:47:20PM +0200, Francois Romieu wrote:
> Grant Grundler <grundler@parisc-linux.org> :
> [...]
> > I'm not keen on adding more code to tulip_interrupt() routine
> > for something that rarely happens (compared to IRQs) and is handled
> > outside the interrupt routine.  I'm pretty sure stopping interrupts
> > before stopping DMA is sufficient.
> > Can you show an example where it doesn't work?
> 
> Shared irq. 
> 
> The device has not quiesced, the kernel stop listening to it and the
> neighbor device receives a late interruption from the network device.

I thought we've worked through that already:
	http://www.spinics.net/lists/netdev/msg05902.html

Patch v3 takes care of that problem.
The first step in the sequence is to mask IRQs on the tulip.
The "neighbor" device sharing the IRQ will not see any interrupts from
the tulip after that.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-14 22:30                 ` Grant Grundler
@ 2006-06-15 20:30                   ` Francois Romieu
  2006-06-16  5:47                     ` Grant Grundler
  0 siblings, 1 reply; 30+ messages in thread
From: Francois Romieu @ 2006-06-15 20:30 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Valerie Henson, Andrew Morton, netdev

Grant Grundler <grundler@parisc-linux.org> :
[shared irq]
> I thought we've worked through that already:
> 	http://www.spinics.net/lists/netdev/msg05902.html
> 
> Patch v3 takes care of that problem.
> The first step in the sequence is to mask IRQs on the tulip.
> The "neighbor" device sharing the IRQ will not see any interrupts from
> the tulip after that.

Afaik free_irq() on a shared irq does not touch the hardware and
irqs are anything but synchronous. If free_irq() is issued before
the device is idle and its irq are not acked, it's wrong.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-15 20:30                   ` Francois Romieu
@ 2006-06-16  5:47                     ` Grant Grundler
  2006-06-16  7:32                       ` Jeff Garzik
  0 siblings, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-06-16  5:47 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Grant Grundler, Jeff Garzik, Valerie Henson, Andrew Morton,
	netdev

On Thu, Jun 15, 2006 at 10:30:17PM +0200, Francois Romieu wrote:
> Afaik free_irq() on a shared irq does not touch the hardware and
> irqs are anything but synchronous. If free_irq() is issued before
> the device is idle and its irq are not acked, it's wrong.

Correct. Before calling free_irq(), patch V3 masks interrupts on tulip
and guarantees the tulip will not generate new interrupts before that call.

grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-16  5:47                     ` Grant Grundler
@ 2006-06-16  7:32                       ` Jeff Garzik
  2006-06-16 15:25                         ` Grant Grundler
       [not found]                         ` <20060616152400.GA7868@colo.lackof.org>
  0 siblings, 2 replies; 30+ messages in thread
From: Jeff Garzik @ 2006-06-16  7:32 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Francois Romieu, Valerie Henson, Andrew Morton, netdev

Grant Grundler wrote:
> On Thu, Jun 15, 2006 at 10:30:17PM +0200, Francois Romieu wrote:
>> Afaik free_irq() on a shared irq does not touch the hardware and
>> irqs are anything but synchronous. If free_irq() is issued before
>> the device is idle and its irq are not acked, it's wrong.
> 
> Correct. Before calling free_irq(), patch V3 masks interrupts on tulip
> and guarantees the tulip will not generate new interrupts before that call.

Incorrect -- it does not guarantee that tulip will not generate new 
interrupt events.

	Jeff




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-16  7:32                       ` Jeff Garzik
@ 2006-06-16 15:25                         ` Grant Grundler
       [not found]                         ` <20060616152400.GA7868@colo.lackof.org>
  1 sibling, 0 replies; 30+ messages in thread
From: Grant Grundler @ 2006-06-16 15:25 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Grant Grundler, Francois Romieu, Valerie Henson, Andrew Morton,
	netdev

On Fri, Jun 16, 2006 at 03:32:56AM -0400, Jeff Garzik wrote:
> >Correct. Before calling free_irq(), patch V3 masks interrupts on tulip
> >and guarantees the tulip will not generate new interrupts before that call.
> 
> Incorrect -- it does not guarantee that tulip will not generate new 
> interrupt events.

Are you saying the following sequence doesn't mask tulip interrupts?

        /* Disable interrupts by clearing the interrupt mask. */
        iowrite32 (0x00000000, ioaddr + CSR7);
        ioread32 (ioaddr + CSR7);       /* flush posted write */

Secondly, since you've ignored the two previous times I've asked,
I'll assume you agree tulip driver has to (and does) handle
the case of pending, masked interrupts at initialization because
firmware might leave it in that state.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
       [not found]                           ` <4492CE98.50900@pobox.com>
@ 2006-06-16 16:06                             ` Grant Grundler
  2006-06-16 16:16                               ` Jeff Garzik
  0 siblings, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-06-16 16:06 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Francois Romieu, Valerie Henson, Andrew Morton, netdev


[ Jeff, apologies. I hit "reply" instead of "group reply" on previous email.
  I've added everyone back on the cc list.]

On Fri, Jun 16, 2006 at 11:30:32AM -0400, Jeff Garzik wrote:
...
> >Are you saying this sequence won't mask interrupts on tulip?
> >
> >        /* Disable interrupts by clearing the interrupt mask. */
> >        iowrite32 (0x00000000, ioaddr + CSR7);
> >        ioread32 (ioaddr + CSR7);       /* flush posted write */
> 
> It does not stop the generation of interrupt events.

This use of "interrupt events" is misleading.
The CPU does not sees these "interrupt events" once we mask interrupts.

> The DMA engine is still running, packets are still being received.
> The above code sequence does not change that.

I agree. And I'm asking why does anyone care?
We clean that up after IRQs are stopped from being delivered to the CPU.

...
> >Secondly, since you have ignored the two previous times I've asked,
> >I'll presume you agree that if firmware leaves it in this state
> >(pending, masked interrupts), that the driver has to (and does)
> >handle it.
> 
> There is no firmware involved here, at any level, after boot.

I agree.  What about at boot time?

> The needed task in the driver has been the same since this thread 
> started:  (1) stop generating new work [stop DMA engine], and (2) 
> quiesce the hardware.  And it must happen in that order.

No it doesn't. I've proven it works in the order I've proposed
on pretty damn anal HW.

> Setting the interrupt mask register to zero doesn't stop new work from 
> appearing.

I agree. It stops the "screaming interrupt" problem. The fact that we
are in "close" or "down" routine means the user already decided
they don't care if new packets do or do not arrive.

Unless you can point to a real user who is affected by
my proposed patch, I ask again patch v3 be accepted.

thanks,
grant

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-16 16:06                             ` Grant Grundler
@ 2006-06-16 16:16                               ` Jeff Garzik
  0 siblings, 0 replies; 30+ messages in thread
From: Jeff Garzik @ 2006-06-16 16:16 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Francois Romieu, Valerie Henson, Andrew Morton, netdev

Grant Grundler wrote:
> [ Jeff, apologies. I hit "reply" instead of "group reply" on previous email.
>   I've added everyone back on the cc list.]
> 
> On Fri, Jun 16, 2006 at 11:30:32AM -0400, Jeff Garzik wrote:
> ...
>>> Are you saying this sequence won't mask interrupts on tulip?
>>>
>>>        /* Disable interrupts by clearing the interrupt mask. */
>>>        iowrite32 (0x00000000, ioaddr + CSR7);
>>>        ioread32 (ioaddr + CSR7);       /* flush posted write */
>> It does not stop the generation of interrupt events.
> 
> This use of "interrupt events" is misleading.
> The CPU does not sees these "interrupt events" once we mask interrupts.
> 
>> The DMA engine is still running, packets are still being received.
>> The above code sequence does not change that.
> 
> I agree. And I'm asking why does anyone care?
> We clean that up after IRQs are stopped from being delivered to the CPU.
> 
> ...
>>> Secondly, since you have ignored the two previous times I've asked,
>>> I'll presume you agree that if firmware leaves it in this state
>>> (pending, masked interrupts), that the driver has to (and does)
>>> handle it.
>> There is no firmware involved here, at any level, after boot.
> 
> I agree.  What about at boot time?

We reset the chip each time we do an interface-up.


>> The needed task in the driver has been the same since this thread 
>> started:  (1) stop generating new work [stop DMA engine], and (2) 
>> quiesce the hardware.  And it must happen in that order.
> 
> No it doesn't. I've proven it works in the order I've proposed
> on pretty damn anal HW.
> 
>> Setting the interrupt mask register to zero doesn't stop new work from 
>> appearing.
> 
> I agree. It stops the "screaming interrupt" problem. The fact that we
> are in "close" or "down" routine means the user already decided
> they don't care if new packets do or do not arrive.
> 
> Unless you can point to a real user who is affected by
> my proposed patch, I ask again patch v3 be accepted.

All users are affected.  There is still a race window due to calling 
free_irq() before stopping the DMA engine, you've simply minimized it.

I don't see why it's so difficult to see

(a) you are introducing a non-standard ordering, different from other 
net drivers
(b) you are introducing an ordering that is counter to how the hardware 
is designed
(c) if you follow the natural ordering, you are GUARANTEED not to have 
screaming interrupts, DMA still going across the PCI bus, and dozens of 
other details.

Rather than running through all of these details, and tweaking your 
patch for each of them (as you have done with V1->V2 and V2->V3), simply 
_guarantee_ that you will not have problems.

	Jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-13 23:55     ` PATCHv3 " Grant Grundler
  2006-06-14  0:06       ` Valerie Henson
  2006-06-14  0:33       ` Jeff Garzik
@ 2006-06-22  0:43       ` Valerie Henson
  2006-06-23  5:00         ` Grant Grundler
  2 siblings, 1 reply; 30+ messages in thread
From: Valerie Henson @ 2006-06-22  0:43 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Andrew Morton, netdev

On Tue, Jun 13, 2006 at 05:55:31PM -0600, Grant Grundler wrote:
> On Thu, Jun 08, 2006 at 11:01:20AM -0600, Grant Grundler wrote:
> > Here is a new patch that moves free_irq() into tulip_down().
> > The resulting code is structured the same as cp_close().
> 
> Val,
> Two details are wrong in version 2 and are fixed in v3 (appended below):
> 
> o we don't need synchronize_irq() before calling free_irq().
>   (It should be removed from cp_close() too)
>   Thanks to willy for pointing me at kernel/irq/manage.c.
> 
> o tulip_stop_rxtx() has to be called _after_ free_irq().
>   ie. v2 patch didn't fix the original race condition
>   and when under test, dies about as fast as the original code.
> 
> Tested on rx4640 (HP IA64) for several hours.
> Please apply.

Hi folks,

The quick summary of my thoughts on this patch is that it isn't the
ideal patch, but it works and it's well-tested.  Doing my preferred
fix (adding a shutdown flag) would be invasive and take many weeks to
reach the level of stability of Grant's patch.  So I'm taking this
patch but adding a comment describing my preferred fix.

Here's the long version.  The obvious ordering of bringing up the card
is:

request_irq()
setup DMA resources
enable interrupts
start DMA engine

And the obvious ordering of shutting it down is:

stop DMA engine
disable interrupts
remove DMA resources
free_irq()

The problem with the above shutdown order is that we can receive an
interrupt in between stopping DMA and disabling interrupts, and the
tulip irq handler will reenable DMA =><= Boom!  The solution I prefer
is to make the irq handler aware of whether we are shutting down or
not, and not reenable DMA in that case.  However, it is a non-trivial
patch to get right, and I would rather have the bug fixed short-term
with the ordering Grant uses:

disable interrupts
free_irq()
stop rxtx
remove DMA resources

Make sense to everyone?  I'll redo the patch with the comment and get
Grant's sign-off.

-VAL

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
  2006-06-22  0:43       ` Valerie Henson
@ 2006-06-23  5:00         ` Grant Grundler
  2006-06-26 22:31           ` [PATCH] Fix tulip shutdown DMA/irq race Valerie Henson
  0 siblings, 1 reply; 30+ messages in thread
From: Grant Grundler @ 2006-06-23  5:00 UTC (permalink / raw)
  To: Valerie Henson; +Cc: Grant Grundler, Jeff Garzik, Andrew Morton, netdev

[ sorry, for the record I'm not trimming Val's comments - mine are much below]

On Wed, Jun 21, 2006 at 05:43:40PM -0700, Valerie Henson wrote:
...
> Hi folks,
> 
> The quick summary of my thoughts on this patch is that it isn't the
> ideal patch, but it works and it's well-tested.  Doing my preferred
> fix (adding a shutdown flag) would be invasive and take many weeks to
> reach the level of stability of Grant's patch.  So I'm taking this
> patch but adding a comment describing my preferred fix.
> 
> Here's the long version.  The obvious ordering of bringing up the card
> is:
> 
> request_irq()
> setup DMA resources
> enable interrupts
> start DMA engine

The problem here is request_irq() enables interrupts.
SO the third step happens as part of the first step.

And if an IRQ happens to be pending, it will blow up.
But since we reset the chip at init time, that should
never happen.

> 
> And the obvious ordering of shutting it down is:
> 
> stop DMA engine
> disable interrupts
> remove DMA resources
> free_irq()

This attempts to be symetrical with the init sequence and 
I like that approach where possible. It's not symetrical
in practice because the interrupt code restarts DMA.
(which is what you've noted below)

> The problem with the above shutdown order is that we can receive an
> interrupt in between stopping DMA and disabling interrupts, and the
> tulip irq handler will reenable DMA =><= Boom!  The solution I prefer
> is to make the irq handler aware of whether we are shutting down or
> not, and not reenable DMA in that case.

While I believe this will work too, I don't advocate this approach
because I don't like to add code to interrupt handlers unless
it's the _only_ way to fix a problem. 

>   However, it is a non-trivial
> patch to get right, and I would rather have the bug fixed short-term
> with the ordering Grant uses:
> 
> disable interrupts
> free_irq()
> stop rxtx
> remove DMA resources
> 
> Make sense to everyone?  I'll redo the patch with the comment and get
> Grant's sign-off.

Yes. I'm ok with anything similar to what you posted above:

    Signed-off-by: Grant Grundler <grundler@parisc-linux.org>


And Val, thanks for accepting the patch and taking the time to
explain where you want to go next with the code in this case.
I'll be happy to setup remote access to my test environment when
you are ready to test that next step.

thanks,
grant

> 
> -VAL

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] Fix tulip shutdown DMA/irq race
  2006-06-23  5:00         ` Grant Grundler
@ 2006-06-26 22:31           ` Valerie Henson
  0 siblings, 0 replies; 30+ messages in thread
From: Valerie Henson @ 2006-06-26 22:31 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Jeff Garzik, Andrew Morton, netdev

From: Grant Grundler <grundler@parisc-linux.org>

IRQs are racing with tulip_down(). DMA can be restarted by the
interrupt handler _after_ we call tulip_stop_rxtx() and the DMA
buffers are unmapped.  The result is an MCA (hard crash on ia64)
because of an IO TLB miss.  The long-term fix is to make the interrupt
handler shutdown aware.

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Valerie Henson <val_henson@linux.intel.com>

---

 tulip_core.c |   37 ++++++++++++++++++++++++-------------
 1 files changed, 24 insertions(+), 13 deletions(-)

diff -Nru a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
--- a/drivers/net/tulip/tulip_core.c	2006-06-22 16:24:11 -07:00
+++ b/drivers/net/tulip/tulip_core.c	2006-06-22 16:24:11 -07:00
@@ -18,11 +18,11 @@
 
 #define DRV_NAME	"tulip"
 #ifdef CONFIG_TULIP_NAPI
-#define DRV_VERSION    "1.1.13-NAPI" /* Keep at least for test */
+#define DRV_VERSION    "1.1.14-NAPI" /* Keep at least for test */
 #else
-#define DRV_VERSION	"1.1.13"
+#define DRV_VERSION	"1.1.14"
 #endif
-#define DRV_RELDATE	"May 11, 2002"
+#define DRV_RELDATE	"May 6, 2006"
 
 
 #include <linux/module.h>
@@ -739,23 +739,36 @@
 #endif
 	spin_lock_irqsave (&tp->lock, flags);
 
+	/*
+	  FIXME: We should really add a shutdown-in-progress flag and
+	  check it in the interrupt handler to see whether we should
+	  reenable DMA or not.  The preferred ordering here would be:
+	 
+	  stop DMA engine
+	  disable interrupts
+	  remove DMA resources
+	  free_irq()
+
+	  The below works but is non-obvious and doesn't match the
+	  ordering of bring-up. -VAL
+	*/
+
 	/* Disable interrupts by clearing the interrupt mask. */
 	iowrite32 (0x00000000, ioaddr + CSR7);
+	ioread32 (ioaddr + CSR7);	/* flush posted write */
 
-	/* Stop the Tx and Rx processes. */
-	tulip_stop_rxtx(tp);
+	spin_unlock_irqrestore (&tp->lock, flags);
 
-	/* prepare receive buffers */
-	tulip_refill_rx(dev);
+	free_irq (dev->irq, dev);	/* no more races after this */
+	tulip_stop_rxtx(tp);		/* Stop DMA */
 
-	/* release any unconsumed transmit buffers */
-	tulip_clean_tx_ring(tp);
+	/* Put driver back into the state we start with */
+	tulip_refill_rx(dev);		/* prepare RX buffers */
+	tulip_clean_tx_ring(tp);	/* clean up unsent TX buffers */
 
 	if (ioread32 (ioaddr + CSR6) != 0xffffffff)
 		tp->stats.rx_missed_errors += ioread32 (ioaddr + CSR8) & 0xffff;
 
-	spin_unlock_irqrestore (&tp->lock, flags);
-
 	init_timer(&tp->timer);
 	tp->timer.data = (unsigned long)dev;
 	tp->timer.function = tulip_tbl[tp->chip_id].media_timer;
@@ -781,7 +794,6 @@
 		printk (KERN_DEBUG "%s: Shutting down ethercard, status was %2.2x.\n",
 			dev->name, ioread32 (ioaddr + CSR5));
 
-	free_irq (dev->irq, dev);
 
 	/* Free all the skbuffs in the Rx queue. */
 	for (i = 0; i < RX_RING_SIZE; i++) {
@@ -1752,7 +1764,6 @@
 		tulip_down(dev);
 
 	netif_device_detach(dev);
-	free_irq(dev->irq, dev);
 
 	pci_save_state(pdev);
 	pci_disable_device(pdev);

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2006-06-26 22:33 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-31 19:52 PATCH 2.6.17-rc5 tulip free_irq() called too late Grant Grundler
2006-06-08 14:43 ` Jeff Garzik
2006-06-08 15:22   ` Grant Grundler
2006-06-08 15:32     ` Grant Grundler
2006-06-08 15:38       ` Jeff Garzik
2006-06-08 15:47         ` Grant Grundler
2006-06-08 15:32     ` Jeff Garzik
2006-06-08 15:36       ` Grant Grundler
2006-06-08 17:01   ` Grant Grundler
2006-06-13 23:55     ` PATCHv3 " Grant Grundler
2006-06-14  0:06       ` Valerie Henson
2006-06-14  0:33       ` Jeff Garzik
2006-06-14  4:44         ` Grant Grundler
2006-06-14 13:05           ` Kyle McMartin
2006-06-14 14:54             ` Grant Grundler
2006-06-14 15:03           ` Jeff Garzik
2006-06-14 18:14             ` Grant Grundler
2006-06-14 19:51               ` Jeff Garzik
2006-06-14 22:25                 ` Grant Grundler
2006-06-14 20:47               ` Francois Romieu
2006-06-14 22:30                 ` Grant Grundler
2006-06-15 20:30                   ` Francois Romieu
2006-06-16  5:47                     ` Grant Grundler
2006-06-16  7:32                       ` Jeff Garzik
2006-06-16 15:25                         ` Grant Grundler
     [not found]                         ` <20060616152400.GA7868@colo.lackof.org>
     [not found]                           ` <4492CE98.50900@pobox.com>
2006-06-16 16:06                             ` Grant Grundler
2006-06-16 16:16                               ` Jeff Garzik
2006-06-22  0:43       ` Valerie Henson
2006-06-23  5:00         ` Grant Grundler
2006-06-26 22:31           ` [PATCH] Fix tulip shutdown DMA/irq race Valerie Henson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).