netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] 3c59x: fix deadlock when using netconsole with 3c59x
@ 2010-08-09 16:32 Neil Horman
  2010-08-10 23:48 ` David Miller
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Horman @ 2010-08-09 16:32 UTC (permalink / raw)
  To: netdev; +Cc: klassert, davem, nhorman

When using netpoll, its possible to deadlock the 3c59x driver.  Since it takes
an internal spinlock (vp->lock) that serializes boomerang_interrupt and parts
of boomerang_start_xmit, if we call pr_debug in the former, we can go through
the tx path on the same cpu, and recurse into the same driver again, deadlocking
in the transmit routine.

This patch fixes that problem by stopping the queues during interrupt
processing, so that subsequent transmits will get queued until a later point.
Its not a great solution, but we need to find some way to serialize access to
the register file on the card without enforcing a deadlock.  I think the queue
stop is the best way to do that.  And since we only print things in
boomerang_interrupt when we have debug enabled, we can mitigate the impact of
this change to only stop the queues when debug is on.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 drivers/net/3c59x.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
index c754d88..e5c8757 100644
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -2338,7 +2338,17 @@ boomerang_interrupt(int irq, void *dev_id)
 	/*
 	 * It seems dopey to put the spinlock this early, but we could race against vortex_tx_timeout
 	 * and boomerang_start_xmit
+	 * We also need to disable the tx queue in the event that we print
+	 * anything from this path.  if pr_debug is called, we run the risk of 
+	 * recursing through the transmit path, which also takes the vp->lock,
+	 * and deadlocks us.  This only happens if we're using netpoll
+	 * but we need to be ready for it
+	 * We can mitigate the perf impact here if we only
+	 * do this is vortex_debug is != 0
 	 */
+	if (vortex_debug)
+		netif_stop_queue(dev);
+		
 	spin_lock(&vp->lock);
 
 	status = ioread16(ioaddr + EL3_STATUS);
@@ -2447,6 +2457,8 @@ boomerang_interrupt(int irq, void *dev_id)
 		pr_debug("%s: exiting interrupt, status %4.4x.\n",
 			   dev->name, status);
 handler_exit:
+	if (vortex_debug)
+		netif_start_queue(dev);
 	spin_unlock(&vp->lock);
 	return IRQ_HANDLED;
 }
-- 
1.7.2.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] 3c59x: fix deadlock when using netconsole with 3c59x
  2010-08-09 16:32 [PATCH] 3c59x: fix deadlock when using netconsole with 3c59x Neil Horman
@ 2010-08-10 23:48 ` David Miller
  2010-08-11  1:16   ` Neil Horman
  0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2010-08-10 23:48 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, klassert

From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 9 Aug 2010 12:32:10 -0400

> When using netpoll, its possible to deadlock the 3c59x driver.  Since it takes
> an internal spinlock (vp->lock) that serializes boomerang_interrupt and parts
> of boomerang_start_xmit, if we call pr_debug in the former, we can go through
> the tx path on the same cpu, and recurse into the same driver again, deadlocking
> in the transmit routine.
> 
> This patch fixes that problem by stopping the queues during interrupt
> processing, so that subsequent transmits will get queued until a later point.
> Its not a great solution, but we need to find some way to serialize access to
> the register file on the card without enforcing a deadlock.  I think the queue
> stop is the best way to do that.  And since we only print things in
> boomerang_interrupt when we have debug enabled, we can mitigate the impact of
> this change to only stop the queues when debug is on.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

Nothing serialized changes to "vortex_debug" with the tests you're
making here.  Simply turning it off when a packet arrives can deadlock
the send queue of the device.

Even if proper serializatio did exist, I still see this as an awful
solution.

And the default value of this thing is "1" so it's always going to be
doing this send queue state flipping for effectively everyone.

Please find another way to solve this problem.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] 3c59x: fix deadlock when using netconsole with 3c59x
  2010-08-10 23:48 ` David Miller
@ 2010-08-11  1:16   ` Neil Horman
  0 siblings, 0 replies; 3+ messages in thread
From: Neil Horman @ 2010-08-11  1:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, klassert

On Tue, Aug 10, 2010 at 04:48:03PM -0700, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Mon, 9 Aug 2010 12:32:10 -0400
> 
> > When using netpoll, its possible to deadlock the 3c59x driver.  Since it takes
> > an internal spinlock (vp->lock) that serializes boomerang_interrupt and parts
> > of boomerang_start_xmit, if we call pr_debug in the former, we can go through
> > the tx path on the same cpu, and recurse into the same driver again, deadlocking
> > in the transmit routine.
> > 
> > This patch fixes that problem by stopping the queues during interrupt
> > processing, so that subsequent transmits will get queued until a later point.
> > Its not a great solution, but we need to find some way to serialize access to
> > the register file on the card without enforcing a deadlock.  I think the queue
> > stop is the best way to do that.  And since we only print things in
> > boomerang_interrupt when we have debug enabled, we can mitigate the impact of
> > this change to only stop the queues when debug is on.
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> 
> Nothing serialized changes to "vortex_debug" with the tests you're
> making here.  Simply turning it off when a packet arrives can deadlock
> the send queue of the device.
> 
> Even if proper serializatio did exist, I still see this as an awful
> solution.
> 
> And the default value of this thing is "1" so it's always going to be
> doing this send queue state flipping for effectively everyone.
> 
> Please find another way to solve this problem.
> 

Crap-spackle, you're right.  I was thinking the module_param defaulted it to
zero, but it doesn't.  Regardless, sysfs would let us change this value and
deadlock it all to heck.

Recinded.  I'll find a better way to fix this.  Sorry for the noise
Neil


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-08-11  1:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-09 16:32 [PATCH] 3c59x: fix deadlock when using netconsole with 3c59x Neil Horman
2010-08-10 23:48 ` David Miller
2010-08-11  1:16   ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).