public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
@ 2006-07-03 21:31 Miles Lane
  2006-07-03 21:43 ` Arjan van de Ven
  0 siblings, 1 reply; 8+ messages in thread
From: Miles Lane @ 2006-07-03 21:31 UTC (permalink / raw)
  To: Andrew Morton, LKML, Arjan van de Ven

inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
swapper/0 [HC1[1]:SC1[2]:HE0:SE0] takes:
 (&ei_local->page_lock){++..}, at: [<f935079f>] ei_interrupt+0x49/0x294 [8390]
{hardirq-on-W} state was registered at:
  [<c102d152>] lock_acquire+0x60/0x80
  [<c1200376>] _spin_lock+0x23/0x32
  [<f9350d30>] ei_start_xmit+0xa6/0x236 [8390]
  [<c11a2715>] dev_hard_start_xmit+0x1c4/0x221
  [<c11af1b5>] __qdisc_run+0xcc/0x185
  [<c11a411e>] dev_queue_xmit+0x140/0x22e
  [<f93a89f1>] mld_sendpack+0x1a0/0x26a [ipv6]
  [<f93a95cb>] mld_ifc_timer_expire+0x1d6/0x1fd [ipv6]
  [<c101dab2>] run_timer_softirq+0xf2/0x14a
  [<c101a691>] __do_softirq+0x55/0xb0
  [<c1004a8d>] do_softirq+0x58/0xbd
irq event stamp: 952615
hardirqs last  enabled at (952614): [<c1200800>]
_spin_unlock_irqrestore+0x36/0x59
hardirqs last disabled at (952615): [<c1002fcf>] common_interrupt+0x1b/0x2c
softirqs last  enabled at (952574): [<c101a6e7>] __do_softirq+0xab/0xb0
softirqs last disabled at (952579): [<c1004a8d>] do_softirq+0x58/0xbd

other info that might help us debug this:
1 lock held by swapper/0:
 #0:  (&dev->_xmit_lock){-...}, at: [<c11af146>] __qdisc_run+0x5d/0x185

stack backtrace:
 [<c1003502>] show_trace_log_lvl+0x54/0xfd
 [<c1003b6a>] show_trace+0xd/0x10
 [<c1003c0e>] dump_stack+0x19/0x1b
 [<c102b7c7>] print_usage_bug+0x1cc/0x1d9
 [<c102bbbf>] mark_lock+0x8a/0x360
 [<c102c8ac>] __lock_acquire+0x38f/0x970
 [<c102d152>] lock_acquire+0x60/0x80
 [<c1200376>] _spin_lock+0x23/0x32
 [<f935079f>] ei_interrupt+0x49/0x294 [8390]
 [<f94983fa>] ei_irq_wrapper+0xb/0x1d [pcnet_cs]
 [<c103f133>] handle_IRQ_event+0x20/0x50
 [<c104026d>] handle_level_irq+0x76/0xc1
 [<c1004bc2>] do_IRQ+0xd0/0xf6
 [<c1002fd9>] common_interrupt+0x25/0x2c

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-03 21:31 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage Miles Lane
@ 2006-07-03 21:43 ` Arjan van de Ven
  2006-07-03 22:25   ` Alan Cox
  0 siblings, 1 reply; 8+ messages in thread
From: Arjan van de Ven @ 2006-07-03 21:43 UTC (permalink / raw)
  To: Miles Lane; +Cc: mingo, Andrew Morton, LKML

On Mon, 2006-07-03 at 14:31 -0700, Miles Lane wrote:
> inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
> swapper/0 [HC1[1]:SC1[2]:HE0:SE0] takes:
>  (&ei_local->page_lock){++..}, at: [<f935079f>] ei_interrupt+0x49/0x294 [8390]
> {hardirq-on-W} state was registered at:
>   [<c102d152>] lock_acquire+0x60/0x80
>   [<c1200376>] _spin_lock+0x23/0x32

fun.. you have many strange network cards :-)

ok I'm not quite happy about this one yet, but can you try this patch?


The ne2000 drivers use disable_irq as a poor mans locking construct;
make sure lockdep knows about these.
NOTE NOTE: the ne2000 driver calls these *from interrupt context*.
That's a new situation that needs to be analyzed for correctness still;
it feels really wrong to me (but then again so does disable_irq() tricks
in general)


Andrew: once testing completes and I've looked at it more I'll resend
with a proper description + signed-off line

---
 drivers/net/8390.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6.17-mm4/drivers/net/8390.c
===================================================================
--- linux-2.6.17-mm4.orig/drivers/net/8390.c
+++ linux-2.6.17-mm4/drivers/net/8390.c
@@ -299,7 +299,7 @@ static int ei_start_xmit(struct sk_buff 
 	 *	Slow phase with lock held.
 	 */
 	 
-	disable_irq_nosync(dev->irq);
+	disable_irq_nosync_lockdep(dev->irq);
 	
 	spin_lock(&ei_local->page_lock);
 	
@@ -338,7 +338,7 @@ static int ei_start_xmit(struct sk_buff 
 		netif_stop_queue(dev);
 		outb_p(ENISR_ALL, e8390_base + EN0_IMR);
 		spin_unlock(&ei_local->page_lock);
-		enable_irq(dev->irq);
+		enable_irq_lockdep(dev->irq);
 		ei_local->stat.tx_errors++;
 		return 1;
 	}
@@ -379,7 +379,7 @@ static int ei_start_xmit(struct sk_buff 
 	outb_p(ENISR_ALL, e8390_base + EN0_IMR);
 	
 	spin_unlock(&ei_local->page_lock);
-	enable_irq(dev->irq);
+	enable_irq_lockdep(dev->irq);
 
 	dev_kfree_skb (skb);
 	ei_local->stat.tx_bytes += send_length;
@@ -505,9 +505,9 @@ irqreturn_t ei_interrupt(int irq, void *
 #ifdef CONFIG_NET_POLL_CONTROLLER
 void ei_poll(struct net_device *dev)
 {
-	disable_irq(dev->irq);
+	disable_irq_lockdep(dev->irq);
 	ei_interrupt(dev->irq, dev, NULL);
-	enable_irq(dev->irq);
+	enable_irq_lockdep(dev->irq);
 }
 #endif
 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-03 21:43 ` Arjan van de Ven
@ 2006-07-03 22:25   ` Alan Cox
  2006-07-03 23:14     ` Miles Lane
  0 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2006-07-03 22:25 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Miles Lane, mingo, Andrew Morton, LKML

Ar Llu, 2006-07-03 am 23:43 +0200, ysgrifennodd Arjan van de Ven:
> The ne2000 drivers use disable_irq as a poor mans locking construct;
> make sure lockdep knows about these.

Actually they use it as a locking construct because the kernel lacks the
constructs it needs (or did when the work was done). We don't have a 

spin_lock_disable_irq(lock, n)

construct which some other OS's do. There are also good reasons for not
having one given so few drivers realy need it.

The underlying problem is that the NE2K chips are slow, especially some
of the ones nailed to PCI with FPGA glue. So slow that worst case taking
a spinlock and uploading a packet drops characters at 9600 baud serial.

The driver disables the on chip IRQ, which for 99.9% of cases then
ensures we don't get further interrupts, then takes the lock. An IRQ
running in parallel on another CPU also holds the lock so that much is
fine.
 
However: the people at Intel designed the original APIC bus to be
somewhat slow, asynchronous and also without a guaranteed "one message
send, one message receive" sematic of any kind.

That means we have a corner case where we also have to
disable_irq_nosync to ensure that an IRQ that left the 8390 but has not
yet arrived at the processor doesn't race with us and lock up the box.
PCI posting is not the issue here, the IRQ bus is itself even more async
than that.

Doing that work means our tx path doesn't totally trash the machine in
these cases.

Alan


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-03 22:25   ` Alan Cox
@ 2006-07-03 23:14     ` Miles Lane
  2006-07-04  7:36       ` Ingo Molnar
                         ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Miles Lane @ 2006-07-03 23:14 UTC (permalink / raw)
  To: Alan Cox; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML

On 7/3/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> Ar Llu, 2006-07-03 am 23:43 +0200, ysgrifennodd Arjan van de Ven:
> > The ne2000 drivers use disable_irq as a poor mans locking construct;
> > make sure lockdep knows about these.
>
> Actually they use it as a locking construct because the kernel lacks the
> constructs it needs (or did when the work was done). We don't have a
>
> spin_lock_disable_irq(lock, n)
>
> construct which some other OS's do. There are also good reasons for not
> having one given so few drivers realy need it.
>
> The underlying problem is that the NE2K chips are slow, especially some
> of the ones nailed to PCI with FPGA glue. So slow that worst case taking
> a spinlock and uploading a packet drops characters at 9600 baud serial.
>
> The driver disables the on chip IRQ, which for 99.9% of cases then
> ensures we don't get further interrupts, then takes the lock. An IRQ
> running in parallel on another CPU also holds the lock so that much is
> fine.
>
> However: the people at Intel designed the original APIC bus to be
> somewhat slow, asynchronous and also without a guaranteed "one message
> send, one message receive" sematic of any kind.
>
> That means we have a corner case where we also have to
> disable_irq_nosync to ensure that an IRQ that left the 8390 but has not
> yet arrived at the processor doesn't race with us and lock up the box.
> PCI posting is not the issue here, the IRQ bus is itself even more async
> than that.
>
> Doing that work means our tx path doesn't totally trash the machine in
> these cases.

Well, I get this:
pcmcia: request for exclusive IRQ could not be fulfilled.
pcmcia: the driver needs updating to supported shared IRQ lines.
eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D
Which seems to indicate I need to tweak the PCMCIA settings to get this card
working.  I wonder if anyone is going to follow up on enabling shared IRQ
support.

Arjan, the patch you sent does cause the lockdep message to disappear,
but the card doesn't work.  When I plug an ethernet cable into the card,
neither the 10 or 100 LED lights up.  I tried running
"modprobe <modname> debug=1" on 8390 and pcnet_cs, but neither seem
to support modprobe options.  I am recompiling with debugging enabled by
tweaking the debugging values in the files.  I'll send you any useful debug
info I gather.

      Miles

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-03 23:14     ` Miles Lane
@ 2006-07-04  7:36       ` Ingo Molnar
  2006-07-04  8:18       ` Alan Cox
  2006-07-04  9:26       ` Alan Cox
  2 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2006-07-04  7:36 UTC (permalink / raw)
  To: Miles Lane; +Cc: Alan Cox, Arjan van de Ven, Andrew Morton, LKML


* Miles Lane <miles.lane@gmail.com> wrote:

> Arjan, the patch you sent does cause the lockdep message to disappear, 
> but the card doesn't work. [...]

did the card work without the patch? The lockdep messages themselves are 
harmless to functionality, they shouldnt ever break anything, they are 
just information for us to fix potential deadlocks.

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-03 23:14     ` Miles Lane
  2006-07-04  7:36       ` Ingo Molnar
@ 2006-07-04  8:18       ` Alan Cox
  2006-07-04  9:26       ` Alan Cox
  2 siblings, 0 replies; 8+ messages in thread
From: Alan Cox @ 2006-07-04  8:18 UTC (permalink / raw)
  To: Miles Lane; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML

Ar Llu, 2006-07-03 am 16:14 -0700, ysgrifennodd Miles Lane:
> pcmcia: request for exclusive IRQ could not be fulfilled.
> pcmcia: the driver needs updating to supported shared IRQ lines.
> eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D
> Which seems to indicate I need to tweak the PCMCIA settings to get this card
> working.  I wonder if anyone is going to follow up on enabling shared IRQ
> support.

Thats a bug in the PCMCIA driver. The 8390 driver core supports shared
IRQ so you should just need to turn it on in pcnet_cs



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-03 23:14     ` Miles Lane
  2006-07-04  7:36       ` Ingo Molnar
  2006-07-04  8:18       ` Alan Cox
@ 2006-07-04  9:26       ` Alan Cox
  2006-07-05  2:20         ` Miles Lane
  2 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2006-07-04  9:26 UTC (permalink / raw)
  To: Miles Lane; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML

Ar Llu, 2006-07-03 am 16:14 -0700, ysgrifennodd Miles Lane:
> eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D
> Which seems to indicate I need to tweak the PCMCIA settings to get this card
> working.  I wonder if anyone is going to follow up on enabling shared IRQ
> support.


Try this. Note the SMP locking in this driver appears iffy and looks
like it was never SMP sane.

--- linux.vanilla-2.6.17/drivers/net/pcmcia/pcnet_cs.c	2006-06-19 17:29:46.000000000 +0100
+++ linux-2.6.17/drivers/net/pcmcia/pcnet_cs.c	2006-07-04 09:55:07.695012656 +0100
@@ -25,6 +25,8 @@
     Based also on Keith Moore's changes to Don Becker's code, for IBM
     CCAE support.  Drivers merged back together, and shared-memory
     Socket EA support added, by Ken Raeburn, September 1995.
+    
+    FIXME: Not SMP safe
 
 ======================================================================*/
 
@@ -254,7 +256,7 @@
     info->p_dev = link;
     link->priv = dev;
 
-    link->irq.Attributes = IRQ_TYPE_EXCLUSIVE;
+    link->irq.Attributes = IRQ_TYPE_DYNAMIC_SHARING;
     link->irq.IRQInfo1 = IRQ_LEVEL_ID;
     link->conf.Attributes = CONF_ENABLE_IRQ;
     link->conf.IntType = INT_MEMORY_AND_IO;
@@ -998,7 +1000,8 @@
     link->open++;
 
     set_misc_reg(dev);
-    request_irq(dev->irq, ei_irq_wrapper, SA_SHIRQ, dev_info, dev);
+    if (request_irq(dev->irq, ei_irq_wrapper, SA_SHIRQ, dev_info, dev) < 0)
+        return -EBUSY;
 
     info->phy_id = info->eth_phy;
     info->link_status = 0x00;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
  2006-07-04  9:26       ` Alan Cox
@ 2006-07-05  2:20         ` Miles Lane
  0 siblings, 0 replies; 8+ messages in thread
From: Miles Lane @ 2006-07-05  2:20 UTC (permalink / raw)
  To: Alan Cox; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML

On 7/4/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> Ar Llu, 2006-07-03 am 16:14 -0700, ysgrifennodd Miles Lane:
> > eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D
> > Which seems to indicate I need to tweak the PCMCIA settings to get this card
> > working.  I wonder if anyone is going to follow up on enabling shared IRQ
> > support.
>
>
> Try this. Note the SMP locking in this driver appears iffy and looks
> like it was never SMP sane.

The patch corrects the messages about shared interrupts:

pccard: PCMCIA card inserted into slot 0
cs: memory probe 0x0c0000-0x0fffff: excluding 0xc0000-0xcffff 0xdc000-0xfffff
cs: memory probe 0x50000000-0x51ffffff: excluding 0x50000000-0x51ffffff
cs: memory probe 0x60000000-0x60ffffff: clean.
cs: memory probe 0xa0000000-0xa0ffffff: clean.
cs: memory probe 0xe0200000-0xe02fffff: excluding 0xe0200000-0xe020ffff
pcmcia: registering new device pcmcia0.0
PM: Adding info for pcmcia:0.0
eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D

I have lost the connector cable that attaches the card to an ethernet
cable, so I have been using a cable labelled 3COM instead.  It has
LEDs for 10 and 100 Kbps connections.
Neither LED is lighting up.  On the other hand, NetworkManager
seems aware when I have an ethernet cable attached.  I now
suspect that I need a new cable.  The card is a D-Link DFE-650
Fast Ethernet PCMCIA adapter.  Maybe I should order one of these:
http://shopping.yahoo.com/p:QVS%20CPN-GN100T%20:1991447348;_ylt=Ap_SzM9pNc5eVJDZKVqsYd5tpcsE;_ylu=X3oDMTBuZDl1N2RxBF9zAzU5MDk4NTIxBGx0AzQEc2VjA3Ny?clink=dmss//ctx=sc:cnetwork_adapter,c:cnetwork_adapter,mid:57,pid:1991447348,pdid:57,pos:6

What do you think?
           Miles

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-07-05  2:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-03 21:31 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage Miles Lane
2006-07-03 21:43 ` Arjan van de Ven
2006-07-03 22:25   ` Alan Cox
2006-07-03 23:14     ` Miles Lane
2006-07-04  7:36       ` Ingo Molnar
2006-07-04  8:18       ` Alan Cox
2006-07-04  9:26       ` Alan Cox
2006-07-05  2:20         ` Miles Lane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox