* 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage
@ 2006-07-03 21:31 Miles Lane
2006-07-03 21:43 ` Arjan van de Ven
0 siblings, 1 reply; 8+ messages in thread
From: Miles Lane @ 2006-07-03 21:31 UTC (permalink / raw)
To: Andrew Morton, LKML, Arjan van de Ven
inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
swapper/0 [HC1[1]:SC1[2]:HE0:SE0] takes:
(&ei_local->page_lock){++..}, at: [<f935079f>] ei_interrupt+0x49/0x294 [8390]
{hardirq-on-W} state was registered at:
[<c102d152>] lock_acquire+0x60/0x80
[<c1200376>] _spin_lock+0x23/0x32
[<f9350d30>] ei_start_xmit+0xa6/0x236 [8390]
[<c11a2715>] dev_hard_start_xmit+0x1c4/0x221
[<c11af1b5>] __qdisc_run+0xcc/0x185
[<c11a411e>] dev_queue_xmit+0x140/0x22e
[<f93a89f1>] mld_sendpack+0x1a0/0x26a [ipv6]
[<f93a95cb>] mld_ifc_timer_expire+0x1d6/0x1fd [ipv6]
[<c101dab2>] run_timer_softirq+0xf2/0x14a
[<c101a691>] __do_softirq+0x55/0xb0
[<c1004a8d>] do_softirq+0x58/0xbd
irq event stamp: 952615
hardirqs last enabled at (952614): [<c1200800>]
_spin_unlock_irqrestore+0x36/0x59
hardirqs last disabled at (952615): [<c1002fcf>] common_interrupt+0x1b/0x2c
softirqs last enabled at (952574): [<c101a6e7>] __do_softirq+0xab/0xb0
softirqs last disabled at (952579): [<c1004a8d>] do_softirq+0x58/0xbd
other info that might help us debug this:
1 lock held by swapper/0:
#0: (&dev->_xmit_lock){-...}, at: [<c11af146>] __qdisc_run+0x5d/0x185
stack backtrace:
[<c1003502>] show_trace_log_lvl+0x54/0xfd
[<c1003b6a>] show_trace+0xd/0x10
[<c1003c0e>] dump_stack+0x19/0x1b
[<c102b7c7>] print_usage_bug+0x1cc/0x1d9
[<c102bbbf>] mark_lock+0x8a/0x360
[<c102c8ac>] __lock_acquire+0x38f/0x970
[<c102d152>] lock_acquire+0x60/0x80
[<c1200376>] _spin_lock+0x23/0x32
[<f935079f>] ei_interrupt+0x49/0x294 [8390]
[<f94983fa>] ei_irq_wrapper+0xb/0x1d [pcnet_cs]
[<c103f133>] handle_IRQ_event+0x20/0x50
[<c104026d>] handle_level_irq+0x76/0xc1
[<c1004bc2>] do_IRQ+0xd0/0xf6
[<c1002fd9>] common_interrupt+0x25/0x2c
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-03 21:31 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage Miles Lane @ 2006-07-03 21:43 ` Arjan van de Ven 2006-07-03 22:25 ` Alan Cox 0 siblings, 1 reply; 8+ messages in thread From: Arjan van de Ven @ 2006-07-03 21:43 UTC (permalink / raw) To: Miles Lane; +Cc: mingo, Andrew Morton, LKML On Mon, 2006-07-03 at 14:31 -0700, Miles Lane wrote: > inconsistent {hardirq-on-W} -> {in-hardirq-W} usage. > swapper/0 [HC1[1]:SC1[2]:HE0:SE0] takes: > (&ei_local->page_lock){++..}, at: [<f935079f>] ei_interrupt+0x49/0x294 [8390] > {hardirq-on-W} state was registered at: > [<c102d152>] lock_acquire+0x60/0x80 > [<c1200376>] _spin_lock+0x23/0x32 fun.. you have many strange network cards :-) ok I'm not quite happy about this one yet, but can you try this patch? The ne2000 drivers use disable_irq as a poor mans locking construct; make sure lockdep knows about these. NOTE NOTE: the ne2000 driver calls these *from interrupt context*. That's a new situation that needs to be analyzed for correctness still; it feels really wrong to me (but then again so does disable_irq() tricks in general) Andrew: once testing completes and I've looked at it more I'll resend with a proper description + signed-off line --- drivers/net/8390.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) Index: linux-2.6.17-mm4/drivers/net/8390.c =================================================================== --- linux-2.6.17-mm4.orig/drivers/net/8390.c +++ linux-2.6.17-mm4/drivers/net/8390.c @@ -299,7 +299,7 @@ static int ei_start_xmit(struct sk_buff * Slow phase with lock held. */ - disable_irq_nosync(dev->irq); + disable_irq_nosync_lockdep(dev->irq); spin_lock(&ei_local->page_lock); @@ -338,7 +338,7 @@ static int ei_start_xmit(struct sk_buff netif_stop_queue(dev); outb_p(ENISR_ALL, e8390_base + EN0_IMR); spin_unlock(&ei_local->page_lock); - enable_irq(dev->irq); + enable_irq_lockdep(dev->irq); ei_local->stat.tx_errors++; return 1; } @@ -379,7 +379,7 @@ static int ei_start_xmit(struct sk_buff outb_p(ENISR_ALL, e8390_base + EN0_IMR); spin_unlock(&ei_local->page_lock); - enable_irq(dev->irq); + enable_irq_lockdep(dev->irq); dev_kfree_skb (skb); ei_local->stat.tx_bytes += send_length; @@ -505,9 +505,9 @@ irqreturn_t ei_interrupt(int irq, void * #ifdef CONFIG_NET_POLL_CONTROLLER void ei_poll(struct net_device *dev) { - disable_irq(dev->irq); + disable_irq_lockdep(dev->irq); ei_interrupt(dev->irq, dev, NULL); - enable_irq(dev->irq); + enable_irq_lockdep(dev->irq); } #endif ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-03 21:43 ` Arjan van de Ven @ 2006-07-03 22:25 ` Alan Cox 2006-07-03 23:14 ` Miles Lane 0 siblings, 1 reply; 8+ messages in thread From: Alan Cox @ 2006-07-03 22:25 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Miles Lane, mingo, Andrew Morton, LKML Ar Llu, 2006-07-03 am 23:43 +0200, ysgrifennodd Arjan van de Ven: > The ne2000 drivers use disable_irq as a poor mans locking construct; > make sure lockdep knows about these. Actually they use it as a locking construct because the kernel lacks the constructs it needs (or did when the work was done). We don't have a spin_lock_disable_irq(lock, n) construct which some other OS's do. There are also good reasons for not having one given so few drivers realy need it. The underlying problem is that the NE2K chips are slow, especially some of the ones nailed to PCI with FPGA glue. So slow that worst case taking a spinlock and uploading a packet drops characters at 9600 baud serial. The driver disables the on chip IRQ, which for 99.9% of cases then ensures we don't get further interrupts, then takes the lock. An IRQ running in parallel on another CPU also holds the lock so that much is fine. However: the people at Intel designed the original APIC bus to be somewhat slow, asynchronous and also without a guaranteed "one message send, one message receive" sematic of any kind. That means we have a corner case where we also have to disable_irq_nosync to ensure that an IRQ that left the 8390 but has not yet arrived at the processor doesn't race with us and lock up the box. PCI posting is not the issue here, the IRQ bus is itself even more async than that. Doing that work means our tx path doesn't totally trash the machine in these cases. Alan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-03 22:25 ` Alan Cox @ 2006-07-03 23:14 ` Miles Lane 2006-07-04 7:36 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Miles Lane @ 2006-07-03 23:14 UTC (permalink / raw) To: Alan Cox; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML On 7/3/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Llu, 2006-07-03 am 23:43 +0200, ysgrifennodd Arjan van de Ven: > > The ne2000 drivers use disable_irq as a poor mans locking construct; > > make sure lockdep knows about these. > > Actually they use it as a locking construct because the kernel lacks the > constructs it needs (or did when the work was done). We don't have a > > spin_lock_disable_irq(lock, n) > > construct which some other OS's do. There are also good reasons for not > having one given so few drivers realy need it. > > The underlying problem is that the NE2K chips are slow, especially some > of the ones nailed to PCI with FPGA glue. So slow that worst case taking > a spinlock and uploading a packet drops characters at 9600 baud serial. > > The driver disables the on chip IRQ, which for 99.9% of cases then > ensures we don't get further interrupts, then takes the lock. An IRQ > running in parallel on another CPU also holds the lock so that much is > fine. > > However: the people at Intel designed the original APIC bus to be > somewhat slow, asynchronous and also without a guaranteed "one message > send, one message receive" sematic of any kind. > > That means we have a corner case where we also have to > disable_irq_nosync to ensure that an IRQ that left the 8390 but has not > yet arrived at the processor doesn't race with us and lock up the box. > PCI posting is not the issue here, the IRQ bus is itself even more async > than that. > > Doing that work means our tx path doesn't totally trash the machine in > these cases. Well, I get this: pcmcia: request for exclusive IRQ could not be fulfilled. pcmcia: the driver needs updating to supported shared IRQ lines. eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D Which seems to indicate I need to tweak the PCMCIA settings to get this card working. I wonder if anyone is going to follow up on enabling shared IRQ support. Arjan, the patch you sent does cause the lockdep message to disappear, but the card doesn't work. When I plug an ethernet cable into the card, neither the 10 or 100 LED lights up. I tried running "modprobe <modname> debug=1" on 8390 and pcnet_cs, but neither seem to support modprobe options. I am recompiling with debugging enabled by tweaking the debugging values in the files. I'll send you any useful debug info I gather. Miles ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-03 23:14 ` Miles Lane @ 2006-07-04 7:36 ` Ingo Molnar 2006-07-04 8:18 ` Alan Cox 2006-07-04 9:26 ` Alan Cox 2 siblings, 0 replies; 8+ messages in thread From: Ingo Molnar @ 2006-07-04 7:36 UTC (permalink / raw) To: Miles Lane; +Cc: Alan Cox, Arjan van de Ven, Andrew Morton, LKML * Miles Lane <miles.lane@gmail.com> wrote: > Arjan, the patch you sent does cause the lockdep message to disappear, > but the card doesn't work. [...] did the card work without the patch? The lockdep messages themselves are harmless to functionality, they shouldnt ever break anything, they are just information for us to fix potential deadlocks. Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-03 23:14 ` Miles Lane 2006-07-04 7:36 ` Ingo Molnar @ 2006-07-04 8:18 ` Alan Cox 2006-07-04 9:26 ` Alan Cox 2 siblings, 0 replies; 8+ messages in thread From: Alan Cox @ 2006-07-04 8:18 UTC (permalink / raw) To: Miles Lane; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML Ar Llu, 2006-07-03 am 16:14 -0700, ysgrifennodd Miles Lane: > pcmcia: request for exclusive IRQ could not be fulfilled. > pcmcia: the driver needs updating to supported shared IRQ lines. > eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D > Which seems to indicate I need to tweak the PCMCIA settings to get this card > working. I wonder if anyone is going to follow up on enabling shared IRQ > support. Thats a bug in the PCMCIA driver. The 8390 driver core supports shared IRQ so you should just need to turn it on in pcnet_cs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-03 23:14 ` Miles Lane 2006-07-04 7:36 ` Ingo Molnar 2006-07-04 8:18 ` Alan Cox @ 2006-07-04 9:26 ` Alan Cox 2006-07-05 2:20 ` Miles Lane 2 siblings, 1 reply; 8+ messages in thread From: Alan Cox @ 2006-07-04 9:26 UTC (permalink / raw) To: Miles Lane; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML Ar Llu, 2006-07-03 am 16:14 -0700, ysgrifennodd Miles Lane: > eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D > Which seems to indicate I need to tweak the PCMCIA settings to get this card > working. I wonder if anyone is going to follow up on enabling shared IRQ > support. Try this. Note the SMP locking in this driver appears iffy and looks like it was never SMP sane. --- linux.vanilla-2.6.17/drivers/net/pcmcia/pcnet_cs.c 2006-06-19 17:29:46.000000000 +0100 +++ linux-2.6.17/drivers/net/pcmcia/pcnet_cs.c 2006-07-04 09:55:07.695012656 +0100 @@ -25,6 +25,8 @@ Based also on Keith Moore's changes to Don Becker's code, for IBM CCAE support. Drivers merged back together, and shared-memory Socket EA support added, by Ken Raeburn, September 1995. + + FIXME: Not SMP safe ======================================================================*/ @@ -254,7 +256,7 @@ info->p_dev = link; link->priv = dev; - link->irq.Attributes = IRQ_TYPE_EXCLUSIVE; + link->irq.Attributes = IRQ_TYPE_DYNAMIC_SHARING; link->irq.IRQInfo1 = IRQ_LEVEL_ID; link->conf.Attributes = CONF_ENABLE_IRQ; link->conf.IntType = INT_MEMORY_AND_IO; @@ -998,7 +1000,8 @@ link->open++; set_misc_reg(dev); - request_irq(dev->irq, ei_irq_wrapper, SA_SHIRQ, dev_info, dev); + if (request_irq(dev->irq, ei_irq_wrapper, SA_SHIRQ, dev_info, dev) < 0) + return -EBUSY; info->phy_id = info->eth_phy; info->link_status = 0x00; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage 2006-07-04 9:26 ` Alan Cox @ 2006-07-05 2:20 ` Miles Lane 0 siblings, 0 replies; 8+ messages in thread From: Miles Lane @ 2006-07-05 2:20 UTC (permalink / raw) To: Alan Cox; +Cc: Arjan van de Ven, mingo, Andrew Morton, LKML On 7/4/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Llu, 2006-07-03 am 16:14 -0700, ysgrifennodd Miles Lane: > > eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D > > Which seems to indicate I need to tweak the PCMCIA settings to get this card > > working. I wonder if anyone is going to follow up on enabling shared IRQ > > support. > > > Try this. Note the SMP locking in this driver appears iffy and looks > like it was never SMP sane. The patch corrects the messages about shared interrupts: pccard: PCMCIA card inserted into slot 0 cs: memory probe 0x0c0000-0x0fffff: excluding 0xc0000-0xcffff 0xdc000-0xfffff cs: memory probe 0x50000000-0x51ffffff: excluding 0x50000000-0x51ffffff cs: memory probe 0x60000000-0x60ffffff: clean. cs: memory probe 0xa0000000-0xa0ffffff: clean. cs: memory probe 0xe0200000-0xe02fffff: excluding 0xe0200000-0xe020ffff pcmcia: registering new device pcmcia0.0 PM: Adding info for pcmcia:0.0 eth2: NE2000 (DL10022 rev 30): io 0x300, irq 11, hw_addr 00:50:BA:73:92:3D I have lost the connector cable that attaches the card to an ethernet cable, so I have been using a cable labelled 3COM instead. It has LEDs for 10 and 100 Kbps connections. Neither LED is lighting up. On the other hand, NetworkManager seems aware when I have an ethernet cable attached. I now suspect that I need a new cable. The card is a D-Link DFE-650 Fast Ethernet PCMCIA adapter. Maybe I should order one of these: http://shopping.yahoo.com/p:QVS%20CPN-GN100T%20:1991447348;_ylt=Ap_SzM9pNc5eVJDZKVqsYd5tpcsE;_ylu=X3oDMTBuZDl1N2RxBF9zAzU5MDk4NTIxBGx0AzQEc2VjA3Ny?clink=dmss//ctx=sc:cnetwork_adapter,c:cnetwork_adapter,mid:57,pid:1991447348,pdid:57,pos:6 What do you think? Miles ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-07-05 2:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-03 21:31 2.6.17-mm5 + pcmcia/hostap/8139too patches -- inconsistent {hardirq-on-W} -> {in-hardirq-W} usage Miles Lane
2006-07-03 21:43 ` Arjan van de Ven
2006-07-03 22:25 ` Alan Cox
2006-07-03 23:14 ` Miles Lane
2006-07-04 7:36 ` Ingo Molnar
2006-07-04 8:18 ` Alan Cox
2006-07-04 9:26 ` Alan Cox
2006-07-05 2:20 ` Miles Lane
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox