* e100 lockdep irq lock inversion.
@ 2006-07-07 17:19 Dave Jones
2006-07-07 18:13 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Arjan van de Ven
0 siblings, 1 reply; 6+ messages in thread
From: Dave Jones @ 2006-07-07 17:19 UTC (permalink / raw)
To: Linux Kernel; +Cc: netdev
Another one triggered by a Fedora-development user..
e100: eth1: e100_watchdog: link up, 100Mbps, half-duplex
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
---------------------------------------------------------
ipcalc/1671 just changed the state of lock:
(&skb_queue_lock_key){-+..}, at: [<c05ebe2f>] udp_ioctl+0x3b/0x6e
but this lock was taken by another, hard-irq-safe lock in the past:
(&ai->aux_lock){+...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
no locks held by ipcalc/1671.
the first lock's dependencies:
-> (&skb_queue_lock_key){-+..} ops: 0 {
initial-use at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<c05b0d2d>] skb_queue_tail+0x14/0x32
[<c05c8a53>] netlink_broadcast+0x1bf/0x28e
[<c04e414b>] kobject_uevent+0x345/0x3b6
[<c055080d>] class_device_add+0x2a7/0x3e6
[<c0594d0c>] input_register_device+0x106/0x238
[<c059947a>] psmouse_connect+0x162/0x20f
[<c0590a09>] serio_connect_driver+0x1e/0x2e
[<c0590a2f>] serio_driver_probe+0x16/0x18
[<c054f9a6>] driver_probe_device+0x45/0x92
[<c054fad3>] __driver_attach+0x68/0x93
[<c054f420>] bus_for_each_dev+0x3a/0x5f
[<c054f901>] driver_attach+0x14/0x17
[<c054f0f7>] bus_add_driver+0x68/0x106
[<c054fda1>] driver_register+0x9d/0xa2
[<c0591522>] serio_thread+0x152/0x27c
[<c04365dc>] kthread+0xc3/0xef
[<c0402005>] kernel_thread_helper+0x5/0xb
in-softirq-W at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<c05b0d2d>] skb_queue_tail+0x14/0x32
[<c05b05c4>] sock_queue_rcv_skb+0xe3/0x11f
[<c060adae>] packet_rcv_spkt+0x120/0x136
[<c05b5b86>] netif_receive_skb+0x1d4/0x26b
[<e09fa8e8>] e100_poll+0x15a/0x2fc [e100]
[<c05b7682>] net_rx_action+0xa6/0x1df
[<c0429809>] __do_softirq+0x78/0xf2
[<c04065ab>] do_softirq+0x5a/0xbe
hardirq-on-W at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e2c8>] _spin_lock_bh+0x1e/0x2d
[<c05ebe2f>] udp_ioctl+0x3b/0x6e
[<c05f08e7>] inet_ioctl+0x8c/0x91
[<c05acabc>] sock_ioctl+0x1b5/0x1d3
[<c0481e56>] do_ioctl+0x22/0x67
[<c04820f3>] vfs_ioctl+0x258/0x26b
[<c048214d>] sys_ioctl+0x47/0x62
[<c0403f2f>] syscall_call+0x7/0xb
}
... key at: [<c09b0748>] skb_queue_lock_key+0x0/0x18
the second lock's dependencies:
-> (&ai->aux_lock){+...} ops: 0 {
initial-use at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<e09b6cd8>] mpi_start_xmit+0x79/0xdd [airo]
[<c05b5fc5>] dev_hard_start_xmit+0x19f/0x1fc
[<c05c48d8>] __qdisc_run+0xdd/0x197
[<c05b7a40>] dev_queue_xmit+0x12a/0x222
[<c060a98a>] packet_sendmsg_spkt+0x172/0x199
[<c05ac5df>] sock_sendmsg+0xe8/0x103
[<c05ad7a7>] sys_sendto+0xbe/0xdc
[<c05adf27>] sys_socketcall+0xfb/0x186
[<c0403f2f>] syscall_call+0x7/0xb
in-hardirq-W at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<e09b7b6d>] airo_interrupt+0xe31/0xffb [airo]
[<c0451020>] handle_IRQ_event+0x20/0x4d
[<c04510e1>] __do_IRQ+0x94/0xef
[<c04066c8>] do_IRQ+0xb9/0xcd
[<c04049d1>] common_interrupt+0x25/0x2c
[<c04030aa>] cpu_idle+0xa7/0xc1
[<c0400591>] rest_init+0x23/0x26
[<c07a7810>] start_kernel+0x3a1/0x3a9
[<c0400210>] 0xc0400210
}
... key at: [<e09c2a80>] __key.24227+0x0/0xffff6704 [airo]
-> (&skb_queue_lock_key){-+..} ops: 0 {
initial-use at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<c05b0d2d>] skb_queue_tail+0x14/0x32
[<c05c8a53>] netlink_broadcast+0x1bf/0x28e
[<c04e414b>] kobject_uevent+0x345/0x3b6
[<c055080d>] class_device_add+0x2a7/0x3e6
[<c0594d0c>] input_register_device+0x106/0x238
[<c059947a>] psmouse_connect+0x162/0x20f
[<c0590a09>] serio_connect_driver+0x1e/0x2e
[<c0590a2f>] serio_driver_probe+0x16/0x18
[<c054f9a6>] driver_probe_device+0x45/0x92
[<c054fad3>] __driver_attach+0x68/0x93
[<c054f420>] bus_for_each_dev+0x3a/0x5f
[<c054f901>] driver_attach+0x14/0x17
[<c054f0f7>] bus_add_driver+0x68/0x106
[<c054fda1>] driver_register+0x9d/0xa2
[<c0591522>] serio_thread+0x152/0x27c
[<c04365dc>] kthread+0xc3/0xef
[<c0402005>] kernel_thread_helper+0x5/0xb
in-softirq-W at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<c05b0d2d>] skb_queue_tail+0x14/0x32
[<c05b05c4>] sock_queue_rcv_skb+0xe3/0x11f
[<c060adae>] packet_rcv_spkt+0x120/0x136
[<c05b5b86>] netif_receive_skb+0x1d4/0x26b
[<e09fa8e8>] e100_poll+0x15a/0x2fc [e100]
[<c05b7682>] net_rx_action+0xa6/0x1df
[<c0429809>] __do_softirq+0x78/0xf2
[<c04065ab>] do_softirq+0x5a/0xbe
hardirq-on-W at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e2c8>] _spin_lock_bh+0x1e/0x2d
[<c05ebe2f>] udp_ioctl+0x3b/0x6e
[<c05f08e7>] inet_ioctl+0x8c/0x91
[<c05acabc>] sock_ioctl+0x1b5/0x1d3
[<c0481e56>] do_ioctl+0x22/0x67
[<c04820f3>] vfs_ioctl+0x258/0x26b
[<c048214d>] sys_ioctl+0x47/0x62
[<c0403f2f>] syscall_call+0x7/0xb
}
... key at: [<c09b0748>] skb_queue_lock_key+0x0/0x18
... acquired at:
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e5c7>] _spin_lock_irqsave+0x22/0x32
[<c05b0d2d>] skb_queue_tail+0x14/0x32
[<e09b6ce8>] mpi_start_xmit+0x89/0xdd [airo]
[<c05b5fc5>] dev_hard_start_xmit+0x19f/0x1fc
[<c05c48d8>] __qdisc_run+0xdd/0x197
[<c05b7a40>] dev_queue_xmit+0x12a/0x222
[<c060a98a>] packet_sendmsg_spkt+0x172/0x199
[<c05ac5df>] sock_sendmsg+0xe8/0x103
[<c05ad7a7>] sys_sendto+0xbe/0xdc
[<c05adf27>] sys_socketcall+0xfb/0x186
[<c0403f2f>] syscall_call+0x7/0xb
stack backtrace:
[<c0405167>] show_trace_log_lvl+0x54/0xfd
[<c040571e>] show_trace+0xd/0x10
[<c040583d>] dump_stack+0x19/0x1b
[<c043aa70>] print_irq_inversion_bug+0xe1/0xee
[<c043ab6e>] check_usage_backwards+0x32/0x3b
[<c043ae49>] mark_lock+0x217/0x36a
[<c043ba86>] __lock_acquire+0x43e/0x98d
[<c043c546>] lock_acquire+0x4b/0x6d
[<c060e2c8>] _spin_lock_bh+0x1e/0x2d
[<c05ebe2f>] udp_ioctl+0x3b/0x6e
[<c05f08e7>] inet_ioctl+0x8c/0x91
[<c05acabc>] sock_ioctl+0x1b5/0x1d3
[<c0481e56>] do_ioctl+0x22/0x67
[<c04820f3>] vfs_ioctl+0x258/0x26b
[<c048214d>] sys_ioctl+0x47/0x62
[<c0403f2f>] syscall_call+0x7/0xb
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 6+ messages in thread* auro deadlock (was Re: e100 lockdep irq lock inversion.) 2006-07-07 17:19 e100 lockdep irq lock inversion Dave Jones @ 2006-07-07 18:13 ` Arjan van de Ven 2006-07-07 19:09 ` auro deadlock David Miller 2006-07-08 3:06 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Herbert Xu 0 siblings, 2 replies; 6+ messages in thread From: Arjan van de Ven @ 2006-07-07 18:13 UTC (permalink / raw) To: Dave Jones; +Cc: akpm, mingo, Linux Kernel, netdev On Fri, 2006-07-07 at 13:19 -0400, Dave Jones wrote: > Another one triggered by a Fedora-development user.. > > e100: eth1: e100_watchdog: link up, 100Mbps, half-duplex > > ========================================================= > [ INFO: possible irq lock inversion dependency detected ] > --------------------------------------------------------- > ipcalc/1671 just changed the state of lock: > (&skb_queue_lock_key){-+..}, at: [<c05ebe2f>] udp_ioctl+0x3b/0x6e > but this lock was taken by another, hard-irq-safe lock in the past: > (&ai->aux_lock){+...} > > and interrupts could create inverse lock ordering between them. ok the situation is this: the airo driver has a per card aux_lock. This lock is used in the hardirq handler, and thus all uses need to be irqsave (and they are) Act 1 Enter the mpi_start_xmit() function, which is airo's xmit function. This function takes the aux_lock first, with irq's off, then calls skb_queue_tail(). skb_queue_tail takes the sk_receive_queue.lock (with irqsave as well). Act 2 Enter the ipcalc program. This program calls an ioctl, which ends up calling udp_ioctl. udp_ioctl does spin_lock_bh(&sk->sk_receive_queue.lock); This is a deadlock For brevity, lock A is the aux_lock, lock B is the sk_receive_queue.lock CPU 0 CPU 1 user context user context udp_ioctl takes lock B with _bh (leaving irqs enabled) mpi_start_function takes lock A with _irqsave mpi_start_function takes lock B with _irqsave and spins interrupt happens the hard irq handler takes lock A .... and spins for cpu 0 eg a classic AB-BA deadlock. Now a question for netdev: what is the interrupt-or-softirq rules for the sk_receive_queue.lock? Anyway, the patch below fixes this deadlock; it may or may not be the correct solution depending on the netdev answer, but the deadlock is gone ;) Signed-off-by: Arjan van de Ven Index: linux-2.6.18-rc1/net/ipv4/udp.c =================================================================== --- linux-2.6.18-rc1.orig/net/ipv4/udp.c +++ linux-2.6.18-rc1/net/ipv4/udp.c @@ -725,6 +725,7 @@ out: int udp_ioctl(struct sock *sk, int cmd, unsigned long arg) { + unsigned long flags; switch(cmd) { case SIOCOUTQ: @@ -739,7 +740,7 @@ int udp_ioctl(struct sock *sk, int cmd, unsigned long amount; amount = 0; - spin_lock_bh(&sk->sk_receive_queue.lock); + spin_lock_irqsave(&sk->sk_receive_queue.lock, flags); skb = skb_peek(&sk->sk_receive_queue); if (skb != NULL) { /* @@ -749,7 +750,7 @@ int udp_ioctl(struct sock *sk, int cmd, */ amount = skb->len - sizeof(struct udphdr); } - spin_unlock_bh(&sk->sk_receive_queue.lock); + spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags); return put_user(amount, (int __user *)arg); } ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: auro deadlock 2006-07-07 18:13 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Arjan van de Ven @ 2006-07-07 19:09 ` David Miller 2006-07-07 19:39 ` Arjan van de Ven 2006-07-10 6:59 ` Ingo Molnar 2006-07-08 3:06 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Herbert Xu 1 sibling, 2 replies; 6+ messages in thread From: David Miller @ 2006-07-07 19:09 UTC (permalink / raw) To: arjan; +Cc: davej, akpm, mingo, linux-kernel, netdev From: Arjan van de Ven <arjan@infradead.org> Date: Fri, 07 Jul 2006 20:13:09 +0200 > Now a question for netdev: what is the interrupt-or-softirq rules for > the sk_receive_queue.lock? > > Anyway, the patch below fixes this deadlock; it may or may not be the > correct solution depending on the netdev answer, but the deadlock is > gone ;) The lockdep fixes are starting to cause us to go in and start adding hard IRQ protection to many socket layer objects and I want this thinking to end quickly :) The netlink wireless fix is another example of this, but I accept the temporary fix for that one for the time being. To reiterate, nothing socket or SKB level should be taking anything deeper than software IRQ locking. If drivers manage local SKB queues in hard IRQ context, they need to use a seperate lockdep identifier for that queue's lock. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: auro deadlock 2006-07-07 19:09 ` auro deadlock David Miller @ 2006-07-07 19:39 ` Arjan van de Ven 2006-07-10 6:59 ` Ingo Molnar 1 sibling, 0 replies; 6+ messages in thread From: Arjan van de Ven @ 2006-07-07 19:39 UTC (permalink / raw) To: David Miller; +Cc: davej, akpm, mingo, linux-kernel, netdev On Fri, 2006-07-07 at 12:09 -0700, David Miller wrote: > From: Arjan van de Ven <arjan@infradead.org> > Date: Fri, 07 Jul 2006 20:13:09 +0200 > > > Now a question for netdev: what is the interrupt-or-softirq rules for > > the sk_receive_queue.lock? > > > > Anyway, the patch below fixes this deadlock; it may or may not be the > > correct solution depending on the netdev answer, but the deadlock is > > gone ;) > > The lockdep fixes are starting to cause us to go in and start adding > hard IRQ protection to many socket layer objects and I want this > thinking to end quickly :) that's why I asked the question ;) > To reiterate, nothing socket or SKB level should be taking anything > deeper than software IRQ locking. > > If drivers manage local SKB queues in hard IRQ context, they need to > use a seperate lockdep identifier for that queue's lock. I'm not so sure that;s the case here, but.. if you have time today I hope you can take a look at this one with a wider "network view" than I can.. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: auro deadlock 2006-07-07 19:09 ` auro deadlock David Miller 2006-07-07 19:39 ` Arjan van de Ven @ 2006-07-10 6:59 ` Ingo Molnar 1 sibling, 0 replies; 6+ messages in thread From: Ingo Molnar @ 2006-07-10 6:59 UTC (permalink / raw) To: David Miller; +Cc: arjan, davej, akpm, linux-kernel, netdev * David Miller <davem@davemloft.net> wrote: > The lockdep fixes are starting to cause us to go in and start adding > hard IRQ protection to many socket layer objects and I want this > thinking to end quickly :) In earlier lockdep versions we had many such hacks, but in the current upstream kernel we've only got one such case so far: one netlink function, where the alternative was to rewrite softmac. I fixed all the earlier hacks to be proper annotations - and the plan is to keep things clean in the future too :-) Ingo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: auro deadlock (was Re: e100 lockdep irq lock inversion.) 2006-07-07 18:13 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Arjan van de Ven 2006-07-07 19:09 ` auro deadlock David Miller @ 2006-07-08 3:06 ` Herbert Xu 1 sibling, 0 replies; 6+ messages in thread From: Herbert Xu @ 2006-07-08 3:06 UTC (permalink / raw) To: Arjan van de Ven; +Cc: davej, akpm, mingo, linux-kernel, netdev Arjan van de Ven <arjan@infradead.org> wrote: > > Act 1 > > Enter the mpi_start_xmit() function, which is airo's xmit function. > This function takes the aux_lock first, with irq's off, then calls > skb_queue_tail(). skb_queue_tail takes the sk_receive_queue.lock (with > irqsave as well). Nope, make that ai->txq. > Act 2 > > Enter the ipcalc program. This program calls an ioctl, which ends up > calling udp_ioctl. udp_ioctl does > spin_lock_bh(&sk->sk_receive_queue.lock); Different queue. So no deadlock. Better luck next time :) -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-07-10 7:04 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-07 17:19 e100 lockdep irq lock inversion Dave Jones 2006-07-07 18:13 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Arjan van de Ven 2006-07-07 19:09 ` auro deadlock David Miller 2006-07-07 19:39 ` Arjan van de Ven 2006-07-10 6:59 ` Ingo Molnar 2006-07-08 3:06 ` auro deadlock (was Re: e100 lockdep irq lock inversion.) Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).