* 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage.
@ 2006-06-29 19:01 Miles Lane
2006-06-29 19:26 ` Andrew Morton
2006-06-30 6:50 ` Ingo Molnar
0 siblings, 2 replies; 22+ messages in thread
From: Miles Lane @ 2006-06-29 19:01 UTC (permalink / raw)
To: Andrew Morton, LKML
[ BUG: illegal lock usage! ]
----------------------------
illegal {softirq-on-W} -> {in-softirq-R} usage.
java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&sk->sk_dst_lock){---?}, at: [<c119d0a9>] sk_dst_check+0x1b/0xe6
{softirq-on-W} state was registered at:
[<c102d1c8>] lock_acquire+0x60/0x80
[<c12012d7>] _write_lock+0x23/0x32
[<c11ddbe7>] inet_bind+0x16c/0x1cc
[<c119ae58>] sys_bind+0x61/0x80
[<c119b465>] sys_socketcall+0x7d/0x186
[<c1002d6d>] sysenter_past_esp+0x56/0x8d
irq event stamp: 11052
hardirqs last enabled at (11052): [<c105d454>] kmem_cache_alloc+0x89/0xa6
hardirqs last disabled at (11051): [<c105d405>] kmem_cache_alloc+0x3a/0xa6
softirqs last enabled at (11040): [<c11a506d>] dev_queue_xmit+0x224/0x24b
softirqs last disabled at (11041): [<c1004a64>] do_softirq+0x58/0xbd
other info that might help us debug this:
1 lock held by java_vm/4418:
#0: (af_family_keys + (sk)->sk_family#4){-+..}, at: [<f93c9281>]
tcp_v6_rcv+0x308/0x7b7 [ipv6]
stack backtrace:
[<c1003502>] show_trace_log_lvl+0x54/0xfd
[<c1003b6a>] show_trace+0xd/0x10
[<c1003c0e>] dump_stack+0x19/0x1b
[<c102b833>] print_usage_bug+0x1cc/0x1d9
[<c102bd34>] mark_lock+0x193/0x360
[<c102c94a>] __lock_acquire+0x3b7/0x970
[<c102d1c8>] lock_acquire+0x60/0x80
[<c12013eb>] _read_lock+0x23/0x32
[<c119d0a9>] sk_dst_check+0x1b/0xe6
[<f93ae479>] ip6_dst_lookup+0x31/0x172 [ipv6]
[<f93c7065>] tcp_v6_send_synack+0x10f/0x238 [ipv6]
[<f93c7dc5>] tcp_v6_conn_request+0x281/0x2c7 [ipv6]
[<c11cca33>] tcp_rcv_state_process+0x5d/0xbde
[<f93c7414>] tcp_v6_do_rcv+0x26d/0x384 [ipv6]
[<f93c96d6>] tcp_v6_rcv+0x75d/0x7b7 [ipv6]
[<f93afadd>] ip6_input+0x201/0x2d1 [ipv6]
[<f93b002d>] ipv6_rcv+0x190/0x1bf [ipv6]
[<c11a3200>] netif_receive_skb+0x2e6/0x37f
[<c11a4b81>] process_backlog+0x80/0x112
[<c11a4c9e>] net_rx_action+0x8b/0x1e8
[<c101a711>] __do_softirq+0x55/0xb0
[<c1004a64>] do_softirq+0x58/0xbd
[<c101a978>] local_bh_enable+0xd0/0x107
[<c11a506d>] dev_queue_xmit+0x224/0x24b
[<c11a9bb8>] neigh_resolve_output+0x1e2/0x20e
[<f93add64>] ip6_output2+0x1de/0x1fc [ipv6]
[<f93ae41e>] ip6_output+0x69c/0x6c6 [ipv6]
[<f93aeb58>] ip6_xmit+0x22b/0x295 [ipv6]
[<f93cc893>] inet6_csk_xmit+0x200/0x20e [ipv6]
[<c11cea04>] tcp_transmit_skb+0x5de/0x60c
[<c11d0cd7>] tcp_connect+0x2bb/0x31a
[<f93c894a>] tcp_v6_connect+0x520/0x655 [ipv6]
[<c11dd889>] inet_stream_connect+0x83/0x20f
[<c119adda>] sys_connect+0x67/0x84
[<c119b474>] sys_socketcall+0x8c/0x186
[<c1002d6d>] sysenter_past_esp+0x56/0x8d
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage. 2006-06-29 19:01 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage Miles Lane @ 2006-06-29 19:26 ` Andrew Morton 2006-06-29 19:42 ` Arjan van de Ven 2006-06-30 6:50 ` Ingo Molnar 1 sibling, 1 reply; 22+ messages in thread From: Andrew Morton @ 2006-06-29 19:26 UTC (permalink / raw) To: Miles Lane; +Cc: linux-kernel, netdev On Thu, 29 Jun 2006 12:01:06 -0700 "Miles Lane" <miles.lane@gmail.com> wrote: > [ BUG: illegal lock usage! ] > ---------------------------- This is claiming that we're taking sk->sk_dst_lock in a deadlockable manner. > illegal {softirq-on-W} -> {in-softirq-R} usage. It found someone doing write_lock(sk_dst_lock) with softirqs enabled, but someone else takes read_lock(dst_lock) inside softirqs. > java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes: > (&sk->sk_dst_lock){---?}, at: [<c119d0a9>] sk_dst_check+0x1b/0xe6 > {softirq-on-W} state was registered at: > [<c102d1c8>] lock_acquire+0x60/0x80 > [<c12012d7>] _write_lock+0x23/0x32 > [<c11ddbe7>] inet_bind+0x16c/0x1cc > [<c119ae58>] sys_bind+0x61/0x80 > [<c119b465>] sys_socketcall+0x7d/0x186 > [<c1002d6d>] sysenter_past_esp+0x56/0x8d inet_bind() ->sk_dst_get ->read_lock(&sk->sk_dst_lock) > irq event stamp: 11052 > hardirqs last enabled at (11052): [<c105d454>] kmem_cache_alloc+0x89/0xa6 > hardirqs last disabled at (11051): [<c105d405>] kmem_cache_alloc+0x3a/0xa6 > softirqs last enabled at (11040): [<c11a506d>] dev_queue_xmit+0x224/0x24b > softirqs last disabled at (11041): [<c1004a64>] do_softirq+0x58/0xbd > > other info that might help us debug this: > 1 lock held by java_vm/4418: > #0: (af_family_keys + (sk)->sk_family#4){-+..}, at: [<f93c9281>] > tcp_v6_rcv+0x308/0x7b7 [ipv6] softirq ->ip6_dst_lookup ->sk_dst_check ->sk_dst_reset ->write_lock(&sk->sk_dst_lock); > stack backtrace: > [<c1003502>] show_trace_log_lvl+0x54/0xfd > [<c1003b6a>] show_trace+0xd/0x10 > [<c1003c0e>] dump_stack+0x19/0x1b > [<c102b833>] print_usage_bug+0x1cc/0x1d9 > [<c102bd34>] mark_lock+0x193/0x360 > [<c102c94a>] __lock_acquire+0x3b7/0x970 > [<c102d1c8>] lock_acquire+0x60/0x80 > [<c12013eb>] _read_lock+0x23/0x32 > [<c119d0a9>] sk_dst_check+0x1b/0xe6 > [<f93ae479>] ip6_dst_lookup+0x31/0x172 [ipv6] > [<f93c7065>] tcp_v6_send_synack+0x10f/0x238 [ipv6] > [<f93c7dc5>] tcp_v6_conn_request+0x281/0x2c7 [ipv6] > [<c11cca33>] tcp_rcv_state_process+0x5d/0xbde > [<f93c7414>] tcp_v6_do_rcv+0x26d/0x384 [ipv6] > [<f93c96d6>] tcp_v6_rcv+0x75d/0x7b7 [ipv6] > [<f93afadd>] ip6_input+0x201/0x2d1 [ipv6] > [<f93b002d>] ipv6_rcv+0x190/0x1bf [ipv6] > [<c11a3200>] netif_receive_skb+0x2e6/0x37f > [<c11a4b81>] process_backlog+0x80/0x112 > [<c11a4c9e>] net_rx_action+0x8b/0x1e8 > [<c101a711>] __do_softirq+0x55/0xb0 > [<c1004a64>] do_softirq+0x58/0xbd > [<c101a978>] local_bh_enable+0xd0/0x107 > [<c11a506d>] dev_queue_xmit+0x224/0x24b > [<c11a9bb8>] neigh_resolve_output+0x1e2/0x20e > [<f93add64>] ip6_output2+0x1de/0x1fc [ipv6] > [<f93ae41e>] ip6_output+0x69c/0x6c6 [ipv6] > [<f93aeb58>] ip6_xmit+0x22b/0x295 [ipv6] > [<f93cc893>] inet6_csk_xmit+0x200/0x20e [ipv6] > [<c11cea04>] tcp_transmit_skb+0x5de/0x60c > [<c11d0cd7>] tcp_connect+0x2bb/0x31a > [<f93c894a>] tcp_v6_connect+0x520/0x655 [ipv6] > [<c11dd889>] inet_stream_connect+0x83/0x20f > [<c119adda>] sys_connect+0x67/0x84 > [<c119b474>] sys_socketcall+0x8c/0x186 > [<c1002d6d>] sysenter_past_esp+0x56/0x8d So the allegation is that if a softirq runs sk_dst_reset() while process-context code is running sk_dst_set(), we'll do write_lock() while holding read_lock(). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage. 2006-06-29 19:26 ` Andrew Morton @ 2006-06-29 19:42 ` Arjan van de Ven 2006-06-29 19:56 ` Andrew Morton 0 siblings, 1 reply; 22+ messages in thread From: Arjan van de Ven @ 2006-06-29 19:42 UTC (permalink / raw) To: Andrew Morton; +Cc: Miles Lane, linux-kernel, netdev On Thu, 2006-06-29 at 12:26 -0700, Andrew Morton wrote: > On Thu, 29 Jun 2006 12:01:06 -0700 > "Miles Lane" <miles.lane@gmail.com> wrote: > > > [ BUG: illegal lock usage! ] > > ---------------------------- > > This is claiming that we're taking sk->sk_dst_lock in a deadlockable manner. > > > illegal {softirq-on-W} -> {in-softirq-R} usage. > > It found someone doing write_lock(sk_dst_lock) with softirqs enabled, but > someone else takes read_lock(dst_lock) inside softirqs. > > > java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes: > > (&sk->sk_dst_lock){---?}, at: [<c119d0a9>] sk_dst_check+0x1b/0xe6 > > {softirq-on-W} state was registered at: > > [<c102d1c8>] lock_acquire+0x60/0x80 > > [<c12012d7>] _write_lock+0x23/0x32 > > [<c11ddbe7>] inet_bind+0x16c/0x1cc > > [<c119ae58>] sys_bind+0x61/0x80 > > [<c119b465>] sys_socketcall+0x7d/0x186 > > [<c1002d6d>] sysenter_past_esp+0x56/0x8d > > inet_bind() > ->sk_dst_get > ->read_lock(&sk->sk_dst_lock) actually write_lock() not read_lock() > > > irq event stamp: 11052 > > hardirqs last enabled at (11052): [<c105d454>] kmem_cache_alloc+0x89/0xa6 > > hardirqs last disabled at (11051): [<c105d405>] kmem_cache_alloc+0x3a/0xa6 > > softirqs last enabled at (11040): [<c11a506d>] dev_queue_xmit+0x224/0x24b > > softirqs last disabled at (11041): [<c1004a64>] do_softirq+0x58/0xbd > > > > other info that might help us debug this: > > 1 lock held by java_vm/4418: > > #0: (af_family_keys + (sk)->sk_family#4){-+..}, at: [<f93c9281>] > > tcp_v6_rcv+0x308/0x7b7 [ipv6] > > softirq > ->ip6_dst_lookup > ->sk_dst_check > ->sk_dst_reset > ->write_lock(&sk->sk_dst_lock); write_lock.. or read_lock() ? > > > stack backtrace: > > [<c1003502>] show_trace_log_lvl+0x54/0xfd > > [<c1003b6a>] show_trace+0xd/0x10 > > [<c1003c0e>] dump_stack+0x19/0x1b > > [<c102b833>] print_usage_bug+0x1cc/0x1d9 > > [<c102bd34>] mark_lock+0x193/0x360 > > [<c102c94a>] __lock_acquire+0x3b7/0x970 > > [<c102d1c8>] lock_acquire+0x60/0x80 > > [<c12013eb>] _read_lock+0x23/0x32 backtrace says read lock to me ... > > [<c119d0a9>] sk_dst_check+0x1b/0xe6 > > [<f93ae479>] ip6_dst_lookup+0x31/0x172 [ipv6] > > [<f93c7065>] tcp_v6_send_synack+0x10f/0x238 [ipv6] > > [<f93c7dc5>] tcp_v6_conn_request+0x281/0x2c7 [ipv6] > > [<c11cca33>] tcp_rcv_state_process+0x5d/0xbde > So the allegation is that if a softirq runs sk_dst_reset() while > process-context code is running sk_dst_set(), we'll do write_lock() while > holding read_lock(). hmm or... we're doing a write_lock(), then an interrupt can happen that triggers the softirq that triggers the read_lock(), which will deadlock because we interrupted the writer... ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage. 2006-06-29 19:42 ` Arjan van de Ven @ 2006-06-29 19:56 ` Andrew Morton 2006-06-30 2:57 ` Herbert Xu 0 siblings, 1 reply; 22+ messages in thread From: Andrew Morton @ 2006-06-29 19:56 UTC (permalink / raw) To: Arjan van de Ven; +Cc: miles.lane, linux-kernel, netdev On Thu, 29 Jun 2006 21:42:34 +0200 Arjan van de Ven <arjan@infradead.org> wrote: > On Thu, 2006-06-29 at 12:26 -0700, Andrew Morton wrote: > > On Thu, 29 Jun 2006 12:01:06 -0700 > > "Miles Lane" <miles.lane@gmail.com> wrote: > > > > > [ BUG: illegal lock usage! ] > > > ---------------------------- > > > > This is claiming that we're taking sk->sk_dst_lock in a deadlockable manner. > > > > > illegal {softirq-on-W} -> {in-softirq-R} usage. > > > > It found someone doing write_lock(sk_dst_lock) with softirqs enabled, but > > someone else takes read_lock(dst_lock) inside softirqs. > > > > > java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes: > > > (&sk->sk_dst_lock){---?}, at: [<c119d0a9>] sk_dst_check+0x1b/0xe6 > > > {softirq-on-W} state was registered at: > > > [<c102d1c8>] lock_acquire+0x60/0x80 > > > [<c12012d7>] _write_lock+0x23/0x32 > > > [<c11ddbe7>] inet_bind+0x16c/0x1cc > > > [<c119ae58>] sys_bind+0x61/0x80 > > > [<c119b465>] sys_socketcall+0x7d/0x186 > > > [<c1002d6d>] sysenter_past_esp+0x56/0x8d > > > > inet_bind() > > ->sk_dst_get > > ->read_lock(&sk->sk_dst_lock) > > actually write_lock() not read_lock() > static inline struct dst_entry * sk_dst_get(struct sock *sk) { struct dst_entry *dst; read_lock(&sk->sk_dst_lock); > > > > > irq event stamp: 11052 > > > hardirqs last enabled at (11052): [<c105d454>] kmem_cache_alloc+0x89/0xa6 > > > hardirqs last disabled at (11051): [<c105d405>] kmem_cache_alloc+0x3a/0xa6 > > > softirqs last enabled at (11040): [<c11a506d>] dev_queue_xmit+0x224/0x24b > > > softirqs last disabled at (11041): [<c1004a64>] do_softirq+0x58/0xbd > > > > > > other info that might help us debug this: > > > 1 lock held by java_vm/4418: > > > #0: (af_family_keys + (sk)->sk_family#4){-+..}, at: [<f93c9281>] > > > tcp_v6_rcv+0x308/0x7b7 [ipv6] > > > > softirq > > ->ip6_dst_lookup > > ->sk_dst_check > > ->sk_dst_reset > > ->write_lock(&sk->sk_dst_lock); > > write_lock.. or read_lock() ? static inline void sk_dst_reset(struct sock *sk) { write_lock(&sk->sk_dst_lock); > > > > > stack backtrace: > > > [<c1003502>] show_trace_log_lvl+0x54/0xfd > > > [<c1003b6a>] show_trace+0xd/0x10 > > > [<c1003c0e>] dump_stack+0x19/0x1b > > > [<c102b833>] print_usage_bug+0x1cc/0x1d9 > > > [<c102bd34>] mark_lock+0x193/0x360 > > > [<c102c94a>] __lock_acquire+0x3b7/0x970 > > > [<c102d1c8>] lock_acquire+0x60/0x80 > > > [<c12013eb>] _read_lock+0x23/0x32 > > backtrace says read lock to me ... > > > [<c119d0a9>] sk_dst_check+0x1b/0xe6 > > > [<f93ae479>] ip6_dst_lookup+0x31/0x172 [ipv6] > > > [<f93c7065>] tcp_v6_send_synack+0x10f/0x238 [ipv6] > > > [<f93c7dc5>] tcp_v6_conn_request+0x281/0x2c7 [ipv6] > > > [<c11cca33>] tcp_rcv_state_process+0x5d/0xbde > > > > So the allegation is that if a softirq runs sk_dst_reset() while > > process-context code is running sk_dst_set(), we'll do write_lock() while > > holding read_lock(). > > hmm or... > > we're doing a write_lock(), then an interrupt can happen that triggers > the softirq that triggers the read_lock(), which will deadlock because > we interrupted the writer... > either way ain't good ;) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage. 2006-06-29 19:56 ` Andrew Morton @ 2006-06-30 2:57 ` Herbert Xu 0 siblings, 0 replies; 22+ messages in thread From: Herbert Xu @ 2006-06-30 2:57 UTC (permalink / raw) To: Andrew Morton; +Cc: arjan, miles.lane, linux-kernel, netdev, davem, yoshfuji Andrew Morton <akpm@osdl.org> wrote: > >> > inet_bind() >> > ->sk_dst_get >> > ->read_lock(&sk->sk_dst_lock) We are still holding the sock lock when doing sk_dst_get. >> > > 1 lock held by java_vm/4418: >> > > #0: (af_family_keys + (sk)->sk_family#4){-+..}, at: [<f93c9281>] >> > > tcp_v6_rcv+0x308/0x7b7 [ipv6] >> > >> > softirq >> > ->ip6_dst_lookup >> > ->sk_dst_check >> > ->sk_dst_reset >> > ->write_lock(&sk->sk_dst_lock); The sock lock prevents this path from being entered. Instead the received TCP packet is queued and replayed when the sock lock is released. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage. 2006-06-29 19:01 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage Miles Lane 2006-06-29 19:26 ` Andrew Morton @ 2006-06-30 6:50 ` Ingo Molnar 2006-06-30 6:59 ` Ingo Molnar 2006-06-30 7:22 ` [patch] lockdep, annotate slocks: turn lockdep off for them Ingo Molnar 1 sibling, 2 replies; 22+ messages in thread From: Ingo Molnar @ 2006-06-30 6:50 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, LKML * Miles Lane <miles.lane@gmail.com> wrote: > [ BUG: illegal lock usage! ] > ---------------------------- > illegal {softirq-on-W} -> {in-softirq-R} usage. > java_vm/4418 [HC0[0]:SC1[1]:HE1:SE0] takes: > (&sk->sk_dst_lock){---?}, at: [<c119d0a9>] sk_dst_check+0x1b/0xe6 > {softirq-on-W} state was registered at: > [<c102d1c8>] lock_acquire+0x60/0x80 > [<c12012d7>] _write_lock+0x23/0x32 > [<c11ddbe7>] inet_bind+0x16c/0x1cc > [<c119ae58>] sys_bind+0x61/0x80 > [<c119b465>] sys_socketcall+0x7d/0x186 > [<c1002d6d>] sysenter_past_esp+0x56/0x8d > irq event stamp: 11052 > hardirqs last enabled at (11052): [<c105d454>] kmem_cache_alloc+0x89/0xa6 > hardirqs last disabled at (11051): [<c105d405>] kmem_cache_alloc+0x3a/0xa6 > softirqs last enabled at (11040): [<c11a506d>] dev_queue_xmit+0x224/0x24b > softirqs last disabled at (11041): [<c1004a64>] do_softirq+0x58/0xbd > > other info that might help us debug this: > 1 lock held by java_vm/4418: > #0: (af_family_keys + (sk)->sk_family#4){-+..}, at: [<f93c9281>] > tcp_v6_rcv+0x308/0x7b7 [ipv6] > > stack backtrace: > [<c1003502>] show_trace_log_lvl+0x54/0xfd > [<c1003b6a>] show_trace+0xd/0x10 > [<c1003c0e>] dump_stack+0x19/0x1b > [<c102b833>] print_usage_bug+0x1cc/0x1d9 > [<c102bd34>] mark_lock+0x193/0x360 > [<c102c94a>] __lock_acquire+0x3b7/0x970 > [<c102d1c8>] lock_acquire+0x60/0x80 > [<c12013eb>] _read_lock+0x23/0x32 > [<c119d0a9>] sk_dst_check+0x1b/0xe6 > [<f93ae479>] ip6_dst_lookup+0x31/0x172 [ipv6] > [<f93c7065>] tcp_v6_send_synack+0x10f/0x238 [ipv6] > [<f93c7dc5>] tcp_v6_conn_request+0x281/0x2c7 [ipv6] > [<c11cca33>] tcp_rcv_state_process+0x5d/0xbde > [<f93c7414>] tcp_v6_do_rcv+0x26d/0x384 [ipv6] > [<f93c96d6>] tcp_v6_rcv+0x75d/0x7b7 [ipv6] > [<f93afadd>] ip6_input+0x201/0x2d1 [ipv6] > [<f93b002d>] ipv6_rcv+0x190/0x1bf [ipv6] > [<c11a3200>] netif_receive_skb+0x2e6/0x37f > [<c11a4b81>] process_backlog+0x80/0x112 > [<c11a4c9e>] net_rx_action+0x8b/0x1e8 > [<c101a711>] __do_softirq+0x55/0xb0 > [<c1004a64>] do_softirq+0x58/0xbd > [<c101a978>] local_bh_enable+0xd0/0x107 > [<c11a506d>] dev_queue_xmit+0x224/0x24b > [<c11a9bb8>] neigh_resolve_output+0x1e2/0x20e > [<f93add64>] ip6_output2+0x1de/0x1fc [ipv6] > [<f93ae41e>] ip6_output+0x69c/0x6c6 [ipv6] > [<f93aeb58>] ip6_xmit+0x22b/0x295 [ipv6] > [<f93cc893>] inet6_csk_xmit+0x200/0x20e [ipv6] > [<c11cea04>] tcp_transmit_skb+0x5de/0x60c > [<c11d0cd7>] tcp_connect+0x2bb/0x31a > [<f93c894a>] tcp_v6_connect+0x520/0x655 [ipv6] > [<c11dd889>] inet_stream_connect+0x83/0x20f > [<c119adda>] sys_connect+0x67/0x84 > [<c119b474>] sys_socketcall+0x8c/0x186 > [<c1002d6d>] sysenter_past_esp+0x56/0x8d hm, is this all the log output you got? In particular the printouts of where various lock-class state changing events happened that are important too. Could you send me the whole dmesg? Ingo ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage. 2006-06-30 6:50 ` Ingo Molnar @ 2006-06-30 6:59 ` Ingo Molnar 2006-06-30 7:22 ` [patch] lockdep, annotate slocks: turn lockdep off for them Ingo Molnar 1 sibling, 0 replies; 22+ messages in thread From: Ingo Molnar @ 2006-06-30 6:59 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, LKML * Ingo Molnar <mingo@elte.hu> wrote: > hm, is this all the log output you got? In particular the printouts of > where various lock-class state changing events happened that are > important too. Could you send me the whole dmesg? sorry, i take this back - the messages are complete. Ingo ^ permalink raw reply [flat|nested] 22+ messages in thread
* [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 6:50 ` Ingo Molnar 2006-06-30 6:59 ` Ingo Molnar @ 2006-06-30 7:22 ` Ingo Molnar 2006-06-30 9:18 ` Ingo Molnar 1 sibling, 1 reply; 22+ messages in thread From: Ingo Molnar @ 2006-06-30 7:22 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, LKML, Herbert Xu, Arjan van de Ven Miles, does the patch below make the message go away? Ingo ---------------- Subject: lockdep, annotate slocks: turn lockdep off for them From: Ingo Molnar <mingo@elte.hu> temporary solution to turn off slock related false positives: the slock is pretty much the only lock type in the kernel that is half spinlock, half waitqueue. "Process level" and "softirq level" uses of slock are excluded - albeit the spinlock itself is not permanently held in process context. The right solution will be to annotate slock uses with acquire()/release(). (i.e. to treat sock_owned_by_user() flagged areas as an exclusion area too) (this temporary solution is not as bad as it might sound, because it does not eliminate the various ->sk_backlog_rcv() related dependencies from the validator's dependency graph - what it does is that it doesnt record them relative to slock. [the callbacks will still be executed and covered when the backlog is processed.]) Signed-off-by: Ingo Molnar <mingo@elte.hu> --- net/core/sock.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) Index: linux/net/core/sock.c =================================================================== --- linux.orig/net/core/sock.c +++ linux/net/core/sock.c @@ -250,9 +250,17 @@ int sk_receive_skb(struct sock *sk, stru skb->dev = NULL; bh_lock_sock(sk); - if (!sock_owned_by_user(sk)) + if (!sock_owned_by_user(sk)) { + /* + * FIXME: teach the validator about slocks. + * + * For now we dont record dependencies in + * this codepath. + */ + lockdep_off(); rc = sk->sk_backlog_rcv(sk, skb); - else + lockdep_on(); + } else sk_add_backlog(sk, skb); bh_unlock_sock(sk); out: ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 7:22 ` [patch] lockdep, annotate slocks: turn lockdep off for them Ingo Molnar @ 2006-06-30 9:18 ` Ingo Molnar 2006-06-30 11:17 ` Herbert Xu 0 siblings, 1 reply; 22+ messages in thread From: Ingo Molnar @ 2006-06-30 9:18 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, LKML, Herbert Xu, Arjan van de Ven [-- Attachment #1: Type: text/plain, Size: 1894 bytes --] * Ingo Molnar <mingo@elte.hu> wrote: > temporary solution to turn off slock related false positives: the > slock is pretty much the only lock type in the kernel that is half > spinlock, half waitqueue. "Process level" and "softirq level" uses of > slock are excluded - albeit the spinlock itself is not permanently > held in process context. > > The right solution will be to annotate slock uses with > acquire()/release(). (i.e. to treat sock_owned_by_user() flagged areas > as an exclusion area too) i've got the right solution coded up too - it was a fairly straightforward process of teaching a new lock type (sk_lock) to the validator. Miles, do the two attached patches (ontop of -mm4) fix your problem too? [you should unapply the previous lockdep_off()/on() patch i sent) Andrew, i'd suggest to not apply the two attached patches yet - while they work fine for me we first should see whether they fix Miles' problem too. Herbert, do the acquire/release semantics as expressed in the lockdep-annotate-slock.patch match sk_lock semantics? the patch also makes the slock names more readable: neptune:/home/mingo> cat /proc/lockdep | grep AF c07f7088 OPS: 4125 FD: 1 BD: 2 -...: slock-AF_UNIX c07f7188 OPS: 1375 FD: 2 BD: 1 --..: sk_lock-AF_UNIX c07f7100 OPS: 252 FD: 1 BD: 2 -...: slock-AF_NETLINK c07f7200 OPS: 84 FD: 2 BD: 1 --..: sk_lock-AF_NETLINK c07f7090 OPS: 2641 FD: 20 BD: 4 -+..: slock-AF_INET c07f7190 OPS: 851 FD: 64 BD: 1 --..: sk_lock-AF_INET c07f7108 OPS: 27 FD: 1 BD: 2 -...: slock-AF_PACKET c07f7208 OPS: 9 FD: 4 BD: 1 --..: sk_lock-AF_PACKET c07f7091 OPS: 285 FD: 44 BD: 2 -+..: slock-AF_INET/1 (that should make it easier to tell which sk_lock class is mentioned in a validator message, instead of the current slock#2/slock#3 enumeration.) Ingo [-- Attachment #2: lockdep-core-add-set-class-and-name.patch --] [-- Type: text/plain, Size: 1265 bytes --] Subject: lockdep: core, add set_class_and_name() From: Ingo Molnar <mingo@elte.hu> add set_class_and_name() API, to allow lock initialization places to beautify the naming of their lock classes. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/linux/lockdep.h | 4 ++++ 1 file changed, 4 insertions(+) Index: linux/include/linux/lockdep.h =================================================================== --- linux.orig/include/linux/lockdep.h +++ linux/include/linux/lockdep.h @@ -212,6 +212,8 @@ extern void lockdep_init_map(struct lock */ #define lockdep_set_class(lock, key) \ lockdep_init_map(&(lock)->dep_map, #key, key) +#define lockdep_set_class_and_name(lock, key, name) \ + lockdep_init_map(&(lock)->dep_map, name, key) /* * Acquire a lock. @@ -257,6 +259,8 @@ static inline int lockdep_internal(void) # define lockdep_info() do { } while (0) # define lockdep_init_map(lock, name, key) do { } while (0) # define lockdep_set_class(lock, key) do { (void)(key); } while (0) +# define lockdep_set_class_and_name(lock, key, name) \ + do { (void)(key); (void)(name); } while (0) # define INIT_LOCKDEP # define lockdep_reset() do { debug_locks = 1; } while (0) # define lockdep_free_key_range(start, size) do { } while (0) [-- Attachment #3: lockdep-annotate-slock.patch --] [-- Type: text/plain, Size: 6614 bytes --] Subject: lockdep, annotate sk_locks From: Ingo Molnar <mingo@elte.hu> Teach sk_lock semantics to the lock validator. In the softirq path the slock has mutex_trylock()+mutex_unlock() semantics, in the process context sock_lock() case it has mutex_lock()/mutex_unlock() semantics. Thus we treat sock_owned_by_user() flagged areas as an exclusion area too, not just those areas covered by a held sk_lock.slock. Effect on non-lockdep kernels: minimal, sk_lock_sock_init() has been turned into an inline function. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/net/sock.h | 20 +++++------ net/core/sock.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 94 insertions(+), 19 deletions(-) Index: linux/include/net/sock.h =================================================================== --- linux.orig/include/net/sock.h +++ linux/include/net/sock.h @@ -44,6 +44,7 @@ #include <linux/timer.h> #include <linux/cache.h> #include <linux/module.h> +#include <linux/lockdep.h> #include <linux/netdevice.h> #include <linux/skbuff.h> /* struct sk_buff */ #include <linux/security.h> @@ -78,18 +79,17 @@ typedef struct { spinlock_t slock; struct sock_iocb *owner; wait_queue_head_t wq; + /* + * We express the mutex-alike socket_lock semantics + * to the lock validator by explicitly managing + * the slock as a lock variant (in addition to + * the slock itself): + */ +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map dep_map; +#endif } socket_lock_t; -extern struct lock_class_key af_family_keys[AF_MAX]; - -#define sock_lock_init(__sk) \ -do { spin_lock_init(&((__sk)->sk_lock.slock)); \ - lockdep_set_class(&(__sk)->sk_lock.slock, \ - af_family_keys + (__sk)->sk_family); \ - (__sk)->sk_lock.owner = NULL; \ - init_waitqueue_head(&((__sk)->sk_lock.wq)); \ -} while(0) - struct sock; struct proto; Index: linux/net/core/sock.c =================================================================== --- linux.orig/net/core/sock.c +++ linux/net/core/sock.c @@ -134,7 +134,40 @@ * Each address family might have different locking rules, so we have * one slock key per address family: */ -struct lock_class_key af_family_keys[AF_MAX]; +static struct lock_class_key af_family_keys[AF_MAX]; +static struct lock_class_key af_family_slock_keys[AF_MAX]; + +#ifdef CONFIG_LOCKDEP +/* + * Make lock validator output more readable: + */ +static const char *af_family_key_strings[AF_MAX+1] = { + "sk_lock-AF_UNSPEC", "sk_lock-AF_UNIX" , "sk_lock-AF_INET" , + "sk_lock-AF_AX25" , "sk_lock-AF_IPX" , "sk_lock-AF_APPLETALK", + "sk_lock-AF_NETROM", "sk_lock-AF_BRIDGE" , "sk_lock-AF_ATMPVC" , + "sk_lock-AF_X25" , "sk_lock-AF_INET6" , "sk_lock-AF_ROSE" , + "sk_lock-AF_DECnet", "sk_lock-AF_NETBEUI" , "sk_lock-AF_SECURITY" , + "sk_lock-AF_KEY" , "sk_lock-AF_NETLINK" , "sk_lock-AF_PACKET" , + "sk_lock-AF_ASH" , "sk_lock-AF_ECONET" , "sk_lock-AF_ATMSVC" , + "sk_lock-21" , "sk_lock-AF_SNA" , "sk_lock-AF_IRDA" , + "sk_lock-AF_PPPOX" , "sk_lock-AF_WANPIPE" , "sk_lock-AF_LLC" , + "sk_lock-27" , "sk_lock-28" , "sk_lock-29" , + "sk_lock-AF_TIPC" , "sk_lock-AF_BLUETOOTH", "sk_lock-AF_MAX" +}; +static const char *af_family_slock_key_strings[AF_MAX+1] = { + "slock-AF_UNSPEC", "slock-AF_UNIX" , "slock-AF_INET" , + "slock-AF_AX25" , "slock-AF_IPX" , "slock-AF_APPLETALK", + "slock-AF_NETROM", "slock-AF_BRIDGE" , "slock-AF_ATMPVC" , + "slock-AF_X25" , "slock-AF_INET6" , "slock-AF_ROSE" , + "slock-AF_DECnet", "slock-AF_NETBEUI" , "slock-AF_SECURITY" , + "slock-AF_KEY" , "slock-AF_NETLINK" , "slock-AF_PACKET" , + "slock-AF_ASH" , "slock-AF_ECONET" , "slock-AF_ATMSVC" , + "slock-21" , "slock-AF_SNA" , "slock-AF_IRDA" , + "slock-AF_PPPOX" , "slock-AF_WANPIPE" , "slock-AF_LLC" , + "slock-27" , "slock-28" , "slock-29" , + "slock-AF_TIPC" , "slock-AF_BLUETOOTH", "slock-AF_MAX" +}; +#endif /* * sk_callback_lock locking rules are per-address-family, @@ -250,9 +283,18 @@ int sk_receive_skb(struct sock *sk, stru skb->dev = NULL; bh_lock_sock(sk); - if (!sock_owned_by_user(sk)) + if (!sock_owned_by_user(sk)) { + /* + * trylock + unlock semantics: + */ + spin_release(&sk->sk_lock.slock.dep_map, 1, _RET_IP_); + mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_); + rc = sk->sk_backlog_rcv(sk, skb); - else + + mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); + spin_acquire(&sk->sk_lock.slock.dep_map, 0, 0, _RET_IP_); + } else sk_add_backlog(sk, skb); bh_unlock_sock(sk); out: @@ -762,6 +804,30 @@ lenout: return 0; } +/* + * Initialize an sk_lock. + * + * (We also register the sk_lock with the lock validator.) + */ +static void inline sock_lock_init(struct sock *sk) +{ + spin_lock_init(&sk->sk_lock.slock); + lockdep_set_class_and_name(&sk->sk_lock.slock, + af_family_slock_keys + sk->sk_family, + af_family_slock_key_strings[sk->sk_family]); + sk->sk_lock.owner = NULL; + init_waitqueue_head(&sk->sk_lock.wq); +#ifdef CONFIG_DEBUG_LOCK_ALLOC + /* + * Make sure we are not reinitializing a held lock: + */ + debug_check_no_locks_freed((void *)&sk->sk_lock, sizeof(sk->sk_lock)); + lockdep_init_map(&sk->sk_lock.dep_map, + af_family_key_strings[sk->sk_family], + af_family_keys + sk->sk_family); +#endif +} + /** * sk_alloc - All socket objects are allocated here * @family: protocol family @@ -1466,24 +1532,33 @@ void sock_init_data(struct socket *sock, void fastcall lock_sock(struct sock *sk) { might_sleep(); - spin_lock_bh(&(sk->sk_lock.slock)); + spin_lock_bh(&sk->sk_lock.slock); if (sk->sk_lock.owner) __lock_sock(sk); sk->sk_lock.owner = (void *)1; - spin_unlock_bh(&(sk->sk_lock.slock)); + spin_unlock(&sk->sk_lock.slock); + /* + * The sk_lock has mutex_lock() semantics here: + */ + mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); + local_bh_enable(); } EXPORT_SYMBOL(lock_sock); void fastcall release_sock(struct sock *sk) { - spin_lock_bh(&(sk->sk_lock.slock)); + spin_lock_bh(&sk->sk_lock.slock); if (sk->sk_backlog.tail) __release_sock(sk); + /* + * The sk_lock has mutex_unlock() semantics: + */ + mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); sk->sk_lock.owner = NULL; - if (waitqueue_active(&(sk->sk_lock.wq))) - wake_up(&(sk->sk_lock.wq)); - spin_unlock_bh(&(sk->sk_lock.slock)); + if (waitqueue_active(&sk->sk_lock.wq)) + wake_up(&sk->sk_lock.wq); + spin_unlock_bh(&sk->sk_lock.slock); } EXPORT_SYMBOL(release_sock); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 9:18 ` Ingo Molnar @ 2006-06-30 11:17 ` Herbert Xu 2006-06-30 11:37 ` Ingo Molnar 0 siblings, 1 reply; 22+ messages in thread From: Herbert Xu @ 2006-06-30 11:17 UTC (permalink / raw) To: Ingo Molnar; +Cc: Miles Lane, Andrew Morton, LKML, Arjan van de Ven On Fri, Jun 30, 2006 at 11:18:50AM +0200, Ingo Molnar wrote: > > Herbert, do the acquire/release semantics as expressed in the > lockdep-annotate-slock.patch match sk_lock semantics? I think it should be fine. > @@ -250,9 +283,18 @@ int sk_receive_skb(struct sock *sk, stru > skb->dev = NULL; > > bh_lock_sock(sk); > - if (!sock_owned_by_user(sk)) > + if (!sock_owned_by_user(sk)) { > + /* > + * trylock + unlock semantics: > + */ > + spin_release(&sk->sk_lock.slock.dep_map, 1, _RET_IP_); > + mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_); Although it would seem that keeping the spin lock would fit the actual semantics better. I suppose there must be a technical reason why this wouldn't work. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 11:17 ` Herbert Xu @ 2006-06-30 11:37 ` Ingo Molnar 2006-06-30 11:46 ` Herbert Xu 2006-06-30 20:21 ` Miles Lane 0 siblings, 2 replies; 22+ messages in thread From: Ingo Molnar @ 2006-06-30 11:37 UTC (permalink / raw) To: Herbert Xu; +Cc: Miles Lane, Andrew Morton, LKML, Arjan van de Ven * Herbert Xu <herbert@gondor.apana.org.au> wrote: > > bh_lock_sock(sk); > > - if (!sock_owned_by_user(sk)) > > + if (!sock_owned_by_user(sk)) { > > + /* > > + * trylock + unlock semantics: > > + */ > > + spin_release(&sk->sk_lock.slock.dep_map, 1, _RET_IP_); > > + mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_); > > Although it would seem that keeping the spin lock would fit the actual > semantics better. I suppose there must be a technical reason why this > wouldn't work. good point. The basic issue is the 'virtual lock inversion' that occurs in the lock vs. release paths. [between taking the slock and taking the new sk_lock type] The situation is like this: we construct 'complex' lock types [mutex, rwsem, sk_lock] out of 'primitive' lock types [spinlock, rwlock]. Both the complex type and the primite types exist separately, and might have lock-validator acquire/release operations. These locks can interact and if we do the complex lock acquire/release while holding the primitive lock, the validator sees inverse ordering between them. For the mutex code i solved the inversion problem by using a raw_spinlock for the primitive type (which has no lockdep operations), hence the complex lock type. But in this particular sk_lock case we can do it even more cleanly i think and can preserve the lockdep awareness of the primitive type too: by releasing the complex lock before taking the primitive lock in the release_sock() unlock path. The updated patch below does this - and thus i was able to remove the dropping of the primitive spinlock type. it is not a problem that the release of the complex lock type does not happen inside the critical section: from the point where we release the complex lock-type _this_ context cannot take any other locks, so there are no dependencies missed. As you can see, the lock validator can easily cover completely new lock types like sk_lock too, as long as the new lock type has some minimalistic "works like a lock" properties. (such as owner-does-unlock) later on i'll try the same cleanup for the mutex code too - it should be possible. (that way the implementation of complex lock types can be lock-validator checked too) Ingo ---------------> Subject: lockdep, annotate sk_locks From: Ingo Molnar <mingo@elte.hu> Teach sk_lock semantics to the lock validator. In the softirq path the slock has mutex_trylock()+mutex_unlock() semantics, in the process context sock_lock() case it has mutex_lock()/mutex_unlock() semantics. Thus we treat sock_owned_by_user() flagged areas as an exclusion area too, not just those areas covered by a held sk_lock.slock. Effect on non-lockdep kernels: minimal, sk_lock_sock_init() has been turned into an inline function. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/net/sock.h | 20 +++++------ net/core/sock.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 93 insertions(+), 19 deletions(-) Index: linux/include/net/sock.h =================================================================== --- linux.orig/include/net/sock.h +++ linux/include/net/sock.h @@ -44,6 +44,7 @@ #include <linux/timer.h> #include <linux/cache.h> #include <linux/module.h> +#include <linux/lockdep.h> #include <linux/netdevice.h> #include <linux/skbuff.h> /* struct sk_buff */ #include <linux/security.h> @@ -78,18 +79,17 @@ typedef struct { spinlock_t slock; struct sock_iocb *owner; wait_queue_head_t wq; + /* + * We express the mutex-alike socket_lock semantics + * to the lock validator by explicitly managing + * the slock as a lock variant (in addition to + * the slock itself): + */ +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map dep_map; +#endif } socket_lock_t; -extern struct lock_class_key af_family_keys[AF_MAX]; - -#define sock_lock_init(__sk) \ -do { spin_lock_init(&((__sk)->sk_lock.slock)); \ - lockdep_set_class(&(__sk)->sk_lock.slock, \ - af_family_keys + (__sk)->sk_family); \ - (__sk)->sk_lock.owner = NULL; \ - init_waitqueue_head(&((__sk)->sk_lock.wq)); \ -} while(0) - struct sock; struct proto; Index: linux/net/core/sock.c =================================================================== --- linux.orig/net/core/sock.c +++ linux/net/core/sock.c @@ -134,7 +134,40 @@ * Each address family might have different locking rules, so we have * one slock key per address family: */ -struct lock_class_key af_family_keys[AF_MAX]; +static struct lock_class_key af_family_keys[AF_MAX]; +static struct lock_class_key af_family_slock_keys[AF_MAX]; + +#ifdef CONFIG_LOCKDEP +/* + * Make lock validator output more readable: + */ +static const char *af_family_key_strings[AF_MAX+1] = { + "sk_lock-AF_UNSPEC", "sk_lock-AF_UNIX" , "sk_lock-AF_INET" , + "sk_lock-AF_AX25" , "sk_lock-AF_IPX" , "sk_lock-AF_APPLETALK", + "sk_lock-AF_NETROM", "sk_lock-AF_BRIDGE" , "sk_lock-AF_ATMPVC" , + "sk_lock-AF_X25" , "sk_lock-AF_INET6" , "sk_lock-AF_ROSE" , + "sk_lock-AF_DECnet", "sk_lock-AF_NETBEUI" , "sk_lock-AF_SECURITY" , + "sk_lock-AF_KEY" , "sk_lock-AF_NETLINK" , "sk_lock-AF_PACKET" , + "sk_lock-AF_ASH" , "sk_lock-AF_ECONET" , "sk_lock-AF_ATMSVC" , + "sk_lock-21" , "sk_lock-AF_SNA" , "sk_lock-AF_IRDA" , + "sk_lock-AF_PPPOX" , "sk_lock-AF_WANPIPE" , "sk_lock-AF_LLC" , + "sk_lock-27" , "sk_lock-28" , "sk_lock-29" , + "sk_lock-AF_TIPC" , "sk_lock-AF_BLUETOOTH", "sk_lock-AF_MAX" +}; +static const char *af_family_slock_key_strings[AF_MAX+1] = { + "slock-AF_UNSPEC", "slock-AF_UNIX" , "slock-AF_INET" , + "slock-AF_AX25" , "slock-AF_IPX" , "slock-AF_APPLETALK", + "slock-AF_NETROM", "slock-AF_BRIDGE" , "slock-AF_ATMPVC" , + "slock-AF_X25" , "slock-AF_INET6" , "slock-AF_ROSE" , + "slock-AF_DECnet", "slock-AF_NETBEUI" , "slock-AF_SECURITY" , + "slock-AF_KEY" , "slock-AF_NETLINK" , "slock-AF_PACKET" , + "slock-AF_ASH" , "slock-AF_ECONET" , "slock-AF_ATMSVC" , + "slock-21" , "slock-AF_SNA" , "slock-AF_IRDA" , + "slock-AF_PPPOX" , "slock-AF_WANPIPE" , "slock-AF_LLC" , + "slock-27" , "slock-28" , "slock-29" , + "slock-AF_TIPC" , "slock-AF_BLUETOOTH", "slock-AF_MAX" +}; +#endif /* * sk_callback_lock locking rules are per-address-family, @@ -250,9 +283,16 @@ int sk_receive_skb(struct sock *sk, stru skb->dev = NULL; bh_lock_sock(sk); - if (!sock_owned_by_user(sk)) + if (!sock_owned_by_user(sk)) { + /* + * trylock + unlock semantics: + */ + mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_); + rc = sk->sk_backlog_rcv(sk, skb); - else + + mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); + } else sk_add_backlog(sk, skb); bh_unlock_sock(sk); out: @@ -762,6 +802,30 @@ lenout: return 0; } +/* + * Initialize an sk_lock. + * + * (We also register the sk_lock with the lock validator.) + */ +static void inline sock_lock_init(struct sock *sk) +{ + spin_lock_init(&sk->sk_lock.slock); + lockdep_set_class_and_name(&sk->sk_lock.slock, + af_family_slock_keys + sk->sk_family, + af_family_slock_key_strings[sk->sk_family]); + sk->sk_lock.owner = NULL; + init_waitqueue_head(&sk->sk_lock.wq); +#ifdef CONFIG_DEBUG_LOCK_ALLOC + /* + * Make sure we are not reinitializing a held lock: + */ + debug_check_no_locks_freed((void *)&sk->sk_lock, sizeof(sk->sk_lock)); + lockdep_init_map(&sk->sk_lock.dep_map, + af_family_key_strings[sk->sk_family], + af_family_keys + sk->sk_family); +#endif +} + /** * sk_alloc - All socket objects are allocated here * @family: protocol family @@ -1466,24 +1530,34 @@ void sock_init_data(struct socket *sock, void fastcall lock_sock(struct sock *sk) { might_sleep(); - spin_lock_bh(&(sk->sk_lock.slock)); + spin_lock_bh(&sk->sk_lock.slock); if (sk->sk_lock.owner) __lock_sock(sk); sk->sk_lock.owner = (void *)1; - spin_unlock_bh(&(sk->sk_lock.slock)); + spin_unlock(&sk->sk_lock.slock); + /* + * The sk_lock has mutex_lock() semantics here: + */ + mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); + local_bh_enable(); } EXPORT_SYMBOL(lock_sock); void fastcall release_sock(struct sock *sk) { - spin_lock_bh(&(sk->sk_lock.slock)); + /* + * The sk_lock has mutex_unlock() semantics: + */ + mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); + + spin_lock_bh(&sk->sk_lock.slock); if (sk->sk_backlog.tail) __release_sock(sk); sk->sk_lock.owner = NULL; - if (waitqueue_active(&(sk->sk_lock.wq))) - wake_up(&(sk->sk_lock.wq)); - spin_unlock_bh(&(sk->sk_lock.slock)); + if (waitqueue_active(&sk->sk_lock.wq)) + wake_up(&sk->sk_lock.wq); + spin_unlock_bh(&sk->sk_lock.slock); } EXPORT_SYMBOL(release_sock); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 11:37 ` Ingo Molnar @ 2006-06-30 11:46 ` Herbert Xu 2006-06-30 20:21 ` Miles Lane 1 sibling, 0 replies; 22+ messages in thread From: Herbert Xu @ 2006-06-30 11:46 UTC (permalink / raw) To: Ingo Molnar; +Cc: Miles Lane, Andrew Morton, LKML, Arjan van de Ven On Fri, Jun 30, 2006 at 01:37:58PM +0200, Ingo Molnar wrote: > > As you can see, the lock validator can easily cover completely new lock > types like sk_lock too, as long as the new lock type has some > minimalistic "works like a lock" properties. (such as owner-does-unlock) > > later on i'll try the same cleanup for the mutex code too - it should be > possible. (that way the implementation of complex lock types can be > lock-validator checked too) Yes, this looks very nice. Thanks! -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 11:37 ` Ingo Molnar 2006-06-30 11:46 ` Herbert Xu @ 2006-06-30 20:21 ` Miles Lane 2006-06-30 20:38 ` Ingo Molnar 1 sibling, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-06-30 20:21 UTC (permalink / raw) To: Ingo Molnar; +Cc: Herbert Xu, Andrew Morton, LKML, Arjan van de Ven On 6/30/06, Ingo Molnar <mingo@elte.hu> wrote: > > * Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > > bh_lock_sock(sk); > > > - if (!sock_owned_by_user(sk)) > > > + if (!sock_owned_by_user(sk)) { > > > + /* > > > + * trylock + unlock semantics: > > > + */ > > > + spin_release(&sk->sk_lock.slock.dep_map, 1, _RET_IP_); > > > + mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_); > > > > Although it would seem that keeping the spin lock would fit the actual > > semantics better. I suppose there must be a technical reason why this > > wouldn't work. > > good point. The basic issue is the 'virtual lock inversion' that occurs > in the lock vs. release paths. [between taking the slock and taking the > new sk_lock type] > > The situation is like this: we construct 'complex' lock types [mutex, > rwsem, sk_lock] out of 'primitive' lock types [spinlock, rwlock]. Both > the complex type and the primite types exist separately, and might have > lock-validator acquire/release operations. These locks can interact and > if we do the complex lock acquire/release while holding the primitive > lock, the validator sees inverse ordering between them. > > For the mutex code i solved the inversion problem by using a > raw_spinlock for the primitive type (which has no lockdep operations), > hence the complex lock type. > > But in this particular sk_lock case we can do it even more cleanly i > think and can preserve the lockdep awareness of the primitive type too: > by releasing the complex lock before taking the primitive lock in the > release_sock() unlock path. The updated patch below does this - and thus > i was able to remove the dropping of the primitive spinlock type. > > it is not a problem that the release of the complex lock type does not > happen inside the critical section: from the point where we release the > complex lock-type _this_ context cannot take any other locks, so there > are no dependencies missed. > > As you can see, the lock validator can easily cover completely new lock > types like sk_lock too, as long as the new lock type has some > minimalistic "works like a lock" properties. (such as owner-does-unlock) > > later on i'll try the same cleanup for the mutex code too - it should be > possible. (that way the implementation of complex lock types can be > lock-validator checked too) > > Ingo > > ---------------> > Subject: lockdep, annotate sk_locks > From: Ingo Molnar <mingo@elte.hu> > > Teach sk_lock semantics to the lock validator. In the softirq path > the slock has mutex_trylock()+mutex_unlock() semantics, in the > process context sock_lock() case it has mutex_lock()/mutex_unlock() > semantics. > > Thus we treat sock_owned_by_user() flagged areas as an exclusion > area too, not just those areas covered by a held sk_lock.slock. > > Effect on non-lockdep kernels: minimal, sk_lock_sock_init() has > been turned into an inline function. > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > include/net/sock.h | 20 +++++------ > net/core/sock.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 93 insertions(+), 19 deletions(-) > > Index: linux/include/net/sock.h > =================================================================== > --- linux.orig/include/net/sock.h > +++ linux/include/net/sock.h > @@ -44,6 +44,7 @@ > #include <linux/timer.h> > #include <linux/cache.h> > #include <linux/module.h> > +#include <linux/lockdep.h> > #include <linux/netdevice.h> > #include <linux/skbuff.h> /* struct sk_buff */ > #include <linux/security.h> > @@ -78,18 +79,17 @@ typedef struct { > spinlock_t slock; > struct sock_iocb *owner; > wait_queue_head_t wq; > + /* > + * We express the mutex-alike socket_lock semantics > + * to the lock validator by explicitly managing > + * the slock as a lock variant (in addition to > + * the slock itself): > + */ > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + struct lockdep_map dep_map; > +#endif > } socket_lock_t; > > -extern struct lock_class_key af_family_keys[AF_MAX]; > - > -#define sock_lock_init(__sk) \ > -do { spin_lock_init(&((__sk)->sk_lock.slock)); \ > - lockdep_set_class(&(__sk)->sk_lock.slock, \ > - af_family_keys + (__sk)->sk_family); \ > - (__sk)->sk_lock.owner = NULL; \ > - init_waitqueue_head(&((__sk)->sk_lock.wq)); \ > -} while(0) > - > struct sock; > struct proto; > > Index: linux/net/core/sock.c > =================================================================== > --- linux.orig/net/core/sock.c > +++ linux/net/core/sock.c > @@ -134,7 +134,40 @@ > * Each address family might have different locking rules, so we have > * one slock key per address family: > */ > -struct lock_class_key af_family_keys[AF_MAX]; > +static struct lock_class_key af_family_keys[AF_MAX]; > +static struct lock_class_key af_family_slock_keys[AF_MAX]; > + > +#ifdef CONFIG_LOCKDEP > +/* > + * Make lock validator output more readable: > + */ > +static const char *af_family_key_strings[AF_MAX+1] = { > + "sk_lock-AF_UNSPEC", "sk_lock-AF_UNIX" , "sk_lock-AF_INET" , > + "sk_lock-AF_AX25" , "sk_lock-AF_IPX" , "sk_lock-AF_APPLETALK", > + "sk_lock-AF_NETROM", "sk_lock-AF_BRIDGE" , "sk_lock-AF_ATMPVC" , > + "sk_lock-AF_X25" , "sk_lock-AF_INET6" , "sk_lock-AF_ROSE" , > + "sk_lock-AF_DECnet", "sk_lock-AF_NETBEUI" , "sk_lock-AF_SECURITY" , > + "sk_lock-AF_KEY" , "sk_lock-AF_NETLINK" , "sk_lock-AF_PACKET" , > + "sk_lock-AF_ASH" , "sk_lock-AF_ECONET" , "sk_lock-AF_ATMSVC" , > + "sk_lock-21" , "sk_lock-AF_SNA" , "sk_lock-AF_IRDA" , > + "sk_lock-AF_PPPOX" , "sk_lock-AF_WANPIPE" , "sk_lock-AF_LLC" , > + "sk_lock-27" , "sk_lock-28" , "sk_lock-29" , > + "sk_lock-AF_TIPC" , "sk_lock-AF_BLUETOOTH", "sk_lock-AF_MAX" > +}; > +static const char *af_family_slock_key_strings[AF_MAX+1] = { > + "slock-AF_UNSPEC", "slock-AF_UNIX" , "slock-AF_INET" , > + "slock-AF_AX25" , "slock-AF_IPX" , "slock-AF_APPLETALK", > + "slock-AF_NETROM", "slock-AF_BRIDGE" , "slock-AF_ATMPVC" , > + "slock-AF_X25" , "slock-AF_INET6" , "slock-AF_ROSE" , > + "slock-AF_DECnet", "slock-AF_NETBEUI" , "slock-AF_SECURITY" , > + "slock-AF_KEY" , "slock-AF_NETLINK" , "slock-AF_PACKET" , > + "slock-AF_ASH" , "slock-AF_ECONET" , "slock-AF_ATMSVC" , > + "slock-21" , "slock-AF_SNA" , "slock-AF_IRDA" , > + "slock-AF_PPPOX" , "slock-AF_WANPIPE" , "slock-AF_LLC" , > + "slock-27" , "slock-28" , "slock-29" , > + "slock-AF_TIPC" , "slock-AF_BLUETOOTH", "slock-AF_MAX" > +}; > +#endif > > /* > * sk_callback_lock locking rules are per-address-family, > @@ -250,9 +283,16 @@ int sk_receive_skb(struct sock *sk, stru > skb->dev = NULL; > > bh_lock_sock(sk); > - if (!sock_owned_by_user(sk)) > + if (!sock_owned_by_user(sk)) { > + /* > + * trylock + unlock semantics: > + */ > + mutex_acquire(&sk->sk_lock.dep_map, 0, 1, _RET_IP_); > + > rc = sk->sk_backlog_rcv(sk, skb); > - else > + > + mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); > + } else > sk_add_backlog(sk, skb); > bh_unlock_sock(sk); > out: > @@ -762,6 +802,30 @@ lenout: > return 0; > } > > +/* > + * Initialize an sk_lock. > + * > + * (We also register the sk_lock with the lock validator.) > + */ > +static void inline sock_lock_init(struct sock *sk) > +{ > + spin_lock_init(&sk->sk_lock.slock); > + lockdep_set_class_and_name(&sk->sk_lock.slock, > + af_family_slock_keys + sk->sk_family, > + af_family_slock_key_strings[sk->sk_family]); > + sk->sk_lock.owner = NULL; > + init_waitqueue_head(&sk->sk_lock.wq); > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + /* > + * Make sure we are not reinitializing a held lock: > + */ > + debug_check_no_locks_freed((void *)&sk->sk_lock, sizeof(sk->sk_lock)); > + lockdep_init_map(&sk->sk_lock.dep_map, > + af_family_key_strings[sk->sk_family], > + af_family_keys + sk->sk_family); > +#endif > +} > + > /** > * sk_alloc - All socket objects are allocated here > * @family: protocol family > @@ -1466,24 +1530,34 @@ void sock_init_data(struct socket *sock, > void fastcall lock_sock(struct sock *sk) > { > might_sleep(); > - spin_lock_bh(&(sk->sk_lock.slock)); > + spin_lock_bh(&sk->sk_lock.slock); > if (sk->sk_lock.owner) > __lock_sock(sk); > sk->sk_lock.owner = (void *)1; > - spin_unlock_bh(&(sk->sk_lock.slock)); > + spin_unlock(&sk->sk_lock.slock); > + /* > + * The sk_lock has mutex_lock() semantics here: > + */ > + mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); > + local_bh_enable(); > } > > EXPORT_SYMBOL(lock_sock); > > void fastcall release_sock(struct sock *sk) > { > - spin_lock_bh(&(sk->sk_lock.slock)); > + /* > + * The sk_lock has mutex_unlock() semantics: > + */ > + mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); > + > + spin_lock_bh(&sk->sk_lock.slock); > if (sk->sk_backlog.tail) > __release_sock(sk); > sk->sk_lock.owner = NULL; > - if (waitqueue_active(&(sk->sk_lock.wq))) > - wake_up(&(sk->sk_lock.wq)); > - spin_unlock_bh(&(sk->sk_lock.slock)); > + if (waitqueue_active(&sk->sk_lock.wq)) > + wake_up(&sk->sk_lock.wq); > + spin_unlock_bh(&sk->sk_lock.slock); > } > EXPORT_SYMBOL(release_sock); > I cannot get this patch to apply cleanly to 2.6.17-mm4. Since the patch listed in this message covers the same files as your previous lockdep-annotate-slock.patch, I am assuming this is supposed to replace it. I should also still apply lockdep-core-add-set-class-and-name.patch, correct? patch -p1 -l --dry-run < ../molnar-latest.patch patching file include/net/sock.h patching file net/core/sock.c Hunk #1 FAILED at 134. Hunk #3 succeeded at 802 with fuzz 2. 1 out of 4 hunks FAILED -- saving rejects to file net/core/sock.c.rej ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 20:21 ` Miles Lane @ 2006-06-30 20:38 ` Ingo Molnar 2006-06-30 22:45 ` Miles Lane 0 siblings, 1 reply; 22+ messages in thread From: Ingo Molnar @ 2006-06-30 20:38 UTC (permalink / raw) To: Miles Lane; +Cc: Herbert Xu, Andrew Morton, LKML, Arjan van de Ven * Miles Lane <miles.lane@gmail.com> wrote: > I cannot get this patch to apply cleanly to 2.6.17-mm4. Since the > patch listed in this message covers the same files as your previous > lockdep-annotate-slock.patch, I am assuming this is supposed to > replace it. I should also still apply > lockdep-core-add-set-class-and-name.patch, correct? correct. My latest combo patch has it all included, it's at: http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-mm4.patch could you try that one? Ingo ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 20:38 ` Ingo Molnar @ 2006-06-30 22:45 ` Miles Lane 2006-07-01 0:14 ` Miles Lane ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Miles Lane @ 2006-06-30 22:45 UTC (permalink / raw) To: Ingo Molnar; +Cc: Herbert Xu, Andrew Morton, LKML, Arjan van de Ven Okay, I rebuilt my kernel with your combo patch applied. Then, I inserted my US Robotics USR2210 PCMCIA wifi card, ran "pccardutil eject", popped out the card and then inserted a Compaq iPaq wifi card. This triggered the following. [ INFO: possible circular locking dependency detected ] ------------------------------------------------------- syslogd/1886 is trying to acquire lock: (&dev->queue_lock){-+..}, at: [<c11a50b5>] dev_queue_xmit+0x120/0x24b but task is already holding lock: (&dev->_xmit_lock){-+..}, at: [<c11a5118>] dev_queue_xmit+0x183/0x24b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&dev->_xmit_lock){-+..}: [<c102d1b6>] lock_acquire+0x60/0x80 [<c12013ed>] _spin_lock_bh+0x28/0x37 [<c11b0045>] dev_activate+0xce/0x100 [<c11a42e7>] dev_open+0x4c/0x64 [<c11a2e44>] dev_change_flags+0x51/0xf1 [<c11ab07d>] do_setlink+0x73/0x312 [<c11aadcc>] rtnetlink_rcv_msg+0x1ae/0x1d1 [<c11b5c91>] netlink_run_queue+0x69/0x100 [<c11aabd4>] rtnetlink_rcv+0x29/0x42 [<c11b6104>] netlink_data_ready+0x12/0x4f [<c11b4f51>] netlink_sendskb+0x1f/0x50 [<c11b5a42>] netlink_unicast+0x1b4/0x1ce [<c11b60e5>] netlink_sendmsg+0x259/0x266 [<c1199c7a>] sock_sendmsg+0xe8/0x103 [<c119a2a5>] sys_sendmsg+0x14f/0x1aa [<c119b5e3>] sys_socketcall+0x16b/0x186 [<c1002d6d>] sysenter_past_esp+0x56/0x8d -> #0 (&dev->queue_lock){-+..}: [<c102d1b6>] lock_acquire+0x60/0x80 [<c12013b6>] _spin_lock+0x23/0x32 [<c11a50b5>] dev_queue_xmit+0x120/0x24b [<f948379e>] hostap_data_start_xmit+0x610/0x61a [hostap] [<c11a371d>] dev_hard_start_xmit+0x206/0x212 [<c11a5134>] dev_queue_xmit+0x19f/0x24b [<f939c7c9>] mld_sendpack+0x1a0/0x26a [ipv6] [<f939d3a3>] mld_ifc_timer_expire+0x1d6/0x1fd [ipv6] [<c101db26>] run_timer_softirq+0xf2/0x14a [<c101a709>] __do_softirq+0x55/0xb0 [<c1004a64>] do_softirq+0x58/0xbd other info that might help us debug this: 4 locks held by syslogd/1886: #0: (&inode->i_mutex){--..}, at: [<c120028e>] mutex_lock+0x1c/0x1f #1: (&journal->j_state_lock){--..}, at: [<c10aee62>] journal_stop+0x109/0x1f8 #2: (&transaction->t_handle_lock){--..}, at: [<c10aee6c>] journal_stop+0x113/0x1f8 #3: (&dev->_xmit_lock){-+..}, at: [<c11a5118>] dev_queue_xmit+0x183/0x24b stack backtrace: [<c1003502>] show_trace_log_lvl+0x54/0xfd [<c1003b6a>] show_trace+0xd/0x10 [<c1003c0e>] dump_stack+0x19/0x1b [<c102c576>] print_circular_bug_tail+0x59/0x64 [<c102cd5c>] __lock_acquire+0x7db/0x970 [<c102d1b6>] lock_acquire+0x60/0x80 [<c12013b6>] _spin_lock+0x23/0x32 [<c11a50b5>] dev_queue_xmit+0x120/0x24b [<f948379e>] hostap_data_start_xmit+0x610/0x61a [hostap] [<c11a371d>] dev_hard_start_xmit+0x206/0x212 [<c11a5134>] dev_queue_xmit+0x19f/0x24b [<f939c7c9>] mld_sendpack+0x1a0/0x26a [ipv6] [<f939d3a3>] mld_ifc_timer_expire+0x1d6/0x1fd [ipv6] [<c101db26>] run_timer_softirq+0xf2/0x14a [<c101a709>] __do_softirq+0x55/0xb0 [<c1004a64>] do_softirq+0x58/0xbd [<c101a6a8>] irq_exit+0x3f/0x4b [<c1004b89>] do_IRQ+0xc0/0xcf [<c1002fd9>] common_interrupt+0x25/0x2c On 6/30/06, Ingo Molnar <mingo@elte.hu> wrote: > > * Miles Lane <miles.lane@gmail.com> wrote: > > > I cannot get this patch to apply cleanly to 2.6.17-mm4. Since the > > patch listed in this message covers the same files as your previous > > lockdep-annotate-slock.patch, I am assuming this is supposed to > > replace it. I should also still apply > > lockdep-core-add-set-class-and-name.patch, correct? > > correct. My latest combo patch has it all included, it's at: > > http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-mm4.patch > > could you try that one? > > Ingo > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 22:45 ` Miles Lane @ 2006-07-01 0:14 ` Miles Lane 2006-07-01 5:32 ` Ingo Molnar 2006-07-01 9:41 ` Arjan van de Ven 2 siblings, 0 replies; 22+ messages in thread From: Miles Lane @ 2006-07-01 0:14 UTC (permalink / raw) To: Ingo Molnar; +Cc: Herbert Xu, Andrew Morton, LKML, Arjan van de Ven i just hit this again. The stacktrace is identical except for this bit: other info that might help us debug this: 2 locks held by usplash_write/2182: #0: (&tty->atomic_write_lock){--..}, at: [<c11fff5e>] mutex_lock_interruptible+0x1c/0x1f #1: (&dev->_xmit_lock){-+..}, at: [<c11a5118>] dev_queue_xmit+0x183/0x24b ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 22:45 ` Miles Lane 2006-07-01 0:14 ` Miles Lane @ 2006-07-01 5:32 ` Ingo Molnar 2006-07-01 9:41 ` Arjan van de Ven 2 siblings, 0 replies; 22+ messages in thread From: Ingo Molnar @ 2006-07-01 5:32 UTC (permalink / raw) To: Miles Lane; +Cc: Herbert Xu, Andrew Morton, LKML, Arjan van de Ven * Miles Lane <miles.lane@gmail.com> wrote: > Okay, I rebuilt my kernel with your combo patch applied. Then, I > inserted my US Robotics USR2210 PCMCIA wifi card, ran "pccardutil > eject", popped out the card and then inserted a Compaq iPaq wifi card. > This triggered the following. ok - but the ipv6 one went away, which is good. Will have a look at this new one. Ingo ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-06-30 22:45 ` Miles Lane 2006-07-01 0:14 ` Miles Lane 2006-07-01 5:32 ` Ingo Molnar @ 2006-07-01 9:41 ` Arjan van de Ven 2006-07-01 14:06 ` Miles Lane 2 siblings, 1 reply; 22+ messages in thread From: Arjan van de Ven @ 2006-07-01 9:41 UTC (permalink / raw) To: Miles Lane; +Cc: Ingo Molnar, Herbert Xu, Andrew Morton, LKML On Fri, 2006-06-30 at 15:45 -0700, Miles Lane wrote: > Okay, I rebuilt my kernel with your combo patch applied. > Then, I inserted my US Robotics USR2210 PCMCIA wifi card, > ran "pccardutil eject", popped out the card and then inserted > a Compaq iPaq wifi card. This triggered the following. > > [ INFO: possible circular locking dependency detected ] > ------------------------------------------------------- > syslogd/1886 is trying to acquire lock: > (&dev->queue_lock){-+..}, at: [<c11a50b5>] dev_queue_xmit+0x120/0x24b > > but task is already holding lock: > (&dev->_xmit_lock){-+..}, at: [<c11a5118>] dev_queue_xmit+0x183/0x24b > > which lock already depends on the new lock. ok this appears to be hostap playing games... it has 2 network devices for one piece of hardware and one calls the other via the networking layer; there is thankfully a natural ordering between the two, so just making the slave one a separate type ought to make this work. Can you test the patch below? --- drivers/net/wireless/hostap/hostap_hw.c | 10 ++++++++++ 1 file changed, 10 insertions(+) Index: linux-2.6.17-mm4/drivers/net/wireless/hostap/hostap_hw.c =================================================================== --- linux-2.6.17-mm4.orig/drivers/net/wireless/hostap/hostap_hw.c +++ linux-2.6.17-mm4/drivers/net/wireless/hostap/hostap_hw.c @@ -3096,6 +3096,14 @@ static void prism2_clear_set_tim_queue(l } +/* + * HostAP uses two layers of net devices, where the inner + * layer gets called all the time from the outer layer. + * This is a natural nesting, which needs a split lock type. + */ +static struct lock_class_key hostap_netdev_xmit_lock_key; + + static struct net_device * prism2_init_local_data(struct prism2_helper_functions *funcs, int card_idx, struct device *sdev) @@ -3260,6 +3268,8 @@ while (0) SET_NETDEV_DEV(dev, sdev); if (ret >= 0) ret = register_netdevice(dev); + + lockdep_set_class(&dev->_xmit_lock, &hostap_netdev_xmit_lock_key); rtnl_unlock(); if (ret < 0) { printk(KERN_WARNING "%s: register netdevice failed!\n", ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-07-01 9:41 ` Arjan van de Ven @ 2006-07-01 14:06 ` Miles Lane 2006-07-01 14:07 ` Herbert Xu 0 siblings, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-07-01 14:06 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Ingo Molnar, Herbert Xu, Andrew Morton, LKML On 7/1/06, Arjan van de Ven <arjan@infradead.org> wrote: > On Fri, 2006-06-30 at 15:45 -0700, Miles Lane wrote: > > Okay, I rebuilt my kernel with your combo patch applied. > > Then, I inserted my US Robotics USR2210 PCMCIA wifi card, > > ran "pccardutil eject", popped out the card and then inserted > > a Compaq iPaq wifi card. This triggered the following. > > > > [ INFO: possible circular locking dependency detected ] > > ------------------------------------------------------- > > syslogd/1886 is trying to acquire lock: > > (&dev->queue_lock){-+..}, at: [<c11a50b5>] dev_queue_xmit+0x120/0x24b > > > > but task is already holding lock: > > (&dev->_xmit_lock){-+..}, at: [<c11a5118>] dev_queue_xmit+0x183/0x24b > > > > which lock already depends on the new lock. > > > ok this appears to be hostap playing games... it has 2 network devices > for one piece of hardware and one calls the other via the networking > layer; there is thankfully a natural ordering between the two, so just > making the slave one a separate type ought to make this work. > > Can you test the patch below? > > --- > drivers/net/wireless/hostap/hostap_hw.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > Index: linux-2.6.17-mm4/drivers/net/wireless/hostap/hostap_hw.c > =================================================================== > --- linux-2.6.17-mm4.orig/drivers/net/wireless/hostap/hostap_hw.c > +++ linux-2.6.17-mm4/drivers/net/wireless/hostap/hostap_hw.c > @@ -3096,6 +3096,14 @@ static void prism2_clear_set_tim_queue(l > } > > > +/* > + * HostAP uses two layers of net devices, where the inner > + * layer gets called all the time from the outer layer. > + * This is a natural nesting, which needs a split lock type. > + */ > +static struct lock_class_key hostap_netdev_xmit_lock_key; > + > + > static struct net_device * > prism2_init_local_data(struct prism2_helper_functions *funcs, int card_idx, > struct device *sdev) > @@ -3260,6 +3268,8 @@ while (0) > SET_NETDEV_DEV(dev, sdev); > if (ret >= 0) > ret = register_netdevice(dev); > + > + lockdep_set_class(&dev->_xmit_lock, &hostap_netdev_xmit_lock_key); > rtnl_unlock(); > if (ret < 0) { > printk(KERN_WARNING "%s: register netdevice failed!\n", (Sorry for the duplicate message Arjan, I forgot to Reply All the first time.) After rebuilding with this patch, I still get the lockdep message. Everything is the same except now the "other info" section reads: other info that might help us debug this: 1 lock held by swapper/0: #0: (&dev->_xmit_lock){-+..}, at: [<c11a5118>] dev_queue_xmit+0x183/0x24b ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-07-01 14:06 ` Miles Lane @ 2006-07-01 14:07 ` Herbert Xu 2006-07-01 14:10 ` Miles Lane 0 siblings, 1 reply; 22+ messages in thread From: Herbert Xu @ 2006-07-01 14:07 UTC (permalink / raw) To: Miles Lane; +Cc: Arjan van de Ven, Ingo Molnar, Andrew Morton, LKML On Sat, Jul 01, 2006 at 07:06:22AM -0700, Miles Lane wrote: > > >@@ -3260,6 +3268,8 @@ while (0) > > SET_NETDEV_DEV(dev, sdev); > > if (ret >= 0) > > ret = register_netdevice(dev); > >+ > >+ lockdep_set_class(&dev->_xmit_lock, &hostap_netdev_xmit_lock_key); > > rtnl_unlock(); > > if (ret < 0) { > > printk(KERN_WARNING "%s: register netdevice failed!\n", > > After rebuilding with this patch, I still get the lockdep message. > Everything is the same except now the "other info" section reads: Perhaps the same trick needs to be applied to the queue lock? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-07-01 14:07 ` Herbert Xu @ 2006-07-01 14:10 ` Miles Lane 2006-07-01 14:19 ` Miles Lane 0 siblings, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-07-01 14:10 UTC (permalink / raw) To: Herbert Xu; +Cc: Arjan van de Ven, Ingo Molnar, Andrew Morton, LKML Doh! User error. I did a dry-run to make sure it'd apply and forgot to run again without it. I'll fix and restest. Sorry, Miles ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch] lockdep, annotate slocks: turn lockdep off for them 2006-07-01 14:10 ` Miles Lane @ 2006-07-01 14:19 ` Miles Lane 0 siblings, 0 replies; 22+ messages in thread From: Miles Lane @ 2006-07-01 14:19 UTC (permalink / raw) To: Herbert Xu; +Cc: Arjan van de Ven, Ingo Molnar, Andrew Morton, LKML Okay, the patch worked! Thanks. BTW, I noticed this unrelated issue when recompiling: WARNING: "__stack_chk_fail" [drivers/net/wireless/hostap/hostap_plx.ko] undefined! WARNING: "__stack_chk_fail" [drivers/net/wireless/hostap/hostap_pci.ko] undefined! WARNING: "__stack_chk_fail" [drivers/net/wireless/hostap/hostap_cs.ko] undefined! Cheers, Miles ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2006-07-01 14:19 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-29 19:01 2.6.17-mm3 -- BUG: illegal lock usage -- illegal {softirq-on-W} -> {in-softirq-R} usage Miles Lane
2006-06-29 19:26 ` Andrew Morton
2006-06-29 19:42 ` Arjan van de Ven
2006-06-29 19:56 ` Andrew Morton
2006-06-30 2:57 ` Herbert Xu
2006-06-30 6:50 ` Ingo Molnar
2006-06-30 6:59 ` Ingo Molnar
2006-06-30 7:22 ` [patch] lockdep, annotate slocks: turn lockdep off for them Ingo Molnar
2006-06-30 9:18 ` Ingo Molnar
2006-06-30 11:17 ` Herbert Xu
2006-06-30 11:37 ` Ingo Molnar
2006-06-30 11:46 ` Herbert Xu
2006-06-30 20:21 ` Miles Lane
2006-06-30 20:38 ` Ingo Molnar
2006-06-30 22:45 ` Miles Lane
2006-07-01 0:14 ` Miles Lane
2006-07-01 5:32 ` Ingo Molnar
2006-07-01 9:41 ` Arjan van de Ven
2006-07-01 14:06 ` Miles Lane
2006-07-01 14:07 ` Herbert Xu
2006-07-01 14:10 ` Miles Lane
2006-07-01 14:19 ` Miles Lane
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox