* 2.6.24-rc6: possible recursive locking detected
@ 2008-01-03 22:58 Christian Kujau
2008-01-03 23:06 ` Rafael J. Wysocki
0 siblings, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-01-03 22:58 UTC (permalink / raw)
To: linux-kernel; +Cc: jfs-discussion
hi,
a few minutes after upgrading from -rc5 to -rc6 I got:
[ 1310.670986] =============================================
[ 1310.671690] [ INFO: possible recursive locking detected ]
[ 1310.672097] 2.6.24-rc6 #1
[ 1310.672421] ---------------------------------------------
[ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
[ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
[ 1310.673869]
[ 1310.673870] but task is already holding lock:
[ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
[ 1310.675267]
[ 1310.675268] other info that might help us debug this:
[ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
[ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0
[ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470
[ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210
[ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80
[ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50
[ 1310.680151]
[ 1310.680152] stack backtrace:
[ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
[ 1310.681209] [<c01038aa>] show_trace_log_lvl+0x1a/0x30
[ 1310.681659] [<c0104322>] show_trace+0x12/0x20
[ 1310.682085] [<c0104cba>] dump_stack+0x6a/0x70
[ 1310.682512] [<c0138ec1>] __lock_acquire+0x971/0x10c0
[ 1310.682961] [<c013966e>] lock_acquire+0x5e/0x80
[ 1310.683392] [<c0419b78>] _spin_lock_irqsave+0x38/0x50
[ 1310.683914] [<c011544b>] __wake_up+0x1b/0x50
[ 1310.684337] [<c018e2ba>] ep_poll_safewake+0x9a/0xc0
[ 1310.684822] [<c018f11b>] ep_poll_callback+0x8b/0xe0
[ 1310.685265] [<c0114418>] __wake_up_common+0x48/0x70
[ 1310.685712] [<c0115467>] __wake_up+0x37/0x50
[ 1310.686136] [<c03816aa>] sock_def_readable+0x7a/0x80
[ 1310.686579] [<c0381c2b>] sock_queue_rcv_skb+0xeb/0x150
[ 1310.687028] [<c03c7d99>] udp_queue_rcv_skb+0x139/0x2a0
[ 1310.687554] [<c03c81f1>] __udp4_lib_rcv+0x2f1/0x7e0
[ 1310.687996] [<c03c86f2>] udp_rcv+0x12/0x20
[ 1310.688415] [<c03a80a5>] ip_local_deliver_finish+0x125/0x210
[ 1310.688881] [<c03a84ed>] ip_local_deliver+0x2d/0x90
[ 1310.689323] [<c03a7d6b>] ip_rcv_finish+0xeb/0x300
[ 1310.689760] [<c03a8425>] ip_rcv+0x195/0x230
[ 1310.690182] [<c0388fdc>] netif_receive_skb+0x37c/0x470
[ 1310.690632] [<c038ba39>] process_backlog+0x69/0xc0
[ 1310.691175] [<c038b6f7>] net_rx_action+0x137/0x1b0
[ 1310.691681] [<c011e5c2>] __do_softirq+0x52/0xb0
[ 1310.692006] [<c0104e94>] do_softirq+0x94/0xe0
[ 1310.692301] =======================
This is a single CPU machine, and the box was quite busy due to disk I/O
(load 6-8). The machine continues to run and all is well now. Even the
application mentioned above (FahCore_a0.exe) is running fine
("Folding@Home", cpu bound). The binary is located on an jfs filesystem,
which was also under heavy I/O. Can someone tell me why the backtrace
shows so much net* stuff? There was not much net I/O...
more details and .config: http://nerdbynature.de/bits/2.6.24-rc6
Thanks,
Christian.
--
BOFH excuse #312:
incompatible bit-registration operators
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-03 22:58 2.6.24-rc6: possible recursive locking detected Christian Kujau @ 2008-01-03 23:06 ` Rafael J. Wysocki 2008-01-04 8:30 ` Ingo Molnar 0 siblings, 1 reply; 17+ messages in thread From: Rafael J. Wysocki @ 2008-01-03 23:06 UTC (permalink / raw) To: Christian Kujau; +Cc: linux-kernel, jfs-discussion, Ingo Molnar, Peter Zijlstra [Added some CCs] On Thursday, 3 of January 2008, Christian Kujau wrote: > hi, > > a few minutes after upgrading from -rc5 to -rc6 I got: > > [ 1310.670986] ============================================= > [ 1310.671690] [ INFO: possible recursive locking detected ] > [ 1310.672097] 2.6.24-rc6 #1 > [ 1310.672421] --------------------------------------------- > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: > [ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > [ 1310.673869] > [ 1310.673870] but task is already holding lock: > [ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > [ 1310.675267] > [ 1310.675268] other info that might help us debug this: > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692: > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0 > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470 > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210 > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80 > [ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > [ 1310.680151] > [ 1310.680152] stack backtrace: > [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1 > [ 1310.681209] [<c01038aa>] show_trace_log_lvl+0x1a/0x30 > [ 1310.681659] [<c0104322>] show_trace+0x12/0x20 > [ 1310.682085] [<c0104cba>] dump_stack+0x6a/0x70 > [ 1310.682512] [<c0138ec1>] __lock_acquire+0x971/0x10c0 > [ 1310.682961] [<c013966e>] lock_acquire+0x5e/0x80 > [ 1310.683392] [<c0419b78>] _spin_lock_irqsave+0x38/0x50 > [ 1310.683914] [<c011544b>] __wake_up+0x1b/0x50 > [ 1310.684337] [<c018e2ba>] ep_poll_safewake+0x9a/0xc0 > [ 1310.684822] [<c018f11b>] ep_poll_callback+0x8b/0xe0 > [ 1310.685265] [<c0114418>] __wake_up_common+0x48/0x70 > [ 1310.685712] [<c0115467>] __wake_up+0x37/0x50 > [ 1310.686136] [<c03816aa>] sock_def_readable+0x7a/0x80 > [ 1310.686579] [<c0381c2b>] sock_queue_rcv_skb+0xeb/0x150 > [ 1310.687028] [<c03c7d99>] udp_queue_rcv_skb+0x139/0x2a0 > [ 1310.687554] [<c03c81f1>] __udp4_lib_rcv+0x2f1/0x7e0 > [ 1310.687996] [<c03c86f2>] udp_rcv+0x12/0x20 > [ 1310.688415] [<c03a80a5>] ip_local_deliver_finish+0x125/0x210 > [ 1310.688881] [<c03a84ed>] ip_local_deliver+0x2d/0x90 > [ 1310.689323] [<c03a7d6b>] ip_rcv_finish+0xeb/0x300 > [ 1310.689760] [<c03a8425>] ip_rcv+0x195/0x230 > [ 1310.690182] [<c0388fdc>] netif_receive_skb+0x37c/0x470 > [ 1310.690632] [<c038ba39>] process_backlog+0x69/0xc0 > [ 1310.691175] [<c038b6f7>] net_rx_action+0x137/0x1b0 > [ 1310.691681] [<c011e5c2>] __do_softirq+0x52/0xb0 > [ 1310.692006] [<c0104e94>] do_softirq+0x94/0xe0 > [ 1310.692301] ======================= > > > This is a single CPU machine, and the box was quite busy due to disk I/O > (load 6-8). The machine continues to run and all is well now. Even the > application mentioned above (FahCore_a0.exe) is running fine > ("Folding@Home", cpu bound). The binary is located on an jfs filesystem, > which was also under heavy I/O. Can someone tell me why the backtrace > shows so much net* stuff? There was not much net I/O... > > more details and .config: http://nerdbynature.de/bits/2.6.24-rc6 > > Thanks, > Christian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-03 23:06 ` Rafael J. Wysocki @ 2008-01-04 8:30 ` Ingo Molnar 2008-01-05 7:12 ` Herbert Xu 0 siblings, 1 reply; 17+ messages in thread From: Ingo Molnar @ 2008-01-04 8:30 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Christian Kujau, linux-kernel, jfs-discussion, Peter Zijlstra, Davide Libenzi, Herbert Xu [ Added even more CCs :-) Seems eventpoll & net related. ] * Rafael J. Wysocki <rjw@sisk.pl> wrote: > [Added some CCs] > > On Thursday, 3 of January 2008, Christian Kujau wrote: > > hi, > > > > a few minutes after upgrading from -rc5 to -rc6 I got: > > > > [ 1310.670986] ============================================= > > [ 1310.671690] [ INFO: possible recursive locking detected ] > > [ 1310.672097] 2.6.24-rc6 #1 > > [ 1310.672421] --------------------------------------------- > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: > > [ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > [ 1310.673869] > > [ 1310.673870] but task is already holding lock: > > [ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > [ 1310.675267] > > [ 1310.675268] other info that might help us debug this: > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692: > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0 > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470 > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210 > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80 > > [ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > [ 1310.680151] > > [ 1310.680152] stack backtrace: > > [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1 > > [ 1310.681209] [<c01038aa>] show_trace_log_lvl+0x1a/0x30 > > [ 1310.681659] [<c0104322>] show_trace+0x12/0x20 > > [ 1310.682085] [<c0104cba>] dump_stack+0x6a/0x70 > > [ 1310.682512] [<c0138ec1>] __lock_acquire+0x971/0x10c0 > > [ 1310.682961] [<c013966e>] lock_acquire+0x5e/0x80 > > [ 1310.683392] [<c0419b78>] _spin_lock_irqsave+0x38/0x50 > > [ 1310.683914] [<c011544b>] __wake_up+0x1b/0x50 > > [ 1310.684337] [<c018e2ba>] ep_poll_safewake+0x9a/0xc0 > > [ 1310.684822] [<c018f11b>] ep_poll_callback+0x8b/0xe0 > > [ 1310.685265] [<c0114418>] __wake_up_common+0x48/0x70 > > [ 1310.685712] [<c0115467>] __wake_up+0x37/0x50 > > [ 1310.686136] [<c03816aa>] sock_def_readable+0x7a/0x80 > > [ 1310.686579] [<c0381c2b>] sock_queue_rcv_skb+0xeb/0x150 > > [ 1310.687028] [<c03c7d99>] udp_queue_rcv_skb+0x139/0x2a0 > > [ 1310.687554] [<c03c81f1>] __udp4_lib_rcv+0x2f1/0x7e0 > > [ 1310.687996] [<c03c86f2>] udp_rcv+0x12/0x20 > > [ 1310.688415] [<c03a80a5>] ip_local_deliver_finish+0x125/0x210 > > [ 1310.688881] [<c03a84ed>] ip_local_deliver+0x2d/0x90 > > [ 1310.689323] [<c03a7d6b>] ip_rcv_finish+0xeb/0x300 > > [ 1310.689760] [<c03a8425>] ip_rcv+0x195/0x230 > > [ 1310.690182] [<c0388fdc>] netif_receive_skb+0x37c/0x470 > > [ 1310.690632] [<c038ba39>] process_backlog+0x69/0xc0 > > [ 1310.691175] [<c038b6f7>] net_rx_action+0x137/0x1b0 > > [ 1310.691681] [<c011e5c2>] __do_softirq+0x52/0xb0 > > [ 1310.692006] [<c0104e94>] do_softirq+0x94/0xe0 > > [ 1310.692301] ======================= > > > > > > This is a single CPU machine, and the box was quite busy due to disk I/O > > (load 6-8). The machine continues to run and all is well now. Even the > > application mentioned above (FahCore_a0.exe) is running fine > > ("Folding@Home", cpu bound). The binary is located on an jfs filesystem, > > which was also under heavy I/O. Can someone tell me why the backtrace > > shows so much net* stuff? There was not much net I/O... > > > > more details and .config: http://nerdbynature.de/bits/2.6.24-rc6 > > > > Thanks, > > Christian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-04 8:30 ` Ingo Molnar @ 2008-01-05 7:12 ` Herbert Xu 2008-01-05 16:53 ` Peter Zijlstra 0 siblings, 1 reply; 17+ messages in thread From: Herbert Xu @ 2008-01-05 7:12 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Peter Zijlstra, Davide Libenzi On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote: > > > > [ 1310.670986] ============================================= > > > [ 1310.671690] [ INFO: possible recursive locking detected ] > > > [ 1310.672097] 2.6.24-rc6 #1 > > > [ 1310.672421] --------------------------------------------- > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: > > > [ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > [ 1310.673869] > > > [ 1310.673870] but task is already holding lock: > > > [ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > [ 1310.675267] > > > [ 1310.675268] other info that might help us debug this: > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692: > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0 > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470 > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210 > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80 > > > [ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 The net part might just be a red herring, since the problem is that __wake_up is somehow reentering itself. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-05 7:12 ` Herbert Xu @ 2008-01-05 16:53 ` Peter Zijlstra 2008-01-05 17:01 ` Peter Zijlstra 0 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2008-01-05 16:53 UTC (permalink / raw) To: Herbert Xu Cc: Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg, Oleg Nesterov On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote: > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote: > > > > > > [ 1310.670986] ============================================= > > > > [ 1310.671690] [ INFO: possible recursive locking detected ] > > > > [ 1310.672097] 2.6.24-rc6 #1 > > > > [ 1310.672421] --------------------------------------------- > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: > > > > [ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > [ 1310.673869] > > > > [ 1310.673870] but task is already holding lock: > > > > [ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > [ 1310.675267] > > > > [ 1310.675268] other info that might help us debug this: > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692: > > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0 > > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470 > > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210 > > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80 > > > > [ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > The net part might just be a red herring, since the problem is that > __wake_up is somehow reentering itself. /* * Perform a safe wake up of the poll wait list. The problem is that * with the new callback'd wake up system, it is possible that the * poll callback is reentered from inside the call to wake_up() done * on the poll wait queue head. The rule is that we cannot reenter the * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times, * and we cannot reenter the same wait queue head at all. This will * enable to have a hierarchy of epoll file descriptor of no more than * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock * because this one gets called by the poll callback, that in turn is called * from inside a wake_up(), that might be called from irq context. */ Seems to suggest that the epoll code can indeed recurse into wakeup. Davide, Johannes, any ideas? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-05 16:53 ` Peter Zijlstra @ 2008-01-05 17:01 ` Peter Zijlstra 2008-01-05 21:35 ` Davide Libenzi 2008-01-07 17:22 ` Oleg Nesterov 0 siblings, 2 replies; 17+ messages in thread From: Peter Zijlstra @ 2008-01-05 17:01 UTC (permalink / raw) To: Herbert Xu Cc: Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg, Oleg Nesterov On Sat, 2008-01-05 at 17:53 +0100, Peter Zijlstra wrote: > On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote: > > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote: > > > > > > > > [ 1310.670986] ============================================= > > > > > [ 1310.671690] [ INFO: possible recursive locking detected ] > > > > > [ 1310.672097] 2.6.24-rc6 #1 > > > > > [ 1310.672421] --------------------------------------------- > > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: > > > > > [ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > > [ 1310.673869] > > > > > [ 1310.673870] but task is already holding lock: > > > > > [ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > > [ 1310.675267] > > > > > [ 1310.675268] other info that might help us debug this: > > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692: > > > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0 > > > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470 > > > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210 > > > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80 > > > > > [ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > The net part might just be a red herring, since the problem is that > > __wake_up is somehow reentering itself. > > /* > * Perform a safe wake up of the poll wait list. The problem is that > * with the new callback'd wake up system, it is possible that the > * poll callback is reentered from inside the call to wake_up() done > * on the poll wait queue head. The rule is that we cannot reenter the > * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times, > * and we cannot reenter the same wait queue head at all. This will > * enable to have a hierarchy of epoll file descriptor of no more than > * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock > * because this one gets called by the poll callback, that in turn is called > * from inside a wake_up(), that might be called from irq context. > */ > > Seems to suggest that the epoll code can indeed recurse into wakeup. > > Davide, Johannes, any ideas? Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do something like: wake_up_nested(..., wake_nests); although I'm not quite sure that is correct, my understanding of this code is still fragile at best. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-05 17:01 ` Peter Zijlstra @ 2008-01-05 21:35 ` Davide Libenzi 2008-01-06 0:20 ` Christian Kujau 2008-01-06 21:44 ` Cyrill Gorcunov 2008-01-07 17:22 ` Oleg Nesterov 1 sibling, 2 replies; 17+ messages in thread From: Davide Libenzi @ 2008-01-05 21:35 UTC (permalink / raw) To: Peter Zijlstra Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, Linux Kernel Mailing List, jfs-discussion, Johannes Berg, Oleg Nesterov On Sat, 5 Jan 2008, Peter Zijlstra wrote: > > On Sat, 2008-01-05 at 17:53 +0100, Peter Zijlstra wrote: > > On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote: > > > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote: > > > > > > > > > > [ 1310.670986] ============================================= > > > > > > [ 1310.671690] [ INFO: possible recursive locking detected ] > > > > > > [ 1310.672097] 2.6.24-rc6 #1 > > > > > > [ 1310.672421] --------------------------------------------- > > > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: > > > > > > [ 1310.673238] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > > > [ 1310.673869] > > > > > > [ 1310.673870] but task is already holding lock: > > > > > > [ 1310.674567] (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > > > [ 1310.675267] > > > > > > [ 1310.675268] other info that might help us debug this: > > > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692: > > > > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [<c038b620>] net_rx_action+0x60/0x1b0 > > > > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [<c0388d60>] netif_receive_skb+0x100/0x470 > > > > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [<c03a7fb2>] ip_local_deliver_finish+0x32/0x210 > > > > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [<c038164e>] sock_def_readable+0x1e/0x80 > > > > > > [ 1310.679250] #4: (&q->lock){++..}, at: [<c011544b>] __wake_up+0x1b/0x50 > > > > > > The net part might just be a red herring, since the problem is that > > > __wake_up is somehow reentering itself. > > > > /* > > * Perform a safe wake up of the poll wait list. The problem is that > > * with the new callback'd wake up system, it is possible that the > > * poll callback is reentered from inside the call to wake_up() done > > * on the poll wait queue head. The rule is that we cannot reenter the > > * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times, > > * and we cannot reenter the same wait queue head at all. This will > > * enable to have a hierarchy of epoll file descriptor of no more than > > * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock > > * because this one gets called by the poll callback, that in turn is called > > * from inside a wake_up(), that might be called from irq context. > > */ > > > > Seems to suggest that the epoll code can indeed recurse into wakeup. > > > > Davide, Johannes, any ideas? > > Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do > something like: > > wake_up_nested(..., wake_nests); > > although I'm not quite sure that is correct, my understanding of this > code is still fragile at best. I remember I talked with Arjan about this time ago. Basically, since 1) you can drop an epoll fd inside another epoll fd 2) callback-based wakeups are used, you can see a wake_up() from inside another wake_up(), but they will never refer to the same lock instance. Think about: dfd = socket(...); efd1 = epoll_create(); efd2 = epoll_create(); epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); When a packet arrives to the device underneath "dfd", the net code will issue a wake_up() on its poll wake list. Epoll (efd1) has installed a callback wakeup entry on that queue, and the wake_up() performed by the "dfd" net code will end up in ep_poll_callback(). At this point epoll (efd1) notices that it may have some event ready, so it needs to wake up the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() that ends up in another wake_up(), after having checked about the recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to avoid stack blasting. Never hit the same queue, to avoid loops like: epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); The code "if (tncur->wq == wq || ..." prevents re-entering the same queue/lock. I don't know how the lockdep code works, so I can't say about wake_up_nested(). Although I have a feeling is not enough in this case. A solution may be to move the call to ep_poll_safewake() (that'd become a simple wake_up()) inside a tasklet or whatever is today trendy for delayed work. But his kinda scares me to be honest, since epoll has already a bunch of places where it could be asynchronously hit (plus performance regression will need to be verified). - Davide ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-05 21:35 ` Davide Libenzi @ 2008-01-06 0:20 ` Christian Kujau 2008-01-07 21:35 ` Davide Libenzi 2008-01-06 21:44 ` Cyrill Gorcunov 1 sibling, 1 reply; 17+ messages in thread From: Christian Kujau @ 2008-01-06 0:20 UTC (permalink / raw) To: Davide Libenzi Cc: Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List, jfs-discussion, Johannes Berg, Oleg Nesterov On Sat, 5 Jan 2008, Davide Libenzi wrote: > A solution may be to move the call to ep_poll_safewake() (that'd become a > simple wake_up()) inside a tasklet or whatever is today trendy for delayed > work. But his kinda scares me to be honest, since epoll has already a > bunch of places where it could be asynchronously hit (plus performance > regression will need to be verified). Although I'm not able to reproduce this one right now, I'm happy to test any patches you guys come up with. Thanks for your time, Christian. -- BOFH excuse #78: Yes, yes, its called a design limitation ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-06 0:20 ` Christian Kujau @ 2008-01-07 21:35 ` Davide Libenzi 0 siblings, 0 replies; 17+ messages in thread From: Davide Libenzi @ 2008-01-07 21:35 UTC (permalink / raw) To: Christian Kujau Cc: Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List, jfs-discussion, Johannes Berg, Oleg Nesterov On Sun, 6 Jan 2008, Christian Kujau wrote: > On Sat, 5 Jan 2008, Davide Libenzi wrote: > > A solution may be to move the call to ep_poll_safewake() (that'd become a > > simple wake_up()) inside a tasklet or whatever is today trendy for delayed > > work. But his kinda scares me to be honest, since epoll has already a > > bunch of places where it could be asynchronously hit (plus performance > > regression will need to be verified). > > Although I'm not able to reproduce this one right now, I'm happy to test any > patches you guys come up with. There's no need to reproduce it. It's there, it's among us ;) - Davide ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-05 21:35 ` Davide Libenzi 2008-01-06 0:20 ` Christian Kujau @ 2008-01-06 21:44 ` Cyrill Gorcunov 2008-01-06 21:53 ` Cyrill Gorcunov 1 sibling, 1 reply; 17+ messages in thread From: Cyrill Gorcunov @ 2008-01-06 21:44 UTC (permalink / raw) To: Davide Libenzi Cc: Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, Linux Kernel Mailing List, jfs-discussion, Johannes Berg, Oleg Nesterov [Davide Libenzi - Sat, Jan 05, 2008 at 01:35:25PM -0800] | On Sat, 5 Jan 2008, Peter Zijlstra wrote: | [...snip...] | I remember I talked with Arjan about this time ago. Basically, since 1) | you can drop an epoll fd inside another epoll fd 2) callback-based wakeups | are used, you can see a wake_up() from inside another wake_up(), but they | will never refer to the same lock instance. | Think about: | | dfd = socket(...); | efd1 = epoll_create(); | efd2 = epoll_create(); | epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); | | When a packet arrives to the device underneath "dfd", the net code will | issue a wake_up() on its poll wake list. Epoll (efd1) has installed a | callback wakeup entry on that queue, and the wake_up() performed by the | "dfd" net code will end up in ep_poll_callback(). At this point epoll | (efd1) notices that it may have some event ready, so it needs to wake up | the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() | that ends up in another wake_up(), after having checked about the | recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to | avoid stack blasting. Never hit the same queue, to avoid loops like: | | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); | epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); | epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); | epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); | | The code "if (tncur->wq == wq || ..." prevents re-entering the same | queue/lock. | I don't know how the lockdep code works, so I can't say about | wake_up_nested(). Although I have a feeling is not enough in this case. | A solution may be to move the call to ep_poll_safewake() (that'd become a | simple wake_up()) inside a tasklet or whatever is today trendy for delayed | work. But his kinda scares me to be honest, since epoll has already a | bunch of places where it could be asynchronously hit (plus performance | regression will need to be verified). | | | | - Davide | | it's quite possible that i'm wrong but just interested... why in ep_poll_safewake() the assignment struct list_head *lsthead = &psw->wake_task_list; is not protected by spinlock? - Cyrill - ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-06 21:44 ` Cyrill Gorcunov @ 2008-01-06 21:53 ` Cyrill Gorcunov 0 siblings, 0 replies; 17+ messages in thread From: Cyrill Gorcunov @ 2008-01-06 21:53 UTC (permalink / raw) To: Davide Libenzi, Peter Zijlstra, Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, Linux Kernel Mailing List, jfs-discussion, Johannes Berg, Oleg Nesterov [Cyrill Gorcunov - Mon, Jan 07, 2008 at 12:44:42AM +0300] | [Davide Libenzi - Sat, Jan 05, 2008 at 01:35:25PM -0800] | | On Sat, 5 Jan 2008, Peter Zijlstra wrote: | | | [...snip...] | | I remember I talked with Arjan about this time ago. Basically, since 1) | | you can drop an epoll fd inside another epoll fd 2) callback-based wakeups | | are used, you can see a wake_up() from inside another wake_up(), but they | | will never refer to the same lock instance. | | Think about: | | | | dfd = socket(...); | | efd1 = epoll_create(); | | efd2 = epoll_create(); | | epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); | | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); | | | | When a packet arrives to the device underneath "dfd", the net code will | | issue a wake_up() on its poll wake list. Epoll (efd1) has installed a | | callback wakeup entry on that queue, and the wake_up() performed by the | | "dfd" net code will end up in ep_poll_callback(). At this point epoll | | (efd1) notices that it may have some event ready, so it needs to wake up | | the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() | | that ends up in another wake_up(), after having checked about the | | recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to | | avoid stack blasting. Never hit the same queue, to avoid loops like: | | | | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); | | epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); | | epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); | | epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); | | | | The code "if (tncur->wq == wq || ..." prevents re-entering the same | | queue/lock. | | I don't know how the lockdep code works, so I can't say about | | wake_up_nested(). Although I have a feeling is not enough in this case. | | A solution may be to move the call to ep_poll_safewake() (that'd become a | | simple wake_up()) inside a tasklet or whatever is today trendy for delayed | | work. But his kinda scares me to be honest, since epoll has already a | | bunch of places where it could be asynchronously hit (plus performance | | regression will need to be verified). | | | | | | | | - Davide | | | | | | it's quite possible that i'm wrong but just interested... | why in ep_poll_safewake() the assignment | | struct list_head *lsthead = &psw->wake_task_list; | | is not protected by spinlock? | | - Cyrill - it was a completely stupid question... please drop ;) - Cyrill - ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-05 17:01 ` Peter Zijlstra 2008-01-05 21:35 ` Davide Libenzi @ 2008-01-07 17:22 ` Oleg Nesterov 2008-01-07 17:49 ` Oleg Nesterov 1 sibling, 1 reply; 17+ messages in thread From: Oleg Nesterov @ 2008-01-07 17:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg On 01/05, Peter Zijlstra wrote: > > Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do > something like: > > wake_up_nested(..., wake_nests); I think this would be the most correct change. But I wonder if it is possible to do something more generic (but otoh more stupid/hackish and less safe). Consider this "just for illustration" patch, --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300 +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300 @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr, struct held_lock *prev; int i; - for (i = 0; i < curr->lockdep_depth; i++) { + for (i = curr->lockdep_depth; --i >= 0; ) { prev = curr->held_locks + i; if (prev->class != next->class) continue; + + if (prev->trylock == -1) + return 2; /* * Allow read-after-read recursion of the same * lock class (i.e. read_lock(lock)+read_lock(lock)): ------------------------------------------------------------------------- Now, // trylock == -1 #define spin_mark_nested(l) \ lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_) #define spin_unmark_nested(l) \ lock_release(&(l)->dep_map, 1, _THIS_IP_) and ep_poll_safewake() can do: /* Do really wake up now */ spin_mark_nested(&wq->lock); wake_up(wq); spin_unmark_nested(&wq->lock); Possible? Oleg. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-07 17:22 ` Oleg Nesterov @ 2008-01-07 17:49 ` Oleg Nesterov 2008-01-13 16:32 ` Peter Zijlstra 0 siblings, 1 reply; 17+ messages in thread From: Oleg Nesterov @ 2008-01-07 17:49 UTC (permalink / raw) To: Peter Zijlstra Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg On 01/07, Oleg Nesterov wrote: > > Consider this "just for illustration" patch, > > --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300 > +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300 > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr, > struct held_lock *prev; > int i; > > - for (i = 0; i < curr->lockdep_depth; i++) { > + for (i = curr->lockdep_depth; --i >= 0; ) { > prev = curr->held_locks + i; > if (prev->class != next->class) > continue; > + > + if (prev->trylock == -1) > + return 2; > /* > * Allow read-after-read recursion of the same > * lock class (i.e. read_lock(lock)+read_lock(lock)): > ------------------------------------------------------------------------- > > Now, > > // trylock == -1 > #define spin_mark_nested(l) \ > lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_) > #define spin_unmark_nested(l) \ > lock_release(&(l)->dep_map, 1, _THIS_IP_) > > and ep_poll_safewake() can do: > > /* Do really wake up now */ > spin_mark_nested(&wq->lock); > wake_up(wq); > spin_unmark_nested(&wq->lock); I tested the patch above with the following code, wait_queue_head_t w1, w2, w3; init_waitqueue_head(&w1); init_waitqueue_head(&w2); init_waitqueue_head(&w3); local_irq_disable(); spin_lock(&w1.lock); spin_mark_nested(&w2.lock); spin_lock(&w2.lock); spin_mark_nested(&w3.lock); wake_up(&w3); spin_unmark_nested(&w3.lock); spin_unlock(&w2.lock); spin_unmark_nested(&w2.lock); spin_unlock(&w1.lock); local_irq_enable(); seems to work. What do you think? Oleg. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-07 17:49 ` Oleg Nesterov @ 2008-01-13 16:32 ` Peter Zijlstra 2008-01-14 21:27 ` Oleg Nesterov 0 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2008-01-13 16:32 UTC (permalink / raw) To: Oleg Nesterov Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg On Mon, 2008-01-07 at 20:49 +0300, Oleg Nesterov wrote: > On 01/07, Oleg Nesterov wrote: > > > > Consider this "just for illustration" patch, > > > > --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300 > > +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300 > > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr, > > struct held_lock *prev; > > int i; > > > > - for (i = 0; i < curr->lockdep_depth; i++) { > > + for (i = curr->lockdep_depth; --i >= 0; ) { > > prev = curr->held_locks + i; > > if (prev->class != next->class) > > continue; > > + > > + if (prev->trylock == -1) > > + return 2; > > /* > > * Allow read-after-read recursion of the same > > * lock class (i.e. read_lock(lock)+read_lock(lock)): > > ------------------------------------------------------------------------- > > > > Now, > > > > // trylock == -1 > > #define spin_mark_nested(l) \ > > lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_) > > #define spin_unmark_nested(l) \ > > lock_release(&(l)->dep_map, 1, _THIS_IP_) > > > > and ep_poll_safewake() can do: > > > > /* Do really wake up now */ > > spin_mark_nested(&wq->lock); > > wake_up(wq); > > spin_unmark_nested(&wq->lock); > > I tested the patch above with the following code, > > wait_queue_head_t w1, w2, w3; > > init_waitqueue_head(&w1); > init_waitqueue_head(&w2); > init_waitqueue_head(&w3); > > local_irq_disable(); > spin_lock(&w1.lock); > > spin_mark_nested(&w2.lock); > spin_lock(&w2.lock); > > spin_mark_nested(&w3.lock); > wake_up(&w3); > spin_unmark_nested(&w3.lock); > > spin_unlock(&w2.lock); > spin_unmark_nested(&w2.lock); > > spin_unlock(&w1.lock); > local_irq_enable(); > > seems to work. What do you think? I've been pondering this for a while, and some days I really like it, some days I don't. The problem I have with it is that it becomes very easy to falsely annotate problems away - its a very powerful annotation. That said, its almost powerful enough to annotate the device semaphore/mutex problem. I think I'll do wake_up_nested() for now and keep this around. Thanks for this very nice idea though. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.24-rc6: possible recursive locking detected 2008-01-13 16:32 ` Peter Zijlstra @ 2008-01-14 21:27 ` Oleg Nesterov 2008-01-30 10:34 ` hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) Peter Zijlstra 0 siblings, 1 reply; 17+ messages in thread From: Oleg Nesterov @ 2008-01-14 21:27 UTC (permalink / raw) To: Peter Zijlstra Cc: Herbert Xu, Ingo Molnar, Rafael J. Wysocki, Christian Kujau, linux-kernel, jfs-discussion, Davide Libenzi, Johannes Berg On 01/13, Peter Zijlstra wrote: > > On Mon, 2008-01-07 at 20:49 +0300, Oleg Nesterov wrote: > > > On 01/07, Oleg Nesterov wrote: > > > > > > Consider this "just for illustration" patch, > > > > > > --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300 > > > +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300 > > > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr, > > > struct held_lock *prev; > > > int i; > > > > > > - for (i = 0; i < curr->lockdep_depth; i++) { > > > + for (i = curr->lockdep_depth; --i >= 0; ) { > > > prev = curr->held_locks + i; > > > if (prev->class != next->class) > > > continue; > > > + > > > + if (prev->trylock == -1) > > > + return 2; > > > /* > > > * Allow read-after-read recursion of the same > > > * lock class (i.e. read_lock(lock)+read_lock(lock)): > > > ------------------------------------------------------------------------- > > > > > > Now, > > > > > > // trylock == -1 > > > #define spin_mark_nested(l) \ > > > lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_) > > > #define spin_unmark_nested(l) \ > > > lock_release(&(l)->dep_map, 1, _THIS_IP_) > > > > > > and ep_poll_safewake() can do: > > > > > > /* Do really wake up now */ > > > spin_mark_nested(&wq->lock); > > > wake_up(wq); > > > spin_unmark_nested(&wq->lock); > > > > seems to work. What do you think? > > I've been pondering this for a while, and some days I really like it, > some days I don't. > > The problem I have with it is that it becomes very easy to falsely > annotate problems away - its a very powerful annotation. Also, I don't like the overloading of ->trylock, this is really hackish. > I think I'll do wake_up_nested() for now and keep this around. Agreed. Perhaps it is a bit easier to use spin_lock_nested() + __wake_up_common() directly (we have a lot of wake_up_xxx helpers), but this is up to you. Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ? We never lock 2 bases at the same time, except in migrate_timers(). We can kill double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in migrate_timers(), no? Oleg. ^ permalink raw reply [flat|nested] 17+ messages in thread
* hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) 2008-01-14 21:27 ` Oleg Nesterov @ 2008-01-30 10:34 ` Peter Zijlstra 2008-01-30 17:36 ` Thomas Gleixner 0 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2008-01-30 10:34 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Ingo Molnar, linux-kernel, Thomas Gleixner ( trimmed CC list ) Sorry for the delay, this message seems to have gotten lost in my inbox :-/ On Tue, 2008-01-15 at 00:27 +0300, Oleg Nesterov wrote: > On 01/13, Peter Zijlstra wrote: > Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ? > We never lock 2 bases at the same time, except in migrate_timers(). We can kill > double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in > migrate_timers(), no? Lets ask Thomas.. :-) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) 2008-01-30 10:34 ` hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) Peter Zijlstra @ 2008-01-30 17:36 ` Thomas Gleixner 0 siblings, 0 replies; 17+ messages in thread From: Thomas Gleixner @ 2008-01-30 17:36 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Oleg Nesterov, Ingo Molnar, linux-kernel On Wed, 30 Jan 2008, Peter Zijlstra wrote: > ( trimmed CC list ) > > Sorry for the delay, this message seems to have gotten lost in my > inbox :-/ > > On Tue, 2008-01-15 at 00:27 +0300, Oleg Nesterov wrote: > > On 01/13, Peter Zijlstra wrote: > > > Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ? > > We never lock 2 bases at the same time, except in migrate_timers(). We can kill > > double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in > > migrate_timers(), no? > > Lets ask Thomas.. :-) No objections as long as it all works :) tglx ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2008-01-30 17:37 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-03 22:58 2.6.24-rc6: possible recursive locking detected Christian Kujau 2008-01-03 23:06 ` Rafael J. Wysocki 2008-01-04 8:30 ` Ingo Molnar 2008-01-05 7:12 ` Herbert Xu 2008-01-05 16:53 ` Peter Zijlstra 2008-01-05 17:01 ` Peter Zijlstra 2008-01-05 21:35 ` Davide Libenzi 2008-01-06 0:20 ` Christian Kujau 2008-01-07 21:35 ` Davide Libenzi 2008-01-06 21:44 ` Cyrill Gorcunov 2008-01-06 21:53 ` Cyrill Gorcunov 2008-01-07 17:22 ` Oleg Nesterov 2008-01-07 17:49 ` Oleg Nesterov 2008-01-13 16:32 ` Peter Zijlstra 2008-01-14 21:27 ` Oleg Nesterov 2008-01-30 10:34 ` hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected) Peter Zijlstra 2008-01-30 17:36 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox