* Re: ipv4: crash at leaf_walk_rcu [not found] <CAOaiJ-k14g6ihjVfgHqgq48KV7-dRJ1_owNyDkza-r=36gscsQ@mail.gmail.com> @ 2013-07-31 12:55 ` Paul E. McKenney 2013-07-31 13:13 ` Hannes Frederic Sowa 0 siblings, 1 reply; 4+ messages in thread From: Paul E. McKenney @ 2013-07-31 12:55 UTC (permalink / raw) To: vinayak menon; +Cc: linux-kernel, davem, getarunks, netdev On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote: > Hi, > > A crash was seen on 3.4.5 kernel during some random wlan operations. > > CPU: Single core ARM Cortex A9. > > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360 > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the > object was freed with crash utility. > > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu > > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu() > returned an invalid tnode. But as I had enabled slab poisoning and the > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to > leaf_walk_rcu and resulted in the crash. > > fib_route_seq_start, takes rcu_read_lock(), but free_leaf > calls call_rcu_bh. Can this be the problem ? > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh() > ? One way or the other, the RCU read-side primitives need to match the RCU update-side primitives. Adding netdev... Thanx, Paul > ---------------------------------------------------------------------------- > PC is at leaf_walk_rcu+0x10/0xa0 > LR is at fib_route_seq_next+0x58/0x74 > pc : [<c0500e5c>] lr : [<c050108c>] psr: a0000013 > sp : c150bee0 ip : 00000000 fp : 00000000 > r10: 00000400 r9 : 53701020 r8 : c32345c0 > r7 : 00000000 r6 : 00000001 r5 : 00000000 r4 : 00000002 > r3 : 6b6b6b6b r2 : 00000001 r1 : d6e3e360 r0 : 6b6b6b6a > Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user > Control: 10c53c7d Table: 835dc059 DAC: 00000015 > > Backtrace: > [<c0500e5c>] (leaf_walk_rcu+0x10/0xa0) from [<c050108c>] > (fib_route_seq_next+0x58/0x74) > [<c050108c>] (fib_route_seq_next+0x58/0x74) from [<c011c06c>] > (seq_read+0x2cc/0x438) > [<c011c06c>] (seq_read+0x2cc/0x438) from [<c0145734>] > (proc_reg_read+0xb0/0xcc) > [<c0145734>] (proc_reg_read+0xb0/0xcc) from [<c0100798>] > (vfs_read+0xac/0x124) > [<c0100798>] (vfs_read+0xac/0x124) from [<c0100848>] (sys_read+0x38/0x64) > [<c0100848>] (sys_read+0x38/0x64) from [<c000e100>] > (ret_fast_syscall+0x0/0x48) > > Thanks, > Vinayak ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ipv4: crash at leaf_walk_rcu 2013-07-31 12:55 ` ipv4: crash at leaf_walk_rcu Paul E. McKenney @ 2013-07-31 13:13 ` Hannes Frederic Sowa 2013-07-31 13:31 ` vinayak menon 2013-07-31 14:13 ` Paul E. McKenney 0 siblings, 2 replies; 4+ messages in thread From: Hannes Frederic Sowa @ 2013-07-31 13:13 UTC (permalink / raw) To: Paul E. McKenney; +Cc: vinayak menon, linux-kernel, davem, getarunks, netdev On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote: > On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote: > > Hi, > > > > A crash was seen on 3.4.5 kernel during some random wlan operations. > > > > CPU: Single core ARM Cortex A9. > > > > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360 > > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the > > object was freed with crash utility. > > > > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu > > > > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu() > > returned an invalid tnode. But as I had enabled slab poisoning and the > > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to > > leaf_walk_rcu and resulted in the crash. > > > > fib_route_seq_start, takes rcu_read_lock(), but free_leaf > > calls call_rcu_bh. Can this be the problem ? > > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh() > > ? > > One way or the other, the RCU read-side primitives need to match the RCU > update-side primitives. Adding netdev... Already fixed by: commit 0c03eca3d995e73d691edea8c787e25929ec156d Author: Eric Dumazet <edumazet@google.com> Date: Tue Aug 7 00:47:11 2012 +0000 net: fib: fix incorrect call_rcu_bh() After IP route cache removal, I believe rcu_bh() has very little use and we should remove this RCU variant, since it adds some cycles in fast path. Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since some users only assert rcu_read_lock(). ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ipv4: crash at leaf_walk_rcu 2013-07-31 13:13 ` Hannes Frederic Sowa @ 2013-07-31 13:31 ` vinayak menon 2013-07-31 14:13 ` Paul E. McKenney 1 sibling, 0 replies; 4+ messages in thread From: vinayak menon @ 2013-07-31 13:31 UTC (permalink / raw) To: Paul E. McKenney, vinayak menon, linux-kernel, davem, getarunks, netdev On Wed, Jul 31, 2013 at 6:43 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote: >> On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote: >> > Hi, >> > >> > A crash was seen on 3.4.5 kernel during some random wlan operations. >> > >> > CPU: Single core ARM Cortex A9. >> > >> > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360 >> > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the >> > object was freed with crash utility. >> > >> > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu >> > >> > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu() >> > returned an invalid tnode. But as I had enabled slab poisoning and the >> > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to >> > leaf_walk_rcu and resulted in the crash. >> > >> > fib_route_seq_start, takes rcu_read_lock(), but free_leaf >> > calls call_rcu_bh. Can this be the problem ? >> > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh() >> > ? >> >> One way or the other, the RCU read-side primitives need to match the RCU >> update-side primitives. Adding netdev... > > Already fixed by: > > commit 0c03eca3d995e73d691edea8c787e25929ec156d > Author: Eric Dumazet <edumazet@google.com> > Date: Tue Aug 7 00:47:11 2012 +0000 > > net: fib: fix incorrect call_rcu_bh() > > After IP route cache removal, I believe rcu_bh() has very little use and > we should remove this RCU variant, since it adds some cycles in fast > path. > > Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since > some users only assert rcu_read_lock(). > Thanks. I missed this somehow. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ipv4: crash at leaf_walk_rcu 2013-07-31 13:13 ` Hannes Frederic Sowa 2013-07-31 13:31 ` vinayak menon @ 2013-07-31 14:13 ` Paul E. McKenney 1 sibling, 0 replies; 4+ messages in thread From: Paul E. McKenney @ 2013-07-31 14:13 UTC (permalink / raw) To: vinayak menon, linux-kernel, davem, getarunks, netdev On Wed, Jul 31, 2013 at 03:13:23PM +0200, Hannes Frederic Sowa wrote: > On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote: > > On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote: > > > Hi, > > > > > > A crash was seen on 3.4.5 kernel during some random wlan operations. > > > > > > CPU: Single core ARM Cortex A9. > > > > > > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360 > > > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the > > > object was freed with crash utility. > > > > > > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu > > > > > > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu() > > > returned an invalid tnode. But as I had enabled slab poisoning and the > > > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to > > > leaf_walk_rcu and resulted in the crash. > > > > > > fib_route_seq_start, takes rcu_read_lock(), but free_leaf > > > calls call_rcu_bh. Can this be the problem ? > > > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh() > > > ? > > > > One way or the other, the RCU read-side primitives need to match the RCU > > update-side primitives. Adding netdev... > > Already fixed by: > > commit 0c03eca3d995e73d691edea8c787e25929ec156d > Author: Eric Dumazet <edumazet@google.com> > Date: Tue Aug 7 00:47:11 2012 +0000 > > net: fib: fix incorrect call_rcu_bh() > > After IP route cache removal, I believe rcu_bh() has very little use and > we should remove this RCU variant, since it adds some cycles in fast > path. > > Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since > some users only assert rcu_read_lock(). Even better! ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-07-31 14:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAOaiJ-k14g6ihjVfgHqgq48KV7-dRJ1_owNyDkza-r=36gscsQ@mail.gmail.com>
2013-07-31 12:55 ` ipv4: crash at leaf_walk_rcu Paul E. McKenney
2013-07-31 13:13 ` Hannes Frederic Sowa
2013-07-31 13:31 ` vinayak menon
2013-07-31 14:13 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).