netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: ipv4: crash at leaf_walk_rcu
       [not found] <CAOaiJ-k14g6ihjVfgHqgq48KV7-dRJ1_owNyDkza-r=36gscsQ@mail.gmail.com>
@ 2013-07-31 12:55 ` Paul E. McKenney
  2013-07-31 13:13   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2013-07-31 12:55 UTC (permalink / raw)
  To: vinayak menon; +Cc: linux-kernel, davem, getarunks, netdev

On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote:
> Hi,
> 
> A crash was seen on 3.4.5 kernel during some random wlan operations.
> 
> CPU: Single core ARM Cortex A9.
> 
> fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360
> which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the
> object was freed with crash utility.
> 
> Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu
> 
> As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu()
> returned an invalid tnode. But as I had enabled slab poisoning and the
> object was already freed, the tnode was 0x6b6b6b6b. And this was passed to
> leaf_walk_rcu and resulted in the crash.
> 
> fib_route_seq_start, takes rcu_read_lock(), but free_leaf
> calls call_rcu_bh. Can this be the problem ?
> Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh()
> ?

One way or the other, the RCU read-side primitives need to match the RCU
update-side primitives.  Adding netdev...

							Thanx, Paul

> ----------------------------------------------------------------------------
> PC is at leaf_walk_rcu+0x10/0xa0
> LR is at fib_route_seq_next+0x58/0x74
> pc : [<c0500e5c>]    lr : [<c050108c>]    psr: a0000013
> sp : c150bee0  ip : 00000000  fp : 00000000
> r10: 00000400  r9 : 53701020  r8 : c32345c0
> r7 : 00000000  r6 : 00000001  r5 : 00000000  r4 : 00000002
> r3 : 6b6b6b6b  r2 : 00000001  r1 : d6e3e360  r0 : 6b6b6b6a
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c53c7d  Table: 835dc059  DAC: 00000015
> 
> Backtrace:
> [<c0500e5c>] (leaf_walk_rcu+0x10/0xa0) from [<c050108c>]
> (fib_route_seq_next+0x58/0x74)
> [<c050108c>] (fib_route_seq_next+0x58/0x74) from [<c011c06c>]
> (seq_read+0x2cc/0x438)
> [<c011c06c>] (seq_read+0x2cc/0x438) from [<c0145734>]
> (proc_reg_read+0xb0/0xcc)
> [<c0145734>] (proc_reg_read+0xb0/0xcc) from [<c0100798>]
> (vfs_read+0xac/0x124)
> [<c0100798>] (vfs_read+0xac/0x124) from [<c0100848>] (sys_read+0x38/0x64)
> [<c0100848>] (sys_read+0x38/0x64) from [<c000e100>]
> (ret_fast_syscall+0x0/0x48)
> 
> Thanks,
> Vinayak

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ipv4: crash at leaf_walk_rcu
  2013-07-31 12:55 ` ipv4: crash at leaf_walk_rcu Paul E. McKenney
@ 2013-07-31 13:13   ` Hannes Frederic Sowa
  2013-07-31 13:31     ` vinayak menon
  2013-07-31 14:13     ` Paul E. McKenney
  0 siblings, 2 replies; 4+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-31 13:13 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: vinayak menon, linux-kernel, davem, getarunks, netdev

On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote:
> > Hi,
> > 
> > A crash was seen on 3.4.5 kernel during some random wlan operations.
> > 
> > CPU: Single core ARM Cortex A9.
> > 
> > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360
> > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the
> > object was freed with crash utility.
> > 
> > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu
> > 
> > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu()
> > returned an invalid tnode. But as I had enabled slab poisoning and the
> > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to
> > leaf_walk_rcu and resulted in the crash.
> > 
> > fib_route_seq_start, takes rcu_read_lock(), but free_leaf
> > calls call_rcu_bh. Can this be the problem ?
> > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh()
> > ?
> 
> One way or the other, the RCU read-side primitives need to match the RCU
> update-side primitives.  Adding netdev...

Already fixed by:

commit 0c03eca3d995e73d691edea8c787e25929ec156d
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Aug 7 00:47:11 2012 +0000

    net: fib: fix incorrect call_rcu_bh()
    
    After IP route cache removal, I believe rcu_bh() has very little use and
    we should remove this RCU variant, since it adds some cycles in fast
    path.
    
    Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since
    some users only assert rcu_read_lock().

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ipv4: crash at leaf_walk_rcu
  2013-07-31 13:13   ` Hannes Frederic Sowa
@ 2013-07-31 13:31     ` vinayak menon
  2013-07-31 14:13     ` Paul E. McKenney
  1 sibling, 0 replies; 4+ messages in thread
From: vinayak menon @ 2013-07-31 13:31 UTC (permalink / raw)
  To: Paul E. McKenney, vinayak menon, linux-kernel, davem, getarunks,
	netdev

On Wed, Jul 31, 2013 at 6:43 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote:
>> On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote:
>> > Hi,
>> >
>> > A crash was seen on 3.4.5 kernel during some random wlan operations.
>> >
>> > CPU: Single core ARM Cortex A9.
>> >
>> > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360
>> > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the
>> > object was freed with crash utility.
>> >
>> > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu
>> >
>> > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu()
>> > returned an invalid tnode. But as I had enabled slab poisoning and the
>> > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to
>> > leaf_walk_rcu and resulted in the crash.
>> >
>> > fib_route_seq_start, takes rcu_read_lock(), but free_leaf
>> > calls call_rcu_bh. Can this be the problem ?
>> > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh()
>> > ?
>>
>> One way or the other, the RCU read-side primitives need to match the RCU
>> update-side primitives.  Adding netdev...
>
> Already fixed by:
>
> commit 0c03eca3d995e73d691edea8c787e25929ec156d
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Tue Aug 7 00:47:11 2012 +0000
>
>     net: fib: fix incorrect call_rcu_bh()
>
>     After IP route cache removal, I believe rcu_bh() has very little use and
>     we should remove this RCU variant, since it adds some cycles in fast
>     path.
>
>     Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since
>     some users only assert rcu_read_lock().
>

Thanks. I missed this somehow.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ipv4: crash at leaf_walk_rcu
  2013-07-31 13:13   ` Hannes Frederic Sowa
  2013-07-31 13:31     ` vinayak menon
@ 2013-07-31 14:13     ` Paul E. McKenney
  1 sibling, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2013-07-31 14:13 UTC (permalink / raw)
  To: vinayak menon, linux-kernel, davem, getarunks, netdev

On Wed, Jul 31, 2013 at 03:13:23PM +0200, Hannes Frederic Sowa wrote:
> On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote:
> > > Hi,
> > > 
> > > A crash was seen on 3.4.5 kernel during some random wlan operations.
> > > 
> > > CPU: Single core ARM Cortex A9.
> > > 
> > > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360
> > > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the
> > > object was freed with crash utility.
> > > 
> > > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu
> > > 
> > > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu()
> > > returned an invalid tnode. But as I had enabled slab poisoning and the
> > > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to
> > > leaf_walk_rcu and resulted in the crash.
> > > 
> > > fib_route_seq_start, takes rcu_read_lock(), but free_leaf
> > > calls call_rcu_bh. Can this be the problem ?
> > > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh()
> > > ?
> > 
> > One way or the other, the RCU read-side primitives need to match the RCU
> > update-side primitives.  Adding netdev...
> 
> Already fixed by:
> 
> commit 0c03eca3d995e73d691edea8c787e25929ec156d
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Tue Aug 7 00:47:11 2012 +0000
> 
>     net: fib: fix incorrect call_rcu_bh()
>     
>     After IP route cache removal, I believe rcu_bh() has very little use and
>     we should remove this RCU variant, since it adds some cycles in fast
>     path.
>     
>     Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since
>     some users only assert rcu_read_lock().

Even better!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-31 14:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAOaiJ-k14g6ihjVfgHqgq48KV7-dRJ1_owNyDkza-r=36gscsQ@mail.gmail.com>
2013-07-31 12:55 ` ipv4: crash at leaf_walk_rcu Paul E. McKenney
2013-07-31 13:13   ` Hannes Frederic Sowa
2013-07-31 13:31     ` vinayak menon
2013-07-31 14:13     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).