netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* conntrack, suspicious RCU usage
@ 2012-01-11  9:25 Hans Schillstrom
  2012-01-11 10:01 ` Eric Dumazet
  2012-01-12  2:30 ` Pablo Neira Ayuso
  0 siblings, 2 replies; 9+ messages in thread
From: Hans Schillstrom @ 2012-01-11  9:25 UTC (permalink / raw)
  To: netfilter-devel@vger.kernel.org

Hello
I got this the first time using conntrack -L when there is a lot of traffic.
It doesn't result in any thing bad yet.

Is this a know thing ?
or should I dig into it..

I'm running the latest and greatest conntrack / netfilter tools and libs.

===============================
[ INFO: suspicious RCU usage. ]
-------------------------------
/home/hans/evip.git/kvm/net-next.git/include/net/netfilter/nf_conntrack_l3proto.h:92 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
3 locks held by conntrack/2249:
 #0:  (nfnl_mutex){+.+.+.}, at: [<ffffffff812cd29f>] nfnl_lock+0x17/0x19
 #1:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff812c7211>] netlink_dump+0x27/0x1ec
 #2:  (nf_conntrack_lock){+.-...}, at: [<ffffffffa00b8922>] 0xffffffffa00b8922

stack backtrace:
Pid: 2249, comm: conntrack Tainted: G        W    3.2.0+ #34
Call Trace:
 [<ffffffff8102ee61>] ? console_unlock+0x164/0x20c
 [<ffffffff81078542>] lockdep_rcu_suspicious+0xd8/0xe1
 [<ffffffffa00b78aa>] 0xffffffffa00b78a9
 [<ffffffffa00b819c>] 0xffffffffa00b819b
 [<ffffffffa00b898f>] 0xffffffffa00b898e
 [<ffffffff812c725e>] netlink_dump+0x74/0x1ec
 [<ffffffffa00b88e4>] ? 0xffffffffa00b88e3
 [<ffffffff812c7a43>] netlink_dump_start+0x103/0x135
 [<ffffffffa00b77fa>] ? 0xffffffffa00b77f9
 [<ffffffffa00b86a8>] 0xffffffffa00b86a7
 [<ffffffff812cd29f>] ? nfnl_lock+0x17/0x19
 [<ffffffff812cd734>] nfnetlink_rcv_msg+0x493/0x4cd
 [<ffffffff812cd3bc>] ? nfnetlink_rcv_msg+0x11b/0x4cd
 [<ffffffff812cd359>] ? nfnetlink_rcv_msg+0xb8/0x4cd
 [<ffffffff812c6c51>] ? netlink_lookup+0xc4/0xcf
 [<ffffffff812cd2a1>] ? nfnl_lock+0x19/0x19
 [<ffffffff812c87d2>] netlink_rcv_skb+0x43/0x94
 [<ffffffff812cd207>] nfnetlink_rcv+0x15/0x17
 [<ffffffff812c853a>] netlink_unicast+0x13d/0x1b4
 [<ffffffff812c8e32>] netlink_sendmsg+0x201/0x269
 [<ffffffff812962ef>] sock_sendmsg+0xea/0x109
 [<ffffffff81077fdf>] ? lock_release_holdtime+0xfd/0x102
 [<ffffffff810e2755>] ? might_fault+0x40/0x90
 [<ffffffff810e2755>] ? might_fault+0x40/0x90
 [<ffffffff810e2755>] ? might_fault+0x40/0x90
 [<ffffffff810e279e>] ? might_fault+0x89/0x90
 [<ffffffff810e2755>] ? might_fault+0x40/0x90
 [<ffffffff812948ec>] ? move_addr_to_kernel+0x3f/0x56
 [<ffffffff81296a65>] sys_sendto+0x102/0x12a
 [<ffffffff810faf10>] ? kmem_cache_free+0xc7/0x1b2
 [<ffffffff81079651>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff813ba612>] system_call_fastpath+0x16/0x1b


-- 
Regards
Hans Schillstrom <hans.schillstrom@ericsson.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11  9:25 conntrack, suspicious RCU usage Hans Schillstrom
@ 2012-01-11 10:01 ` Eric Dumazet
  2012-01-11 13:24   ` Hans Schillstrom
  2012-01-12  2:30 ` Pablo Neira Ayuso
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2012-01-11 10:01 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: netfilter-devel@vger.kernel.org

Le mercredi 11 janvier 2012 à 10:25 +0100, Hans Schillstrom a écrit :
> Hello
> I got this the first time using conntrack -L when there is a lot of traffic.
> It doesn't result in any thing bad yet.
> 
> Is this a know thing ?
> or should I dig into it..
> 
> I'm running the latest and greatest conntrack / netfilter tools and libs.
> 
> ===============================
> [ INFO: suspicious RCU usage. ]
> -------------------------------
> /home/hans/evip.git/kvm/net-next.git/include/net/netfilter/nf_conntrack_l3proto.h:92 suspicious rcu_dereference_check() usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 3 locks held by conntrack/2249:
>  #0:  (nfnl_mutex){+.+.+.}, at: [<ffffffff812cd29f>] nfnl_lock+0x17/0x19
>  #1:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff812c7211>] netlink_dump+0x27/0x1ec
>  #2:  (nf_conntrack_lock){+.-...}, at: [<ffffffffa00b8922>] 0xffffffffa00b8922
> 
> stack backtrace:
> Pid: 2249, comm: conntrack Tainted: G        W    3.2.0+ #34
> Call Trace:
>  [<ffffffff8102ee61>] ? console_unlock+0x164/0x20c
>  [<ffffffff81078542>] lockdep_rcu_suspicious+0xd8/0xe1
>  [<ffffffffa00b78aa>] 0xffffffffa00b78a9
>  [<ffffffffa00b819c>] 0xffffffffa00b819b
>  [<ffffffffa00b898f>] 0xffffffffa00b898e
>  [<ffffffff812c725e>] netlink_dump+0x74/0x1ec
>  [<ffffffffa00b88e4>] ? 0xffffffffa00b88e3
>  [<ffffffff812c7a43>] netlink_dump_start+0x103/0x135
>  [<ffffffffa00b77fa>] ? 0xffffffffa00b77f9
>  [<ffffffffa00b86a8>] 0xffffffffa00b86a7
>  [<ffffffff812cd29f>] ? nfnl_lock+0x17/0x19
>  [<ffffffff812cd734>] nfnetlink_rcv_msg+0x493/0x4cd
>  [<ffffffff812cd3bc>] ? nfnetlink_rcv_msg+0x11b/0x4cd
>  [<ffffffff812cd359>] ? nfnetlink_rcv_msg+0xb8/0x4cd
>  [<ffffffff812c6c51>] ? netlink_lookup+0xc4/0xcf
>  [<ffffffff812cd2a1>] ? nfnl_lock+0x19/0x19
>  [<ffffffff812c87d2>] netlink_rcv_skb+0x43/0x94
>  [<ffffffff812cd207>] nfnetlink_rcv+0x15/0x17
>  [<ffffffff812c853a>] netlink_unicast+0x13d/0x1b4
>  [<ffffffff812c8e32>] netlink_sendmsg+0x201/0x269
>  [<ffffffff812962ef>] sock_sendmsg+0xea/0x109
>  [<ffffffff81077fdf>] ? lock_release_holdtime+0xfd/0x102
>  [<ffffffff810e2755>] ? might_fault+0x40/0x90
>  [<ffffffff810e2755>] ? might_fault+0x40/0x90
>  [<ffffffff810e2755>] ? might_fault+0x40/0x90
>  [<ffffffff810e279e>] ? might_fault+0x89/0x90
>  [<ffffffff810e2755>] ? might_fault+0x40/0x90
>  [<ffffffff812948ec>] ? move_addr_to_kernel+0x3f/0x56
>  [<ffffffff81296a65>] sys_sendto+0x102/0x12a
>  [<ffffffff810faf10>] ? kmem_cache_free+0xc7/0x1b2
>  [<ffffffff81079651>] ? trace_hardirqs_on+0xd/0xf
>  [<ffffffff813ba612>] system_call_fastpath+0x16/0x1b
> 
> 

Hmm, we either need to take rcu_read_lock() while calling
__nf_ct_l3proto_find(), or define a variant using
rcu_dereference_protected() in places we hold nf_conntrack_lock




--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11 10:01 ` Eric Dumazet
@ 2012-01-11 13:24   ` Hans Schillstrom
  2012-01-11 13:33     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Schillstrom @ 2012-01-11 13:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netfilter-devel@vger.kernel.org

On Wednesday 11 January 2012 11:01:51 Eric Dumazet wrote:
> Le mercredi 11 janvier 2012 à 10:25 +0100, Hans Schillstrom a écrit :
> > Hello
> > I got this the first time using conntrack -L when there is a lot of traffic.
> > It doesn't result in any thing bad yet.
> > 
> > Is this a know thing ?
> > or should I dig into it..
> > 
> > I'm running the latest and greatest conntrack / netfilter tools and libs.
> > 
> > ===============================
> > [ INFO: suspicious RCU usage. ]
> > -------------------------------
> > /home/hans/evip.git/kvm/net-next.git/include/net/netfilter/nf_conntrack_l3proto.h:92 suspicious rcu_dereference_check() usage!
> > 
> > other info that might help us debug this:
> > 
> > 
> > rcu_scheduler_active = 1, debug_locks = 0
> > 3 locks held by conntrack/2249:
> >  #0:  (nfnl_mutex){+.+.+.}, at: [<ffffffff812cd29f>] nfnl_lock+0x17/0x19
> >  #1:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff812c7211>] netlink_dump+0x27/0x1ec
> >  #2:  (nf_conntrack_lock){+.-...}, at: [<ffffffffa00b8922>] 0xffffffffa00b8922
> > 
> > stack backtrace:
> > Pid: 2249, comm: conntrack Tainted: G        W    3.2.0+ #34
> > Call Trace:
> >  [<ffffffff8102ee61>] ? console_unlock+0x164/0x20c
> >  [<ffffffff81078542>] lockdep_rcu_suspicious+0xd8/0xe1
> >  [<ffffffffa00b78aa>] 0xffffffffa00b78a9
> >  [<ffffffffa00b819c>] 0xffffffffa00b819b
> >  [<ffffffffa00b898f>] 0xffffffffa00b898e
> >  [<ffffffff812c725e>] netlink_dump+0x74/0x1ec
> >  [<ffffffffa00b88e4>] ? 0xffffffffa00b88e3
> >  [<ffffffff812c7a43>] netlink_dump_start+0x103/0x135
> >  [<ffffffffa00b77fa>] ? 0xffffffffa00b77f9
> >  [<ffffffffa00b86a8>] 0xffffffffa00b86a7
> >  [<ffffffff812cd29f>] ? nfnl_lock+0x17/0x19
> >  [<ffffffff812cd734>] nfnetlink_rcv_msg+0x493/0x4cd
> >  [<ffffffff812cd3bc>] ? nfnetlink_rcv_msg+0x11b/0x4cd
> >  [<ffffffff812cd359>] ? nfnetlink_rcv_msg+0xb8/0x4cd
> >  [<ffffffff812c6c51>] ? netlink_lookup+0xc4/0xcf
> >  [<ffffffff812cd2a1>] ? nfnl_lock+0x19/0x19
> >  [<ffffffff812c87d2>] netlink_rcv_skb+0x43/0x94
> >  [<ffffffff812cd207>] nfnetlink_rcv+0x15/0x17
> >  [<ffffffff812c853a>] netlink_unicast+0x13d/0x1b4
> >  [<ffffffff812c8e32>] netlink_sendmsg+0x201/0x269
> >  [<ffffffff812962ef>] sock_sendmsg+0xea/0x109
> >  [<ffffffff81077fdf>] ? lock_release_holdtime+0xfd/0x102
> >  [<ffffffff810e2755>] ? might_fault+0x40/0x90
> >  [<ffffffff810e2755>] ? might_fault+0x40/0x90
> >  [<ffffffff810e2755>] ? might_fault+0x40/0x90
> >  [<ffffffff810e279e>] ? might_fault+0x89/0x90
> >  [<ffffffff810e2755>] ? might_fault+0x40/0x90
> >  [<ffffffff812948ec>] ? move_addr_to_kernel+0x3f/0x56
> >  [<ffffffff81296a65>] sys_sendto+0x102/0x12a
> >  [<ffffffff810faf10>] ? kmem_cache_free+0xc7/0x1b2
> >  [<ffffffff81079651>] ? trace_hardirqs_on+0xd/0xf
> >  [<ffffffff813ba612>] system_call_fastpath+0x16/0x1b
> > 
> > 
> 
> Hmm, we either need to take rcu_read_lock() while calling
> __nf_ct_l3proto_find(), or define a variant using
> rcu_dereference_protected() in places we hold nf_conntrack_lock
> 
I made a qick test with locks /unlocks in
__nf_ct_l3proto_find() and __nf_ct_l4proto_find()

	rcu_read_lock();
...
	rcu_read_unlock();
	return retp;

It seems to help, I cant see the dump anymore and everything else that I run works ...


-- 
Regards
Hans Schillstrom <hans.schillstrom@ericsson.com>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11 13:24   ` Hans Schillstrom
@ 2012-01-11 13:33     ` Eric Dumazet
  2012-01-11 13:44       ` Hans Schillstrom
  2012-01-11 14:56       ` Eric Dumazet
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-01-11 13:33 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: netfilter-devel@vger.kernel.org

Le mercredi 11 janvier 2012 à 14:24 +0100, Hans Schillstrom a écrit :
> On Wednesday 11 January 2012 11:01:51 Eric Dumazet wrote:

> > Hmm, we either need to take rcu_read_lock() while calling
> > __nf_ct_l3proto_find(), or define a variant using
> > rcu_dereference_protected() in places we hold nf_conntrack_lock
> > 
> I made a qick test with locks /unlocks in
> __nf_ct_l3proto_find() and __nf_ct_l4proto_find()
> 
> 	rcu_read_lock();
> ...
> 	rcu_read_unlock();
> 	return retp;
> 
> It seems to help, I cant see the dump anymore and everything else that I run works ...
> 
> 

You cant do that, its just a brown paper bag :)

If "retp" is returned, then the caller must handle the rcu_read_unlock()
itself, after all possible "retp" dereferences.

But really adding rcu_read_lock() should not be necessary on paths we
own the conntrack lock. We should use rcu_dereference_protected()
instead.

I'll send a patch.



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11 13:33     ` Eric Dumazet
@ 2012-01-11 13:44       ` Hans Schillstrom
  2012-01-11 14:56       ` Eric Dumazet
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Schillstrom @ 2012-01-11 13:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netfilter-devel@vger.kernel.org

On Wednesday 11 January 2012 14:33:55 Eric Dumazet wrote:
> Le mercredi 11 janvier 2012 à 14:24 +0100, Hans Schillstrom a écrit :
> > On Wednesday 11 January 2012 11:01:51 Eric Dumazet wrote:
> 
> > > Hmm, we either need to take rcu_read_lock() while calling
> > > __nf_ct_l3proto_find(), or define a variant using
> > > rcu_dereference_protected() in places we hold nf_conntrack_lock
> > > 
> > I made a qick test with locks /unlocks in
> > __nf_ct_l3proto_find() and __nf_ct_l4proto_find()
> > 
> > 	rcu_read_lock();
> > ...
> > 	rcu_read_unlock();
> > 	return retp;
> > 
> > It seems to help, I cant see the dump anymore and everything else that I run works ...
> > 
> > 
> 
> You cant do that, its just a brown paper bag :)
> 
OK it didn't feel right ...

> If "retp" is returned, then the caller must handle the rcu_read_unlock()
> itself, after all possible "retp" dereferences.
> 
> But really adding rcu_read_lock() should not be necessary on paths we
> own the conntrack lock. We should use rcu_dereference_protected()
> instead.
> 
> I'll send a patch.

Thanks
Hans
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11 13:33     ` Eric Dumazet
  2012-01-11 13:44       ` Hans Schillstrom
@ 2012-01-11 14:56       ` Eric Dumazet
  2012-01-12  2:35         ` Pablo Neira Ayuso
  2012-01-12  8:15         ` Hans Schillstrom
  1 sibling, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-01-11 14:56 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: netfilter-devel@vger.kernel.org, netdev

Le mercredi 11 janvier 2012 à 14:33 +0100, Eric Dumazet a écrit :
> Le mercredi 11 janvier 2012 à 14:24 +0100, Hans Schillstrom a écrit :
> > On Wednesday 11 January 2012 11:01:51 Eric Dumazet wrote:
> 
> > > Hmm, we either need to take rcu_read_lock() while calling
> > > __nf_ct_l3proto_find(), or define a variant using
> > > rcu_dereference_protected() in places we hold nf_conntrack_lock
> > > 
> > I made a qick test with locks /unlocks in
> > __nf_ct_l3proto_find() and __nf_ct_l4proto_find()
> > 
> > 	rcu_read_lock();
> > ...
> > 	rcu_read_unlock();
> > 	return retp;
> > 
> > It seems to help, I cant see the dump anymore and everything else that I run works ...
> > 
> > 
> 
> You cant do that, its just a brown paper bag :)
> 
> If "retp" is returned, then the caller must handle the rcu_read_unlock()
> itself, after all possible "retp" dereferences.
> 
> But really adding rcu_read_lock() should not be necessary on paths we
> own the conntrack lock. We should use rcu_dereference_protected()
> instead.
> 

Well, __nf_ct_l4proto_find() being out of line and the way we already
use rcu_read_lock() in this code, it seems following patch is
the most natural way to cope with these lockdep warnings.

Thanks

[PATCH] netfilter: ctnetlink: fix lockep splats

net/netfilter/nf_conntrack_proto.c:70 suspicious rcu_dereference_check() usage!
 
other info that might help us debug this:
 
 
rcu_scheduler_active = 1, debug_locks = 0
3 locks held by conntrack/3235:
 #0:  (nfnl_mutex){+.+.+.}, at: [<ffffffff81603537>]
nfnl_lock+0x17/0x20
 #1:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff815fbd72>]
netlink_dump+0x32/0x240
 #2:  (nf_conntrack_lock){+.-...}, at: [<ffffffffa0115d2e>]
ctnetlink_dump_table+0x3e/0x170 [nf_conntrack_netlink]
 
stack backtrace:
Pid: 3235, comm: conntrack Tainted: G        W    3.2.0+ #511
Call Trace:
 [<ffffffff8108ce45>] lockdep_rcu_suspicious+0xe5/0x100
 [<ffffffffa00ec6e1>] __nf_ct_l4proto_find+0x81/0xb0 [nf_conntrack]
 [<ffffffffa0115675>] ctnetlink_fill_info+0x215/0x5f0 [nf_conntrack_netlink]
 [<ffffffffa0115dc1>] ctnetlink_dump_table+0xd1/0x170 [nf_conntrack_netlink]
 [<ffffffff815fbdbf>] netlink_dump+0x7f/0x240
 [<ffffffff81090f9d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff815fd34f>] netlink_dump_start+0xdf/0x190
 [<ffffffffa0111490>] ? ctnetlink_change_nat_seq_adj+0x160/0x160 [nf_conntrack_netlink]
 [<ffffffffa0115cf0>] ? ctnetlink_get_conntrack+0x2a0/0x2a0 [nf_conntrack_netlink]
 [<ffffffffa0115ad9>] ctnetlink_get_conntrack+0x89/0x2a0 [nf_conntrack_netlink]
 [<ffffffff81603a47>] nfnetlink_rcv_msg+0x467/0x5f0
 [<ffffffff81603a7c>] ? nfnetlink_rcv_msg+0x49c/0x5f0
 [<ffffffff81603922>] ? nfnetlink_rcv_msg+0x342/0x5f0
 [<ffffffff81071b21>] ? get_parent_ip+0x11/0x50
 [<ffffffff816035e0>] ? nfnetlink_subsys_register+0x60/0x60
 [<ffffffff815fed49>] netlink_rcv_skb+0xa9/0xd0
 [<ffffffff81603475>] nfnetlink_rcv+0x15/0x20
 [<ffffffff815fe70e>] netlink_unicast+0x1ae/0x1f0
 [<ffffffff815fea16>] netlink_sendmsg+0x2c6/0x320
 [<ffffffff815b2a87>] sock_sendmsg+0x117/0x130
 [<ffffffff81125093>] ? might_fault+0x53/0xb0
 [<ffffffff811250dc>] ? might_fault+0x9c/0xb0
 [<ffffffff81125093>] ? might_fault+0x53/0xb0
 [<ffffffff815b5991>] ? move_addr_to_kernel+0x71/0x80
 [<ffffffff815b644e>] sys_sendto+0xfe/0x130
 [<ffffffff815b5c94>] ? sys_bind+0xb4/0xd0
 [<ffffffff817a8a0e>] ? retint_swapgs+0xe/0x13
 [<ffffffff817afcd2>] system_call_fastpath+0x16/0x1b


Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/netfilter/nf_conntrack_netlink.c |   31 ++++++++++++++-----------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index e07dc3a..14840d9 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -110,15 +110,15 @@ ctnetlink_dump_tuples(struct sk_buff *skb,
 	struct nf_conntrack_l3proto *l3proto;
 	struct nf_conntrack_l4proto *l4proto;
 
+	rcu_read_lock();
 	l3proto = __nf_ct_l3proto_find(tuple->src.l3num);
 	ret = ctnetlink_dump_tuples_ip(skb, tuple, l3proto);
 
-	if (unlikely(ret < 0))
-		return ret;
-
-	l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
-	ret = ctnetlink_dump_tuples_proto(skb, tuple, l4proto);
-
+	if (ret >= 0) {
+		l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
+		ret = ctnetlink_dump_tuples_proto(skb, tuple, l4proto);
+	}
+	rcu_read_unlock();
 	return ret;
 }
 
@@ -703,6 +703,7 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 	struct hlist_nulls_node *n;
 	struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
 	u_int8_t l3proto = nfmsg->nfgen_family;
+	int res;
 
 	spin_lock_bh(&nf_conntrack_lock);
 	last = (struct nf_conn *)cb->args[1];
@@ -723,11 +724,14 @@ restart:
 					continue;
 				cb->args[1] = 0;
 			}
-			if (ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid,
+			rcu_read_lock();
+			res = ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid,
 						cb->nlh->nlmsg_seq,
 						NFNL_MSG_TYPE(
 							cb->nlh->nlmsg_type),
-						ct) < 0) {
+						ct);
+			rcu_read_unlock();
+			if (res < 0) {
 				nf_conntrack_get(&ct->ct_general);
 				cb->args[1] = (unsigned long)ct;
 				goto out;
@@ -1626,17 +1630,18 @@ ctnetlink_exp_dump_mask(struct sk_buff *skb,
 	if (!nest_parms)
 		goto nla_put_failure;
 
+	rcu_read_lock();
 	l3proto = __nf_ct_l3proto_find(tuple->src.l3num);
 	ret = ctnetlink_dump_tuples_ip(skb, &m, l3proto);
+	if (ret >= 0) {
+		l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
+		ret = ctnetlink_dump_tuples_proto(skb, &m, l4proto);
+	}
+	rcu_read_unlock();
 
 	if (unlikely(ret < 0))
 		goto nla_put_failure;
 
-	l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
-	ret = ctnetlink_dump_tuples_proto(skb, &m, l4proto);
-	if (unlikely(ret < 0))
-		goto nla_put_failure;
-
 	nla_nest_end(skb, nest_parms);
 
 	return 0;


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11  9:25 conntrack, suspicious RCU usage Hans Schillstrom
  2012-01-11 10:01 ` Eric Dumazet
@ 2012-01-12  2:30 ` Pablo Neira Ayuso
  1 sibling, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2012-01-12  2:30 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: netfilter-devel@vger.kernel.org

On Wed, Jan 11, 2012 at 10:25:50AM +0100, Hans Schillstrom wrote:
> Hello
> I got this the first time using conntrack -L when there is a lot of traffic.
> It doesn't result in any thing bad yet.
> 
> Is this a know thing ?

No, you have been the first to spot this.

> or should I dig into it..
> 
> I'm running the latest and greatest conntrack / netfilter tools and libs.
> 
> ===============================
> [ INFO: suspicious RCU usage. ]
> -------------------------------
> /home/hans/evip.git/kvm/net-next.git/include/net/netfilter/nf_conntrack_l3proto.h:92 suspicious rcu_dereference_check() usage!

We were using rcu in the table dumping time ago, but we had to replace
rcu by spinlocks.

ctnetlink_dump_tuples is used in both RCU context and spinlock
context, this seems to be the problem.

it seems I didn't enable RCU read lock verification in my kernels.
I'll do it to catch up this sort of problems.

thanks for the report.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11 14:56       ` Eric Dumazet
@ 2012-01-12  2:35         ` Pablo Neira Ayuso
  2012-01-12  8:15         ` Hans Schillstrom
  1 sibling, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2012-01-12  2:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Hans Schillstrom, netfilter-devel@vger.kernel.org, netdev

On Wed, Jan 11, 2012 at 03:56:52PM +0100, Eric Dumazet wrote:
> Le mercredi 11 janvier 2012 à 14:33 +0100, Eric Dumazet a écrit :
> > Le mercredi 11 janvier 2012 à 14:24 +0100, Hans Schillstrom a écrit :
> > > On Wednesday 11 January 2012 11:01:51 Eric Dumazet wrote:
> > 
> > > > Hmm, we either need to take rcu_read_lock() while calling
> > > > __nf_ct_l3proto_find(), or define a variant using
> > > > rcu_dereference_protected() in places we hold nf_conntrack_lock
> > > > 
> > > I made a qick test with locks /unlocks in
> > > __nf_ct_l3proto_find() and __nf_ct_l4proto_find()
> > > 
> > > 	rcu_read_lock();
> > > ...
> > > 	rcu_read_unlock();
> > > 	return retp;
> > > 
> > > It seems to help, I cant see the dump anymore and everything else that I run works ...
> > > 
> > > 
> > 
> > You cant do that, its just a brown paper bag :)
> > 
> > If "retp" is returned, then the caller must handle the rcu_read_unlock()
> > itself, after all possible "retp" dereferences.
> > 
> > But really adding rcu_read_lock() should not be necessary on paths we
> > own the conntrack lock. We should use rcu_dereference_protected()
> > instead.
> > 
> 
> Well, __nf_ct_l4proto_find() being out of line and the way we already
> use rcu_read_lock() in this code, it seems following patch is
> the most natural way to cope with these lockdep warnings.
> 
> Thanks
> 
> [PATCH] netfilter: ctnetlink: fix lockep splats

Thanks Eric. I'll pass this to davem.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: conntrack, suspicious RCU usage
  2012-01-11 14:56       ` Eric Dumazet
  2012-01-12  2:35         ` Pablo Neira Ayuso
@ 2012-01-12  8:15         ` Hans Schillstrom
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Schillstrom @ 2012-01-12  8:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netfilter-devel@vger.kernel.org, netdev

On Wednesday 11 January 2012 15:56:52 Eric Dumazet wrote:
> Le mercredi 11 janvier 2012 à 14:33 +0100, Eric Dumazet a écrit :
> > Le mercredi 11 janvier 2012 à 14:24 +0100, Hans Schillstrom a écrit :
> > > On Wednesday 11 January 2012 11:01:51 Eric Dumazet wrote:
> > 
> > > > Hmm, we either need to take rcu_read_lock() while calling
> > > > __nf_ct_l3proto_find(), or define a variant using
> > > > rcu_dereference_protected() in places we hold nf_conntrack_lock
> > > > 
> > > I made a qick test with locks /unlocks in
> > > __nf_ct_l3proto_find() and __nf_ct_l4proto_find()
> > > 
> > > 	rcu_read_lock();
> > > ...
> > > 	rcu_read_unlock();
> > > 	return retp;
> > > 
> > > It seems to help, I cant see the dump anymore and everything else that I run works ...
> > > 
> > > 
> > 
> > You cant do that, its just a brown paper bag :)
> > 
> > If "retp" is returned, then the caller must handle the rcu_read_unlock()
> > itself, after all possible "retp" dereferences.
> > 
> > But really adding rcu_read_lock() should not be necessary on paths we
> > own the conntrack lock. We should use rcu_dereference_protected()
> > instead.
> > 
> 
> Well, __nf_ct_l4proto_find() being out of line and the way we already
> use rcu_read_lock() in this code, it seems following patch is
> the most natural way to cope with these lockdep warnings.
> 
> Thanks
> 
> [PATCH] netfilter: ctnetlink: fix lockep splats
> 
> net/netfilter/nf_conntrack_proto.c:70 suspicious rcu_dereference_check() usage!
>  
> other info that might help us debug this:
>  
>  
> rcu_scheduler_active = 1, debug_locks = 0
> 3 locks held by conntrack/3235:
>  #0:  (nfnl_mutex){+.+.+.}, at: [<ffffffff81603537>]
> nfnl_lock+0x17/0x20
>  #1:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff815fbd72>]
> netlink_dump+0x32/0x240
>  #2:  (nf_conntrack_lock){+.-...}, at: [<ffffffffa0115d2e>]
> ctnetlink_dump_table+0x3e/0x170 [nf_conntrack_netlink]
>  
> stack backtrace:
> Pid: 3235, comm: conntrack Tainted: G        W    3.2.0+ #511
> Call Trace:
>  [<ffffffff8108ce45>] lockdep_rcu_suspicious+0xe5/0x100
>  [<ffffffffa00ec6e1>] __nf_ct_l4proto_find+0x81/0xb0 [nf_conntrack]
>  [<ffffffffa0115675>] ctnetlink_fill_info+0x215/0x5f0 [nf_conntrack_netlink]
>  [<ffffffffa0115dc1>] ctnetlink_dump_table+0xd1/0x170 [nf_conntrack_netlink]
>  [<ffffffff815fbdbf>] netlink_dump+0x7f/0x240
>  [<ffffffff81090f9d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff815fd34f>] netlink_dump_start+0xdf/0x190
>  [<ffffffffa0111490>] ? ctnetlink_change_nat_seq_adj+0x160/0x160 [nf_conntrack_netlink]
>  [<ffffffffa0115cf0>] ? ctnetlink_get_conntrack+0x2a0/0x2a0 [nf_conntrack_netlink]
>  [<ffffffffa0115ad9>] ctnetlink_get_conntrack+0x89/0x2a0 [nf_conntrack_netlink]
>  [<ffffffff81603a47>] nfnetlink_rcv_msg+0x467/0x5f0
>  [<ffffffff81603a7c>] ? nfnetlink_rcv_msg+0x49c/0x5f0
>  [<ffffffff81603922>] ? nfnetlink_rcv_msg+0x342/0x5f0
>  [<ffffffff81071b21>] ? get_parent_ip+0x11/0x50
>  [<ffffffff816035e0>] ? nfnetlink_subsys_register+0x60/0x60
>  [<ffffffff815fed49>] netlink_rcv_skb+0xa9/0xd0
>  [<ffffffff81603475>] nfnetlink_rcv+0x15/0x20
>  [<ffffffff815fe70e>] netlink_unicast+0x1ae/0x1f0
>  [<ffffffff815fea16>] netlink_sendmsg+0x2c6/0x320
>  [<ffffffff815b2a87>] sock_sendmsg+0x117/0x130
>  [<ffffffff81125093>] ? might_fault+0x53/0xb0
>  [<ffffffff811250dc>] ? might_fault+0x9c/0xb0
>  [<ffffffff81125093>] ? might_fault+0x53/0xb0
>  [<ffffffff815b5991>] ? move_addr_to_kernel+0x71/0x80
>  [<ffffffff815b644e>] sys_sendto+0xfe/0x130
>  [<ffffffff815b5c94>] ? sys_bind+0xb4/0xd0
>  [<ffffffff817a8a0e>] ? retint_swapgs+0xe/0x13
>  [<ffffffff817afcd2>] system_call_fastpath+0x16/0x1b
> 
> 
> Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com>

Tested-by: Hans Schillstrom <hans.schillstrom@ericsson.com>

> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  net/netfilter/nf_conntrack_netlink.c |   31 ++++++++++++++-----------
>  1 file changed, 18 insertions(+), 13 deletions(-)
> 
> diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
> index e07dc3a..14840d9 100644
> --- a/net/netfilter/nf_conntrack_netlink.c
> +++ b/net/netfilter/nf_conntrack_netlink.c
> @@ -110,15 +110,15 @@ ctnetlink_dump_tuples(struct sk_buff *skb,
>  	struct nf_conntrack_l3proto *l3proto;
>  	struct nf_conntrack_l4proto *l4proto;
>  
> +	rcu_read_lock();
>  	l3proto = __nf_ct_l3proto_find(tuple->src.l3num);
>  	ret = ctnetlink_dump_tuples_ip(skb, tuple, l3proto);
>  
> -	if (unlikely(ret < 0))
> -		return ret;
> -
> -	l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
> -	ret = ctnetlink_dump_tuples_proto(skb, tuple, l4proto);
> -
> +	if (ret >= 0) {
> +		l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
> +		ret = ctnetlink_dump_tuples_proto(skb, tuple, l4proto);
> +	}
> +	rcu_read_unlock();
>  	return ret;
>  }
>  
> @@ -703,6 +703,7 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
>  	struct hlist_nulls_node *n;
>  	struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
>  	u_int8_t l3proto = nfmsg->nfgen_family;
> +	int res;
>  
>  	spin_lock_bh(&nf_conntrack_lock);
>  	last = (struct nf_conn *)cb->args[1];
> @@ -723,11 +724,14 @@ restart:
>  					continue;
>  				cb->args[1] = 0;
>  			}
> -			if (ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid,
> +			rcu_read_lock();
> +			res = ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid,
>  						cb->nlh->nlmsg_seq,
>  						NFNL_MSG_TYPE(
>  							cb->nlh->nlmsg_type),
> -						ct) < 0) {
> +						ct);
> +			rcu_read_unlock();
> +			if (res < 0) {
>  				nf_conntrack_get(&ct->ct_general);
>  				cb->args[1] = (unsigned long)ct;
>  				goto out;
> @@ -1626,17 +1630,18 @@ ctnetlink_exp_dump_mask(struct sk_buff *skb,
>  	if (!nest_parms)
>  		goto nla_put_failure;
>  
> +	rcu_read_lock();
>  	l3proto = __nf_ct_l3proto_find(tuple->src.l3num);
>  	ret = ctnetlink_dump_tuples_ip(skb, &m, l3proto);
> +	if (ret >= 0) {
> +		l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
> +		ret = ctnetlink_dump_tuples_proto(skb, &m, l4proto);
> +	}
> +	rcu_read_unlock();
>  
>  	if (unlikely(ret < 0))
>  		goto nla_put_failure;
>  
> -	l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
> -	ret = ctnetlink_dump_tuples_proto(skb, &m, l4proto);
> -	if (unlikely(ret < 0))
> -		goto nla_put_failure;
> -
>  	nla_nest_end(skb, nest_parms);
>  
>  	return 0;
> 
> 
> --
Thank's Eric
It works I cant get the fault any more.

Tested-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-01-12  8:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-11  9:25 conntrack, suspicious RCU usage Hans Schillstrom
2012-01-11 10:01 ` Eric Dumazet
2012-01-11 13:24   ` Hans Schillstrom
2012-01-11 13:33     ` Eric Dumazet
2012-01-11 13:44       ` Hans Schillstrom
2012-01-11 14:56       ` Eric Dumazet
2012-01-12  2:35         ` Pablo Neira Ayuso
2012-01-12  8:15         ` Hans Schillstrom
2012-01-12  2:30 ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).