netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: BORBELY Zoltan <bozo@andrews.hu>
To: Patrick McHardy <kaber@trash.net>
Cc: Netfilter Development Mailinglist <netfilter-devel@vger.kernel.org>
Subject: Re: crash in death_by_timeout()
Date: Tue, 25 Nov 2008 09:09:03 +0100	[thread overview]
Message-ID: <20081125080903.GA3195@zebra.home> (raw)
In-Reply-To: <4922C2D0.9060207@trash.net>

Hi,

On Tue, Nov 18, 2008 at 02:27:44PM +0100, Patrick McHardy wrote:
> Could you try whether this patch fixes the problem?
>
> Pablo, do you recall the reason why the lock isn't held in
> ctnetlink_create_conntrack()?

> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 622d7c6..233fdd2 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -305,9 +305,7 @@ void nf_conntrack_hash_insert(struct nf_conn *ct)
>  	hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
>  	repl_hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple);
>  
> -	spin_lock_bh(&nf_conntrack_lock);
>  	__nf_conntrack_hash_insert(ct, hash, repl_hash);
> -	spin_unlock_bh(&nf_conntrack_lock);
>  }
>  EXPORT_SYMBOL_GPL(nf_conntrack_hash_insert);
>  
> diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
> index a040d46..3b009a3 100644
> --- a/net/netfilter/nf_conntrack_netlink.c
> +++ b/net/netfilter/nf_conntrack_netlink.c
> @@ -1090,7 +1090,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
>  	struct nf_conn_help *help;
>  	struct nf_conntrack_helper *helper;
>  
> -	ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_KERNEL);
> +	ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_ATOMIC);
>  	if (ct == NULL || IS_ERR(ct))
>  		return -ENOMEM;
>  
> @@ -1212,13 +1212,14 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb,
>  			atomic_inc(&master_ct->ct_general.use);
>  		}
>  
> -		spin_unlock_bh(&nf_conntrack_lock);
>  		err = -ENOENT;
>  		if (nlh->nlmsg_flags & NLM_F_CREATE)
>  			err = ctnetlink_create_conntrack(cda,
>  							 &otuple,
>  							 &rtuple,
>  							 master_ct);
> +		spin_unlock_bh(&nf_conntrack_lock);
> +
>  		if (err < 0 && master_ct)
>  			nf_ct_put(master_ct);
>  

We didn't see any kernel crashes during a half day heavy work (without the
patch the kernel crashed in 3-4 hours every time), but we found a lot of
BUG messages in the log (maybe for every new entry):

    Nov 24 14:45:43 test kernel: BUG: sleeping function called from invalid context at mm/slab.c:3043
    Nov 24 14:45:43 test kernel: in_atomic():1, irqs_disabled():0
    Nov 24 14:45:43 test kernel: 3 locks held by test/3586:
    Nov 24 14:45:43 test kernel:  #0:  (nfnl_mutex){--..}, at: [<d081500f>] nfnetlink_rcv+0xf/0x30 [nfnetlink]
    Nov 24 14:45:43 test kernel:  #1:  (nf_conntrack_lock){-+..}, at: [<d08c979f>] ctnetlink_new_conntrack+0x7f/0x770 [nf_conntrack_netlink]
    Nov 24 14:45:43 test kernel:  #2:  (rcu_read_lock){..--}, at: [<d08c98ee>] ctnetlink_new_conntrack+0x1ce/0x770 [nf_conntrack_netlink]
    Nov 24 14:45:43 test kernel: Pid: 3586, comm: test Not tainted 2.6.27.6bozotest #1
    Nov 24 14:45:43 test kernel:  [<c027a566>] __kmalloc_track_caller+0x126/0x160
    Nov 24 14:45:43 test kernel:  [<c052a7a5>] __nf_ct_ext_add+0xb5/0x290
    Nov 24 14:45:43 test kernel:  [<c026411d>] __krealloc+0x5d/0x80
    Nov 24 14:45:44 test kernel:  [<c052a7a5>] __nf_ct_ext_add+0xb5/0x290
    Nov 24 14:45:44 test kernel:  [<c052a71d>] __nf_ct_ext_add+0x2d/0x290
    Nov 24 14:45:44 test kernel:  [<d08c9af8>] ctnetlink_new_conntrack+0x3d8/0x770 [nf_conntrack_netlink]
    Nov 24 14:45:44 test kernel:  [<d08c98ee>] ctnetlink_new_conntrack+0x1ce/0x770 [nf_conntrack_netlink]
    Nov 24 14:45:44 test kernel:  [<c0248910>] validate_chain+0x380/0xed0
    Nov 24 14:45:44 test kernel:  [<d0815220>] nfnetlink_rcv_msg+0xf0/0x180 [nfnetlink]
    Nov 24 14:45:44 test kernel:  [<d0815130>] nfnetlink_rcv_msg+0x0/0x180 [nfnetlink]
    Nov 24 14:45:44 test kernel:  [<c0520ebc>] netlink_rcv_skb+0x7c/0xa0
    Nov 24 14:45:44 test kernel:  [<d081501b>] nfnetlink_rcv+0x1b/0x30 [nfnetlink]
    Nov 24 14:45:44 test kernel:  [<c0520c50>] netlink_unicast+0x250/0x280
    Nov 24 14:45:44 test kernel:  [<c052145e>] netlink_sendmsg+0x1ee/0x2c0
    Nov 24 14:45:44 test kernel:  [<c04fad7f>] sock_sendmsg+0xbf/0xf0
    Nov 24 14:45:44 test kernel:  [<c02496e5>] __lock_acquire+0x285/0x9e0
    Nov 24 14:45:44 test kernel:  [<c0239790>] autoremove_wake_function+0x0/0x50
    Nov 24 14:45:44 test kernel:  [<c0248910>] validate_chain+0x380/0xed0
    Nov 24 14:45:44 test kernel:  [<c027ee33>] fget_light+0xd3/0xf0
    Nov 24 14:45:44 test kernel:  [<c031bea8>] copy_from_user+0x38/0x80
    Nov 24 14:45:44 test kernel:  [<c031bea8>] copy_from_user+0x38/0x80
    Nov 24 14:45:44 test kernel:  [<c0502e2a>] verify_iovec+0x2a/0x90
    Nov 24 14:45:44 test kernel:  [<c04faf14>] sys_sendmsg+0x164/0x280
    Nov 24 14:45:44 test kernel:  [<c027ee33>] fget_light+0xd3/0xf0
    Nov 24 14:45:44 test kernel:  [<c031c16a>] copy_to_user+0x3a/0x70
    Nov 24 14:45:44 test kernel:  [<c04fb98f>] move_addr_to_user+0x5f/0x70
    Nov 24 14:45:44 test kernel:  [<c04fbf0d>] sys_getsockname+0xcd/0xd0
    Nov 24 14:45:44 test kernel:  [<c022ad6c>] local_bh_enable_ip+0x7c/0xc0
    Nov 24 14:45:44 test kernel:  [<c0247e64>] trace_hardirqs_on_caller+0xc4/0x140
    Nov 24 14:45:44 test kernel:  [<c022ad6c>] local_bh_enable_ip+0x7c/0xc0
    Nov 24 14:45:44 test kernel:  [<c04fe578>] sock_setsockopt+0x128/0x590
    Nov 24 14:45:44 test kernel:  [<c027edb3>] fget_light+0x53/0xf0
    Nov 24 14:45:44 test kernel:  [<c04fa552>] sockfd_lookup_light+0x32/0x60
    Nov 24 14:45:44 test kernel:  [<c04fc39b>] sys_socketcall+0x25b/0x2b0
    Nov 24 14:45:44 test kernel:  [<c031ba44>] trace_hardirqs_on_thunk+0xc/0x10
    Nov 24 14:45:44 test kernel:  [<c031ba44>] trace_hardirqs_on_thunk+0xc/0x10
    Nov 24 14:45:44 test kernel:  [<c0203029>] sysenter_do_call+0x12/0x35
    Nov 24 14:45:44 test kernel:  =======================

Bye,
Zoltan

  parent reply	other threads:[~2008-11-25  8:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-17 22:18 crash in death_by_timeout() BORBELY Zoltan
2008-11-18 11:07 ` Patrick McHardy
2008-11-18 12:38   ` BORBELY Zoltan
2008-11-18 13:19     ` Patrick McHardy
2008-11-18 13:27       ` Patrick McHardy
2008-11-18 22:25         ` Pablo Neira Ayuso
2008-11-19 12:04           ` Patrick McHardy
2008-11-19 12:37             ` Pablo Neira Ayuso
2008-11-19 12:47               ` Patrick McHardy
2008-11-25  8:09         ` BORBELY Zoltan [this message]
2008-11-25 11:11           ` Patrick McHardy
2008-11-25 22:48             ` BORBELY Zoltan
2008-11-26 11:16               ` Patrick McHardy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081125080903.GA3195@zebra.home \
    --to=bozo@andrews.hu \
    --cc=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).