From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: netfilter-devel@vger.kernel.org, Zambo Marcell <marcell.zambo@gmail.com>
Subject: Re: [PATCH 1/1] netfilter: fix soft lockup when netlink adds new entries
Date: Fri, 24 Feb 2012 12:18:00 +0100 [thread overview]
Message-ID: <20120224111800.GA9669@1984> (raw)
In-Reply-To: <alpine.DEB.2.00.1202241143001.14577@blackhole.kfki.hu>
On Fri, Feb 24, 2012 at 11:45:49AM +0100, Jozsef Kadlecsik wrote:
> On Fri, 24 Feb 2012, Jozsef Kadlecsik wrote:
>
> > OK, and I remove nf_conntrack_hash_insert then, and use a single ct
> > argument because as you noted net and zone can be extracted from the ct.
>
> Here comes the second version of the patch, which assumes that the
> previous one is reverted.
>
> Best regards,
> Jozsef
>
> Marcell Zambo and Janos Farago noticed and reported that when
> new conntrack entries are added via netlink and the conntrack table
> gets full, soft lockup happens. This is because the nf_conntrack_lock
> is held while nf_conntrack_alloc is called, which is in turn wants
> to lock nf_conntrack_lock while evicting entries from the full table.
>
> The patch fixes the soft lockup with limiting the holding of the
> nf_conntrack_lock to the minimum, where it's absolutely required.
> It required to extend (and thus change) nf_conntrack_hash_insert
> so that it makes sure conntrack and ctnetlink do not add the same entry
> twice to the conntrack table.
>
> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> ---
> include/net/netfilter/nf_conntrack.h | 2 +-
> net/netfilter/nf_conntrack_core.c | 38 ++++++++++++++++++++--
> net/netfilter/nf_conntrack_netlink.c | 58 +++++++++++++--------------------
> 3 files changed, 58 insertions(+), 40 deletions(-)
>
> diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
> index 8a2b0ae..ab86036 100644
> --- a/include/net/netfilter/nf_conntrack.h
> +++ b/include/net/netfilter/nf_conntrack.h
> @@ -209,7 +209,7 @@ extern struct nf_conntrack_tuple_hash *
> __nf_conntrack_find(struct net *net, u16 zone,
> const struct nf_conntrack_tuple *tuple);
>
> -extern void nf_conntrack_hash_insert(struct nf_conn *ct);
> +extern int nf_conntrack_hash_check_insert(struct nf_conn *ct);
> extern void nf_ct_delete_from_lists(struct nf_conn *ct);
> extern void nf_ct_insert_dying_list(struct nf_conn *ct);
>
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 76613f5..ed86a3b 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -404,19 +404,49 @@ static void __nf_conntrack_hash_insert(struct nf_conn *ct,
> &net->ct.hash[repl_hash]);
> }
>
> -void nf_conntrack_hash_insert(struct nf_conn *ct)
> +int
> +nf_conntrack_hash_check_insert(struct nf_conn *ct)
> {
> struct net *net = nf_ct_net(ct);
> unsigned int hash, repl_hash;
> + struct nf_conntrack_tuple_hash *h;
> + struct hlist_nulls_node *n;
> u16 zone;
>
> zone = nf_ct_zone(ct);
> - hash = hash_conntrack(net, zone, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
> - repl_hash = hash_conntrack(net, zone, &ct->tuplehash[IP_CT_DIR_REPLY].tuple);
> + hash = hash_conntrack(net, zone,
> + &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
> + repl_hash = hash_conntrack(net, zone,
> + &ct->tuplehash[IP_CT_DIR_REPLY].tuple);
> +
> + spin_lock_bh(&nf_conntrack_lock);
>
> + /* See if there's one in the list already, including reverse */
> + hlist_nulls_for_each_entry(h, n, &net->ct.hash[hash], hnnode)
> + if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
> + &h->tuple) &&
> + zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
> + goto out;
> + hlist_nulls_for_each_entry(h, n, &net->ct.hash[repl_hash], hnnode)
> + if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple,
> + &h->tuple) &&
> + zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
> + goto out;
> +
> + add_timer(&ct->timeout);
> + nf_conntrack_get(&ct->ct_general);
> __nf_conntrack_hash_insert(ct, hash, repl_hash);
> + NF_CT_STAT_INC(net, insert);
> + spin_unlock_bh(&nf_conntrack_lock);
> +
> + return 0;
> +
> +out:
> + NF_CT_STAT_INC(net, insert_failed);
> + spin_unlock_bh(&nf_conntrack_lock);
> + return -EEXIST;
> }
> -EXPORT_SYMBOL_GPL(nf_conntrack_hash_insert);
> +EXPORT_SYMBOL_GPL(nf_conntrack_hash_check_insert);
>
> /* Confirm a connection given skb; places it in hash table */
> int
> diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
> index 9307b03..6d73501 100644
> --- a/net/netfilter/nf_conntrack_netlink.c
> +++ b/net/netfilter/nf_conntrack_netlink.c
> @@ -1345,7 +1345,7 @@ ctnetlink_create_conntrack(struct net *net, u16 zone,
> struct nf_conntrack_helper *helper;
> struct nf_conn_tstamp *tstamp;
>
> - ct = nf_conntrack_alloc(net, zone, otuple, rtuple, GFP_ATOMIC);
> + ct = nf_conntrack_alloc(net, zone, otuple, rtuple, GFP_KERNEL);
> if (IS_ERR(ct))
> return ERR_PTR(-ENOMEM);
>
Sorry, I was wrong. I didn't notice that this is called holding
rcu_read_lock() inside nfnetlink.c and we cannot use GFP_KERNEL
inside read-lock area.
I'm going to take the patch and will fix this myself.
Thanks Jozsef.
next prev parent reply other threads:[~2012-02-24 11:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-21 13:53 [PATCH 0/1] netfilter: fix soft lockup when netlink adds new entries Jozsef Kadlecsik
2012-02-21 13:53 ` [PATCH 1/1] " Jozsef Kadlecsik
2012-02-21 14:52 ` Pablo Neira Ayuso
2012-02-21 15:06 ` Jozsef Kadlecsik
2012-02-21 15:20 ` Pablo Neira Ayuso
2012-02-23 10:12 ` Pablo Neira Ayuso
2012-02-23 12:43 ` Jozsef Kadlecsik
2012-02-23 15:46 ` Pablo Neira Ayuso
2012-02-23 20:44 ` Jozsef Kadlecsik
2012-02-24 1:24 ` Pablo Neira Ayuso
2012-02-24 6:44 ` Eric Dumazet
2012-02-24 10:01 ` Pablo Neira Ayuso
2012-02-24 10:11 ` Jozsef Kadlecsik
2012-02-24 11:36 ` Pablo Neira Ayuso
2012-02-24 8:06 ` Jozsef Kadlecsik
2012-02-24 9:59 ` Pablo Neira Ayuso
2012-02-24 10:09 ` Jozsef Kadlecsik
2012-02-24 10:45 ` Jozsef Kadlecsik
2012-02-24 11:18 ` Pablo Neira Ayuso [this message]
2012-02-24 11:33 ` Pablo Neira Ayuso
2012-02-24 17:05 ` Jozsef Kadlecsik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120224111800.GA9669@1984 \
--to=pablo@netfilter.org \
--cc=kadlec@blackhole.kfki.hu \
--cc=marcell.zambo@gmail.com \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).