From: Florian Westphal <fw@strlen.de>
To: Ani Sinha <ani@arista.com>
Cc: Florian Westphal <fw@strlen.de>,
Patrick McHardy <kaber@trash.net>,
"David S. Miller" <davem@davemloft.net>,
netfilter-devel@vger.kernel.org, netfilter@vger.kernel.org,
coreteam@netfilter.org,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: linux 3.4.43 : kernel crash at __nf_conntrack_confirm
Date: Sun, 18 Oct 2015 23:40:50 +0200 [thread overview]
Message-ID: <20151018214050.GD4386@breakpoint.cc> (raw)
In-Reply-To: <alpine.OSX.2.20.1510181409060.87917@athabasca.local>
Ani Sinha <ani@arista.com> wrote:
> Indeed. So it seems to me that we have run into one another such case.
> In patch c6825c0976fa7893692, I see we have added an additional check (along with comparing tuple and zone) to verify that if the conntrack is confirmed.
>
> + return nf_ct_tuple_equal(tuple, &h->tuple) &&
> + nf_ct_zone(ct) == zone &&
> + nf_ct_is_confirmed(ct);
>
> This is necessary since it's possible that a conntrack can be recreated with the same zone.
> Unfortunately, we leave a hole open in __nf_conntrack_confirm() because this routine _is_ responsible
> for confirming the conntrack. We cannot use the same logic here.
Hmm, why?
I don't understand why we need to change __nf_conntrack_confirm(), can
you elaborate?
At __nf_conntrack_confirm call time, only one cpu can see this nfct entry.
Other cpus on read-side can see it due to object re-use but any of the
following tests should fail:
1. different tuples
2. differnet zones
3. CONFIRMED not set
So they would skip entry and restart lookup (NULs value mismatch).
> Should I send a patch along the lines of :
>
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 71935fc..6ff4088 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -535,6 +535,12 @@ __nf_conntrack_confirm(struct sk_buff *skb)
> zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
> goto out;
>
> + /* we might be racing against a case where the conntrack was deleted
> + and a new conntrack was initialized with the exact same zone. We
> + need to make sure that the conntrack node is in the hashtable */
?
The conntrack is NOT in the hashtable at this point. Its not even on
the unconfirmed list since we already removed it in preparation of
hashtable insertion.
> + if (hlist_nulls_unhashed(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode))
> + goto out;
That would be a bug, how can ->nfct be confirmed twice?
If you're talking about IPS_CONFIRMED getting set -- that should be
harmless. In some theoretical condition we could indeed observe this
nfct on another cpu, just before we actually insert this but this does
not cause a problem on the read-side since the conntrack matches the
tuple exactly and all extensions have been initialized.
And if we create two conntracks with identical tuples on different CPUs
which is possible regardless of RCU this will be detected during
confirm step (we search ht for a colliding tuple).
So, if there is a problem please describe in more detail, I don't see
anything wrong so far.
next prev parent reply other threads:[~2015-10-18 21:40 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-07 19:57 linux 3.4.43 : kernel crash at __nf_conntrack_confirm Ani Sinha
2015-10-18 2:34 ` Ani Sinha
2015-10-18 8:07 ` Florian Westphal
[not found] ` <CAOxq_8NLLFyNCSDJ68+VjxFGpNSex8ShdhGFNBHK29g_+UBW6g@mail.gmail.com>
2015-10-18 21:12 ` Ani Sinha
2015-10-18 21:40 ` Florian Westphal [this message]
2015-10-19 20:22 ` Ani Sinha
2015-10-19 20:33 ` Florian Westphal
2015-10-19 22:13 ` Ani Sinha
2015-10-21 19:35 ` Ani Sinha
2015-10-21 21:19 ` Florian Westphal
2015-10-21 21:26 ` Ani Sinha
2015-10-22 7:42 ` Neal P. Murphy
2015-10-22 19:53 ` Ani Sinha
2015-10-23 2:39 ` Neal P. Murphy
2015-10-24 18:28 ` Ani Sinha
2015-10-26 6:13 ` Neal P. Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151018214050.GD4386@breakpoint.cc \
--to=fw@strlen.de \
--cc=ani@arista.com \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=netfilter@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).