From: Florian Westphal <fw@strlen.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>, netfilter-devel@vger.kernel.org
Subject: Re: [PATCH nf 0/4] netfilter: conntrack: allow insertion of clashing entries
Date: Mon, 17 Feb 2020 21:12:40 +0100 [thread overview]
Message-ID: <20200217201240.GH19559@breakpoint.cc> (raw)
In-Reply-To: <20200217192546.pa26vfni4kmhlpng@salvia>
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Mon, Feb 03, 2020 at 05:37:03PM +0100, Florian Westphal wrote:
> > This series allows conntrack to insert a duplicate conntrack entry
> > if the reply direction doesn't result in a clash with a different
> > original connection.
>
> Applied, thanks for your patience.
>
> I introduced the late clash resolution approach to deal with nfqueue,
> now this is extended to cover more cases, let's give it a try.
Yes, nfqueue is one way this can happen, changes to resolver libraries
to issue parallel requests have exposed this race for non-nfqueue case
too.
> >Alternatives considered were:
> >1. Confirm ct entries at allocation time, not in postrouting.
> > a. will cause uneccesarry work when the skb that creates the
> > conntrack is dropped by ruleset.
> > b. in case nat is applied, ct entry would need to be moved in
> > the table, which requires another spinlock pair to be taken.
> > c. breaks the 'unconfirmed entry is private to cpu' assumption:
> > we would need to guard all nfct->ext allocation requests with
> > ct->lock spinlock.
> >
> >2. Make the unconfirmed list a hash table instead of a pcpu list.
> > Shares drawback c) of the first alternative.
>
> The spinlock would need to be grabbed rarely, right? My mean, most
> extension allocations happen before insertion to the unconfirmed list.
> Only _ext_add() invocations coming after init_conntrack() might
> require this.
Right, we could add __nf_ct_ext_add() which is unlocked and convert
the additions happening before unconfirmed list insertion there.
But there are additional problems that I forgot:
a) need for one additional lookup after negative result from main table
(this time in unconfirmed list).
b) Need to asynchronously re-insert the skb at a later time, once
the racing entry is confirmed.
We can't use the unconfirmed ct as-is, because it may be incomplete.
For instance, the racing skb might not yet have hit the nat table, so
the ct contains wrong NAT info.
I think b) is a non-starter for all of the alternatives, unfortunately.
prev parent reply other threads:[~2020-02-17 20:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-03 16:37 [PATCH nf 0/4] netfilter: conntrack: allow insertion of clashing entries Florian Westphal
2020-02-03 16:37 ` [PATCH nf 1/4] netfilter: conntrack: remove two args from resolve_clash Florian Westphal
2020-02-03 16:37 ` [PATCH nf 2/4] netfilter: conntrack: place confirm-bit setting in a helper Florian Westphal
2020-02-03 16:37 ` [PATCH nf 3/4] netfilter: conntrack: split resolve_clash function Florian Westphal
2020-02-03 16:37 ` [PATCH nf 4/4] netfilter: conntrack: allow insertion of clashing entries Florian Westphal
2020-02-17 19:25 ` [PATCH nf 0/4] " Pablo Neira Ayuso
2020-02-17 20:12 ` Florian Westphal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200217201240.GH19559@breakpoint.cc \
--to=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.