From: Florian Westphal <fw@strlen.de>
To: "Niklas Hambüchen" <mail@nh2.me>
Cc: netdev@vger.kernel.org
Subject: Re: Packets being dropped somewhere in the kernel, between iptables and packet capture layers
Date: Mon, 28 Jan 2019 07:21:21 +0100 [thread overview]
Message-ID: <20190128062121.fbksn2vdzpthwfkh@breakpoint.cc> (raw)
In-Reply-To: <19e1b7a2-00b2-3656-309c-0586e990007b@nh2.me>
Niklas Hambüchen <mail@nh2.me> wrote:
> I'm sending this to netdev@vger.kernel.org even though http://vger.kernel.org/lkml/ still suggests linux-net@vger.kernel.org, because the latter seems to be inactive since 2011 and full of spam, and I got "unresolvable address" for it. Perhaps somebody should update the page that recommends it.
> Nevertheless, please let me know if here is the wrong place.
This problem is known; I asked for test feedback on this patch but never
got a response:
netfilter: nf_nat: return the same reply tuple for matching CTs
It is possible that two concurrent packets originating from the same
socket of a connection-less protocol (e.g. UDP) can end up having
different IP_CT_DIR_REPLY tuples which results in one of the packets
being dropped.
To illustrate this, consider the following simplified scenario:
1. No DNAT/SNAT/MASQUEARADE rules are installed, but the nf_nat module
is loaded.
2. Packet A and B are sent at the same time from two different threads
via the same UDP socket which hasn't been used before (=no CT has
been created before). Both packets have the same IP_CT_DIR_ORIGINAL
tuple.
3. CT of A has been created and confirmed, afterwards get_unique_tuple
is called for B. Because IP_CT_DIR_REPLY tuple (the inverse of
the IP_CT_DIR_ORIGINAL tuple) is already taken by the A's confirmed
CT (nf_nat_used_tuple finds it), get_unique_tuple calls UDP's
unique_tuple which returns a different IP_CT_DIR_REPLY tuple (usually
with src port = 1024)
4. B's CT cannot get confirmed in __nf_conntrack_confirm due to
the found IP_CT_DIR_ORIGINAL tuple of A and the different
IP_CT_DIR_REPLY tuples, thus the packet B gets dropped.
This patch modifies nf_conntrack_tuple_taken so it doesn't consider
colliding reply tuples if the IP_CT_DIR_ORIGINAL tuples are equal.
Then, at insert time, either clash resolution is possible (new packet
has the existing/older conntrack assigned to it), or it has to be
dropped.
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 741b533148ba..07847a612adf 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1007,6 +1007,22 @@ nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple,
}
if (nf_ct_key_equal(h, tuple, zone, net)) {
+ /* If the origin tuples are identical, we can ignore
+ * this clashing entry: they refer to the same flow.
+ * Do not apply nat clash resolution in this case and
+ * let nf_ct_resolve_clash() deal with this.
+ *
+ * This can happen with UDP in particular, e.g. when
+ * more than one packet is sent from same socket in
+ * different threads.
+ *
+ * We would now mangle our entry and would then have to
+ * discard it at conntrack confirm time.
+ */
+ if (nf_ct_tuple_equal(&ignored_conntrack->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
+ &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple))
+ continue;
+
NF_CT_STAT_INC_ATOMIC(net, found);
rcu_read_unlock();
return 1;
next prev parent reply other threads:[~2019-01-28 6:21 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-28 0:17 Packets being dropped somewhere in the kernel, between iptables and packet capture layers Niklas Hambüchen
2019-01-28 6:21 ` Florian Westphal [this message]
2019-01-29 0:45 ` Niklas Hambüchen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190128062121.fbksn2vdzpthwfkh@breakpoint.cc \
--to=fw@strlen.de \
--cc=mail@nh2.me \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox