netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion
@ 2016-11-16 14:13 Florian Westphal
  2016-11-16 14:13 ` [PATCH nf 1/2] netfilter: nat: fix cmp return value Florian Westphal
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Florian Westphal @ 2016-11-16 14:13 UTC (permalink / raw)
  To: netfilter-devel; +Cc: herbert

These two patches address bugs added when I converted the nat bysource hash
to rhashtable.

First patch fixes object comparision, second patch converts
nat bysource to the rhlist interface, we have to deal with identical
source identities (distinct connections sharing same ip/port source pair).

Unfortunately the rhashlist interface isn't available in 4.8, so the question
is how to address patch #2 for -stable.

One way would be to revert the rhashtable conversion, the other one
would be to just set .insecure_elasticity in the rhashtable params.

The latter should work fine, as we do NOT need to identify the individual
duplicate keys.

 include/net/netfilter/nf_conntrack.h |    2 -
 net/netfilter/nf_nat_core.c          |   49 +++++++++++++++++++++--------------
 2 files changed, 31 insertions(+), 20 deletions(-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH nf 1/2] netfilter: nat: fix cmp return value
  2016-11-16 14:13 [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Florian Westphal
@ 2016-11-16 14:13 ` Florian Westphal
  2016-11-16 14:13 ` [PATCH nf 2/2] netfilter: nat: switch to new rhlist interface Florian Westphal
  2016-11-24 12:53 ` [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Pablo Neira Ayuso
  2 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2016-11-16 14:13 UTC (permalink / raw)
  To: netfilter-devel; +Cc: herbert, Florian Westphal

The comparator works like memcmp, i.e. 0 means objects are equal.
In other words, when objects are distinct they are treated as identical,
when they are distinct they are allegedly the same.

The first case is rare (distinct objects are unlikely to get hashed to
same bucket).

The second case results in unneeded port conflict resolutions attempts.

Fixes: 870190a9ec907 ("netfilter: nat: convert nat bysrc hash to rhashtable")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_nat_core.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index bbb8f3df79f7..c632429706eb 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -193,9 +193,12 @@ static int nf_nat_bysource_cmp(struct rhashtable_compare_arg *arg,
 	const struct nf_nat_conn_key *key = arg->key;
 	const struct nf_conn *ct = obj;
 
-	return same_src(ct, key->tuple) &&
-	       net_eq(nf_ct_net(ct), key->net) &&
-	       nf_ct_zone_equal(ct, key->zone, IP_CT_DIR_ORIGINAL);
+	if (!same_src(ct, key->tuple) ||
+	    !net_eq(nf_ct_net(ct), key->net) ||
+	    !nf_ct_zone_equal(ct, key->zone, IP_CT_DIR_ORIGINAL))
+		return 1;
+
+	return 0;
 }
 
 static struct rhashtable_params nf_nat_bysource_params = {
-- 
2.7.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH nf 2/2] netfilter: nat: switch to new rhlist interface
  2016-11-16 14:13 [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Florian Westphal
  2016-11-16 14:13 ` [PATCH nf 1/2] netfilter: nat: fix cmp return value Florian Westphal
@ 2016-11-16 14:13 ` Florian Westphal
  2016-11-24 12:53 ` [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Pablo Neira Ayuso
  2 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2016-11-16 14:13 UTC (permalink / raw)
  To: netfilter-devel; +Cc: herbert, Florian Westphal

I got offlist bug report about failing connections and high cpu usage.
This happens because we hit 'elasticity' checks in rhashtable that
refuses bucket list exceeding 16 entries.

The nat bysrc hash unfortunately needs to insert distinct objects that
share same key and are identical (have same source tuple), this cannot
be avoided.

Switch to the rhlist interface which is designed for this.

The nulls_base is removed here, I don't think its needed:

A (unlikely) false positive results in unneeded port clash resolution,
a false negative results in packet drop during conntrack confirmation,
when we try to insert the duplicate into main conntrack hash table.

Tested by adding multiple ip addresses to host, then adding
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

... and then creating multiple connections, from same source port but
different addresses:

for i in $(seq 2000 2032);do nc -p 1234 192.168.7.1 $i > /dev/null  & done

(all of these then get hashed to same bysource slot)

Then, to test that nat conflict resultion is working:

nc -s 10.0.0.1 -p 1234 192.168.7.1 2000
nc -s 10.0.0.2 -p 1234 192.168.7.1 2000

tcp  .. src=10.0.0.1 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1024 [ASSURED]
tcp  .. src=10.0.0.2 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1025 [ASSURED]
tcp  .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1234 [ASSURED]
tcp  .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2001 src=192.168.7.1 dst=192.168.7.10 sport=2001 dport=1234 [ASSURED]
[..]

-> nat altered source ports to 1024 and 1025, respectively.
This can also be confirmed on destination host which shows
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1024
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1025
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1234

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 870190a9ec907 ("netfilter: nat: convert nat bysrc hash to rhashtable")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_conntrack.h |  2 +-
 net/netfilter/nf_nat_core.c          | 40 +++++++++++++++++++++---------------
 2 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 50418052a520..dc143ada9762 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -118,7 +118,7 @@ struct nf_conn {
 	struct nf_ct_ext *ext;
 
 #if IS_ENABLED(CONFIG_NF_NAT)
-	struct rhash_head	nat_bysource;
+	struct rhlist_head nat_bysource;
 #endif
 	/* Storage reserved for other modules, must be the last member */
 	union nf_conntrack_proto proto;
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index c632429706eb..5b9c884a452e 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -42,7 +42,7 @@ struct nf_nat_conn_key {
 	const struct nf_conntrack_zone *zone;
 };
 
-static struct rhashtable nf_nat_bysource_table;
+static struct rhltable nf_nat_bysource_table;
 
 inline const struct nf_nat_l3proto *
 __nf_nat_l3proto_find(u8 family)
@@ -207,7 +207,6 @@ static struct rhashtable_params nf_nat_bysource_params = {
 	.obj_cmpfn = nf_nat_bysource_cmp,
 	.nelem_hint = 256,
 	.min_size = 1024,
-	.nulls_base = (1U << RHT_BASE_SHIFT),
 };
 
 /* Only called for SRC manip */
@@ -226,12 +225,15 @@ find_appropriate_src(struct net *net,
 		.tuple = tuple,
 		.zone = zone
 	};
+	struct rhlist_head *hl;
 
-	ct = rhashtable_lookup_fast(&nf_nat_bysource_table, &key,
-				    nf_nat_bysource_params);
-	if (!ct)
+	hl = rhltable_lookup(&nf_nat_bysource_table, &key,
+			     nf_nat_bysource_params);
+	if (!hl)
 		return 0;
 
+	ct = container_of(hl, typeof(*ct), nat_bysource);
+
 	nf_ct_invert_tuplepr(result,
 			     &ct->tuplehash[IP_CT_DIR_REPLY].tuple);
 	result->dst = tuple->dst;
@@ -449,11 +451,17 @@ nf_nat_setup_info(struct nf_conn *ct,
 	}
 
 	if (maniptype == NF_NAT_MANIP_SRC) {
+		struct nf_nat_conn_key key = {
+			.net = nf_ct_net(ct),
+			.tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
+			.zone = nf_ct_zone(ct),
+		};
 		int err;
 
-		err = rhashtable_insert_fast(&nf_nat_bysource_table,
-					     &ct->nat_bysource,
-					     nf_nat_bysource_params);
+		err = rhltable_insert_key(&nf_nat_bysource_table,
+					  &key,
+					  &ct->nat_bysource,
+					  nf_nat_bysource_params);
 		if (err)
 			return NF_DROP;
 	}
@@ -570,8 +578,8 @@ static int nf_nat_proto_clean(struct nf_conn *ct, void *data)
 	 * will delete entry from already-freed table.
 	 */
 	ct->status &= ~IPS_NAT_DONE_MASK;
-	rhashtable_remove_fast(&nf_nat_bysource_table, &ct->nat_bysource,
-			       nf_nat_bysource_params);
+	rhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,
+			nf_nat_bysource_params);
 
 	/* don't delete conntrack.  Although that would make things a lot
 	 * simpler, we'd end up flushing all conntracks on nat rmmod.
@@ -701,8 +709,8 @@ static void nf_nat_cleanup_conntrack(struct nf_conn *ct)
 	if (!nat)
 		return;
 
-	rhashtable_remove_fast(&nf_nat_bysource_table, &ct->nat_bysource,
-			       nf_nat_bysource_params);
+	rhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,
+			nf_nat_bysource_params);
 }
 
 static struct nf_ct_ext_type nat_extend __read_mostly = {
@@ -837,13 +845,13 @@ static int __init nf_nat_init(void)
 {
 	int ret;
 
-	ret = rhashtable_init(&nf_nat_bysource_table, &nf_nat_bysource_params);
+	ret = rhltable_init(&nf_nat_bysource_table, &nf_nat_bysource_params);
 	if (ret)
 		return ret;
 
 	ret = nf_ct_extend_register(&nat_extend);
 	if (ret < 0) {
-		rhashtable_destroy(&nf_nat_bysource_table);
+		rhltable_destroy(&nf_nat_bysource_table);
 		printk(KERN_ERR "nf_nat_core: Unable to register extension\n");
 		return ret;
 	}
@@ -867,7 +875,7 @@ static int __init nf_nat_init(void)
 	return 0;
 
  cleanup_extend:
-	rhashtable_destroy(&nf_nat_bysource_table);
+	rhltable_destroy(&nf_nat_bysource_table);
 	nf_ct_extend_unregister(&nat_extend);
 	return ret;
 }
@@ -886,7 +894,7 @@ static void __exit nf_nat_cleanup(void)
 	for (i = 0; i < NFPROTO_NUMPROTO; i++)
 		kfree(nf_nat_l4protos[i]);
 
-	rhashtable_destroy(&nf_nat_bysource_table);
+	rhltable_destroy(&nf_nat_bysource_table);
 }
 
 MODULE_LICENSE("GPL");
-- 
2.7.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion
  2016-11-16 14:13 [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Florian Westphal
  2016-11-16 14:13 ` [PATCH nf 1/2] netfilter: nat: fix cmp return value Florian Westphal
  2016-11-16 14:13 ` [PATCH nf 2/2] netfilter: nat: switch to new rhlist interface Florian Westphal
@ 2016-11-24 12:53 ` Pablo Neira Ayuso
  2 siblings, 0 replies; 4+ messages in thread
From: Pablo Neira Ayuso @ 2016-11-24 12:53 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, herbert

On Wed, Nov 16, 2016 at 03:13:34PM +0100, Florian Westphal wrote:
> These two patches address bugs added when I converted the nat bysource hash
> to rhashtable.
> 
> First patch fixes object comparision, second patch converts
> nat bysource to the rhlist interface, we have to deal with identical
> source identities (distinct connections sharing same ip/port source pair).
> 
> Unfortunately the rhashlist interface isn't available in 4.8, so the question
> is how to address patch #2 for -stable.
> 
> One way would be to revert the rhashtable conversion, the other one
> would be to just set .insecure_elasticity in the rhashtable params.
> 
> The latter should work fine, as we do NOT need to identify the individual
> duplicate keys.

Series applied, thanks Florian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-24 12:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-16 14:13 [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Florian Westphal
2016-11-16 14:13 ` [PATCH nf 1/2] netfilter: nat: fix cmp return value Florian Westphal
2016-11-16 14:13 ` [PATCH nf 2/2] netfilter: nat: switch to new rhlist interface Florian Westphal
2016-11-24 12:53 ` [PATCH nf 0/2] netfilter: nat: fix regressions introduced by rhashtable conversion Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).