* [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1.
@ 2017-05-21 10:52 Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 1/5] netfilter: conntrack: rename nf_ct_iterate_cleanup Florian Westphal
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Florian Westphal @ 2017-05-21 10:52 UTC (permalink / raw)
To: netfilter-devel
First batch of changes to rework how we iterate over the conntrack table.
Historically, we had one table.
When net namespaces were added, we got one table per namespace.
Nowadays we again only have a single table (which considers netns
during lookups).
This series prepares for removal of some open-coded table iteration
places.
It also adds nf_ct_iterate_destroy(), to be used in module exit path
when we need to inspect every conntrack entry regardless of namespace,
then uses it from nat module exit path.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH nf-next 1/5] netfilter: conntrack: rename nf_ct_iterate_cleanup
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
@ 2017-05-21 10:52 ` Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 2/5] netfilter: conntrack: don't call iter for non-confirmed conntracks Florian Westphal
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2017-05-21 10:52 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
There are several places where we needlesly call nf_ct_iterate_cleanup,
we should instead iterate the full table at module unload time.
This is a leftover from back when the conntrack table got duplicated
per net namespace.
So rename nf_ct_iterate_cleanup to nf_ct_iterate_cleanup_net.
A later patch will then add a non-net variant.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/net/netfilter/nf_conntrack.h | 6 +++---
net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 4 ++--
net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 10 +++++-----
net/netfilter/nf_conntrack_core.c | 10 +++++-----
net/netfilter/nf_conntrack_netlink.c | 4 ++--
net/netfilter/nf_conntrack_proto.c | 4 ++--
net/netfilter/nf_nat_core.c | 6 +++---
7 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 8ece3612d0cd..f21180ea4558 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -225,9 +225,9 @@ extern s32 (*nf_ct_nat_offset)(const struct nf_conn *ct,
u32 seq);
/* Iterate over all conntracks: if iter returns true, it's deleted. */
-void nf_ct_iterate_cleanup(struct net *net,
- int (*iter)(struct nf_conn *i, void *data),
- void *data, u32 portid, int report);
+void nf_ct_iterate_cleanup_net(struct net *net,
+ int (*iter)(struct nf_conn *i, void *data),
+ void *data, u32 portid, int report);
struct nf_conntrack_zone;
diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index dc1dea15c1b4..f39037fca923 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -98,8 +98,8 @@ static int masq_device_event(struct notifier_block *this,
*/
NF_CT_ASSERT(dev->ifindex != 0);
- nf_ct_iterate_cleanup(net, device_cmp,
- (void *)(long)dev->ifindex, 0, 0);
+ nf_ct_iterate_cleanup_net(net, device_cmp,
+ (void *)(long)dev->ifindex, 0, 0);
}
return NOTIFY_DONE;
diff --git a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
index 2297c9f073ba..d7b679037bae 100644
--- a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c
@@ -75,8 +75,8 @@ static int masq_device_event(struct notifier_block *this,
struct net *net = dev_net(dev);
if (event == NETDEV_DOWN)
- nf_ct_iterate_cleanup(net, device_cmp,
- (void *)(long)dev->ifindex, 0, 0);
+ nf_ct_iterate_cleanup_net(net, device_cmp,
+ (void *)(long)dev->ifindex, 0, 0);
return NOTIFY_DONE;
}
@@ -99,7 +99,7 @@ static void iterate_cleanup_work(struct work_struct *work)
w = container_of(work, struct masq_dev_work, work);
index = w->ifindex;
- nf_ct_iterate_cleanup(w->net, device_cmp, (void *)index, 0, 0);
+ nf_ct_iterate_cleanup_net(w->net, device_cmp, (void *)index, 0, 0);
put_net(w->net);
kfree(w);
@@ -110,12 +110,12 @@ static void iterate_cleanup_work(struct work_struct *work)
/* ipv6 inet notifier is an atomic notifier, i.e. we cannot
* schedule.
*
- * Unfortunately, nf_ct_iterate_cleanup can run for a long
+ * Unfortunately, nf_ct_iterate_cleanup_net can run for a long
* time if there are lots of conntracks and the system
* handles high softirq load, so it frequently calls cond_resched
* while iterating the conntrack table.
*
- * So we defer nf_ct_iterate_cleanup walk to the system workqueue.
+ * So we defer nf_ct_iterate_cleanup_net walk to the system workqueue.
*
* As we can have 'a lot' of inet_events (depending on amount
* of ipv6 addresses being deleted), we also need to add an upper
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index e847dbaa0c6b..2730f9df33b7 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1634,9 +1634,9 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
return ct;
}
-void nf_ct_iterate_cleanup(struct net *net,
- int (*iter)(struct nf_conn *i, void *data),
- void *data, u32 portid, int report)
+void nf_ct_iterate_cleanup_net(struct net *net,
+ int (*iter)(struct nf_conn *i, void *data),
+ void *data, u32 portid, int report)
{
struct nf_conn *ct;
unsigned int bucket = 0;
@@ -1654,7 +1654,7 @@ void nf_ct_iterate_cleanup(struct net *net,
cond_resched();
}
}
-EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup);
+EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup_net);
static int kill_all(struct nf_conn *i, void *data)
{
@@ -1723,7 +1723,7 @@ void nf_conntrack_cleanup_net_list(struct list_head *net_exit_list)
i_see_dead_people:
busy = 0;
list_for_each_entry(net, net_exit_list, exit_list) {
- nf_ct_iterate_cleanup(net, kill_all, NULL, 0, 0);
+ nf_ct_iterate_cleanup_net(net, kill_all, NULL, 0, 0);
if (atomic_read(&net->ct.count) != 0)
busy = 1;
}
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index dcf561b5c97a..7f53ec578b7e 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1116,8 +1116,8 @@ static int ctnetlink_flush_conntrack(struct net *net,
return PTR_ERR(filter);
}
- nf_ct_iterate_cleanup(net, ctnetlink_filter_match, filter,
- portid, report);
+ nf_ct_iterate_cleanup_net(net, ctnetlink_filter_match, filter,
+ portid, report);
kfree(filter);
return 0;
diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 2de6c1fe3261..b7d01f27d463 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -282,7 +282,7 @@ void nf_ct_l3proto_pernet_unregister(struct net *net,
proto->net_ns_put(net);
/* Remove all contrack entries for this protocol */
- nf_ct_iterate_cleanup(net, kill_l3proto, proto, 0, 0);
+ nf_ct_iterate_cleanup_net(net, kill_l3proto, proto, 0, 0);
}
EXPORT_SYMBOL_GPL(nf_ct_l3proto_pernet_unregister);
@@ -450,7 +450,7 @@ void nf_ct_l4proto_pernet_unregister_one(struct net *net,
nf_ct_l4proto_unregister_sysctl(net, pn, l4proto);
/* Remove all contrack entries for this protocol */
- nf_ct_iterate_cleanup(net, kill_l4proto, l4proto, 0, 0);
+ nf_ct_iterate_cleanup_net(net, kill_l4proto, l4proto, 0, 0);
}
EXPORT_SYMBOL_GPL(nf_ct_l4proto_pernet_unregister_one);
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index b48d6b5aae8a..46eac534f0d0 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -582,7 +582,7 @@ static void nf_nat_l4proto_clean(u8 l3proto, u8 l4proto)
rtnl_lock();
for_each_net(net)
- nf_ct_iterate_cleanup(net, nf_nat_proto_remove, &clean, 0, 0);
+ nf_ct_iterate_cleanup_net(net, nf_nat_proto_remove, &clean, 0, 0);
rtnl_unlock();
}
@@ -596,7 +596,7 @@ static void nf_nat_l3proto_clean(u8 l3proto)
rtnl_lock();
for_each_net(net)
- nf_ct_iterate_cleanup(net, nf_nat_proto_remove, &clean, 0, 0);
+ nf_ct_iterate_cleanup_net(net, nf_nat_proto_remove, &clean, 0, 0);
rtnl_unlock();
}
@@ -822,7 +822,7 @@ static void __net_exit nf_nat_net_exit(struct net *net)
{
struct nf_nat_proto_clean clean = {};
- nf_ct_iterate_cleanup(net, nf_nat_proto_clean, &clean, 0, 0);
+ nf_ct_iterate_cleanup_net(net, nf_nat_proto_clean, &clean, 0, 0);
}
static struct pernet_operations nf_nat_net_ops = {
--
2.13.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH nf-next 2/5] netfilter: conntrack: don't call iter for non-confirmed conntracks
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 1/5] netfilter: conntrack: rename nf_ct_iterate_cleanup Florian Westphal
@ 2017-05-21 10:52 ` Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 3/5] netfilter: conntrack: add nf_ct_iterate_destroy Florian Westphal
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2017-05-21 10:52 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
nf_ct_iterate_cleanup_net currently calls iter() callback also for
conntracks on the unconfirmed list, but this is unsafe.
Acesses to nf_conn are fine, but some users access the extension area
in the iter() callback, but that does only work reliably for confirmed
conntracks (ct->ext can be reallocated at any time for unconfirmed
conntrack).
The seond issue is that there is a short window where a conntrack entry
is neither on the list nor in the table: To confirm an entry, it is first
removed from the unconfirmed list, then insert into the table.
Fix this by iterating the unconfirmed list first and marking all entries
as dying, then wait for rcu grace period.
This makes sure all entries that were about to be confirmed either are
in the main table, or will be dropped soon.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_conntrack_core.c | 39 +++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 2730f9df33b7..08733685d732 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1592,7 +1592,6 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct;
struct hlist_nulls_node *n;
- int cpu;
spinlock_t *lockp;
for (; *bucket < nf_conntrack_htable_size; (*bucket)++) {
@@ -1614,24 +1613,40 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
cond_resched();
}
+ return NULL;
+found:
+ atomic_inc(&ct->ct_general.use);
+ spin_unlock(lockp);
+ local_bh_enable();
+ return ct;
+}
+
+static void
+__nf_ct_unconfirmed_destroy(struct net *net)
+{
+ int cpu;
+
for_each_possible_cpu(cpu) {
- struct ct_pcpu *pcpu = per_cpu_ptr(net->ct.pcpu_lists, cpu);
+ struct nf_conntrack_tuple_hash *h;
+ struct hlist_nulls_node *n;
+ struct ct_pcpu *pcpu;
+
+ pcpu = per_cpu_ptr(net->ct.pcpu_lists, cpu);
spin_lock_bh(&pcpu->lock);
hlist_nulls_for_each_entry(h, n, &pcpu->unconfirmed, hnnode) {
+ struct nf_conn *ct;
+
ct = nf_ct_tuplehash_to_ctrack(h);
- if (iter(ct, data))
- set_bit(IPS_DYING_BIT, &ct->status);
+
+ /* we cannot call iter() on unconfirmed list, the
+ * owning cpu can reallocate ct->ext at any time.
+ */
+ set_bit(IPS_DYING_BIT, &ct->status);
}
spin_unlock_bh(&pcpu->lock);
cond_resched();
}
- return NULL;
-found:
- atomic_inc(&ct->ct_general.use);
- spin_unlock(lockp);
- local_bh_enable();
- return ct;
}
void nf_ct_iterate_cleanup_net(struct net *net,
@@ -1646,6 +1661,10 @@ void nf_ct_iterate_cleanup_net(struct net *net,
if (atomic_read(&net->ct.count) == 0)
return;
+ __nf_ct_unconfirmed_destroy(net);
+
+ synchronize_net();
+
while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) {
/* Time to push up daises... */
--
2.13.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH nf-next 3/5] netfilter: conntrack: add nf_ct_iterate_destroy
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 1/5] netfilter: conntrack: rename nf_ct_iterate_cleanup Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 2/5] netfilter: conntrack: don't call iter for non-confirmed conntracks Florian Westphal
@ 2017-05-21 10:52 ` Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 4/5] netfilter: conntrack: restart iteration on resize Florian Westphal
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2017-05-21 10:52 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
sledgehammer to be used on module unload (to remove affected conntracks
from all namespaces).
It will also flag all unconfirmed conntracks as dying, i.e. they will
not be committed to main table.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/net/netfilter/nf_conntrack.h | 4 ++
net/netfilter/nf_conntrack_core.c | 87 ++++++++++++++++++++++++++++++------
2 files changed, 78 insertions(+), 13 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index f21180ea4558..48407569585d 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -229,6 +229,10 @@ void nf_ct_iterate_cleanup_net(struct net *net,
int (*iter)(struct nf_conn *i, void *data),
void *data, u32 portid, int report);
+/* also set unconfirmed conntracks as dying. Only use in module exit path. */
+void nf_ct_iterate_destroy(int (*iter)(struct nf_conn *i, void *data),
+ void *data);
+
struct nf_conntrack_zone;
void nf_conntrack_free(struct nf_conn *ct);
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 08733685d732..7ecee79c78b8 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1586,7 +1586,7 @@ static void nf_conntrack_attach(struct sk_buff *nskb, const struct sk_buff *skb)
/* Bring out ya dead! */
static struct nf_conn *
-get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
+get_next_corpse(int (*iter)(struct nf_conn *i, void *data),
void *data, unsigned int *bucket)
{
struct nf_conntrack_tuple_hash *h;
@@ -1603,8 +1603,7 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
continue;
ct = nf_ct_tuplehash_to_ctrack(h);
- if (net_eq(nf_ct_net(ct), net) &&
- iter(ct, data))
+ if (iter(ct, data))
goto found;
}
}
@@ -1621,6 +1620,39 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
return ct;
}
+static void nf_ct_iterate_cleanup(int (*iter)(struct nf_conn *i, void *data),
+ void *data, u32 portid, int report)
+{
+ struct nf_conn *ct;
+ unsigned int bucket = 0;
+
+ might_sleep();
+
+ while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) {
+ /* Time to push up daises... */
+
+ nf_ct_delete(ct, portid, report);
+ nf_ct_put(ct);
+ cond_resched();
+ }
+}
+
+struct iter_data {
+ int (*iter)(struct nf_conn *i, void *data);
+ void *data;
+ struct net *net;
+};
+
+static int iter_net_only(struct nf_conn *i, void *data)
+{
+ struct iter_data *d = data;
+
+ if (!net_eq(d->net, nf_ct_net(i)))
+ return 0;
+
+ return d->iter(i, d->data);
+}
+
static void
__nf_ct_unconfirmed_destroy(struct net *net)
{
@@ -1653,8 +1685,7 @@ void nf_ct_iterate_cleanup_net(struct net *net,
int (*iter)(struct nf_conn *i, void *data),
void *data, u32 portid, int report)
{
- struct nf_conn *ct;
- unsigned int bucket = 0;
+ struct iter_data d;
might_sleep();
@@ -1663,21 +1694,51 @@ void nf_ct_iterate_cleanup_net(struct net *net,
__nf_ct_unconfirmed_destroy(net);
+ d.iter = iter;
+ d.data = data;
+ d.net = net;
+
synchronize_net();
- while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) {
- /* Time to push up daises... */
+ nf_ct_iterate_cleanup(iter_net_only, &d, portid, report);
+}
+EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup_net);
- nf_ct_delete(ct, portid, report);
- nf_ct_put(ct);
- cond_resched();
+/**
+ * nf_ct_iterate_destroy - destroy unconfirmed conntracks and iterate table
+ * @iter: callback to invoke for each conntrack
+ * @data: data to pass to @iter
+ *
+ * Like nf_ct_iterate_cleanup, but first marks conntracks on the
+ * unconfirmed list as dying (so they will not be inserted into
+ * main table).
+ */
+void
+nf_ct_iterate_destroy(int (*iter)(struct nf_conn *i, void *data), void *data)
+{
+ struct net *net;
+
+ rtnl_lock();
+ for_each_net(net) {
+ if (atomic_read(&net->ct.count) == 0)
+ continue;
+ __nf_ct_unconfirmed_destroy(net);
}
+ rtnl_unlock();
+
+ /* a conntrack could have been unlinked from unconfirmed list
+ * before we grabbed pcpu lock in __nf_ct_unconfirmed_destroy().
+ * This makes sure its inserted into conntrack table.
+ */
+ synchronize_net();
+
+ nf_ct_iterate_cleanup(iter, data, 0, 0);
}
-EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup_net);
+EXPORT_SYMBOL_GPL(nf_ct_iterate_destroy);
static int kill_all(struct nf_conn *i, void *data)
{
- return 1;
+ return net_eq(nf_ct_net(i), data);
}
void nf_ct_free_hashtable(void *hash, unsigned int size)
@@ -1742,7 +1803,7 @@ void nf_conntrack_cleanup_net_list(struct list_head *net_exit_list)
i_see_dead_people:
busy = 0;
list_for_each_entry(net, net_exit_list, exit_list) {
- nf_ct_iterate_cleanup_net(net, kill_all, NULL, 0, 0);
+ nf_ct_iterate_cleanup(kill_all, net, 0, 0);
if (atomic_read(&net->ct.count) != 0)
busy = 1;
}
--
2.13.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH nf-next 4/5] netfilter: conntrack: restart iteration on resize
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
` (2 preceding siblings ...)
2017-05-21 10:52 ` [PATCH nf-next 3/5] netfilter: conntrack: add nf_ct_iterate_destroy Florian Westphal
@ 2017-05-21 10:52 ` Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 5/5] netfilter: nat: destroy nat mappings on module exit path only Florian Westphal
2017-05-29 9:35 ` [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Pablo Neira Ayuso
5 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2017-05-21 10:52 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
We could some conntracks when a resize occurs in parallel.
Avoid this by sampling generation seqcnt and doing a restart if needed.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_conntrack_core.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 7ecee79c78b8..c3bd9b086dcc 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1623,17 +1623,25 @@ get_next_corpse(int (*iter)(struct nf_conn *i, void *data),
static void nf_ct_iterate_cleanup(int (*iter)(struct nf_conn *i, void *data),
void *data, u32 portid, int report)
{
+ unsigned int bucket = 0, sequence;
struct nf_conn *ct;
- unsigned int bucket = 0;
might_sleep();
- while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) {
- /* Time to push up daises... */
+ for (;;) {
+ sequence = read_seqcount_begin(&nf_conntrack_generation);
- nf_ct_delete(ct, portid, report);
- nf_ct_put(ct);
- cond_resched();
+ while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) {
+ /* Time to push up daises... */
+
+ nf_ct_delete(ct, portid, report);
+ nf_ct_put(ct);
+ cond_resched();
+ }
+
+ if (!read_seqcount_retry(&nf_conntrack_generation, sequence))
+ break;
+ bucket = 0;
}
}
--
2.13.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH nf-next 5/5] netfilter: nat: destroy nat mappings on module exit path only
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
` (3 preceding siblings ...)
2017-05-21 10:52 ` [PATCH nf-next 4/5] netfilter: conntrack: restart iteration on resize Florian Westphal
@ 2017-05-21 10:52 ` Florian Westphal
2017-05-29 9:35 ` [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Pablo Neira Ayuso
5 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2017-05-21 10:52 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
We don't need pernetns cleanup anymore. If the netns is being
destroyed, conntrack netns exit will kill all entries in this namespace,
and neither conntrack hash table nor bysource hash are per namespace.
For the rmmod case, we have to make sure we remove all entries from the
nat bysource table, so call the new nf_ct_iterate_destroy in module exit
path.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_nat_core.c | 37 +++++--------------------------------
1 file changed, 5 insertions(+), 32 deletions(-)
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 46eac534f0d0..32b749ea2014 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -578,12 +578,8 @@ static void nf_nat_l4proto_clean(u8 l3proto, u8 l4proto)
.l3proto = l3proto,
.l4proto = l4proto,
};
- struct net *net;
- rtnl_lock();
- for_each_net(net)
- nf_ct_iterate_cleanup_net(net, nf_nat_proto_remove, &clean, 0, 0);
- rtnl_unlock();
+ nf_ct_iterate_destroy(nf_nat_proto_remove, &clean);
}
static void nf_nat_l3proto_clean(u8 l3proto)
@@ -591,13 +587,8 @@ static void nf_nat_l3proto_clean(u8 l3proto)
struct nf_nat_proto_clean clean = {
.l3proto = l3proto,
};
- struct net *net;
- rtnl_lock();
-
- for_each_net(net)
- nf_ct_iterate_cleanup_net(net, nf_nat_proto_remove, &clean, 0, 0);
- rtnl_unlock();
+ nf_ct_iterate_destroy(nf_nat_proto_remove, &clean);
}
/* Protocol registration. */
@@ -818,17 +809,6 @@ nfnetlink_parse_nat_setup(struct nf_conn *ct,
}
#endif
-static void __net_exit nf_nat_net_exit(struct net *net)
-{
- struct nf_nat_proto_clean clean = {};
-
- nf_ct_iterate_cleanup_net(net, nf_nat_proto_clean, &clean, 0, 0);
-}
-
-static struct pernet_operations nf_nat_net_ops = {
- .exit = nf_nat_net_exit,
-};
-
static struct nf_ct_helper_expectfn follow_master_nat = {
.name = "nat-follow-master",
.expectfn = nf_nat_follow_master,
@@ -849,10 +829,6 @@ static int __init nf_nat_init(void)
return ret;
}
- ret = register_pernet_subsys(&nf_nat_net_ops);
- if (ret < 0)
- goto cleanup_extend;
-
nf_ct_helper_expectfn_register(&follow_master_nat);
BUG_ON(nfnetlink_parse_nat_setup_hook != NULL);
@@ -863,18 +839,15 @@ static int __init nf_nat_init(void)
RCU_INIT_POINTER(nf_nat_decode_session_hook, __nf_nat_decode_session);
#endif
return 0;
-
- cleanup_extend:
- rhltable_destroy(&nf_nat_bysource_table);
- nf_ct_extend_unregister(&nat_extend);
- return ret;
}
static void __exit nf_nat_cleanup(void)
{
+ struct nf_nat_proto_clean clean = {};
unsigned int i;
- unregister_pernet_subsys(&nf_nat_net_ops);
+ nf_ct_iterate_destroy(nf_nat_proto_clean, &clean);
+
nf_ct_extend_unregister(&nat_extend);
nf_ct_helper_expectfn_unregister(&follow_master_nat);
RCU_INIT_POINTER(nfnetlink_parse_nat_setup_hook, NULL);
--
2.13.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1.
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
` (4 preceding siblings ...)
2017-05-21 10:52 ` [PATCH nf-next 5/5] netfilter: nat: destroy nat mappings on module exit path only Florian Westphal
@ 2017-05-29 9:35 ` Pablo Neira Ayuso
5 siblings, 0 replies; 7+ messages in thread
From: Pablo Neira Ayuso @ 2017-05-29 9:35 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Sun, May 21, 2017 at 12:52:54PM +0200, Florian Westphal wrote:
> First batch of changes to rework how we iterate over the conntrack table.
>
> Historically, we had one table.
> When net namespaces were added, we got one table per namespace.
> Nowadays we again only have a single table (which considers netns
> during lookups).
>
> This series prepares for removal of some open-coded table iteration
> places.
>
> It also adds nf_ct_iterate_destroy(), to be used in module exit path
> when we need to inspect every conntrack entry regardless of namespace,
> then uses it from nat module exit path.
Series applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-05-29 9:35 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-21 10:52 [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 1/5] netfilter: conntrack: rename nf_ct_iterate_cleanup Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 2/5] netfilter: conntrack: don't call iter for non-confirmed conntracks Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 3/5] netfilter: conntrack: add nf_ct_iterate_destroy Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 4/5] netfilter: conntrack: restart iteration on resize Florian Westphal
2017-05-21 10:52 ` [PATCH nf-next 5/5] netfilter: nat: destroy nat mappings on module exit path only Florian Westphal
2017-05-29 9:35 ` [PATCH nf-next 0/5] netfilter: conntrack: rework nf_ct_iterate, part 1 Pablo Neira Ayuso
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).