From: Pablo Neira Ayuso <pablo@netfilter.org>
To: netfilter-devel@vger.kernel.org
Cc: davem@davemloft.net, netdev@vger.kernel.org
Subject: [PATCH 25/29] netfilter: conntrack: add gc worker to remove timed-out entries
Date: Mon, 5 Sep 2016 12:58:40 +0200 [thread overview]
Message-ID: <1473073124-5015-26-git-send-email-pablo@netfilter.org> (raw)
In-Reply-To: <1473073124-5015-1-git-send-email-pablo@netfilter.org>
From: Florian Westphal <fw@strlen.de>
Conntrack gc worker to evict stale entries.
GC happens once every 5 seconds, but we only scan at most 1/64th of the
table (and not more than 8k) buckets to avoid hogging cpu.
This means that a complete scan of the table will take several minutes
of wall-clock time.
Considering that the gc run will never have to evict any entries
during normal operation because those will happen from packet path
this should be fine.
We only need gc to make sure userspace (conntrack event listeners)
eventually learn of the timeout, and for resource reclaim in case the
system becomes idle.
We do not disable BH and cond_resched for every bucket so this should
not introduce noticeable latencies either.
A followup patch will add a small change to speed up GC for the extreme
case where most entries are timed out on an otherwise idle system.
v2: Use cond_resched_rcu_qs & add comment wrt. missing restart on
nulls value change in gc worker, suggested by Eric Dumazet.
v3: don't call cancel_delayed_work_sync twice (again, Eric).
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_core.c | 76 +++++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 87ee6da..f95a9e9 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -72,11 +72,24 @@ EXPORT_SYMBOL_GPL(nf_conntrack_expect_lock);
struct hlist_nulls_head *nf_conntrack_hash __read_mostly;
EXPORT_SYMBOL_GPL(nf_conntrack_hash);
+struct conntrack_gc_work {
+ struct delayed_work dwork;
+ u32 last_bucket;
+ bool exiting;
+};
+
static __read_mostly struct kmem_cache *nf_conntrack_cachep;
static __read_mostly spinlock_t nf_conntrack_locks_all_lock;
static __read_mostly DEFINE_SPINLOCK(nf_conntrack_locks_all_lock);
static __read_mostly bool nf_conntrack_locks_all;
+#define GC_MAX_BUCKETS_DIV 64u
+#define GC_MAX_BUCKETS 8192u
+#define GC_INTERVAL (5 * HZ)
+#define GC_MAX_EVICTS 256u
+
+static struct conntrack_gc_work conntrack_gc_work;
+
void nf_conntrack_lock(spinlock_t *lock) __acquires(lock)
{
spin_lock(lock);
@@ -928,6 +941,63 @@ static noinline int early_drop(struct net *net, unsigned int _hash)
return false;
}
+static void gc_worker(struct work_struct *work)
+{
+ unsigned int i, goal, buckets = 0, expired_count = 0;
+ unsigned long next_run = GC_INTERVAL;
+ struct conntrack_gc_work *gc_work;
+
+ gc_work = container_of(work, struct conntrack_gc_work, dwork.work);
+
+ goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, GC_MAX_BUCKETS);
+ i = gc_work->last_bucket;
+
+ do {
+ struct nf_conntrack_tuple_hash *h;
+ struct hlist_nulls_head *ct_hash;
+ struct hlist_nulls_node *n;
+ unsigned int hashsz;
+ struct nf_conn *tmp;
+
+ i++;
+ rcu_read_lock();
+
+ nf_conntrack_get_ht(&ct_hash, &hashsz);
+ if (i >= hashsz)
+ i = 0;
+
+ hlist_nulls_for_each_entry_rcu(h, n, &ct_hash[i], hnnode) {
+ tmp = nf_ct_tuplehash_to_ctrack(h);
+
+ if (nf_ct_is_expired(tmp)) {
+ nf_ct_gc_expired(tmp);
+ expired_count++;
+ continue;
+ }
+ }
+
+ /* could check get_nulls_value() here and restart if ct
+ * was moved to another chain. But given gc is best-effort
+ * we will just continue with next hash slot.
+ */
+ rcu_read_unlock();
+ cond_resched_rcu_qs();
+ } while (++buckets < goal &&
+ expired_count < GC_MAX_EVICTS);
+
+ if (gc_work->exiting)
+ return;
+
+ gc_work->last_bucket = i;
+ schedule_delayed_work(&gc_work->dwork, next_run);
+}
+
+static void conntrack_gc_work_init(struct conntrack_gc_work *gc_work)
+{
+ INIT_DELAYED_WORK(&gc_work->dwork, gc_worker);
+ gc_work->exiting = false;
+}
+
static struct nf_conn *
__nf_conntrack_alloc(struct net *net,
const struct nf_conntrack_zone *zone,
@@ -1534,6 +1604,7 @@ static int untrack_refs(void)
void nf_conntrack_cleanup_start(void)
{
+ conntrack_gc_work.exiting = true;
RCU_INIT_POINTER(ip_ct_attach, NULL);
}
@@ -1543,6 +1614,7 @@ void nf_conntrack_cleanup_end(void)
while (untrack_refs() > 0)
schedule();
+ cancel_delayed_work_sync(&conntrack_gc_work.dwork);
nf_ct_free_hashtable(nf_conntrack_hash, nf_conntrack_htable_size);
nf_conntrack_proto_fini();
@@ -1817,6 +1889,10 @@ int nf_conntrack_init_start(void)
}
/* - and look it like as a confirmed connection */
nf_ct_untracked_status_or(IPS_CONFIRMED | IPS_UNTRACKED);
+
+ conntrack_gc_work_init(&conntrack_gc_work);
+ schedule_delayed_work(&conntrack_gc_work.dwork, GC_INTERVAL);
+
return 0;
err_proto:
--
2.1.4
next prev parent reply other threads:[~2016-09-05 10:59 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-05 10:58 [PATCH 00/29] Netfilter updates for net-next Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 01/29] netfilter: conntrack: Only need first 4 bytes to get l4proto ports Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 02/29] netfilter: physdev: add missed blank Pablo Neira Ayuso
2016-09-05 17:43 ` Joe Perches
2016-09-05 10:58 ` [PATCH 03/29] netfilter: nf_dup4: remove redundant checksum recalculation Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 04/29] netfilter: use_nf_conn_expires helper in more places Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 05/29] ipvs: use nf_ct_kill helper Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 06/29] netfilter: nf_tables: rename set implementations Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 07/29] netfilter: nf_tables: add hash expression Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 08/29] netfilter: remove ip_conntrack* sysctl compat code Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 09/29] netfilter: conntrack: simplify the code by using nf_conntrack_get_ht Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 10/29] netfilter: nf_conntrack: restore nf_conntrack_htable_size as exported symbol Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 11/29] netfilter: nf_tables: add quota expression Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 12/29] netfilter: nf_tables: add number generator expression Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 13/29] netfilter: fix spelling mistake: "delimitter" -> "delimiter" Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 14/29] netfilter: nft_hash: fix non static symbol warning Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 15/29] netfilter: nf_tables: typo in trace attribute definition Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 16/29] netfilter: nf_tables: introduce nft_chain_parse_hook() Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 17/29] netfilter: nf_tables: reject hook configuration updates on existing chains Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 18/29] rhashtable: add rhashtable_lookup_get_insert_key() Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 19/29] netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 20/29] netfilter: nf_tables: Use nla_put_be32() to dump immediate parameters Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 21/29] netfilter: restart search if moved to other chain Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 22/29] netfilter: don't rely on DYING bit to detect when destroy event was sent Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 23/29] netfilter: conntrack: get rid of conntrack timer Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 24/29] netfilter: evict stale entries on netlink dumps Pablo Neira Ayuso
2016-09-05 10:58 ` Pablo Neira Ayuso [this message]
2016-09-05 10:58 ` [PATCH 26/29] netfilter: conntrack: resched gc again if eviction rate is high Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 27/29] netfilter: remove __nf_ct_kill_acct helper Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 28/29] netfilter: log_arp: Use ARPHRD_ETHER instead of literal '1' Pablo Neira Ayuso
2016-09-05 10:58 ` [PATCH 29/29] netfilter: log: Check param to avoid overflow in nf_log_set Pablo Neira Ayuso
2016-09-06 19:47 ` [PATCH 00/29] Netfilter updates for net-next David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1473073124-5015-26-git-send-email-pablo@netfilter.org \
--to=pablo@netfilter.org \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).