All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jozsef Kadlecsik <kadlec@netfilter.org>
To: netfilter-devel@vger.kernel.org
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH v7 08/10] netfilter: ipset: skip gc when resize is in progress
Date: Thu, 14 May 2026 10:55:17 +0200	[thread overview]
Message-ID: <20260514085519.12729-9-kadlec@netfilter.org> (raw)
In-Reply-To: <20260514085519.12729-1-kadlec@netfilter.org>

Zhengchuan Liang reported that because resize does not copy
the comment extension into the resized set but uses it's pointer,
ongoing gc can free the extension in the original set which then
results stale pointer in the resized one. The proposed patch was
to recreate the extensions for every element in the resized set.
It is both expensive and wastes memory, so better skip gc
when resizing in progress detected: resizing will destroy
the original set anyway, so doing gc on it unnecessary.

Reported by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported by: Eulgyu Kim <eulgyukim@snu.ac.kr>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
---
 net/netfilter/ipset/ip_set_hash_gen.h | 40 ++++++++++++++++-----------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 6a31f2db824a..ba560ebb4719 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -75,7 +75,9 @@ struct hbucket {
 struct htable_gc {
 	struct delayed_work dwork;
 	struct ip_set *set;	/* Set the gc belongs to */
+	spinlock_t lock;	/* Lock to exclude gc and resize */
 	u32 region;		/* Last gc run position */
+	bool resizing;		/* Signal resize in progress */
 };
 
 /* The hash table: the table size stored here in order to make resizing easy */
@@ -569,28 +571,24 @@ mtype_gc(struct work_struct *work)
 	set = gc->set;
 	h = set->data;
 
-	spin_lock_bh(&set->lock);
 	t = ipset_dereference_set(h->table, set);
-	atomic_inc(&t->uref);
 	numof_locks = ahash_numof_locks(t->htable_bits);
-	r = gc->region++;
-	if (r >= numof_locks) {
-		r = gc->region = 0;
-	}
 	next_run = (IPSET_GC_PERIOD(set->timeout) * HZ) / numof_locks;
 	if (next_run < HZ/10)
 		next_run = HZ/10;
-	spin_unlock_bh(&set->lock);
-
-	mtype_gc_do(set, h, t, r);
 
-	if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
-		pr_debug("Table destroy after resize by expire: %p\n", t);
-		mtype_ahash_destroy(set, t, false);
+	spin_lock_bh(&gc->lock);
+	if (gc->resizing)
+		goto skip_gc;
+	r = gc->region++;
+	if (r >= numof_locks) {
+		r = gc->region = 0;
 	}
+	mtype_gc_do(set, h, t, r);
+skip_gc:
+	spin_unlock_bh(&gc->lock);
 
 	queue_delayed_work(system_power_efficient_wq, &gc->dwork, next_run);
-
 }
 
 static void
@@ -646,6 +644,9 @@ mtype_resize(struct ip_set *set, bool retried)
 #endif
 	orig = ipset_dereference_bh_nfnl(h->table);
 	htable_bits = orig->htable_bits;
+	spin_lock_bh(&h->gc.lock);
+	h->gc.resizing = 1;
+	spin_unlock_bh(&h->gc.lock);
 
 retry:
 	ret = 0;
@@ -672,7 +673,11 @@ mtype_resize(struct ip_set *set, bool retried)
 		spin_lock_init(&t->hregion[i].lock);
 
 	/* There can't be another parallel resizing,
-	 * but dumping, gc, kernel side add/del are possible
+	 * but dumping, kernel side add/del are possible.
+	 *
+	 * Parallel gc is explicitly excluded because
+	 * resize destroys the old set and its extensions
+	 * which can interfere with an ongoing gc.
 	 */
 	orig = ipset_dereference_bh_nfnl(h->table);
 	atomic_set(&orig->ref, 1);
@@ -692,8 +697,7 @@ mtype_resize(struct ip_set *set, bool retried)
 				if (!test_bit_acquire(j, n->used))
 					continue;
 				data = ahash_data(n, j, dsize);
-				if (SET_ELEM_EXPIRED(set, data))
-					continue;
+				/* Expired elements copied as well */
 #ifdef IP_SET_HASH_WITH_NETS
 				/* We have readers running parallel with us,
 				 * so the live data cannot be modified.
@@ -785,6 +789,9 @@ mtype_resize(struct ip_set *set, bool retried)
 	}
 
 out:
+	spin_lock_bh(&h->gc.lock);
+	h->gc.resizing = 0;
+	spin_unlock_bh(&h->gc.lock);
 #ifdef IP_SET_HASH_WITH_NETS
 	kfree(tmp);
 #endif
@@ -1594,6 +1601,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 		return -ENOMEM;
 	}
 	h->gc.set = set;
+	spin_lock_init(&h->gc.lock);
 	for (i = 0; i < ahash_numof_locks(hbits); i++)
 		spin_lock_init(&t->hregion[i].lock);
 	h->maxelem = maxelem;
-- 
2.39.5


  parent reply	other threads:[~2026-05-14  8:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-14  8:55 [PATCH v7 00/10] netfilter: ipset fixes Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 01/10] netfilter: ipset: fix a potential dump-destroy race Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 02/10] netfilter: ipset: Fix data race between add and list header in all hash types Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 03/10] netfilter: ipset: Fix data race between add and dump " Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 04/10] netfilter: ipset: annotate "pos" for concurrent readers/writers Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 05/10] netfilter: ipset: Don't use test_bit() in lockless RCU readers in hash types Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 06/10] netfilter: ipset: Don't use test_bit() in lockless RCU readers in bitmap types Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 07/10] netfilter: ipset: fix order of kfree_rcu() and rcu_assign_pointer() Jozsef Kadlecsik
2026-05-14  8:55 ` Jozsef Kadlecsik [this message]
2026-05-14  8:55 ` [PATCH v7 09/10] netfilter: ipset: fix potential torn read in reuse/forceadd cases Jozsef Kadlecsik
2026-05-14  8:55 ` [PATCH v7 10/10] netfilter: ipset: add comment how cidr bookkeeping is working Jozsef Kadlecsik
2026-05-14 16:34 ` [syzbot ci] Re: netfilter: ipset fixes syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514085519.12729-9-kadlec@netfilter.org \
    --to=kadlec@netfilter.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.