* [PATCH v6 0/8] netfilter: ipset fixes
@ 2026-05-08 20:58 Jozsef Kadlecsik
2026-05-08 20:58 ` [PATCH v6 1/8] netfilter: ipset: fix a potential dump-destroy race Jozsef Kadlecsik
` (8 more replies)
0 siblings, 9 replies; 12+ messages in thread
From: Jozsef Kadlecsik @ 2026-05-08 20:58 UTC (permalink / raw)
To: netfilter-devel; +Cc: Pablo Neira Ayuso
Hi Pablo,
Here follows the new revision of the fixes for the current list
of ipset related issues. If sashiko won't find any issues in
the patches themselves, then please consider applying them.
Best regards,
Jozsef
Jozsef Kadlecsik (8):
netfilter: ipset: fix a potential dump-destroy race
netfilter: ipset: Fix data race between add and list header in all
hash types
netfilter: ipset: Fix data race between add and dump in all hash types
netfilter: ipset: annotate "pos" for concurrent readers/writers
netfilter: ipset: Don't use test_bit() in lockless RCU readers
netfilter: ipset: fix potential torn read in reuse/forceadd cases
netfilter: ipset: skip gc when resize is in progress
netfilter: ipset: fix order of usage counters
net/netfilter/ipset/ip_set_core.c | 5 +-
net/netfilter/ipset/ip_set_hash_gen.h | 82 ++++++++++++++++++---------
2 files changed, 58 insertions(+), 29 deletions(-)
--
2.39.5
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH v6 1/8] netfilter: ipset: fix a potential dump-destroy race 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik @ 2026-05-08 20:58 ` Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 2/8] netfilter: ipset: Fix data race between add and list header in all hash types Jozsef Kadlecsik ` (7 subsequent siblings) 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:58 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso When dumping sets in order to create the proper order for restore, the list type of sets dumped last. Therefore internally we run the dumping loop twice: first with all non-list type of sets and skipping the list type ones and then secondly for the list type of sets. Sashiko noticed that there's a potential race between dump and destroy if in the first loop the last set was a list type of set: its pointer remains unreferenced and a concurrent destroy can free it. Fix the issue by resetting the variable holding the pointer. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index c5a26236a0bb..0874029cb0f2 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -1613,6 +1613,7 @@ ip_set_dump_do(struct sk_buff *skb, struct netlink_callback *cb) ((dump_type == DUMP_ALL) == !!(set->type->features & IPSET_DUMP_LAST))) { write_unlock_bh(&ip_set_ref_lock); + set = NULL; continue; } pr_debug("List set: %s\n", set->name); -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 2/8] netfilter: ipset: Fix data race between add and list header in all hash types 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 1/8] netfilter: ipset: fix a potential dump-destroy race Jozsef Kadlecsik @ 2026-05-08 20:58 ` Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 3/8] netfilter: ipset: Fix data race between add and dump " Jozsef Kadlecsik ` (6 subsequent siblings) 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:58 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso The "ipset list -terse" command is actually a dump operation which may run parallel with "ipset add" commands, which can trigger an internal resizing of the hash type of sets just being dumped. However, dumping just the header part of the set was not protected against underlying resizing. Fix it by protecting the header dumping part as well. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 0874029cb0f2..3706b4a85a0f 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -1649,13 +1649,13 @@ ip_set_dump_do(struct sk_buff *skb, struct netlink_callback *cb) if (cb->args[IPSET_CB_PROTO] > IPSET_PROTOCOL_MIN && nla_put_net16(skb, IPSET_ATTR_INDEX, htons(index))) goto nla_put_failure; + if (set->variant->uref) + set->variant->uref(set, cb, true); ret = set->variant->head(set, skb); if (ret < 0) goto release_refcount; if (dump_flags & IPSET_FLAG_LIST_HEADER) goto next_set; - if (set->variant->uref) - set->variant->uref(set, cb, true); fallthrough; default: ret = set->variant->list(set, skb, cb); -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 3/8] netfilter: ipset: Fix data race between add and dump in all hash types 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 1/8] netfilter: ipset: fix a potential dump-destroy race Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 2/8] netfilter: ipset: Fix data race between add and list header in all hash types Jozsef Kadlecsik @ 2026-05-08 20:58 ` Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 4/8] netfilter: ipset: annotate "pos" for concurrent readers/writers Jozsef Kadlecsik ` (5 subsequent siblings) 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:58 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso When adding a new entry to the next position in the existing hash bucket, the position index was incremented too early and parallel dump could read it before the entry was populated with the value. Move the setting of the position index after populating the entry. v2: Position counting fixed, noticed by Florian Westphal. Reported-by: syzbot+786c889f046e8b003ca6@syzkaller.appspotmail.com Reported-by: syzbot+1da17e4b41d795df059e@syzkaller.appspotmail.com Reported-by: syzbot+421c5f3ff8e9493084d9@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_hash_gen.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index b79e5dd2af03..133ce4611eed 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -844,7 +844,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, const struct mtype_elem *d = value; struct mtype_elem *data; struct hbucket *n, *old = ERR_PTR(-ENOENT); - int i, j = -1, ret; + int i, j = -1, npos = 0, ret; bool flag_exist = flags & IPSET_FLAG_EXIST; bool deleted = false, forceadd = false, reuse = false; u32 r, key, multi = 0, elements, maxelem; @@ -889,6 +889,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, ext_size(AHASH_INIT_SIZE, set->dsize); goto copy_elem; } + npos = n->pos; for (i = 0; i < n->pos; i++) { if (!test_bit(i, n->used)) { /* Reuse first deleted entry */ @@ -962,7 +963,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, } copy_elem: - j = n->pos++; + j = npos; + npos = n->pos + 1; data = ahash_data(n, j, set->dsize); copy_data: t->hregion[r].elements++; @@ -985,6 +987,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (SET_WITH_TIMEOUT(set)) ip_set_timeout_set(ext_timeout(data, set), ext->timeout); smp_mb__before_atomic(); + n->pos = npos; set_bit(j, n->used); if (old != ERR_PTR(-ENOENT)) { rcu_assign_pointer(hbucket(t, key), n); -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 4/8] netfilter: ipset: annotate "pos" for concurrent readers/writers 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik ` (2 preceding siblings ...) 2026-05-08 20:58 ` [PATCH v6 3/8] netfilter: ipset: Fix data race between add and dump " Jozsef Kadlecsik @ 2026-05-08 20:58 ` Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 5/8] netfilter: ipset: Don't use test_bit() in lockless RCU readers Jozsef Kadlecsik ` (4 subsequent siblings) 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:58 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso The "pos" structure member of struct hbucket stores the first free slot in the hash bucket of a hash type of set and there are concurrent readers/writers. Annotate accesses properly. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_hash_gen.h | 62 ++++++++++++++++----------- 1 file changed, 38 insertions(+), 24 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 133ce4611eed..04e4627ddfc1 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -386,8 +386,9 @@ static void mtype_ext_cleanup(struct ip_set *set, struct hbucket *n) { int i; + u8 pos = smp_load_acquire(&n->pos); - for (i = 0; i < n->pos; i++) + for (i = 0; i < pos; i++) if (test_bit(i, n->used)) ip_set_ext_destroy(set, ahash_data(n, i, set->dsize)); } @@ -490,7 +491,7 @@ mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r) #ifdef IP_SET_HASH_WITH_NETS u8 k; #endif - u8 htable_bits = t->htable_bits; + u8 pos, htable_bits = t->htable_bits; spin_lock_bh(&t->hregion[r].lock); for (i = ahash_bucket_start(r, htable_bits); @@ -498,7 +499,8 @@ mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r) n = __ipset_dereference(hbucket(t, i)); if (!n) continue; - for (j = 0, d = 0; j < n->pos; j++) { + pos = smp_load_acquire(&n->pos); + for (j = 0, d = 0; j < pos; j++) { if (!test_bit(j, n->used)) { d++; continue; @@ -534,7 +536,7 @@ mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r) /* Still try to delete expired elements. */ continue; tmp->size = n->size - AHASH_INIT_SIZE; - for (j = 0, d = 0; j < n->pos; j++) { + for (j = 0, d = 0; j < pos; j++) { if (!test_bit(j, n->used)) continue; data = ahash_data(n, j, dsize); @@ -623,7 +625,7 @@ mtype_resize(struct ip_set *set, bool retried) { struct htype *h = set->data; struct htable *t, *orig; - u8 htable_bits; + u8 pos, htable_bits; size_t hsize, dsize = set->dsize; #ifdef IP_SET_HASH_WITH_NETS u8 flags; @@ -685,7 +687,8 @@ mtype_resize(struct ip_set *set, bool retried) n = __ipset_dereference(hbucket(orig, i)); if (!n) continue; - for (j = 0; j < n->pos; j++) { + pos = smp_load_acquire(&n->pos); + for (j = 0; j < pos; j++) { if (!test_bit(j, n->used)) continue; data = ahash_data(n, j, dsize); @@ -809,9 +812,10 @@ mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size) { struct htype *h = set->data; const struct htable *t; - u32 i, j, r; struct hbucket *n; struct mtype_elem *data; + u32 i, j, r; + u8 pos; t = rcu_dereference_bh(h->table); for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) { @@ -820,7 +824,8 @@ mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size) n = rcu_dereference_bh(hbucket(t, i)); if (!n) continue; - for (j = 0; j < n->pos; j++) { + pos = smp_load_acquire(&n->pos); + for (j = 0; j < pos; j++) { if (!test_bit(j, n->used)) continue; data = ahash_data(n, j, set->dsize); @@ -844,10 +849,11 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, const struct mtype_elem *d = value; struct mtype_elem *data; struct hbucket *n, *old = ERR_PTR(-ENOENT); - int i, j = -1, npos = 0, ret; + int i, j = -1, ret; bool flag_exist = flags & IPSET_FLAG_EXIST; bool deleted = false, forceadd = false, reuse = false; u32 r, key, multi = 0, elements, maxelem; + u8 npos = 0; rcu_read_lock_bh(); t = rcu_dereference_bh(h->table); @@ -889,8 +895,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, ext_size(AHASH_INIT_SIZE, set->dsize); goto copy_elem; } - npos = n->pos; - for (i = 0; i < n->pos; i++) { + npos = smp_load_acquire(&n->pos); + for (i = 0; i < npos; i++) { if (!test_bit(i, n->used)) { /* Reuse first deleted entry */ if (j == -1) { @@ -934,7 +940,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (elements >= maxelem) goto set_full; /* Create a new slot */ - if (n->pos >= n->size) { + if (npos >= n->size) { #ifdef IP_SET_HASH_WITH_MULTI if (h->bucketsize >= AHASH_MAX_TUNED) goto set_full; @@ -963,8 +969,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, } copy_elem: - j = npos; - npos = n->pos + 1; + j = npos++; data = ahash_data(n, j, set->dsize); copy_data: t->hregion[r].elements++; @@ -987,7 +992,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (SET_WITH_TIMEOUT(set)) ip_set_timeout_set(ext_timeout(data, set), ext->timeout); smp_mb__before_atomic(); - n->pos = npos; + /* Ensure all data writes are visible before updating position */ + smp_store_release(&n->pos, npos); set_bit(j, n->used); if (old != ERR_PTR(-ENOENT)) { rcu_assign_pointer(hbucket(t, key), n); @@ -1046,6 +1052,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, int i, j, k, r, ret = -IPSET_ERR_EXIST; u32 key, multi = 0; size_t dsize = set->dsize; + u8 pos; /* Userspace add and resize is excluded by the mutex. * Kernespace add does not trigger resize. @@ -1061,7 +1068,8 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, n = rcu_dereference_bh(hbucket(t, key)); if (!n) goto out; - for (i = 0, k = 0; i < n->pos; i++) { + pos = smp_load_acquire(&n->pos); + for (i = 0, k = 0; i < pos; i++) { if (!test_bit(i, n->used)) { k++; continue; @@ -1075,8 +1083,8 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, ret = 0; clear_bit(i, n->used); smp_mb__after_atomic(); - if (i + 1 == n->pos) - n->pos--; + if (i + 1 == pos) + smp_store_release(&n->pos, --pos); t->hregion[r].elements--; #ifdef IP_SET_HASH_WITH_NETS for (j = 0; j < IPSET_NET_COUNT; j++) @@ -1097,11 +1105,11 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, x->flags = flags; } } - for (; i < n->pos; i++) { + for (; i < pos; i++) { if (!test_bit(i, n->used)) k++; } - if (k == n->pos) { + if (k == pos) { t->hregion[r].ext_size -= ext_size(n->size, dsize); rcu_assign_pointer(hbucket(t, key), NULL); kfree_rcu(n, rcu); @@ -1112,7 +1120,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (!tmp) goto out; tmp->size = n->size - AHASH_INIT_SIZE; - for (j = 0, k = 0; j < n->pos; j++) { + for (j = 0, k = 0; j < pos; j++) { if (!test_bit(j, n->used)) continue; data = ahash_data(n, j, dsize); @@ -1173,6 +1181,7 @@ mtype_test_cidrs(struct ip_set *set, struct mtype_elem *d, int ret, i, j = 0; #endif u32 key, multi = 0; + u8 pos; pr_debug("test by nets\n"); for (; j < NLEN && h->nets[j].cidr[0] && !multi; j++) { @@ -1190,7 +1199,8 @@ mtype_test_cidrs(struct ip_set *set, struct mtype_elem *d, n = rcu_dereference_bh(hbucket(t, key)); if (!n) continue; - for (i = 0; i < n->pos; i++) { + pos = smp_load_acquire(&n->pos); + for (i = 0; i < pos; i++) { if (!test_bit(i, n->used)) continue; data = ahash_data(n, i, set->dsize); @@ -1224,6 +1234,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext, struct mtype_elem *data; int i, ret = 0; u32 key, multi = 0; + u8 pos; rcu_read_lock_bh(); t = rcu_dereference_bh(h->table); @@ -1246,7 +1257,8 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext, ret = 0; goto out; } - for (i = 0; i < n->pos; i++) { + pos = smp_load_acquire(&n->pos); + for (i = 0; i < pos; i++) { if (!test_bit(i, n->used)) continue; data = ahash_data(n, i, set->dsize); @@ -1363,6 +1375,7 @@ mtype_list(const struct ip_set *set, /* We assume that one hash bucket fills into one page */ void *incomplete; int i, ret = 0; + u8 pos; atd = nla_nest_start(skb, IPSET_ATTR_ADT); if (!atd) @@ -1381,7 +1394,8 @@ mtype_list(const struct ip_set *set, cb->args[IPSET_CB_ARG0], t, n); if (!n) continue; - for (i = 0; i < n->pos; i++) { + pos = smp_load_acquire(&n->pos); + for (i = 0; i < pos; i++) { if (!test_bit(i, n->used)) continue; e = ahash_data(n, i, set->dsize); -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 5/8] netfilter: ipset: Don't use test_bit() in lockless RCU readers 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik ` (3 preceding siblings ...) 2026-05-08 20:58 ` [PATCH v6 4/8] netfilter: ipset: annotate "pos" for concurrent readers/writers Jozsef Kadlecsik @ 2026-05-08 20:59 ` Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 6/8] netfilter: ipset: fix potential torn read in reuse/forceadd cases Jozsef Kadlecsik ` (3 subsequent siblings) 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:59 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Sashiko pointed out that there are a few lockless RCU readers using test_bit() which is a relaxed atomic operation and provides no memory barrier guarantees. Use test_bit_acquire() instead where the operation may run parallel with add/del/gc, i.e. is not one from the next cases - protected by region lock - in a set destroy phase - in a new/temporary set creation phase Also, add two missing smp_mb__after_atomic() operations. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_hash_gen.h | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 04e4627ddfc1..6a31f2db824a 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -689,7 +689,7 @@ mtype_resize(struct ip_set *set, bool retried) continue; pos = smp_load_acquire(&n->pos); for (j = 0; j < pos; j++) { - if (!test_bit(j, n->used)) + if (!test_bit_acquire(j, n->used)) continue; data = ahash_data(n, j, dsize); if (SET_ELEM_EXPIRED(set, data)) @@ -826,7 +826,7 @@ mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size) continue; pos = smp_load_acquire(&n->pos); for (j = 0; j < pos; j++) { - if (!test_bit(j, n->used)) + if (!test_bit_acquire(j, n->used)) continue; data = ahash_data(n, j, set->dsize); if (!SET_ELEM_EXPIRED(set, data)) @@ -995,6 +995,7 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, /* Ensure all data writes are visible before updating position */ smp_store_release(&n->pos, npos); set_bit(j, n->used); + smp_mb__after_atomic(); if (old != ERR_PTR(-ENOENT)) { rcu_assign_pointer(hbucket(t, key), n); if (old) @@ -1201,7 +1202,7 @@ mtype_test_cidrs(struct ip_set *set, struct mtype_elem *d, continue; pos = smp_load_acquire(&n->pos); for (i = 0; i < pos; i++) { - if (!test_bit(i, n->used)) + if (!test_bit_acquire(i, n->used)) continue; data = ahash_data(n, i, set->dsize); if (!mtype_data_equal(data, d, &multi)) @@ -1259,7 +1260,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext, } pos = smp_load_acquire(&n->pos); for (i = 0; i < pos; i++) { - if (!test_bit(i, n->used)) + if (!test_bit_acquire(i, n->used)) continue; data = ahash_data(n, i, set->dsize); if (!mtype_data_equal(data, d, &multi)) @@ -1396,7 +1397,7 @@ mtype_list(const struct ip_set *set, continue; pos = smp_load_acquire(&n->pos); for (i = 0; i < pos; i++) { - if (!test_bit(i, n->used)) + if (!test_bit_acquire(i, n->used)) continue; e = ahash_data(n, i, set->dsize); if (SET_ELEM_EXPIRED(set, e)) -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 6/8] netfilter: ipset: fix potential torn read in reuse/forceadd cases 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik ` (4 preceding siblings ...) 2026-05-08 20:59 ` [PATCH v6 5/8] netfilter: ipset: Don't use test_bit() in lockless RCU readers Jozsef Kadlecsik @ 2026-05-08 20:59 ` Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 7/8] netfilter: ipset: skip gc when resize is in progress Jozsef Kadlecsik ` (2 subsequent siblings) 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:59 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Sashiko pointed out that due to using memcpy() to overwrite already existing entry in reuse/forceadd cases, it can lead to torn reads for concurrent lockless RCU readers. Set the element explicitly to unused before reusing it. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_hash_gen.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 6a31f2db824a..377b4be9e4d5 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -926,6 +926,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, j = 0; data = ahash_data(n, j, set->dsize); if (!deleted) { + clear_bit(j, n->used); + smp_mb__after_atomic(); #ifdef IP_SET_HASH_WITH_NETS for (i = 0; i < IPSET_NET_COUNT; i++) mtype_del_cidr(set, h, -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 7/8] netfilter: ipset: skip gc when resize is in progress 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik ` (5 preceding siblings ...) 2026-05-08 20:59 ` [PATCH v6 6/8] netfilter: ipset: fix potential torn read in reuse/forceadd cases Jozsef Kadlecsik @ 2026-05-08 20:59 ` Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 8/8] netfilter: ipset: fix order of usage counters Jozsef Kadlecsik 2026-05-09 8:09 ` [PATCH v6 0/8] netfilter: ipset fixes Florian Westphal 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:59 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Zhengchuan Liang reported that because resize does not copy the comment extension into the resized set but uses it's pointer, ongoing gc can free the extension in the original set which then results stale pointer in the resized one. The proposed patch was to recreate the extensions for every element in the resized set. It is both expensive and wastes memory, so better skip gc when resizing in progress detected: resizing will destroy the original set anyway, so doing gc on it unnecessary. Reported by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_hash_gen.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 377b4be9e4d5..71b57c731dcb 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -501,6 +501,8 @@ mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r) continue; pos = smp_load_acquire(&n->pos); for (j = 0, d = 0; j < pos; j++) { + if (atomic_read(&t->ref)) + goto resize_in_progress; if (!test_bit(j, n->used)) { d++; continue; @@ -552,6 +554,7 @@ mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r) kfree_rcu(n, rcu); } } +resize_in_progress: spin_unlock_bh(&t->hregion[r].lock); } @@ -672,7 +675,10 @@ mtype_resize(struct ip_set *set, bool retried) spin_lock_init(&t->hregion[i].lock); /* There can't be another parallel resizing, - * but dumping, gc, kernel side add/del are possible + * but dumping, kernel side add/del are possible. + * gc must detect ongoing resize when comments are in use + * in order not to free the comment extension area shared + * between the original and resized sets. */ orig = ipset_dereference_bh_nfnl(h->table); atomic_set(&orig->ref, 1); -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v6 8/8] netfilter: ipset: fix order of usage counters 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik ` (6 preceding siblings ...) 2026-05-08 20:59 ` [PATCH v6 7/8] netfilter: ipset: skip gc when resize is in progress Jozsef Kadlecsik @ 2026-05-08 20:59 ` Jozsef Kadlecsik 2026-05-09 8:09 ` [PATCH v6 0/8] netfilter: ipset fixes Florian Westphal 8 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-08 20:59 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Eulgyu Kim reported a slab-use-after-free issue when resizing a set and gc runs in parallel. Resizing may run parallel with already running gc or gc can start but notice that resizing started. The operation which finishes last must destroy the original set. The logic for the testing is: "I was the last user of the set and it was resized". However setting the counters in resizing was: "the set will be resized and I'm going to use the set". That created a small racing window at the testing phase. Fix the order in the resizing functions. Reported by: Eulgyu Kim <eulgyukim@snu.ac.kr> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> --- net/netfilter/ipset/ip_set_hash_gen.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 71b57c731dcb..023a3d7aeba0 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -681,8 +681,9 @@ mtype_resize(struct ip_set *set, bool retried) * between the original and resized sets. */ orig = ipset_dereference_bh_nfnl(h->table); - atomic_set(&orig->ref, 1); atomic_inc(&orig->uref); + smp_mb__after_atomic(); + atomic_set(&orig->ref, 1); pr_debug("attempt to resize set %s from %u to %u, t %p\n", set->name, orig->htable_bits, htable_bits, orig); for (r = 0; r < ahash_numof_locks(orig->htable_bits); r++) { @@ -799,6 +800,7 @@ mtype_resize(struct ip_set *set, bool retried) cleanup: rcu_read_unlock_bh(); atomic_set(&orig->ref, 0); + smp_mb__before_atomic(); atomic_dec(&orig->uref); mtype_ahash_destroy(set, t, false); if (ret == -EAGAIN) -- 2.39.5 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v6 0/8] netfilter: ipset fixes 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik ` (7 preceding siblings ...) 2026-05-08 20:59 ` [PATCH v6 8/8] netfilter: ipset: fix order of usage counters Jozsef Kadlecsik @ 2026-05-09 8:09 ` Florian Westphal 2026-05-10 21:43 ` Pablo Neira Ayuso 8 siblings, 1 reply; 12+ messages in thread From: Florian Westphal @ 2026-05-09 8:09 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel, Pablo Neira Ayuso Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > Hi Pablo, > > Here follows the new revision of the fixes for the current list > of ipset related issues. If sashiko won't find any issues in > the patches themselves, then please consider applying them. I think it would make sense to start taking the first 4 patches so we make some progress here and Jozsef doesn't have to respin all patches. What do you think? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v6 0/8] netfilter: ipset fixes 2026-05-09 8:09 ` [PATCH v6 0/8] netfilter: ipset fixes Florian Westphal @ 2026-05-10 21:43 ` Pablo Neira Ayuso 2026-05-11 7:45 ` Jozsef Kadlecsik 0 siblings, 1 reply; 12+ messages in thread From: Pablo Neira Ayuso @ 2026-05-10 21:43 UTC (permalink / raw) To: Florian Westphal; +Cc: Jozsef Kadlecsik, netfilter-devel On Sat, May 09, 2026 at 10:09:01AM +0200, Florian Westphal wrote: > Jozsef Kadlecsik <kadlec@netfilter.org> wrote: > > Hi Pablo, > > > > Here follows the new revision of the fixes for the current list > > of ipset related issues. If sashiko won't find any issues in > > the patches themselves, then please consider applying them. > > I think it would make sense to start taking the first 4 patches so we > make some progress here and Jozsef doesn't have to respin all patches. > > What do you think? Agreed. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v6 0/8] netfilter: ipset fixes 2026-05-10 21:43 ` Pablo Neira Ayuso @ 2026-05-11 7:45 ` Jozsef Kadlecsik 0 siblings, 0 replies; 12+ messages in thread From: Jozsef Kadlecsik @ 2026-05-11 7:45 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: Florian Westphal, Jozsef Kadlecsik, netfilter-devel Hi, On Sun, 10 May 2026, Pablo Neira Ayuso wrote: > On Sat, May 09, 2026 at 10:09:01AM +0200, Florian Westphal wrote: >> Jozsef Kadlecsik <kadlec@netfilter.org> wrote: >>> Hi Pablo, >>> >>> Here follows the new revision of the fixes for the current list >>> of ipset related issues. If sashiko won't find any issues in >>> the patches themselves, then please consider applying them. >> >> I think it would make sense to start taking the first 4 patches so we >> make some progress here and Jozsef doesn't have to respin all patches. >> >> What do you think? > > Agreed. Thanks! The patch "fix a potential dump-destroy race" definitely needs improvement and I'll check sashiko comments as well. Best regards, Jozsef ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-11 7:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-08 20:58 [PATCH v6 0/8] netfilter: ipset fixes Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 1/8] netfilter: ipset: fix a potential dump-destroy race Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 2/8] netfilter: ipset: Fix data race between add and list header in all hash types Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 3/8] netfilter: ipset: Fix data race between add and dump " Jozsef Kadlecsik 2026-05-08 20:58 ` [PATCH v6 4/8] netfilter: ipset: annotate "pos" for concurrent readers/writers Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 5/8] netfilter: ipset: Don't use test_bit() in lockless RCU readers Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 6/8] netfilter: ipset: fix potential torn read in reuse/forceadd cases Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 7/8] netfilter: ipset: skip gc when resize is in progress Jozsef Kadlecsik 2026-05-08 20:59 ` [PATCH v6 8/8] netfilter: ipset: fix order of usage counters Jozsef Kadlecsik 2026-05-09 8:09 ` [PATCH v6 0/8] netfilter: ipset fixes Florian Westphal 2026-05-10 21:43 ` Pablo Neira Ayuso 2026-05-11 7:45 ` Jozsef Kadlecsik
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.