* [PATCH net v2 0/3] netfilter: updates for net
@ 2026-03-05 12:26 Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion Florian Westphal
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Florian Westphal @ 2026-03-05 12:26 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
[ v2: remove patch 1 from series, no other changes ]
Hi,
The following patchset contains Netfilter fixes for *net*:
1) Inseo An reported a bug with the set element handling in nf_tables:
When set cannot accept more elements, we unlink and immediately free
an element that was inserted into a public data structure, freeing it
without waiting for RCU grace period. Fix this by doing the
increment earlier and by deferring possible unlink-and-free to the
existing abort path, which performs the needed synchronize_rcu before
free. From Pablo Neira Ayuso. This is an ancient bug, dating back to
kernel 4.10.
2) syzbot reported WARN_ON() splat in nf_tables that occurs on memory
allocation failure. Fix this by a new iterator annotation:
The affected walker does not need to clone the data structure and
can just use the live version if no clone exists yet.
Also from Pablo. This bug existed since 6.10 days.
3) Ancient forever bug in nft_pipapo data structure:
The garbage collection logic to remove expired elements is broken.
We must unlink from data structure and can only hand the freeing
to call_rcu after the clone/live pointers of the data structures
have been swapped. Else, readers can observe the free'd element.
Reported by Yiming Qian.
Please, pull these changes from:
The following changes since commit b824c3e16c1904bf80df489e293d1e3cbf98896d:
net: Provide a PREEMPT_RT specific check for netdev_queue::_xmit_lock (2026-03-05 12:14:21 +0100)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-26-03-05
for you to fetch changes up to 9df95785d3d8302f7c066050117b04cd3c2048c2:
netfilter: nft_set_pipapo: split gc into unlink and reclaim phase (2026-03-05 13:22:37 +0100)
----------------------------------------------------------------
netfilter pull request nf-26-03-05
----------------------------------------------------------------
Florian Westphal (1):
netfilter: nft_set_pipapo: split gc into unlink and reclaim phase
Pablo Neira Ayuso (2):
netfilter: nf_tables: unconditionally bump set->nelems before insertion
netfilter: nf_tables: clone set on flush only
include/net/netfilter/nf_tables.h | 7 ++++
net/netfilter/nf_tables_api.c | 45 ++++++++++++----------
net/netfilter/nft_set_hash.c | 1 +
net/netfilter/nft_set_pipapo.c | 62 ++++++++++++++++++++++++++-----
net/netfilter/nft_set_pipapo.h | 2 +
net/netfilter/nft_set_rbtree.c | 8 ++--
6 files changed, 92 insertions(+), 33 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion
2026-03-05 12:26 [PATCH net v2 0/3] netfilter: updates for net Florian Westphal
@ 2026-03-05 12:26 ` Florian Westphal
2026-03-05 16:00 ` patchwork-bot+netdevbpf
2026-03-05 12:26 ` [PATCH net v2 2/3] netfilter: nf_tables: clone set on flush only Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 3/3] netfilter: nft_set_pipapo: split gc into unlink and reclaim phase Florian Westphal
2 siblings, 1 reply; 5+ messages in thread
From: Florian Westphal @ 2026-03-05 12:26 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Pablo Neira Ayuso <pablo@netfilter.org>
In case that the set is full, a new element gets published then removed
without waiting for the RCU grace period, while RCU reader can be
walking over it already.
To address this issue, add the element transaction even if set is full,
but toggle the set_full flag to report -ENFILE so the abort path safely
unwinds the set to its previous state.
As for element updates, decrement set->nelems to restore it.
A simpler fix is to call synchronize_rcu() in the error path.
However, with a large batch adding elements to already maxed-out set,
this could cause noticeable slowdown of such batches.
Fixes: 35d0ac9070ef ("netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCL")
Reported-by: Inseo An <y0un9sa@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 30 ++++++++++++++++--------------
1 file changed, 16 insertions(+), 14 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index fd7f7e4e2a43..df67932d3e09 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7170,6 +7170,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
struct nft_data_desc desc;
enum nft_registers dreg;
struct nft_trans *trans;
+ bool set_full = false;
u64 expiration;
u64 timeout;
int err, i;
@@ -7461,10 +7462,18 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
if (err < 0)
goto err_elem_free;
+ if (!(flags & NFT_SET_ELEM_CATCHALL)) {
+ unsigned int max = nft_set_maxsize(set), nelems;
+
+ nelems = atomic_inc_return(&set->nelems);
+ if (nelems > max)
+ set_full = true;
+ }
+
trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set);
if (trans == NULL) {
err = -ENOMEM;
- goto err_elem_free;
+ goto err_set_size;
}
ext->genmask = nft_genmask_cur(ctx->net);
@@ -7516,7 +7525,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
ue->priv = elem_priv;
nft_trans_commit_list_add_elem(ctx->net, trans);
- goto err_elem_free;
+ goto err_set_size;
}
}
}
@@ -7534,23 +7543,16 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
goto err_element_clash;
}
- if (!(flags & NFT_SET_ELEM_CATCHALL)) {
- unsigned int max = nft_set_maxsize(set);
-
- if (!atomic_add_unless(&set->nelems, 1, max)) {
- err = -ENFILE;
- goto err_set_full;
- }
- }
-
nft_trans_container_elem(trans)->elems[0].priv = elem.priv;
nft_trans_commit_list_add_elem(ctx->net, trans);
- return 0;
-err_set_full:
- nft_setelem_remove(ctx->net, set, elem.priv);
+ return set_full ? -ENFILE : 0;
+
err_element_clash:
kfree(trans);
+err_set_size:
+ if (!(flags & NFT_SET_ELEM_CATCHALL))
+ atomic_dec(&set->nelems);
err_elem_free:
nf_tables_set_elem_destroy(ctx, set, elem.priv);
err_parse_data:
--
2.52.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net v2 2/3] netfilter: nf_tables: clone set on flush only
2026-03-05 12:26 [PATCH net v2 0/3] netfilter: updates for net Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion Florian Westphal
@ 2026-03-05 12:26 ` Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 3/3] netfilter: nft_set_pipapo: split gc into unlink and reclaim phase Florian Westphal
2 siblings, 0 replies; 5+ messages in thread
From: Florian Westphal @ 2026-03-05 12:26 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Pablo Neira Ayuso <pablo@netfilter.org>
Syzbot with fault injection triggered a failing memory allocation with
GFP_KERNEL which results in a WARN splat:
iter.err
WARNING: net/netfilter/nf_tables_api.c:845 at nft_map_deactivate+0x34e/0x3c0 net/netfilter/nf_tables_api.c:845, CPU#0: syz.0.17/5992
Modules linked in:
CPU: 0 UID: 0 PID: 5992 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
RIP: 0010:nft_map_deactivate+0x34e/0x3c0 net/netfilter/nf_tables_api.c:845
Code: 8b 05 86 5a 4e 09 48 3b 84 24 a0 00 00 00 75 62 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc e8 63 6d fa f7 90 <0f> 0b 90 43
+80 7c 35 00 00 0f 85 23 fe ff ff e9 26 fe ff ff 89 d9
RSP: 0018:ffffc900045af780 EFLAGS: 00010293
RAX: ffffffff89ca45bd RBX: 00000000fffffff4 RCX: ffff888028111e40
RDX: 0000000000000000 RSI: 00000000fffffff4 RDI: 0000000000000000
RBP: ffffc900045af870 R08: 0000000000400dc0 R09: 00000000ffffffff
R10: dffffc0000000000 R11: fffffbfff1d141db R12: ffffc900045af7e0
R13: 1ffff920008b5f24 R14: dffffc0000000000 R15: ffffc900045af920
FS: 000055557a6a5500(0000) GS:ffff888125496000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb5ea271fc0 CR3: 000000003269e000 CR4: 00000000003526f0
Call Trace:
<TASK>
__nft_release_table+0xceb/0x11f0 net/netfilter/nf_tables_api.c:12115
nft_rcv_nl_event+0xc25/0xdb0 net/netfilter/nf_tables_api.c:12187
notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85
blocking_notifier_call_chain+0x6a/0x90 kernel/notifier.c:380
netlink_release+0x123b/0x1ad0 net/netlink/af_netlink.c:761
__sock_release net/socket.c:662 [inline]
sock_close+0xc3/0x240 net/socket.c:1455
Restrict set clone to the flush set command in the preparation phase.
Add NFT_ITER_UPDATE_CLONE and use it for this purpose, update the rbtree
and pipapo backends to only clone the set when this iteration type is
used.
As for the existing NFT_ITER_UPDATE type, update the pipapo backend to
use the existing set clone if available, otherwise use the existing set
representation. After this update, there is no need to clone a set that
is being deleted, this includes bound anonymous set.
An alternative approach to NFT_ITER_UPDATE_CLONE is to add a .clone
interface and call it from the flush set path.
Reported-by: syzbot+4924a0edc148e8b4b342@syzkaller.appspotmail.com
Fixes: 3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/net/netfilter/nf_tables.h | 2 ++
net/netfilter/nf_tables_api.c | 10 +++++++++-
net/netfilter/nft_set_hash.c | 1 +
net/netfilter/nft_set_pipapo.c | 11 +++++++++--
net/netfilter/nft_set_rbtree.c | 8 +++++---
5 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 426534a711b0..ea6f29ad7888 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -320,11 +320,13 @@ static inline void *nft_elem_priv_cast(const struct nft_elem_priv *priv)
* @NFT_ITER_UNSPEC: unspecified, to catch errors
* @NFT_ITER_READ: read-only iteration over set elements
* @NFT_ITER_UPDATE: iteration under mutex to update set element state
+ * @NFT_ITER_UPDATE_CLONE: clone set before iteration under mutex to update element
*/
enum nft_iter_type {
NFT_ITER_UNSPEC,
NFT_ITER_READ,
NFT_ITER_UPDATE,
+ NFT_ITER_UPDATE_CLONE,
};
struct nft_set;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index df67932d3e09..058f7004cb2b 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -833,6 +833,11 @@ static void nft_map_catchall_deactivate(const struct nft_ctx *ctx,
}
}
+/* Use NFT_ITER_UPDATE iterator even if this may be called from the preparation
+ * phase, the set clone might already exist from a previous command, or it might
+ * be a set that is going away and does not require a clone. The netns and
+ * netlink release paths also need to work on the live set.
+ */
static void nft_map_deactivate(const struct nft_ctx *ctx, struct nft_set *set)
{
struct nft_set_iter iter = {
@@ -7903,9 +7908,12 @@ static int nft_set_catchall_flush(const struct nft_ctx *ctx,
static int nft_set_flush(struct nft_ctx *ctx, struct nft_set *set, u8 genmask)
{
+ /* The set backend might need to clone the set, do it now from the
+ * preparation phase, use NFT_ITER_UPDATE_CLONE iterator type.
+ */
struct nft_set_iter iter = {
.genmask = genmask,
- .type = NFT_ITER_UPDATE,
+ .type = NFT_ITER_UPDATE_CLONE,
.fn = nft_setelem_flush,
};
diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 739b992bde59..b0e571c8e3f3 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -374,6 +374,7 @@ static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
{
switch (iter->type) {
case NFT_ITER_UPDATE:
+ case NFT_ITER_UPDATE_CLONE:
/* only relevant for netlink dumps which use READ type */
WARN_ON_ONCE(iter->skip != 0);
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index 7ef4b44471d3..c091898df710 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -2144,13 +2144,20 @@ static void nft_pipapo_walk(const struct nft_ctx *ctx, struct nft_set *set,
const struct nft_pipapo_match *m;
switch (iter->type) {
- case NFT_ITER_UPDATE:
+ case NFT_ITER_UPDATE_CLONE:
m = pipapo_maybe_clone(set);
if (!m) {
iter->err = -ENOMEM;
return;
}
-
+ nft_pipapo_do_walk(ctx, set, m, iter);
+ break;
+ case NFT_ITER_UPDATE:
+ if (priv->clone)
+ m = priv->clone;
+ else
+ m = rcu_dereference_protected(priv->match,
+ nft_pipapo_transaction_mutex_held(set));
nft_pipapo_do_walk(ctx, set, m, iter);
break;
case NFT_ITER_READ:
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 3f02e4478216..ee3d4f5b9ff7 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -861,13 +861,15 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
struct nft_rbtree *priv = nft_set_priv(set);
switch (iter->type) {
- case NFT_ITER_UPDATE:
- lockdep_assert_held(&nft_pernet(ctx->net)->commit_mutex);
-
+ case NFT_ITER_UPDATE_CLONE:
if (nft_array_may_resize(set) < 0) {
iter->err = -ENOMEM;
break;
}
+ fallthrough;
+ case NFT_ITER_UPDATE:
+ lockdep_assert_held(&nft_pernet(ctx->net)->commit_mutex);
+
nft_rbtree_do_walk(ctx, set, iter);
break;
case NFT_ITER_READ:
--
2.52.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net v2 3/3] netfilter: nft_set_pipapo: split gc into unlink and reclaim phase
2026-03-05 12:26 [PATCH net v2 0/3] netfilter: updates for net Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 2/3] netfilter: nf_tables: clone set on flush only Florian Westphal
@ 2026-03-05 12:26 ` Florian Westphal
2 siblings, 0 replies; 5+ messages in thread
From: Florian Westphal @ 2026-03-05 12:26 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
Yiming Qian reports Use-after-free in the pipapo set type:
Under a large number of expired elements, commit-time GC can run for a very
long time in a non-preemptible context, triggering soft lockup warnings and
RCU stall reports (local denial of service).
We must split GC in an unlink and a reclaim phase.
We cannot queue elements for freeing until pointers have been swapped.
Expired elements are still exposed to both the packet path and userspace
dumpers via the live copy of the data structure.
call_rcu() does not protect us: dump operations or element lookups starting
after call_rcu has fired can still observe the free'd element, unless the
commit phase has made enough progress to swap the clone and live pointers
before any new reader has picked up the old version.
This a similar approach as done recently for the rbtree backend in commit
35f83a75529a ("netfilter: nft_set_rbtree: don't gc elements on insert").
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/net/netfilter/nf_tables.h | 5 +++
net/netfilter/nf_tables_api.c | 5 ---
net/netfilter/nft_set_pipapo.c | 51 ++++++++++++++++++++++++++-----
net/netfilter/nft_set_pipapo.h | 2 ++
4 files changed, 50 insertions(+), 13 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index ea6f29ad7888..e2d2bfc1f989 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1863,6 +1863,11 @@ struct nft_trans_gc {
struct rcu_head rcu;
};
+static inline int nft_trans_gc_space(const struct nft_trans_gc *trans)
+{
+ return NFT_TRANS_GC_BATCHCOUNT - trans->count;
+}
+
static inline void nft_ctx_update(struct nft_ctx *ctx,
const struct nft_trans *trans)
{
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 058f7004cb2b..1862bd7fe804 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -10493,11 +10493,6 @@ static void nft_trans_gc_queue_work(struct nft_trans_gc *trans)
schedule_work(&trans_gc_work);
}
-static int nft_trans_gc_space(struct nft_trans_gc *trans)
-{
- return NFT_TRANS_GC_BATCHCOUNT - trans->count;
-}
-
struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc,
unsigned int gc_seq, gfp_t gfp)
{
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index c091898df710..a34632ae6048 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -1680,11 +1680,11 @@ static void nft_pipapo_gc_deactivate(struct net *net, struct nft_set *set,
}
/**
- * pipapo_gc() - Drop expired entries from set, destroy start and end elements
+ * pipapo_gc_scan() - Drop expired entries from set and link them to gc list
* @set: nftables API set representation
* @m: Matching data
*/
-static void pipapo_gc(struct nft_set *set, struct nft_pipapo_match *m)
+static void pipapo_gc_scan(struct nft_set *set, struct nft_pipapo_match *m)
{
struct nft_pipapo *priv = nft_set_priv(set);
struct net *net = read_pnet(&set->net);
@@ -1697,6 +1697,8 @@ static void pipapo_gc(struct nft_set *set, struct nft_pipapo_match *m)
if (!gc)
return;
+ list_add(&gc->list, &priv->gc_head);
+
while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) {
union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS];
const struct nft_pipapo_field *f;
@@ -1724,9 +1726,13 @@ static void pipapo_gc(struct nft_set *set, struct nft_pipapo_match *m)
* NFT_SET_ELEM_DEAD_BIT.
*/
if (__nft_set_elem_expired(&e->ext, tstamp)) {
- gc = nft_trans_gc_queue_sync(gc, GFP_KERNEL);
- if (!gc)
- return;
+ if (!nft_trans_gc_space(gc)) {
+ gc = nft_trans_gc_alloc(set, 0, GFP_KERNEL);
+ if (!gc)
+ return;
+
+ list_add(&gc->list, &priv->gc_head);
+ }
nft_pipapo_gc_deactivate(net, set, e);
pipapo_drop(m, rulemap);
@@ -1740,10 +1746,30 @@ static void pipapo_gc(struct nft_set *set, struct nft_pipapo_match *m)
}
}
- gc = nft_trans_gc_catchall_sync(gc);
+ priv->last_gc = jiffies;
+}
+
+/**
+ * pipapo_gc_queue() - Free expired elements
+ * @set: nftables API set representation
+ */
+static void pipapo_gc_queue(struct nft_set *set)
+{
+ struct nft_pipapo *priv = nft_set_priv(set);
+ struct nft_trans_gc *gc, *next;
+
+ /* always do a catchall cycle: */
+ gc = nft_trans_gc_alloc(set, 0, GFP_KERNEL);
if (gc) {
+ gc = nft_trans_gc_catchall_sync(gc);
+ if (gc)
+ nft_trans_gc_queue_sync_done(gc);
+ }
+
+ /* always purge queued gc elements. */
+ list_for_each_entry_safe(gc, next, &priv->gc_head, list) {
+ list_del(&gc->list);
nft_trans_gc_queue_sync_done(gc);
- priv->last_gc = jiffies;
}
}
@@ -1797,6 +1823,10 @@ static void pipapo_reclaim_match(struct rcu_head *rcu)
*
* We also need to create a new working copy for subsequent insertions and
* deletions.
+ *
+ * After the live copy has been replaced by the clone, we can safely queue
+ * expired elements that have been collected by pipapo_gc_scan() for
+ * memory reclaim.
*/
static void nft_pipapo_commit(struct nft_set *set)
{
@@ -1807,7 +1837,7 @@ static void nft_pipapo_commit(struct nft_set *set)
return;
if (time_after_eq(jiffies, priv->last_gc + nft_set_gc_interval(set)))
- pipapo_gc(set, priv->clone);
+ pipapo_gc_scan(set, priv->clone);
old = rcu_replace_pointer(priv->match, priv->clone,
nft_pipapo_transaction_mutex_held(set));
@@ -1815,6 +1845,8 @@ static void nft_pipapo_commit(struct nft_set *set)
if (old)
call_rcu(&old->rcu, pipapo_reclaim_match);
+
+ pipapo_gc_queue(set);
}
static void nft_pipapo_abort(const struct nft_set *set)
@@ -2279,6 +2311,7 @@ static int nft_pipapo_init(const struct nft_set *set,
f->mt = NULL;
}
+ INIT_LIST_HEAD(&priv->gc_head);
rcu_assign_pointer(priv->match, m);
return 0;
@@ -2328,6 +2361,8 @@ static void nft_pipapo_destroy(const struct nft_ctx *ctx,
struct nft_pipapo *priv = nft_set_priv(set);
struct nft_pipapo_match *m;
+ WARN_ON_ONCE(!list_empty(&priv->gc_head));
+
m = rcu_dereference_protected(priv->match, true);
if (priv->clone) {
diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h
index eaab422aa56a..9aee9a9eaeb7 100644
--- a/net/netfilter/nft_set_pipapo.h
+++ b/net/netfilter/nft_set_pipapo.h
@@ -156,12 +156,14 @@ struct nft_pipapo_match {
* @clone: Copy where pending insertions and deletions are kept
* @width: Total bytes to be matched for one packet, including padding
* @last_gc: Timestamp of last garbage collection run, jiffies
+ * @gc_head: list of nft_trans_gc to queue up for mem reclaim
*/
struct nft_pipapo {
struct nft_pipapo_match __rcu *match;
struct nft_pipapo_match *clone;
int width;
unsigned long last_gc;
+ struct list_head gc_head;
};
struct nft_pipapo_elem;
--
2.52.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion
2026-03-05 12:26 ` [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion Florian Westphal
@ 2026-03-05 16:00 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-05 16:00 UTC (permalink / raw)
To: Florian Westphal
Cc: netdev, pabeni, davem, edumazet, kuba, netfilter-devel, pablo
Hello:
This series was applied to netdev/net.git (main)
by Florian Westphal <fw@strlen.de>:
On Thu, 5 Mar 2026 13:26:33 +0100 you wrote:
> From: Pablo Neira Ayuso <pablo@netfilter.org>
>
> In case that the set is full, a new element gets published then removed
> without waiting for the RCU grace period, while RCU reader can be
> walking over it already.
>
> To address this issue, add the element transaction even if set is full,
> but toggle the set_full flag to report -ENFILE so the abort path safely
> unwinds the set to its previous state.
>
> [...]
Here is the summary with links:
- [net,v2,1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion
https://git.kernel.org/netdev/net/c/def602e498a4
- [net,v2,2/3] netfilter: nf_tables: clone set on flush only
https://git.kernel.org/netdev/net/c/fb7fb4016300
- [net,v2,3/3] netfilter: nft_set_pipapo: split gc into unlink and reclaim phase
https://git.kernel.org/netdev/net/c/9df95785d3d8
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-05 16:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-05 12:26 [PATCH net v2 0/3] netfilter: updates for net Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 1/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion Florian Westphal
2026-03-05 16:00 ` patchwork-bot+netdevbpf
2026-03-05 12:26 ` [PATCH net v2 2/3] netfilter: nf_tables: clone set on flush only Florian Westphal
2026-03-05 12:26 ` [PATCH net v2 3/3] netfilter: nft_set_pipapo: split gc into unlink and reclaim phase Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox