public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 00/10] netfilter: updates for net
@ 2026-03-09 21:08 Florian Westphal
  2026-03-09 21:08 ` [PATCH net 01/10] netfilter: nf_tables: Fix for duplicate device in netdev hooks Florian Westphal
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Hi,

The following patchset contains Netfilter fixes for *net*:

Due to a large influx of bug fixes on netfilter-devel@ I am sending
an earlier PR to have more time to go through the remaining patches
without getting a 20+ patch PR.

1) syzbot managed to add/remove devices to a flowtable, due to a bug in
   the flowtable netdevice notifier this gets us a double-add and
   eventually UaF when device is removed again (we only expect one
   entry, duplicate remains past net_device end-of-life).
   From Phil Sutter, bug added in 6.16.

2) Yiming Qian reports another nf_tables transaction handling bug:
   in some cases error unwind misses to undo certain set elements,
   resulting in refcount underflow and use-after-free, bug added in 6.4.

3) Jenny Guanni Qu found out-of-bounds read in pipapo set type.
   While the value is never used, it still rightfully triggers KASAN
   splats.  Bug exists since this set type was added in 5.6.

4) a few x_tables modules contain copypastry tcp option parsing code which
    can read 1 byte past the option area.  This bug is ancient, fix from
    David Dull.

5) nfnetlink_queue leaks kernel memory if userspace provides bad
   NFQA_VLAN/NFQA_L2HDR attributes.  From Hyunwoo Kim, bug stems from
   from 4.7 days.

6) nfnetlink_cthelper has incorrect loop restart logic which may result
   in reading one pointer past end of array. From 3.6 days, fix also from
   Hyunwoo Kim.

7-9) fix access bugs in the ctnetlink expectation handling.
     Problem is that while RCU prevents the referenced nf_conn entry
     from going way, nf_conn entries have an extension area that can
     only be safely accessed if the cpu holds a reference to the
     conntrack.  Else the extension area can be free'd at any time.
     Fix is to grab references before the accesses happen.
     These bugs are old, v3.10 resp. even pre-git days.
     All fixes from Hyunwoo Kim.

10) xt_IDLETIMER v0 extension must reject working with timers added
    by revision v1, else we get list corruption. Bug added in v5.7.
    From Yifan Wu, Juefei Pu and Yuan Tan via Xin Lu.

Please, pull these changes from:
The following changes since commit c113d5e32678c8de40694b738000a4a2143e2f81:

  Merge branch 'net-spacemit-a-few-error-handling-fixes' (2026-03-06 18:58:36 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-26-03-09

for you to fetch changes up to a29b6cda03c1a4175468953c87a6c7db8766df7e:

  netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels (2026-03-09 14:40:00 +0100)

----------------------------------------------------------------
netfilter pull request nf-26-03-09

----------------------------------------------------------------
David Dull (1):
  netfilter: x_tables: guard option walkers against 1-byte tail reads

Florian Westphal (1):
  netfilter: nf_tables: always walk all pending catchall elements

Hyunwoo Kim (5):
  netfilter: nfnetlink_queue: fix entry leak in bridge verdict error path
  netfilter: nfnetlink_cthelper: fix OOB read in nfnl_cthelper_dump_table()
  netfilter: ctnetlink: fix use-after-free in ctnetlink_dump_exp_ct()
  netfilter: ctnetlink: fix use-after-free of exp->master in single expectation GET
  netfilter: ctnetlink: fix use-after-free of exp->master in expectation dump

Jenny Guanni Qu (1):
  netfilter: nft_set_pipapo: fix stack out-of-bounds read in pipapo_drop()

Phil Sutter (1):
  netfilter: nf_tables: Fix for duplicate device in netdev hooks

Yuan Tan (1):
  netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels

 net/netfilter/nf_conntrack_netlink.c | 52 ++++++++++++++++++++++++++--
 net/netfilter/nf_tables_api.c        |  4 +--
 net/netfilter/nfnetlink_cthelper.c   |  8 ++---
 net/netfilter/nfnetlink_queue.c      |  4 ++-
 net/netfilter/nft_chain_filter.c     |  2 +-
 net/netfilter/nft_set_pipapo.c       |  3 +-
 net/netfilter/xt_IDLETIMER.c         |  6 ++++
 net/netfilter/xt_dccp.c              |  4 +--
 net/netfilter/xt_tcpudp.c            |  6 ++--
 9 files changed, 72 insertions(+), 17 deletions(-)

-- 
2.52.0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net 01/10] netfilter: nf_tables: Fix for duplicate device in netdev hooks
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 02/10] netfilter: nf_tables: always walk all pending catchall elements Florian Westphal
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Phil Sutter <phil@nwl.cc>

When handling NETDEV_REGISTER notification, duplicate device
registration must be avoided since the device may have been added by
nft_netdev_hook_alloc() already when creating the hook.

Suggested-by: Florian Westphal <fw@strlen.de>
Reported-by: syzbot+bb9127e278fa198e110c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bb9127e278fa198e110c
Fixes: a331b78a5525 ("netfilter: nf_tables: Respect NETDEV_REGISTER events")
Tested-by: Helen Koike <koike@igalia.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_tables_api.c    | 2 +-
 net/netfilter/nft_chain_filter.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 1862bd7fe804..710f0ee21a34 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -9688,7 +9688,7 @@ static int nft_flowtable_event(unsigned long event, struct net_device *dev,
 			break;
 		case NETDEV_REGISTER:
 			/* NOP if not matching or already registered */
-			if (!match || (changename && ops))
+			if (!match || ops)
 				continue;
 
 			ops = kzalloc_obj(struct nf_hook_ops,
diff --git a/net/netfilter/nft_chain_filter.c b/net/netfilter/nft_chain_filter.c
index b16185e9a6dd..041426e3bdbf 100644
--- a/net/netfilter/nft_chain_filter.c
+++ b/net/netfilter/nft_chain_filter.c
@@ -344,7 +344,7 @@ static int nft_netdev_event(unsigned long event, struct net_device *dev,
 			break;
 		case NETDEV_REGISTER:
 			/* NOP if not matching or already registered */
-			if (!match || (changename && ops))
+			if (!match || ops)
 				continue;
 
 			ops = kmemdup(&basechain->ops,
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 02/10] netfilter: nf_tables: always walk all pending catchall elements
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
  2026-03-09 21:08 ` [PATCH net 01/10] netfilter: nf_tables: Fix for duplicate device in netdev hooks Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 03/10] netfilter: nft_set_pipapo: fix stack out-of-bounds read in pipapo_drop() Florian Westphal
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

During transaction processing we might have more than one catchall element:
1 live catchall element and 1 pending element that is coming as part of the
new batch.

If the map holding the catchall elements is also going away, its
required to toggle all catchall elements and not just the first viable
candidate.

Otherwise, we get:
 WARNING: ./include/net/netfilter/nf_tables.h:1281 at nft_data_release+0xb7/0xe0 [nf_tables], CPU#2: nft/1404
 RIP: 0010:nft_data_release+0xb7/0xe0 [nf_tables]
 [..]
 __nft_set_elem_destroy+0x106/0x380 [nf_tables]
 nf_tables_abort_release+0x348/0x8d0 [nf_tables]
 nf_tables_abort+0xcf2/0x3ac0 [nf_tables]
 nfnetlink_rcv_batch+0x9c9/0x20e0 [..]

Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_tables_api.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 710f0ee21a34..dacec5f8a11c 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -829,7 +829,6 @@ static void nft_map_catchall_deactivate(const struct nft_ctx *ctx,
 
 		nft_set_elem_change_active(ctx->net, set, ext);
 		nft_setelem_data_deactivate(ctx->net, set, catchall->elem);
-		break;
 	}
 }
 
@@ -5873,7 +5872,6 @@ static void nft_map_catchall_activate(const struct nft_ctx *ctx,
 
 		nft_clear(ctx->net, ext);
 		nft_setelem_data_activate(ctx->net, set, catchall->elem);
-		break;
 	}
 }
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 03/10] netfilter: nft_set_pipapo: fix stack out-of-bounds read in pipapo_drop()
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
  2026-03-09 21:08 ` [PATCH net 01/10] netfilter: nf_tables: Fix for duplicate device in netdev hooks Florian Westphal
  2026-03-09 21:08 ` [PATCH net 02/10] netfilter: nf_tables: always walk all pending catchall elements Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 04/10] netfilter: x_tables: guard option walkers against 1-byte tail reads Florian Westphal
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Jenny Guanni Qu <qguanni@gmail.com>

pipapo_drop() passes rulemap[i + 1].n to pipapo_unmap() as the
to_offset argument on every iteration, including the last one where
i == m->field_count - 1. This reads one element past the end of the
stack-allocated rulemap array (declared as rulemap[NFT_PIPAPO_MAX_FIELDS]
with NFT_PIPAPO_MAX_FIELDS == 16).

Although pipapo_unmap() returns early when is_last is true without
using the to_offset value, the argument is evaluated at the call site
before the function body executes, making this a genuine out-of-bounds
stack read confirmed by KASAN:

  BUG: KASAN: stack-out-of-bounds in pipapo_drop+0x50c/0x57c [nf_tables]
  Read of size 4 at addr ffff8000810e71a4

  This frame has 1 object:
   [32, 160) 'rulemap'

  The buggy address is at offset 164 -- exactly 4 bytes past the end
  of the rulemap array.

Pass 0 instead of rulemap[i + 1].n on the last iteration to avoid
the out-of-bounds read.

Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Jenny Guanni Qu <qguanni@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nft_set_pipapo.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index a34632ae6048..7fd24e0cc428 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -1640,6 +1640,7 @@ static void pipapo_drop(struct nft_pipapo_match *m,
 	int i;
 
 	nft_pipapo_for_each_field(f, i, m) {
+		bool last = i == m->field_count - 1;
 		int g;
 
 		for (g = 0; g < f->groups; g++) {
@@ -1659,7 +1660,7 @@ static void pipapo_drop(struct nft_pipapo_match *m,
 		}
 
 		pipapo_unmap(f->mt, f->rules, rulemap[i].to, rulemap[i].n,
-			     rulemap[i + 1].n, i == m->field_count - 1);
+			     last ? 0 : rulemap[i + 1].n, last);
 		if (pipapo_resize(f, f->rules, f->rules - rulemap[i].n)) {
 			/* We can ignore this, a failure to shrink tables down
 			 * doesn't make tables invalid.
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 04/10] netfilter: x_tables: guard option walkers against 1-byte tail reads
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (2 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 03/10] netfilter: nft_set_pipapo: fix stack out-of-bounds read in pipapo_drop() Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 05/10] netfilter: nfnetlink_queue: fix entry leak in bridge verdict error path Florian Westphal
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: David Dull <monderasdor@gmail.com>

When the last byte of options is a non-single-byte option kind, walkers
that advance with i += op[i + 1] ? : 1 can read op[i + 1] past the end
of the option area.

Add an explicit i == optlen - 1 check before dereferencing op[i + 1]
in xt_tcpudp and xt_dccp option walkers.

Fixes: 2e4e6a17af35 ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables")
Signed-off-by: David Dull <monderasdor@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/xt_dccp.c   | 4 ++--
 net/netfilter/xt_tcpudp.c | 6 ++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/xt_dccp.c b/net/netfilter/xt_dccp.c
index e5a13ecbe67a..037ab93e25d0 100644
--- a/net/netfilter/xt_dccp.c
+++ b/net/netfilter/xt_dccp.c
@@ -62,10 +62,10 @@ dccp_find_option(u_int8_t option,
 			return true;
 		}
 
-		if (op[i] < 2)
+		if (op[i] < 2 || i == optlen - 1)
 			i++;
 		else
-			i += op[i+1]?:1;
+			i += op[i + 1] ? : 1;
 	}
 
 	spin_unlock_bh(&dccp_buflock);
diff --git a/net/netfilter/xt_tcpudp.c b/net/netfilter/xt_tcpudp.c
index e8991130a3de..f76cf18f1a24 100644
--- a/net/netfilter/xt_tcpudp.c
+++ b/net/netfilter/xt_tcpudp.c
@@ -59,8 +59,10 @@ tcp_find_option(u_int8_t option,
 
 	for (i = 0; i < optlen; ) {
 		if (op[i] == option) return !invert;
-		if (op[i] < 2) i++;
-		else i += op[i+1]?:1;
+		if (op[i] < 2 || i == optlen - 1)
+			i++;
+		else
+			i += op[i + 1] ? : 1;
 	}
 
 	return invert;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 05/10] netfilter: nfnetlink_queue: fix entry leak in bridge verdict error path
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (3 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 04/10] netfilter: x_tables: guard option walkers against 1-byte tail reads Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 06/10] netfilter: nfnetlink_cthelper: fix OOB read in nfnl_cthelper_dump_table() Florian Westphal
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Hyunwoo Kim <imv4bel@gmail.com>

nfqnl_recv_verdict() calls find_dequeue_entry() to remove the queue
entry from the queue data structures, taking ownership of the entry.
For PF_BRIDGE packets, it then calls nfqa_parse_bridge() to parse VLAN
attributes.  If nfqa_parse_bridge() returns an error (e.g. NFQA_VLAN
present but NFQA_VLAN_TCI missing), the function returns immediately
without freeing the dequeued entry or its sk_buff.

This leaks the nf_queue_entry, its associated sk_buff, and all held
references (net_device refcounts, struct net refcount).  Repeated
triggering exhausts kernel memory.

Fix this by dropping the entry via nfqnl_reinject() with NF_DROP verdict
on the error path, consistent with other error handling in this file.

Fixes: 8d45ff22f1b4 ("netfilter: bridge: nf queue verdict to use NFQA_VLAN and NFQA_L2HDR")
Reviewed-by: David Dull <monderasdor@gmail.com>
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nfnetlink_queue.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index 7f5248b5f1ee..47f7f62906e2 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -1546,8 +1546,10 @@ static int nfqnl_recv_verdict(struct sk_buff *skb, const struct nfnl_info *info,
 
 	if (entry->state.pf == PF_BRIDGE) {
 		err = nfqa_parse_bridge(entry, nfqa);
-		if (err < 0)
+		if (err < 0) {
+			nfqnl_reinject(entry, NF_DROP);
 			return err;
+		}
 	}
 
 	if (nfqa[NFQA_PAYLOAD]) {
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 06/10] netfilter: nfnetlink_cthelper: fix OOB read in nfnl_cthelper_dump_table()
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (4 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 05/10] netfilter: nfnetlink_queue: fix entry leak in bridge verdict error path Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 07/10] netfilter: ctnetlink: fix use-after-free in ctnetlink_dump_exp_ct() Florian Westphal
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Hyunwoo Kim <imv4bel@gmail.com>

nfnl_cthelper_dump_table() has a 'goto restart' that jumps to a label
inside the for loop body.  When the "last" helper saved in cb->args[1]
is deleted between dump rounds, every entry fails the (cur != last)
check, so cb->args[1] is never cleared.  The for loop finishes with
cb->args[0] == nf_ct_helper_hsize, and the 'goto restart' jumps back
into the loop body bypassing the bounds check, causing an 8-byte
out-of-bounds read on nf_ct_helper_hash[nf_ct_helper_hsize].

The 'goto restart' block was meant to re-traverse the current bucket
when "last" is no longer found, but it was placed after the for loop
instead of inside it.  Move the block into the for loop body so that
the restart only occurs while cb->args[0] is still within bounds.

 BUG: KASAN: slab-out-of-bounds in nfnl_cthelper_dump_table+0x9f/0x1b0
 Read of size 8 at addr ffff888104ca3000 by task poc_cthelper/131
 Call Trace:
  nfnl_cthelper_dump_table+0x9f/0x1b0
  netlink_dump+0x333/0x880
  netlink_recvmsg+0x3e2/0x4b0
  sock_recvmsg+0xde/0xf0
  __sys_recvfrom+0x150/0x200
  __x64_sys_recvfrom+0x76/0x90
  do_syscall_64+0xc3/0x6e0

 Allocated by task 1:
  __kvmalloc_node_noprof+0x21b/0x700
  nf_ct_alloc_hashtable+0x65/0xd0
  nf_conntrack_helper_init+0x21/0x60
  nf_conntrack_init_start+0x18d/0x300
  nf_conntrack_standalone_init+0x12/0xc0

Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nfnetlink_cthelper.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nfnetlink_cthelper.c b/net/netfilter/nfnetlink_cthelper.c
index d658b1478fa0..d545fa459455 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -601,10 +601,10 @@ nfnl_cthelper_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 				goto out;
 			}
 		}
-	}
-	if (cb->args[1]) {
-		cb->args[1] = 0;
-		goto restart;
+		if (cb->args[1]) {
+			cb->args[1] = 0;
+			goto restart;
+		}
 	}
 out:
 	rcu_read_unlock();
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 07/10] netfilter: ctnetlink: fix use-after-free in ctnetlink_dump_exp_ct()
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (5 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 06/10] netfilter: nfnetlink_cthelper: fix OOB read in nfnl_cthelper_dump_table() Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 08/10] netfilter: ctnetlink: fix use-after-free of exp->master in single expectation GET Florian Westphal
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Hyunwoo Kim <imv4bel@gmail.com>

ctnetlink_dump_exp_ct() stores a conntrack pointer in cb->data for the
netlink dump callback ctnetlink_exp_ct_dump_table(), but drops the
conntrack reference immediately after netlink_dump_start().  When the
dump spans multiple rounds, the second recvmsg() triggers the dump
callback which dereferences the now-freed conntrack via nfct_help(ct),
leading to a use-after-free on ct->ext.

The bug is that the netlink_dump_control has no .start or .done
callbacks to manage the conntrack reference across dump rounds.  Other
dump functions in the same file (e.g. ctnetlink_get_conntrack) properly
use .start/.done callbacks for this purpose.

Fix this by adding .start and .done callbacks that hold and release the
conntrack reference for the duration of the dump, and move the
nfct_help() call after the cb->args[0] early-return check in the dump
callback to avoid dereferencing ct->ext unnecessarily.

 BUG: KASAN: slab-use-after-free in ctnetlink_exp_ct_dump_table+0x4f/0x2e0
 Read of size 8 at addr ffff88810597ebf0 by task ctnetlink_poc/133

 CPU: 1 UID: 0 PID: 133 Comm: ctnetlink_poc Not tainted 7.0.0-rc2+ #3 PREEMPTLAZY
 Call Trace:
  <TASK>
  ctnetlink_exp_ct_dump_table+0x4f/0x2e0
  netlink_dump+0x333/0x880
  netlink_recvmsg+0x3e2/0x4b0
  ? aa_sk_perm+0x184/0x450
  sock_recvmsg+0xde/0xf0

 Allocated by task 133:
  kmem_cache_alloc_noprof+0x134/0x440
  __nf_conntrack_alloc+0xa8/0x2b0
  ctnetlink_create_conntrack+0xa1/0x900
  ctnetlink_new_conntrack+0x3cf/0x7d0
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180

 Freed by task 0:
  slab_free_after_rcu_debug+0xad/0x1e0
  rcu_core+0x5c3/0x9c0

 Last potentially related work creation:
  kmem_cache_free+0x1f5/0x440
  nf_conntrack_free+0xc1/0x140
  ctnetlink_del_conntrack+0x4c4/0x520
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180

Fixes: e844a928431f ("netfilter: ctnetlink: allow to dump expectation per master conntrack")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_netlink.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index c9d725fc2d71..65aa44a12d01 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3212,7 +3212,7 @@ ctnetlink_exp_ct_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
 	struct nf_conn *ct = cb->data;
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_conn_help *help;
 	u_int8_t l3proto = nfmsg->nfgen_family;
 	unsigned long last_id = cb->args[1];
 	struct nf_conntrack_expect *exp;
@@ -3220,6 +3220,10 @@ ctnetlink_exp_ct_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 	if (cb->args[0])
 		return 0;
 
+	help = nfct_help(ct);
+	if (!help)
+		return 0;
+
 	rcu_read_lock();
 
 restart:
@@ -3249,6 +3253,24 @@ ctnetlink_exp_ct_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+static int ctnetlink_dump_exp_ct_start(struct netlink_callback *cb)
+{
+	struct nf_conn *ct = cb->data;
+
+	if (!refcount_inc_not_zero(&ct->ct_general.use))
+		return -ENOENT;
+	return 0;
+}
+
+static int ctnetlink_dump_exp_ct_done(struct netlink_callback *cb)
+{
+	struct nf_conn *ct = cb->data;
+
+	if (ct)
+		nf_ct_put(ct);
+	return 0;
+}
+
 static int ctnetlink_dump_exp_ct(struct net *net, struct sock *ctnl,
 				 struct sk_buff *skb,
 				 const struct nlmsghdr *nlh,
@@ -3264,6 +3286,8 @@ static int ctnetlink_dump_exp_ct(struct net *net, struct sock *ctnl,
 	struct nf_conntrack_zone zone;
 	struct netlink_dump_control c = {
 		.dump = ctnetlink_exp_ct_dump_table,
+		.start = ctnetlink_dump_exp_ct_start,
+		.done = ctnetlink_dump_exp_ct_done,
 	};
 
 	err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_MASTER,
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 08/10] netfilter: ctnetlink: fix use-after-free of exp->master in single expectation GET
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (6 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 07/10] netfilter: ctnetlink: fix use-after-free in ctnetlink_dump_exp_ct() Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 09/10] netfilter: ctnetlink: fix use-after-free of exp->master in expectation dump Florian Westphal
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Hyunwoo Kim <imv4bel@gmail.com>

ctnetlink_get_expect() in the non-dump path calls
nf_ct_expect_find_get() which only takes a reference on the expectation
itself, not on exp->master.  It then calls ctnetlink_exp_fill_info()
which dereferences exp->master extensively (tuplehash, ct->ext via
nfct_help()).

A concurrent conntrack deletion through NFNL_SUBSYS_CTNETLINK (a
different nfnetlink subsystem mutex than NFNL_SUBSYS_CTNETLINK_EXP) can
free the master conntrack while the single GET is in progress, leading
to use-after-free.  In particular, kfree(ct->ext) is immediate and not
RCU-deferred.

Fix this by taking a reference on exp->master under rcu_read_lock
(required for SLAB_TYPESAFE_BY_RCU) before calling
ctnetlink_exp_fill_info() and releasing it afterwards.

 BUG: KASAN: slab-use-after-free in ctnetlink_dump_tuples_ip+0xbc/0x1f0
 Read of size 2 at addr ffff8881042a8cb2 by task poc3/134

 CPU: 0 UID: 0 PID: 134 Comm: poc3 Not tainted 7.0.0-rc2+ #6 PREEMPTLAZY
 Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 Call Trace:
  ctnetlink_dump_tuples_ip+0xbc/0x1f0
  ctnetlink_dump_tuples+0x19/0x60
  ctnetlink_exp_dump_tuple+0x6f/0xd0
  ctnetlink_exp_dump_expect+0x315/0x660
  ctnetlink_exp_fill_info.constprop.0+0xf9/0x180
  ctnetlink_get_expect+0x2f3/0x400
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  ___sys_sendmsg+0xfc/0x170
  __sys_sendmsg+0xf4/0x180
  do_syscall_64+0xc3/0x6e0

 Allocated by task 131:
  kmem_cache_alloc_noprof+0x134/0x440
  __nf_conntrack_alloc+0xa8/0x2b0
  ctnetlink_create_conntrack+0xa1/0x900
  ctnetlink_new_conntrack+0x3cf/0x7d0
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180
  do_syscall_64+0xc3/0x6e0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

 Freed by task 0:
  __kasan_slab_free+0x43/0x70
  slab_free_after_rcu_debug+0xad/0x1e0
  rcu_core+0x5c3/0x9c0
  handle_softirqs+0x148/0x460

 Last potentially related work creation:
  kmem_cache_free+0x1f5/0x440
  nf_conntrack_free+0xc1/0x140
  ctnetlink_del_conntrack+0x4c4/0x520
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180

 The buggy address belongs to the object at ffff8881042a8c80
  which belongs to the cache nf_conntrack of size 248
 The buggy address is located 50 bytes inside of
  freed 248-byte region [ffff8881042a8c80, ffff8881042a8d78)

Fixes: c1d10adb4a52 ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_netlink.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 65aa44a12d01..10a9b98368f4 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3324,6 +3324,7 @@ static int ctnetlink_get_expect(struct sk_buff *skb,
 {
 	u_int8_t u3 = info->nfmsg->nfgen_family;
 	struct nf_conntrack_tuple tuple;
+	struct nf_conn *master;
 	struct nf_conntrack_expect *exp;
 	struct nf_conntrack_zone zone;
 	struct sk_buff *skb2;
@@ -3378,10 +3379,19 @@ static int ctnetlink_get_expect(struct sk_buff *skb,
 	}
 
 	rcu_read_lock();
+	master = exp->master;
+	if (!refcount_inc_not_zero(&master->ct_general.use)) {
+		rcu_read_unlock();
+		nf_ct_expect_put(exp);
+		kfree_skb(skb2);
+		return -ENOENT;
+	}
+
 	err = ctnetlink_exp_fill_info(skb2, NETLINK_CB(skb).portid,
 				      info->nlh->nlmsg_seq, IPCTNL_MSG_EXP_NEW,
 				      exp);
 	rcu_read_unlock();
+	nf_ct_put(master);
 	nf_ct_expect_put(exp);
 	if (err <= 0) {
 		kfree_skb(skb2);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 09/10] netfilter: ctnetlink: fix use-after-free of exp->master in expectation dump
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (7 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 08/10] netfilter: ctnetlink: fix use-after-free of exp->master in single expectation GET Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-09 21:08 ` [PATCH net 10/10] netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels Florian Westphal
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Hyunwoo Kim <imv4bel@gmail.com>

ctnetlink_exp_dump_table() iterates the expectation hash table under
rcu_read_lock and dereferences exp->master to access the master
conntrack's fields (ct_net, tuplehash, ct->ext).  However, expectations
do not hold a reference on exp->master.  A concurrent conntrack deletion
via NFNL_SUBSYS_CTNETLINK (a different nfnetlink subsystem mutex) can
free the master conntrack while the dump is in progress, leading to
use-after-free on ct->ext which is freed immediately by kfree().

Fix this by taking a reference on exp->master with
refcount_inc_not_zero() before accessing it.  If the master conntrack is
already being destroyed, skip the expectation.

KASAN report:
 BUG: KASAN: slab-use-after-free in ctnetlink_exp_dump_expect+0x584/0x660
 Read of size 1 at addr ffff888102b4ab00 by task poc2/135

 CPU: 1 UID: 0 PID: 135 Comm: poc2 Not tainted 7.0.0-rc2+ #5 PREEMPTLAZY
 Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 Call Trace:
  ctnetlink_exp_dump_expect+0x584/0x660
  ctnetlink_exp_fill_info.constprop.0+0xf9/0x180
  ctnetlink_exp_dump_table+0x24a/0x2e0
  netlink_dump+0x333/0x880
  __netlink_dump_start+0x391/0x450
  ctnetlink_get_expect+0x393/0x3f0
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180

 Allocated by task 132:
  krealloc_node_align_noprof+0x124/0x3c0
  nf_ct_ext_add+0xd8/0x1a0
  ctnetlink_create_conntrack+0x38d/0x900
  ctnetlink_new_conntrack+0x3cf/0x7d0
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180
  do_syscall_64+0xc3/0x6e0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

 Freed by task 132:
  kfree+0x1ca/0x430
  nf_conntrack_free+0xb2/0x140
  ctnetlink_del_conntrack+0x4c4/0x520
  nfnetlink_rcv_msg+0x48e/0x510
  netlink_rcv_skb+0xc9/0x1f0
  nfnetlink_rcv+0xdb/0x220
  netlink_unicast+0x3ec/0x590
  netlink_sendmsg+0x397/0x690
  __sys_sendmsg+0xf4/0x180
  do_syscall_64+0xc3/0x6e0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

 The buggy address belongs to the object at ffff888102b4ab00
  which belongs to the cache kmalloc-128 of size 128
 The buggy address is located 0 bytes inside of
  freed 128-byte region [ffff888102b4ab00, ffff888102b4ab80)

Fixes: c1d10adb4a52 ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_netlink.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 10a9b98368f4..96e342147de8 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3167,6 +3167,7 @@ static int
 ctnetlink_exp_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
+	struct nf_conn *master;
 	struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
 	u_int8_t l3proto = nfmsg->nfgen_family;
 	unsigned long last_id = cb->args[1];
@@ -3180,12 +3181,20 @@ ctnetlink_exp_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 			if (l3proto && exp->tuple.src.l3num != l3proto)
 				continue;
 
-			if (!net_eq(nf_ct_net(exp->master), net))
+			master = exp->master;
+			if (!refcount_inc_not_zero(&master->ct_general.use))
 				continue;
 
+			if (!net_eq(nf_ct_net(master), net)) {
+				nf_ct_put(master);
+				continue;
+			}
+
 			if (cb->args[1]) {
-				if (ctnetlink_exp_id(exp) != last_id)
+				if (ctnetlink_exp_id(exp) != last_id) {
+					nf_ct_put(master);
 					continue;
+				}
 				cb->args[1] = 0;
 			}
 			if (ctnetlink_exp_fill_info(skb,
@@ -3194,8 +3203,11 @@ ctnetlink_exp_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
 						    IPCTNL_MSG_EXP_NEW,
 						    exp) < 0) {
 				cb->args[1] = ctnetlink_exp_id(exp);
+				nf_ct_put(master);
 				goto out;
 			}
+
+			nf_ct_put(master);
 		}
 		if (cb->args[1]) {
 			cb->args[1] = 0;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 10/10] netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (8 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 09/10] netfilter: ctnetlink: fix use-after-free of exp->master in expectation dump Florian Westphal
@ 2026-03-09 21:08 ` Florian Westphal
  2026-03-10 10:56 ` [PATCH net 00/10] netfilter: updates for net Pablo Neira Ayuso
  2026-03-10 13:02 ` Florian Westphal
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-09 21:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Yuan Tan <tanyuan98@outlook.com>

IDLETIMER revision 0 rules reuse existing timers by label and always call
mod_timer() on timer->timer.

If the label was created first by revision 1 with XT_IDLETIMER_ALARM,
the object uses alarm timer semantics and timer->timer is never initialized.
Reusing that object from revision 0 causes mod_timer() on an uninitialized
timer_list, triggering debugobjects warnings and possible panic when
panic_on_warn=1.

Fix this by rejecting revision 0 rule insertion when an existing timer with
the same label is of ALARM type.

Fixes: 68983a354a65 ("netfilter: xtables: Add snapshot of hardidletimer target")
Co-developed-by: Yifan Wu <yifanwucs@gmail.com>
Signed-off-by: Yifan Wu <yifanwucs@gmail.com>
Co-developed-by: Juefei Pu <tomapufckgml@gmail.com>
Signed-off-by: Juefei Pu <tomapufckgml@gmail.com>
Signed-off-by: Yuan Tan <tanyuan98@outlook.com>
Signed-off-by: Xin Liu <dstsmallbird@foxmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/xt_IDLETIMER.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/netfilter/xt_IDLETIMER.c b/net/netfilter/xt_IDLETIMER.c
index 5d93e225d0f8..517106165ad2 100644
--- a/net/netfilter/xt_IDLETIMER.c
+++ b/net/netfilter/xt_IDLETIMER.c
@@ -318,6 +318,12 @@ static int idletimer_tg_checkentry(const struct xt_tgchk_param *par)
 
 	info->timer = __idletimer_tg_find_by_label(info->label);
 	if (info->timer) {
+		if (info->timer->timer_type & XT_IDLETIMER_ALARM) {
+			pr_debug("Adding/Replacing rule with same label and different timer type is not allowed\n");
+			mutex_unlock(&list_mutex);
+			return -EINVAL;
+		}
+
 		info->timer->refcnt++;
 		mod_timer(&info->timer->timer,
 			  secs_to_jiffies(info->timeout) + jiffies);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net 00/10] netfilter: updates for net
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (9 preceding siblings ...)
  2026-03-09 21:08 ` [PATCH net 10/10] netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels Florian Westphal
@ 2026-03-10 10:56 ` Pablo Neira Ayuso
  2026-03-10 12:33   ` Florian Westphal
  2026-03-10 13:02 ` Florian Westphal
  11 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-10 10:56 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, Paolo Abeni, David S. Miller, Eric Dumazet,
	Jakub Kicinski, netfilter-devel

Hi,

On Mon, Mar 09, 2026 at 10:08:35PM +0100, Florian Westphal wrote:
> 7-9) fix access bugs in the ctnetlink expectation handling.
>      Problem is that while RCU prevents the referenced nf_conn entry
>      from going way, nf_conn entries have an extension area that can
>      only be safely accessed if the cpu holds a reference to the
>      conntrack.  Else the extension area can be free'd at any time.
>      Fix is to grab references before the accesses happen.
>      These bugs are old, v3.10 resp. even pre-git days.
>      All fixes from Hyunwoo Kim.

I am not sure 7-9 are correct.

nfct_help() is accessed via exp->master in other existing paths,
I think these fixes are papering an underlying problem since the
typesafe rcu infrastructure was introduced in nf_conntrack.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 00/10] netfilter: updates for net
  2026-03-10 10:56 ` [PATCH net 00/10] netfilter: updates for net Pablo Neira Ayuso
@ 2026-03-10 12:33   ` Florian Westphal
  2026-03-10 12:41     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2026-03-10 12:33 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netdev, Paolo Abeni, David S. Miller, Eric Dumazet,
	Jakub Kicinski, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Mon, Mar 09, 2026 at 10:08:35PM +0100, Florian Westphal wrote:
> > 7-9) fix access bugs in the ctnetlink expectation handling.
> >      Problem is that while RCU prevents the referenced nf_conn entry
> >      from going way, nf_conn entries have an extension area that can
> >      only be safely accessed if the cpu holds a reference to the
> >      conntrack.  Else the extension area can be free'd at any time.
> >      Fix is to grab references before the accesses happen.
> >      These bugs are old, v3.10 resp. even pre-git days.
> >      All fixes from Hyunwoo Kim.
> 
> I am not sure 7-9 are correct.
> 
> nfct_help() is accessed via exp->master in other existing paths,
> I think these fixes are papering an underlying problem since the
> typesafe rcu infrastructure was introduced in nf_conntrack.

AFAICS these patchers are correct and other areas need to be fixed too.
I am currently auditing other conntrack helper usage for this bug
type.  I'm working as fast as I can given the volume of bugs coming
in.  I don't think that not taking these patches now is better in any
way.

Its possible these changes do miss a check for confirmed bit, to
avoid handling new, unconfirmed conntracks during object reuse.

Expect further patches in this area.

But I'm not sure this is related to rcu infra usage.

I would not be surprised if this bug has always been there:
20+ years ago, without KASAN/UBSAN it would likely have never
been found.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 00/10] netfilter: updates for net
  2026-03-10 12:33   ` Florian Westphal
@ 2026-03-10 12:41     ` Pablo Neira Ayuso
  2026-03-10 12:48       ` Florian Westphal
  0 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-10 12:41 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, Paolo Abeni, David S. Miller, Eric Dumazet,
	Jakub Kicinski, netfilter-devel

On Tue, Mar 10, 2026 at 01:33:25PM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Mon, Mar 09, 2026 at 10:08:35PM +0100, Florian Westphal wrote:
> > > 7-9) fix access bugs in the ctnetlink expectation handling.
> > >      Problem is that while RCU prevents the referenced nf_conn entry
> > >      from going way, nf_conn entries have an extension area that can
> > >      only be safely accessed if the cpu holds a reference to the
> > >      conntrack.  Else the extension area can be free'd at any time.
> > >      Fix is to grab references before the accesses happen.
> > >      These bugs are old, v3.10 resp. even pre-git days.
> > >      All fixes from Hyunwoo Kim.
> > 
> > I am not sure 7-9 are correct.
> > 
> > nfct_help() is accessed via exp->master in other existing paths,
> > I think these fixes are papering an underlying problem since the
> > typesafe rcu infrastructure was introduced in nf_conntrack.
> 
> AFAICS these patchers are correct and other areas need to be fixed too.
> I am currently auditing other conntrack helper usage for this bug
> type.  I'm working as fast as I can given the volume of bugs coming
> in.  I don't think that not taking these patches now is better in any
> way.

Are sure that adding refcount bump is the way to go?

# git grep "nfct_help(exp->master)" net/netfilter/
net/netfilter/nf_conntrack_expect.c:    struct nf_conn_help *master_help = nfct_help(exp->master);
net/netfilter/nf_conntrack_expect.c:    struct nf_conn_help *master_help = nfct_help(exp->master);
net/netfilter/nf_conntrack_helper.c:    struct nf_conn_help *help = nfct_help(exp->master);
net/netfilter/nf_conntrack_netlink.c:   m_help = nfct_help(exp->master);
net/netfilter/nf_conntrack_sip.c:                   nfct_help(exp->master)->helper != nfct_help(ct)->helper ||

These callsites need auditing.

Yes, I just wonder if this can be fixed without adding checks
everywhere in the code, I would need a bit more time too.

This report came in over the weekend.

> Its possible these changes do miss a check for confirmed bit, to
> avoid handling new, unconfirmed conntracks during object reuse.
> 
> Expect further patches in this area.
>
> But I'm not sure this is related to rcu infra usage.
> 
> I would not be surprised if this bug has always been there:
> 20+ years ago, without KASAN/UBSAN it would likely have never
> been found.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 00/10] netfilter: updates for net
  2026-03-10 12:41     ` Pablo Neira Ayuso
@ 2026-03-10 12:48       ` Florian Westphal
  0 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-10 12:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netdev, Paolo Abeni, David S. Miller, Eric Dumazet,
	Jakub Kicinski, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Are sure that adding refcount bump is the way to go?

No, but I don't have a better idea at this time.

The obvious alternative is to grab a reference at expectation
creation time.

But this would pin the creating conntrack in memory.
In some cases, this will pin it forever, even if someone
runs 'conntrack -F/conntrack -D' to delete it if a helper
creates a forever-expectation.

> # git grep "nfct_help(exp->master)" net/netfilter/
> net/netfilter/nf_conntrack_expect.c:    struct nf_conn_help *master_help = nfct_help(exp->master);
> net/netfilter/nf_conntrack_expect.c:    struct nf_conn_help *master_help = nfct_help(exp->master);
> net/netfilter/nf_conntrack_helper.c:    struct nf_conn_help *help = nfct_help(exp->master);
> net/netfilter/nf_conntrack_netlink.c:   m_help = nfct_help(exp->master);
> net/netfilter/nf_conntrack_sip.c:                   nfct_help(exp->master)->helper != nfct_help(ct)->helper ||
> 
> These callsites need auditing.

Yes, but I don't have infinite time.  I am almost working non-stop
since these things got reported.
I cannot accelerate things any further.

> Yes, I just wonder if this can be fixed without adding checks
> everywhere in the code, I would need a bit more time too.

Another option is to stop releasing exp area with kfree()
and move back to kfree_rcu.  But still, I fear thats not enough.

What if we're releasing the master conntrack (refcount already 0)
and we're in object reuse scenario?

Then, master->ext can be krealloc'd in parallel.

TL;DR: I see no alternative to these refcount dances ATM and I also
think we need to add confirmed-bit check.

I hope I can condense this later with some new helper function that
can be used so we avoid open-coding this.

If you think that its better to yank these fixes and do the slow
audit, then fine, but I don't have any evidence that its a better
approach compared to incremental fixups.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 00/10] netfilter: updates for net
  2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
                   ` (10 preceding siblings ...)
  2026-03-10 10:56 ` [PATCH net 00/10] netfilter: updates for net Pablo Neira Ayuso
@ 2026-03-10 13:02 ` Florian Westphal
  11 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2026-03-10 13:02 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Florian Westphal <fw@strlen.de> wrote:

I am withdrawing this PR, will send a v2 without patches 7-9.
Expectation infra is very broken and has design issues that
need more work to resolve.

This sucks, but I'm out of ideas.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-03-10 13:02 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 21:08 [PATCH net 00/10] netfilter: updates for net Florian Westphal
2026-03-09 21:08 ` [PATCH net 01/10] netfilter: nf_tables: Fix for duplicate device in netdev hooks Florian Westphal
2026-03-09 21:08 ` [PATCH net 02/10] netfilter: nf_tables: always walk all pending catchall elements Florian Westphal
2026-03-09 21:08 ` [PATCH net 03/10] netfilter: nft_set_pipapo: fix stack out-of-bounds read in pipapo_drop() Florian Westphal
2026-03-09 21:08 ` [PATCH net 04/10] netfilter: x_tables: guard option walkers against 1-byte tail reads Florian Westphal
2026-03-09 21:08 ` [PATCH net 05/10] netfilter: nfnetlink_queue: fix entry leak in bridge verdict error path Florian Westphal
2026-03-09 21:08 ` [PATCH net 06/10] netfilter: nfnetlink_cthelper: fix OOB read in nfnl_cthelper_dump_table() Florian Westphal
2026-03-09 21:08 ` [PATCH net 07/10] netfilter: ctnetlink: fix use-after-free in ctnetlink_dump_exp_ct() Florian Westphal
2026-03-09 21:08 ` [PATCH net 08/10] netfilter: ctnetlink: fix use-after-free of exp->master in single expectation GET Florian Westphal
2026-03-09 21:08 ` [PATCH net 09/10] netfilter: ctnetlink: fix use-after-free of exp->master in expectation dump Florian Westphal
2026-03-09 21:08 ` [PATCH net 10/10] netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels Florian Westphal
2026-03-10 10:56 ` [PATCH net 00/10] netfilter: updates for net Pablo Neira Ayuso
2026-03-10 12:33   ` Florian Westphal
2026-03-10 12:41     ` Pablo Neira Ayuso
2026-03-10 12:48       ` Florian Westphal
2026-03-10 13:02 ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox