* [PATCH net 1/4] netfilter: nf_conncount: fix leaked ct in error paths
2025-12-10 11:07 [PATCH net 0/4] netfilter: updates for net Florian Westphal
@ 2025-12-10 11:07 ` Florian Westphal
2025-12-11 9:00 ` patchwork-bot+netdevbpf
2025-12-10 11:07 ` [PATCH net 2/4] ipvs: fix ipv4 null-ptr-deref in route error path Florian Westphal
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Florian Westphal @ 2025-12-10 11:07 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Fernando Fernandez Mancera <fmancera@suse.de>
There are some situations where ct might be leaked as error paths are
skipping the refcounted check and return immediately. In order to solve
it make sure that the check is always called.
Fixes: be102eb6a0e7 ("netfilter: nf_conncount: rework API to use sk_buff directly")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_conncount.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index f1be4dd5cf85..3654f1e8976c 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -172,14 +172,14 @@ static int __nf_conncount_add(struct net *net,
struct nf_conn *found_ct;
unsigned int collect = 0;
bool refcounted = false;
+ int err = 0;
if (!get_ct_or_tuple_from_skb(net, skb, l3num, &ct, &tuple, &zone, &refcounted))
return -ENOENT;
if (ct && nf_ct_is_confirmed(ct)) {
- if (refcounted)
- nf_ct_put(ct);
- return -EEXIST;
+ err = -EEXIST;
+ goto out_put;
}
if ((u32)jiffies == list->last_gc)
@@ -231,12 +231,16 @@ static int __nf_conncount_add(struct net *net,
}
add_new_node:
- if (WARN_ON_ONCE(list->count > INT_MAX))
- return -EOVERFLOW;
+ if (WARN_ON_ONCE(list->count > INT_MAX)) {
+ err = -EOVERFLOW;
+ goto out_put;
+ }
conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC);
- if (conn == NULL)
- return -ENOMEM;
+ if (conn == NULL) {
+ err = -ENOMEM;
+ goto out_put;
+ }
conn->tuple = tuple;
conn->zone = *zone;
@@ -249,7 +253,7 @@ static int __nf_conncount_add(struct net *net,
out_put:
if (refcounted)
nf_ct_put(ct);
- return 0;
+ return err;
}
int nf_conncount_add_skb(struct net *net,
@@ -456,11 +460,10 @@ insert_tree(struct net *net,
rb_link_node_rcu(&rbconn->node, parent, rbnode);
rb_insert_color(&rbconn->node, root);
-
- if (refcounted)
- nf_ct_put(ct);
}
out_unlock:
+ if (refcounted)
+ nf_ct_put(ct);
spin_unlock_bh(&nf_conncount_locks[hash]);
return count;
}
--
2.51.2
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net 1/4] netfilter: nf_conncount: fix leaked ct in error paths
2025-12-10 11:07 ` [PATCH net 1/4] netfilter: nf_conncount: fix leaked ct in error paths Florian Westphal
@ 2025-12-11 9:00 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-12-11 9:00 UTC (permalink / raw)
To: Florian Westphal
Cc: netdev, pabeni, davem, edumazet, kuba, netfilter-devel, pablo
Hello:
This series was applied to netdev/net.git (main)
by Florian Westphal <fw@strlen.de>:
On Wed, 10 Dec 2025 12:07:51 +0100 you wrote:
> From: Fernando Fernandez Mancera <fmancera@suse.de>
>
> There are some situations where ct might be leaked as error paths are
> skipping the refcounted check and return immediately. In order to solve
> it make sure that the check is always called.
>
> Fixes: be102eb6a0e7 ("netfilter: nf_conncount: rework API to use sk_buff directly")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> Signed-off-by: Florian Westphal <fw@strlen.de>
>
> [...]
Here is the summary with links:
- [net,1/4] netfilter: nf_conncount: fix leaked ct in error paths
https://git.kernel.org/netdev/net/c/2e2a72076688
- [net,2/4] ipvs: fix ipv4 null-ptr-deref in route error path
https://git.kernel.org/netdev/net/c/ad891bb3d079
- [net,3/4] netfilter: always set route tuple out ifindex
https://git.kernel.org/netdev/net/c/2bdc536c9da7
- [net,4/4] selftests: netfilter: prefer xfail in case race wasn't triggered
https://git.kernel.org/netdev/net/c/b8a81b0ce539
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net 2/4] ipvs: fix ipv4 null-ptr-deref in route error path
2025-12-10 11:07 [PATCH net 0/4] netfilter: updates for net Florian Westphal
2025-12-10 11:07 ` [PATCH net 1/4] netfilter: nf_conncount: fix leaked ct in error paths Florian Westphal
@ 2025-12-10 11:07 ` Florian Westphal
2025-12-10 11:07 ` [PATCH net 3/4] netfilter: always set route tuple out ifindex Florian Westphal
2025-12-10 11:07 ` [PATCH net 4/4] selftests: netfilter: prefer xfail in case race wasn't triggered Florian Westphal
3 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2025-12-10 11:07 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Slavin Liu <slavin452@gmail.com>
The IPv4 code path in __ip_vs_get_out_rt() calls dst_link_failure()
without ensuring skb->dev is set, leading to a NULL pointer dereference
in fib_compute_spec_dst() when ipv4_link_failure() attempts to send
ICMP destination unreachable messages.
The issue emerged after commit ed0de45a1008 ("ipv4: recompile ip options
in ipv4_link_failure") started calling __ip_options_compile() from
ipv4_link_failure(). This code path eventually calls fib_compute_spec_dst()
which dereferences skb->dev. An attempt was made to fix the NULL skb->dev
dereference in commit 0113d9c9d1cc ("ipv4: fix null-deref in
ipv4_link_failure"), but it only addressed the immediate dev_net(skb->dev)
dereference by using a fallback device. The fix was incomplete because
fib_compute_spec_dst() later in the call chain still accesses skb->dev
directly, which remains NULL when IPVS calls dst_link_failure().
The crash occurs when:
1. IPVS processes a packet in NAT mode with a misconfigured destination
2. Route lookup fails in __ip_vs_get_out_rt() before establishing a route
3. The error path calls dst_link_failure(skb) with skb->dev == NULL
4. ipv4_link_failure() → ipv4_send_dest_unreach() →
__ip_options_compile() → fib_compute_spec_dst()
5. fib_compute_spec_dst() dereferences NULL skb->dev
Apply the same fix used for IPv6 in commit 326bf17ea5d4 ("ipvs: fix
ipv6 route unreach panic"): set skb->dev from skb_dst(skb)->dev before
calling dst_link_failure().
KASAN: null-ptr-deref in range [0x0000000000000328-0x000000000000032f]
CPU: 1 PID: 12732 Comm: syz.1.3469 Not tainted 6.6.114 #2
RIP: 0010:__in_dev_get_rcu include/linux/inetdevice.h:233
RIP: 0010:fib_compute_spec_dst+0x17a/0x9f0 net/ipv4/fib_frontend.c:285
Call Trace:
<TASK>
spec_dst_fill net/ipv4/ip_options.c:232
spec_dst_fill net/ipv4/ip_options.c:229
__ip_options_compile+0x13a1/0x17d0 net/ipv4/ip_options.c:330
ipv4_send_dest_unreach net/ipv4/route.c:1252
ipv4_link_failure+0x702/0xb80 net/ipv4/route.c:1265
dst_link_failure include/net/dst.h:437
__ip_vs_get_out_rt+0x15fd/0x19e0 net/netfilter/ipvs/ip_vs_xmit.c:412
ip_vs_nat_xmit+0x1d8/0xc80 net/netfilter/ipvs/ip_vs_xmit.c:764
Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure")
Signed-off-by: Slavin Liu <slavin452@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/ipvs/ip_vs_xmit.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 3162ce3c2640..64c697212578 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -408,6 +408,9 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
return -1;
err_unreach:
+ if (!skb->dev)
+ skb->dev = skb_dst(skb)->dev;
+
dst_link_failure(skb);
return -1;
}
--
2.51.2
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH net 3/4] netfilter: always set route tuple out ifindex
2025-12-10 11:07 [PATCH net 0/4] netfilter: updates for net Florian Westphal
2025-12-10 11:07 ` [PATCH net 1/4] netfilter: nf_conncount: fix leaked ct in error paths Florian Westphal
2025-12-10 11:07 ` [PATCH net 2/4] ipvs: fix ipv4 null-ptr-deref in route error path Florian Westphal
@ 2025-12-10 11:07 ` Florian Westphal
2025-12-10 11:07 ` [PATCH net 4/4] selftests: netfilter: prefer xfail in case race wasn't triggered Florian Westphal
3 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2025-12-10 11:07 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Lorenzo Bianconi <lorenzo@kernel.org>
Always set nf_flow_route tuple out ifindex even if the indev is not one
of the flowtable configured devices since otherwise the outdev lookup in
nf_flow_offload_ip_hook() or nf_flow_offload_ipv6_hook() for
FLOW_OFFLOAD_XMIT_NEIGH flowtable entries will fail.
The above issue occurs in the following configuration since IP6IP6
tunnel does not support flowtable acceleration yet:
$ip addr show
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:11:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns1
inet6 2001:db8:1::2/64 scope global nodad
valid_lft forever preferred_lft forever
inet6 fe80::211:22ff:fe33:2255/64 scope link tentative proto kernel_ll
valid_lft forever preferred_lft forever
6: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:22:22:33:22:55 brd ff:ff:ff:ff:ff:ff link-netns ns3
inet6 2001:db8:2::1/64 scope global nodad
valid_lft forever preferred_lft forever
inet6 fe80::222:22ff:fe33:2255/64 scope link tentative proto kernel_ll
valid_lft forever preferred_lft forever
7: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
link/tunnel6 2001:db8:2::1 peer 2001:db8:2::2 permaddr a85:e732:2c37::
inet6 2002:db8:1::1/64 scope global nodad
valid_lft forever preferred_lft forever
inet6 fe80::885:e7ff:fe32:2c37/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
$ip -6 route show
2001:db8:1::/64 dev eth0 proto kernel metric 256 pref medium
2001:db8:2::/64 dev eth1 proto kernel metric 256 pref medium
2002:db8:1::/64 dev tun0 proto kernel metric 256 pref medium
default via 2002:db8:1::2 dev tun0 metric 1024 pref medium
$nft list ruleset
table inet filter {
flowtable ft {
hook ingress priority filter
devices = { eth0, eth1 }
}
chain forward {
type filter hook forward priority filter; policy accept;
meta l4proto { tcp, udp } flow add @ft
}
}
Fixes: b5964aac51e0 ("netfilter: flowtable: consolidate xmit path")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_flow_table_path.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_flow_table_path.c b/net/netfilter/nf_flow_table_path.c
index f0984cf69a09..eb24fe2715dc 100644
--- a/net/netfilter/nf_flow_table_path.c
+++ b/net/netfilter/nf_flow_table_path.c
@@ -250,6 +250,9 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt,
if (nft_dev_fill_forward_path(route, dst, ct, dir, ha, &stack) >= 0)
nft_dev_path_info(&stack, &info, ha, &ft->data);
+ if (info.outdev)
+ route->tuple[dir].out.ifindex = info.outdev->ifindex;
+
if (!info.indev || !nft_flowtable_find_dev(info.indev, ft))
return;
@@ -269,7 +272,6 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt,
route->tuple[!dir].in.num_encaps = info.num_encaps;
route->tuple[!dir].in.ingress_vlans = info.ingress_vlans;
- route->tuple[dir].out.ifindex = info.outdev->ifindex;
if (info.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT) {
memcpy(route->tuple[dir].out.h_source, info.h_source, ETH_ALEN);
--
2.51.2
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH net 4/4] selftests: netfilter: prefer xfail in case race wasn't triggered
2025-12-10 11:07 [PATCH net 0/4] netfilter: updates for net Florian Westphal
` (2 preceding siblings ...)
2025-12-10 11:07 ` [PATCH net 3/4] netfilter: always set route tuple out ifindex Florian Westphal
@ 2025-12-10 11:07 ` Florian Westphal
3 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2025-12-10 11:07 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
Jakub says: "We try to reserve SKIP for tests skipped because tool is
missing in env, something isn't built into the kernel etc."
use xfail, we can't force the race condition to appear at will
so its expected that the test 'fails' occasionally.
Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20251206175647.5c32f419@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
---
tools/testing/selftests/net/netfilter/conntrack_clash.sh | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/netfilter/conntrack_clash.sh b/tools/testing/selftests/net/netfilter/conntrack_clash.sh
index 7fc6c5dbd551..84b8eb12143a 100755
--- a/tools/testing/selftests/net/netfilter/conntrack_clash.sh
+++ b/tools/testing/selftests/net/netfilter/conntrack_clash.sh
@@ -116,7 +116,7 @@ run_one_clash_test()
# not a failure: clash resolution logic did not trigger.
# With right timing, xmit completed sequentially and
# no parallel insertion occurs.
- return $ksft_skip
+ return $ksft_xfail
}
run_clash_test()
@@ -133,12 +133,12 @@ run_clash_test()
if [ $rv -eq 0 ];then
echo "PASS: clash resolution test for $daddr:$dport on attempt $i"
return 0
- elif [ $rv -eq $ksft_skip ]; then
+ elif [ $rv -eq $ksft_xfail ]; then
softerr=1
fi
done
- [ $softerr -eq 1 ] && echo "SKIP: clash resolution for $daddr:$dport did not trigger"
+ [ $softerr -eq 1 ] && echo "XFAIL: clash resolution for $daddr:$dport did not trigger"
}
ip link add veth0 netns "$nsclient1" type veth peer name veth0 netns "$nsrouter"
@@ -167,8 +167,7 @@ load_simple_ruleset "$nsclient2"
run_clash_test "$nsclient2" "$nsclient2" 127.0.0.1 9001
if [ $clash_resolution_active -eq 0 ];then
- [ "$ret" -eq 0 ] && ret=$ksft_skip
- echo "SKIP: Clash resolution did not trigger"
+ [ "$ret" -eq 0 ] && ret=$ksft_xfail
fi
exit $ret
--
2.51.2
^ permalink raw reply related [flat|nested] 6+ messages in thread