* [PATCH nf v2 1/3] netfilter: nft_fib_ipv6: walk fib6_siblings under RCU
2026-05-20 2:34 [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Jiayuan Chen
@ 2026-05-20 2:34 ` Jiayuan Chen
2026-05-20 2:34 ` [PATCH nf v2 2/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Jiayuan Chen
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Jiayuan Chen @ 2026-05-20 2:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: pablo, fw, phil, coreteam
nft_fib6_info_nh_uses_dev() runs from nft_fib6_eval() in softirq under
rcu_read_lock(). fib6_siblings is modified by writers that hold
tb6_lock but do not wait for RCU readers, so the sibling walk should
use list_for_each_entry_rcu(): it adds READ_ONCE() on the ->next
pointer and lets CONFIG_PROVE_RCU_LIST validate the locking.
No functional change for non-debug builds.
Fixes: 1c32b24c234b ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
net/ipv6/netfilter/nft_fib_ipv6.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c
index 8b2dba88ee96..5e192a446ec8 100644
--- a/net/ipv6/netfilter/nft_fib_ipv6.c
+++ b/net/ipv6/netfilter/nft_fib_ipv6.c
@@ -170,7 +170,7 @@ static bool nft_fib6_info_nh_uses_dev(struct fib6_info *rt,
if (nft_fib6_info_nh_dev_match(nh_dev, dev))
return true;
- list_for_each_entry(iter, &rt->fib6_siblings, fib6_siblings) {
+ list_for_each_entry_rcu(iter, &rt->fib6_siblings, fib6_siblings) {
nh_dev = fib6_info_nh_dev(iter);
if (nft_fib6_info_nh_dev_match(nh_dev, dev))
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH nf v2 2/3] netfilter: nft_fib_ipv6: handle routes via external nexthop
2026-05-20 2:34 [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Jiayuan Chen
2026-05-20 2:34 ` [PATCH nf v2 1/3] netfilter: nft_fib_ipv6: walk fib6_siblings under RCU Jiayuan Chen
@ 2026-05-20 2:34 ` Jiayuan Chen
2026-05-20 2:34 ` [PATCH nf v2 3/3] selftests: netfilter: add nft_fib_nexthop test Jiayuan Chen
2026-05-20 9:26 ` [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Phil Sutter
3 siblings, 0 replies; 7+ messages in thread
From: Jiayuan Chen @ 2026-05-20 2:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: pablo, fw, phil, coreteam
fib6_info has a union:
union {
struct list_head fib6_siblings;
struct list_head nh_list;
};
Old-style multipath (ip -6 route add ... nexthop ... nexthop ...) uses
fib6_siblings. External nexthop (ip -6 route add ... nhid N) uses
nh_list, linked into &nh->f6i_list.
nft_fib6_info_nh_uses_dev() blindly walks &rt->fib6_siblings, causing
an OOB read past the struct nexthop slab when rt->nh is set:
==================================================================
BUG: KASAN: slab-out-of-bounds in nft_fib6_eval+0x1362/0x16c0
Read of size 8 at addr ffff888103a099d0 by task ping/386
CPU: 2 UID: 0 PID: 386 Comm: ping Not tainted 7.1.0-rc3+ #251 PREEMPT
Call Trace:
<IRQ>
dump_stack_lvl+0x76/0xa0
print_report+0xd1/0x5f0
kasan_report+0xe7/0x130
__asan_report_load8_noabort+0x14/0x30
nft_fib6_eval+0x1362/0x16c0
nft_do_chain+0x279/0x18c0
nft_do_chain_ipv6+0x1a8/0x230
nf_hook_slow+0xad/0x200
ipv6_rcv+0x152/0x380
__netif_receive_skb_one_core+0x118/0x1c0
==================================================================
Branch by route shape: when rt->nh is set, walk via
nexthop_for_each_fib6_nh() (also covers nh groups, which the original
code missed); otherwise walk fib6_siblings, guarded by READ_ONCE() of
rt->fib6_nsiblings as required by commit 31d7d67ba127 ("ipv6: annotate
data-races around rt->fib6_nsiblings").
Fixes: 1c32b24c234b ("netfilter: nft_fib_ipv6: switch to fib6_lookup")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
net/ipv6/netfilter/nft_fib_ipv6.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c
index 5e192a446ec8..c0a0075e2590 100644
--- a/net/ipv6/netfilter/nft_fib_ipv6.c
+++ b/net/ipv6/netfilter/nft_fib_ipv6.c
@@ -160,16 +160,32 @@ static bool nft_fib6_info_nh_dev_match(const struct net_device *nh_dev,
l3mdev_master_ifindex_rcu(nh_dev) == dev->ifindex;
}
+static int nft_fib6_nh_match_dev_cb(struct fib6_nh *nh, void *arg)
+{
+ const struct net_device *dev = arg;
+
+ return nft_fib6_info_nh_dev_match(nh->fib_nh_dev, dev);
+}
+
static bool nft_fib6_info_nh_uses_dev(struct fib6_info *rt,
const struct net_device *dev)
{
const struct net_device *nh_dev;
struct fib6_info *iter;
+ /* External nexthop: fib6_siblings slot aliases nh_list, walk via nh. */
+ if (rt->nh)
+ return nexthop_for_each_fib6_nh(rt->nh,
+ nft_fib6_nh_match_dev_cb,
+ (void *)dev);
+
nh_dev = fib6_info_nh_dev(rt);
if (nft_fib6_info_nh_dev_match(nh_dev, dev))
return true;
+ if (!READ_ONCE(rt->fib6_nsiblings))
+ return false;
+
list_for_each_entry_rcu(iter, &rt->fib6_siblings, fib6_siblings) {
nh_dev = fib6_info_nh_dev(iter);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH nf v2 3/3] selftests: netfilter: add nft_fib_nexthop test
2026-05-20 2:34 [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Jiayuan Chen
2026-05-20 2:34 ` [PATCH nf v2 1/3] netfilter: nft_fib_ipv6: walk fib6_siblings under RCU Jiayuan Chen
2026-05-20 2:34 ` [PATCH nf v2 2/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Jiayuan Chen
@ 2026-05-20 2:34 ` Jiayuan Chen
2026-05-20 9:26 ` [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Phil Sutter
3 siblings, 0 replies; 7+ messages in thread
From: Jiayuan Chen @ 2026-05-20 2:34 UTC (permalink / raw)
To: netfilter-devel; +Cc: pablo, fw, phil, coreteam
Functional coverage of nft_fib6_eval()'s nexthop enumeration over
three route shapes:
1) single external nexthop (nhid)
2) external nexthop group (nhid -> group)
3) old-style multipath (nexthop ... nexthop ...)
Each scenario places one nexthop on the input device (veth0). For
(2) and (3) the matching nexthop is the second member, so the walk
has to traverse beyond the primary nh. Two nft counters on prerouting
verify the data path: one increments only when fib reports veth0 as
the oif, the other counts "missing" results and must stay at zero.
./nft_fib_nexthop.sh
PASS: single external nexthop (nhid -> veth0)
PASS: nexthop group (dummy0 + veth0)
PASS: old-style multipath (sibling on veth0)
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
.../testing/selftests/net/netfilter/Makefile | 1 +
.../net/netfilter/nft_fib_nexthop.sh | 152 ++++++++++++++++++
2 files changed, 153 insertions(+)
create mode 100755 tools/testing/selftests/net/netfilter/nft_fib_nexthop.sh
diff --git a/tools/testing/selftests/net/netfilter/Makefile b/tools/testing/selftests/net/netfilter/Makefile
index ee2d1a5254f8..d953ee218c0f 100644
--- a/tools/testing/selftests/net/netfilter/Makefile
+++ b/tools/testing/selftests/net/netfilter/Makefile
@@ -26,6 +26,7 @@ TEST_PROGS := \
nft_concat_range.sh \
nft_conntrack_helper.sh \
nft_fib.sh \
+ nft_fib_nexthop.sh \
nft_flowtable.sh \
nft_interface_stress.sh \
nft_meta.sh \
diff --git a/tools/testing/selftests/net/netfilter/nft_fib_nexthop.sh b/tools/testing/selftests/net/netfilter/nft_fib_nexthop.sh
new file mode 100755
index 000000000000..c4f203057382
--- /dev/null
+++ b/tools/testing/selftests/net/netfilter/nft_fib_nexthop.sh
@@ -0,0 +1,152 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2154
+#
+# Exercise nft_fib6_eval()'s sibling/nh enumeration on three route shapes:
+# 1) route via a single external nexthop (nhid)
+# 2) route via an external nexthop group (nhid -> group, two members)
+# 3) route via old-style multipath (nexthop ... nexthop ...)
+#
+# In each scenario the route's nexthop set contains veth0 (the iif of the
+# test packet). nft_fib6_info_nh_uses_dev() must walk the set and report
+# veth0 as a valid oif. For (2) and (3) the matching nexthop is the second
+# member, so the walk has to traverse beyond the primary nh.
+#
+# After sending $PKTS ICMPv6 echo requests from ns1, check two counters on
+# nsrouter:
+# nf_ok -- `fib daddr . iif oif eq "veth0"` must equal $PKTS
+# nf_bad -- `fib daddr . iif oif missing` must stay at 0
+# Both rules also match on iif veth0 and ip6 daddr dead:dead::/64 so that
+# kernel-generated ND/MLD/RA traffic cannot pollute the counters.
+#
+# Topology similar to nft_fib.sh, without ns2; two dummy interfaces on
+# nsrouter host extra nh devices:
+#
+# dead:1::99 dead:1::1
+# ns1 <----veth----> nsrouter --- dummy0 dead:2::1
+# \-- dummy1 dead:9::1
+
+source lib.sh
+
+ret=0
+PKTS=3
+
+checktool "nft --version" "run test without nft"
+checktool "ip -V" "run test without iproute2"
+
+setup_ns nsrouter ns1
+trap cleanup_all_ns EXIT
+
+if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" \
+ > /dev/null 2>&1; then
+ echo "SKIP: No virtual ethernet pair device support in kernel"
+ exit $ksft_skip
+fi
+
+ip -net "$ns1" link set lo up
+ip -net "$ns1" link set eth0 up
+ip -net "$ns1" -6 addr add dead:1::99/64 dev eth0 nodad
+ip -net "$ns1" -6 route add default via dead:1::1
+
+ip -net "$nsrouter" link set lo up
+ip -net "$nsrouter" link set veth0 up
+ip -net "$nsrouter" -6 addr add dead:1::1/64 dev veth0 nodad
+
+if ! ip -net "$nsrouter" link add dummy0 type dummy 2>/dev/null; then
+ echo "SKIP: dummy netdev not available"
+ exit $ksft_skip
+fi
+ip -net "$nsrouter" link set dummy0 up
+ip -net "$nsrouter" -6 addr add dead:2::1/64 dev dummy0 nodad
+
+ip -net "$nsrouter" link add dummy1 type dummy
+ip -net "$nsrouter" link set dummy1 up
+ip -net "$nsrouter" -6 addr add dead:9::1/64 dev dummy1 nodad
+
+ip netns exec "$nsrouter" sysctl -q net.ipv6.conf.all.forwarding=1
+
+load_fib_rule() {
+ # filter on iif + daddr so the counters only see our test packets
+ ip netns exec "$nsrouter" nft -f /dev/stdin <<EOF
+flush ruleset
+table ip6 t {
+ counter nf_ok { }
+ counter nf_bad { }
+ chain c {
+ type filter hook prerouting priority 0; policy accept;
+ iif "veth0" ip6 daddr dead:dead::/64 fib daddr . iif oif eq "veth0" counter name nf_ok
+ iif "veth0" ip6 daddr dead:dead::/64 fib daddr . iif oif missing counter name nf_bad
+ }
+}
+EOF
+}
+
+bad_counter() {
+ local counter=$1
+ local expect=$2
+ local tag=$3
+
+ echo "FAIL ($tag): counter $counter has unexpected value (expected \"$expect\")" 1>&2
+ ip netns exec "$nsrouter" nft list counter ip6 t "$counter" 1>&2
+}
+
+run_scenario() {
+ local what="$1"; shift
+ # counter output format is "packets PACKET_NUM bytes BYTES_NUM";
+ # we only care about the packet count
+ local expect_ok="packets $PKTS bytes"
+ local expect_bad="packets 0 bytes"
+ local lret=0
+
+ # reset route + nexthop state between scenarios
+ ip -net "$nsrouter" -6 route del dead:dead::/64 > /dev/null 2>&1 || true
+ ip -net "$nsrouter" nexthop flush > /dev/null 2>&1 || true
+
+ # run the scenario function passed by the caller
+ "$@" || echo "WARN ($what): scenario setup returned non-zero"
+
+ load_fib_rule || { echo "FAIL ($what): nft load"; ret=1; return; }
+
+ # ping a daddr inside dead:dead::/64 so fib has to walk the nh set
+ ip netns exec "$ns1" ping -6 -c "$PKTS" -i 0.1 -W 1 dead:dead::1 \
+ > /dev/null 2>&1 || true
+
+ # verify the packets went through the expected fib path
+ if ! ip netns exec "$nsrouter" nft list counter ip6 t nf_ok | grep -q "$expect_ok"; then
+ bad_counter nf_ok "$expect_ok" "$what"
+ lret=1
+ fi
+ if ! ip netns exec "$nsrouter" nft list counter ip6 t nf_bad | grep -q "$expect_bad"; then
+ bad_counter nf_bad "$expect_bad" "$what"
+ lret=1
+ fi
+
+ if [ $lret -eq 0 ]; then
+ echo "PASS: $what"
+ else
+ ret=1
+ fi
+}
+
+scenario_single_nh() {
+ ip -net "$nsrouter" nexthop add id 1 via dead:1::99 dev veth0
+ ip -net "$nsrouter" -6 route add dead:dead::/64 nhid 1
+}
+run_scenario "single external nexthop (nhid -> veth0)" scenario_single_nh
+
+scenario_nh_group() {
+ ip -net "$nsrouter" nexthop add id 1 via dead:2::2 dev dummy0
+ ip -net "$nsrouter" nexthop add id 2 via dead:1::99 dev veth0
+ ip -net "$nsrouter" nexthop add id 100 group 1/2
+ ip -net "$nsrouter" -6 route add dead:dead::/64 nhid 100
+}
+run_scenario "nexthop group (dummy0 + veth0)" scenario_nh_group
+
+scenario_old_multipath() {
+ ip -net "$nsrouter" -6 route add dead:dead::/64 \
+ nexthop via dead:2::2 dev dummy0 \
+ nexthop via dead:1::99 dev veth0
+}
+run_scenario "old-style multipath (sibling on veth0)" scenario_old_multipath
+
+exit $ret
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop
2026-05-20 2:34 [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Jiayuan Chen
` (2 preceding siblings ...)
2026-05-20 2:34 ` [PATCH nf v2 3/3] selftests: netfilter: add nft_fib_nexthop test Jiayuan Chen
@ 2026-05-20 9:26 ` Phil Sutter
2026-05-20 9:39 ` Jiayuan Chen
3 siblings, 1 reply; 7+ messages in thread
From: Phil Sutter @ 2026-05-20 9:26 UTC (permalink / raw)
To: Jiayuan Chen; +Cc: netfilter-devel, pablo, fw, coreteam
Hi,
On Wed, May 20, 2026 at 10:34:08AM +0800, Jiayuan Chen wrote:
> Patch 1 switches the fib6_siblings walk in nft_fib6_info_nh_uses_dev()
> to list_for_each_entry_rcu().
>
> Patch 2 fixes the slab-out-of-bounds when the matched route uses an
> external nexthop object.
>
> Patch 3 adds a selftest covering single nh, nh group and old-style
> multipath.
>
> v1: https://lore.kernel.org/netfilter-devel/20260519041431.396218-1-jiayuan.chen@linux.dev/
>
> Changes since v1:
> - new patch 1: list_for_each_entry_rcu() conversion split out
> (Suggested-by: Phil Sutter)
> - patch 2:
> * drop redundant ternary in nft_fib6_nh_match_dev_cb (Phil)
> * drop redundant "!= 0" on nexthop_for_each_fib6_nh return (Phil)
> * use READ_ONCE() for rt->fib6_nsiblings (Phil)
Will you send a v3 addressing Florian's concerns regarding the test case
in patch 3?
Patches 1 and 2 look good to me, thanks for the respin!
Cheers, Phil
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop
2026-05-20 9:26 ` [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop Phil Sutter
@ 2026-05-20 9:39 ` Jiayuan Chen
2026-05-20 10:48 ` Phil Sutter
0 siblings, 1 reply; 7+ messages in thread
From: Jiayuan Chen @ 2026-05-20 9:39 UTC (permalink / raw)
To: Phil Sutter; +Cc: netfilter-devel, pablo, fw, coreteam
On 5/20/26 5:26 PM, Phil Sutter wrote:
> Hi,
>
> On Wed, May 20, 2026 at 10:34:08AM +0800, Jiayuan Chen wrote:
>> Patch 1 switches the fib6_siblings walk in nft_fib6_info_nh_uses_dev()
>> to list_for_each_entry_rcu().
>>
>> Patch 2 fixes the slab-out-of-bounds when the matched route uses an
>> external nexthop object.
>>
>> Patch 3 adds a selftest covering single nh, nh group and old-style
>> multipath.
>>
>> v1: https://lore.kernel.org/netfilter-devel/20260519041431.396218-1-jiayuan.chen@linux.dev/
>>
>> Changes since v1:
>> - new patch 1: list_for_each_entry_rcu() conversion split out
>> (Suggested-by: Phil Sutter)
>> - patch 2:
>> * drop redundant ternary in nft_fib6_nh_match_dev_cb (Phil)
>> * drop redundant "!= 0" on nexthop_for_each_fib6_nh return (Phil)
>> * use READ_ONCE() for rt->fib6_nsiblings (Phil)
> Will you send a v3 addressing Florian's concerns regarding the test case
> in patch 3?
In the current version, the selftest has already incorporated Florian's
suggestion,that is,
to verify functionality rather than just serving as a bug reproducer
(using nf_ok/nf_bad counter).
Sorry for not making this clear in the changelog : ).
> Patches 1 and 2 look good to me, thanks for the respin!
Thank for your review.
> Cheers, Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH nf v2 0/3] netfilter: nft_fib_ipv6: handle routes via external nexthop
2026-05-20 9:39 ` Jiayuan Chen
@ 2026-05-20 10:48 ` Phil Sutter
0 siblings, 0 replies; 7+ messages in thread
From: Phil Sutter @ 2026-05-20 10:48 UTC (permalink / raw)
To: Jiayuan Chen; +Cc: netfilter-devel, pablo, fw, coreteam
On Wed, May 20, 2026 at 05:39:48PM +0800, Jiayuan Chen wrote:
>
> On 5/20/26 5:26 PM, Phil Sutter wrote:
> > Hi,
> >
> > On Wed, May 20, 2026 at 10:34:08AM +0800, Jiayuan Chen wrote:
> >> Patch 1 switches the fib6_siblings walk in nft_fib6_info_nh_uses_dev()
> >> to list_for_each_entry_rcu().
> >>
> >> Patch 2 fixes the slab-out-of-bounds when the matched route uses an
> >> external nexthop object.
> >>
> >> Patch 3 adds a selftest covering single nh, nh group and old-style
> >> multipath.
> >>
> >> v1: https://lore.kernel.org/netfilter-devel/20260519041431.396218-1-jiayuan.chen@linux.dev/
> >>
> >> Changes since v1:
> >> - new patch 1: list_for_each_entry_rcu() conversion split out
> >> (Suggested-by: Phil Sutter)
> >> - patch 2:
> >> * drop redundant ternary in nft_fib6_nh_match_dev_cb (Phil)
> >> * drop redundant "!= 0" on nexthop_for_each_fib6_nh return (Phil)
> >> * use READ_ONCE() for rt->fib6_nsiblings (Phil)
> > Will you send a v3 addressing Florian's concerns regarding the test case
> > in patch 3?
>
>
> In the current version, the selftest has already incorporated Florian's
> suggestion,that is,
>
> to verify functionality rather than just serving as a bug reproducer
> (using nf_ok/nf_bad counter).
>
>
> Sorry for not making this clear in the changelog : ).
Oh, I missed that. :)
Test looks good, so for the series:
Acked-by: Phil Sutter <phil@nwl.cc>
Thanks, Phil
^ permalink raw reply [flat|nested] 7+ messages in thread