From: Avinash Duduskar <avinash.duduskar@gmail.com>
To: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>
Cc: "Eduard Zingerman" <eddyz87@gmail.com>,
"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Song Liu" <song@kernel.org>,
"Yonghong Song" <yonghong.song@linux.dev>,
"Jiri Olsa" <jolsa@kernel.org>,
"Emil Tsalapatis" <emil@etsalapatis.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Simon Horman" <horms@kernel.org>,
"David Ahern" <dsahern@kernel.org>,
"Shuah Khan" <shuah@kernel.org>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Mykyta Yatsenko" <yatsenko@meta.com>,
"Leon Hwang" <leon.hwang@linux.dev>,
"KP Singh" <kpsingh@kernel.org>,
"Anton Protopopov" <a.s.protopopov@gmail.com>,
"Amery Hung" <ameryhung@gmail.com>,
"Eyal Birger" <eyal.birger@gmail.com>,
"Rong Tao" <rongtao@cestc.cn>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH bpf-next v2 1/4] bpf: Initialize the l3mdev field for the fib lookup flow
Date: Wed, 17 Jun 2026 04:04:23 +0530 [thread overview]
Message-ID: <20260616223426.3568080-2-avinash.duduskar@gmail.com> (raw)
In-Reply-To: <20260616223426.3568080-1-avinash.duduskar@gmail.com>
bpf_ipv4_fib_lookup() and bpf_ipv6_fib_lookup() build the flow key on
the stack with a bare "struct flowi4 fl4;" / "struct flowi6 fl6;" and
fill it field by field, but never set flowi4_l3mdev / flowi6_l3mdev.
On the non-DIRECT path the lookup goes through the fib rules whenever the
netns has custom rules, which a VRF installs:
bpf_ipv4_fib_lookup() -> fib_lookup() -> __fib_lookup()
-> l3mdev_update_flow() reads !fl->flowi_l3mdev
-> fib_rules_lookup() -> fib_rule_match()
-> l3mdev_fib_rule_match() uses fl->flowi_l3mdev
l3mdev_update_flow() resolves the l3mdev master from the ingress device
only while the field is still zero:
if (fl->flowi_iif > LOOPBACK_IFINDEX && !fl->flowi_l3mdev) {
dev = dev_get_by_index_rcu(net, fl->flowi_iif);
if (dev)
fl->flowi_l3mdev = l3mdev_master_ifindex_rcu(dev);
}
Left at a nonzero stack value the resolution is skipped, and
l3mdev_fib_rule_match() then tests that value as an ifindex, so the VRF
master is not resolved and the rule fails to match: an ingress enslaved
to a VRF can fail to select its table. The same value is also read just
before that, by FIB rules matching on an L3 master device
(l3mdev_fib_rule_iif_match()/_oif_match()), so an "ip rule iif/oif <vrf>"
mismatches the same way.
The helper already initializes the other flow fields the rules path
consumes (flowi4_mark, flowi4_tun_key.tun_id, flowi4_uid and the v6
counterparts); flowi*_l3mdev was added to that set afterwards and this
helper was never updated to match. ip_route_input_slow() likewise zeroes
the field before its input lookup. Do the same here.
CONFIG_INIT_STACK_ALL_ZERO masks this by default, but it depends on
compiler support (CC_HAS_AUTO_VAR_INIT_ZERO), so INIT_STACK_NONE builds,
including older toolchains that fall back to it, are exposed. Built with
INIT_STACK_ALL_PATTERN, a plain bpf_fib_lookup (no VLAN, no DIRECT) over a
VRF slave whose destination is routed only in the VRF table returns
BPF_FIB_LKUP_RET_NOT_FWDED, and resolves with this patch; reverting these
two lines flips it back. The series' VRF selftests pass on the default
config either way, so they do not exercise this fix.
Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Signed-off-by: Avinash Duduskar <avinash.duduskar@gmail.com>
---
net/core/filter.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index 9590877b0714..6fa172cb1348 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6162,6 +6162,7 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
fl4.flowi4_dscp = inet_dsfield_to_dscp(params->tos);
fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
fl4.flowi4_flags = 0;
+ fl4.flowi4_l3mdev = 0;
fl4.flowi4_proto = params->l4_protocol;
fl4.daddr = params->ipv4_dst;
@@ -6307,6 +6308,7 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
fl6.flowlabel = params->flowinfo;
fl6.flowi6_scope = 0;
fl6.flowi6_flags = 0;
+ fl6.flowi6_l3mdev = 0;
fl6.mp_hash = 0;
fl6.flowi6_proto = params->l4_protocol;
base-commit: 140fa23df957b51385aa847986d44ad7f59b0563
--
2.54.0
next prev parent reply other threads:[~2026-06-16 22:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 22:34 [PATCH bpf-next v2 0/4] bpf: bidirectional VLAN support for bpf_fib_lookup() Avinash Duduskar
2026-06-16 22:34 ` Avinash Duduskar [this message]
2026-06-16 22:34 ` [PATCH bpf-next v2 2/4] bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper Avinash Duduskar
2026-06-16 22:47 ` sashiko-bot
2026-06-16 22:34 ` [PATCH bpf-next v2 3/4] bpf: Add BPF_FIB_LOOKUP_VLAN_INPUT " Avinash Duduskar
2026-06-16 22:34 ` [PATCH bpf-next v2 4/4] selftests/bpf: Add bpf_fib_lookup() VLAN flag tests Avinash Duduskar
2026-06-16 22:40 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260616223426.3568080-2-avinash.duduskar@gmail.com \
--to=avinash.duduskar@gmail.com \
--cc=a.s.protopopov@gmail.com \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=emil@etsalapatis.com \
--cc=eyal.birger@gmail.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=leon.hwang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rongtao@cestc.cn \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=toke@redhat.com \
--cc=yatsenko@meta.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.