From: Ido Schimmel <idosch@nvidia.com>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: netdev@vger.kernel.org, dsahern@kernel.org,
jiayuan.chen@shopee.com,
syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>, Shuah Khan <shuah@kernel.org>,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH net v2 1/2] net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop
Date: Mon, 2 Mar 2026 10:25:51 +0200 [thread overview]
Message-ID: <20260302082551.GA814377@shredder> (raw)
In-Reply-To: <20260302051132.66314-2-jiayuan.chen@linux.dev>
On Mon, Mar 02, 2026 at 01:11:28PM +0800, Jiayuan Chen wrote:
> From: Jiayuan Chen <jiayuan.chen@shopee.com>
>
> When a standalone IPv6 nexthop object is created with a loopback device
> (e.g., "ip -6 nexthop add id 100 dev lo"), fib6_nh_init() misclassifies
> it as a reject route. This is because nexthop objects have no destination
> prefix (fc_dst=::), causing fib6_is_reject() to match any loopback
> nexthop. The reject path skips fib_nh_common_init(), leaving
> nhc_pcpu_rth_output unallocated. If an IPv4 route later references this
> nexthop, __mkroute_output() dereferences NULL nhc_pcpu_rth_output and
> panics.
>
> The reject classification was designed for regular IPv6 routes to prevent
> kernel loopback loops, but nexthop objects should not be subject to this
> check since they carry no destination information - loop prevention is
> handled separately when the route is created.
>
> An alternative approach of unconditionally calling fib_nh_common_init()
> for all reject routes was considered, but on large machines (e.g., 256
> CPUs) with many routes, this wastes significant memory since
> nhc_pcpu_rth_output allocates a per-CPU pointer for each route.
>
> Since fib6_nh_init() is shared by multiple callers (route creation,
> nexthop object creation, IPv4 gateway validation), using fc_dst_len to
> implicitly distinguish nexthop objects would be fragile. Add an explicit
> fc_is_nh flag to fib6_config to clearly identify nexthop object creation
> and skip the reject check for this path.
>
> Fixes: 7dd73168e273 ("ipv6: Always allocate pcpu memory in a fib6_nh")
> Reported-by: syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/698f8482.a70a0220.2c38d7.00ca.GAE@google.com/T/
> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
> ---
> include/net/ip6_fib.h | 1 +
> net/ipv4/nexthop.c | 1 +
> net/ipv6/route.c | 8 +++++++-
> 3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index 88b0dd4d8e09..7710f247b8d9 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -62,6 +62,7 @@ struct fib6_config {
> struct nlattr *fc_encap;
> u16 fc_encap_type;
> bool fc_is_fdb;
> + bool fc_is_nh;
> };
>
> struct fib6_node {
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 7b9d70f9b31c..efad2dd27636 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -2859,6 +2859,7 @@ static int nh_create_ipv6(struct net *net, struct nexthop *nh,
> struct fib6_config fib6_cfg = {
> .fc_table = l3mdev_fib_table(cfg->dev),
> .fc_ifindex = cfg->nh_ifindex,
> + .fc_is_nh = true,
> .fc_gateway = cfg->gw.ipv6,
> .fc_flags = cfg->nh_flags,
> .fc_nlinfo = cfg->nlinfo,
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index c0350d97307e..347f464ce7fe 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -3628,7 +3628,13 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
> * they would result in kernel looping; promote them to reject routes
> */
> addr_type = ipv6_addr_type(&cfg->fc_dst);
> - if (fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
> + /*
> + * Nexthop objects have no destination prefix, so fib6_is_reject()
> + * will misclassify loopback nexthops as reject routes, causing
> + * fib_nh_common_init() to be skipped along with its allocation
> + * of nhc_pcpu_rth_output, which IPv4 routes require.
> + */
> + if (!cfg->fc_is_nh && fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
> /* hold loopback dev/idev if we haven't done so. */
> if (dev != net->loopback_dev) {
> if (dev) {
The code basically resets the nexthop device to the loopback device in
case of reject routes:
# ip link add name dummy1 up type dummy
# ip route add unreachable 2001:db8:1::/64 dev dummy1
# ip -6 route show 2001:db8:1::/64
unreachable 2001:db8:1::/64 dev lo metric 1024 pref medium
Therefore, the check in fib6_is_reject() regarding the nexthop device
being a loopback seems quite pointless. It's probably only needed when
promoting routes that are using the loopback device to reject routes,
which happens in ip6_route_info_create_nh() (the other caller of
fib6_is_reject()).
I suggest simplifying the check so that it only applies to reject routes
[1]. It fixes the issue since RTF_REJECT is a route attribute and not a
nexthop attribute, so it will never be set by the nexthop code.
[1]
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 85df25c36409..035e3f668d49 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3582,7 +3582,6 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
netdevice_tracker *dev_tracker = &fib6_nh->fib_nh_dev_tracker;
struct net_device *dev = NULL;
struct inet6_dev *idev = NULL;
- int addr_type;
int err;
fib6_nh->fib_nh_family = AF_INET6;
@@ -3624,11 +3623,10 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
fib6_nh->fib_nh_weight = 1;
- /* We cannot add true routes via loopback here,
- * they would result in kernel looping; promote them to reject routes
+ /* Reset the nexthop device to the loopback device in case of reject
+ * routes.
*/
- addr_type = ipv6_addr_type(&cfg->fc_dst);
- if (fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
+ if (cfg->fc_flags & RTF_REJECT) {
/* hold loopback dev/idev if we haven't done so. */
if (dev != net->loopback_dev) {
if (dev) {
next prev parent reply other threads:[~2026-03-02 8:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 5:11 [PATCH net v2 0/2] net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop and add selftest Jiayuan Chen
2026-03-02 5:11 ` [PATCH net v2 1/2] net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop Jiayuan Chen
2026-03-02 8:25 ` Ido Schimmel [this message]
2026-03-02 9:07 ` Jiayuan Chen
2026-03-02 13:38 ` Ido Schimmel
2026-03-02 5:11 ` [PATCH net v2 2/2] selftests: net: add test for IPv4 route with " Jiayuan Chen
2026-03-02 8:35 ` Ido Schimmel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260302082551.GA814377@shredder \
--to=idosch@nvidia.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jiayuan.chen@linux.dev \
--cc=jiayuan.chen@shopee.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
--cc=syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox