All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jiayuan Chen" <jiayuan.chen@linux.dev>
To: "Ido Schimmel" <idosch@nvidia.com>, "David Ahern" <dsahern@kernel.org>
Cc: netdev@vger.kernel.org, dsahern@kernel.org,
	jiayuan.chen@shopee.com,
	syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Simon Horman" <horms@kernel.org>,
	"Shuah Khan" <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH net v2 1/2] net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop
Date: Mon, 02 Mar 2026 09:07:34 +0000	[thread overview]
Message-ID: <80cf6abc40af7f2d072bd9c55758849bb05bfa95@linux.dev> (raw)
In-Reply-To: <20260302082551.GA814377@shredder>

March 2, 2026 at 16:25, "Ido Schimmel" <idosch@nvidia.com mailto:idosch@nvidia.com?to=%22Ido%20Schimmel%22%20%3Cidosch%40nvidia.com%3E > wrote:


> 
> On Mon, Mar 02, 2026 at 01:11:28PM +0800, Jiayuan Chen wrote:
> 
> > 
> > From: Jiayuan Chen <jiayuan.chen@shopee.com>
> >  
> >  When a standalone IPv6 nexthop object is created with a loopback device
> >  (e.g., "ip -6 nexthop add id 100 dev lo"), fib6_nh_init() misclassifies
> >  it as a reject route. This is because nexthop objects have no destination
> >  prefix (fc_dst=::), causing fib6_is_reject() to match any loopback
> >  nexthop. The reject path skips fib_nh_common_init(), leaving
> >  nhc_pcpu_rth_output unallocated. If an IPv4 route later references this
> >  nexthop, __mkroute_output() dereferences NULL nhc_pcpu_rth_output and
> >  panics.
> >  
> >  The reject classification was designed for regular IPv6 routes to prevent
> >  kernel loopback loops, but nexthop objects should not be subject to this
> >  check since they carry no destination information - loop prevention is
> >  handled separately when the route is created.
> >  
> >  An alternative approach of unconditionally calling fib_nh_common_init()
> >  for all reject routes was considered, but on large machines (e.g., 256
> >  CPUs) with many routes, this wastes significant memory since
> >  nhc_pcpu_rth_output allocates a per-CPU pointer for each route.
> >  
> >  Since fib6_nh_init() is shared by multiple callers (route creation,
> >  nexthop object creation, IPv4 gateway validation), using fc_dst_len to
> >  implicitly distinguish nexthop objects would be fragile. Add an explicit
> >  fc_is_nh flag to fib6_config to clearly identify nexthop object creation
> >  and skip the reject check for this path.
> >  
> >  Fixes: 7dd73168e273 ("ipv6: Always allocate pcpu memory in a fib6_nh")
> >  Reported-by: syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com
> >  Closes: https://lore.kernel.org/all/698f8482.a70a0220.2c38d7.00ca.GAE@google.com/T/
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
> >  ---
> >  include/net/ip6_fib.h | 1 +
> >  net/ipv4/nexthop.c | 1 +
> >  net/ipv6/route.c | 8 +++++++-
> >  3 files changed, 9 insertions(+), 1 deletion(-)
> >  
> >  diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> >  index 88b0dd4d8e09..7710f247b8d9 100644
> >  --- a/include/net/ip6_fib.h
> >  +++ b/include/net/ip6_fib.h
> >  @@ -62,6 +62,7 @@ struct fib6_config {
> >  struct nlattr *fc_encap;
> >  u16 fc_encap_type;
> >  bool fc_is_fdb;
> >  + bool fc_is_nh;
> >  };
> >  
> >  struct fib6_node {
> >  diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> >  index 7b9d70f9b31c..efad2dd27636 100644
> >  --- a/net/ipv4/nexthop.c
> >  +++ b/net/ipv4/nexthop.c
> >  @@ -2859,6 +2859,7 @@ static int nh_create_ipv6(struct net *net, struct nexthop *nh,
> >  struct fib6_config fib6_cfg = {
> >  .fc_table = l3mdev_fib_table(cfg->dev),
> >  .fc_ifindex = cfg->nh_ifindex,
> >  + .fc_is_nh = true,
> >  .fc_gateway = cfg->gw.ipv6,
> >  .fc_flags = cfg->nh_flags,
> >  .fc_nlinfo = cfg->nlinfo,
> >  diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> >  index c0350d97307e..347f464ce7fe 100644
> >  --- a/net/ipv6/route.c
> >  +++ b/net/ipv6/route.c
> >  @@ -3628,7 +3628,13 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
> >  * they would result in kernel looping; promote them to reject routes
> >  */
> >  addr_type = ipv6_addr_type(&cfg->fc_dst);
> >  - if (fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
> >  + /*
> >  + * Nexthop objects have no destination prefix, so fib6_is_reject()
> >  + * will misclassify loopback nexthops as reject routes, causing
> >  + * fib_nh_common_init() to be skipped along with its allocation
> >  + * of nhc_pcpu_rth_output, which IPv4 routes require.
> >  + */
> >  + if (!cfg->fc_is_nh && fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
> >  /* hold loopback dev/idev if we haven't done so. */
> >  if (dev != net->loopback_dev) {
> >  if (dev) {
> > 
> The code basically resets the nexthop device to the loopback device in
> case of reject routes:
> 
> # ip link add name dummy1 up type dummy
> # ip route add unreachable 2001:db8:1::/64 dev dummy1
> # ip -6 route show 2001:db8:1::/64
> unreachable 2001:db8:1::/64 dev lo metric 1024 pref medium
> 
> Therefore, the check in fib6_is_reject() regarding the nexthop device
> being a loopback seems quite pointless. It's probably only needed when
> promoting routes that are using the loopback device to reject routes,
> which happens in ip6_route_info_create_nh() (the other caller of
> fib6_is_reject()).
> 
> I suggest simplifying the check so that it only applies to reject routes
> [1]. It fixes the issue since RTF_REJECT is a route attribute and not a
> nexthop attribute, so it will never be set by the nexthop code.
> 
> [1]
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 85df25c36409..035e3f668d49 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -3582,7 +3582,6 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
>  netdevice_tracker *dev_tracker = &fib6_nh->fib_nh_dev_tracker;
>  struct net_device *dev = NULL;
>  struct inet6_dev *idev = NULL;
> - int addr_type;
>  int err;
>  
>  fib6_nh->fib_nh_family = AF_INET6;
> @@ -3624,11 +3623,10 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
>  
>  fib6_nh->fib_nh_weight = 1;
>  
> - /* We cannot add true routes via loopback here,
> - * they would result in kernel looping; promote them to reject routes
> + /* Reset the nexthop device to the loopback device in case of reject
> + * routes.
>  */
> - addr_type = ipv6_addr_type(&cfg->fc_dst);
> - if (fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
> + if (cfg->fc_flags & RTF_REJECT) {
>  /* hold loopback dev/idev if we haven't done so. */
>  if (dev != net->loopback_dev) {
>  if (dev) {
>

Thanks, this is indeed the simplest fix.

Let me walk through each case to confirm my understanding:

Case 1: Explicit reject route (with RTF_REJECT)
ip -6 route add unreachable 2001:db8:1::/64

cfg->fc_flags has RTF_REJECT before entering fib6_nh_init(), so the reject path is taken.
fib_nh_common_init() is skipped, nhc_pcpu_rth_output is not allocated. This is fine since reject
routes never need it.


Case 2: Loopback implicit reject route (without RTF_REJECT)
ip -6 route add 2001:db8::/32 dev lo

cfg->fc_flags does not have RTF_REJECT, so fib6_nh_init() takes the normal path and
fib_nh_common_init() allocates nhc_pcpu_rth_output. Later, ip6_route_info_create() calls
fib6_is_reject() and marks the route as RTF_REJECT.
The allocated nhc_pcpu_rth_output is unused but harmless.


Case 3: Standalone nexthop object (our bug scenario)
ip -6 nexthop add id 100 dev lo

ip route add 172.20.20.0/24 nhid 100
cfg->fc_flags does not have RTF_REJECT (nexthop objects never carry route attributes),
so fib6_nh_init() takes the normal path and fib_nh_common_init() allocates nhc_pcpu_rth_output.
This fixes the crash when an IPv4 route later references this nexthop.

  reply	other threads:[~2026-03-02  9:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02  5:11 [PATCH net v2 0/2] net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop and add selftest Jiayuan Chen
2026-03-02  5:11 ` [PATCH net v2 1/2] net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop Jiayuan Chen
2026-03-02  8:25   ` Ido Schimmel
2026-03-02  9:07     ` Jiayuan Chen [this message]
2026-03-02 13:38       ` Ido Schimmel
2026-03-02  5:11 ` [PATCH net v2 2/2] selftests: net: add test for IPv4 route with " Jiayuan Chen
2026-03-02  8:35   ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80cf6abc40af7f2d072bd9c55758849bb05bfa95@linux.dev \
    --to=jiayuan.chen@linux.dev \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=jiayuan.chen@shopee.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    --cc=syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.