Re: [PATCH net v2] ipv6: anycast: insert aca into global hash under idev->lock

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ido Schimmel <idosch@nvidia.com>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: netdev@vger.kernel.org,
	syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com,
	Kuniyuki Iwashima <kuniyu@google.com>,
	David Ahern <dsahern@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v2] ipv6: anycast: insert aca into global hash under idev->lock
Date: Sun, 31 May 2026 19:45:15 +0300	[thread overview]
Message-ID: <20260531164515.GA232188@shredder> (raw)
In-Reply-To: <20260529152219.235475-1-jiayuan.chen@linux.dev>

On Fri, May 29, 2026 at 11:22:18PM +0800, Jiayuan Chen wrote:
> syzbot reported a splat [1]: a slab-use-after-free in
> ipv6_chk_acast_addr(), which walks the global inet6_acaddr_lst[] hash
> under RCU and dereferences a struct ifacaddr6 that has already been
> freed while still linked in the hash, so a later reader walks into a
> dangling node.
> 
> In __ipv6_dev_ac_inc() the aca is allocated with refcount 1, then
> aca_get() bumps it to 2 to keep it alive across the unlocked region.
> It is published to idev->ac_list under idev->lock, but
> ipv6_add_acaddr_hash() runs after write_unlock_bh(). A concurrent
> teardown (ipv6_ac_destroy_dev() from addrconf_ifdown(), under RTNL)
> can slip into that window:
> 
>   CPU0 __ipv6_dev_ac_inc           CPU1 ipv6_ac_destroy_dev (RTNL)
>   ------------------------------   ------------------------------------
>   aca_alloc()              refcnt 1
>   aca_get()               refcnt 2
>   write_lock_bh(idev->lock)
>     add aca to ac_list
>   write_unlock_bh(idev->lock)
>                                    write_lock_bh(idev->lock)
>                                      pull aca off ac_list
>                                    write_unlock_bh(idev->lock)
>                                    ipv6_del_acaddr_hash(aca)
>                                      hlist_del_init_rcu() is a no-op,
>                                      aca is not in the hash yet
>                                    aca_put()           refcnt 2->1
>   ipv6_add_acaddr_hash(aca)
>     aca now inserted into the hash
>   aca_put()                refcnt 1->0
>     call_rcu(aca_free_rcu) -> kfree(aca)
> 
> The hash removal becomes a no-op because the insertion has not
> happened yet, so once CPU0 inserts and drops the last reference, the
> aca is freed while still linked in inet6_acaddr_lst[], and readers
> dereference freed memory after the slab slot is reused.
> 
> This window opened once RTNL stopped serializing the join path against
> device teardown. Move ipv6_add_acaddr_hash() inside the idev->lock
> section so the ac_list and hash insertions are atomic with respect to
> teardown: a racing remover now either misses the aca entirely or finds
> it in both lists.
> 
> acaddr_hash_lock is now nested under idev->lock, which is acquired in
> softirq context, so switch all acaddr_hash_lock sites to spin_lock_bh()
> to avoid the irq lock inversion reported in [2].
> 
> [1] https://syzkaller.appspot.com/bug?extid=a01df04303c131efbf3a
> [2] https://lore.kernel.org/netdev/6a194ef7.ba3b1513.1890b4.0000.GAE@google.com/
> 
> Reported-by: syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/6a191f87.ce022c6e.138e56.0003.GAE@google.com/T/
> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
> Fixes: eb1ac9ff6c4a ("ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST.")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

There's a comment from Sashiko about UAF / leak with regards to the
associated route, but I don't think it can happen:

"
This is a pre-existing issue, but could a race condition here cause a
use-after-free of the fib6_info object and leak the net_device?

Since ip6_ins_rt() is called after dropping the idev->lock, what happens if
a concurrent device teardown via ipv6_ac_destroy_dev() intervenes?

If ipv6_ac_destroy_dev() acquires the lock right after it is dropped here,
it would find the newly published aca in idev->ac_list, unlink it, and call
ip6_del_rt().

Since the route isn't inserted yet, ip6_del_rt() fails to remove it but
still calls fib6_info_release(), dropping the refcount of f6i to zero.
When this thread resumes, would ip6_ins_rt() then insert the 0-refcount
route into the FIB tree?
"

I don't believe the reference count drops to 0 since the address is
still alive and aca_alloc() acquires a reference on the route via
fib6_info_hold().

"
Since device unregistration has already flushed all routes, it appears this
orphaned route is never removed. Would this cause unregister_netdevice()
to hang indefinitely due to the held net_device reference?

Could ip6_ins_rt() be moved inside the idev->lock critical section to
prevent this race?
"

The kernel will emit NETDEV_UNREGISTER until the netdev reference count
drops to 1 and the route will be cleaned via addrconf_notify() ->
addrconf_ifdown() -> rt6_disable_ip()

Racing addrconf_{join,leave}_solict() also seems fine since
__ipv6_dev_mc_inc() will be a NOP due to the in6_dev_get() check.

next prev parent reply	other threads:[~2026-05-31 16:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29 15:22 [PATCH net v2] ipv6: anycast: insert aca into global hash under idev->lock Jiayuan Chen
2026-05-29 20:51 ` Jakub Kicinski
2026-05-30  5:00   ` Jiayuan Chen
2026-05-31 16:45 ` Ido Schimmel [this message]
2026-06-03  2:50 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260531164515.GA232188@shredder \
    --to=idosch@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.