Netdev List
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: netdev@vger.kernel.org,
	syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com,
	Kuniyuki Iwashima <kuniyu@google.com>,
	David Ahern <dsahern@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v2] ipv6: anycast: insert aca into global hash under idev->lock
Date: Sun, 31 May 2026 19:45:15 +0300	[thread overview]
Message-ID: <20260531164515.GA232188@shredder> (raw)
In-Reply-To: <20260529152219.235475-1-jiayuan.chen@linux.dev>

On Fri, May 29, 2026 at 11:22:18PM +0800, Jiayuan Chen wrote:
> syzbot reported a splat [1]: a slab-use-after-free in
> ipv6_chk_acast_addr(), which walks the global inet6_acaddr_lst[] hash
> under RCU and dereferences a struct ifacaddr6 that has already been
> freed while still linked in the hash, so a later reader walks into a
> dangling node.
> 
> In __ipv6_dev_ac_inc() the aca is allocated with refcount 1, then
> aca_get() bumps it to 2 to keep it alive across the unlocked region.
> It is published to idev->ac_list under idev->lock, but
> ipv6_add_acaddr_hash() runs after write_unlock_bh(). A concurrent
> teardown (ipv6_ac_destroy_dev() from addrconf_ifdown(), under RTNL)
> can slip into that window:
> 
>   CPU0 __ipv6_dev_ac_inc           CPU1 ipv6_ac_destroy_dev (RTNL)
>   ------------------------------   ------------------------------------
>   aca_alloc()              refcnt 1
>   aca_get()               refcnt 2
>   write_lock_bh(idev->lock)
>     add aca to ac_list
>   write_unlock_bh(idev->lock)
>                                    write_lock_bh(idev->lock)
>                                      pull aca off ac_list
>                                    write_unlock_bh(idev->lock)
>                                    ipv6_del_acaddr_hash(aca)
>                                      hlist_del_init_rcu() is a no-op,
>                                      aca is not in the hash yet
>                                    aca_put()           refcnt 2->1
>   ipv6_add_acaddr_hash(aca)
>     aca now inserted into the hash
>   aca_put()                refcnt 1->0
>     call_rcu(aca_free_rcu) -> kfree(aca)
> 
> The hash removal becomes a no-op because the insertion has not
> happened yet, so once CPU0 inserts and drops the last reference, the
> aca is freed while still linked in inet6_acaddr_lst[], and readers
> dereference freed memory after the slab slot is reused.
> 
> This window opened once RTNL stopped serializing the join path against
> device teardown. Move ipv6_add_acaddr_hash() inside the idev->lock
> section so the ac_list and hash insertions are atomic with respect to
> teardown: a racing remover now either misses the aca entirely or finds
> it in both lists.
> 
> acaddr_hash_lock is now nested under idev->lock, which is acquired in
> softirq context, so switch all acaddr_hash_lock sites to spin_lock_bh()
> to avoid the irq lock inversion reported in [2].
> 
> [1] https://syzkaller.appspot.com/bug?extid=a01df04303c131efbf3a
> [2] https://lore.kernel.org/netdev/6a194ef7.ba3b1513.1890b4.0000.GAE@google.com/
> 
> Reported-by: syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/6a191f87.ce022c6e.138e56.0003.GAE@google.com/T/
> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
> Fixes: eb1ac9ff6c4a ("ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST.")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

There's a comment from Sashiko about UAF / leak with regards to the
associated route, but I don't think it can happen:

"
This is a pre-existing issue, but could a race condition here cause a
use-after-free of the fib6_info object and leak the net_device?

Since ip6_ins_rt() is called after dropping the idev->lock, what happens if
a concurrent device teardown via ipv6_ac_destroy_dev() intervenes?

If ipv6_ac_destroy_dev() acquires the lock right after it is dropped here,
it would find the newly published aca in idev->ac_list, unlink it, and call
ip6_del_rt().

Since the route isn't inserted yet, ip6_del_rt() fails to remove it but
still calls fib6_info_release(), dropping the refcount of f6i to zero.
When this thread resumes, would ip6_ins_rt() then insert the 0-refcount
route into the FIB tree?
"

I don't believe the reference count drops to 0 since the address is
still alive and aca_alloc() acquires a reference on the route via
fib6_info_hold().

"
Since device unregistration has already flushed all routes, it appears this
orphaned route is never removed. Would this cause unregister_netdevice()
to hang indefinitely due to the held net_device reference?

Could ip6_ins_rt() be moved inside the idev->lock critical section to
prevent this race?
"

The kernel will emit NETDEV_UNREGISTER until the netdev reference count
drops to 1 and the route will be cleaned via addrconf_notify() ->
addrconf_ifdown() -> rt6_disable_ip()

Racing addrconf_{join,leave}_solict() also seems fine since
__ipv6_dev_mc_inc() will be a NOP due to the in6_dev_get() check.

      parent reply	other threads:[~2026-05-31 16:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29 15:22 [PATCH net v2] ipv6: anycast: insert aca into global hash under idev->lock Jiayuan Chen
2026-05-29 20:51 ` Jakub Kicinski
2026-05-30  5:00   ` Jiayuan Chen
2026-05-31 16:45 ` Ido Schimmel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260531164515.GA232188@shredder \
    --to=idosch@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox