From: Eric Dumazet <eric.dumazet@gmail.com>
To: Jiri Benc <jbenc@redhat.com>,
netdev@vger.kernel.org, edumazet@google.com
Cc: Stephen Hemminger <stephen@networkplumber.org>,
David Ahern <dsahern@kernel.org>
Subject: Re: [PATCH net] ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
Date: Fri, 5 Apr 2024 13:31:52 +0200 [thread overview]
Message-ID: <3e90336c-6859-474c-aa1e-01ccc665ad49@gmail.com> (raw)
In-Reply-To: <8bbe1218656e66552ff28cbee8c7d1f0ffd8e9fd.1712314149.git.jbenc@redhat.com>
On 4/5/24 12:54, Jiri Benc wrote:
> Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it
> still means hlist_for_each_entry_rcu can return an item that got removed
> from the list. The memory itself of such item is not freed thanks to RCU
> but nothing guarantees the actual content of the memory is sane.
>
> In particular, the reference count can be zero. This can happen if
> ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry
> from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all
> references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough
> timing, this can happen:
>
> 1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry.
>
> 2. Then, the whole ipv6_del_addr is executed for the given entry. The
> reference count drops to zero and kfree_rcu is scheduled.
>
> 3. ipv6_get_ifaddr continues and increments the reference count
> (in6_ifa_hold).
>
> 4. The rcu is unlocked and the entry is freed.
>
> 5. Later, the reference count is dropped to zero (again) and kfree_rcu
> is scheduled (again).
refcount_t semantic should prevent this double transition to 0 ?
Can you include a stack trace in the changelog ?
Otherwise patch looks good to me, thanks !
Reviewed-by: Eric Dumazet <edumazet@google.com>
>
> Prevent increasing of the reference count in such case. The name
> in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe.
>
> Fixes: 5c578aedcb21d ("IPv6: convert addrconf hash list to RCU")
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
>
> Side note: While this fixes one bug, there may be more locking bugs
> lurking aroung inet6_ifaddr. The semantics of locking of inet6_ifaddr is
> wild and fragile. Some of the fields are freed in ipv6_del_addr and
> guarded by ifa->state == INET6_IFADDR_STATE_DEAD and RTNL. Some of the
> fields are freed in inet6_ifa_finish_destroy and guarded by ifa->refcnt
> and RCU. Needless to say, this semantics is undocumented. Worse,
> ifa->state guard may not be enough. For example, ipv6_get_ifaddr can
> still return an entry that proceeded through ipv6_del_addr, which means
> ifa->state is INET6_IFADDR_STATE_DEAD. However, at least some callers
> (e.g. ndisc_recv_ns) seem to change ifa->state to something else. As
> another example, ipv6_del_addr relies on ifa->flags, which are changed
> throughout the code without RTNL. All of this may be okay but it's far
> from clear.
> ---
> include/net/addrconf.h | 4 ++++
> net/ipv6/addrconf.c | 7 ++++---
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index 9d06eb945509..62a407db1bf5 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -438,6 +438,10 @@ static inline void in6_ifa_hold(struct inet6_ifaddr *ifp)
> refcount_inc(&ifp->refcnt);
> }
>
> +static inline bool in6_ifa_hold_safe(struct inet6_ifaddr *ifp)
> +{
> + return refcount_inc_not_zero(&ifp->refcnt);
> +}
>
> /*
> * compute link-local solicited-node multicast address
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 92db9b474f2b..779aa6ecdd49 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -2091,9 +2091,10 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net, const struct in6_addr *add
> if (ipv6_addr_equal(&ifp->addr, addr)) {
> if (!dev || ifp->idev->dev == dev ||
> !(ifp->scope&(IFA_LINK|IFA_HOST) || strict)) {
> - result = ifp;
> - in6_ifa_hold(ifp);
> - break;
> + if (in6_ifa_hold_safe(ifp)) {
> + result = ifp;
> + break;
> + }
> }
> }
> }
next prev parent reply other threads:[~2024-04-05 11:31 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-05 10:54 [PATCH net] ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr Jiri Benc
2024-04-05 11:31 ` Eric Dumazet [this message]
2024-04-05 11:47 ` Jiri Benc
2024-04-06 19:51 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e90336c-6859-474c-aa1e-01ccc665ad49@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=jbenc@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.