netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c
@ 2025-06-24 20:24 Kuniyuki Iwashima
  2025-06-24 20:24 ` [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}() Kuniyuki Iwashima
                   ` (15 more replies)
  0 siblings, 16 replies; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

This is a prep series for RCU conversion of RTM_NEWNEIGH, which needs
RTNL during neigh_table.{pconstructor,pdestructor}() touching IPv6
multicast code.

Currently, IPv6 multicast code is protected by lock_sock() and
inet6_dev->mc_lock, and RTNL is not actually needed.

In addition, anycast code is also in the same situation and does not
need RTNL at all.

This series removes RTNL from net/ipv6/{mcast.c,anycast.c} and finally
removes setsockopt_needs_rtnl() from do_ipv6_setsockopt().


Changes:
  v2:
    * Patch 2: Clarify which function doesn't need assertion
    * Patch 6, 9, 14: Call rt6_lookup() and dev_hold() under RCU

  v1: https://lore.kernel.org/netdev/20250616233417.1153427-1-kuni1840@gmail.com/


Kuniyuki Iwashima (15):
  ipv6: ndisc: Remove __in6_dev_get() in
    pndisc_{constructor,destructor}().
  ipv6: mcast: Replace locking comments with lockdep annotations.
  ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in
    __ipv6_dev_mc_inc().
  ipv6: mcast: Remove mca_get().
  ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec().
  ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and
    MCAST_JOIN_GROUP.
  ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and
    MCAST_LEAVE_GROUP.
  ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close().
  ipv6: mcast: Don't hold RTNL for MCAST_ socket options.
  ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment.
  ipv6: anycast: Don't use rtnl_dereference().
  ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and
    IPV6_ADDRFORM.
  ipv6: anycast: Unify two error paths in ipv6_sock_ac_join().
  ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST.
  ipv6: Remove setsockopt_needs_rtnl().

 include/linux/netdevice.h |   4 +-
 net/core/dev.c            |  38 +++--
 net/ipv6/addrconf.c       |  12 +-
 net/ipv6/anycast.c        |  95 ++++++-----
 net/ipv6/ipv6_sockglue.c  |  28 +---
 net/ipv6/mcast.c          | 328 +++++++++++++++++++-------------------
 net/ipv6/ndisc.c          |  10 +-
 7 files changed, 254 insertions(+), 261 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:26   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 02/15] ipv6: mcast: Replace locking comments with lockdep annotations Kuniyuki Iwashima
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

ipv6_dev_mc_{inc,dec}() has the same check.

Let's remove __in6_dev_get() from pndisc_constructor() and
pndisc_destructor().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/ndisc.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index ecb5c4b8518f..beb1814a1ac2 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -377,11 +377,12 @@ static int ndisc_constructor(struct neighbour *neigh)
 static int pndisc_constructor(struct pneigh_entry *n)
 {
 	struct in6_addr *addr = (struct in6_addr *)&n->key;
-	struct in6_addr maddr;
 	struct net_device *dev = n->dev;
+	struct in6_addr maddr;
 
-	if (!dev || !__in6_dev_get(dev))
+	if (!dev)
 		return -EINVAL;
+
 	addrconf_addr_solict_mult(addr, &maddr);
 	ipv6_dev_mc_inc(dev, &maddr);
 	return 0;
@@ -390,11 +391,12 @@ static int pndisc_constructor(struct pneigh_entry *n)
 static void pndisc_destructor(struct pneigh_entry *n)
 {
 	struct in6_addr *addr = (struct in6_addr *)&n->key;
-	struct in6_addr maddr;
 	struct net_device *dev = n->dev;
+	struct in6_addr maddr;
 
-	if (!dev || !__in6_dev_get(dev))
+	if (!dev)
 		return;
+
 	addrconf_addr_solict_mult(addr, &maddr);
 	ipv6_dev_mc_dec(dev, &maddr);
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 02/15] ipv6: mcast: Replace locking comments with lockdep annotations.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
  2025-06-24 20:24 ` [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:28   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 03/15] ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in __ipv6_dev_mc_inc() Kuniyuki Iwashima
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

Commit 63ed8de4be81 ("mld: add mc_lock for protecting per-interface
mld data") added the same comments regarding locking to many functions.

Let's replace the comments with lockdep annotation, which is more helpful.

Note that we just remove the comment for mld_clear_zeros() and
mld_send_cr(), where mc_dereference() is used in the entry of the
function.

While at it, a comment for __ipv6_sock_mc_join() is moved back to the
correct place.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/mcast.c | 125 +++++++++++++++++++++++++++--------------------
 1 file changed, 71 insertions(+), 54 deletions(-)

diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 65831b4fee1f..5cd94effbc92 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -108,9 +108,9 @@ static int __ipv6_dev_mc_inc(struct net_device *dev,
 int sysctl_mld_max_msf __read_mostly = IPV6_MLD_MAX_MSF;
 int sysctl_mld_qrv __read_mostly = MLD_QRV_DEFAULT;
 
-/*
- *	socket join on multicast group
- */
+#define mc_assert_locked(idev)			\
+	lockdep_assert_held(&(idev)->mc_lock)
+
 #define mc_dereference(e, idev) \
 	rcu_dereference_protected(e, lockdep_is_held(&(idev)->mc_lock))
 
@@ -169,6 +169,9 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
 	return iv > 0 ? iv : 1;
 }
 
+/*
+ *	socket join on multicast group
+ */
 static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
 			       const struct in6_addr *addr, unsigned int mode)
 {
@@ -668,12 +671,13 @@ bool inet6_mc_check(const struct sock *sk, const struct in6_addr *mc_addr,
 	return rv;
 }
 
-/* called with mc_lock */
 static void igmp6_group_added(struct ifmcaddr6 *mc)
 {
 	struct net_device *dev = mc->idev->dev;
 	char buf[MAX_ADDR_LEN];
 
+	mc_assert_locked(mc->idev);
+
 	if (IPV6_ADDR_MC_SCOPE(&mc->mca_addr) <
 	    IPV6_ADDR_SCOPE_LINKLOCAL)
 		return;
@@ -703,12 +707,13 @@ static void igmp6_group_added(struct ifmcaddr6 *mc)
 	mld_ifc_event(mc->idev);
 }
 
-/* called with mc_lock */
 static void igmp6_group_dropped(struct ifmcaddr6 *mc)
 {
 	struct net_device *dev = mc->idev->dev;
 	char buf[MAX_ADDR_LEN];
 
+	mc_assert_locked(mc->idev);
+
 	if (IPV6_ADDR_MC_SCOPE(&mc->mca_addr) <
 	    IPV6_ADDR_SCOPE_LINKLOCAL)
 		return;
@@ -729,14 +734,13 @@ static void igmp6_group_dropped(struct ifmcaddr6 *mc)
 		refcount_dec(&mc->mca_refcnt);
 }
 
-/*
- * deleted ifmcaddr6 manipulation
- * called with mc_lock
- */
+/* deleted ifmcaddr6 manipulation */
 static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im)
 {
 	struct ifmcaddr6 *pmc;
 
+	mc_assert_locked(idev);
+
 	/* this is an "ifmcaddr6" for convenience; only the fields below
 	 * are actually used. In particular, the refcnt and users are not
 	 * used for management of the delete list. Using the same structure
@@ -770,13 +774,14 @@ static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im)
 	rcu_assign_pointer(idev->mc_tomb, pmc);
 }
 
-/* called with mc_lock */
 static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im)
 {
 	struct ip6_sf_list *psf, *sources, *tomb;
 	struct in6_addr *pmca = &im->mca_addr;
 	struct ifmcaddr6 *pmc, *pmc_prev;
 
+	mc_assert_locked(idev);
+
 	pmc_prev = NULL;
 	for_each_mc_tomb(idev, pmc) {
 		if (ipv6_addr_equal(&pmc->mca_addr, pmca))
@@ -813,11 +818,12 @@ static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im)
 	}
 }
 
-/* called with mc_lock */
 static void mld_clear_delrec(struct inet6_dev *idev)
 {
 	struct ifmcaddr6 *pmc, *nextpmc;
 
+	mc_assert_locked(idev);
+
 	pmc = mc_dereference(idev->mc_tomb, idev);
 	RCU_INIT_POINTER(idev->mc_tomb, NULL);
 
@@ -874,13 +880,14 @@ static void ma_put(struct ifmcaddr6 *mc)
 	}
 }
 
-/* called with mc_lock */
 static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev,
 				   const struct in6_addr *addr,
 				   unsigned int mode)
 {
 	struct ifmcaddr6 *mc;
 
+	mc_assert_locked(idev);
+
 	mc = kzalloc(sizeof(*mc), GFP_KERNEL);
 	if (!mc)
 		return NULL;
@@ -1091,46 +1098,51 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group,
 	return rv;
 }
 
-/* called with mc_lock */
 static void mld_gq_start_work(struct inet6_dev *idev)
 {
 	unsigned long tv = get_random_u32_below(idev->mc_maxdelay);
 
+	mc_assert_locked(idev);
+
 	idev->mc_gq_running = 1;
 	if (!mod_delayed_work(mld_wq, &idev->mc_gq_work, tv + 2))
 		in6_dev_hold(idev);
 }
 
-/* called with mc_lock */
 static void mld_gq_stop_work(struct inet6_dev *idev)
 {
+	mc_assert_locked(idev);
+
 	idev->mc_gq_running = 0;
 	if (cancel_delayed_work(&idev->mc_gq_work))
 		__in6_dev_put(idev);
 }
 
-/* called with mc_lock */
 static void mld_ifc_start_work(struct inet6_dev *idev, unsigned long delay)
 {
 	unsigned long tv = get_random_u32_below(delay);
 
+	mc_assert_locked(idev);
+
 	if (!mod_delayed_work(mld_wq, &idev->mc_ifc_work, tv + 2))
 		in6_dev_hold(idev);
 }
 
-/* called with mc_lock */
 static void mld_ifc_stop_work(struct inet6_dev *idev)
 {
+	mc_assert_locked(idev);
+
 	idev->mc_ifc_count = 0;
 	if (cancel_delayed_work(&idev->mc_ifc_work))
 		__in6_dev_put(idev);
 }
 
-/* called with mc_lock */
 static void mld_dad_start_work(struct inet6_dev *idev, unsigned long delay)
 {
 	unsigned long tv = get_random_u32_below(delay);
 
+	mc_assert_locked(idev);
+
 	if (!mod_delayed_work(mld_wq, &idev->mc_dad_work, tv + 2))
 		in6_dev_hold(idev);
 }
@@ -1155,14 +1167,13 @@ static void mld_report_stop_work(struct inet6_dev *idev)
 		__in6_dev_put(idev);
 }
 
-/*
- * IGMP handling (alias multicast ICMPv6 messages)
- * called with mc_lock
- */
+/* IGMP handling (alias multicast ICMPv6 messages) */
 static void igmp6_group_queried(struct ifmcaddr6 *ma, unsigned long resptime)
 {
 	unsigned long delay = resptime;
 
+	mc_assert_locked(ma->idev);
+
 	/* Do not start work for these addresses */
 	if (ipv6_addr_is_ll_all_nodes(&ma->mca_addr) ||
 	    IPV6_ADDR_MC_SCOPE(&ma->mca_addr) < IPV6_ADDR_SCOPE_LINKLOCAL)
@@ -1181,15 +1192,15 @@ static void igmp6_group_queried(struct ifmcaddr6 *ma, unsigned long resptime)
 	ma->mca_flags |= MAF_TIMER_RUNNING;
 }
 
-/* mark EXCLUDE-mode sources
- * called with mc_lock
- */
+/* mark EXCLUDE-mode sources */
 static bool mld_xmarksources(struct ifmcaddr6 *pmc, int nsrcs,
 			     const struct in6_addr *srcs)
 {
 	struct ip6_sf_list *psf;
 	int i, scount;
 
+	mc_assert_locked(pmc->idev);
+
 	scount = 0;
 	for_each_psf_mclock(pmc, psf) {
 		if (scount == nsrcs)
@@ -1212,13 +1223,14 @@ static bool mld_xmarksources(struct ifmcaddr6 *pmc, int nsrcs,
 	return true;
 }
 
-/* called with mc_lock */
 static bool mld_marksources(struct ifmcaddr6 *pmc, int nsrcs,
 			    const struct in6_addr *srcs)
 {
 	struct ip6_sf_list *psf;
 	int i, scount;
 
+	mc_assert_locked(pmc->idev);
+
 	if (pmc->mca_sfmode == MCAST_EXCLUDE)
 		return mld_xmarksources(pmc, nsrcs, srcs);
 
@@ -1913,7 +1925,6 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
 
 #define AVAILABLE(skb)	((skb) ? skb_availroom(skb) : 0)
 
-/* called with mc_lock */
 static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
 				int type, int gdeleted, int sdeleted,
 				int crsend)
@@ -1927,6 +1938,8 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
 	struct mld2_report *pmr;
 	unsigned int mtu;
 
+	mc_assert_locked(idev);
+
 	if (pmc->mca_flags & MAF_NOREPORT)
 		return skb;
 
@@ -2045,12 +2058,13 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
 	return skb;
 }
 
-/* called with mc_lock */
 static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
 {
 	struct sk_buff *skb = NULL;
 	int type;
 
+	mc_assert_locked(idev);
+
 	if (!pmc) {
 		for_each_mc_mclock(idev, pmc) {
 			if (pmc->mca_flags & MAF_NOREPORT)
@@ -2072,10 +2086,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
 		mld_sendpack(skb);
 }
 
-/*
- * remove zero-count source records from a source filter list
- * called with mc_lock
- */
+/* remove zero-count source records from a source filter list */
 static void mld_clear_zeros(struct ip6_sf_list __rcu **ppsf, struct inet6_dev *idev)
 {
 	struct ip6_sf_list *psf_prev, *psf_next, *psf;
@@ -2099,7 +2110,6 @@ static void mld_clear_zeros(struct ip6_sf_list __rcu **ppsf, struct inet6_dev *i
 	}
 }
 
-/* called with mc_lock */
 static void mld_send_cr(struct inet6_dev *idev)
 {
 	struct ifmcaddr6 *pmc, *pmc_prev, *pmc_next;
@@ -2263,13 +2273,14 @@ static void igmp6_send(struct in6_addr *addr, struct net_device *dev, int type)
 	goto out;
 }
 
-/* called with mc_lock */
 static void mld_send_initial_cr(struct inet6_dev *idev)
 {
-	struct sk_buff *skb;
 	struct ifmcaddr6 *pmc;
+	struct sk_buff *skb;
 	int type;
 
+	mc_assert_locked(idev);
+
 	if (mld_in_v1_mode(idev))
 		return;
 
@@ -2316,13 +2327,14 @@ static void mld_dad_work(struct work_struct *work)
 	in6_dev_put(idev);
 }
 
-/* called with mc_lock */
 static int ip6_mc_del1_src(struct ifmcaddr6 *pmc, int sfmode,
-	const struct in6_addr *psfsrc)
+			   const struct in6_addr *psfsrc)
 {
 	struct ip6_sf_list *psf, *psf_prev;
 	int rv = 0;
 
+	mc_assert_locked(pmc->idev);
+
 	psf_prev = NULL;
 	for_each_psf_mclock(pmc, psf) {
 		if (ipv6_addr_equal(&psf->sf_addr, psfsrc))
@@ -2359,7 +2371,6 @@ static int ip6_mc_del1_src(struct ifmcaddr6 *pmc, int sfmode,
 	return rv;
 }
 
-/* called with mc_lock */
 static int ip6_mc_del_src(struct inet6_dev *idev, const struct in6_addr *pmca,
 			  int sfmode, int sfcount, const struct in6_addr *psfsrc,
 			  int delta)
@@ -2371,6 +2382,8 @@ static int ip6_mc_del_src(struct inet6_dev *idev, const struct in6_addr *pmca,
 	if (!idev)
 		return -ENODEV;
 
+	mc_assert_locked(idev);
+
 	for_each_mc_mclock(idev, pmc) {
 		if (ipv6_addr_equal(pmca, &pmc->mca_addr))
 			break;
@@ -2412,15 +2425,14 @@ static int ip6_mc_del_src(struct inet6_dev *idev, const struct in6_addr *pmca,
 	return err;
 }
 
-/*
- * Add multicast single-source filter to the interface list
- * called with mc_lock
- */
+/* Add multicast single-source filter to the interface list */
 static int ip6_mc_add1_src(struct ifmcaddr6 *pmc, int sfmode,
-	const struct in6_addr *psfsrc)
+			   const struct in6_addr *psfsrc)
 {
 	struct ip6_sf_list *psf, *psf_prev;
 
+	mc_assert_locked(pmc->idev);
+
 	psf_prev = NULL;
 	for_each_psf_mclock(pmc, psf) {
 		if (ipv6_addr_equal(&psf->sf_addr, psfsrc))
@@ -2443,11 +2455,12 @@ static int ip6_mc_add1_src(struct ifmcaddr6 *pmc, int sfmode,
 	return 0;
 }
 
-/* called with mc_lock */
 static void sf_markstate(struct ifmcaddr6 *pmc)
 {
-	struct ip6_sf_list *psf;
 	int mca_xcount = pmc->mca_sfcount[MCAST_EXCLUDE];
+	struct ip6_sf_list *psf;
+
+	mc_assert_locked(pmc->idev);
 
 	for_each_psf_mclock(pmc, psf) {
 		if (pmc->mca_sfcount[MCAST_EXCLUDE]) {
@@ -2460,14 +2473,15 @@ static void sf_markstate(struct ifmcaddr6 *pmc)
 	}
 }
 
-/* called with mc_lock */
 static int sf_setstate(struct ifmcaddr6 *pmc)
 {
-	struct ip6_sf_list *psf, *dpsf;
 	int mca_xcount = pmc->mca_sfcount[MCAST_EXCLUDE];
+	struct ip6_sf_list *psf, *dpsf;
 	int qrv = pmc->idev->mc_qrv;
 	int new_in, rv;
 
+	mc_assert_locked(pmc->idev);
+
 	rv = 0;
 	for_each_psf_mclock(pmc, psf) {
 		if (pmc->mca_sfcount[MCAST_EXCLUDE]) {
@@ -2526,10 +2540,7 @@ static int sf_setstate(struct ifmcaddr6 *pmc)
 	return rv;
 }
 
-/*
- * Add multicast source filter list to the interface list
- * called with mc_lock
- */
+/* Add multicast source filter list to the interface list */
 static int ip6_mc_add_src(struct inet6_dev *idev, const struct in6_addr *pmca,
 			  int sfmode, int sfcount, const struct in6_addr *psfsrc,
 			  int delta)
@@ -2541,6 +2552,8 @@ static int ip6_mc_add_src(struct inet6_dev *idev, const struct in6_addr *pmca,
 	if (!idev)
 		return -ENODEV;
 
+	mc_assert_locked(idev);
+
 	for_each_mc_mclock(idev, pmc) {
 		if (ipv6_addr_equal(pmca, &pmc->mca_addr))
 			break;
@@ -2588,11 +2601,12 @@ static int ip6_mc_add_src(struct inet6_dev *idev, const struct in6_addr *pmca,
 	return err;
 }
 
-/* called with mc_lock */
 static void ip6_mc_clear_src(struct ifmcaddr6 *pmc)
 {
 	struct ip6_sf_list *psf, *nextpsf;
 
+	mc_assert_locked(pmc->idev);
+
 	for (psf = mc_dereference(pmc->mca_tomb, pmc->idev);
 	     psf;
 	     psf = nextpsf) {
@@ -2613,11 +2627,12 @@ static void ip6_mc_clear_src(struct ifmcaddr6 *pmc)
 	WRITE_ONCE(pmc->mca_sfcount[MCAST_EXCLUDE], 1);
 }
 
-/* called with mc_lock */
 static void igmp6_join_group(struct ifmcaddr6 *ma)
 {
 	unsigned long delay;
 
+	mc_assert_locked(ma->idev);
+
 	if (ma->mca_flags & MAF_NOREPORT)
 		return;
 
@@ -2664,9 +2679,10 @@ static int ip6_mc_leave_src(struct sock *sk, struct ipv6_mc_socklist *iml,
 	return err;
 }
 
-/* called with mc_lock */
 static void igmp6_leave_group(struct ifmcaddr6 *ma)
 {
+	mc_assert_locked(ma->idev);
+
 	if (mld_in_v1_mode(ma->idev)) {
 		if (ma->mca_flags & MAF_LAST_REPORTER) {
 			igmp6_send(&ma->mca_addr, ma->idev->dev,
@@ -2711,9 +2727,10 @@ static void mld_ifc_work(struct work_struct *work)
 	in6_dev_put(idev);
 }
 
-/* called with mc_lock */
 static void mld_ifc_event(struct inet6_dev *idev)
 {
+	mc_assert_locked(idev);
+
 	if (mld_in_v1_mode(idev))
 		return;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 03/15] ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in __ipv6_dev_mc_inc().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
  2025-06-24 20:24 ` [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}() Kuniyuki Iwashima
  2025-06-24 20:24 ` [PATCH v2 net-next 02/15] ipv6: mcast: Replace locking comments with lockdep annotations Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:30   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 04/15] ipv6: mcast: Remove mca_get() Kuniyuki Iwashima
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

Since commit 63ed8de4be81 ("mld: add mc_lock for protecting
per-interface mld data"), every multicast resource is protected
by inet6_dev->mc_lock.

RTNL is unnecessary in terms of protection but still needed for
synchronisation between addrconf_ifdown() and __ipv6_dev_mc_inc().

Once we removed RTNL, there would be a race below, where we could
add a multicast address to a dead inet6_dev.

  CPU1                            CPU2
  ====                            ====
  addrconf_ifdown()               __ipv6_dev_mc_inc()
                                    if (idev->dead) <-- false
    dead = true                       return -ENODEV;
    ipv6_mc_destroy_dev() / ipv6_mc_down()
      mutex_lock(&idev->mc_lock)
      ...
      mutex_unlock(&idev->mc_lock)
                                    mutex_lock(&idev->mc_lock)
                                    ...
                                    mutex_unlock(&idev->mc_lock)

The race window can be easily closed by checking inet6_dev->dead
under inet6_dev->mc_lock in __ipv6_dev_mc_inc() as addrconf_ifdown()
will acquire it after marking inet6_dev dead.

Let's check inet6_dev->dead under mc_lock in __ipv6_dev_mc_inc().

Note that now __ipv6_dev_mc_inc() no longer depends on RTNL and
we can remove ASSERT_RTNL() there and the RTNL comment above
addrconf_join_solict().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/addrconf.c |  7 +++----
 net/ipv6/mcast.c    | 11 +++++------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 9c297974d3a6..dcc07767e51f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2229,13 +2229,12 @@ void addrconf_dad_failure(struct sk_buff *skb, struct inet6_ifaddr *ifp)
 	in6_ifa_put(ifp);
 }
 
-/* Join to solicited addr multicast group.
- * caller must hold RTNL */
+/* Join to solicited addr multicast group. */
 void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr)
 {
 	struct in6_addr maddr;
 
-	if (dev->flags&(IFF_LOOPBACK|IFF_NOARP))
+	if (READ_ONCE(dev->flags) & (IFF_LOOPBACK | IFF_NOARP))
 		return;
 
 	addrconf_addr_solict_mult(addr, &maddr);
@@ -3865,7 +3864,7 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
 	 *	   Do not dev_put!
 	 */
 	if (unregister) {
-		idev->dead = 1;
+		WRITE_ONCE(idev->dead, 1);
 
 		/* protected by rtnl_lock */
 		RCU_INIT_POINTER(dev->ip6_ptr, NULL);
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 5cd94effbc92..15a37352124d 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -952,23 +952,22 @@ static void inet6_ifmcaddr_notify(struct net_device *dev,
 static int __ipv6_dev_mc_inc(struct net_device *dev,
 			     const struct in6_addr *addr, unsigned int mode)
 {
-	struct ifmcaddr6 *mc;
 	struct inet6_dev *idev;
-
-	ASSERT_RTNL();
+	struct ifmcaddr6 *mc;
 
 	/* we need to take a reference on idev */
 	idev = in6_dev_get(dev);
-
 	if (!idev)
 		return -EINVAL;
 
-	if (idev->dead) {
+	mutex_lock(&idev->mc_lock);
+
+	if (READ_ONCE(idev->dead)) {
+		mutex_unlock(&idev->mc_lock);
 		in6_dev_put(idev);
 		return -ENODEV;
 	}
 
-	mutex_lock(&idev->mc_lock);
 	for_each_mc_mclock(idev, mc) {
 		if (ipv6_addr_equal(&mc->mca_addr, addr)) {
 			mc->mca_users++;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 04/15] ipv6: mcast: Remove mca_get().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 03/15] ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in __ipv6_dev_mc_inc() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:31   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 05/15] ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec() Kuniyuki Iwashima
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

Since commit 63ed8de4be81 ("mld: add mc_lock for protecting per-interface
mld data"), the newly allocated struct ifmcaddr6 cannot be removed until
inet6_dev->mc_lock is released, so mca_get() and mc_put() are unnecessary.

Let's remove the extra refcounting.

Note that mca_get() was only used in __ipv6_dev_mc_inc().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/mcast.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 15a37352124d..aa1280df4c1f 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -867,11 +867,6 @@ static void mld_clear_report(struct inet6_dev *idev)
 	spin_unlock_bh(&idev->mc_report_lock);
 }
 
-static void mca_get(struct ifmcaddr6 *mc)
-{
-	refcount_inc(&mc->mca_refcnt);
-}
-
 static void ma_put(struct ifmcaddr6 *mc)
 {
 	if (refcount_dec_and_test(&mc->mca_refcnt)) {
@@ -988,13 +983,11 @@ static int __ipv6_dev_mc_inc(struct net_device *dev,
 	rcu_assign_pointer(mc->next, idev->mc_list);
 	rcu_assign_pointer(idev->mc_list, mc);
 
-	mca_get(mc);
-
 	mld_del_delrec(idev, mc);
 	igmp6_group_added(mc);
 	inet6_ifmcaddr_notify(dev, mc, RTM_NEWMULTICAST);
 	mutex_unlock(&idev->mc_lock);
-	ma_put(mc);
+
 	return 0;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 05/15] ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 04/15] ipv6: mcast: Remove mca_get() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:33   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP Kuniyuki Iwashima
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

As well as __ipv6_dev_mc_inc(), all code in __ipv6_dev_mc_dec() are
protected by inet6_dev->mc_lock, and RTNL is not needed.

Let's use in6_dev_get() in ipv6_dev_mc_dec() and remove ASSERT_RTNL()
in __ipv6_dev_mc_dec().

Now, we can remove the RTNL comment above addrconf_leave_solict() too.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/addrconf.c |  3 +--
 net/ipv6/mcast.c    | 14 ++++++--------
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index dcc07767e51f..8451014457dd 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2241,12 +2241,11 @@ void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr)
 	ipv6_dev_mc_inc(dev, &maddr);
 }
 
-/* caller must hold RTNL */
 void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr)
 {
 	struct in6_addr maddr;
 
-	if (idev->dev->flags&(IFF_LOOPBACK|IFF_NOARP))
+	if (READ_ONCE(idev->dev->flags) & (IFF_LOOPBACK | IFF_NOARP))
 		return;
 
 	addrconf_addr_solict_mult(addr, &maddr);
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index aa1280df4c1f..b3f063b5ffd7 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1004,9 +1004,8 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr)
 {
 	struct ifmcaddr6 *ma, __rcu **map;
 
-	ASSERT_RTNL();
-
 	mutex_lock(&idev->mc_lock);
+
 	for (map = &idev->mc_list;
 	     (ma = mc_dereference(*map, idev));
 	     map = &ma->next) {
@@ -1037,13 +1036,12 @@ int ipv6_dev_mc_dec(struct net_device *dev, const struct in6_addr *addr)
 	struct inet6_dev *idev;
 	int err;
 
-	ASSERT_RTNL();
-
-	idev = __in6_dev_get(dev);
+	idev = in6_dev_get(dev);
 	if (!idev)
-		err = -ENODEV;
-	else
-		err = __ipv6_dev_mc_dec(idev, addr);
+		return -ENODEV;
+
+	err = __ipv6_dev_mc_dec(idev, addr);
+	in6_dev_put(idev);
 
 	return err;
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (4 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 05/15] ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:36   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 07/15] ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and MCAST_LEAVE_GROUP Kuniyuki Iwashima
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

In __ipv6_sock_mc_join(), per-socket mld data is protected by lock_sock(),
and only __dev_get_by_index() requires RTNL.

Let's use dev_get_by_index() and drop RTNL for IPV6_ADD_MEMBERSHIP and
MCAST_JOIN_GROUP.

Note that we must call rt6_lookup() and dev_hold() under RCU.

If rt6_lookup() returns an entry from the exception table, dst_dev_put()
could change rt->dev.dst to loopback concurrently, and the original device
could lose the refcount before dev_hold() and unblock device registration.

dst_dev_put() is called from NETDEV_UNREGISTER and synchronize_net() follows
it, so as long as rt6_lookup() and dev_hold() are called within the same
RCU critical section, the dev is alive.

Even if the race happens, they are synchronised by idev->dead and mcast
addresses are cleaned up.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
---
 net/ipv6/ipv6_sockglue.c |  2 --
 net/ipv6/mcast.c         | 22 ++++++++++++----------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 1e225e6489ea..cb0dc885cbe4 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -121,11 +121,9 @@ static bool setsockopt_needs_rtnl(int optname)
 {
 	switch (optname) {
 	case IPV6_ADDRFORM:
-	case IPV6_ADD_MEMBERSHIP:
 	case IPV6_DROP_MEMBERSHIP:
 	case IPV6_JOIN_ANYCAST:
 	case IPV6_LEAVE_ANYCAST:
-	case MCAST_JOIN_GROUP:
 	case MCAST_LEAVE_GROUP:
 	case MCAST_JOIN_SOURCE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index b3f063b5ffd7..9fc7672926bf 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -175,14 +175,12 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
 static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
 			       const struct in6_addr *addr, unsigned int mode)
 {
-	struct net_device *dev = NULL;
-	struct ipv6_mc_socklist *mc_lst;
 	struct ipv6_pinfo *np = inet6_sk(sk);
+	struct ipv6_mc_socklist *mc_lst;
 	struct net *net = sock_net(sk);
+	struct net_device *dev = NULL;
 	int err;
 
-	ASSERT_RTNL();
-
 	if (!ipv6_addr_is_multicast(addr))
 		return -EINVAL;
 
@@ -202,13 +200,18 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
 
 	if (ifindex == 0) {
 		struct rt6_info *rt;
+
+		rcu_read_lock();
 		rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
 		if (rt) {
 			dev = rt->dst.dev;
+			dev_hold(dev);
 			ip6_rt_put(rt);
 		}
-	} else
-		dev = __dev_get_by_index(net, ifindex);
+		rcu_read_unlock();
+	} else {
+		dev = dev_get_by_index(net, ifindex);
+	}
 
 	if (!dev) {
 		sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
@@ -219,12 +222,11 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
 	mc_lst->sfmode = mode;
 	RCU_INIT_POINTER(mc_lst->sflist, NULL);
 
-	/*
-	 *	now add/increase the group membership on the device
-	 */
-
+	/* now add/increase the group membership on the device */
 	err = __ipv6_dev_mc_inc(dev, addr, mode);
 
+	dev_put(dev);
+
 	if (err) {
 		sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
 		return err;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 07/15] ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and MCAST_LEAVE_GROUP.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (5 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:38   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 08/15] ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close() Kuniyuki Iwashima
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

In __ipv6_sock_mc_drop(), per-socket mld data is protected by lock_sock(),
and only __dev_get_by_index() and __in6_dev_get() require RTNL.

Let's use dev_get_by_index() and in6_dev_get() and drop RTNL for
IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.

Note that __ipv6_sock_mc_drop() is factorised to reuse in the next patch.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/ipv6_sockglue.c |  2 --
 net/ipv6/mcast.c         | 47 +++++++++++++++++++++++-----------------
 2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index cb0dc885cbe4..c8892d54821f 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -121,10 +121,8 @@ static bool setsockopt_needs_rtnl(int optname)
 {
 	switch (optname) {
 	case IPV6_ADDRFORM:
-	case IPV6_DROP_MEMBERSHIP:
 	case IPV6_JOIN_ANYCAST:
 	case IPV6_LEAVE_ANYCAST:
-	case MCAST_LEAVE_GROUP:
 	case MCAST_JOIN_SOURCE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
 	case MCAST_BLOCK_SOURCE:
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 9fc7672926bf..afa3ff092702 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -253,14 +253,36 @@ int ipv6_sock_mc_join_ssm(struct sock *sk, int ifindex,
 /*
  *	socket leave on multicast group
  */
+static void __ipv6_sock_mc_drop(struct sock *sk, struct ipv6_mc_socklist *mc_lst)
+{
+	struct net *net = sock_net(sk);
+	struct net_device *dev;
+
+	dev = dev_get_by_index(net, mc_lst->ifindex);
+	if (dev) {
+		struct inet6_dev *idev = in6_dev_get(dev);
+
+		ip6_mc_leave_src(sk, mc_lst, idev);
+
+		if (idev) {
+			__ipv6_dev_mc_dec(idev, &mc_lst->addr);
+			in6_dev_put(idev);
+		}
+
+		dev_put(dev);
+	} else {
+		ip6_mc_leave_src(sk, mc_lst, NULL);
+	}
+
+	atomic_sub(sizeof(*mc_lst), &sk->sk_omem_alloc);
+	kfree_rcu(mc_lst, rcu);
+}
+
 int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
-	struct ipv6_mc_socklist *mc_lst;
 	struct ipv6_mc_socklist __rcu **lnk;
-	struct net *net = sock_net(sk);
-
-	ASSERT_RTNL();
+	struct ipv6_mc_socklist *mc_lst;
 
 	if (!ipv6_addr_is_multicast(addr))
 		return -EINVAL;
@@ -270,23 +292,8 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
 	      lnk = &mc_lst->next) {
 		if ((ifindex == 0 || mc_lst->ifindex == ifindex) &&
 		    ipv6_addr_equal(&mc_lst->addr, addr)) {
-			struct net_device *dev;
-
 			*lnk = mc_lst->next;
-
-			dev = __dev_get_by_index(net, mc_lst->ifindex);
-			if (dev) {
-				struct inet6_dev *idev = __in6_dev_get(dev);
-
-				ip6_mc_leave_src(sk, mc_lst, idev);
-				if (idev)
-					__ipv6_dev_mc_dec(idev, &mc_lst->addr);
-			} else {
-				ip6_mc_leave_src(sk, mc_lst, NULL);
-			}
-
-			atomic_sub(sizeof(*mc_lst), &sk->sk_omem_alloc);
-			kfree_rcu(mc_lst, rcu);
+			__ipv6_sock_mc_drop(sk, mc_lst);
 			return 0;
 		}
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 08/15] ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (6 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 07/15] ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and MCAST_LEAVE_GROUP Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:39   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 09/15] ipv6: mcast: Don't hold RTNL for MCAST_ socket options Kuniyuki Iwashima
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

In __ipv6_sock_mc_close(), per-socket mld data is protected by lock_sock(),
and only __dev_get_by_index() and __in6_dev_get() require RTNL.

Let's call __ipv6_sock_mc_drop() and drop RTNL in ipv6_sock_mc_close().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/mcast.c | 22 +---------------------
 1 file changed, 1 insertion(+), 21 deletions(-)

diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index afa3ff092702..e47d3fd7f789 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -334,28 +334,10 @@ void __ipv6_sock_mc_close(struct sock *sk)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct ipv6_mc_socklist *mc_lst;
-	struct net *net = sock_net(sk);
-
-	ASSERT_RTNL();
 
 	while ((mc_lst = sock_dereference(np->ipv6_mc_list, sk)) != NULL) {
-		struct net_device *dev;
-
 		np->ipv6_mc_list = mc_lst->next;
-
-		dev = __dev_get_by_index(net, mc_lst->ifindex);
-		if (dev) {
-			struct inet6_dev *idev = __in6_dev_get(dev);
-
-			ip6_mc_leave_src(sk, mc_lst, idev);
-			if (idev)
-				__ipv6_dev_mc_dec(idev, &mc_lst->addr);
-		} else {
-			ip6_mc_leave_src(sk, mc_lst, NULL);
-		}
-
-		atomic_sub(sizeof(*mc_lst), &sk->sk_omem_alloc);
-		kfree_rcu(mc_lst, rcu);
+		__ipv6_sock_mc_drop(sk, mc_lst);
 	}
 }
 
@@ -366,11 +348,9 @@ void ipv6_sock_mc_close(struct sock *sk)
 	if (!rcu_access_pointer(np->ipv6_mc_list))
 		return;
 
-	rtnl_lock();
 	lock_sock(sk);
 	__ipv6_sock_mc_close(sk);
 	release_sock(sk);
-	rtnl_unlock();
 }
 
 int ip6_mc_source(int add, int omode, struct sock *sk,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 09/15] ipv6: mcast: Don't hold RTNL for MCAST_ socket options.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (7 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 08/15] ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:44   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 10/15] ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment Kuniyuki Iwashima
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

In ip6_mc_source() and ip6_mc_msfilter(), per-socket mld data is
protected by lock_sock() and inet6_dev->mc_lock is also held for
some per-interface functions.

ip6_mc_find_dev_rtnl() only depends on RTNL.  If we want to remove
it, we need to check inet6_dev->dead under mc_lock to close the race
with addrconf_ifdown(), as mentioned earlier.

Let's do that and drop RTNL for the rest of MCAST_ socket options.

Note that ip6_mc_msfilter() has unnecessary lock dances and they
are integrated into one to avoid the last-minute error and simplify
the error handling.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
---
 net/ipv6/ipv6_sockglue.c |  5 ---
 net/ipv6/mcast.c         | 72 ++++++++++++++++++++++++----------------
 2 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index c8892d54821f..0c870713b08c 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -123,11 +123,6 @@ static bool setsockopt_needs_rtnl(int optname)
 	case IPV6_ADDRFORM:
 	case IPV6_JOIN_ANYCAST:
 	case IPV6_LEAVE_ANYCAST:
-	case MCAST_JOIN_SOURCE_GROUP:
-	case MCAST_LEAVE_SOURCE_GROUP:
-	case MCAST_BLOCK_SOURCE:
-	case MCAST_UNBLOCK_SOURCE:
-	case MCAST_MSFILTER:
 		return true;
 	}
 	return false;
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index e47d3fd7f789..af322a455346 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -302,31 +302,36 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
 }
 EXPORT_SYMBOL(ipv6_sock_mc_drop);
 
-static struct inet6_dev *ip6_mc_find_dev_rtnl(struct net *net,
-					      const struct in6_addr *group,
-					      int ifindex)
+static struct inet6_dev *ip6_mc_find_dev(struct net *net,
+					 const struct in6_addr *group,
+					 int ifindex)
 {
 	struct net_device *dev = NULL;
-	struct inet6_dev *idev = NULL;
+	struct inet6_dev *idev;
 
 	if (ifindex == 0) {
-		struct rt6_info *rt = rt6_lookup(net, group, NULL, 0, NULL, 0);
+		struct rt6_info *rt;
 
+		rcu_read_lock();
+		rt = rt6_lookup(net, group, NULL, 0, NULL, 0);
 		if (rt) {
 			dev = rt->dst.dev;
+			dev_hold(dev);
 			ip6_rt_put(rt);
 		}
+		rcu_read_unlock();
 	} else {
-		dev = __dev_get_by_index(net, ifindex);
+		dev = dev_get_by_index(net, ifindex);
 	}
-
 	if (!dev)
 		return NULL;
-	idev = __in6_dev_get(dev);
+
+	idev = in6_dev_get(dev);
+	dev_put(dev);
+
 	if (!idev)
 		return NULL;
-	if (idev->dead)
-		return NULL;
+
 	return idev;
 }
 
@@ -354,16 +359,16 @@ void ipv6_sock_mc_close(struct sock *sk)
 }
 
 int ip6_mc_source(int add, int omode, struct sock *sk,
-	struct group_source_req *pgsr)
+		  struct group_source_req *pgsr)
 {
+	struct ipv6_pinfo *inet6 = inet6_sk(sk);
 	struct in6_addr *source, *group;
+	struct net *net = sock_net(sk);
 	struct ipv6_mc_socklist *pmc;
-	struct inet6_dev *idev;
-	struct ipv6_pinfo *inet6 = inet6_sk(sk);
 	struct ip6_sf_socklist *psl;
-	struct net *net = sock_net(sk);
-	int i, j, rv;
+	struct inet6_dev *idev;
 	int leavegroup = 0;
+	int i, j, rv;
 	int err;
 
 	source = &((struct sockaddr_in6 *)&pgsr->gsr_source)->sin6_addr;
@@ -372,13 +377,19 @@ int ip6_mc_source(int add, int omode, struct sock *sk,
 	if (!ipv6_addr_is_multicast(group))
 		return -EINVAL;
 
-	idev = ip6_mc_find_dev_rtnl(net, group, pgsr->gsr_interface);
+	idev = ip6_mc_find_dev(net, group, pgsr->gsr_interface);
 	if (!idev)
 		return -ENODEV;
 
+	mutex_lock(&idev->mc_lock);
+
+	if (idev->dead) {
+		err = -ENODEV;
+		goto done;
+	}
+
 	err = -EADDRNOTAVAIL;
 
-	mutex_lock(&idev->mc_lock);
 	for_each_pmc_socklock(inet6, sk, pmc) {
 		if (pgsr->gsr_interface && pmc->ifindex != pgsr->gsr_interface)
 			continue;
@@ -475,6 +486,7 @@ int ip6_mc_source(int add, int omode, struct sock *sk,
 	ip6_mc_add_src(idev, group, omode, 1, source, 1);
 done:
 	mutex_unlock(&idev->mc_lock);
+	in6_dev_put(idev);
 	if (leavegroup)
 		err = ipv6_sock_mc_drop(sk, pgsr->gsr_interface, group);
 	return err;
@@ -483,12 +495,12 @@ int ip6_mc_source(int add, int omode, struct sock *sk,
 int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf,
 		    struct sockaddr_storage *list)
 {
-	const struct in6_addr *group;
-	struct ipv6_mc_socklist *pmc;
-	struct inet6_dev *idev;
 	struct ipv6_pinfo *inet6 = inet6_sk(sk);
 	struct ip6_sf_socklist *newpsl, *psl;
 	struct net *net = sock_net(sk);
+	const struct in6_addr *group;
+	struct ipv6_mc_socklist *pmc;
+	struct inet6_dev *idev;
 	int leavegroup = 0;
 	int i, err;
 
@@ -500,10 +512,17 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf,
 	    gsf->gf_fmode != MCAST_EXCLUDE)
 		return -EINVAL;
 
-	idev = ip6_mc_find_dev_rtnl(net, group, gsf->gf_interface);
+	idev = ip6_mc_find_dev(net, group, gsf->gf_interface);
 	if (!idev)
 		return -ENODEV;
 
+	mutex_lock(&idev->mc_lock);
+
+	if (idev->dead) {
+		err = -ENODEV;
+		goto done;
+	}
+
 	err = 0;
 
 	if (gsf->gf_fmode == MCAST_INCLUDE && gsf->gf_numsrc == 0) {
@@ -536,24 +555,19 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf,
 			psin6 = (struct sockaddr_in6 *)list;
 			newpsl->sl_addr[i] = psin6->sin6_addr;
 		}
-		mutex_lock(&idev->mc_lock);
+
 		err = ip6_mc_add_src(idev, group, gsf->gf_fmode,
 				     newpsl->sl_count, newpsl->sl_addr, 0);
 		if (err) {
-			mutex_unlock(&idev->mc_lock);
 			sock_kfree_s(sk, newpsl, struct_size(newpsl, sl_addr,
 							     newpsl->sl_max));
 			goto done;
 		}
-		mutex_unlock(&idev->mc_lock);
 	} else {
 		newpsl = NULL;
-		mutex_lock(&idev->mc_lock);
 		ip6_mc_add_src(idev, group, gsf->gf_fmode, 0, NULL, 0);
-		mutex_unlock(&idev->mc_lock);
 	}
 
-	mutex_lock(&idev->mc_lock);
 	psl = sock_dereference(pmc->sflist, sk);
 	if (psl) {
 		ip6_mc_del_src(idev, group, pmc->sfmode,
@@ -563,12 +577,14 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf,
 	} else {
 		ip6_mc_del_src(idev, group, pmc->sfmode, 0, NULL, 0);
 	}
+
 	rcu_assign_pointer(pmc->sflist, newpsl);
-	mutex_unlock(&idev->mc_lock);
 	kfree_rcu(psl, rcu);
 	pmc->sfmode = gsf->gf_fmode;
 	err = 0;
 done:
+	mutex_unlock(&idev->mc_lock);
+	in6_dev_put(idev);
 	if (leavegroup)
 		err = ipv6_sock_mc_drop(sk, gsf->gf_interface, group);
 	return err;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 10/15] ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (8 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 09/15] ipv6: mcast: Don't hold RTNL for MCAST_ socket options Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:44   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 11/15] ipv6: anycast: Don't use rtnl_dereference() Kuniyuki Iwashima
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

Now, RTNL is not needed for mcast code, and what's commented in
ip6_mc_msfget() is apparent by for_each_pmc_socklock(), which has
lockdep annotation for lock_sock().

Let's remove the comment and ASSERT_RTNL() in ipv6_mc_rejoin_groups().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/mcast.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index af322a455346..dc363a6c0b7a 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -605,10 +605,6 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf,
 	if (!ipv6_addr_is_multicast(group))
 		return -EINVAL;
 
-	/* changes to the ipv6_mc_list require the socket lock and
-	 * rtnl lock. We have the socket lock, so reading the list is safe.
-	 */
-
 	for_each_pmc_socklock(inet6, sk, pmc) {
 		if (pmc->ifindex != gsf->gf_interface)
 			continue;
@@ -2880,8 +2876,6 @@ static void ipv6_mc_rejoin_groups(struct inet6_dev *idev)
 {
 	struct ifmcaddr6 *pmc;
 
-	ASSERT_RTNL();
-
 	mutex_lock(&idev->mc_lock);
 	if (mld_in_v1_mode(idev)) {
 		for_each_mc_mclock(idev, pmc)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 11/15] ipv6: anycast: Don't use rtnl_dereference().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (9 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 10/15] ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:46   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 12/15] ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

inet6_dev->ac_list is protected by inet6_dev->lock, so rtnl_dereference()
is a bit rough annotation.

As done in mcast.c, we can use ac_dereference() that checks if
inet6_dev->lock is held.

Let's replace rtnl_dereference() with a new helper ac_dereference().

Note that now addrconf_join_solict() / addrconf_leave_solict() in
__ipv6_dev_ac_inc() / __ipv6_dev_ac_dec() does not need RTNL, so we
can remove ASSERT_RTNL() there.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/addrconf.c |  2 --
 net/ipv6/anycast.c  | 17 ++++++++---------
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 8451014457dd..3c200157634e 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2252,7 +2252,6 @@ void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr)
 	__ipv6_dev_mc_dec(idev, &maddr);
 }
 
-/* caller must hold RTNL */
 static void addrconf_join_anycast(struct inet6_ifaddr *ifp)
 {
 	struct in6_addr addr;
@@ -2265,7 +2264,6 @@ static void addrconf_join_anycast(struct inet6_ifaddr *ifp)
 	__ipv6_dev_ac_inc(ifp->idev, &addr);
 }
 
-/* caller must hold RTNL */
 static void addrconf_leave_anycast(struct inet6_ifaddr *ifp)
 {
 	struct in6_addr addr;
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 21e01695b48c..f510df93b1e9 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -47,6 +47,9 @@
 static struct hlist_head inet6_acaddr_lst[IN6_ADDR_HSIZE];
 static DEFINE_SPINLOCK(acaddr_hash_lock);
 
+#define ac_dereference(a, idev)						\
+	rcu_dereference_protected(a, lockdep_is_held(&(idev)->lock))
+
 static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr);
 
 static u32 inet6_acaddr_hash(const struct net *net,
@@ -319,16 +322,14 @@ int __ipv6_dev_ac_inc(struct inet6_dev *idev, const struct in6_addr *addr)
 	struct net *net;
 	int err;
 
-	ASSERT_RTNL();
-
 	write_lock_bh(&idev->lock);
 	if (idev->dead) {
 		err = -ENODEV;
 		goto out;
 	}
 
-	for (aca = rtnl_dereference(idev->ac_list); aca;
-	     aca = rtnl_dereference(aca->aca_next)) {
+	for (aca = ac_dereference(idev->ac_list, idev); aca;
+	     aca = ac_dereference(aca->aca_next, idev)) {
 		if (ipv6_addr_equal(&aca->aca_addr, addr)) {
 			aca->aca_users++;
 			err = 0;
@@ -380,12 +381,10 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr)
 {
 	struct ifacaddr6 *aca, *prev_aca;
 
-	ASSERT_RTNL();
-
 	write_lock_bh(&idev->lock);
 	prev_aca = NULL;
-	for (aca = rtnl_dereference(idev->ac_list); aca;
-	     aca = rtnl_dereference(aca->aca_next)) {
+	for (aca = ac_dereference(idev->ac_list, idev); aca;
+	     aca = ac_dereference(aca->aca_next, idev)) {
 		if (ipv6_addr_equal(&aca->aca_addr, addr))
 			break;
 		prev_aca = aca;
@@ -429,7 +428,7 @@ void ipv6_ac_destroy_dev(struct inet6_dev *idev)
 	struct ifacaddr6 *aca;
 
 	write_lock_bh(&idev->lock);
-	while ((aca = rtnl_dereference(idev->ac_list)) != NULL) {
+	while ((aca = ac_dereference(idev->ac_list, idev)) != NULL) {
 		rcu_assign_pointer(idev->ac_list, aca->aca_next);
 		write_unlock_bh(&idev->lock);
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 12/15] ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (10 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 11/15] ipv6: anycast: Don't use rtnl_dereference() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:48   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join() Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().

In ipv6_sock_ac_drop() and ipv6_sock_ac_close(),
only __dev_get_by_index() and __in6_dev_get() requrie RTNL.

Let's replace them with dev_get_by_index() and in6_dev_get()
and drop RTNL from IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/anycast.c       | 36 ++++++++++++++++++++----------------
 net/ipv6/ipv6_sockglue.c |  2 --
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index f510df93b1e9..8440e7b27f6d 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -158,12 +158,10 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
  */
 int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
 {
-	struct ipv6_pinfo *np = inet6_sk(sk);
-	struct net_device *dev;
 	struct ipv6_ac_socklist *pac, *prev_pac;
+	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct net *net = sock_net(sk);
-
-	ASSERT_RTNL();
+	struct net_device *dev;
 
 	prev_pac = NULL;
 	for (pac = np->ipv6_ac_list; pac; pac = pac->acl_next) {
@@ -179,9 +177,11 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
 	else
 		np->ipv6_ac_list = pac->acl_next;
 
-	dev = __dev_get_by_index(net, pac->acl_ifindex);
-	if (dev)
+	dev = dev_get_by_index(net, pac->acl_ifindex);
+	if (dev) {
 		ipv6_dev_ac_dec(dev, &pac->acl_addr);
+		dev_put(dev);
+	}
 
 	sock_kfree_s(sk, pac, sizeof(*pac));
 	return 0;
@@ -190,21 +190,20 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr)
 void __ipv6_sock_ac_close(struct sock *sk)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
+	struct net *net = sock_net(sk);
 	struct net_device *dev = NULL;
 	struct ipv6_ac_socklist *pac;
-	struct net *net = sock_net(sk);
-	int	prev_index;
+	int prev_index = 0;
 
-	ASSERT_RTNL();
 	pac = np->ipv6_ac_list;
 	np->ipv6_ac_list = NULL;
 
-	prev_index = 0;
 	while (pac) {
 		struct ipv6_ac_socklist *next = pac->acl_next;
 
 		if (pac->acl_ifindex != prev_index) {
-			dev = __dev_get_by_index(net, pac->acl_ifindex);
+			dev_put(dev);
+			dev = dev_get_by_index(net, pac->acl_ifindex);
 			prev_index = pac->acl_ifindex;
 		}
 		if (dev)
@@ -212,6 +211,8 @@ void __ipv6_sock_ac_close(struct sock *sk)
 		sock_kfree_s(sk, pac, sizeof(*pac));
 		pac = next;
 	}
+
+	dev_put(dev);
 }
 
 void ipv6_sock_ac_close(struct sock *sk)
@@ -220,9 +221,8 @@ void ipv6_sock_ac_close(struct sock *sk)
 
 	if (!np->ipv6_ac_list)
 		return;
-	rtnl_lock();
+
 	__ipv6_sock_ac_close(sk);
-	rtnl_unlock();
 }
 
 static void ipv6_add_acaddr_hash(struct net *net, struct ifacaddr6 *aca)
@@ -413,14 +413,18 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr)
 	return 0;
 }
 
-/* called with rtnl_lock() */
 static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr)
 {
-	struct inet6_dev *idev = __in6_dev_get(dev);
+	struct inet6_dev *idev = in6_dev_get(dev);
+	int err;
 
 	if (!idev)
 		return -ENODEV;
-	return __ipv6_dev_ac_dec(idev, addr);
+
+	err = __ipv6_dev_ac_dec(idev, addr);
+	in6_dev_put(idev);
+
+	return err;
 }
 
 void ipv6_ac_destroy_dev(struct inet6_dev *idev)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 0c870713b08c..3d891aa6e7f5 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -120,9 +120,7 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
 static bool setsockopt_needs_rtnl(int optname)
 {
 	switch (optname) {
-	case IPV6_ADDRFORM:
 	case IPV6_JOIN_ANYCAST:
-	case IPV6_LEAVE_ANYCAST:
 		return true;
 	}
 	return false;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (11 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 12/15] ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:52   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 14/15] ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

The next patch will replace __dev_get_by_index() and __dev_get_by_flags()
to RCU + refcount version.

Then, we will need to call dev_put() in some error paths.

Let's unify two error paths to make the next patch cleaner.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/anycast.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 8440e7b27f6d..e0a1f9d7622c 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -67,12 +67,11 @@ static u32 inet6_acaddr_hash(const struct net *net,
 int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
+	struct ipv6_ac_socklist *pac = NULL;
+	struct net *net = sock_net(sk);
 	struct net_device *dev = NULL;
 	struct inet6_dev *idev;
-	struct ipv6_ac_socklist *pac;
-	struct net *net = sock_net(sk);
-	int	ishost = !net->ipv6.devconf_all->forwarding;
-	int	err = 0;
+	int err = 0, ishost;
 
 	ASSERT_RTNL();
 
@@ -84,15 +83,22 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 	if (ifindex)
 		dev = __dev_get_by_index(net, ifindex);
 
-	if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE))
-		return -EINVAL;
+	if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE)) {
+		err = -EINVAL;
+		goto error;
+	}
 
 	pac = sock_kmalloc(sk, sizeof(struct ipv6_ac_socklist), GFP_KERNEL);
-	if (!pac)
-		return -ENOMEM;
+	if (!pac) {
+		err = -ENOMEM;
+		goto error;
+	}
+
 	pac->acl_next = NULL;
 	pac->acl_addr = *addr;
 
+	ishost = !net->ipv6.devconf_all->forwarding;
+
 	if (ifindex == 0) {
 		struct rt6_info *rt;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 14/15] ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST.
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (12 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join() Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:54   ` Eric Dumazet
  2025-06-24 20:24 ` [PATCH v2 net-next 15/15] ipv6: Remove setsockopt_needs_rtnl() Kuniyuki Iwashima
  2025-06-26 13:27 ` [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Paolo Abeni
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().

In ipv6_sock_ac_join(), only __dev_get_by_index(), __dev_get_by_flags(),
and __in6_dev_get() require RTNL.

__dev_get_by_flags() is only used by ipv6_sock_ac_join() and can be
converted to RCU version.

Let's replace RCU version helper and drop RTNL from IPV6_JOIN_ANYCAST.

setsockopt_needs_rtnl() will be removed in the next patch.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
---
 include/linux/netdevice.h |  4 ++--
 net/core/dev.c            | 38 ++++++++++++++++++--------------------
 net/ipv6/anycast.c        | 20 +++++++++++++-------
 net/ipv6/ipv6_sockglue.c  |  4 ----
 4 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 03c26bb0fbbe..68f874a58c92 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3339,8 +3339,8 @@ int dev_get_iflink(const struct net_device *dev);
 int dev_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb);
 int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr,
 			  struct net_device_path_stack *stack);
-struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags,
-				      unsigned short mask);
+struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags,
+					unsigned short mask);
 struct net_device *dev_get_by_name(struct net *net, const char *name);
 struct net_device *dev_get_by_name_rcu(struct net *net, const char *name);
 struct net_device *__dev_get_by_name(struct net *net, const char *name);
diff --git a/net/core/dev.c b/net/core/dev.c
index 7ee808eb068e..553c654e6f77 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1267,33 +1267,31 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type)
 EXPORT_SYMBOL(dev_getfirstbyhwtype);
 
 /**
- *	__dev_get_by_flags - find any device with given flags
- *	@net: the applicable net namespace
- *	@if_flags: IFF_* values
- *	@mask: bitmask of bits in if_flags to check
+ * dev_get_by_flags_rcu - find any device with given flags
+ * @net: the applicable net namespace
+ * @if_flags: IFF_* values
+ * @mask: bitmask of bits in if_flags to check
  *
- *	Search for any interface with the given flags. Returns NULL if a device
- *	is not found or a pointer to the device. Must be called inside
- *	rtnl_lock(), and result refcount is unchanged.
+ * Search for any interface with the given flags.
+ *
+ * Context: rcu_read_lock() must be held.
+ * Returns: NULL if a device is not found or a pointer to the device.
  */
-
-struct net_device *__dev_get_by_flags(struct net *net, unsigned short if_flags,
-				      unsigned short mask)
+struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags,
+					unsigned short mask)
 {
-	struct net_device *dev, *ret;
-
-	ASSERT_RTNL();
+	struct net_device *dev;
 
-	ret = NULL;
-	for_each_netdev(net, dev) {
-		if (((dev->flags ^ if_flags) & mask) == 0) {
-			ret = dev;
-			break;
+	for_each_netdev_rcu(net, dev) {
+		if (((READ_ONCE(dev->flags) ^ if_flags) & mask) == 0) {
+			dev_hold(dev);
+			return dev;
 		}
 	}
-	return ret;
+
+	return NULL;
 }
-EXPORT_SYMBOL(__dev_get_by_flags);
+EXPORT_IPV6_MOD(dev_get_by_flags_rcu);
 
 /**
  *	dev_valid_name - check if name is okay for network device
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index e0a1f9d7622c..427fa95018b7 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -73,15 +73,13 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 	struct inet6_dev *idev;
 	int err = 0, ishost;
 
-	ASSERT_RTNL();
-
 	if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
 		return -EPERM;
 	if (ipv6_addr_is_multicast(addr))
 		return -EINVAL;
 
 	if (ifindex)
-		dev = __dev_get_by_index(net, ifindex);
+		dev = dev_get_by_index(net, ifindex);
 
 	if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE)) {
 		err = -EINVAL;
@@ -102,18 +100,22 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 	if (ifindex == 0) {
 		struct rt6_info *rt;
 
+		rcu_read_lock();
 		rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
 		if (rt) {
 			dev = rt->dst.dev;
+			dev_hold(dev);
 			ip6_rt_put(rt);
 		} else if (ishost) {
+			rcu_read_unlock();
 			err = -EADDRNOTAVAIL;
 			goto error;
 		} else {
 			/* router, no matching interface: just pick one */
-			dev = __dev_get_by_flags(net, IFF_UP,
-						 IFF_UP | IFF_LOOPBACK);
+			dev = dev_get_by_flags_rcu(net, IFF_UP,
+						   IFF_UP | IFF_LOOPBACK);
 		}
+		rcu_read_unlock();
 	}
 
 	if (!dev) {
@@ -121,7 +123,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 		goto error;
 	}
 
-	idev = __in6_dev_get(dev);
+	idev = in6_dev_get(dev);
 	if (!idev) {
 		if (ifindex)
 			err = -ENODEV;
@@ -143,7 +145,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 		if (ishost)
 			err = -EADDRNOTAVAIL;
 		if (err)
-			goto error;
+			goto error_idev;
 	}
 
 	err = __ipv6_dev_ac_inc(idev, addr);
@@ -153,7 +155,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
 		pac = NULL;
 	}
 
+error_idev:
+	in6_dev_put(idev);
 error:
+	dev_put(dev);
+
 	if (pac)
 		sock_kfree_s(sk, pac, sizeof(*pac));
 	return err;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 3d891aa6e7f5..702dc33e50ad 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -119,10 +119,6 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
 
 static bool setsockopt_needs_rtnl(int optname)
 {
-	switch (optname) {
-	case IPV6_JOIN_ANYCAST:
-		return true;
-	}
 	return false;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 net-next 15/15] ipv6: Remove setsockopt_needs_rtnl().
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (13 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 14/15] ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST Kuniyuki Iwashima
@ 2025-06-24 20:24 ` Kuniyuki Iwashima
  2025-06-26 14:54   ` Eric Dumazet
  2025-06-26 13:27 ` [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Paolo Abeni
  15 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-24 20:24 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

From: Kuniyuki Iwashima <kuniyu@google.com>

We no longer need to hold RTNL for IPv6 socket options.

Let's remove setsockopt_needs_rtnl().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv6/ipv6_sockglue.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 702dc33e50ad..e66ec623972e 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -117,11 +117,6 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
 	return opt;
 }
 
-static bool setsockopt_needs_rtnl(int optname)
-{
-	return false;
-}
-
 static int copy_group_source_from_sockptr(struct group_source_req *greqs,
 		sockptr_t optval, int optlen)
 {
@@ -380,9 +375,8 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct net *net = sock_net(sk);
-	int val, valbool;
 	int retv = -ENOPROTOOPT;
-	bool needs_rtnl = setsockopt_needs_rtnl(optname);
+	int val, valbool;
 
 	if (sockptr_is_null(optval))
 		val = 0;
@@ -547,8 +541,7 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		return 0;
 	}
 	}
-	if (needs_rtnl)
-		rtnl_lock();
+
 	sockopt_lock_sock(sk);
 
 	/* Another thread has converted the socket into IPv4 with
@@ -954,8 +947,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 
 unlock:
 	sockopt_release_sock(sk);
-	if (needs_rtnl)
-		rtnl_unlock();
 
 	return retv;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c
  2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
                   ` (14 preceding siblings ...)
  2025-06-24 20:24 ` [PATCH v2 net-next 15/15] ipv6: Remove setsockopt_needs_rtnl() Kuniyuki Iwashima
@ 2025-06-26 13:27 ` Paolo Abeni
  2025-06-27  0:49   ` Kuniyuki Iwashima
  15 siblings, 1 reply; 40+ messages in thread
From: Paolo Abeni @ 2025-06-26 13:27 UTC (permalink / raw)
  To: Kuniyuki Iwashima, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski
  Cc: Simon Horman, Kuniyuki Iwashima, netdev

On 6/24/25 10:24 PM, Kuniyuki Iwashima wrote:
> From: Kuniyuki Iwashima <kuniyu@google.com>
> 
> This is a prep series for RCU conversion of RTM_NEWNEIGH, which needs
> RTNL during neigh_table.{pconstructor,pdestructor}() touching IPv6
> multicast code.
> 
> Currently, IPv6 multicast code is protected by lock_sock() and
> inet6_dev->mc_lock, and RTNL is not actually needed.
> 
> In addition, anycast code is also in the same situation and does not
> need RTNL at all.
> 
> This series removes RTNL from net/ipv6/{mcast.c,anycast.c} and finally
> removes setsockopt_needs_rtnl() from do_ipv6_setsockopt().

I went through the whole series I could not find any obvious bug.

Still this is not trivial matter and I recently missed bugs in similar
changes, so let me keep the series in PW for a little longer, just in
case some other pair of eyes would go over it ;)

BTW @Kuniyuki: do you have a somewhat public todo list that others could
peek at to join this effort?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}().
  2025-06-24 20:24 ` [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}() Kuniyuki Iwashima
@ 2025-06-26 14:26   ` Eric Dumazet
  2025-06-27  0:56     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:26 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> ipv6_dev_mc_{inc,dec}() has the same check.
>
> Let's remove __in6_dev_get() from pndisc_constructor() and
> pndisc_destructor().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
>  net/ipv6/ndisc.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
> index ecb5c4b8518f..beb1814a1ac2 100644
> --- a/net/ipv6/ndisc.c
> +++ b/net/ipv6/ndisc.c
> @@ -377,11 +377,12 @@ static int ndisc_constructor(struct neighbour *neigh)
>  static int pndisc_constructor(struct pneigh_entry *n)
>  {
>         struct in6_addr *addr = (struct in6_addr *)&n->key;
> -       struct in6_addr maddr;
>         struct net_device *dev = n->dev;
> +       struct in6_addr maddr;
>
> -       if (!dev || !__in6_dev_get(dev))
> +       if (!dev)
>                 return -EINVAL;
> +
>         addrconf_addr_solict_mult(addr, &maddr);
>         ipv6_dev_mc_inc(dev, &maddr);

return ipv6_dev_mc_inc(dev, &maddr); ?

>         return 0;
> @@ -390,11 +391,12 @@ static int pndisc_constructor(struct pneigh_entry *n)
>  static void pndisc_destructor(struct pneigh_entry *n)
>  {
>         struct in6_addr *addr = (struct in6_addr *)&n->key;
> -       struct in6_addr maddr;
>         struct net_device *dev = n->dev;
> +       struct in6_addr maddr;
>
> -       if (!dev || !__in6_dev_get(dev))
> +       if (!dev)
>                 return;
> +
>         addrconf_addr_solict_mult(addr, &maddr);
>         ipv6_dev_mc_dec(dev, &maddr);

return ipv6_dev_mc_dec(dev, &maddr);

>  }
> --

If not needed (because of a future patch ?), this should be mentioned
in the changelog.

> 2.49.0
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 02/15] ipv6: mcast: Replace locking comments with lockdep annotations.
  2025-06-24 20:24 ` [PATCH v2 net-next 02/15] ipv6: mcast: Replace locking comments with lockdep annotations Kuniyuki Iwashima
@ 2025-06-26 14:28   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:28 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> Commit 63ed8de4be81 ("mld: add mc_lock for protecting per-interface
> mld data") added the same comments regarding locking to many functions.
>
> Let's replace the comments with lockdep annotation, which is more helpful.
>
> Note that we just remove the comment for mld_clear_zeros() and
> mld_send_cr(), where mc_dereference() is used in the entry of the
> function.
>
> While at it, a comment for __ipv6_sock_mc_join() is moved back to the
> correct place.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 03/15] ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in __ipv6_dev_mc_inc().
  2025-06-24 20:24 ` [PATCH v2 net-next 03/15] ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in __ipv6_dev_mc_inc() Kuniyuki Iwashima
@ 2025-06-26 14:30   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:30 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> Since commit 63ed8de4be81 ("mld: add mc_lock for protecting
> per-interface mld data"), every multicast resource is protected
> by inet6_dev->mc_lock.
>
> RTNL is unnecessary in terms of protection but still needed for
> synchronisation between addrconf_ifdown() and __ipv6_dev_mc_inc().
>
> Once we removed RTNL, there would be a race below, where we could
> add a multicast address to a dead inet6_dev.
>
>   CPU1                            CPU2
>   ====                            ====
>   addrconf_ifdown()               __ipv6_dev_mc_inc()
>                                     if (idev->dead) <-- false
>     dead = true                       return -ENODEV;
>     ipv6_mc_destroy_dev() / ipv6_mc_down()
>       mutex_lock(&idev->mc_lock)
>       ...
>       mutex_unlock(&idev->mc_lock)
>                                     mutex_lock(&idev->mc_lock)
>                                     ...
>                                     mutex_unlock(&idev->mc_lock)
>
> The race window can be easily closed by checking inet6_dev->dead
> under inet6_dev->mc_lock in __ipv6_dev_mc_inc() as addrconf_ifdown()
> will acquire it after marking inet6_dev dead.
>
> Let's check inet6_dev->dead under mc_lock in __ipv6_dev_mc_inc().
>
> Note that now __ipv6_dev_mc_inc() no longer depends on RTNL and
> we can remove ASSERT_RTNL() there and the RTNL comment above
> addrconf_join_solict().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 04/15] ipv6: mcast: Remove mca_get().
  2025-06-24 20:24 ` [PATCH v2 net-next 04/15] ipv6: mcast: Remove mca_get() Kuniyuki Iwashima
@ 2025-06-26 14:31   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:31 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> Since commit 63ed8de4be81 ("mld: add mc_lock for protecting per-interface
> mld data"), the newly allocated struct ifmcaddr6 cannot be removed until
> inet6_dev->mc_lock is released, so mca_get() and mc_put() are unnecessary.
>
> Let's remove the extra refcounting.
>
> Note that mca_get() was only used in __ipv6_dev_mc_inc().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 05/15] ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec().
  2025-06-24 20:24 ` [PATCH v2 net-next 05/15] ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec() Kuniyuki Iwashima
@ 2025-06-26 14:33   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:33 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> As well as __ipv6_dev_mc_inc(), all code in __ipv6_dev_mc_dec() are
> protected by inet6_dev->mc_lock, and RTNL is not needed.
>
> Let's use in6_dev_get() in ipv6_dev_mc_dec() and remove ASSERT_RTNL()
> in __ipv6_dev_mc_dec().
>
> Now, we can remove the RTNL comment above addrconf_leave_solict() too.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
  2025-06-24 20:24 ` [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP Kuniyuki Iwashima
@ 2025-06-26 14:36   ` Eric Dumazet
  2025-06-27  1:01     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:36 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> In __ipv6_sock_mc_join(), per-socket mld data is protected by lock_sock(),
> and only __dev_get_by_index() requires RTNL.
>
> Let's use dev_get_by_index() and drop RTNL for IPV6_ADD_MEMBERSHIP and
> MCAST_JOIN_GROUP.
>
> Note that we must call rt6_lookup() and dev_hold() under RCU.
>
> If rt6_lookup() returns an entry from the exception table, dst_dev_put()
> could change rt->dev.dst to loopback concurrently, and the original device
> could lose the refcount before dev_hold() and unblock device registration.
>
> dst_dev_put() is called from NETDEV_UNREGISTER and synchronize_net() follows
> it, so as long as rt6_lookup() and dev_hold() are called within the same
> RCU critical section, the dev is alive.
>
> Even if the race happens, they are synchronised by idev->dead and mcast
> addresses are cleaned up.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
> v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
> ---
>  net/ipv6/ipv6_sockglue.c |  2 --
>  net/ipv6/mcast.c         | 22 ++++++++++++----------
>  2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index 1e225e6489ea..cb0dc885cbe4 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -121,11 +121,9 @@ static bool setsockopt_needs_rtnl(int optname)
>  {
>         switch (optname) {
>         case IPV6_ADDRFORM:
> -       case IPV6_ADD_MEMBERSHIP:
>         case IPV6_DROP_MEMBERSHIP:
>         case IPV6_JOIN_ANYCAST:
>         case IPV6_LEAVE_ANYCAST:
> -       case MCAST_JOIN_GROUP:
>         case MCAST_LEAVE_GROUP:
>         case MCAST_JOIN_SOURCE_GROUP:
>         case MCAST_LEAVE_SOURCE_GROUP:
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index b3f063b5ffd7..9fc7672926bf 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -175,14 +175,12 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
>  static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
>                                const struct in6_addr *addr, unsigned int mode)
>  {
> -       struct net_device *dev = NULL;
> -       struct ipv6_mc_socklist *mc_lst;
>         struct ipv6_pinfo *np = inet6_sk(sk);
> +       struct ipv6_mc_socklist *mc_lst;
>         struct net *net = sock_net(sk);
> +       struct net_device *dev = NULL;
>         int err;
>
> -       ASSERT_RTNL();
> -
>         if (!ipv6_addr_is_multicast(addr))
>                 return -EINVAL;
>
> @@ -202,13 +200,18 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
>
>         if (ifindex == 0) {
>                 struct rt6_info *rt;
> +
> +               rcu_read_lock();
>                 rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
>                 if (rt) {
>                         dev = rt->dst.dev;

We probably need safety here, READ_ONCE() at minimum.

This can probably be done in a separate series.

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 07/15] ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and MCAST_LEAVE_GROUP.
  2025-06-24 20:24 ` [PATCH v2 net-next 07/15] ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and MCAST_LEAVE_GROUP Kuniyuki Iwashima
@ 2025-06-26 14:38   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:38 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> In __ipv6_sock_mc_drop(), per-socket mld data is protected by lock_sock(),
> and only __dev_get_by_index() and __in6_dev_get() require RTNL.
>
> Let's use dev_get_by_index() and in6_dev_get() and drop RTNL for
> IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
>
> Note that __ipv6_sock_mc_drop() is factorised to reuse in the next patch.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 08/15] ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close().
  2025-06-24 20:24 ` [PATCH v2 net-next 08/15] ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close() Kuniyuki Iwashima
@ 2025-06-26 14:39   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:39 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> In __ipv6_sock_mc_close(), per-socket mld data is protected by lock_sock(),
> and only __dev_get_by_index() and __in6_dev_get() require RTNL.
>
> Let's call __ipv6_sock_mc_drop() and drop RTNL in ipv6_sock_mc_close().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 09/15] ipv6: mcast: Don't hold RTNL for MCAST_ socket options.
  2025-06-24 20:24 ` [PATCH v2 net-next 09/15] ipv6: mcast: Don't hold RTNL for MCAST_ socket options Kuniyuki Iwashima
@ 2025-06-26 14:44   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:44 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> In ip6_mc_source() and ip6_mc_msfilter(), per-socket mld data is
> protected by lock_sock() and inet6_dev->mc_lock is also held for
> some per-interface functions.
>
> ip6_mc_find_dev_rtnl() only depends on RTNL.  If we want to remove
> it, we need to check inet6_dev->dead under mc_lock to close the race
> with addrconf_ifdown(), as mentioned earlier.
>
> Let's do that and drop RTNL for the rest of MCAST_ socket options.
>
> Note that ip6_mc_msfilter() has unnecessary lock dances and they
> are integrated into one to avoid the last-minute error and simplify
> the error handling.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---

Same remark about lack of one READ_ONCE( rt->dst.dev)

Other than that :

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 10/15] ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment.
  2025-06-24 20:24 ` [PATCH v2 net-next 10/15] ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment Kuniyuki Iwashima
@ 2025-06-26 14:44   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:44 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> Now, RTNL is not needed for mcast code, and what's commented in
> ip6_mc_msfget() is apparent by for_each_pmc_socklock(), which has
> lockdep annotation for lock_sock().
>
> Let's remove the comment and ASSERT_RTNL() in ipv6_mc_rejoin_groups().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 11/15] ipv6: anycast: Don't use rtnl_dereference().
  2025-06-24 20:24 ` [PATCH v2 net-next 11/15] ipv6: anycast: Don't use rtnl_dereference() Kuniyuki Iwashima
@ 2025-06-26 14:46   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:46 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> inet6_dev->ac_list is protected by inet6_dev->lock, so rtnl_dereference()
> is a bit rough annotation.
>
> As done in mcast.c, we can use ac_dereference() that checks if
> inet6_dev->lock is held.
>
> Let's replace rtnl_dereference() with a new helper ac_dereference().
>
> Note that now addrconf_join_solict() / addrconf_leave_solict() in
> __ipv6_dev_ac_inc() / __ipv6_dev_ac_dec() does not need RTNL, so we
> can remove ASSERT_RTNL() there.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 12/15] ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM.
  2025-06-24 20:24 ` [PATCH v2 net-next 12/15] ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM Kuniyuki Iwashima
@ 2025-06-26 14:48   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:48 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().
>
> In ipv6_sock_ac_drop() and ipv6_sock_ac_close(),
> only __dev_get_by_index() and __in6_dev_get() requrie RTNL.
>
> Let's replace them with dev_get_by_index() and in6_dev_get()
> and drop RTNL from IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join().
  2025-06-24 20:24 ` [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join() Kuniyuki Iwashima
@ 2025-06-26 14:52   ` Eric Dumazet
  2025-06-27  1:02     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:52 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> The next patch will replace __dev_get_by_index() and __dev_get_by_flags()
> to RCU + refcount version.
>
> Then, we will need to call dev_put() in some error paths.
>
> Let's unify two error paths to make the next patch cleaner.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
>  net/ipv6/anycast.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
> index 8440e7b27f6d..e0a1f9d7622c 100644
> --- a/net/ipv6/anycast.c
> +++ b/net/ipv6/anycast.c
> @@ -67,12 +67,11 @@ static u32 inet6_acaddr_hash(const struct net *net,
>  int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>  {
>         struct ipv6_pinfo *np = inet6_sk(sk);
> +       struct ipv6_ac_socklist *pac = NULL;
> +       struct net *net = sock_net(sk);
>         struct net_device *dev = NULL;
>         struct inet6_dev *idev;
> -       struct ipv6_ac_socklist *pac;
> -       struct net *net = sock_net(sk);
> -       int     ishost = !net->ipv6.devconf_all->forwarding;
> -       int     err = 0;
> +       int err = 0, ishost;
>
>         ASSERT_RTNL();
>
> @@ -84,15 +83,22 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>         if (ifindex)
>                 dev = __dev_get_by_index(net, ifindex);
>
> -       if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE))
> -               return -EINVAL;
> +       if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE)) {
> +               err = -EINVAL;
> +               goto error;
> +       }
>
>         pac = sock_kmalloc(sk, sizeof(struct ipv6_ac_socklist), GFP_KERNEL);
> -       if (!pac)
> -               return -ENOMEM;
> +       if (!pac) {
> +               err = -ENOMEM;
> +               goto error;
> +       }
> +
>         pac->acl_next = NULL;
>         pac->acl_addr = *addr;
>
> +       ishost = !net->ipv6.devconf_all->forwarding;

RTNL will no longer protect this read, you should add a READ_ONCE()

Other than that :

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 14/15] ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST.
  2025-06-24 20:24 ` [PATCH v2 net-next 14/15] ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST Kuniyuki Iwashima
@ 2025-06-26 14:54   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:54 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().
>
> In ipv6_sock_ac_join(), only __dev_get_by_index(), __dev_get_by_flags(),
> and __in6_dev_get() require RTNL.
>
> __dev_get_by_flags() is only used by ipv6_sock_ac_join() and can be
> converted to RCU version.
>
> Let's replace RCU version helper and drop RTNL from IPV6_JOIN_ANYCAST.
>
> setsockopt_needs_rtnl() will be removed in the next patch.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
> v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
> ---
>  include/linux/netdevice.h |  4 ++--
>  net/core/dev.c            | 38 ++++++++++++++++++--------------------
>  net/ipv6/anycast.c        | 20 +++++++++++++-------
>  net/ipv6/ipv6_sockglue.c  |  4 ----
>  4 files changed, 33 insertions(+), 33 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 03c26bb0fbbe..68f874a58c92 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3339,8 +3339,8 @@ int dev_get_iflink(const struct net_device *dev);
>  int dev_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb);
>  int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr,
>                           struct net_device_path_stack *stack);
> -struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags,
> -                                     unsigned short mask);
> +struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags,
> +                                       unsigned short mask);
>  struct net_device *dev_get_by_name(struct net *net, const char *name);
>  struct net_device *dev_get_by_name_rcu(struct net *net, const char *name);
>  struct net_device *__dev_get_by_name(struct net *net, const char *name);
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7ee808eb068e..553c654e6f77 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1267,33 +1267,31 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type)
>  EXPORT_SYMBOL(dev_getfirstbyhwtype);
>
>  /**
> - *     __dev_get_by_flags - find any device with given flags
> - *     @net: the applicable net namespace
> - *     @if_flags: IFF_* values
> - *     @mask: bitmask of bits in if_flags to check
> + * dev_get_by_flags_rcu - find any device with given flags
> + * @net: the applicable net namespace
> + * @if_flags: IFF_* values
> + * @mask: bitmask of bits in if_flags to check
>   *
> - *     Search for any interface with the given flags. Returns NULL if a device
> - *     is not found or a pointer to the device. Must be called inside
> - *     rtnl_lock(), and result refcount is unchanged.
> + * Search for any interface with the given flags.
> + *
> + * Context: rcu_read_lock() must be held.
> + * Returns: NULL if a device is not found or a pointer to the device.
>   */
> -
> -struct net_device *__dev_get_by_flags(struct net *net, unsigned short if_flags,
> -                                     unsigned short mask)
> +struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags,
> +                                       unsigned short mask)
>  {
> -       struct net_device *dev, *ret;
> -
> -       ASSERT_RTNL();
> +       struct net_device *dev;
>
> -       ret = NULL;
> -       for_each_netdev(net, dev) {
> -               if (((dev->flags ^ if_flags) & mask) == 0) {
> -                       ret = dev;
> -                       break;
> +       for_each_netdev_rcu(net, dev) {
> +               if (((READ_ONCE(dev->flags) ^ if_flags) & mask) == 0) {
> +                       dev_hold(dev);
> +                       return dev;
>                 }
>         }
> -       return ret;
> +
> +       return NULL;
>  }
> -EXPORT_SYMBOL(__dev_get_by_flags);
> +EXPORT_IPV6_MOD(dev_get_by_flags_rcu);
>
>  /**
>   *     dev_valid_name - check if name is okay for network device
> diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
> index e0a1f9d7622c..427fa95018b7 100644
> --- a/net/ipv6/anycast.c
> +++ b/net/ipv6/anycast.c
> @@ -73,15 +73,13 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>         struct inet6_dev *idev;
>         int err = 0, ishost;
>
> -       ASSERT_RTNL();
> -
>         if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
>                 return -EPERM;
>         if (ipv6_addr_is_multicast(addr))
>                 return -EINVAL;
>
>         if (ifindex)
> -               dev = __dev_get_by_index(net, ifindex);
> +               dev = dev_get_by_index(net, ifindex);
>
>         if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE)) {
>                 err = -EINVAL;
> @@ -102,18 +100,22 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>         if (ifindex == 0) {
>                 struct rt6_info *rt;
>
> +               rcu_read_lock();
>                 rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
>                 if (rt) {
>                         dev = rt->dst.dev;

READ_ONCE(rt->dst.dev)

Reviewed-by: Eric Dumazet <edumazet@google.com>

> +                       dev_hold(dev);
>                         ip6_rt_put(rt);
>                 } else if (ishost) {
> +                       rcu_read_unlock();
>                         err = -EADDRNOTAVAIL;
>                         goto error;
>                 } else {
>                         /* router, no matching interface: just pick one */
> -                       dev = __dev_get_by_flags(net, IFF_UP,
> -                                                IFF_UP | IFF_LOOPBACK);
> +                       dev = dev_get_by_flags_rcu(net, IFF_UP,
> +                                                  IFF_UP | IFF_LOOPBACK);
>                 }
> +               rcu_read_unlock();
>         }
>
>         if (!dev) {
> @@ -121,7 +123,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>                 goto error;
>         }
>
> -       idev = __in6_dev_get(dev);
> +       idev = in6_dev_get(dev);
>         if (!idev) {
>                 if (ifindex)
>                         err = -ENODEV;
> @@ -143,7 +145,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>                 if (ishost)
>                         err = -EADDRNOTAVAIL;
>                 if (err)
> -                       goto error;
> +                       goto error_idev;
>         }
>
>         err = __ipv6_dev_ac_inc(idev, addr);
> @@ -153,7 +155,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
>                 pac = NULL;
>         }
>
> +error_idev:
> +       in6_dev_put(idev);
>  error:
> +       dev_put(dev);
> +
>         if (pac)
>                 sock_kfree_s(sk, pac, sizeof(*pac));
>         return err;
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index 3d891aa6e7f5..702dc33e50ad 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -119,10 +119,6 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
>
>  static bool setsockopt_needs_rtnl(int optname)
>  {
> -       switch (optname) {
> -       case IPV6_JOIN_ANYCAST:
> -               return true;
> -       }
>         return false;
>  }
>
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 15/15] ipv6: Remove setsockopt_needs_rtnl().
  2025-06-24 20:24 ` [PATCH v2 net-next 15/15] ipv6: Remove setsockopt_needs_rtnl() Kuniyuki Iwashima
@ 2025-06-26 14:54   ` Eric Dumazet
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Dumazet @ 2025-06-26 14:54 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
>
> From: Kuniyuki Iwashima <kuniyu@google.com>
>
> We no longer need to hold RTNL for IPv6 socket options.
>
> Let's remove setsockopt_needs_rtnl().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c
  2025-06-26 13:27 ` [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Paolo Abeni
@ 2025-06-27  0:49   ` Kuniyuki Iwashima
  2025-06-27  6:32     ` Paolo Abeni
  0 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-27  0:49 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Simon Horman, netdev

On Thu, Jun 26, 2025 at 6:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 6/24/25 10:24 PM, Kuniyuki Iwashima wrote:
> > From: Kuniyuki Iwashima <kuniyu@google.com>
> >
> > This is a prep series for RCU conversion of RTM_NEWNEIGH, which needs
> > RTNL during neigh_table.{pconstructor,pdestructor}() touching IPv6
> > multicast code.
> >
> > Currently, IPv6 multicast code is protected by lock_sock() and
> > inet6_dev->mc_lock, and RTNL is not actually needed.
> >
> > In addition, anycast code is also in the same situation and does not
> > need RTNL at all.
> >
> > This series removes RTNL from net/ipv6/{mcast.c,anycast.c} and finally
> > removes setsockopt_needs_rtnl() from do_ipv6_setsockopt().
>
> I went through the whole series I could not find any obvious bug.
>
> Still this is not trivial matter and I recently missed bugs in similar
> changes, so let me keep the series in PW for a little longer, just in
> case some other pair of eyes would go over it ;)

Thank you Paolo!

>
> BTW @Kuniyuki: do you have a somewhat public todo list that others could
> peek at to join this effort?

I  don't have a public one now, but I can create a public repo on GitHub
and fill the Issues tab as the todo list.  Do you have any ideas ?


>
> Thanks!
>
> Paolo
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}().
  2025-06-26 14:26   ` Eric Dumazet
@ 2025-06-27  0:56     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-27  0:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev

On Thu, Jun 26, 2025 at 7:26 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
> >
> > From: Kuniyuki Iwashima <kuniyu@google.com>
> >
> > ipv6_dev_mc_{inc,dec}() has the same check.
> >
> > Let's remove __in6_dev_get() from pndisc_constructor() and
> > pndisc_destructor().
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> > ---
> >  net/ipv6/ndisc.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
> > index ecb5c4b8518f..beb1814a1ac2 100644
> > --- a/net/ipv6/ndisc.c
> > +++ b/net/ipv6/ndisc.c
> > @@ -377,11 +377,12 @@ static int ndisc_constructor(struct neighbour *neigh)
> >  static int pndisc_constructor(struct pneigh_entry *n)
> >  {
> >         struct in6_addr *addr = (struct in6_addr *)&n->key;
> > -       struct in6_addr maddr;
> >         struct net_device *dev = n->dev;
> > +       struct in6_addr maddr;
> >
> > -       if (!dev || !__in6_dev_get(dev))
> > +       if (!dev)
> >                 return -EINVAL;
> > +
> >         addrconf_addr_solict_mult(addr, &maddr);
> >         ipv6_dev_mc_inc(dev, &maddr);
>
> return ipv6_dev_mc_inc(dev, &maddr); ?
>
> >         return 0;
> > @@ -390,11 +391,12 @@ static int pndisc_constructor(struct pneigh_entry *n)
> >  static void pndisc_destructor(struct pneigh_entry *n)
> >  {
> >         struct in6_addr *addr = (struct in6_addr *)&n->key;
> > -       struct in6_addr maddr;
> >         struct net_device *dev = n->dev;
> > +       struct in6_addr maddr;
> >
> > -       if (!dev || !__in6_dev_get(dev))
> > +       if (!dev)
> >                 return;
> > +
> >         addrconf_addr_solict_mult(addr, &maddr);
> >         ipv6_dev_mc_dec(dev, &maddr);
>
> return ipv6_dev_mc_dec(dev, &maddr);

Somehow I separated this change for the following neigh conversion
series but will squash it to this patch.

Thanks!

>
> >  }
> > --
>
> If not needed (because of a future patch ?), this should be mentioned
> in the changelog.
>
> > 2.49.0
> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
  2025-06-26 14:36   ` Eric Dumazet
@ 2025-06-27  1:01     ` Kuniyuki Iwashima
  2025-06-27  7:21       ` Eric Dumazet
  0 siblings, 1 reply; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-27  1:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev

On Thu, Jun 26, 2025 at 7:37 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
> >
> > From: Kuniyuki Iwashima <kuniyu@google.com>
> >
> > In __ipv6_sock_mc_join(), per-socket mld data is protected by lock_sock(),
> > and only __dev_get_by_index() requires RTNL.
> >
> > Let's use dev_get_by_index() and drop RTNL for IPV6_ADD_MEMBERSHIP and
> > MCAST_JOIN_GROUP.
> >
> > Note that we must call rt6_lookup() and dev_hold() under RCU.
> >
> > If rt6_lookup() returns an entry from the exception table, dst_dev_put()
> > could change rt->dev.dst to loopback concurrently, and the original device
> > could lose the refcount before dev_hold() and unblock device registration.
> >
> > dst_dev_put() is called from NETDEV_UNREGISTER and synchronize_net() follows
> > it, so as long as rt6_lookup() and dev_hold() are called within the same
> > RCU critical section, the dev is alive.
> >
> > Even if the race happens, they are synchronised by idev->dead and mcast
> > addresses are cleaned up.
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> > ---
> > v2: Hold rcu_read_lock() around rt6_lookup & dev_hold()
> > ---
> >  net/ipv6/ipv6_sockglue.c |  2 --
> >  net/ipv6/mcast.c         | 22 ++++++++++++----------
> >  2 files changed, 12 insertions(+), 12 deletions(-)
> >
> > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > index 1e225e6489ea..cb0dc885cbe4 100644
> > --- a/net/ipv6/ipv6_sockglue.c
> > +++ b/net/ipv6/ipv6_sockglue.c
> > @@ -121,11 +121,9 @@ static bool setsockopt_needs_rtnl(int optname)
> >  {
> >         switch (optname) {
> >         case IPV6_ADDRFORM:
> > -       case IPV6_ADD_MEMBERSHIP:
> >         case IPV6_DROP_MEMBERSHIP:
> >         case IPV6_JOIN_ANYCAST:
> >         case IPV6_LEAVE_ANYCAST:
> > -       case MCAST_JOIN_GROUP:
> >         case MCAST_LEAVE_GROUP:
> >         case MCAST_JOIN_SOURCE_GROUP:
> >         case MCAST_LEAVE_SOURCE_GROUP:
> > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> > index b3f063b5ffd7..9fc7672926bf 100644
> > --- a/net/ipv6/mcast.c
> > +++ b/net/ipv6/mcast.c
> > @@ -175,14 +175,12 @@ static int unsolicited_report_interval(struct inet6_dev *idev)
> >  static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
> >                                const struct in6_addr *addr, unsigned int mode)
> >  {
> > -       struct net_device *dev = NULL;
> > -       struct ipv6_mc_socklist *mc_lst;
> >         struct ipv6_pinfo *np = inet6_sk(sk);
> > +       struct ipv6_mc_socklist *mc_lst;
> >         struct net *net = sock_net(sk);
> > +       struct net_device *dev = NULL;
> >         int err;
> >
> > -       ASSERT_RTNL();
> > -
> >         if (!ipv6_addr_is_multicast(addr))
> >                 return -EINVAL;
> >
> > @@ -202,13 +200,18 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex,
> >
> >         if (ifindex == 0) {
> >                 struct rt6_info *rt;
> > +
> > +               rcu_read_lock();
> >                 rt = rt6_lookup(net, addr, NULL, 0, NULL, 0);
> >                 if (rt) {
> >                         dev = rt->dst.dev;
>
> We probably need safety here, READ_ONCE() at minimum.

Will add it and the corresponding WRITE_ONCE() in dst_dev_put().

>
> This can probably be done in a separate series.

I'll post a follow-up for other rt6_lookup() users and dst.dev
users under RCU.

Thanks!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join().
  2025-06-26 14:52   ` Eric Dumazet
@ 2025-06-27  1:02     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-27  1:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev

On Thu, Jun 26, 2025 at 7:52 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Jun 24, 2025 at 1:26 PM Kuniyuki Iwashima <kuni1840@gmail.com> wrote:
> >
> > From: Kuniyuki Iwashima <kuniyu@google.com>
> >
> > The next patch will replace __dev_get_by_index() and __dev_get_by_flags()
> > to RCU + refcount version.
> >
> > Then, we will need to call dev_put() in some error paths.
> >
> > Let's unify two error paths to make the next patch cleaner.
> >
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> > ---
> >  net/ipv6/anycast.c | 22 ++++++++++++++--------
> >  1 file changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
> > index 8440e7b27f6d..e0a1f9d7622c 100644
> > --- a/net/ipv6/anycast.c
> > +++ b/net/ipv6/anycast.c
> > @@ -67,12 +67,11 @@ static u32 inet6_acaddr_hash(const struct net *net,
> >  int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
> >  {
> >         struct ipv6_pinfo *np = inet6_sk(sk);
> > +       struct ipv6_ac_socklist *pac = NULL;
> > +       struct net *net = sock_net(sk);
> >         struct net_device *dev = NULL;
> >         struct inet6_dev *idev;
> > -       struct ipv6_ac_socklist *pac;
> > -       struct net *net = sock_net(sk);
> > -       int     ishost = !net->ipv6.devconf_all->forwarding;
> > -       int     err = 0;
> > +       int err = 0, ishost;
> >
> >         ASSERT_RTNL();
> >
> > @@ -84,15 +83,22 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr)
> >         if (ifindex)
> >                 dev = __dev_get_by_index(net, ifindex);
> >
> > -       if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE))
> > -               return -EINVAL;
> > +       if (ipv6_chk_addr_and_flags(net, addr, dev, true, 0, IFA_F_TENTATIVE)) {
> > +               err = -EINVAL;
> > +               goto error;
> > +       }
> >
> >         pac = sock_kmalloc(sk, sizeof(struct ipv6_ac_socklist), GFP_KERNEL);
> > -       if (!pac)
> > -               return -ENOMEM;
> > +       if (!pac) {
> > +               err = -ENOMEM;
> > +               goto error;
> > +       }
> > +
> >         pac->acl_next = NULL;
> >         pac->acl_addr = *addr;
> >
> > +       ishost = !net->ipv6.devconf_all->forwarding;
>
> RTNL will no longer protect this read, you should add a READ_ONCE()

Ah exactly.  Will use it.

Thank you!


>
> Other than that :
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c
  2025-06-27  0:49   ` Kuniyuki Iwashima
@ 2025-06-27  6:32     ` Paolo Abeni
  2025-06-27 16:36       ` Kuniyuki Iwashima
  0 siblings, 1 reply; 40+ messages in thread
From: Paolo Abeni @ 2025-06-27  6:32 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Simon Horman, netdev

On 6/27/25 2:49 AM, Kuniyuki Iwashima wrote:
> On Thu, Jun 26, 2025 at 6:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 6/24/25 10:24 PM, Kuniyuki Iwashima wrote:
>>> From: Kuniyuki Iwashima <kuniyu@google.com>
>>>
>>> This is a prep series for RCU conversion of RTM_NEWNEIGH, which needs
>>> RTNL during neigh_table.{pconstructor,pdestructor}() touching IPv6
>>> multicast code.
>>>
>>> Currently, IPv6 multicast code is protected by lock_sock() and
>>> inet6_dev->mc_lock, and RTNL is not actually needed.
>>>
>>> In addition, anycast code is also in the same situation and does not
>>> need RTNL at all.
>>>
>>> This series removes RTNL from net/ipv6/{mcast.c,anycast.c} and finally
>>> removes setsockopt_needs_rtnl() from do_ipv6_setsockopt().
>>
>> I went through the whole series I could not find any obvious bug.
>>
>> Still this is not trivial matter and I recently missed bugs in similar
>> changes, so let me keep the series in PW for a little longer, just in
>> case some other pair of eyes would go over it ;)
> 
> Thank you Paolo!
> 
>>
>> BTW @Kuniyuki: do you have a somewhat public todo list that others could
>> peek at to join this effort?
> 
> I  don't have a public one now, but I can create a public repo on GitHub
> and fill the Issues tab as the todo list.  Do you have any ideas ?

Not really, that is way I asked ;) Hopefully someone ~here could help.

Quickly skimming over the codebase I suspect/hope mroute{4,6} should be
doable (to be converted to own lock instead of rtnl).

/P


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
  2025-06-27  1:01     ` Kuniyuki Iwashima
@ 2025-06-27  7:21       ` Eric Dumazet
  2025-06-27 16:38         ` Kuniyuki Iwashima
  0 siblings, 1 reply; 40+ messages in thread
From: Eric Dumazet @ 2025-06-27  7:21 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev

On Thu, Jun 26, 2025 at 6:01 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>

> I'll post a follow-up for other rt6_lookup() users and dst.dev
> users under RCU.

I have prepared a series adding annotations and helpers.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c
  2025-06-27  6:32     ` Paolo Abeni
@ 2025-06-27 16:36       ` Kuniyuki Iwashima
  0 siblings, 0 replies; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-27 16:36 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Simon Horman, netdev

On Thu, Jun 26, 2025 at 11:32 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 6/27/25 2:49 AM, Kuniyuki Iwashima wrote:
> > On Thu, Jun 26, 2025 at 6:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >> On 6/24/25 10:24 PM, Kuniyuki Iwashima wrote:
> >>> From: Kuniyuki Iwashima <kuniyu@google.com>
> >>>
> >>> This is a prep series for RCU conversion of RTM_NEWNEIGH, which needs
> >>> RTNL during neigh_table.{pconstructor,pdestructor}() touching IPv6
> >>> multicast code.
> >>>
> >>> Currently, IPv6 multicast code is protected by lock_sock() and
> >>> inet6_dev->mc_lock, and RTNL is not actually needed.
> >>>
> >>> In addition, anycast code is also in the same situation and does not
> >>> need RTNL at all.
> >>>
> >>> This series removes RTNL from net/ipv6/{mcast.c,anycast.c} and finally
> >>> removes setsockopt_needs_rtnl() from do_ipv6_setsockopt().
> >>
> >> I went through the whole series I could not find any obvious bug.
> >>
> >> Still this is not trivial matter and I recently missed bugs in similar
> >> changes, so let me keep the series in PW for a little longer, just in
> >> case some other pair of eyes would go over it ;)
> >
> > Thank you Paolo!
> >
> >>
> >> BTW @Kuniyuki: do you have a somewhat public todo list that others could
> >> peek at to join this effort?
> >
> > I  don't have a public one now, but I can create a public repo on GitHub
> > and fill the Issues tab as the todo list.  Do you have any ideas ?
>
> Not really, that is way I asked ;) Hopefully someone ~here could help.

I'll create Issues as todo in this repo (now importing net-next)
https://github.com/q2ven/small_rtnl/


>
> Quickly skimming over the codebase I suspect/hope mroute{4,6} should be
> doable (to be converted to own lock instead of rtnl).

Looks doable to me too :)


>
> /P
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.
  2025-06-27  7:21       ` Eric Dumazet
@ 2025-06-27 16:38         ` Kuniyuki Iwashima
  0 siblings, 0 replies; 40+ messages in thread
From: Kuniyuki Iwashima @ 2025-06-27 16:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev

On Fri, Jun 27, 2025 at 12:21 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Jun 26, 2025 at 6:01 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
> >
>
> > I'll post a follow-up for other rt6_lookup() users and dst.dev
> > users under RCU.
>
> I have prepared a series adding annotations and helpers.

Thank you!!  Will use the helper in v3.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2025-06-27 16:38 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-24 20:24 [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Kuniyuki Iwashima
2025-06-24 20:24 ` [PATCH v2 net-next 01/15] ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}() Kuniyuki Iwashima
2025-06-26 14:26   ` Eric Dumazet
2025-06-27  0:56     ` Kuniyuki Iwashima
2025-06-24 20:24 ` [PATCH v2 net-next 02/15] ipv6: mcast: Replace locking comments with lockdep annotations Kuniyuki Iwashima
2025-06-26 14:28   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 03/15] ipv6: mcast: Check inet6_dev->dead under idev->mc_lock in __ipv6_dev_mc_inc() Kuniyuki Iwashima
2025-06-26 14:30   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 04/15] ipv6: mcast: Remove mca_get() Kuniyuki Iwashima
2025-06-26 14:31   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 05/15] ipv6: mcast: Use in6_dev_get() in ipv6_dev_mc_dec() Kuniyuki Iwashima
2025-06-26 14:33   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 06/15] ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP Kuniyuki Iwashima
2025-06-26 14:36   ` Eric Dumazet
2025-06-27  1:01     ` Kuniyuki Iwashima
2025-06-27  7:21       ` Eric Dumazet
2025-06-27 16:38         ` Kuniyuki Iwashima
2025-06-24 20:24 ` [PATCH v2 net-next 07/15] ipv6: mcast: Don't hold RTNL for IPV6_DROP_MEMBERSHIP and MCAST_LEAVE_GROUP Kuniyuki Iwashima
2025-06-26 14:38   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 08/15] ipv6: mcast: Don't hold RTNL in ipv6_sock_mc_close() Kuniyuki Iwashima
2025-06-26 14:39   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 09/15] ipv6: mcast: Don't hold RTNL for MCAST_ socket options Kuniyuki Iwashima
2025-06-26 14:44   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 10/15] ipv6: mcast: Remove unnecessary ASSERT_RTNL and comment Kuniyuki Iwashima
2025-06-26 14:44   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 11/15] ipv6: anycast: Don't use rtnl_dereference() Kuniyuki Iwashima
2025-06-26 14:46   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 12/15] ipv6: anycast: Don't hold RTNL for IPV6_LEAVE_ANYCAST and IPV6_ADDRFORM Kuniyuki Iwashima
2025-06-26 14:48   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 13/15] ipv6: anycast: Unify two error paths in ipv6_sock_ac_join() Kuniyuki Iwashima
2025-06-26 14:52   ` Eric Dumazet
2025-06-27  1:02     ` Kuniyuki Iwashima
2025-06-24 20:24 ` [PATCH v2 net-next 14/15] ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST Kuniyuki Iwashima
2025-06-26 14:54   ` Eric Dumazet
2025-06-24 20:24 ` [PATCH v2 net-next 15/15] ipv6: Remove setsockopt_needs_rtnl() Kuniyuki Iwashima
2025-06-26 14:54   ` Eric Dumazet
2025-06-26 13:27 ` [PATCH v2 net-next 00/15] ipv6: Drop RTNL from mcast.c and anycast.c Paolo Abeni
2025-06-27  0:49   ` Kuniyuki Iwashima
2025-06-27  6:32     ` Paolo Abeni
2025-06-27 16:36       ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).