netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.
@ 2025-05-14 20:18 Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 1/7] ipv6: Remove rcu_read_lock() in fib6_get_table() Kuniyuki Iwashima
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

Patch 1 removes rcu_read_lock() in fib6_get_table().
Patch 2 removes rtnl_is_held arg for lwtunnel_valid_encap_type(), which
 was short-term fix and is no longer used.
Patch 3 fixes RCU vs GFP_KERNEL report by syzkaller.
Patch 4~7 reverts GFP_ATOMIC uses to GFP_KERNEL.


Kuniyuki Iwashima (7):
  ipv6: Remove rcu_read_lock() in fib6_get_table().
  inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
  ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
  Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during
    seg6local LWT setup"
  Revert "ipv6: Factorise ip6_route_multipath_add()."
  ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
  ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.

 include/net/lwtunnel.h   |  13 +-
 net/core/lwtunnel.c      |  15 +--
 net/ipv4/fib_frontend.c  |   4 +-
 net/ipv4/fib_semantics.c |  10 +-
 net/ipv4/nexthop.c       |   3 +-
 net/ipv6/ip6_fib.c       |  27 ++--
 net/ipv6/route.c         | 269 ++++++++++++++-------------------------
 net/ipv6/seg6_local.c    |   6 +-
 8 files changed, 127 insertions(+), 220 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 1/7] ipv6: Remove rcu_read_lock() in fib6_get_table().
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
@ 2025-05-14 20:18 ` Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 2/7] inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?() Kuniyuki Iwashima
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

Once allocated, the IPv6 routing table is not freed until
netns is dismantled.

fib6_get_table() uses rcu_read_lock() while iterating
net->ipv6.fib_table_hash[], but it's not needed and
rather confusing.

Because some callers have this pattern,

  table = fib6_get_table();

  rcu_read_lock();
  /* ... use table here ... */
  rcu_read_unlock();

  [ See: addrconf_get_prefix_route(), ip6_route_del(),
         rt6_get_route_info(), rt6_get_dflt_router() ]

and this looks illegal but is actually safe.

Let's remove rcu_read_lock() in fib6_get_table() and pass true
to the last argument of hlist_for_each_entry_rcu() to bypass
the RCU check.

Note that protection is not needed but RCU helper is used to
avoid data-race.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ip6_fib.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 1f860340690c..88770ecd2da1 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -281,22 +281,20 @@ EXPORT_SYMBOL_GPL(fib6_new_table);
 
 struct fib6_table *fib6_get_table(struct net *net, u32 id)
 {
-	struct fib6_table *tb;
 	struct hlist_head *head;
-	unsigned int h;
+	struct fib6_table *tb;
 
-	if (id == 0)
+	if (!id)
 		id = RT6_TABLE_MAIN;
-	h = id & (FIB6_TABLE_HASHSZ - 1);
-	rcu_read_lock();
-	head = &net->ipv6.fib_table_hash[h];
-	hlist_for_each_entry_rcu(tb, head, tb6_hlist) {
-		if (tb->tb6_id == id) {
-			rcu_read_unlock();
+
+	head = &net->ipv6.fib_table_hash[id & (FIB6_TABLE_HASHSZ - 1)];
+
+	/* See comment in fib6_link_table().  RCU is not required,
+	 * but rcu_dereference_raw() is used to avoid data-race.
+	 */
+	hlist_for_each_entry_rcu(tb, head, tb6_hlist, true)
+		if (tb->tb6_id == id)
 			return tb;
-		}
-	}
-	rcu_read_unlock();
 
 	return NULL;
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 2/7] inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 1/7] ipv6: Remove rcu_read_lock() in fib6_get_table() Kuniyuki Iwashima
@ 2025-05-14 20:18 ` Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 3/7] ipv6: Narrow down RCU critical section in inet6_rtm_newroute() Kuniyuki Iwashima
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

Commit f130a0cc1b4f ("inet: fix lwtunnel_valid_encap_type() lock
imbalance") added the rtnl_is_held argument as a temporary fix while
I'm converting nexthop and IPv6 routing table to per-netns RTNL or RCU.

Now all callers of lwtunnel_valid_encap_type() do not hold RTNL.

Let's remove the argument.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/lwtunnel.h  | 13 +++++--------
 net/core/lwtunnel.c     | 15 +++------------
 net/ipv4/fib_frontend.c |  4 ++--
 net/ipv4/nexthop.c      |  3 +--
 net/ipv6/route.c        |  6 ++----
 5 files changed, 13 insertions(+), 28 deletions(-)

diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h
index 39cd50300a18..c306ebe379a0 100644
--- a/include/net/lwtunnel.h
+++ b/include/net/lwtunnel.h
@@ -116,11 +116,9 @@ int lwtunnel_encap_add_ops(const struct lwtunnel_encap_ops *op,
 int lwtunnel_encap_del_ops(const struct lwtunnel_encap_ops *op,
 			   unsigned int num);
 int lwtunnel_valid_encap_type(u16 encap_type,
-			      struct netlink_ext_ack *extack,
-			      bool rtnl_is_held);
+			      struct netlink_ext_ack *extack);
 int lwtunnel_valid_encap_type_attr(struct nlattr *attr, int len,
-				   struct netlink_ext_ack *extack,
-				   bool rtnl_is_held);
+				   struct netlink_ext_ack *extack);
 int lwtunnel_build_state(struct net *net, u16 encap_type,
 			 struct nlattr *encap,
 			 unsigned int family, const void *cfg,
@@ -203,15 +201,14 @@ static inline int lwtunnel_encap_del_ops(const struct lwtunnel_encap_ops *op,
 }
 
 static inline int lwtunnel_valid_encap_type(u16 encap_type,
-					    struct netlink_ext_ack *extack,
-					    bool rtnl_is_held)
+					    struct netlink_ext_ack *extack)
 {
 	NL_SET_ERR_MSG(extack, "CONFIG_LWTUNNEL is not enabled in this kernel");
 	return -EOPNOTSUPP;
 }
+
 static inline int lwtunnel_valid_encap_type_attr(struct nlattr *attr, int len,
-						 struct netlink_ext_ack *extack,
-						 bool rtnl_is_held)
+						 struct netlink_ext_ack *extack)
 {
 	/* return 0 since we are not walking attr looking for
 	 * RTA_ENCAP_TYPE attribute on nexthops.
diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
index 60f27cb4e54f..f9d76d85d04f 100644
--- a/net/core/lwtunnel.c
+++ b/net/core/lwtunnel.c
@@ -149,8 +149,7 @@ int lwtunnel_build_state(struct net *net, u16 encap_type,
 }
 EXPORT_SYMBOL_GPL(lwtunnel_build_state);
 
-int lwtunnel_valid_encap_type(u16 encap_type, struct netlink_ext_ack *extack,
-			      bool rtnl_is_held)
+int lwtunnel_valid_encap_type(u16 encap_type, struct netlink_ext_ack *extack)
 {
 	const struct lwtunnel_encap_ops *ops;
 	int ret = -EINVAL;
@@ -167,12 +166,7 @@ int lwtunnel_valid_encap_type(u16 encap_type, struct netlink_ext_ack *extack,
 		const char *encap_type_str = lwtunnel_encap_str(encap_type);
 
 		if (encap_type_str) {
-			if (rtnl_is_held)
-				__rtnl_unlock();
 			request_module("rtnl-lwt-%s", encap_type_str);
-			if (rtnl_is_held)
-				rtnl_lock();
-
 			ops = rcu_access_pointer(lwtun_encaps[encap_type]);
 		}
 	}
@@ -186,8 +180,7 @@ int lwtunnel_valid_encap_type(u16 encap_type, struct netlink_ext_ack *extack,
 EXPORT_SYMBOL_GPL(lwtunnel_valid_encap_type);
 
 int lwtunnel_valid_encap_type_attr(struct nlattr *attr, int remaining,
-				   struct netlink_ext_ack *extack,
-				   bool rtnl_is_held)
+				   struct netlink_ext_ack *extack)
 {
 	struct rtnexthop *rtnh = (struct rtnexthop *)attr;
 	struct nlattr *nla_entype;
@@ -208,9 +201,7 @@ int lwtunnel_valid_encap_type_attr(struct nlattr *attr, int remaining,
 				}
 				encap_type = nla_get_u16(nla_entype);
 
-				if (lwtunnel_valid_encap_type(encap_type,
-							      extack,
-							      rtnl_is_held) != 0)
+				if (lwtunnel_valid_encap_type(encap_type, extack))
 					return -EOPNOTSUPP;
 			}
 		}
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 57f088e5540e..fd1e1507a224 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -807,7 +807,7 @@ static int rtm_to_fib_config(struct net *net, struct sk_buff *skb,
 		case RTA_MULTIPATH:
 			err = lwtunnel_valid_encap_type_attr(nla_data(attr),
 							     nla_len(attr),
-							     extack, false);
+							     extack);
 			if (err < 0)
 				goto errout;
 			cfg->fc_mp = nla_data(attr);
@@ -825,7 +825,7 @@ static int rtm_to_fib_config(struct net *net, struct sk_buff *skb,
 		case RTA_ENCAP_TYPE:
 			cfg->fc_encap_type = nla_get_u16(attr);
 			err = lwtunnel_valid_encap_type(cfg->fc_encap_type,
-							extack, false);
+							extack);
 			if (err < 0)
 				goto errout;
 			break;
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 823e4a783d2b..4397e89d3123 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -3180,8 +3180,7 @@ static int rtm_to_nh_config(struct net *net, struct sk_buff *skb,
 		}
 
 		cfg->nh_encap_type = nla_get_u16(tb[NHA_ENCAP_TYPE]);
-		err = lwtunnel_valid_encap_type(cfg->nh_encap_type,
-						extack, false);
+		err = lwtunnel_valid_encap_type(cfg->nh_encap_type, extack);
 		if (err < 0)
 			goto out;
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 44300962230b..6baf177c529b 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -5172,8 +5172,7 @@ static int rtm_to_fib6_multipath_config(struct fib6_config *cfg,
 		rtnh = rtnh_next(rtnh, &remaining);
 	} while (rtnh_ok(rtnh, remaining));
 
-	return lwtunnel_valid_encap_type_attr(cfg->fc_mp, cfg->fc_mp_len,
-					      extack, false);
+	return lwtunnel_valid_encap_type_attr(cfg->fc_mp, cfg->fc_mp_len, extack);
 }
 
 static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -5310,8 +5309,7 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (tb[RTA_ENCAP_TYPE]) {
 		cfg->fc_encap_type = nla_get_u16(tb[RTA_ENCAP_TYPE]);
 
-		err = lwtunnel_valid_encap_type(cfg->fc_encap_type,
-						extack, false);
+		err = lwtunnel_valid_encap_type(cfg->fc_encap_type, extack);
 		if (err < 0)
 			goto errout;
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 3/7] ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 1/7] ipv6: Remove rcu_read_lock() in fib6_get_table() Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 2/7] inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?() Kuniyuki Iwashima
@ 2025-05-14 20:18 ` Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 4/7] Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup" Kuniyuki Iwashima
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev,
	syzbot+bcc12d6799364500fbec

Commit 169fd62799e8 ("ipv6: Get rid of RTNL for SIOCADDRT and
RTM_NEWROUTE.") added rcu_read_lock() covering
ip6_route_info_create_nh() and __ip6_ins_rt() to guarantee that
nexthop and netdev will not go away.

However, as reported by syzkaller [0], ip_tun_build_state() calls
dst_cache_init() with GFP_KERNEL during the RCU critical section.

ip6_route_info_create_nh() fetches nexthop or netdev depending on
whether RTA_NH_ID is set, and struct fib6_info holds a refcount
of either of them by nexthop_get() or netdev_get_by_index().

netdev_get_by_index() looks up a dev and calls dev_hold() under RCU.

So, we need RCU only around nexthop_find_by_id() and nexthop_get()
( and a few more nexthop code).

Let's add rcu_read_lock() there and remove rcu_read_lock() in
ip6_route_add() and ip6_route_multipath_add().

Now these functions called from fib6_add() need RCU:

  - inet6_rt_notify()
  - fib6_drop_pcpu_from() (via fib6_purge_rt())
  - rt6_flush_exceptions() (via fib6_purge_rt())

All callers of inet6_rt_notify() need RCU, so rcu_read_lock() is
added there.

[0]:
[ BUG: Invalid wait context ]
6.15.0-rc4-syzkaller-00746-g836b313a14a3 #0 Tainted: G W
syz-executor234/5832 is trying to lock:
ffffffff8e021688 (pcpu_alloc_mutex){+.+.}-{4:4}, at:
pcpu_alloc_noprof+0x284/0x16b0 mm/percpu.c:1782
other info that might help us debug this:
context-{5:5}
1 lock held by syz-executor234/5832:
 0: ffffffff8df3b860 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire
include/linux/rcupdate.h:331 [inline]
 0: ffffffff8df3b860 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock
include/linux/rcupdate.h:841 [inline]
 0: ffffffff8df3b860 (rcu_read_lock){....}-{1:3}, at:
ip6_route_add+0x4d/0x2f0 net/ipv6/route.c:3913
stack backtrace:
CPU: 0 UID: 0 PID: 5832 Comm: syz-executor234 Tainted: G W
6.15.0-rc4-syzkaller-00746-g836b313a14a3 #0 PREEMPT(full)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 04/29/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_lock_invalid_wait_context kernel/locking/lockdep.c:4831 [inline]
 check_wait_context kernel/locking/lockdep.c:4903 [inline]
 __lock_acquire+0xbcf/0xd20 kernel/locking/lockdep.c:5185
 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5866
 __mutex_lock_common kernel/locking/mutex.c:601 [inline]
 __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:746
 pcpu_alloc_noprof+0x284/0x16b0 mm/percpu.c:1782
 dst_cache_init+0x37/0xc0 net/core/dst_cache.c:145
 ip_tun_build_state+0x193/0x6b0 net/ipv4/ip_tunnel_core.c:687
 lwtunnel_build_state+0x381/0x4c0 net/core/lwtunnel.c:137
 fib_nh_common_init+0x129/0x460 net/ipv4/fib_semantics.c:635
 fib6_nh_init+0x15e4/0x2030 net/ipv6/route.c:3669
 ip6_route_info_create_nh+0x139/0x870 net/ipv6/route.c:3866
 ip6_route_add+0xf6/0x2f0 net/ipv6/route.c:3915
 inet6_rtm_newroute+0x284/0x1c50 net/ipv6/route.c:5732
 rtnetlink_rcv_msg+0x7cc/0xb70 net/core/rtnetlink.c:6955
 netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
 netlink_unicast+0x758/0x8d0 net/netlink/af_netlink.c:1339
 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg+0x219/0x270 net/socket.c:727
 ____sys_sendmsg+0x505/0x830 net/socket.c:2566
 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2620
 __sys_sendmsg net/socket.c:2652 [inline]
 __do_sys_sendmsg net/socket.c:2657 [inline]
 __se_sys_sendmsg net/socket.c:2655 [inline]
 __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2655
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94

Fixes: 169fd62799e8 ("ipv6: Get rid of RTNL for SIOCADDRT and RTM_NEWROUTE.")
Reported-by: syzbot+bcc12d6799364500fbec@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bcc12d6799364500fbec
Reported-by: Eric Dumazet <edumazet@google.com>
Closes: https://lore.kernel.org/netdev/CANn89i+r1cGacVC_6n3-A-WSkAa_Nr+pmxJ7Gt+oP-P9by2aGw@mail.gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ip6_fib.c |  5 +++--
 net/ipv6/route.c   | 31 ++++++++++++++++++-------------
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 88770ecd2da1..e17b173625da 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1027,8 +1027,9 @@ static void fib6_drop_pcpu_from(struct fib6_info *f6i,
 			.table = table
 		};
 
-		nexthop_for_each_fib6_nh(f6i->nh, fib6_nh_drop_pcpu_from,
-					 &arg);
+		rcu_read_lock();
+		nexthop_for_each_fib6_nh(f6i->nh, fib6_nh_drop_pcpu_from, &arg);
+		rcu_read_unlock();
 	} else {
 		struct fib6_nh *fib6_nh;
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 6baf177c529b..a87091dd06b1 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1820,11 +1820,13 @@ static int rt6_nh_flush_exceptions(struct fib6_nh *nh, void *arg)
 
 void rt6_flush_exceptions(struct fib6_info *f6i)
 {
-	if (f6i->nh)
-		nexthop_for_each_fib6_nh(f6i->nh, rt6_nh_flush_exceptions,
-					 f6i);
-	else
+	if (f6i->nh) {
+		rcu_read_lock();
+		nexthop_for_each_fib6_nh(f6i->nh, rt6_nh_flush_exceptions, f6i);
+		rcu_read_unlock();
+	} else {
 		fib6_nh_flush_exceptions(f6i->fib6_nh, f6i);
+	}
 }
 
 /* Find cached rt in the hash table inside passed in rt
@@ -3841,6 +3843,8 @@ static int ip6_route_info_create_nh(struct fib6_info *rt,
 	if (cfg->fc_nh_id) {
 		struct nexthop *nh;
 
+		rcu_read_lock();
+
 		nh = nexthop_find_by_id(net, cfg->fc_nh_id);
 		if (!nh) {
 			err = -EINVAL;
@@ -3860,6 +3864,8 @@ static int ip6_route_info_create_nh(struct fib6_info *rt,
 
 		rt->nh = nh;
 		fib6_nh = nexthop_fib6_nh(rt->nh);
+
+		rcu_read_unlock();
 	} else {
 		int addr_type;
 
@@ -3895,6 +3901,7 @@ static int ip6_route_info_create_nh(struct fib6_info *rt,
 	fib6_info_release(rt);
 	return err;
 out_free:
+	rcu_read_unlock();
 	ip_fib_metrics_put(rt->fib6_metrics);
 	kfree(rt);
 	return err;
@@ -3910,16 +3917,12 @@ int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags,
 	if (IS_ERR(rt))
 		return PTR_ERR(rt);
 
-	rcu_read_lock();
-
 	err = ip6_route_info_create_nh(rt, cfg, extack);
 	if (err)
-		goto unlock;
+		return err;
 
 	err = __ip6_ins_rt(rt, &cfg->fc_nlinfo, extack);
 	fib6_info_release(rt);
-unlock:
-	rcu_read_unlock();
 
 	return err;
 }
@@ -5534,8 +5537,6 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
 	if (err)
 		return err;
 
-	rcu_read_lock();
-
 	err = ip6_route_mpath_info_create_nh(&rt6_nh_list, extack);
 	if (err)
 		goto cleanup;
@@ -5627,8 +5628,6 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
 	}
 
 cleanup:
-	rcu_read_unlock();
-
 	list_for_each_entry_safe(nh, nh_safe, &rt6_nh_list, list) {
 		fib6_info_release(nh->fib6_info);
 		list_del(&nh->list);
@@ -6410,6 +6409,8 @@ void inet6_rt_notify(int event, struct fib6_info *rt, struct nl_info *info,
 	err = -ENOBUFS;
 	seq = info->nlh ? info->nlh->nlmsg_seq : 0;
 
+	rcu_read_lock();
+
 	skb = nlmsg_new(rt6_nlmsg_size(rt), GFP_ATOMIC);
 	if (!skb)
 		goto errout;
@@ -6422,10 +6423,14 @@ void inet6_rt_notify(int event, struct fib6_info *rt, struct nl_info *info,
 		kfree_skb(skb);
 		goto errout;
 	}
+
+	rcu_read_unlock();
+
 	rtnl_notify(skb, net, info->portid, RTNLGRP_IPV6_ROUTE,
 		    info->nlh, GFP_ATOMIC);
 	return;
 errout:
+	rcu_read_unlock();
 	rtnl_set_sk_err(net, RTNLGRP_IPV6_ROUTE, err);
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 4/7] Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup"
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2025-05-14 20:18 ` [PATCH v1 net-next 3/7] ipv6: Narrow down RCU critical section in inet6_rtm_newroute() Kuniyuki Iwashima
@ 2025-05-14 20:18 ` Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 5/7] Revert "ipv6: Factorise ip6_route_multipath_add()." Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

The previous patch fixed the same issue mentioned in
commit 14a0087e7236 ("ipv6: sr: switch to GFP_ATOMIC
flag to allocate memory during seg6local LWT setup").

Let's revert it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/seg6_local.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index ee5e448cc7a8..ac1dbd492c22 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -1671,7 +1671,7 @@ static int parse_nla_srh(struct nlattr **attrs, struct seg6_local_lwt *slwt,
 	if (!seg6_validate_srh(srh, len, false))
 		return -EINVAL;
 
-	slwt->srh = kmemdup(srh, len, GFP_ATOMIC);
+	slwt->srh = kmemdup(srh, len, GFP_KERNEL);
 	if (!slwt->srh)
 		return -ENOMEM;
 
@@ -1911,7 +1911,7 @@ static int parse_nla_bpf(struct nlattr **attrs, struct seg6_local_lwt *slwt,
 	if (!tb[SEG6_LOCAL_BPF_PROG] || !tb[SEG6_LOCAL_BPF_PROG_NAME])
 		return -EINVAL;
 
-	slwt->bpf.name = nla_memdup(tb[SEG6_LOCAL_BPF_PROG_NAME], GFP_ATOMIC);
+	slwt->bpf.name = nla_memdup(tb[SEG6_LOCAL_BPF_PROG_NAME], GFP_KERNEL);
 	if (!slwt->bpf.name)
 		return -ENOMEM;
 
@@ -1994,7 +1994,7 @@ static int parse_nla_counters(struct nlattr **attrs,
 		return -EINVAL;
 
 	/* counters are always zero initialized */
-	pcounters = seg6_local_alloc_pcpu_counters(GFP_ATOMIC);
+	pcounters = seg6_local_alloc_pcpu_counters(GFP_KERNEL);
 	if (!pcounters)
 		return -ENOMEM;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 5/7] Revert "ipv6: Factorise ip6_route_multipath_add()."
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2025-05-14 20:18 ` [PATCH v1 net-next 4/7] Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup" Kuniyuki Iwashima
@ 2025-05-14 20:18 ` Kuniyuki Iwashima
  2025-05-14 20:18 ` [PATCH v1 net-next 6/7] ipv6: Pass gfp_flags down to ip6_route_info_create_nh() Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

Commit 71c0efb6d12f ("ipv6: Factorise ip6_route_multipath_add().") split
a loop in ip6_route_multipath_add() so that we can put rcu_read_lock()
between ip6_route_info_create() and ip6_route_info_create_nh().

We no longer need to do so as ip6_route_info_create_nh() does not require
RCU now.

Let's revert the commit to simplify the code.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/route.c | 193 +++++++++++++++++------------------------------
 1 file changed, 70 insertions(+), 123 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index a87091dd06b1..96ae21da9961 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -5335,131 +5335,29 @@ struct rt6_nh {
 	struct fib6_info *fib6_info;
 	struct fib6_config r_cfg;
 	struct list_head list;
-	int weight;
 };
 
-static void ip6_route_mpath_info_cleanup(struct list_head *rt6_nh_list)
+static int ip6_route_info_append(struct list_head *rt6_nh_list,
+				 struct fib6_info *rt,
+				 struct fib6_config *r_cfg)
 {
-	struct rt6_nh *nh, *nh_next;
+	struct rt6_nh *nh;
 
-	list_for_each_entry_safe(nh, nh_next, rt6_nh_list, list) {
-		struct fib6_info *rt = nh->fib6_info;
-
-		if (rt) {
-			free_percpu(rt->fib6_nh->nh_common.nhc_pcpu_rth_output);
-			free_percpu(rt->fib6_nh->rt6i_pcpu);
-			ip_fib_metrics_put(rt->fib6_metrics);
-			kfree(rt);
-		}
-
-		list_del(&nh->list);
-		kfree(nh);
+	list_for_each_entry(nh, rt6_nh_list, list) {
+		/* check if fib6_info already exists */
+		if (rt6_duplicate_nexthop(nh->fib6_info, rt))
+			return -EEXIST;
 	}
-}
-
-static int ip6_route_mpath_info_create(struct list_head *rt6_nh_list,
-				       struct fib6_config *cfg,
-				       struct netlink_ext_ack *extack)
-{
-	struct rtnexthop *rtnh;
-	int remaining;
-	int err;
-
-	remaining = cfg->fc_mp_len;
-	rtnh = (struct rtnexthop *)cfg->fc_mp;
-
-	/* Parse a Multipath Entry and build a list (rt6_nh_list) of
-	 * fib6_info structs per nexthop
-	 */
-	while (rtnh_ok(rtnh, remaining)) {
-		struct fib6_config r_cfg;
-		struct fib6_info *rt;
-		struct rt6_nh *nh;
-		int attrlen;
-
-		nh = kzalloc(sizeof(*nh), GFP_KERNEL);
-		if (!nh) {
-			err = -ENOMEM;
-			goto err;
-		}
 
-		list_add_tail(&nh->list, rt6_nh_list);
-
-		memcpy(&r_cfg, cfg, sizeof(*cfg));
-		if (rtnh->rtnh_ifindex)
-			r_cfg.fc_ifindex = rtnh->rtnh_ifindex;
-
-		attrlen = rtnh_attrlen(rtnh);
-		if (attrlen > 0) {
-			struct nlattr *nla, *attrs = rtnh_attrs(rtnh);
-
-			nla = nla_find(attrs, attrlen, RTA_GATEWAY);
-			if (nla) {
-				r_cfg.fc_gateway = nla_get_in6_addr(nla);
-				r_cfg.fc_flags |= RTF_GATEWAY;
-			}
-
-			r_cfg.fc_encap = nla_find(attrs, attrlen, RTA_ENCAP);
-			nla = nla_find(attrs, attrlen, RTA_ENCAP_TYPE);
-			if (nla)
-				r_cfg.fc_encap_type = nla_get_u16(nla);
-		}
-
-		r_cfg.fc_flags |= (rtnh->rtnh_flags & RTNH_F_ONLINK);
-
-		rt = ip6_route_info_create(&r_cfg, GFP_KERNEL, extack);
-		if (IS_ERR(rt)) {
-			err = PTR_ERR(rt);
-			goto err;
-		}
-
-		nh->fib6_info = rt;
-		nh->weight = rtnh->rtnh_hops + 1;
-		memcpy(&nh->r_cfg, &r_cfg, sizeof(r_cfg));
+	nh = kzalloc(sizeof(*nh), GFP_KERNEL);
+	if (!nh)
+		return -ENOMEM;
 
-		rtnh = rtnh_next(rtnh, &remaining);
-	}
+	nh->fib6_info = rt;
+	memcpy(&nh->r_cfg, r_cfg, sizeof(*r_cfg));
+	list_add_tail(&nh->list, rt6_nh_list);
 
 	return 0;
-err:
-	ip6_route_mpath_info_cleanup(rt6_nh_list);
-	return err;
-}
-
-static int ip6_route_mpath_info_create_nh(struct list_head *rt6_nh_list,
-					  struct netlink_ext_ack *extack)
-{
-	struct rt6_nh *nh, *nh_next, *nh_tmp;
-	LIST_HEAD(tmp);
-	int err;
-
-	list_for_each_entry_safe(nh, nh_next, rt6_nh_list, list) {
-		struct fib6_info *rt = nh->fib6_info;
-
-		err = ip6_route_info_create_nh(rt, &nh->r_cfg, extack);
-		if (err) {
-			nh->fib6_info = NULL;
-			goto err;
-		}
-
-		rt->fib6_nh->fib_nh_weight = nh->weight;
-
-		list_move_tail(&nh->list, &tmp);
-
-		list_for_each_entry(nh_tmp, rt6_nh_list, list) {
-			/* check if fib6_info already exists */
-			if (rt6_duplicate_nexthop(nh_tmp->fib6_info, rt)) {
-				err = -EEXIST;
-				goto err;
-			}
-		}
-	}
-out:
-	list_splice(&tmp, rt6_nh_list);
-	return err;
-err:
-	ip6_route_mpath_info_cleanup(rt6_nh_list);
-	goto out;
 }
 
 static void ip6_route_mpath_notify(struct fib6_info *rt,
@@ -5519,11 +5417,16 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
 	struct fib6_info *rt_notif = NULL, *rt_last = NULL;
 	struct nl_info *info = &cfg->fc_nlinfo;
 	struct rt6_nh *nh, *nh_safe;
+	struct fib6_config r_cfg;
+	struct rtnexthop *rtnh;
 	LIST_HEAD(rt6_nh_list);
 	struct rt6_nh *err_nh;
+	struct fib6_info *rt;
 	__u16 nlflags;
-	int nhn = 0;
+	int remaining;
+	int attrlen;
 	int replace;
+	int nhn = 0;
 	int err;
 
 	replace = (cfg->fc_nlinfo.nlh &&
@@ -5533,13 +5436,57 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
 	if (info->nlh && info->nlh->nlmsg_flags & NLM_F_APPEND)
 		nlflags |= NLM_F_APPEND;
 
-	err = ip6_route_mpath_info_create(&rt6_nh_list, cfg, extack);
-	if (err)
-		return err;
+	remaining = cfg->fc_mp_len;
+	rtnh = (struct rtnexthop *)cfg->fc_mp;
 
-	err = ip6_route_mpath_info_create_nh(&rt6_nh_list, extack);
-	if (err)
-		goto cleanup;
+	/* Parse a Multipath Entry and build a list (rt6_nh_list) of
+	 * fib6_info structs per nexthop
+	 */
+	while (rtnh_ok(rtnh, remaining)) {
+		memcpy(&r_cfg, cfg, sizeof(*cfg));
+		if (rtnh->rtnh_ifindex)
+			r_cfg.fc_ifindex = rtnh->rtnh_ifindex;
+
+		attrlen = rtnh_attrlen(rtnh);
+		if (attrlen > 0) {
+			struct nlattr *nla, *attrs = rtnh_attrs(rtnh);
+
+			nla = nla_find(attrs, attrlen, RTA_GATEWAY);
+			if (nla) {
+				r_cfg.fc_gateway = nla_get_in6_addr(nla);
+				r_cfg.fc_flags |= RTF_GATEWAY;
+			}
+
+			r_cfg.fc_encap = nla_find(attrs, attrlen, RTA_ENCAP);
+			nla = nla_find(attrs, attrlen, RTA_ENCAP_TYPE);
+			if (nla)
+				r_cfg.fc_encap_type = nla_get_u16(nla);
+		}
+
+		r_cfg.fc_flags |= (rtnh->rtnh_flags & RTNH_F_ONLINK);
+		rt = ip6_route_info_create(&r_cfg, GFP_KERNEL, extack);
+		if (IS_ERR(rt)) {
+			err = PTR_ERR(rt);
+			rt = NULL;
+			goto cleanup;
+		}
+
+		err = ip6_route_info_create_nh(rt, &r_cfg, extack);
+		if (err) {
+			rt = NULL;
+			goto cleanup;
+		}
+
+		rt->fib6_nh->fib_nh_weight = rtnh->rtnh_hops + 1;
+
+		err = ip6_route_info_append(&rt6_nh_list, rt, &r_cfg);
+		if (err) {
+			fib6_info_release(rt);
+			goto cleanup;
+		}
+
+		rtnh = rtnh_next(rtnh, &remaining);
+	}
 
 	/* for add and replace send one notification with all nexthops.
 	 * Skip the notification in fib6_add_rt2node and send one with
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 6/7] ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
                   ` (4 preceding siblings ...)
  2025-05-14 20:18 ` [PATCH v1 net-next 5/7] Revert "ipv6: Factorise ip6_route_multipath_add()." Kuniyuki Iwashima
@ 2025-05-14 20:18 ` Kuniyuki Iwashima
  2025-05-14 20:19 ` [PATCH v1 net-next 7/7] ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE Kuniyuki Iwashima
  2025-05-15  1:45 ` [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Jakub Kicinski
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:18 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

Since commit c4837b9853e5 ("ipv6: Split ip6_route_info_create()."),
ip6_route_info_create_nh() uses GFP_ATOMIC as it was expected to be
called under RCU.

Now, we can call it without RCU and use GFP_KERNEL.

Let's pass gfp_flags to ip6_route_info_create_nh().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/route.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 96ae21da9961..dda913ebd2d3 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3834,6 +3834,7 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
 
 static int ip6_route_info_create_nh(struct fib6_info *rt,
 				    struct fib6_config *cfg,
+				    gfp_t gfp_flags,
 				    struct netlink_ext_ack *extack)
 {
 	struct net *net = cfg->fc_nlinfo.nl_net;
@@ -3869,7 +3870,7 @@ static int ip6_route_info_create_nh(struct fib6_info *rt,
 	} else {
 		int addr_type;
 
-		err = fib6_nh_init(net, rt->fib6_nh, cfg, GFP_ATOMIC, extack);
+		err = fib6_nh_init(net, rt->fib6_nh, cfg, gfp_flags, extack);
 		if (err)
 			goto out_release;
 
@@ -3917,7 +3918,7 @@ int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags,
 	if (IS_ERR(rt))
 		return PTR_ERR(rt);
 
-	err = ip6_route_info_create_nh(rt, cfg, extack);
+	err = ip6_route_info_create_nh(rt, cfg, gfp_flags, extack);
 	if (err)
 		return err;
 
@@ -4707,7 +4708,7 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
 	if (IS_ERR(f6i))
 		return f6i;
 
-	err = ip6_route_info_create_nh(f6i, &cfg, extack);
+	err = ip6_route_info_create_nh(f6i, &cfg, gfp_flags, extack);
 	if (err)
 		return ERR_PTR(err);
 
@@ -5471,7 +5472,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
 			goto cleanup;
 		}
 
-		err = ip6_route_info_create_nh(rt, &r_cfg, extack);
+		err = ip6_route_info_create_nh(rt, &r_cfg, GFP_KERNEL, extack);
 		if (err) {
 			rt = NULL;
 			goto cleanup;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 net-next 7/7] ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
                   ` (5 preceding siblings ...)
  2025-05-14 20:18 ` [PATCH v1 net-next 6/7] ipv6: Pass gfp_flags down to ip6_route_info_create_nh() Kuniyuki Iwashima
@ 2025-05-14 20:19 ` Kuniyuki Iwashima
  2025-05-15  1:45 ` [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Jakub Kicinski
  7 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-14 20:19 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

These two commits preallocated two per-cpu variables in
ip6_route_info_create() as fib_nh_common_init() and fib6_nh_init()
were expected to be called under RCU.

  * commit d27b9c40dbd6 ("ipv6: Preallocate nhc_pcpu_rth_output in
    ip6_route_info_create().")
  * commit 5720a328c3e9 ("ipv6: Preallocate rt->fib6_nh->rt6i_pcpu in
    ip6_route_info_create().")

Now these functions can be called without RCU and can use GFP_KERNEL.

Let's revert the commits.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/fib_semantics.c | 10 ++++------
 net/ipv6/route.c         | 34 +++-------------------------------
 2 files changed, 7 insertions(+), 37 deletions(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index dabe2b7044ab..d643bd1a0d9d 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -617,12 +617,10 @@ int fib_nh_common_init(struct net *net, struct fib_nh_common *nhc,
 {
 	int err;
 
-	if (!nhc->nhc_pcpu_rth_output) {
-		nhc->nhc_pcpu_rth_output = alloc_percpu_gfp(struct rtable __rcu *,
-							    gfp_flags);
-		if (!nhc->nhc_pcpu_rth_output)
-			return -ENOMEM;
-	}
+	nhc->nhc_pcpu_rth_output = alloc_percpu_gfp(struct rtable __rcu *,
+						    gfp_flags);
+	if (!nhc->nhc_pcpu_rth_output)
+		return -ENOMEM;
 
 	if (encap) {
 		struct lwtunnel_state *lwtstate;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index dda913ebd2d3..0143262094b0 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3674,12 +3674,10 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
 		goto out;
 
 pcpu_alloc:
+	fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags);
 	if (!fib6_nh->rt6i_pcpu) {
-		fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags);
-		if (!fib6_nh->rt6i_pcpu) {
-			err = -ENOMEM;
-			goto out;
-		}
+		err = -ENOMEM;
+		goto out;
 	}
 
 	fib6_nh->fib_nh_dev = dev;
@@ -3739,24 +3737,6 @@ void fib6_nh_release_dsts(struct fib6_nh *fib6_nh)
 	}
 }
 
-static int fib6_nh_prealloc_percpu(struct fib6_nh *fib6_nh, gfp_t gfp_flags)
-{
-	struct fib_nh_common *nhc = &fib6_nh->nh_common;
-
-	fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags);
-	if (!fib6_nh->rt6i_pcpu)
-		return -ENOMEM;
-
-	nhc->nhc_pcpu_rth_output = alloc_percpu_gfp(struct rtable __rcu *,
-						    gfp_flags);
-	if (!nhc->nhc_pcpu_rth_output) {
-		free_percpu(fib6_nh->rt6i_pcpu);
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
 static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
 					       gfp_t gfp_flags,
 					       struct netlink_ext_ack *extack)
@@ -3794,12 +3774,6 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
 		goto free;
 	}
 
-	if (!cfg->fc_nh_id) {
-		err = fib6_nh_prealloc_percpu(&rt->fib6_nh[0], gfp_flags);
-		if (err)
-			goto free_metrics;
-	}
-
 	if (cfg->fc_flags & RTF_ADDRCONF)
 		rt->dst_nocount = true;
 
@@ -3824,8 +3798,6 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
 	rt->fib6_src.plen = cfg->fc_src_len;
 #endif
 	return rt;
-free_metrics:
-	ip_fib_metrics_put(rt->fib6_metrics);
 free:
 	kfree(rt);
 err:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.
  2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
                   ` (6 preceding siblings ...)
  2025-05-14 20:19 ` [PATCH v1 net-next 7/7] ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE Kuniyuki Iwashima
@ 2025-05-15  1:45 ` Jakub Kicinski
  2025-05-15  2:05   ` Kuniyuki Iwashima
  7 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2025-05-15  1:45 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, David Ahern, Eric Dumazet, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev

On Wed, 14 May 2025 13:18:53 -0700 Kuniyuki Iwashima wrote:
> Patch 1 removes rcu_read_lock() in fib6_get_table().
> Patch 2 removes rtnl_is_held arg for lwtunnel_valid_encap_type(), which
>  was short-term fix and is no longer used.
> Patch 3 fixes RCU vs GFP_KERNEL report by syzkaller.
> Patch 4~7 reverts GFP_ATOMIC uses to GFP_KERNEL.

Hi! Something in the following set of patches is making our CI time out.
The problem seems to be:

[    0.751266] virtme-init: waiting for udev to settle
Timed out for waiting the udev queue being empty.
[  120.826428] virtme-init: udev is done

+team: grab team lock during team_change_rx_flags
+net: mana: Add handler for hardware servicing events
+ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.
+ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
+Revert "ipv6: Factorise ip6_route_multipath_add()."
+Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup"
+ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
+inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
+ipv6: Remove rcu_read_lock() in fib6_get_table().
+net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead
 amd-xgbe: read link status twice to avoid inconsistencies
+net: phy: fixed_phy: remove fixed_phy_register_with_gpiod
 drivers: net: mvpp2: attempt to refill rx before allocating skb
+selftest: af_unix: Test SO_PASSRIGHTS.
+af_unix: Introduce SO_PASSRIGHTS.
+af_unix: Inherit sk_flags at connect().
+af_unix: Move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
+net: Restrict SO_PASS{CRED,PIDFD,SEC} to AF_{UNIX,NETLINK,BLUETOOTH}.
+tcp: Restrict SO_TXREHASH to TCP socket.
+scm: Move scm_recv() from scm.h to scm.c.
+af_unix: Don't pass struct socket to maybe_add_creds().
+af_unix: Factorise test_bit() for SOCK_PASSCRED and SOCK_PASSPIDFD.

I haven't dug into it, gotta review / apply other patches :(
Maybe you can try to repro? 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.
  2025-05-15  1:45 ` [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Jakub Kicinski
@ 2025-05-15  2:05   ` Kuniyuki Iwashima
  2025-05-15  2:22     ` Jakub Kicinski
  2025-05-15  9:02     ` Paolo Abeni
  0 siblings, 2 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-15  2:05 UTC (permalink / raw)
  To: kuba; +Cc: davem, dsahern, edumazet, horms, kuni1840, kuniyu, netdev, pabeni

From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 14 May 2025 18:45:02 -0700
> On Wed, 14 May 2025 13:18:53 -0700 Kuniyuki Iwashima wrote:
> > Patch 1 removes rcu_read_lock() in fib6_get_table().
> > Patch 2 removes rtnl_is_held arg for lwtunnel_valid_encap_type(), which
> >  was short-term fix and is no longer used.
> > Patch 3 fixes RCU vs GFP_KERNEL report by syzkaller.
> > Patch 4~7 reverts GFP_ATOMIC uses to GFP_KERNEL.
> 
> Hi! Something in the following set of patches is making our CI time out.
> The problem seems to be:
> 
> [    0.751266] virtme-init: waiting for udev to settle
> Timed out for waiting the udev queue being empty.
> [  120.826428] virtme-init: udev is done
> 
> +team: grab team lock during team_change_rx_flags
> +net: mana: Add handler for hardware servicing events
> +ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.
> +ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
> +Revert "ipv6: Factorise ip6_route_multipath_add()."
> +Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup"
> +ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
> +inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
> +ipv6: Remove rcu_read_lock() in fib6_get_table().
> +net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead
>  amd-xgbe: read link status twice to avoid inconsistencies
> +net: phy: fixed_phy: remove fixed_phy_register_with_gpiod
>  drivers: net: mvpp2: attempt to refill rx before allocating skb
> +selftest: af_unix: Test SO_PASSRIGHTS.
> +af_unix: Introduce SO_PASSRIGHTS.
> +af_unix: Inherit sk_flags at connect().
> +af_unix: Move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
> +net: Restrict SO_PASS{CRED,PIDFD,SEC} to AF_{UNIX,NETLINK,BLUETOOTH}.
> +tcp: Restrict SO_TXREHASH to TCP socket.
> +scm: Move scm_recv() from scm.h to scm.c.
> +af_unix: Don't pass struct socket to maybe_add_creds().
> +af_unix: Factorise test_bit() for SOCK_PASSCRED and SOCK_PASSPIDFD.
> 
> I haven't dug into it, gotta review / apply other patches :(
> Maybe you can try to repro? 

I think I was able to reproduce it with SO_PASSRIGHTS series
with virtme-ng (but not with normal qemu with AL2023 rootfs).

After 2min, virtme-ng showed the console.

[    1.461450] virtme-ng-init: triggering udev coldplug
[    1.533147] virtme-ng-init: waiting for udev to settle
[  121.588624] virtme-ng-init: Timed out for waiting the udev queue being empty.
[  121.588710] virtme-ng-init: udev is done
[  121.593214] virtme-ng-init: initialization done
          _      _
   __   _(_)_ __| |_ _ __ ___   ___       _ __   __ _
   \ \ / / |  __| __|  _   _ \ / _ \_____|  _ \ / _  |
    \ V /| | |  | |_| | | | | |  __/_____| | | | (_| |
     \_/ |_|_|   \__|_| |_| |_|\___|     |_| |_|\__  |
                                                |___/
   kernel version: 6.15.0-rc4-virtme-00071-gceba111cf5e7 x86_64
   (CTRL+d to exit)


Will investigate the cause.

Sorry, but please drop the series and kick the CI again.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.
  2025-05-15  2:05   ` Kuniyuki Iwashima
@ 2025-05-15  2:22     ` Jakub Kicinski
  2025-05-15  9:02     ` Paolo Abeni
  1 sibling, 0 replies; 13+ messages in thread
From: Jakub Kicinski @ 2025-05-15  2:22 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: davem, dsahern, edumazet, horms, kuni1840, netdev, pabeni

On Wed, 14 May 2025 19:05:16 -0700 Kuniyuki Iwashima wrote:
> I think I was able to reproduce it with SO_PASSRIGHTS series
> with virtme-ng (but not with normal qemu with AL2023 rootfs).
> 
> After 2min, virtme-ng showed the console.

Yup! That's what I saw. Thanks for investigating!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.
  2025-05-15  2:05   ` Kuniyuki Iwashima
  2025-05-15  2:22     ` Jakub Kicinski
@ 2025-05-15  9:02     ` Paolo Abeni
  2025-05-15 16:46       ` Kuniyuki Iwashima
  1 sibling, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2025-05-15  9:02 UTC (permalink / raw)
  To: Kuniyuki Iwashima, kuba; +Cc: davem, dsahern, edumazet, horms, kuni1840, netdev

On 5/15/25 4:05 AM, Kuniyuki Iwashima wrote:
> From: Jakub Kicinski <kuba@kernel.org>
> Date: Wed, 14 May 2025 18:45:02 -0700
>> On Wed, 14 May 2025 13:18:53 -0700 Kuniyuki Iwashima wrote:
>>> Patch 1 removes rcu_read_lock() in fib6_get_table().
>>> Patch 2 removes rtnl_is_held arg for lwtunnel_valid_encap_type(), which
>>>  was short-term fix and is no longer used.
>>> Patch 3 fixes RCU vs GFP_KERNEL report by syzkaller.
>>> Patch 4~7 reverts GFP_ATOMIC uses to GFP_KERNEL.
>>
>> Hi! Something in the following set of patches is making our CI time out.
>> The problem seems to be:
>>
>> [    0.751266] virtme-init: waiting for udev to settle
>> Timed out for waiting the udev queue being empty.
>> [  120.826428] virtme-init: udev is done
>>
>> +team: grab team lock during team_change_rx_flags
>> +net: mana: Add handler for hardware servicing events
>> +ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.
>> +ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
>> +Revert "ipv6: Factorise ip6_route_multipath_add()."
>> +Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup"
>> +ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
>> +inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
>> +ipv6: Remove rcu_read_lock() in fib6_get_table().
>> +net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead
>>  amd-xgbe: read link status twice to avoid inconsistencies
>> +net: phy: fixed_phy: remove fixed_phy_register_with_gpiod
>>  drivers: net: mvpp2: attempt to refill rx before allocating skb
>> +selftest: af_unix: Test SO_PASSRIGHTS.
>> +af_unix: Introduce SO_PASSRIGHTS.
>> +af_unix: Inherit sk_flags at connect().
>> +af_unix: Move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
>> +net: Restrict SO_PASS{CRED,PIDFD,SEC} to AF_{UNIX,NETLINK,BLUETOOTH}.
>> +tcp: Restrict SO_TXREHASH to TCP socket.
>> +scm: Move scm_recv() from scm.h to scm.c.
>> +af_unix: Don't pass struct socket to maybe_add_creds().
>> +af_unix: Factorise test_bit() for SOCK_PASSCRED and SOCK_PASSPIDFD.
>>
>> I haven't dug into it, gotta review / apply other patches :(
>> Maybe you can try to repro? 
> 
> I think I was able to reproduce it with SO_PASSRIGHTS series
> with virtme-ng (but not with normal qemu with AL2023 rootfs).
> 
> After 2min, virtme-ng showed the console.
> 
> [    1.461450] virtme-ng-init: triggering udev coldplug
> [    1.533147] virtme-ng-init: waiting for udev to settle
> [  121.588624] virtme-ng-init: Timed out for waiting the udev queue being empty.
> [  121.588710] virtme-ng-init: udev is done
> [  121.593214] virtme-ng-init: initialization done
>           _      _
>    __   _(_)_ __| |_ _ __ ___   ___       _ __   __ _
>    \ \ / / |  __| __|  _   _ \ / _ \_____|  _ \ / _  |
>     \ V /| | |  | |_| | | | | |  __/_____| | | | (_| |
>      \_/ |_|_|   \__|_| |_| |_|\___|     |_| |_|\__  |
>                                                 |___/
>    kernel version: 6.15.0-rc4-virtme-00071-gceba111cf5e7 x86_64
>    (CTRL+d to exit)
> 
> 
> Will investigate the cause.
> 
> Sorry, but please drop the series and kick the CI again.

FTR I think some CI iterations survived the boot and hit the following,
in several forwarding tests (i.e. router-multipath-sh)

[  922.307796][ T6194] =============================
[  922.308069][ T6194] WARNING: suspicious RCU usage
[  922.308339][ T6194] 6.15.0-rc5-virtme #1 Not tainted
[  922.308596][ T6194] -----------------------------
[  922.308860][ T6194] ./include/net/addrconf.h:347 suspicious
rcu_dereference_check() usage!
[  922.309352][ T6194]
[  922.309352][ T6194] other info that might help us debug this:
[  922.309352][ T6194]
[  922.310105][ T6194]
[  922.310105][ T6194] rcu_scheduler_active = 2, debug_locks = 1
[  922.310501][ T6194] 1 lock held by ip/6194:
[  922.310704][ T6194]  #0: ffff888012942630
(&tb->tb6_lock){+...}-{3:3}, at: ip6_route_multipath_add+0x743/0x1450
[  922.311255][ T6194]
[  922.311255][ T6194] stack backtrace:
[  922.311577][ T6194] CPU: 1 UID: 0 PID: 6194 Comm: ip Not tainted
6.15.0-rc5-virtme #1 PREEMPT(full)
[  922.311583][ T6194] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  922.311585][ T6194] Call Trace:
[  922.311589][ T6194]  <TASK>
[  922.311591][ T6194]  dump_stack_lvl+0xb0/0xd0
[  922.311605][ T6194]  lockdep_rcu_suspicious+0x166/0x270
[  922.311619][ T6194]  rt6_multipath_rebalance.part.0+0x70c/0x8a0
[  922.311628][ T6194]  fib6_add_rt2node+0xa36/0x2c00
[  922.311668][ T6194]  fib6_add+0x38d/0xec0
[  922.311699][ T6194]  ip6_route_multipath_add+0x75b/0x1450
[  922.311753][ T6194]  inet6_rtm_newroute+0xb2/0x120
[  922.311795][ T6194]  rtnetlink_rcv_msg+0x710/0xc00
[  922.311819][ T6194]  netlink_rcv_skb+0x12f/0x360
[  922.311869][ T6194]  netlink_unicast+0x449/0x710
[  922.311891][ T6194]  netlink_sendmsg+0x721/0xbe0
[  922.311922][ T6194]  ____sys_sendmsg+0x7aa/0xa10
[  922.311954][ T6194]  ___sys_sendmsg+0xed/0x170
[  922.312031][ T6194]  __sys_sendmsg+0x108/0x1a0
[  922.312061][ T6194]  do_syscall_64+0xc1/0x1d0
[  922.312069][ T6194]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  922.312074][ T6194] RIP: 0033:0x7f8e77c649a7
[  922.312078][ T6194] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff
eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00
00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89
74 24 10
[  922.312081][ T6194] RSP: 002b:00007ffd73480708 EFLAGS: 00000246
ORIG_RAX: 000000000000002e
[  922.312086][ T6194] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
00007f8e77c649a7
[  922.312088][ T6194] RDX: 0000000000000000 RSI: 00007ffd73480770 RDI:
0000000000000005
[  922.312090][ T6194] RBP: 00007ffd73480abc R08: 0000000000000038 R09:
0000000000000000
[  922.312092][ T6194] R10: 000000000b9c6910 R11: 0000000000000246 R12:
00007ffd73481a80
[  922.312094][ T6194] R13: 00000000682562aa R14: 0000000000498600 R15:
00007ffd7348499b
[  922.312108][ T6194]  </TASK>

see:

https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-05-15--03-00&executor=vmksft-forwarding-dbg&pw-n=0&pass=0

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series.
  2025-05-15  9:02     ` Paolo Abeni
@ 2025-05-15 16:46       ` Kuniyuki Iwashima
  0 siblings, 0 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-15 16:46 UTC (permalink / raw)
  To: pabeni; +Cc: davem, dsahern, edumazet, horms, kuba, kuni1840, kuniyu, netdev

From: Paolo Abeni <pabeni@redhat.com>
Date: Thu, 15 May 2025 11:02:34 +0200
> On 5/15/25 4:05 AM, Kuniyuki Iwashima wrote:
> > From: Jakub Kicinski <kuba@kernel.org>
> > Date: Wed, 14 May 2025 18:45:02 -0700
> >> On Wed, 14 May 2025 13:18:53 -0700 Kuniyuki Iwashima wrote:
> >>> Patch 1 removes rcu_read_lock() in fib6_get_table().
> >>> Patch 2 removes rtnl_is_held arg for lwtunnel_valid_encap_type(), which
> >>>  was short-term fix and is no longer used.
> >>> Patch 3 fixes RCU vs GFP_KERNEL report by syzkaller.
> >>> Patch 4~7 reverts GFP_ATOMIC uses to GFP_KERNEL.
> >>
> >> Hi! Something in the following set of patches is making our CI time out.
> >> The problem seems to be:
> >>
> >> [    0.751266] virtme-init: waiting for udev to settle
> >> Timed out for waiting the udev queue being empty.
> >> [  120.826428] virtme-init: udev is done
> >>
> >> +team: grab team lock during team_change_rx_flags
> >> +net: mana: Add handler for hardware servicing events
> >> +ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE.
> >> +ipv6: Pass gfp_flags down to ip6_route_info_create_nh().
> >> +Revert "ipv6: Factorise ip6_route_multipath_add()."
> >> +Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup"
> >> +ipv6: Narrow down RCU critical section in inet6_rtm_newroute().
> >> +inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?().
> >> +ipv6: Remove rcu_read_lock() in fib6_get_table().
> >> +net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead
> >>  amd-xgbe: read link status twice to avoid inconsistencies
> >> +net: phy: fixed_phy: remove fixed_phy_register_with_gpiod
> >>  drivers: net: mvpp2: attempt to refill rx before allocating skb
> >> +selftest: af_unix: Test SO_PASSRIGHTS.
> >> +af_unix: Introduce SO_PASSRIGHTS.
> >> +af_unix: Inherit sk_flags at connect().
> >> +af_unix: Move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
> >> +net: Restrict SO_PASS{CRED,PIDFD,SEC} to AF_{UNIX,NETLINK,BLUETOOTH}.
> >> +tcp: Restrict SO_TXREHASH to TCP socket.
> >> +scm: Move scm_recv() from scm.h to scm.c.
> >> +af_unix: Don't pass struct socket to maybe_add_creds().
> >> +af_unix: Factorise test_bit() for SOCK_PASSCRED and SOCK_PASSPIDFD.
> >>
> >> I haven't dug into it, gotta review / apply other patches :(
> >> Maybe you can try to repro? 
> > 
> > I think I was able to reproduce it with SO_PASSRIGHTS series
> > with virtme-ng (but not with normal qemu with AL2023 rootfs).
> > 
> > After 2min, virtme-ng showed the console.
> > 
> > [    1.461450] virtme-ng-init: triggering udev coldplug
> > [    1.533147] virtme-ng-init: waiting for udev to settle
> > [  121.588624] virtme-ng-init: Timed out for waiting the udev queue being empty.
> > [  121.588710] virtme-ng-init: udev is done
> > [  121.593214] virtme-ng-init: initialization done
> >           _      _
> >    __   _(_)_ __| |_ _ __ ___   ___       _ __   __ _
> >    \ \ / / |  __| __|  _   _ \ / _ \_____|  _ \ / _  |
> >     \ V /| | |  | |_| | | | | |  __/_____| | | | (_| |
> >      \_/ |_|_|   \__|_| |_| |_|\___|     |_| |_|\__  |
> >                                                 |___/
> >    kernel version: 6.15.0-rc4-virtme-00071-gceba111cf5e7 x86_64
> >    (CTRL+d to exit)
> > 
> > 
> > Will investigate the cause.
> > 
> > Sorry, but please drop the series and kick the CI again.
> 
> FTR I think some CI iterations survived the boot and hit the following,
> in several forwarding tests (i.e. router-multipath-sh)

Oh thanks!

I learnt "make TARGETS=net run_tests" doesn't run forwarding tests.

Will fix in v2.


> 
> [  922.307796][ T6194] =============================
> [  922.308069][ T6194] WARNING: suspicious RCU usage
> [  922.308339][ T6194] 6.15.0-rc5-virtme #1 Not tainted
> [  922.308596][ T6194] -----------------------------
> [  922.308860][ T6194] ./include/net/addrconf.h:347 suspicious
> rcu_dereference_check() usage!
> [  922.309352][ T6194]
> [  922.309352][ T6194] other info that might help us debug this:
> [  922.309352][ T6194]
> [  922.310105][ T6194]
> [  922.310105][ T6194] rcu_scheduler_active = 2, debug_locks = 1
> [  922.310501][ T6194] 1 lock held by ip/6194:
> [  922.310704][ T6194]  #0: ffff888012942630
> (&tb->tb6_lock){+...}-{3:3}, at: ip6_route_multipath_add+0x743/0x1450
> [  922.311255][ T6194]
> [  922.311255][ T6194] stack backtrace:
> [  922.311577][ T6194] CPU: 1 UID: 0 PID: 6194 Comm: ip Not tainted
> 6.15.0-rc5-virtme #1 PREEMPT(full)
> [  922.311583][ T6194] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  922.311585][ T6194] Call Trace:
> [  922.311589][ T6194]  <TASK>
> [  922.311591][ T6194]  dump_stack_lvl+0xb0/0xd0
> [  922.311605][ T6194]  lockdep_rcu_suspicious+0x166/0x270
> [  922.311619][ T6194]  rt6_multipath_rebalance.part.0+0x70c/0x8a0
> [  922.311628][ T6194]  fib6_add_rt2node+0xa36/0x2c00
> [  922.311668][ T6194]  fib6_add+0x38d/0xec0
> [  922.311699][ T6194]  ip6_route_multipath_add+0x75b/0x1450
> [  922.311753][ T6194]  inet6_rtm_newroute+0xb2/0x120
> [  922.311795][ T6194]  rtnetlink_rcv_msg+0x710/0xc00
> [  922.311819][ T6194]  netlink_rcv_skb+0x12f/0x360
> [  922.311869][ T6194]  netlink_unicast+0x449/0x710
> [  922.311891][ T6194]  netlink_sendmsg+0x721/0xbe0
> [  922.311922][ T6194]  ____sys_sendmsg+0x7aa/0xa10
> [  922.311954][ T6194]  ___sys_sendmsg+0xed/0x170
> [  922.312031][ T6194]  __sys_sendmsg+0x108/0x1a0
> [  922.312061][ T6194]  do_syscall_64+0xc1/0x1d0
> [  922.312069][ T6194]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [  922.312074][ T6194] RIP: 0033:0x7f8e77c649a7
> [  922.312078][ T6194] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff
> eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00
> 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89
> 74 24 10
> [  922.312081][ T6194] RSP: 002b:00007ffd73480708 EFLAGS: 00000246
> ORIG_RAX: 000000000000002e
> [  922.312086][ T6194] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> 00007f8e77c649a7
> [  922.312088][ T6194] RDX: 0000000000000000 RSI: 00007ffd73480770 RDI:
> 0000000000000005
> [  922.312090][ T6194] RBP: 00007ffd73480abc R08: 0000000000000038 R09:
> 0000000000000000
> [  922.312092][ T6194] R10: 000000000b9c6910 R11: 0000000000000246 R12:
> 00007ffd73481a80
> [  922.312094][ T6194] R13: 00000000682562aa R14: 0000000000498600 R15:
> 00007ffd7348499b
> [  922.312108][ T6194]  </TASK>
> 
> see:
> 
> https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-05-15--03-00&executor=vmksft-forwarding-dbg&pw-n=0&pass=0
> 
> Thanks,
> 
> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-05-15 16:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-14 20:18 [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Kuniyuki Iwashima
2025-05-14 20:18 ` [PATCH v1 net-next 1/7] ipv6: Remove rcu_read_lock() in fib6_get_table() Kuniyuki Iwashima
2025-05-14 20:18 ` [PATCH v1 net-next 2/7] inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?() Kuniyuki Iwashima
2025-05-14 20:18 ` [PATCH v1 net-next 3/7] ipv6: Narrow down RCU critical section in inet6_rtm_newroute() Kuniyuki Iwashima
2025-05-14 20:18 ` [PATCH v1 net-next 4/7] Revert "ipv6: sr: switch to GFP_ATOMIC flag to allocate memory during seg6local LWT setup" Kuniyuki Iwashima
2025-05-14 20:18 ` [PATCH v1 net-next 5/7] Revert "ipv6: Factorise ip6_route_multipath_add()." Kuniyuki Iwashima
2025-05-14 20:18 ` [PATCH v1 net-next 6/7] ipv6: Pass gfp_flags down to ip6_route_info_create_nh() Kuniyuki Iwashima
2025-05-14 20:19 ` [PATCH v1 net-next 7/7] ipv6: Revert two per-cpu var allocation for RTM_NEWROUTE Kuniyuki Iwashima
2025-05-15  1:45 ` [PATCH v1 net-next 0/7] ipv6: Follow up for RTNL-free RTM_NEWROUTE series Jakub Kicinski
2025-05-15  2:05   ` Kuniyuki Iwashima
2025-05-15  2:22     ` Jakub Kicinski
2025-05-15  9:02     ` Paolo Abeni
2025-05-15 16:46       ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).