netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net 00/16] net: first round to use dev_net_rcu()
@ 2025-02-03 14:30 Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 01/16] net: add dev_net_rcu() helper Eric Dumazet
                   ` (15 more replies)
  0 siblings, 16 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

dev_net(dev) should either be protected by RTNL or RCU.

There is no LOCKDEP support yet for this helper.

Adding it would trigger too many splats.

Instead, add dev_net_rcu() and start to use it
to either fix bugs or document points that were safely
using dev_net().

v2: Resend (one patch missed v1 train), plus minor fixes.

Eric Dumazet (16):
  net: add dev_net_rcu() helper
  ipv4: add RCU protection to ip4_dst_hoplimit()
  ipv4: use RCU protection in ip_dst_mtu_maybe_forward()
  ipv4: use RCU protection in ipv4_default_advmss()
  ipv4: use RCU protection in rt_is_expired()
  tcp: convert to dev_net_rcu()
  net: gro: convert four dev_net() calls
  udp: convert to dev_net_rcu()
  ipv4: icmp: convert to dev_net_rcu()
  ipv6: icmp: convert to dev_net_rcu()
  ipv6: input: convert to dev_net_rcu()
  ipv6: output: convert to dev_net_rcu()
  ipv6: use RCU protection in ip6_default_advmss()
  net: filter: convert to dev_net_rcu()
  flow_dissector: use rcu protection to fetch dev_net()
  ipv4: use RCU protection in inet_select_addr()

 include/linux/netdevice.h      |  6 +++++
 include/net/inet6_hashtables.h |  2 +-
 include/net/inet_hashtables.h  |  2 +-
 include/net/ip.h               | 13 ++++++++---
 include/net/net_namespace.h    |  2 +-
 include/net/route.h            |  9 ++++++--
 net/core/filter.c              | 40 +++++++++++++++++-----------------
 net/core/flow_dissector.c      | 21 +++++++++---------
 net/ipv4/devinet.c             |  3 ++-
 net/ipv4/icmp.c                | 22 +++++++++----------
 net/ipv4/route.c               | 19 ++++++++++++----
 net/ipv4/tcp_ipv4.c            |  8 +++----
 net/ipv4/tcp_metrics.c         |  6 ++---
 net/ipv4/tcp_offload.c         |  2 +-
 net/ipv4/udp.c                 | 19 ++++++++--------
 net/ipv4/udp_offload.c         |  2 +-
 net/ipv6/icmp.c                | 22 +++++++++----------
 net/ipv6/ip6_input.c           | 12 +++++-----
 net/ipv6/ip6_output.c          |  4 ++--
 net/ipv6/output_core.c         |  2 +-
 net/ipv6/route.c               |  7 +++++-
 net/ipv6/tcp_ipv6.c            | 10 ++++-----
 net/ipv6/tcpv6_offload.c       |  2 +-
 net/ipv6/udp.c                 | 18 +++++++--------
 net/ipv6/udp_offload.c         |  2 +-
 25 files changed, 146 insertions(+), 109 deletions(-)

-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 net 01/16] net: add dev_net_rcu() helper
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 02/16] ipv4: add RCU protection to ip4_dst_hoplimit() Eric Dumazet
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

dev->nd_net can change, readers should either
use rcu_read_lock() or RTNL.

We currently use a generic helper, dev_net() with
no debugging support. We probably have many hidden bugs.

Add dev_net_rcu() helper for callers using rcu_read_lock()
protection.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/linux/netdevice.h   | 6 ++++++
 include/net/net_namespace.h | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2a59034a5fa2fb53300657968c2053ab354bb746..046015adf2856f859b9a671e2be4ef674125ef96 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2663,6 +2663,12 @@ struct net *dev_net(const struct net_device *dev)
 	return read_pnet(&dev->nd_net);
 }
 
+static inline
+struct net *dev_net_rcu(const struct net_device *dev)
+{
+	return read_pnet_rcu(&dev->nd_net);
+}
+
 static inline
 void dev_net_set(struct net_device *dev, struct net *net)
 {
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 0f5eb9db0c6264efc1ac83ab577511fd6823f4fe..7ba1402ca7796663bed3373b1a0c6a0249cd1599 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -398,7 +398,7 @@ static inline struct net *read_pnet(const possible_net_t *pnet)
 #endif
 }
 
-static inline struct net *read_pnet_rcu(possible_net_t *pnet)
+static inline struct net *read_pnet_rcu(const possible_net_t *pnet)
 {
 #ifdef CONFIG_NET_NS
 	return rcu_dereference(pnet->net);
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 02/16] ipv4: add RCU protection to ip4_dst_hoplimit()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 01/16] net: add dev_net_rcu() helper Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 03/16] ipv4: use RCU protection in ip_dst_mtu_maybe_forward() Eric Dumazet
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

ip4_dst_hoplimit() must use RCU protection to make
sure the net structure it reads does not disappear.

Fixes: fa50d974d104 ("ipv4: Namespaceify ip_default_ttl sysctl knob")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/route.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index f86775be3e2934697533a61f566aca1ef196d74e..c605fd5ec0c08cc7658c3cf6aa6223790d463ede 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -382,10 +382,15 @@ static inline int inet_iif(const struct sk_buff *skb)
 static inline int ip4_dst_hoplimit(const struct dst_entry *dst)
 {
 	int hoplimit = dst_metric_raw(dst, RTAX_HOPLIMIT);
-	struct net *net = dev_net(dst->dev);
 
-	if (hoplimit == 0)
+	if (hoplimit == 0) {
+		const struct net *net;
+
+		rcu_read_lock();
+		net = dev_net_rcu(dst->dev);
 		hoplimit = READ_ONCE(net->ipv4.sysctl_ip_default_ttl);
+		rcu_read_unlock();
+	}
 	return hoplimit;
 }
 
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 03/16] ipv4: use RCU protection in ip_dst_mtu_maybe_forward()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 01/16] net: add dev_net_rcu() helper Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 02/16] ipv4: add RCU protection to ip4_dst_hoplimit() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 04/16] ipv4: use RCU protection in ipv4_default_advmss() Eric Dumazet
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

ip_dst_mtu_maybe_forward() must use RCU protection to make
sure the net structure it reads does not disappear.

Fixes: f87c10a8aa1e8 ("ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/ip.h | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 9f5e33e371fcdd8ea88c54584b8d4b6c50e7d0c9..ba7b43447775e51b3b9a8cbf5c3345d6308bb525 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -471,9 +471,12 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
 						    bool forwarding)
 {
 	const struct rtable *rt = dst_rtable(dst);
-	struct net *net = dev_net(dst->dev);
-	unsigned int mtu;
+	unsigned int mtu, res;
+	struct net *net;
+
+	rcu_read_lock();
 
+	net = dev_net_rcu(dst->dev);
 	if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
 	    ip_mtu_locked(dst) ||
 	    !forwarding) {
@@ -497,7 +500,11 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
 out:
 	mtu = min_t(unsigned int, mtu, IP_MAX_MTU);
 
-	return mtu - lwtunnel_headroom(dst->lwtstate, mtu);
+	res = mtu - lwtunnel_headroom(dst->lwtstate, mtu);
+
+	rcu_read_unlock();
+
+	return res;
 }
 
 static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 04/16] ipv4: use RCU protection in ipv4_default_advmss()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (2 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 03/16] ipv4: use RCU protection in ip_dst_mtu_maybe_forward() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 05/16] ipv4: use RCU protection in rt_is_expired() Eric Dumazet
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

ipv4_default_advmss() must use RCU protection to make
sure the net structure it reads does not disappear.

Fixes: 2e9589ff809e ("ipv4: Namespaceify min_adv_mss sysctl knob")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/route.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 577b88a43293aa801c3ee736d7e5cc4d97917717..74c074f45758be5ae78a87edb31837481cc40278 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1307,10 +1307,15 @@ static void set_class_tag(struct rtable *rt, u32 tag)
 
 static unsigned int ipv4_default_advmss(const struct dst_entry *dst)
 {
-	struct net *net = dev_net(dst->dev);
 	unsigned int header_size = sizeof(struct tcphdr) + sizeof(struct iphdr);
-	unsigned int advmss = max_t(unsigned int, ipv4_mtu(dst) - header_size,
-				    net->ipv4.ip_rt_min_advmss);
+	unsigned int advmss;
+	struct net *net;
+
+	rcu_read_lock();
+	net = dev_net_rcu(dst->dev);
+	advmss = max_t(unsigned int, ipv4_mtu(dst) - header_size,
+				   net->ipv4.ip_rt_min_advmss);
+	rcu_read_unlock();
 
 	return min(advmss, IPV4_MAX_PMTU - header_size);
 }
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 05/16] ipv4: use RCU protection in rt_is_expired()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (3 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 04/16] ipv4: use RCU protection in ipv4_default_advmss() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 06/16] tcp: convert to dev_net_rcu() Eric Dumazet
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

rt_is_expired() must use RCU protection to make
sure the net structure it reads does not disappear.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/route.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 74c074f45758be5ae78a87edb31837481cc40278..e959327c0ba8979ce5c7ca8c46ae41068824edc6 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -390,7 +390,13 @@ static inline int ip_rt_proc_init(void)
 
 static inline bool rt_is_expired(const struct rtable *rth)
 {
-	return rth->rt_genid != rt_genid_ipv4(dev_net(rth->dst.dev));
+	bool res;
+
+	rcu_read_lock();
+	res = rth->rt_genid != rt_genid_ipv4(dev_net_rcu(rth->dst.dev));
+	rcu_read_unlock();
+
+	return res;
 }
 
 void rt_cache_flush(struct net *net)
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 06/16] tcp: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (4 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 05/16] ipv4: use RCU protection in rt_is_expired() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 07/16] net: gro: convert four dev_net() calls Eric Dumazet
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

TCP uses of dev_net() are safe, change them to dev_net_rcu()
to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/inet6_hashtables.h |  2 +-
 include/net/inet_hashtables.h  |  2 +-
 net/ipv4/tcp_ipv4.c            |  8 ++++----
 net/ipv4/tcp_metrics.c         |  6 +++---
 net/ipv6/tcp_ipv6.c            | 10 +++++-----
 5 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index 74dd90ff5f129fe4c8adad67a642ae5070410518..c32878c69179dac5a7fcfa098a297420d9adfab2 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -150,7 +150,7 @@ static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
 					      int iif, int sdif,
 					      bool *refcounted)
 {
-	struct net *net = dev_net(skb_dst(skb)->dev);
+	struct net *net = dev_net_rcu(skb_dst(skb)->dev);
 	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
 	struct sock *sk;
 
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 5eea47f135a421ce8275d4cd83c5771b3f448e5c..da818fb0205fed6b4120946bc032e67e046b716f 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -492,7 +492,7 @@ static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
 					     const int sdif,
 					     bool *refcounted)
 {
-	struct net *net = dev_net(skb_dst(skb)->dev);
+	struct net *net = dev_net_rcu(skb_dst(skb)->dev);
 	const struct iphdr *iph = ip_hdr(skb);
 	struct sock *sk;
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index cc2b5194a18d2e64595f474f62c6f2fd3eff319f..3bd835220d43d6d6491fd5c8d5e9954c37303f83 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -503,7 +503,7 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
 	struct request_sock *fastopen;
 	u32 seq, snd_una;
 	int err;
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 
 	sk = __inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo,
 				       iph->daddr, th->dest, iph->saddr,
@@ -788,7 +788,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb,
 	arg.iov[0].iov_base = (unsigned char *)&rep;
 	arg.iov[0].iov_len  = sizeof(rep.th);
 
-	net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev);
+	net = sk ? sock_net(sk) : dev_net_rcu(skb_dst(skb)->dev);
 
 	/* Invalid TCP option size or twice included auth */
 	if (tcp_parse_auth_options(tcp_hdr(skb), &md5_hash_location, &aoh))
@@ -1967,7 +1967,7 @@ EXPORT_SYMBOL(tcp_v4_do_rcv);
 
 int tcp_v4_early_demux(struct sk_buff *skb)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	const struct iphdr *iph;
 	const struct tcphdr *th;
 	struct sock *sk;
@@ -2178,7 +2178,7 @@ static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
 
 int tcp_v4_rcv(struct sk_buff *skb)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	enum skb_drop_reason drop_reason;
 	int sdif = inet_sdif(skb);
 	int dif = inet_iif(skb);
diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 95669935494ef8003a1877e2b86c76bd27307afd..4251670e328c83b55eff7bbda3cc3d97d78563a8 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -170,7 +170,7 @@ static struct tcp_metrics_block *tcpm_new(struct dst_entry *dst,
 	bool reclaim = false;
 
 	spin_lock_bh(&tcp_metrics_lock);
-	net = dev_net(dst->dev);
+	net = dev_net_rcu(dst->dev);
 
 	/* While waiting for the spin-lock the cache might have been populated
 	 * with this entry and so we have to check again.
@@ -273,7 +273,7 @@ static struct tcp_metrics_block *__tcp_get_metrics_req(struct request_sock *req,
 		return NULL;
 	}
 
-	net = dev_net(dst->dev);
+	net = dev_net_rcu(dst->dev);
 	hash ^= net_hash_mix(net);
 	hash = hash_32(hash, tcp_metrics_hash_log);
 
@@ -318,7 +318,7 @@ static struct tcp_metrics_block *tcp_get_metrics(struct sock *sk,
 	else
 		return NULL;
 
-	net = dev_net(dst->dev);
+	net = dev_net_rcu(dst->dev);
 	hash ^= net_hash_mix(net);
 	hash = hash_32(hash, tcp_metrics_hash_log);
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 2debdf085a3b4d2452b2b316cb5368507b17efc8..429f8a5ab511b671aa405ae20f7c1b3163839779 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -376,7 +376,7 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 {
 	const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data;
 	const struct tcphdr *th = (struct tcphdr *)(skb->data+offset);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	struct request_sock *fastopen;
 	struct ipv6_pinfo *np;
 	struct tcp_sock *tp;
@@ -868,7 +868,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
 	struct tcphdr *t1;
 	struct sk_buff *buff;
 	struct flowi6 fl6;
-	struct net *net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev);
+	struct net *net = sk ? sock_net(sk) : dev_net_rcu(skb_dst(skb)->dev);
 	struct sock *ctl_sk = net->ipv6.tcp_sk;
 	unsigned int tot_len = sizeof(struct tcphdr);
 	__be32 mrst = 0, *topt;
@@ -1039,7 +1039,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb,
 	if (!sk && !ipv6_unicast_destination(skb))
 		return;
 
-	net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev);
+	net = sk ? sock_net(sk) : dev_net_rcu(skb_dst(skb)->dev);
 	/* Invalid TCP option size or twice included auth */
 	if (tcp_parse_auth_options(th, &md5_hash_location, &aoh))
 		return;
@@ -1744,6 +1744,7 @@ static void tcp_v6_fill_cb(struct sk_buff *skb, const struct ipv6hdr *hdr,
 
 INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 {
+	struct net *net = dev_net_rcu(skb->dev);
 	enum skb_drop_reason drop_reason;
 	int sdif = inet6_sdif(skb);
 	int dif = inet6_iif(skb);
@@ -1753,7 +1754,6 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 	bool refcounted;
 	int ret;
 	u32 isn;
-	struct net *net = dev_net(skb->dev);
 
 	drop_reason = SKB_DROP_REASON_NOT_SPECIFIED;
 	if (skb->pkt_type != PACKET_HOST)
@@ -2004,7 +2004,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 
 void tcp_v6_early_demux(struct sk_buff *skb)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	const struct ipv6hdr *hdr;
 	const struct tcphdr *th;
 	struct sock *sk;
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 07/16] net: gro: convert four dev_net() calls
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (5 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 06/16] tcp: convert to dev_net_rcu() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 08/16] udp: convert to dev_net_rcu() Eric Dumazet
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

tcp4_check_fraglist_gro(), tcp6_check_fraglist_gro(),
udp4_gro_lookup_skb() and udp6_gro_lookup_skb()
assume RCU is held so that the net structure does not disappear.

Use dev_net_rcu() instead of dev_net() to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/tcp_offload.c   | 2 +-
 net/ipv4/udp_offload.c   | 2 +-
 net/ipv6/tcpv6_offload.c | 2 +-
 net/ipv6/udp_offload.c   | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 2308665b51c5388814e5b61a262a1636d897c4a9..ecef16c58c07146cbeebade0620a5ec7251ddbc5 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -425,7 +425,7 @@ static void tcp4_check_fraglist_gro(struct list_head *head, struct sk_buff *skb,
 
 	inet_get_iif_sdif(skb, &iif, &sdif);
 	iph = skb_gro_network_header(skb);
-	net = dev_net(skb->dev);
+	net = dev_net_rcu(skb->dev);
 	sk = __inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo,
 				       iph->saddr, th->source,
 				       iph->daddr, ntohs(th->dest),
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index a5be6e4ed326fbdc6a9b3889db4da903f7f25d37..c1a85b300ee87758ee683a834248a600a3e7f18d 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -630,7 +630,7 @@ static struct sock *udp4_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
 					__be16 dport)
 {
 	const struct iphdr *iph = skb_gro_network_header(skb);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	int iif, sdif;
 
 	inet_get_iif_sdif(skb, &iif, &sdif);
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index a45bf17cb2a172d4612cb42f51481b97bbf364cd..91b88daa5b555cb1af591db7680b7d829ce7b1b7 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -35,7 +35,7 @@ static void tcp6_check_fraglist_gro(struct list_head *head, struct sk_buff *skb,
 
 	inet6_get_iif_sdif(skb, &iif, &sdif);
 	hdr = skb_gro_network_header(skb);
-	net = dev_net(skb->dev);
+	net = dev_net_rcu(skb->dev);
 	sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo,
 					&hdr->saddr, th->source,
 					&hdr->daddr, ntohs(th->dest),
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index b41152dd424697a9fc3cef13fbb430de49dcb913..404212dfc99abba4d48fc27a574b48ab53731d39 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -117,7 +117,7 @@ static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
 					__be16 dport)
 {
 	const struct ipv6hdr *iph = skb_gro_network_header(skb);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	int iif, sdif;
 
 	inet6_get_iif_sdif(skb, &iif, &sdif);
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 08/16] udp: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (6 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 07/16] net: gro: convert four dev_net() calls Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 09/16] ipv4: icmp: " Eric Dumazet
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

UDP uses of dev_net() are safe, change them to dev_net_rcu()
to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/udp.c | 19 ++++++++++---------
 net/ipv6/udp.c | 18 +++++++++---------
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index a9bb9ce5438eaa9f9ceede1e4ac080dc6ab74588..fc1e37eb49190cb7e2671ebd54ac4fca54b77ac2 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -750,7 +750,7 @@ static inline struct sock *__udp4_lib_lookup_skb(struct sk_buff *skb,
 {
 	const struct iphdr *iph = ip_hdr(skb);
 
-	return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
+	return __udp4_lib_lookup(dev_net_rcu(skb->dev), iph->saddr, sport,
 				 iph->daddr, dport, inet_iif(skb),
 				 inet_sdif(skb), udptable, skb);
 }
@@ -760,7 +760,7 @@ struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
 {
 	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
 	const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	int iif, sdif;
 
 	inet_get_iif_sdif(skb, &iif, &sdif);
@@ -934,13 +934,13 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 	struct inet_sock *inet;
 	const struct iphdr *iph = (const struct iphdr *)skb->data;
 	struct udphdr *uh = (struct udphdr *)(skb->data+(iph->ihl<<2));
+	struct net *net = dev_net_rcu(skb->dev);
 	const int type = icmp_hdr(skb)->type;
 	const int code = icmp_hdr(skb)->code;
 	bool tunnel = false;
 	struct sock *sk;
 	int harderr;
 	int err;
-	struct net *net = dev_net(skb->dev);
 
 	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
 			       iph->saddr, uh->source, skb->dev->ifindex,
@@ -1025,7 +1025,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 
 int udp_err(struct sk_buff *skb, u32 info)
 {
-	return __udp4_lib_err(skb, info, dev_net(skb->dev)->ipv4.udp_table);
+	return __udp4_lib_err(skb, info, dev_net_rcu(skb->dev)->ipv4.udp_table);
 }
 
 /*
@@ -2466,7 +2466,7 @@ static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 		udp_post_segment_fix_csum(skb);
 		ret = udp_queue_rcv_one_skb(sk, skb);
 		if (ret > 0)
-			ip_protocol_deliver_rcu(dev_net(skb->dev), skb, ret);
+			ip_protocol_deliver_rcu(dev_net_rcu(skb->dev), skb, ret);
 	}
 	return 0;
 }
@@ -2632,12 +2632,12 @@ static int udp_unicast_rcv_skb(struct sock *sk, struct sk_buff *skb,
 int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 		   int proto)
 {
+	struct net *net = dev_net_rcu(skb->dev);
+	struct rtable *rt = skb_rtable(skb);
 	struct sock *sk = NULL;
 	struct udphdr *uh;
 	unsigned short ulen;
-	struct rtable *rt = skb_rtable(skb);
 	__be32 saddr, daddr;
-	struct net *net = dev_net(skb->dev);
 	bool refcounted;
 	int drop_reason;
 
@@ -2804,7 +2804,7 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
 
 int udp_v4_early_demux(struct sk_buff *skb)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	struct in_device *in_dev = NULL;
 	const struct iphdr *iph;
 	const struct udphdr *uh;
@@ -2873,7 +2873,8 @@ int udp_v4_early_demux(struct sk_buff *skb)
 
 int udp_rcv(struct sk_buff *skb)
 {
-	return __udp4_lib_rcv(skb, dev_net(skb->dev)->ipv4.udp_table, IPPROTO_UDP);
+	return __udp4_lib_rcv(skb, dev_net_rcu(skb->dev)->ipv4.udp_table,
+			      IPPROTO_UDP);
 }
 
 void udp_destroy_sock(struct sock *sk)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index c6ea438b5c7588edd2971997f21382c26446a45c..d0b8f724e4362ec35352dae547e916c912716cab 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -410,7 +410,7 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 {
 	const struct ipv6hdr *iph = ipv6_hdr(skb);
 
-	return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport,
+	return __udp6_lib_lookup(dev_net_rcu(skb->dev), &iph->saddr, sport,
 				 &iph->daddr, dport, inet6_iif(skb),
 				 inet6_sdif(skb), udptable, skb);
 }
@@ -420,7 +420,7 @@ struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
 {
 	const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
 	const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	int iif, sdif;
 
 	inet6_get_iif_sdif(skb, &iif, &sdif);
@@ -702,16 +702,16 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		   u8 type, u8 code, int offset, __be32 info,
 		   struct udp_table *udptable)
 {
-	struct ipv6_pinfo *np;
 	const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data;
 	const struct in6_addr *saddr = &hdr->saddr;
 	const struct in6_addr *daddr = seg6_get_daddr(skb, opt) ? : &hdr->daddr;
 	struct udphdr *uh = (struct udphdr *)(skb->data+offset);
+	struct net *net = dev_net_rcu(skb->dev);
+	struct ipv6_pinfo *np;
 	bool tunnel = false;
 	struct sock *sk;
 	int harderr;
 	int err;
-	struct net *net = dev_net(skb->dev);
 
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
 			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
@@ -818,7 +818,7 @@ static __inline__ int udpv6_err(struct sk_buff *skb,
 				u8 code, int offset, __be32 info)
 {
 	return __udp6_lib_err(skb, opt, type, code, offset, info,
-			      dev_net(skb->dev)->ipv4.udp_table);
+			      dev_net_rcu(skb->dev)->ipv4.udp_table);
 }
 
 static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
@@ -929,7 +929,7 @@ static int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 		udp_post_segment_fix_csum(skb);
 		ret = udpv6_queue_rcv_one_skb(sk, skb);
 		if (ret > 0)
-			ip6_protocol_deliver_rcu(dev_net(skb->dev), skb, ret,
+			ip6_protocol_deliver_rcu(dev_net_rcu(skb->dev), skb, ret,
 						 true);
 	}
 	return 0;
@@ -1071,8 +1071,8 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 		   int proto)
 {
 	enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED;
+	struct net *net = dev_net_rcu(skb->dev);
 	const struct in6_addr *saddr, *daddr;
-	struct net *net = dev_net(skb->dev);
 	struct sock *sk = NULL;
 	struct udphdr *uh;
 	bool refcounted;
@@ -1220,7 +1220,7 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
 
 void udp_v6_early_demux(struct sk_buff *skb)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	const struct udphdr *uh;
 	struct sock *sk;
 	struct dst_entry *dst;
@@ -1262,7 +1262,7 @@ void udp_v6_early_demux(struct sk_buff *skb)
 
 INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb)
 {
-	return __udp6_lib_rcv(skb, dev_net(skb->dev)->ipv4.udp_table, IPPROTO_UDP);
+	return __udp6_lib_rcv(skb, dev_net_rcu(skb->dev)->ipv4.udp_table, IPPROTO_UDP);
 }
 
 /*
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (7 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 08/16] udp: convert to dev_net_rcu() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 23:36   ` Jakub Kicinski
  2025-02-03 14:30 ` [PATCH v2 net 10/16] ipv6: " Eric Dumazet
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

ICMP uses of dev_net() are safe, change them to dev_net_rcu()
to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/icmp.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 094084b61bff8a17c4e85c99019b84e9cba21599..19bf8edd6759872fe667af82790b77b01212271b 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -401,7 +401,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb)
 {
 	struct ipcm_cookie ipc;
 	struct rtable *rt = skb_rtable(skb);
-	struct net *net = dev_net(rt->dst.dev);
+	struct net *net = dev_net_rcu(rt->dst.dev);
 	bool apply_ratelimit = false;
 	struct flowi4 fl4;
 	struct sock *sk;
@@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
 		goto out;
 
 	if (rt->dst.dev)
-		net = dev_net(rt->dst.dev);
+		net = dev_net_rcu(rt->dst.dev);
 	else if (skb_in->dev)
-		net = dev_net(skb_in->dev);
+		net = dev_net_rcu(skb_in->dev);
 	else
 		goto out;
 
@@ -834,7 +834,7 @@ static void icmp_socket_deliver(struct sk_buff *skb, u32 info)
 	 * avoid additional coding at protocol handlers.
 	 */
 	if (!pskb_may_pull(skb, iph->ihl * 4 + 8)) {
-		__ICMP_INC_STATS(dev_net(skb->dev), ICMP_MIB_INERRORS);
+		__ICMP_INC_STATS(dev_net_rcu(skb->dev), ICMP_MIB_INERRORS);
 		return;
 	}
 
@@ -868,7 +868,7 @@ static enum skb_drop_reason icmp_unreach(struct sk_buff *skb)
 	struct net *net;
 	u32 info = 0;
 
-	net = dev_net(skb_dst(skb)->dev);
+	net = dev_net_rcu(skb_dst(skb)->dev);
 
 	/*
 	 *	Incomplete header ?
@@ -979,7 +979,7 @@ static enum skb_drop_reason icmp_unreach(struct sk_buff *skb)
 static enum skb_drop_reason icmp_redirect(struct sk_buff *skb)
 {
 	if (skb->len < sizeof(struct iphdr)) {
-		__ICMP_INC_STATS(dev_net(skb->dev), ICMP_MIB_INERRORS);
+		__ICMP_INC_STATS(dev_net_rcu(skb->dev), ICMP_MIB_INERRORS);
 		return SKB_DROP_REASON_PKT_TOO_SMALL;
 	}
 
@@ -1011,7 +1011,7 @@ static enum skb_drop_reason icmp_echo(struct sk_buff *skb)
 	struct icmp_bxm icmp_param;
 	struct net *net;
 
-	net = dev_net(skb_dst(skb)->dev);
+	net = dev_net_rcu(skb_dst(skb)->dev);
 	/* should there be an ICMP stat for ignored echos? */
 	if (READ_ONCE(net->ipv4.sysctl_icmp_echo_ignore_all))
 		return SKB_NOT_DROPPED_YET;
@@ -1040,9 +1040,9 @@ static enum skb_drop_reason icmp_echo(struct sk_buff *skb)
 
 bool icmp_build_probe(struct sk_buff *skb, struct icmphdr *icmphdr)
 {
+	struct net *net = dev_net_rcu(skb->dev);
 	struct icmp_ext_hdr *ext_hdr, _ext_hdr;
 	struct icmp_ext_echo_iio *iio, _iio;
-	struct net *net = dev_net(skb->dev);
 	struct inet6_dev *in6_dev;
 	struct in_device *in_dev;
 	struct net_device *dev;
@@ -1181,7 +1181,7 @@ static enum skb_drop_reason icmp_timestamp(struct sk_buff *skb)
 	return SKB_NOT_DROPPED_YET;
 
 out_err:
-	__ICMP_INC_STATS(dev_net(skb_dst(skb)->dev), ICMP_MIB_INERRORS);
+	__ICMP_INC_STATS(dev_net_rcu(skb_dst(skb)->dev), ICMP_MIB_INERRORS);
 	return SKB_DROP_REASON_PKT_TOO_SMALL;
 }
 
@@ -1198,7 +1198,7 @@ int icmp_rcv(struct sk_buff *skb)
 {
 	enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED;
 	struct rtable *rt = skb_rtable(skb);
-	struct net *net = dev_net(rt->dst.dev);
+	struct net *net = dev_net_rcu(rt->dst.dev);
 	struct icmphdr *icmph;
 
 	if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) {
@@ -1371,9 +1371,9 @@ int icmp_err(struct sk_buff *skb, u32 info)
 	struct iphdr *iph = (struct iphdr *)skb->data;
 	int offset = iph->ihl<<2;
 	struct icmphdr *icmph = (struct icmphdr *)(skb->data + offset);
+	struct net *net = dev_net_rcu(skb->dev);
 	int type = icmp_hdr(skb)->type;
 	int code = icmp_hdr(skb)->code;
-	struct net *net = dev_net(skb->dev);
 
 	/*
 	 * Use ping_err to handle all icmp errors except those
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 10/16] ipv6: icmp: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (8 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 09/16] ipv4: icmp: " Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 11/16] ipv6: input: " Eric Dumazet
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

ICMP uses of dev_net() are safe, change them to dev_net_rcu()
to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/icmp.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index a6984a29fdb9dd972a11ca9f8d5e794c443bac6f..cb9ba5d8b6bab340fd4900f2fa99baa1ebeacb0f 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -76,7 +76,7 @@ static int icmpv6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 {
 	/* icmpv6_notify checks 8 bytes can be pulled, icmp6hdr is 8 bytes */
 	struct icmp6hdr *icmp6 = (struct icmp6hdr *) (skb->data + offset);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 
 	if (type == ICMPV6_PKT_TOOBIG)
 		ip6_update_pmtu(skb, net, info, skb->dev->ifindex, 0, sock_net_uid(net, NULL));
@@ -473,7 +473,7 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
 
 	if (!skb->dev)
 		return;
-	net = dev_net(skb->dev);
+	net = dev_net_rcu(skb->dev);
 	mark = IP6_REPLY_MARK(net, skb->mark);
 	/*
 	 *	Make sure we respect the rules
@@ -679,8 +679,8 @@ int ip6_err_gen_icmpv6_unreach(struct sk_buff *skb, int nhs, int type,
 	skb_pull(skb2, nhs);
 	skb_reset_network_header(skb2);
 
-	rt = rt6_lookup(dev_net(skb->dev), &ipv6_hdr(skb2)->saddr, NULL, 0,
-			skb, 0);
+	rt = rt6_lookup(dev_net_rcu(skb->dev), &ipv6_hdr(skb2)->saddr,
+			NULL, 0, skb, 0);
 
 	if (rt && rt->dst.dev)
 		skb2->dev = rt->dst.dev;
@@ -717,7 +717,7 @@ EXPORT_SYMBOL(ip6_err_gen_icmpv6_unreach);
 
 static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	struct sock *sk;
 	struct inet6_dev *idev;
 	struct ipv6_pinfo *np;
@@ -832,7 +832,7 @@ enum skb_drop_reason icmpv6_notify(struct sk_buff *skb, u8 type,
 				   u8 code, __be32 info)
 {
 	struct inet6_skb_parm *opt = IP6CB(skb);
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	const struct inet6_protocol *ipprot;
 	enum skb_drop_reason reason;
 	int inner_offset;
@@ -889,7 +889,7 @@ enum skb_drop_reason icmpv6_notify(struct sk_buff *skb, u8 type,
 static int icmpv6_rcv(struct sk_buff *skb)
 {
 	enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED;
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	struct net_device *dev = icmp6_dev(skb);
 	struct inet6_dev *idev = __in6_dev_get(dev);
 	const struct in6_addr *saddr, *daddr;
@@ -921,7 +921,7 @@ static int icmpv6_rcv(struct sk_buff *skb)
 		skb_set_network_header(skb, nh);
 	}
 
-	__ICMP6_INC_STATS(dev_net(dev), idev, ICMP6_MIB_INMSGS);
+	__ICMP6_INC_STATS(dev_net_rcu(dev), idev, ICMP6_MIB_INMSGS);
 
 	saddr = &ipv6_hdr(skb)->saddr;
 	daddr = &ipv6_hdr(skb)->daddr;
@@ -939,7 +939,7 @@ static int icmpv6_rcv(struct sk_buff *skb)
 
 	type = hdr->icmp6_type;
 
-	ICMP6MSGIN_INC_STATS(dev_net(dev), idev, type);
+	ICMP6MSGIN_INC_STATS(dev_net_rcu(dev), idev, type);
 
 	switch (type) {
 	case ICMPV6_ECHO_REQUEST:
@@ -1034,9 +1034,9 @@ static int icmpv6_rcv(struct sk_buff *skb)
 
 csum_error:
 	reason = SKB_DROP_REASON_ICMP_CSUM;
-	__ICMP6_INC_STATS(dev_net(dev), idev, ICMP6_MIB_CSUMERRORS);
+	__ICMP6_INC_STATS(dev_net_rcu(dev), idev, ICMP6_MIB_CSUMERRORS);
 discard_it:
-	__ICMP6_INC_STATS(dev_net(dev), idev, ICMP6_MIB_INERRORS);
+	__ICMP6_INC_STATS(dev_net_rcu(dev), idev, ICMP6_MIB_INERRORS);
 drop_no_count:
 	kfree_skb_reason(skb, reason);
 	return 0;
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 11/16] ipv6: input: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (9 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 10/16] ipv6: " Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 12/16] ipv6: output: " Eric Dumazet
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

dev_net() calls from net/ipv6/ip6_input.c seem to
happen under RCU protection.

Convert them to dev_net_rcu() to ensure LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ip6_input.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 70c0e16c0ae6837d1c64d0036829c8b61799578b..4030527ebe098e86764f37c9068d2f2f9af2d183 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -301,7 +301,7 @@ static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev,
 
 int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 
 	skb = ip6_rcv_core(skb, dev, net);
 	if (skb == NULL)
@@ -330,7 +330,7 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
 
 	list_for_each_entry_safe(skb, next, head, list) {
 		struct net_device *dev = skb->dev;
-		struct net *net = dev_net(dev);
+		struct net *net = dev_net_rcu(dev);
 
 		skb_list_del_init(skb);
 		skb = ip6_rcv_core(skb, dev, net);
@@ -488,7 +488,7 @@ static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *sk
 int ip6_input(struct sk_buff *skb)
 {
 	return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_IN,
-		       dev_net(skb->dev), NULL, skb, skb->dev, NULL,
+		       dev_net_rcu(skb->dev), NULL, skb, skb->dev, NULL,
 		       ip6_input_finish);
 }
 EXPORT_SYMBOL_GPL(ip6_input);
@@ -500,14 +500,14 @@ int ip6_mc_input(struct sk_buff *skb)
 	struct net_device *dev;
 	bool deliver;
 
-	__IP6_UPD_PO_STATS(dev_net(skb_dst(skb)->dev),
+	__IP6_UPD_PO_STATS(dev_net_rcu(skb_dst(skb)->dev),
 			 __in6_dev_get_safely(skb->dev), IPSTATS_MIB_INMCAST,
 			 skb->len);
 
 	/* skb->dev passed may be master dev for vrfs. */
 	if (sdif) {
 		rcu_read_lock();
-		dev = dev_get_by_index_rcu(dev_net(skb->dev), sdif);
+		dev = dev_get_by_index_rcu(dev_net_rcu(skb->dev), sdif);
 		if (!dev) {
 			rcu_read_unlock();
 			kfree_skb(skb);
@@ -526,7 +526,7 @@ int ip6_mc_input(struct sk_buff *skb)
 	/*
 	 *      IPv6 multicast router mode is now supported ;)
 	 */
-	if (atomic_read(&dev_net(skb->dev)->ipv6.devconf_all->mc_forwarding) &&
+	if (atomic_read(&dev_net_rcu(skb->dev)->ipv6.devconf_all->mc_forwarding) &&
 	    !(ipv6_addr_type(&hdr->daddr) &
 	      (IPV6_ADDR_LOOPBACK|IPV6_ADDR_LINKLOCAL)) &&
 	    likely(!(IP6CB(skb)->flags & IP6SKB_FORWARDED))) {
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 12/16] ipv6: output: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (10 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 11/16] ipv6: input: " Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 13/16] ipv6: use RCU protection in ip6_default_advmss() Eric Dumazet
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

dev_net() calls from net/ipv6/ip6_output.c
and net/ipv6/output_core.c are happening under RCU
protection.

Convert them to dev_net_rcu() to ensure LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ip6_output.c  | 4 ++--
 net/ipv6/output_core.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d577bf2f3053873d27b241029592cdbb0a124ad7..4c73a4cdcb23f76d81e572d5b1bd0f6902447c0e 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -393,7 +393,7 @@ static int ip6_call_ra_chain(struct sk_buff *skb, int sel)
 		     sk->sk_bound_dev_if == skb->dev->ifindex)) {
 
 			if (inet6_test_bit(RTALERT_ISOLATE, sk) &&
-			    !net_eq(sock_net(sk), dev_net(skb->dev))) {
+			    !net_eq(sock_net(sk), dev_net_rcu(skb->dev))) {
 				continue;
 			}
 			if (last) {
@@ -503,7 +503,7 @@ int ip6_forward(struct sk_buff *skb)
 	struct dst_entry *dst = skb_dst(skb);
 	struct ipv6hdr *hdr = ipv6_hdr(skb);
 	struct inet6_skb_parm *opt = IP6CB(skb);
-	struct net *net = dev_net(dst->dev);
+	struct net *net = dev_net_rcu(dst->dev);
 	struct inet6_dev *idev;
 	SKB_DR(reason);
 	u32 mtu;
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index 806d4b5dd1e60b27726facbb59bbef97d6fee7f5..94438fd4f0e833bb8f5ea4822c7312376ea79304 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -113,7 +113,7 @@ int ip6_dst_hoplimit(struct dst_entry *dst)
 		if (idev)
 			hoplimit = READ_ONCE(idev->cnf.hop_limit);
 		else
-			hoplimit = READ_ONCE(dev_net(dev)->ipv6.devconf_all->hop_limit);
+			hoplimit = READ_ONCE(dev_net_rcu(dev)->ipv6.devconf_all->hop_limit);
 		rcu_read_unlock();
 	}
 	return hoplimit;
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 13/16] ipv6: use RCU protection in ip6_default_advmss()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (11 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 12/16] ipv6: output: " Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 14/16] net: filter: convert to dev_net_rcu() Eric Dumazet
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

ip6_default_advmss() needs rcu protection to make
sure the net structure it reads does not disappear.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/route.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 78362822b9070df138a0724dc76003b63026f9e2..ef2d23a1e3d532f5db37ca94ca482c5522dddffc 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3196,13 +3196,18 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst)
 {
 	struct net_device *dev = dst->dev;
 	unsigned int mtu = dst_mtu(dst);
-	struct net *net = dev_net(dev);
+	struct net *net;
 
 	mtu -= sizeof(struct ipv6hdr) + sizeof(struct tcphdr);
 
+	rcu_read_lock();
+
+	net = dev_net_rcu(dev);
 	if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss)
 		mtu = net->ipv6.sysctl.ip6_rt_min_advmss;
 
+	rcu_read_unlock();
+
 	/*
 	 * Maximal non-jumbo IPv6 payload is IPV6_MAXPLEN and
 	 * corresponding MSS is IPV6_MAXPLEN - tcp_header_size.
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 14/16] net: filter: convert to dev_net_rcu()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (12 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 13/16] ipv6: use RCU protection in ip6_default_advmss() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net() Eric Dumazet
  2025-02-03 14:30 ` [PATCH v2 net 16/16] ipv4: use RCU protection in inet_select_addr() Eric Dumazet
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

All calls to dev_net() from net/core/filter.c are currently
done under rcu_read_lock().

Convert them to dev_net_rcu() to ensure LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/filter.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 2ec162dd83c463640dcf3c151327206f519b217a..4db537a982d55fa9b42aaa70820cb337d5283299 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2244,7 +2244,7 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
 				   struct bpf_nh_params *nh)
 {
 	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
-	struct net *net = dev_net(dev);
+	struct net *net = dev_net_rcu(dev);
 	int err, ret = NET_XMIT_DROP;
 
 	if (!nh) {
@@ -2348,7 +2348,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
 				   struct bpf_nh_params *nh)
 {
 	const struct iphdr *ip4h = ip_hdr(skb);
-	struct net *net = dev_net(dev);
+	struct net *net = dev_net_rcu(dev);
 	int err, ret = NET_XMIT_DROP;
 
 	if (!nh) {
@@ -2438,7 +2438,7 @@ BPF_CALL_3(bpf_clone_redirect, struct sk_buff *, skb, u32, ifindex, u64, flags)
 	if (unlikely(flags & (~(BPF_F_INGRESS) | BPF_F_REDIRECT_INTERNAL)))
 		return -EINVAL;
 
-	dev = dev_get_by_index_rcu(dev_net(skb->dev), ifindex);
+	dev = dev_get_by_index_rcu(dev_net_rcu(skb->dev), ifindex);
 	if (unlikely(!dev))
 		return -EINVAL;
 
@@ -2482,7 +2482,7 @@ static struct net_device *skb_get_peer_dev(struct net_device *dev)
 int skb_do_redirect(struct sk_buff *skb)
 {
 	struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	struct net_device *dev;
 	u32 flags = ri->flags;
 
@@ -2497,7 +2497,7 @@ int skb_do_redirect(struct sk_buff *skb)
 		dev = skb_get_peer_dev(dev);
 		if (unlikely(!dev ||
 			     !(dev->flags & IFF_UP) ||
-			     net_eq(net, dev_net(dev))))
+			     net_eq(net, dev_net_rcu(dev))))
 			goto out_drop;
 		skb->dev = dev;
 		dev_sw_netstats_rx_add(dev, skb->len);
@@ -4425,7 +4425,7 @@ __xdp_do_redirect_frame(struct bpf_redirect_info *ri, struct net_device *dev,
 		break;
 	case BPF_MAP_TYPE_UNSPEC:
 		if (map_id == INT_MAX) {
-			fwd = dev_get_by_index_rcu(dev_net(dev), ri->tgt_index);
+			fwd = dev_get_by_index_rcu(dev_net_rcu(dev), ri->tgt_index);
 			if (unlikely(!fwd)) {
 				err = -EINVAL;
 				break;
@@ -4550,7 +4550,7 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 	ri->map_type = BPF_MAP_TYPE_UNSPEC;
 
 	if (map_type == BPF_MAP_TYPE_UNSPEC && map_id == INT_MAX) {
-		fwd = dev_get_by_index_rcu(dev_net(dev), ri->tgt_index);
+		fwd = dev_get_by_index_rcu(dev_net_rcu(dev), ri->tgt_index);
 		if (unlikely(!fwd)) {
 			err = -EINVAL;
 			goto err;
@@ -6203,12 +6203,12 @@ BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
 	switch (params->family) {
 #if IS_ENABLED(CONFIG_INET)
 	case AF_INET:
-		return bpf_ipv4_fib_lookup(dev_net(ctx->rxq->dev), params,
+		return bpf_ipv4_fib_lookup(dev_net_rcu(ctx->rxq->dev), params,
 					   flags, true);
 #endif
 #if IS_ENABLED(CONFIG_IPV6)
 	case AF_INET6:
-		return bpf_ipv6_fib_lookup(dev_net(ctx->rxq->dev), params,
+		return bpf_ipv6_fib_lookup(dev_net_rcu(ctx->rxq->dev), params,
 					   flags, true);
 #endif
 	}
@@ -6228,7 +6228,7 @@ static const struct bpf_func_proto bpf_xdp_fib_lookup_proto = {
 BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
 	   struct bpf_fib_lookup *, params, int, plen, u32, flags)
 {
-	struct net *net = dev_net(skb->dev);
+	struct net *net = dev_net_rcu(skb->dev);
 	int rc = -EAFNOSUPPORT;
 	bool check_mtu = false;
 
@@ -6283,7 +6283,7 @@ static const struct bpf_func_proto bpf_skb_fib_lookup_proto = {
 static struct net_device *__dev_via_ifindex(struct net_device *dev_curr,
 					    u32 ifindex)
 {
-	struct net *netns = dev_net(dev_curr);
+	struct net *netns = dev_net_rcu(dev_curr);
 
 	/* Non-redirect use-cases can use ifindex=0 and save ifindex lookup */
 	if (ifindex == 0)
@@ -6806,7 +6806,7 @@ bpf_skc_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
 	int ifindex;
 
 	if (skb->dev) {
-		caller_net = dev_net(skb->dev);
+		caller_net = dev_net_rcu(skb->dev);
 		ifindex = skb->dev->ifindex;
 	} else {
 		caller_net = sock_net(skb->sk);
@@ -6906,7 +6906,7 @@ BPF_CALL_5(bpf_tc_skc_lookup_tcp, struct sk_buff *, skb,
 {
 	struct net_device *dev = skb->dev;
 	int ifindex = dev->ifindex, sdif = dev_sdif(dev);
-	struct net *caller_net = dev_net(dev);
+	struct net *caller_net = dev_net_rcu(dev);
 
 	return (unsigned long)__bpf_skc_lookup(skb, tuple, len, caller_net,
 					       ifindex, IPPROTO_TCP, netns_id,
@@ -6930,7 +6930,7 @@ BPF_CALL_5(bpf_tc_sk_lookup_tcp, struct sk_buff *, skb,
 {
 	struct net_device *dev = skb->dev;
 	int ifindex = dev->ifindex, sdif = dev_sdif(dev);
-	struct net *caller_net = dev_net(dev);
+	struct net *caller_net = dev_net_rcu(dev);
 
 	return (unsigned long)__bpf_sk_lookup(skb, tuple, len, caller_net,
 					      ifindex, IPPROTO_TCP, netns_id,
@@ -6954,7 +6954,7 @@ BPF_CALL_5(bpf_tc_sk_lookup_udp, struct sk_buff *, skb,
 {
 	struct net_device *dev = skb->dev;
 	int ifindex = dev->ifindex, sdif = dev_sdif(dev);
-	struct net *caller_net = dev_net(dev);
+	struct net *caller_net = dev_net_rcu(dev);
 
 	return (unsigned long)__bpf_sk_lookup(skb, tuple, len, caller_net,
 					      ifindex, IPPROTO_UDP, netns_id,
@@ -6992,7 +6992,7 @@ BPF_CALL_5(bpf_xdp_sk_lookup_udp, struct xdp_buff *, ctx,
 {
 	struct net_device *dev = ctx->rxq->dev;
 	int ifindex = dev->ifindex, sdif = dev_sdif(dev);
-	struct net *caller_net = dev_net(dev);
+	struct net *caller_net = dev_net_rcu(dev);
 
 	return (unsigned long)__bpf_sk_lookup(NULL, tuple, len, caller_net,
 					      ifindex, IPPROTO_UDP, netns_id,
@@ -7016,7 +7016,7 @@ BPF_CALL_5(bpf_xdp_skc_lookup_tcp, struct xdp_buff *, ctx,
 {
 	struct net_device *dev = ctx->rxq->dev;
 	int ifindex = dev->ifindex, sdif = dev_sdif(dev);
-	struct net *caller_net = dev_net(dev);
+	struct net *caller_net = dev_net_rcu(dev);
 
 	return (unsigned long)__bpf_skc_lookup(NULL, tuple, len, caller_net,
 					       ifindex, IPPROTO_TCP, netns_id,
@@ -7040,7 +7040,7 @@ BPF_CALL_5(bpf_xdp_sk_lookup_tcp, struct xdp_buff *, ctx,
 {
 	struct net_device *dev = ctx->rxq->dev;
 	int ifindex = dev->ifindex, sdif = dev_sdif(dev);
-	struct net *caller_net = dev_net(dev);
+	struct net *caller_net = dev_net_rcu(dev);
 
 	return (unsigned long)__bpf_sk_lookup(NULL, tuple, len, caller_net,
 					      ifindex, IPPROTO_TCP, netns_id,
@@ -7510,7 +7510,7 @@ BPF_CALL_3(bpf_sk_assign, struct sk_buff *, skb, struct sock *, sk, u64, flags)
 		return -EINVAL;
 	if (!skb_at_tc_ingress(skb))
 		return -EOPNOTSUPP;
-	if (unlikely(dev_net(skb->dev) != sock_net(sk)))
+	if (unlikely(dev_net_rcu(skb->dev) != sock_net(sk)))
 		return -ENETUNREACH;
 	if (sk_unhashed(sk))
 		return -EOPNOTSUPP;
@@ -11985,7 +11985,7 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk,
 	if (!skb_at_tc_ingress(skb))
 		return -EINVAL;
 
-	net = dev_net(skb->dev);
+	net = dev_net_rcu(skb->dev);
 	if (net != sock_net(sk))
 		return -ENETUNREACH;
 
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (13 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 14/16] net: filter: convert to dev_net_rcu() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  2025-02-03 23:38   ` Jakub Kicinski
  2025-02-03 14:30 ` [PATCH v2 net 16/16] ipv4: use RCU protection in inet_select_addr() Eric Dumazet
  15 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

__skb_flow_dissect() can be called from arbitrary contexts.

It must extend its rcu protection section to include
the call to dev_net(), which can become dev_net_rcu().

This makes sure the net structure can not disappear under us.

Fixes: 9b52e3f267a6 ("flow_dissector: handle no-skb use case")flow_dissect")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/flow_dissector.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 0e638a37aa0961de6281deeed227b3e7ef70e546..5db41bf2ed93e0df721c216ca4557dad16aa5f83 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1108,10 +1108,12 @@ bool __skb_flow_dissect(const struct net *net,
 					      FLOW_DISSECTOR_KEY_BASIC,
 					      target_container);
 
+	rcu_read_lock();
+
 	if (skb) {
 		if (!net) {
 			if (skb->dev)
-				net = dev_net(skb->dev);
+				net = dev_net_rcu(skb->dev);
 			else if (skb->sk)
 				net = sock_net(skb->sk);
 		}
@@ -1122,7 +1124,6 @@ bool __skb_flow_dissect(const struct net *net,
 		enum netns_bpf_attach_type type = NETNS_BPF_FLOW_DISSECTOR;
 		struct bpf_prog_array *run_array;
 
-		rcu_read_lock();
 		run_array = rcu_dereference(init_net.bpf.run_array[type]);
 		if (!run_array)
 			run_array = rcu_dereference(net->bpf.run_array[type]);
@@ -1150,17 +1151,17 @@ bool __skb_flow_dissect(const struct net *net,
 			prog = READ_ONCE(run_array->items[0].prog);
 			result = bpf_flow_dissect(prog, &ctx, n_proto, nhoff,
 						  hlen, flags);
-			if (result == BPF_FLOW_DISSECTOR_CONTINUE)
-				goto dissect_continue;
-			__skb_flow_bpf_to_target(&flow_keys, flow_dissector,
-						 target_container);
-			rcu_read_unlock();
-			return result == BPF_OK;
+			if (result != BPF_FLOW_DISSECTOR_CONTINUE) {
+				__skb_flow_bpf_to_target(&flow_keys, flow_dissector,
+							 target_container);
+				rcu_read_unlock();
+				return result == BPF_OK;
+			}
 		}
-dissect_continue:
-		rcu_read_unlock();
 	}
 
+	rcu_read_unlock();
+
 	if (dissector_uses_key(flow_dissector,
 			       FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
 		struct ethhdr *eth = eth_hdr(skb);
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 net 16/16] ipv4: use RCU protection in inet_select_addr()
  2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
                   ` (14 preceding siblings ...)
  2025-02-03 14:30 ` [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net() Eric Dumazet
@ 2025-02-03 14:30 ` Eric Dumazet
  15 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-03 14:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Kuniyuki Iwashima, Simon Horman, eric.dumazet,
	Eric Dumazet

inet_select_addr() must use RCU protection to make
sure the net structure it reads does not disappear.

Fixes: c4544c724322 ("[NETNS]: Process inet_select_addr inside a namespace.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/devinet.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index c8b3cf5fba4c02941b919687a6a657cf68f5f99a..55b8151759bc9f76ebdbfae27544d6ee666a4809 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1371,10 +1371,11 @@ __be32 inet_select_addr(const struct net_device *dev, __be32 dst, int scope)
 	__be32 addr = 0;
 	unsigned char localnet_scope = RT_SCOPE_HOST;
 	struct in_device *in_dev;
-	struct net *net = dev_net(dev);
+	struct net *net;
 	int master_idx;
 
 	rcu_read_lock();
+	net = dev_net_rcu(dev);
 	in_dev = __in_dev_get_rcu(dev);
 	if (!in_dev)
 		goto no_in_dev;
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-03 14:30 ` [PATCH v2 net 09/16] ipv4: icmp: " Eric Dumazet
@ 2025-02-03 23:36   ` Jakub Kicinski
  2025-02-04  4:14     ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Jakub Kicinski @ 2025-02-03 23:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Mon,  3 Feb 2025 14:30:39 +0000 Eric Dumazet wrote:
> @@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
>  		goto out;
>  
>  	if (rt->dst.dev)
> -		net = dev_net(rt->dst.dev);
> +		net = dev_net_rcu(rt->dst.dev);
>  	else if (skb_in->dev)
> -		net = dev_net(skb_in->dev);
> +		net = dev_net_rcu(skb_in->dev);
>  	else
>  		goto out;

Hm. Weird. NIPA says this one is not under RCU.

[  275.730657][    C1] ./include/net/net_namespace.h:404 suspicious rcu_dereference_check() usage!
[  275.731033][    C1] 
[  275.731033][    C1] other info that might help us debug this:
[  275.731033][    C1] 
[  275.731471][    C1] 
[  275.731471][    C1] rcu_scheduler_active = 2, debug_locks = 1
[  275.731799][    C1] 1 lock held by swapper/1/0:
[  275.732000][    C1]  #0: ffffc900001e0ae8 ((&n->timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x230
[  275.732354][    C1] 
[  275.732354][    C1] stack backtrace:
[  275.732638][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.13.0-virtme #1
[  275.732643][    C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  275.732646][    C1] Call Trace:
[  275.732647][    C1]  <IRQ>
[  275.732651][    C1]  dump_stack_lvl+0xb0/0xd0
[  275.732663][    C1]  lockdep_rcu_suspicious+0x1ea/0x280
[  275.732678][    C1]  __icmp_send+0xb0d/0x1580
[  275.732695][    C1]  ? tcp_data_queue+0x8/0x22d0
[  275.732701][    C1]  ? lockdep_hardirqs_on_prepare+0x12b/0x410
[  275.732712][    C1]  ? __pfx___icmp_send+0x10/0x10
[  275.732719][    C1]  ? tcp_check_space+0x3ce/0x5f0
[  275.732742][    C1]  ? rcu_read_lock_any_held+0x43/0xb0
[  275.732750][    C1]  ? validate_chain+0x1fe/0xae0
[  275.732771][    C1]  ? __pfx_validate_chain+0x10/0x10
[  275.732778][    C1]  ? hlock_class+0x4e/0x130
[  275.732784][    C1]  ? mark_lock+0x38/0x3e0
[  275.732788][    C1]  ? sock_put+0x1a/0x60
[  275.732806][    C1]  ? __lock_acquire+0xb9a/0x1680
[  275.732822][    C1]  ipv4_send_dest_unreach+0x3b4/0x800
[  275.732829][    C1]  ? neigh_invalidate+0x1c7/0x540
[  275.732837][    C1]  ? __pfx_ipv4_send_dest_unreach+0x10/0x10
[  275.732850][    C1]  ipv4_link_failure+0x1b/0x190
[  275.732856][    C1]  arp_error_report+0x96/0x170
[  275.732862][    C1]  neigh_invalidate+0x209/0x540
[  275.732873][    C1]  neigh_timer_handler+0x87a/0xdf0
[  275.732883][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
[  275.732886][    C1]  call_timer_fn+0x13b/0x230
[  275.732891][    C1]  ? call_timer_fn+0xe8/0x230
[  275.732894][    C1]  ? call_timer_fn+0xe8/0x230
[  275.732899][    C1]  ? __pfx_call_timer_fn+0x10/0x10
[  275.732902][    C1]  ? mark_lock+0x38/0x3e0
[  275.732920][    C1]  __run_timers+0x545/0x810
[  275.732925][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
[  275.732936][    C1]  ? __pfx___run_timers+0x10/0x10
[  275.732939][    C1]  ? __lock_release+0x103/0x460
[  275.732947][    C1]  ? do_raw_spin_lock+0x131/0x270
[  275.732952][    C1]  ? __pfx_do_raw_spin_lock+0x10/0x10
[  275.732956][    C1]  ? lock_acquire+0x32/0xc0
[  275.732958][    C1]  ? timer_expire_remote+0x96/0xf0
[  275.732967][    C1]  timer_expire_remote+0x9e/0xf0
[  275.732970][    C1]  tmigr_handle_remote_cpu+0x278/0x440
[  275.732977][    C1]  ? __pfx_tmigr_handle_remote_cpu+0x10/0x10
[  275.732981][    C1]  ? __pfx___lock_release+0x10/0x10
[  275.732985][    C1]  ? __pfx_lock_acquire.part.0+0x10/0x10
[  275.733015][    C1]  tmigr_handle_remote_up+0x1a6/0x270
[  275.733027][    C1]  ? __pfx_tmigr_handle_remote_up+0x10/0x10
[  275.733036][    C1]  __walk_groups.isra.0+0x44/0x160
[  275.733051][    C1]  tmigr_handle_remote+0x20b/0x300

Decoded:
https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/976941/vm-crash-thr0-1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net()
  2025-02-03 14:30 ` [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net() Eric Dumazet
@ 2025-02-03 23:38   ` Jakub Kicinski
  2025-02-04  4:16     ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Jakub Kicinski @ 2025-02-03 23:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Mon,  3 Feb 2025 14:30:45 +0000 Eric Dumazet wrote:
> Fixes: 9b52e3f267a6 ("flow_dissector: handle no-skb use case")flow_dissect")

This Fixes tag looks corrupted, in case you need to post v3

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-03 23:36   ` Jakub Kicinski
@ 2025-02-04  4:14     ` Eric Dumazet
  2025-02-04  4:57       ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2025-02-04  4:14 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Tue, Feb 4, 2025 at 12:36 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon,  3 Feb 2025 14:30:39 +0000 Eric Dumazet wrote:
> > @@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
> >               goto out;
> >
> >       if (rt->dst.dev)
> > -             net = dev_net(rt->dst.dev);
> > +             net = dev_net_rcu(rt->dst.dev);
> >       else if (skb_in->dev)
> > -             net = dev_net(skb_in->dev);
> > +             net = dev_net_rcu(skb_in->dev);
> >       else
> >               goto out;
>
> Hm. Weird. NIPA says this one is not under RCU.
>
> [  275.730657][    C1] ./include/net/net_namespace.h:404 suspicious rcu_dereference_check() usage!
> [  275.731033][    C1]
> [  275.731033][    C1] other info that might help us debug this:
> [  275.731033][    C1]
> [  275.731471][    C1]
> [  275.731471][    C1] rcu_scheduler_active = 2, debug_locks = 1
> [  275.731799][    C1] 1 lock held by swapper/1/0:
> [  275.732000][    C1]  #0: ffffc900001e0ae8 ((&n->timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x230
> [  275.732354][    C1]
> [  275.732354][    C1] stack backtrace:
> [  275.732638][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.13.0-virtme #1
> [  275.732643][    C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  275.732646][    C1] Call Trace:
> [  275.732647][    C1]  <IRQ>
> [  275.732651][    C1]  dump_stack_lvl+0xb0/0xd0
> [  275.732663][    C1]  lockdep_rcu_suspicious+0x1ea/0x280
> [  275.732678][    C1]  __icmp_send+0xb0d/0x1580
> [  275.732695][    C1]  ? tcp_data_queue+0x8/0x22d0
> [  275.732701][    C1]  ? lockdep_hardirqs_on_prepare+0x12b/0x410
> [  275.732712][    C1]  ? __pfx___icmp_send+0x10/0x10
> [  275.732719][    C1]  ? tcp_check_space+0x3ce/0x5f0
> [  275.732742][    C1]  ? rcu_read_lock_any_held+0x43/0xb0
> [  275.732750][    C1]  ? validate_chain+0x1fe/0xae0
> [  275.732771][    C1]  ? __pfx_validate_chain+0x10/0x10
> [  275.732778][    C1]  ? hlock_class+0x4e/0x130
> [  275.732784][    C1]  ? mark_lock+0x38/0x3e0
> [  275.732788][    C1]  ? sock_put+0x1a/0x60
> [  275.732806][    C1]  ? __lock_acquire+0xb9a/0x1680
> [  275.732822][    C1]  ipv4_send_dest_unreach+0x3b4/0x800
> [  275.732829][    C1]  ? neigh_invalidate+0x1c7/0x540
> [  275.732837][    C1]  ? __pfx_ipv4_send_dest_unreach+0x10/0x10
> [  275.732850][    C1]  ipv4_link_failure+0x1b/0x190
> [  275.732856][    C1]  arp_error_report+0x96/0x170
> [  275.732862][    C1]  neigh_invalidate+0x209/0x540
> [  275.732873][    C1]  neigh_timer_handler+0x87a/0xdf0
> [  275.732883][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> [  275.732886][    C1]  call_timer_fn+0x13b/0x230
> [  275.732891][    C1]  ? call_timer_fn+0xe8/0x230
> [  275.732894][    C1]  ? call_timer_fn+0xe8/0x230
> [  275.732899][    C1]  ? __pfx_call_timer_fn+0x10/0x10
> [  275.732902][    C1]  ? mark_lock+0x38/0x3e0
> [  275.732920][    C1]  __run_timers+0x545/0x810
> [  275.732925][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> [  275.732936][    C1]  ? __pfx___run_timers+0x10/0x10
> [  275.732939][    C1]  ? __lock_release+0x103/0x460
> [  275.732947][    C1]  ? do_raw_spin_lock+0x131/0x270
> [  275.732952][    C1]  ? __pfx_do_raw_spin_lock+0x10/0x10
> [  275.732956][    C1]  ? lock_acquire+0x32/0xc0
> [  275.732958][    C1]  ? timer_expire_remote+0x96/0xf0
> [  275.732967][    C1]  timer_expire_remote+0x9e/0xf0
> [  275.732970][    C1]  tmigr_handle_remote_cpu+0x278/0x440
> [  275.732977][    C1]  ? __pfx_tmigr_handle_remote_cpu+0x10/0x10
> [  275.732981][    C1]  ? __pfx___lock_release+0x10/0x10
> [  275.732985][    C1]  ? __pfx_lock_acquire.part.0+0x10/0x10
> [  275.733015][    C1]  tmigr_handle_remote_up+0x1a6/0x270
> [  275.733027][    C1]  ? __pfx_tmigr_handle_remote_up+0x10/0x10
> [  275.733036][    C1]  __walk_groups.isra.0+0x44/0x160
> [  275.733051][    C1]  tmigr_handle_remote+0x20b/0x300
>
> Decoded:
> https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/976941/vm-crash-thr0-1

Oops, I thought I ran the tests on the whole series. I missed this one.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net()
  2025-02-03 23:38   ` Jakub Kicinski
@ 2025-02-04  4:16     ` Eric Dumazet
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-04  4:16 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Tue, Feb 4, 2025 at 12:38 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon,  3 Feb 2025 14:30:45 +0000 Eric Dumazet wrote:
> > Fixes: 9b52e3f267a6 ("flow_dissector: handle no-skb use case")flow_dissect")
>
> This Fixes tag looks corrupted, in case you need to post v3

ACK, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-04  4:14     ` Eric Dumazet
@ 2025-02-04  4:57       ` Eric Dumazet
  2025-02-04 10:35         ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2025-02-04  4:57 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Tue, Feb 4, 2025 at 5:14 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Feb 4, 2025 at 12:36 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Mon,  3 Feb 2025 14:30:39 +0000 Eric Dumazet wrote:
> > > @@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
> > >               goto out;
> > >
> > >       if (rt->dst.dev)
> > > -             net = dev_net(rt->dst.dev);
> > > +             net = dev_net_rcu(rt->dst.dev);
> > >       else if (skb_in->dev)
> > > -             net = dev_net(skb_in->dev);
> > > +             net = dev_net_rcu(skb_in->dev);
> > >       else
> > >               goto out;
> >
> > Hm. Weird. NIPA says this one is not under RCU.
> >
> > [  275.730657][    C1] ./include/net/net_namespace.h:404 suspicious rcu_dereference_check() usage!
> > [  275.731033][    C1]
> > [  275.731033][    C1] other info that might help us debug this:
> > [  275.731033][    C1]
> > [  275.731471][    C1]
> > [  275.731471][    C1] rcu_scheduler_active = 2, debug_locks = 1
> > [  275.731799][    C1] 1 lock held by swapper/1/0:
> > [  275.732000][    C1]  #0: ffffc900001e0ae8 ((&n->timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x230
> > [  275.732354][    C1]
> > [  275.732354][    C1] stack backtrace:
> > [  275.732638][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.13.0-virtme #1
> > [  275.732643][    C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [  275.732646][    C1] Call Trace:
> > [  275.732647][    C1]  <IRQ>
> > [  275.732651][    C1]  dump_stack_lvl+0xb0/0xd0
> > [  275.732663][    C1]  lockdep_rcu_suspicious+0x1ea/0x280
> > [  275.732678][    C1]  __icmp_send+0xb0d/0x1580
> > [  275.732695][    C1]  ? tcp_data_queue+0x8/0x22d0
> > [  275.732701][    C1]  ? lockdep_hardirqs_on_prepare+0x12b/0x410
> > [  275.732712][    C1]  ? __pfx___icmp_send+0x10/0x10
> > [  275.732719][    C1]  ? tcp_check_space+0x3ce/0x5f0
> > [  275.732742][    C1]  ? rcu_read_lock_any_held+0x43/0xb0
> > [  275.732750][    C1]  ? validate_chain+0x1fe/0xae0
> > [  275.732771][    C1]  ? __pfx_validate_chain+0x10/0x10
> > [  275.732778][    C1]  ? hlock_class+0x4e/0x130
> > [  275.732784][    C1]  ? mark_lock+0x38/0x3e0
> > [  275.732788][    C1]  ? sock_put+0x1a/0x60
> > [  275.732806][    C1]  ? __lock_acquire+0xb9a/0x1680
> > [  275.732822][    C1]  ipv4_send_dest_unreach+0x3b4/0x800
> > [  275.732829][    C1]  ? neigh_invalidate+0x1c7/0x540
> > [  275.732837][    C1]  ? __pfx_ipv4_send_dest_unreach+0x10/0x10
> > [  275.732850][    C1]  ipv4_link_failure+0x1b/0x190
> > [  275.732856][    C1]  arp_error_report+0x96/0x170
> > [  275.732862][    C1]  neigh_invalidate+0x209/0x540
> > [  275.732873][    C1]  neigh_timer_handler+0x87a/0xdf0
> > [  275.732883][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> > [  275.732886][    C1]  call_timer_fn+0x13b/0x230
> > [  275.732891][    C1]  ? call_timer_fn+0xe8/0x230
> > [  275.732894][    C1]  ? call_timer_fn+0xe8/0x230
> > [  275.732899][    C1]  ? __pfx_call_timer_fn+0x10/0x10
> > [  275.732902][    C1]  ? mark_lock+0x38/0x3e0
> > [  275.732920][    C1]  __run_timers+0x545/0x810
> > [  275.732925][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> > [  275.732936][    C1]  ? __pfx___run_timers+0x10/0x10
> > [  275.732939][    C1]  ? __lock_release+0x103/0x460
> > [  275.732947][    C1]  ? do_raw_spin_lock+0x131/0x270
> > [  275.732952][    C1]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > [  275.732956][    C1]  ? lock_acquire+0x32/0xc0
> > [  275.732958][    C1]  ? timer_expire_remote+0x96/0xf0
> > [  275.732967][    C1]  timer_expire_remote+0x9e/0xf0
> > [  275.732970][    C1]  tmigr_handle_remote_cpu+0x278/0x440
> > [  275.732977][    C1]  ? __pfx_tmigr_handle_remote_cpu+0x10/0x10
> > [  275.732981][    C1]  ? __pfx___lock_release+0x10/0x10
> > [  275.732985][    C1]  ? __pfx_lock_acquire.part.0+0x10/0x10
> > [  275.733015][    C1]  tmigr_handle_remote_up+0x1a6/0x270
> > [  275.733027][    C1]  ? __pfx_tmigr_handle_remote_up+0x10/0x10
> > [  275.733036][    C1]  __walk_groups.isra.0+0x44/0x160
> > [  275.733051][    C1]  tmigr_handle_remote+0x20b/0x300
> >
> > Decoded:
> > https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/976941/vm-crash-thr0-1
>
> Oops, I thought I ran the tests on the whole series. I missed this one.

BTW, ICMPv6 has the same potential problem, I will amend both cases.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-04  4:57       ` Eric Dumazet
@ 2025-02-04 10:35         ` Eric Dumazet
  2025-02-04 16:21           ` Matthieu Baerts
  2025-02-04 16:21           ` Jakub Kicinski
  0 siblings, 2 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-04 10:35 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Tue, Feb 4, 2025 at 5:57 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Feb 4, 2025 at 5:14 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Feb 4, 2025 at 12:36 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Mon,  3 Feb 2025 14:30:39 +0000 Eric Dumazet wrote:
> > > > @@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
> > > >               goto out;
> > > >
> > > >       if (rt->dst.dev)
> > > > -             net = dev_net(rt->dst.dev);
> > > > +             net = dev_net_rcu(rt->dst.dev);
> > > >       else if (skb_in->dev)
> > > > -             net = dev_net(skb_in->dev);
> > > > +             net = dev_net_rcu(skb_in->dev);
> > > >       else
> > > >               goto out;
> > >
> > > Hm. Weird. NIPA says this one is not under RCU.
> > >
> > > [  275.730657][    C1] ./include/net/net_namespace.h:404 suspicious rcu_dereference_check() usage!
> > > [  275.731033][    C1]
> > > [  275.731033][    C1] other info that might help us debug this:
> > > [  275.731033][    C1]
> > > [  275.731471][    C1]
> > > [  275.731471][    C1] rcu_scheduler_active = 2, debug_locks = 1
> > > [  275.731799][    C1] 1 lock held by swapper/1/0:
> > > [  275.732000][    C1]  #0: ffffc900001e0ae8 ((&n->timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x230
> > > [  275.732354][    C1]
> > > [  275.732354][    C1] stack backtrace:
> > > [  275.732638][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.13.0-virtme #1
> > > [  275.732643][    C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > > [  275.732646][    C1] Call Trace:
> > > [  275.732647][    C1]  <IRQ>
> > > [  275.732651][    C1]  dump_stack_lvl+0xb0/0xd0
> > > [  275.732663][    C1]  lockdep_rcu_suspicious+0x1ea/0x280
> > > [  275.732678][    C1]  __icmp_send+0xb0d/0x1580
> > > [  275.732695][    C1]  ? tcp_data_queue+0x8/0x22d0
> > > [  275.732701][    C1]  ? lockdep_hardirqs_on_prepare+0x12b/0x410
> > > [  275.732712][    C1]  ? __pfx___icmp_send+0x10/0x10
> > > [  275.732719][    C1]  ? tcp_check_space+0x3ce/0x5f0
> > > [  275.732742][    C1]  ? rcu_read_lock_any_held+0x43/0xb0
> > > [  275.732750][    C1]  ? validate_chain+0x1fe/0xae0
> > > [  275.732771][    C1]  ? __pfx_validate_chain+0x10/0x10
> > > [  275.732778][    C1]  ? hlock_class+0x4e/0x130
> > > [  275.732784][    C1]  ? mark_lock+0x38/0x3e0
> > > [  275.732788][    C1]  ? sock_put+0x1a/0x60
> > > [  275.732806][    C1]  ? __lock_acquire+0xb9a/0x1680
> > > [  275.732822][    C1]  ipv4_send_dest_unreach+0x3b4/0x800
> > > [  275.732829][    C1]  ? neigh_invalidate+0x1c7/0x540
> > > [  275.732837][    C1]  ? __pfx_ipv4_send_dest_unreach+0x10/0x10
> > > [  275.732850][    C1]  ipv4_link_failure+0x1b/0x190
> > > [  275.732856][    C1]  arp_error_report+0x96/0x170
> > > [  275.732862][    C1]  neigh_invalidate+0x209/0x540
> > > [  275.732873][    C1]  neigh_timer_handler+0x87a/0xdf0
> > > [  275.732883][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> > > [  275.732886][    C1]  call_timer_fn+0x13b/0x230
> > > [  275.732891][    C1]  ? call_timer_fn+0xe8/0x230
> > > [  275.732894][    C1]  ? call_timer_fn+0xe8/0x230
> > > [  275.732899][    C1]  ? __pfx_call_timer_fn+0x10/0x10
> > > [  275.732902][    C1]  ? mark_lock+0x38/0x3e0
> > > [  275.732920][    C1]  __run_timers+0x545/0x810
> > > [  275.732925][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> > > [  275.732936][    C1]  ? __pfx___run_timers+0x10/0x10
> > > [  275.732939][    C1]  ? __lock_release+0x103/0x460
> > > [  275.732947][    C1]  ? do_raw_spin_lock+0x131/0x270
> > > [  275.732952][    C1]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > > [  275.732956][    C1]  ? lock_acquire+0x32/0xc0
> > > [  275.732958][    C1]  ? timer_expire_remote+0x96/0xf0
> > > [  275.732967][    C1]  timer_expire_remote+0x9e/0xf0
> > > [  275.732970][    C1]  tmigr_handle_remote_cpu+0x278/0x440
> > > [  275.732977][    C1]  ? __pfx_tmigr_handle_remote_cpu+0x10/0x10
> > > [  275.732981][    C1]  ? __pfx___lock_release+0x10/0x10
> > > [  275.732985][    C1]  ? __pfx_lock_acquire.part.0+0x10/0x10
> > > [  275.733015][    C1]  tmigr_handle_remote_up+0x1a6/0x270
> > > [  275.733027][    C1]  ? __pfx_tmigr_handle_remote_up+0x10/0x10
> > > [  275.733036][    C1]  __walk_groups.isra.0+0x44/0x160
> > > [  275.733051][    C1]  tmigr_handle_remote+0x20b/0x300
> > >
> > > Decoded:
> > > https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/976941/vm-crash-thr0-1
> >
> > Oops, I thought I ran the tests on the whole series. I missed this one.
>
> BTW, ICMPv6 has the same potential problem, I will amend both cases.

I ran again the tests for v3, got an unrelated crash, FYI.

14237.095216] #PF: supervisor instruction fetch in kernel mode
[14237.095570] #PF: error_code(0x0010) - not-present page
[14237.095915] PGD 1e58067 P4D 1e58067 PUD ce1c067 PMD 0
[14237.096991] Oops: Oops: 0010 [#1] SMP DEBUG_PAGEALLOC NOPTI
[14237.097507] CPU: 0 UID: 0 PID: 6371 Comm: python3 Not tainted
6.13.0-virtme #1559
[14237.098045] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[14237.098578] RIP: 0010:0x0
[14237.099324] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[14237.099752] RSP: 0018:ffffacfd4486bed0 EFLAGS: 00000286
[14237.100079] RAX: 0000000000000000 RBX: ffff9af502607200 RCX: 0000000000000002
[14237.100452] RDX: 00007fffc684a690 RSI: 0000000000005401 RDI: ffff9af502607200
[14237.100821] RBP: 0000000000005401 R08: 0000000000000001 R09: 0000000000000000
[14237.101182] R10: 0000000000000001 R11: 0000000000000000 R12: 00007fffc684a690
[14237.101542] R13: ffff9af50888ed68 R14: ffff9af502607200 R15: 0000000000000000
[14237.101956] FS:  00007f76b73f95c0(0000) GS:ffff9af57cc00000(0000)
knlGS:0000000000000000
[14237.102372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14237.102679] CR2: ffffffffffffffd6 CR3: 00000000039ca000 CR4: 00000000000006f0
[14237.103160] Call Trace:
[14237.103435]  <TASK>
[14237.103720]  ? __die_body.cold+0x19/0x26
[14237.104340]  ? page_fault_oops+0x134/0x2a0
[14237.104553]  ? cp_new_stat+0x157/0x190
[14237.104799]  ? exc_page_fault+0x68/0x230
[14237.105013]  ? asm_exc_page_fault+0x26/0x30
[14237.105259]  full_proxy_unlocked_ioctl+0x63/0x90
[14237.105546]  __x64_sys_ioctl+0x97/0xc0
[14237.105754]  do_syscall_64+0x72/0x180
[14237.105949]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-04 10:35         ` Eric Dumazet
@ 2025-02-04 16:21           ` Matthieu Baerts
  2025-02-04 16:56             ` Eric Dumazet
  2025-02-04 16:21           ` Jakub Kicinski
  1 sibling, 1 reply; 26+ messages in thread
From: Matthieu Baerts @ 2025-02-04 16:21 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

Hi Eric,

On 04/02/2025 11:35, Eric Dumazet wrote:
> On Tue, Feb 4, 2025 at 5:57 AM Eric Dumazet <edumazet@google.com> wrote:
>>
>> On Tue, Feb 4, 2025 at 5:14 AM Eric Dumazet <edumazet@google.com> wrote:
>>>
>>> On Tue, Feb 4, 2025 at 12:36 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>>>
>>>> On Mon,  3 Feb 2025 14:30:39 +0000 Eric Dumazet wrote:
>>>>> @@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
>>>>>               goto out;
>>>>>
>>>>>       if (rt->dst.dev)
>>>>> -             net = dev_net(rt->dst.dev);
>>>>> +             net = dev_net_rcu(rt->dst.dev);
>>>>>       else if (skb_in->dev)
>>>>> -             net = dev_net(skb_in->dev);
>>>>> +             net = dev_net_rcu(skb_in->dev);
>>>>>       else
>>>>>               goto out;
>>>>
>>>> Hm. Weird. NIPA says this one is not under RCU.
>>>>
>>>> [  275.730657][    C1] ./include/net/net_namespace.h:404 suspicious rcu_dereference_check() usage!
>>>> [  275.731033][    C1]
>>>> [  275.731033][    C1] other info that might help us debug this:
>>>> [  275.731033][    C1]
>>>> [  275.731471][    C1]
>>>> [  275.731471][    C1] rcu_scheduler_active = 2, debug_locks = 1
>>>> [  275.731799][    C1] 1 lock held by swapper/1/0:
>>>> [  275.732000][    C1]  #0: ffffc900001e0ae8 ((&n->timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x230
>>>> [  275.732354][    C1]
>>>> [  275.732354][    C1] stack backtrace:
>>>> [  275.732638][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.13.0-virtme #1
>>>> [  275.732643][    C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>>>> [  275.732646][    C1] Call Trace:
>>>> [  275.732647][    C1]  <IRQ>
>>>> [  275.732651][    C1]  dump_stack_lvl+0xb0/0xd0
>>>> [  275.732663][    C1]  lockdep_rcu_suspicious+0x1ea/0x280
>>>> [  275.732678][    C1]  __icmp_send+0xb0d/0x1580
>>>> [  275.732695][    C1]  ? tcp_data_queue+0x8/0x22d0
>>>> [  275.732701][    C1]  ? lockdep_hardirqs_on_prepare+0x12b/0x410
>>>> [  275.732712][    C1]  ? __pfx___icmp_send+0x10/0x10
>>>> [  275.732719][    C1]  ? tcp_check_space+0x3ce/0x5f0
>>>> [  275.732742][    C1]  ? rcu_read_lock_any_held+0x43/0xb0
>>>> [  275.732750][    C1]  ? validate_chain+0x1fe/0xae0
>>>> [  275.732771][    C1]  ? __pfx_validate_chain+0x10/0x10
>>>> [  275.732778][    C1]  ? hlock_class+0x4e/0x130
>>>> [  275.732784][    C1]  ? mark_lock+0x38/0x3e0
>>>> [  275.732788][    C1]  ? sock_put+0x1a/0x60
>>>> [  275.732806][    C1]  ? __lock_acquire+0xb9a/0x1680
>>>> [  275.732822][    C1]  ipv4_send_dest_unreach+0x3b4/0x800
>>>> [  275.732829][    C1]  ? neigh_invalidate+0x1c7/0x540
>>>> [  275.732837][    C1]  ? __pfx_ipv4_send_dest_unreach+0x10/0x10
>>>> [  275.732850][    C1]  ipv4_link_failure+0x1b/0x190
>>>> [  275.732856][    C1]  arp_error_report+0x96/0x170
>>>> [  275.732862][    C1]  neigh_invalidate+0x209/0x540
>>>> [  275.732873][    C1]  neigh_timer_handler+0x87a/0xdf0
>>>> [  275.732883][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
>>>> [  275.732886][    C1]  call_timer_fn+0x13b/0x230
>>>> [  275.732891][    C1]  ? call_timer_fn+0xe8/0x230
>>>> [  275.732894][    C1]  ? call_timer_fn+0xe8/0x230
>>>> [  275.732899][    C1]  ? __pfx_call_timer_fn+0x10/0x10
>>>> [  275.732902][    C1]  ? mark_lock+0x38/0x3e0
>>>> [  275.732920][    C1]  __run_timers+0x545/0x810
>>>> [  275.732925][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
>>>> [  275.732936][    C1]  ? __pfx___run_timers+0x10/0x10
>>>> [  275.732939][    C1]  ? __lock_release+0x103/0x460
>>>> [  275.732947][    C1]  ? do_raw_spin_lock+0x131/0x270
>>>> [  275.732952][    C1]  ? __pfx_do_raw_spin_lock+0x10/0x10
>>>> [  275.732956][    C1]  ? lock_acquire+0x32/0xc0
>>>> [  275.732958][    C1]  ? timer_expire_remote+0x96/0xf0
>>>> [  275.732967][    C1]  timer_expire_remote+0x9e/0xf0
>>>> [  275.732970][    C1]  tmigr_handle_remote_cpu+0x278/0x440
>>>> [  275.732977][    C1]  ? __pfx_tmigr_handle_remote_cpu+0x10/0x10
>>>> [  275.732981][    C1]  ? __pfx___lock_release+0x10/0x10
>>>> [  275.732985][    C1]  ? __pfx_lock_acquire.part.0+0x10/0x10
>>>> [  275.733015][    C1]  tmigr_handle_remote_up+0x1a6/0x270
>>>> [  275.733027][    C1]  ? __pfx_tmigr_handle_remote_up+0x10/0x10
>>>> [  275.733036][    C1]  __walk_groups.isra.0+0x44/0x160
>>>> [  275.733051][    C1]  tmigr_handle_remote+0x20b/0x300
>>>>
>>>> Decoded:
>>>> https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/976941/vm-crash-thr0-1
>>>
>>> Oops, I thought I ran the tests on the whole series. I missed this one.
>>
>> BTW, ICMPv6 has the same potential problem, I will amend both cases.
> 
> I ran again the tests for v3, got an unrelated crash, FYI.
> 
> 14237.095216] #PF: supervisor instruction fetch in kernel mode
> [14237.095570] #PF: error_code(0x0010) - not-present page
> [14237.095915] PGD 1e58067 P4D 1e58067 PUD ce1c067 PMD 0
> [14237.096991] Oops: Oops: 0010 [#1] SMP DEBUG_PAGEALLOC NOPTI
> [14237.097507] CPU: 0 UID: 0 PID: 6371 Comm: python3 Not tainted
> 6.13.0-virtme #1559
> [14237.098045] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [14237.098578] RIP: 0010:0x0
> [14237.099324] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> [14237.099752] RSP: 0018:ffffacfd4486bed0 EFLAGS: 00000286
> [14237.100079] RAX: 0000000000000000 RBX: ffff9af502607200 RCX: 0000000000000002
> [14237.100452] RDX: 00007fffc684a690 RSI: 0000000000005401 RDI: ffff9af502607200
> [14237.100821] RBP: 0000000000005401 R08: 0000000000000001 R09: 0000000000000000
> [14237.101182] R10: 0000000000000001 R11: 0000000000000000 R12: 00007fffc684a690
> [14237.101542] R13: ffff9af50888ed68 R14: ffff9af502607200 R15: 0000000000000000
> [14237.101956] FS:  00007f76b73f95c0(0000) GS:ffff9af57cc00000(0000)
> knlGS:0000000000000000
> [14237.102372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14237.102679] CR2: ffffffffffffffd6 CR3: 00000000039ca000 CR4: 00000000000006f0
> [14237.103160] Call Trace:
> [14237.103435]  <TASK>
> [14237.103720]  ? __die_body.cold+0x19/0x26
> [14237.104340]  ? page_fault_oops+0x134/0x2a0
> [14237.104553]  ? cp_new_stat+0x157/0x190
> [14237.104799]  ? exc_page_fault+0x68/0x230
> [14237.105013]  ? asm_exc_page_fault+0x26/0x30
> [14237.105259]  full_proxy_unlocked_ioctl+0x63/0x90
> [14237.105546]  __x64_sys_ioctl+0x97/0xc0
> [14237.105754]  do_syscall_64+0x72/0x180
> [14237.105949]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

I think I got this issue as well on MPTCP side, when using GCOV, but
something else using the debugfs could trigger that as well I guess. It
is apparently fixed in the Linus tree, see 57b314752ec0 ("debugfs: Fix
the missing initializations in __debugfs_file_get()")

https://lore.kernel.org/all/20250129191937.GR1977892@ZenIV/

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-04 10:35         ` Eric Dumazet
  2025-02-04 16:21           ` Matthieu Baerts
@ 2025-02-04 16:21           ` Jakub Kicinski
  1 sibling, 0 replies; 26+ messages in thread
From: Jakub Kicinski @ 2025-02-04 16:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, netdev, Kuniyuki Iwashima,
	Simon Horman, eric.dumazet

On Tue, 4 Feb 2025 11:35:46 +0100 Eric Dumazet wrote:
> > > Oops, I thought I ran the tests on the whole series. I missed this one.  
> >
> > BTW, ICMPv6 has the same potential problem, I will amend both cases.  
> 
> I ran again the tests for v3, got an unrelated crash, FYI.

Yes, FWIW that's
https://lore.kernel.org/all/20250129191937.GR1977892@ZenIV/
we have the fix queued up locally in NIPA. Merge windows are fun!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 net 09/16] ipv4: icmp: convert to dev_net_rcu()
  2025-02-04 16:21           ` Matthieu Baerts
@ 2025-02-04 16:56             ` Eric Dumazet
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2025-02-04 16:56 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: Jakub Kicinski, David S . Miller, Paolo Abeni, netdev,
	Kuniyuki Iwashima, Simon Horman, eric.dumazet

On Tue, Feb 4, 2025 at 5:21 PM Matthieu Baerts <matttbe@kernel.org> wrote:
>
> Hi Eric,
>
> On 04/02/2025 11:35, Eric Dumazet wrote:
> > On Tue, Feb 4, 2025 at 5:57 AM Eric Dumazet <edumazet@google.com> wrote:
> >>
> >> On Tue, Feb 4, 2025 at 5:14 AM Eric Dumazet <edumazet@google.com> wrote:
> >>>
> >>> On Tue, Feb 4, 2025 at 12:36 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >>>>
> >>>> On Mon,  3 Feb 2025 14:30:39 +0000 Eric Dumazet wrote:
> >>>>> @@ -611,9 +611,9 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
> >>>>>               goto out;
> >>>>>
> >>>>>       if (rt->dst.dev)
> >>>>> -             net = dev_net(rt->dst.dev);
> >>>>> +             net = dev_net_rcu(rt->dst.dev);
> >>>>>       else if (skb_in->dev)
> >>>>> -             net = dev_net(skb_in->dev);
> >>>>> +             net = dev_net_rcu(skb_in->dev);
> >>>>>       else
> >>>>>               goto out;
> >>>>
> >>>> Hm. Weird. NIPA says this one is not under RCU.
> >>>>
> >>>> [  275.730657][    C1] ./include/net/net_namespace.h:404 suspicious rcu_dereference_check() usage!
> >>>> [  275.731033][    C1]
> >>>> [  275.731033][    C1] other info that might help us debug this:
> >>>> [  275.731033][    C1]
> >>>> [  275.731471][    C1]
> >>>> [  275.731471][    C1] rcu_scheduler_active = 2, debug_locks = 1
> >>>> [  275.731799][    C1] 1 lock held by swapper/1/0:
> >>>> [  275.732000][    C1]  #0: ffffc900001e0ae8 ((&n->timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x230
> >>>> [  275.732354][    C1]
> >>>> [  275.732354][    C1] stack backtrace:
> >>>> [  275.732638][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.13.0-virtme #1
> >>>> [  275.732643][    C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> >>>> [  275.732646][    C1] Call Trace:
> >>>> [  275.732647][    C1]  <IRQ>
> >>>> [  275.732651][    C1]  dump_stack_lvl+0xb0/0xd0
> >>>> [  275.732663][    C1]  lockdep_rcu_suspicious+0x1ea/0x280
> >>>> [  275.732678][    C1]  __icmp_send+0xb0d/0x1580
> >>>> [  275.732695][    C1]  ? tcp_data_queue+0x8/0x22d0
> >>>> [  275.732701][    C1]  ? lockdep_hardirqs_on_prepare+0x12b/0x410
> >>>> [  275.732712][    C1]  ? __pfx___icmp_send+0x10/0x10
> >>>> [  275.732719][    C1]  ? tcp_check_space+0x3ce/0x5f0
> >>>> [  275.732742][    C1]  ? rcu_read_lock_any_held+0x43/0xb0
> >>>> [  275.732750][    C1]  ? validate_chain+0x1fe/0xae0
> >>>> [  275.732771][    C1]  ? __pfx_validate_chain+0x10/0x10
> >>>> [  275.732778][    C1]  ? hlock_class+0x4e/0x130
> >>>> [  275.732784][    C1]  ? mark_lock+0x38/0x3e0
> >>>> [  275.732788][    C1]  ? sock_put+0x1a/0x60
> >>>> [  275.732806][    C1]  ? __lock_acquire+0xb9a/0x1680
> >>>> [  275.732822][    C1]  ipv4_send_dest_unreach+0x3b4/0x800
> >>>> [  275.732829][    C1]  ? neigh_invalidate+0x1c7/0x540
> >>>> [  275.732837][    C1]  ? __pfx_ipv4_send_dest_unreach+0x10/0x10
> >>>> [  275.732850][    C1]  ipv4_link_failure+0x1b/0x190
> >>>> [  275.732856][    C1]  arp_error_report+0x96/0x170
> >>>> [  275.732862][    C1]  neigh_invalidate+0x209/0x540
> >>>> [  275.732873][    C1]  neigh_timer_handler+0x87a/0xdf0
> >>>> [  275.732883][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> >>>> [  275.732886][    C1]  call_timer_fn+0x13b/0x230
> >>>> [  275.732891][    C1]  ? call_timer_fn+0xe8/0x230
> >>>> [  275.732894][    C1]  ? call_timer_fn+0xe8/0x230
> >>>> [  275.732899][    C1]  ? __pfx_call_timer_fn+0x10/0x10
> >>>> [  275.732902][    C1]  ? mark_lock+0x38/0x3e0
> >>>> [  275.732920][    C1]  __run_timers+0x545/0x810
> >>>> [  275.732925][    C1]  ? __pfx_neigh_timer_handler+0x10/0x10
> >>>> [  275.732936][    C1]  ? __pfx___run_timers+0x10/0x10
> >>>> [  275.732939][    C1]  ? __lock_release+0x103/0x460
> >>>> [  275.732947][    C1]  ? do_raw_spin_lock+0x131/0x270
> >>>> [  275.732952][    C1]  ? __pfx_do_raw_spin_lock+0x10/0x10
> >>>> [  275.732956][    C1]  ? lock_acquire+0x32/0xc0
> >>>> [  275.732958][    C1]  ? timer_expire_remote+0x96/0xf0
> >>>> [  275.732967][    C1]  timer_expire_remote+0x9e/0xf0
> >>>> [  275.732970][    C1]  tmigr_handle_remote_cpu+0x278/0x440
> >>>> [  275.732977][    C1]  ? __pfx_tmigr_handle_remote_cpu+0x10/0x10
> >>>> [  275.732981][    C1]  ? __pfx___lock_release+0x10/0x10
> >>>> [  275.732985][    C1]  ? __pfx_lock_acquire.part.0+0x10/0x10
> >>>> [  275.733015][    C1]  tmigr_handle_remote_up+0x1a6/0x270
> >>>> [  275.733027][    C1]  ? __pfx_tmigr_handle_remote_up+0x10/0x10
> >>>> [  275.733036][    C1]  __walk_groups.isra.0+0x44/0x160
> >>>> [  275.733051][    C1]  tmigr_handle_remote+0x20b/0x300
> >>>>
> >>>> Decoded:
> >>>> https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/976941/vm-crash-thr0-1
> >>>
> >>> Oops, I thought I ran the tests on the whole series. I missed this one.
> >>
> >> BTW, ICMPv6 has the same potential problem, I will amend both cases.
> >
> > I ran again the tests for v3, got an unrelated crash, FYI.
> >
> > 14237.095216] #PF: supervisor instruction fetch in kernel mode
> > [14237.095570] #PF: error_code(0x0010) - not-present page
> > [14237.095915] PGD 1e58067 P4D 1e58067 PUD ce1c067 PMD 0
> > [14237.096991] Oops: Oops: 0010 [#1] SMP DEBUG_PAGEALLOC NOPTI
> > [14237.097507] CPU: 0 UID: 0 PID: 6371 Comm: python3 Not tainted
> > 6.13.0-virtme #1559
> > [14237.098045] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [14237.098578] RIP: 0010:0x0
> > [14237.099324] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > [14237.099752] RSP: 0018:ffffacfd4486bed0 EFLAGS: 00000286
> > [14237.100079] RAX: 0000000000000000 RBX: ffff9af502607200 RCX: 0000000000000002
> > [14237.100452] RDX: 00007fffc684a690 RSI: 0000000000005401 RDI: ffff9af502607200
> > [14237.100821] RBP: 0000000000005401 R08: 0000000000000001 R09: 0000000000000000
> > [14237.101182] R10: 0000000000000001 R11: 0000000000000000 R12: 00007fffc684a690
> > [14237.101542] R13: ffff9af50888ed68 R14: ffff9af502607200 R15: 0000000000000000
> > [14237.101956] FS:  00007f76b73f95c0(0000) GS:ffff9af57cc00000(0000)
> > knlGS:0000000000000000
> > [14237.102372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14237.102679] CR2: ffffffffffffffd6 CR3: 00000000039ca000 CR4: 00000000000006f0
> > [14237.103160] Call Trace:
> > [14237.103435]  <TASK>
> > [14237.103720]  ? __die_body.cold+0x19/0x26
> > [14237.104340]  ? page_fault_oops+0x134/0x2a0
> > [14237.104553]  ? cp_new_stat+0x157/0x190
> > [14237.104799]  ? exc_page_fault+0x68/0x230
> > [14237.105013]  ? asm_exc_page_fault+0x26/0x30
> > [14237.105259]  full_proxy_unlocked_ioctl+0x63/0x90
> > [14237.105546]  __x64_sys_ioctl+0x97/0xc0
> > [14237.105754]  do_syscall_64+0x72/0x180
> > [14237.105949]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> I think I got this issue as well on MPTCP side, when using GCOV, but
> something else using the debugfs could trigger that as well I guess. It
> is apparently fixed in the Linus tree, see 57b314752ec0 ("debugfs: Fix
> the missing initializations in __debugfs_file_get()")
>
> https://lore.kernel.org/all/20250129191937.GR1977892@ZenIV/

Great, thanks for the info :)

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-02-04 16:57 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-03 14:30 [PATCH v2 net 00/16] net: first round to use dev_net_rcu() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 01/16] net: add dev_net_rcu() helper Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 02/16] ipv4: add RCU protection to ip4_dst_hoplimit() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 03/16] ipv4: use RCU protection in ip_dst_mtu_maybe_forward() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 04/16] ipv4: use RCU protection in ipv4_default_advmss() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 05/16] ipv4: use RCU protection in rt_is_expired() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 06/16] tcp: convert to dev_net_rcu() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 07/16] net: gro: convert four dev_net() calls Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 08/16] udp: convert to dev_net_rcu() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 09/16] ipv4: icmp: " Eric Dumazet
2025-02-03 23:36   ` Jakub Kicinski
2025-02-04  4:14     ` Eric Dumazet
2025-02-04  4:57       ` Eric Dumazet
2025-02-04 10:35         ` Eric Dumazet
2025-02-04 16:21           ` Matthieu Baerts
2025-02-04 16:56             ` Eric Dumazet
2025-02-04 16:21           ` Jakub Kicinski
2025-02-03 14:30 ` [PATCH v2 net 10/16] ipv6: " Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 11/16] ipv6: input: " Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 12/16] ipv6: output: " Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 13/16] ipv6: use RCU protection in ip6_default_advmss() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 14/16] net: filter: convert to dev_net_rcu() Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 15/16] flow_dissector: use rcu protection to fetch dev_net() Eric Dumazet
2025-02-03 23:38   ` Jakub Kicinski
2025-02-04  4:16     ` Eric Dumazet
2025-02-03 14:30 ` [PATCH v2 net 16/16] ipv4: use RCU protection in inet_select_addr() Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).