netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/14] ipv6: round of data-races fixes
@ 2023-09-12 16:01 Eric Dumazet
  2023-09-12 16:01 ` [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation Eric Dumazet
                   ` (15 more replies)
  0 siblings, 16 replies; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:01 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

This series is inspired by one related syzbot report.

Many inet6_sk(sk) fields reads or writes are racy.

Move 1-bit fields to inet->inet_flags to provide
atomic safety. inet6_{test|set|clear|assign}_bit() helpers
could be changed later if we need to make room in inet_flags.

Also add missing READ_ONCE()/WRITE_ONCE() when
lockless readers need access to specific fields.

np->srcprefs will be handled separately to avoid merge conflicts
because a prior patch was posted for net tree.

Eric Dumazet (14):
  ipv6: lockless IPV6_UNICAST_HOPS implementation
  ipv6: lockless IPV6_MULTICAST_LOOP implementation
  ipv6: lockless IPV6_MULTICAST_HOPS implementation
  ipv6: lockless IPV6_MTU implementation
  ipv6: lockless IPV6_MINHOPCOUNT implementation
  ipv6: lockless IPV6_RECVERR_RFC4884 implementation
  ipv6: lockless IPV6_MULTICAST_ALL implementation
  ipv6: lockless IPV6_AUTOFLOWLABEL implementation
  ipv6: lockless IPV6_DONTFRAG implementation
  ipv6: lockless IPV6_RECVERR implemetation
  ipv6: move np->repflow to atomic flags
  ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation
  ipv6: lockless IPV6_MTU_DISCOVER implementation
  ipv6: lockless IPV6_FLOWINFO_SEND implementation

 include/linux/ipv6.h            |  49 +++----
 include/net/inet_sock.h         |  10 ++
 include/net/ip6_route.h         |  14 +-
 include/net/ipv6.h              |  16 +--
 include/net/sock.h              |   2 +-
 include/net/xfrm.h              |   2 +-
 net/core/sock.c                 |   4 +-
 net/dccp/ipv6.c                 |   8 +-
 net/ipv4/ping.c                 |   5 +-
 net/ipv6/af_inet6.c             |   9 +-
 net/ipv6/datagram.c             |  15 +--
 net/ipv6/icmp.c                 |   4 +-
 net/ipv6/ip6_flowlabel.c        |   8 +-
 net/ipv6/ip6_output.c           |  42 +++---
 net/ipv6/ipv6_sockglue.c        | 223 +++++++++++++++-----------------
 net/ipv6/mcast.c                |   4 +-
 net/ipv6/ndisc.c                |   4 +-
 net/ipv6/ping.c                 |   4 +-
 net/ipv6/raw.c                  |  16 +--
 net/ipv6/tcp_ipv6.c             |  21 ++-
 net/ipv6/udp.c                  |  12 +-
 net/l2tp/l2tp_ip6.c             |   6 +-
 net/netfilter/ipvs/ip_vs_sync.c |  12 +-
 net/sctp/ipv6.c                 |   7 +-
 24 files changed, 238 insertions(+), 259 deletions(-)

-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
@ 2023-09-12 16:01 ` Eric Dumazet
  2023-09-14 14:51   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation Eric Dumazet
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:01 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Some np->hop_limit accesses are racy, when socket lock is not held.

Add missing annotations and switch to full lockless implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     | 12 +-----------
 include/net/ipv6.h       |  2 +-
 net/ipv6/ip6_output.c    |  2 +-
 net/ipv6/ipv6_sockglue.c | 20 +++++++++++---------
 net/ipv6/mcast.c         |  2 +-
 net/ipv6/ndisc.c         |  2 +-
 6 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index af8a771a053c51eed297516f927a5fd003315ef4..c2e0870713849fbbf1a8ec2d60cca80caab0cb98 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -213,17 +213,7 @@ struct ipv6_pinfo {
 	__be32			flow_label;
 	__u32			frag_size;
 
-	/*
-	 * Packed in 16bits.
-	 * Omit one shift by putting the signed field at MSB.
-	 */
-#if defined(__BIG_ENDIAN_BITFIELD)
-	__s16			hop_limit:9;
-	__u16			__unused_1:7;
-#else
-	__u16			__unused_1:7;
-	__s16			hop_limit:9;
-#endif
+	s16			hop_limit;
 
 #if defined(__BIG_ENDIAN_BITFIELD)
 	/* Packed in 16bits. */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 0675be0f3fa0efc55575bb5b2569dc8a1dbb9f24..61007db0036482e27121747add0eec77f912b54a 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -911,7 +911,7 @@ static inline int ip6_sk_dst_hoplimit(struct ipv6_pinfo *np, struct flowi6 *fl6,
 	if (ipv6_addr_is_multicast(&fl6->daddr))
 		hlimit = np->mcast_hops;
 	else
-		hlimit = np->hop_limit;
+		hlimit = READ_ONCE(np->hop_limit);
 	if (hlimit < 0)
 		hlimit = ip6_dst_hoplimit(dst);
 	return hlimit;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 54fc4c711f2c545f2ca625d6b0e09f2bb8e6d513..1e16d56d8c38ac51bd999038ae4e8478bf2f5f8c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -309,7 +309,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 	 *	Fill in the IPv6 header
 	 */
 	if (np)
-		hlimit = np->hop_limit;
+		hlimit = READ_ONCE(np->hop_limit);
 	if (hlimit < 0)
 		hlimit = ip6_dst_hoplimit(dst);
 
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 0e2a0847b387f0f6f50211b89f92ac1e00a0b07a..f27993a1470dddd876f34f65c1f171c576eca272 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -415,6 +415,16 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 	if (ip6_mroute_opt(optname))
 		return ip6_mroute_setsockopt(sk, optname, optval, optlen);
 
+	/* Handle options that can be set without locking the socket. */
+	switch (optname) {
+	case IPV6_UNICAST_HOPS:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val > 255 || val < -1)
+			return -EINVAL;
+		WRITE_ONCE(np->hop_limit, val);
+		return 0;
+	}
 	if (needs_rtnl)
 		rtnl_lock();
 	sockopt_lock_sock(sk);
@@ -733,14 +743,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		}
 		break;
 	}
-	case IPV6_UNICAST_HOPS:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val > 255 || val < -1)
-			goto e_inval;
-		np->hop_limit = val;
-		retv = 0;
-		break;
 
 	case IPV6_MULTICAST_HOPS:
 		if (sk->sk_type == SOCK_STREAM)
@@ -1347,7 +1349,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		struct dst_entry *dst;
 
 		if (optname == IPV6_UNICAST_HOPS)
-			val = np->hop_limit;
+			val = READ_ONCE(np->hop_limit);
 		else
 			val = np->mcast_hops;
 
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 5ce25bcb9974de97f26635d0d3d54695af3070a7..6a33a50687bcf7201e75574f03e619fe89636068 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1716,7 +1716,7 @@ static void ip6_mc_hdr(const struct sock *sk, struct sk_buff *skb,
 
 	hdr->payload_len = htons(len);
 	hdr->nexthdr = proto;
-	hdr->hop_limit = inet6_sk(sk)->hop_limit;
+	hdr->hop_limit = READ_ONCE(inet6_sk(sk)->hop_limit);
 
 	hdr->saddr = *saddr;
 	hdr->daddr = *daddr;
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 553c8664e0a7a37d7858393ab6a30616ab13a3bf..b554fd40bdc3787eb3bafa1d9923076d6078217e 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -500,7 +500,7 @@ void ndisc_send_skb(struct sk_buff *skb, const struct in6_addr *daddr,
 					      csum_partial(icmp6h,
 							   skb->len, 0));
 
-	ip6_nd_hdr(skb, saddr, daddr, inet6_sk(sk)->hop_limit, skb->len);
+	ip6_nd_hdr(skb, saddr, daddr, READ_ONCE(inet6_sk(sk)->hop_limit), skb->len);
 
 	rcu_read_lock();
 	idev = __in6_dev_get(dst->dev);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
  2023-09-12 16:01 ` [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 14:54   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation Eric Dumazet
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Add inet6_{test|set|clear|assign}_bit() helpers.

Note that I am using bits from inet->inet_flags,
this might change in the future if we need more flags.

While solving data-races accessing np->mc_loop,
this patch also allows to implement lockless accesses
to np->mcast_hops in the following patch.

Also constify sk_mc_loop() argument.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h            | 18 ++++++++++++++----
 include/net/inet_sock.h         |  1 +
 include/net/sock.h              |  2 +-
 net/core/sock.c                 |  4 ++--
 net/ipv6/af_inet6.c             |  2 +-
 net/ipv6/ipv6_sockglue.c        | 18 ++++++++----------
 net/ipv6/ndisc.c                |  2 +-
 net/netfilter/ipvs/ip_vs_sync.c |  8 ++------
 8 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index c2e0870713849fbbf1a8ec2d60cca80caab0cb98..68cf1ca949141e419abf2031db2b42105b821ab0 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -218,11 +218,9 @@ struct ipv6_pinfo {
 #if defined(__BIG_ENDIAN_BITFIELD)
 	/* Packed in 16bits. */
 	__s16			mcast_hops:9;
-	__u16			__unused_2:6,
-				mc_loop:1;
+	__u16			__unused_2:7,
 #else
-	__u16			mc_loop:1,
-				__unused_2:6;
+	__u16			__unused_2:7;
 	__s16			mcast_hops:9;
 #endif
 	int			ucast_oif;
@@ -283,6 +281,18 @@ struct ipv6_pinfo {
 	struct inet6_cork	cork;
 };
 
+/* We currently use available bits from inet_sk(sk)->inet_flags,
+ * this could change in the future.
+ */
+#define inet6_test_bit(nr, sk)			\
+	test_bit(INET_FLAGS_##nr, &inet_sk(sk)->inet_flags)
+#define inet6_set_bit(nr, sk)			\
+	set_bit(INET_FLAGS_##nr, &inet_sk(sk)->inet_flags)
+#define inet6_clear_bit(nr, sk)			\
+	clear_bit(INET_FLAGS_##nr, &inet_sk(sk)->inet_flags)
+#define inet6_assign_bit(nr, sk, val)		\
+	assign_bit(INET_FLAGS_##nr, &inet_sk(sk)->inet_flags, val)
+
 /* WARNING: don't change the layout of the members in {raw,udp,tcp}6_sock! */
 struct raw6_sock {
 	/* inet_sock has to be the first member of raw6_sock */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 2de0e4d4a027889706323b7ee4b96e406101bff4..b5a9dca92fb45425c032bdf08bfa88cad77926b8 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -268,6 +268,7 @@ enum {
 	INET_FLAGS_NODEFRAG	= 17,
 	INET_FLAGS_BIND_ADDRESS_NO_PORT = 18,
 	INET_FLAGS_DEFER_CONNECT = 19,
+	INET_FLAGS_MC6_LOOP	= 20,
 };
 
 /* cmsg flags for inet */
diff --git a/include/net/sock.h b/include/net/sock.h
index b770261fbdaf59d4d1c0b30adb2592c56442e9e3..9e1c17e56971f8714d421d58e408bf3face421b0 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2239,7 +2239,7 @@ static inline void sock_confirm_neigh(struct sk_buff *skb, struct neighbour *n)
 	}
 }
 
-bool sk_mc_loop(struct sock *sk);
+bool sk_mc_loop(const struct sock *sk);
 
 static inline bool sk_can_gso(const struct sock *sk)
 {
diff --git a/net/core/sock.c b/net/core/sock.c
index 16584e2dd6481a3fc28d796db785439f0446703b..b2a9b5630bb513d5e5b99a6b7d3cef54af3a4b6f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -759,7 +759,7 @@ static int sock_getbindtodevice(struct sock *sk, sockptr_t optval,
 	return ret;
 }
 
-bool sk_mc_loop(struct sock *sk)
+bool sk_mc_loop(const struct sock *sk)
 {
 	if (dev_recursion_level())
 		return false;
@@ -771,7 +771,7 @@ bool sk_mc_loop(struct sock *sk)
 		return inet_test_bit(MC_LOOP, sk);
 #if IS_ENABLED(CONFIG_IPV6)
 	case AF_INET6:
-		return inet6_sk(sk)->mc_loop;
+		return inet6_test_bit(MC6_LOOP, sk);
 #endif
 	}
 	WARN_ON_ONCE(1);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 368824fe9719f92b46512f3f78446fe5bc802ef7..bbd4aa1b96d09d346c521dab2194045123e7a5a6 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -217,7 +217,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 	inet_sk(sk)->pinet6 = np = inet6_sk_generic(sk);
 	np->hop_limit	= -1;
 	np->mcast_hops	= IPV6_DEFAULT_MCASTHOPS;
-	np->mc_loop	= 1;
+	inet6_set_bit(MC6_LOOP, sk);
 	np->mc_all	= 1;
 	np->pmtudisc	= IPV6_PMTUDISC_WANT;
 	np->repflow	= net->ipv6.sysctl.flowlabel_reflect & FLOWLABEL_REFLECT_ESTABLISHED;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index f27993a1470dddd876f34f65c1f171c576eca272..755fac85a120de44272f685529b579e7118d306b 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -424,6 +424,13 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		WRITE_ONCE(np->hop_limit, val);
 		return 0;
+	case IPV6_MULTICAST_LOOP:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val != valbool)
+			return -EINVAL;
+		inet6_assign_bit(MC6_LOOP, sk, valbool);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -755,15 +762,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		retv = 0;
 		break;
 
-	case IPV6_MULTICAST_LOOP:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val != valbool)
-			goto e_inval;
-		np->mc_loop = valbool;
-		retv = 0;
-		break;
-
 	case IPV6_UNICAST_IF:
 	{
 		struct net_device *dev = NULL;
@@ -1367,7 +1365,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 	}
 
 	case IPV6_MULTICAST_LOOP:
-		val = np->mc_loop;
+		val = inet6_test_bit(MC6_LOOP, sk);
 		break;
 
 	case IPV6_MULTICAST_IF:
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index b554fd40bdc3787eb3bafa1d9923076d6078217e..679443d7ecb586af17fa22f9ecf573318a6ac49d 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1996,7 +1996,7 @@ static int __net_init ndisc_net_init(struct net *net)
 	np = inet6_sk(sk);
 	np->hop_limit = 255;
 	/* Do not loopback ndisc messages */
-	np->mc_loop = 0;
+	inet6_clear_bit(MC6_LOOP, sk);
 
 	return 0;
 }
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index da5af28ff57b5254c0ec8976c4180113037c96a0..3c2251cabd0439834ca0fc2b8bbf0ecc6cfe9266 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1298,17 +1298,13 @@ static void set_sock_size(struct sock *sk, int mode, int val)
 static void set_mcast_loop(struct sock *sk, u_char loop)
 {
 	/* setsockopt(sock, SOL_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop)); */
-	lock_sock(sk);
 	inet_assign_bit(MC_LOOP, sk, loop);
 #ifdef CONFIG_IP_VS_IPV6
-	if (sk->sk_family == AF_INET6) {
-		struct ipv6_pinfo *np = inet6_sk(sk);
-
+	if (READ_ONCE(sk->sk_family) == AF_INET6) {
 		/* IPV6_MULTICAST_LOOP */
-		np->mc_loop = loop ? 1 : 0;
+		inet6_assign_bit(MC6_LOOP, sk, loop);
 	}
 #endif
-	release_sock(sk);
 }
 
 /*
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
  2023-09-12 16:01 ` [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation Eric Dumazet
  2023-09-12 16:02 ` [PATCH net-next 02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 14:55   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 04/14] ipv6: lockless IPV6_MTU implementation Eric Dumazet
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

This fixes data-races around np->mcast_hops,
and make IPV6_MULTICAST_HOPS lockless.

Note that np->mcast_hops is never negative,
thus can fit an u8 field instead of s16.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h            |  9 +--------
 include/net/ipv6.h              |  2 +-
 net/dccp/ipv6.c                 |  2 +-
 net/ipv6/ipv6_sockglue.c        | 28 +++++++++++++++-------------
 net/ipv6/tcp_ipv6.c             |  3 ++-
 net/netfilter/ipvs/ip_vs_sync.c |  2 +-
 6 files changed, 21 insertions(+), 25 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 68cf1ca949141e419abf2031db2b42105b821ab0..9cc278b5e4f42ce097e57ecd95a50479a947fd82 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -214,15 +214,8 @@ struct ipv6_pinfo {
 	__u32			frag_size;
 
 	s16			hop_limit;
+	u8			mcast_hops;
 
-#if defined(__BIG_ENDIAN_BITFIELD)
-	/* Packed in 16bits. */
-	__s16			mcast_hops:9;
-	__u16			__unused_2:7,
-#else
-	__u16			__unused_2:7;
-	__s16			mcast_hops:9;
-#endif
 	int			ucast_oif;
 	int			mcast_oif;
 
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 61007db0036482e27121747add0eec77f912b54a..0af1a7565a3602e4deb68762267cba454750341e 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -909,7 +909,7 @@ static inline int ip6_sk_dst_hoplimit(struct ipv6_pinfo *np, struct flowi6 *fl6,
 	int hlimit;
 
 	if (ipv6_addr_is_multicast(&fl6->daddr))
-		hlimit = np->mcast_hops;
+		hlimit = READ_ONCE(np->mcast_hops);
 	else
 		hlimit = READ_ONCE(np->hop_limit);
 	if (hlimit < 0)
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 33f6ccf6ba77b9bcc24054b09857aaee4bb71acf..83617a16b98e70aa577c08a394df63e006e53e9e 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -676,7 +676,7 @@ static int dccp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 		if (np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo)
 			np->mcast_oif = inet6_iif(opt_skb);
 		if (np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim)
-			np->mcast_hops = ipv6_hdr(opt_skb)->hop_limit;
+			WRITE_ONCE(np->mcast_hops, ipv6_hdr(opt_skb)->hop_limit);
 		if (np->rxopt.bits.rxflow || np->rxopt.bits.rxtclass)
 			np->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(opt_skb));
 		if (np->repflow)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 755fac85a120de44272f685529b579e7118d306b..5fff19a87c75518358bae067dfeb227d6738bb03 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -431,6 +431,16 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		inet6_assign_bit(MC6_LOOP, sk, valbool);
 		return 0;
+	case IPV6_MULTICAST_HOPS:
+		if (sk->sk_type == SOCK_STREAM)
+			return retv;
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val > 255 || val < -1)
+			return -EINVAL;
+		WRITE_ONCE(np->mcast_hops,
+			   val == -1 ? IPV6_DEFAULT_MCASTHOPS : val);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -751,16 +761,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		break;
 	}
 
-	case IPV6_MULTICAST_HOPS:
-		if (sk->sk_type == SOCK_STREAM)
-			break;
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val > 255 || val < -1)
-			goto e_inval;
-		np->mcast_hops = (val == -1 ? IPV6_DEFAULT_MCASTHOPS : val);
-		retv = 0;
-		break;
 
 	case IPV6_UNICAST_IF:
 	{
@@ -1180,7 +1180,8 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 				put_cmsg(&msg, SOL_IPV6, IPV6_PKTINFO, sizeof(src_info), &src_info);
 			}
 			if (np->rxopt.bits.rxhlim) {
-				int hlim = np->mcast_hops;
+				int hlim = READ_ONCE(np->mcast_hops);
+
 				put_cmsg(&msg, SOL_IPV6, IPV6_HOPLIMIT, sizeof(hlim), &hlim);
 			}
 			if (np->rxopt.bits.rxtclass) {
@@ -1197,7 +1198,8 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 				put_cmsg(&msg, SOL_IPV6, IPV6_2292PKTINFO, sizeof(src_info), &src_info);
 			}
 			if (np->rxopt.bits.rxohlim) {
-				int hlim = np->mcast_hops;
+				int hlim = READ_ONCE(np->mcast_hops);
+
 				put_cmsg(&msg, SOL_IPV6, IPV6_2292HOPLIMIT, sizeof(hlim), &hlim);
 			}
 			if (np->rxopt.bits.rxflow) {
@@ -1349,7 +1351,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		if (optname == IPV6_UNICAST_HOPS)
 			val = READ_ONCE(np->hop_limit);
 		else
-			val = np->mcast_hops;
+			val = READ_ONCE(np->mcast_hops);
 
 		if (val < 0) {
 			rcu_read_lock();
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 3a88545a265d6bd064ecc41d33c9541a75fe0f4d..54db5fab318bc68cf9efbe6f26dacba614fa8562 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1542,7 +1542,8 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 		if (np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo)
 			np->mcast_oif = tcp_v6_iif(opt_skb);
 		if (np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim)
-			np->mcast_hops = ipv6_hdr(opt_skb)->hop_limit;
+			WRITE_ONCE(np->mcast_hops,
+				   ipv6_hdr(opt_skb)->hop_limit);
 		if (np->rxopt.bits.rxflow || np->rxopt.bits.rxtclass)
 			np->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(opt_skb));
 		if (np->repflow)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 3c2251cabd0439834ca0fc2b8bbf0ecc6cfe9266..df1b33b61059eef1e86baefc63e138108a50a081 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1322,7 +1322,7 @@ static void set_mcast_ttl(struct sock *sk, u_char ttl)
 		struct ipv6_pinfo *np = inet6_sk(sk);
 
 		/* IPV6_MULTICAST_HOPS */
-		np->mcast_hops = ttl;
+		WRITE_ONCE(np->mcast_hops, ttl);
 	}
 #endif
 	release_sock(sk);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 04/14] ipv6: lockless IPV6_MTU implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (2 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 14:58   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation Eric Dumazet
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

np->frag_size can be read/written without holding socket lock.

Add missing annotations and make IPV6_MTU setsockopt() lockless.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/ip6_output.c    | 19 +++++++++++--------
 net/ipv6/ipv6_sockglue.c | 15 +++++++--------
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 1e16d56d8c38ac51bd999038ae4e8478bf2f5f8c..ab7ede4a731a96fe6dce3205df29b298c923acc7 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -881,9 +881,11 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 			mtu = IPV6_MIN_MTU;
 	}
 
-	if (np && np->frag_size < mtu) {
-		if (np->frag_size)
-			mtu = np->frag_size;
+	if (np) {
+		u32 frag_size = READ_ONCE(np->frag_size);
+
+		if (frag_size && frag_size < mtu)
+			mtu = frag_size;
 	}
 	if (mtu < hlen + sizeof(struct frag_hdr) + 8)
 		goto fail_toobig;
@@ -1392,7 +1394,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 			  struct rt6_info *rt)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
-	unsigned int mtu;
+	unsigned int mtu, frag_size;
 	struct ipv6_txoptions *nopt, *opt = ipc6->opt;
 
 	/* callers pass dst together with a reference, set it first so
@@ -1441,10 +1443,11 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	else
 		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
 			READ_ONCE(rt->dst.dev->mtu) : dst_mtu(xfrm_dst_path(&rt->dst));
-	if (np->frag_size < mtu) {
-		if (np->frag_size)
-			mtu = np->frag_size;
-	}
+
+	frag_size = READ_ONCE(np->frag_size);
+	if (frag_size && frag_size < mtu)
+		mtu = frag_size;
+
 	cork->base.fragsize = mtu;
 	cork->base.gso_size = ipc6->gso_size;
 	cork->base.tx_flags = 0;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 5fff19a87c75518358bae067dfeb227d6738bb03..3b2a34828daab5c666d7b429afa961279739c70b 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -441,6 +441,13 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		WRITE_ONCE(np->mcast_hops,
 			   val == -1 ? IPV6_DEFAULT_MCASTHOPS : val);
 		return 0;
+	case IPV6_MTU:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val && val < IPV6_MIN_MTU)
+			return -EINVAL;
+		WRITE_ONCE(np->frag_size, val);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -910,14 +917,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		np->pmtudisc = val;
 		retv = 0;
 		break;
-	case IPV6_MTU:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val && val < IPV6_MIN_MTU)
-			goto e_inval;
-		np->frag_size = val;
-		retv = 0;
-		break;
 	case IPV6_RECVERR:
 		if (optlen < sizeof(int))
 			goto e_inval;
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (3 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 04/14] ipv6: lockless IPV6_MTU implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:01   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation Eric Dumazet
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Add one missing READ_ONCE() annotation in do_ipv6_getsockopt()
and make IPV6_MINHOPCOUNT setsockopt() lockless.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/ipv6_sockglue.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 3b2a34828daab5c666d7b429afa961279739c70b..bbc8a009e05d3de49868e1ccf469a12bc31b8e22 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -448,6 +448,20 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		WRITE_ONCE(np->frag_size, val);
 		return 0;
+	case IPV6_MINHOPCOUNT:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val < 0 || val > 255)
+			return -EINVAL;
+
+		if (val)
+			static_branch_enable(&ip6_min_hopcount);
+
+		/* tcp_v6_err() and tcp_v6_rcv() might read min_hopcount
+		 * while we are changing it.
+		 */
+		WRITE_ONCE(np->min_hopcount, val);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -947,21 +961,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			goto e_inval;
 		retv = __ip6_sock_set_addr_preferences(sk, val);
 		break;
-	case IPV6_MINHOPCOUNT:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val < 0 || val > 255)
-			goto e_inval;
-
-		if (val)
-			static_branch_enable(&ip6_min_hopcount);
-
-		/* tcp_v6_err() and tcp_v6_rcv() might read min_hopcount
-		 * while we are changing it.
-		 */
-		WRITE_ONCE(np->min_hopcount, val);
-		retv = 0;
-		break;
 	case IPV6_DONTFRAG:
 		np->dontfrag = valbool;
 		retv = 0;
@@ -1443,7 +1442,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_MINHOPCOUNT:
-		val = np->min_hopcount;
+		val = READ_ONCE(np->min_hopcount);
 		break;
 
 	case IPV6_DONTFRAG:
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (4 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:02   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation Eric Dumazet
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Move np->recverr_rfc4884 to an atomic flag to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  1 -
 include/net/inet_sock.h  |  1 +
 net/ipv6/datagram.c      |  2 +-
 net/ipv6/ipv6_sockglue.c | 17 ++++++++---------
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 9cc278b5e4f42ce097e57ecd95a50479a947fd82..0d2b0a1b2daeaee51a03624adab5a385cc852cc7 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -256,7 +256,6 @@ struct ipv6_pinfo {
 				autoflowlabel:1,
 				autoflowlabel_set:1,
 				mc_all:1,
-				recverr_rfc4884:1,
 				rtalert_isolate:1;
 	__u8			min_hopcount;
 	__u8			tclass;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index b5a9dca92fb45425c032bdf08bfa88cad77926b8..8cf1f7b442348bef83cc3d9648521a01667efae7 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -269,6 +269,7 @@ enum {
 	INET_FLAGS_BIND_ADDRESS_NO_PORT = 18,
 	INET_FLAGS_DEFER_CONNECT = 19,
 	INET_FLAGS_MC6_LOOP	= 20,
+	INET_FLAGS_RECVERR6_RFC4884 = 21,
 };
 
 /* cmsg flags for inet */
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 41ebc4e574734456357169e883c3d13e42fa66b2..e81892814935fb3934fbf0e6f9defc702ec29152 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -332,7 +332,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err,
 
 	__skb_pull(skb, payload - skb->data);
 
-	if (inet6_sk(sk)->recverr_rfc4884)
+	if (inet6_test_bit(RECVERR6_RFC4884, sk))
 		ipv6_icmp_error_rfc4884(skb, &serr->ee.ee_rfc4884);
 
 	skb_reset_transport_header(skb);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index bbc8a009e05d3de49868e1ccf469a12bc31b8e22..b65e73ac2ccdee79aa293948d3ba9853966e1e2d 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -462,6 +462,13 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		 */
 		WRITE_ONCE(np->min_hopcount, val);
 		return 0;
+	case IPV6_RECVERR_RFC4884:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val < 0 || val > 1)
+			return -EINVAL;
+		inet6_assign_bit(RECVERR6_RFC4884, sk, valbool);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -974,14 +981,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		np->rxopt.bits.recvfragsize = valbool;
 		retv = 0;
 		break;
-	case IPV6_RECVERR_RFC4884:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val < 0 || val > 1)
-			goto e_inval;
-		np->recverr_rfc4884 = valbool;
-		retv = 0;
-		break;
 	}
 
 unlock:
@@ -1462,7 +1461,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_RECVERR_RFC4884:
-		val = np->recverr_rfc4884;
+		val = inet6_test_bit(RECVERR6_RFC4884, sk);
 		break;
 
 	default:
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (5 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:03   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation Eric Dumazet
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Move np->mc_all to an atomic flags to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  1 -
 include/net/inet_sock.h  |  1 +
 net/ipv6/af_inet6.c      |  2 +-
 net/ipv6/ipv6_sockglue.c | 14 ++++++--------
 net/ipv6/mcast.c         |  2 +-
 5 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 0d2b0a1b2daeaee51a03624adab5a385cc852cc7..d88e91b7f0a319a816488025ef213c4fb90ed359 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -255,7 +255,6 @@ struct ipv6_pinfo {
 				dontfrag:1,
 				autoflowlabel:1,
 				autoflowlabel_set:1,
-				mc_all:1,
 				rtalert_isolate:1;
 	__u8			min_hopcount;
 	__u8			tclass;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 8cf1f7b442348bef83cc3d9648521a01667efae7..97e70a97dae888e6ab93c6446f4f3ba58cd8583e 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -270,6 +270,7 @@ enum {
 	INET_FLAGS_DEFER_CONNECT = 19,
 	INET_FLAGS_MC6_LOOP	= 20,
 	INET_FLAGS_RECVERR6_RFC4884 = 21,
+	INET_FLAGS_MC6_ALL	= 22,
 };
 
 /* cmsg flags for inet */
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index bbd4aa1b96d09d346c521dab2194045123e7a5a6..372fb7b9112c8dfed09b6ddfdb37016a1a668494 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -218,7 +218,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 	np->hop_limit	= -1;
 	np->mcast_hops	= IPV6_DEFAULT_MCASTHOPS;
 	inet6_set_bit(MC6_LOOP, sk);
-	np->mc_all	= 1;
+	inet6_set_bit(MC6_ALL, sk);
 	np->pmtudisc	= IPV6_PMTUDISC_WANT;
 	np->repflow	= net->ipv6.sysctl.flowlabel_reflect & FLOWLABEL_REFLECT_ESTABLISHED;
 	sk->sk_ipv6only	= net->ipv6.sysctl.bindv6only;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index b65e73ac2ccdee79aa293948d3ba9853966e1e2d..7a181831f226c67813446145f8f58fa58908e3ae 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -469,6 +469,11 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		inet6_assign_bit(RECVERR6_RFC4884, sk, valbool);
 		return 0;
+	case IPV6_MULTICAST_ALL:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		inet6_assign_bit(MC6_ALL, sk, valbool);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -890,13 +895,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			retv = ipv6_sock_ac_drop(sk, mreq.ipv6mr_ifindex, &mreq.ipv6mr_acaddr);
 		break;
 	}
-	case IPV6_MULTICAST_ALL:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		np->mc_all = valbool;
-		retv = 0;
-		break;
-
 	case MCAST_JOIN_GROUP:
 	case MCAST_LEAVE_GROUP:
 		if (in_compat_syscall())
@@ -1372,7 +1370,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_MULTICAST_ALL:
-		val = np->mc_all;
+		val = inet6_test_bit(MC6_ALL, sk);
 		break;
 
 	case IPV6_UNICAST_IF:
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 6a33a50687bcf7201e75574f03e619fe89636068..483f797ae44d538009184b5e53ad7755d73bab4a 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -642,7 +642,7 @@ bool inet6_mc_check(const struct sock *sk, const struct in6_addr *mc_addr,
 	}
 	if (!mc) {
 		rcu_read_unlock();
-		return np->mc_all;
+		return inet6_test_bit(MC6_ALL, sk);
 	}
 	psl = rcu_dereference(mc->sflist);
 	if (!psl) {
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (6 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:04   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 09/14] ipv6: lockless IPV6_DONTFRAG implementation Eric Dumazet
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Move np->autoflowlabel and np->autoflowlabel_set in inet->inet_flags,
to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  2 --
 include/net/inet_sock.h  |  2 ++
 include/net/ipv6.h       |  2 +-
 net/ipv6/ip6_output.c    | 12 +++++-------
 net/ipv6/ipv6_sockglue.c | 11 +++++------
 5 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index d88e91b7f0a319a816488025ef213c4fb90ed359..e3be5dc21b7d27080b398f1425bf11145896a4f3 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -253,8 +253,6 @@ struct ipv6_pinfo {
 						 * 100: prefer care-of address
 						 */
 				dontfrag:1,
-				autoflowlabel:1,
-				autoflowlabel_set:1,
 				rtalert_isolate:1;
 	__u8			min_hopcount;
 	__u8			tclass;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 97e70a97dae888e6ab93c6446f4f3ba58cd8583e..f1af64a4067310258a3bc45b84ad3fd093bddbab 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -271,6 +271,8 @@ enum {
 	INET_FLAGS_MC6_LOOP	= 20,
 	INET_FLAGS_RECVERR6_RFC4884 = 21,
 	INET_FLAGS_MC6_ALL	= 22,
+	INET_FLAGS_AUTOFLOWLABEL_SET = 23,
+	INET_FLAGS_AUTOFLOWLABEL = 24,
 };
 
 /* cmsg flags for inet */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 0af1a7565a3602e4deb68762267cba454750341e..fe1978a288630a20ba03dc3a36e22938495082e4 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -428,7 +428,7 @@ int ipv6_flowlabel_opt_get(struct sock *sk, struct in6_flowlabel_req *freq,
 			   int flags);
 int ip6_flowlabel_init(void);
 void ip6_flowlabel_cleanup(void);
-bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np);
+bool ip6_autoflowlabel(struct net *net, const struct sock *sk);
 
 static inline void fl6_sock_release(struct ip6_flowlabel *fl)
 {
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ab7ede4a731a96fe6dce3205df29b298c923acc7..47aa42f93ccda8b49ed6ecd7a7a07703ae147928 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -232,12 +232,11 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(ip6_output);
 
-bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
+bool ip6_autoflowlabel(struct net *net, const struct sock *sk)
 {
-	if (!np->autoflowlabel_set)
+	if (!inet6_test_bit(AUTOFLOWLABEL_SET, sk))
 		return ip6_default_np_autolabel(net);
-	else
-		return np->autoflowlabel;
+	return inet6_test_bit(AUTOFLOWLABEL, sk);
 }
 
 /*
@@ -314,7 +313,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 		hlimit = ip6_dst_hoplimit(dst);
 
 	ip6_flow_hdr(hdr, tclass, ip6_make_flowlabel(net, skb, fl6->flowlabel,
-				ip6_autoflowlabel(net, np), fl6));
+				ip6_autoflowlabel(net, sk), fl6));
 
 	hdr->payload_len = htons(seg_len);
 	hdr->nexthdr = proto;
@@ -1938,7 +1937,6 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 	struct sk_buff *skb, *tmp_skb;
 	struct sk_buff **tail_skb;
 	struct in6_addr *final_dst;
-	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct net *net = sock_net(sk);
 	struct ipv6hdr *hdr;
 	struct ipv6_txoptions *opt = v6_cork->opt;
@@ -1981,7 +1979,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 
 	ip6_flow_hdr(hdr, v6_cork->tclass,
 		     ip6_make_flowlabel(net, skb, fl6->flowlabel,
-					ip6_autoflowlabel(net, np), fl6));
+					ip6_autoflowlabel(net, sk), fl6));
 	hdr->hop_limit = v6_cork->hop_limit;
 	hdr->nexthdr = proto;
 	hdr->saddr = fl6->saddr;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 7a181831f226c67813446145f8f58fa58908e3ae..d5d428a695f728d96a7d075d86f806cc3f926e0a 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -474,6 +474,10 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		inet6_assign_bit(MC6_ALL, sk, valbool);
 		return 0;
+	case IPV6_AUTOFLOWLABEL:
+		inet6_assign_bit(AUTOFLOWLABEL, sk, valbool);
+		inet6_set_bit(AUTOFLOWLABEL_SET, sk);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -970,11 +974,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		np->dontfrag = valbool;
 		retv = 0;
 		break;
-	case IPV6_AUTOFLOWLABEL:
-		np->autoflowlabel = valbool;
-		np->autoflowlabel_set = 1;
-		retv = 0;
-		break;
 	case IPV6_RECVFRAGSIZE:
 		np->rxopt.bits.recvfragsize = valbool;
 		retv = 0;
@@ -1447,7 +1446,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_AUTOFLOWLABEL:
-		val = ip6_autoflowlabel(sock_net(sk), np);
+		val = ip6_autoflowlabel(sock_net(sk), sk);
 		break;
 
 	case IPV6_RECVFRAGSIZE:
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 09/14] ipv6: lockless IPV6_DONTFRAG implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (7 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:05   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 10/14] ipv6: lockless IPV6_RECVERR implemetation Eric Dumazet
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Move np->dontfrag flag to inet->inet_flags to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     | 1 -
 include/net/inet_sock.h  | 1 +
 include/net/ipv6.h       | 6 +++---
 include/net/xfrm.h       | 2 +-
 net/ipv6/icmp.c          | 4 ++--
 net/ipv6/ip6_output.c    | 2 +-
 net/ipv6/ipv6_sockglue.c | 9 ++++-----
 net/ipv6/ping.c          | 2 +-
 net/ipv6/raw.c           | 2 +-
 net/ipv6/udp.c           | 2 +-
 net/l2tp/l2tp_ip6.c      | 2 +-
 11 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index e3be5dc21b7d27080b398f1425bf11145896a4f3..57d563f1d4b1707264f0d79406c4c139cc0fa525 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -252,7 +252,6 @@ struct ipv6_pinfo {
 						 * 010: prefer public address
 						 * 100: prefer care-of address
 						 */
-				dontfrag:1,
 				rtalert_isolate:1;
 	__u8			min_hopcount;
 	__u8			tclass;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index f1af64a4067310258a3bc45b84ad3fd093bddbab..ac75324e9e1eafe68cee7b0581e472cbb4f49aa3 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -273,6 +273,7 @@ enum {
 	INET_FLAGS_MC6_ALL	= 22,
 	INET_FLAGS_AUTOFLOWLABEL_SET = 23,
 	INET_FLAGS_AUTOFLOWLABEL = 24,
+	INET_FLAGS_DONTFRAG	= 25,
 };
 
 /* cmsg flags for inet */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index fe1978a288630a20ba03dc3a36e22938495082e4..d2cf7e176f2b97dac957e65b75d5e69a39c546b5 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -373,12 +373,12 @@ static inline void ipcm6_init(struct ipcm6_cookie *ipc6)
 }
 
 static inline void ipcm6_init_sk(struct ipcm6_cookie *ipc6,
-				 const struct ipv6_pinfo *np)
+				 const struct sock *sk)
 {
 	*ipc6 = (struct ipcm6_cookie) {
 		.hlimit = -1,
-		.tclass = np->tclass,
-		.dontfrag = np->dontfrag,
+		.tclass = inet6_sk(sk)->tclass,
+		.dontfrag = inet6_test_bit(DONTFRAG, sk),
 	};
 }
 
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 363c7d5105542ec7f43f91e5071b877314584bc5..98d7aa78addaab129f7ce060b10b7652fd0acba1 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -2166,7 +2166,7 @@ static inline bool xfrm6_local_dontfrag(const struct sock *sk)
 
 	proto = sk->sk_protocol;
 	if (proto == IPPROTO_UDP || proto == IPPROTO_RAW)
-		return inet6_sk(sk)->dontfrag;
+		return inet6_test_bit(DONTFRAG, sk);
 
 	return false;
 }
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 93a594a901d12befb754e7035f56726273eead92..8fb4a791881a48d5efcebc990c8829d8f77fe94f 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -588,7 +588,7 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
 	else if (!fl6.flowi6_oif)
 		fl6.flowi6_oif = np->ucast_oif;
 
-	ipcm6_init_sk(&ipc6, np);
+	ipcm6_init_sk(&ipc6, sk);
 	ipc6.sockc.mark = mark;
 	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
 
@@ -791,7 +791,7 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb)
 	msg.offset = 0;
 	msg.type = type;
 
-	ipcm6_init_sk(&ipc6, np);
+	ipcm6_init_sk(&ipc6, sk);
 	ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 	ipc6.tclass = ipv6_get_dsfield(ipv6_hdr(skb));
 	ipc6.sockc.mark = mark;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 47aa42f93ccda8b49ed6ecd7a7a07703ae147928..8851fe5d45a0781c8b78c995c2c4c6c81e10cd52 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -2092,7 +2092,7 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
 		return ERR_PTR(err);
 	}
 	if (ipc6->dontfrag < 0)
-		ipc6->dontfrag = inet6_sk(sk)->dontfrag;
+		ipc6->dontfrag = inet6_test_bit(DONTFRAG, sk);
 
 	err = __ip6_append_data(sk, &queue, cork, &v6_cork,
 				&current->task_frag, getfrag, from,
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index d5d428a695f728d96a7d075d86f806cc3f926e0a..33dd4dd872e6bca2ee18a634283640007adcc692 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -478,6 +478,9 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		inet6_assign_bit(AUTOFLOWLABEL, sk, valbool);
 		inet6_set_bit(AUTOFLOWLABEL_SET, sk);
 		return 0;
+	case IPV6_DONTFRAG:
+		inet6_assign_bit(DONTFRAG, sk, valbool);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -970,10 +973,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			goto e_inval;
 		retv = __ip6_sock_set_addr_preferences(sk, val);
 		break;
-	case IPV6_DONTFRAG:
-		np->dontfrag = valbool;
-		retv = 0;
-		break;
 	case IPV6_RECVFRAGSIZE:
 		np->rxopt.bits.recvfragsize = valbool;
 		retv = 0;
@@ -1442,7 +1441,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_DONTFRAG:
-		val = np->dontfrag;
+		val = inet6_test_bit(DONTFRAG, sk);
 		break;
 
 	case IPV6_AUTOFLOWLABEL:
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 5831aaa53d75eae7b764d54ab52da65db4030d73..4444b61eb23bbf483068d2b119a7559e49ba3880 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -118,7 +118,7 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	     l3mdev_master_ifindex_by_index(sock_net(sk), oif) != sk->sk_bound_dev_if))
 		return -EINVAL;
 
-	ipcm6_init_sk(&ipc6, np);
+	ipcm6_init_sk(&ipc6, sk);
 	ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
 	ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 42fcec3ecf5e171a5ebe724b8c971d90885abe41..cc9673c1809fb238f6d9ab6915116cf0dd6eb593 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -898,7 +898,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 
 	if (ipc6.dontfrag < 0)
-		ipc6.dontfrag = np->dontfrag;
+		ipc6.dontfrag = inet6_test_bit(DONTFRAG, sk);
 
 	if (msg->msg_flags&MSG_CONFIRM)
 		goto do_confirm;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 86b5d509a4688cacb2f40667c9ddc10f81ade2fe..d904c5450a07bf1df10d94ee6bb9b2a8fb9381b5 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1593,7 +1593,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 do_append_data:
 	if (ipc6.dontfrag < 0)
-		ipc6.dontfrag = np->dontfrag;
+		ipc6.dontfrag = inet6_test_bit(DONTFRAG, sk);
 	up->len += ulen;
 	err = ip6_append_data(sk, getfrag, msg, ulen, sizeof(struct udphdr),
 			      &ipc6, fl6, (struct rt6_info *)dst,
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index ed8ebb6f59097ac18bb284d1c48f9e801e9a92c2..40af2431e73aad74ab64e97db8a5ee79dda0879d 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -621,7 +621,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 
 	if (ipc6.dontfrag < 0)
-		ipc6.dontfrag = np->dontfrag;
+		ipc6.dontfrag = inet6_test_bit(DONTFRAG, sk);
 
 	if (msg->msg_flags & MSG_CONFIRM)
 		goto do_confirm;
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 10/14] ipv6: lockless IPV6_RECVERR implemetation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (8 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 09/14] ipv6: lockless IPV6_DONTFRAG implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:06   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 11/14] ipv6: move np->repflow to atomic flags Eric Dumazet
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

np->recverr is moved to inet->inet_flags to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  3 +--
 include/net/inet_sock.h  |  1 +
 include/net/ipv6.h       |  4 +---
 net/dccp/ipv6.c          |  2 +-
 net/ipv4/ping.c          |  2 +-
 net/ipv6/datagram.c      |  6 ++----
 net/ipv6/ipv6_sockglue.c | 17 ++++++++---------
 net/ipv6/raw.c           | 10 +++++-----
 net/ipv6/tcp_ipv6.c      |  2 +-
 net/ipv6/udp.c           |  6 +++---
 net/sctp/ipv6.c          |  4 +---
 11 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 57d563f1d4b1707264f0d79406c4c139cc0fa525..53f4f1b97a787ac01fc274a8057494a28fa270fd 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -243,8 +243,7 @@ struct ipv6_pinfo {
 	} rxopt;
 
 	/* sockopt flags */
-	__u16			recverr:1,
-	                        sndflow:1,
+	__u16			sndflow:1,
 				repflow:1,
 				pmtudisc:3,
 				padding:1,	/* 1 bit hole */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index ac75324e9e1eafe68cee7b0581e472cbb4f49aa3..3b79bc759ff478f96d729f2669c6963bbe768ba1 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -274,6 +274,7 @@ enum {
 	INET_FLAGS_AUTOFLOWLABEL_SET = 23,
 	INET_FLAGS_AUTOFLOWLABEL = 24,
 	INET_FLAGS_DONTFRAG	= 25,
+	INET_FLAGS_RECVERR6	= 26,
 };
 
 /* cmsg flags for inet */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d2cf7e176f2b97dac957e65b75d5e69a39c546b5..51c94fddd8039f980eb5a14441936623fd9b7a5d 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1298,9 +1298,7 @@ static inline int ip6_sock_set_v6only(struct sock *sk)
 
 static inline void ip6_sock_set_recverr(struct sock *sk)
 {
-	lock_sock(sk);
-	inet6_sk(sk)->recverr = true;
-	release_sock(sk);
+	inet6_set_bit(RECVERR6, sk);
 }
 
 static inline int __ip6_sock_set_addr_preferences(struct sock *sk, int val)
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 83617a16b98e70aa577c08a394df63e006e53e9e..e6c3d84c2b9ec2df9b89ab0879991b3b312d0b6f 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -185,7 +185,7 @@ static int dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		goto out;
 	}
 
-	if (!sock_owned_by_user(sk) && np->recverr) {
+	if (!sock_owned_by_user(sk) && inet6_test_bit(RECVERR6, sk)) {
 		sk->sk_err = err;
 		sk_error_report(sk);
 	} else {
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 75e0aee35eb787a6c9f70394294b30490c980a64..bc01ad5fc01ab97f71f7704a671eaf644ec040be 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -581,7 +581,7 @@ void ping_err(struct sk_buff *skb, int offset, u32 info)
 	 *	4.1.3.3.
 	 */
 	if ((family == AF_INET && !inet_test_bit(RECVERR, sk)) ||
-	    (family == AF_INET6 && !inet6_sk(sk)->recverr)) {
+	    (family == AF_INET6 && !inet6_test_bit(RECVERR6, sk))) {
 		if (!harderr || sk->sk_state != TCP_ESTABLISHED)
 			goto out;
 	} else {
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index e81892814935fb3934fbf0e6f9defc702ec29152..74673a5eff319f23871e64584a33f5299fa7b521 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -305,11 +305,10 @@ static void ipv6_icmp_error_rfc4884(const struct sk_buff *skb,
 void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err,
 		     __be16 port, u32 info, u8 *payload)
 {
-	struct ipv6_pinfo *np  = inet6_sk(sk);
 	struct icmp6hdr *icmph = icmp6_hdr(skb);
 	struct sock_exterr_skb *serr;
 
-	if (!np->recverr)
+	if (!inet6_test_bit(RECVERR6, sk))
 		return;
 
 	skb = skb_clone(skb, GFP_ATOMIC);
@@ -344,12 +343,11 @@ EXPORT_SYMBOL_GPL(ipv6_icmp_error);
 
 void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info)
 {
-	const struct ipv6_pinfo *np = inet6_sk(sk);
 	struct sock_exterr_skb *serr;
 	struct ipv6hdr *iph;
 	struct sk_buff *skb;
 
-	if (!np->recverr)
+	if (!inet6_test_bit(RECVERR6, sk))
 		return;
 
 	skb = alloc_skb(sizeof(struct ipv6hdr), GFP_ATOMIC);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 33dd4dd872e6bca2ee18a634283640007adcc692..ec10b45c49c15f9655466a529046f741f8b9fc69 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -481,6 +481,13 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 	case IPV6_DONTFRAG:
 		inet6_assign_bit(DONTFRAG, sk, valbool);
 		return 0;
+	case IPV6_RECVERR:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		inet6_assign_bit(RECVERR6, sk, valbool);
+		if (!val)
+			skb_errqueue_purge(&sk->sk_error_queue);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -943,14 +950,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		np->pmtudisc = val;
 		retv = 0;
 		break;
-	case IPV6_RECVERR:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		np->recverr = valbool;
-		if (!val)
-			skb_errqueue_purge(&sk->sk_error_queue);
-		retv = 0;
-		break;
 	case IPV6_FLOWINFO_SEND:
 		if (optlen < sizeof(int))
 			goto e_inval;
@@ -1380,7 +1379,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_RECVERR:
-		val = np->recverr;
+		val = inet6_test_bit(RECVERR6, sk);
 		break;
 
 	case IPV6_FLOWINFO_SEND:
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index cc9673c1809fb238f6d9ab6915116cf0dd6eb593..71f6bdccfa1f39290e1b573ff8c647d91fd007a4 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -291,6 +291,7 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
 	       struct inet6_skb_parm *opt,
 	       u8 type, u8 code, int offset, __be32 info)
 {
+	bool recverr = inet6_test_bit(RECVERR6, sk);
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	int err;
 	int harderr;
@@ -300,7 +301,7 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
 	   2. Socket is connected (otherwise the error indication
 	      is useless without recverr and error is hard.
 	 */
-	if (!np->recverr && sk->sk_state != TCP_ESTABLISHED)
+	if (!recverr && sk->sk_state != TCP_ESTABLISHED)
 		return;
 
 	harderr = icmpv6_err_convert(type, code, &err);
@@ -312,14 +313,14 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
 		ip6_sk_redirect(skb, sk);
 		return;
 	}
-	if (np->recverr) {
+	if (recverr) {
 		u8 *payload = skb->data;
 		if (!inet_test_bit(HDRINCL, sk))
 			payload += offset;
 		ipv6_icmp_error(sk, skb, err, 0, ntohl(info), payload);
 	}
 
-	if (np->recverr || harderr) {
+	if (recverr || harderr) {
 		sk->sk_err = err;
 		sk_error_report(sk);
 	}
@@ -587,7 +588,6 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
 			struct flowi6 *fl6, struct dst_entry **dstp,
 			unsigned int flags, const struct sockcm_cookie *sockc)
 {
-	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct net *net = sock_net(sk);
 	struct ipv6hdr *iph;
 	struct sk_buff *skb;
@@ -668,7 +668,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
 error:
 	IP6_INC_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS);
 error_check:
-	if (err == -ENOBUFS && !np->recverr)
+	if (err == -ENOBUFS && !inet6_test_bit(RECVERR6, sk))
 		err = 0;
 	return err;
 }
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 54db5fab318bc68cf9efbe6f26dacba614fa8562..b5954b136b57306429690594238f7a01b0cf15de 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -508,7 +508,7 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 			tcp_ld_RTO_revert(sk, seq);
 	}
 
-	if (!sock_owned_by_user(sk) && np->recverr) {
+	if (!sock_owned_by_user(sk) && inet6_test_bit(RECVERR6, sk)) {
 		WRITE_ONCE(sk->sk_err, err);
 		sk_error_report(sk);
 	} else {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d904c5450a07bf1df10d94ee6bb9b2a8fb9381b5..65f6217d36cb7c862f1511a058a7a5973c40cef8 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -619,7 +619,7 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		goto out;
 	}
 
-	if (!np->recverr) {
+	if (!inet6_test_bit(RECVERR6, sk)) {
 		if (!harderr || sk->sk_state != TCP_ESTABLISHED)
 			goto out;
 	} else {
@@ -1281,7 +1281,7 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6,
 send:
 	err = ip6_send_skb(skb);
 	if (err) {
-		if (err == -ENOBUFS && !inet6_sk(sk)->recverr) {
+		if (err == -ENOBUFS && !inet6_test_bit(RECVERR6, sk)) {
 			UDP6_INC_STATS(sock_net(sk),
 				       UDP_MIB_SNDBUFERRORS, is_udplite);
 			err = 0;
@@ -1606,7 +1606,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		up->pending = 0;
 
 	if (err > 0)
-		err = np->recverr ? net_xmit_errno(err) : 0;
+		err = inet6_test_bit(RECVERR6, sk) ? net_xmit_errno(err) : 0;
 	release_sock(sk);
 
 out:
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 43f2731bf590e5757b7ad2d3a92a12e4098e0d47..42b5b853ea01c767e1fe878772eeabe5c05adb6d 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -128,7 +128,6 @@ static void sctp_v6_err_handle(struct sctp_transport *t, struct sk_buff *skb,
 {
 	struct sctp_association *asoc = t->asoc;
 	struct sock *sk = asoc->base.sk;
-	struct ipv6_pinfo *np;
 	int err = 0;
 
 	switch (type) {
@@ -149,9 +148,8 @@ static void sctp_v6_err_handle(struct sctp_transport *t, struct sk_buff *skb,
 		break;
 	}
 
-	np = inet6_sk(sk);
 	icmpv6_err_convert(type, code, &err);
-	if (!sock_owned_by_user(sk) && np->recverr) {
+	if (!sock_owned_by_user(sk) && inet6_test_bit(RECVERR6, sk)) {
 		sk->sk_err = err;
 		sk_error_report(sk);
 	} else {
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 11/14] ipv6: move np->repflow to atomic flags
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (9 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 10/14] ipv6: lockless IPV6_RECVERR implemetation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:07   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation Eric Dumazet
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Move np->repflow to inet->inet_flags to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  1 -
 include/net/inet_sock.h  |  1 +
 net/dccp/ipv6.c          |  2 +-
 net/ipv6/af_inet6.c      |  3 ++-
 net/ipv6/ip6_flowlabel.c |  8 ++++----
 net/ipv6/tcp_ipv6.c      | 14 ++++++--------
 6 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 53f4f1b97a787ac01fc274a8057494a28fa270fd..e62413371ea40cbd9f13aa6ac6b6be41a6831237 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -244,7 +244,6 @@ struct ipv6_pinfo {
 
 	/* sockopt flags */
 	__u16			sndflow:1,
-				repflow:1,
 				pmtudisc:3,
 				padding:1,	/* 1 bit hole */
 				srcprefs:3,	/* 001: prefer temporary address
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 3b79bc759ff478f96d729f2669c6963bbe768ba1..5d61c7dc6577827740254f0e9aa288065f1bda7f 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -275,6 +275,7 @@ enum {
 	INET_FLAGS_AUTOFLOWLABEL = 24,
 	INET_FLAGS_DONTFRAG	= 25,
 	INET_FLAGS_RECVERR6	= 26,
+	INET_FLAGS_REPFLOW	= 27,
 };
 
 /* cmsg flags for inet */
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index e6c3d84c2b9ec2df9b89ab0879991b3b312d0b6f..d7e63eea705dfe5c40d374301f93987e1c34748b 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -679,7 +679,7 @@ static int dccp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 			WRITE_ONCE(np->mcast_hops, ipv6_hdr(opt_skb)->hop_limit);
 		if (np->rxopt.bits.rxflow || np->rxopt.bits.rxtclass)
 			np->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(opt_skb));
-		if (np->repflow)
+		if (inet6_test_bit(REPFLOW, sk))
 			np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb));
 		if (ipv6_opt_accepted(sk, opt_skb,
 				      &DCCP_SKB_CB(opt_skb)->header.h6)) {
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 372fb7b9112c8dfed09b6ddfdb37016a1a668494..48737363377fef32f471075fd3f000bc742fd4e4 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -220,7 +220,8 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 	inet6_set_bit(MC6_LOOP, sk);
 	inet6_set_bit(MC6_ALL, sk);
 	np->pmtudisc	= IPV6_PMTUDISC_WANT;
-	np->repflow	= net->ipv6.sysctl.flowlabel_reflect & FLOWLABEL_REFLECT_ESTABLISHED;
+	inet6_assign_bit(REPFLOW, sk, net->ipv6.sysctl.flowlabel_reflect &
+				     FLOWLABEL_REFLECT_ESTABLISHED);
 	sk->sk_ipv6only	= net->ipv6.sysctl.bindv6only;
 	sk->sk_txrehash = READ_ONCE(net->core.sysctl_txrehash);
 
diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c
index b3ca4beb4405aa9dc4ce610abda9a46ac3ceb5fb..eca07e10e21fcf11b3a8ebe6353f38789b87bdaf 100644
--- a/net/ipv6/ip6_flowlabel.c
+++ b/net/ipv6/ip6_flowlabel.c
@@ -513,7 +513,7 @@ int ipv6_flowlabel_opt_get(struct sock *sk, struct in6_flowlabel_req *freq,
 		return 0;
 	}
 
-	if (np->repflow) {
+	if (inet6_test_bit(REPFLOW, sk)) {
 		freq->flr_label = np->flow_label;
 		return 0;
 	}
@@ -551,10 +551,10 @@ static int ipv6_flowlabel_put(struct sock *sk, struct in6_flowlabel_req *freq)
 	if (freq->flr_flags & IPV6_FL_F_REFLECT) {
 		if (sk->sk_protocol != IPPROTO_TCP)
 			return -ENOPROTOOPT;
-		if (!np->repflow)
+		if (!inet6_test_bit(REPFLOW, sk))
 			return -ESRCH;
 		np->flow_label = 0;
-		np->repflow = 0;
+		inet6_clear_bit(REPFLOW, sk);
 		return 0;
 	}
 
@@ -626,7 +626,7 @@ static int ipv6_flowlabel_get(struct sock *sk, struct in6_flowlabel_req *freq,
 
 		if (sk->sk_protocol != IPPROTO_TCP)
 			return -ENOPROTOOPT;
-		np->repflow = 1;
+		inet6_set_bit(REPFLOW, sk);
 		return 0;
 	}
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index b5954b136b57306429690594238f7a01b0cf15de..201caf88bb99e4ff87048fab3d89b6ea22269df3 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -548,7 +548,7 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst,
 				    &ireq->ir_v6_rmt_addr);
 
 		fl6->daddr = ireq->ir_v6_rmt_addr;
-		if (np->repflow && ireq->pktopts)
+		if (inet6_test_bit(REPFLOW, sk) && ireq->pktopts)
 			fl6->flowlabel = ip6_flowlabel(ipv6_hdr(ireq->pktopts));
 
 		tclass = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reflect_tos) ?
@@ -797,7 +797,7 @@ static void tcp_v6_init_req(struct request_sock *req,
 	    (ipv6_opt_accepted(sk_listener, skb, &TCP_SKB_CB(skb)->header.h6) ||
 	     np->rxopt.bits.rxinfo ||
 	     np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim ||
-	     np->rxopt.bits.rxohlim || np->repflow)) {
+	     np->rxopt.bits.rxohlim || inet6_test_bit(REPFLOW, sk_listener))) {
 		refcount_inc(&skb->users);
 		ireq->pktopts = skb;
 	}
@@ -1055,10 +1055,8 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb)
 	if (sk) {
 		oif = sk->sk_bound_dev_if;
 		if (sk_fullsock(sk)) {
-			const struct ipv6_pinfo *np = tcp_inet6_sk(sk);
-
 			trace_tcp_send_reset(sk, skb);
-			if (np->repflow)
+			if (inet6_test_bit(REPFLOW, sk))
 				label = ip6_flowlabel(ipv6h);
 			priority = sk->sk_priority;
 			txhash = sk->sk_txhash;
@@ -1247,7 +1245,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 		newnp->mcast_oif   = inet_iif(skb);
 		newnp->mcast_hops  = ip_hdr(skb)->ttl;
 		newnp->rcv_flowinfo = 0;
-		if (np->repflow)
+		if (inet6_test_bit(REPFLOW, sk))
 			newnp->flow_label = 0;
 
 		/*
@@ -1320,7 +1318,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 	newnp->mcast_oif  = tcp_v6_iif(skb);
 	newnp->mcast_hops = ipv6_hdr(skb)->hop_limit;
 	newnp->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(skb));
-	if (np->repflow)
+	if (inet6_test_bit(REPFLOW, sk))
 		newnp->flow_label = ip6_flowlabel(ipv6_hdr(skb));
 
 	/* Set ToS of the new socket based upon the value of incoming SYN.
@@ -1546,7 +1544,7 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 				   ipv6_hdr(opt_skb)->hop_limit);
 		if (np->rxopt.bits.rxflow || np->rxopt.bits.rxtclass)
 			np->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(opt_skb));
-		if (np->repflow)
+		if (inet6_test_bit(REPFLOW, sk))
 			np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb));
 		if (ipv6_opt_accepted(sk, opt_skb, &TCP_SKB_CB(opt_skb)->header.h6)) {
 			tcp_v6_restore_cb(opt_skb);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (10 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 11/14] ipv6: move np->repflow to atomic flags Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:08   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation Eric Dumazet
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Reads from np->rtalert_isolate are racy.

Move this flag to inet->inet_flags to fix data-races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  3 +--
 include/net/inet_sock.h  |  1 +
 net/ipv6/ip6_output.c    |  3 +--
 net/ipv6/ipv6_sockglue.c | 13 ++++++-------
 4 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index e62413371ea40cbd9f13aa6ac6b6be41a6831237..f288a35f157f73ded445639c30f3365047fd9ddc 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -246,11 +246,10 @@ struct ipv6_pinfo {
 	__u16			sndflow:1,
 				pmtudisc:3,
 				padding:1,	/* 1 bit hole */
-				srcprefs:3,	/* 001: prefer temporary address
+				srcprefs:3;	/* 001: prefer temporary address
 						 * 010: prefer public address
 						 * 100: prefer care-of address
 						 */
-				rtalert_isolate:1;
 	__u8			min_hopcount;
 	__u8			tclass;
 	__be32			rcv_flowinfo;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 5d61c7dc6577827740254f0e9aa288065f1bda7f..befee0f66c0555f3ac4524fd8f7780ff21c04aaa 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -276,6 +276,7 @@ enum {
 	INET_FLAGS_DONTFRAG	= 25,
 	INET_FLAGS_RECVERR6	= 26,
 	INET_FLAGS_REPFLOW	= 27,
+	INET_FLAGS_RTALERT_ISOLATE = 28,
 };
 
 /* cmsg flags for inet */
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 8851fe5d45a0781c8b78c995c2c4c6c81e10cd52..f87d8491d7e273f167b7b144a7e134783e1b80f6 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -368,9 +368,8 @@ static int ip6_call_ra_chain(struct sk_buff *skb, int sel)
 		if (sk && ra->sel == sel &&
 		    (!sk->sk_bound_dev_if ||
 		     sk->sk_bound_dev_if == skb->dev->ifindex)) {
-			struct ipv6_pinfo *np = inet6_sk(sk);
 
-			if (np && np->rtalert_isolate &&
+			if (inet6_test_bit(RTALERT_ISOLATE, sk) &&
 			    !net_eq(sock_net(sk), dev_net(skb->dev))) {
 				continue;
 			}
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index ec10b45c49c15f9655466a529046f741f8b9fc69..c22a492e05360b68ef6868707e363f2ce84a4c35 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -488,6 +488,11 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		if (!val)
 			skb_errqueue_purge(&sk->sk_error_queue);
 		return 0;
+	case IPV6_ROUTER_ALERT_ISOLATE:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		inet6_assign_bit(RTALERT_ISOLATE, sk, valbool);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -936,12 +941,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			goto e_inval;
 		retv = ip6_ra_control(sk, val);
 		break;
-	case IPV6_ROUTER_ALERT_ISOLATE:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		np->rtalert_isolate = valbool;
-		retv = 0;
-		break;
 	case IPV6_MTU_DISCOVER:
 		if (optlen < sizeof(int))
 			goto e_inval;
@@ -1452,7 +1451,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_ROUTER_ALERT_ISOLATE:
-		val = np->rtalert_isolate;
+		val = inet6_test_bit(RTALERT_ISOLATE, sk);
 		break;
 
 	case IPV6_RECVERR_RFC4884:
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (11 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:10   ` David Ahern
  2023-09-12 16:02 ` [PATCH net-next 14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation Eric Dumazet
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

Most np->pmtudisc reads are racy.

Move this 3bit field on a full byte, add annotations
and make IPV6_MTU_DISCOVER setsockopt() lockless.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h            |  5 ++---
 include/net/ip6_route.h         | 14 +++++++++-----
 net/ipv6/ip6_output.c           |  4 ++--
 net/ipv6/ipv6_sockglue.c        | 17 ++++++++---------
 net/ipv6/raw.c                  |  2 +-
 net/ipv6/udp.c                  |  2 +-
 net/netfilter/ipvs/ip_vs_sync.c |  2 +-
 7 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index f288a35f157f73ded445639c30f3365047fd9ddc..10f521a6a9c8a881b4677d53597929622ae95b67 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -243,13 +243,12 @@ struct ipv6_pinfo {
 	} rxopt;
 
 	/* sockopt flags */
-	__u16			sndflow:1,
-				pmtudisc:3,
-				padding:1,	/* 1 bit hole */
+	__u8			sndflow:1,
 				srcprefs:3;	/* 001: prefer temporary address
 						 * 010: prefer public address
 						 * 100: prefer care-of address
 						 */
+	__u8			pmtudisc;
 	__u8			min_hopcount;
 	__u8			tclass;
 	__be32			rcv_flowinfo;
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index b32539bb0fb05c67b5849bb219be59fabe5bb51c..b1ea49900b4ae17cb3436f884e26f5ae3a7a761c 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -266,7 +266,7 @@ static inline unsigned int ip6_skb_dst_mtu(const struct sk_buff *skb)
 	const struct dst_entry *dst = skb_dst(skb);
 	unsigned int mtu;
 
-	if (np && np->pmtudisc >= IPV6_PMTUDISC_PROBE) {
+	if (np && READ_ONCE(np->pmtudisc) >= IPV6_PMTUDISC_PROBE) {
 		mtu = READ_ONCE(dst->dev->mtu);
 		mtu -= lwtunnel_headroom(dst->lwtstate, mtu);
 	} else {
@@ -277,14 +277,18 @@ static inline unsigned int ip6_skb_dst_mtu(const struct sk_buff *skb)
 
 static inline bool ip6_sk_accept_pmtu(const struct sock *sk)
 {
-	return inet6_sk(sk)->pmtudisc != IPV6_PMTUDISC_INTERFACE &&
-	       inet6_sk(sk)->pmtudisc != IPV6_PMTUDISC_OMIT;
+	u8 pmtudisc = READ_ONCE(inet6_sk(sk)->pmtudisc);
+
+	return pmtudisc != IPV6_PMTUDISC_INTERFACE &&
+	       pmtudisc != IPV6_PMTUDISC_OMIT;
 }
 
 static inline bool ip6_sk_ignore_df(const struct sock *sk)
 {
-	return inet6_sk(sk)->pmtudisc < IPV6_PMTUDISC_DO ||
-	       inet6_sk(sk)->pmtudisc == IPV6_PMTUDISC_OMIT;
+	u8 pmtudisc = READ_ONCE(inet6_sk(sk)->pmtudisc);
+
+	return pmtudisc < IPV6_PMTUDISC_DO ||
+	       pmtudisc == IPV6_PMTUDISC_OMIT;
 }
 
 static inline const struct in6_addr *rt6_nexthop(const struct rt6_info *rt,
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f87d8491d7e273f167b7b144a7e134783e1b80f6..7e5d9eeb990fd4549be753fdaaf1e6c6c21d3f8d 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1436,10 +1436,10 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	v6_cork->hop_limit = ipc6->hlimit;
 	v6_cork->tclass = ipc6->tclass;
 	if (rt->dst.flags & DST_XFRM_TUNNEL)
-		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
+		mtu = READ_ONCE(np->pmtudisc) >= IPV6_PMTUDISC_PROBE ?
 		      READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst);
 	else
-		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
+		mtu = READ_ONCE(np->pmtudisc) >= IPV6_PMTUDISC_PROBE ?
 			READ_ONCE(rt->dst.dev->mtu) : dst_mtu(xfrm_dst_path(&rt->dst));
 
 	frag_size = READ_ONCE(np->frag_size);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index c22a492e05360b68ef6868707e363f2ce84a4c35..85ea42644dcbbe3ed8f625e51ffc6d55ada40156 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -493,6 +493,13 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		inet6_assign_bit(RTALERT_ISOLATE, sk, valbool);
 		return 0;
+	case IPV6_MTU_DISCOVER:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		if (val < IPV6_PMTUDISC_DONT || val > IPV6_PMTUDISC_OMIT)
+			return -EINVAL;
+		WRITE_ONCE(np->pmtudisc, val);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -941,14 +948,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			goto e_inval;
 		retv = ip6_ra_control(sk, val);
 		break;
-	case IPV6_MTU_DISCOVER:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		if (val < IPV6_PMTUDISC_DONT || val > IPV6_PMTUDISC_OMIT)
-			goto e_inval;
-		np->pmtudisc = val;
-		retv = 0;
-		break;
 	case IPV6_FLOWINFO_SEND:
 		if (optlen < sizeof(int))
 			goto e_inval;
@@ -1374,7 +1373,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_MTU_DISCOVER:
-		val = np->pmtudisc;
+		val = READ_ONCE(np->pmtudisc);
 		break;
 
 	case IPV6_RECVERR:
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 71f6bdccfa1f39290e1b573ff8c647d91fd007a4..47372cceb98f6e606346b74230b03e76e303822c 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -307,7 +307,7 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
 	harderr = icmpv6_err_convert(type, code, &err);
 	if (type == ICMPV6_PKT_TOOBIG) {
 		ip6_sk_update_pmtu(skb, sk, info);
-		harderr = (np->pmtudisc == IPV6_PMTUDISC_DO);
+		harderr = (READ_ONCE(np->pmtudisc) == IPV6_PMTUDISC_DO);
 	}
 	if (type == NDISC_REDIRECT) {
 		ip6_sk_redirect(skb, sk);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 65f6217d36cb7c862f1511a058a7a5973c40cef8..97fabbd7e7aa8bf66bfe21a98f97d4408af13d2b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -598,7 +598,7 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		if (!ip6_sk_accept_pmtu(sk))
 			goto out;
 		ip6_sk_update_pmtu(skb, sk, info);
-		if (np->pmtudisc != IPV6_PMTUDISC_DONT)
+		if (READ_ONCE(np->pmtudisc) != IPV6_PMTUDISC_DONT)
 			harderr = 1;
 	}
 	if (type == NDISC_REDIRECT) {
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index df1b33b61059eef1e86baefc63e138108a50a081..5820a8156c4701bb163f569d735c389d7a8e3820 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1341,7 +1341,7 @@ static void set_mcast_pmtudisc(struct sock *sk, int val)
 		struct ipv6_pinfo *np = inet6_sk(sk);
 
 		/* IPV6_MTU_DISCOVER */
-		np->pmtudisc = val;
+		WRITE_ONCE(np->pmtudisc, val);
 	}
 #endif
 	release_sock(sk);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next 14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (12 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation Eric Dumazet
@ 2023-09-12 16:02 ` Eric Dumazet
  2023-09-14 15:11   ` David Ahern
  2023-09-14 10:25 ` [PATCH net-next 00/14] ipv6: round of data-races fixes Simon Horman
  2023-09-15  9:40 ` patchwork-bot+netdevbpf
  15 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2023-09-12 16:02 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: David Ahern, netdev, eric.dumazet, Eric Dumazet

np->sndflow reads are racy.

Use one bit ftom atomic inet->inet_flags instead,
IPV6_FLOWINFO_SEND setsockopt() can be lockless.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/ipv6.h     |  3 +--
 include/net/inet_sock.h  |  1 +
 net/dccp/ipv6.c          |  2 +-
 net/ipv4/ping.c          |  3 +--
 net/ipv6/af_inet6.c      |  2 +-
 net/ipv6/datagram.c      |  7 ++++---
 net/ipv6/ipv6_sockglue.c | 13 ++++++-------
 net/ipv6/ping.c          |  2 +-
 net/ipv6/raw.c           |  2 +-
 net/ipv6/tcp_ipv6.c      |  2 +-
 net/ipv6/udp.c           |  2 +-
 net/l2tp/l2tp_ip6.c      |  4 ++--
 net/sctp/ipv6.c          |  3 ++-
 13 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 10f521a6a9c8a881b4677d53597929622ae95b67..09253825c99c7a94c4c8a3f176f0ceecd0b166bc 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -243,8 +243,7 @@ struct ipv6_pinfo {
 	} rxopt;
 
 	/* sockopt flags */
-	__u8			sndflow:1,
-				srcprefs:3;	/* 001: prefer temporary address
+	__u8			srcprefs:3;	/* 001: prefer temporary address
 						 * 010: prefer public address
 						 * 100: prefer care-of address
 						 */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index befee0f66c0555f3ac4524fd8f7780ff21c04aaa..98e11958cdff688249fddf1893ce06b45ecb68d9 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -277,6 +277,7 @@ enum {
 	INET_FLAGS_RECVERR6	= 26,
 	INET_FLAGS_REPFLOW	= 27,
 	INET_FLAGS_RTALERT_ISOLATE = 28,
+	INET_FLAGS_SNDFLOW	= 29,
 };
 
 /* cmsg flags for inet */
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index d7e63eea705dfe5c40d374301f93987e1c34748b..4803f06148488b07ba027138c93014d2b5fa28db 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -844,7 +844,7 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 	memset(&fl6, 0, sizeof(fl6));
 
-	if (np->sndflow) {
+	if (inet6_test_bit(SNDFLOW, sk)) {
 		fl6.flowlabel = usin->sin6_flowinfo & IPV6_FLOWINFO_MASK;
 		IP6_ECN_flow_init(fl6.flowlabel);
 		if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) {
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index bc01ad5fc01ab97f71f7704a671eaf644ec040be..4dd809b7b18867154df42bc28809b886913e253c 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -899,7 +899,6 @@ int ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 
 #if IS_ENABLED(CONFIG_IPV6)
 	} else if (family == AF_INET6) {
-		struct ipv6_pinfo *np = inet6_sk(sk);
 		struct ipv6hdr *ip6 = ipv6_hdr(skb);
 		DECLARE_SOCKADDR(struct sockaddr_in6 *, sin6, msg->msg_name);
 
@@ -908,7 +907,7 @@ int ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 			sin6->sin6_port = 0;
 			sin6->sin6_addr = ip6->saddr;
 			sin6->sin6_flowinfo = 0;
-			if (np->sndflow)
+			if (inet6_test_bit(SNDFLOW, sk))
 				sin6->sin6_flowinfo = ip6_flowinfo(ip6);
 			sin6->sin6_scope_id =
 				ipv6_iface_scope_id(&sin6->sin6_addr,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 48737363377fef32f471075fd3f000bc742fd4e4..c6ad0d6e99b5e2259648e260e2cad54f34c90cfd 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -537,7 +537,7 @@ int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
 		}
 		sin->sin6_port = inet->inet_dport;
 		sin->sin6_addr = sk->sk_v6_daddr;
-		if (np->sndflow)
+		if (inet6_test_bit(SNDFLOW, sk))
 			sin->sin6_flowinfo = np->flow_label;
 		BPF_CGROUP_RUN_SA_PROG(sk, (struct sockaddr *)sin,
 				       CGROUP_INET6_GETPEERNAME);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 74673a5eff319f23871e64584a33f5299fa7b521..cc6a502db39d2e446c39656ccc398e6ac20abf6b 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -80,7 +80,8 @@ int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr)
 	struct flowi6 fl6;
 	int err = 0;
 
-	if (np->sndflow && (np->flow_label & IPV6_FLOWLABEL_MASK)) {
+	if (inet6_test_bit(SNDFLOW, sk) &&
+	    (np->flow_label & IPV6_FLOWLABEL_MASK)) {
 		flowlabel = fl6_sock_lookup(sk, np->flow_label);
 		if (IS_ERR(flowlabel))
 			return -EINVAL;
@@ -163,7 +164,7 @@ int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
 	if (usin->sin6_family != AF_INET6)
 		return -EAFNOSUPPORT;
 
-	if (np->sndflow)
+	if (inet6_test_bit(SNDFLOW, sk))
 		fl6_flowlabel = usin->sin6_flowinfo & IPV6_FLOWINFO_MASK;
 
 	if (ipv6_addr_any(&usin->sin6_addr)) {
@@ -491,7 +492,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 			const struct ipv6hdr *ip6h = container_of((struct in6_addr *)(nh + serr->addr_offset),
 								  struct ipv6hdr, daddr);
 			sin->sin6_addr = ip6h->daddr;
-			if (np->sndflow)
+			if (inet6_test_bit(SNDFLOW, sk))
 				sin->sin6_flowinfo = ip6_flowinfo(ip6h);
 			sin->sin6_scope_id =
 				ipv6_iface_scope_id(&sin->sin6_addr,
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 85ea42644dcbbe3ed8f625e51ffc6d55ada40156..e9dc6f881bb92db267903a71f3f3e4de4c557819 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -500,6 +500,11 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			return -EINVAL;
 		WRITE_ONCE(np->pmtudisc, val);
 		return 0;
+	case IPV6_FLOWINFO_SEND:
+		if (optlen < sizeof(int))
+			return -EINVAL;
+		inet6_assign_bit(SNDFLOW, sk, valbool);
+		return 0;
 	}
 	if (needs_rtnl)
 		rtnl_lock();
@@ -948,12 +953,6 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			goto e_inval;
 		retv = ip6_ra_control(sk, val);
 		break;
-	case IPV6_FLOWINFO_SEND:
-		if (optlen < sizeof(int))
-			goto e_inval;
-		np->sndflow = valbool;
-		retv = 0;
-		break;
 	case IPV6_FLOWLABEL_MGR:
 		retv = ipv6_flowlabel_opt(sk, optval, optlen);
 		break;
@@ -1381,7 +1380,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_FLOWINFO_SEND:
-		val = np->sndflow;
+		val = inet6_test_bit(SNDFLOW, sk);
 		break;
 
 	case IPV6_FLOWLABEL_MGR:
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 4444b61eb23bbf483068d2b119a7559e49ba3880..e8fb0d275cc2d9adf997f944a42a8fc456f8b950 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -89,7 +89,7 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			return -EAFNOSUPPORT;
 		}
 		daddr = &(u->sin6_addr);
-		if (np->sndflow)
+		if (inet6_test_bit(SNDFLOW, sk))
 			fl6.flowlabel = u->sin6_flowinfo & IPV6_FLOWINFO_MASK;
 		if (__ipv6_addr_needs_scope_id(ipv6_addr_type(daddr)))
 			oif = u->sin6_scope_id;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 47372cceb98f6e606346b74230b03e76e303822c..a2aa54a2baaec0169fecd490588a2cd4e8a2f2d7 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -795,7 +795,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			return -EINVAL;
 
 		daddr = &sin6->sin6_addr;
-		if (np->sndflow) {
+		if (inet6_test_bit(SNDFLOW, sk)) {
 			fl6.flowlabel = sin6->sin6_flowinfo&IPV6_FLOWINFO_MASK;
 			if (fl6.flowlabel&IPV6_FLOWLABEL_MASK) {
 				flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 201caf88bb99e4ff87048fab3d89b6ea22269df3..94afb8d0f2d0e4974c3dbe4e3301f0152b5cb9e1 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -163,7 +163,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 	memset(&fl6, 0, sizeof(fl6));
 
-	if (np->sndflow) {
+	if (inet6_test_bit(SNDFLOW, sk)) {
 		fl6.flowlabel = usin->sin6_flowinfo&IPV6_FLOWINFO_MASK;
 		IP6_ECN_flow_init(fl6.flowlabel);
 		if (fl6.flowlabel&IPV6_FLOWLABEL_MASK) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 97fabbd7e7aa8bf66bfe21a98f97d4408af13d2b..b55e23ba1da53eba2ee4c468e30f9428a6fee3a7 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1427,7 +1427,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		fl6->fl6_dport = sin6->sin6_port;
 		daddr = &sin6->sin6_addr;
 
-		if (np->sndflow) {
+		if (inet6_test_bit(SNDFLOW, sk)) {
 			fl6->flowlabel = sin6->sin6_flowinfo&IPV6_FLOWINFO_MASK;
 			if (fl6->flowlabel & IPV6_FLOWLABEL_MASK) {
 				flowlabel = fl6_sock_lookup(sk, fl6->flowlabel);
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 40af2431e73aad74ab64e97db8a5ee79dda0879d..44cfb72bbd18a34e83e50bebca09729c55df524f 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -431,7 +431,7 @@ static int l2tp_ip6_getname(struct socket *sock, struct sockaddr *uaddr,
 			return -ENOTCONN;
 		lsa->l2tp_conn_id = lsk->peer_conn_id;
 		lsa->l2tp_addr = sk->sk_v6_daddr;
-		if (np->sndflow)
+		if (inet6_test_bit(SNDFLOW, sk))
 			lsa->l2tp_flowinfo = np->flow_label;
 	} else {
 		if (ipv6_addr_any(&sk->sk_v6_rcv_saddr))
@@ -529,7 +529,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			return -EAFNOSUPPORT;
 
 		daddr = &lsa->l2tp_addr;
-		if (np->sndflow) {
+		if (inet6_test_bit(SNDFLOW, sk)) {
 			fl6.flowlabel = lsa->l2tp_flowinfo & IPV6_FLOWINFO_MASK;
 			if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) {
 				flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 42b5b853ea01c767e1fe878772eeabe5c05adb6d..5c0ed5909d85a1fc137e8652e32df75d8bef28ac 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -296,7 +296,8 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
 	if (t->flowlabel & SCTP_FLOWLABEL_SET_MASK)
 		fl6->flowlabel = htonl(t->flowlabel & SCTP_FLOWLABEL_VAL_MASK);
 
-	if (np->sndflow && (fl6->flowlabel & IPV6_FLOWLABEL_MASK)) {
+	if (inet6_test_bit(SNDFLOW, sk) &&
+	    (fl6->flowlabel & IPV6_FLOWLABEL_MASK)) {
 		struct ip6_flowlabel *flowlabel;
 
 		flowlabel = fl6_sock_lookup(sk, fl6->flowlabel);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 00/14] ipv6: round of data-races fixes
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (13 preceding siblings ...)
  2023-09-12 16:02 ` [PATCH net-next 14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation Eric Dumazet
@ 2023-09-14 10:25 ` Simon Horman
  2023-09-15  9:40 ` patchwork-bot+netdevbpf
  15 siblings, 0 replies; 31+ messages in thread
From: Simon Horman @ 2023-09-14 10:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern, netdev,
	eric.dumazet

On Tue, Sep 12, 2023 at 04:01:58PM +0000, Eric Dumazet wrote:
> This series is inspired by one related syzbot report.
> 
> Many inet6_sk(sk) fields reads or writes are racy.
> 
> Move 1-bit fields to inet->inet_flags to provide
> atomic safety. inet6_{test|set|clear|assign}_bit() helpers
> could be changed later if we need to make room in inet_flags.
> 
> Also add missing READ_ONCE()/WRITE_ONCE() when
> lockless readers need access to specific fields.
> 
> np->srcprefs will be handled separately to avoid merge conflicts
> because a prior patch was posted for net tree.
> 
> Eric Dumazet (14):
>   ipv6: lockless IPV6_UNICAST_HOPS implementation
>   ipv6: lockless IPV6_MULTICAST_LOOP implementation
>   ipv6: lockless IPV6_MULTICAST_HOPS implementation
>   ipv6: lockless IPV6_MTU implementation
>   ipv6: lockless IPV6_MINHOPCOUNT implementation
>   ipv6: lockless IPV6_RECVERR_RFC4884 implementation
>   ipv6: lockless IPV6_MULTICAST_ALL implementation
>   ipv6: lockless IPV6_AUTOFLOWLABEL implementation
>   ipv6: lockless IPV6_DONTFRAG implementation
>   ipv6: lockless IPV6_RECVERR implemetation
>   ipv6: move np->repflow to atomic flags
>   ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation
>   ipv6: lockless IPV6_MTU_DISCOVER implementation
>   ipv6: lockless IPV6_FLOWINFO_SEND implementation

For series,

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation
  2023-09-12 16:01 ` [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation Eric Dumazet
@ 2023-09-14 14:51   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 14:51 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:01 AM, Eric Dumazet wrote:
> Some np->hop_limit accesses are racy, when socket lock is not held.
> 
> Add missing annotations and switch to full lockless implementation.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     | 12 +-----------
>  include/net/ipv6.h       |  2 +-
>  net/ipv6/ip6_output.c    |  2 +-
>  net/ipv6/ipv6_sockglue.c | 20 +++++++++++---------
>  net/ipv6/mcast.c         |  2 +-
>  net/ipv6/ndisc.c         |  2 +-
>  6 files changed, 16 insertions(+), 24 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation
  2023-09-12 16:02 ` [PATCH net-next 02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation Eric Dumazet
@ 2023-09-14 14:54   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 14:54 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Add inet6_{test|set|clear|assign}_bit() helpers.
> 
> Note that I am using bits from inet->inet_flags,
> this might change in the future if we need more flags.
> 
> While solving data-races accessing np->mc_loop,
> this patch also allows to implement lockless accesses
> to np->mcast_hops in the following patch.
> 
> Also constify sk_mc_loop() argument.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h            | 18 ++++++++++++++----
>  include/net/inet_sock.h         |  1 +
>  include/net/sock.h              |  2 +-
>  net/core/sock.c                 |  4 ++--
>  net/ipv6/af_inet6.c             |  2 +-
>  net/ipv6/ipv6_sockglue.c        | 18 ++++++++----------
>  net/ipv6/ndisc.c                |  2 +-
>  net/netfilter/ipvs/ip_vs_sync.c |  8 ++------
>  8 files changed, 30 insertions(+), 25 deletions(-)
> 


Reviewed-by: David Ahern <dsahern@kernel.org>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation
  2023-09-12 16:02 ` [PATCH net-next 03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation Eric Dumazet
@ 2023-09-14 14:55   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 14:55 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> This fixes data-races around np->mcast_hops,
> and make IPV6_MULTICAST_HOPS lockless.
> 
> Note that np->mcast_hops is never negative,
> thus can fit an u8 field instead of s16.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h            |  9 +--------
>  include/net/ipv6.h              |  2 +-
>  net/dccp/ipv6.c                 |  2 +-
>  net/ipv6/ipv6_sockglue.c        | 28 +++++++++++++++-------------
>  net/ipv6/tcp_ipv6.c             |  3 ++-
>  net/netfilter/ipvs/ip_vs_sync.c |  2 +-
>  6 files changed, 21 insertions(+), 25 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 04/14] ipv6: lockless IPV6_MTU implementation
  2023-09-12 16:02 ` [PATCH net-next 04/14] ipv6: lockless IPV6_MTU implementation Eric Dumazet
@ 2023-09-14 14:58   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 14:58 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> np->frag_size can be read/written without holding socket lock.
> 
> Add missing annotations and make IPV6_MTU setsockopt() lockless.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/ipv6/ip6_output.c    | 19 +++++++++++--------
>  net/ipv6/ipv6_sockglue.c | 15 +++++++--------
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation
  2023-09-12 16:02 ` [PATCH net-next 05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation Eric Dumazet
@ 2023-09-14 15:01   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:01 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Add one missing READ_ONCE() annotation in do_ipv6_getsockopt()
> and make IPV6_MINHOPCOUNT setsockopt() lockless.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/ipv6/ipv6_sockglue.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation
  2023-09-12 16:02 ` [PATCH net-next 06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation Eric Dumazet
@ 2023-09-14 15:02   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:02 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Move np->recverr_rfc4884 to an atomic flag to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  1 -
>  include/net/inet_sock.h  |  1 +
>  net/ipv6/datagram.c      |  2 +-
>  net/ipv6/ipv6_sockglue.c | 17 ++++++++---------
>  4 files changed, 10 insertions(+), 11 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation
  2023-09-12 16:02 ` [PATCH net-next 07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation Eric Dumazet
@ 2023-09-14 15:03   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:03 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Move np->mc_all to an atomic flags to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  1 -
>  include/net/inet_sock.h  |  1 +
>  net/ipv6/af_inet6.c      |  2 +-
>  net/ipv6/ipv6_sockglue.c | 14 ++++++--------
>  net/ipv6/mcast.c         |  2 +-
>  5 files changed, 9 insertions(+), 11 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation
  2023-09-12 16:02 ` [PATCH net-next 08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation Eric Dumazet
@ 2023-09-14 15:04   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:04 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Move np->autoflowlabel and np->autoflowlabel_set in inet->inet_flags,
> to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  2 --
>  include/net/inet_sock.h  |  2 ++
>  include/net/ipv6.h       |  2 +-
>  net/ipv6/ip6_output.c    | 12 +++++-------
>  net/ipv6/ipv6_sockglue.c | 11 +++++------
>  5 files changed, 13 insertions(+), 16 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 09/14] ipv6: lockless IPV6_DONTFRAG implementation
  2023-09-12 16:02 ` [PATCH net-next 09/14] ipv6: lockless IPV6_DONTFRAG implementation Eric Dumazet
@ 2023-09-14 15:05   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:05 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Move np->dontfrag flag to inet->inet_flags to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     | 1 -
>  include/net/inet_sock.h  | 1 +
>  include/net/ipv6.h       | 6 +++---
>  include/net/xfrm.h       | 2 +-
>  net/ipv6/icmp.c          | 4 ++--
>  net/ipv6/ip6_output.c    | 2 +-
>  net/ipv6/ipv6_sockglue.c | 9 ++++-----
>  net/ipv6/ping.c          | 2 +-
>  net/ipv6/raw.c           | 2 +-
>  net/ipv6/udp.c           | 2 +-
>  net/l2tp/l2tp_ip6.c      | 2 +-
>  11 files changed, 16 insertions(+), 17 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 10/14] ipv6: lockless IPV6_RECVERR implemetation
  2023-09-12 16:02 ` [PATCH net-next 10/14] ipv6: lockless IPV6_RECVERR implemetation Eric Dumazet
@ 2023-09-14 15:06   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:06 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> np->recverr is moved to inet->inet_flags to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  3 +--
>  include/net/inet_sock.h  |  1 +
>  include/net/ipv6.h       |  4 +---
>  net/dccp/ipv6.c          |  2 +-
>  net/ipv4/ping.c          |  2 +-
>  net/ipv6/datagram.c      |  6 ++----
>  net/ipv6/ipv6_sockglue.c | 17 ++++++++---------
>  net/ipv6/raw.c           | 10 +++++-----
>  net/ipv6/tcp_ipv6.c      |  2 +-
>  net/ipv6/udp.c           |  6 +++---
>  net/sctp/ipv6.c          |  4 +---
>  11 files changed, 25 insertions(+), 32 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 11/14] ipv6: move np->repflow to atomic flags
  2023-09-12 16:02 ` [PATCH net-next 11/14] ipv6: move np->repflow to atomic flags Eric Dumazet
@ 2023-09-14 15:07   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:07 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Move np->repflow to inet->inet_flags to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  1 -
>  include/net/inet_sock.h  |  1 +
>  net/dccp/ipv6.c          |  2 +-
>  net/ipv6/af_inet6.c      |  3 ++-
>  net/ipv6/ip6_flowlabel.c |  8 ++++----
>  net/ipv6/tcp_ipv6.c      | 14 ++++++--------
>  6 files changed, 14 insertions(+), 15 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation
  2023-09-12 16:02 ` [PATCH net-next 12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation Eric Dumazet
@ 2023-09-14 15:08   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:08 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Reads from np->rtalert_isolate are racy.
> 
> Move this flag to inet->inet_flags to fix data-races.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  3 +--
>  include/net/inet_sock.h  |  1 +
>  net/ipv6/ip6_output.c    |  3 +--
>  net/ipv6/ipv6_sockglue.c | 13 ++++++-------
>  4 files changed, 9 insertions(+), 11 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation
  2023-09-12 16:02 ` [PATCH net-next 13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation Eric Dumazet
@ 2023-09-14 15:10   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:10 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> Most np->pmtudisc reads are racy.
> 
> Move this 3bit field on a full byte, add annotations
> and make IPV6_MTU_DISCOVER setsockopt() lockless.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h            |  5 ++---
>  include/net/ip6_route.h         | 14 +++++++++-----
>  net/ipv6/ip6_output.c           |  4 ++--
>  net/ipv6/ipv6_sockglue.c        | 17 ++++++++---------
>  net/ipv6/raw.c                  |  2 +-
>  net/ipv6/udp.c                  |  2 +-
>  net/netfilter/ipvs/ip_vs_sync.c |  2 +-
>  7 files changed, 24 insertions(+), 22 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation
  2023-09-12 16:02 ` [PATCH net-next 14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation Eric Dumazet
@ 2023-09-14 15:11   ` David Ahern
  0 siblings, 0 replies; 31+ messages in thread
From: David Ahern @ 2023-09-14 15:11 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet

On 9/12/23 10:02 AM, Eric Dumazet wrote:
> np->sndflow reads are racy.
> 
> Use one bit ftom atomic inet->inet_flags instead,
> IPV6_FLOWINFO_SEND setsockopt() can be lockless.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/ipv6.h     |  3 +--
>  include/net/inet_sock.h  |  1 +
>  net/dccp/ipv6.c          |  2 +-
>  net/ipv4/ping.c          |  3 +--
>  net/ipv6/af_inet6.c      |  2 +-
>  net/ipv6/datagram.c      |  7 ++++---
>  net/ipv6/ipv6_sockglue.c | 13 ++++++-------
>  net/ipv6/ping.c          |  2 +-
>  net/ipv6/raw.c           |  2 +-
>  net/ipv6/tcp_ipv6.c      |  2 +-
>  net/ipv6/udp.c           |  2 +-
>  net/l2tp/l2tp_ip6.c      |  4 ++--
>  net/sctp/ipv6.c          |  3 ++-
>  13 files changed, 23 insertions(+), 23 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next 00/14] ipv6: round of data-races fixes
  2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
                   ` (14 preceding siblings ...)
  2023-09-14 10:25 ` [PATCH net-next 00/14] ipv6: round of data-races fixes Simon Horman
@ 2023-09-15  9:40 ` patchwork-bot+netdevbpf
  15 siblings, 0 replies; 31+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-09-15  9:40 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, pabeni, dsahern, netdev, eric.dumazet

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Tue, 12 Sep 2023 16:01:58 +0000 you wrote:
> This series is inspired by one related syzbot report.
> 
> Many inet6_sk(sk) fields reads or writes are racy.
> 
> Move 1-bit fields to inet->inet_flags to provide
> atomic safety. inet6_{test|set|clear|assign}_bit() helpers
> could be changed later if we need to make room in inet_flags.
> 
> [...]

Here is the summary with links:
  - [net-next,01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation
    https://git.kernel.org/netdev/net-next/c/b0adfba7ee77
  - [net-next,02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation
    https://git.kernel.org/netdev/net-next/c/d986f52124e0
  - [net-next,03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation
    https://git.kernel.org/netdev/net-next/c/2da23eb07c91
  - [net-next,04/14] ipv6: lockless IPV6_MTU implementation
    https://git.kernel.org/netdev/net-next/c/15f926c4457a
  - [net-next,05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation
    https://git.kernel.org/netdev/net-next/c/273784d3c574
  - [net-next,06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation
    https://git.kernel.org/netdev/net-next/c/dcae74622c05
  - [net-next,07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation
    https://git.kernel.org/netdev/net-next/c/6559c0ff3bc2
  - [net-next,08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation
    https://git.kernel.org/netdev/net-next/c/5121516b0c47
  - [net-next,09/14] ipv6: lockless IPV6_DONTFRAG implementation
    https://git.kernel.org/netdev/net-next/c/1086ca7cce29
  - [net-next,10/14] ipv6: lockless IPV6_RECVERR implemetation
    https://git.kernel.org/netdev/net-next/c/3fa29971c695
  - [net-next,11/14] ipv6: move np->repflow to atomic flags
    https://git.kernel.org/netdev/net-next/c/3cccda8db2cf
  - [net-next,12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation
    https://git.kernel.org/netdev/net-next/c/83cd5eb654b3
  - [net-next,13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation
    https://git.kernel.org/netdev/net-next/c/6b724bc4300b
  - [net-next,14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation
    https://git.kernel.org/netdev/net-next/c/859f8b265fc2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-09-15  9:40 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-12 16:01 [PATCH net-next 00/14] ipv6: round of data-races fixes Eric Dumazet
2023-09-12 16:01 ` [PATCH net-next 01/14] ipv6: lockless IPV6_UNICAST_HOPS implementation Eric Dumazet
2023-09-14 14:51   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 02/14] ipv6: lockless IPV6_MULTICAST_LOOP implementation Eric Dumazet
2023-09-14 14:54   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 03/14] ipv6: lockless IPV6_MULTICAST_HOPS implementation Eric Dumazet
2023-09-14 14:55   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 04/14] ipv6: lockless IPV6_MTU implementation Eric Dumazet
2023-09-14 14:58   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 05/14] ipv6: lockless IPV6_MINHOPCOUNT implementation Eric Dumazet
2023-09-14 15:01   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 06/14] ipv6: lockless IPV6_RECVERR_RFC4884 implementation Eric Dumazet
2023-09-14 15:02   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 07/14] ipv6: lockless IPV6_MULTICAST_ALL implementation Eric Dumazet
2023-09-14 15:03   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 08/14] ipv6: lockless IPV6_AUTOFLOWLABEL implementation Eric Dumazet
2023-09-14 15:04   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 09/14] ipv6: lockless IPV6_DONTFRAG implementation Eric Dumazet
2023-09-14 15:05   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 10/14] ipv6: lockless IPV6_RECVERR implemetation Eric Dumazet
2023-09-14 15:06   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 11/14] ipv6: move np->repflow to atomic flags Eric Dumazet
2023-09-14 15:07   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 12/14] ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementation Eric Dumazet
2023-09-14 15:08   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 13/14] ipv6: lockless IPV6_MTU_DISCOVER implementation Eric Dumazet
2023-09-14 15:10   ` David Ahern
2023-09-12 16:02 ` [PATCH net-next 14/14] ipv6: lockless IPV6_FLOWINFO_SEND implementation Eric Dumazet
2023-09-14 15:11   ` David Ahern
2023-09-14 10:25 ` [PATCH net-next 00/14] ipv6: round of data-races fixes Simon Horman
2023-09-15  9:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).