public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v1] net: annotate data races around sk->sk_prot
@ 2026-03-04  3:16 Jiayuan Chen
  2026-03-04  3:42 ` Kuniyuki Iwashima
  0 siblings, 1 reply; 3+ messages in thread
From: Jiayuan Chen @ 2026-03-04  3:16 UTC (permalink / raw)
  To: netdev
  Cc: Jiayuan Chen, Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni,
	Willem de Bruijn, David S. Miller, Jakub Kicinski, Simon Horman,
	David Ahern, linux-kernel

inet_sendmsg(), inet_recvmsg() and sock_common_recvmsg() access
sk->sk_prot without lock_sock() or any other synchronization.

sock_replace_proto() (used by sockmap), TLS and MPTCP can change
sk->sk_prot under us, so these functions need READ_ONCE() to avoid
load tearing.

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 net/core/sock.c    | 2 +-
 net/ipv4/af_inet.c | 8 ++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index f4e2ff23d60e..79b659cebbb1 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3968,7 +3968,7 @@ int sock_common_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->recvmsg(sk, msg, size, flags);
+	return READ_ONCE(sk->sk_prot)->recvmsg(sk, msg, size, flags);
 }
 EXPORT_SYMBOL(sock_common_recvmsg);
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index babcd75a08e2..e95ffa070568 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -852,11 +852,13 @@ EXPORT_SYMBOL_GPL(inet_send_prepare);
 int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 
 	if (unlikely(inet_send_prepare(sk)))
 		return -EAGAIN;
 
-	return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udp_sendmsg,
+	prot = READ_ONCE(sk->sk_prot);
+	return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udp_sendmsg,
 			       sk, msg, size);
 }
 EXPORT_SYMBOL(inet_sendmsg);
@@ -882,11 +884,13 @@ int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		 int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 
 	if (likely(!(flags & MSG_ERRQUEUE)))
 		sock_rps_record_flow(sk);
 
-	return INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg,
+	prot = READ_ONCE(sk->sk_prot);
+	return INDIRECT_CALL_2(prot->recvmsg, tcp_recvmsg, udp_recvmsg,
 			       sk, msg, size, flags);
 }
 EXPORT_SYMBOL(inet_recvmsg);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v1] net: annotate data races around sk->sk_prot
  2026-03-04  3:16 [PATCH net-next v1] net: annotate data races around sk->sk_prot Jiayuan Chen
@ 2026-03-04  3:42 ` Kuniyuki Iwashima
  2026-03-04  3:53   ` Jiayuan Chen
  0 siblings, 1 reply; 3+ messages in thread
From: Kuniyuki Iwashima @ 2026-03-04  3:42 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, Eric Dumazet, Paolo Abeni, Willem de Bruijn,
	David S. Miller, Jakub Kicinski, Simon Horman, David Ahern,
	linux-kernel

On Tue, Mar 3, 2026 at 7:16 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> inet_sendmsg(), inet_recvmsg() and sock_common_recvmsg() access
> sk->sk_prot without lock_sock() or any other synchronization.
>
> sock_replace_proto() (used by sockmap), TLS and MPTCP can change
> sk->sk_prot under us, so these functions need READ_ONCE() to avoid
> load tearing.
>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  net/core/sock.c    | 2 +-
>  net/ipv4/af_inet.c | 8 ++++++--
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index f4e2ff23d60e..79b659cebbb1 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -3968,7 +3968,7 @@ int sock_common_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
>  {
>         struct sock *sk = sock->sk;
>
> -       return sk->sk_prot->recvmsg(sk, msg, size, flags);
> +       return READ_ONCE(sk->sk_prot)->recvmsg(sk, msg, size, flags);
>  }
>  EXPORT_SYMBOL(sock_common_recvmsg);

None of users seems to be supported by SOCKMAP,
or am I missing something ?

include/net/sock.h:1963:int sock_common_recvmsg(struct socket *sock,
struct msghdr *msg, size_t size,
net/core/sock.c:3966:int sock_common_recvmsg(struct socket *sock,
struct msghdr *msg, size_t size,
net/core/sock.c:3973:EXPORT_SYMBOL(sock_common_recvmsg);
net/l2tp/l2tp_ip6.c:774: .recvmsg   = sock_common_recvmsg,
net/l2tp/l2tp_ip.c:645: .recvmsg   = sock_common_recvmsg,
net/ipv6/raw.c:1292: .recvmsg   = sock_common_recvmsg, /* ok */
net/ieee802154/socket.c:427: .recvmsg   = sock_common_recvmsg,
net/ieee802154/socket.c:989: .recvmsg   = sock_common_recvmsg,
net/phonet/socket.c:441: .recvmsg = sock_common_recvmsg,
net/phonet/socket.c:461: .recvmsg = sock_common_recvmsg,



>
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index babcd75a08e2..e95ffa070568 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -852,11 +852,13 @@ EXPORT_SYMBOL_GPL(inet_send_prepare);
>  int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
>  {
>         struct sock *sk = sock->sk;
> +       const struct proto *prot;
>
>         if (unlikely(inet_send_prepare(sk)))
>                 return -EAGAIN;
>
> -       return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udp_sendmsg,
> +       prot = READ_ONCE(sk->sk_prot);
> +       return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udp_sendmsg,
>                                sk, msg, size);
>  }
>  EXPORT_SYMBOL(inet_sendmsg);
> @@ -882,11 +884,13 @@ int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
>                  int flags)
>  {
>         struct sock *sk = sock->sk;
> +       const struct proto *prot;
>
>         if (likely(!(flags & MSG_ERRQUEUE)))
>                 sock_rps_record_flow(sk);
>
> -       return INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg,
> +       prot = READ_ONCE(sk->sk_prot);
> +       return INDIRECT_CALL_2(prot->recvmsg, tcp_recvmsg, udp_recvmsg,
>                                sk, msg, size, flags);
>  }
>  EXPORT_SYMBOL(inet_recvmsg);
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v1] net: annotate data races around sk->sk_prot
  2026-03-04  3:42 ` Kuniyuki Iwashima
@ 2026-03-04  3:53   ` Jiayuan Chen
  0 siblings, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2026-03-04  3:53 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: netdev, Eric Dumazet, Paolo Abeni, Willem de Bruijn,
	David S. Miller, Jakub Kicinski, Simon Horman, David Ahern,
	linux-kernel

March 4, 2026 at 11:42, "Kuniyuki Iwashima" <kuniyu@google.com mailto:kuniyu@google.com?to=%22Kuniyuki%20Iwashima%22%20%3Ckuniyu%40google.com%3E > wrote:


> 
> On Tue, Mar 3, 2026 at 7:16 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> 
> > 
> > inet_sendmsg(), inet_recvmsg() and sock_common_recvmsg() access
> >  sk->sk_prot without lock_sock() or any other synchronization.
> > 
> >  sock_replace_proto() (used by sockmap), TLS and MPTCP can change
> >  sk->sk_prot under us, so these functions need READ_ONCE() to avoid
> >  load tearing.
> > 
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> >  ---
> >  net/core/sock.c | 2 +-
> >  net/ipv4/af_inet.c | 8 ++++++--
> >  2 files changed, 7 insertions(+), 3 deletions(-)
> > 
> >  diff --git a/net/core/sock.c b/net/core/sock.c
> >  index f4e2ff23d60e..79b659cebbb1 100644
> >  --- a/net/core/sock.c
> >  +++ b/net/core/sock.c
> >  @@ -3968,7 +3968,7 @@ int sock_common_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
> >  {
> >  struct sock *sk = sock->sk;
> > 
> >  - return sk->sk_prot->recvmsg(sk, msg, size, flags);
> >  + return READ_ONCE(sk->sk_prot)->recvmsg(sk, msg, size, flags);
> >  }
> >  EXPORT_SYMBOL(sock_common_recvmsg);
> > 
> None of users seems to be supported by SOCKMAP,
> or am I missing something ?
> 
> include/net/sock.h:1963:int sock_common_recvmsg(struct socket *sock,
> struct msghdr *msg, size_t size,
> net/core/sock.c:3966:int sock_common_recvmsg(struct socket *sock,
> struct msghdr *msg, size_t size,
> net/core/sock.c:3973:EXPORT_SYMBOL(sock_common_recvmsg);
> net/l2tp/l2tp_ip6.c:774: .recvmsg = sock_common_recvmsg,
> net/l2tp/l2tp_ip.c:645: .recvmsg = sock_common_recvmsg,
> net/ipv6/raw.c:1292: .recvmsg = sock_common_recvmsg, /* ok */
> net/ieee802154/socket.c:427: .recvmsg = sock_common_recvmsg,
> net/ieee802154/socket.c:989: .recvmsg = sock_common_recvmsg,
> net/phonet/socket.c:441: .recvmsg = sock_common_recvmsg,
> net/phonet/socket.c:461: .recvmsg = sock_common_recvmsg,


You're right. None of the sock_common_recvmsg() users (raw, l2tp,
ieee802154, phonet) support SOCKMAP/TLS/MPTCP, so there is no
concurrent writer to sk->sk_prot for these socket types. I'll drop
that change in v2.

> > 
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> >  index babcd75a08e2..e95ffa070568 100644
> >  --- a/net/ipv4/af_inet.c
> >  +++ b/net/ipv4/af_inet.c
> >  @@ -852,11 +852,13 @@ EXPORT_SYMBOL_GPL(inet_send_prepare);
> >  int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
> >  {
> >  struct sock *sk = sock->sk;
> >  + const struct proto *prot;
> > 
> >  if (unlikely(inet_send_prepare(sk)))
> >  return -EAGAIN;
> > 
> >  - return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udp_sendmsg,
> >  + prot = READ_ONCE(sk->sk_prot);
> >  + return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udp_sendmsg,
> >  sk, msg, size);
> >  }
> >  EXPORT_SYMBOL(inet_sendmsg);
> >  @@ -882,11 +884,13 @@ int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
> >  int flags)
> >  {
> >  struct sock *sk = sock->sk;
> >  + const struct proto *prot;
> > 
> >  if (likely(!(flags & MSG_ERRQUEUE)))
> >  sock_rps_record_flow(sk);
> > 
> >  - return INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg,
> >  + prot = READ_ONCE(sk->sk_prot);
> >  + return INDIRECT_CALL_2(prot->recvmsg, tcp_recvmsg, udp_recvmsg,
> >  sk, msg, size, flags);
> >  }
> >  EXPORT_SYMBOL(inet_recvmsg);
> >  --
> >  2.43.0
> >
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-04  3:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-04  3:16 [PATCH net-next v1] net: annotate data races around sk->sk_prot Jiayuan Chen
2026-03-04  3:42 ` Kuniyuki Iwashima
2026-03-04  3:53   ` Jiayuan Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox