From: Sabrina Dubroca <sd@queasysnail.net>
To: Ren Wei <n05ec@lzu.edu.cn>
Cc: netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, horms@kernel.org,
steffen.klassert@secunet.com, herbert@gondor.apana.org.au,
yuantan098@gmail.com, yifanwucs@gmail.com,
tomapufckgml@gmail.com, bird@lzu.edu.cn, ronbogo@outlook.com,
zylzyl2333@gmail.com
Subject: Re: [PATCH net 1/1] xfrm: espintcp: publish ULP context before entry points
Date: Tue, 12 May 2026 12:07:02 +0200 [thread overview]
Message-ID: <agL7xoDgf7z6nRXX@krikkit> (raw)
In-Reply-To: <c30b645074a1b379e0f7fe297f917c66137d9964.1778464688.git.zylzyl2333@gmail.com>
Thanks for the fix. A small note: IPsec fixes go through the "ipsec"
tree, not "net", so the prefix should be [PATCH ipsec]
Some comments inline:
2026-05-11, 21:40:58 +0800, Ren Wei wrote:
> diff --git a/include/net/espintcp.h b/include/net/espintcp.h
> index c70efd704b6d..034be559786b 100644
> --- a/include/net/espintcp.h
> +++ b/include/net/espintcp.h
> @@ -34,7 +34,16 @@ static inline struct espintcp_ctx *espintcp_getctx(const struct sock *sk)
> {
> const struct inet_connection_sock *icsk = inet_csk(sk);
>
> - /* RCU is only needed for diag */
> - return (__force void *)icsk->icsk_ulp_data;
> + /*
> + * The caller reached an ESP entry point by observing sk_prot,
> + * sk_socket->ops, or one of the socket callbacks. Keep the ctx
> + * load after that observation so the caller cannot see the new
> + * entry point while still seeing stale icsk_ulp_data.
I don't think this comment is really helpful.
> + *
> + * Pairs with smp_wmb() in espintcp_init_sk().
> + */
> + smp_rmb();
> +
> + return (__force void *)READ_ONCE(icsk->icsk_ulp_data);
I think smp_store_release/smp_load_acquire is the "standard spelling"
for this now.
[...]
> @@ -472,34 +476,46 @@ static int espintcp_init_sk(struct sock *sk)
>
> __sk_dst_reset(sk);
>
> - strp_check_rcv(&ctx->strp);
> skb_queue_head_init(&ctx->ike_queue);
> skb_queue_head_init(&ctx->out_queue);
> + ctx->saved_data_ready = READ_ONCE(sk->sk_data_ready);
> + ctx->saved_write_space = READ_ONCE(sk->sk_write_space);
> + ctx->saved_destruct = READ_ONCE(sk->sk_destruct);
If something is changing those while espintcp_init_sk is running,
READ_ONCE won't help us. We'll end up with the wrong saved_*
values. Can this actually happen here?
> + INIT_WORK(&ctx->work, espintcp_tx_work);
> +
> + /* avoid using task_frag */
> + sk->sk_allocation = GFP_ATOMIC;
> + sk->sk_use_task_frag = false;
>
> if (sk->sk_family == AF_INET) {
> - sk->sk_prot = &espintcp_prot;
> - sk->sk_socket->ops = &espintcp_ops;
> + prot = &espintcp_prot;
> + ops = &espintcp_ops;
> } else {
> mutex_lock(&tcpv6_prot_mutex);
> if (!espintcp6_prot.recvmsg)
> - build_protos(&espintcp6_prot, &espintcp6_ops, sk->sk_prot, sk->sk_socket->ops);
> + build_protos(&espintcp6_prot, &espintcp6_ops,
> + READ_ONCE(sk->sk_prot),
> + READ_ONCE(sk->sk_socket->ops));
And similar here. Those should always be tcpv6_prot/inet6_stream_ops,
but I wrote it this way to avoid having to use stubs, back when IPv6
could be built as a module. This could now be moved into espintcp_init
like the ipv4 variant of this.
> mutex_unlock(&tcpv6_prot_mutex);
>
> - sk->sk_prot = &espintcp6_prot;
> - sk->sk_socket->ops = &espintcp6_ops;
> + prot = &espintcp6_prot;
> + ops = &espintcp6_ops;
> }
Or just move the whole block to the end, instead of introducing those
temporary variables?
> - ctx->saved_data_ready = sk->sk_data_ready;
> - ctx->saved_write_space = sk->sk_write_space;
> - ctx->saved_destruct = sk->sk_destruct;
> - sk->sk_data_ready = espintcp_data_ready;
> - sk->sk_write_space = espintcp_write_space;
> - sk->sk_destruct = espintcp_destruct;
> rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
> - INIT_WORK(&ctx->work, espintcp_tx_work);
>
> - /* avoid using task_frag */
> - sk->sk_allocation = GFP_ATOMIC;
> - sk->sk_use_task_frag = false;
> + /*
> + * Publish the fully initialized ctx before publishing any entry point
> + * that can call espintcp_getctx(). The read barrier there runs after
> + * the caller has observed one of these pointers.
> + */
> + smp_wmb();
> + WRITE_ONCE(sk->sk_prot, prot);
> + WRITE_ONCE(sk->sk_socket->ops, ops);
> + WRITE_ONCE(sk->sk_data_ready, espintcp_data_ready);
> + WRITE_ONCE(sk->sk_write_space, espintcp_write_space);
> + WRITE_ONCE(sk->sk_destruct, espintcp_destruct);
> +
> + strp_check_rcv(&ctx->strp);
>
> return 0;
>
> @@ -530,7 +546,7 @@ static void espintcp_close(struct sock *sk, long timeout)
>
> strp_stop(&ctx->strp);
>
> - sk->sk_prot = &tcp_prot;
> + WRITE_ONCE(sk->sk_prot, &tcp_prot);
Actually this should be the original sk_prot, which could be
tcpv6_prot.
I'm not sure how much the WRITE_ONCE matters here. What is it
protecting against/synchronizing with?
--
Sabrina
prev parent reply other threads:[~2026-05-12 10:07 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1778464688.git.zylzyl2333@gmail.com>
2026-05-11 13:40 ` [PATCH net 1/1] xfrm: espintcp: publish ULP context before entry points Ren Wei
2026-05-12 10:07 ` Sabrina Dubroca [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agL7xoDgf7z6nRXX@krikkit \
--to=sd@queasysnail.net \
--cc=bird@lzu.edu.cn \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=herbert@gondor.apana.org.au \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=n05ec@lzu.edu.cn \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ronbogo@outlook.com \
--cc=steffen.klassert@secunet.com \
--cc=tomapufckgml@gmail.com \
--cc=yifanwucs@gmail.com \
--cc=yuantan098@gmail.com \
--cc=zylzyl2333@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox