All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dust Li <dust.li@linux.alibaba.com>
To: "D. Wythe" <alibuda@linux.alibaba.com>,
	kgraul@linux.ibm.com, wenjia@linux.ibm.com, jaka@linux.ibm.com,
	wintera@linux.ibm.com, guwen@linux.alibaba.com
Cc: kuba@kernel.org, davem@davemloft.net, netdev@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org,
	tonylu@linux.alibaba.com, pabeni@redhat.com, edumazet@google.com
Subject: Re: [RFC PATCH net-next] net/smc: avoid conflict with sockmap after fallback
Date: Mon, 17 Mar 2025 11:57:18 +0800	[thread overview]
Message-ID: <20250317035718.GD56800@linux.alibaba.com> (raw)
In-Reply-To: <20250312133014.35775-1-alibuda@linux.alibaba.com>

On 2025-03-12 21:30:14, D. Wythe wrote:
>Currently, after fallback, SMC will occupy the sk_user_data of the TCP sock(clcsk) to
>forward events. As a result, we cannot use sockmap after that, since sockmap also
>requires the use of the sk_user_data. Even more, in some cases, this may result in
>abnormal panic.

Looks like this is a bugfix more then a feature.

>
>To enable sockmap after SMC fallback , we need to avoid using sk_user_data and
>instead introduce an additional smc_ctx in tcp_sock to index from TCP sock to SMC sock.
>
>Additionally, we bind the lifecycle of the SMC sock to that of the TCP sock, ensuring
>that the indexing to the SMC sock remains valid throughout the lifetime of the TCP sock.
>
>One key reason is that SMC overrides inet_connection_sock_af_ops, which introduces
>potential dependencies. We must ensure that the af_ops remain visible throughout the
>lifecycle of the TCP socket. In addition, this also resolves potential issues in some
>scenarios where the SMC sock might be invalid.
>
>Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>---
> include/linux/tcp.h |  1 +
> net/smc/af_smc.c    | 53 ++++++++++++++++++++++-----------------------
> net/smc/smc.h       |  8 +++----
> net/smc/smc_close.c |  1 -
> 4 files changed, 30 insertions(+), 33 deletions(-)
>
>diff --git a/include/linux/tcp.h b/include/linux/tcp.h
>index f88daaa76d83..f2223b1cc0d0 100644
>--- a/include/linux/tcp.h
>+++ b/include/linux/tcp.h
>@@ -478,6 +478,7 @@ struct tcp_sock {
> #if IS_ENABLED(CONFIG_SMC)
> 	bool	syn_smc;	/* SYN includes SMC */
> 	bool	(*smc_hs_congested)(const struct sock *sk);
>+	void	*smc_ctx;
> #endif
> 
> #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
>diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>index bc356f77ff1d..d434105639c1 100644
>--- a/net/smc/af_smc.c
>+++ b/net/smc/af_smc.c
>@@ -127,7 +127,7 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> 	struct smc_sock *smc;
> 	struct sock *child;
> 
>-	smc = smc_clcsock_user_data(sk);
>+	smc = smc_sk_from_clcsk(sk);
> 
> 	if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
> 				sk->sk_max_ack_backlog)
>@@ -143,8 +143,6 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> 					       own_req);
> 	/* child must not inherit smc or its ops */
> 	if (child) {
>-		rcu_assign_sk_user_data(child, NULL);
>-
> 		/* v4-mapped sockets don't inherit parent ops. Don't restore. */
> 		if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> 			inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
>@@ -161,10 +159,7 @@ static bool smc_hs_congested(const struct sock *sk)
> {
> 	const struct smc_sock *smc;
> 
>-	smc = smc_clcsock_user_data(sk);
>-
>-	if (!smc)
>-		return true;
>+	smc = smc_sk_from_clcsk(sk);

I don't see any users of smc in this function. Seems it is redundant here.

> 
> 	if (workqueue_congested(WORK_CPU_UNBOUND, smc_hs_wq))
> 		return true;
>@@ -250,7 +245,6 @@ static void smc_fback_restore_callbacks(struct smc_sock *smc)
> 	struct sock *clcsk = smc->clcsock->sk;
> 
> 	write_lock_bh(&clcsk->sk_callback_lock);
>-	clcsk->sk_user_data = NULL;
> 
> 	smc_clcsock_restore_cb(&clcsk->sk_state_change, &smc->clcsk_state_change);
> 	smc_clcsock_restore_cb(&clcsk->sk_data_ready, &smc->clcsk_data_ready);
>@@ -832,11 +826,10 @@ static void smc_fback_forward_wakeup(struct smc_sock *smc, struct sock *clcsk,
> 
> static void smc_fback_state_change(struct sock *clcsk)
> {
>-	struct smc_sock *smc;
>+	struct smc_sock *smc = smc_sk_from_clcsk(clcsk);
> 
> 	read_lock_bh(&clcsk->sk_callback_lock);
>-	smc = smc_clcsock_user_data(clcsk);
>-	if (smc)
>+	if (smc->clcsk_state_change)
> 		smc_fback_forward_wakeup(smc, clcsk,
> 					 smc->clcsk_state_change);

Since we checked the clcsock_callback everywhere, why not put it into
smc_fback_forward_wakeup() ?


Best regards,
Dust


      reply	other threads:[~2025-03-17  3:57 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12 13:30 [RFC PATCH net-next] net/smc: avoid conflict with sockmap after fallback D. Wythe
2025-03-17  3:57 ` Dust Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250317035718.GD56800@linux.alibaba.com \
    --to=dust.li@linux.alibaba.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=guwen@linux.alibaba.com \
    --cc=jaka@linux.ibm.com \
    --cc=kgraul@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=wenjia@linux.ibm.com \
    --cc=wintera@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.