public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/1] net/smc: An issue of smc sending messages
@ 2024-12-26 12:22 liqiang
  2024-12-26 12:22 ` [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space liqiang
  0 siblings, 1 reply; 4+ messages in thread
From: liqiang @ 2024-12-26 12:22 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen
  Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
	dengguangxing, gaochao24, liqiang64

This problem can be reproduced using netperf and netserver.

On server:
smc_run netserver

On client:
smc_run netperf -H 127.0.0.1 -l 1
[]# smc_run netperf -H 127.0.0.1 -l 1
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) ...
Recv   Send    Send
Socket ...     Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

131072 131072 131072    0.00     1723.79

This is because netperf obtains the smc sndbuf size as 131072 
through getsockopt. It calls sendmsg to send 131072 bytes of 
data at a time, but smc actually returns after only sending 
65504 bytes (65536 - sizeof(cdc head)), and then netperf 
terminates the test.

In smc_tx_sendmsg, the processing after the sending ring buf 
is full is to return directly, so the above phenomenon is caused.

I would like to ask if here after the sending ring buf is full, 
enter smc_tx_wait and wait for the peer to read before continuing 
to send.

When I delete the judgment of send_done in front of smc_tx_wait, 
it seems to work normally:

[]# smc_run netperf -H 127.0.0.1 -l 1
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) ...
Recv   Send    Send
Socket ...     Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

131072 131072 131072    1.00     11543.80

liqiang (1):
  Enter smc_tx_wait when the tx length exceeds the available space

 net/smc/smc_tx.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space
  2024-12-26 12:22 [PATCH net-next 0/1] net/smc: An issue of smc sending messages liqiang
@ 2024-12-26 12:22 ` liqiang
  2024-12-26 13:20   ` Wen Gu
  0 siblings, 1 reply; 4+ messages in thread
From: liqiang @ 2024-12-26 12:22 UTC (permalink / raw)
  To: wenjia, jaka, alibuda, tonylu, guwen
  Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
	dengguangxing, gaochao24, liqiang64

The variable send_done records the number of bytes that have been 
successfully sent in the context of the code. It is more reasonable 
to rename it to sent_bytes here.

Another modification point is that if the ring buf is full after 
sendmsg has sent part of the data, the current code will return 
directly without entering smc_tx_wait, so the judgment of send_done 
in front of smc_tx_wait is removed.

Signed-off-by: liqiang <liqiang64@huawei.com>
---
 net/smc/smc_tx.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 214ac3cbcf9a..6ecabc10793c 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -180,7 +180,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg)
  */
 int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 {
-	size_t copylen, send_done = 0, send_remaining = len;
+	size_t copylen, sent_bytes = 0, send_remaining = len;
 	size_t chunk_len, chunk_off, chunk_len_sum;
 	struct smc_connection *conn = &smc->conn;
 	union smc_host_cursor prep;
@@ -216,14 +216,12 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		    conn->killed)
 			return -EPIPE;
 		if (smc_cdc_rxed_any_close(conn))
-			return send_done ?: -ECONNRESET;
+			return sent_bytes ?: -ECONNRESET;
 
 		if (msg->msg_flags & MSG_OOB)
 			conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
 
 		if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
-			if (send_done)
-				return send_done;
 			rc = smc_tx_wait(smc, msg->msg_flags);
 			if (rc)
 				goto out_err;
@@ -250,11 +248,11 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 					     msg, chunk_len);
 			if (rc) {
 				smc_sndbuf_sync_sg_for_device(conn);
-				if (send_done)
-					return send_done;
+				if (sent_bytes)
+					return sent_bytes;
 				goto out_err;
 			}
-			send_done += chunk_len;
+			sent_bytes += chunk_len;
 			send_remaining -= chunk_len;
 
 			if (chunk_len_sum == copylen)
@@ -287,7 +285,7 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
 		trace_smc_tx_sendmsg(smc, copylen);
 	} /* while (msg_data_left(msg)) */
 
-	return send_done;
+	return sent_bytes;
 
 out_err:
 	rc = sk_stream_error(sk, msg->msg_flags, rc);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space
  2024-12-26 12:22 ` [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space liqiang
@ 2024-12-26 13:20   ` Wen Gu
  2024-12-27  7:53     ` Li Qiang
  0 siblings, 1 reply; 4+ messages in thread
From: Wen Gu @ 2024-12-26 13:20 UTC (permalink / raw)
  To: liqiang, wenjia, jaka, alibuda, tonylu
  Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
	dengguangxing, gaochao24



On 2024/12/26 20:22, liqiang wrote:
> The variable send_done records the number of bytes that have been
> successfully sent in the context of the code. It is more reasonable
> to rename it to sent_bytes here.
> 
> Another modification point is that if the ring buf is full after
> sendmsg has sent part of the data, the current code will return
> directly without entering smc_tx_wait, so the judgment of send_done
> in front of smc_tx_wait is removed.
> 
> Signed-off-by: liqiang <liqiang64@huawei.com>
> ---

Hi liqiang,

I think this discussion thread[1] can help you understand why this is the case.
The current design is to avoid the stalled connection problem.

Some other small points: issues should be posted to 'net' tree instead of 'net-next'
tree[2], and currently net-next is closed[3].

[1] https://lore.kernel.org/netdev/20211027085208.16048-2-tonylu@linux.alibaba.com/
[2] https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
[3] https://patchwork.hopto.org/net-next.html

Regards.

>   net/smc/smc_tx.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
> index 214ac3cbcf9a..6ecabc10793c 100644
> --- a/net/smc/smc_tx.c
> +++ b/net/smc/smc_tx.c
> @@ -180,7 +180,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg)
>    */
>   int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   {
> -	size_t copylen, send_done = 0, send_remaining = len;
> +	size_t copylen, sent_bytes = 0, send_remaining = len;
>   	size_t chunk_len, chunk_off, chunk_len_sum;
>   	struct smc_connection *conn = &smc->conn;
>   	union smc_host_cursor prep;
> @@ -216,14 +216,12 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   		    conn->killed)
>   			return -EPIPE;
>   		if (smc_cdc_rxed_any_close(conn))
> -			return send_done ?: -ECONNRESET;
> +			return sent_bytes ?: -ECONNRESET;
>   
>   		if (msg->msg_flags & MSG_OOB)
>   			conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
>   
>   		if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
> -			if (send_done)
> -				return send_done;
>   			rc = smc_tx_wait(smc, msg->msg_flags);
>   			if (rc)
>   				goto out_err;
> @@ -250,11 +248,11 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   					     msg, chunk_len);
>   			if (rc) {
>   				smc_sndbuf_sync_sg_for_device(conn);
> -				if (send_done)
> -					return send_done;
> +				if (sent_bytes)
> +					return sent_bytes;
>   				goto out_err;
>   			}
> -			send_done += chunk_len;
> +			sent_bytes += chunk_len;
>   			send_remaining -= chunk_len;
>   
>   			if (chunk_len_sum == copylen)
> @@ -287,7 +285,7 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
>   		trace_smc_tx_sendmsg(smc, copylen);
>   	} /* while (msg_data_left(msg)) */
>   
> -	return send_done;
> +	return sent_bytes;
>   
>   out_err:
>   	rc = sk_stream_error(sk, msg->msg_flags, rc);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space
  2024-12-26 13:20   ` Wen Gu
@ 2024-12-27  7:53     ` Li Qiang
  0 siblings, 0 replies; 4+ messages in thread
From: Li Qiang @ 2024-12-27  7:53 UTC (permalink / raw)
  To: Wen Gu, wenjia, jaka, alibuda, tonylu
  Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
	dengguangxing, gaochao24



在 2024/12/26 21:20, Wen Gu 写道:
> 
> 
> On 2024/12/26 20:22, liqiang wrote:
>> The variable send_done records the number of bytes that have been
>> successfully sent in the context of the code. It is more reasonable
>> to rename it to sent_bytes here.
>>
>> Another modification point is that if the ring buf is full after
>> sendmsg has sent part of the data, the current code will return
>> directly without entering smc_tx_wait, so the judgment of send_done
>> in front of smc_tx_wait is removed.
>>
>> Signed-off-by: liqiang <liqiang64@huawei.com>
>> ---
> 
> Hi liqiang,
> 
> I think this discussion thread[1] can help you understand why this is the case.
> The current design is to avoid the stalled connection problem.

Yes, I read that discussion and the problem does exist. So we should correctly handle
fewer bytes sent than expected in user space (like netperf).

However, according to my verification, in the TCP network or loopback (without smc),
after increasing the memory sent by the client at one time to a large enough size,
a connection deadlock seems to occur. SMC processing will not be stuck due to the
expansion of the sending memory.But when the socket is blocking and sends messages,
its behavior is different from TCP socket.

> 
> Some other small points: issues should be posted to 'net' tree instead of 'net-next'
> tree[2], and currently net-next is closed[3].

Thank you for pointing out the problem, I learned from it.

> 
> [1] https://lore.kernel.org/netdev/20211027085208.16048-2-tonylu@linux.alibaba.com/
> [2] https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> [3] https://patchwork.hopto.org/net-next.html
> 
> Regards.
> 


-- 
Cheers,
Li Qiang


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-27  7:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-26 12:22 [PATCH net-next 0/1] net/smc: An issue of smc sending messages liqiang
2024-12-26 12:22 ` [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space liqiang
2024-12-26 13:20   ` Wen Gu
2024-12-27  7:53     ` Li Qiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox