* [PATCH net-next 0/1] net/smc: An issue of smc sending messages
@ 2024-12-26 12:22 liqiang
2024-12-26 12:22 ` [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space liqiang
0 siblings, 1 reply; 4+ messages in thread
From: liqiang @ 2024-12-26 12:22 UTC (permalink / raw)
To: wenjia, jaka, alibuda, tonylu, guwen
Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
dengguangxing, gaochao24, liqiang64
This problem can be reproduced using netperf and netserver.
On server:
smc_run netserver
On client:
smc_run netperf -H 127.0.0.1 -l 1
[]# smc_run netperf -H 127.0.0.1 -l 1
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) ...
Recv Send Send
Socket ... Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 131072 131072 0.00 1723.79
This is because netperf obtains the smc sndbuf size as 131072
through getsockopt. It calls sendmsg to send 131072 bytes of
data at a time, but smc actually returns after only sending
65504 bytes (65536 - sizeof(cdc head)), and then netperf
terminates the test.
In smc_tx_sendmsg, the processing after the sending ring buf
is full is to return directly, so the above phenomenon is caused.
I would like to ask if here after the sending ring buf is full,
enter smc_tx_wait and wait for the peer to read before continuing
to send.
When I delete the judgment of send_done in front of smc_tx_wait,
it seems to work normally:
[]# smc_run netperf -H 127.0.0.1 -l 1
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) ...
Recv Send Send
Socket ... Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 131072 131072 1.00 11543.80
liqiang (1):
Enter smc_tx_wait when the tx length exceeds the available space
net/smc/smc_tx.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space
2024-12-26 12:22 [PATCH net-next 0/1] net/smc: An issue of smc sending messages liqiang
@ 2024-12-26 12:22 ` liqiang
2024-12-26 13:20 ` Wen Gu
0 siblings, 1 reply; 4+ messages in thread
From: liqiang @ 2024-12-26 12:22 UTC (permalink / raw)
To: wenjia, jaka, alibuda, tonylu, guwen
Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
dengguangxing, gaochao24, liqiang64
The variable send_done records the number of bytes that have been
successfully sent in the context of the code. It is more reasonable
to rename it to sent_bytes here.
Another modification point is that if the ring buf is full after
sendmsg has sent part of the data, the current code will return
directly without entering smc_tx_wait, so the judgment of send_done
in front of smc_tx_wait is removed.
Signed-off-by: liqiang <liqiang64@huawei.com>
---
net/smc/smc_tx.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 214ac3cbcf9a..6ecabc10793c 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -180,7 +180,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg)
*/
int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
{
- size_t copylen, send_done = 0, send_remaining = len;
+ size_t copylen, sent_bytes = 0, send_remaining = len;
size_t chunk_len, chunk_off, chunk_len_sum;
struct smc_connection *conn = &smc->conn;
union smc_host_cursor prep;
@@ -216,14 +216,12 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
conn->killed)
return -EPIPE;
if (smc_cdc_rxed_any_close(conn))
- return send_done ?: -ECONNRESET;
+ return sent_bytes ?: -ECONNRESET;
if (msg->msg_flags & MSG_OOB)
conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
- if (send_done)
- return send_done;
rc = smc_tx_wait(smc, msg->msg_flags);
if (rc)
goto out_err;
@@ -250,11 +248,11 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
msg, chunk_len);
if (rc) {
smc_sndbuf_sync_sg_for_device(conn);
- if (send_done)
- return send_done;
+ if (sent_bytes)
+ return sent_bytes;
goto out_err;
}
- send_done += chunk_len;
+ sent_bytes += chunk_len;
send_remaining -= chunk_len;
if (chunk_len_sum == copylen)
@@ -287,7 +285,7 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
trace_smc_tx_sendmsg(smc, copylen);
} /* while (msg_data_left(msg)) */
- return send_done;
+ return sent_bytes;
out_err:
rc = sk_stream_error(sk, msg->msg_flags, rc);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space
2024-12-26 12:22 ` [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space liqiang
@ 2024-12-26 13:20 ` Wen Gu
2024-12-27 7:53 ` Li Qiang
0 siblings, 1 reply; 4+ messages in thread
From: Wen Gu @ 2024-12-26 13:20 UTC (permalink / raw)
To: liqiang, wenjia, jaka, alibuda, tonylu
Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
dengguangxing, gaochao24
On 2024/12/26 20:22, liqiang wrote:
> The variable send_done records the number of bytes that have been
> successfully sent in the context of the code. It is more reasonable
> to rename it to sent_bytes here.
>
> Another modification point is that if the ring buf is full after
> sendmsg has sent part of the data, the current code will return
> directly without entering smc_tx_wait, so the judgment of send_done
> in front of smc_tx_wait is removed.
>
> Signed-off-by: liqiang <liqiang64@huawei.com>
> ---
Hi liqiang,
I think this discussion thread[1] can help you understand why this is the case.
The current design is to avoid the stalled connection problem.
Some other small points: issues should be posted to 'net' tree instead of 'net-next'
tree[2], and currently net-next is closed[3].
[1] https://lore.kernel.org/netdev/20211027085208.16048-2-tonylu@linux.alibaba.com/
[2] https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
[3] https://patchwork.hopto.org/net-next.html
Regards.
> net/smc/smc_tx.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
> index 214ac3cbcf9a..6ecabc10793c 100644
> --- a/net/smc/smc_tx.c
> +++ b/net/smc/smc_tx.c
> @@ -180,7 +180,7 @@ static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg)
> */
> int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> {
> - size_t copylen, send_done = 0, send_remaining = len;
> + size_t copylen, sent_bytes = 0, send_remaining = len;
> size_t chunk_len, chunk_off, chunk_len_sum;
> struct smc_connection *conn = &smc->conn;
> union smc_host_cursor prep;
> @@ -216,14 +216,12 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> conn->killed)
> return -EPIPE;
> if (smc_cdc_rxed_any_close(conn))
> - return send_done ?: -ECONNRESET;
> + return sent_bytes ?: -ECONNRESET;
>
> if (msg->msg_flags & MSG_OOB)
> conn->local_tx_ctrl.prod_flags.urg_data_pending = 1;
>
> if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) {
> - if (send_done)
> - return send_done;
> rc = smc_tx_wait(smc, msg->msg_flags);
> if (rc)
> goto out_err;
> @@ -250,11 +248,11 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> msg, chunk_len);
> if (rc) {
> smc_sndbuf_sync_sg_for_device(conn);
> - if (send_done)
> - return send_done;
> + if (sent_bytes)
> + return sent_bytes;
> goto out_err;
> }
> - send_done += chunk_len;
> + sent_bytes += chunk_len;
> send_remaining -= chunk_len;
>
> if (chunk_len_sum == copylen)
> @@ -287,7 +285,7 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> trace_smc_tx_sendmsg(smc, copylen);
> } /* while (msg_data_left(msg)) */
>
> - return send_done;
> + return sent_bytes;
>
> out_err:
> rc = sk_stream_error(sk, msg->msg_flags, rc);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space
2024-12-26 13:20 ` Wen Gu
@ 2024-12-27 7:53 ` Li Qiang
0 siblings, 0 replies; 4+ messages in thread
From: Li Qiang @ 2024-12-27 7:53 UTC (permalink / raw)
To: Wen Gu, wenjia, jaka, alibuda, tonylu
Cc: linux-s390, netdev, linux-kernel, luanjianhai, zhangxuzhou4,
dengguangxing, gaochao24
在 2024/12/26 21:20, Wen Gu 写道:
>
>
> On 2024/12/26 20:22, liqiang wrote:
>> The variable send_done records the number of bytes that have been
>> successfully sent in the context of the code. It is more reasonable
>> to rename it to sent_bytes here.
>>
>> Another modification point is that if the ring buf is full after
>> sendmsg has sent part of the data, the current code will return
>> directly without entering smc_tx_wait, so the judgment of send_done
>> in front of smc_tx_wait is removed.
>>
>> Signed-off-by: liqiang <liqiang64@huawei.com>
>> ---
>
> Hi liqiang,
>
> I think this discussion thread[1] can help you understand why this is the case.
> The current design is to avoid the stalled connection problem.
Yes, I read that discussion and the problem does exist. So we should correctly handle
fewer bytes sent than expected in user space (like netperf).
However, according to my verification, in the TCP network or loopback (without smc),
after increasing the memory sent by the client at one time to a large enough size,
a connection deadlock seems to occur. SMC processing will not be stuck due to the
expansion of the sending memory.But when the socket is blocking and sends messages,
its behavior is different from TCP socket.
>
> Some other small points: issues should be posted to 'net' tree instead of 'net-next'
> tree[2], and currently net-next is closed[3].
Thank you for pointing out the problem, I learned from it.
>
> [1] https://lore.kernel.org/netdev/20211027085208.16048-2-tonylu@linux.alibaba.com/
> [2] https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> [3] https://patchwork.hopto.org/net-next.html
>
> Regards.
>
--
Cheers,
Li Qiang
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-12-27 7:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-26 12:22 [PATCH net-next 0/1] net/smc: An issue of smc sending messages liqiang
2024-12-26 12:22 ` [PATCH net-next 1/1] Enter smc_tx_wait when the tx length exceeds the available space liqiang
2024-12-26 13:20 ` Wen Gu
2024-12-27 7:53 ` Li Qiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox