* [PATCH mptcp-next 0/2] implement mptcp read_sock
@ 2025-05-22 9:21 Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 1/2] mptcp: use sk_eat_skb in recvmsg_mskq Geliang Tang
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Geliang Tang @ 2025-05-22 9:21 UTC (permalink / raw)
To: mptcp; +Cc: hare, Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
I have good news! I recently added MPTCP support to "NVME over TCP".
And my RFC patches are under review by NVME maintainer Hannes.
Replacing "NVME over TCP" with MPTCP is very simple. I used IPPROTO_MPTCP
instead of IPPROTO_TCP to create MPTCP sockets on both target and host
sides, these sockets are created in Kernel space.
nvmet_tcp_add_port:
ret = sock_create(port->addr.ss_family, SOCK_STREAM,
IPPROTO_MPTCP, &port->sock);
nvme_tcp_alloc_queue:
ret = sock_create_kern(current->nsproxy->net_ns,
ctrl->addr.ss_family, SOCK_STREAM,
IPPROTO_MPTCP, &queue->sock);
nvme_tcp_try_recv() needs to call .read_sock interface of struct
proto_ops, but it is not implemented in MPTCP. So I implemented it
with reference to __mptcp_recvmsg_mskq().
Since the NVME part patches are still under reviewing, I only send the
MPTCP part patches in this set to MPTCP ML for your opinions.
Thanks,
-Geliang
Geliang Tang (2):
mptcp: use sk_eat_skb in recvmsg_mskq
mptcp: implement .read_sock
net/mptcp/protocol.c | 57 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 55 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH mptcp-next 1/2] mptcp: use sk_eat_skb in recvmsg_mskq
2025-05-22 9:21 [PATCH mptcp-next 0/2] implement mptcp read_sock Geliang Tang
@ 2025-05-22 9:21 ` Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 2/2] mptcp: implement .read_sock Geliang Tang
2025-05-22 10:42 ` [PATCH mptcp-next 0/2] implement mptcp read_sock MPTCP CI
2 siblings, 0 replies; 5+ messages in thread
From: Geliang Tang @ 2025-05-22 9:21 UTC (permalink / raw)
To: mptcp; +Cc: hare, Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
This patch uses sk_eat_skb() helper in __mptcp_recvmsg_mskq() instead of
open-coding it.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/protocol.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2c2618a56b64..a9094cd8bcd0 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1890,8 +1890,7 @@ static int __mptcp_recvmsg_mskq(struct sock *sk,
skb->destructor = NULL;
atomic_sub(skb->truesize, &sk->sk_rmem_alloc);
sk_mem_uncharge(sk, skb->truesize);
- __skb_unlink(skb, &sk->sk_receive_queue);
- __kfree_skb(skb);
+ sk_eat_skb(sk, skb);
msk->bytes_consumed += count;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH mptcp-next 2/2] mptcp: implement .read_sock
2025-05-22 9:21 [PATCH mptcp-next 0/2] implement mptcp read_sock Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 1/2] mptcp: use sk_eat_skb in recvmsg_mskq Geliang Tang
@ 2025-05-22 9:21 ` Geliang Tang
2025-05-23 15:04 ` Matthieu Baerts
2025-05-22 10:42 ` [PATCH mptcp-next 0/2] implement mptcp read_sock MPTCP CI
2 siblings, 1 reply; 5+ messages in thread
From: Geliang Tang @ 2025-05-22 9:21 UTC (permalink / raw)
To: mptcp; +Cc: hare, Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
nvme_tcp_try_recv() needs to call .read_sock interface of struct
proto_ops, but it is not implemented in MPTCP.
This patch implements it with reference to __mptcp_recvmsg_mskq().
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/protocol.c | 54 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a9094cd8bcd0..e5c69b13dbf7 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3956,6 +3956,58 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock,
return mask;
}
+static int mptcp_read_sock(struct sock *sk, read_descriptor_t *desc,
+ sk_read_actor_t recv_actor)
+{
+ struct mptcp_sock *msk = mptcp_sk(sk);
+ struct scm_timestamping_internal tss;
+ struct sk_buff *skb, *tmp;
+ size_t len = INT_MAX;
+ int copied = 0;
+
+ skb_queue_walk_safe(&sk->sk_receive_queue, skb, tmp) {
+ u32 offset = MPTCP_SKB_CB(skb)->offset;
+ u32 data_len = skb->len - offset;
+ u32 count = min_t(size_t, len - copied, data_len);
+ int used;
+
+ used = recv_actor(desc, skb, offset, count);
+ if (used <= 0) {
+ if (!copied)
+ copied = used;
+ break;
+ }
+
+ if (MPTCP_SKB_CB(skb)->has_rxtstamp)
+ tcp_update_recv_tstamps(skb, &tss);
+
+ copied += used;
+
+ if (used < data_len) {
+ MPTCP_SKB_CB(skb)->offset += used;
+ MPTCP_SKB_CB(skb)->map_seq += used;
+ msk->bytes_consumed += used;
+ break;
+ }
+
+ skb->destructor = NULL;
+ atomic_sub(skb->truesize, &sk->sk_rmem_alloc);
+ sk_mem_uncharge(sk, skb->truesize);
+ sk_eat_skb(sk, skb);
+ msk->bytes_consumed += used;
+
+ if (copied >= len)
+ break;
+ }
+
+ mptcp_rcv_space_adjust(msk, copied);
+
+ if (copied > 0)
+ mptcp_cleanup_rbuf(msk, copied);
+
+ return copied;
+}
+
static const struct proto_ops mptcp_stream_ops = {
.family = PF_INET,
.owner = THIS_MODULE,
@@ -3976,6 +4028,7 @@ static const struct proto_ops mptcp_stream_ops = {
.recvmsg = inet_recvmsg,
.mmap = sock_no_mmap,
.set_rcvlowat = mptcp_set_rcvlowat,
+ .read_sock = mptcp_read_sock,
};
static struct inet_protosw mptcp_protosw = {
@@ -4080,6 +4133,7 @@ static const struct proto_ops mptcp_v6_stream_ops = {
.compat_ioctl = inet6_compat_ioctl,
#endif
.set_rcvlowat = mptcp_set_rcvlowat,
+ .read_sock = mptcp_read_sock,
};
static struct proto mptcp_v6_prot;
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH mptcp-next 0/2] implement mptcp read_sock
2025-05-22 9:21 [PATCH mptcp-next 0/2] implement mptcp read_sock Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 1/2] mptcp: use sk_eat_skb in recvmsg_mskq Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 2/2] mptcp: implement .read_sock Geliang Tang
@ 2025-05-22 10:42 ` MPTCP CI
2 siblings, 0 replies; 5+ messages in thread
From: MPTCP CI @ 2025-05-22 10:42 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/15183160676
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/524da7e8bf1a
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=965336
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH mptcp-next 2/2] mptcp: implement .read_sock
2025-05-22 9:21 ` [PATCH mptcp-next 2/2] mptcp: implement .read_sock Geliang Tang
@ 2025-05-23 15:04 ` Matthieu Baerts
0 siblings, 0 replies; 5+ messages in thread
From: Matthieu Baerts @ 2025-05-23 15:04 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: hare, Geliang Tang
Hi Geliang,
On 22/05/2025 11:21, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> nvme_tcp_try_recv() needs to call .read_sock interface of struct
> proto_ops, but it is not implemented in MPTCP.
Thank you for looking at that!
> This patch implements it with reference to __mptcp_recvmsg_mskq().
Can you share more details about the implementation please?
I didn't check in details what this .read_sock interface is supposed to
do, but it might be good to add some explanations about that, mentioning
the differences with __mptcp_recvmsg_mskq() and why they are there, why
you cannot share code (which is certainly fine, but better with a few
words about that) etc. Plus also the differences with __tcp_read_sock()
I suppose.
Without that, I'm sorry, but it is difficult to review the code, at
least for me who doesn't know much about this interface, what it is
supposed to do nor how it is used :)
Also, would it be possible to add a new MPTCP specific selftest?
>
> Reviewed-by: Hannes Reinecke <hare@kernel.org>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> net/mptcp/protocol.c | 54 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 54 insertions(+)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index a9094cd8bcd0..e5c69b13dbf7 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -3956,6 +3956,58 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock,
> return mask;
> }
>
If it is inspired by something else, please add a comment about that
here so if something change there, we might think about changing things
here as well.
> +static int mptcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> + sk_read_actor_t recv_actor)
> +{
> + struct mptcp_sock *msk = mptcp_sk(sk);
> + struct scm_timestamping_internal tss;
> + struct sk_buff *skb, *tmp;
> + size_t len = INT_MAX;
> + int copied = 0;
> +
> + skb_queue_walk_safe(&sk->sk_receive_queue, skb, tmp) {
> + u32 offset = MPTCP_SKB_CB(skb)->offset;
> + u32 data_len = skb->len - offset;
> + u32 count = min_t(size_t, len - copied, data_len);
> + int used;
> +
> + used = recv_actor(desc, skb, offset, count);
(to avoid confusions with __mptcp_recvmsg_mskq(), maybe clearer to use
'count' instead of 'used' here? I mean: below, 'used' is used here,
while 'count' is used in __mptcp_recvmsg_mskq(), that's a bit confusing)
> + if (used <= 0) {
> + if (!copied)
> + copied = used;
> + break;
> + }
> +
> + if (MPTCP_SKB_CB(skb)->has_rxtstamp)
> + tcp_update_recv_tstamps(skb, &tss);
> +
> + copied += used;
> +
> + if (used < data_len) {
> + MPTCP_SKB_CB(skb)->offset += used;
> + MPTCP_SKB_CB(skb)->map_seq += used;
> + msk->bytes_consumed += used;
> + break;
> + }
> +
> + skb->destructor = NULL;
That looks strange, please keep the comment that was there before
> + atomic_sub(skb->truesize, &sk->sk_rmem_alloc);
> + sk_mem_uncharge(sk, skb->truesize);
> + sk_eat_skb(sk, skb);
> + msk->bytes_consumed += used;
> +
> + if (copied >= len)
> + break;
> + }
> +
> + mptcp_rcv_space_adjust(msk, copied);
> +
> + if (copied > 0)
> + mptcp_cleanup_rbuf(msk, copied);
> +
> + return copied;
On TCP side, they first check the sk_state, and they also look for the
end of the end of a connection. Is it not needed here?
> +}
> +
> static const struct proto_ops mptcp_stream_ops = {
> .family = PF_INET,
> .owner = THIS_MODULE,
> @@ -3976,6 +4028,7 @@ static const struct proto_ops mptcp_stream_ops = {
> .recvmsg = inet_recvmsg,
> .mmap = sock_no_mmap,
> .set_rcvlowat = mptcp_set_rcvlowat,
> + .read_sock = mptcp_read_sock,
> };
>
> static struct inet_protosw mptcp_protosw = {
> @@ -4080,6 +4133,7 @@ static const struct proto_ops mptcp_v6_stream_ops = {
> .compat_ioctl = inet6_compat_ioctl,
> #endif
> .set_rcvlowat = mptcp_set_rcvlowat,
> + .read_sock = mptcp_read_sock,
> };
>
> static struct proto mptcp_v6_prot;
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-05-23 15:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-22 9:21 [PATCH mptcp-next 0/2] implement mptcp read_sock Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 1/2] mptcp: use sk_eat_skb in recvmsg_mskq Geliang Tang
2025-05-22 9:21 ` [PATCH mptcp-next 2/2] mptcp: implement .read_sock Geliang Tang
2025-05-23 15:04 ` Matthieu Baerts
2025-05-22 10:42 ` [PATCH mptcp-next 0/2] implement mptcp read_sock MPTCP CI
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.