public inbox for mptcp@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path
@ 2026-03-31  9:08 Gang Yan
  2026-03-31  9:08 ` [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8 Gang Yan
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Gang Yan @ 2026-03-31  9:08 UTC (permalink / raw)
  To: mptcp; +Cc: Gang Yan

From: Gang Yan <yangang@kylinos.cn>

Hi, Matt:

With this v2 version, the packetdrill scripts also updated at
https://github.com/multipath-tcp/packetdrill/pull/189.

I removed the 'mptcp_eor_subflows.pkt', and made some improvements in
'mptcp_eor_no_collapse.pkt'.

Thanks,
Gang

---
Changelog:
v2:
  - Split the change into two independent patches:
    Patch 1 changes overhead to u8 (with BUILD_BUG_ON).
    Patch 2 implements the actual MSG_EOR handling.
    This split makes the series easier to review.
  - Added a !df->eor check in mptcp_frag_can_collapse_to() to prevent
    appending new data after a dfrag with the EOR flag set.
  - Improved comments at the new BUILD_BUG_ON check.

v1:
  - Link: https://patchwork.kernel.org/project/mptcp/patch/20260309025431.125943-1-gang.yan@linux.dev/

Gang Yan (2):
  mptcp: reduce 'overhead' from u16 to u8
  mptcp: preserve MSG_EOR semantics in sendmsg path

 net/mptcp/protocol.c | 28 +++++++++++++++++++++++++---
 net/mptcp/protocol.h |  4 +++-
 2 files changed, 28 insertions(+), 4 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8
  2026-03-31  9:08 [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
@ 2026-03-31  9:08 ` Gang Yan
  2026-04-01 17:50   ` Matthieu Baerts
  2026-03-31  9:08 ` [PATCH mptcp-next v2 2/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Gang Yan @ 2026-03-31  9:08 UTC (permalink / raw)
  To: mptcp; +Cc: Gang Yan

From: Gang Yan <yangang@kylinos.cn>

The 'overhead' in struct mptcp_data_frag can safely use u8, as it
represents 'alignment + sizeof(mptcp_data_frag)'. With a maximum
alignment of 7('ALIGN(1, sizeof(long)) - 1'), the overhead is at most
47, well below U8_MAX and validated with BUILD_BUG_ON().

This patch also adds a field named 'unused' for further extensions.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
---
 net/mptcp/protocol.c | 4 ++++
 net/mptcp/protocol.h | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 46fedfa05a54..01690a84ea6d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -4624,6 +4624,10 @@ void __init mptcp_proto_init(void)
 	inet_register_protosw(&mptcp_protosw);
 
 	BUILD_BUG_ON(sizeof(struct mptcp_skb_cb) > sizeof_field(struct sk_buff, cb));
+	/* ensure 'overhead' (alignment + sizeof(struct mptcp_data_frag)) fits in u8.
+	 * 'ALIGN(1, sizeof(long)) - 1' represents the maximum of alignment.
+	 */
+	BUILD_BUG_ON(ALIGN(1, sizeof(long)) - 1 + sizeof(struct mptcp_data_frag) > U8_MAX);
 }
 
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index f5d4d7d030f2..afdead91a4d7 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -264,7 +264,8 @@ struct mptcp_data_frag {
 	u64 data_seq;
 	u16 data_len;
 	u16 offset;
-	u16 overhead;
+	u8 overhead;
+	u8 __unused;
 	u16 already_sent;
 	struct page *page;
 };
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH mptcp-next v2 2/2] mptcp: preserve MSG_EOR semantics in sendmsg path
  2026-03-31  9:08 [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
  2026-03-31  9:08 ` [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8 Gang Yan
@ 2026-03-31  9:08 ` Gang Yan
  2026-04-01 17:50   ` Matthieu Baerts
  2026-03-31 10:37 ` [PATCH mptcp-next v2 0/2] " MPTCP CI
  2026-04-02 12:19 ` Matthieu Baerts
  3 siblings, 1 reply; 7+ messages in thread
From: Gang Yan @ 2026-03-31  9:08 UTC (permalink / raw)
  To: mptcp; +Cc: Gang Yan

From: Gang Yan <yangang@kylinos.cn>

Extend MPTCP's sendmsg handling to recognize and honor the MSG_EOR flag,
which marks the end of a record for application-level message boundaries.

Data fragments tagged with MSG_EOR are explicitly marked in the
mptcp_data_frag structure and skb context to prevent unintended
coalescing with subsequent data chunks. This ensures the intent of
applications using MSG_EOR is preserved across MPTCP subflows,
maintaining consistent message segmentation behavior.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
---
 net/mptcp/protocol.c | 24 +++++++++++++++++++++---
 net/mptcp/protocol.h |  3 ++-
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 01690a84ea6d..dafa178f43c5 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1032,7 +1032,8 @@ static bool mptcp_frag_can_collapse_to(const struct mptcp_sock *msk,
 				       const struct page_frag *pfrag,
 				       const struct mptcp_data_frag *df)
 {
-	return df && pfrag->page == df->page &&
+	return df && !df->eor &&
+		pfrag->page == df->page &&
 		pfrag->size - pfrag->offset > 0 &&
 		pfrag->offset == (df->offset + df->data_len) &&
 		df->data_seq + df->data_len == msk->write_seq;
@@ -1174,6 +1175,7 @@ mptcp_carve_data_frag(const struct mptcp_sock *msk, struct page_frag *pfrag,
 	dfrag->offset = offset + sizeof(struct mptcp_data_frag);
 	dfrag->already_sent = 0;
 	dfrag->page = pfrag->page;
+	dfrag->eor = 0;
 
 	return dfrag;
 }
@@ -1435,6 +1437,13 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		mptcp_update_infinite_map(msk, ssk, mpext);
 	trace_mptcp_sendmsg_frag(mpext);
 	mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
+
+	/* if this is the last chunk of a dfrag with MSG_EOR set,
+	 * mark the skb to prevent coalescing with subsequent data.
+	 */
+	if (dfrag->eor && info->sent + copy >= dfrag->data_len)
+		TCP_SKB_CB(skb)->eor = 1;
+
 	return copy;
 }
 
@@ -1895,7 +1904,8 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	long timeo;
 
 	/* silently ignore everything else */
-	msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_FASTOPEN;
+	msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+			  MSG_FASTOPEN | MSG_EOR;
 
 	lock_sock(sk);
 
@@ -2002,8 +2012,16 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			goto do_error;
 	}
 
-	if (copied)
+	if (copied) {
+		/* mark the last dfrag with EOR if MSG_EOR was set */
+		if (msg->msg_flags & MSG_EOR) {
+			struct mptcp_data_frag *dfrag = mptcp_pending_tail(sk);
+
+			if (dfrag)
+				dfrag->eor = 1;
+		}
 		__mptcp_push_pending(sk, msg->msg_flags);
+	}
 
 out:
 	release_sock(sk);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index afdead91a4d7..db96f2945cbd 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -265,7 +265,8 @@ struct mptcp_data_frag {
 	u16 data_len;
 	u16 offset;
 	u8 overhead;
-	u8 __unused;
+	u8 eor:1,
+	   __unused:7;
 	u16 already_sent;
 	struct page *page;
 };
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path
  2026-03-31  9:08 [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
  2026-03-31  9:08 ` [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8 Gang Yan
  2026-03-31  9:08 ` [PATCH mptcp-next v2 2/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
@ 2026-03-31 10:37 ` MPTCP CI
  2026-04-02 12:19 ` Matthieu Baerts
  3 siblings, 0 replies; 7+ messages in thread
From: MPTCP CI @ 2026-03-31 10:37 UTC (permalink / raw)
  To: Gang Yan; +Cc: mptcp

Hi Gang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_mp_capable 🔴
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/23790407262

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/742ca3f3c16d
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1075065


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8
  2026-03-31  9:08 ` [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8 Gang Yan
@ 2026-04-01 17:50   ` Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2026-04-01 17:50 UTC (permalink / raw)
  To: Gang Yan; +Cc: mptcp, Gang Yan

On Tue, 31 Mar 2026 17:08:08 +0800, Gang Yan <gang.yan@linux.dev> wrote:
> The 'overhead' in struct mptcp_data_frag can safely use u8, as it
> represents 'alignment + sizeof(mptcp_data_frag)'. With a maximum
> alignment of 7('ALIGN(1, sizeof(long)) - 1'), the overhead is at most
> 47, well below U8_MAX and validated with BUILD_BUG_ON().
> 
> This patch also adds a field named 'unused' for further extensions.
> 

Thank you, that's clearer like that in a dedicated patch.

We can probably add Paolo's suggested-by while at it.

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Suggested-by: Paolo Abeni <pabeni@redhat.com>

-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-next v2 2/2] mptcp: preserve MSG_EOR semantics in sendmsg path
  2026-03-31  9:08 ` [PATCH mptcp-next v2 2/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
@ 2026-04-01 17:50   ` Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2026-04-01 17:50 UTC (permalink / raw)
  To: Gang Yan; +Cc: mptcp, Gang Yan

On Tue, 31 Mar 2026 17:08:09 +0800, Gang Yan <gang.yan@linux.dev> wrote:
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index afdead91a4d7..db96f2945cbd 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -265,7 +265,8 @@ struct mptcp_data_frag {
>  	u16 data_len;
>  	u16 offset;
>  	u8 overhead;
> -	u8 __unused;
> +	u8 eor:1,
> +	   __unused:7;

Here, we could also use the whole u8 than using only one bit which is a
bit more costly to read/write, and we would avoid any KMSAN warnings (if
any).

I can also do this modification when applying the patch if that's OK:

  u8 eor;	/* currently using 1 bit */

WDYT?

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path
  2026-03-31  9:08 [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
                   ` (2 preceding siblings ...)
  2026-03-31 10:37 ` [PATCH mptcp-next v2 0/2] " MPTCP CI
@ 2026-04-02 12:19 ` Matthieu Baerts
  3 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2026-04-02 12:19 UTC (permalink / raw)
  To: Gang Yan, mptcp; +Cc: Gang Yan

Hi Gang,

On 31/03/2026 11:08, Gang Yan wrote:
> From: Gang Yan <yangang@kylinos.cn>
> 
> Hi, Matt:
> 
> With this v2 version, the packetdrill scripts also updated at
> https://github.com/multipath-tcp/packetdrill/pull/189.

Thank you for this v2. Now in our tree:

New patches for t/upstream:
- 15ac3c9af026: mptcp: reduce 'overhead' from u16 to u8
- 6e9b7b0c8325: mptcp: preserve MSG_EOR semantics in sendmsg path
- Results: 987a22862c63..53f431e3d0f4 (export)

Tests are now in progress:

- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/bf3cf8c6aab683f6c5d82ed923487bc50ef7a1f5/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-02 12:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31  9:08 [PATCH mptcp-next v2 0/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
2026-03-31  9:08 ` [PATCH mptcp-next v2 1/2] mptcp: reduce 'overhead' from u16 to u8 Gang Yan
2026-04-01 17:50   ` Matthieu Baerts
2026-03-31  9:08 ` [PATCH mptcp-next v2 2/2] mptcp: preserve MSG_EOR semantics in sendmsg path Gang Yan
2026-04-01 17:50   ` Matthieu Baerts
2026-03-31 10:37 ` [PATCH mptcp-next v2 0/2] " MPTCP CI
2026-04-02 12:19 ` Matthieu Baerts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox