From: Gang Yan <gang.yan@linux.dev>
To: mptcp@lists.linux.dev
Cc: pabeni@redhat.com, geliang@kernel.org, Gang Yan <yangang@kylinos.cn>
Subject: [PATCH mptcp-next] mptcp: preserve MSG_EOR semantics in sendmsg path
Date: Mon, 9 Mar 2026 10:54:31 +0800 [thread overview]
Message-ID: <20260309025431.125943-1-gang.yan@linux.dev> (raw)
From: Gang Yan <yangang@kylinos.cn>
Extend MPTCP's sendmsg handling to recognize and honor the MSG_EOR flag,
which marks the end of a record for application-level message boundaries.
Data fragments tagged with MSG_EOR are explicitly marked in the
mptcp_data_frag structure and skb context to prevent unintended
coalescing with subsequent data chunks. This ensures the intent of
applications using MSG_EOR is preserved across MPTCP subflows,
maintaining consistent message segmentation behavior.
Signed-off-by: Gang Yan <yangang@kylinos.cn>
---
Notes:
- This patch incorporates feedback and suggestions from Paolo Abeni
and Geliang Tang, including memory alignment optimizations for the
mptcp_data_frag struct (shrinking overhead to u8 and using bitfield
for eor to avoid size increase) and compile-time checks with BUILD_BUG_ON.
- Packetdrill test cases validating this feature are available at:
https://github.com/multipath-tcp/packetdrill/pull/189/changes/d6ce92a4786704fe749bbd848ced0c047632282e
net/mptcp/protocol.c | 24 ++++++++++++++++++++++--
net/mptcp/protocol.h | 4 +++-
2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 17e43aff4459..3e574c87301b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1174,6 +1174,7 @@ mptcp_carve_data_frag(const struct mptcp_sock *msk, struct page_frag *pfrag,
dfrag->offset = offset + sizeof(struct mptcp_data_frag);
dfrag->already_sent = 0;
dfrag->page = pfrag->page;
+ dfrag->eor = 0;
return dfrag;
}
@@ -1435,6 +1436,13 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
mptcp_update_infinite_map(msk, ssk, mpext);
trace_mptcp_sendmsg_frag(mpext);
mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
+
+ /* If this is the last chunk of a dfrag with MSG_EOR set,
+ * mark the skb to prevent coalescing with subsequent data.
+ */
+ if (dfrag->eor && info->sent + copy >= dfrag->data_len)
+ TCP_SKB_CB(skb)->eor = 1;
+
return copy;
}
@@ -1895,7 +1903,8 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
long timeo;
/* silently ignore everything else */
- msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_FASTOPEN;
+ msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+ MSG_FASTOPEN | MSG_EOR;
lock_sock(sk);
@@ -2002,8 +2011,16 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
goto do_error;
}
- if (copied)
+ if (copied) {
+ /* Mark the last dfrag with EOR if MSG_EOR was set */
+ if (msg->msg_flags & MSG_EOR) {
+ struct mptcp_data_frag *dfrag = mptcp_pending_tail(sk);
+
+ if (dfrag)
+ dfrag->eor = 1;
+ }
__mptcp_push_pending(sk, msg->msg_flags);
+ }
out:
release_sock(sk);
@@ -4621,6 +4638,9 @@ void __init mptcp_proto_init(void)
inet_register_protosw(&mptcp_protosw);
BUILD_BUG_ON(sizeof(struct mptcp_skb_cb) > sizeof_field(struct sk_buff, cb));
+ /* Compile-time check: ensure 'overhead' (alignment + struct size) fits in u8 */
+ BUILD_BUG_ON(ALIGN(1, sizeof(long)) + sizeof(struct mptcp_data_frag) > U8_MAX);
+
}
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index f5d4d7d030f2..db96f2945cbd 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -264,7 +264,9 @@ struct mptcp_data_frag {
u64 data_seq;
u16 data_len;
u16 offset;
- u16 overhead;
+ u8 overhead;
+ u8 eor:1,
+ __unused:7;
u16 already_sent;
struct page *page;
};
--
2.43.0
next reply other threads:[~2026-03-09 2:54 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 2:54 Gang Yan [this message]
2026-03-09 4:07 ` [PATCH mptcp-next] mptcp: preserve MSG_EOR semantics in sendmsg path MPTCP CI
2026-03-26 16:42 ` Matthieu Baerts
2026-03-30 8:19 ` gang.yan
2026-03-30 9:50 ` Matthieu Baerts
2026-03-31 6:54 ` gang.yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260309025431.125943-1-gang.yan@linux.dev \
--to=gang.yan@linux.dev \
--cc=geliang@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=pabeni@redhat.com \
--cc=yangang@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.