* [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1
@ 2025-12-05 18:55 Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 1/4] mptcp: pm: ignore unknown endpoint flags Matthieu Baerts (NGI0)
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-12-05 18:55 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
Kuniyuki Iwashima, Dmytro Shytyi
Cc: netdev, mptcp, linux-kernel, linux-kselftest,
Matthieu Baerts (NGI0), stable
Here are various unrelated fixes:
- Patches 1-2: ignore unknown in-kernel PM endpoint flags instead of
pretending they are supported. A fix for v5.7.
- Patch 3: avoid potential unnecessary rtx timer expiration. A fix for
v5.15.
- Patch 4: avoid a deadlock on fallback in case of SKB creation failure
while re-injecting.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Matthieu Baerts (NGI0) (2):
mptcp: pm: ignore unknown endpoint flags
selftests: mptcp: pm: ensure unknown flags are ignored
Paolo Abeni (2):
mptcp: schedule rtx timer only after pushing data
mptcp: avoid deadlock on fallback while reinjecting
include/uapi/linux/mptcp.h | 1 +
net/mptcp/pm_netlink.c | 3 ++-
net/mptcp/protocol.c | 22 ++++++++++++++--------
tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 ++++
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 11 +++++++++++
5 files changed, 32 insertions(+), 9 deletions(-)
---
base-commit: 0373d5c387f24de749cc22e694a14b3a7c7eb515
change-id: 20251205-net-mptcp-misc-fixes-6-19-rc1-5174bd3df981
Best regards,
--
Matthieu Baerts (NGI0) <matttbe@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net 1/4] mptcp: pm: ignore unknown endpoint flags
2025-12-05 18:55 [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 Matthieu Baerts (NGI0)
@ 2025-12-05 18:55 ` Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 2/4] selftests: mptcp: pm: ensure unknown flags are ignored Matthieu Baerts (NGI0)
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-12-05 18:55 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
Kuniyuki Iwashima, Dmytro Shytyi
Cc: netdev, mptcp, linux-kernel, linux-kselftest,
Matthieu Baerts (NGI0), stable
Before this patch, the kernel was saving any flags set by the userspace,
even unknown ones. This doesn't cause critical issues because the kernel
is only looking at specific ones. But on the other hand, endpoints dumps
could tell the userspace some recent flags seem to be supported on older
kernel versions.
Instead, ignore all unknown flags when parsing them. By doing that, the
userspace can continue to set unsupported flags, but it has a way to
verify what is supported by the kernel.
Note that it sounds better to continue accepting unsupported flags not
to change the behaviour, but also that eases things on the userspace
side by adding "optional" endpoint types only supported by newer kernel
versions without having to deal with the different kernel versions.
A note for the backports: there will be conflicts in mptcp.h on older
versions not having the mentioned flags, the new line should still be
added last, and the '5' needs to be adapted to have the same value as
the last entry.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
include/uapi/linux/mptcp.h | 1 +
net/mptcp/pm_netlink.c | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
index 04eea6d1d0a9..72a5d030154e 100644
--- a/include/uapi/linux/mptcp.h
+++ b/include/uapi/linux/mptcp.h
@@ -40,6 +40,7 @@
#define MPTCP_PM_ADDR_FLAG_FULLMESH _BITUL(3)
#define MPTCP_PM_ADDR_FLAG_IMPLICIT _BITUL(4)
#define MPTCP_PM_ADDR_FLAG_LAMINAR _BITUL(5)
+#define MPTCP_PM_ADDR_FLAGS_MASK GENMASK(5, 0)
struct mptcp_info {
__u8 mptcpi_subflows;
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index d5b383870f79..7aa42de9c47b 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -119,7 +119,8 @@ int mptcp_pm_parse_entry(struct nlattr *attr, struct genl_info *info,
}
if (tb[MPTCP_PM_ADDR_ATTR_FLAGS])
- entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]);
+ entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]) &
+ MPTCP_PM_ADDR_FLAGS_MASK;
if (tb[MPTCP_PM_ADDR_ATTR_PORT])
entry->addr.port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]));
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net 2/4] selftests: mptcp: pm: ensure unknown flags are ignored
2025-12-05 18:55 [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 1/4] mptcp: pm: ignore unknown endpoint flags Matthieu Baerts (NGI0)
@ 2025-12-05 18:55 ` Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 3/4] mptcp: schedule rtx timer only after pushing data Matthieu Baerts (NGI0)
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-12-05 18:55 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
Kuniyuki Iwashima, Dmytro Shytyi
Cc: netdev, mptcp, linux-kernel, linux-kselftest,
Matthieu Baerts (NGI0), stable
This validates the previous commit: the userspace can set unknown flags
-- the 7th bit is currently unused -- without errors, but only the
supported ones are printed in the endpoints dumps.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
tools/testing/selftests/net/mptcp/pm_netlink.sh | 4 ++++
tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 11 +++++++++++
2 files changed, 15 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh
index ec6a87588191..123d9d7a0278 100755
--- a/tools/testing/selftests/net/mptcp/pm_netlink.sh
+++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh
@@ -192,6 +192,10 @@ check "show_endpoints" \
flush_endpoint
check "show_endpoints" "" "flush addrs"
+add_endpoint 10.0.1.1 flags unknown
+check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" "ignore unknown flags"
+flush_endpoint
+
set_limits 9 1 2>/dev/null
check "get_limits" "${default_limits}" "rcv addrs above hard limit"
diff --git a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
index 65b374232ff5..99eecccbf0c8 100644
--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
+++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
@@ -24,6 +24,8 @@
#define IPPROTO_MPTCP 262
#endif
+#define MPTCP_PM_ADDR_FLAG_UNKNOWN _BITUL(7)
+
static void syntax(char *argv[])
{
fprintf(stderr, "%s add|ann|rem|csf|dsf|get|set|del|flush|dump|events|listen|accept [<args>]\n", argv[0]);
@@ -836,6 +838,8 @@ int add_addr(int fd, int pm_family, int argc, char *argv[])
flags |= MPTCP_PM_ADDR_FLAG_BACKUP;
else if (!strcmp(tok, "fullmesh"))
flags |= MPTCP_PM_ADDR_FLAG_FULLMESH;
+ else if (!strcmp(tok, "unknown"))
+ flags |= MPTCP_PM_ADDR_FLAG_UNKNOWN;
else
error(1, errno,
"unknown flag %s", argv[arg]);
@@ -1048,6 +1052,13 @@ static void print_addr(struct rtattr *attrs, int len)
printf(",");
}
+ if (flags & MPTCP_PM_ADDR_FLAG_UNKNOWN) {
+ printf("unknown");
+ flags &= ~MPTCP_PM_ADDR_FLAG_UNKNOWN;
+ if (flags)
+ printf(",");
+ }
+
/* bump unknown flags, if any */
if (flags)
printf("0x%x", flags);
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net 3/4] mptcp: schedule rtx timer only after pushing data
2025-12-05 18:55 [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 1/4] mptcp: pm: ignore unknown endpoint flags Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 2/4] selftests: mptcp: pm: ensure unknown flags are ignored Matthieu Baerts (NGI0)
@ 2025-12-05 18:55 ` Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 4/4] mptcp: avoid deadlock on fallback while reinjecting Matthieu Baerts (NGI0)
2025-12-09 11:50 ` [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 patchwork-bot+netdevbpf
4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-12-05 18:55 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
Kuniyuki Iwashima, Dmytro Shytyi
Cc: netdev, mptcp, linux-kernel, linux-kselftest,
Matthieu Baerts (NGI0), stable
From: Paolo Abeni <pabeni@redhat.com>
The MPTCP protocol usually schedule the retransmission timer only
when there is some chances for such retransmissions to happen.
With a notable exception: __mptcp_push_pending() currently schedule
such timer unconditionally, potentially leading to unnecessary rtx
timer expiration.
The issue is present since the blamed commit below but become easily
reproducible after commit 27b0e701d387 ("mptcp: drop bogus optimization
in __mptcp_check_push()")
Fixes: 33d41c9cd74c ("mptcp: more accurate timeout")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
net/mptcp/protocol.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e212c1374bd0..d8a7f7029164 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1623,7 +1623,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
struct mptcp_sendmsg_info info = {
.flags = flags,
};
- bool do_check_data_fin = false;
+ bool copied = false;
int push_count = 1;
while (mptcp_send_head(sk) && (push_count > 0)) {
@@ -1665,7 +1665,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
push_count--;
continue;
}
- do_check_data_fin = true;
+ copied = true;
}
}
}
@@ -1674,11 +1674,14 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
if (ssk)
mptcp_push_release(ssk, &info);
- /* ensure the rtx timer is running */
- if (!mptcp_rtx_timer_pending(sk))
- mptcp_reset_rtx_timer(sk);
- if (do_check_data_fin)
+ /* Avoid scheduling the rtx timer if no data has been pushed; the timer
+ * will be updated on positive acks by __mptcp_cleanup_una().
+ */
+ if (copied) {
+ if (!mptcp_rtx_timer_pending(sk))
+ mptcp_reset_rtx_timer(sk);
mptcp_check_send_data_fin(sk);
+ }
}
static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool first)
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net 4/4] mptcp: avoid deadlock on fallback while reinjecting
2025-12-05 18:55 [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 Matthieu Baerts (NGI0)
` (2 preceding siblings ...)
2025-12-05 18:55 ` [PATCH net 3/4] mptcp: schedule rtx timer only after pushing data Matthieu Baerts (NGI0)
@ 2025-12-05 18:55 ` Matthieu Baerts (NGI0)
2025-12-09 11:50 ` [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 patchwork-bot+netdevbpf
4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-12-05 18:55 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
Kuniyuki Iwashima, Dmytro Shytyi
Cc: netdev, mptcp, linux-kernel, linux-kselftest,
Matthieu Baerts (NGI0), stable
From: Paolo Abeni <pabeni@redhat.com>
Jakub reported an MPTCP deadlock at fallback time:
WARNING: possible recursive locking detected
6.18.0-rc7-virtme #1 Not tainted
--------------------------------------------
mptcp_connect/20858 is trying to acquire lock:
ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_try_fallback+0xd8/0x280
but task is already holding lock:
ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&msk->fallback_lock);
lock(&msk->fallback_lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by mptcp_connect/20858:
#0: ff1100001da18290 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x114/0x1bc0
#1: ff1100001db40fd0 (k-sk_lock-AF_INET#2){+.+.}-{0:0}, at: __mptcp_retrans+0x2cb/0xaa0
#2: ff1100001da18b60 (&msk->fallback_lock){+.-.}-{3:3}, at: __mptcp_retrans+0x352/0xaa0
stack backtrace:
CPU: 0 UID: 0 PID: 20858 Comm: mptcp_connect Not tainted 6.18.0-rc7-virtme #1 PREEMPT(full)
Hardware name: Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x6f/0xa0
print_deadlock_bug.cold+0xc0/0xcd
validate_chain+0x2ff/0x5f0
__lock_acquire+0x34c/0x740
lock_acquire.part.0+0xbc/0x260
_raw_spin_lock_bh+0x38/0x50
__mptcp_try_fallback+0xd8/0x280
mptcp_sendmsg_frag+0x16c2/0x3050
__mptcp_retrans+0x421/0xaa0
mptcp_release_cb+0x5aa/0xa70
release_sock+0xab/0x1d0
mptcp_sendmsg+0xd5b/0x1bc0
sock_write_iter+0x281/0x4d0
new_sync_write+0x3c5/0x6f0
vfs_write+0x65e/0xbb0
ksys_write+0x17e/0x200
do_syscall_64+0xbb/0xfd0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fa5627cbc5e
Code: 4d 89 d8 e8 14 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
RSP: 002b:00007fff1fe14700 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa5627cbc5e
RDX: 0000000000001f9c RSI: 00007fff1fe16984 RDI: 0000000000000005
RBP: 00007fff1fe14710 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff1fe16920
R13: 0000000000002000 R14: 0000000000001f9c R15: 0000000000001f9c
The packet scheduler could attempt a reinjection after receiving an
MP_FAIL and before the infinite map has been transmitted, causing a
deadlock since MPTCP needs to do the reinjection atomically from WRT
fallback.
Address the issue explicitly avoiding the reinjection in the critical
scenario. Note that this is the only fallback critical section that
could potentially send packets and hit the double-lock.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://netdev-ctrl.bots.linux.dev/logs/vmksft/mptcp-dbg/results/412720/1-mptcp-join-sh/stderr
Fixes: f8a1d9b18c5e ("mptcp: make fallback action and fallback decision atomic")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
net/mptcp/protocol.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index d8a7f7029164..9b1fafd87cb9 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2769,10 +2769,13 @@ static void __mptcp_retrans(struct sock *sk)
/*
* make the whole retrans decision, xmit, disallow
- * fallback atomic
+ * fallback atomic, note that we can't retrans even
+ * when an infinite fallback is in progress, i.e. new
+ * subflows are disallowed.
*/
spin_lock_bh(&msk->fallback_lock);
- if (__mptcp_check_fallback(msk)) {
+ if (__mptcp_check_fallback(msk) ||
+ !msk->allow_subflows) {
spin_unlock_bh(&msk->fallback_lock);
release_sock(ssk);
goto clear_scheduled;
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1
2025-12-05 18:55 [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 Matthieu Baerts (NGI0)
` (3 preceding siblings ...)
2025-12-05 18:55 ` [PATCH net 4/4] mptcp: avoid deadlock on fallback while reinjecting Matthieu Baerts (NGI0)
@ 2025-12-09 11:50 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-12-09 11:50 UTC (permalink / raw)
To: Matthieu Baerts
Cc: martineau, geliang, davem, edumazet, kuba, pabeni, horms, shuah,
kuniyu, dmytro, netdev, mptcp, linux-kernel, linux-kselftest,
stable
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 05 Dec 2025 19:55:13 +0100 you wrote:
> Here are various unrelated fixes:
>
> - Patches 1-2: ignore unknown in-kernel PM endpoint flags instead of
> pretending they are supported. A fix for v5.7.
>
> - Patch 3: avoid potential unnecessary rtx timer expiration. A fix for
> v5.15.
>
> [...]
Here is the summary with links:
- [net,1/4] mptcp: pm: ignore unknown endpoint flags
https://git.kernel.org/netdev/net/c/0ace3297a730
- [net,2/4] selftests: mptcp: pm: ensure unknown flags are ignored
https://git.kernel.org/netdev/net/c/29f4801e9c8d
- [net,3/4] mptcp: schedule rtx timer only after pushing data
https://git.kernel.org/netdev/net/c/2ea6190f42d0
- [net,4/4] mptcp: avoid deadlock on fallback while reinjecting
https://git.kernel.org/netdev/net/c/ffb8c27b0539
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-12-09 11:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-05 18:55 [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 1/4] mptcp: pm: ignore unknown endpoint flags Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 2/4] selftests: mptcp: pm: ensure unknown flags are ignored Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 3/4] mptcp: schedule rtx timer only after pushing data Matthieu Baerts (NGI0)
2025-12-05 18:55 ` [PATCH net 4/4] mptcp: avoid deadlock on fallback while reinjecting Matthieu Baerts (NGI0)
2025-12-09 11:50 ` [PATCH net 0/4] mptcp: misc fixes for v6.19-rc1 patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox