* [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close
@ 2026-06-01 8:30 Kalpan Jani
2026-06-01 9:41 ` MPTCP CI
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Kalpan Jani @ 2026-06-01 8:30 UTC (permalink / raw)
To: mptcp
Cc: matttbe, martineau, shardul.b, janak, kalpanjani009, shardulsb08,
Kalpan Jani, Paolo Abeni, syzkaller
The backlog list is updated by mptcp_data_ready() under
mptcp_data_lock(). The cleanup of backlog references to a closing
subflow, however, was performed in mptcp_close_ssk(), before
__mptcp_close_ssk() acquires the ssk lock, and while holding neither
the ssk lock nor mptcp_data_lock().
Because that traversal ran without mptcp_data_lock(), concurrent softirq
RX processing on another CPU (subflow_data_ready() -> mptcp_data_ready()
-> __mptcp_add_backlog(), under mptcp_data_lock()) could add a backlog
entry referencing the ssk while the cleanup loop was in progress. Such
an entry could be missed by the cleanup, or the concurrent list update
could corrupt the traversal, leaving skb->sk pointing at the ssk after
it is freed.
A later mptcp_backlog_purge() then dereferences the stale pointer,
triggering a warning in inet_sock_destruct() (ssk->sk_rmem_alloc != 0)
followed by a use-after-free in mptcp_backlog_purge().
Fix this by moving the backlog cleanup into __mptcp_close_ssk(), after
subflow->closing is set to 1 and while the ssk lock is still held,
serialized under mptcp_data_lock(). The cleanup runs only on the push
path (MPTCP_CF_PUSH), where backlog references accumulate; on other
teardown paths the caller already handles cleanup.
With subflow->closing set and mptcp_data_lock() held across the purge,
any concurrent mptcp_data_ready() either completes its enqueue before
the purge runs and is caught, or observes closing=1 and bails out. Once
mptcp_data_unlock() is reached, no new skb referencing the ssk can be
enqueued, so the cleanup is exhaustive.
Remove the unprotected traversal from mptcp_close_ssk() entirely.
Tested with the MPTCP kernel selftests on the patched kernel:
- tools/testing/selftests/net/mptcp/mptcp_join.sh: all tests pass
- tools/testing/selftests/net/mptcp/mptcp_connect.sh: 68/68 pass
Fixes: ee458a3f314e ("mptcp: introduce mptcp-level backlog")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/621
Signed-off-by: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
---
net/mptcp/protocol.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 4546a8b09884..37efcea99dc1 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2527,6 +2527,22 @@ static void __mptcp_subflow_disconnect(struct sock *ssk,
}
}
+static void mptcp_cleanup_ssk_backlog(struct sock *sk, struct sock *ssk)
+{
+ struct mptcp_sock *msk = mptcp_sk(sk);
+ struct sk_buff *skb;
+
+ mptcp_data_lock(sk);
+ list_for_each_entry(skb, &msk->backlog_list, list) {
+ if (skb->sk != ssk)
+ continue;
+
+ atomic_sub(skb->truesize, &skb->sk->sk_rmem_alloc);
+ skb->sk = NULL;
+ }
+ mptcp_data_unlock(sk);
+}
+
/* subflow sockets can be either outgoing (connect) or incoming
* (accept).
*
@@ -2550,6 +2567,9 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
subflow->closing = 1;
+ if (flags & MPTCP_CF_PUSH)
+ mptcp_cleanup_ssk_backlog(sk, ssk);
+
/* Borrow the fwd allocated page left-over; fwd memory for the subflow
* could be negative at this point, but will be reach zero soon - when
* the data allocated using such fragment will be freed.
@@ -2641,9 +2661,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
struct mptcp_subflow_context *subflow)
{
- struct mptcp_sock *msk = mptcp_sk(sk);
- struct sk_buff *skb;
-
/* The first subflow can already be closed or disconnected */
if (subflow->close_event_done || READ_ONCE(subflow->local_id) < 0)
return;
@@ -2653,17 +2670,6 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
if (sk->sk_state == TCP_ESTABLISHED)
mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL);
- /* Remove any reference from the backlog to this ssk; backlog skbs consume
- * space in the msk receive queue, no need to touch sk->sk_rmem_alloc
- */
- list_for_each_entry(skb, &msk->backlog_list, list) {
- if (skb->sk != ssk)
- continue;
-
- atomic_sub(skb->truesize, &skb->sk->sk_rmem_alloc);
- skb->sk = NULL;
- }
-
/* subflow aborted before reaching the fully_established status
* attempt the creation of the next subflow
*/
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close
2026-06-01 8:30 [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close Kalpan Jani
@ 2026-06-01 9:41 ` MPTCP CI
2026-06-25 6:40 ` Kalpan Jani
2026-06-25 8:39 ` Matthieu Baerts
2 siblings, 0 replies; 5+ messages in thread
From: MPTCP CI @ 2026-06-01 9:41 UTC (permalink / raw)
To: Kalpan Jani; +Cc: mptcp
Hi Kalpan,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_connect_checksum ⚠️
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/26744812858
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/60889317c233
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1103840
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re:[PATCH net v4] mptcp: fix stale skb->sk reference on subflow close
2026-06-01 8:30 [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close Kalpan Jani
2026-06-01 9:41 ` MPTCP CI
@ 2026-06-25 6:40 ` Kalpan Jani
2026-06-25 8:06 ` [PATCH " Matthieu Baerts
2026-06-25 8:39 ` Matthieu Baerts
2 siblings, 1 reply; 5+ messages in thread
From: Kalpan Jani @ 2026-06-25 6:40 UTC (permalink / raw)
To: mptcp
Cc: matttbe, martineau, shardul.b, janak, kalpanjani009, shardulsb08,
Kalpan Jani, Paolo Abeni, syzkaller, Akshit Patadiya
Hi Paolo, Matt,
Gentle ping on this patch.
https://lore.kernel.org/all/20260601083010.924938-1-kalpan.jani@mpiricsoftware.com/
I haven't seen any further feedback on v4, so I just wanted to bring it back to your attention in case it was missed.
Changes since v3:
- Rewrote the commit message: the old backlog traversal in mptcp_close_ssk() ran before __mptcp_close_ssk() took the ssk lock, holding neither the ssk lock nor mptcp_data_lock(). The race is a concurrent softirq RX path (subflow_data_ready() -> mptcp_data_ready() -> __mptcp_add_backlog(), under mptcp_data_lock()) adding to the backlog mid-traversal, not anything tied to a release_sock(ssk).
- Added the missing Fixes tag (ee458a3f314e).
The diff is otherwise unchanged from v3. I kept your Suggested-by, Paolo; happy to add an Acked-by too if you'd rather it be explicit.
Thanks for your time.
Cheers,
Kalpan Jani
From: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
To: <mptcp@lists.linux.dev>
Cc: <matttbe@kernel.org>, <martineau@kernel.org>, <shardul.b@mpiricsoftware.com>, <janak@mpiric.us>, <kalpanjani009@gmail.com>, <shardulsb08@gmail.com>, "Kalpan Jani"<kalpan.jani@mpiricsoftware.com>, "Paolo Abeni"<pabeni@redhat.com>, "syzkaller"<syzkaller@googlegroups.com>
Date: Mon, 01 Jun 2026 14:00:10 +0530
Subject: [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close
> The backlog list is updated by mptcp_data_ready() under
> mptcp_data_lock(). The cleanup of backlog references to a closing
> subflow, however, was performed in mptcp_close_ssk(), before
> __mptcp_close_ssk() acquires the ssk lock, and while holding neither
> the ssk lock nor mptcp_data_lock().
>
> Because that traversal ran without mptcp_data_lock(), concurrent softirq
> RX processing on another CPU (subflow_data_ready() -> mptcp_data_ready()
> -> __mptcp_add_backlog(), under mptcp_data_lock()) could add a backlog
> entry referencing the ssk while the cleanup loop was in progress. Such
> an entry could be missed by the cleanup, or the concurrent list update
> could corrupt the traversal, leaving skb->sk pointing at the ssk after
> it is freed.
>
> A later mptcp_backlog_purge() then dereferences the stale pointer,
> triggering a warning in inet_sock_destruct() (ssk->sk_rmem_alloc != 0)
> followed by a use-after-free in mptcp_backlog_purge().
>
> Fix this by moving the backlog cleanup into __mptcp_close_ssk(), after
> subflow->closing is set to 1 and while the ssk lock is still held,
> serialized under mptcp_data_lock(). The cleanup runs only on the push
> path (MPTCP_CF_PUSH), where backlog references accumulate; on other
> teardown paths the caller already handles cleanup.
>
> With subflow->closing set and mptcp_data_lock() held across the purge,
> any concurrent mptcp_data_ready() either completes its enqueue before
> the purge runs and is caught, or observes closing=1 and bails out. Once
> mptcp_data_unlock() is reached, no new skb referencing the ssk can be
> enqueued, so the cleanup is exhaustive.
>
> Remove the unprotected traversal from mptcp_close_ssk() entirely.
>
> Tested with the MPTCP kernel selftests on the patched kernel:
> - tools/testing/selftests/net/mptcp/mptcp_join.sh: all tests pass
> - tools/testing/selftests/net/mptcp/mptcp_connect.sh: 68/68 pass
>
> Fixes: ee458a3f314e ("mptcp: introduce mptcp-level backlog")
> Suggested-by: Paolo Abeni <pabeni@redhat.com>
> Reported-by: syzkaller <syzkaller@googlegroups.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/621
> Signed-off-by: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
> ---
> net/mptcp/protocol.c | 34 ++++++++++++++++++++--------------
> 1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 4546a8b09884..37efcea99dc1 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2527,6 +2527,22 @@ static void __mptcp_subflow_disconnect(struct sock *ssk,
> }
> }
>
> +static void mptcp_cleanup_ssk_backlog(struct sock *sk, struct sock *ssk)
> +{
> + struct mptcp_sock *msk = mptcp_sk(sk);
> + struct sk_buff *skb;
> +
> + mptcp_data_lock(sk);
> + list_for_each_entry(skb, &msk->backlog_list, list) {
> + if (skb->sk != ssk)
> + continue;
> +
> + atomic_sub(skb->truesize, &skb->sk->sk_rmem_alloc);
> + skb->sk = NULL;
> + }
> + mptcp_data_unlock(sk);
> +}
> +
> /* subflow sockets can be either outgoing (connect) or incoming
> * (accept).
> *
> @@ -2550,6 +2567,9 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
> lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
> subflow->closing = 1;
>
> + if (flags & MPTCP_CF_PUSH)
> + mptcp_cleanup_ssk_backlog(sk, ssk);
> +
> /* Borrow the fwd allocated page left-over; fwd memory for the subflow
> * could be negative at this point, but will be reach zero soon - when
> * the data allocated using such fragment will be freed.
> @@ -2641,9 +2661,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
> void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
> struct mptcp_subflow_context *subflow)
> {
> - struct mptcp_sock *msk = mptcp_sk(sk);
> - struct sk_buff *skb;
> -
> /* The first subflow can already be closed or disconnected */
> if (subflow->close_event_done || READ_ONCE(subflow->local_id) < 0)
> return;
> @@ -2653,17 +2670,6 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
> if (sk->sk_state == TCP_ESTABLISHED)
> mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL);
>
> - /* Remove any reference from the backlog to this ssk; backlog skbs consume
> - * space in the msk receive queue, no need to touch sk->sk_rmem_alloc
> - */
> - list_for_each_entry(skb, &msk->backlog_list, list) {
> - if (skb->sk != ssk)
> - continue;
> -
> - atomic_sub(skb->truesize, &skb->sk->sk_rmem_alloc);
> - skb->sk = NULL;
> - }
> -
> /* subflow aborted before reaching the fully_established status
> * attempt the creation of the next subflow
> */
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close
2026-06-25 6:40 ` Kalpan Jani
@ 2026-06-25 8:06 ` Matthieu Baerts
0 siblings, 0 replies; 5+ messages in thread
From: Matthieu Baerts @ 2026-06-25 8:06 UTC (permalink / raw)
To: Kalpan Jani, mptcp
Cc: martineau, shardul.b, janak, kalpanjani009, shardulsb08,
Paolo Abeni, syzkaller, Akshit Patadiya
Hi Kalpan,
On 25/06/2026 08:40, Kalpan Jani wrote:
> Hi Paolo, Matt,
>
> Gentle ping on this patch.
>
> https://lore.kernel.org/all/20260601083010.924938-1-kalpan.jani@mpiricsoftware.com/
>
> I haven't seen any further feedback on v4, so I just wanted to bring it back to your attention in case it was missed.
Sorry for the delay, your patch has not been missed, it is tracked on
patchwork. The delay is due to unfortunate timing from most of us at the
same time, hopefully this will get better at the point, and the load
will decrease!
I will apply the patch, the v4 looks good to me.
> Changes since v3:
> - Rewrote the commit message: the old backlog traversal in mptcp_close_ssk() ran before __mptcp_close_ssk() took the ssk lock, holding neither the ssk lock nor mptcp_data_lock(). The race is a concurrent softirq RX path (subflow_data_ready() -> mptcp_data_ready() -> __mptcp_add_backlog(), under mptcp_data_lock()) adding to the backlog mid-traversal, not anything tied to a release_sock(ssk).
> - Added the missing Fixes tag (ee458a3f314e).
Thank you for the changelog. It is always useful to keep it in the note
section (under '---') to help reviewers.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close
2026-06-01 8:30 [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close Kalpan Jani
2026-06-01 9:41 ` MPTCP CI
2026-06-25 6:40 ` Kalpan Jani
@ 2026-06-25 8:39 ` Matthieu Baerts
2 siblings, 0 replies; 5+ messages in thread
From: Matthieu Baerts @ 2026-06-25 8:39 UTC (permalink / raw)
To: Kalpan Jani, mptcp
Cc: martineau, shardul.b, janak, kalpanjani009, shardulsb08,
Paolo Abeni, syzkaller
Hi Kalpan,
On 01/06/2026 10:30, Kalpan Jani wrote:
> The backlog list is updated by mptcp_data_ready() under
> mptcp_data_lock(). The cleanup of backlog references to a closing
> subflow, however, was performed in mptcp_close_ssk(), before
> __mptcp_close_ssk() acquires the ssk lock, and while holding neither
> the ssk lock nor mptcp_data_lock().
>
> Because that traversal ran without mptcp_data_lock(), concurrent softirq
> RX processing on another CPU (subflow_data_ready() -> mptcp_data_ready()
> -> __mptcp_add_backlog(), under mptcp_data_lock()) could add a backlog
> entry referencing the ssk while the cleanup loop was in progress. Such
> an entry could be missed by the cleanup, or the concurrent list update
> could corrupt the traversal, leaving skb->sk pointing at the ssk after
> it is freed.
>
> A later mptcp_backlog_purge() then dereferences the stale pointer,
> triggering a warning in inet_sock_destruct() (ssk->sk_rmem_alloc != 0)
> followed by a use-after-free in mptcp_backlog_purge().
>
> Fix this by moving the backlog cleanup into __mptcp_close_ssk(), after
> subflow->closing is set to 1 and while the ssk lock is still held,
> serialized under mptcp_data_lock(). The cleanup runs only on the push
> path (MPTCP_CF_PUSH), where backlog references accumulate; on other
> teardown paths the caller already handles cleanup.
>
> With subflow->closing set and mptcp_data_lock() held across the purge,
> any concurrent mptcp_data_ready() either completes its enqueue before
> the purge runs and is caught, or observes closing=1 and bails out. Once
> mptcp_data_unlock() is reached, no new skb referencing the ssk can be
> enqueued, so the cleanup is exhaustive.
>
> Remove the unprotected traversal from mptcp_close_ssk() entirely.
Thank you for the patch. Now in our tree (fixes for -net)
New patches for t/upstream-net and t/upstream:
- 3462c6e4c363: mptcp: fix stale skb->sk reference on subflow close
- Results: b4e31a1d879f..3ca09a35eeb2 (export-net)
- Results: beb02b81dcc9..de7eb4dc952e (export)
Tests are now in progress:
- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/3c5df8d7e52fe1bca30ed7f1358cd0d51c438800/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/ed605c82f0536e49e42d77d6e77366d59888b4f1/checks
> Tested with the MPTCP kernel selftests on the patched kernel:
> - tools/testing/selftests/net/mptcp/mptcp_join.sh: all tests pass
> - tools/testing/selftests/net/mptcp/mptcp_connect.sh: 68/68 pass
Note that you don't need to specify this: for MPTCP patches, we assume
all kernel selftests and packetdrill tests are passing. If you did other
"unusual" tests -- e.g. running many times a reproducer mentioned in the
issue -- then that's interesting to add that in the commit message.>
> Fixes: ee458a3f314e ("mptcp: introduce mptcp-level backlog")
> Suggested-by: Paolo Abeni <pabeni@redhat.com>
> Reported-by: syzkaller <syzkaller@googlegroups.com>
Detail: it might be confusing to add this: the issue has not been
reported by the public syzbot instance [1], but by a private one
dedicated to MPTCP. People on this mailing list will likely not find the
related issue.
https://syzkaller.appspot.com/upstream/s/mptcp
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-25 8:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 8:30 [PATCH net v4] mptcp: fix stale skb->sk reference on subflow close Kalpan Jani
2026-06-01 9:41 ` MPTCP CI
2026-06-25 6:40 ` Kalpan Jani
2026-06-25 8:06 ` [PATCH " Matthieu Baerts
2026-06-25 8:39 ` Matthieu Baerts
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.