* [PATCH net v2 1/1] mptcp: fix request ownership when cloning reqsk
@ 2026-05-10 14:14 Ren Wei
2026-05-13 9:37 ` Paolo Abeni
0 siblings, 1 reply; 2+ messages in thread
From: Ren Wei @ 2026-05-10 14:14 UTC (permalink / raw)
To: netdev, mptcp
Cc: matttbe, martineau, geliang, davem, edumazet, kuba, pabeni, horms,
ncardwell, kuniyu, daniel, kafai, yuantan098, yifanwucs,
tomapufckgml, bird, caoruide123, enjou1224z, n05ec
From: Ruide Cao <caoruide123@gmail.com>
TCP request migration clones pending request sockets with
inet_reqsk_clone(). For MPTCP MP_JOIN requests this raw-copies
subflow_req->msk, but the cloned request does not take a new reference.
Both the original and the cloned request can later drop the same msk in
subflow_req_destructor(), and a migrated request may keep a dangling msk
pointer after the original owner has already been released.
MP_CAPABLE requests have a similar ownership issue for token_node. The
original request is hashed in the token table, and inet_reqsk_clone()
raw-copies the hlist pointers into the clone. The clone must not inherit
that hashed state on migration failure, and the token ownership must move
only after the new req has actually replaced the old one in the ehash.
Fix both cases while keeping the change local to MPTCP and the TCP
migration path: grab a reference for cloned subflow requests that carry an
msk, clear any raw-copied token hash state from the clone, and move the
hashed token request ownership only after migration succeeds.
Fixes: c905dee62232 ("tcp: Migrate TCP_NEW_SYN_RECV requests at retransmitting SYN+ACKs.")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Ruide Cao <caoruide123@gmail.com>
Tested-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
---
changes in v2:
- drop the generic request_sock clone callback
- call MPTCP directly from inet_reqsk_clone() under the TCP protocol check
- keep cloned MP_JOIN requests holding an msk reference
- clear raw-copied MP_CAPABLE token hash state in the clone
- move MP_CAPABLE token ownership only after successful req migration
- avoid exposing token internals to inet_connection_sock.c
- update the commit message accordingly
- v1 link: https://lore.kernel.org/all/86e2514b533bf4d55d4aa2fdbf1404022e8c9430.1776149210.git.caoruide123@gmail.com/
include/net/mptcp.h | 14 ++++++++++++++
net/ipv4/inet_connection_sock.c | 12 ++++++++++--
net/mptcp/protocol.h | 2 ++
net/mptcp/subflow.c | 20 ++++++++++++++++++++
net/mptcp/token.c | 25 +++++++++++++++++++++++++
5 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index f7263fe2a2e4..4bb24963afef 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -219,6 +219,10 @@ int mptcp_subflow_init_cookie_req(struct request_sock *req,
struct request_sock *mptcp_subflow_reqsk_alloc(const struct request_sock_ops *ops,
struct sock *sk_listener,
bool attach_listener);
+void mptcp_subflow_reqsk_clone(struct request_sock *req,
+ struct request_sock *new_req);
+void mptcp_subflow_reqsk_migrated(struct request_sock *req,
+ struct request_sock *new_req);
__be32 mptcp_get_reset_option(const struct sk_buff *skb);
@@ -314,6 +318,16 @@ static inline struct request_sock *mptcp_subflow_reqsk_alloc(const struct reques
return NULL;
}
+static inline void mptcp_subflow_reqsk_clone(struct request_sock *req,
+ struct request_sock *new_req)
+{
+}
+
+static inline void mptcp_subflow_reqsk_migrated(struct request_sock *req,
+ struct request_sock *new_req)
+{
+}
+
static inline __be32 mptcp_reset_option(const struct sk_buff *skb) { return htonl(0u); }
static inline void mptcp_active_detect_blackhole(struct sock *sk, bool expired) { }
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index e961936b6be7..578471bb8a2b 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -17,6 +17,7 @@
#include <net/inet_timewait_sock.h>
#include <net/ip.h>
#include <net/route.h>
+#include <net/mptcp.h>
#include <net/tcp_states.h>
#include <net/xfrm.h>
#include <net/tcp.h>
@@ -951,8 +952,13 @@ static struct request_sock *inet_reqsk_clone(struct request_sock *req,
/* We need not acquire fastopenq->lock
* because the child socket is locked in inet_csk_listen_stop().
*/
- if (sk->sk_protocol == IPPROTO_TCP && tcp_rsk(nreq)->tfo_listener)
- rcu_assign_pointer(tcp_sk(nreq->sk)->fastopen_rsk, nreq);
+ if (sk->sk_protocol == IPPROTO_TCP) {
+ if (tcp_rsk(nreq)->tfo_listener)
+ rcu_assign_pointer(tcp_sk(nreq->sk)->fastopen_rsk, nreq);
+
+ if (rsk_is_mptcp(req))
+ mptcp_subflow_reqsk_clone(req, nreq);
+ }
return nreq;
}
@@ -1115,6 +1121,8 @@ static void reqsk_timer_handler(struct timer_list *t)
goto no_ownership;
}
+ if (rsk_is_mptcp(oreq))
+ mptcp_subflow_reqsk_migrated(oreq, nreq);
__NET_INC_STATS(net, LINUX_MIB_TCPMIGRATEREQSUCCESS);
reqsk_migrate_reset(oreq);
reqsk_queue_removed(&inet_csk(oreq->rsk_listener)->icsk_accept_queue, oreq);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index ec15e503da8b..ca1ca2334e1a 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -1038,6 +1038,8 @@ static inline void mptcp_token_init_request(struct request_sock *req)
int mptcp_token_new_request(struct request_sock *req);
void mptcp_token_destroy_request(struct request_sock *req);
+void mptcp_token_move_request(struct request_sock *req,
+ struct request_sock *new_req);
int mptcp_token_new_connect(struct sock *ssk);
void mptcp_token_accept(struct mptcp_subflow_request_sock *r,
struct mptcp_sock *msk);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 4ff5863aa9fd..e4c1d2bb4783 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -47,6 +47,26 @@ static void subflow_req_destructor(struct request_sock *req)
mptcp_token_destroy_request(req);
}
+void mptcp_subflow_reqsk_clone(struct request_sock *req,
+ struct request_sock *new_req)
+{
+ struct mptcp_subflow_request_sock *subflow_req;
+
+ subflow_req = mptcp_subflow_rsk(new_req);
+
+ if (subflow_req->msk)
+ sock_hold((struct sock *)subflow_req->msk);
+
+ /* Raw-cloned token hash state must not survive failed migrations. */
+ mptcp_token_init_request(new_req);
+}
+
+void mptcp_subflow_reqsk_migrated(struct request_sock *req,
+ struct request_sock *new_req)
+{
+ mptcp_token_move_request(req, new_req);
+}
+
static void subflow_generate_hmac(u64 key1, u64 key2, u32 nonce1, u32 nonce2,
void *hmac)
{
diff --git a/net/mptcp/token.c b/net/mptcp/token.c
index f1a50f367add..8f92a57c26e5 100644
--- a/net/mptcp/token.c
+++ b/net/mptcp/token.c
@@ -207,6 +207,31 @@ void mptcp_token_accept(struct mptcp_subflow_request_sock *req,
spin_unlock_bh(&bucket->lock);
}
+void mptcp_token_move_request(struct request_sock *req,
+ struct request_sock *new_req)
+{
+ struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req);
+ struct mptcp_subflow_request_sock *new_subflow_req;
+ struct mptcp_subflow_request_sock *pos;
+ struct token_bucket *bucket;
+
+ new_subflow_req = mptcp_subflow_rsk(new_req);
+
+ if (hlist_nulls_unhashed(&subflow_req->token_node))
+ return;
+
+ bucket = token_bucket(subflow_req->token);
+ spin_lock_bh(&bucket->lock);
+ /* Migrate the pending MP_CAPABLE token ownership to the cloned req. */
+ pos = __token_lookup_req(bucket, subflow_req->token);
+ if (!WARN_ON_ONCE(pos != subflow_req))
+ hlist_nulls_replace_init_rcu(&subflow_req->token_node,
+ &new_subflow_req->token_node);
+ else
+ mptcp_token_init_request(new_req);
+ spin_unlock_bh(&bucket->lock);
+}
+
bool mptcp_token_exists(u32 token)
{
struct hlist_nulls_node *pos;
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH net v2 1/1] mptcp: fix request ownership when cloning reqsk
2026-05-10 14:14 [PATCH net v2 1/1] mptcp: fix request ownership when cloning reqsk Ren Wei
@ 2026-05-13 9:37 ` Paolo Abeni
0 siblings, 0 replies; 2+ messages in thread
From: Paolo Abeni @ 2026-05-13 9:37 UTC (permalink / raw)
To: Ren Wei, netdev, mptcp
Cc: matttbe, martineau, geliang, davem, edumazet, kuba, horms,
ncardwell, kuniyu, daniel, kafai, yuantan098, yifanwucs,
tomapufckgml, bird, caoruide123, enjou1224z
On 5/10/26 4:14 PM, Ren Wei wrote:
> From: Ruide Cao <caoruide123@gmail.com>
>
> TCP request migration clones pending request sockets with
> inet_reqsk_clone(). For MPTCP MP_JOIN requests this raw-copies
> subflow_req->msk, but the cloned request does not take a new reference.
>
> Both the original and the cloned request can later drop the same msk in
> subflow_req_destructor(), and a migrated request may keep a dangling msk
> pointer after the original owner has already been released.
>
> MP_CAPABLE requests have a similar ownership issue for token_node. The
> original request is hashed in the token table, and inet_reqsk_clone()
> raw-copies the hlist pointers into the clone. The clone must not inherit
> that hashed state on migration failure, and the token ownership must move
> only after the new req has actually replaced the old one in the ehash.
>
> Fix both cases while keeping the change local to MPTCP and the TCP
> migration path: grab a reference for cloned subflow requests that carry an
> msk, clear any raw-copied token hash state from the clone, and move the
> hashed token request ownership only after migration succeeds.
>
> Fixes: c905dee62232 ("tcp: Migrate TCP_NEW_SYN_RECV requests at retransmitting SYN+ACKs.")
> Cc: stable@kernel.org
> Reported-by: Yuan Tan <yuantan098@gmail.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Signed-off-by: Ruide Cao <caoruide123@gmail.com>
> Tested-by: Ren Wei <enjou1224z@gmail.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
> changes in v2:
> - drop the generic request_sock clone callback
> - call MPTCP directly from inet_reqsk_clone() under the TCP protocol check
> - keep cloned MP_JOIN requests holding an msk reference
> - clear raw-copied MP_CAPABLE token hash state in the clone
> - move MP_CAPABLE token ownership only after successful req migration
> - avoid exposing token internals to inet_connection_sock.c
> - update the commit message accordingly
> - v1 link: https://lore.kernel.org/all/86e2514b533bf4d55d4aa2fdbf1404022e8c9430.1776149210.git.caoruide123@gmail.com/
>
> include/net/mptcp.h | 14 ++++++++++++++
> net/ipv4/inet_connection_sock.c | 12 ++++++++++--
> net/mptcp/protocol.h | 2 ++
> net/mptcp/subflow.c | 20 ++++++++++++++++++++
> net/mptcp/token.c | 25 +++++++++++++++++++++++++
> 5 files changed, 71 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> index f7263fe2a2e4..4bb24963afef 100644
> --- a/include/net/mptcp.h
> +++ b/include/net/mptcp.h
> @@ -219,6 +219,10 @@ int mptcp_subflow_init_cookie_req(struct request_sock *req,
> struct request_sock *mptcp_subflow_reqsk_alloc(const struct request_sock_ops *ops,
> struct sock *sk_listener,
> bool attach_listener);
> +void mptcp_subflow_reqsk_clone(struct request_sock *req,
> + struct request_sock *new_req);
> +void mptcp_subflow_reqsk_migrated(struct request_sock *req,
> + struct request_sock *new_req);
>
> __be32 mptcp_get_reset_option(const struct sk_buff *skb);
>
> @@ -314,6 +318,16 @@ static inline struct request_sock *mptcp_subflow_reqsk_alloc(const struct reques
> return NULL;
> }
>
> +static inline void mptcp_subflow_reqsk_clone(struct request_sock *req,
> + struct request_sock *new_req)
> +{
> +}
> +
> +static inline void mptcp_subflow_reqsk_migrated(struct request_sock *req,
> + struct request_sock *new_req)
> +{
> +}
> +
> static inline __be32 mptcp_reset_option(const struct sk_buff *skb) { return htonl(0u); }
>
> static inline void mptcp_active_detect_blackhole(struct sock *sk, bool expired) { }
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index e961936b6be7..578471bb8a2b 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -17,6 +17,7 @@
> #include <net/inet_timewait_sock.h>
> #include <net/ip.h>
> #include <net/route.h>
> +#include <net/mptcp.h>
> #include <net/tcp_states.h>
> #include <net/xfrm.h>
> #include <net/tcp.h>
> @@ -951,8 +952,13 @@ static struct request_sock *inet_reqsk_clone(struct request_sock *req,
> /* We need not acquire fastopenq->lock
> * because the child socket is locked in inet_csk_listen_stop().
> */
> - if (sk->sk_protocol == IPPROTO_TCP && tcp_rsk(nreq)->tfo_listener)
> - rcu_assign_pointer(tcp_sk(nreq->sk)->fastopen_rsk, nreq);
> + if (sk->sk_protocol == IPPROTO_TCP) {
> + if (tcp_rsk(nreq)->tfo_listener)
> + rcu_assign_pointer(tcp_sk(nreq->sk)->fastopen_rsk, nreq);
> +
> + if (rsk_is_mptcp(req))
> + mptcp_subflow_reqsk_clone(req, nreq);
> + }
>
> return nreq;
> }
> @@ -1115,6 +1121,8 @@ static void reqsk_timer_handler(struct timer_list *t)
> goto no_ownership;
> }
>
> + if (rsk_is_mptcp(oreq))
> + mptcp_subflow_reqsk_migrated(oreq, nreq);
Sashiko says:
---
Can this sequence trigger a race condition with network RX processing?
In reqsk_timer_handler(), nreq is inserted into the global ehash table
via inet_ehash_insert() right before this block. The ehash lock is
immediately dropped, making nreq globally visible to concurrent network
RX softirq processing.
However, the MPTCP token state is not migrated to nreq until the call to
mptcp_subflow_reqsk_migrated(oreq, nreq).
If a concurrent RX thread processes an incoming ACK for the connection
in this narrow window, it could find nreq in the ehash table,
successfully process it, and invoke mptcp_token_accept(nreq).
Inside mptcp_token_accept(), __token_lookup_req() searches the token
bucket and finds oreq, because the timer thread hasn't moved the token yet.
The code then asserts ownership:
mptcp_token_accept()
pos = __token_lookup_req(bucket, req->token);
if (!WARN_ON_ONCE(pos != req))
hlist_nulls_del_init_rcu(&req->token_node);
Since pos is oreq and req is nreq, would this fail the assertion and
trigger a kernel warning?
If the calls were reordered so that mptcp_subflow_reqsk_migrated()
happens before inet_ehash_insert(), could that merely shift the race? If
a concurrent ACK is already processing oreq, it might call
mptcp_token_accept(oreq) after the token was moved to nreq, triggering
the same warning, since the ehash table and the token bucket are
protected by different locks and the updates are non-atomic.
Could a remote attacker time ACKs to hit this window during a listener
migration, potentially leading to a denial of service on systems
configured to panic on warnings?
---
I _think_ the above race could/should be solved squashing
mptcp_subflow_reqsk_migrated() into mptcp_subflow_reqsk_clone() and
adjusting mptcp_token_destroy_request() and mptcp_token_accept() to
deal with already removed token, checking (again) for unhased token_node
under the bucket lock.
The latter will require a little more work, returning a success status
to the caller - mptcp_sk_clone_init() - and make the latter fail on
mptcp_token_accept() failures.
The above would also make the fix more encapsulated into the mptcp code.
/P
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-13 9:37 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-10 14:14 [PATCH net v2 1/1] mptcp: fix request ownership when cloning reqsk Ren Wei
2026-05-13 9:37 ` Paolo Abeni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox