* [PATCH 1/1] RDMA/rxe: Fix Use-After-Free problem in rxe_net_del
@ 2026-05-17 4:47 Zhu Yanjun
2026-05-18 11:39 ` Leon Romanovsky
0 siblings, 1 reply; 3+ messages in thread
From: Zhu Yanjun @ 2026-05-17 4:47 UTC (permalink / raw)
To: zyjzyj2000, jgg, leon, linux-rdma; +Cc: Zhu Yanjun, syzbot+d8f76778263ab65c2b21
syzbot reported a general protection fault (KASAN: null-ptr-deref) in
kernel_sock_shutdown() called during the software RoCE (rxe) link
deletion path (rxe_dellink -> rxe_net_del).
The root cause is a TOCTOU (Time-of-Check to Time-of-Use) race condition
in rxe_net_del(). Previously, the function fetched the socket pointer
via rxe_ns_pernet_sk4/6() outside the critical section, and then
acquired the lock to release it via rxe_sock_put().
In a highly concurrent teardown environment, another thread could close
and clear the pernet socket after it was fetched but before the lock
was acquired. This causes rxe_sock_put() to operate on a dangling or
already cleared socket pointer, leading to a NULL pointer dereference
when kernel_sock_shutdown() attempts to access sock->sk.
Fix this by introducing a dedicated, per-netns mutex 'release_lock'
and extending its scope. The socket pointers are now fetched, checked,
and released entirely within the same locked critical section. This
ensures the atomicity of the socket lookup and teardown sequence.
Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
Fixes: f1327abd6abe ("RDMA/rxe: Support RDMA link creation and destruction per net namespace")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
drivers/infiniband/sw/rxe/rxe_net.c | 4 ++++
drivers/infiniband/sw/rxe/rxe_ns.c | 22 ++++++++++++++++++++++
drivers/infiniband/sw/rxe/rxe_ns.h | 3 +++
3 files changed, 29 insertions(+)
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 50a2cb5405e2..b689ba085da4 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -655,6 +655,8 @@ void rxe_net_del(struct ib_device *dev)
net = dev_net(ndev);
+ rxe_ns_lock(net);
+
sk = rxe_ns_pernet_sk4(net);
if (sk)
rxe_sock_put(sk, rxe_ns_pernet_set_sk4, net);
@@ -663,6 +665,8 @@ void rxe_net_del(struct ib_device *dev)
if (sk)
rxe_sock_put(sk, rxe_ns_pernet_set_sk6, net);
+ rxe_ns_unlock(net);
+
dev_put(ndev);
}
diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
index 8b9d734229b2..799a727bc1fe 100644
--- a/drivers/infiniband/sw/rxe/rxe_ns.c
+++ b/drivers/infiniband/sw/rxe/rxe_ns.c
@@ -16,6 +16,7 @@
struct rxe_ns_sock {
struct sock __rcu *rxe_sk4;
struct sock __rcu *rxe_sk6;
+ struct mutex release_lock;
};
/*
@@ -31,10 +32,26 @@ static int rxe_ns_init(struct net *net)
/* defer socket create in the namespace to the first
* device create.
*/
+ struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+ mutex_init(&ns_sk->release_lock);
return 0;
}
+void rxe_ns_lock(struct net *net)
+{
+ struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+
+ mutex_lock(&ns_sk->release_lock);
+}
+
+void rxe_ns_unlock(struct net *net)
+{
+ struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+
+ mutex_unlock(&ns_sk->release_lock);
+}
+
static void rxe_ns_exit(struct net *net)
{
/* called when the network namespace is removed
@@ -42,6 +59,7 @@ static void rxe_ns_exit(struct net *net)
struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
struct sock *sk;
+ rxe_ns_lock(net);
rcu_read_lock();
sk = rcu_dereference(ns_sk->rxe_sk4);
rcu_read_unlock();
@@ -59,6 +77,10 @@ static void rxe_ns_exit(struct net *net)
udp_tunnel_sock_release(sk->sk_socket);
}
#endif
+
+ rxe_ns_unlock(net);
+
+ mutex_destroy(&ns_sk->release_lock);
}
/*
diff --git a/drivers/infiniband/sw/rxe/rxe_ns.h b/drivers/infiniband/sw/rxe/rxe_ns.h
index 4da2709e6b71..e6cc6b5a4806 100644
--- a/drivers/infiniband/sw/rxe/rxe_ns.h
+++ b/drivers/infiniband/sw/rxe/rxe_ns.h
@@ -20,6 +20,9 @@ static inline void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk)
}
#endif /* IPv6 */
+void rxe_ns_lock(struct net *net);
+void rxe_ns_unlock(struct net *net);
+
int rxe_namespace_init(void);
void rxe_namespace_exit(void);
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH 1/1] RDMA/rxe: Fix Use-After-Free problem in rxe_net_del
2026-05-17 4:47 [PATCH 1/1] RDMA/rxe: Fix Use-After-Free problem in rxe_net_del Zhu Yanjun
@ 2026-05-18 11:39 ` Leon Romanovsky
2026-05-18 18:59 ` yanjun.zhu
0 siblings, 1 reply; 3+ messages in thread
From: Leon Romanovsky @ 2026-05-18 11:39 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: zyjzyj2000, jgg, linux-rdma, syzbot+d8f76778263ab65c2b21
On Sun, May 17, 2026 at 06:47:47AM +0200, Zhu Yanjun wrote:
> syzbot reported a general protection fault (KASAN: null-ptr-deref) in
> kernel_sock_shutdown() called during the software RoCE (rxe) link
> deletion path (rxe_dellink -> rxe_net_del).
>
> The root cause is a TOCTOU (Time-of-Check to Time-of-Use) race condition
> in rxe_net_del(). Previously, the function fetched the socket pointer
> via rxe_ns_pernet_sk4/6() outside the critical section, and then
> acquired the lock to release it via rxe_sock_put().
>
> In a highly concurrent teardown environment, another thread could close
> and clear the pernet socket after it was fetched but before the lock
> was acquired. This causes rxe_sock_put() to operate on a dangling or
> already cleared socket pointer, leading to a NULL pointer dereference
> when kernel_sock_shutdown() attempts to access sock->sk.
>
> Fix this by introducing a dedicated, per-netns mutex 'release_lock'
> and extending its scope. The socket pointers are now fetched, checked,
> and released entirely within the same locked critical section. This
> ensures the atomicity of the socket lookup and teardown sequence.
>
> Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
> Fixes: f1327abd6abe ("RDMA/rxe: Support RDMA link creation and destruction per net namespace")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
> drivers/infiniband/sw/rxe/rxe_net.c | 4 ++++
> drivers/infiniband/sw/rxe/rxe_ns.c | 22 ++++++++++++++++++++++
> drivers/infiniband/sw/rxe/rxe_ns.h | 3 +++
> 3 files changed, 29 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
> index 50a2cb5405e2..b689ba085da4 100644
> --- a/drivers/infiniband/sw/rxe/rxe_net.c
> +++ b/drivers/infiniband/sw/rxe/rxe_net.c
> @@ -655,6 +655,8 @@ void rxe_net_del(struct ib_device *dev)
>
> net = dev_net(ndev);
>
> + rxe_ns_lock(net);
> +
> sk = rxe_ns_pernet_sk4(net);
> if (sk)
> rxe_sock_put(sk, rxe_ns_pernet_set_sk4, net);
> @@ -663,6 +665,8 @@ void rxe_net_del(struct ib_device *dev)
> if (sk)
> rxe_sock_put(sk, rxe_ns_pernet_set_sk6, net);
>
> + rxe_ns_unlock(net);
> +
> dev_put(ndev);
> }
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
> index 8b9d734229b2..799a727bc1fe 100644
> --- a/drivers/infiniband/sw/rxe/rxe_ns.c
> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
> @@ -16,6 +16,7 @@
> struct rxe_ns_sock {
> struct sock __rcu *rxe_sk4;
> struct sock __rcu *rxe_sk6;
> + struct mutex release_lock;
This change renders the existing rcu_read_lock() and rcu_read_unlock()
calls unnecessary.
Thanks
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH 1/1] RDMA/rxe: Fix Use-After-Free problem in rxe_net_del
2026-05-18 11:39 ` Leon Romanovsky
@ 2026-05-18 18:59 ` yanjun.zhu
0 siblings, 0 replies; 3+ messages in thread
From: yanjun.zhu @ 2026-05-18 18:59 UTC (permalink / raw)
To: Leon Romanovsky, Zhu Yanjun
Cc: zyjzyj2000, jgg, linux-rdma, syzbot+d8f76778263ab65c2b21
On 5/18/26 4:39 AM, Leon Romanovsky wrote:
> On Sun, May 17, 2026 at 06:47:47AM +0200, Zhu Yanjun wrote:
>> + rxe_ns_lock(net);
>> +
>> sk = rxe_ns_pernet_sk4(net);
>> if (sk)
>> rxe_sock_put(sk, rxe_ns_pernet_set_sk4, net);
>> @@ -663,6 +665,8 @@ void rxe_net_del(struct ib_device *dev)
>> if (sk)
>> rxe_sock_put(sk, rxe_ns_pernet_set_sk6, net);
>>
>> + rxe_ns_unlock(net);
>> +
>> dev_put(ndev);
>> }
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
>> index 8b9d734229b2..799a727bc1fe 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_ns.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
>> @@ -16,6 +16,7 @@
>> struct rxe_ns_sock {
>> struct sock __rcu *rxe_sk4;
>> struct sock __rcu *rxe_sk6;
>> + struct mutex release_lock;
>
> This change renders the existing rcu_read_lock() and rcu_read_unlock()
> calls unnecessary.
Thanks, Leon. I fully agree with you.
In the next version, I will remove the existing rcu locks.
Zhu Yanjun
>
> Thanks
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-18 18:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-17 4:47 [PATCH 1/1] RDMA/rxe: Fix Use-After-Free problem in rxe_net_del Zhu Yanjun
2026-05-18 11:39 ` Leon Romanovsky
2026-05-18 18:59 ` yanjun.zhu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox