Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
@ 2026-04-24  4:35 Zhu Yanjun
  2026-04-27 12:35 ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Zhu Yanjun @ 2026-04-24  4:35 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma, yanjun.zhu

Since all the sockets are created in rdma link create command
and destroyed in rdma link delete command, keeping
udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
the namespace and the device are being cleaned up simultaneously.

Fixes: 13f2a53c2a71 ("RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_ns.c | 20 --------------------
 1 file changed, 20 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
index 8b9d734229b2..53add78b8e3a 100644
--- a/drivers/infiniband/sw/rxe/rxe_ns.c
+++ b/drivers/infiniband/sw/rxe/rxe_ns.c
@@ -39,26 +39,6 @@ static void rxe_ns_exit(struct net *net)
 {
 	/* called when the network namespace is removed
 	 */
-	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
-	struct sock *sk;
-
-	rcu_read_lock();
-	sk = rcu_dereference(ns_sk->rxe_sk4);
-	rcu_read_unlock();
-	if (sk) {
-		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
-		udp_tunnel_sock_release(sk->sk_socket);
-	}
-
-#if IS_ENABLED(CONFIG_IPV6)
-	rcu_read_lock();
-	sk = rcu_dereference(ns_sk->rxe_sk6);
-	rcu_read_unlock();
-	if (sk) {
-		rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
-		udp_tunnel_sock_release(sk->sk_socket);
-	}
-#endif
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-04-24  4:35 [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup Zhu Yanjun
@ 2026-04-27 12:35 ` Leon Romanovsky
  2026-04-27 20:52   ` yanjun.zhu
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2026-04-27 12:35 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: zyjzyj2000, jgg, linux-rdma

On Fri, Apr 24, 2026 at 06:35:22AM +0200, Zhu Yanjun wrote:
> Since all the sockets are created in rdma link create command
> and destroyed in rdma link delete command, keeping
> udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
> the namespace and the device are being cleaned up simultaneously.

Please add a ladder diagram to clarify how it can be possible.

Thanks

> 
> Fixes: 13f2a53c2a71 ("RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
>  drivers/infiniband/sw/rxe/rxe_ns.c | 20 --------------------
>  1 file changed, 20 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
> index 8b9d734229b2..53add78b8e3a 100644
> --- a/drivers/infiniband/sw/rxe/rxe_ns.c
> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
> @@ -39,26 +39,6 @@ static void rxe_ns_exit(struct net *net)
>  {
>  	/* called when the network namespace is removed
>  	 */
> -	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
> -	struct sock *sk;
> -
> -	rcu_read_lock();
> -	sk = rcu_dereference(ns_sk->rxe_sk4);
> -	rcu_read_unlock();
> -	if (sk) {
> -		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
> -		udp_tunnel_sock_release(sk->sk_socket);
> -	}
> -
> -#if IS_ENABLED(CONFIG_IPV6)
> -	rcu_read_lock();
> -	sk = rcu_dereference(ns_sk->rxe_sk6);
> -	rcu_read_unlock();
> -	if (sk) {
> -		rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
> -		udp_tunnel_sock_release(sk->sk_socket);
> -	}
> -#endif
>  }
>  
>  /*
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-04-27 12:35 ` Leon Romanovsky
@ 2026-04-27 20:52   ` yanjun.zhu
  2026-04-28 14:26     ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: yanjun.zhu @ 2026-04-27 20:52 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: zyjzyj2000, jgg, linux-rdma

On 4/27/26 5:35 AM, Leon Romanovsky wrote:
> On Fri, Apr 24, 2026 at 06:35:22AM +0200, Zhu Yanjun wrote:
>> Since all the sockets are created in rdma link create command
>> and destroyed in rdma link delete command, keeping
>> udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
>> the namespace and the device are being cleaned up simultaneously.
> 
> Please add a ladder diagram to clarify how it can be possible.

Hi, Leon

The double-free occurs as follows:

CPU 0 (Net NameSpace cleanup)        CPU 1 (RDMA device removal)
---------------------                ---------------------------
rxe_ns_exit()                        rxe_link_delete() (rdma link del )
   -> sk = ns_sk->rxe_sk4               -> sk = ns_sk->rxe_sk4
   -> udp_tunnel_sock_release(sk)
      [Success: First Free]             -> udp_tunnel_sock_release(sk)
                                           [Crash: Double Free]

After removing the socket release logic from rxe_ns_exit(), we ensure
that only the device destruction path (rxe_link_delete) is responsible
for freeing the tunnel sockets, effectively eliminating the double-free 
problem.

I am not sure if I should put the above into the commit log.

Thanks a lot.

Zhu Yanjun

> 
> Thanks
> 
>>
>> Fixes: 13f2a53c2a71 ("RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets")
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_ns.c | 20 --------------------
>>   1 file changed, 20 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
>> index 8b9d734229b2..53add78b8e3a 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_ns.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
>> @@ -39,26 +39,6 @@ static void rxe_ns_exit(struct net *net)
>>   {
>>   	/* called when the network namespace is removed
>>   	 */
>> -	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
>> -	struct sock *sk;
>> -
>> -	rcu_read_lock();
>> -	sk = rcu_dereference(ns_sk->rxe_sk4);
>> -	rcu_read_unlock();
>> -	if (sk) {
>> -		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
>> -		udp_tunnel_sock_release(sk->sk_socket);
>> -	}
>> -
>> -#if IS_ENABLED(CONFIG_IPV6)
>> -	rcu_read_lock();
>> -	sk = rcu_dereference(ns_sk->rxe_sk6);
>> -	rcu_read_unlock();
>> -	if (sk) {
>> -		rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
>> -		udp_tunnel_sock_release(sk->sk_socket);
>> -	}
>> -#endif
>>   }
>>   
>>   /*
>> -- 
>> 2.43.0
>>
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-04-27 20:52   ` yanjun.zhu
@ 2026-04-28 14:26     ` Leon Romanovsky
  2026-04-29 13:49       ` Zhu Yanjun
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2026-04-28 14:26 UTC (permalink / raw)
  To: yanjun.zhu; +Cc: zyjzyj2000, jgg, linux-rdma

On Mon, Apr 27, 2026 at 01:52:17PM -0700, yanjun.zhu wrote:
> On 4/27/26 5:35 AM, Leon Romanovsky wrote:
> > On Fri, Apr 24, 2026 at 06:35:22AM +0200, Zhu Yanjun wrote:
> > > Since all the sockets are created in rdma link create command
> > > and destroyed in rdma link delete command, keeping
> > > udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
> > > the namespace and the device are being cleaned up simultaneously.
> > 
> > Please add a ladder diagram to clarify how it can be possible.
> 
> Hi, Leon
> 
> The double-free occurs as follows:
> 
> CPU 0 (Net NameSpace cleanup)        CPU 1 (RDMA device removal)
> ---------------------                ---------------------------
> rxe_ns_exit()                        rxe_link_delete() (rdma link del )
>   -> sk = ns_sk->rxe_sk4               -> sk = ns_sk->rxe_sk4
>   -> udp_tunnel_sock_release(sk)
>      [Success: First Free]             -> udp_tunnel_sock_release(sk)
>                                           [Crash: Double Free]
> 
> After removing the socket release logic from rxe_ns_exit(), we ensure
> that only the device destruction path (rxe_link_delete) is responsible
> for freeing the tunnel sockets, effectively eliminating the double-free
> problem.

I think it is possible to call rxe_ns_exit() without invoking
rxe_link_delete(), and in that case the UDP socket will not be
destroyed.

Thanks

> 
> I am not sure if I should put the above into the commit log.
> 
> Thanks a lot.
> 
> Zhu Yanjun
> 
> > 
> > Thanks
> > 
> > > 
> > > Fixes: 13f2a53c2a71 ("RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets")
> > > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> > > ---
> > >   drivers/infiniband/sw/rxe/rxe_ns.c | 20 --------------------
> > >   1 file changed, 20 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
> > > index 8b9d734229b2..53add78b8e3a 100644
> > > --- a/drivers/infiniband/sw/rxe/rxe_ns.c
> > > +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
> > > @@ -39,26 +39,6 @@ static void rxe_ns_exit(struct net *net)
> > >   {
> > >   	/* called when the network namespace is removed
> > >   	 */
> > > -	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
> > > -	struct sock *sk;
> > > -
> > > -	rcu_read_lock();
> > > -	sk = rcu_dereference(ns_sk->rxe_sk4);
> > > -	rcu_read_unlock();
> > > -	if (sk) {
> > > -		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
> > > -		udp_tunnel_sock_release(sk->sk_socket);
> > > -	}
> > > -
> > > -#if IS_ENABLED(CONFIG_IPV6)
> > > -	rcu_read_lock();
> > > -	sk = rcu_dereference(ns_sk->rxe_sk6);
> > > -	rcu_read_unlock();
> > > -	if (sk) {
> > > -		rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
> > > -		udp_tunnel_sock_release(sk->sk_socket);
> > > -	}
> > > -#endif
> > >   }
> > >   /*
> > > -- 
> > > 2.43.0
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-04-28 14:26     ` Leon Romanovsky
@ 2026-04-29 13:49       ` Zhu Yanjun
  2026-04-29 23:31         ` yanjun.zhu
  0 siblings, 1 reply; 8+ messages in thread
From: Zhu Yanjun @ 2026-04-29 13:49 UTC (permalink / raw)
  To: Leon Romanovsky, yanjun.zhu@linux.dev; +Cc: zyjzyj2000, jgg, linux-rdma

在 2026/4/28 7:26, Leon Romanovsky 写道:
> On Mon, Apr 27, 2026 at 01:52:17PM -0700, yanjun.zhu wrote:
>> On 4/27/26 5:35 AM, Leon Romanovsky wrote:
>>> On Fri, Apr 24, 2026 at 06:35:22AM +0200, Zhu Yanjun wrote:
>>>> Since all the sockets are created in rdma link create command
>>>> and destroyed in rdma link delete command, keeping
>>>> udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
>>>> the namespace and the device are being cleaned up simultaneously.
>>>
>>> Please add a ladder diagram to clarify how it can be possible.
>>
>> Hi, Leon
>>
>> The double-free occurs as follows:
>>
>> CPU 0 (Net NameSpace cleanup)        CPU 1 (RDMA device removal)
>> ---------------------                ---------------------------
>> rxe_ns_exit()                        rxe_link_delete() (rdma link del )
>>    -> sk = ns_sk->rxe_sk4               -> sk = ns_sk->rxe_sk4
>>    -> udp_tunnel_sock_release(sk)
>>       [Success: First Free]             -> udp_tunnel_sock_release(sk)
>>                                            [Crash: Double Free]
>>
>> After removing the socket release logic from rxe_ns_exit(), we ensure
>> that only the device destruction path (rxe_link_delete) is responsible
>> for freeing the tunnel sockets, effectively eliminating the double-free
>> problem.
> 
> I think it is possible to call rxe_ns_exit() without invoking
> rxe_link_delete(), and in that case the UDP socket will not be
> destroyed.

Thanks, my bad. I missed this scenario.

Zhu Yanjun

> 
> Thanks
> 
>>
>> I am not sure if I should put the above into the commit log.
>>
>> Thanks a lot.
>>
>> Zhu Yanjun
>>
>>>
>>> Thanks
>>>
>>>>
>>>> Fixes: 13f2a53c2a71 ("RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets")
>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>> ---
>>>>    drivers/infiniband/sw/rxe/rxe_ns.c | 20 --------------------
>>>>    1 file changed, 20 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
>>>> index 8b9d734229b2..53add78b8e3a 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_ns.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
>>>> @@ -39,26 +39,6 @@ static void rxe_ns_exit(struct net *net)
>>>>    {
>>>>    	/* called when the network namespace is removed
>>>>    	 */
>>>> -	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
>>>> -	struct sock *sk;
>>>> -
>>>> -	rcu_read_lock();
>>>> -	sk = rcu_dereference(ns_sk->rxe_sk4);
>>>> -	rcu_read_unlock();
>>>> -	if (sk) {
>>>> -		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
>>>> -		udp_tunnel_sock_release(sk->sk_socket);
>>>> -	}
>>>> -
>>>> -#if IS_ENABLED(CONFIG_IPV6)
>>>> -	rcu_read_lock();
>>>> -	sk = rcu_dereference(ns_sk->rxe_sk6);
>>>> -	rcu_read_unlock();
>>>> -	if (sk) {
>>>> -		rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
>>>> -		udp_tunnel_sock_release(sk->sk_socket);
>>>> -	}
>>>> -#endif
>>>>    }
>>>>    /*
>>>> -- 
>>>> 2.43.0
>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-04-29 13:49       ` Zhu Yanjun
@ 2026-04-29 23:31         ` yanjun.zhu
  2026-05-11 12:37           ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: yanjun.zhu @ 2026-04-29 23:31 UTC (permalink / raw)
  To: Leon Romanovsky, Zhu Yanjun; +Cc: zyjzyj2000, jgg, linux-rdma

On 4/29/26 6:49 AM, Zhu Yanjun wrote:
> 在 2026/4/28 7:26, Leon Romanovsky 写道:
>> On Mon, Apr 27, 2026 at 01:52:17PM -0700, yanjun.zhu wrote:
>>> On 4/27/26 5:35 AM, Leon Romanovsky wrote:
>>>> On Fri, Apr 24, 2026 at 06:35:22AM +0200, Zhu Yanjun wrote:
>>>>> Since all the sockets are created in rdma link create command
>>>>> and destroyed in rdma link delete command, keeping
>>>>> udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
>>>>> the namespace and the device are being cleaned up simultaneously.
>>>>
>>>> Please add a ladder diagram to clarify how it can be possible.
>>>
>>> Hi, Leon
>>>
>>> The double-free occurs as follows:
>>>
>>> CPU 0 (Net NameSpace cleanup)        CPU 1 (RDMA device removal)
>>> ---------------------                ---------------------------
>>> rxe_ns_exit()                        rxe_link_delete() (rdma link del )
>>>    -> sk = ns_sk->rxe_sk4               -> sk = ns_sk->rxe_sk4
>>>    -> udp_tunnel_sock_release(sk)
>>>       [Success: First Free]             -> udp_tunnel_sock_release(sk)
>>>                                            [Crash: Double Free]
>>>
>>> After removing the socket release logic from rxe_ns_exit(), we ensure
>>> that only the device destruction path (rxe_link_delete) is responsible
>>> for freeing the tunnel sockets, effectively eliminating the double-free
>>> problem.
>>
>> I think it is possible to call rxe_ns_exit() without invoking
>> rxe_link_delete(), and in that case the UDP socket will not be
>> destroyed.
> 
> Thanks, my bad. I missed this scenario.
> 
> Zhu Yanjun
> 
>>
>> Thanks
>>
>>>
>>> I am not sure if I should put the above into the commit log.
>>>
>>> Thanks a lot.

Hi, Leon

I have performed further tests to verify the execution order and the 
necessity of the cleanup code in rxe_ns_exit().

My findings show that a double-free race condition is unlikely because 
of how the kernel manages namespace references:

Reference Dependency: The RXE RDMA link holds a reference to the network 
namespace.

Order of Execution: When a namespace is deleted while an RDMA link 
exists, rxe_ns_exit() is not invoked immediately. It is deferred until 
the RDMA link itself is deleted (e.g., via rdma link del), which drops 
the final reference count of the namespace.

Redundancy: Consequently, rxe_ns_exit() always follows the device 
cleanup path (rxe_link_delete). Since all tunnel sockets are already 
released during the device cleanup, the code in rxe_ns_exit() is 
redundant and does nothing.

Removing this code simplifies the driver by centralizing socket 
destruction in the device management path, where the sockets are 
originally created.

This ensures that we don't attempt to release the same resources twice, 
even if the destruction is technically serialized by the kernel's 
reference counting.

What are your thoughts on this observation?

Thanks,

Zhu Yanjun

>>>
>>> Zhu Yanjun
>>>
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>> Fixes: 13f2a53c2a71 ("RDMA/rxe: Add net namespace support for IPv4/ 
>>>>> IPv6 sockets")
>>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>>> ---
>>>>>    drivers/infiniband/sw/rxe/rxe_ns.c | 20 --------------------
>>>>>    1 file changed, 20 deletions(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/ 
>>>>> infiniband/sw/rxe/rxe_ns.c
>>>>> index 8b9d734229b2..53add78b8e3a 100644
>>>>> --- a/drivers/infiniband/sw/rxe/rxe_ns.c
>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
>>>>> @@ -39,26 +39,6 @@ static void rxe_ns_exit(struct net *net)
>>>>>    {
>>>>>        /* called when the network namespace is removed
>>>>>         */
>>>>> -    struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
>>>>> -    struct sock *sk;
>>>>> -
>>>>> -    rcu_read_lock();
>>>>> -    sk = rcu_dereference(ns_sk->rxe_sk4);
>>>>> -    rcu_read_unlock();
>>>>> -    if (sk) {
>>>>> -        rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
>>>>> -        udp_tunnel_sock_release(sk->sk_socket);
>>>>> -    }
>>>>> -
>>>>> -#if IS_ENABLED(CONFIG_IPV6)
>>>>> -    rcu_read_lock();
>>>>> -    sk = rcu_dereference(ns_sk->rxe_sk6);
>>>>> -    rcu_read_unlock();
>>>>> -    if (sk) {
>>>>> -        rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
>>>>> -        udp_tunnel_sock_release(sk->sk_socket);
>>>>> -    }
>>>>> -#endif
>>>>>    }
>>>>>    /*
>>>>> -- 
>>>>> 2.43.0
>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-04-29 23:31         ` yanjun.zhu
@ 2026-05-11 12:37           ` Leon Romanovsky
  2026-05-12  3:35             ` yanjun.zhu
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2026-05-11 12:37 UTC (permalink / raw)
  To: yanjun.zhu; +Cc: zyjzyj2000, jgg, linux-rdma

On Wed, Apr 29, 2026 at 04:31:48PM -0700, yanjun.zhu wrote:
> On 4/29/26 6:49 AM, Zhu Yanjun wrote:
> > 在 2026/4/28 7:26, Leon Romanovsky 写道:
> > > On Mon, Apr 27, 2026 at 01:52:17PM -0700, yanjun.zhu wrote:
> > > > On 4/27/26 5:35 AM, Leon Romanovsky wrote:
> > > > > On Fri, Apr 24, 2026 at 06:35:22AM +0200, Zhu Yanjun wrote:
> > > > > > Since all the sockets are created in rdma link create command
> > > > > > and destroyed in rdma link delete command, keeping
> > > > > > udp_tunnel_sock_release in rxe_ns_exit risks a "double-free" if
> > > > > > the namespace and the device are being cleaned up simultaneously.
> > > > > 
> > > > > Please add a ladder diagram to clarify how it can be possible.
> > > > 
> > > > Hi, Leon
> > > > 
> > > > The double-free occurs as follows:
> > > > 
> > > > CPU 0 (Net NameSpace cleanup)        CPU 1 (RDMA device removal)
> > > > ---------------------                ---------------------------
> > > > rxe_ns_exit()                        rxe_link_delete() (rdma link del )
> > > >    -> sk = ns_sk->rxe_sk4               -> sk = ns_sk->rxe_sk4
> > > >    -> udp_tunnel_sock_release(sk)
> > > >       [Success: First Free]             -> udp_tunnel_sock_release(sk)
> > > >                                            [Crash: Double Free]
> > > > 
> > > > After removing the socket release logic from rxe_ns_exit(), we ensure
> > > > that only the device destruction path (rxe_link_delete) is responsible
> > > > for freeing the tunnel sockets, effectively eliminating the double-free
> > > > problem.
> > > 
> > > I think it is possible to call rxe_ns_exit() without invoking
> > > rxe_link_delete(), and in that case the UDP socket will not be
> > > destroyed.
> > 
> > Thanks, my bad. I missed this scenario.
> > 
> > Zhu Yanjun
> > 
> > > 
> > > Thanks
> > > 
> > > > 
> > > > I am not sure if I should put the above into the commit log.
> > > > 
> > > > Thanks a lot.
> 
> Hi, Leon
> 
> I have performed further tests to verify the execution order and the
> necessity of the cleanup code in rxe_ns_exit().
> 
> My findings show that a double-free race condition is unlikely because of
> how the kernel manages namespace references:
> 
> Reference Dependency: The RXE RDMA link holds a reference to the network
> namespace.
> 
> Order of Execution: When a namespace is deleted while an RDMA link exists,
> rxe_ns_exit() is not invoked immediately. It is deferred until the RDMA link
> itself is deleted (e.g., via rdma link del), which drops the final reference
> count of the namespace.

AFAIC, we've seen syzkaller reports where "rdma link del" was never invoked,  
yet RXE was removed regardless. Is it possible?

Thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup
  2026-05-11 12:37           ` Leon Romanovsky
@ 2026-05-12  3:35             ` yanjun.zhu
  0 siblings, 0 replies; 8+ messages in thread
From: yanjun.zhu @ 2026-05-12  3:35 UTC (permalink / raw)
  To: Leon Romanovsky, Zhu Yanjun; +Cc: zyjzyj2000, jgg, linux-rdma

On 5/11/26 5:37 AM, Leon Romanovsky wrote:
> On Wed, Apr 29, 2026 at 04:31:48PM -0700, yanjun.zhu wrote:
>> Hi, Leon
>>
>> I have performed further tests to verify the execution order and the
>> necessity of the cleanup code in rxe_ns_exit().
>>
>> My findings show that a double-free race condition is unlikely because of
>> how the kernel manages namespace references:
>>
>> Reference Dependency: The RXE RDMA link holds a reference to the network
>> namespace.
>>
>> Order of Execution: When a namespace is deleted while an RDMA link exists,
>> rxe_ns_exit() is not invoked immediately. It is deferred until the RDMA link
>> itself is deleted (e.g., via rdma link del), which drops the final reference
>> count of the namespace.
> 
> AFAIC, we've seen syzkaller reports where "rdma link del" was never invoked,
> yet RXE was removed regardless. Is it possible?

Hi Leon,

Thanks for your feedback.

Regarding the case where "rdma link del" is not invoked, I’d like to 
share my observations on the namespace cleanup flow for RXE:

In my tests, when a network namespace is deleted (e.g., ip netns del), 
the RXE device (and its underlying net_device) is typically moved back 
to the init_net rather than being destroyed immediately. This is why the 
RDMA links still show up in init_net and the namespace reference count 
remains held.

As long as the device exists (even if moved), the resource cleanup is 
managed by the device's lifecycle. The rxe_ns_exit() only gets called 
when the last reference to that netns is dropped, which usually happens 
after the RXE device itself is finally deleted.

I haven't personally encountered the syzkaller reports where this logic 
fails or leads to a leak/crash. If possible, could you please share the 
specific syzkaller logs or the link to the report? It would be very 
helpful for me to understand if there is a specific corner case (e.g., 
driver unloading or abnormal netns teardown) where the per-net cleanup 
acts as a necessary "safety net."

If such a race exists, I will reconsider whether to keep the cleanup 
code or move it to a more robust location.

Best regards,
Yanjun Zhu

> 
> Thanks


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-12  3:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24  4:35 [PATCH 1/1] RDMA/rxe: Fix unsafe socket release during namespace cleanup Zhu Yanjun
2026-04-27 12:35 ` Leon Romanovsky
2026-04-27 20:52   ` yanjun.zhu
2026-04-28 14:26     ` Leon Romanovsky
2026-04-29 13:49       ` Zhu Yanjun
2026-04-29 23:31         ` yanjun.zhu
2026-05-11 12:37           ` Leon Romanovsky
2026-05-12  3:35             ` yanjun.zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox