public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: "yanjun.zhu" <yanjun.zhu@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>, Zhu Yanjun <yanjun.zhu@linux.dev>
Cc: David Ahern <dsahern@kernel.org>,
	Zhu Yanjun <zyjzyj2000@gmail.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Leon Romanovsky <leon@kernel.org>,
	Kuniyuki Iwashima <kuni1840@gmail.com>,
	linux-rdma@vger.kernel.org,
	syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
Subject: Re: [PATCH v2 1/2] RDMA/rxe: Fix null-ptr-deref in kernel_sock_shutdown().
Date: Tue, 28 Apr 2026 09:56:26 -0700	[thread overview]
Message-ID: <0e05de34-79a0-415f-afb2-cc6c194ad87d@linux.dev> (raw)
In-Reply-To: <CAAVpQUBS0aeCEUK2Nvkq_9NqePiTaLoVQ5T4V8gPiJpbvDYj8Q@mail.gmail.com>

On 4/27/26 11:39 PM, Kuniyuki Iwashima wrote:
> On Mon, Apr 27, 2026 at 11:30 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>
>> 在 2026/4/27 22:22, Kuniyuki Iwashima 写道:
>>> On Mon, Apr 27, 2026 at 10:12 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>>
>>>>
>>>>
>>>> 在 2026/4/27 19:15, Zhu Yanjun 写道:
>>>>>
>>>>> 在 2026/4/27 17:58, David Ahern 写道:
>>>>>> On 4/27/26 6:52 PM, Kuniyuki Iwashima wrote:
>>>>>>> To be clear, you meant implementing David' idea, right ?
>>>>>>> I'm asking because dellink won't need locking then.
>>>>>> dellink is not needed with my suggestion. It was added to manage
>>>>>> basically a refcount on the socket to close on last rxe delete in the
>>>>>
>>>>> This is my original implementation.
>>>>>
>>>>> @Kuniyuki Iwashima, can you reproduce this problem in your local host or
>>>>> other test environments?
>>>
>>> The syzbot does not have a repro, but I think it can be
>>> reproduced by calling newlink and dellink with multiple
>>> threads.
>>>
>>> newlink would trigger kmemleak splat while dellink trigger
>>> KASAN splat.
>>>
>>>
>>>>>
>>>>> If yes, can you make tests after applying the commit in the link:
>>>>> https://patchwork.kernel.org/project/linux-rdma/
>>>>> patch/20260424043522.22901-1-yanjun.zhu@linux.dev/
>>>>>
>>>>> Thanks a lot.
>>>>
>>>> Hi, David && Kuniyuki
>>>>
>>>> I read the call trace again.
>>>>
>>>> If net namespace has already released socket in A thread, then rdma link
>>>> del command is called in B thread to release socket.
>>>>
>>>> So A thread has released socket firstly, then B thread also release socket.
>>>>
>>>> The similar call trace would appear.
>>>>
>>>> The followiing is the explanation to the commit
>>>> https://patchwork.kernel.org/project/linux-rdma/patch/20260424043522.22901-1-yanjun.zhu@linux.dev/
>>>>
>>>> The double-free occurs as follows:
>>>>
>>>> CPU 0 (Net NameSpace cleanup)        CPU 1 (RDMA device removal)
>>>> ---------------------                ---------------------------
>>>> rxe_ns_exit()                        rxe_link_delete() (rdma link del )
>>>
>>> If rxe_link_delete() is in progress, it means the user thread is
>>> alive, holding the netns refcount, and rxe_ns_exit() cannot be
>>> called.
>>>
>>> So, dellink() never races with rxe_ns_exit(), and it races only
>>> with the concurrent dellink().
>>>
>>> And when that occurs, the number of threads is not limited to
>>> two, theoretically triple-free, quad-free, ... are possible.
>>
>> Thread 1: rdma link del          Thread 2: rdma link del
>>        (User A calls dellink)           (User B calls dellink)
>>                 |                                 |
>>         (1) Get Socket Pointer            (2) Get Socket Pointer
>>             sk = ns_sk->rxe_sk4               sk = ns_sk->rxe_sk4
>>                 |                                 |
>>         (3) Release Socket                (4) Release Socket
>>             udp_tunnel_sock_release(sk)       udp_tunnel_sock_release(sk)
>>                 |                                 |
>>           [ FIRST FREE ]                          |
>>                 |                          [ DOUBLE FREE! ]
>>                 v                                 v
>>           (Memory freed)                  (Kernel Panic / Crash)
>>
>> I think the above should explain your idea. If so, your solution makes
>> senses to add a per-netns mutex to synchronise.
>>
>> Let us use the first solution
>> https://lore.kernel.org/all/20260424013759.728288-1-kuniyu@google.com/
>>
>> BTW, 1) add mutex_destroy 2) take into account of rdma link add.
>>
>> I am not sure if it is OK or not. @David Ahern
> 
> No, newlink is still racy and the same kind of race leaks
> the udp tunnel.
> 
> If we defer allocation, there are two options:
> 
> 1. David's idea, allocate on first use, and no free
>    until netns destruction (newlink can add a fast path
>    like check the pointer and only take mutex when it's
>    NULL, and check again under mutex and allcoate a
>   tunnel if not yet allocated)
> 
> 2. Manage refcount properly.  (If we allocate a dedicated
>    refcount for each tunnel socket in rxe_ns_sock, we
>    can implement a similar fast path for newlink, and dellink
>    will be lockless thanks to atomic)

I suggest we stick with Option 2 (proper refcounting) but move away from 
a purely lockless dellink. By protecting the tunnel destruction with a 
mutex, we can effectively close the race window and ensure the UDP 
tunnel is cleaned up reliably without compromising the efficiency of the 
fast path in newlink.

Zhu Yanjun

  reply	other threads:[~2026-04-28 16:56 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-25  6:04 [PATCH v2 0/2] RDMA/rxe: Fix per-netns UDP tunnel issues Kuniyuki Iwashima
2026-04-25  6:04 ` [PATCH v2 1/2] RDMA/rxe: Fix null-ptr-deref in kernel_sock_shutdown() Kuniyuki Iwashima
2026-04-25 15:47   ` David Ahern
2026-04-25 20:55     ` Kuniyuki Iwashima
2026-04-26 16:40       ` David Ahern
2026-04-25 21:25   ` Zhu Yanjun
2026-04-26 16:42     ` David Ahern
2026-04-27  2:57       ` Zhu Yanjun
2026-04-27  3:10         ` Kuniyuki Iwashima
2026-04-27  3:53           ` Zhu Yanjun
2026-04-27 14:38             ` David Ahern
2026-04-27 20:20               ` yanjun.zhu
2026-04-28  0:52                 ` Kuniyuki Iwashima
2026-04-28  0:58                   ` David Ahern
2026-04-28  2:15                     ` Zhu Yanjun
2026-04-28  5:12                       ` Zhu Yanjun
2026-04-28  5:22                         ` Kuniyuki Iwashima
2026-04-28  6:30                           ` Zhu Yanjun
2026-04-28  6:39                             ` Kuniyuki Iwashima
2026-04-28 16:56                               ` yanjun.zhu [this message]
2026-04-25  6:04 ` [PATCH v2 2/2] RDMA/rxe: Fix up RCU usage for rxe_ns_pernet_sk6() Kuniyuki Iwashima
2026-04-25 21:26   ` Zhu Yanjun
  -- strict thread matches above, loose matches on Subject: below --
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
2026-04-12 12:33   ` Michael Margolin
2026-04-13  8:32     ` Jiri Pirko
2026-04-13 16:02       ` Michael Margolin
2026-04-13 18:22         ` Jiri Pirko
2026-04-16 12:10           ` Michael Margolin
2026-04-16 13:34             ` Jiri Pirko
2026-04-21 12:50               ` Jason Gunthorpe
2026-04-21 12:52             ` Jason Gunthorpe
2026-04-22 10:32               ` Jiri Pirko
2026-04-22 16:30                 ` Jason Gunthorpe
2026-04-21 13:46   ` Jason Gunthorpe
2026-04-22 11:33     ` Jiri Pirko
2026-04-22 14:06       ` Jiri Pirko
2026-04-22 16:51         ` Jason Gunthorpe
2026-04-23 13:08           ` Jiri Pirko
2026-04-23 15:08             ` Jason Gunthorpe
     [not found]           ` <20260426135340.GH440345@unreal>
2026-04-26 22:50             ` Jason Gunthorpe
2026-04-27 10:48               ` Jiri Pirko
2026-04-27 18:54                 ` Leon Romanovsky
2026-04-28  8:50                   ` Jiri Pirko
2026-04-27 19:01               ` Leon Romanovsky
2026-04-11 14:49 ` [PATCH rdma-next v2 02/15] RDMA/uverbs: Push out CQ buffer umem processing into a helper Jiri Pirko
2026-04-21 13:25   ` Jason Gunthorpe
2026-04-22 10:56     ` Jiri Pirko
2026-04-22 16:32       ` Jason Gunthorpe
2026-04-11 14:49 ` [PATCH rdma-next v2 03/15] RDMA/uverbs: Integrate umem_list into CQ creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 04/15] RDMA/efa: Use umem_list for user CQ buffer Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 05/15] RDMA/mlx5: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 06/15] RDMA/bnxt_re: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 07/15] RDMA/mlx4: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 08/15] RDMA/uverbs: Remove legacy umem field from struct ib_cq Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 09/15] RDMA/uverbs: Verify all umem_list buffers are consumed after CQ creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 10/15] RDMA/uverbs: Integrate umem_list into QP creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 11/15] RDMA/mlx5: Use umem_list for QP buffers in create_qp Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 12/15] RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 13/15] RDMA/mlx5: Use umem_list for CQ doorbell record Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 14/15] RDMA/uverbs: Add doorbell record buffer slot to QP umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 15/15] RDMA/mlx5: Use umem_list for QP doorbell record Jiri Pirko
2026-03-31  5:56 [PATCH v2 0/4] Firmware LSM hook Leon Romanovsky
2026-03-31  5:56 ` [PATCH v2 1/4] bpf: add firmware command validation hook Leon Romanovsky
2026-04-16  8:43   ` Matt Bobrowski
2026-03-31  5:56 ` [PATCH v2 2/4] selftests/bpf: add test cases for fw_validate_cmd hook Leon Romanovsky
2026-03-31  5:56 ` [PATCH v2 3/4] RDMA/mlx5: Externally validate FW commands supplied in DEVX interface Leon Romanovsky
2026-03-31  5:56 ` [PATCH v2 4/4] fwctl/mlx5: Externally validate FW commands supplied in fwctl Leon Romanovsky
2026-04-09 12:12 ` [PATCH v2 0/4] Firmware LSM hook Leon Romanovsky
2026-04-09 12:27   ` Roberto Sassu
2026-04-09 12:45     ` Leon Romanovsky
2026-04-09 21:04       ` Paul Moore
2026-04-12  9:00         ` Leon Romanovsky
2026-04-13  1:38           ` Paul Moore
2026-04-13 15:53             ` Leon Romanovsky
2026-04-13 16:42             ` Jason Gunthorpe
2026-04-13 17:36               ` Casey Schaufler
2026-04-13 19:09                 ` Casey Schaufler
2026-04-13 22:36               ` Paul Moore
2026-04-13 23:19                 ` Jason Gunthorpe
2026-04-14 17:05                   ` Casey Schaufler
2026-04-14 19:09                     ` Paul Moore
2026-04-14 20:09                       ` Casey Schaufler
2026-04-14 20:44                         ` Paul Moore
2026-04-14 22:42                           ` Casey Schaufler
2026-04-15 21:03                             ` Paul Moore
2026-04-15 21:21                               ` Casey Schaufler
2026-04-14 20:27                   ` Paul Moore
2026-04-15 13:47                     ` Jason Gunthorpe
2026-04-15 21:40                       ` Paul Moore
2026-04-17 19:17                         ` Jason Gunthorpe
2026-04-21  0:58                           ` Paul Moore
2026-04-24 14:36                             ` Jason Gunthorpe
2026-04-24 20:59                               ` Paul Moore
2026-04-24 22:13                                 ` Jason Gunthorpe
2026-04-23 14:09                           ` Leon Romanovsky
2026-04-24 14:19                             ` Jason Gunthorpe
2026-04-26 10:39                               ` Leon Romanovsky
     [not found]                                 ` <20260426134224.GC3501894@ziepe.ca>
2026-04-27 19:09                                   ` Leon Romanovsky
2026-04-23 13:05                         ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e05de34-79a0-415f-afb2-cc6c194ad87d@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=dsahern@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@google.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox