From: "yanjun.zhu" <yanjun.zhu@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>, Zhu Yanjun <yanjun.zhu@linux.dev>
Cc: David Ahern <dsahern@kernel.org>,
Zhu Yanjun <zyjzyj2000@gmail.com>, Jason Gunthorpe <jgg@ziepe.ca>,
Leon Romanovsky <leon@kernel.org>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
linux-rdma@vger.kernel.org,
syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
Subject: Re: [PATCH v2 1/2] RDMA/rxe: Fix null-ptr-deref in kernel_sock_shutdown().
Date: Tue, 28 Apr 2026 09:56:26 -0700 [thread overview]
Message-ID: <0e05de34-79a0-415f-afb2-cc6c194ad87d@linux.dev> (raw)
In-Reply-To: <CAAVpQUBS0aeCEUK2Nvkq_9NqePiTaLoVQ5T4V8gPiJpbvDYj8Q@mail.gmail.com>
On 4/27/26 11:39 PM, Kuniyuki Iwashima wrote:
> On Mon, Apr 27, 2026 at 11:30 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>
>> 在 2026/4/27 22:22, Kuniyuki Iwashima 写道:
>>> On Mon, Apr 27, 2026 at 10:12 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>>
>>>>
>>>>
>>>> 在 2026/4/27 19:15, Zhu Yanjun 写道:
>>>>>
>>>>> 在 2026/4/27 17:58, David Ahern 写道:
>>>>>> On 4/27/26 6:52 PM, Kuniyuki Iwashima wrote:
>>>>>>> To be clear, you meant implementing David' idea, right ?
>>>>>>> I'm asking because dellink won't need locking then.
>>>>>> dellink is not needed with my suggestion. It was added to manage
>>>>>> basically a refcount on the socket to close on last rxe delete in the
>>>>>
>>>>> This is my original implementation.
>>>>>
>>>>> @Kuniyuki Iwashima, can you reproduce this problem in your local host or
>>>>> other test environments?
>>>
>>> The syzbot does not have a repro, but I think it can be
>>> reproduced by calling newlink and dellink with multiple
>>> threads.
>>>
>>> newlink would trigger kmemleak splat while dellink trigger
>>> KASAN splat.
>>>
>>>
>>>>>
>>>>> If yes, can you make tests after applying the commit in the link:
>>>>> https://patchwork.kernel.org/project/linux-rdma/
>>>>> patch/20260424043522.22901-1-yanjun.zhu@linux.dev/
>>>>>
>>>>> Thanks a lot.
>>>>
>>>> Hi, David && Kuniyuki
>>>>
>>>> I read the call trace again.
>>>>
>>>> If net namespace has already released socket in A thread, then rdma link
>>>> del command is called in B thread to release socket.
>>>>
>>>> So A thread has released socket firstly, then B thread also release socket.
>>>>
>>>> The similar call trace would appear.
>>>>
>>>> The followiing is the explanation to the commit
>>>> https://patchwork.kernel.org/project/linux-rdma/patch/20260424043522.22901-1-yanjun.zhu@linux.dev/
>>>>
>>>> The double-free occurs as follows:
>>>>
>>>> CPU 0 (Net NameSpace cleanup) CPU 1 (RDMA device removal)
>>>> --------------------- ---------------------------
>>>> rxe_ns_exit() rxe_link_delete() (rdma link del )
>>>
>>> If rxe_link_delete() is in progress, it means the user thread is
>>> alive, holding the netns refcount, and rxe_ns_exit() cannot be
>>> called.
>>>
>>> So, dellink() never races with rxe_ns_exit(), and it races only
>>> with the concurrent dellink().
>>>
>>> And when that occurs, the number of threads is not limited to
>>> two, theoretically triple-free, quad-free, ... are possible.
>>
>> Thread 1: rdma link del Thread 2: rdma link del
>> (User A calls dellink) (User B calls dellink)
>> | |
>> (1) Get Socket Pointer (2) Get Socket Pointer
>> sk = ns_sk->rxe_sk4 sk = ns_sk->rxe_sk4
>> | |
>> (3) Release Socket (4) Release Socket
>> udp_tunnel_sock_release(sk) udp_tunnel_sock_release(sk)
>> | |
>> [ FIRST FREE ] |
>> | [ DOUBLE FREE! ]
>> v v
>> (Memory freed) (Kernel Panic / Crash)
>>
>> I think the above should explain your idea. If so, your solution makes
>> senses to add a per-netns mutex to synchronise.
>>
>> Let us use the first solution
>> https://lore.kernel.org/all/20260424013759.728288-1-kuniyu@google.com/
>>
>> BTW, 1) add mutex_destroy 2) take into account of rdma link add.
>>
>> I am not sure if it is OK or not. @David Ahern
>
> No, newlink is still racy and the same kind of race leaks
> the udp tunnel.
>
> If we defer allocation, there are two options:
>
> 1. David's idea, allocate on first use, and no free
> until netns destruction (newlink can add a fast path
> like check the pointer and only take mutex when it's
> NULL, and check again under mutex and allcoate a
> tunnel if not yet allocated)
>
> 2. Manage refcount properly. (If we allocate a dedicated
> refcount for each tunnel socket in rxe_ns_sock, we
> can implement a similar fast path for newlink, and dellink
> will be lockless thanks to atomic)
I suggest we stick with Option 2 (proper refcounting) but move away from
a purely lockless dellink. By protecting the tunnel destruction with a
mutex, we can effectively close the race window and ensure the UDP
tunnel is cleaned up reliably without compromising the efficiency of the
fast path in newlink.
Zhu Yanjun
next prev parent reply other threads:[~2026-04-28 16:56 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-25 6:04 [PATCH v2 0/2] RDMA/rxe: Fix per-netns UDP tunnel issues Kuniyuki Iwashima
2026-04-25 6:04 ` [PATCH v2 1/2] RDMA/rxe: Fix null-ptr-deref in kernel_sock_shutdown() Kuniyuki Iwashima
2026-04-25 15:47 ` David Ahern
2026-04-25 20:55 ` Kuniyuki Iwashima
2026-04-26 16:40 ` David Ahern
2026-04-25 21:25 ` Zhu Yanjun
2026-04-26 16:42 ` David Ahern
2026-04-27 2:57 ` Zhu Yanjun
2026-04-27 3:10 ` Kuniyuki Iwashima
2026-04-27 3:53 ` Zhu Yanjun
2026-04-27 14:38 ` David Ahern
2026-04-27 20:20 ` yanjun.zhu
2026-04-28 0:52 ` Kuniyuki Iwashima
2026-04-28 0:58 ` David Ahern
2026-04-28 2:15 ` Zhu Yanjun
2026-04-28 5:12 ` Zhu Yanjun
2026-04-28 5:22 ` Kuniyuki Iwashima
2026-04-28 6:30 ` Zhu Yanjun
2026-04-28 6:39 ` Kuniyuki Iwashima
2026-04-28 16:56 ` yanjun.zhu [this message]
2026-04-25 6:04 ` [PATCH v2 2/2] RDMA/rxe: Fix up RCU usage for rxe_ns_pernet_sk6() Kuniyuki Iwashima
2026-04-25 21:26 ` Zhu Yanjun
-- strict thread matches above, loose matches on Subject: below --
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
2026-04-12 12:33 ` Michael Margolin
2026-04-13 8:32 ` Jiri Pirko
2026-04-13 16:02 ` Michael Margolin
2026-04-13 18:22 ` Jiri Pirko
2026-04-16 12:10 ` Michael Margolin
2026-04-16 13:34 ` Jiri Pirko
2026-04-21 12:50 ` Jason Gunthorpe
2026-04-21 12:52 ` Jason Gunthorpe
2026-04-22 10:32 ` Jiri Pirko
2026-04-22 16:30 ` Jason Gunthorpe
2026-04-21 13:46 ` Jason Gunthorpe
2026-04-22 11:33 ` Jiri Pirko
2026-04-22 14:06 ` Jiri Pirko
2026-04-22 16:51 ` Jason Gunthorpe
2026-04-23 13:08 ` Jiri Pirko
2026-04-23 15:08 ` Jason Gunthorpe
[not found] ` <20260426135340.GH440345@unreal>
2026-04-26 22:50 ` Jason Gunthorpe
2026-04-27 10:48 ` Jiri Pirko
2026-04-27 18:54 ` Leon Romanovsky
2026-04-28 8:50 ` Jiri Pirko
2026-04-27 19:01 ` Leon Romanovsky
2026-04-11 14:49 ` [PATCH rdma-next v2 02/15] RDMA/uverbs: Push out CQ buffer umem processing into a helper Jiri Pirko
2026-04-21 13:25 ` Jason Gunthorpe
2026-04-22 10:56 ` Jiri Pirko
2026-04-22 16:32 ` Jason Gunthorpe
2026-04-11 14:49 ` [PATCH rdma-next v2 03/15] RDMA/uverbs: Integrate umem_list into CQ creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 04/15] RDMA/efa: Use umem_list for user CQ buffer Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 05/15] RDMA/mlx5: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 06/15] RDMA/bnxt_re: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 07/15] RDMA/mlx4: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 08/15] RDMA/uverbs: Remove legacy umem field from struct ib_cq Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 09/15] RDMA/uverbs: Verify all umem_list buffers are consumed after CQ creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 10/15] RDMA/uverbs: Integrate umem_list into QP creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 11/15] RDMA/mlx5: Use umem_list for QP buffers in create_qp Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 12/15] RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 13/15] RDMA/mlx5: Use umem_list for CQ doorbell record Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 14/15] RDMA/uverbs: Add doorbell record buffer slot to QP umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 15/15] RDMA/mlx5: Use umem_list for QP doorbell record Jiri Pirko
2026-03-31 5:56 [PATCH v2 0/4] Firmware LSM hook Leon Romanovsky
2026-03-31 5:56 ` [PATCH v2 1/4] bpf: add firmware command validation hook Leon Romanovsky
2026-04-16 8:43 ` Matt Bobrowski
2026-03-31 5:56 ` [PATCH v2 2/4] selftests/bpf: add test cases for fw_validate_cmd hook Leon Romanovsky
2026-03-31 5:56 ` [PATCH v2 3/4] RDMA/mlx5: Externally validate FW commands supplied in DEVX interface Leon Romanovsky
2026-03-31 5:56 ` [PATCH v2 4/4] fwctl/mlx5: Externally validate FW commands supplied in fwctl Leon Romanovsky
2026-04-09 12:12 ` [PATCH v2 0/4] Firmware LSM hook Leon Romanovsky
2026-04-09 12:27 ` Roberto Sassu
2026-04-09 12:45 ` Leon Romanovsky
2026-04-09 21:04 ` Paul Moore
2026-04-12 9:00 ` Leon Romanovsky
2026-04-13 1:38 ` Paul Moore
2026-04-13 15:53 ` Leon Romanovsky
2026-04-13 16:42 ` Jason Gunthorpe
2026-04-13 17:36 ` Casey Schaufler
2026-04-13 19:09 ` Casey Schaufler
2026-04-13 22:36 ` Paul Moore
2026-04-13 23:19 ` Jason Gunthorpe
2026-04-14 17:05 ` Casey Schaufler
2026-04-14 19:09 ` Paul Moore
2026-04-14 20:09 ` Casey Schaufler
2026-04-14 20:44 ` Paul Moore
2026-04-14 22:42 ` Casey Schaufler
2026-04-15 21:03 ` Paul Moore
2026-04-15 21:21 ` Casey Schaufler
2026-04-14 20:27 ` Paul Moore
2026-04-15 13:47 ` Jason Gunthorpe
2026-04-15 21:40 ` Paul Moore
2026-04-17 19:17 ` Jason Gunthorpe
2026-04-21 0:58 ` Paul Moore
2026-04-24 14:36 ` Jason Gunthorpe
2026-04-24 20:59 ` Paul Moore
2026-04-24 22:13 ` Jason Gunthorpe
2026-04-23 14:09 ` Leon Romanovsky
2026-04-24 14:19 ` Jason Gunthorpe
2026-04-26 10:39 ` Leon Romanovsky
[not found] ` <20260426134224.GC3501894@ziepe.ca>
2026-04-27 19:09 ` Leon Romanovsky
2026-04-23 13:05 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e05de34-79a0-415f-afb2-cc6c194ad87d@linux.dev \
--to=yanjun.zhu@linux.dev \
--cc=dsahern@kernel.org \
--cc=jgg@ziepe.ca \
--cc=kuni1840@gmail.com \
--cc=kuniyu@google.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox