From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E783617109C for ; Fri, 10 May 2024 15:53:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.183 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715356384; cv=none; b=sfVydRsowUwJdp3YgQpKBg6K+xU04UI+jqyXiAOcNmheBG2VECcwXf7c6em367ZLOz6a8/kyvKmqAry7ObxSv6YfxDH+fjbWINR5SN2Z9HZaO80omnqz6GoFUxTCM30ZKqCMZcgB9wOlkDLHXaUbXOOuCrdO9uZRByPN4tZt3tM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715356384; c=relaxed/simple; bh=oNwH0AzZke52YzKRSg5WDKSP/23sd9bYR4osfR57L/s=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=fERX4iH8ivln3mJa1HvUQ4buR+zG32xZd83KwVxImP6LddVhCruuHztBqASxrjgMbS6lt/F10X15v/0xB8GcJGYaeKJmtgobyGYKelQ1LH9yG13m3FY+HoBDBlaFnX79FlhpojkOLApKreArlmFoBhnEzTdMVMQqWfXlOMKeXmA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=qVM3Tp56; arc=none smtp.client-ip=95.215.58.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="qVM3Tp56" Message-ID: <97668d96-db91-40f3-831b-93cb1b1aabf4@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1715356380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B5L7Xa1LApINrTr1AoqVcnxCTOR4M73LLqE7g5u3Cn4=; b=qVM3Tp562LMJeA1Z4QGSjWA2lzfIKjQShNe7PwgvCwZUVZWbbzkXMcGCv4RC0df33DrABq c8QDbYqR/Z0R8LTO7U8BRq6sbWI+u6w7W/6mqNPJu3Cpjr/LC0ufbL3GTnyDzFoR7pW1KE ZDdHcduiAz/joHEG5DEv3geC8e8CUEE= Date: Fri, 10 May 2024 17:52:57 +0200 Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [bug report][bisected] kmemleak in rdma_core observed during blktests nvme/rdma use siw To: Yi Zhang Cc: Jason Gunthorpe , leonro@nvidia.com, chuck.lever@oracle.com, RDMA mailing list References: <7de9793f-6805-1412-3fae-a5508910124b@linux.dev> <54eea59a-efcd-c281-e998-033c6df81a28@linux.dev> <9eb4ed5e-0872-40fd-ab96-e210463d82ee@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Zhu Yanjun In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT 在 2024/5/10 17:49, Yi Zhang 写道: > On Fri, May 10, 2024 at 7:28 PM Zhu Yanjun wrote: >> I can reproduce the new problem that you mentioned in the mail. And in >> the latest diff, I fixed the new problem. >> >> >> About the kmemleak problem, when sgid_attr is set to ERR_PTR(-ENODEV), >> rdma_put_gid_attr should be called. >> >> But rdma_put_gid_attr is not called in commit f8ef1be816bf9a ("RDMA/cma: >> Avoid GID lookups on iWARP devices"). >> >> I think this is the root cause that results in kmemleak error. I made a >> lot of tests in my local host. This will not introduce new problem that >> you mentioned. >> >> Because I can not reproduce the kmemleak error locally, I send the >> followings to you. >> >> If any problem, please feel free to let me know. > The issue fixed now, feel free to add > > Tested-by: Yi Zhang Thanks a lot. I will send out the official patch very soon. Your helps and efforts are much appreciated. Zhu Yanjun > >> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >> index 1e2cd7c8716e..b4683dcdfaca 100644 >> --- a/drivers/infiniband/core/cma.c >> +++ b/drivers/infiniband/core/cma.c >> @@ -715,8 +715,11 @@ cma_validate_port(struct ib_device *device, u32 port, >> rcu_read_lock(); >> ndev = rcu_dereference(sgid_attr->ndev); >> if (!net_eq(dev_net(ndev), dev_addr->net) || >> - ndev->ifindex != bound_if_index) >> + ndev->ifindex != bound_if_index) { >> + rdma_put_gid_attr(sgid_attr); >> sgid_attr = ERR_PTR(-ENODEV); >> + } >> + >> rcu_read_unlock(); >> goto out; >> } >> >> Zhu Yanjun >> >> On 10.05.24 03:41, Yi Zhang wrote: >>> On Wed, May 8, 2024 at 11:31 PM Zhu Yanjun wrote: >>>> 在 2024/5/8 15:08, Yi Zhang 写道: >>>>> So bisect shows it was introduced with below commit, please help check >>>>> and fix it, thanks. >>>>> >>>>> commit f8ef1be816bf9a0c406c696368c2264a9597a994 >>>>> Author: Chuck Lever >>>>> Date: Mon Jul 17 11:12:32 2023 -0400 >>>>> >>>>> RDMA/cma: Avoid GID lookups on iWARP devices >>>> Not sure if the following can fix this problem or not. >>>> Please let me know the test result. >>>> Thanks a lot. >>> Hi Yanjun >>> >>> Seems the change introduced new issue, here is the log: >>> >>> [ 3017.171697] run blktests nvme/003 at 2024-05-09 21:35:41 >>> [ 3032.344539] ================================================================== >>> [ 3032.351780] BUG: KASAN: slab-use-after-free in >>> rdma_put_gid_attr+0x23/0xa0 [ib_core] >>> [ 3032.359659] Write of size 4 at addr ffff88813c1c3f00 by task kworker/27:1/370 >>> [ 3032.366791] >>> [ 3032.368293] CPU: 27 PID: 370 Comm: kworker/27:1 Not tainted >>> 6.9.0-rc2.rdma.fix+ #1 >>> [ 3032.375859] Hardware name: Dell Inc. PowerEdge R640/08HT8T, BIOS >>> 2.20.1 09/13/2023 >>> [ 3032.383425] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma] >>> [ 3032.390304] Call Trace: >>> [ 3032.392757] >>> [ 3032.394864] dump_stack_lvl+0x84/0xd0 >>> [ 3032.398537] ? rdma_put_gid_attr+0x23/0xa0 [ib_core] >>> [ 3032.403561] print_report+0x19d/0x52e >>> [ 3032.407228] ? rdma_put_gid_attr+0x23/0xa0 [ib_core] >>> [ 3032.412256] ? __virt_addr_valid+0x228/0x420 >>> [ 3032.416537] ? rdma_put_gid_attr+0x23/0xa0 [ib_core] >>> [ 3032.421563] kasan_report+0xab/0x180 >>> [ 3032.425142] ? rdma_put_gid_attr+0x23/0xa0 [ib_core] >>> [ 3032.430173] kasan_check_range+0x104/0x1b0 >>> [ 3032.434275] rdma_put_gid_attr+0x23/0xa0 [ib_core] >>> [ 3032.439128] ? cma_dev_put+0x1f/0x60 [rdma_cm] >>> [ 3032.443591] cma_release_dev+0x1b2/0x270 [rdma_cm] >>> [ 3032.448401] _destroy_id+0x35f/0xc80 [rdma_cm] >>> [ 3032.452867] nvmet_rdma_free_queue+0x7a/0x380 [nvmet_rdma] >>> [ 3032.458360] nvmet_rdma_release_queue_work+0x42/0x90 [nvmet_rdma] >>> [ 3032.464460] process_one_work+0x85d/0x13e0 >>> [ 3032.468573] ? worker_thread+0xcc/0x1130 >>> [ 3032.472501] ? __pfx_process_one_work+0x10/0x10 >>> [ 3032.477038] ? assign_work+0x16c/0x240 >>> [ 3032.480797] worker_thread+0x6da/0x1130 >>> [ 3032.484648] ? __pfx_worker_thread+0x10/0x10 >>> [ 3032.488923] kthread+0x2ed/0x3c0 >>> [ 3032.492155] ? _raw_spin_unlock_irq+0x28/0x60 >>> [ 3032.496514] ? __pfx_kthread+0x10/0x10 >>> [ 3032.500267] ret_from_fork+0x31/0x70 >>> [ 3032.503854] ? __pfx_kthread+0x10/0x10 >>> [ 3032.507607] ret_from_fork_asm+0x1a/0x30 >>> [ 3032.511539] >>> [ 3032.513734] >>> [ 3032.515234] Allocated by task 1997: >>> [ 3032.518725] kasan_save_stack+0x30/0x50 >>> [ 3032.522562] kasan_save_track+0x14/0x30 >>> [ 3032.526402] __kasan_kmalloc+0x8f/0xa0 >>> [ 3032.530155] add_modify_gid+0x18e/0xb80 [ib_core] >>> [ 3032.534922] ib_cache_update.part.0+0x6fc/0x8e0 [ib_core] >>> [ 3032.540380] ib_cache_setup_one+0x3ff/0x5f0 [ib_core] >>> [ 3032.545495] ib_register_device+0x5ba/0xa20 [ib_core] >>> [ 3032.550607] siw_newlink+0xb0d/0xe50 [siw] >>> [ 3032.554724] nldev_newlink+0x301/0x520 [ib_core] >>> [ 3032.559404] rdma_nl_rcv_msg+0x2e7/0x600 [ib_core] >>> [ 3032.564256] rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core] >>> [ 3032.570756] netlink_unicast+0x437/0x6e0 >>> [ 3032.574679] netlink_sendmsg+0x775/0xc10 >>> [ 3032.578607] __sys_sendto+0x3e5/0x490 >>> [ 3032.582273] __x64_sys_sendto+0xe0/0x1c0 >>> [ 3032.586198] do_syscall_64+0x9a/0x1a0 >>> [ 3032.589862] entry_SYSCALL_64_after_hwframe+0x71/0x79 >>> [ 3032.594917] >>> [ 3032.596416] Freed by task 339: >>> [ 3032.599475] kasan_save_stack+0x30/0x50 >>> [ 3032.603312] kasan_save_track+0x14/0x30 >>> [ 3032.607153] kasan_save_free_info+0x3b/0x60 >>> [ 3032.611338] poison_slab_object+0x103/0x190 >>> [ 3032.615523] __kasan_slab_free+0x14/0x30 >>> [ 3032.619450] kfree+0x120/0x3a0 >>> [ 3032.622508] free_gid_work+0xd4/0x120 [ib_core] >>> [ 3032.627100] process_one_work+0x85d/0x13e0 >>> [ 3032.631200] worker_thread+0x6da/0x1130 >>> [ 3032.635038] kthread+0x2ed/0x3c0 >>> [ 3032.638271] ret_from_fork+0x31/0x70 >>> [ 3032.641849] ret_from_fork_asm+0x1a/0x30 >>> [ 3032.645777] >>> [ 3032.647277] Last potentially related work creation: >>> [ 3032.652153] kasan_save_stack+0x30/0x50 >>> [ 3032.655994] __kasan_record_aux_stack+0x8e/0xa0 >>> [ 3032.660525] insert_work+0x36/0x310 >>> [ 3032.664019] __queue_work+0x6a4/0xcb0 >>> [ 3032.667685] queue_work_on+0x99/0xb0 >>> [ 3032.671263] cma_release_dev+0x1b2/0x270 [rdma_cm] >>> [ 3032.676072] _destroy_id+0x35f/0xc80 [rdma_cm] >>> [ 3032.680537] nvme_rdma_free_queue+0x4a/0x70 [nvme_rdma] >>> [ 3032.685768] nvme_do_delete_ctrl+0x146/0x160 [nvme_core] >>> [ 3032.691108] nvme_delete_ctrl_sync.cold+0x8/0xd [nvme_core] >>> [ 3032.696707] nvme_sysfs_delete+0x96/0xc0 [nvme_core] >>> [ 3032.701696] kernfs_fop_write_iter+0x3a5/0x5b0 >>> [ 3032.706142] vfs_write+0x62e/0x1010 >>> [ 3032.709636] ksys_write+0xfb/0x1d0 >>> [ 3032.713041] do_syscall_64+0x9a/0x1a0 >>> [ 3032.716707] entry_SYSCALL_64_after_hwframe+0x71/0x79 >>> [ 3032.721760] >>> [ 3032.723259] The buggy address belongs to the object at ffff88813c1c3f00 >>> [ 3032.723259] which belongs to the cache kmalloc-rnd-07-192 of size 192 >>> [ 3032.736370] The buggy address is located 0 bytes inside of >>> [ 3032.736370] freed 192-byte region [ffff88813c1c3f00, ffff88813c1c3fc0) >>> [ 3032.748441] >>> [ 3032.749942] The buggy address belongs to the physical page: >>> [ 3032.755513] page: refcount:1 mapcount:0 mapping:0000000000000000 >>> index:0x0 pfn:0x13c1c2 >>> [ 3032.763511] head: order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 >>> [ 3032.770212] flags: >>> 0x17ffffe0000840(slab|head|node=0|zone=2|lastcpupid=0x3fffff) >>> [ 3032.777604] page_type: 0xffffffff() >>> [ 3032.781097] raw: 0017ffffe0000840 ffff888100053c00 dead000000000122 >>> 0000000000000000 >>> [ 3032.788837] raw: 0000000000000000 0000000080200020 00000001ffffffff >>> 0000000000000000 >>> [ 3032.796574] head: 0017ffffe0000840 ffff888100053c00 >>> dead000000000122 0000000000000000 >>> [ 3032.804400] head: 0000000000000000 0000000080200020 >>> 00000001ffffffff 0000000000000000 >>> [ 3032.812225] head: 0017ffffe0000001 ffffea0004f07081 >>> ffffea0004f070c8 00000000ffffffff >>> [ 3032.820050] head: 0000000200000000 0000000000000000 >>> 00000000ffffffff 0000000000000000 >>> [ 3032.827874] page dumped because: kasan: bad access detected >>> [ 3032.833447] >>> [ 3032.834944] Memory state around the buggy address: >>> [ 3032.839737] ffff88813c1c3e00: fc fc fc fc fc fc fc fc fc fc fc fc >>> fc fc fc fc >>> [ 3032.846958] ffff88813c1c3e80: fc fc fc fc fc fc fc fc fc fc fc fc >>> fc fc fc fc >>> [ 3032.854177] >ffff88813c1c3f00: fa fb fb fb fb fb fb fb fb fb fb fb >>> fb fb fb fb >>> [ 3032.861393] ^ >>> [ 3032.864628] ffff88813c1c3f80: fb fb fb fb fb fb fb fb fc fc fc fc >>> fc fc fc fc >>> [ 3032.871845] ffff88813c1c4000: 00 00 00 00 00 00 00 00 00 00 00 00 >>> 00 00 00 00 >>> [ 3032.879065] ================================================================== >>> [ 3032.886311] Disabling lock debugging due to kernel taint >>> [ 3032.891630] ------------[ cut here ]------------ >>> [ 3032.896255] refcount_t: underflow; use-after-free. >>> [ 3032.901104] WARNING: CPU: 27 PID: 370 at lib/refcount.c:28 >>> refcount_warn_saturate+0xf2/0x150 >>> [ 3032.909552] Modules linked in: siw ib_uverbs nvmet_rdma nvmet >>> nvme_keyring nvme_rdma nvme_fabrics nvme_core nvme_auth rdma_cm iw_cm >>> ib_cm ib_core intel_rapl_msr intel_rapl_coma >>> [ 3032.992273] CPU: 27 PID: 370 Comm: kworker/27:1 Tainted: G B >>> 6.9.0-rc2.rdma.fix+ #1 >>> [ 3033.001324] Hardware name: Dell Inc. PowerEdge R640/08HT8T, BIOS >>> 2.20.1 09/13/2023 >>> [ 3033.008899] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma] >>> [ 3033.015787] RIP: 0010:refcount_warn_saturate+0xf2/0x150 >>> [ 3033.021022] Code: 2f 1b 66 04 01 e8 6e f7 9c fe 0f 0b eb 91 80 3d >>> 1e 1b 66 04 00 75 88 48 c7 c7 80 16 c5 91 c6 05 0e 1b 66 04 01 e8 4e >>> f7 9c fe <0f> 0b e9 6e ff ff ff 80 3d fe7 >>> [ 3033.039775] RSP: 0018:ffffc9000e627c10 EFLAGS: 00010286 >>> [ 3033.045019] RAX: 0000000000000000 RBX: ffff88813c1c3f00 RCX: 0000000000000000 >>> [ 3033.052158] RDX: 0000000000000000 RSI: ffffffff91c5c760 RDI: 0000000000000001 >>> [ 3033.059303] RBP: 0000000000000003 R08: 0000000000000001 R09: fffff52001cc4f36 >>> [ 3033.066449] R10: ffffc9000e6279b7 R11: 0000000000000001 R12: ffff88a88f394248 >>> [ 3033.073588] R13: ffff888135750240 R14: ffff88825385fb80 R15: dead000000000100 >>> [ 3033.080730] FS: 0000000000000000(0000) GS:ffff88c7ad200000(0000) >>> knlGS:0000000000000000 >>> [ 3033.088824] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 3033.094579] CR2: 00007fdfcb573c58 CR3: 0000000ea6e98004 CR4: 00000000007706f0 >>> [ 3033.101720] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> [ 3033.108859] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> [ 3033.116001] PKRU: 55555554 >>> [ 3033.118723] Call Trace: >>> [ 3033.121182] >>> [ 3033.123300] ? __warn+0xcc/0x170 >>> [ 3033.126548] ? refcount_warn_saturate+0xf2/0x150 >>> [ 3033.131176] ? report_bug+0x1fc/0x3d0 >>> [ 3033.134860] ? handle_bug+0x3c/0x80 >>> [ 3033.138368] ? exc_invalid_op+0x17/0x40 >>> [ 3033.142217] ? asm_exc_invalid_op+0x1a/0x20 >>> [ 3033.146428] ? refcount_warn_saturate+0xf2/0x150 >>> [ 3033.151056] cma_release_dev+0x1b2/0x270 [rdma_cm] >>> [ 3033.155874] _destroy_id+0x35f/0xc80 [rdma_cm] >>> [ 3033.160348] nvmet_rdma_free_queue+0x7a/0x380 [nvmet_rdma] >>> [ 3033.165851] nvmet_rdma_release_queue_work+0x42/0x90 [nvmet_rdma] >>> [ 3033.171958] process_one_work+0x85d/0x13e0 >>> [ 3033.176077] ? worker_thread+0xcc/0x1130 >>> [ 3033.180017] ? __pfx_process_one_work+0x10/0x10 >>> [ 3033.184561] ? assign_work+0x16c/0x240 >>> [ 3033.188332] worker_thread+0x6da/0x1130 >>> [ 3033.192187] ? __pfx_worker_thread+0x10/0x10 >>> [ 3033.196475] kthread+0x2ed/0x3c0 >>> [ 3033.199724] ? _raw_spin_unlock_irq+0x28/0x60 >>> [ 3033.204099] ? __pfx_kthread+0x10/0x10 >>> [ 3033.207860] ret_from_fork+0x31/0x70 >>> [ 3033.211447] ? __pfx_kthread+0x10/0x10 >>> [ 3033.215209] ret_from_fork_asm+0x1a/0x30 >>> [ 3033.219152] >>> [ 3033.221352] irq event stamp: 255979 >>> [ 3033.224855] hardirqs last enabled at (255979): >>> [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 >>> [ 3033.234863] hardirqs last disabled at (255978): >>> [] __do_softirq+0x75a/0x967 >>> [ 3033.243391] softirqs last enabled at (255766): >>> [] __irq_exit_rcu+0xc6/0x1d0 >>> [ 3033.252015] softirqs last disabled at (255757): >>> [] __irq_exit_rcu+0xc6/0x1d0 >>> [ 3033.260637] ---[ end trace 0000000000000000 ]--- >>> >>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>> index 1e2cd7c8716e..901e6c40d560 100644 >>>> --- a/drivers/infiniband/core/cma.c >>>> +++ b/drivers/infiniband/core/cma.c >>>> @@ -715,9 +715,13 @@ cma_validate_port(struct ib_device *device, u32 port, >>>> rcu_read_lock(); >>>> ndev = rcu_dereference(sgid_attr->ndev); >>>> if (!net_eq(dev_net(ndev), dev_addr->net) || >>>> - ndev->ifindex != bound_if_index) >>>> + ndev->ifindex != bound_if_index) { >>>> + rdma_put_gid_attr(sgid_attr); >>>> sgid_attr = ERR_PTR(-ENODEV); >>>> + } >>>> rcu_read_unlock(); >>>> + if (!IS_ERR(sgid_attr)) >>>> + rdma_put_gid_attr(sgid_attr); >>>> goto out; >>>> } >>>> Zhu Yanjun >>>> >>>>> On Tue, Apr 30, 2024 at 7:51 PM Yi Zhang wrote: >>>>>> On Mon, Apr 29, 2024 at 8:54 AM Guoqing Jiang wrote: >>>>>>> >>>>>>> On 4/28/24 20:42, Yi Zhang wrote: >>>>>>>> On Sun, Apr 28, 2024 at 10:54 AM Guoqing Jiang wrote: >>>>>>>>> On 4/26/24 16:44, Yi Zhang wrote: >>>>>>>>>> On Fri, Apr 26, 2024 at 1:56 PM Yi Zhang wrote: >>>>>>>>>>> On Wed, Apr 24, 2024 at 9:28 PM Guoqing Jiang wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> On 4/8/24 14:03, Yi Zhang wrote: >>>>>>>>>>>>> Hi >>>>>>>>>>>>> I found the below kmemleak issue during blktests nvme/rdma on the >>>>>>>>>>>>> latest linux-rdma/for-next, please help check it and let me know if >>>>>>>>>>>>> you need any info/testing for it, thanks. >>>>>>>>>>>> Could you share which test case caused the issue? I can't reproduce >>>>>>>>>>>> it with 6.9-rc3+ kernel (commit 586b5dfb51b) with the below. >>>>>>>>>>> It can be reproduced by [1], you can find more info from the symbol >>>>>>>>>>> info[2], I also attached the config file, maybe you can this config >>>>>>>>>>> file >>>>>>>>>> Just attached the config file >>>>>>>>>> >>>>>>>>> I tried with the config, but still unlucky. >>>>>>>>> >>>>>>>>> # nvme_trtype=rdma ./check nvme/012 >>>>>>>>> nvme/012 (run mkfs and data verification fio job on NVMeOF block >>>>>>>>> device-backed ns) >>>>>>>>> nvme/012 (run mkfs and data verification fio job on NVMeOF block >>>>>>>>> device-backed ns) [passed] >>>>>>>>> runtime 52.763s ... 392.027s device: nvme0 >>>>>>>>> >>>>>>>>>>> [1] nvme_trtype=rdma ./check nvme/012 >>>>>>>>>>> [2] >>>>>>>>>>> unreferenced object 0xffff8883a87e8800 (size 192): >>>>>>>>>>> comm "rdma", pid 2355, jiffies 4294836069 >>>>>>>>>>> hex dump (first 32 bytes): >>>>>>>>>>> 32 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00 2............... >>>>>>>>>>> 10 88 7e a8 83 88 ff ff 10 88 7e a8 83 88 ff ff ..~.......~..... >>>>>>>>>>> backtrace (crc 4db191c4): >>>>>>>>>>> [] kmalloc_trace+0x30d/0x3b0 >>>>>>>>>>> [] alloc_gid_entry+0x47/0x380 [ib_core] >>>>>>>>>>> [] add_modify_gid+0x166/0x930 [ib_core] >>>>>>>>>>> [] ib_cache_update.part.0+0x6d8/0x910 [ib_core] >>>>>>>>>>> [] ib_cache_setup_one+0x24a/0x350 [ib_core] >>>>>>>>>>> [] ib_register_device+0x9e/0x3a0 [ib_core] >>>>>>>>>>> [] 0xffffffffc24ac389 >>>>>>>>>>> [] nldev_newlink+0x2b8/0x520 [ib_core] >>>>>>>>>>> [] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core] >>>>>>>>>>> [] >>>>>>>>>>> rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core] >>>>>>>>>>> [] netlink_unicast+0x445/0x710 >>>>>>>>>>> [] netlink_sendmsg+0x761/0xc40 >>>>>>>>>>> [] __sys_sendto+0x3a9/0x420 >>>>>>>>>>> [] __x64_sys_sendto+0xdc/0x1b0 >>>>>>>>>>> [] do_syscall_64+0x93/0x180 >>>>>>>>>>> [] entry_SYSCALL_64_after_hwframe+0x71/0x79 >>>>>>>>>>> >>>>>>>>>>> (gdb) l *(alloc_gid_entry+0x47) >>>>>>>>>>> 0x2eff7 is in alloc_gid_entry (./include/linux/slab.h:628). >>>>>>>>>>> 623 >>>>>>>>>>> 624 if (size > KMALLOC_MAX_CACHE_SIZE) >>>>>>>>>>> 625 return kmalloc_large(size, flags); >>>>>>>>>>> 626 >>>>>>>>>>> 627 index = kmalloc_index(size); >>>>>>>>>>> 628 return kmalloc_trace( >>>>>>>>>>> 629 kmalloc_caches[kmalloc_type(flags, _RET_IP_)][index], >>>>>>>>>>> 630 flags, size); >>>>>>>>>>> 631 } >>>>>>>>>>> 632 return __kmalloc(size, flags); >>>>>>>>>>> >>>>>>>>>>> (gdb) l *(add_modify_gid+0x166) >>>>>>>>>>> 0x30206 is in add_modify_gid (drivers/infiniband/core/cache.c:447). >>>>>>>>>>> 442 * empty table entries instead of storing them. >>>>>>>>>>> 443 */ >>>>>>>>>>> 444 if (rdma_is_zero_gid(&attr->gid)) >>>>>>>>>>> 445 return 0; >>>>>>>>>>> 446 >>>>>>>>>>> 447 entry = alloc_gid_entry(attr); >>>>>>>>>>> 448 if (!entry) >>>>>>>>>>> 449 return -ENOMEM; >>>>>>>>>>> 450 >>>>>>>>>>> 451 if (rdma_protocol_roce(attr->device, attr->port_num)) { >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> use_siw=1 nvme_trtype=rdma ./check nvme/ >>>>>>>>>>>> >>>>>>>>>>>>> # dmesg | grep kmemleak >>>>>>>>>>>>> [ 67.130652] kmemleak: Kernel memory leak detector initialized (mem >>>>>>>>>>>>> pool available: 36041) >>>>>>>>>>>>> [ 67.130728] kmemleak: Automatic memory scanning thread started >>>>>>>>>>>>> [ 1051.771867] kmemleak: 2 new suspected memory leaks (see >>>>>>>>>>>>> /sys/kernel/debug/kmemleak) >>>>>>>>>>>>> [ 1832.796189] kmemleak: 8 new suspected memory leaks (see >>>>>>>>>>>>> /sys/kernel/debug/kmemleak) >>>>>>>>>>>>> [ 2578.189075] kmemleak: 17 new suspected memory leaks (see >>>>>>>>>>>>> /sys/kernel/debug/kmemleak) >>>>>>>>>>>>> [ 3330.710984] kmemleak: 4 new suspected memory leaks (see >>>>>>>>>>>>> /sys/kernel/debug/kmemleak) >>>>>>>>>>>>> >>>>>>>>>>>>> unreferenced object 0xffff88855da53400 (size 192): >>>>>>>>>>>>> comm "rdma", pid 10630, jiffies 4296575922 >>>>>>>>>>>>> hex dump (first 32 bytes): >>>>>>>>>>>>> 37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00 7............... >>>>>>>>>>>>> 10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff .4.].....4.].... >>>>>>>>>>>>> backtrace (crc 47f66721): >>>>>>>>>>>>> [] kmalloc_trace+0x30d/0x3b0 >>>>>>>>>>>>> [] alloc_gid_entry+0x47/0x380 [ib_core] >>>>>>>>>>>>> [] add_modify_gid+0x166/0x930 [ib_core] >>>>>>>>>>>> I guess add_modify_gid is called from config_non_roce_gid_cache, not sure >>>>>>>>>>>> why we don't check the return value of it here. >>>>>>>>>>>> >>>>>>>>>>>> Looks put_gid_entry is called in case add_modify_gid returns failure, it >>>>>>>>>>>> would >>>>>>>>>>>> trigger schedule_free_gid -> queue_work(ib_wq, &entry->del_work), then >>>>>>>>>>>> free_gid_work -> free_gid_entry_locked would free storage asynchronously by >>>>>>>>>>>> put_gid_ndev and also entry. >>>>>>>>>>>> >>>>>>>>>>>>> [] ib_cache_update.part.0+0x6d8/0x910 [ib_core] >>>>>>>>>>>>> [] ib_cache_setup_one+0x24a/0x350 [ib_core] >>>>>>>>>>>>> [] ib_register_device+0x9e/0x3a0 [ib_core] >>>>>>>>>>>>> [] 0xffffffffc2a3d389 >>>>>>>>>>>>> [] nldev_newlink+0x2b8/0x520 [ib_core] >>>>>>>>>>>>> [] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core] >>>>>>>>>>>>> [] >>>>>>>>>>>>> rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core] >>>>>>>>>>>>> [] netlink_unicast+0x445/0x710 >>>>>>>>>>>>> [] netlink_sendmsg+0x761/0xc40 >>>>>>>>>>>>> [] __sys_sendto+0x3a9/0x420 >>>>>>>>>>>>> [] __x64_sys_sendto+0xdc/0x1b0 >>>>>>>>>>>>> [] do_syscall_64+0x93/0x180 >>>>>>>>>>>>> [] entry_SYSCALL_64_after_hwframe+0x71/0x79 >>>>>>>>>>>> After ib_cache_setup_one failed, maybe ib_cache_cleanup_one is needed >>>>>>>>>>>> which flush ib_wq to ensure storage is freed. Could you try with the change? >>>>>>>>>>> Will try it later. >>>>>>>>>>> >>>>>>>>>> The kmemleak still can be reproduced with this change: >>>>>>>>>> >>>>>>>>>> unreferenced object 0xffff8881f89fde00 (size 192): >>>>>>>>>> comm "rdma", pid 8708, jiffies 4295703453 >>>>>>>>>> hex dump (first 32 bytes): >>>>>>>>>> 02 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00 ................ >>>>>>>>>> 10 de 9f f8 81 88 ff ff 10 de 9f f8 81 88 ff ff ................ >>>>>>>>>> backtrace (crc 888c494b): >>>>>>>>>> [] kmalloc_trace+0x30d/0x3b0 >>>>>>>>>> [] alloc_gid_entry+0x47/0x380 [ib_core] >>>>>>>>>> [] add_modify_gid+0x166/0x930 [ib_core] >>>>>>>>>> [] ib_cache_update.part.0+0x6d8/0x910 [ib_core] >>>>>>>>>> [] ib_cache_setup_one+0x24a/0x350 [ib_core] >>>>>>>>>> [] ib_register_device+0x9e/0x3a0 [ib_core] >>>>>>>>>> [] >>>>>>>>>> siw_qp_state_to_ib_qp_state+0x28a9/0xfffffffffffd1520 [siw] >>>>>>>>> Is it possible to run the test with rxe instead of siw? In case it is >>>>>>>>> only happened >>>>>>>>> with siw, I'd suggest to revert 0b988c1bee28 to check if it causes the >>>>>>>>> issue. >>>>>>>>> But I don't understand why siw_qp_state_to_ib_qp_state was appeared in the >>>>>>>>> middle of above trace. >>>>>>>> Hi Guoqing >>>>>>>> This issue only can be reproduced with siw, I did more testing today >>>>>>>> and it cannot be reproduced with 6.5, seems it was introduced from >>>>>>>> 6.6-rc1, and I saw there are some siw updates from 6.6-rc1. >>>>>>> Yes, pls bisect them. >>>>>> Sure, will do that after I back from holiday next week. >>>>>> >>>>>>> > git log --oneline v6.5..v6.6-rc1 drivers/infiniband/sw/siw/|cat >>>>>>> 9dfccb6d0d3d RDMA/siw: Call llist_reverse_order in siw_run_sq >>>>>>> bee024d20451 RDMA/siw: Correct wrong debug message >>>>>>> b056327bee09 RDMA/siw: Balance the reference of cep->kref in the error path >>>>>>> 91f36237b4b9 RDMA/siw: Fix tx thread initialization. >>>>>>> bad5b6e34ffb RDMA/siw: Fabricate a GID on tun and loopback devices >>>>>>> 9191df002926 RDMA/siw: use vmalloc_array and vcalloc >>>>>>> >>>>>>> Thanks, >>>>>>> Guoqing >>>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Yi Zhang >>>>> >> -- >> Best Regards, >> Yanjun.Zhu >> > -- I only represent myself. Zhu Yanjun or Yanjun.Zhu