From: Leon Romanovsky <leon@kernel.org>
To: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
saravanan.vajravel@broadcom.com
Cc: OFED mailing list <linux-rdma@vger.kernel.org>,
Ehab Ababneh <ehab.ababneh@cornelisnetworks.com>,
Selvin Xavier <selvin.xavier@broadcom.com>,
Sagi Grimberg <sagi@grimberg.me>
Subject: Re: isert patch leaving resources behind
Date: Sun, 13 Aug 2023 11:29:31 +0300 [thread overview]
Message-ID: <20230813082931.GD7707@unreal> (raw)
In-Reply-To: <921cd1d9-2879-f455-1f50-0053fe6a6655@cornelisnetworks.com>
On Thu, Aug 10, 2023 at 10:44:00AM -0400, Dennis Dalessandro wrote:
> Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
> causing problems on OPA when we try to unload the driver after doing iSCI
> testing. Reverting this commit causes the problem to go away. Any ideas?
Saravanan, can you please post kernel logs as you wrote "When a bunch of iSER target
is cleared, this issue can lead to use-after-free memory issue as isert conn is twice
released" in the reverted commit?
Thanks
> Was testing done on this patch with removing/hotplugging drivers?
>
> [29151.413816] ------------[ cut here ]------------
> [29151.419086] WARNING: CPU: 52 PID: 2117247 at drivers/infiniband/core/cq.c:359
> ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.431096] Modules linked in: nfsd nfs_acl target_core_user uio tcm_fc libfc
> scsi_transport_fc tcm_loop target_core_pscsi target_core_iblock target_core_file
> rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs
> rfkill rpcrdma rdma_ucm ib_srpt sunrpc ib_isert iscsi_target_mod target_core_mod
> opa_vnic ib_iser libiscsi ib_umad scsi_transport_iscsi rdma_cm ib_ipoib iw_cm
> ib_cm hfi1(-) rdmavt ib_uverbs intel_rapl_msr intel_rapl_common sb_edac ib_core
> x86_pkg_temp_thermal intel_powerclamp coretemp i2c_i801 mxm_wmi rapl iTCO_wdt
> ipmi_si iTCO_vendor_support mei_me ipmi_devintf mei intel_cstate ioatdma
> intel_uncore i2c_smbus joydev pcspkr lpc_ich ipmi_msghandler acpi_power_meter
> acpi_pad xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg crct10dif_pclmul
> crc32_pclmul crc32c_intel drm_kms_helper drm_shmem_helper ahci libahci
> ghash_clmulni_intel igb drm libata dca i2c_algo_bit wmi fuse
> [29151.520056] CPU: 52 PID: 2117247 Comm: modprobe Not tainted 6.5.0-rc1+ #1
> [29151.527759] Hardware name: Intel Corporation S2600CWR/S2600CW, BIOS
> SE5C610.86B.01.01.0014.121820151719 12/18/2015
> [29151.539462] RIP: 0010:ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.545908] Code: ff 48 8b 43 40 48 8d 7b 40 48 83 e8 40 4c 39 e7 75 b3 49 83
> c4 10 4d 39 fc 75 94 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc <0f> 0b eb a1
> 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f
> [29151.567086] RSP: 0018:ffffc10bea13fc80 EFLAGS: 00010206
> [29151.573040] RAX: 000000000000010c RBX: ffff9bf5c7e66c00 RCX: 000000008020001d
> [29151.581120] RDX: 000000008020001e RSI: fffff175221f9900 RDI: ffff9bf5c7e67640
> [29151.589202] RBP: ffff9bf5c7e67600 R08: ffff9bf5c7e64400 R09: 000000008020001d
> [29151.597280] R10: 0000000040000000 R11: 0000000000000000 R12: ffff9bee4b1e8a18
> [29151.605360] R13: dead000000000122 R14: dead000000000100 R15: ffff9bee4b1e8a38
> [29151.613437] FS: 00007ff1e6d38740(0000) GS:ffff9bfd9fb00000(0000)
> knlGS:0000000000000000
> [29151.622610] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [29151.629133] CR2: 00005652044ecc68 CR3: 0000000889b5c005 CR4: 00000000001706e0
> [29151.637212] Call Trace:
> [29151.640063] <TASK>
> [29151.642500] ? __warn+0x80/0x130
> [29151.646209] ? ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.651997] ? report_bug+0x195/0x1a0
> [29151.656191] ? handle_bug+0x3c/0x70
> [29151.660190] ? exc_invalid_op+0x14/0x70
> [29151.664574] ? asm_exc_invalid_op+0x16/0x20
> [29151.669352] ? ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.675107] disable_device+0x9d/0x160 [ib_core]
> [29151.680399] __ib_unregister_device+0x42/0xb0 [ib_core]
> [29151.686361] ib_unregister_device+0x22/0x30 [ib_core]
> [29151.692128] rvt_unregister_device+0x20/0x90 [rdmavt]
> [29151.697889] hfi1_unregister_ib_device+0x16/0xf0 [hfi1]
> [29151.703936] remove_one+0x55/0x1a0 [hfi1]
> [29151.708588] pci_device_remove+0x36/0xa0
> [29151.713076] device_release_driver_internal+0x193/0x200
> [29151.719035] driver_detach+0x44/0x90
> [29151.723137] bus_remove_driver+0x69/0xf0
> [29151.727619] pci_unregister_driver+0x2a/0xb0
> [29151.732490] hfi1_mod_cleanup+0xc/0x3c [hfi1]
> [29151.737516] __do_sys_delete_module.constprop.0+0x17a/0x2f0
> [29151.743844] ? exit_to_user_mode_prepare+0xc4/0xd0
> [29151.749298] ? syscall_trace_enter.constprop.0+0x126/0x1a0
> [29151.755527] do_syscall_64+0x5c/0x90
> [29151.759631] ? syscall_exit_to_user_mode+0x12/0x30
> [29151.765089] ? do_syscall_64+0x69/0x90
> [29151.769374] ? syscall_exit_work+0x103/0x130
> [29151.774243] ? syscall_exit_to_user_mode+0x12/0x30
> [29151.779716] ? do_syscall_64+0x69/0x90
> [29151.784020] ? exc_page_fault+0x65/0x150
> [29151.788499] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [29151.794245] RIP: 0033:0x7ff1e643f5ab
> [29151.798336] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48 83 c8 ff c3
> 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0
> ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89 01 48
> [29151.819504] RSP: 002b:00007ffec9103cc8 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000b0
> [29151.828109] RAX: ffffffffffffffda RBX: 00005615267fdc50 RCX: 00007ff1e643f5ab
> [29151.837575] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00005615267fdcb8
> [29151.845654] RBP: 00005615267fdc50 R08: 0000000000000000 R09: 0000000000000000
> [29151.853739] R10: 00007ff1e659eac0 R11: 0000000000000206 R12: 00005615267fdcb8
> [29151.861817] R13: 0000000000000000 R14: 00005615267fdcb8 R15: 00007ffec9105ff8
> [29151.869896] </TASK>
> [29151.872423] ---[ end trace 0000000000000000 ]---
>
> And...
>
> [29158.533739] restrack: ------------[ cut here ]------------
> [29158.540002] infiniband hfi1_0: BUG: RESTRACK detected leak of resources
> [29158.547499] restrack: Kernel PD object allocated by ib_isert is not freed
> [29158.555193] restrack: Kernel CQ object allocated by ib_core is not freed
> [29158.562801] restrack: Kernel QP object allocated by rdma_cm is not freed
> [29158.570395] restrack: ------------[ cut here ]------------
>
>
> -Denny
next prev parent reply other threads:[~2023-08-13 8:29 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-10 14:44 isert patch leaving resources behind Dennis Dalessandro
2023-08-13 8:29 ` Leon Romanovsky [this message]
2023-08-20 9:46 ` Leon Romanovsky
2023-08-20 14:46 ` Sagi Grimberg
2023-08-20 17:33 ` Leon Romanovsky
2023-08-21 10:47 ` Saravanan Vajravel
2023-08-21 11:30 ` Dennis Dalessandro
2023-08-13 14:18 ` Sagi Grimberg
2023-08-14 11:20 ` Saravanan Vajravel
2023-08-14 12:36 ` Sagi Grimberg
2023-08-16 11:33 ` Saravanan Vajravel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230813082931.GD7707@unreal \
--to=leon@kernel.org \
--cc=dennis.dalessandro@cornelisnetworks.com \
--cc=ehab.ababneh@cornelisnetworks.com \
--cc=linux-rdma@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=saravanan.vajravel@broadcom.com \
--cc=selvin.xavier@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox