From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Shuai Xue <xueshuai@linux.alibaba.com>, Yi Sun <yi.sun@intel.com>
Cc: Fenghua Yu <fenghuay@nvidia.com>,
dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
dave.jiang@intel.com, gordon.jin@intel.com
Subject: Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
Date: Wed, 30 Jul 2025 17:17:28 -0700 [thread overview]
Message-ID: <87zfclkycn.fsf@intel.com> (raw)
In-Reply-To: <2ee448d2-76b2-446c-9368-8b90ec087419@linux.alibaba.com>
Shuai Xue <xueshuai@linux.alibaba.com> writes:
> 在 2025/7/28 19:43, Yi Sun 写道:
>> On 28.07.2025 16:40, Shuai Xue wrote:
>>> Hi, Yi Sun, Fenghua Yu,
>>>
>>> 在 2025/7/27 17:16, Yi Sun 写道:
>>>> On 17.06.2025 14:58, Fenghua Yu wrote:
>>>>> Hi, Yi,
>>>>>
>>>>> On 6/17/25 03:27, Yi Sun wrote:
>>>>>> A recent refactor introduced a misplaced put_device() call, leading to a
>>>>>> reference count underflow during module unload.
>>>>>>
>>>>>> There is no need to add additional put_device() calls for idxd groups,
>>>>>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>>>>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>>>>>> It appears no such omission existed. The required cleanup is already
>>>>>> handled by the call chain:
>>>>>>
>>>>>>
>>>>>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>>>>> idxd_cleanup_internals() which was not originally part of the driver
>>>>>> unload path and introduced unintended reference count underflow.
>>>>>>
>>>>>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>>>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>>>
>>>>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>>> index 40cc9c070081..40f4bf446763 100644
>>>>>> --- a/drivers/dma/idxd/init.c
>>>>>> +++ b/drivers/dma/idxd/init.c
>>>>>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>>>> device_unregister(idxd_confdev(idxd));
>>>>>> idxd_shutdown(pdev);
>>>>>> idxd_device_remove_debugfs(idxd);
>>>>>> - idxd_cleanup(idxd);
>>>>>> + perfmon_pmu_remove(idxd);
>>>>>> + idxd_cleanup_interrupts(idxd);
>>>>>> + if (device_pasid_enabled(idxd))
>>>>>> + idxd_disable_system_pasid(idxd);
>>>>>>
>>>>> This will hit memory leak issue.
>>>>>
>>>>> idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>>>>
>>>>> I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>>>>
>>>>> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>>>>
>>>>> 2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>>>>
>>>>>> pci_iounmap(pdev, idxd->reg_base);
>>>>>> put_device(idxd_confdev(idxd));
>>>>>> pci_disable_device(pdev);
>>>>>
>>>>> Thanks.
>>>>>
>>>>> -Fenghua
>>>>>
>>>>
>>>> Hi Fenghua,
>>>>
>>>> As with the comments on the previous patch, the function
>>>> idxd_conf_device_release already covers part of what is done in
>>>> idxd_remove_internals, including:
>>>> kfree(idxd->groups);
>>>> kfree(idxd->wqs);
>>>> kfree(idxd->engines);
>>>> kfree(idxd);
>>>>
>>>> We need to redesign idxd_remove_internals to clearly identify what
>>>> truely result in memory leaks and should be handled there.
>>>> Then, we'll need another patch to fix the idxd_remove_internals and call
>>>> it here.
>>>>
>>>> Let's prioritize addressing the two critical issues we've encountered here,
>>>> and then revisit the memory leak discussion afterward.
>>>>
>>>> Thanks
>>>> --Sun, Yi
>>>
>>> The root cause is simliar to Patch 1, the idxd_conf_device_release()
>>> function already handles part of the cleanup that
>>> idxd_cleanup_internals() attempts to do, e.g.
>>>
>>> idxd->wqs
>>> idxd->einges
>>> idxd->groups
>>>
>>> As a result, it causes user-after-free issue.
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>> refcount_t: underflow; use-after-free.
>>> drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
>>> Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
>>> CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S E 6.16.0-rc6+ #116 PREEMPT(none)
>>> Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>>> Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>>> RIP: 0010:refcount_warn_saturate+0xbe/0x110
>>> RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
>>> RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
>>> RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
>>> R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
>>> R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
>>> FS: 00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
>>> PKRU: 55555554
>>> Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
>>> Call Trace:
>>> idxd_cleanup+0x6b/0x100 [idxd]
>>> <TASK>
>>> idxd_remove+0x46/0x70 [idxd]
>>> pci_device_remove+0x3f/0xb0
>>> driver_detach+0x48/0x90
>>> device_release_driver_internal+0x197/0x200
>>> bus_remove_driver+0x6d/0xf0
>>> idxd_exit_module+0x34/0x6c0 [idxd]
>>> __do_sys_delete_module.constprop.0+0x174/0x310
>>> do_syscall_64+0x5f/0x2d0
>>> pci_unregister_driver+0x2e/0xb0
>>> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> RIP: 0033:0x7fb5f8a620cb
>>> Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>>> RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>>> RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
>>> RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
>>> R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
>>> R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
>>> </TASK>
>>> ---[ end trace 0000000000000000 ]---
>>>
>>> With this patch applied, the user-after-free issue is fixed.
>>>
>>> But, there is still memory leaks in
>>>
>>> idxd->wqs[i]
>>> idxd->einges[i]
>>> idxd->groups[i]
>>>
>>> I agree with Vinicius that we should cleanup in idxd_conf_device_release().
>>>
>>> Thanks.
>>> Shuai
>>
>> Appreciate Shuai's clarification. Yes, it would be better if fixing the memory
>> leak issue in idxd_conf_device_release() in a separate patch.
>>
>> @Shuai, please feel free to proceed if you'd like to cook a patch for it.
>>
>> Thanks
>> --Sun, Yi
>
> Hi, Sun, Yi,
>
> I need to correct my previous analysis. After further investigation, I
> believe there is no memory leak in the current code. The device
> reference counting and memory management are properly handled through
> the Linux device model. Each component is freed at the conf_dev
> destruction time:
>
>
> idxd->wqs[i] is freed by idxd_conf_wq_release()
> idxd->einges[i] is freed by idxd_conf_engine_release()
> idxd->groups[i] is freed by idxd_conf_group_release()
>
>
> Adding additional cleanup in idxd_conf_device_release() would actually
> cause double-free issues. I can reproduce this with KASAN:
>
Just tested with Yi Sun latest series applied (the series look good by
the way), and I am seeing this:
[ 966.361049] kmemleak: 2075 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
And from kmemleak:
unreferenced object 0xff110001e3e4b340 (size 192):
comm "(udev-worker)", pid 2753, jiffies 4295015473
hex dump (first 32 bytes):
40 e5 f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00 @...............
40 2a e5 08 00 00 a0 ff 40 fa ff ff 00 00 00 00 @*......@.......
backtrace (crc f81b8a71):
__kmalloc_cache_node_noprof+0x39b/0x450
idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
idxd_drv_enable_wq+0x391/0x1100 [idxd]
0xffffffffc22aa490
really_probe+0x1e0/0x930
__driver_probe_device+0x18c/0x3d0
driver_probe_device+0x4a/0xd0
__driver_attach+0x1e1/0x4a0
bus_for_each_dev+0xef/0x170
bus_add_driver+0x29d/0x5c0
driver_register+0x130/0x450
uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
do_one_initcall+0xb9/0x450
do_init_module+0x281/0x8c0
load_module+0x1693/0x2090
init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150941e40 (size 192):
comm "(udev-worker)", pid 2753, jiffies 4295015473
hex dump (first 32 bytes):
40 de f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00 @...............
80 2a e5 08 00 00 a0 ff 80 fa ff ff 00 00 00 00 .*..............
backtrace (crc e1abedf1):
__kmalloc_cache_node_noprof+0x39b/0x450
idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
idxd_drv_enable_wq+0x391/0x1100 [idxd]
0xffffffffc22aa490
really_probe+0x1e0/0x930
__driver_probe_device+0x18c/0x3d0
driver_probe_device+0x4a/0xd0
__driver_attach+0x1e1/0x4a0
bus_for_each_dev+0xef/0x170
bus_add_driver+0x29d/0x5c0
driver_register+0x130/0x450
uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
do_one_initcall+0xb9/0x450
do_init_module+0x281/0x8c0
load_module+0x1693/0x2090
init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150941840 (size 192):
comm "(udev-worker)", pid 2753, jiffies 4295015473
hex dump (first 32 bytes):
40 fd f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00 @...............
c0 2b e5 08 00 00 a0 ff c0 fb ff ff 00 00 00 00 .+..............
backtrace (crc efd34d21):
__kmalloc_cache_node_noprof+0x39b/0x450
idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
idxd_drv_enable_wq+0x391/0x1100 [idxd]
0xffffffffc22aa490
really_probe+0x1e0/0x930
__driver_probe_device+0x18c/0x3d0
driver_probe_device+0x4a/0xd0
__driver_attach+0x1e1/0x4a0
bus_for_each_dev+0xef/0x170
bus_add_driver+0x29d/0x5c0
driver_register+0x130/0x450
uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
do_one_initcall+0xb9/0x450
do_init_module+0x281/0x8c0
load_module+0x1693/0x2090
init_module_from_file+0xe8/0x150
unreferenced object 0xff110001509425c0 (size 192):
comm "(udev-worker)", pid 2753, jiffies 4295015473
hex dump (first 32 bytes):
40 d1 f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00 @...............
00 2c e5 08 00 00 a0 ff 00 fc ff ff 00 00 00 00 .,..............
backtrace (crc 3e67f08):
__kmalloc_cache_node_noprof+0x39b/0x450
idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
idxd_drv_enable_wq+0x391/0x1100 [idxd]
0xffffffffc22aa490
really_probe+0x1e0/0x930
__driver_probe_device+0x18c/0x3d0
driver_probe_device+0x4a/0xd0
__driver_attach+0x1e1/0x4a0
bus_for_each_dev+0xef/0x170
bus_add_driver+0x29d/0x5c0
driver_register+0x130/0x450
uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
do_one_initcall+0xb9/0x450
do_init_module+0x281/0x8c0
load_module+0x1693/0x2090
init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150942740 (size 192):
comm "(udev-worker)", pid 2753, jiffies 4295015473
hex dump (first 32 bytes):
40 dd f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00 @...............
40 2c e5 08 00 00 a0 ff 40 fc ff ff 00 00 00 00 @,......@.......
backtrace (crc 4e740d35):
__kmalloc_cache_node_noprof+0x39b/0x450
idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
idxd_drv_enable_wq+0x391/0x1100 [idxd]
0xffffffffc22aa490
really_probe+0x1e0/0x930
__driver_probe_device+0x18c/0x3d0
driver_probe_device+0x4a/0xd0
__driver_attach+0x1e1/0x4a0
bus_for_each_dev+0xef/0x170
bus_add_driver+0x29d/0x5c0
driver_register+0x130/0x450
uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
do_one_initcall+0xb9/0x450
do_init_module+0x281/0x8c0
load_module+0x1693/0x2090
init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150943340 (size 192):
comm "(udev-worker)", pid 2753, jiffies 4295015473
hex dump (first 32 bytes):
40 c2 f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00 @...............
80 2c e5 08 00 00 a0 ff 80 fc ff ff 00 00 00 00 .,..............
backtrace (crc 68538b12):
__kmalloc_cache_node_noprof+0x39b/0x450
idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
idxd_drv_enable_wq+0x391/0x1100 [idxd]
0xffffffffc22aa490
really_probe+0x1e0/0x930
__driver_probe_device+0x18c/0x3d0
driver_probe_device+0x4a/0xd0
__driver_attach+0x1e1/0x4a0
bus_for_each_dev+0xef/0x170
bus_add_driver+0x29d/0x5c0
driver_register+0x130/0x450
uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
do_one_initcall+0xb9/0x450
do_init_module+0x281/0x8c0
load_module+0x1693/0x2090
init_module_from_file+0xe8/0x150
I have a series that tries to fix those (and a few other things), I am
finishing testing it, should be able to propose it soon.
Cheers,
--
Vinicius
prev parent reply other threads:[~2025-07-31 0:17 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-17 10:27 [PATCH v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
2025-06-17 10:27 ` [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
2025-06-17 22:13 ` Fenghua Yu
2025-07-27 9:02 ` Yi Sun
2025-07-28 8:21 ` Shuai Xue
2025-06-17 10:27 ` [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
2025-06-17 21:58 ` Fenghua Yu
2025-06-18 0:38 ` Vinicius Costa Gomes
2025-07-27 9:16 ` Yi Sun
2025-07-28 8:40 ` Shuai Xue
2025-07-28 11:43 ` Yi Sun
2025-07-29 2:46 ` Shuai Xue
2025-07-29 3:15 ` Yi Sun
2025-07-29 6:00 ` Shuai Xue
2025-07-31 0:17 ` Vinicius Costa Gomes [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zfclkycn.fsf@intel.com \
--to=vinicius.gomes@intel.com \
--cc=dave.jiang@intel.com \
--cc=dmaengine@vger.kernel.org \
--cc=fenghuay@nvidia.com \
--cc=gordon.jin@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=xueshuai@linux.alibaba.com \
--cc=yi.sun@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.