linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
@ 2025-07-29 15:03 Yi Sun
  2025-07-29 15:03 ` [PATCH v3 RESEND 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Yi Sun @ 2025-07-29 15:03 UTC (permalink / raw)
  To: vinicius.gomes, vkoul, dave.jiang, fenghuay, xueshuai
  Cc: dmaengine, linux-kernel, yi.sun, gordon.jin

This patch series addresses two issues related to the device reference
counting and cleanup path in the idxd driver.

Recent changes introduced improper put_device() calls and duplicated
cleanup logic, leading to refcount underflow and potential use-after-free
during module unload.

Patch 1 removes an unnecessary call to idxd_free(), which could result in a
use-after-free, because the function idxd_conf_device_release already
covers everything done in idxd_free. The newly added idxd_free in commit
90022b3 doesn't resolve any memory leaks, but introduces several duplicated
cleanup.

Patch 2 refactors the cleanup to avoid redundant put_device() calls
introduced in commit a409e919ca3. The existing idxd_unregister_devices()
already handles proper device reference release.

Both patches have been verified on hardware platform.

Both patches have been run through `checkpatch.pl`. Patch 2 gets 1 error
and 1 warning. But these appear to be limitations in the checkpatch script
itself, not reflect issues with the patches.

---
Changes in V3:
- Removed function idxd_disable_sva which got removed recently (Vinicius)
Changes in v2:
- Reworded commit messages supplementing the call traces (Vinicius)
- Explain why the put_device are unnecessary. (Vinicius)

Yi Sun (2):
  dmaengine: idxd: Remove improper idxd_free
  dmaengine: idxd: Fix refcount underflow on module unload

 drivers/dma/idxd/init.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 RESEND 1/2] dmaengine: idxd: Remove improper idxd_free
  2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
@ 2025-07-29 15:03 ` Yi Sun
  2025-07-29 15:03 ` [PATCH RESEND v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Yi Sun @ 2025-07-29 15:03 UTC (permalink / raw)
  To: vinicius.gomes, vkoul, dave.jiang, fenghuay, xueshuai
  Cc: dmaengine, linux-kernel, yi.sun, gordon.jin

The call to idxd_free() introduces a duplicate put_device() leading to a
reference count underflow:
refcount_t: underflow; use-after-free.
WARNING: CPU: 15 PID: 4428 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
...
Call Trace:
 <TASK>
  idxd_remove+0xe4/0x120 [idxd]
  pci_device_remove+0x3f/0xb0
  device_release_driver_internal+0x197/0x200
  driver_detach+0x48/0x90
  bus_remove_driver+0x74/0xf0
  pci_unregister_driver+0x2e/0xb0
  idxd_exit_module+0x34/0x7a0 [idxd]
  __do_sys_delete_module.constprop.0+0x183/0x280
  do_syscall_64+0x54/0xd70
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

The idxd_unregister_devices() which is invoked at the very beginning of
idxd_remove(), already takes care of the necessary put_device() through the
following call path:
idxd_unregister_devices() -> device_unregister() -> put_device()

In addition, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device() may
trigger asynchronous cleanup via schedule_delayed_work(). If idxd_free() is
called immediately after, it can result in a use-after-free.

Remove the improper idxd_free() to avoid both the refcount underflow and
potential memory corruption during module unload.

Fixes: d5449ff1b04d ("dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call")
Signed-off-by: Yi Sun <yi.sun@intel.com>
Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 80355d03004d..40cc9c070081 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -1295,7 +1295,6 @@ static void idxd_remove(struct pci_dev *pdev)
 	idxd_cleanup(idxd);
 	pci_iounmap(pdev, idxd->reg_base);
 	put_device(idxd_confdev(idxd));
-	idxd_free(idxd);
 	pci_disable_device(pdev);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RESEND v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
  2025-07-29 15:03 ` [PATCH v3 RESEND 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
@ 2025-07-29 15:03 ` Yi Sun
  2025-07-29 15:06 ` [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues " Dave Jiang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Yi Sun @ 2025-07-29 15:03 UTC (permalink / raw)
  To: vinicius.gomes, vkoul, dave.jiang, fenghuay, xueshuai
  Cc: dmaengine, linux-kernel, yi.sun, gordon.jin

A recent refactor introduced a misplaced put_device() call, resulting in a
reference count underflow during module unload.

There is no need to add additional put_device() calls for idxd groups,
engines, or workqueues. Although the commit claims: "Note, this also
fixes the missing put_device() for idxd groups, engines, and wqs."

It appears no such omission actually existed. The required cleanup is
already handled by the call chain:
idxd_unregister_devices() -> device_unregister() -> put_device()

Extend idxd_cleanup() to handle the remaining necessary cleanup and
remove idxd_cleanup_internals(), which duplicates deallocation logic
for idxd, engines, groups, and workqueues. Memory management is also
properly handled through the Linux device model.

Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
Signed-off-by: Yi Sun <yi.sun@intel.com>
Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 40cc9c070081..40f4bf446763 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
 	device_unregister(idxd_confdev(idxd));
 	idxd_shutdown(pdev);
 	idxd_device_remove_debugfs(idxd);
-	idxd_cleanup(idxd);
+	perfmon_pmu_remove(idxd);
+	idxd_cleanup_interrupts(idxd);
+	if (device_pasid_enabled(idxd))
+		idxd_disable_system_pasid(idxd);
 	pci_iounmap(pdev, idxd->reg_base);
 	put_device(idxd_confdev(idxd));
 	pci_disable_device(pdev);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
  2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
  2025-07-29 15:03 ` [PATCH v3 RESEND 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
  2025-07-29 15:03 ` [PATCH RESEND v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
@ 2025-07-29 15:06 ` Dave Jiang
  2025-07-31  1:09 ` Vinicius Costa Gomes
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Dave Jiang @ 2025-07-29 15:06 UTC (permalink / raw)
  To: Yi Sun, vinicius.gomes, vkoul, fenghuay, xueshuai
  Cc: dmaengine, linux-kernel, gordon.jin



On 7/29/25 8:03 AM, Yi Sun wrote:
> This patch series addresses two issues related to the device reference
> counting and cleanup path in the idxd driver.
> 
> Recent changes introduced improper put_device() calls and duplicated
> cleanup logic, leading to refcount underflow and potential use-after-free
> during module unload.
> 
> Patch 1 removes an unnecessary call to idxd_free(), which could result in a
> use-after-free, because the function idxd_conf_device_release already
> covers everything done in idxd_free. The newly added idxd_free in commit
> 90022b3 doesn't resolve any memory leaks, but introduces several duplicated
> cleanup.
> 
> Patch 2 refactors the cleanup to avoid redundant put_device() calls
> introduced in commit a409e919ca3. The existing idxd_unregister_devices()
> already handles proper device reference release.
> 
> Both patches have been verified on hardware platform.
> 
> Both patches have been run through `checkpatch.pl`. Patch 2 gets 1 error
> and 1 warning. But these appear to be limitations in the checkpatch script
> itself, not reflect issues with the patches.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

for the series

> 
> ---
> Changes in V3:
> - Removed function idxd_disable_sva which got removed recently (Vinicius)
> Changes in v2:
> - Reworded commit messages supplementing the call traces (Vinicius)
> - Explain why the put_device are unnecessary. (Vinicius)
> 
> Yi Sun (2):
>   dmaengine: idxd: Remove improper idxd_free
>   dmaengine: idxd: Fix refcount underflow on module unload
> 
>  drivers/dma/idxd/init.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
  2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
                   ` (2 preceding siblings ...)
  2025-07-29 15:06 ` [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues " Dave Jiang
@ 2025-07-31  1:09 ` Vinicius Costa Gomes
  2025-08-12  2:17   ` Yi Sun
  2025-08-12  8:05 ` Lai, Yi
  2025-08-20 17:39 ` Vinod Koul
  5 siblings, 1 reply; 8+ messages in thread
From: Vinicius Costa Gomes @ 2025-07-31  1:09 UTC (permalink / raw)
  To: Yi Sun, vkoul, dave.jiang, fenghuay, xueshuai
  Cc: dmaengine, linux-kernel, yi.sun, gordon.jin

Yi Sun <yi.sun@intel.com> writes:

> This patch series addresses two issues related to the device reference
> counting and cleanup path in the idxd driver.
>
> Recent changes introduced improper put_device() calls and duplicated
> cleanup logic, leading to refcount underflow and potential use-after-free
> during module unload.
>
> Patch 1 removes an unnecessary call to idxd_free(), which could result in a
> use-after-free, because the function idxd_conf_device_release already
> covers everything done in idxd_free. The newly added idxd_free in commit
> 90022b3 doesn't resolve any memory leaks, but introduces several duplicated
> cleanup.
>
> Patch 2 refactors the cleanup to avoid redundant put_device() calls
> introduced in commit a409e919ca3. The existing idxd_unregister_devices()
> already handles proper device reference release.
>
> Both patches have been verified on hardware platform.
>
> Both patches have been run through `checkpatch.pl`. Patch 2 gets 1 error
> and 1 warning. But these appear to be limitations in the checkpatch script
> itself, not reflect issues with the patches.
>
> ---

For the series:

Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
  2025-07-31  1:09 ` Vinicius Costa Gomes
@ 2025-08-12  2:17   ` Yi Sun
  0 siblings, 0 replies; 8+ messages in thread
From: Yi Sun @ 2025-08-12  2:17 UTC (permalink / raw)
  To: vkoul
  Cc: Vinicius Costa Gomes, vkoul, dave.jiang, fenghuay, xueshuai,
	dmaengine, linux-kernel, gordon.jin

On 30.07.2025 18:09, Vinicius Costa Gomes wrote:
>Yi Sun <yi.sun@intel.com> writes:
>
>> This patch series addresses two issues related to the device reference
>> counting and cleanup path in the idxd driver.
>>
>> Recent changes introduced improper put_device() calls and duplicated
>> cleanup logic, leading to refcount underflow and potential use-after-free
>> during module unload.
>>
>> Patch 1 removes an unnecessary call to idxd_free(), which could result in a
>> use-after-free, because the function idxd_conf_device_release already
>> covers everything done in idxd_free. The newly added idxd_free in commit
>> 90022b3 doesn't resolve any memory leaks, but introduces several duplicated
>> cleanup.
>>
>> Patch 2 refactors the cleanup to avoid redundant put_device() calls
>> introduced in commit a409e919ca3. The existing idxd_unregister_devices()
>> already handles proper device reference release.
>>
>> Both patches have been verified on hardware platform.
>>
>> Both patches have been run through `checkpatch.pl`. Patch 2 gets 1 error
>> and 1 warning. But these appear to be limitations in the checkpatch script
>> itself, not reflect issues with the patches.
>>
>> ---
>
>For the series:
>
>Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>
>
>Cheers,
>-- 
>Vinicius

Hi Vinod,

Gentle ping.
This is another series of bug fixes.

Thanks
    --Sun, Yi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
  2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
                   ` (3 preceding siblings ...)
  2025-07-31  1:09 ` Vinicius Costa Gomes
@ 2025-08-12  8:05 ` Lai, Yi
  2025-08-20 17:39 ` Vinod Koul
  5 siblings, 0 replies; 8+ messages in thread
From: Lai, Yi @ 2025-08-12  8:05 UTC (permalink / raw)
  To: Yi Sun
  Cc: vinicius.gomes, vkoul, dave.jiang, fenghuay, xueshuai, dmaengine,
	linux-kernel, gordon.jin

On Tue, Jul 29, 2025 at 11:03:11PM +0800, Yi Sun wrote:
> This patch series addresses two issues related to the device reference
> counting and cleanup path in the idxd driver.
> 
> Recent changes introduced improper put_device() calls and duplicated
> cleanup logic, leading to refcount underflow and potential use-after-free
> during module unload.
> 
> Patch 1 removes an unnecessary call to idxd_free(), which could result in a
> use-after-free, because the function idxd_conf_device_release already
> covers everything done in idxd_free. The newly added idxd_free in commit
> 90022b3 doesn't resolve any memory leaks, but introduces several duplicated
> cleanup.
> 
> Patch 2 refactors the cleanup to avoid redundant put_device() calls
> introduced in commit a409e919ca3. The existing idxd_unregister_devices()
> already handles proper device reference release.
> 
> Both patches have been verified on hardware platform.
> 
> Both patches have been run through `checkpatch.pl`. Patch 2 gets 1 error
> and 1 warning. But these appear to be limitations in the checkpatch script
> itself, not reflect issues with the patches.
> 
> ---
> Changes in V3:
> - Removed function idxd_disable_sva which got removed recently (Vinicius)
> Changes in v2:
> - Reworded commit messages supplementing the call traces (Vinicius)
> - Explain why the put_device are unnecessary. (Vinicius)
> 
> Yi Sun (2):
>   dmaengine: idxd: Remove improper idxd_free
>   dmaengine: idxd: Fix refcount underflow on module unload
> 
>  drivers/dma/idxd/init.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>

Applied patch series on top of v6.17-rc1 kernel. Issue is fixed.

Please help add Tested-by: Yi Lai <yi1.lai@intel.com>

Regards,
Yi Lai
> -- 
> 2.43.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
  2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
                   ` (4 preceding siblings ...)
  2025-08-12  8:05 ` Lai, Yi
@ 2025-08-20 17:39 ` Vinod Koul
  5 siblings, 0 replies; 8+ messages in thread
From: Vinod Koul @ 2025-08-20 17:39 UTC (permalink / raw)
  To: vinicius.gomes, dave.jiang, fenghuay, xueshuai, Yi Sun
  Cc: dmaengine, linux-kernel, gordon.jin


On Tue, 29 Jul 2025 23:03:11 +0800, Yi Sun wrote:
> This patch series addresses two issues related to the device reference
> counting and cleanup path in the idxd driver.
> 
> Recent changes introduced improper put_device() calls and duplicated
> cleanup logic, leading to refcount underflow and potential use-after-free
> during module unload.
> 
> [...]

Applied, thanks!

[1/2] dmaengine: idxd: Remove improper idxd_free
      commit: f41c538881eec4dcf5961a242097d447f848cda6
[2/2] dmaengine: idxd: Fix refcount underflow on module unload
      commit: b7cb9a034305d52222433fad10c3de10204f29e7

Best regards,
-- 
~Vinod



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-08-20 17:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-29 15:03 [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
2025-07-29 15:03 ` [PATCH v3 RESEND 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
2025-07-29 15:03 ` [PATCH RESEND v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
2025-07-29 15:06 ` [PATCH RESEND v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues " Dave Jiang
2025-07-31  1:09 ` Vinicius Costa Gomes
2025-08-12  2:17   ` Yi Sun
2025-08-12  8:05 ` Lai, Yi
2025-08-20 17:39 ` Vinod Koul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).