linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload
@ 2025-06-17 10:27 Yi Sun
  2025-06-17 10:27 ` [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
  2025-06-17 10:27 ` [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
  0 siblings, 2 replies; 15+ messages in thread
From: Yi Sun @ 2025-06-17 10:27 UTC (permalink / raw)
  To: vinicius.gomes, dmaengine, linux-kernel
  Cc: dave.jiang, yi.sun, gordon.jin, fenghuay

This patch series addresses two issues related to the device reference
counting and cleanup path in the idxd driver.

Recent changes introduced improper put_device() calls and duplicated
cleanup logic, leading to refcount underflow and potential use-after-free
during module unload.

Patch 1 removes an unnecessary call to idxd_free(), which could result in a
use-after-free when paired with asynchronous put_device().

Patch 2 refactors the cleanup path to avoid redundant put_device() calls
introduced in commit a409e919ca3. The existing idxd_unregister_devices()
already handles proper device reference release.

Both patches have been verified on hardware platform.

Both patches have been run through `checkpatch.pl`. Patch 2 gets 1 error
and 1 warning. But these appear to be limitations in the checkpatch script
itself, not reflect issues with the patches.

---
Changes in V3:
- Removed function idxd_disable_sva which got removed recently (Vinicius)
Changes in v2:
- Reworded commit messages supplementing the call traces (Vinicius)
- Explain why the put_device are unnecessary. (Vinicius)

Yi Sun (2):
  dmaengine: idxd: Remove improper idxd_free
  dmaengine: idxd: Fix refcount underflow on module unload

 drivers/dma/idxd/init.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free
  2025-06-17 10:27 [PATCH v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
@ 2025-06-17 10:27 ` Yi Sun
  2025-06-17 22:13   ` Fenghua Yu
  2025-06-17 10:27 ` [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
  1 sibling, 1 reply; 15+ messages in thread
From: Yi Sun @ 2025-06-17 10:27 UTC (permalink / raw)
  To: vinicius.gomes, dmaengine, linux-kernel
  Cc: dave.jiang, yi.sun, gordon.jin, fenghuay

The call to idxd_free() introduces a duplicate put_device() leading to a
reference count underflow:
refcount_t: underflow; use-after-free.
WARNING: CPU: 15 PID: 4428 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
...
Call Trace:
 <TASK>
  idxd_remove+0xe4/0x120 [idxd]
  pci_device_remove+0x3f/0xb0
  device_release_driver_internal+0x197/0x200
  driver_detach+0x48/0x90
  bus_remove_driver+0x74/0xf0
  pci_unregister_driver+0x2e/0xb0
  idxd_exit_module+0x34/0x7a0 [idxd]
  __do_sys_delete_module.constprop.0+0x183/0x280
  do_syscall_64+0x54/0xd70
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

The idxd_unregister_devices() which is invoked at the very beginning of
idxd_remove(), already takes care of the necessary put_device() through the
following call path:
idxd_unregister_devices() -> device_unregister() -> put_device()

In addition, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device() may
trigger asynchronous cleanup via schedule_delayed_work(). If idxd_free() is
called immediately after, it can result in a use-after-free.

Remove the improper idxd_free() to avoid both the refcount underflow and
potential memory corruption during module unload.

Fixes: d5449ff1b04d ("dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call")
Signed-off-by: Yi Sun <yi.sun@intel.com>

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 80355d03004d..40cc9c070081 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -1295,7 +1295,6 @@ static void idxd_remove(struct pci_dev *pdev)
 	idxd_cleanup(idxd);
 	pci_iounmap(pdev, idxd->reg_base);
 	put_device(idxd_confdev(idxd));
-	idxd_free(idxd);
 	pci_disable_device(pdev);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-06-17 10:27 [PATCH v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
  2025-06-17 10:27 ` [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
@ 2025-06-17 10:27 ` Yi Sun
  2025-06-17 21:58   ` Fenghua Yu
  1 sibling, 1 reply; 15+ messages in thread
From: Yi Sun @ 2025-06-17 10:27 UTC (permalink / raw)
  To: vinicius.gomes, dmaengine, linux-kernel
  Cc: dave.jiang, yi.sun, gordon.jin, fenghuay

A recent refactor introduced a misplaced put_device() call, leading to a
reference count underflow during module unload.

There is no need to add additional put_device() calls for idxd groups,
engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
also fixes the missing put_device() for idxd groups, engines, and wqs."
It appears no such omission existed. The required cleanup is already
handled by the call chain:
idxd_unregister_devices() -> device_unregister() -> put_device()

Extend idxd_cleanup() to perform the necessary cleanup, and remove
idxd_cleanup_internals() which was not originally part of the driver
unload path and introduced unintended reference count underflow.

Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
Signed-off-by: Yi Sun <yi.sun@intel.com>

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 40cc9c070081..40f4bf446763 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
 	device_unregister(idxd_confdev(idxd));
 	idxd_shutdown(pdev);
 	idxd_device_remove_debugfs(idxd);
-	idxd_cleanup(idxd);
+	perfmon_pmu_remove(idxd);
+	idxd_cleanup_interrupts(idxd);
+	if (device_pasid_enabled(idxd))
+		idxd_disable_system_pasid(idxd);
 	pci_iounmap(pdev, idxd->reg_base);
 	put_device(idxd_confdev(idxd));
 	pci_disable_device(pdev);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-06-17 10:27 ` [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
@ 2025-06-17 21:58   ` Fenghua Yu
  2025-06-18  0:38     ` Vinicius Costa Gomes
  2025-07-27  9:16     ` Yi Sun
  0 siblings, 2 replies; 15+ messages in thread
From: Fenghua Yu @ 2025-06-17 21:58 UTC (permalink / raw)
  To: Yi Sun, vinicius.gomes, dmaengine, linux-kernel; +Cc: dave.jiang, gordon.jin

Hi, Yi,

On 6/17/25 03:27, Yi Sun wrote:
> A recent refactor introduced a misplaced put_device() call, leading to a
> reference count underflow during module unload.
>
> There is no need to add additional put_device() calls for idxd groups,
> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
> also fixes the missing put_device() for idxd groups, engines, and wqs."
> It appears no such omission existed. The required cleanup is already
> handled by the call chain:
>
>
> Extend idxd_cleanup() to perform the necessary cleanup, and remove
> idxd_cleanup_internals() which was not originally part of the driver
> unload path and introduced unintended reference count underflow.
>
> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
> Signed-off-by: Yi Sun <yi.sun@intel.com>
>
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 40cc9c070081..40f4bf446763 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>   	device_unregister(idxd_confdev(idxd));
>   	idxd_shutdown(pdev);
>   	idxd_device_remove_debugfs(idxd);
> -	idxd_cleanup(idxd);
> +	perfmon_pmu_remove(idxd);
> +	idxd_cleanup_interrupts(idxd);
> +	if (device_pasid_enabled(idxd))
> +		idxd_disable_system_pasid(idxd);
>
This will hit memory leak issue.

idxd_remove_internals() does not only put_device() but also free 
allocated memory for wqs, engines, groups. Without calling 
idxd_remove_internals(), the allocated memory is leaked.

I think a right fix is to remove the put_device() in 
idxd_cleanup_wqs/engines/groups() because:

1. idxd_setup_wqs/engines/groups() does not call get_device(). Their 
counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().

2. Fix the issue mentioned in this patch while there is no memory leak 
issue.

>   	pci_iounmap(pdev, idxd->reg_base);
>   	put_device(idxd_confdev(idxd));
>   	pci_disable_device(pdev);

Thanks.

-Fenghua


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free
  2025-06-17 10:27 ` [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
@ 2025-06-17 22:13   ` Fenghua Yu
  2025-07-27  9:02     ` Yi Sun
  0 siblings, 1 reply; 15+ messages in thread
From: Fenghua Yu @ 2025-06-17 22:13 UTC (permalink / raw)
  To: Yi Sun, vinicius.gomes, dmaengine, linux-kernel; +Cc: dave.jiang, gordon.jin

Hi, Yi,

On 6/17/25 03:27, Yi Sun wrote:
> The call to idxd_free() introduces a duplicate put_device() leading to a
> reference count underflow:
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 15 PID: 4428 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
> ...
> Call Trace:
>   <TASK>
>    idxd_remove+0xe4/0x120 [idxd]
>    pci_device_remove+0x3f/0xb0
>    device_release_driver_internal+0x197/0x200
>    driver_detach+0x48/0x90
>    bus_remove_driver+0x74/0xf0
>    pci_unregister_driver+0x2e/0xb0
>    idxd_exit_module+0x34/0x7a0 [idxd]
>    __do_sys_delete_module.constprop.0+0x183/0x280
>    do_syscall_64+0x54/0xd70
>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> The idxd_unregister_devices() which is invoked at the very beginning of
> idxd_remove(), already takes care of the necessary put_device() through the
> following call path:
> idxd_unregister_devices() -> device_unregister() -> put_device()
>
> In addition, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device() may
> trigger asynchronous cleanup via schedule_delayed_work(). If idxd_free() is
> called immediately after, it can result in a use-after-free.
>
> Remove the improper idxd_free() to avoid both the refcount underflow and
> potential memory corruption during module unload.
>
> Fixes: d5449ff1b04d ("dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call")
> Signed-off-by: Yi Sun <yi.sun@intel.com>
>
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 80355d03004d..40cc9c070081 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -1295,7 +1295,6 @@ static void idxd_remove(struct pci_dev *pdev)
>   	idxd_cleanup(idxd);
>   	pci_iounmap(pdev, idxd->reg_base);
>   	put_device(idxd_confdev(idxd));
> -	idxd_free(idxd);

Simply removing idxd_free() causes two issues:

1. This hits memory leak issues because allocated idxd, ida, map are not 
freed.

2. There is still an underflow issue for dev refcnt in 
idxd_pci_probe_alloc() when idxd_register_devices() fails. Here 
get_device() is not called but put_device() is called.

A right fix is to remove the put_device() in idxd_free(). This will fix 
all the above issues.

Thanks.

-Fenghua


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-06-17 21:58   ` Fenghua Yu
@ 2025-06-18  0:38     ` Vinicius Costa Gomes
  2025-07-27  9:16     ` Yi Sun
  1 sibling, 0 replies; 15+ messages in thread
From: Vinicius Costa Gomes @ 2025-06-18  0:38 UTC (permalink / raw)
  To: Fenghua Yu, Yi Sun, dmaengine, linux-kernel; +Cc: dave.jiang, gordon.jin

Fenghua Yu <fenghuay@nvidia.com> writes:

> Hi, Yi,
>
> On 6/17/25 03:27, Yi Sun wrote:
>> A recent refactor introduced a misplaced put_device() call, leading to a
>> reference count underflow during module unload.
>>
>> There is no need to add additional put_device() calls for idxd groups,
>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>> It appears no such omission existed. The required cleanup is already
>> handled by the call chain:
>>
>>
>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>> idxd_cleanup_internals() which was not originally part of the driver
>> unload path and introduced unintended reference count underflow.
>>
>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>
>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>> index 40cc9c070081..40f4bf446763 100644
>> --- a/drivers/dma/idxd/init.c
>> +++ b/drivers/dma/idxd/init.c
>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>   	device_unregister(idxd_confdev(idxd));
>>   	idxd_shutdown(pdev);
>>   	idxd_device_remove_debugfs(idxd);
>> -	idxd_cleanup(idxd);
>> +	perfmon_pmu_remove(idxd);
>> +	idxd_cleanup_interrupts(idxd);
>> +	if (device_pasid_enabled(idxd))
>> +		idxd_disable_system_pasid(idxd);
>>
> This will hit memory leak issue.
>
> idxd_remove_internals() does not only put_device() but also free 
> allocated memory for wqs, engines, groups. Without calling 
> idxd_remove_internals(), the allocated memory is leaked.
>
> I think a right fix is to remove the put_device() in 
> idxd_cleanup_wqs/engines/groups() because:
>
> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their 
> counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>
> 2. Fix the issue mentioned in this patch while there is no memory leak 
> issue.
>

In my opinion, I think the problem is a bit different, it is that the
driver is doing a lot of custom deallocation itself and not
trusting/depending on the device lifetime tracking to do the
deallocation of resources. That is, we should free the memory associated
with a device when its .release() is called.

>>   	pci_iounmap(pdev, idxd->reg_base);
>>   	put_device(idxd_confdev(idxd));
>>   	pci_disable_device(pdev);
>
> Thanks.
>
> -Fenghua
>

Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free
  2025-06-17 22:13   ` Fenghua Yu
@ 2025-07-27  9:02     ` Yi Sun
  2025-07-28  8:21       ` Shuai Xue
  0 siblings, 1 reply; 15+ messages in thread
From: Yi Sun @ 2025-07-27  9:02 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: vinicius.gomes, dmaengine, linux-kernel, dave.jiang, gordon.jin

On 17.06.2025 15:13, Fenghua Yu wrote:
>Hi, Yi,
>
>On 6/17/25 03:27, Yi Sun wrote:
>>The call to idxd_free() introduces a duplicate put_device() leading to a
>>reference count underflow:
>>refcount_t: underflow; use-after-free.
>>WARNING: CPU: 15 PID: 4428 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>...
>>Call Trace:
>>  <TASK>
>>   idxd_remove+0xe4/0x120 [idxd]
>>   pci_device_remove+0x3f/0xb0
>>   device_release_driver_internal+0x197/0x200
>>   driver_detach+0x48/0x90
>>   bus_remove_driver+0x74/0xf0
>>   pci_unregister_driver+0x2e/0xb0
>>   idxd_exit_module+0x34/0x7a0 [idxd]
>>   __do_sys_delete_module.constprop.0+0x183/0x280
>>   do_syscall_64+0x54/0xd70
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>>The idxd_unregister_devices() which is invoked at the very beginning of
>>idxd_remove(), already takes care of the necessary put_device() through the
>>following call path:
>>idxd_unregister_devices() -> device_unregister() -> put_device()
>>
>>In addition, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device() may
>>trigger asynchronous cleanup via schedule_delayed_work(). If idxd_free() is
>>called immediately after, it can result in a use-after-free.
>>
>>Remove the improper idxd_free() to avoid both the refcount underflow and
>>potential memory corruption during module unload.
>>
>>Fixes: d5449ff1b04d ("dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call")
>>Signed-off-by: Yi Sun <yi.sun@intel.com>
>>
>>diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>index 80355d03004d..40cc9c070081 100644
>>--- a/drivers/dma/idxd/init.c
>>+++ b/drivers/dma/idxd/init.c
>>@@ -1295,7 +1295,6 @@ static void idxd_remove(struct pci_dev *pdev)
>>  	idxd_cleanup(idxd);
>>  	pci_iounmap(pdev, idxd->reg_base);
>>  	put_device(idxd_confdev(idxd));
>>-	idxd_free(idxd);
>
>Simply removing idxd_free() causes two issues:
>
>1. This hits memory leak issues because allocated idxd, ida, map are 
>not freed.
>
>2. There is still an underflow issue for dev refcnt in 
>idxd_pci_probe_alloc() when idxd_register_devices() fails. Here 
>get_device() is not called but put_device() is called.
>
>A right fix is to remove the put_device() in idxd_free(). This will 
>fix all the above issues.
>
>Thanks.
>
>-Fenghua
>

Hi Fenghua,
 From my understanding, the function idxd_conf_device_release already
covers everything done in idxd_free, including:
     bitmap_free(idxd->opcap_bmap);
     ida_free(&idxd_ida, idxd->id);
     kfree(idxd);

At least the newly added idxd_free in commit 90022b3 doesn't resolve
any memory leaks, but introduces several duplicated cleanup.

reference:
```
static void idxd_free(struct idxd_device *idxd)
{
        if (!idxd)
                return;

        put_device(idxd_confdev(idxd));
        bitmap_free(idxd->opcap_bmap);
        ida_free(&idxd_ida, idxd->id);
        kfree(idxd);
}
```

V.S.
```
static void idxd_conf_device_release(struct device *dev)
{
	struct idxd_device *idxd = confdev_to_idxd(dev);

	kfree(idxd->groups);
	bitmap_free(idxd->wq_enable_map);
	kfree(idxd->wqs);
	kfree(idxd->engines);
	kfree(idxd->evl);
	kmem_cache_destroy(idxd->evl_cache);
	ida_free(&idxd_ida, idxd->id);
	bitmap_free(idxd->opcap_bmap);
	kfree(idxd);
}
```

Thanks
    --Sun, Yi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-06-17 21:58   ` Fenghua Yu
  2025-06-18  0:38     ` Vinicius Costa Gomes
@ 2025-07-27  9:16     ` Yi Sun
  2025-07-28  8:40       ` Shuai Xue
  1 sibling, 1 reply; 15+ messages in thread
From: Yi Sun @ 2025-07-27  9:16 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: vinicius.gomes, dmaengine, linux-kernel, dave.jiang, gordon.jin

On 17.06.2025 14:58, Fenghua Yu wrote:
>Hi, Yi,
>
>On 6/17/25 03:27, Yi Sun wrote:
>>A recent refactor introduced a misplaced put_device() call, leading to a
>>reference count underflow during module unload.
>>
>>There is no need to add additional put_device() calls for idxd groups,
>>engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>also fixes the missing put_device() for idxd groups, engines, and wqs."
>>It appears no such omission existed. The required cleanup is already
>>handled by the call chain:
>>
>>
>>Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>idxd_cleanup_internals() which was not originally part of the driver
>>unload path and introduced unintended reference count underflow.
>>
>>Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>Signed-off-by: Yi Sun <yi.sun@intel.com>
>>
>>diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>index 40cc9c070081..40f4bf446763 100644
>>--- a/drivers/dma/idxd/init.c
>>+++ b/drivers/dma/idxd/init.c
>>@@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>  	device_unregister(idxd_confdev(idxd));
>>  	idxd_shutdown(pdev);
>>  	idxd_device_remove_debugfs(idxd);
>>-	idxd_cleanup(idxd);
>>+	perfmon_pmu_remove(idxd);
>>+	idxd_cleanup_interrupts(idxd);
>>+	if (device_pasid_enabled(idxd))
>>+		idxd_disable_system_pasid(idxd);
>>
>This will hit memory leak issue.
>
>idxd_remove_internals() does not only put_device() but also free 
>allocated memory for wqs, engines, groups. Without calling 
>idxd_remove_internals(), the allocated memory is leaked.
>
>I think a right fix is to remove the put_device() in 
>idxd_cleanup_wqs/engines/groups() because:
>
>1. idxd_setup_wqs/engines/groups() does not call get_device(). Their 
>counterpart idxd_cleanup_wqs/engines/groups() shouldn't call 
>put_device().
>
>2. Fix the issue mentioned in this patch while there is no memory leak 
>issue.
>
>>  	pci_iounmap(pdev, idxd->reg_base);
>>  	put_device(idxd_confdev(idxd));
>>  	pci_disable_device(pdev);
>
>Thanks.
>
>-Fenghua
>

Hi Fenghua,

As with the comments on the previous patch, the function
idxd_conf_device_release already covers part of what is done in
idxd_remove_internals, including:
	kfree(idxd->groups);
	kfree(idxd->wqs);
	kfree(idxd->engines);
	kfree(idxd);

We need to redesign idxd_remove_internals to clearly identify what
truely result in memory leaks and should be handled there.
Then, we'll need another patch to fix the idxd_remove_internals and call
it here.

Let's prioritize addressing the two critical issues we've encountered here,
and then revisit the memory leak discussion afterward.

Thanks
   --Sun, Yi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free
  2025-07-27  9:02     ` Yi Sun
@ 2025-07-28  8:21       ` Shuai Xue
  0 siblings, 0 replies; 15+ messages in thread
From: Shuai Xue @ 2025-07-28  8:21 UTC (permalink / raw)
  To: Yi Sun, Fenghua Yu
  Cc: vinicius.gomes, dmaengine, linux-kernel, dave.jiang, gordon.jin

Hi, Sun, Yi and Fenghua,

在 2025/7/27 17:02, Yi Sun 写道:
> On 17.06.2025 15:13, Fenghua Yu wrote:
>> Hi, Yi,
>>
>> On 6/17/25 03:27, Yi Sun wrote:
>>> The call to idxd_free() introduces a duplicate put_device() leading to a
>>> reference count underflow:
>>> refcount_t: underflow; use-after-free.
>>> WARNING: CPU: 15 PID: 4428 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>> ...
>>> Call Trace:
>>>  <TASK>
>>>   idxd_remove+0xe4/0x120 [idxd]
>>>   pci_device_remove+0x3f/0xb0
>>>   device_release_driver_internal+0x197/0x200
>>>   driver_detach+0x48/0x90
>>>   bus_remove_driver+0x74/0xf0
>>>   pci_unregister_driver+0x2e/0xb0
>>>   idxd_exit_module+0x34/0x7a0 [idxd]
>>>   __do_sys_delete_module.constprop.0+0x183/0x280
>>>   do_syscall_64+0x54/0xd70
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>
>>> The idxd_unregister_devices() which is invoked at the very beginning of
>>> idxd_remove(), already takes care of the necessary put_device() through the
>>> following call path:
>>> idxd_unregister_devices() -> device_unregister() -> put_device()
>>>
>>> In addition, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device() may
>>> trigger asynchronous cleanup via schedule_delayed_work(). If idxd_free() is
>>> called immediately after, it can result in a use-after-free.
>>>
>>> Remove the improper idxd_free() to avoid both the refcount underflow and
>>> potential memory corruption during module unload.
>>>
>>> Fixes: d5449ff1b04d ("dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call")
>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>
>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>> index 80355d03004d..40cc9c070081 100644
>>> --- a/drivers/dma/idxd/init.c
>>> +++ b/drivers/dma/idxd/init.c
>>> @@ -1295,7 +1295,6 @@ static void idxd_remove(struct pci_dev *pdev)
>>>      idxd_cleanup(idxd);
>>>      pci_iounmap(pdev, idxd->reg_base);
>>>      put_device(idxd_confdev(idxd));
>>> -    idxd_free(idxd);
>>
>> Simply removing idxd_free() causes two issues:
>>
>> 1. This hits memory leak issues because allocated idxd, ida, map are not freed.
>>
>> 2. There is still an underflow issue for dev refcnt in idxd_pci_probe_alloc() when idxd_register_devices() fails. Here get_device() is not called but put_device() is called.
>>
>> A right fix is to remove the put_device() in idxd_free(). This will fix all the above issues.
>>
>> Thanks.
>>
>> -Fenghua
>>
> 
> Hi Fenghua,
>  From my understanding, the function idxd_conf_device_release already
> covers everything done in idxd_free, including:
>      bitmap_free(idxd->opcap_bmap);
>      ida_free(&idxd_ida, idxd->id);
>      kfree(idxd);
> 
> At least the newly added idxd_free in commit 90022b3 doesn't resolve
> any memory leaks, but introduces several duplicated cleanup.
> 
> reference:
> ```
> static void idxd_free(struct idxd_device *idxd)
> {
>         if (!idxd)
>                 return;
> 
>         put_device(idxd_confdev(idxd));
>         bitmap_free(idxd->opcap_bmap);
>         ida_free(&idxd_ida, idxd->id);
>         kfree(idxd);
> }
> ```
> 
> V.S.
> ```
> static void idxd_conf_device_release(struct device *dev)
> {
>      struct idxd_device *idxd = confdev_to_idxd(dev);
> 
>      kfree(idxd->groups);
>      bitmap_free(idxd->wq_enable_map);
>      kfree(idxd->wqs);
>      kfree(idxd->engines);
>      kfree(idxd->evl);
>      kmem_cache_destroy(idxd->evl_cache);
>      ida_free(&idxd_ida, idxd->id);
>      bitmap_free(idxd->opcap_bmap);
>      kfree(idxd);
> }
> ```
> 
> Thanks
>     --Sun, Yi

Thanks for the detailed analysis. After reviewing the code paths, I
agree with Yi Sun's assessment.

Looking at the call flow:

     idxd_remove()
     	-> idxd_unregister_devices
     		-> idxd_conf_device_release
     			-> cleaup // first freed here
     	-> idxd_free
     		-> clean up  // double freed here

The idxd_conf_device_release() function already handles all the cleanup
that idxd_free() attempts to do:

Calling idxd_free() after idxd_unregister_devices() results in
double-free issues. I can reproduce this with the following warning:

     ------------[ cut here ]------------
     ida_free called for id=4 which is not allocated.
     WARNING: CPU: 169 PID: 13495 at lib/idr.c:592 ida_free+0xe2/0x140
     Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) rfkill(E) kvm_intel(E) kvm(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) irqbypass(E) mlx5_ib(E) nf_tables(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) snd_hwdep(E) cxl_acpi(E) rapl(E) snd_pcm(E) cxl_port(E) pmt_telemetry(E) iTCO_wdt(E) qat_4xxx(E) ib_uverbs(E) cxl_core(E) iTCO_vendor_support(E) pmt_class(E) isst_if_mbox_pci(E) intel_cstate(E) isst_if_mmio(E) snd_timer(E) tun(E) intel_qat(E) nfnetlink(E) intel_uncore(E) einj(E) pcspkr(E) joydev(E) ib_core(E) dh_generic(E) idxd(E-) crc8(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) mei_me(E) i2c_i801(E) soundcore(E) i2c_smbus(E) mei(E) i2c_ismt(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
     drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
     CPU: 169 UID: 0 PID: 13495 Comm: rmmod Kdump: loaded Tainted: G S      W   E       6.16.0-rc6 #115 PREEMPT(none)
     Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN, [E]=UNSIGNED_MODULE
     Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
     RIP: 0010:ida_free+0xe2/0x140
     Code: f6 48 89 e7 e8 bf 2b 02 00 eb 5e 83 fb 3e 76 3a 48 8b 3c 24 4c 89 ee e8 bc a1 03 00 89 ee 48 c7 c7 b8 be 7a 9a e8 4e 70 2f ff <0f> 0b 48 8b 44 24 38 65 48 2b 05 17 11 20 02 75 3c 48 83 c4 40 5b
     RSP: 0018:ff51ea5f331e3d78 EFLAGS: 00010282
     RAX: 0000000000000000 RBX: 0000000000000004 RCX: ff3a3c153f85c448
     RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ff3a3c153f85c440
     RBP: 0000000000000004 R08: 0000000000000000 R09: ff51ea5f331e3c28
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     R13: 0000000000000202 R14: ff3a3b97569ecc80 R15: ffffffffc0d41938
     R10: ff51ea5f331e3c20 R11: ffffffff9b761d68 R12: 000000000000000f
     FS:  00007f4c7e1c3740(0000) GS:ff3a3c15a3679000(0000) knlGS:0000000000000000
     CR2: 0000562018b242c8 CR3: 00000080a849a004 CR4: 0000000000f73ef0
     Call Trace:
     <TASK>
     PKRU: 55555554
     idxd_remove+0x8d/0xa0 [idxd]
     device_release_driver_internal+0x197/0x200
     pci_device_remove+0x3f/0xb0
     driver_detach+0x48/0x90
     do_syscall_64+0x5f/0x2d0
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
     __do_sys_delete_module.constprop.0+0x174/0x310
     idxd_exit_module+0x34/0x680 [idxd]
     bus_remove_driver+0x6d/0xf0
     pci_unregister_driver+0x2e/0xb0
     RIP: 0033:0x7f4c7dc620cb
     RSP: 002b:00007ffcb58d0b78 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
     Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
     RAX: ffffffffffffffda RBX: 0000562018b19780 RCX: 00007f4c7dc620cb
     RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000562018b197e8
     RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
     R10: 00007f4c7ddaaac0 R11: 0000000000000206 R12: 00007ffcb58d0da0
     </TASK>
     R13: 00007ffcb58d2352 R14: 0000562018b192a0 R15: 0000562018b19780

Yi Sun's fix to remove the idxd_free() call is the correct approach. The
device release mechanism properly handles all the cleanup through the
established kernel device model.

Sorry for the confusion in the original implementation.

Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>

Best regards,
Shuai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-27  9:16     ` Yi Sun
@ 2025-07-28  8:40       ` Shuai Xue
  2025-07-28 11:43         ` Yi Sun
  0 siblings, 1 reply; 15+ messages in thread
From: Shuai Xue @ 2025-07-28  8:40 UTC (permalink / raw)
  To: Yi Sun, Fenghua Yu
  Cc: vinicius.gomes, dmaengine, linux-kernel, dave.jiang, gordon.jin,
	xueshuai

Hi, Yi Sun, Fenghua Yu,

在 2025/7/27 17:16, Yi Sun 写道:
> On 17.06.2025 14:58, Fenghua Yu wrote:
>> Hi, Yi,
>>
>> On 6/17/25 03:27, Yi Sun wrote:
>>> A recent refactor introduced a misplaced put_device() call, leading to a
>>> reference count underflow during module unload.
>>>
>>> There is no need to add additional put_device() calls for idxd groups,
>>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>>> It appears no such omission existed. The required cleanup is already
>>> handled by the call chain:
>>>
>>>
>>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>> idxd_cleanup_internals() which was not originally part of the driver
>>> unload path and introduced unintended reference count underflow.
>>>
>>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>
>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>> index 40cc9c070081..40f4bf446763 100644
>>> --- a/drivers/dma/idxd/init.c
>>> +++ b/drivers/dma/idxd/init.c
>>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>      device_unregister(idxd_confdev(idxd));
>>>      idxd_shutdown(pdev);
>>>      idxd_device_remove_debugfs(idxd);
>>> -    idxd_cleanup(idxd);
>>> +    perfmon_pmu_remove(idxd);
>>> +    idxd_cleanup_interrupts(idxd);
>>> +    if (device_pasid_enabled(idxd))
>>> +        idxd_disable_system_pasid(idxd);
>>>
>> This will hit memory leak issue.
>>
>> idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>
>> I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>
>> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>
>> 2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>
>>>      pci_iounmap(pdev, idxd->reg_base);
>>>      put_device(idxd_confdev(idxd));
>>>      pci_disable_device(pdev);
>>
>> Thanks.
>>
>> -Fenghua
>>
> 
> Hi Fenghua,
> 
> As with the comments on the previous patch, the function
> idxd_conf_device_release already covers part of what is done in
> idxd_remove_internals, including:
>      kfree(idxd->groups);
>      kfree(idxd->wqs);
>      kfree(idxd->engines);
>      kfree(idxd);
> 
> We need to redesign idxd_remove_internals to clearly identify what
> truely result in memory leaks and should be handled there.
> Then, we'll need another patch to fix the idxd_remove_internals and call
> it here.
> 
> Let's prioritize addressing the two critical issues we've encountered here,
> and then revisit the memory leak discussion afterward.
> 
> Thanks
>    --Sun, Yi

The root cause is simliar to Patch 1, the idxd_conf_device_release()
function already handles part of the cleanup that
idxd_cleanup_internals() attempts to do, e.g.

     idxd->wqs
     idxd->einges
     idxd->groups

As a result, it causes user-after-free issue.

     ------------[ cut here ]------------
     WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
     refcount_t: underflow; use-after-free.
     drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
     Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
     CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #116 PREEMPT(none)
     Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
     Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
     RIP: 0010:refcount_warn_saturate+0xbe/0x110
     RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
     RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
     RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
     RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
     R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
     R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
     FS:  00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
     PKRU: 55555554
     Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
     Call Trace:
     idxd_cleanup+0x6b/0x100 [idxd]
     <TASK>
     idxd_remove+0x46/0x70 [idxd]
     pci_device_remove+0x3f/0xb0
     driver_detach+0x48/0x90
     device_release_driver_internal+0x197/0x200
     bus_remove_driver+0x6d/0xf0
     idxd_exit_module+0x34/0x6c0 [idxd]
     __do_sys_delete_module.constprop.0+0x174/0x310
     do_syscall_64+0x5f/0x2d0
     pci_unregister_driver+0x2e/0xb0
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
     RIP: 0033:0x7fb5f8a620cb
     Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
     RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
     RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
     RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
     R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
     R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
     </TASK>
     ---[ end trace 0000000000000000 ]---

With this patch applied, the user-after-free issue is fixed.

But, there is still memory leaks in

     idxd->wqs[i]
     idxd->einges[i]
     idxd->groups[i]

I agree with Vinicius that we should cleanup in idxd_conf_device_release().

Thanks.
Shuai


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-28  8:40       ` Shuai Xue
@ 2025-07-28 11:43         ` Yi Sun
  2025-07-29  2:46           ` Shuai Xue
  0 siblings, 1 reply; 15+ messages in thread
From: Yi Sun @ 2025-07-28 11:43 UTC (permalink / raw)
  To: Shuai Xue
  Cc: Fenghua Yu, vinicius.gomes, dmaengine, linux-kernel, dave.jiang,
	gordon.jin

On 28.07.2025 16:40, Shuai Xue wrote:
>Hi, Yi Sun, Fenghua Yu,
>
>在 2025/7/27 17:16, Yi Sun 写道:
>>On 17.06.2025 14:58, Fenghua Yu wrote:
>>>Hi, Yi,
>>>
>>>On 6/17/25 03:27, Yi Sun wrote:
>>>>A recent refactor introduced a misplaced put_device() call, leading to a
>>>>reference count underflow during module unload.
>>>>
>>>>There is no need to add additional put_device() calls for idxd groups,
>>>>engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>>>also fixes the missing put_device() for idxd groups, engines, and wqs."
>>>>It appears no such omission existed. The required cleanup is already
>>>>handled by the call chain:
>>>>
>>>>
>>>>Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>>>idxd_cleanup_internals() which was not originally part of the driver
>>>>unload path and introduced unintended reference count underflow.
>>>>
>>>>Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>>>Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>
>>>>diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>index 40cc9c070081..40f4bf446763 100644
>>>>--- a/drivers/dma/idxd/init.c
>>>>+++ b/drivers/dma/idxd/init.c
>>>>@@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>>     device_unregister(idxd_confdev(idxd));
>>>>     idxd_shutdown(pdev);
>>>>     idxd_device_remove_debugfs(idxd);
>>>>-    idxd_cleanup(idxd);
>>>>+    perfmon_pmu_remove(idxd);
>>>>+    idxd_cleanup_interrupts(idxd);
>>>>+    if (device_pasid_enabled(idxd))
>>>>+        idxd_disable_system_pasid(idxd);
>>>>
>>>This will hit memory leak issue.
>>>
>>>idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>>
>>>I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>>
>>>1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>>
>>>2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>>
>>>>     pci_iounmap(pdev, idxd->reg_base);
>>>>     put_device(idxd_confdev(idxd));
>>>>     pci_disable_device(pdev);
>>>
>>>Thanks.
>>>
>>>-Fenghua
>>>
>>
>>Hi Fenghua,
>>
>>As with the comments on the previous patch, the function
>>idxd_conf_device_release already covers part of what is done in
>>idxd_remove_internals, including:
>>     kfree(idxd->groups);
>>     kfree(idxd->wqs);
>>     kfree(idxd->engines);
>>     kfree(idxd);
>>
>>We need to redesign idxd_remove_internals to clearly identify what
>>truely result in memory leaks and should be handled there.
>>Then, we'll need another patch to fix the idxd_remove_internals and call
>>it here.
>>
>>Let's prioritize addressing the two critical issues we've encountered here,
>>and then revisit the memory leak discussion afterward.
>>
>>Thanks
>>   --Sun, Yi
>
>The root cause is simliar to Patch 1, the idxd_conf_device_release()
>function already handles part of the cleanup that
>idxd_cleanup_internals() attempts to do, e.g.
>
>    idxd->wqs
>    idxd->einges
>    idxd->groups
>
>As a result, it causes user-after-free issue.
>
>    ------------[ cut here ]------------
>    WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>    refcount_t: underflow; use-after-free.
>    drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
>    Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
>    CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #116 PREEMPT(none)
>    Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>    Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>    RIP: 0010:refcount_warn_saturate+0xbe/0x110
>    RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
>    RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
>    RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
>    RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
>    R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
>    R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
>    FS:  00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
>    PKRU: 55555554
>    Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
>    Call Trace:
>    idxd_cleanup+0x6b/0x100 [idxd]
>    <TASK>
>    idxd_remove+0x46/0x70 [idxd]
>    pci_device_remove+0x3f/0xb0
>    driver_detach+0x48/0x90
>    device_release_driver_internal+0x197/0x200
>    bus_remove_driver+0x6d/0xf0
>    idxd_exit_module+0x34/0x6c0 [idxd]
>    __do_sys_delete_module.constprop.0+0x174/0x310
>    do_syscall_64+0x5f/0x2d0
>    pci_unregister_driver+0x2e/0xb0
>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>    RIP: 0033:0x7fb5f8a620cb
>    Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>    RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>    RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
>    RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
>    R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
>    R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
>    </TASK>
>    ---[ end trace 0000000000000000 ]---
>
>With this patch applied, the user-after-free issue is fixed.
>
>But, there is still memory leaks in
>
>    idxd->wqs[i]
>    idxd->einges[i]
>    idxd->groups[i]
>
>I agree with Vinicius that we should cleanup in idxd_conf_device_release().
>
>Thanks.
>Shuai

Appreciate Shuai's clarification. Yes, it would be better if fixing the memory
leak issue in idxd_conf_device_release() in a separate patch.

@Shuai, please feel free to proceed if you'd like to cook a patch for it.

Thanks
    --Sun, Yi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-28 11:43         ` Yi Sun
@ 2025-07-29  2:46           ` Shuai Xue
  2025-07-29  3:15             ` Yi Sun
  2025-07-31  0:17             ` Vinicius Costa Gomes
  0 siblings, 2 replies; 15+ messages in thread
From: Shuai Xue @ 2025-07-29  2:46 UTC (permalink / raw)
  To: Yi Sun
  Cc: Fenghua Yu, vinicius.gomes, dmaengine, linux-kernel, dave.jiang,
	gordon.jin



在 2025/7/28 19:43, Yi Sun 写道:
> On 28.07.2025 16:40, Shuai Xue wrote:
>> Hi, Yi Sun, Fenghua Yu,
>>
>> 在 2025/7/27 17:16, Yi Sun 写道:
>>> On 17.06.2025 14:58, Fenghua Yu wrote:
>>>> Hi, Yi,
>>>>
>>>> On 6/17/25 03:27, Yi Sun wrote:
>>>>> A recent refactor introduced a misplaced put_device() call, leading to a
>>>>> reference count underflow during module unload.
>>>>>
>>>>> There is no need to add additional put_device() calls for idxd groups,
>>>>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>>>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>>>>> It appears no such omission existed. The required cleanup is already
>>>>> handled by the call chain:
>>>>>
>>>>>
>>>>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>>>> idxd_cleanup_internals() which was not originally part of the driver
>>>>> unload path and introduced unintended reference count underflow.
>>>>>
>>>>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>>
>>>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>> index 40cc9c070081..40f4bf446763 100644
>>>>> --- a/drivers/dma/idxd/init.c
>>>>> +++ b/drivers/dma/idxd/init.c
>>>>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>>>     device_unregister(idxd_confdev(idxd));
>>>>>     idxd_shutdown(pdev);
>>>>>     idxd_device_remove_debugfs(idxd);
>>>>> -    idxd_cleanup(idxd);
>>>>> +    perfmon_pmu_remove(idxd);
>>>>> +    idxd_cleanup_interrupts(idxd);
>>>>> +    if (device_pasid_enabled(idxd))
>>>>> +        idxd_disable_system_pasid(idxd);
>>>>>
>>>> This will hit memory leak issue.
>>>>
>>>> idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>>>
>>>> I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>>>
>>>> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>>>
>>>> 2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>>>
>>>>>     pci_iounmap(pdev, idxd->reg_base);
>>>>>     put_device(idxd_confdev(idxd));
>>>>>     pci_disable_device(pdev);
>>>>
>>>> Thanks.
>>>>
>>>> -Fenghua
>>>>
>>>
>>> Hi Fenghua,
>>>
>>> As with the comments on the previous patch, the function
>>> idxd_conf_device_release already covers part of what is done in
>>> idxd_remove_internals, including:
>>>     kfree(idxd->groups);
>>>     kfree(idxd->wqs);
>>>     kfree(idxd->engines);
>>>     kfree(idxd);
>>>
>>> We need to redesign idxd_remove_internals to clearly identify what
>>> truely result in memory leaks and should be handled there.
>>> Then, we'll need another patch to fix the idxd_remove_internals and call
>>> it here.
>>>
>>> Let's prioritize addressing the two critical issues we've encountered here,
>>> and then revisit the memory leak discussion afterward.
>>>
>>> Thanks
>>>   --Sun, Yi
>>
>> The root cause is simliar to Patch 1, the idxd_conf_device_release()
>> function already handles part of the cleanup that
>> idxd_cleanup_internals() attempts to do, e.g.
>>
>>    idxd->wqs
>>    idxd->einges
>>    idxd->groups
>>
>> As a result, it causes user-after-free issue.
>>
>>    ------------[ cut here ]------------
>>    WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>    refcount_t: underflow; use-after-free.
>>    drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
>>    Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
>>    CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #116 PREEMPT(none)
>>    Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>>    Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>>    RIP: 0010:refcount_warn_saturate+0xbe/0x110
>>    RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
>>    RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
>>    RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
>>    RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
>>    R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
>>    R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
>>    FS:  00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
>>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>    CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
>>    PKRU: 55555554
>>    Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
>>    Call Trace:
>>    idxd_cleanup+0x6b/0x100 [idxd]
>>    <TASK>
>>    idxd_remove+0x46/0x70 [idxd]
>>    pci_device_remove+0x3f/0xb0
>>    driver_detach+0x48/0x90
>>    device_release_driver_internal+0x197/0x200
>>    bus_remove_driver+0x6d/0xf0
>>    idxd_exit_module+0x34/0x6c0 [idxd]
>>    __do_sys_delete_module.constprop.0+0x174/0x310
>>    do_syscall_64+0x5f/0x2d0
>>    pci_unregister_driver+0x2e/0xb0
>>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>    RIP: 0033:0x7fb5f8a620cb
>>    Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>>    RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>>    RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
>>    RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
>>    R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
>>    R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
>>    </TASK>
>>    ---[ end trace 0000000000000000 ]---
>>
>> With this patch applied, the user-after-free issue is fixed.
>>
>> But, there is still memory leaks in
>>
>>    idxd->wqs[i]
>>    idxd->einges[i]
>>    idxd->groups[i]
>>
>> I agree with Vinicius that we should cleanup in idxd_conf_device_release().
>>
>> Thanks.
>> Shuai
> 
> Appreciate Shuai's clarification. Yes, it would be better if fixing the memory
> leak issue in idxd_conf_device_release() in a separate patch.
> 
> @Shuai, please feel free to proceed if you'd like to cook a patch for it.
> 
> Thanks
>     --Sun, Yi

Hi, Sun, Yi,

I need to correct my previous analysis. After further investigation, I
believe there is no memory leak in the current code. The device
reference counting and memory management are properly handled through
the Linux device model. Each component is freed at the conf_dev
destruction time:


     idxd->wqs[i] is freed by idxd_conf_wq_release()
     idxd->einges[i] is freed by idxd_conf_engine_release()
     idxd->groups[i] is freed by idxd_conf_group_release()


Adding additional cleanup in idxd_conf_device_release() would actually
cause double-free issues. I can reproduce this with KASAN:

[kernel][err]BUG: KASAN: slab-use-after-free in idxd_clean_groups+0x137/0x250 [idxd]
[kernel][err]==================================================================
[kernel][err]Read of size 4 at addr ff1100822ff89038 by task rmmod/13956
[kernel][err]
[kernel][err]CPU: 185 UID: 0 PID: 13956 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #117 PREEMPT(none)
[kernel][err]Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
[kernel][err]Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
[kernel][err]Call Trace:
[kernel][err]<TASK>
[kernel][err]dump_stack_lvl+0x55/0x70
[kernel][err]print_address_description.constprop.0+0x27/0x2e0
[kernel][err]? idxd_clean_groups+0x137/0x250 [idxd]
[kernel][err]print_report+0x3e/0x70
[kernel][err]kasan_report+0xb8/0xf0
[kernel][err]? idxd_clean_groups+0x137/0x250 [idxd]
[kernel][err]kasan_check_range+0x39/0x1b0
[kernel][err]idxd_clean_groups+0x137/0x250 [idxd]
[kernel][err]idxd_cleanup_internals+0x1f/0x1b0 [idxd]
[kernel][err]idxd_conf_device_release+0xe2/0x2b0 [idxd]
[kernel][err]device_release+0x9c/0x210
[kernel][err]kobject_cleanup+0x101/0x360
[kernel][err]pci_device_remove+0xab/0x1e0
[kernel][err]idxd_remove+0x93/0xb0 [idxd]
[kernel][err]device_release_driver_internal+0x369/0x530
[kernel][err]driver_detach+0xbf/0x180
[kernel][err]bus_remove_driver+0x11b/0x2a0
[kernel][err]pci_unregister_driver+0x2a/0x250
[kernel][err]? kobject_put+0x55/0xe0
[kernel][err]idxd_exit_module+0x34/0x40 [idxd]
[kernel][err]__do_sys_delete_module.constprop.0+0x2ee/0x580
[kernel][err]? fput_close_sync+0xdd/0x190
[kernel][err]? __pfx___do_sys_delete_module.constprop.0+0x10/0x10
[kernel][err]do_syscall_64+0x5d/0x2d0
[kernel][err]? __pfx_fput_close_sync+0x10/0x10
[kernel][err]entry_SYSCALL_64_after_hwframe+0x76/0x7e
[kernel][err]RIP: 0033:0x7f2ac20620cb
[kernel][err]Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
[kernel][err]RSP: 002b:00007ffdfec0a598 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[kernel][err]RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000560bd6bd77f8
[kernel][err]RAX: ffffffffffffffda RBX: 0000560bd6bd7790 RCX: 00007f2ac20620cb
[kernel][err]R10: 00007f2ac21aaac0 R11: 0000000000000206 R12: 00007ffdfec0a7c0
[kernel][err]RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[kernel][err]</TASK>
[kernel][err]R13: 00007ffdfec0b352 R14: 0000560bd6bd72a0 R15: 0000560bd6bd7790
[kernel][err]Allocated by task 1445:
[kernel][err]
[kernel][warning]kasan_save_stack+0x24/0x50
[kernel][warning]kasan_save_track+0x14/0x30
[kernel][warning]__kasan_kmalloc+0xaa/0xb0
[kernel][warning]idxd_setup_groups+0x2b5/0x7a0 [idxd]
[kernel][warning]idxd_setup_internals+0xd1/0x6e0 [idxd]
[kernel][warning]idxd_probe+0x93/0x310 [idxd]
[kernel][warning]idxd_pci_probe_alloc+0x3f3/0x6e0 [idxd]
[kernel][warning]local_pci_probe+0xe5/0x1a0
[kernel][warning]work_for_cpu_fn+0x52/0xa0
[kernel][warning]process_one_work+0x652/0x1080
[kernel][warning]worker_thread+0x652/0xc70
[kernel][warning]ret_from_fork+0x253/0x2e0
[kernel][warning]kthread+0x34a/0x450
[kernel][warning]ret_from_fork_asm+0x1a/0x30
[kernel][err]
[kernel][err]Freed by task 13956:
[kernel][warning]kasan_save_stack+0x24/0x50
[kernel][warning]kasan_save_track+0x14/0x30
[kernel][warning]kasan_save_free_info+0x3a/0x60
[kernel][warning]__kasan_slab_free+0x54/0x70
[kernel][warning]kfree+0x12e/0x430
[kernel][warning]device_release+0x9c/0x210
[kernel][warning]kobject_cleanup+0x101/0x360
[kernel][warning]idxd_unregister_devices+0x22d/0x320 [idxd]
[kernel][warning]pci_device_remove+0xab/0x1e0
[kernel][warning]idxd_remove+0x3c/0xb0 [idxd]
[kernel][warning]driver_detach+0xbf/0x180
[kernel][warning]device_release_driver_internal+0x369/0x530
[kernel][warning]bus_remove_driver+0x11b/0x2a0
[kernel][warning]pci_unregister_driver+0x2a/0x250
[kernel][warning]idxd_exit_module+0x34/0x40 [idxd]
[kernel][warning]do_syscall_64+0x5d/0x2d0
[kernel][warning]__do_sys_delete_module.constprop.0+0x2ee/0x580
[kernel][err]
[kernel][err]The buggy address belongs to the object at ff1100822ff89000

The issue shows memory being freed during device_release(), then
accessed again in idxd_conf_device_release().

Your original approach is correct - the kernel's device model handles
the memory management properly, and we should not interfere with it.

Feel free to add:

Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>

Sorry for the confusion in my earlier analysis.

Best Regards,
Shuai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-29  2:46           ` Shuai Xue
@ 2025-07-29  3:15             ` Yi Sun
  2025-07-29  6:00               ` Shuai Xue
  2025-07-31  0:17             ` Vinicius Costa Gomes
  1 sibling, 1 reply; 15+ messages in thread
From: Yi Sun @ 2025-07-29  3:15 UTC (permalink / raw)
  To: Shuai Xue
  Cc: Fenghua Yu, vinicius.gomes, dmaengine, linux-kernel, dave.jiang,
	gordon.jin

On 29.07.2025 10:46, Shuai Xue wrote:
>
>
>在 2025/7/28 19:43, Yi Sun 写道:
>>On 28.07.2025 16:40, Shuai Xue wrote:
>>>Hi, Yi Sun, Fenghua Yu,
>>>
>>>在 2025/7/27 17:16, Yi Sun 写道:
>>>>On 17.06.2025 14:58, Fenghua Yu wrote:
>>>>>Hi, Yi,
>>>>>
>>>>>On 6/17/25 03:27, Yi Sun wrote:
>>>>>>A recent refactor introduced a misplaced put_device() call, leading to a
>>>>>>reference count underflow during module unload.
>>>>>>
>>>>>>There is no need to add additional put_device() calls for idxd groups,
>>>>>>engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>>>>>also fixes the missing put_device() for idxd groups, engines, and wqs."
>>>>>>It appears no such omission existed. The required cleanup is already
>>>>>>handled by the call chain:
>>>>>>
>>>>>>
>>>>>>Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>>>>>idxd_cleanup_internals() which was not originally part of the driver
>>>>>>unload path and introduced unintended reference count underflow.
>>>>>>
>>>>>>Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>>>>>Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>>>
>>>>>>diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>>>index 40cc9c070081..40f4bf446763 100644
>>>>>>--- a/drivers/dma/idxd/init.c
>>>>>>+++ b/drivers/dma/idxd/init.c
>>>>>>@@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>>>>    device_unregister(idxd_confdev(idxd));
>>>>>>    idxd_shutdown(pdev);
>>>>>>    idxd_device_remove_debugfs(idxd);
>>>>>>-    idxd_cleanup(idxd);
>>>>>>+    perfmon_pmu_remove(idxd);
>>>>>>+    idxd_cleanup_interrupts(idxd);
>>>>>>+    if (device_pasid_enabled(idxd))
>>>>>>+        idxd_disable_system_pasid(idxd);
>>>>>>
>>>>>This will hit memory leak issue.
>>>>>
>>>>>idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>>>>
>>>>>I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>>>>
>>>>>1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>>>>
>>>>>2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>>>>
>>>>>>    pci_iounmap(pdev, idxd->reg_base);
>>>>>>    put_device(idxd_confdev(idxd));
>>>>>>    pci_disable_device(pdev);
>>>>>
>>>>>Thanks.
>>>>>
>>>>>-Fenghua
>>>>>
>>>>
>>>>Hi Fenghua,
>>>>
>>>>As with the comments on the previous patch, the function
>>>>idxd_conf_device_release already covers part of what is done in
>>>>idxd_remove_internals, including:
>>>>    kfree(idxd->groups);
>>>>    kfree(idxd->wqs);
>>>>    kfree(idxd->engines);
>>>>    kfree(idxd);
>>>>
>>>>We need to redesign idxd_remove_internals to clearly identify what
>>>>truely result in memory leaks and should be handled there.
>>>>Then, we'll need another patch to fix the idxd_remove_internals and call
>>>>it here.
>>>>
>>>>Let's prioritize addressing the two critical issues we've encountered here,
>>>>and then revisit the memory leak discussion afterward.
>>>>
>>>>Thanks
>>>>  --Sun, Yi
>>>
>>>The root cause is simliar to Patch 1, the idxd_conf_device_release()
>>>function already handles part of the cleanup that
>>>idxd_cleanup_internals() attempts to do, e.g.
>>>
>>>   idxd->wqs
>>>   idxd->einges
>>>   idxd->groups
>>>
>>>As a result, it causes user-after-free issue.
>>>
>>>   ------------[ cut here ]------------
>>>   WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>>   refcount_t: underflow; use-after-free.
>>>   drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
>>>   Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
>>>   CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #116 PREEMPT(none)
>>>   Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>>>   Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>>>   RIP: 0010:refcount_warn_saturate+0xbe/0x110
>>>   RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
>>>   RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
>>>   RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
>>>   RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
>>>   R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
>>>   R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
>>>   FS:  00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>   CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
>>>   PKRU: 55555554
>>>   Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
>>>   Call Trace:
>>>   idxd_cleanup+0x6b/0x100 [idxd]
>>>   <TASK>
>>>   idxd_remove+0x46/0x70 [idxd]
>>>   pci_device_remove+0x3f/0xb0
>>>   driver_detach+0x48/0x90
>>>   device_release_driver_internal+0x197/0x200
>>>   bus_remove_driver+0x6d/0xf0
>>>   idxd_exit_module+0x34/0x6c0 [idxd]
>>>   __do_sys_delete_module.constprop.0+0x174/0x310
>>>   do_syscall_64+0x5f/0x2d0
>>>   pci_unregister_driver+0x2e/0xb0
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>   RIP: 0033:0x7fb5f8a620cb
>>>   Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>>>   RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>>>   RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
>>>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
>>>   R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
>>>   R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
>>>   </TASK>
>>>   ---[ end trace 0000000000000000 ]---
>>>
>>>With this patch applied, the user-after-free issue is fixed.
>>>
>>>But, there is still memory leaks in
>>>
>>>   idxd->wqs[i]
>>>   idxd->einges[i]
>>>   idxd->groups[i]
>>>
>>>I agree with Vinicius that we should cleanup in idxd_conf_device_release().
>>>
>>>Thanks.
>>>Shuai
>>
>>Appreciate Shuai's clarification. Yes, it would be better if fixing the memory
>>leak issue in idxd_conf_device_release() in a separate patch.
>>
>>@Shuai, please feel free to proceed if you'd like to cook a patch for it.
>>
>>Thanks
>>    --Sun, Yi
>
>Hi, Sun, Yi,
>
>I need to correct my previous analysis. After further investigation, I
>believe there is no memory leak in the current code. The device
>reference counting and memory management are properly handled through
>the Linux device model. Each component is freed at the conf_dev
>destruction time:
>
>
>    idxd->wqs[i] is freed by idxd_conf_wq_release()
>    idxd->einges[i] is freed by idxd_conf_engine_release()
>    idxd->groups[i] is freed by idxd_conf_group_release()
>
>
>Adding additional cleanup in idxd_conf_device_release() would actually
>cause double-free issues. I can reproduce this with KASAN:
>
>[kernel][err]BUG: KASAN: slab-use-after-free in idxd_clean_groups+0x137/0x250 [idxd]
>[kernel][err]==================================================================
>[kernel][err]Read of size 4 at addr ff1100822ff89038 by task rmmod/13956
>[kernel][err]
>[kernel][err]CPU: 185 UID: 0 PID: 13956 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #117 PREEMPT(none)
>[kernel][err]Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>[kernel][err]Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>[kernel][err]Call Trace:
>[kernel][err]<TASK>
>[kernel][err]dump_stack_lvl+0x55/0x70
>[kernel][err]print_address_description.constprop.0+0x27/0x2e0
>[kernel][err]? idxd_clean_groups+0x137/0x250 [idxd]
>[kernel][err]print_report+0x3e/0x70
>[kernel][err]kasan_report+0xb8/0xf0
>[kernel][err]? idxd_clean_groups+0x137/0x250 [idxd]
>[kernel][err]kasan_check_range+0x39/0x1b0
>[kernel][err]idxd_clean_groups+0x137/0x250 [idxd]
>[kernel][err]idxd_cleanup_internals+0x1f/0x1b0 [idxd]
>[kernel][err]idxd_conf_device_release+0xe2/0x2b0 [idxd]
>[kernel][err]device_release+0x9c/0x210
>[kernel][err]kobject_cleanup+0x101/0x360
>[kernel][err]pci_device_remove+0xab/0x1e0
>[kernel][err]idxd_remove+0x93/0xb0 [idxd]
>[kernel][err]device_release_driver_internal+0x369/0x530
>[kernel][err]driver_detach+0xbf/0x180
>[kernel][err]bus_remove_driver+0x11b/0x2a0
>[kernel][err]pci_unregister_driver+0x2a/0x250
>[kernel][err]? kobject_put+0x55/0xe0
>[kernel][err]idxd_exit_module+0x34/0x40 [idxd]
>[kernel][err]__do_sys_delete_module.constprop.0+0x2ee/0x580
>[kernel][err]? fput_close_sync+0xdd/0x190
>[kernel][err]? __pfx___do_sys_delete_module.constprop.0+0x10/0x10
>[kernel][err]do_syscall_64+0x5d/0x2d0
>[kernel][err]? __pfx_fput_close_sync+0x10/0x10
>[kernel][err]entry_SYSCALL_64_after_hwframe+0x76/0x7e
>[kernel][err]RIP: 0033:0x7f2ac20620cb
>[kernel][err]Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>[kernel][err]RSP: 002b:00007ffdfec0a598 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>[kernel][err]RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000560bd6bd77f8
>[kernel][err]RAX: ffffffffffffffda RBX: 0000560bd6bd7790 RCX: 00007f2ac20620cb
>[kernel][err]R10: 00007f2ac21aaac0 R11: 0000000000000206 R12: 00007ffdfec0a7c0
>[kernel][err]RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>[kernel][err]</TASK>
>[kernel][err]R13: 00007ffdfec0b352 R14: 0000560bd6bd72a0 R15: 0000560bd6bd7790
>[kernel][err]Allocated by task 1445:
>[kernel][err]
>[kernel][warning]kasan_save_stack+0x24/0x50
>[kernel][warning]kasan_save_track+0x14/0x30
>[kernel][warning]__kasan_kmalloc+0xaa/0xb0
>[kernel][warning]idxd_setup_groups+0x2b5/0x7a0 [idxd]
>[kernel][warning]idxd_setup_internals+0xd1/0x6e0 [idxd]
>[kernel][warning]idxd_probe+0x93/0x310 [idxd]
>[kernel][warning]idxd_pci_probe_alloc+0x3f3/0x6e0 [idxd]
>[kernel][warning]local_pci_probe+0xe5/0x1a0
>[kernel][warning]work_for_cpu_fn+0x52/0xa0
>[kernel][warning]process_one_work+0x652/0x1080
>[kernel][warning]worker_thread+0x652/0xc70
>[kernel][warning]ret_from_fork+0x253/0x2e0
>[kernel][warning]kthread+0x34a/0x450
>[kernel][warning]ret_from_fork_asm+0x1a/0x30
>[kernel][err]
>[kernel][err]Freed by task 13956:
>[kernel][warning]kasan_save_stack+0x24/0x50
>[kernel][warning]kasan_save_track+0x14/0x30
>[kernel][warning]kasan_save_free_info+0x3a/0x60
>[kernel][warning]__kasan_slab_free+0x54/0x70
>[kernel][warning]kfree+0x12e/0x430
>[kernel][warning]device_release+0x9c/0x210
>[kernel][warning]kobject_cleanup+0x101/0x360
>[kernel][warning]idxd_unregister_devices+0x22d/0x320 [idxd]
>[kernel][warning]pci_device_remove+0xab/0x1e0
>[kernel][warning]idxd_remove+0x3c/0xb0 [idxd]
>[kernel][warning]driver_detach+0xbf/0x180
>[kernel][warning]device_release_driver_internal+0x369/0x530
>[kernel][warning]bus_remove_driver+0x11b/0x2a0
>[kernel][warning]pci_unregister_driver+0x2a/0x250
>[kernel][warning]idxd_exit_module+0x34/0x40 [idxd]
>[kernel][warning]do_syscall_64+0x5d/0x2d0
>[kernel][warning]__do_sys_delete_module.constprop.0+0x2ee/0x580
>[kernel][err]
>[kernel][err]The buggy address belongs to the object at ff1100822ff89000
>
>The issue shows memory being freed during device_release(), then
>accessed again in idxd_conf_device_release().
>
>Your original approach is correct - the kernel's device model handles
>the memory management properly, and we should not interfere with it.
>
>Feel free to add:
>
>Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>
>
>Sorry for the confusion in my earlier analysis.
>
>Best Regards,
>Shuai

I see. Thanks Shuai for the comments and testing efforts.
I'd like to resend this series with the Tested-by tag to make it easier
for further review.


Thanks
    --Sun, Yi


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-29  3:15             ` Yi Sun
@ 2025-07-29  6:00               ` Shuai Xue
  0 siblings, 0 replies; 15+ messages in thread
From: Shuai Xue @ 2025-07-29  6:00 UTC (permalink / raw)
  To: Yi Sun
  Cc: Fenghua Yu, vinicius.gomes, dmaengine, linux-kernel, dave.jiang,
	gordon.jin



在 2025/7/29 11:15, Yi Sun 写道:
> On 29.07.2025 10:46, Shuai Xue wrote:
>>
>>
>> 在 2025/7/28 19:43, Yi Sun 写道:
>>> On 28.07.2025 16:40, Shuai Xue wrote:
>>>> Hi, Yi Sun, Fenghua Yu,
>>>>
>>>> 在 2025/7/27 17:16, Yi Sun 写道:
>>>>> On 17.06.2025 14:58, Fenghua Yu wrote:
>>>>>> Hi, Yi,
>>>>>>
>>>>>> On 6/17/25 03:27, Yi Sun wrote:
>>>>>>> A recent refactor introduced a misplaced put_device() call, leading to a
>>>>>>> reference count underflow during module unload.
>>>>>>>
>>>>>>> There is no need to add additional put_device() calls for idxd groups,
>>>>>>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>>>>>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>>>>>>> It appears no such omission existed. The required cleanup is already
>>>>>>> handled by the call chain:
>>>>>>>
>>>>>>>
>>>>>>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>>>>>> idxd_cleanup_internals() which was not originally part of the driver
>>>>>>> unload path and introduced unintended reference count underflow.
>>>>>>>
>>>>>>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>>>>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>>>>
>>>>>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>>>> index 40cc9c070081..40f4bf446763 100644
>>>>>>> --- a/drivers/dma/idxd/init.c
>>>>>>> +++ b/drivers/dma/idxd/init.c
>>>>>>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>>>>>    device_unregister(idxd_confdev(idxd));
>>>>>>>    idxd_shutdown(pdev);
>>>>>>>    idxd_device_remove_debugfs(idxd);
>>>>>>> -    idxd_cleanup(idxd);
>>>>>>> +    perfmon_pmu_remove(idxd);
>>>>>>> +    idxd_cleanup_interrupts(idxd);
>>>>>>> +    if (device_pasid_enabled(idxd))
>>>>>>> +        idxd_disable_system_pasid(idxd);
>>>>>>>
>>>>>> This will hit memory leak issue.
>>>>>>
>>>>>> idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>>>>>
>>>>>> I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>>>>>
>>>>>> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>>>>>
>>>>>> 2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>>>>>
>>>>>>>    pci_iounmap(pdev, idxd->reg_base);
>>>>>>>    put_device(idxd_confdev(idxd));
>>>>>>>    pci_disable_device(pdev);
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Fenghua
>>>>>>
>>>>>
>>>>> Hi Fenghua,
>>>>>
>>>>> As with the comments on the previous patch, the function
>>>>> idxd_conf_device_release already covers part of what is done in
>>>>> idxd_remove_internals, including:
>>>>>    kfree(idxd->groups);
>>>>>    kfree(idxd->wqs);
>>>>>    kfree(idxd->engines);
>>>>>    kfree(idxd);
>>>>>
>>>>> We need to redesign idxd_remove_internals to clearly identify what
>>>>> truely result in memory leaks and should be handled there.
>>>>> Then, we'll need another patch to fix the idxd_remove_internals and call
>>>>> it here.
>>>>>
>>>>> Let's prioritize addressing the two critical issues we've encountered here,
>>>>> and then revisit the memory leak discussion afterward.
>>>>>
>>>>> Thanks
>>>>>  --Sun, Yi
>>>>
>>>> The root cause is simliar to Patch 1, the idxd_conf_device_release()
>>>> function already handles part of the cleanup that
>>>> idxd_cleanup_internals() attempts to do, e.g.
>>>>
>>>>   idxd->wqs
>>>>   idxd->einges
>>>>   idxd->groups
>>>>
>>>> As a result, it causes user-after-free issue.
>>>>
>>>>   ------------[ cut here ]------------
>>>>   WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>>>   refcount_t: underflow; use-after-free.
>>>>   drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
>>>>   Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
>>>>   CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #116 PREEMPT(none)
>>>>   Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>>>>   Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>>>>   RIP: 0010:refcount_warn_saturate+0xbe/0x110
>>>>   RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
>>>>   RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
>>>>   RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
>>>>   RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
>>>>   R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
>>>>   R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
>>>>   FS:  00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>   CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
>>>>   PKRU: 55555554
>>>>   Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
>>>>   Call Trace:
>>>>   idxd_cleanup+0x6b/0x100 [idxd]
>>>>   <TASK>
>>>>   idxd_remove+0x46/0x70 [idxd]
>>>>   pci_device_remove+0x3f/0xb0
>>>>   driver_detach+0x48/0x90
>>>>   device_release_driver_internal+0x197/0x200
>>>>   bus_remove_driver+0x6d/0xf0
>>>>   idxd_exit_module+0x34/0x6c0 [idxd]
>>>>   __do_sys_delete_module.constprop.0+0x174/0x310
>>>>   do_syscall_64+0x5f/0x2d0
>>>>   pci_unregister_driver+0x2e/0xb0
>>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>>   RIP: 0033:0x7fb5f8a620cb
>>>>   Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>>>>   RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>>>>   RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
>>>>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
>>>>   R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
>>>>   R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
>>>>   </TASK>
>>>>   ---[ end trace 0000000000000000 ]---
>>>>
>>>> With this patch applied, the user-after-free issue is fixed.
>>>>
>>>> But, there is still memory leaks in
>>>>
>>>>   idxd->wqs[i]
>>>>   idxd->einges[i]
>>>>   idxd->groups[i]
>>>>
>>>> I agree with Vinicius that we should cleanup in idxd_conf_device_release().
>>>>
>>>> Thanks.
>>>> Shuai
>>>
>>> Appreciate Shuai's clarification. Yes, it would be better if fixing the memory
>>> leak issue in idxd_conf_device_release() in a separate patch.
>>>
>>> @Shuai, please feel free to proceed if you'd like to cook a patch for it.
>>>
>>> Thanks
>>>    --Sun, Yi
>>
>> Hi, Sun, Yi,
>>
>> I need to correct my previous analysis. After further investigation, I
>> believe there is no memory leak in the current code. The device
>> reference counting and memory management are properly handled through
>> the Linux device model. Each component is freed at the conf_dev
>> destruction time:
>>
>>
>>    idxd->wqs[i] is freed by idxd_conf_wq_release()
>>    idxd->einges[i] is freed by idxd_conf_engine_release()
>>    idxd->groups[i] is freed by idxd_conf_group_release()
>>
>>
>> Adding additional cleanup in idxd_conf_device_release() would actually
>> cause double-free issues. I can reproduce this with KASAN:
>>
>> [kernel][err]BUG: KASAN: slab-use-after-free in idxd_clean_groups+0x137/0x250 [idxd]
>> [kernel][err]==================================================================
>> [kernel][err]Read of size 4 at addr ff1100822ff89038 by task rmmod/13956
>> [kernel][err]
>> [kernel][err]CPU: 185 UID: 0 PID: 13956 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #117 PREEMPT(none)
>> [kernel][err]Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>> [kernel][err]Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>> [kernel][err]Call Trace:
>> [kernel][err]<TASK>
>> [kernel][err]dump_stack_lvl+0x55/0x70
>> [kernel][err]print_address_description.constprop.0+0x27/0x2e0
>> [kernel][err]? idxd_clean_groups+0x137/0x250 [idxd]
>> [kernel][err]print_report+0x3e/0x70
>> [kernel][err]kasan_report+0xb8/0xf0
>> [kernel][err]? idxd_clean_groups+0x137/0x250 [idxd]
>> [kernel][err]kasan_check_range+0x39/0x1b0
>> [kernel][err]idxd_clean_groups+0x137/0x250 [idxd]
>> [kernel][err]idxd_cleanup_internals+0x1f/0x1b0 [idxd]
>> [kernel][err]idxd_conf_device_release+0xe2/0x2b0 [idxd]
>> [kernel][err]device_release+0x9c/0x210
>> [kernel][err]kobject_cleanup+0x101/0x360
>> [kernel][err]pci_device_remove+0xab/0x1e0
>> [kernel][err]idxd_remove+0x93/0xb0 [idxd]
>> [kernel][err]device_release_driver_internal+0x369/0x530
>> [kernel][err]driver_detach+0xbf/0x180
>> [kernel][err]bus_remove_driver+0x11b/0x2a0
>> [kernel][err]pci_unregister_driver+0x2a/0x250
>> [kernel][err]? kobject_put+0x55/0xe0
>> [kernel][err]idxd_exit_module+0x34/0x40 [idxd]
>> [kernel][err]__do_sys_delete_module.constprop.0+0x2ee/0x580
>> [kernel][err]? fput_close_sync+0xdd/0x190
>> [kernel][err]? __pfx___do_sys_delete_module.constprop.0+0x10/0x10
>> [kernel][err]do_syscall_64+0x5d/0x2d0
>> [kernel][err]? __pfx_fput_close_sync+0x10/0x10
>> [kernel][err]entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [kernel][err]RIP: 0033:0x7f2ac20620cb
>> [kernel][err]Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>> [kernel][err]RSP: 002b:00007ffdfec0a598 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>> [kernel][err]RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000560bd6bd77f8
>> [kernel][err]RAX: ffffffffffffffda RBX: 0000560bd6bd7790 RCX: 00007f2ac20620cb
>> [kernel][err]R10: 00007f2ac21aaac0 R11: 0000000000000206 R12: 00007ffdfec0a7c0
>> [kernel][err]RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>> [kernel][err]</TASK>
>> [kernel][err]R13: 00007ffdfec0b352 R14: 0000560bd6bd72a0 R15: 0000560bd6bd7790
>> [kernel][err]Allocated by task 1445:
>> [kernel][err]
>> [kernel][warning]kasan_save_stack+0x24/0x50
>> [kernel][warning]kasan_save_track+0x14/0x30
>> [kernel][warning]__kasan_kmalloc+0xaa/0xb0
>> [kernel][warning]idxd_setup_groups+0x2b5/0x7a0 [idxd]
>> [kernel][warning]idxd_setup_internals+0xd1/0x6e0 [idxd]
>> [kernel][warning]idxd_probe+0x93/0x310 [idxd]
>> [kernel][warning]idxd_pci_probe_alloc+0x3f3/0x6e0 [idxd]
>> [kernel][warning]local_pci_probe+0xe5/0x1a0
>> [kernel][warning]work_for_cpu_fn+0x52/0xa0
>> [kernel][warning]process_one_work+0x652/0x1080
>> [kernel][warning]worker_thread+0x652/0xc70
>> [kernel][warning]ret_from_fork+0x253/0x2e0
>> [kernel][warning]kthread+0x34a/0x450
>> [kernel][warning]ret_from_fork_asm+0x1a/0x30
>> [kernel][err]
>> [kernel][err]Freed by task 13956:
>> [kernel][warning]kasan_save_stack+0x24/0x50
>> [kernel][warning]kasan_save_track+0x14/0x30
>> [kernel][warning]kasan_save_free_info+0x3a/0x60
>> [kernel][warning]__kasan_slab_free+0x54/0x70
>> [kernel][warning]kfree+0x12e/0x430
>> [kernel][warning]device_release+0x9c/0x210
>> [kernel][warning]kobject_cleanup+0x101/0x360
>> [kernel][warning]idxd_unregister_devices+0x22d/0x320 [idxd]
>> [kernel][warning]pci_device_remove+0xab/0x1e0
>> [kernel][warning]idxd_remove+0x3c/0xb0 [idxd]
>> [kernel][warning]driver_detach+0xbf/0x180
>> [kernel][warning]device_release_driver_internal+0x369/0x530
>> [kernel][warning]bus_remove_driver+0x11b/0x2a0
>> [kernel][warning]pci_unregister_driver+0x2a/0x250
>> [kernel][warning]idxd_exit_module+0x34/0x40 [idxd]
>> [kernel][warning]do_syscall_64+0x5d/0x2d0
>> [kernel][warning]__do_sys_delete_module.constprop.0+0x2ee/0x580
>> [kernel][err]
>> [kernel][err]The buggy address belongs to the object at ff1100822ff89000
>>
>> The issue shows memory being freed during device_release(), then
>> accessed again in idxd_conf_device_release().
>>
>> Your original approach is correct - the kernel's device model handles
>> the memory management properly, and we should not interfere with it.
>>
>> Feel free to add:
>>
>> Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>
>>
>> Sorry for the confusion in my earlier analysis.
>>
>> Best Regards,
>> Shuai
> 
> I see. Thanks Shuai for the comments and testing efforts.
> I'd like to resend this series with the Tested-by tag to make it easier
> for further review.
> 
> 
> Thanks
>     --Sun, Yi

Hi, Sun, Yi,

Good idea to resend with the Tested-by tag.

I think you should also CC stable, since this issue exists in stable 6.6
and 6.12kernels as well. The refcount underflow and use-after-free
issues are serious enough to warrant backporting the fix.

Thanks,
Shuai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload
  2025-07-29  2:46           ` Shuai Xue
  2025-07-29  3:15             ` Yi Sun
@ 2025-07-31  0:17             ` Vinicius Costa Gomes
  1 sibling, 0 replies; 15+ messages in thread
From: Vinicius Costa Gomes @ 2025-07-31  0:17 UTC (permalink / raw)
  To: Shuai Xue, Yi Sun
  Cc: Fenghua Yu, dmaengine, linux-kernel, dave.jiang, gordon.jin

Shuai Xue <xueshuai@linux.alibaba.com> writes:

> 在 2025/7/28 19:43, Yi Sun 写道:
>> On 28.07.2025 16:40, Shuai Xue wrote:
>>> Hi, Yi Sun, Fenghua Yu,
>>>
>>> 在 2025/7/27 17:16, Yi Sun 写道:
>>>> On 17.06.2025 14:58, Fenghua Yu wrote:
>>>>> Hi, Yi,
>>>>>
>>>>> On 6/17/25 03:27, Yi Sun wrote:
>>>>>> A recent refactor introduced a misplaced put_device() call, leading to a
>>>>>> reference count underflow during module unload.
>>>>>>
>>>>>> There is no need to add additional put_device() calls for idxd groups,
>>>>>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>>>>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>>>>>> It appears no such omission existed. The required cleanup is already
>>>>>> handled by the call chain:
>>>>>>
>>>>>>
>>>>>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>>>>> idxd_cleanup_internals() which was not originally part of the driver
>>>>>> unload path and introduced unintended reference count underflow.
>>>>>>
>>>>>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>>>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>>>
>>>>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>>> index 40cc9c070081..40f4bf446763 100644
>>>>>> --- a/drivers/dma/idxd/init.c
>>>>>> +++ b/drivers/dma/idxd/init.c
>>>>>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>>>>>     device_unregister(idxd_confdev(idxd));
>>>>>>     idxd_shutdown(pdev);
>>>>>>     idxd_device_remove_debugfs(idxd);
>>>>>> -    idxd_cleanup(idxd);
>>>>>> +    perfmon_pmu_remove(idxd);
>>>>>> +    idxd_cleanup_interrupts(idxd);
>>>>>> +    if (device_pasid_enabled(idxd))
>>>>>> +        idxd_disable_system_pasid(idxd);
>>>>>>
>>>>> This will hit memory leak issue.
>>>>>
>>>>> idxd_remove_internals() does not only put_device() but also free allocated memory for wqs, engines, groups. Without calling idxd_remove_internals(), the allocated memory is leaked.
>>>>>
>>>>> I think a right fix is to remove the put_device() in idxd_cleanup_wqs/engines/groups() because:
>>>>>
>>>>> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>>>>>
>>>>> 2. Fix the issue mentioned in this patch while there is no memory leak issue.
>>>>>
>>>>>>     pci_iounmap(pdev, idxd->reg_base);
>>>>>>     put_device(idxd_confdev(idxd));
>>>>>>     pci_disable_device(pdev);
>>>>>
>>>>> Thanks.
>>>>>
>>>>> -Fenghua
>>>>>
>>>>
>>>> Hi Fenghua,
>>>>
>>>> As with the comments on the previous patch, the function
>>>> idxd_conf_device_release already covers part of what is done in
>>>> idxd_remove_internals, including:
>>>>     kfree(idxd->groups);
>>>>     kfree(idxd->wqs);
>>>>     kfree(idxd->engines);
>>>>     kfree(idxd);
>>>>
>>>> We need to redesign idxd_remove_internals to clearly identify what
>>>> truely result in memory leaks and should be handled there.
>>>> Then, we'll need another patch to fix the idxd_remove_internals and call
>>>> it here.
>>>>
>>>> Let's prioritize addressing the two critical issues we've encountered here,
>>>> and then revisit the memory leak discussion afterward.
>>>>
>>>> Thanks
>>>>   --Sun, Yi
>>>
>>> The root cause is simliar to Patch 1, the idxd_conf_device_release()
>>> function already handles part of the cleanup that
>>> idxd_cleanup_internals() attempts to do, e.g.
>>>
>>>    idxd->wqs
>>>    idxd->einges
>>>    idxd->groups
>>>
>>> As a result, it causes user-after-free issue.
>>>
>>>    ------------[ cut here ]------------
>>>    WARNING: CPU: 190 PID: 13854 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
>>>    refcount_t: underflow; use-after-free.
>>>    drm_client_lib(E) i2c_algo_bit(E) drm_shmem_helper(E) drm_kms_helper(E) nvme(E) mlx5_core(E) mlxfw(E) nvme_core(E) pci_hyperv_intf(E) psample(E) drm(E) wmi(E) pinctrl_emmitsburg(E) sd_mod(E) sg(E) ahci(E) libahci(E) libata(E) fuse(E)
>>>    Modules linked in: binfmt_misc(E) bonding(E) tls(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) i10nm_edac(E) skx_edac_common(E) nfit(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) snd_hda_intel(E) kvm(E) snd_intel_dspcfg(E) snd_hda_codec(E) mlx5_ib(E) irqbypass(E) qat_4xxx(E) snd_hda_core(E) dax_hmem(E) ghash_clmulni_intel(E) intel_qat(E) cxl_acpi(E) snd_hwdep(E) rapl(E) snd_pcm(E) cxl_port(E) iTCO_wdt(E) intel_cstate(E) iTCO_vendor_support(E) snd_timer(E) ib_uverbs(E) pmt_telemetry(E) cxl_core(E) rfkill(E) tun(E) dh_generic(E) pmt_class(E) intel_uncore(E) einj(E) pcspkr(E) isst_if_mbox_pci(E) joydev(E) isst_if_mmio(E) idxd(E-) crc8(E) ib_core(E) isst_if_common(E) authenc(E) intel_vsec(E) idxd_bus(E) snd(E) i2c_i801(E) mei_me(E) soundcore(E) i2c_smbus(E) i2c_ismt(E) mei(E) nf_tables(E) nfnetlink(E) ipmi_ssif(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandle
>>>    CPU: 190 UID: 0 PID: 13854 Comm: rmmod Kdump: loaded Tainted: G S          E       6.16.0-rc6+ #116 PREEMPT(none)
>>>    Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
>>>    Hardware name: Inspur AliServer-Xuanwu2.0-02-1UCG42PF/AS2111TG5, BIOS 3.0.ES.AL.P.090.01 02/22/2024
>>>    RIP: 0010:refcount_warn_saturate+0xbe/0x110
>>>    RSP: 0018:ff7078d2df957db8 EFLAGS: 00010282
>>>    RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
>>>    RBP: ff2b10355b055000 R08: 0000000000000000 R09: ff7078d2df957c68
>>>    RDX: ff2b10b33fdaa3c0 RSI: 0000000000000001 RDI: ff2b10b33fd9c440
>>>    R10: ff7078d2df957c60 R11: ffffffffbe761d68 R12: ff2b1035a00db400
>>>    R13: ff2b10355670b148 R14: ff2b103555097c80 R15: ffffffffc0a88938
>>>    FS:  00007fb5f8f3b740(0000) GS:ff2b10b380bb9000(0000) knlGS:0000000000000000
>>>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>    CR2: 00005560a898c2d8 CR3: 00000080f9def005 CR4: 0000000000f73ef0
>>>    PKRU: 55555554
>>>    Code: 01 01 e8 35 9b 9a ff 0f 0b c3 cc cc cc cc 80 3d 05 4c f5 01 00 75 85 48 c7 c7 30 21 75 bd c6 05 f5 4b f5 01 01 e8 12 9b 9a ff <0f> 0b c3 cc cc cc cc 80 3d e0 4b f5 01 00 0f 85 5e ff ff ff 48 c7
>>>    Call Trace:
>>>    idxd_cleanup+0x6b/0x100 [idxd]
>>>    <TASK>
>>>    idxd_remove+0x46/0x70 [idxd]
>>>    pci_device_remove+0x3f/0xb0
>>>    driver_detach+0x48/0x90
>>>    device_release_driver_internal+0x197/0x200
>>>    bus_remove_driver+0x6d/0xf0
>>>    idxd_exit_module+0x34/0x6c0 [idxd]
>>>    __do_sys_delete_module.constprop.0+0x174/0x310
>>>    do_syscall_64+0x5f/0x2d0
>>>    pci_unregister_driver+0x2e/0xb0
>>>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>    RIP: 0033:0x7fb5f8a620cb
>>>    Code: 73 01 c3 48 8b 0d a5 6d 19 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 6d 19 00 f7 d8 64 89 01 48
>>>    RSP: 002b:00007ffeed6101c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>>>    RAX: ffffffffffffffda RBX: 00005560a8981790 RCX: 00007fb5f8a620cb
>>>    RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005560a89817f8
>>>    R13: 00007ffeed612352 R14: 00005560a89812a0 R15: 00005560a8981790
>>>    R10: 00007fb5f8baaac0 R11: 0000000000000206 R12: 00007ffeed6103f0
>>>    </TASK>
>>>    ---[ end trace 0000000000000000 ]---
>>>
>>> With this patch applied, the user-after-free issue is fixed.
>>>
>>> But, there is still memory leaks in
>>>
>>>    idxd->wqs[i]
>>>    idxd->einges[i]
>>>    idxd->groups[i]
>>>
>>> I agree with Vinicius that we should cleanup in idxd_conf_device_release().
>>>
>>> Thanks.
>>> Shuai
>> 
>> Appreciate Shuai's clarification. Yes, it would be better if fixing the memory
>> leak issue in idxd_conf_device_release() in a separate patch.
>> 
>> @Shuai, please feel free to proceed if you'd like to cook a patch for it.
>> 
>> Thanks
>>     --Sun, Yi
>
> Hi, Sun, Yi,
>
> I need to correct my previous analysis. After further investigation, I
> believe there is no memory leak in the current code. The device
> reference counting and memory management are properly handled through
> the Linux device model. Each component is freed at the conf_dev
> destruction time:
>
>
>      idxd->wqs[i] is freed by idxd_conf_wq_release()
>      idxd->einges[i] is freed by idxd_conf_engine_release()
>      idxd->groups[i] is freed by idxd_conf_group_release()
>
>
> Adding additional cleanup in idxd_conf_device_release() would actually
> cause double-free issues. I can reproduce this with KASAN:
>

Just tested with Yi Sun latest series applied (the series look good by
the way), and I am seeing this:

[  966.361049] kmemleak: 2075 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

And from kmemleak:

unreferenced object 0xff110001e3e4b340 (size 192):
  comm "(udev-worker)", pid 2753, jiffies 4295015473
  hex dump (first 32 bytes):
    40 e5 f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00  @...............
    40 2a e5 08 00 00 a0 ff 40 fa ff ff 00 00 00 00  @*......@.......
  backtrace (crc f81b8a71):
    __kmalloc_cache_node_noprof+0x39b/0x450
    idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
    idxd_drv_enable_wq+0x391/0x1100 [idxd]
    0xffffffffc22aa490
    really_probe+0x1e0/0x930
    __driver_probe_device+0x18c/0x3d0
    driver_probe_device+0x4a/0xd0
    __driver_attach+0x1e1/0x4a0
    bus_for_each_dev+0xef/0x170
    bus_add_driver+0x29d/0x5c0
    driver_register+0x130/0x450
    uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
    do_one_initcall+0xb9/0x450
    do_init_module+0x281/0x8c0
    load_module+0x1693/0x2090
    init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150941e40 (size 192):
  comm "(udev-worker)", pid 2753, jiffies 4295015473
  hex dump (first 32 bytes):
    40 de f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00  @...............
    80 2a e5 08 00 00 a0 ff 80 fa ff ff 00 00 00 00  .*..............
  backtrace (crc e1abedf1):
    __kmalloc_cache_node_noprof+0x39b/0x450
    idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
    idxd_drv_enable_wq+0x391/0x1100 [idxd]
    0xffffffffc22aa490
    really_probe+0x1e0/0x930
    __driver_probe_device+0x18c/0x3d0
    driver_probe_device+0x4a/0xd0
    __driver_attach+0x1e1/0x4a0
    bus_for_each_dev+0xef/0x170
    bus_add_driver+0x29d/0x5c0
    driver_register+0x130/0x450
    uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
    do_one_initcall+0xb9/0x450
    do_init_module+0x281/0x8c0
    load_module+0x1693/0x2090
    init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150941840 (size 192):
  comm "(udev-worker)", pid 2753, jiffies 4295015473
  hex dump (first 32 bytes):
    40 fd f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00  @...............
    c0 2b e5 08 00 00 a0 ff c0 fb ff ff 00 00 00 00  .+..............
  backtrace (crc efd34d21):
    __kmalloc_cache_node_noprof+0x39b/0x450
    idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
    idxd_drv_enable_wq+0x391/0x1100 [idxd]
    0xffffffffc22aa490
    really_probe+0x1e0/0x930
    __driver_probe_device+0x18c/0x3d0
    driver_probe_device+0x4a/0xd0
    __driver_attach+0x1e1/0x4a0
    bus_for_each_dev+0xef/0x170
    bus_add_driver+0x29d/0x5c0
    driver_register+0x130/0x450
    uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
    do_one_initcall+0xb9/0x450
    do_init_module+0x281/0x8c0
    load_module+0x1693/0x2090
    init_module_from_file+0xe8/0x150
unreferenced object 0xff110001509425c0 (size 192):
  comm "(udev-worker)", pid 2753, jiffies 4295015473
  hex dump (first 32 bytes):
    40 d1 f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00  @...............
    00 2c e5 08 00 00 a0 ff 00 fc ff ff 00 00 00 00  .,..............
  backtrace (crc 3e67f08):
    __kmalloc_cache_node_noprof+0x39b/0x450
    idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
    idxd_drv_enable_wq+0x391/0x1100 [idxd]
    0xffffffffc22aa490
    really_probe+0x1e0/0x930
    __driver_probe_device+0x18c/0x3d0
    driver_probe_device+0x4a/0xd0
    __driver_attach+0x1e1/0x4a0
    bus_for_each_dev+0xef/0x170
    bus_add_driver+0x29d/0x5c0
    driver_register+0x130/0x450
    uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
    do_one_initcall+0xb9/0x450
    do_init_module+0x281/0x8c0
    load_module+0x1693/0x2090
    init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150942740 (size 192):
  comm "(udev-worker)", pid 2753, jiffies 4295015473
  hex dump (first 32 bytes):
    40 dd f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00  @...............
    40 2c e5 08 00 00 a0 ff 40 fc ff ff 00 00 00 00  @,......@.......
  backtrace (crc 4e740d35):
    __kmalloc_cache_node_noprof+0x39b/0x450
    idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
    idxd_drv_enable_wq+0x391/0x1100 [idxd]
    0xffffffffc22aa490
    really_probe+0x1e0/0x930
    __driver_probe_device+0x18c/0x3d0
    driver_probe_device+0x4a/0xd0
    __driver_attach+0x1e1/0x4a0
    bus_for_each_dev+0xef/0x170
    bus_add_driver+0x29d/0x5c0
    driver_register+0x130/0x450
    uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
    do_one_initcall+0xb9/0x450
    do_init_module+0x281/0x8c0
    load_module+0x1693/0x2090
    init_module_from_file+0xe8/0x150
unreferenced object 0xff11000150943340 (size 192):
  comm "(udev-worker)", pid 2753, jiffies 4295015473
  hex dump (first 32 bytes):
    40 c2 f9 d4 01 00 11 ff 00 00 00 00 00 00 00 00  @...............
    80 2c e5 08 00 00 a0 ff 80 fc ff ff 00 00 00 00  .,..............
  backtrace (crc 68538b12):
    __kmalloc_cache_node_noprof+0x39b/0x450
    idxd_wq_alloc_resources+0x6c9/0x1640 [idxd]
    idxd_drv_enable_wq+0x391/0x1100 [idxd]
    0xffffffffc22aa490
    really_probe+0x1e0/0x930
    __driver_probe_device+0x18c/0x3d0
    driver_probe_device+0x4a/0xd0
    __driver_attach+0x1e1/0x4a0
    bus_for_each_dev+0xef/0x170
    bus_add_driver+0x29d/0x5c0
    driver_register+0x130/0x450
    uncore_pm_notify+0xd1/0x130 [intel_uncore_frequency]
    do_one_initcall+0xb9/0x450
    do_init_module+0x281/0x8c0
    load_module+0x1693/0x2090
    init_module_from_file+0xe8/0x150

I have a series that tries to fix those (and a few other things), I am
finishing testing it, should be able to propose it soon.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-07-31  0:17 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 10:27 [PATCH v3 0/2] dmaengine: idxd: Fix refcount and cleanup issues on module unload Yi Sun
2025-06-17 10:27 ` [PATCH v3 1/2] dmaengine: idxd: Remove improper idxd_free Yi Sun
2025-06-17 22:13   ` Fenghua Yu
2025-07-27  9:02     ` Yi Sun
2025-07-28  8:21       ` Shuai Xue
2025-06-17 10:27 ` [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module unload Yi Sun
2025-06-17 21:58   ` Fenghua Yu
2025-06-18  0:38     ` Vinicius Costa Gomes
2025-07-27  9:16     ` Yi Sun
2025-07-28  8:40       ` Shuai Xue
2025-07-28 11:43         ` Yi Sun
2025-07-29  2:46           ` Shuai Xue
2025-07-29  3:15             ` Yi Sun
2025-07-29  6:00               ` Shuai Xue
2025-07-31  0:17             ` Vinicius Costa Gomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).