* [PATCH] dmaengine: idxd: fix problems on initialization error path.
@ 2026-05-20 14:37 Steve Wahl
2026-05-20 15:20 ` sashiko-bot
2026-05-20 20:10 ` Vinicius Costa Gomes
0 siblings, 2 replies; 5+ messages in thread
From: Steve Wahl @ 2026-05-20 14:37 UTC (permalink / raw)
To: Steve Wahl, Vinicius Costa Gomes, Dave Jiang, Vinod Koul,
Frank Li, dmaengine, linux-kernel
Cc: Russ Anderson, Dimitri Sivanich
Some error paths within idxd_pci_probe_alloc and functions it calls
did not keep proper track of what has already been allocated or freed,
resulting in calling destroy_workqueue with a null pointer, and once
that was fixed, attempting to free structures more than once. These
conditions were hit running in a kexec'd kdump kernel with reduced
resources, causing the "Device is HALTED!" branch in
idxd_device_init_reset to be taken.
In idxd_conf_device_release, check that the workqueue has been
allocated before trying to destroy it. And in idxd_free and
idxd_alloc, do not attempt to free allocations that
idxd_conf_device_release, called through put_device, will already have
freed.
Fixes: 3d33de353b1f ("dmaengine: idxd: Fix not releasing workqueue on .release()")
Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
---
drivers/dma/idxd/init.c | 10 ++++++----
drivers/dma/idxd/sysfs.c | 3 ++-
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index f1cfc7790d95..227e323cc5a0 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -607,9 +607,6 @@ static void idxd_free(struct idxd_device *idxd)
return;
put_device(idxd_confdev(idxd));
- bitmap_free(idxd->opcap_bmap);
- ida_free(&idxd_ida, idxd->id);
- kfree(idxd);
}
static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_data *data)
@@ -649,8 +646,13 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
return idxd;
err_name:
+ /*
+ * once device_initialize(conf_dev) is called,
+ * put_device(conf_dev) will end up calling
+ * idxd_conf_device_release() which will free the rest.
+ */
put_device(conf_dev);
- bitmap_free(idxd->opcap_bmap);
+ return NULL;
err_opcap:
ida_free(&idxd_ida, idxd->id);
err_ida:
diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
index 6d251095c350..d5ffc641c856 100644
--- a/drivers/dma/idxd/sysfs.c
+++ b/drivers/dma/idxd/sysfs.c
@@ -1836,7 +1836,8 @@ static void idxd_conf_device_release(struct device *dev)
{
struct idxd_device *idxd = confdev_to_idxd(dev);
- destroy_workqueue(idxd->wq);
+ if (idxd->wq)
+ destroy_workqueue(idxd->wq);
kfree(idxd->groups);
bitmap_free(idxd->wq_enable_map);
kfree(idxd->wqs);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] dmaengine: idxd: fix problems on initialization error path.
2026-05-20 14:37 [PATCH] dmaengine: idxd: fix problems on initialization error path Steve Wahl
@ 2026-05-20 15:20 ` sashiko-bot
2026-05-20 19:59 ` Vinicius Costa Gomes
2026-05-20 20:10 ` Vinicius Costa Gomes
1 sibling, 1 reply; 5+ messages in thread
From: sashiko-bot @ 2026-05-20 15:20 UTC (permalink / raw)
To: Steve Wahl; +Cc: vkoul, Frank.Li, dmaengine
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues:
- [High] Incomplete Fix / Double Free & Double Destroy
- [High] Immediate Double Free in child device cleanup functions
- [High] Use-After-Free and Double Free due to dangling pointers in registration rollback
--
commit 36d73f2b6ed131031e13332b322a76ce56830ad7
Author: Steve Wahl <steve.wahl@hpe.com>
Subject: dmaengine: idxd: fix problems on initialization error path.
This patch fixes double frees and NULL pointer dereferences in the IDXD
initialization error paths by ensuring allocations are freed only once
during the device release callback, and checking if the workqueue exists
before destroying it.
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index f1cfc7790d950..227e323cc5a0b 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
[ ... ]
> @@ -649,8 +646,13 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
> return idxd;
>
> err_name:
> + /*
> + * once device_initialize(conf_dev) is called,
> + * put_device(conf_dev) will end up calling
> + * idxd_conf_device_release() which will free the rest.
> + */
> put_device(conf_dev);
[Severity: High]
This is a pre-existing issue, but does a similar pattern cause immediate double
frees in the child device cleanup functions?
In functions like idxd_clean_wqs(), idxd_clean_groups(), and
idxd_clean_engines() (as well as their setup error paths), cleanup is performed
by calling put_device() on the child kobjects.
This drops the reference count to 0 and synchronously invokes the sysfs release
callback (e.g., idxd_conf_wq_release(), idxd_conf_group_release()), which safely
frees the structure (kfree(wq)) and its internal allocations.
However, immediately after put_device() returns, these cleanup functions
explicitly call kfree() on the exact same device structure (and inner pointers
like wq->wqcfg). Will this result in immediate double frees when unloading the
driver or failing initialization?
[Severity: High]
This isn't a bug introduced by this patch, but are there dangling pointers in
the registration rollback that cause a use-after-free?
If idxd_register_wq_devices() (or equivalent engine/group registration
functions) fails, its error rollback drops references to all child devices via
put_device() or device_unregister(), which invokes their release callbacks and
frees them.
It does not remove the dangling pointers from the idxd->wqs (or groups/engines)
arrays.
The error then bubbles up to idxd_pci_probe_alloc(), which jumps to
err_dev_register and calls idxd_cleanup(). This eventually calls
idxd_clean_wqs(), which iterates over the full idxd->wqs array and
unconditionally calls put_device() and kfree() on the already-freed pointers.
> + return NULL;
> err_opcap:
> ida_free(&idxd_ida, idxd->id);
> err_ida:
[ ... ]
> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> index 6d251095c350f..d5ffc641c8567 100644
> --- a/drivers/dma/idxd/sysfs.c
> +++ b/drivers/dma/idxd/sysfs.c
> @@ -1836,7 +1836,8 @@ static void idxd_conf_device_release(struct device *dev)
> {
> struct idxd_device *idxd = confdev_to_idxd(dev);
>
> - destroy_workqueue(idxd->wq);
> + if (idxd->wq)
> + destroy_workqueue(idxd->wq);
[Severity: High]
This isn't a bug introduced by this patch, but do the error paths from
idxd_probe() still cause a double free and double destroy?
If an error occurs in idxd_probe() (e.g., idxd_device_load_config() fails),
idxd_cleanup_internals() or idxd_setup_internals() error paths are invoked.
These functions manually free arrays (idxd->groups, idxd->wqs, idxd->engines)
and destroy the workqueue (destroy_workqueue(idxd->wq)) but leave the pointers
non-NULL.
When the error propagates up to idxd_pci_probe_alloc(), it calls idxd_free()
which calls put_device(). This triggers idxd_conf_device_release().
Because the pointers are still non-NULL, idxd_conf_device_release()
unconditionally calls kfree() on the arrays a second time and executes
destroy_workqueue(idxd->wq) again on the already-destroyed workqueue.
> kfree(idxd->groups);
> bitmap_free(idxd->wq_enable_map);
> kfree(idxd->wqs);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260520143732.119407-1-steve.wahl@hpe.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] dmaengine: idxd: fix problems on initialization error path.
2026-05-20 15:20 ` sashiko-bot
@ 2026-05-20 19:59 ` Vinicius Costa Gomes
0 siblings, 0 replies; 5+ messages in thread
From: Vinicius Costa Gomes @ 2026-05-20 19:59 UTC (permalink / raw)
To: sashiko-reviews, Steve Wahl; +Cc: vkoul, Frank.Li, dmaengine
sashiko-bot@kernel.org writes:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] Incomplete Fix / Double Free & Double Destroy
> - [High] Immediate Double Free in child device cleanup functions
> - [High] Use-After-Free and Double Free due to dangling pointers in registration rollback
> --
>
> commit 36d73f2b6ed131031e13332b322a76ce56830ad7
> Author: Steve Wahl <steve.wahl@hpe.com>
> Subject: dmaengine: idxd: fix problems on initialization error path.
>
> This patch fixes double frees and NULL pointer dereferences in the IDXD
> initialization error paths by ensuring allocations are freed only once
> during the device release callback, and checking if the workqueue exists
> before destroying it.
>
>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>> index f1cfc7790d950..227e323cc5a0b 100644
>> --- a/drivers/dma/idxd/init.c
>> +++ b/drivers/dma/idxd/init.c
>
> [ ... ]
>
>> @@ -649,8 +646,13 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
>> return idxd;
>>
>> err_name:
>> + /*
>> + * once device_initialize(conf_dev) is called,
>> + * put_device(conf_dev) will end up calling
>> + * idxd_conf_device_release() which will free the rest.
>> + */
>> put_device(conf_dev);
>
> [Severity: High]
> This is a pre-existing issue, but does a similar pattern cause immediate double
> frees in the child device cleanup functions?
>
> In functions like idxd_clean_wqs(), idxd_clean_groups(), and
> idxd_clean_engines() (as well as their setup error paths), cleanup is performed
> by calling put_device() on the child kobjects.
>
> This drops the reference count to 0 and synchronously invokes the sysfs release
> callback (e.g., idxd_conf_wq_release(), idxd_conf_group_release()), which safely
> frees the structure (kfree(wq)) and its internal allocations.
>
> However, immediately after put_device() returns, these cleanup functions
> explicitly call kfree() on the exact same device structure (and inner pointers
> like wq->wqcfg). Will this result in immediate double frees when unloading the
> driver or failing initialization?
>
I believe this patch already solves this issue:
https://lore.kernel.org/dmaengine/20260415205452.67155-1-dbgh9129@gmail.com/
>
> [Severity: High]
> This isn't a bug introduced by this patch, but are there dangling pointers in
> the registration rollback that cause a use-after-free?
>
> If idxd_register_wq_devices() (or equivalent engine/group registration
> functions) fails, its error rollback drops references to all child devices via
> put_device() or device_unregister(), which invokes their release callbacks and
> frees them.
>
> It does not remove the dangling pointers from the idxd->wqs (or groups/engines)
> arrays.
>
> The error then bubbles up to idxd_pci_probe_alloc(), which jumps to
> err_dev_register and calls idxd_cleanup(). This eventually calls
> idxd_clean_wqs(), which iterates over the full idxd->wqs array and
> unconditionally calls put_device() and kfree() on the already-freed pointers.
>
That patch from Yuho Choi may also fix this.
>> + return NULL;
>> err_opcap:
>> ida_free(&idxd_ida, idxd->id);
>> err_ida:
>
> [ ... ]
>
>> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
>> index 6d251095c350f..d5ffc641c8567 100644
>> --- a/drivers/dma/idxd/sysfs.c
>> +++ b/drivers/dma/idxd/sysfs.c
>> @@ -1836,7 +1836,8 @@ static void idxd_conf_device_release(struct device *dev)
>> {
>> struct idxd_device *idxd = confdev_to_idxd(dev);
>>
>> - destroy_workqueue(idxd->wq);
>> + if (idxd->wq)
>> + destroy_workqueue(idxd->wq);
>
> [Severity: High]
> This isn't a bug introduced by this patch, but do the error paths from
> idxd_probe() still cause a double free and double destroy?
>
> If an error occurs in idxd_probe() (e.g., idxd_device_load_config() fails),
> idxd_cleanup_internals() or idxd_setup_internals() error paths are invoked.
> These functions manually free arrays (idxd->groups, idxd->wqs, idxd->engines)
> and destroy the workqueue (destroy_workqueue(idxd->wq)) but leave the pointers
> non-NULL.
>
> When the error propagates up to idxd_pci_probe_alloc(), it calls idxd_free()
> which calls put_device(). This triggers idxd_conf_device_release().
>
> Because the pointers are still non-NULL, idxd_conf_device_release()
> unconditionally calls kfree() on the arrays a second time and executes
> destroy_workqueue(idxd->wq) again on the already-destroyed workqueue.
>
>> kfree(idxd->groups);
>> bitmap_free(idxd->wq_enable_map);
>> kfree(idxd->wqs);
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260520143732.119407-1-steve.wahl@hpe.com?part=1
--
Vinicius
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] dmaengine: idxd: fix problems on initialization error path.
2026-05-20 14:37 [PATCH] dmaengine: idxd: fix problems on initialization error path Steve Wahl
2026-05-20 15:20 ` sashiko-bot
@ 2026-05-20 20:10 ` Vinicius Costa Gomes
2026-05-21 16:23 ` Steve Wahl
1 sibling, 1 reply; 5+ messages in thread
From: Vinicius Costa Gomes @ 2026-05-20 20:10 UTC (permalink / raw)
To: Steve Wahl, Steve Wahl, Dave Jiang, Vinod Koul, Frank Li,
dmaengine, linux-kernel
Cc: Russ Anderson, Dimitri Sivanich
Hi Steve,
Steve Wahl <steve.wahl@hpe.com> writes:
> Some error paths within idxd_pci_probe_alloc and functions it calls
> did not keep proper track of what has already been allocated or freed,
> resulting in calling destroy_workqueue with a null pointer, and once
> that was fixed, attempting to free structures more than once. These
> conditions were hit running in a kexec'd kdump kernel with reduced
> resources, causing the "Device is HALTED!" branch in
> idxd_device_init_reset to be taken.
>
> In idxd_conf_device_release, check that the workqueue has been
> allocated before trying to destroy it. And in idxd_free and
> idxd_alloc, do not attempt to free allocations that
> idxd_conf_device_release, called through put_device, will already have
> freed.
>
> Fixes: 3d33de353b1f ("dmaengine: idxd: Fix not releasing workqueue on .release()")
>
> Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
> ---
> drivers/dma/idxd/init.c | 10 ++++++----
> drivers/dma/idxd/sysfs.c | 3 ++-
> 2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index f1cfc7790d95..227e323cc5a0 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -607,9 +607,6 @@ static void idxd_free(struct idxd_device *idxd)
> return;
>
> put_device(idxd_confdev(idxd));
> - bitmap_free(idxd->opcap_bmap);
> - ida_free(&idxd_ida, idxd->id);
> - kfree(idxd);
> }
>
> static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_data *data)
> @@ -649,8 +646,13 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
> return idxd;
>
> err_name:
> + /*
> + * once device_initialize(conf_dev) is called,
> + * put_device(conf_dev) will end up calling
> + * idxd_conf_device_release() which will free the rest.
> + */
> put_device(conf_dev);
> - bitmap_free(idxd->opcap_bmap);
> + return NULL;
> err_opcap:
> ida_free(&idxd_ida, idxd->id);
> err_ida:
I think that this first part should be a separate patch.
> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> index 6d251095c350..d5ffc641c856 100644
> --- a/drivers/dma/idxd/sysfs.c
> +++ b/drivers/dma/idxd/sysfs.c
> @@ -1836,7 +1836,8 @@ static void idxd_conf_device_release(struct device *dev)
> {
> struct idxd_device *idxd = confdev_to_idxd(dev);
>
> - destroy_workqueue(idxd->wq);
> + if (idxd->wq)
> + destroy_workqueue(idxd->wq);
> kfree(idxd->groups);
> bitmap_free(idxd->wq_enable_map);
> kfree(idxd->wqs);
And this another.
Cheers,
--
Vinicius
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] dmaengine: idxd: fix problems on initialization error path.
2026-05-20 20:10 ` Vinicius Costa Gomes
@ 2026-05-21 16:23 ` Steve Wahl
0 siblings, 0 replies; 5+ messages in thread
From: Steve Wahl @ 2026-05-21 16:23 UTC (permalink / raw)
To: Vinicius Costa Gomes
Cc: Steve Wahl, Dave Jiang, Vinod Koul, Frank Li, dmaengine,
linux-kernel, Russ Anderson, Dimitri Sivanich
On Wed, May 20, 2026 at 01:10:03PM -0700, Vinicius Costa Gomes wrote:
> Hi Steve,
>
> Steve Wahl <steve.wahl@hpe.com> writes:
>
> > Some error paths within idxd_pci_probe_alloc and functions it calls
> > did not keep proper track of what has already been allocated or freed,
> > resulting in calling destroy_workqueue with a null pointer, and once
> > that was fixed, attempting to free structures more than once. These
> > conditions were hit running in a kexec'd kdump kernel with reduced
> > resources, causing the "Device is HALTED!" branch in
> > idxd_device_init_reset to be taken.
> >
> > In idxd_conf_device_release, check that the workqueue has been
> > allocated before trying to destroy it. And in idxd_free and
> > idxd_alloc, do not attempt to free allocations that
> > idxd_conf_device_release, called through put_device, will already have
> > freed.
> >
> > Fixes: 3d33de353b1f ("dmaengine: idxd: Fix not releasing workqueue on .release()")
> >
> > Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
> > ---
> > drivers/dma/idxd/init.c | 10 ++++++----
> > drivers/dma/idxd/sysfs.c | 3 ++-
> > 2 files changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> > index f1cfc7790d95..227e323cc5a0 100644
> > --- a/drivers/dma/idxd/init.c
> > +++ b/drivers/dma/idxd/init.c
> > @@ -607,9 +607,6 @@ static void idxd_free(struct idxd_device *idxd)
> > return;
> >
> > put_device(idxd_confdev(idxd));
> > - bitmap_free(idxd->opcap_bmap);
> > - ida_free(&idxd_ida, idxd->id);
> > - kfree(idxd);
> > }
> >
> > static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_data *data)
> > @@ -649,8 +646,13 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
> > return idxd;
> >
> > err_name:
> > + /*
> > + * once device_initialize(conf_dev) is called,
> > + * put_device(conf_dev) will end up calling
> > + * idxd_conf_device_release() which will free the rest.
> > + */
> > put_device(conf_dev);
> > - bitmap_free(idxd->opcap_bmap);
> > + return NULL;
> > err_opcap:
> > ida_free(&idxd_ida, idxd->id);
> > err_ida:
>
> I think that this first part should be a separate patch.
>
> > diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> > index 6d251095c350..d5ffc641c856 100644
> > --- a/drivers/dma/idxd/sysfs.c
> > +++ b/drivers/dma/idxd/sysfs.c
> > @@ -1836,7 +1836,8 @@ static void idxd_conf_device_release(struct device *dev)
> > {
> > struct idxd_device *idxd = confdev_to_idxd(dev);
> >
> > - destroy_workqueue(idxd->wq);
> > + if (idxd->wq)
> > + destroy_workqueue(idxd->wq);
> > kfree(idxd->groups);
> > bitmap_free(idxd->wq_enable_map);
> > kfree(idxd->wqs);
>
> And this another.
I can split it as you desire.
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-21 16:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 14:37 [PATCH] dmaengine: idxd: fix problems on initialization error path Steve Wahl
2026-05-20 15:20 ` sashiko-bot
2026-05-20 19:59 ` Vinicius Costa Gomes
2026-05-20 20:10 ` Vinicius Costa Gomes
2026-05-21 16:23 ` Steve Wahl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox