From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Guixin Liu <kanie@linux.alibaba.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Vinod Koul <vkoul@kernel.org>,
Frank Li <Frank.Li@kernel.org>,
dmaengine@vger.kernel.org, Xunlei Pang <xlpang@linux.alibaba.com>,
oliver.yang@linux.alibaba.com
Subject: Re: [PATCH v2] dmaengine: idxd: Fix use-after-free of idxd_wq
Date: Thu, 23 Apr 2026 11:54:11 -0700 [thread overview]
Message-ID: <87cxzp1p9o.fsf@intel.com> (raw)
In-Reply-To: <968e2a4f-7613-4ef2-8cf4-68710ec55163@linux.alibaba.com>
Guixin Liu <kanie@linux.alibaba.com> writes:
> 在 2026/4/23 05:34, Vinicius Costa Gomes 写道:
>> On Wed, 15 Apr 2026 17:50:30 +0800, Guixin Liu <kanie@linux.alibaba.com> wrote:
>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>> index fb80803d5b57..c3cfd96074c9 100644
>>> --- a/drivers/dma/idxd/init.c
>>> +++ b/drivers/dma/idxd/init.c
>>> @@ -1293,13 +1293,30 @@ static void idxd_remove(struct pci_dev *pdev)
>>> [ ... skip 14 lines ... ]
>>> + *
>>> + * Use device_release_driver() to only unbind the driver (triggering
>>> + * idxd_device_drv_remove()) without touching sysfs. Then safely
>>> + * unregister children before the parent.
>>> + */
>>> + device_release_driver(idxd_confdev(idxd));
> Thanks for the reply.
>> Below are sashiko comments verbatim, some notes:
>>
>> - I do believe the race condition that it points out is real (but
>> very narrow), but not sure that we don't have a similar one in
>> current code (i.e. it wasn't introduced by your patch);
>> - The possible wq leak was definitely not introduced by this patch;
>> - The question about what happens with the file references after a
>> device is removed is a separate issue;
>>
>> The patch looks good, not opposed to it at all, the only one I am a
>> bit concerned, about this patch, to be sure, is the race condition,
>> and the two "paths" to reach it. (the others we can fix separately)
>>
>> sashiko.dev <sashiko@sashiko.dev>:
>>
>> Does this introduce a race condition with sysfs?
>>
>> Since the driver's sysfs nodes remain active after device_release_driver()
>> is called, a user with sysfs privileges could write to the bind attribute
>> to re-probe the driver before the device unregistration completes.
>>
>> If idxd_device_drv_probe()->idxd_wqs_setup() runs concurrently with
>> idxd_unregister_devices(), does it cause a use-after-free when it
>> iterates over the idxd->wqs[] objects being freed, since
>> suppress_bind_attrs is not set?
> I think re-probe will call idxd_pci_probe() to allocate an new idxd device,
> this avoids accessing stale data.
Yeah, I missed the (now obvious) that a new 'probe' also means a new
device. So this "path" is not valid. Thank you.
>> Also, does this code leak the workqueue?
>>
>> The workqueue idxd->wq is dynamically allocated using create_workqueue()
>> during device initialization, but destroy_workqueue(idxd->wq) does not
>> appear to be called in idxd_remove() or the sysfs device release callback.
> Looks like it's b7cb9a034305 ("dmaengine: idxd: Fix refcount underflow
> on module unload")
> overlooked this, destory_workqueue(idxd->wq) is called in idxd_cleanup()
> -> idxd_cleanup_internals().
>
> If so, I can send another patch to fix this.
Yes, please do.
>>> idxd_unregister_devices(idxd);
>> sashiko.dev <sashiko@sashiko.dev>:
>>
>> If idxd_unregister_devices() drops the last reference to the child wq
>> objects and frees them, can idxd_shutdown() trigger a use-after-free
>> when called shortly after?
>>
>> For example, if a hardware error interrupt fired during teardown,
>> idxd_shutdown()->flush_workqueue() could execute an error handler that
>> iterates over the freed idxd->wqs[] memory.
>>
>> Additionally, if the hardware is wedged,
>> idxd_shutdown()->idxd_device_disable()->idxd_device_clear_state()
>> iterates over idxd->wqs[] and attempts to acquire mutex_lock(&wq->wq_lock)
>> on the freed memory.
> I took a look — idxd_shutdown() is only invoked by idxd_remove() and
> during system shutdown/reboot.
> Is idxd_shutdown() ever reached from the idxd hardware error interrupt
> path?
The bot's suggestion that this code path could be hit by a hardware
error is false, but the shutdown path (like the user pressing a button)
could be valid.
A crash during that could cause the machine to not poweroff, which is
not good. Worth thinking about this, if there's something we could do
while we are here.
>>
>> Is there also a missing reference count for the cdev open path?
>>
>> Since idxd_unregister_devices() frees the wq structure, if user space
>> holds an open file descriptor, cdev_device_del() won't revoke it.
>> Can file operations like mmap or poll subsequently access the freed
>> ctx->wq pointer? Should idxd_cdev_open() take a kobject reference on
>> the parent wq device?
> To be honest, I'm not very familiar with the idxd driver. At this point,
> it looks like the idxd driver needs a state machine to ensure mutual
> exclusion
> across the various concurrent paths.
Don't worry about this one, it's a separate issue.
>
> Best Regards,
> Guixin Liu
>>
>> via: https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>> via: https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>>
>> via: https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>> via: https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>>
>
Cheers,
--
Vinicius
next prev parent reply other threads:[~2026-04-23 18:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 9:50 [PATCH v2] dmaengine: idxd: Fix use-after-free of idxd_wq Guixin Liu
2026-04-22 21:34 ` Vinicius Costa Gomes
2026-04-23 9:01 ` Guixin Liu
2026-04-23 18:54 ` Vinicius Costa Gomes [this message]
2026-04-24 5:43 ` Guixin Liu
2026-04-24 5:56 ` Guixin Liu
2026-04-29 2:55 ` Guixin Liu
2026-04-29 17:54 ` Vinicius Costa Gomes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87cxzp1p9o.fsf@intel.com \
--to=vinicius.gomes@intel.com \
--cc=Frank.Li@kernel.org \
--cc=dave.jiang@intel.com \
--cc=dmaengine@vger.kernel.org \
--cc=kanie@linux.alibaba.com \
--cc=oliver.yang@linux.alibaba.com \
--cc=vkoul@kernel.org \
--cc=xlpang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.