DMA Engine development
 help / color / mirror / Atom feed
From: Guixin Liu <kanie@linux.alibaba.com>
To: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Vinod Koul <vkoul@kernel.org>,
	Frank Li <Frank.Li@kernel.org>,
	dmaengine@vger.kernel.org, Xunlei Pang <xlpang@linux.alibaba.com>,
	oliver.yang@linux.alibaba.com
Subject: Re: [PATCH v2] dmaengine: idxd: Fix use-after-free of idxd_wq
Date: Wed, 29 Apr 2026 10:55:32 +0800	[thread overview]
Message-ID: <e9500d11-11c0-461c-a215-705b4f9b15f6@linux.alibaba.com> (raw)
In-Reply-To: <3847a039-bdab-478d-be57-c46c7665f92c@linux.alibaba.com>

Hi Vinicius,

It looks like all the issues from the AI Review have been clarified,
Can this patch be accepted?

Best Regards,
Guixin Liu

在 2026/4/24 13:56, Guixin Liu 写道:
>
>
> 在 2026/4/24 13:43, Guixin Liu 写道:
>>
>>
>> 在 2026/4/24 02:54, Vinicius Costa Gomes 写道:
>>> Guixin Liu <kanie@linux.alibaba.com> writes:
>>>
>>>> 在 2026/4/23 05:34, Vinicius Costa Gomes 写道:
>>>>> On Wed, 15 Apr 2026 17:50:30 +0800, Guixin Liu 
>>>>> <kanie@linux.alibaba.com> wrote:
>>>>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>>> index fb80803d5b57..c3cfd96074c9 100644
>>>>>> --- a/drivers/dma/idxd/init.c
>>>>>> +++ b/drivers/dma/idxd/init.c
>>>>>> @@ -1293,13 +1293,30 @@ static void idxd_remove(struct pci_dev 
>>>>>> *pdev)
>>>>>> [ ... skip 14 lines ... ]
>>>>>> +     *
>>>>>> +     * Use device_release_driver() to only unbind the driver 
>>>>>> (triggering
>>>>>> +     * idxd_device_drv_remove()) without touching sysfs. Then 
>>>>>> safely
>>>>>> +     * unregister children before the parent.
>>>>>> +     */
>>>>>> +    device_release_driver(idxd_confdev(idxd));
>>>> Thanks for the reply.
>>>>> Below are sashiko comments verbatim, some notes:
>>>>>
>>>>>    - I do believe the race condition that it points out is real (but
>>>>>      very narrow), but not sure that we don't have a similar one in
>>>>>      current code (i.e. it wasn't introduced by your patch);
>>>>>    - The possible wq leak was definitely not introduced by this 
>>>>> patch;
>>>>>    - The question about what happens with the file references after a
>>>>>      device is removed is a separate issue;
>>>>>
>>>>> The patch looks good, not opposed to it at all, the only one I am a
>>>>> bit concerned, about this patch, to be sure, is the race condition,
>>>>> and the two "paths" to reach it. (the others we can fix separately)
>>>>>
>>>>> sashiko.dev <sashiko@sashiko.dev>:
>>>>>
>>>>> Does this introduce a race condition with sysfs?
>>>>>
>>>>> Since the driver's sysfs nodes remain active after 
>>>>> device_release_driver()
>>>>> is called, a user with sysfs privileges could write to the bind 
>>>>> attribute
>>>>> to re-probe the driver before the device unregistration completes.
>>>>>
>>>>> If idxd_device_drv_probe()->idxd_wqs_setup() runs concurrently with
>>>>> idxd_unregister_devices(), does it cause a use-after-free when it
>>>>> iterates over the idxd->wqs[] objects being freed, since
>>>>> suppress_bind_attrs is not set?
>>>> I think re-probe will call idxd_pci_probe() to allocate an new idxd 
>>>> device,
>>>> this avoids accessing stale data.
>>> Yeah, I missed the (now obvious) that a new 'probe' also means a new
>>> device. So this "path" is not valid. Thank you.
>>>
>>>>> Also, does this code leak the workqueue?
>>>>>
>>>>> The workqueue idxd->wq is dynamically allocated using 
>>>>> create_workqueue()
>>>>> during device initialization, but destroy_workqueue(idxd->wq) does 
>>>>> not
>>>>> appear to be called in idxd_remove() or the sysfs device release 
>>>>> callback.
>>>> Looks like it's b7cb9a034305 ("dmaengine: idxd: Fix refcount underflow
>>>> on module unload")
>>>> overlooked this, destory_workqueue(idxd->wq) is called in 
>>>> idxd_cleanup()
>>>> -> idxd_cleanup_internals().
>>>>
>>>> If so, I can send another patch to fix this.
>>> Yes, please do.
>> Will be added in v3.
> Well, the idxd->wq is freed in idxd_conf_device_release(), so there is no
> problem, no need to fix.
>>>>>> idxd_unregister_devices(idxd);
>>>>> sashiko.dev <sashiko@sashiko.dev>:
>>>>>
>>>>> If idxd_unregister_devices() drops the last reference to the child wq
>>>>> objects and frees them, can idxd_shutdown() trigger a use-after-free
>>>>> when called shortly after?
>>>>>
>>>>> For example, if a hardware error interrupt fired during teardown,
>>>>> idxd_shutdown()->flush_workqueue() could execute an error handler 
>>>>> that
>>>>> iterates over the freed idxd->wqs[] memory.
>>>>>
>>>>> Additionally, if the hardware is wedged,
>>>>> idxd_shutdown()->idxd_device_disable()->idxd_device_clear_state()
>>>>> iterates over idxd->wqs[] and attempts to acquire 
>>>>> mutex_lock(&wq->wq_lock)
>>>>> on the freed memory.
>>>> I took a look — idxd_shutdown() is only invoked by idxd_remove() and
>>>> during system shutdown/reboot.
>>>> Is idxd_shutdown() ever reached from the idxd hardware error interrupt
>>>> path?
>>> The bot's suggestion that this code path could be hit by a hardware
>>> error is false, but the shutdown path (like the user pressing a button)
>>> could be valid.
>>>
>>> A crash during that could cause the machine to not poweroff, which is
>>> not good. Worth thinking about this, if there's something we could do
>>> while we are here.
>>
>> Both idxd_remove() and idxd_shutdown() are called under 
>> device_lock(dev) —
>> the lock is acquired by the driver core before invoking any bus 
>> callback:
>>
>> 1.
>>
>>    Remove path: device_release_driver_internal() →
>>    __device_driver_lock(dev, parent) → device_lock(dev) → ... →
>>    pci_device_remove() → idxd_remove()
>>
>> 2.
>>
>>    Shutdown path: device_shutdown() → device_lock(dev) →
>>    dev->bus->shutdown(dev) → pci_device_shutdown() → idxd_shutdown()
>>
>> Since both paths acquire the same dev->mutex on the same PCI device,
>>
>> idxd_remove() and idxd_shutdown() are mutually exclusive.
>>
>> Best Regards,
>>
>> Guixin Liu
>>
>>>>> Is there also a missing reference count for the cdev open path?
>>>>>
>>>>> Since idxd_unregister_devices() frees the wq structure, if user space
>>>>> holds an open file descriptor, cdev_device_del() won't revoke it.
>>>>> Can file operations like mmap or poll subsequently access the freed
>>>>> ctx->wq pointer? Should idxd_cdev_open() take a kobject reference on
>>>>> the parent wq device?
>>>> To be honest, I'm not very familiar with the idxd driver. At this 
>>>> point,
>>>> it looks like the idxd driver needs a state machine to ensure mutual
>>>> exclusion
>>>> across the various concurrent paths.
>>> Don't worry about this one, it's a separate issue.
>>>
>>>> Best Regards,
>>>> Guixin Liu
>>>>> via: 
>>>>> https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>>>>> via: 
>>>>> https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>>>>>
>>>>> via: 
>>>>> https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>>>>> via: 
>>>>> https://sashiko.dev/#/message/20260415095030.42183-1-kanie@linux.alibaba.com
>>>>>
>>>
>>> Cheers,
>>


  reply	other threads:[~2026-04-29  2:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  9:50 [PATCH v2] dmaengine: idxd: Fix use-after-free of idxd_wq Guixin Liu
2026-04-22 21:34 ` Vinicius Costa Gomes
2026-04-23  9:01   ` Guixin Liu
2026-04-23 18:54     ` Vinicius Costa Gomes
2026-04-24  5:43       ` Guixin Liu
2026-04-24  5:56         ` Guixin Liu
2026-04-29  2:55           ` Guixin Liu [this message]
2026-04-29 17:54 ` Vinicius Costa Gomes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9500d11-11c0-461c-a215-705b4f9b15f6@linux.alibaba.com \
    --to=kanie@linux.alibaba.com \
    --cc=Frank.Li@kernel.org \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=oliver.yang@linux.alibaba.com \
    --cc=vinicius.gomes@intel.com \
    --cc=vkoul@kernel.org \
    --cc=xlpang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox