From: Zhangfei Gao <zhangfei.gao@linaro.org>
To: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Yang Shen <shenyang39@huawei.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
linux-crypto@vger.kernel.org,
linux-accelerators@lists.ozlabs.org
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove
Date: Fri, 17 Jun 2022 16:20:30 +0800 [thread overview]
Message-ID: <53b9acef-ad32-d0aa-fa1b-a7cb77a0d088@linaro.org> (raw)
In-Reply-To: <d90e8ea5-2f18-2eda-b4b2-711083aa7ecd@linaro.org>
On 2022/6/17 下午2:05, Zhangfei Gao wrote:
>
>
> On 2022/6/16 下午4:14, Jean-Philippe Brucker wrote:
>> On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
>>>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>>>> index 281c54003edc..b6219c6bfb48 100644
>>>>> --- a/drivers/misc/uacce/uacce.c
>>>>> +++ b/drivers/misc/uacce/uacce.c
>>>>> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode
>>>>> *inode, struct file *filep)
>>>>> if (!q)
>>>>> return -ENOMEM;
>>>>> + mutex_lock(&uacce->queues_lock);
>>>>> +
>>>>> + if (!uacce->parent->driver) {
>>>> I don't think this is useful, because the core clears
>>>> parent->driver after
>>>> having run uacce_remove():
>>>>
>>>> rmmod hisi_zip open()
>>>> ... uacce_fops_open()
>>>> __device_release_driver() ...
>>>> pci_device_remove()
>>>> hisi_zip_remove()
>>>> hisi_qm_uninit()
>>>> uacce_remove()
>>>> ... ...
>>>> mutex_lock(uacce->queues_lock)
>>>> ... if (!uacce->parent->driver)
>>>> device_unbind_cleanup() /* driver still valid, proceed */
>>>> dev->driver = NULL
>>> The check if (!uacce->parent->driver) is required, otherwise NULL
>>> pointer
>>> may happen.
>> I agree we need something, what I mean is that this check is not
>> sufficient.
>>
>>> iommu_sva_bind_device
>>> const struct iommu_ops *ops = dev_iommu_ops(dev); ->
>>> dev->iommu->iommu_dev->ops
>>>
>>> rmmod has no issue, but remove parent pci device has the issue.
>> Ah right, relying on the return value of bind() wouldn't be enough
>> even if
>> we mandated SVA.
>>
>> [...]
>>>> I think we need the global uacce_mutex to serialize uacce_remove() and
>>>> uacce_fops_open(). uacce_remove() would do everything, including
>>>> xa_erase(), while holding that mutex. And uacce_fops_open() would
>>>> try to
>>>> obtain the uacce object from the xarray while holding the mutex, which
>>>> fails if the uacce object is being removed.
>>> Since fops_open get char device refcount, uacce_release will not happen
>>> until open returns.
>> The refcount only ensures that the uacce_device object is not freed as
>> long as there are open fds. But uacce_remove() can run while there are
>> open fds, or fds in the process of being opened. And atfer
>> uacce_remove()
>> runs, the uacce_device object still exists but is mostly unusable. For
>> example once the module is freed, uacce->ops is not valid anymore. But
>> currently uacce_fops_open() may dereference the ops in this case:
>>
>> uacce_fops_open()
>> if (!uacce->parent->driver)
>> /* Still valid, keep going */
>> ... rmmod
>> uacce_remove()
>> ... free_module()
>> uacce->ops->get_queue() /* BUG */
>
> uacce_remove should wait for uacce->queues_lock, until fops_open
> release the lock.
> If open happen just after the uacce_remove: unlock, uacce_bind_queue
> in open should fail.
>
>> Accessing uacce->ops after free_module() is a use-after-free. We need
>> all
> you men parent release the resources.
>> the fops to synchronize with uacce_remove() to ensure they don't use any
>> resource of the parent after it's been freed.
> After fops_open, currently we are counting on parent driver stop all
> dma first, then call uacce_remove, which is assumption.
> Like drivers/crypto/hisilicon/zip/zip_main.c:
> hisi_qm_wait_task_finish, which will wait uacce_release.
> If comments this , there may other issue,
> Unable to handle kernel paging request at virtual address
> ffff80000b700204
> pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
>
>> I see uacce_fops_poll() may have the same problem, and should be inside
>> uacce_mutex.
> Do we need consider this, uacce_remove can happen anytime but not
> waiting dma stop?
>
> Not sure uacce_mutex can do this.
> Currently the sequence is
> mutex_lock(&uacce->queues_lock);
> mutex_lock(&uacce_mutex);
>
> Or we set all the callbacks of uacce_ops to NULL?
How about in uacce_remove
mutex_lock(&uacce_mutex);
uacce->ops = NULL;
mutex_unlock(&uacce_mutex);
And check uacce->ops first when using.
Or set all ops of uacce->ops to NULL.
> Module_get/put only works for module, but not for removing device.
>
> Thanks
>
>>
>> Thanks,
>> Jean
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2022-06-17 8:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-10 12:34 [PATCH] uacce: fix concurrency of fops_open and uacce_remove Zhangfei Gao
2022-06-15 15:16 ` Jean-Philippe Brucker
2022-06-16 4:10 ` Zhangfei Gao
2022-06-16 8:14 ` Jean-Philippe Brucker
2022-06-17 6:05 ` Zhangfei Gao
2022-06-17 8:20 ` Zhangfei Gao [this message]
2022-06-17 14:23 ` Zhangfei Gao
2022-06-20 13:25 ` Jean-Philippe Brucker
2022-06-20 13:24 ` Jean-Philippe Brucker
2022-06-20 13:36 ` Greg Kroah-Hartman
2022-06-21 7:37 ` Zhangfei Gao
2022-06-21 7:44 ` Greg Kroah-Hartman
2022-06-22 8:14 ` Zhangfei Gao
2022-06-22 8:24 ` Greg Kroah-Hartman
2022-06-20 13:38 ` Greg Kroah-Hartman
2022-06-20 20:18 ` [PATCH] uacce: Tidy up locking kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53b9acef-ad32-d0aa-fa1b-a7cb77a0d088@linaro.org \
--to=zhangfei.gao@linaro.org \
--cc=arnd@arndb.de \
--cc=gregkh@linuxfoundation.org \
--cc=herbert@gondor.apana.org.au \
--cc=iommu@lists.linux-foundation.org \
--cc=jean-philippe@linaro.org \
--cc=linux-accelerators@lists.ozlabs.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=shenyang39@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox