Linux IOMMU Development
 help / color / mirror / Atom feed
From: Zhangfei Gao <zhangfei.gao@linaro.org>
To: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Yang Shen <shenyang39@huawei.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org,
	linux-accelerators@lists.ozlabs.org
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove
Date: Thu, 16 Jun 2022 12:10:18 +0800	[thread overview]
Message-ID: <fdc8d8b0-4e04-78f5-1e8a-4cf44c89a37f@linaro.org> (raw)
In-Reply-To: <Yqn3spLZHpAkQ9Us@myrica>

Hi, Jean

On 2022/6/15 下午11:16, Jean-Philippe Brucker wrote:
> Hi,
>
> On Fri, Jun 10, 2022 at 08:34:23PM +0800, Zhangfei Gao wrote:
>> The uacce parent's module can be removed when uacce is working,
>> which may cause troubles.
>>
>> If rmmod/uacce_remove happens just after fops_open: bind_queue,
>> the uacce_remove can not remove the bound queue since it is not
>> added to the queue list yet, which blocks the uacce_disable_sva.
>>
>> Change queues_lock area to make sure the bound queue is added to
>> the list thereby can be searched in uacce_remove.
>>
>> And uacce->parent->driver is checked immediately in case rmmod is
>> just happening.
>>
>> Also the parent driver must always stop DMA before calling
>> uacce_remove.
>>
>> Signed-off-by: Yang Shen <shenyang39@huawei.com>
>> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>> ---
>>   drivers/misc/uacce/uacce.c | 19 +++++++++++++------
>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>> index 281c54003edc..b6219c6bfb48 100644
>> --- a/drivers/misc/uacce/uacce.c
>> +++ b/drivers/misc/uacce/uacce.c
>> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
>>   	if (!q)
>>   		return -ENOMEM;
>>   
>> +	mutex_lock(&uacce->queues_lock);
>> +
>> +	if (!uacce->parent->driver) {
> I don't think this is useful, because the core clears parent->driver after
> having run uacce_remove():
>
>    rmmod hisi_zip		open()
>     ...				 uacce_fops_open()
>     __device_release_driver()	  ...
>      pci_device_remove()
>       hisi_zip_remove()
>        hisi_qm_uninit()
>         uacce_remove()
>          ...			  ...
>     				  mutex_lock(uacce->queues_lock)
>      ...				  if (!uacce->parent->driver)
>      device_unbind_cleanup()	  /* driver still valid, proceed */
>       dev->driver = NULL

The check  if (!uacce->parent->driver) is required, otherwise NULL 
pointer may happen.
iommu_sva_bind_device
const struct iommu_ops *ops = dev_iommu_ops(dev);  -> 
dev->iommu->iommu_dev->ops

rmmod has no issue, but remove parent pci device has the issue.

Test:
sleep in fops_open before mutex.

estuary:/mnt$ ./work/a.out &
//sleep in fops_open

echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
estuary:/mnt$ [   22.594348] uacce_remove!
[   22.594663] pci 0000:00:02.0: Removing from iommu group 2
[   22.595073] iommu_release_device dev->iommu=0
[   22.595076] CPU: 2 PID: 229 Comm: ash Not tainted 
5.19.0-rc1-15071-gcbcf098c5257-dirty #633
[   22.595079] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 
02/06/2015
[   22.595080] Call trace:
[   22.595080]  dump_backtrace+0xe4/0xf0
[   22.595085]  show_stack+0x20/0x70
[   22.595086]  dump_stack_lvl+0x8c/0xb8
[   22.595089]  dump_stack+0x18/0x34
[   22.595091]  iommu_release_device+0x90/0x98
[   22.595095]  iommu_bus_notifier+0x58/0x68
[   22.595097]  blocking_notifier_call_chain+0x74/0xa8
[   22.595100]  device_del+0x268/0x3b0
[   22.595102]  pci_remove_bus_device+0x84/0x110
[   22.595106]  pci_stop_and_remove_bus_device_locked+0x30/0x60
...

estuary:/mnt$ [   31.466360] uacce: sleep end!
[   31.466362] uacce->parent->driver=0
[   31.466364] uacce->parent->iommu=0
[   31.466365] uacce_bind_queue!
[   31.466366] uacce_bind_queue call iommu_sva_bind_device!
[   31.466367] uacce->parent=d120d0
[   31.466371] Unable to handle kernel NULL pointer dereference at 
virtual address 0000000000000038
[   31.472870] Mem abort info:
[   31.473450]   ESR = 0x0000000096000004
[   31.474223]   EC = 0x25: DABT (current EL), IL = 32 bits
[   31.475390]   SET = 0, FnV = 0
[   31.476031]   EA = 0, S1PTW = 0
[   31.476680]   FSC = 0x04: level 0 translation fault
[   31.477687] Data abort info:
[   31.478294]   ISV = 0, ISS = 0x00000004
[   31.479152]   CM = 0, WnR = 0
[   31.479785] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000714d8000
[   31.481144] [0000000000000038] pgd=0000000000000000, p4d=0000000000000000
[   31.482622] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   31.483784] Modules linked in: hisi_zip
[   31.484590] CPU: 2 PID: 228 Comm: a.out Not tainted 
5.19.0-rc1-15071-gcbcf098c5257-dirty #633
[   31.486374] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 
02/06/2015
[   31.487862] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS 
BTYPE=--)
[   31.489390] pc : iommu_sva_bind_device+0x44/0xf4
[   31.490404] lr : uacce_fops_open+0x128/0x234

>
> Since uacce_remove() disabled SVA, the following uacce_bind_queue() will
> fail anyway. However, if uacce->flags does not have UACCE_DEV_SVA set,
> we'll proceed further and call uacce->ops->get_queue(), which does not
> exist anymore since the parent module is gone.
>
> I think we need the global uacce_mutex to serialize uacce_remove() and
> uacce_fops_open(). uacce_remove() would do everything, including
> xa_erase(), while holding that mutex. And uacce_fops_open() would try to
> obtain the uacce object from the xarray while holding the mutex, which
> fails if the uacce object is being removed.

Since fops_open get char device refcount, uacce_release will not happen 
until open returns.
So either uacce = xa_load(&uacce_xa, iminor(inode)) is got, 
uacce_release release uacce after fops_release.
Or uacce is not got and return -ENODEV.

open:
         uacce = xa_load(&uacce_xa, iminor(inode));
         if (!uacce)
                 return -ENODEV;

uacce->dev.release = uacce_release;
uacce_release: kfree(uacce);

Thanks
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2022-06-16  4:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-10 12:34 [PATCH] uacce: fix concurrency of fops_open and uacce_remove Zhangfei Gao
2022-06-15 15:16 ` Jean-Philippe Brucker
2022-06-16  4:10   ` Zhangfei Gao [this message]
2022-06-16  8:14     ` Jean-Philippe Brucker
2022-06-17  6:05       ` Zhangfei Gao
2022-06-17  8:20         ` Zhangfei Gao
2022-06-17 14:23           ` Zhangfei Gao
2022-06-20 13:25             ` Jean-Philippe Brucker
2022-06-20 13:24         ` Jean-Philippe Brucker
2022-06-20 13:36           ` Greg Kroah-Hartman
2022-06-21  7:37             ` Zhangfei Gao
2022-06-21  7:44               ` Greg Kroah-Hartman
2022-06-22  8:14                 ` Zhangfei Gao
2022-06-22  8:24                   ` Greg Kroah-Hartman
2022-06-20 13:38           ` Greg Kroah-Hartman
2022-06-20 20:18           ` [PATCH] uacce: Tidy up locking kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fdc8d8b0-4e04-78f5-1e8a-4cf44c89a37f@linaro.org \
    --to=zhangfei.gao@linaro.org \
    --cc=arnd@arndb.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shenyang39@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox