public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages
@ 2024-09-05 13:11 Junxian Huang
  2024-09-05 13:11 ` [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
  2024-09-05 13:11 ` [PATCH v4 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang
  0 siblings, 2 replies; 9+ messages in thread
From: Junxian Huang @ 2024-09-05 13:11 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6

Provide an API rdma_user_mmap_disassociate() for drivers to disassociate
mmap pages. Use this API in hns to prevent userspace from ringing doorbell
when HW is reset.

v3 -> v4:
* Add the newly introduced disassociation_lock to ib_uverbs_mmap()
  and rdma_umap_open() to prevent concurrency with
  rdma_user_mmap_disassociate().
* Change the disassociated flag from atomic_t to bool.

v2 -> v3:
* Walk all ufiles of a device in rdma_user_mmap_disassociate() as Jason
  commented, so drivers don't need to maintain their own list of ucontexts.
* Add a disassociation_lock in uverbs_user_mmap_disassociate() as Jason
  commented to avoid racing between different threads.
* Add a disassociated flag indicating whether mmaps have been disabled
  to prevent new mmap after the disassociation.

v1 -> v2:
* Keep uverbs_user_mmap_disassociate() in uverbs_main.c. The new api
  rdma_user_mmap_disassociate() is also moved to this file.
* Add "#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)" to hns's
  rdma_user_mmap_disassociate() call.

Chengchang Tang (2):
  RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap
    pages
  RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset

 drivers/infiniband/core/uverbs.h           |  3 ++
 drivers/infiniband/core/uverbs_main.c      | 55 ++++++++++++++++++++--
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c |  9 ++++
 include/rdma/ib_verbs.h                    |  8 ++++
 4 files changed, 71 insertions(+), 4 deletions(-)

--
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-05 13:11 [PATCH v4 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages Junxian Huang
@ 2024-09-05 13:11 ` Junxian Huang
  2024-09-07 12:12   ` Jason Gunthorpe
  2024-09-05 13:11 ` [PATCH v4 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang
  1 sibling, 1 reply; 9+ messages in thread
From: Junxian Huang @ 2024-09-05 13:11 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6

From: Chengchang Tang <tangchengchang@huawei.com>

Provide a new api rdma_user_mmap_disassociate() for drivers to
disassociate mmap pages for a device.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
 drivers/infiniband/core/uverbs.h      |  3 ++
 drivers/infiniband/core/uverbs_main.c | 55 +++++++++++++++++++++++++--
 include/rdma/ib_verbs.h               |  8 ++++
 3 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 821d93c8f712..05b589aad5ef 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -160,6 +160,9 @@ struct ib_uverbs_file {
 	struct page *disassociate_page;
 
 	struct xarray		idr;
+
+	struct mutex disassociation_lock;
+	bool disassociated;
 };
 
 struct ib_uverbs_event {
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index bc099287de9a..7c97df5d1a69 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -76,6 +76,7 @@ static dev_t dynamic_uverbs_dev;
 static DEFINE_IDA(uverbs_ida);
 static int ib_uverbs_add_one(struct ib_device *device);
 static void ib_uverbs_remove_one(struct ib_device *device, void *client_data);
+static struct ib_client uverbs_client;
 
 static char *uverbs_devnode(const struct device *dev, umode_t *mode)
 {
@@ -217,6 +218,7 @@ void ib_uverbs_release_file(struct kref *ref)
 
 	if (file->disassociate_page)
 		__free_pages(file->disassociate_page, 0);
+	mutex_destroy(&file->disassociation_lock);
 	mutex_destroy(&file->umap_lock);
 	mutex_destroy(&file->ucontext_lock);
 	kfree(file);
@@ -698,11 +700,20 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
 	ucontext = ib_uverbs_get_ucontext_file(file);
 	if (IS_ERR(ucontext)) {
 		ret = PTR_ERR(ucontext);
-		goto out;
+		goto out_srcu;
 	}
+
+	mutex_lock(&file->disassociation_lock);
+	if (file->disassociated) {
+		ret = -EPERM;
+		goto out_mutex;
+	}
+
 	vma->vm_ops = &rdma_umap_ops;
 	ret = ucontext->device->ops.mmap(ucontext, vma);
-out:
+out_mutex:
+	mutex_unlock(&file->disassociation_lock);
+out_srcu:
 	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
 	return ret;
 }
@@ -723,10 +734,12 @@ static void rdma_umap_open(struct vm_area_struct *vma)
 	/* We are racing with disassociation */
 	if (!down_read_trylock(&ufile->hw_destroy_rwsem))
 		goto out_zap;
+	mutex_lock(&ufile->disassociation_lock);
+
 	/*
 	 * Disassociation already completed, the VMA should already be zapped.
 	 */
-	if (!ufile->ucontext)
+	if (!ufile->ucontext || ufile->disassociated)
 		goto out_unlock;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
@@ -734,10 +747,12 @@ static void rdma_umap_open(struct vm_area_struct *vma)
 		goto out_unlock;
 	rdma_umap_priv_init(priv, vma, opriv->entry);
 
+	mutex_unlock(&ufile->disassociation_lock);
 	up_read(&ufile->hw_destroy_rwsem);
 	return;
 
 out_unlock:
+	mutex_unlock(&ufile->disassociation_lock);
 	up_read(&ufile->hw_destroy_rwsem);
 out_zap:
 	/*
@@ -822,6 +837,8 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
 	struct rdma_umap_priv *priv, *next_priv;
 
 	lockdep_assert_held(&ufile->hw_destroy_rwsem);
+	mutex_lock(&ufile->disassociation_lock);
+	ufile->disassociated = true;
 
 	while (1) {
 		struct mm_struct *mm = NULL;
@@ -847,8 +864,10 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
 			break;
 		}
 		mutex_unlock(&ufile->umap_lock);
-		if (!mm)
+		if (!mm) {
+			mutex_unlock(&ufile->disassociation_lock);
 			return;
+		}
 
 		/*
 		 * The umap_lock is nested under mmap_lock since it used within
@@ -878,8 +897,34 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
 		mmap_read_unlock(mm);
 		mmput(mm);
 	}
+
+	mutex_unlock(&ufile->disassociation_lock);
 }
 
+/**
+ * rdma_user_mmap_disassociate() - Revoke mmaps for a device
+ * @device: device to revoke
+ *
+ * This function should be called by drivers that need to disable mmaps for the
+ * device, for instance because it is going to be reset.
+ */
+void rdma_user_mmap_disassociate(struct ib_device *device)
+{
+	struct ib_uverbs_device *uverbs_dev =
+		ib_get_client_data(device, &uverbs_client);
+	struct ib_uverbs_file *ufile;
+
+	mutex_lock(&uverbs_dev->lists_mutex);
+	list_for_each_entry(ufile, &uverbs_dev->uverbs_file_list, list) {
+		down_read(&ufile->hw_destroy_rwsem);
+		if (ufile->ucontext && !ufile->disassociated)
+			uverbs_user_mmap_disassociate(ufile);
+		up_read(&ufile->hw_destroy_rwsem);
+	}
+	mutex_unlock(&uverbs_dev->lists_mutex);
+}
+EXPORT_SYMBOL(rdma_user_mmap_disassociate);
+
 /*
  * ib_uverbs_open() does not need the BKL:
  *
@@ -949,6 +994,8 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	mutex_init(&file->umap_lock);
 	INIT_LIST_HEAD(&file->umaps);
 
+	mutex_init(&file->disassociation_lock);
+
 	filp->private_data = file;
 	list_add_tail(&file->list, &dev->uverbs_file_list);
 	mutex_unlock(&dev->lists_mutex);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a1dcf812d787..09b80c8253e2 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2948,6 +2948,14 @@ int rdma_user_mmap_entry_insert_range(struct ib_ucontext *ucontext,
 				      size_t length, u32 min_pgoff,
 				      u32 max_pgoff);
 
+#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
+void rdma_user_mmap_disassociate(struct ib_device *device);
+#else
+static inline void rdma_user_mmap_disassociate(struct ib_device *device)
+{
+}
+#endif
+
 static inline int
 rdma_user_mmap_entry_insert_exact(struct ib_ucontext *ucontext,
 				  struct rdma_user_mmap_entry *entry,
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-05 13:11 [PATCH v4 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages Junxian Huang
  2024-09-05 13:11 ` [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
@ 2024-09-05 13:11 ` Junxian Huang
  1 sibling, 0 replies; 9+ messages in thread
From: Junxian Huang @ 2024-09-05 13:11 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6

From: Chengchang Tang <tangchengchang@huawei.com>

When HW is being reset, userspace should not ring doorbell otherwise
it may lead to abnormal consequence such as RAS.

Disassociate mmap pages for all uctx to prevent userspace from ringing
doorbell to HW.

Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 621b057fb9da..ecf4f1c9f51d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -7012,6 +7012,12 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
 
 	handle->rinfo.instance_state = HNS_ROCE_STATE_NON_INIT;
 }
+
+static void hns_roce_v2_reset_notify_user(struct hns_roce_dev *hr_dev)
+{
+	rdma_user_mmap_disassociate(&hr_dev->ib_dev);
+}
+
 static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
 {
 	struct hns_roce_dev *hr_dev;
@@ -7030,6 +7036,9 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
 
 	hr_dev->active = false;
 	hr_dev->dis_db = true;
+
+	hns_roce_v2_reset_notify_user(hr_dev);
+
 	hr_dev->state = HNS_ROCE_DEVICE_STATE_RST_DOWN;
 
 	return 0;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-05 13:11 ` [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
@ 2024-09-07 12:12   ` Jason Gunthorpe
  2024-09-09  8:41     ` Junxian Huang
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2024-09-07 12:12 UTC (permalink / raw)
  To: Junxian Huang; +Cc: leon, linux-rdma, linuxarm, linux-kernel

On Thu, Sep 05, 2024 at 09:11:54PM +0800, Junxian Huang wrote:

> @@ -698,11 +700,20 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
>  	ucontext = ib_uverbs_get_ucontext_file(file);
>  	if (IS_ERR(ucontext)) {
>  		ret = PTR_ERR(ucontext);
> -		goto out;
> +		goto out_srcu;
>  	}
> +
> +	mutex_lock(&file->disassociation_lock);
> +	if (file->disassociated) {
> +		ret = -EPERM;
> +		goto out_mutex;
> +	}

What sets disassociated back to false once the driver reset is
completed?

I think you should probably drop this and instead add a lock and test
inside the driver within its mmap op. While reset is ongoing fail all
new mmaps.

>  	/*
>  	 * Disassociation already completed, the VMA should already be zapped.
>  	 */
> -	if (!ufile->ucontext)
> +	if (!ufile->ucontext || ufile->disassociated)
>  		goto out_unlock;

Is this needed? It protects agains fork, but since the driver is still
present I wonder if it is OK

> @@ -822,6 +837,8 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
>  	struct rdma_umap_priv *priv, *next_priv;
>  
>  	lockdep_assert_held(&ufile->hw_destroy_rwsem);
> +	mutex_lock(&ufile->disassociation_lock);
> +	ufile->disassociated = true;

I think this doesn't need the hw_destroy_rwsem anymore since you are
using this new disassociation_lock instead. It doesn't make alot of
sense to hold the hw_destroy_rwsem for read here, it was ment to be
held for write.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-07 12:12   ` Jason Gunthorpe
@ 2024-09-09  8:41     ` Junxian Huang
  2024-09-11 10:20       ` Leon Romanovsky
  2024-09-15 13:34       ` Jason Gunthorpe
  0 siblings, 2 replies; 9+ messages in thread
From: Junxian Huang @ 2024-09-09  8:41 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: leon, linux-rdma, linuxarm, linux-kernel



On 2024/9/7 20:12, Jason Gunthorpe wrote:
> On Thu, Sep 05, 2024 at 09:11:54PM +0800, Junxian Huang wrote:
> 
>> @@ -698,11 +700,20 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
>>  	ucontext = ib_uverbs_get_ucontext_file(file);
>>  	if (IS_ERR(ucontext)) {
>>  		ret = PTR_ERR(ucontext);
>> -		goto out;
>> +		goto out_srcu;
>>  	}
>> +
>> +	mutex_lock(&file->disassociation_lock);
>> +	if (file->disassociated) {
>> +		ret = -EPERM;
>> +		goto out_mutex;
>> +	}
> 
> What sets disassociated back to false once the driver reset is
> completed?
> 
> I think you should probably drop this and instead add a lock and test
> inside the driver within its mmap op. While reset is ongoing fail all
> new mmaps.
> 

disassociated won't be set back to false. This is to stop new mmaps on
this ucontext even after reset is completed, because during hns reset,
all resources will be destroyed, and the ucontexts will become unavailable.

But of course, other drivers may handle this case differently from hns, so
I will remove disassociated here and put it in hns driver.

>>  	/*
>>  	 * Disassociation already completed, the VMA should already be zapped.
>>  	 */
>> -	if (!ufile->ucontext)
>> +	if (!ufile->ucontext || ufile->disassociated)
>>  		goto out_unlock;
> 
> Is this needed? It protects agains fork, but since the driver is still
> present I wonder if it is OK
> 

Will remove it too.

>> @@ -822,6 +837,8 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
>>  	struct rdma_umap_priv *priv, *next_priv;
>>  
>>  	lockdep_assert_held(&ufile->hw_destroy_rwsem);
>> +	mutex_lock(&ufile->disassociation_lock);
>> +	ufile->disassociated = true;
> 
> I think this doesn't need the hw_destroy_rwsem anymore since you are
> using this new disassociation_lock instead. It doesn't make alot of
> sense to hold the hw_destroy_rwsem for read here, it was ment to be
> held for write.
> 

Then it seems we should remove the lockdep_assert_held() here?

Junxian

> Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-09  8:41     ` Junxian Huang
@ 2024-09-11 10:20       ` Leon Romanovsky
  2024-09-11 12:38         ` Junxian Huang
  2024-09-15 13:34       ` Jason Gunthorpe
  1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2024-09-11 10:20 UTC (permalink / raw)
  To: Junxian Huang; +Cc: Jason Gunthorpe, linux-rdma, linuxarm, linux-kernel

On Mon, Sep 09, 2024 at 04:41:00PM +0800, Junxian Huang wrote:
> 
> 
> On 2024/9/7 20:12, Jason Gunthorpe wrote:
> > On Thu, Sep 05, 2024 at 09:11:54PM +0800, Junxian Huang wrote:
> > 
> >> @@ -698,11 +700,20 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
> >>  	ucontext = ib_uverbs_get_ucontext_file(file);
> >>  	if (IS_ERR(ucontext)) {
> >>  		ret = PTR_ERR(ucontext);
> >> -		goto out;
> >> +		goto out_srcu;
> >>  	}
> >> +
> >> +	mutex_lock(&file->disassociation_lock);
> >> +	if (file->disassociated) {
> >> +		ret = -EPERM;
> >> +		goto out_mutex;
> >> +	}
> > 
> > What sets disassociated back to false once the driver reset is
> > completed?
> > 
> > I think you should probably drop this and instead add a lock and test
> > inside the driver within its mmap op. While reset is ongoing fail all
> > new mmaps.
> > 
> 
> disassociated won't be set back to false. This is to stop new mmaps on
> this ucontext even after reset is completed, because during hns reset,
> all resources will be destroyed, and the ucontexts will become unavailable.

ucontext is SW object and not HW object, why will it become unavailable?

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-11 10:20       ` Leon Romanovsky
@ 2024-09-11 12:38         ` Junxian Huang
  2024-09-15 13:35           ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Junxian Huang @ 2024-09-11 12:38 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Jason Gunthorpe, linux-rdma, linuxarm, linux-kernel



On 2024/9/11 18:20, Leon Romanovsky wrote:
> On Mon, Sep 09, 2024 at 04:41:00PM +0800, Junxian Huang wrote:
>>
>>
>> On 2024/9/7 20:12, Jason Gunthorpe wrote:
>>> On Thu, Sep 05, 2024 at 09:11:54PM +0800, Junxian Huang wrote:
>>>
>>>> @@ -698,11 +700,20 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
>>>>  	ucontext = ib_uverbs_get_ucontext_file(file);
>>>>  	if (IS_ERR(ucontext)) {
>>>>  		ret = PTR_ERR(ucontext);
>>>> -		goto out;
>>>> +		goto out_srcu;
>>>>  	}
>>>> +
>>>> +	mutex_lock(&file->disassociation_lock);
>>>> +	if (file->disassociated) {
>>>> +		ret = -EPERM;
>>>> +		goto out_mutex;
>>>> +	}
>>>
>>> What sets disassociated back to false once the driver reset is
>>> completed?
>>>
>>> I think you should probably drop this and instead add a lock and test
>>> inside the driver within its mmap op. While reset is ongoing fail all
>>> new mmaps.
>>>
>>
>> disassociated won't be set back to false. This is to stop new mmaps on
>> this ucontext even after reset is completed, because during hns reset,
>> all resources will be destroyed, and the ucontexts will become unavailable.
> 
> ucontext is SW object and not HW object, why will it become unavailable?
> 

Once hns device is reset, we don't allow any doorbell until driver's
re-initialization is completed. Not only all existing mmaps on ucontexts
will be zapped, no more new mmaps are allowed either.

This actually makes ucontexts unavailable since userspace cannot access
HW with them any more. Users will have to destroy the old ucontext and
allocate a new one after driver's re-initialization is completed.

Junxian

> Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-09  8:41     ` Junxian Huang
  2024-09-11 10:20       ` Leon Romanovsky
@ 2024-09-15 13:34       ` Jason Gunthorpe
  1 sibling, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2024-09-15 13:34 UTC (permalink / raw)
  To: Junxian Huang; +Cc: leon, linux-rdma, linuxarm, linux-kernel

On Mon, Sep 09, 2024 at 04:41:00PM +0800, Junxian Huang wrote:
> On 2024/9/7 20:12, Jason Gunthorpe wrote:
> > On Thu, Sep 05, 2024 at 09:11:54PM +0800, Junxian Huang wrote:
> > 
> >> @@ -698,11 +700,20 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
> >>  	ucontext = ib_uverbs_get_ucontext_file(file);
> >>  	if (IS_ERR(ucontext)) {
> >>  		ret = PTR_ERR(ucontext);
> >> -		goto out;
> >> +		goto out_srcu;
> >>  	}
> >> +
> >> +	mutex_lock(&file->disassociation_lock);
> >> +	if (file->disassociated) {
> >> +		ret = -EPERM;
> >> +		goto out_mutex;
> >> +	}
> > 
> > What sets disassociated back to false once the driver reset is
> > completed?
> > 
> > I think you should probably drop this and instead add a lock and test
> > inside the driver within its mmap op. While reset is ongoing fail all
> > new mmaps.
> > 
> 
> disassociated won't be set back to false. This is to stop new mmaps on
> this ucontext even after reset is completed, because during hns reset,
> all resources will be destroyed, and the ucontexts will become unavailable.

That isn't really the model we are going for, the ucontext should be
recoverable even if the objects are not.

If you want to really fully destroy and fence things then you have to
unregister the whole ib_device

> > I think this doesn't need the hw_destroy_rwsem anymore since you are
> > using this new disassociation_lock instead. It doesn't make alot of
> > sense to hold the hw_destroy_rwsem for read here, it was ment to be
> > held for write.
> 
> Then it seems we should remove the lockdep_assert_held() here?

Yes

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-11 12:38         ` Junxian Huang
@ 2024-09-15 13:35           ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2024-09-15 13:35 UTC (permalink / raw)
  To: Junxian Huang; +Cc: Leon Romanovsky, linux-rdma, linuxarm, linux-kernel

On Wed, Sep 11, 2024 at 08:38:35PM +0800, Junxian Huang wrote:

> Once hns device is reset, we don't allow any doorbell until driver's
> re-initialization is completed. Not only all existing mmaps on ucontexts
> will be zapped, no more new mmaps are allowed either.
> 
> This actually makes ucontexts unavailable since userspace cannot access
> HW with them any more. Users will have to destroy the old ucontext and
> allocate a new one after driver's re-initialization is completed.

It is supposed to come back once the reset is completed, yes userspace
may need to restart some objects but the ucontext shouldn't be left
crippled.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-09-15 13:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05 13:11 [PATCH v4 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages Junxian Huang
2024-09-05 13:11 ` [PATCH v4 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
2024-09-07 12:12   ` Jason Gunthorpe
2024-09-09  8:41     ` Junxian Huang
2024-09-11 10:20       ` Leon Romanovsky
2024-09-11 12:38         ` Junxian Huang
2024-09-15 13:35           ` Jason Gunthorpe
2024-09-15 13:34       ` Jason Gunthorpe
2024-09-05 13:11 ` [PATCH v4 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox