public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages
@ 2024-09-13 12:29 Junxian Huang
  2024-09-13 12:29 ` [PATCH v5 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
  2024-09-13 12:29 ` [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang
  0 siblings, 2 replies; 9+ messages in thread
From: Junxian Huang @ 2024-09-13 12:29 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6

Provide an API rdma_user_mmap_disassociate() for drivers to disassociate
mmap pages. Use this API in hns to prevent userspace from ringing doorbell
when HW is reset.

v4 -> v5:
* Remove 'disassociated' from core.
* Remove lockdep in uverbs_user_mmap_disassociate() since we have
  a new disassociation_lock and don't need to hold hw_destroy_rwsem
  in the newly introduced api.
* Use hr_dev->dis_db to prevent new mmaps once the device is reset.

v3 -> v4:
* Add the newly introduced disassociation_lock to ib_uverbs_mmap()
  and rdma_umap_open() to prevent concurrency with
  rdma_user_mmap_disassociate().
* Change the disassociated flag from atomic_t to bool.

v2 -> v3:
* Walk all ufiles of a device in rdma_user_mmap_disassociate() as Jason
  commented, so drivers don't need to maintain their own list of ucontexts.
* Add a disassociation_lock in uverbs_user_mmap_disassociate() as Jason
  commented to avoid racing between different threads.
* Add a disassociated flag indicating whether mmaps have been disabled
  to prevent new mmap after the disassociation.

v1 -> v2:
* Keep uverbs_user_mmap_disassociate() in uverbs_main.c. The new api
  rdma_user_mmap_disassociate() is also moved to this file.
* Add "#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)" to hns's
  rdma_user_mmap_disassociate() call.

Chengchang Tang (2):
  RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap
    pages
  RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset

 drivers/infiniband/core/uverbs.h           |  2 +
 drivers/infiniband/core/uverbs_main.c      | 47 ++++++++++++++++++++--
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c |  9 +++++
 drivers/infiniband/hw/hns/hns_roce_main.c  |  5 +++
 include/rdma/ib_verbs.h                    |  8 ++++
 5 files changed, 67 insertions(+), 4 deletions(-)

--
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pages
  2024-09-13 12:29 [PATCH v5 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages Junxian Huang
@ 2024-09-13 12:29 ` Junxian Huang
  2024-09-13 12:29 ` [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang
  1 sibling, 0 replies; 9+ messages in thread
From: Junxian Huang @ 2024-09-13 12:29 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6

From: Chengchang Tang <tangchengchang@huawei.com>

Provide a new api rdma_user_mmap_disassociate() for drivers to
disassociate mmap pages for a device.

Since drivers can now disassociate mmaps by calling this api,
introduce a new disassociation_lock to specifically prevent
races between this disassociation process and new mmaps. And
thus the old hw_destroy_rwsem is not needed in this api.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
 drivers/infiniband/core/uverbs.h      |  2 ++
 drivers/infiniband/core/uverbs_main.c | 47 ++++++++++++++++++++++++---
 include/rdma/ib_verbs.h               |  8 +++++
 3 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 821d93c8f712..dfd2e5a86e6f 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -160,6 +160,8 @@ struct ib_uverbs_file {
 	struct page *disassociate_page;
 
 	struct xarray		idr;
+
+	struct mutex disassociation_lock;
 };
 
 struct ib_uverbs_event {
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index bc099287de9a..9b2073c8cc6e 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -76,6 +76,7 @@ static dev_t dynamic_uverbs_dev;
 static DEFINE_IDA(uverbs_ida);
 static int ib_uverbs_add_one(struct ib_device *device);
 static void ib_uverbs_remove_one(struct ib_device *device, void *client_data);
+static struct ib_client uverbs_client;
 
 static char *uverbs_devnode(const struct device *dev, umode_t *mode)
 {
@@ -217,6 +218,7 @@ void ib_uverbs_release_file(struct kref *ref)
 
 	if (file->disassociate_page)
 		__free_pages(file->disassociate_page, 0);
+	mutex_destroy(&file->disassociation_lock);
 	mutex_destroy(&file->umap_lock);
 	mutex_destroy(&file->ucontext_lock);
 	kfree(file);
@@ -698,11 +700,16 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
 	ucontext = ib_uverbs_get_ucontext_file(file);
 	if (IS_ERR(ucontext)) {
 		ret = PTR_ERR(ucontext);
-		goto out;
+		goto out_srcu;
 	}
+
+	mutex_lock(&file->disassociation_lock);
+
 	vma->vm_ops = &rdma_umap_ops;
 	ret = ucontext->device->ops.mmap(ucontext, vma);
-out:
+out_mutex:
+	mutex_unlock(&file->disassociation_lock);
+out_srcu:
 	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
 	return ret;
 }
@@ -723,6 +730,8 @@ static void rdma_umap_open(struct vm_area_struct *vma)
 	/* We are racing with disassociation */
 	if (!down_read_trylock(&ufile->hw_destroy_rwsem))
 		goto out_zap;
+	mutex_lock(&ufile->disassociation_lock);
+
 	/*
 	 * Disassociation already completed, the VMA should already be zapped.
 	 */
@@ -734,10 +743,12 @@ static void rdma_umap_open(struct vm_area_struct *vma)
 		goto out_unlock;
 	rdma_umap_priv_init(priv, vma, opriv->entry);
 
+	mutex_unlock(&ufile->disassociation_lock);
 	up_read(&ufile->hw_destroy_rwsem);
 	return;
 
 out_unlock:
+	mutex_unlock(&ufile->disassociation_lock);
 	up_read(&ufile->hw_destroy_rwsem);
 out_zap:
 	/*
@@ -821,7 +832,7 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
 {
 	struct rdma_umap_priv *priv, *next_priv;
 
-	lockdep_assert_held(&ufile->hw_destroy_rwsem);
+	mutex_lock(&ufile->disassociation_lock);
 
 	while (1) {
 		struct mm_struct *mm = NULL;
@@ -847,8 +858,10 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
 			break;
 		}
 		mutex_unlock(&ufile->umap_lock);
-		if (!mm)
+		if (!mm) {
+			mutex_unlock(&ufile->disassociation_lock);
 			return;
+		}
 
 		/*
 		 * The umap_lock is nested under mmap_lock since it used within
@@ -878,8 +891,32 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile)
 		mmap_read_unlock(mm);
 		mmput(mm);
 	}
+
+	mutex_unlock(&ufile->disassociation_lock);
 }
 
+/**
+ * rdma_user_mmap_disassociate() - Revoke mmaps for a device
+ * @device: device to revoke
+ *
+ * This function should be called by drivers that need to disable mmaps for the
+ * device, for instance because it is going to be reset.
+ */
+void rdma_user_mmap_disassociate(struct ib_device *device)
+{
+	struct ib_uverbs_device *uverbs_dev =
+		ib_get_client_data(device, &uverbs_client);
+	struct ib_uverbs_file *ufile;
+
+	mutex_lock(&uverbs_dev->lists_mutex);
+	list_for_each_entry(ufile, &uverbs_dev->uverbs_file_list, list) {
+		if (ufile->ucontext)
+			uverbs_user_mmap_disassociate(ufile);
+	}
+	mutex_unlock(&uverbs_dev->lists_mutex);
+}
+EXPORT_SYMBOL(rdma_user_mmap_disassociate);
+
 /*
  * ib_uverbs_open() does not need the BKL:
  *
@@ -949,6 +986,8 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
 	mutex_init(&file->umap_lock);
 	INIT_LIST_HEAD(&file->umaps);
 
+	mutex_init(&file->disassociation_lock);
+
 	filp->private_data = file;
 	list_add_tail(&file->list, &dev->uverbs_file_list);
 	mutex_unlock(&dev->lists_mutex);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index aa8ede439905..9cb8b5fe7eee 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2948,6 +2948,14 @@ int rdma_user_mmap_entry_insert_range(struct ib_ucontext *ucontext,
 				      size_t length, u32 min_pgoff,
 				      u32 max_pgoff);
 
+#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
+void rdma_user_mmap_disassociate(struct ib_device *device);
+#else
+static inline void rdma_user_mmap_disassociate(struct ib_device *device)
+{
+}
+#endif
+
 static inline int
 rdma_user_mmap_entry_insert_exact(struct ib_ucontext *ucontext,
 				  struct rdma_user_mmap_entry *entry,
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-13 12:29 [PATCH v5 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages Junxian Huang
  2024-09-13 12:29 ` [PATCH v5 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
@ 2024-09-13 12:29 ` Junxian Huang
  2024-09-16  9:13   ` Leon Romanovsky
  1 sibling, 1 reply; 9+ messages in thread
From: Junxian Huang @ 2024-09-13 12:29 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6

From: Chengchang Tang <tangchengchang@huawei.com>

When HW is being reset, userspace should not ring doorbell otherwise
it may lead to abnormal consequence such as RAS.

Disassociate mmap pages for all uctx to prevent userspace from ringing
doorbell to HW. Since all resources will be destroyed during HW reset,
no new mmap is allowed after HW reset is completed.

Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 9 +++++++++
 drivers/infiniband/hw/hns/hns_roce_main.c  | 5 +++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 24e906b9d3ae..4e374b2da101 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -7017,6 +7017,12 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
 
 	handle->rinfo.instance_state = HNS_ROCE_STATE_NON_INIT;
 }
+
+static void hns_roce_v2_reset_notify_user(struct hns_roce_dev *hr_dev)
+{
+	rdma_user_mmap_disassociate(&hr_dev->ib_dev);
+}
+
 static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
 {
 	struct hns_roce_dev *hr_dev;
@@ -7035,6 +7041,9 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
 
 	hr_dev->active = false;
 	hr_dev->dis_db = true;
+
+	hns_roce_v2_reset_notify_user(hr_dev);
+
 	hr_dev->state = HNS_ROCE_DEVICE_STATE_RST_DOWN;
 
 	return 0;
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index 4cb0af733587..49315f39361d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
 	pgprot_t prot;
 	int ret;
 
+	if (hr_dev->dis_db) {
+		atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_MMAP_ERR_CNT]);
+		return -EPERM;
+	}
+
 	rdma_entry = rdma_user_mmap_entry_get_pgoff(uctx, vma->vm_pgoff);
 	if (!rdma_entry) {
 		atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_MMAP_ERR_CNT]);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-13 12:29 ` [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang
@ 2024-09-16  9:13   ` Leon Romanovsky
  2024-09-20  9:18     ` Junxian Huang
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2024-09-16  9:13 UTC (permalink / raw)
  To: Junxian Huang; +Cc: jgg, linux-rdma, linuxarm, linux-kernel

On Fri, Sep 13, 2024 at 08:29:55PM +0800, Junxian Huang wrote:
> From: Chengchang Tang <tangchengchang@huawei.com>
> 
> When HW is being reset, userspace should not ring doorbell otherwise
> it may lead to abnormal consequence such as RAS.
> 
> Disassociate mmap pages for all uctx to prevent userspace from ringing
> doorbell to HW. Since all resources will be destroyed during HW reset,
> no new mmap is allowed after HW reset is completed.
> 
> Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 9 +++++++++
>  drivers/infiniband/hw/hns/hns_roce_main.c  | 5 +++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> index 24e906b9d3ae..4e374b2da101 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> @@ -7017,6 +7017,12 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>  
>  	handle->rinfo.instance_state = HNS_ROCE_STATE_NON_INIT;
>  }
> +
> +static void hns_roce_v2_reset_notify_user(struct hns_roce_dev *hr_dev)
> +{
> +	rdma_user_mmap_disassociate(&hr_dev->ib_dev);
> +}

There is no need in one line function, please inline it.

> +
>  static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
>  {
>  	struct hns_roce_dev *hr_dev;
> @@ -7035,6 +7041,9 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
>  
>  	hr_dev->active = false;
>  	hr_dev->dis_db = true;
> +
> +	hns_roce_v2_reset_notify_user(hr_dev);
> +
>  	hr_dev->state = HNS_ROCE_DEVICE_STATE_RST_DOWN;
>  
>  	return 0;
> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> index 4cb0af733587..49315f39361d 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> @@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
>  	pgprot_t prot;
>  	int ret;
>  
> +	if (hr_dev->dis_db) {

How do you clear dis_db after calling to hns_roce_hw_v2_reset_notify_down()? Does it have any locking protection?

> +		atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_MMAP_ERR_CNT]);
> +		return -EPERM;
> +	}
> +
>  	rdma_entry = rdma_user_mmap_entry_get_pgoff(uctx, vma->vm_pgoff);
>  	if (!rdma_entry) {
>  		atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_MMAP_ERR_CNT]);
> -- 
> 2.33.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-16  9:13   ` Leon Romanovsky
@ 2024-09-20  9:18     ` Junxian Huang
  2024-09-20 12:47       ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Junxian Huang @ 2024-09-20  9:18 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: jgg, linux-rdma, linuxarm, linux-kernel



On 2024/9/16 17:13, Leon Romanovsky wrote:
> On Fri, Sep 13, 2024 at 08:29:55PM +0800, Junxian Huang wrote:
>> From: Chengchang Tang <tangchengchang@huawei.com>
>>
>> When HW is being reset, userspace should not ring doorbell otherwise
>> it may lead to abnormal consequence such as RAS.
>>
>> Disassociate mmap pages for all uctx to prevent userspace from ringing
>> doorbell to HW. Since all resources will be destroyed during HW reset,
>> no new mmap is allowed after HW reset is completed.
>>
>> Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
>> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
>> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
>> ---
>>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 9 +++++++++
>>  drivers/infiniband/hw/hns/hns_roce_main.c  | 5 +++++
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> index 24e906b9d3ae..4e374b2da101 100644
>> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> @@ -7017,6 +7017,12 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>  
>>  	handle->rinfo.instance_state = HNS_ROCE_STATE_NON_INIT;
>>  }
>> +
>> +static void hns_roce_v2_reset_notify_user(struct hns_roce_dev *hr_dev)
>> +{
>> +	rdma_user_mmap_disassociate(&hr_dev->ib_dev);
>> +}
> 
> There is no need in one line function, please inline it.
> 

Sure.

>> +
>>  static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
>>  {
>>  	struct hns_roce_dev *hr_dev;
>> @@ -7035,6 +7041,9 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
>>  
>>  	hr_dev->active = false;
>>  	hr_dev->dis_db = true;
>> +
>> +	hns_roce_v2_reset_notify_user(hr_dev);
>> +
>>  	hr_dev->state = HNS_ROCE_DEVICE_STATE_RST_DOWN;
>>  
>>  	return 0;
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
>> index 4cb0af733587..49315f39361d 100644
>> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
>> @@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
>>  	pgprot_t prot;
>>  	int ret;
>>  
>> +	if (hr_dev->dis_db) {
> 
> How do you clear dis_db after calling to hns_roce_hw_v2_reset_notify_down()? Does it have any locking protection?
> 

Sorry for the late response, I just came back from vacation.

After calling hns_roce_hw_v2_reset_notify_down(), we will call ib_unregister_device()
and destory all HW resources eventually, so there is no need to clear dis_db.

Junxian

>> +		atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_MMAP_ERR_CNT]);
>> +		return -EPERM;
>> +	}
>> +
>>  	rdma_entry = rdma_user_mmap_entry_get_pgoff(uctx, vma->vm_pgoff);
>>  	if (!rdma_entry) {
>>  		atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_MMAP_ERR_CNT]);
>> -- 
>> 2.33.0
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-20  9:18     ` Junxian Huang
@ 2024-09-20 12:47       ` Jason Gunthorpe
  2024-09-23  6:17         ` Junxian Huang
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2024-09-20 12:47 UTC (permalink / raw)
  To: Junxian Huang; +Cc: Leon Romanovsky, linux-rdma, linuxarm, linux-kernel

On Fri, Sep 20, 2024 at 05:18:14PM +0800, Junxian Huang wrote:

> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> >> index 4cb0af733587..49315f39361d 100644
> >> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
> >> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> >> @@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
> >>  	pgprot_t prot;
> >>  	int ret;
> >>  
> >> +	if (hr_dev->dis_db) {
> > 
> > How do you clear dis_db after calling to hns_roce_hw_v2_reset_notify_down()? Does it have any locking protection?
> > 
> 
> Sorry for the late response, I just came back from vacation.
> 
> After calling hns_roce_hw_v2_reset_notify_down(), we will call ib_unregister_device()
> and destory all HW resources eventually, so there is no need to clear dis_db.

Why can't you do the unregister device sooner then and avoid all this
special stuff?

I assumed you'd bring the same device back after completing the reset??

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-20 12:47       ` Jason Gunthorpe
@ 2024-09-23  6:17         ` Junxian Huang
  2024-09-23  9:02           ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Junxian Huang @ 2024-09-23  6:17 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, linuxarm, linux-kernel



On 2024/9/20 20:47, Jason Gunthorpe wrote:
> On Fri, Sep 20, 2024 at 05:18:14PM +0800, Junxian Huang wrote:
> 
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> index 4cb0af733587..49315f39361d 100644
>>>> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> @@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
>>>>  	pgprot_t prot;
>>>>  	int ret;
>>>>  
>>>> +	if (hr_dev->dis_db) {
>>>
>>> How do you clear dis_db after calling to hns_roce_hw_v2_reset_notify_down()? Does it have any locking protection?
>>>
>>
>> Sorry for the late response, I just came back from vacation.
>>
>> After calling hns_roce_hw_v2_reset_notify_down(), we will call ib_unregister_device()
>> and destory all HW resources eventually, so there is no need to clear dis_db.
> 
> Why can't you do the unregister device sooner then and avoid all this
> special stuff?
> 

It's a limitation of HW. Resources such as QP/CQ/MR will be destoryed
during unregistering device. This is not allowed by HW until
hns_roce_hw_v2_reset_notify_uninit(), or it may lead to some HW errors.

> I assumed you'd bring the same device back after completing the reset??
> 

Yes

> Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-23  6:17         ` Junxian Huang
@ 2024-09-23  9:02           ` Leon Romanovsky
  2024-09-23 14:15             ` Junxian Huang
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2024-09-23  9:02 UTC (permalink / raw)
  To: Junxian Huang; +Cc: Jason Gunthorpe, linux-rdma, linuxarm, linux-kernel

On Mon, Sep 23, 2024 at 02:17:40PM +0800, Junxian Huang wrote:
> 
> 
> On 2024/9/20 20:47, Jason Gunthorpe wrote:
> > On Fri, Sep 20, 2024 at 05:18:14PM +0800, Junxian Huang wrote:
> > 
> >>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> >>>> index 4cb0af733587..49315f39361d 100644
> >>>> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
> >>>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> >>>> @@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
> >>>>  	pgprot_t prot;
> >>>>  	int ret;
> >>>>  
> >>>> +	if (hr_dev->dis_db) {
> >>>
> >>> How do you clear dis_db after calling to hns_roce_hw_v2_reset_notify_down()? Does it have any locking protection?
> >>>
> >>
> >> Sorry for the late response, I just came back from vacation.
> >>
> >> After calling hns_roce_hw_v2_reset_notify_down(), we will call ib_unregister_device()
> >> and destory all HW resources eventually, so there is no need to clear dis_db.
> > 
> > Why can't you do the unregister device sooner then and avoid all this
> > special stuff?
> > 
> 
> It's a limitation of HW. Resources such as QP/CQ/MR will be destoryed
> during unregistering device. This is not allowed by HW until
> hns_roce_hw_v2_reset_notify_uninit(), or it may lead to some HW errors.

It is interested claim given the fact that you are changing original
code from 2016.

Thanks

> 
> > I assumed you'd bring the same device back after completing the reset??
> > 
> 
> Yes
> 
> > Jason
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset
  2024-09-23  9:02           ` Leon Romanovsky
@ 2024-09-23 14:15             ` Junxian Huang
  0 siblings, 0 replies; 9+ messages in thread
From: Junxian Huang @ 2024-09-23 14:15 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Jason Gunthorpe, linux-rdma, linuxarm, linux-kernel



On 2024/9/23 17:02, Leon Romanovsky wrote:
> On Mon, Sep 23, 2024 at 02:17:40PM +0800, Junxian Huang wrote:
>>
>>
>> On 2024/9/20 20:47, Jason Gunthorpe wrote:
>>> On Fri, Sep 20, 2024 at 05:18:14PM +0800, Junxian Huang wrote:
>>>
>>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>>>> index 4cb0af733587..49315f39361d 100644
>>>>>> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
>>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>>>> @@ -466,6 +466,11 @@ static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma)
>>>>>>  	pgprot_t prot;
>>>>>>  	int ret;
>>>>>>  
>>>>>> +	if (hr_dev->dis_db) {
>>>>>
>>>>> How do you clear dis_db after calling to hns_roce_hw_v2_reset_notify_down()? Does it have any locking protection?
>>>>>
>>>>
>>>> Sorry for the late response, I just came back from vacation.
>>>>
>>>> After calling hns_roce_hw_v2_reset_notify_down(), we will call ib_unregister_device()
>>>> and destory all HW resources eventually, so there is no need to clear dis_db.
>>>
>>> Why can't you do the unregister device sooner then and avoid all this
>>> special stuff?
>>>
>>
>> It's a limitation of HW. Resources such as QP/CQ/MR will be destoryed
>> during unregistering device. This is not allowed by HW until
>> hns_roce_hw_v2_reset_notify_uninit(), or it may lead to some HW errors.
> 
> It is interested claim given the fact that you are changing original
> code from 2016.
> 

Well, this isn't a new issue. We once sent a patch to try to address it
in 2019 [1], but that solution wasn't the right way since it relied on
userspace. We haven't come up with any new solutions since then, until
this series recently.

[1] https://lore.kernel.org/linux-rdma/20190812055220.GA8440@mtr-leonro.mtl.com/

Junxian

> Thanks
> 
>>
>>> I assumed you'd bring the same device back after completing the reset??
>>>
>>
>> Yes
>>
>>> Jason
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-09-23 14:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-13 12:29 [PATCH v5 for-next 0/2] RDMA: Provide an API for drivers to disassociate mmap pages Junxian Huang
2024-09-13 12:29 ` [PATCH v5 for-next 1/2] RDMA/core: Provide rdma_user_mmap_disassociate() " Junxian Huang
2024-09-13 12:29 ` [PATCH v5 for-next 2/2] RDMA/hns: Disassociate mmap pages for all uctx when HW is being reset Junxian Huang
2024-09-16  9:13   ` Leon Romanovsky
2024-09-20  9:18     ` Junxian Huang
2024-09-20 12:47       ` Jason Gunthorpe
2024-09-23  6:17         ` Junxian Huang
2024-09-23  9:02           ` Leon Romanovsky
2024-09-23 14:15             ` Junxian Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox