All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
@ 2022-03-28 13:06 Philip Yang
  2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Philip Yang @ 2022-03-28 13:06 UTC (permalink / raw)
  To: amd-gfx; +Cc: Philip Yang, felix.kuehling, christian.koenig

To fix two issues with unlocked update fence:

1. vm->last_unlocked store the latest fence without taking refcount.
2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index dbb551762805..69fba68ff88e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,
 	if (p->unlocked) {
 		struct dma_fence *tmp = dma_fence_get(f);
 
-		swap(p->vm->last_unlocked, f);
+		swap(p->vm->last_unlocked, tmp);
 		dma_fence_put(tmp);
 	} else {
 		amdgpu_bo_fence(p->vm->root.bo, f, true);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update
  2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
@ 2022-03-28 13:06 ` Philip Yang
  2022-03-28 13:14   ` Christian König
  2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Philip Yang @ 2022-03-28 13:06 UTC (permalink / raw)
  To: amd-gfx; +Cc: Philip Yang, felix.kuehling, christian.koenig

Flush TLB needs wait for GPU update fence done. MMU notify callback to
unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb
to unlocked update fence to increase vm->tlb_seq.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 48f326609976..683b08f756ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -927,7 +927,7 @@ int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 
 	r = vm->update_funcs->commit(&params, fence);
 
-	if (!unlocked && (!(flags & AMDGPU_PTE_VALID) || params.table_freed)) {
+	if (!(flags & AMDGPU_PTE_VALID) || params.table_freed) {
 		tlb_cb->vm = vm;
 		if (!fence || !*fence ||
 		    dma_fence_add_callback(*fence, &tlb_cb->cb,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq
  2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
  2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
@ 2022-03-28 13:07 ` Philip Yang
  2022-03-28 13:16   ` Christian König
  2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
  2022-03-28 13:52 ` Christian König
  3 siblings, 1 reply; 9+ messages in thread
From: Philip Yang @ 2022-03-28 13:07 UTC (permalink / raw)
  To: amd-gfx; +Cc: Philip Yang, felix.kuehling, christian.koenig

To support multi-thread update page table.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h    | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 945982a5d688..e1b7e6afa920 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -705,7 +705,7 @@ struct kfd_process_device {
 	/* VM context for GPUVM allocations */
 	struct file *drm_file;
 	void *drm_priv;
-	uint64_t tlb_seq;
+	atomic64_t tlb_seq;
 
 	/* GPUVM allocations storage */
 	struct idr alloc_idr;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ac8123c1ee8f..43ed8ec1f975 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1560,7 +1560,7 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
 		return ret;
 	}
 	pdd->drm_priv = drm_file->private_data;
-	pdd->tlb_seq = 0;
+	atomic64_set(&pdd->tlb_seq, 0);
 
 	ret = kfd_process_device_reserve_ib_mem(pdd);
 	if (ret)
@@ -1954,10 +1954,10 @@ void kfd_flush_tlb(struct kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)
 	uint64_t tlb_seq = amdgpu_vm_tlb_seq(vm);
 	struct kfd_dev *dev = pdd->dev;
 
-	if (pdd->tlb_seq == tlb_seq)
+	if (atomic64_read(&pdd->tlb_seq) == tlb_seq)
 		return;
 
-	pdd->tlb_seq = tlb_seq;
+	atomic64_set(&pdd->tlb_seq, tlb_seq);
 	if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
 		/* Nothing to flush until a VMID is assigned, which
 		 * only happens when the first queue is created.
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
  2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
  2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
  2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
@ 2022-03-28 13:14 ` Christian König
  2022-03-28 13:35   ` philip yang
  2022-03-28 13:52 ` Christian König
  3 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2022-03-28 13:14 UTC (permalink / raw)
  To: Philip Yang, amd-gfx; +Cc: felix.kuehling

Am 28.03.22 um 15:06 schrieb Philip Yang:
> To fix two issues with unlocked update fence:
>
> 1. vm->last_unlocked store the latest fence without taking refcount.
> 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.

NAK, that code here is perfectly correct.

vm->last_unlocked gets a reference to the last unlocked operation.

Otherwise the last locked operation is added directly to the root resv 
object.

Regards,
Christian.

>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index dbb551762805..69fba68ff88e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,
>   	if (p->unlocked) {
>   		struct dma_fence *tmp = dma_fence_get(f);
>   
> -		swap(p->vm->last_unlocked, f);
> +		swap(p->vm->last_unlocked, tmp);
>   		dma_fence_put(tmp);
>   	} else {
>   		amdgpu_bo_fence(p->vm->root.bo, f, true);


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update
  2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
@ 2022-03-28 13:14   ` Christian König
  0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:14 UTC (permalink / raw)
  To: Philip Yang, amd-gfx; +Cc: felix.kuehling

Am 28.03.22 um 15:06 schrieb Philip Yang:
> Flush TLB needs wait for GPU update fence done. MMU notify callback to
> unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb
> to unlocked update fence to increase vm->tlb_seq.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 48f326609976..683b08f756ed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -927,7 +927,7 @@ int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   
>   	r = vm->update_funcs->commit(&params, fence);
>   
> -	if (!unlocked && (!(flags & AMDGPU_PTE_VALID) || params.table_freed)) {
> +	if (!(flags & AMDGPU_PTE_VALID) || params.table_freed) {
>   		tlb_cb->vm = vm;
>   		if (!fence || !*fence ||
>   		    dma_fence_add_callback(*fence, &tlb_cb->cb,


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq
  2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
@ 2022-03-28 13:16   ` Christian König
  0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:16 UTC (permalink / raw)
  To: Philip Yang, amd-gfx; +Cc: felix.kuehling



Am 28.03.22 um 15:07 schrieb Philip Yang:
> To support multi-thread update page table.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    | 2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
>   2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 945982a5d688..e1b7e6afa920 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -705,7 +705,7 @@ struct kfd_process_device {
>   	/* VM context for GPUVM allocations */
>   	struct file *drm_file;
>   	void *drm_priv;
> -	uint64_t tlb_seq;
> +	atomic64_t tlb_seq;
>   
>   	/* GPUVM allocations storage */
>   	struct idr alloc_idr;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index ac8123c1ee8f..43ed8ec1f975 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -1560,7 +1560,7 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
>   		return ret;
>   	}
>   	pdd->drm_priv = drm_file->private_data;
> -	pdd->tlb_seq = 0;
> +	atomic64_set(&pdd->tlb_seq, 0);
>   
>   	ret = kfd_process_device_reserve_ib_mem(pdd);
>   	if (ret)
> @@ -1954,10 +1954,10 @@ void kfd_flush_tlb(struct kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)
>   	uint64_t tlb_seq = amdgpu_vm_tlb_seq(vm);
>   	struct kfd_dev *dev = pdd->dev;
>   
> -	if (pdd->tlb_seq == tlb_seq)
> +	if (atomic64_read(&pdd->tlb_seq) == tlb_seq)
>   		return;
>   
> -	pdd->tlb_seq = tlb_seq;
> +	atomic64_set(&pdd->tlb_seq, tlb_seq);

That should probably use atomic64_xchg() instead of read+set or 
otherwise using an atomic doesn't make much sense.

Christian.

>   	if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
>   		/* Nothing to flush until a VMID is assigned, which
>   		 * only happens when the first queue is created.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
  2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
@ 2022-03-28 13:35   ` philip yang
  2022-03-28 13:37     ` Christian König
  0 siblings, 1 reply; 9+ messages in thread
From: philip yang @ 2022-03-28 13:35 UTC (permalink / raw)
  To: Christian König, Philip Yang, amd-gfx; +Cc: felix.kuehling

[-- Attachment #1: Type: text/html, Size: 3156 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
  2022-03-28 13:35   ` philip yang
@ 2022-03-28 13:37     ` Christian König
  0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:37 UTC (permalink / raw)
  To: philip yang, Philip Yang, amd-gfx; +Cc: felix.kuehling

Am 28.03.22 um 15:35 schrieb philip yang:
>
>
> On 2022-03-28 9:14 a.m., Christian König wrote:
>> Am 28.03.22 um 15:06 schrieb Philip Yang:
>>> To fix two issues with unlocked update fence:
>>>
>>> 1. vm->last_unlocked store the latest fence without taking refcount.
>>> 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.
>>
>> NAK, that code here is perfectly correct.
>>
>> vm->last_unlocked gets a reference to the last unlocked operation.
>>
>> Otherwise the last locked operation is added directly to the root 
>> resv object.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> index dbb551762805..69fba68ff88e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> @@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct 
>>> amdgpu_vm_update_params *p,
>>>       if (p->unlocked) {
>>>           struct dma_fence *tmp = dma_fence_get(f);
>>>   -        swap(p->vm->last_unlocked, f);
>>> +        swap(p->vm->last_unlocked, tmp);
>
>           dma_fence_put(tmp);
>
> tmp is the new unlock fence, so this drop the new fence refcount, fix 
> is to move the old fence to tmp, drop the old fence refcount, take the 
> new fence reference.
>
> f is return fence, because f swap to old fence, so old fence is 
> returned, not new fence, fix will not change f.
>

No, swap() will exchange the parameters a and b.

So tmp is pointing to the old fence when dma_fence_put() is called on it.

That reference counting is correct as far as I can see.

Regards,
Christian.

> Regards,
>
> Philip
>
>>>       } else {
>>>           amdgpu_bo_fence(p->vm->root.bo, f, true);
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
  2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
                   ` (2 preceding siblings ...)
  2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
@ 2022-03-28 13:52 ` Christian König
  3 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:52 UTC (permalink / raw)
  To: Philip Yang, amd-gfx; +Cc: felix.kuehling

Am 28.03.22 um 15:06 schrieb Philip Yang:
> To fix two issues with unlocked update fence:
>
> 1. vm->last_unlocked store the latest fence without taking refcount.
> 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>

Now I see what you mean. Patch is Reviewed-by: Christian König 
<christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index dbb551762805..69fba68ff88e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,
>   	if (p->unlocked) {
>   		struct dma_fence *tmp = dma_fence_get(f);
>   
> -		swap(p->vm->last_unlocked, f);
> +		swap(p->vm->last_unlocked, tmp);
>   		dma_fence_put(tmp);
>   	} else {
>   		amdgpu_bo_fence(p->vm->root.bo, f, true);


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-03-28 13:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
2022-03-28 13:14   ` Christian König
2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
2022-03-28 13:16   ` Christian König
2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
2022-03-28 13:35   ` philip yang
2022-03-28 13:37     ` Christian König
2022-03-28 13:52 ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.