* [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update
2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
@ 2022-03-28 13:06 ` Philip Yang
2022-03-28 13:14 ` Christian König
2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Philip Yang @ 2022-03-28 13:06 UTC (permalink / raw)
To: amd-gfx; +Cc: Philip Yang, felix.kuehling, christian.koenig
Flush TLB needs wait for GPU update fence done. MMU notify callback to
unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb
to unlocked update fence to increase vm->tlb_seq.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 48f326609976..683b08f756ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -927,7 +927,7 @@ int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
r = vm->update_funcs->commit(¶ms, fence);
- if (!unlocked && (!(flags & AMDGPU_PTE_VALID) || params.table_freed)) {
+ if (!(flags & AMDGPU_PTE_VALID) || params.table_freed) {
tlb_cb->vm = vm;
if (!fence || !*fence ||
dma_fence_add_callback(*fence, &tlb_cb->cb,
--
2.35.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update
2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
@ 2022-03-28 13:14 ` Christian König
0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:14 UTC (permalink / raw)
To: Philip Yang, amd-gfx; +Cc: felix.kuehling
Am 28.03.22 um 15:06 schrieb Philip Yang:
> Flush TLB needs wait for GPU update fence done. MMU notify callback to
> unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb
> to unlocked update fence to increase vm->tlb_seq.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 48f326609976..683b08f756ed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -927,7 +927,7 @@ int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>
> r = vm->update_funcs->commit(¶ms, fence);
>
> - if (!unlocked && (!(flags & AMDGPU_PTE_VALID) || params.table_freed)) {
> + if (!(flags & AMDGPU_PTE_VALID) || params.table_freed) {
> tlb_cb->vm = vm;
> if (!fence || !*fence ||
> dma_fence_add_callback(*fence, &tlb_cb->cb,
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq
2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
@ 2022-03-28 13:07 ` Philip Yang
2022-03-28 13:16 ` Christian König
2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
2022-03-28 13:52 ` Christian König
3 siblings, 1 reply; 9+ messages in thread
From: Philip Yang @ 2022-03-28 13:07 UTC (permalink / raw)
To: amd-gfx; +Cc: Philip Yang, felix.kuehling, christian.koenig
To support multi-thread update page table.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 945982a5d688..e1b7e6afa920 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -705,7 +705,7 @@ struct kfd_process_device {
/* VM context for GPUVM allocations */
struct file *drm_file;
void *drm_priv;
- uint64_t tlb_seq;
+ atomic64_t tlb_seq;
/* GPUVM allocations storage */
struct idr alloc_idr;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ac8123c1ee8f..43ed8ec1f975 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1560,7 +1560,7 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
return ret;
}
pdd->drm_priv = drm_file->private_data;
- pdd->tlb_seq = 0;
+ atomic64_set(&pdd->tlb_seq, 0);
ret = kfd_process_device_reserve_ib_mem(pdd);
if (ret)
@@ -1954,10 +1954,10 @@ void kfd_flush_tlb(struct kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)
uint64_t tlb_seq = amdgpu_vm_tlb_seq(vm);
struct kfd_dev *dev = pdd->dev;
- if (pdd->tlb_seq == tlb_seq)
+ if (atomic64_read(&pdd->tlb_seq) == tlb_seq)
return;
- pdd->tlb_seq = tlb_seq;
+ atomic64_set(&pdd->tlb_seq, tlb_seq);
if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
/* Nothing to flush until a VMID is assigned, which
* only happens when the first queue is created.
--
2.35.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq
2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
@ 2022-03-28 13:16 ` Christian König
0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:16 UTC (permalink / raw)
To: Philip Yang, amd-gfx; +Cc: felix.kuehling
Am 28.03.22 um 15:07 schrieb Philip Yang:
> To support multi-thread update page table.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 945982a5d688..e1b7e6afa920 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -705,7 +705,7 @@ struct kfd_process_device {
> /* VM context for GPUVM allocations */
> struct file *drm_file;
> void *drm_priv;
> - uint64_t tlb_seq;
> + atomic64_t tlb_seq;
>
> /* GPUVM allocations storage */
> struct idr alloc_idr;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index ac8123c1ee8f..43ed8ec1f975 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -1560,7 +1560,7 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
> return ret;
> }
> pdd->drm_priv = drm_file->private_data;
> - pdd->tlb_seq = 0;
> + atomic64_set(&pdd->tlb_seq, 0);
>
> ret = kfd_process_device_reserve_ib_mem(pdd);
> if (ret)
> @@ -1954,10 +1954,10 @@ void kfd_flush_tlb(struct kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)
> uint64_t tlb_seq = amdgpu_vm_tlb_seq(vm);
> struct kfd_dev *dev = pdd->dev;
>
> - if (pdd->tlb_seq == tlb_seq)
> + if (atomic64_read(&pdd->tlb_seq) == tlb_seq)
> return;
>
> - pdd->tlb_seq = tlb_seq;
> + atomic64_set(&pdd->tlb_seq, tlb_seq);
That should probably use atomic64_xchg() instead of read+set or
otherwise using an atomic doesn't make much sense.
Christian.
> if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
> /* Nothing to flush until a VMID is assigned, which
> * only happens when the first queue is created.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
2022-03-28 13:06 ` [PATCH v2 2/3] drm/amdgpu: Add tlb_cb for unlocked update Philip Yang
2022-03-28 13:07 ` [PATCH v2 3/3] drm/amdkfd: Use atomic64_t type for pdd->tlb_seq Philip Yang
@ 2022-03-28 13:14 ` Christian König
2022-03-28 13:35 ` philip yang
2022-03-28 13:52 ` Christian König
3 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2022-03-28 13:14 UTC (permalink / raw)
To: Philip Yang, amd-gfx; +Cc: felix.kuehling
Am 28.03.22 um 15:06 schrieb Philip Yang:
> To fix two issues with unlocked update fence:
>
> 1. vm->last_unlocked store the latest fence without taking refcount.
> 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.
NAK, that code here is perfectly correct.
vm->last_unlocked gets a reference to the last unlocked operation.
Otherwise the last locked operation is added directly to the root resv
object.
Regards,
Christian.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index dbb551762805..69fba68ff88e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,
> if (p->unlocked) {
> struct dma_fence *tmp = dma_fence_get(f);
>
> - swap(p->vm->last_unlocked, f);
> + swap(p->vm->last_unlocked, tmp);
> dma_fence_put(tmp);
> } else {
> amdgpu_bo_fence(p->vm->root.bo, f, true);
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
@ 2022-03-28 13:35 ` philip yang
2022-03-28 13:37 ` Christian König
0 siblings, 1 reply; 9+ messages in thread
From: philip yang @ 2022-03-28 13:35 UTC (permalink / raw)
To: Christian König, Philip Yang, amd-gfx; +Cc: felix.kuehling
[-- Attachment #1: Type: text/html, Size: 3156 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
2022-03-28 13:35 ` philip yang
@ 2022-03-28 13:37 ` Christian König
0 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:37 UTC (permalink / raw)
To: philip yang, Philip Yang, amd-gfx; +Cc: felix.kuehling
Am 28.03.22 um 15:35 schrieb philip yang:
>
>
> On 2022-03-28 9:14 a.m., Christian König wrote:
>> Am 28.03.22 um 15:06 schrieb Philip Yang:
>>> To fix two issues with unlocked update fence:
>>>
>>> 1. vm->last_unlocked store the latest fence without taking refcount.
>>> 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.
>>
>> NAK, that code here is perfectly correct.
>>
>> vm->last_unlocked gets a reference to the last unlocked operation.
>>
>> Otherwise the last locked operation is added directly to the root
>> resv object.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> index dbb551762805..69fba68ff88e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> @@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct
>>> amdgpu_vm_update_params *p,
>>> if (p->unlocked) {
>>> struct dma_fence *tmp = dma_fence_get(f);
>>> - swap(p->vm->last_unlocked, f);
>>> + swap(p->vm->last_unlocked, tmp);
>
> dma_fence_put(tmp);
>
> tmp is the new unlock fence, so this drop the new fence refcount, fix
> is to move the old fence to tmp, drop the old fence refcount, take the
> new fence reference.
>
> f is return fence, because f swap to old fence, so old fence is
> returned, not new fence, fix will not change f.
>
No, swap() will exchange the parameters a and b.
So tmp is pointing to the old fence when dma_fence_put() is called on it.
That reference counting is correct as far as I can see.
Regards,
Christian.
> Regards,
>
> Philip
>
>>> } else {
>>> amdgpu_bo_fence(p->vm->root.bo, f, true);
>>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling
2022-03-28 13:06 [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Philip Yang
` (2 preceding siblings ...)
2022-03-28 13:14 ` [PATCH v2 1/3] drm/amdgpu: Correct unlocked update fence handling Christian König
@ 2022-03-28 13:52 ` Christian König
3 siblings, 0 replies; 9+ messages in thread
From: Christian König @ 2022-03-28 13:52 UTC (permalink / raw)
To: Philip Yang, amd-gfx; +Cc: felix.kuehling
Am 28.03.22 um 15:06 schrieb Philip Yang:
> To fix two issues with unlocked update fence:
>
> 1. vm->last_unlocked store the latest fence without taking refcount.
> 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Now I see what you mean. Patch is Reviewed-by: Christian König
<christian.koenig@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index dbb551762805..69fba68ff88e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -109,7 +109,7 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,
> if (p->unlocked) {
> struct dma_fence *tmp = dma_fence_get(f);
>
> - swap(p->vm->last_unlocked, f);
> + swap(p->vm->last_unlocked, tmp);
> dma_fence_put(tmp);
> } else {
> amdgpu_bo_fence(p->vm->root.bo, f, true);
^ permalink raw reply [flat|nested] 9+ messages in thread