AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Khatri, Sunil" <sukhatri@amd.com>,
	tursulin@ursulin.net, Alexander.Deucher@amd.com,
	Prike.Liang@amd.com, Yogesh.Mohanmarimuthu@amd.com,
	SRINIVASAN.SHANMUGAM@amd.com, Sunil.Khatri@amd.com,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 10/11] drm/amdgpu: fix some more bug in amdgpu_gem_va_ioctl
Date: Tue, 17 Mar 2026 12:08:39 +0100	[thread overview]
Message-ID: <c6be4648-d53c-42d1-a625-8e09cba0993a@amd.com> (raw)
In-Reply-To: <b26458f2-d0ad-42e8-a0a1-77d7c5fd15dc@amd.com>

On 3/17/26 09:44, Khatri, Sunil wrote:
> 
> On 11-03-2026 12:43 am, Christian König wrote:
>> Some illegal combination of input flags were not checked and we need to
>> take the PDEs into account when returning the fence as well.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 76 +++++++++++--------------
>>  1 file changed, 34 insertions(+), 42 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index 88a21400ae09..98276b55ad3c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -30,6 +30,7 @@
>>  #include <linux/pagemap.h>
>>  #include <linux/pci.h>
>>  #include <linux/dma-buf.h>
>> +#include <linux/dma-fence-unwrap.h>
>>  
>>  #include <drm/amdgpu_drm.h>
>>  #include <drm/drm_drv.h>
>> @@ -741,11 +742,10 @@ amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
>>  	struct dma_fence *fence;
>>  	int r = 0;
>>  
>> -	/* Always start from the VM's existing last update fence. */
>> -	fence = dma_fence_get(vm->last_update);
>> -
>> +	/* If the VM is not ready return only a stub. */
>>  	if (!amdgpu_vm_ready(vm))
>> -		return fence;
>> +		return dma_fence_get_stub();
>> +
>>  
>>  	/*
>>  	 * First clean up any freed mappings in the VM.
>> @@ -754,7 +754,7 @@ amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
>>  	 * schedules GPU work. If nothing needs clearing, @fence can remain as
>>  	 * the original vm->last_update.
>>  	 */
>> -	r = amdgpu_vm_clear_freed(adev, vm, &fence);
>> +	r = amdgpu_vm_clear_freed(adev, vm, &vm->last_update);
>>  	if (r)
>>  		goto error;
>>  
>> @@ -771,47 +771,34 @@ amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
>>  	if (r)
>>  		goto error;
>>  
>> -	/*
>> -	 * Decide which fence best represents the last update:
>> -	 *
>> -	 * MAP/REPLACE:
>> -	 *   - For always-valid mappings, use vm->last_update.
>> -	 *   - Otherwise, export bo_va->last_pt_update.
>> -	 *
>> -	 * UNMAP/CLEAR:
>> -	 *   Keep the fence returned by amdgpu_vm_clear_freed(). If no work was
>> -	 *   needed, it can remain as vm->last_pt_update.
>> -	 *
>> -	 * The VM and BO update fences are always initialized to a valid value.
>> -	 * vm->last_update and bo_va->last_pt_update always start as valid fences.
>> -	 * and are never expected to be NULL.
>> -	 */
>> -	switch (operation) {
>> -	case AMDGPU_VA_OP_MAP:
>> -	case AMDGPU_VA_OP_REPLACE:
>> +	if ((operation == AMDGPU_VA_OP_MAP ||
>> +	     operation == AMDGPU_VA_OP_REPLACE) &&
>> +	    !amdgpu_vm_is_bo_always_valid(vm, bo_va->base.bo)) {
>> +
>>  		/*
>> -		 * For MAP/REPLACE, return the page table update fence for the
>> -		 * mapping we just modified. bo_va is expected to be valid here.
>> +		 * For MAP/REPLACE of non per-VM BOs we need to sync to both the
>> +		 * bo_va->last_pt_update and vm->last_update or otherwise we
>> +		 * potentially miss the PDE updates.
>>  		 */
>> -		dma_fence_put(fence);
>> -
>> -		if (amdgpu_vm_is_bo_always_valid(vm, bo_va->base.bo))
>> -			fence = dma_fence_get(vm->last_update);
>> -		else
>> -			fence = dma_fence_get(bo_va->last_pt_update);
>> -		break;
>> -	case AMDGPU_VA_OP_UNMAP:
>> -	case AMDGPU_VA_OP_CLEAR:
>> -	default:
>> -		/* keep @fence as returned by amdgpu_vm_clear_freed() */
>> -		break;
>> +		fence = dma_fence_unwrap_merge(vm->last_update,
>> +					       bo_va->last_pt_update);
>> +		if (!fence) {
>> +			/* As fallback in OOM situations */
>> +			dma_fence_wait(vm->last_update, false);
>> +			dma_fence_wait(bo_va->last_pt_update, false);
>> +			fence = dma_fence_get_stub();
>> +		}
>> +	} else {
>> +		fence = dma_fence_get(vm->last_update);
>>  	}
>>  
>> +	return fence;
>> +
>>  error:
>>  	if (r && r != -ERESTARTSYS)
>>  		DRM_ERROR("Couldn't update BO_VA (%d)\n", r);
>>  
>> -	return fence;
>> +	return dma_fence_get(vm->last_update);
>>  }
>>  
>>  int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
>> @@ -832,7 +819,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
>>  	struct amdgpu_bo_va *bo_va;
>>  	struct drm_syncobj *timeline_syncobj = NULL;
>>  	struct dma_fence_chain *timeline_chain = NULL;
>> -	struct dma_fence *fence;
>>  	struct drm_exec exec;
>>  	uint64_t vm_size;
>>  	int r = 0;
>> @@ -884,6 +870,10 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
>>  		return -EINVAL;
>>  	}
>>  
>> +	if (args->flags & AMDGPU_VM_DELAY_UPDATE &&
>> +	    args->vm_timeline_syncobj_out)
>> +		return -EINVAL;
>> +
>>  	if ((args->operation != AMDGPU_VA_OP_CLEAR) &&
>>  	    !(args->flags & AMDGPU_VM_PAGE_PRT)) {
>>  		gobj = drm_gem_object_lookup(filp, args->handle);
>> @@ -973,11 +963,13 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
>>  	 * that represents the last relevant update for this mapping. This
>>  	 * fence can then be exported to the user-visible VM timeline.
>>  	 */
>> -	if (!r && !(args->flags & AMDGPU_VM_DELAY_UPDATE) && !adev->debug_vm) {
>> +	if (!r && !(args->flags & AMDGPU_VM_DELAY_UPDATE) &&
>> +	    (!adev->debug_vm || timeline_syncobj)) {
>> +		struct dma_fence *fence;
> why to declare it here within the block. If i am not wrong we should have declaration first thing in the function as it was before? 

Because the problem keeping the variables local reduces the risk that we accidentally mess things up again.

Regards,
Christian.

> 
> Other than that small nitpick looks good to me. 
> 
> Acked-by: Sunil Khatri <sunil.khatri@amd.com>
> 
> Regards
> Sunil khatri
> 
> 
>> +
>>  		fence = amdgpu_gem_va_update_vm(adev, &fpriv->vm, bo_va,
>>  						args->operation);
>> -
>> -		if (timeline_syncobj && fence) {
>> +		if (timeline_syncobj) {
>>  			if (!args->vm_timeline_point) {
>>  				/* Replace the existing fence when no point is given. */
>>  				drm_syncobj_replace_fence(timeline_syncobj,


  reply	other threads:[~2026-03-17 11:08 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 19:13 [PATCH 01/11] drm/amdgpu: revert to old status lock handling v4 Christian König
2026-03-10 19:13 ` [PATCH 02/11] drm/amdgpu: restructure VM state machine Christian König
2026-03-11  8:47   ` Khatri, Sunil
2026-03-12 12:49     ` Christian König
2026-03-12 10:07   ` Liang, Prike
2026-03-12 14:32   ` Tvrtko Ursulin
2026-03-16 13:44     ` Christian König
2026-03-16 14:26       ` Tvrtko Ursulin
2026-03-10 19:13 ` [PATCH 03/11] drm/amdgpu: fix amdgpu_userq_evict Christian König
2026-03-11  8:51   ` Khatri, Sunil
2026-03-13  7:25   ` Liang, Prike
2026-03-10 19:13 ` [PATCH 04/11] drm/amdgpu: completely rework eviction fence handling Christian König
2026-03-11 12:27   ` Khatri, Sunil
2026-03-13  8:00     ` Khatri, Sunil
2026-03-17  9:41     ` Christian König
2026-03-13  8:28   ` Liang, Prike
2026-03-17  9:57     ` Christian König
2026-03-17 11:21       ` Liang, Prike
2026-03-17 11:23         ` Christian König
2026-03-17 11:54           ` Liang, Prike
2026-03-10 19:13 ` [PATCH 05/11] drm/amdgpu: fix eviction fence and userq manager shutdown Christian König
2026-03-11 12:26   ` Khatri, Sunil
2026-03-13  9:35     ` Khatri, Sunil
2026-03-10 19:13 ` [PATCH 06/11] drm/amdgpu: fix adding eviction fence Christian König
2026-03-11 12:26   ` Khatri, Sunil
2026-03-10 19:13 ` [PATCH 07/11] drm/amdgpu: rework amdgpu_userq_wait_ioctl v3 Christian König
2026-03-12 16:34   ` Tvrtko Ursulin
2026-03-16 14:19     ` Christian König
2026-03-16 14:44       ` Tvrtko Ursulin
2026-03-17  7:05   ` Khatri, Sunil
2026-03-10 19:13 ` [PATCH 08/11] drm/amdgpu: make amdgpu_user_wait_ioctl more resilent v2 Christian König
2026-03-17  7:15   ` Khatri, Sunil
2026-03-10 19:13 ` [PATCH 09/11] drm/amdgpu: annotate eviction fence signaling path Christian König
2026-03-17  7:35   ` Khatri, Sunil
2026-03-10 19:13 ` [PATCH 10/11] drm/amdgpu: fix some more bug in amdgpu_gem_va_ioctl Christian König
2026-03-17  8:44   ` Khatri, Sunil
2026-03-17 11:08     ` Christian König [this message]
2026-03-10 19:13 ` [PATCH 11/11] drm/amdgpu: WIP sync amdgpu_ttm_fill_mem only to kernel fences Christian König
2026-03-17  8:59   ` Khatri, Sunil
2026-03-17 10:52     ` Christian König
2026-03-11  7:43 ` [PATCH 01/11] drm/amdgpu: revert to old status lock handling v4 Khatri, Sunil
2026-03-12  7:13 ` Liang, Prike

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6be4648-d53c-42d1-a625-8e09cba0993a@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Prike.Liang@amd.com \
    --cc=SRINIVASAN.SHANMUGAM@amd.com \
    --cc=Sunil.Khatri@amd.com \
    --cc=Yogesh.Mohanmarimuthu@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=sukhatri@amd.com \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox