From: "Christian König" <christian.koenig@amd.com>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
amd-gfx@lists.freedesktop.org
Cc: kernel-dev@igalia.com
Subject: Re: [RFC] drm/amdgpu: Simplify amdgpu_vmid_grab?
Date: Mon, 29 Jun 2026 10:27:17 +0200 [thread overview]
Message-ID: <67a37c1f-e09e-4eac-b100-0702557c2244@amd.com> (raw)
In-Reply-To: <20260626161002.13908-1-tvrtko.ursulin@igalia.com>
On 6/26/26 18:10, Tvrtko Ursulin wrote:
> This is not a proper commit message but a conversation starter:
>
> I found the flow of amdgpu_vmid_grab a bit confusing and decided to
> butcher it, heavily, to see what will happen. To my surprise, I was able
> to start a dozen of parallel 3d apps and nothing broke.
>
> In no particular order:
>
> - Amdgpu_vmid_grab_idle cannot return an error despite the code claims it
> can.
Yeah, that's on my cleanup TODO list after I removed the memory allocation from that path.
> - Also, why it is called before attempting to re-use the previously
> assigned vmid?
Because of fairness. That's necessary otherwise submission hungry application can starve processes which only submit something here and there.
> - And why it is called before the reserved vmid path and then not used?
> - What is the point of ring->vmid_wait? It is never cleared and isn't
> the same effect achieved by simply waiting on the last LRU entry?
Different rings see different VMIDs busy because of pipelining.
In other words what can happen is that the last LRU entry changes while somebody waits for it to become idle.
So to guarantee fairness we need to make sure that everybody who wants to use this submission ring waits for the same fence.
Regards,
Christian.
>
> What am I missing and what have I broke? :) Or if nothing, then:
>
> 3 files changed, 64 insertions(+), 99 deletions(-)
>
> And:
>
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-465 (-465)
> Function old new delta
> amdgpu_vmid_grab 2917 2452 -465
>
> Which is not bad? Not least the clearer flow of amdgpu_vmid_grab(),
> one of the hottest functions in the submit worker after all.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 159 +++++++++--------------
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 -
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 -
> 3 files changed, 64 insertions(+), 99 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> index 124fb38eb465..37405f9ff7e5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> @@ -199,48 +199,35 @@ static bool amdgpu_vmid_compatible(struct amdgpu_vmid *id,
> * amdgpu_vmid_grab_idle - grab idle VMID
> *
> * @ring: ring we want to submit job to
> - * @idle: resulting idle VMID
> * @fence: fence to wait for if no id could be grabbed
> *
> * Try to find an idle VMID, if none is idle add a fence to wait to the sync
> - * object. Returns -ENOMEM when we are out of memory.
> + * object.
> */
> -static int amdgpu_vmid_grab_idle(struct amdgpu_ring *ring,
> - struct amdgpu_vmid **idle,
> - struct dma_fence **fence)
> +static struct amdgpu_vmid *amdgpu_vmid_grab_idle(struct amdgpu_ring *ring,
> + struct dma_fence **fence)
> {
> struct amdgpu_device *adev = ring->adev;
> - unsigned vmhub = ring->vm_hub;
> - struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
> -
> - /* If anybody is waiting for a VMID let everybody wait for fairness */
> - if (!dma_fence_is_signaled(ring->vmid_wait)) {
> - *fence = dma_fence_get(ring->vmid_wait);
> - return 0;
> - }
> + struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[ring->vm_hub];
> + struct amdgpu_vmid *idle;
>
> /* Check if we have an idle VMID */
> - list_for_each_entry_reverse((*idle), &id_mgr->ids_lru, list) {
> + list_for_each_entry_reverse(idle, &id_mgr->ids_lru, list) {
> /* Don't use per engine and per process VMID at the same time */
> struct amdgpu_ring *r = adev->vm_manager.concurrent_flush ?
> NULL : ring;
>
> - *fence = amdgpu_sync_peek_fence(&(*idle)->active, r);
> + *fence = amdgpu_sync_peek_fence(&idle->active, r);
> if (!(*fence))
> - return 0;
> + return idle;
> }
>
> /*
> * If we can't find a idle VMID to use, wait on a fence from the least
> * recently used in the hope that it will be available soon.
> */
> - *idle = NULL;
> - dma_fence_put(ring->vmid_wait);
> - ring->vmid_wait = dma_fence_get(*fence);
> -
> - /* This is the reference we return */
> dma_fence_get(*fence);
> - return 0;
> + return NULL;
> }
>
> /**
> @@ -249,34 +236,33 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_ring *ring,
> * @vm: vm to allocate id for
> * @ring: ring we want to submit job to
> * @job: job who wants to use the VMID
> - * @id: resulting VMID
> * @fence: fence to wait for if no id could be grabbed
> *
> * Try to assign a reserved VMID.
> */
> -static int amdgpu_vmid_grab_reserved(struct amdgpu_vm *vm,
> - struct amdgpu_ring *ring,
> - struct amdgpu_job *job,
> - struct amdgpu_vmid **id,
> - struct dma_fence **fence)
> +static struct amdgpu_vmid *
> +amdgpu_vmid_grab_reserved(struct amdgpu_vm *vm,
> + struct amdgpu_ring *ring,
> + struct amdgpu_job *job,
> + struct dma_fence **fence)
> {
> struct amdgpu_device *adev = ring->adev;
> - unsigned vmhub = ring->vm_hub;
> uint64_t fence_context = adev->fence_context + ring->idx;
> bool needs_flush = vm->use_cpu_for_update;
> uint64_t updates = amdgpu_vm_tlb_seq(vm);
> + struct amdgpu_vmid *id;
> int r;
>
> - *id = vm->reserved_vmid[vmhub];
> - if ((*id)->owner != vm->immediate.fence_context ||
> - !amdgpu_vmid_compatible(*id, job) ||
> - (*id)->flushed_updates < updates ||
> - !(*id)->last_flush ||
> - ((*id)->last_flush->context != fence_context &&
> - !dma_fence_is_signaled((*id)->last_flush)))
> + id = vm->reserved_vmid[ring->vm_hub];
> + if (id->owner != vm->immediate.fence_context ||
> + !amdgpu_vmid_compatible(id, job) ||
> + id->flushed_updates < updates ||
> + !id->last_flush ||
> + (id->last_flush->context != fence_context &&
> + !dma_fence_is_signaled(id->last_flush)))
> needs_flush = true;
>
> - if ((*id)->owner != vm->immediate.fence_context ||
> + if (id->owner != vm->immediate.fence_context ||
> (!adev->vm_manager.concurrent_flush && needs_flush)) {
> struct dma_fence *tmp;
>
> @@ -287,26 +273,25 @@ static int amdgpu_vmid_grab_reserved(struct amdgpu_vm *vm,
> ring = NULL;
>
> /* to prevent one context starved by another context */
> - (*id)->pd_gpu_addr = 0;
> - tmp = amdgpu_sync_peek_fence(&(*id)->active, ring);
> + id->pd_gpu_addr = 0;
> + tmp = amdgpu_sync_peek_fence(&id->active, ring);
> if (tmp) {
> - *id = NULL;
> *fence = dma_fence_get(tmp);
> - return 0;
> + return NULL;
> }
> }
>
> /* Good we can use this VMID. Remember this submission as
> * user of the VMID.
> */
> - r = amdgpu_sync_fence(&(*id)->active, &job->base.s_fence->finished,
> + r = amdgpu_sync_fence(&id->active, &job->base.s_fence->finished,
> GFP_ATOMIC);
> if (r)
> - return r;
> + return ERR_PTR(r);
>
> job->vm_needs_flush = needs_flush;
> job->spm_update_needed = true;
> - return 0;
> + return id;
> }
>
> /**
> @@ -319,57 +304,46 @@ static int amdgpu_vmid_grab_reserved(struct amdgpu_vm *vm,
> *
> * Try to reuse a VMID for this submission.
> */
> -static int amdgpu_vmid_grab_used(struct amdgpu_vm *vm,
> - struct amdgpu_ring *ring,
> - struct amdgpu_job *job,
> - struct amdgpu_vmid **id)
> +static struct amdgpu_vmid *
> +amdgpu_vmid_grab_used(struct amdgpu_vm *vm,
> + struct amdgpu_ring *ring,
> + struct amdgpu_job *job)
> {
> struct amdgpu_device *adev = ring->adev;
> - unsigned vmhub = ring->vm_hub;
> - struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
> + struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[ring->vm_hub];
> uint64_t fence_context = adev->fence_context + ring->idx;
> uint64_t updates = amdgpu_vm_tlb_seq(vm);
> - int r;
> + struct amdgpu_vmid *id;
>
> job->vm_needs_flush = vm->use_cpu_for_update;
>
> /* Check if we can use a VMID already assigned to this VM */
> - list_for_each_entry_reverse((*id), &id_mgr->ids_lru, list) {
> + list_for_each_entry_reverse(id, &id_mgr->ids_lru, list) {
> bool needs_flush = vm->use_cpu_for_update;
>
> /* Check all the prerequisites to using this VMID */
> - if ((*id)->owner != vm->immediate.fence_context)
> + if (id->owner != vm->immediate.fence_context)
> continue;
>
> - if (!amdgpu_vmid_compatible(*id, job))
> + if (!amdgpu_vmid_compatible(id, job))
> continue;
>
> - if (!(*id)->last_flush ||
> - ((*id)->last_flush->context != fence_context &&
> - !dma_fence_is_signaled((*id)->last_flush)))
> + if (!id->last_flush ||
> + (id->last_flush->context != fence_context &&
> + !dma_fence_is_signaled(id->last_flush)))
> needs_flush = true;
>
> - if ((*id)->flushed_updates < updates)
> + if (id->flushed_updates < updates)
> needs_flush = true;
>
> if (needs_flush && !adev->vm_manager.concurrent_flush)
> continue;
>
> - /* Good, we can use this VMID. Remember this submission as
> - * user of the VMID.
> - */
> - r = amdgpu_sync_fence(&(*id)->active,
> - &job->base.s_fence->finished,
> - GFP_ATOMIC);
> - if (r)
> - return r;
> -
> job->vm_needs_flush |= needs_flush;
> - return 0;
> + return id;
> }
>
> - *id = NULL;
> - return 0;
> + return NULL;
> }
>
> /**
> @@ -386,39 +360,34 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
> struct amdgpu_job *job, struct dma_fence **fence)
> {
> struct amdgpu_device *adev = ring->adev;
> - unsigned vmhub = ring->vm_hub;
> + const unsigned vmhub = ring->vm_hub;
> struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
> - struct amdgpu_vmid *idle = NULL;
> - struct amdgpu_vmid *id = NULL;
> + struct amdgpu_vmid *id;
> int r = 0;
>
> mutex_lock(&id_mgr->lock);
> - r = amdgpu_vmid_grab_idle(ring, &idle, fence);
> - if (r || !idle)
> - goto error;
>
> if (amdgpu_vmid_uses_reserved(vm, vmhub)) {
> - r = amdgpu_vmid_grab_reserved(vm, ring, job, &id, fence);
> - if (r || !id)
> - goto error;
> + id = amdgpu_vmid_grab_reserved(vm, ring, job, fence);
> + if (IS_ERR_OR_NULL(id)) {
> + r = PTR_ERR(id);
> + goto out;
> + }
> } else {
> - r = amdgpu_vmid_grab_used(vm, ring, job, &id);
> + id = amdgpu_vmid_grab_used(vm, ring, job);
> + if (!id)
> + id = amdgpu_vmid_grab_idle(ring, fence);
> + if (!id)
> + goto out;
> +
> + /* Remember this submission as user of the VMID */
> + r = amdgpu_sync_fence(&id->active,
> + &job->base.s_fence->finished,
> + GFP_ATOMIC);
> if (r)
> - goto error;
> + goto out;
>
> - if (!id) {
> - /* Still no ID to use? Then use the idle one found earlier */
> - id = idle;
> -
> - /* Remember this submission as user of the VMID */
> - r = amdgpu_sync_fence(&id->active,
> - &job->base.s_fence->finished,
> - GFP_ATOMIC);
> - if (r)
> - goto error;
> -
> - job->vm_needs_flush = true;
> - }
> + job->vm_needs_flush = true;
>
> list_move_tail(&id->list, &id_mgr->ids_lru);
> }
> @@ -443,7 +412,7 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
>
> trace_amdgpu_vm_grab_id(vm, ring, job);
>
> -error:
> +out:
> mutex_unlock(&id_mgr->lock);
> return r;
> }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 09593fcdb2f7..6614682eaa89 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -254,7 +254,6 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
> ring->adev = adev;
> ring->num_hw_submission = sched_hw_submission;
> ring->sched_score = sched_score;
> - ring->vmid_wait = dma_fence_get_stub();
>
> ring->idx = adev->num_rings++;
> adev->rings[ring->idx] = ring;
> @@ -413,8 +412,6 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
> kvfree(ring->ring_backup);
> ring->ring_backup = NULL;
>
> - dma_fence_put(ring->vmid_wait);
> - ring->vmid_wait = NULL;
> ring->me = 0;
> }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 8f28b3bd7010..33dd39dd21cc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -414,7 +414,6 @@ struct amdgpu_ring {
> u64 set_q_mode_token;
> unsigned vm_hub;
> unsigned vm_inv_eng;
> - struct dma_fence *vmid_wait;
> bool has_compute_vm_bug;
> bool no_scheduler;
> bool no_user_submission;
prev parent reply other threads:[~2026-06-29 8:27 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-26 16:10 [RFC] drm/amdgpu: Simplify amdgpu_vmid_grab? Tvrtko Ursulin
2026-06-29 8:27 ` Christian König [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=67a37c1f-e09e-4eac-b100-0702557c2244@amd.com \
--to=christian.koenig@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=kernel-dev@igalia.com \
--cc=tvrtko.ursulin@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.