From: "Timur Kristóf" <timur.kristof@gmail.com>
To: amd-gfx@lists.freedesktop.org, Alexander.Deucher@amd.com,
"Christian König" <christian.koenig@amd.com>,
"Natalie Vock" <natalie.vock@gmx.de>,
"Amir Shetaia" <Amir.Shetaia@amd.com>,
"Marek Olšák" <maraeo@gmail.com>,
"Mario Limonciello" <mario.limonciello@amd.com>,
"Tvrtko Ursulin" <tursulin@ursulin.net>
Subject: Re: [PATCH 1/7] drm/amdgpu/vm: Add fence argument to amdgpu_vm_handle_fault()
Date: Wed, 24 Jun 2026 16:09:54 +0200 [thread overview]
Message-ID: <5790868.rdbgypaU67@timur-max> (raw)
In-Reply-To: <5dc2e31f-7e28-4076-9842-2c4245c01e67@ursulin.net>
On 2026. június 24., szerda 15:54:30 közép-európai nyári idő Tvrtko Ursulin
wrote:
> On 29/05/2026 11:30, Timur Kristóf wrote:
> > Allow the caller to respond to when the VM update is finished.
> >
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 4 ++--
> > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 ++++-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
> > drivers/gpu/drm/amd/amdgpu/gmc_v12_1.c | 4 ++--
> > 4 files changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index
> > d790b7619ccd4..26aea960e2759 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -564,7 +564,7 @@ int amdgpu_gmc_handle_retry_fault(struct amdgpu_device
> > *adev,>
> > }
> >
> > ret = amdgpu_vm_handle_fault(adev, entry->pasid,
entry->vmid, node_id,
> >
> > - addr, entry-
>timestamp, write_fault);
> > + addr, entry-
>timestamp, write_fault, NULL);
> >
> > adev->irq.ih_funcs->retry_cam_ack(adev, cam_index);
> > if (ret)
> >
> > return 1;
> >
> > @@ -587,7 +587,7 @@ int amdgpu_gmc_handle_retry_fault(struct amdgpu_device
> > *adev,>
> > * tables
> > */
> >
> > if (amdgpu_vm_handle_fault(adev, entry->pasid, entry-
>vmid, node_id,
> >
> > - addr, entry-
>timestamp, write_fault))
> > + addr, entry-
>timestamp, write_fault, NULL))
> >
> > return 1;
> >
> > }
> > return 0;
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index
> > b523a7b97d6f1..8c3ba7213eb22 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > @@ -2962,13 +2962,14 @@ struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct
> > amdgpu_device *adev,>
> > * GFX 9.4.3.
> > * @addr: Address of the fault
> > * @write_fault: true is write fault, false is read fault
> >
> > + * @fence: optional resulting fence, signaled after update is done
> >
> > *
> > * Try to gracefully handle a VM fault. Return true if the fault was
> > handled and * shouldn't be reported any more.
> > */
> >
> > bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
> >
> > u32 vmid, u32 node_id, uint64_t addr,
> >
> > - uint64_t ts, bool write_fault)
> > + uint64_t ts, bool write_fault, struct
dma_fence **fence)
> >
> > {
> >
> > bool is_compute_context = false;
> > struct amdgpu_bo *root;
> >
> > @@ -3034,6 +3035,8 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device
> > *adev, u32 pasid,>
> > r = amdgpu_vm_update_pdes(adev, vm, true);
> >
> > + *fence = vm->last_update;
>
> Unless the heat wave is severely interfering with my ability to read
> code, fence here is mostly NULL and who owns the reference is suspect.
> Did you mean like this:
>
> if (fence)
> *fence = dma_fence_get(vm->last_update);
>
> Kernel doc should perhaps clarify along the lines of:
>
> "@fence: If non-null, returns a fence with an extra reference for the
> caller, which is signaled after update is done.".
>
> Regards,
>
> Tvrtko
Thank you!
Yes, that's a valid point. I will fix this.
Note that I will likely drop this patch from the next version of the series
and submit it separately, because it conflicts with Christian's series.
>
> > +
> >
> > error_unlock:
> > amdgpu_bo_unreserve(root);
> > if (r < 0)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index
> > cc096c005e348..72da6b3d98c70 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > @@ -589,7 +589,7 @@ void amdgpu_vm_put_task_info(struct amdgpu_task_info
> > *task_info);>
> > bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
> >
> > u32 vmid, u32 node_id, uint64_t addr,
uint64_t ts,
> >
> > - bool write_fault);
> > + bool write_fault, struct dma_fence
**fence);
> >
> > struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
> >
> > struct amdgpu_bo
**root, u32 pasid);
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v12_1.c
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v12_1.c index
> > 855cd29cbffaa..da18c02013966 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v12_1.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v12_1.c
> > @@ -155,7 +155,7 @@ static int gmc_v12_1_process_interrupt(struct
> > amdgpu_device *adev,>
> > cam_index = entry->src_data[3] & 0x3ff;
> >
> > ret = amdgpu_vm_handle_fault(adev, entry-
>pasid, entry->vmid,
> > node_id,
> >
> > -
addr, entry->timestamp, write_fault);
> > +
addr, entry->timestamp, write_fault, NULL);
> >
> > WDOORBELL32(adev-
>irq.retry_cam_doorbell_index, cam_index);
> > if (ret)
> >
> > return 1;
> >
> > @@ -178,7 +178,7 @@ static int gmc_v12_1_process_interrupt(struct
> > amdgpu_device *adev,>
> > * tables
> > */
> >
> > if (amdgpu_vm_handle_fault(adev, entry-
>pasid, entry->vmid, node_id,
> >
> > - addr,
entry->timestamp, write_fault))
> > + addr,
entry->timestamp, write_fault, NULL))
> >
> > return 1;
> >
> > }
> >
> > }
next prev parent reply other threads:[~2026-06-24 14:10 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 10:30 [PATCH 0/7] drm/amdgpu: Implement retry faults on Navi 4 Timur Kristóf
2026-05-29 10:30 ` [PATCH 1/7] drm/amdgpu/vm: Add fence argument to amdgpu_vm_handle_fault() Timur Kristóf
2026-06-24 13:54 ` Tvrtko Ursulin
2026-06-24 14:09 ` Timur Kristóf [this message]
2026-05-29 10:30 ` [PATCH 2/7] drm/amdgpu: ACK the retry CAM after VM update finishes Timur Kristóf
2026-06-24 14:31 ` Tvrtko Ursulin
2026-06-24 14:52 ` Timur Kristóf
2026-06-24 15:14 ` Tvrtko Ursulin
2026-06-24 15:42 ` Timur Kristóf
2026-06-24 15:52 ` Tvrtko Ursulin
2026-05-29 10:30 ` [PATCH 3/7] drm/amdgpu/ih7.0: Use MMIO ACK instead of doorbell for retry CAM on IH 7.0 Timur Kristóf
2026-05-29 10:30 ` [PATCH 4/7] drm/amdgpu/ih7.0: Use IH_SW_RING_SIZE for soft IH ring instead of PAGE_SIZE Timur Kristóf
2026-06-24 14:37 ` Tvrtko Ursulin
2026-06-24 15:16 ` Timur Kristóf
2026-05-29 10:30 ` [PATCH 5/7] drm/amdgpu/gmc12.0: Use AMDGPU_PTE_IS_PTE flag for init_pte_flags on GFX12.0 Timur Kristóf
2026-06-24 14:54 ` Tvrtko Ursulin
2026-06-24 15:30 ` Timur Kristóf
2026-05-29 10:30 ` [PATCH 6/7] drm/amdgpu/vm: Use init PTE flags, and NOALLOC in amdgpu_vm_handle_fault() Timur Kristóf
2026-06-24 14:56 ` Tvrtko Ursulin
2026-05-29 10:30 ` [PATCH 7/7] drm/amdgpu/gmc12: Pass cam_index to retry fault handler Timur Kristóf
2026-06-24 14:59 ` Tvrtko Ursulin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5790868.rdbgypaU67@timur-max \
--to=timur.kristof@gmail.com \
--cc=Alexander.Deucher@amd.com \
--cc=Amir.Shetaia@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=maraeo@gmail.com \
--cc=mario.limonciello@amd.com \
--cc=natalie.vock@gmx.de \
--cc=tursulin@ursulin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.