AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>,
	amd-gfx@lists.freedesktop.org, Jesse Zhang <Jesse.Zhang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: fix possible fence leaks from job structure
Date: Mon, 3 Nov 2025 11:51:47 +0100	[thread overview]
Message-ID: <eaefde24-1b5d-4cbd-b23e-6a5a608493fa@amd.com> (raw)
In-Reply-To: <CADnq5_NT-P-izFo-hWi7dpfDtU8WZitEw4xaKOjczRmgzwH5SQ@mail.gmail.com>

On 10/31/25 16:28, Alex Deucher wrote:
> On Fri, Oct 31, 2025 at 10:01 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>> On 10/31/25 14:53, Alex Deucher wrote:
>>> On Fri, Oct 31, 2025 at 4:40 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> On 10/27/25 23:02, Alex Deucher wrote:
>>>>> If we don't end up initializing the fences, free them when
>>>>> we free the job.
>>>>>
>>>>> v2: take a reference to the fences if we emit them
>>>>>
>>>>> Fixes: db36632ea51e ("drm/amdgpu: clean up and unify hw fence handling")
>>>>> Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> (v1)
>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>> ---
>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  |  2 ++
>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 18 ++++++++++++++++++
>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 ++
>>>>>  3 files changed, 22 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> index 39229ece83f83..0596114377600 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>>>>> @@ -302,6 +302,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned int num_ibs,
>>>>>               return r;
>>>>>       }
>>>>>       *f = &af->base;
>>>>> +     /* get a ref for the job */
>>>>> +     dma_fence_get(*f);
>>>>
>>>> I think it would be better to set the fence inside the job to NULL as soon as it is consumed/initialized.
>>>
>>> We need the pointer for the job timed out handling.
>>
>> I don't think that is true. During a timeout we should have job->s_fence->parent for the HW fence.
> 
> We also need to keep it around for job_submit_direct() so we can free
> the IBs used for that.

Good point, but that handling here is really not straight forward.

Anyway feel free to add my rb for now, but we need to re-visite that at some point.

Regards,
Christian.

> 
> Alex
> 
>>
>> But even when we go down that route here, you only grab a reference to the hw_fence but not the hw_vm_fence.
>>
>> That looks broken to me.
>>
>> Christian.
>>
>>>
>>> Alex
>>>
>>>>
>>>>>
>>>>>       if (ring->funcs->insert_end)
>>>>>               ring->funcs->insert_end(ring);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> index 55c7e104d5ca0..dc970f5fe601b 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> @@ -295,6 +295,15 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
>>>>>
>>>>>       amdgpu_sync_free(&job->explicit_sync);
>>>>>
>>>>> +     if (job->hw_fence->base.ops)
>>>>> +             dma_fence_put(&job->hw_fence->base);
>>>>> +     else
>>>>> +             kfree(job->hw_fence);
>>>>> +     if (job->hw_vm_fence->base.ops)
>>>>> +             dma_fence_put(&job->hw_vm_fence->base);
>>>>> +     else
>>>>> +             kfree(job->hw_vm_fence);
>>>>> +
>>>>
>>>> This way that here can just be a kfree(..).
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>       kfree(job);
>>>>>  }
>>>>>
>>>>> @@ -324,6 +333,15 @@ void amdgpu_job_free(struct amdgpu_job *job)
>>>>>       if (job->gang_submit != &job->base.s_fence->scheduled)
>>>>>               dma_fence_put(job->gang_submit);
>>>>>
>>>>> +     if (job->hw_fence->base.ops)
>>>>> +             dma_fence_put(&job->hw_fence->base);
>>>>> +     else
>>>>> +             kfree(job->hw_fence);
>>>>> +     if (job->hw_vm_fence->base.ops)
>>>>> +             dma_fence_put(&job->hw_vm_fence->base);
>>>>> +     else
>>>>> +             kfree(job->hw_vm_fence);
>>>>> +
>>>>>       kfree(job);
>>>>>  }
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index db66b4232de02..f8c67840f446f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -845,6 +845,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job,
>>>>>               if (r)
>>>>>                       return r;
>>>>>               fence = &job->hw_vm_fence->base;
>>>>> +             /* get a ref for the job */
>>>>> +             dma_fence_get(fence);
>>>>>       }
>>>>>
>>>>>       if (vm_flush_needed) {
>>>>
>>


  reply	other threads:[~2025-11-03 10:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27 22:02 [PATCH] drm/amdgpu: fix possible fence leaks from job structure Alex Deucher
2025-10-30 12:51 ` Alex Deucher
2025-10-31  8:40 ` Christian König
2025-10-31 13:53   ` Alex Deucher
2025-10-31 14:01     ` Christian König
2025-10-31 14:05       ` Alex Deucher
2025-10-31 15:28       ` Alex Deucher
2025-11-03 10:51         ` Christian König [this message]
2025-11-03 14:14           ` Alex Deucher
  -- strict thread matches above, loose matches on Subject: below --
2025-10-31 17:43 Alex Deucher
2025-11-04 13:36 ` Alex Deucher
2025-11-04 13:50 ` Christian König
2025-10-22 21:20 Alex Deucher
2025-10-23 19:02 ` Alex Deucher
2025-10-24  9:48 ` Zhang, Jesse(Jie)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eaefde24-1b5d-4cbd-b23e-6a5a608493fa@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Jesse.Zhang@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox