AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Liang, Prike" <Prike.Liang@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>
Subject: Re: [PATCH v3 4/5] drm/amdgpu: validate the eviction fence before attaching/detaching
Date: Thu, 8 May 2025 11:40:01 +0200	[thread overview]
Message-ID: <cbf48eb1-cd47-4851-b307-ee3cc65d4ed7@amd.com> (raw)
In-Reply-To: <DS7PR12MB600529CF3069303E7F088C35FB8BA@DS7PR12MB6005.namprd12.prod.outlook.com>

On 5/8/25 09:08, Liang, Prike wrote:
> [Public]
> 
>> From: Koenig, Christian <Christian.Koenig@amd.com>
>> Sent: Tuesday, May 6, 2025 4:39 PM
>> To: Liang, Prike <Prike.Liang@amd.com>; amd-gfx@lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>
>> Subject: Re: [PATCH v3 4/5] drm/amdgpu: validate the eviction fence before
>> attaching/detaching
>>
>> On 5/6/25 10:22, Liang, Prike wrote:
>>>>> -   /* attach gfx eviction fence */
>>>>> +   /* attach gfx the validated eviction fence */
>>>>>     r = amdgpu_eviction_fence_attach(&fpriv->evf_mgr, abo);
>>>>>     if (r) {
>>>>>             DRM_DEBUG_DRIVER("Failed to attach eviction fence to BO\n");
>>>>> +           amdgpu_bo_unreserve(abo);
>>>> Adding this here looks like the only valid fix in the patch.
>>> As the eviction fence will be invalidated until the user queue is created from the
>> user space, here it requires validating the eviction fence before trying to attach
>> and detach it to the reservation.
>>> I will try to draft a patch for validating the eviction fence at attach/detach
>> separately with this attach error handler change.
>>
>>
>> No, that is clearly incorrect.
>>
>> See the eviction fence works like this:
>>
>> Validating thread
>> * Create new eviction fence
>> * Publish eviction fence
>> * Lock all BOs
>> * Replace eviction fence
>>
>> Attaching:
>> * Lock BO
>> * Attach current eviction fence
>> * Unlock BO
>>
>> Detaching:
>> * Lock BO
>> * Unconditionally detach all possible eviction fences, no matter if new or old.
>> * Unlock BO
>>
>> This order is necessary or otherwise you break the logic here.
>>
>> Any additional check will completely mess that up because it makes the operation
>> racy.
> As the user queue eviction fence doesn't create until user queue submission, the eviction fence will be NULL without userq submission. So do we still try to attach/detach the null eviction fence for the kernel queue case?

Yes, the problem is that we can't check the eviction fence before we have taken the reservation lock.

Otherwise it can always be that there is an eviction fence created between the check and attaching it.

I also suggested before that the eviction fence is never NULL, we just start with a dummy stub fence (see function dma_fence_get_stub()). This way we can avoid all the NULL checks.

> It's ok without validating the eviction fence or userqueue work before attach/detach the eviction fence, but it will cost cycles for walking over the reservation fences array in the dma_resv_reserve_fences() and dma_resv_replace_fences().

That's completely irrelevant. Important is that we have the right sequence to not create a race condition.

Regards,
Christian.


> 
>> Regards,
>> Christian.
>>
>>>
>>> Thanks,
>>> Prike
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>             return r;
>>>>>     }
>>>>>
> 


  reply	other threads:[~2025-05-08  9:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-30  2:40 [PATCH v3 1/5] drm/amdgpu: promote the implicit sync to the dependent read fences Prike Liang
2025-04-30  2:40 ` [PATCH v3 2/5] drm/amdgpu: don't sync the user queue eviction fence Prike Liang
2025-04-30 11:56   ` Christian König
2025-05-06  2:09     ` Liang, Prike
2025-05-06  8:23       ` Christian König
2025-05-06  8:59         ` Liang, Prike
2025-04-30  2:40 ` [PATCH v3 3/5] drm/amdgpu: fix the eviction fence dereference Prike Liang
2025-04-30 11:58   ` Christian König
2025-05-06  2:19     ` Liang, Prike
2025-04-30  2:40 ` [PATCH v3 4/5] drm/amdgpu: validate the eviction fence before attaching/detaching Prike Liang
2025-04-30 12:01   ` Christian König
2025-05-06  8:22     ` Liang, Prike
2025-05-06  8:38       ` Christian König
2025-05-08  7:08         ` Liang, Prike
2025-05-08  9:40           ` Christian König [this message]
2025-04-30  2:40 ` [PATCH v3 5/5] drm/amdgpu: lock the eviction fence before signaling it Prike Liang
2025-04-30 12:03   ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cbf48eb1-cd47-4851-b307-ee3cc65d4ed7@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Prike.Liang@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox