From: Danilo Krummrich <dakr@redhat.com>
To: "Luben Tuikov" <luben.tuikov@amd.com>,
"Lucas Stach" <l.stach@pengutronix.de>,
daniel@ffwll.ch, "Dave Airlie" <airlied@gmail.com>,
"Bagas Sanjaya" <bagasdotme@gmail.com>,
andrey.grodzovsky@amd.com,
"Christian König" <christian.koenig@amd.com>,
tvrtko.ursulin@linux.intel.com,
"Matthew Brost" <matthew.brost@intel.com>,
yuq825@gmail.com,
"Boris Brezillon" <boris.brezillon@collabora.com>,
lina@asahilina.net
Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [Regression] drm/scheduler: track GPU active time per entity
Date: Wed, 5 Apr 2023 16:05:53 +0200 [thread overview]
Message-ID: <3004a2bf-e725-643e-82af-8a217784e796@redhat.com> (raw)
In-Reply-To: <8b28151c-f2db-af3f-8dff-87dd5d57417b@amd.com>
On 4/4/23 06:31, Luben Tuikov wrote:
> On 2023-03-28 04:54, Lucas Stach wrote:
>> Hi Danilo,
>>
>> Am Dienstag, dem 28.03.2023 um 02:57 +0200 schrieb Danilo Krummrich:
>>> Hi all,
>>>
>>> Commit df622729ddbf ("drm/scheduler: track GPU active time per entity")
>>> tries to track the accumulated time that a job was active on the GPU
>>> writing it to the entity through which the job was deployed to the
>>> scheduler originally. This is done within drm_sched_get_cleanup_job()
>>> which fetches a job from the schedulers pending_list.
>>>
>>> Doing this can result in a race condition where the entity is already
>>> freed, but the entity's newly added elapsed_ns field is still accessed
>>> once the job is fetched from the pending_list.
>>>
>>> After drm_sched_entity_destroy() being called it should be safe to free
>>> the structure that embeds the entity. However, a job originally handed
>>> over to the scheduler by this entity might still reside in the
>>> schedulers pending_list for cleanup after drm_sched_entity_destroy()
>>> already being called and the entity being freed. Hence, we can run into
>>> a UAF.
>>>
>> Sorry about that, I clearly didn't properly consider this case.
>>
>>> In my case it happened that a job, as explained above, was just picked
>>> from the schedulers pending_list after the entity was freed due to the
>>> client application exiting. Meanwhile this freed up memory was already
>>> allocated for a subsequent client applications job structure again.
>>> Hence, the new jobs memory got corrupted. Luckily, I was able to
>>> reproduce the same corruption over and over again by just using
>>> deqp-runner to run a specific set of VK test cases in parallel.
>>>
>>> Fixing this issue doesn't seem to be very straightforward though (unless
>>> I miss something), which is why I'm writing this mail instead of sending
>>> a fix directly.
>>>
>>> Spontaneously, I see three options to fix it:
>>>
>>> 1. Rather than embedding the entity into driver specific structures
>>> (e.g. tied to file_priv) we could allocate the entity separately and
>>> reference count it, such that it's only freed up once all jobs that were
>>> deployed through this entity are fetched from the schedulers pending list.
>>>
>> My vote is on this or something in similar vain for the long term. I
>> have some hope to be able to add a GPU scheduling algorithm with a bit
>> more fairness than the current one sometime in the future, which
>> requires execution time tracking on the entities.
>
> Danilo,
>
> Using kref is preferable, i.e. option 1 above.
I think the only real motivation for doing that would be for generically
tracking job statistics within the entity a job was deployed through. If
we all agree on tracking job statistics this way I am happy to prepare a
patch for this option and drop this one:
https://lore.kernel.org/all/20230331000622.4156-1-dakr@redhat.com/T/#u
Christian mentioned amdgpu tried something similar to what Lucas tried
running into similar trouble, backed off and implemented it in another
way - a driver specific way I guess?
>
> Lucas, can you shed some light on,
>
> 1. In what way the current FIFO scheduling is unfair, and
> 2. shed some details on this "scheduling algorithm with a bit
> more fairness than the current one"?
>
> Regards,
> Luben
>
>>
>>> 2. Somehow make sure drm_sched_entity_destroy() does block until all
>>> jobs deployed through this entity were fetched from the schedulers
>>> pending list. Though, I'm pretty sure that this is not really desirable.
>>>
>>> 3. Just revert the change and let drivers implement tracking of GPU
>>> active times themselves.
>>>
>> Given that we are already pretty late in the release cycle and etnaviv
>> being the only driver so far making use of the scheduler elapsed time
>> tracking I think the right short term solution is to either move the
>> tracking into etnaviv or just revert the change for now. I'll have a
>> look at this.
>>
>> Regards,
>> Lucas
>>
>>> In the case of just reverting the change I'd propose to also set a jobs
>>> entity pointer to NULL once the job was taken from the entity, such
>>> that in case of a future issue we fail where the actual issue resides
>>> and to make it more obvious that the field shouldn't be used anymore
>>> after the job was taken from the entity.
>>>
>>> I'm happy to implement the solution we agree on. However, it might also
>>> make sense to revert the change until we have a solution in place. I'm
>>> also happy to send a revert with a proper description of the problem.
>>> Please let me know what you think.
>>>
>>> - Danilo
>>>
>>
>
next prev parent reply other threads:[~2023-04-05 15:38 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-28 0:57 [Regression] drm/scheduler: track GPU active time per entity Danilo Krummrich
2023-03-28 8:54 ` Lucas Stach
2023-04-04 4:31 ` Luben Tuikov
2023-04-05 14:05 ` Danilo Krummrich [this message]
2023-04-05 16:09 ` Luben Tuikov
2023-04-05 16:23 ` Danilo Krummrich
2023-04-06 8:22 ` Christian König
2023-04-06 8:27 ` Daniel Vetter
2023-04-06 9:05 ` Asahi Lina
2023-04-06 10:09 ` Daniel Vetter
2023-04-06 12:21 ` Asahi Lina
2023-04-06 13:37 ` Daniel Vetter
2023-04-06 14:43 ` Asahi Lina
2023-04-06 15:30 ` Daniel Vetter
2023-04-06 17:11 ` Asahi Lina
2023-04-06 17:42 ` Daniel Vetter
2023-04-06 17:46 ` Daniel Vetter
2023-04-06 10:45 ` Lucas Stach
2023-04-06 12:09 ` Daniel Vetter
2023-04-06 12:19 ` Lucas Stach
2023-04-06 13:56 ` Daniel Vetter
2023-04-06 14:21 ` Christian König
2023-04-06 15:24 ` Lucas Stach
2023-04-06 15:33 ` Christian König
2023-04-06 15:59 ` Daniel Vetter
2023-04-06 16:23 ` Daniel Vetter
2023-04-06 22:36 ` Luben Tuikov
2023-04-06 22:30 ` Luben Tuikov
2023-04-05 17:44 ` Lucas Stach
2023-04-05 20:44 ` Luben Tuikov
2023-04-06 15:58 ` Lucas Stach
2023-04-06 22:06 ` Luben Tuikov
[not found] <e3e07282-136f-5239-96ae-4aeb2d3c95d7.ref@yahoo.de>
[not found] ` <e3e07282-136f-5239-96ae-4aeb2d3c95d7@yahoo.de>
2023-03-28 10:29 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3004a2bf-e725-643e-82af-8a217784e796@redhat.com \
--to=dakr@redhat.com \
--cc=airlied@gmail.com \
--cc=andrey.grodzovsky@amd.com \
--cc=bagasdotme@gmail.com \
--cc=boris.brezillon@collabora.com \
--cc=christian.koenig@amd.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=l.stach@pengutronix.de \
--cc=lina@asahilina.net \
--cc=linux-kernel@vger.kernel.org \
--cc=luben.tuikov@amd.com \
--cc=matthew.brost@intel.com \
--cc=tvrtko.ursulin@linux.intel.com \
--cc=yuq825@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox