* [PATCH] drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled()
@ 2023-06-13 9:44 Boris Brezillon
2023-06-13 9:46 ` Boris Brezillon
0 siblings, 1 reply; 4+ messages in thread
From: Boris Brezillon @ 2023-06-13 9:44 UTC (permalink / raw)
To: dri-devel
Cc: Luben Tuikov, Sarah Walker, Christian König, Boris Brezillon,
Donald Robson, Sumit Semwal
Drivers that can delegate waits to the firmware/GPU pass the scheduled
fence to drm_sched_job_add_dependency(), and issue wait commands to
the firmware/GPU at job submission time. For this to be possible, they
need all their 'native' dependencies to have a valid parent since this
is where the actual HW fence information are encoded.
In drm_sched_main(), we currently call drm_sched_fence_set_parent()
after drm_sched_fence_set_parent(), leaving a short period of time
during which the job depending on this fence can be submitted.
Since setting parent and signaling the fence are two things that are
kinda related (you can't have a parent if the job hasn't been scheduled),
it probably makes sense to pass the parent fence to
drm_sched_fence_scheduled() and let it call drm_sched_fence_set_parent()
before it signals the scheduled fence.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Frank Binns <frank.binns@imgtec.com>
Cc: Sarah Walker <sarah.walker@imgtec.com>
Cc: Donald Robson <donald.robson@imgtec.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
---
Christian, that's the last bit remaining from [1] after your suggestion
to pass scheduled fences for those native-deps we have. It does feel
like setting the parent after signaling the fence is racy, but you might
have a good reason to do it in that order. If that's the case, could you
help us find a solution for the race exposed here?
[1]https://lore.kernel.org/dri-devel/20230612182530.6214caf3@collabora.com/T/#t
---
drivers/gpu/drm/scheduler/sched_fence.c | 40 +++++++++++++++----------
drivers/gpu/drm/scheduler/sched_main.c | 3 +-
include/drm/gpu_scheduler.h | 5 ++--
3 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index ef120475e7c6..06cedfe4b486 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -48,8 +48,32 @@ static void __exit drm_sched_fence_slab_fini(void)
kmem_cache_destroy(sched_fence_slab);
}
-void drm_sched_fence_scheduled(struct drm_sched_fence *fence)
+static void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
+ struct dma_fence *fence)
{
+ /*
+ * smp_store_release() to ensure another thread racing us
+ * in drm_sched_fence_set_deadline_finished() sees the
+ * fence's parent set before test_bit()
+ */
+ smp_store_release(&s_fence->parent, dma_fence_get(fence));
+ if (test_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT,
+ &s_fence->finished.flags))
+ dma_fence_set_deadline(fence, s_fence->deadline);
+}
+
+void drm_sched_fence_scheduled(struct drm_sched_fence *fence,
+ struct dma_fence *parent)
+{
+ /* Set the parent before signaling the scheduled fence, such that,
+ * any waiter expecting the parent to be filled after the job has
+ * been scheduled (which is the case for drivers delegating waits
+ * to some firmware) doesn't have to busy wait for parent to show
+ * up.
+ */
+ if (!IS_ERR_OR_NULL(parent))
+ drm_sched_fence_set_parent(fence, parent);
+
dma_fence_signal(&fence->scheduled);
}
@@ -181,20 +205,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
}
EXPORT_SYMBOL(to_drm_sched_fence);
-void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
- struct dma_fence *fence)
-{
- /*
- * smp_store_release() to ensure another thread racing us
- * in drm_sched_fence_set_deadline_finished() sees the
- * fence's parent set before test_bit()
- */
- smp_store_release(&s_fence->parent, dma_fence_get(fence));
- if (test_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT,
- &s_fence->finished.flags))
- dma_fence_set_deadline(fence, s_fence->deadline);
-}
-
struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
void *owner)
{
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 394010a60821..27097772ad6e 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1043,10 +1043,9 @@ static int drm_sched_main(void *param)
trace_drm_run_job(sched_job, entity);
fence = sched->ops->run_job(sched_job);
complete_all(&entity->entity_idle);
- drm_sched_fence_scheduled(s_fence);
+ drm_sched_fence_scheduled(s_fence, fence);
if (!IS_ERR_OR_NULL(fence)) {
- drm_sched_fence_set_parent(s_fence, fence);
/* Drop for original kref_init of the fence */
dma_fence_put(fence);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index e95b4837e5a3..f9544d9b670d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -583,15 +583,14 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
int drm_sched_entity_error(struct drm_sched_entity *entity);
-void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
- struct dma_fence *fence);
struct drm_sched_fence *drm_sched_fence_alloc(
struct drm_sched_entity *s_entity, void *owner);
void drm_sched_fence_init(struct drm_sched_fence *fence,
struct drm_sched_entity *entity);
void drm_sched_fence_free(struct drm_sched_fence *fence);
-void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
+void drm_sched_fence_scheduled(struct drm_sched_fence *fence,
+ struct dma_fence *parent);
void drm_sched_fence_finished(struct drm_sched_fence *fence, int result);
unsigned long drm_sched_suspend_timeout(struct drm_gpu_scheduler *sched);
--
2.40.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled()
2023-06-13 9:44 [PATCH] drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled() Boris Brezillon
@ 2023-06-13 9:46 ` Boris Brezillon
2023-06-13 11:06 ` Christian König
0 siblings, 1 reply; 4+ messages in thread
From: Boris Brezillon @ 2023-06-13 9:46 UTC (permalink / raw)
To: dri-devel
Cc: Sarah Walker, Christian König, Luben Tuikov, Donald Robson,
Sumit Semwal
On Tue, 13 Jun 2023 11:44:24 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:
> Drivers that can delegate waits to the firmware/GPU pass the scheduled
> fence to drm_sched_job_add_dependency(), and issue wait commands to
> the firmware/GPU at job submission time. For this to be possible, they
> need all their 'native' dependencies to have a valid parent since this
> is where the actual HW fence information are encoded.
>
> In drm_sched_main(), we currently call drm_sched_fence_set_parent()
> after drm_sched_fence_set_parent(), leaving a short period of time
after drm_sched_fence_scheduled(), ...
> during which the job depending on this fence can be submitted.
>
> Since setting parent and signaling the fence are two things that are
> kinda related (you can't have a parent if the job hasn't been scheduled),
> it probably makes sense to pass the parent fence to
> drm_sched_fence_scheduled() and let it call drm_sched_fence_set_parent()
> before it signals the scheduled fence.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled()
2023-06-13 9:46 ` Boris Brezillon
@ 2023-06-13 11:06 ` Christian König
2023-06-21 14:21 ` Boris Brezillon
0 siblings, 1 reply; 4+ messages in thread
From: Christian König @ 2023-06-13 11:06 UTC (permalink / raw)
To: Boris Brezillon, dri-devel
Cc: Sarah Walker, Luben Tuikov, Donald Robson, Sumit Semwal
Am 13.06.23 um 11:46 schrieb Boris Brezillon:
> On Tue, 13 Jun 2023 11:44:24 +0200
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
>
>> Drivers that can delegate waits to the firmware/GPU pass the scheduled
>> fence to drm_sched_job_add_dependency(), and issue wait commands to
>> the firmware/GPU at job submission time. For this to be possible, they
>> need all their 'native' dependencies to have a valid parent since this
>> is where the actual HW fence information are encoded.
>>
>> In drm_sched_main(), we currently call drm_sched_fence_set_parent()
>> after drm_sched_fence_set_parent(), leaving a short period of time
> after drm_sched_fence_scheduled(), ...
I was just about to complain, but yeah sounds like the right idea to me.
Just let me review the patch in more detail.
Christian.
>
>> during which the job depending on this fence can be submitted.
>>
>> Since setting parent and signaling the fence are two things that are
>> kinda related (you can't have a parent if the job hasn't been scheduled),
>> it probably makes sense to pass the parent fence to
>> drm_sched_fence_scheduled() and let it call drm_sched_fence_set_parent()
>> before it signals the scheduled fence.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled()
2023-06-13 11:06 ` Christian König
@ 2023-06-21 14:21 ` Boris Brezillon
0 siblings, 0 replies; 4+ messages in thread
From: Boris Brezillon @ 2023-06-21 14:21 UTC (permalink / raw)
To: Christian König
Cc: Sarah Walker, dri-devel, Luben Tuikov, Donald Robson,
Sumit Semwal
Hi Christian,
On Tue, 13 Jun 2023 13:06:06 +0200
Christian König <christian.koenig@amd.com> wrote:
> Am 13.06.23 um 11:46 schrieb Boris Brezillon:
> > On Tue, 13 Jun 2023 11:44:24 +0200
> > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> >
> >> Drivers that can delegate waits to the firmware/GPU pass the scheduled
> >> fence to drm_sched_job_add_dependency(), and issue wait commands to
> >> the firmware/GPU at job submission time. For this to be possible, they
> >> need all their 'native' dependencies to have a valid parent since this
> >> is where the actual HW fence information are encoded.
> >>
> >> In drm_sched_main(), we currently call drm_sched_fence_set_parent()
> >> after drm_sched_fence_set_parent(), leaving a short period of time
> > after drm_sched_fence_scheduled(), ...
>
> I was just about to complain, but yeah sounds like the right idea to me.
>
> Just let me review the patch in more detail.
Did you have time to look at this patch in more detail? Should I send a
v2 fixing the mistake in the commit message?
Regards,
Boris
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-21 14:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-13 9:44 [PATCH] drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled() Boris Brezillon
2023-06-13 9:46 ` Boris Brezillon
2023-06-13 11:06 ` Christian König
2023-06-21 14:21 ` Boris Brezillon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.