* [PATCH] drm/sched/tests: Use one lock for fence context @ 2025-05-21 10:04 Philipp Stanner 2025-05-21 10:24 ` Tvrtko Ursulin 0 siblings, 1 reply; 4+ messages in thread From: Philipp Stanner @ 2025-05-21 10:04 UTC (permalink / raw) To: Matthew Brost, Danilo Krummrich, Philipp Stanner, Christian König, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal, Tvrtko Ursulin Cc: dri-devel, linux-kernel, linux-media When the unit tests were implemented, each scheduler job got its own, distinct lock. This is not how dma_fence context locking rules are to be implemented. All jobs belonging to the same fence context (in this case: scheduler) should share a lock for their dma_fences. This is to comply to various dma_fence rules, e.g., ensuring that only one fence gets signaled at a time. Use the fence context (scheduler) lock for the jobs. Signed-off-by: Philipp Stanner <phasta@kernel.org> --- drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 5 ++--- drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 - 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c index f999c8859cf7..17023276f4b0 100644 --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c @@ -64,7 +64,7 @@ static void drm_mock_sched_job_complete(struct drm_mock_sched_job *job) job->flags |= DRM_MOCK_SCHED_JOB_DONE; list_move_tail(&job->link, &sched->done_list); - dma_fence_signal(&job->hw_fence); + dma_fence_signal_locked(&job->hw_fence); complete(&job->done); } @@ -123,7 +123,6 @@ drm_mock_sched_job_new(struct kunit *test, job->test = test; init_completion(&job->done); - spin_lock_init(&job->lock); INIT_LIST_HEAD(&job->link); hrtimer_setup(&job->timer, drm_mock_sched_job_signal_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); @@ -169,7 +168,7 @@ static struct dma_fence *mock_sched_run_job(struct drm_sched_job *sched_job) dma_fence_init(&job->hw_fence, &drm_mock_sched_hw_fence_ops, - &job->lock, + &sched->lock, sched->hw_timeline.context, atomic_inc_return(&sched->hw_timeline.next_seqno)); diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h b/drivers/gpu/drm/scheduler/tests/sched_tests.h index 27caf8285fb7..fbba38137f0c 100644 --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h @@ -106,7 +106,6 @@ struct drm_mock_sched_job { unsigned int duration_us; ktime_t finish_at; - spinlock_t lock; struct dma_fence hw_fence; struct kunit *test; -- 2.49.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/sched/tests: Use one lock for fence context 2025-05-21 10:04 [PATCH] drm/sched/tests: Use one lock for fence context Philipp Stanner @ 2025-05-21 10:24 ` Tvrtko Ursulin 2025-05-22 14:06 ` Philipp Stanner 0 siblings, 1 reply; 4+ messages in thread From: Tvrtko Ursulin @ 2025-05-21 10:24 UTC (permalink / raw) To: Philipp Stanner, Matthew Brost, Danilo Krummrich, Christian König, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal Cc: dri-devel, linux-kernel, linux-media On 21/05/2025 11:04, Philipp Stanner wrote: > When the unit tests were implemented, each scheduler job got its own, > distinct lock. This is not how dma_fence context locking rules are to be > implemented. All jobs belonging to the same fence context (in this case: > scheduler) should share a lock for their dma_fences. This is to comply > to various dma_fence rules, e.g., ensuring that only one fence gets > signaled at a time. > > Use the fence context (scheduler) lock for the jobs. I think for the mock scheduler it works to share the lock, but I don't think see that the commit message is correct. Where do you see the requirement to share the lock? AFAIK fence->lock is a fence lock, nothing more semantically. And what does "ensuring that only one fence gets signalled at a time" mean? You mean signal in seqno order? Even that is not guaranteed in the contract due opportunistic signalling. Regards, Tvrtko > Signed-off-by: Philipp Stanner <phasta@kernel.org> > --- > drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 5 ++--- > drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 - > 2 files changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > index f999c8859cf7..17023276f4b0 100644 > --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > @@ -64,7 +64,7 @@ static void drm_mock_sched_job_complete(struct drm_mock_sched_job *job) > > job->flags |= DRM_MOCK_SCHED_JOB_DONE; > list_move_tail(&job->link, &sched->done_list); > - dma_fence_signal(&job->hw_fence); > + dma_fence_signal_locked(&job->hw_fence); > complete(&job->done); > } > > @@ -123,7 +123,6 @@ drm_mock_sched_job_new(struct kunit *test, > job->test = test; > > init_completion(&job->done); > - spin_lock_init(&job->lock); > INIT_LIST_HEAD(&job->link); > hrtimer_setup(&job->timer, drm_mock_sched_job_signal_timer, > CLOCK_MONOTONIC, HRTIMER_MODE_ABS); > @@ -169,7 +168,7 @@ static struct dma_fence *mock_sched_run_job(struct drm_sched_job *sched_job) > > dma_fence_init(&job->hw_fence, > &drm_mock_sched_hw_fence_ops, > - &job->lock, > + &sched->lock, > sched->hw_timeline.context, > atomic_inc_return(&sched->hw_timeline.next_seqno)); > > diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h b/drivers/gpu/drm/scheduler/tests/sched_tests.h > index 27caf8285fb7..fbba38137f0c 100644 > --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h > +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h > @@ -106,7 +106,6 @@ struct drm_mock_sched_job { > unsigned int duration_us; > ktime_t finish_at; > > - spinlock_t lock; > struct dma_fence hw_fence; > > struct kunit *test; ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/sched/tests: Use one lock for fence context 2025-05-21 10:24 ` Tvrtko Ursulin @ 2025-05-22 14:06 ` Philipp Stanner 2025-05-23 14:58 ` Tvrtko Ursulin 0 siblings, 1 reply; 4+ messages in thread From: Philipp Stanner @ 2025-05-22 14:06 UTC (permalink / raw) To: Tvrtko Ursulin, Philipp Stanner, Matthew Brost, Danilo Krummrich, Christian König, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal Cc: dri-devel, linux-kernel, linux-media On Wed, 2025-05-21 at 11:24 +0100, Tvrtko Ursulin wrote: > > On 21/05/2025 11:04, Philipp Stanner wrote: > > When the unit tests were implemented, each scheduler job got its > > own, > > distinct lock. This is not how dma_fence context locking rules are > > to be > > implemented. All jobs belonging to the same fence context (in this > > case: > > scheduler) should share a lock for their dma_fences. This is to > > comply > > to various dma_fence rules, e.g., ensuring that only one fence gets > > signaled at a time. > > > > Use the fence context (scheduler) lock for the jobs. > > I think for the mock scheduler it works to share the lock, but I > don't > think see that the commit message is correct. Where do you see the > requirement to share the lock? AFAIK fence->lock is a fence lock, > nothing more semantically. This patch is in part to probe a bit with Christian and Danilo to see whether we can get a bit more clarity about it. In many places, notably Nouveau, it's definitely well established practice to use one lock for the fctx and all the jobs associated with it. > > And what does "ensuring that only one fence gets signalled at a time" > mean? You mean signal in seqno order? Yes. But that's related. If jobs' fences can get signaled indepently from each other, that might race and screw up ordering. A common lock can prevent that. > Even that is not guaranteed in the > contract due opportunistic signalling. Jobs must be submitted to the hardware in the order they were submitted, and, therefore, their fences must be signaled in order. No? What do you mean by opportunistic signaling? P. > > Regards, > > Tvrtko > > > Signed-off-by: Philipp Stanner <phasta@kernel.org> > > --- > > drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 5 ++--- > > drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 - > > 2 files changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > > b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > > index f999c8859cf7..17023276f4b0 100644 > > --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > > +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c > > @@ -64,7 +64,7 @@ static void drm_mock_sched_job_complete(struct > > drm_mock_sched_job *job) > > > > job->flags |= DRM_MOCK_SCHED_JOB_DONE; > > list_move_tail(&job->link, &sched->done_list); > > - dma_fence_signal(&job->hw_fence); > > + dma_fence_signal_locked(&job->hw_fence); > > complete(&job->done); > > } > > > > @@ -123,7 +123,6 @@ drm_mock_sched_job_new(struct kunit *test, > > job->test = test; > > > > init_completion(&job->done); > > - spin_lock_init(&job->lock); > > INIT_LIST_HEAD(&job->link); > > hrtimer_setup(&job->timer, > > drm_mock_sched_job_signal_timer, > > CLOCK_MONOTONIC, HRTIMER_MODE_ABS); > > @@ -169,7 +168,7 @@ static struct dma_fence > > *mock_sched_run_job(struct drm_sched_job *sched_job) > > > > dma_fence_init(&job->hw_fence, > > &drm_mock_sched_hw_fence_ops, > > - &job->lock, > > + &sched->lock, > > sched->hw_timeline.context, > > atomic_inc_return(&sched- > > >hw_timeline.next_seqno)); > > > > diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h > > b/drivers/gpu/drm/scheduler/tests/sched_tests.h > > index 27caf8285fb7..fbba38137f0c 100644 > > --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h > > +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h > > @@ -106,7 +106,6 @@ struct drm_mock_sched_job { > > unsigned int duration_us; > > ktime_t finish_at; > > > > - spinlock_t lock; > > struct dma_fence hw_fence; > > > > struct kunit *test; > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/sched/tests: Use one lock for fence context 2025-05-22 14:06 ` Philipp Stanner @ 2025-05-23 14:58 ` Tvrtko Ursulin 0 siblings, 0 replies; 4+ messages in thread From: Tvrtko Ursulin @ 2025-05-23 14:58 UTC (permalink / raw) To: phasta, Matthew Brost, Danilo Krummrich, Christian König, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal Cc: dri-devel, linux-kernel, linux-media On 22/05/2025 15:06, Philipp Stanner wrote: > On Wed, 2025-05-21 at 11:24 +0100, Tvrtko Ursulin wrote: >> >> On 21/05/2025 11:04, Philipp Stanner wrote: >>> When the unit tests were implemented, each scheduler job got its >>> own, >>> distinct lock. This is not how dma_fence context locking rules are >>> to be >>> implemented. All jobs belonging to the same fence context (in this >>> case: >>> scheduler) should share a lock for their dma_fences. This is to >>> comply >>> to various dma_fence rules, e.g., ensuring that only one fence gets >>> signaled at a time. >>> >>> Use the fence context (scheduler) lock for the jobs. >> >> I think for the mock scheduler it works to share the lock, but I >> don't >> think see that the commit message is correct. Where do you see the >> requirement to share the lock? AFAIK fence->lock is a fence lock, >> nothing more semantically. > > This patch is in part to probe a bit with Christian and Danilo to see > whether we can get a bit more clarity about it. > > In many places, notably Nouveau, it's definitely well established > practice to use one lock for the fctx and all the jobs associated with > it. > > >> >> And what does "ensuring that only one fence gets signalled at a time" >> mean? You mean signal in seqno order? > > Yes. But that's related. If jobs' fences can get signaled indepently > from each other, that might race and screw up ordering. A common lock > can prevent that. > >> Even that is not guaranteed in the >> contract due opportunistic signalling. > > Jobs must be submitted to the hardware in the order they were > submitted, and, therefore, their fences must be signaled in order. No? > > What do you mean by opportunistic signaling? Our beloved dma_fence_is_signaled(). External caller can signal a fence before the driver which owns it does. If you change the commit message to correctly describe it is just a simplification since there is no need for separate locks I am good with that. It is a good simplification in that case. Regards, Tvrtko > > > P. > > > > >> >> Regards, >> >> Tvrtko >> >>> Signed-off-by: Philipp Stanner <phasta@kernel.org> >>> --- >>> drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 5 ++--- >>> drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 - >>> 2 files changed, 2 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c >>> b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c >>> index f999c8859cf7..17023276f4b0 100644 >>> --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c >>> +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c >>> @@ -64,7 +64,7 @@ static void drm_mock_sched_job_complete(struct >>> drm_mock_sched_job *job) >>> >>> job->flags |= DRM_MOCK_SCHED_JOB_DONE; >>> list_move_tail(&job->link, &sched->done_list); >>> - dma_fence_signal(&job->hw_fence); >>> + dma_fence_signal_locked(&job->hw_fence); >>> complete(&job->done); >>> } >>> >>> @@ -123,7 +123,6 @@ drm_mock_sched_job_new(struct kunit *test, >>> job->test = test; >>> >>> init_completion(&job->done); >>> - spin_lock_init(&job->lock); >>> INIT_LIST_HEAD(&job->link); >>> hrtimer_setup(&job->timer, >>> drm_mock_sched_job_signal_timer, >>> CLOCK_MONOTONIC, HRTIMER_MODE_ABS); >>> @@ -169,7 +168,7 @@ static struct dma_fence >>> *mock_sched_run_job(struct drm_sched_job *sched_job) >>> >>> dma_fence_init(&job->hw_fence, >>> &drm_mock_sched_hw_fence_ops, >>> - &job->lock, >>> + &sched->lock, >>> sched->hw_timeline.context, >>> atomic_inc_return(&sched- >>>> hw_timeline.next_seqno)); >>> >>> diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h >>> b/drivers/gpu/drm/scheduler/tests/sched_tests.h >>> index 27caf8285fb7..fbba38137f0c 100644 >>> --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h >>> +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h >>> @@ -106,7 +106,6 @@ struct drm_mock_sched_job { >>> unsigned int duration_us; >>> ktime_t finish_at; >>> >>> - spinlock_t lock; >>> struct dma_fence hw_fence; >>> >>> struct kunit *test; >> > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-05-23 14:58 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-21 10:04 [PATCH] drm/sched/tests: Use one lock for fence context Philipp Stanner 2025-05-21 10:24 ` Tvrtko Ursulin 2025-05-22 14:06 ` Philipp Stanner 2025-05-23 14:58 ` Tvrtko Ursulin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox