[PATCH v7 0/2] drm/panthor: Scheduler fixes for termination failure and timeouts

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 0/2] drm/panthor: Scheduler fixes for termination failure and timeouts
@ 2025-11-12 12:17 Boris Brezillon
  2025-11-12 12:17 ` [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job Boris Brezillon
  2025-11-12 12:17 ` [PATCH v7 2/2] drm/panthor: Reset queue slots if termination fails Boris Brezillon
  0 siblings, 2 replies; 10+ messages in thread
From: Boris Brezillon @ 2025-11-12 12:17 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, dri-devel,
	Detlev Casanova, Ashley Smith, kernel

This patch series includes previously attempted patches to fix panthor
scheduler issues with spurious timeouts and issues when a termination
failed which would lead to a race condition.

Timeout recovery has been tested with some IGT tests issuing jobs with
infinite loops [1]. It's certainly not enough to claim that everything
works as it should, but that's still more testing than we had so far
;-).

[1]https://gitlab.freedesktop.org/bbrezillon/igt-gpu-tools/-/commit/15c3ee220808a437a76638bd21fedfb4498a434f

Changes in v7:
 - Add Steve's R-b
 - Use the local group variable when we can

Changes in v6:
 - Re-order changes
 - Dropped the Fixes tag on one patch
 - Cover UAF situation when the timeout work is pending/running at group
   destruction time
Changes in v5:
 - Swiched to a patch series to make sure the patch which addresses the
   bug is added as a requirement on the scheduler patch.
Changes in v4:
 - Moved code related to a timeout bug to a separate patch as this
   was not relevant to this change.
Changes in v3:
 - Moved to a patch series to make sure this bug fix happens before the
   changes to the scheduler
Changes in v2:
 - Fixed syntax error
*** BLURB HERE ***

Ashley Smith (2):
  drm/panthor: Make the timeout per-queue instead of per-job
  drm/panthor: Reset queue slots if termination fails

 drivers/gpu/drm/panthor/panthor_sched.c | 299 +++++++++++++++++-------
 1 file changed, 218 insertions(+), 81 deletions(-)

-- 
2.51.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 12:17 [PATCH v7 0/2] drm/panthor: Scheduler fixes for termination failure and timeouts Boris Brezillon
@ 2025-11-12 12:17 ` Boris Brezillon
  2025-11-12 12:38   ` Philipp Stanner
  2025-11-12 12:17 ` [PATCH v7 2/2] drm/panthor: Reset queue slots if termination fails Boris Brezillon
  1 sibling, 1 reply; 10+ messages in thread
From: Boris Brezillon @ 2025-11-12 12:17 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, dri-devel,
	Detlev Casanova, Ashley Smith, kernel

From: Ashley Smith <ashley.smith@collabora.com>

The timeout logic provided by drm_sched leads to races when we try
to suspend it while the drm_sched workqueue queues more jobs. Let's
overhaul the timeout handling in panthor to have our own delayed work
that's resumed/suspended when a group is resumed/suspended. When an
actual timeout occurs, we call drm_sched_fault() to report it
through drm_sched, still. But otherwise, the drm_sched timeout is
disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of
how we protect modifications on the timer.

One issue seems to be when we call drm_sched_suspend_timeout() from
both queue_run_job() and tick_work() which could lead to races due to
drm_sched_suspend_timeout() not having a lock. Another issue seems to
be in queue_run_job() if the group is not scheduled, we suspend the
timeout again which undoes what drm_sched_job_begin() did when calling
drm_sched_start_timeout(). So the timeout does not reset when a job
is finished.

v2:
- Fix syntax error

v3:
- Split the changes in two commits

v4:
- No changes

v5:
- No changes

v6:
- Fix a NULL deref in group_can_run(), and narrow the group variable
  scope to avoid such mistakes in the future
- Add an queue_timeout_is_suspended() helper to clarify things

Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Co-developed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 285 +++++++++++++++++-------
 1 file changed, 206 insertions(+), 79 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index e74ca071159d..b440187798dd 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -364,17 +364,20 @@ struct panthor_queue {
 	/** @name: DRM scheduler name for this queue. */
 	char *name;
 
-	/**
-	 * @remaining_time: Time remaining before the job timeout expires.
-	 *
-	 * The job timeout is suspended when the queue is not scheduled by the
-	 * FW. Every time we suspend the timer, we need to save the remaining
-	 * time so we can restore it later on.
-	 */
-	unsigned long remaining_time;
+	/** @timeout: Queue timeout related fields. */
+	struct {
+		/** @timeout.work: Work executed when a queue timeout occurs. */
+		struct delayed_work work;
 
-	/** @timeout_suspended: True if the job timeout was suspended. */
-	bool timeout_suspended;
+		/**
+		 * @timeout.remaining: Time remaining before a queue timeout.
+		 *
+		 * When the timer is running, this value is set to MAX_SCHEDULE_TIMEOUT.
+		 * When the timer is suspended, it's set to the time remaining when the
+		 * timer was suspended.
+		 */
+		unsigned long remaining;
+	} timeout;
 
 	/**
 	 * @doorbell_id: Doorbell assigned to this queue.
@@ -899,6 +902,10 @@ static void group_free_queue(struct panthor_group *group, struct panthor_queue *
 	if (IS_ERR_OR_NULL(queue))
 		return;
 
+	/* This should have been disabled before that point. */
+	drm_WARN_ON(&group->ptdev->base,
+		    disable_delayed_work_sync(&queue->timeout.work));
+
 	if (queue->entity.fence_context)
 		drm_sched_entity_destroy(&queue->entity);
 
@@ -1046,6 +1053,115 @@ group_unbind_locked(struct panthor_group *group)
 	return 0;
 }
 
+static bool
+group_is_idle(struct panthor_group *group)
+{
+	struct panthor_device *ptdev = group->ptdev;
+	u32 inactive_queues;
+
+	if (group->csg_id >= 0)
+		return ptdev->scheduler->csg_slots[group->csg_id].idle;
+
+	inactive_queues = group->idle_queues | group->blocked_queues;
+	return hweight32(inactive_queues) == group->queue_count;
+}
+
+static void
+queue_reset_timeout_locked(struct panthor_queue *queue)
+{
+	lockdep_assert_held(&queue->fence_ctx.lock);
+
+	if (queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT) {
+		mod_delayed_work(queue->scheduler.timeout_wq,
+				 &queue->timeout.work,
+				 msecs_to_jiffies(JOB_TIMEOUT_MS));
+	}
+}
+
+static bool
+group_can_run(struct panthor_group *group)
+{
+	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
+	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
+	       !group->destroyed && group->fatal_queues == 0 &&
+	       !group->timedout;
+}
+
+static bool
+queue_timeout_is_suspended(struct panthor_queue *queue)
+{
+	/* When running, the remaining time is set to MAX_SCHEDULE_TIMEOUT. */
+	return queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT;
+}
+
+static void
+queue_suspend_timeout_locked(struct panthor_queue *queue)
+{
+	unsigned long qtimeout, now;
+	struct panthor_group *group;
+	struct panthor_job *job;
+	bool timer_was_active;
+
+	lockdep_assert_held(&queue->fence_ctx.lock);
+
+	/* Already suspended, nothing to do. */
+	if (queue_timeout_is_suspended(queue))
+		return;
+
+	job = list_first_entry_or_null(&queue->fence_ctx.in_flight_jobs,
+				       struct panthor_job, node);
+	group = job ? job->group : NULL;
+
+	/* If the queue is blocked and the group is idle, we want the timer to
+	 * keep running because the group can't be unblocked by other queues,
+	 * so it has to come from an external source, and we want to timebox
+	 * this external signalling.
+	 */
+	if (group && group_can_run(group) &&
+	    (group->blocked_queues & BIT(job->queue_idx)) &&
+	    group_is_idle(group))
+		return;
+
+	now = jiffies;
+	qtimeout = queue->timeout.work.timer.expires;
+
+	/* Cancel the timer. */
+	timer_was_active = cancel_delayed_work(&queue->timeout.work);
+	if (!timer_was_active || !job)
+		queue->timeout.remaining = msecs_to_jiffies(JOB_TIMEOUT_MS);
+	else if (time_after(qtimeout, now))
+		queue->timeout.remaining = qtimeout - now;
+	else
+		queue->timeout.remaining = 0;
+
+	if (WARN_ON_ONCE(queue->timeout.remaining > msecs_to_jiffies(JOB_TIMEOUT_MS)))
+		queue->timeout.remaining = msecs_to_jiffies(JOB_TIMEOUT_MS);
+}
+
+static void
+queue_suspend_timeout(struct panthor_queue *queue)
+{
+	spin_lock(&queue->fence_ctx.lock);
+	queue_suspend_timeout_locked(queue);
+	spin_unlock(&queue->fence_ctx.lock);
+}
+
+static void
+queue_resume_timeout(struct panthor_queue *queue)
+{
+	spin_lock(&queue->fence_ctx.lock);
+
+	if (queue_timeout_is_suspended(queue)) {
+		mod_delayed_work(queue->scheduler.timeout_wq,
+				 &queue->timeout.work,
+				 queue->timeout.remaining);
+
+		queue->timeout.remaining = MAX_SCHEDULE_TIMEOUT;
+	}
+
+	spin_unlock(&queue->fence_ctx.lock);
+}
+
 /**
  * cs_slot_prog_locked() - Program a queue slot
  * @ptdev: Device.
@@ -1084,10 +1200,8 @@ cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
 			       CS_IDLE_EMPTY |
 			       CS_STATE_MASK |
 			       CS_EXTRACT_EVENT);
-	if (queue->iface.input->insert != queue->iface.input->extract && queue->timeout_suspended) {
-		drm_sched_resume_timeout(&queue->scheduler, queue->remaining_time);
-		queue->timeout_suspended = false;
-	}
+	if (queue->iface.input->insert != queue->iface.input->extract)
+		queue_resume_timeout(queue);
 }
 
 /**
@@ -1114,14 +1228,7 @@ cs_slot_reset_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
 			       CS_STATE_STOP,
 			       CS_STATE_MASK);
 
-	/* If the queue is blocked, we want to keep the timeout running, so
-	 * we can detect unbounded waits and kill the group when that happens.
-	 */
-	if (!(group->blocked_queues & BIT(cs_id)) && !queue->timeout_suspended) {
-		queue->remaining_time = drm_sched_suspend_timeout(&queue->scheduler);
-		queue->timeout_suspended = true;
-		WARN_ON(queue->remaining_time > msecs_to_jiffies(JOB_TIMEOUT_MS));
-	}
+	queue_suspend_timeout(queue);
 
 	return 0;
 }
@@ -1916,28 +2023,6 @@ tick_ctx_is_full(const struct panthor_scheduler *sched,
 	return ctx->group_count == sched->csg_slot_count;
 }
 
-static bool
-group_is_idle(struct panthor_group *group)
-{
-	struct panthor_device *ptdev = group->ptdev;
-	u32 inactive_queues;
-
-	if (group->csg_id >= 0)
-		return ptdev->scheduler->csg_slots[group->csg_id].idle;
-
-	inactive_queues = group->idle_queues | group->blocked_queues;
-	return hweight32(inactive_queues) == group->queue_count;
-}
-
-static bool
-group_can_run(struct panthor_group *group)
-{
-	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
-	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
-	       !group->destroyed && group->fatal_queues == 0 &&
-	       !group->timedout;
-}
-
 static void
 tick_ctx_pick_groups_from_list(const struct panthor_scheduler *sched,
 			       struct panthor_sched_tick_ctx *ctx,
@@ -2619,6 +2704,7 @@ static void group_schedule_locked(struct panthor_group *group, u32 queue_mask)
 static void queue_stop(struct panthor_queue *queue,
 		       struct panthor_job *bad_job)
 {
+	disable_delayed_work_sync(&queue->timeout.work);
 	drm_sched_stop(&queue->scheduler, bad_job ? &bad_job->base : NULL);
 }
 
@@ -2630,6 +2716,7 @@ static void queue_start(struct panthor_queue *queue)
 	list_for_each_entry(job, &queue->scheduler.pending_list, base.list)
 		job->base.s_fence->parent = dma_fence_get(job->done_fence);
 
+	enable_delayed_work(&queue->timeout.work);
 	drm_sched_start(&queue->scheduler, 0);
 }
 
@@ -2696,7 +2783,6 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 {
 	struct panthor_scheduler *sched = ptdev->scheduler;
 	struct panthor_csg_slots_upd_ctx upd_ctx;
-	struct panthor_group *group;
 	u32 suspended_slots;
 	u32 i;
 
@@ -2786,8 +2872,8 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 
 	for (i = 0; i < sched->csg_slot_count; i++) {
 		struct panthor_csg_slot *csg_slot = &sched->csg_slots[i];
+		struct panthor_group *group = csg_slot->group;
 
-		group = csg_slot->group;
 		if (!group)
 			continue;
 
@@ -2916,35 +3002,47 @@ void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
 	xa_unlock(&gpool->xa);
 }
 
-static void group_sync_upd_work(struct work_struct *work)
+static bool queue_check_job_completion(struct panthor_queue *queue)
 {
-	struct panthor_group *group =
-		container_of(work, struct panthor_group, sync_upd_work);
+	struct panthor_syncobj_64b *syncobj = NULL;
 	struct panthor_job *job, *job_tmp;
+	bool cookie, progress = false;
 	LIST_HEAD(done_jobs);
-	u32 queue_idx;
-	bool cookie;
 
 	cookie = dma_fence_begin_signalling();
-	for (queue_idx = 0; queue_idx < group->queue_count; queue_idx++) {
-		struct panthor_queue *queue = group->queues[queue_idx];
-		struct panthor_syncobj_64b *syncobj;
+	spin_lock(&queue->fence_ctx.lock);
+	list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
+		if (!syncobj) {
+			struct panthor_group *group = job->group;
 
-		if (!queue)
-			continue;
-
-		syncobj = group->syncobjs->kmap + (queue_idx * sizeof(*syncobj));
-
-		spin_lock(&queue->fence_ctx.lock);
-		list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
-			if (syncobj->seqno < job->done_fence->seqno)
-				break;
-
-			list_move_tail(&job->node, &done_jobs);
-			dma_fence_signal_locked(job->done_fence);
+			syncobj = group->syncobjs->kmap +
+				  (job->queue_idx * sizeof(*syncobj));
 		}
-		spin_unlock(&queue->fence_ctx.lock);
+
+		if (syncobj->seqno < job->done_fence->seqno)
+			break;
+
+		list_move_tail(&job->node, &done_jobs);
+		dma_fence_signal_locked(job->done_fence);
 	}
+
+	if (list_empty(&queue->fence_ctx.in_flight_jobs)) {
+		/* If we have no job left, we cancel the timer, and reset remaining
+		 * time to its default so it can be restarted next time
+		 * queue_resume_timeout() is called.
+		 */
+		queue_suspend_timeout_locked(queue);
+
+		/* If there's no job pending, we consider it progress to avoid a
+		 * spurious timeout if the timeout handler and the sync update
+		 * handler raced.
+		 */
+		progress = true;
+	} else if (!list_empty(&done_jobs)) {
+		queue_reset_timeout_locked(queue);
+		progress = true;
+	}
+	spin_unlock(&queue->fence_ctx.lock);
 	dma_fence_end_signalling(cookie);
 
 	list_for_each_entry_safe(job, job_tmp, &done_jobs, node) {
@@ -2954,6 +3052,27 @@ static void group_sync_upd_work(struct work_struct *work)
 		panthor_job_put(&job->base);
 	}
 
+	return progress;
+}
+
+static void group_sync_upd_work(struct work_struct *work)
+{
+	struct panthor_group *group =
+		container_of(work, struct panthor_group, sync_upd_work);
+	u32 queue_idx;
+	bool cookie;
+
+	cookie = dma_fence_begin_signalling();
+	for (queue_idx = 0; queue_idx < group->queue_count; queue_idx++) {
+		struct panthor_queue *queue = group->queues[queue_idx];
+
+		if (!queue)
+			continue;
+
+		queue_check_job_completion(queue);
+	}
+	dma_fence_end_signalling(cookie);
+
 	group_put(group);
 }
 
@@ -3201,17 +3320,6 @@ queue_run_job(struct drm_sched_job *sched_job)
 	queue->iface.input->insert = job->ringbuf.end;
 
 	if (group->csg_id < 0) {
-		/* If the queue is blocked, we want to keep the timeout running, so we
-		 * can detect unbounded waits and kill the group when that happens.
-		 * Otherwise, we suspend the timeout so the time we spend waiting for
-		 * a CSG slot is not counted.
-		 */
-		if (!(group->blocked_queues & BIT(job->queue_idx)) &&
-		    !queue->timeout_suspended) {
-			queue->remaining_time = drm_sched_suspend_timeout(&queue->scheduler);
-			queue->timeout_suspended = true;
-		}
-
 		group_schedule_locked(group, BIT(job->queue_idx));
 	} else {
 		gpu_write(ptdev, CSF_DOORBELL(queue->doorbell_id), 1);
@@ -3220,6 +3328,7 @@ queue_run_job(struct drm_sched_job *sched_job)
 			pm_runtime_get(ptdev->base.dev);
 			sched->pm.has_ref = true;
 		}
+		queue_resume_timeout(queue);
 		panthor_devfreq_record_busy(sched->ptdev);
 	}
 
@@ -3270,6 +3379,11 @@ queue_timedout_job(struct drm_sched_job *sched_job)
 
 	queue_start(queue);
 
+	/* We already flagged the queue as faulty, make sure we don't get
+	 * called again.
+	 */
+	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
+
 	return DRM_GPU_SCHED_STAT_RESET;
 }
 
@@ -3312,6 +3426,17 @@ static u32 calc_profiling_ringbuf_num_slots(struct panthor_device *ptdev,
 	return DIV_ROUND_UP(cs_ringbuf_size, min_profiled_job_instrs * sizeof(u64));
 }
 
+static void queue_timeout_work(struct work_struct *work)
+{
+	struct panthor_queue *queue = container_of(work, struct panthor_queue,
+						   timeout.work.work);
+	bool progress;
+
+	progress = queue_check_job_completion(queue);
+	if (!progress)
+		drm_sched_fault(&queue->scheduler);
+}
+
 static struct panthor_queue *
 group_create_queue(struct panthor_group *group,
 		   const struct drm_panthor_queue_create *args,
@@ -3328,7 +3453,7 @@ group_create_queue(struct panthor_group *group,
 		 * their profiling status.
 		 */
 		.credit_limit = args->ringbuf_size / sizeof(u64),
-		.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
+		.timeout = MAX_SCHEDULE_TIMEOUT,
 		.timeout_wq = group->ptdev->reset.wq,
 		.dev = group->ptdev->base.dev,
 	};
@@ -3350,6 +3475,8 @@ group_create_queue(struct panthor_group *group,
 	if (!queue)
 		return ERR_PTR(-ENOMEM);
 
+	queue->timeout.remaining = msecs_to_jiffies(JOB_TIMEOUT_MS);
+	INIT_DELAYED_WORK(&queue->timeout.work, queue_timeout_work);
 	queue->fence_ctx.id = dma_fence_context_alloc(1);
 	spin_lock_init(&queue->fence_ctx.lock);
 	INIT_LIST_HEAD(&queue->fence_ctx.in_flight_jobs);
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v7 2/2] drm/panthor: Reset queue slots if termination fails
  2025-11-12 12:17 [PATCH v7 0/2] drm/panthor: Scheduler fixes for termination failure and timeouts Boris Brezillon
  2025-11-12 12:17 ` [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job Boris Brezillon
@ 2025-11-12 12:17 ` Boris Brezillon
  1 sibling, 0 replies; 10+ messages in thread
From: Boris Brezillon @ 2025-11-12 12:17 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, dri-devel,
	Detlev Casanova, Ashley Smith, kernel

From: Ashley Smith <ashley.smith@collabora.com>

Make sure the queue slot is reset even if we failed termination so
we don't have garbage in the CS input interface after a reset. In
practice that's not a problem because we zero out all RW sections when
a hangs occurs, but it's safer to reset things manually, in case we
decide to not conditionally reload RW sections based on the type of
hang.

v4:
- Split the changes in two separate patches

v5:
- No changes

v6:
- Adjust the explanation in the commit message
- Drop the Fixes tag
- Put after the timeout changes and make the two patches independent
  so one can be backported, and the other not

v7:
- Use the local group variable instead of dereferencing csg_slot->group
- Add Steve's R-b

Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_sched.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index b440187798dd..36c75c39e3e4 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -2836,13 +2836,23 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
 		while (slot_mask) {
 			u32 csg_id = ffs(slot_mask) - 1;
 			struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
+			struct panthor_group *group = csg_slot->group;
 
 			/* Terminate command timedout, but the soft-reset will
 			 * automatically terminate all active groups, so let's
 			 * force the state to halted here.
 			 */
-			if (csg_slot->group->state != PANTHOR_CS_GROUP_TERMINATED)
-				csg_slot->group->state = PANTHOR_CS_GROUP_TERMINATED;
+			if (group->state != PANTHOR_CS_GROUP_TERMINATED) {
+				group->state = PANTHOR_CS_GROUP_TERMINATED;
+
+				/* Reset the queue slots manually if the termination
+				 * request failed.
+				 */
+				for (i = 0; i < group->queue_count; i++) {
+					if (group->queues[i])
+						cs_slot_reset_locked(ptdev, csg_id, i);
+				}
+			}
 			slot_mask &= ~BIT(csg_id);
 		}
 	}
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 12:17 ` [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job Boris Brezillon
@ 2025-11-12 12:38   ` Philipp Stanner
  2025-11-12 13:31     ` Boris Brezillon
  0 siblings, 1 reply; 10+ messages in thread
From: Philipp Stanner @ 2025-11-12 12:38 UTC (permalink / raw)
  To: Boris Brezillon, Steven Price, Liviu Dudau, Adrián Larumbe
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, dri-devel,
	Detlev Casanova, Ashley Smith, kernel

On Wed, 2025-11-12 at 13:17 +0100, Boris Brezillon wrote:
> From: Ashley Smith <ashley.smith@collabora.com>
> 
> The timeout logic provided by drm_sched leads to races when we try
> to suspend it while the drm_sched workqueue queues more jobs. Let's
> overhaul the timeout handling in panthor to have our own delayed work
> that's resumed/suspended when a group is resumed/suspended. When an
> actual timeout occurs, we call drm_sched_fault() to report it
> through drm_sched, still. But otherwise, the drm_sched timeout is
> disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of
> how we protect modifications on the timer.
> 
> One issue seems to be when we call drm_sched_suspend_timeout() from
> both queue_run_job() and tick_work() which could lead to races due to
> drm_sched_suspend_timeout() not having a lock. Another issue seems to
> be in queue_run_job() if the group is not scheduled, we suspend the
> timeout again which undoes what drm_sched_job_begin() did when calling
> drm_sched_start_timeout(). So the timeout does not reset when a job
> is finished.
> 
> 

[…]

> +
> +static void
> +queue_reset_timeout_locked(struct panthor_queue *queue)
> +{
> +	lockdep_assert_held(&queue->fence_ctx.lock);
> +
> +	if (queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT) {
> +		mod_delayed_work(queue->scheduler.timeout_wq,

Here you are interfering with the scheduler's internals again, don't
you. I think we agreed that we don't want to do such things anymore,
didn't we?

You could write a proper drm_sched API function which serves your
usecase.

Or could maybe DRM_GPU_SCHED_STAT_NO_HANG be returned from your driver
in case an interrupt actually fires?

> +				 &queue->timeout.work,
> +				 msecs_to_jiffies(JOB_TIMEOUT_MS));
> +	}
> +}
> +
> +static bool
> +group_can_run(struct panthor_group *group)
> +{
> +	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
> +	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
> +	       !group->destroyed && group->fatal_queues == 0 &&
> +	       !group->timedout;
> +}
> +
> 

[…]

> +queue_suspend_timeout(struct panthor_queue *queue)
> +{
> +	spin_lock(&queue->fence_ctx.lock);
> +	queue_suspend_timeout_locked(queue);
> +	spin_unlock(&queue->fence_ctx.lock);
> +}
> +
> +static void
> +queue_resume_timeout(struct panthor_queue *queue)
> +{
> +	spin_lock(&queue->fence_ctx.lock);
> +
> +	if (queue_timeout_is_suspended(queue)) {
> +		mod_delayed_work(queue->scheduler.timeout_wq,

There is drm_sched_resume_timeout(). Why can it not be used?

> +				 &queue->timeout.work,
> +				 queue->timeout.remaining);
> +
> +		queue->timeout.remaining = MAX_SCHEDULE_TIMEOUT;
> +	}
> +
> +	spin_unlock(&queue->fence_ctx.lock);
> +}
> +
> 

[…]

>  
> @@ -3270,6 +3379,11 @@ queue_timedout_job(struct drm_sched_job *sched_job)
>  
>  	queue_start(queue);
>  
> +	/* We already flagged the queue as faulty, make sure we don't get
> +	 * called again.
> +	 */
> +	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> +
>  	return DRM_GPU_SCHED_STAT_RESET;

DRM_GPU_SCHED_STAT_NO_HANG instead of just modifying the scheduler's
internal data??


P.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 12:38   ` Philipp Stanner
@ 2025-11-12 13:31     ` Boris Brezillon
  2025-11-12 13:48       ` Philipp Stanner
  0 siblings, 1 reply; 10+ messages in thread
From: Boris Brezillon @ 2025-11-12 13:31 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: Steven Price, Liviu Dudau, Adrián Larumbe, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, dri-devel, Detlev Casanova,
	Ashley Smith, kernel

Hi Philipp,

On Wed, 12 Nov 2025 13:38:00 +0100
Philipp Stanner <pstanner@redhat.com> wrote:

> On Wed, 2025-11-12 at 13:17 +0100, Boris Brezillon wrote:
> > From: Ashley Smith <ashley.smith@collabora.com>
> > 
> > The timeout logic provided by drm_sched leads to races when we try
> > to suspend it while the drm_sched workqueue queues more jobs. Let's
> > overhaul the timeout handling in panthor to have our own delayed work
> > that's resumed/suspended when a group is resumed/suspended. When an
> > actual timeout occurs, we call drm_sched_fault() to report it
> > through drm_sched, still. But otherwise, the drm_sched timeout is
> > disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of
> > how we protect modifications on the timer.
> > 
> > One issue seems to be when we call drm_sched_suspend_timeout() from
> > both queue_run_job() and tick_work() which could lead to races due to
> > drm_sched_suspend_timeout() not having a lock. Another issue seems to
> > be in queue_run_job() if the group is not scheduled, we suspend the
> > timeout again which undoes what drm_sched_job_begin() did when calling
> > drm_sched_start_timeout(). So the timeout does not reset when a job
> > is finished.
> > 
> >   
> 
> […]
> 
> > +
> > +static void
> > +queue_reset_timeout_locked(struct panthor_queue *queue)
> > +{
> > +	lockdep_assert_held(&queue->fence_ctx.lock);
> > +
> > +	if (queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT) {
> > +		mod_delayed_work(queue->scheduler.timeout_wq,  
> 
> Here you are interfering with the scheduler's internals again, don't
> you. I think we agreed that we don't want to do such things anymore,
> didn't we?

We're not really touching drm_gpu_scheduler's internals, we're just
retrieving the timeout workqueue we passed at init time:
panthor_queue::timeout is panthor internals not drm_sched internals.

This being said, I agree we should use ptdev->reset.wq instead of
retrieving the timeout workqueue through the drm_gpu_scheduler object.

Just to be clear, the goal of this patch is to bypass the
drm_gpu_scheduler timeout logic entirely, so we can have our own thing
that's not racy (see below).

> 
> You could write a proper drm_sched API function which serves your
> usecase.

It's not really lack of support for our usecase that drives this
change, but more the fact the current helpers are racy for drivers that
have a 1:1 entity:sched relationship with queues that can be scheduled
out behind drm_gpu_scheduler's back.

> 
> Or could maybe DRM_GPU_SCHED_STAT_NO_HANG be returned from your driver
> in case an interrupt actually fires?

I don't think it helps, see below.

> 
> > +				 &queue->timeout.work,
> > +				 msecs_to_jiffies(JOB_TIMEOUT_MS));
> > +	}
> > +}
> > +
> > +static bool
> > +group_can_run(struct panthor_group *group)
> > +{
> > +	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
> > +	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
> > +	       !group->destroyed && group->fatal_queues == 0 &&
> > +	       !group->timedout;
> > +}
> > +
> >   
> 
> […]
> 
> > +queue_suspend_timeout(struct panthor_queue *queue)
> > +{
> > +	spin_lock(&queue->fence_ctx.lock);
> > +	queue_suspend_timeout_locked(queue);
> > +	spin_unlock(&queue->fence_ctx.lock);
> > +}
> > +
> > +static void
> > +queue_resume_timeout(struct panthor_queue *queue)
> > +{
> > +	spin_lock(&queue->fence_ctx.lock);
> > +
> > +	if (queue_timeout_is_suspended(queue)) {
> > +		mod_delayed_work(queue->scheduler.timeout_wq,  
> 
> There is drm_sched_resume_timeout(). Why can it not be used?

Because it's racy. I don't remember all the details, but IIRC, it had
to do with job insertion calling drm_sched_start_timeout() while we're
calling drm_sched_resume_timeout() from cs_slot_reset_locked(). We
tried to make it work, but we gave up at some point and went for a
driver-specific timer, because ultimately what we want is a per-queue
timeout that we can pause/resume without drm_sched interfering when new
jobs are queued to our ring buffers: we resume when the execution
context (AKA group) is scheduled in, and we pause when this execution
context is scheduled out.

That's the very reason we set drm_gpu_scheduler::timeout to
MAX_SCHEDULE_TIMEOUT at init time (AKA, timeout disabled) and never
touch that again. When our driver-internal timer expires, we forward
the information to the drm_sched layer by calling drm_sched_fault().

> 
> > +				 &queue->timeout.work,
> > +				 queue->timeout.remaining);
> > +
> > +		queue->timeout.remaining = MAX_SCHEDULE_TIMEOUT;
> > +	}
> > +
> > +	spin_unlock(&queue->fence_ctx.lock);
> > +}
> > +
> >   
> 
> […]
> 
> >  
> > @@ -3270,6 +3379,11 @@ queue_timedout_job(struct drm_sched_job *sched_job)
> >  
> >  	queue_start(queue);
> >  
> > +	/* We already flagged the queue as faulty, make sure we don't get
> > +	 * called again.
> > +	 */
> > +	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> > +
> >  	return DRM_GPU_SCHED_STAT_RESET;  
> 
> DRM_GPU_SCHED_STAT_NO_HANG instead of just modifying the scheduler's
> internal data??

	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;

is a leftover from a previous version. We shouldn't have to modify that
here because the timeout is initialized to MAX_SCHEDULE_TIMEOUT and
never touched again.

Regards,

Boris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 13:31     ` Boris Brezillon
@ 2025-11-12 13:48       ` Philipp Stanner
  2025-11-12 14:12         ` Boris Brezillon
  0 siblings, 1 reply; 10+ messages in thread
From: Philipp Stanner @ 2025-11-12 13:48 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Steven Price, Liviu Dudau, Adrián Larumbe, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, dri-devel, Detlev Casanova,
	Ashley Smith, kernel

On Wed, 2025-11-12 at 14:31 +0100, Boris Brezillon wrote:
> Hi Philipp,
> 
> On Wed, 12 Nov 2025 13:38:00 +0100
> Philipp Stanner <pstanner@redhat.com> wrote:
> 
> > On Wed, 2025-11-12 at 13:17 +0100, Boris Brezillon wrote:
> > > From: Ashley Smith <ashley.smith@collabora.com>
> > > 
> > > The timeout logic provided by drm_sched leads to races when we try
> > > to suspend it while the drm_sched workqueue queues more jobs. Let's
> > > overhaul the timeout handling in panthor to have our own delayed work
> > > that's resumed/suspended when a group is resumed/suspended. When an
> > > actual timeout occurs, we call drm_sched_fault() to report it
> > > through drm_sched, still. But otherwise, the drm_sched timeout is
> > > disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of
> > > how we protect modifications on the timer.
> > > 
> > > One issue seems to be when we call drm_sched_suspend_timeout() from
> > > both queue_run_job() and tick_work() which could lead to races due to
> > > drm_sched_suspend_timeout() not having a lock. Another issue seems to
> > > be in queue_run_job() if the group is not scheduled, we suspend the
> > > timeout again which undoes what drm_sched_job_begin() did when calling
> > > drm_sched_start_timeout(). So the timeout does not reset when a job
> > > is finished.
> > > 
> > >   
> > 
> > […]
> > 
> > > +
> > > +static void
> > > +queue_reset_timeout_locked(struct panthor_queue *queue)
> > > +{
> > > +	lockdep_assert_held(&queue->fence_ctx.lock);
> > > +
> > > +	if (queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT) {
> > > +		mod_delayed_work(queue->scheduler.timeout_wq,  
> > 
> > Here you are interfering with the scheduler's internals again, don't
> > you. I think we agreed that we don't want to do such things anymore,
> > didn't we?
> 
> We're not really touching drm_gpu_scheduler's internals, we're just
> retrieving the timeout workqueue we passed at init time:
> panthor_queue::timeout is panthor internals not drm_sched internals.
> 
> This being said, I agree we should use ptdev->reset.wq instead of
> retrieving the timeout workqueue through the drm_gpu_scheduler object.
> 
> Just to be clear, the goal of this patch is to bypass the
> drm_gpu_scheduler timeout logic entirely, so we can have our own thing
> that's not racy (see below).

OK. timeout_wq sharing is intended and allowed, so if that's what
you're doing, good. But I agree that accessing the wq through the
driver's struct is then cleaner and more obviously correct.

> 
> > 
> > You could write a proper drm_sched API function which serves your
> > usecase.
> 
> It's not really lack of support for our usecase that drives this
> change, but more the fact the current helpers are racy for drivers that
> have a 1:1 entity:sched relationship with queues that can be scheduled
> out behind drm_gpu_scheduler's back.

And you also can't stop drm_sched to prevent races?

> 
> > 
> > Or could maybe DRM_GPU_SCHED_STAT_NO_HANG be returned from your driver
> > in case an interrupt actually fires?
> 
> I don't think it helps, see below.
> 
> > 
> > > +				 &queue->timeout.work,
> > > +				 msecs_to_jiffies(JOB_TIMEOUT_MS));
> > > +	}
> > > +}
> > > +
> > > +static bool
> > > +group_can_run(struct panthor_group *group)
> > > +{
> > > +	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
> > > +	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
> > > +	       !group->destroyed && group->fatal_queues == 0 &&
> > > +	       !group->timedout;
> > > +}
> > > +
> > >   
> > 
> > […]
> > 
> > > +queue_suspend_timeout(struct panthor_queue *queue)
> > > +{
> > > +	spin_lock(&queue->fence_ctx.lock);
> > > +	queue_suspend_timeout_locked(queue);
> > > +	spin_unlock(&queue->fence_ctx.lock);
> > > +}
> > > +
> > > +static void
> > > +queue_resume_timeout(struct panthor_queue *queue)
> > > +{
> > > +	spin_lock(&queue->fence_ctx.lock);
> > > +
> > > +	if (queue_timeout_is_suspended(queue)) {
> > > +		mod_delayed_work(queue->scheduler.timeout_wq,  
> > 
> > There is drm_sched_resume_timeout(). Why can it not be used?
> 
> Because it's racy. I don't remember all the details, but IIRC, it had
> to do with job insertion calling drm_sched_start_timeout() while we're
> calling drm_sched_resume_timeout() from cs_slot_reset_locked(). We
> tried to make it work, but we gave up at some point and went for a
> driver-specific timer, because ultimately what we want is a per-queue
> timeout that we can pause/resume without drm_sched interfering when new
> jobs are queued to our ring buffers: we resume when the execution
> context (AKA group) is scheduled in, and we pause when this execution
> context is scheduled out.
> 
> That's the very reason we set drm_gpu_scheduler::timeout to
> MAX_SCHEDULE_TIMEOUT at init time (AKA, timeout disabled) and never
> touch that again. When our driver-internal timer expires, we forward
> the information to the drm_sched layer by calling drm_sched_fault().

That sounds all.. stressful ;)

As you know I only learned a few weeks ago about your group scheduler
on top of drm_sched. I wish I had heard about it when it was
implemented; we might have come up with the idea for drm_jobqueue
sooner.

> 
> > 
> > > +				 &queue->timeout.work,
> > > +				 queue->timeout.remaining);
> > > +
> > > +		queue->timeout.remaining = MAX_SCHEDULE_TIMEOUT;
> > > +	}
> > > +
> > > +	spin_unlock(&queue->fence_ctx.lock);
> > > +}
> > > +
> > >   
> > 
> > […]
> > 
> > >  
> > > @@ -3270,6 +3379,11 @@ queue_timedout_job(struct drm_sched_job *sched_job)
> > >  
> > >  	queue_start(queue);
> > >  
> > > +	/* We already flagged the queue as faulty, make sure we don't get
> > > +	 * called again.
> > > +	 */
> > > +	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> > > +
> > >  	return DRM_GPU_SCHED_STAT_RESET;  
> > 
> > DRM_GPU_SCHED_STAT_NO_HANG instead of just modifying the scheduler's
> > internal data??
> 
> 
> 	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> 
> is a leftover from a previous version. We shouldn't have to modify that
> here because the timeout is initialized to MAX_SCHEDULE_TIMEOUT and
> never touched again.

So you agree that it can be removed in v8?


P.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 13:48       ` Philipp Stanner
@ 2025-11-12 14:12         ` Boris Brezillon
  2025-11-12 14:23           ` Boris Brezillon
  2025-11-13 10:56           ` Philipp Stanner
  0 siblings, 2 replies; 10+ messages in thread
From: Boris Brezillon @ 2025-11-12 14:12 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: Steven Price, Liviu Dudau, Adrián Larumbe, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, dri-devel, Detlev Casanova,
	Ashley Smith, kernel

On Wed, 12 Nov 2025 14:48:51 +0100
Philipp Stanner <pstanner@redhat.com> wrote:

> On Wed, 2025-11-12 at 14:31 +0100, Boris Brezillon wrote:
> > Hi Philipp,
> > 
> > On Wed, 12 Nov 2025 13:38:00 +0100
> > Philipp Stanner <pstanner@redhat.com> wrote:
> >   
> > > On Wed, 2025-11-12 at 13:17 +0100, Boris Brezillon wrote:  
> > > > From: Ashley Smith <ashley.smith@collabora.com>
> > > > 
> > > > The timeout logic provided by drm_sched leads to races when we try
> > > > to suspend it while the drm_sched workqueue queues more jobs. Let's
> > > > overhaul the timeout handling in panthor to have our own delayed work
> > > > that's resumed/suspended when a group is resumed/suspended. When an
> > > > actual timeout occurs, we call drm_sched_fault() to report it
> > > > through drm_sched, still. But otherwise, the drm_sched timeout is
> > > > disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of
> > > > how we protect modifications on the timer.
> > > > 
> > > > One issue seems to be when we call drm_sched_suspend_timeout() from
> > > > both queue_run_job() and tick_work() which could lead to races due to
> > > > drm_sched_suspend_timeout() not having a lock. Another issue seems to
> > > > be in queue_run_job() if the group is not scheduled, we suspend the
> > > > timeout again which undoes what drm_sched_job_begin() did when calling
> > > > drm_sched_start_timeout(). So the timeout does not reset when a job
> > > > is finished.
> > > > 
> > > >     
> > > 
> > > […]
> > >   
> > > > +
> > > > +static void
> > > > +queue_reset_timeout_locked(struct panthor_queue *queue)
> > > > +{
> > > > +	lockdep_assert_held(&queue->fence_ctx.lock);
> > > > +
> > > > +	if (queue->timeout.remaining != MAX_SCHEDULE_TIMEOUT) {
> > > > +		mod_delayed_work(queue->scheduler.timeout_wq,    
> > > 
> > > Here you are interfering with the scheduler's internals again, don't
> > > you. I think we agreed that we don't want to do such things anymore,
> > > didn't we?  
> > 
> > We're not really touching drm_gpu_scheduler's internals, we're just
> > retrieving the timeout workqueue we passed at init time:
> > panthor_queue::timeout is panthor internals not drm_sched internals.
> > 
> > This being said, I agree we should use ptdev->reset.wq instead of
> > retrieving the timeout workqueue through the drm_gpu_scheduler object.
> > 
> > Just to be clear, the goal of this patch is to bypass the
> > drm_gpu_scheduler timeout logic entirely, so we can have our own thing
> > that's not racy (see below).  
> 
> OK. timeout_wq sharing is intended and allowed, so if that's what
> you're doing, good. But I agree that accessing the wq through the
> driver's struct is then cleaner and more obviously correct.

Will do.

> 
> >   
> > > 
> > > You could write a proper drm_sched API function which serves your
> > > usecase.  
> > 
> > It's not really lack of support for our usecase that drives this
> > change, but more the fact the current helpers are racy for drivers that
> > have a 1:1 entity:sched relationship with queues that can be scheduled
> > out behind drm_gpu_scheduler's back.  
> 
> And you also can't stop drm_sched to prevent races?

That's the thing, I don't want to stop the drm_gpu_scheduler attached
to a panthor_queue, I want new jobs to be queued to the ring buffer
until this ring buffer is full (which is controller with the
::credit_limit property), even if the group this queue belongs to is
not currently active on the FW side. Those jobs will get executed at
some later point when the group gets picked by the panthor scheduler.

> 
> >   
> > > 
> > > Or could maybe DRM_GPU_SCHED_STAT_NO_HANG be returned from your driver
> > > in case an interrupt actually fires?  
> > 
> > I don't think it helps, see below.
> >   
> > >   
> > > > +				 &queue->timeout.work,
> > > > +				 msecs_to_jiffies(JOB_TIMEOUT_MS));
> > > > +	}
> > > > +}
> > > > +
> > > > +static bool
> > > > +group_can_run(struct panthor_group *group)
> > > > +{
> > > > +	return group->state != PANTHOR_CS_GROUP_TERMINATED &&
> > > > +	       group->state != PANTHOR_CS_GROUP_UNKNOWN_STATE &&
> > > > +	       !group->destroyed && group->fatal_queues == 0 &&
> > > > +	       !group->timedout;
> > > > +}
> > > > +
> > > >     
> > > 
> > > […]
> > >   
> > > > +queue_suspend_timeout(struct panthor_queue *queue)
> > > > +{
> > > > +	spin_lock(&queue->fence_ctx.lock);
> > > > +	queue_suspend_timeout_locked(queue);
> > > > +	spin_unlock(&queue->fence_ctx.lock);
> > > > +}
> > > > +
> > > > +static void
> > > > +queue_resume_timeout(struct panthor_queue *queue)
> > > > +{
> > > > +	spin_lock(&queue->fence_ctx.lock);
> > > > +
> > > > +	if (queue_timeout_is_suspended(queue)) {
> > > > +		mod_delayed_work(queue->scheduler.timeout_wq,    
> > > 
> > > There is drm_sched_resume_timeout(). Why can it not be used?  
> > 
> > Because it's racy. I don't remember all the details, but IIRC, it had
> > to do with job insertion calling drm_sched_start_timeout() while we're
> > calling drm_sched_resume_timeout() from cs_slot_reset_locked(). We
> > tried to make it work, but we gave up at some point and went for a
> > driver-specific timer, because ultimately what we want is a per-queue
> > timeout that we can pause/resume without drm_sched interfering when new
> > jobs are queued to our ring buffers: we resume when the execution
> > context (AKA group) is scheduled in, and we pause when this execution
> > context is scheduled out.
> > 
> > That's the very reason we set drm_gpu_scheduler::timeout to
> > MAX_SCHEDULE_TIMEOUT at init time (AKA, timeout disabled) and never
> > touch that again. When our driver-internal timer expires, we forward
> > the information to the drm_sched layer by calling drm_sched_fault().  
> 
> That sounds all.. stressful ;)

Yeah, it's not ideal :-/.

> 
> As you know I only learned a few weeks ago about your group scheduler
> on top of drm_sched. I wish I had heard about it when it was
> implemented; we might have come up with the idea for drm_jobqueue
> sooner.

Might have simplified things, I guess, but that's life, and I'm happy
to transition to drm_jobqueue when it's deemed ready.

> 
> >   
> > >   
> > > > +				 &queue->timeout.work,
> > > > +				 queue->timeout.remaining);
> > > > +
> > > > +		queue->timeout.remaining = MAX_SCHEDULE_TIMEOUT;
> > > > +	}
> > > > +
> > > > +	spin_unlock(&queue->fence_ctx.lock);
> > > > +}
> > > > +
> > > >     
> > > 
> > > […]
> > >   
> > > >  
> > > > @@ -3270,6 +3379,11 @@ queue_timedout_job(struct drm_sched_job *sched_job)
> > > >  
> > > >  	queue_start(queue);
> > > >  
> > > > +	/* We already flagged the queue as faulty, make sure we don't get
> > > > +	 * called again.
> > > > +	 */
> > > > +	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> > > > +
> > > >  	return DRM_GPU_SCHED_STAT_RESET;    
> > > 
> > > DRM_GPU_SCHED_STAT_NO_HANG instead of just modifying the scheduler's
> > > internal data??  
> > 
> > 
> > 	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> > 
> > is a leftover from a previous version. We shouldn't have to modify that
> > here because the timeout is initialized to MAX_SCHEDULE_TIMEOUT and
> > never touched again.  
> 
> So you agree that it can be removed in v8?

Yep, I think it can go away. I'll make sure it still works without it,
and also get the wq from some driver fields instead of going back to
drm_gpu_scheduler::timeout_wq.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 14:12         ` Boris Brezillon
@ 2025-11-12 14:23           ` Boris Brezillon
  2025-11-13 10:56           ` Philipp Stanner
  1 sibling, 0 replies; 10+ messages in thread
From: Boris Brezillon @ 2025-11-12 14:23 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: Steven Price, Liviu Dudau, Adrián Larumbe, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, dri-devel, Detlev Casanova,
	Ashley Smith, kernel

On Wed, 12 Nov 2025 15:12:53 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> > OK. timeout_wq sharing is intended and allowed, so if that's what
> > you're doing, good. But I agree that accessing the wq through the
> > driver's struct is then cleaner and more obviously correct.  
> 
> Will do.

Actually, after looking into that, it seems I'd have to go back to
dev_get_drvdata(drm_gpu_scheduler::dev)::reset_wq to get the timeout
workqueue, so I think I'd prefer to keep using
drm_gpu_scheduler::timeout_wq unless that's really seen as a layering
violation (but I'd argue that accessing drm_gpu_scheduler::dev looks
like the same kind of layering violation to me).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-12 14:12         ` Boris Brezillon
  2025-11-12 14:23           ` Boris Brezillon
@ 2025-11-13 10:56           ` Philipp Stanner
  2025-11-13 11:05             ` Boris Brezillon
  1 sibling, 1 reply; 10+ messages in thread
From: Philipp Stanner @ 2025-11-13 10:56 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Steven Price, Liviu Dudau, Adrián Larumbe, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, dri-devel, Detlev Casanova,
	Ashley Smith, kernel

On Wed, 2025-11-12 at 15:12 +0100, Boris Brezillon wrote:
> On Wed, 12 Nov 2025 14:48:51 +0100
> Philipp Stanner <pstanner@redhat.com> wrote:
> 
> > 

[…]

> > >   
> > > > 
> > > > You could write a proper drm_sched API function which serves your
> > > > usecase.  
> > > 
> > > It's not really lack of support for our usecase that drives this
> > > change, but more the fact the current helpers are racy for drivers that
> > > have a 1:1 entity:sched relationship with queues that can be scheduled
> > > out behind drm_gpu_scheduler's back.  
> > 
> > And you also can't stop drm_sched to prevent races?
> 
> That's the thing, I don't want to stop the drm_gpu_scheduler attached
> to a panthor_queue, I want new jobs to be queued to the ring buffer
> until this ring buffer is full (which is controller with the
> ::credit_limit property), even if the group this queue belongs to is
> not currently active on the FW side. Those jobs will get executed at
> some later point when the group gets picked by the panthor scheduler.

Ah, OK! Understood.

> > 
> > As you know I only learned a few weeks ago about your group scheduler
> > on top of drm_sched. I wish I had heard about it when it was
> > implemented; we might have come up with the idea for drm_jobqueue
> > sooner.
> 
> Might have simplified things, I guess, but that's life, and I'm happy
> to transition to drm_jobqueue when it's deemed ready.

JQ is in Rust, potentially one day with a C ABI. So that could only
happen if your driver's users are OK with relying on LLVM for building
the kernel.

BTW would be interesting for me to know to what degree that's a problem
for common distributions and users.

> 
> > 
> > >   
> > > > 

[…]

> > > 
> > > 
> > > 	queue->scheduler.timeout = MAX_SCHEDULE_TIMEOUT;
> > > 
> > > is a leftover from a previous version. We shouldn't have to modify that
> > > here because the timeout is initialized to MAX_SCHEDULE_TIMEOUT and
> > > never touched again.  
> > 
> > So you agree that it can be removed in v8?
> 
> Yep, I think it can go away. I'll make sure it still works without it,
> and also get the wq from some driver fields instead of going back to
> drm_gpu_scheduler::timeout_wq.
> 

I saw your other reply.

I mean, it's all not ideal, but as already stated timeout_wq is
intended for sharing and, furthermore, we have so many users of interna
by now that it often can't even be determined anymore what's legal and
what's not. Let's live with it for now.


P.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job
  2025-11-13 10:56           ` Philipp Stanner
@ 2025-11-13 11:05             ` Boris Brezillon
  0 siblings, 0 replies; 10+ messages in thread
From: Boris Brezillon @ 2025-11-13 11:05 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: Steven Price, Liviu Dudau, Adrián Larumbe, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, dri-devel, Detlev Casanova,
	Ashley Smith, kernel

On Thu, 13 Nov 2025 11:56:03 +0100
Philipp Stanner <pstanner@redhat.com> wrote:

> On Wed, 2025-11-12 at 15:12 +0100, Boris Brezillon wrote:
> > On Wed, 12 Nov 2025 14:48:51 +0100
> > Philipp Stanner <pstanner@redhat.com> wrote:
> >   
> > >   
> 
> […]
> 
> > > >     
> > > > > 
> > > > > You could write a proper drm_sched API function which serves your
> > > > > usecase.    
> > > > 
> > > > It's not really lack of support for our usecase that drives this
> > > > change, but more the fact the current helpers are racy for drivers that
> > > > have a 1:1 entity:sched relationship with queues that can be scheduled
> > > > out behind drm_gpu_scheduler's back.    
> > > 
> > > And you also can't stop drm_sched to prevent races?  
> > 
> > That's the thing, I don't want to stop the drm_gpu_scheduler attached
> > to a panthor_queue, I want new jobs to be queued to the ring buffer
> > until this ring buffer is full (which is controller with the
> > ::credit_limit property), even if the group this queue belongs to is
> > not currently active on the FW side. Those jobs will get executed at
> > some later point when the group gets picked by the panthor scheduler.  
> 
> Ah, OK! Understood.
> 
> > > 
> > > As you know I only learned a few weeks ago about your group scheduler
> > > on top of drm_sched. I wish I had heard about it when it was
> > > implemented; we might have come up with the idea for drm_jobqueue
> > > sooner.  
> > 
> > Might have simplified things, I guess, but that's life, and I'm happy
> > to transition to drm_jobqueue when it's deemed ready.  
> 
> JQ is in Rust, potentially one day with a C ABI. So that could only
> happen if your driver's users are OK with relying on LLVM for building
> the kernel.
> 
> BTW would be interesting for me to know to what degree that's a problem
> for common distributions and users.

That's a good question. I must admit I never thought of this because I
usually build my kernels on a full-blown distro with both llvm and gcc
installed, so I can easily pick the one I need for the task. For distro
builds that should be okay, but for {yocto,buildroot,custom}-based
build systems the transition might be painful/annoying, but I'd expect
people to move to llvm if they have to. I mean, the ultimate goal is
for Tyr to replace Panthor so we don't have to maintain two drivers, so
if that's going to be a problem, I'd rather know it now :-).

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-11-13 11:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-12 12:17 [PATCH v7 0/2] drm/panthor: Scheduler fixes for termination failure and timeouts Boris Brezillon
2025-11-12 12:17 ` [PATCH v7 1/2] drm/panthor: Make the timeout per-queue instead of per-job Boris Brezillon
2025-11-12 12:38   ` Philipp Stanner
2025-11-12 13:31     ` Boris Brezillon
2025-11-12 13:48       ` Philipp Stanner
2025-11-12 14:12         ` Boris Brezillon
2025-11-12 14:23           ` Boris Brezillon
2025-11-13 10:56           ` Philipp Stanner
2025-11-13 11:05             ` Boris Brezillon
2025-11-12 12:17 ` [PATCH v7 2/2] drm/panthor: Reset queue slots if termination fails Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.