[PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler
@ 2026-06-05 16:06 Maíra Canal
  2026-06-05 16:06 ` [PATCH 1/3] drm/rocket: Remove unused reset worker Maíra Canal
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Maíra Canal @ 2026-06-05 16:06 UTC (permalink / raw)
  To: Tomeu Vizoso, Oded Gabbay, Christian König,
	Christian König, Rob Herring, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Sumit Semwal
  Cc: kernel-dev, dri-devel, Maíra Canal

After Rob mentioned to me that Rocket could have a similar redundant
job_lock just like Ethos [1], I decided to take a look at the driver to
see if we could remove this lock. However, as I was reading the code, I
identified that, more than the job_lock, the issue is a bit different. The
job submission procedure in Rocket breaks the DRM scheduler's design in a
fundamental way.

Currently, a job spawns further hardware work from outside the scheduler.
The function rocket_job_run() submits only the first task of an inference;
every subsequent task is submitted by the threaded IRQ handler, which calls
rocket_job_hw_submit() directly.

The scheduler expects all of a job's hardware submission to happen in
run_job(). Submitting jobs from the IRQ handler instead is completely
invisible to the scheduler, which can cause some issues, like:
drm_sched_stop() only synchronizes the scheduler's workqueue, not the IRQ,
so the reset path races these IRQ-driven submissions. This creates the need
of a job_lock mutex and the reset.pending flag, which exist only as a
workaround to that self-inflicted race.

Considering the current status of the driver, solving this issue is quite
simple: don't consider the whole submission as a DRM sched job, instead
consider a task a DRM sched job. With that, the driver can comply to the
DRM scheduler expectations and get rid of some locks, flags and indexes.

Having said that, this is only "compile-tested", I don't have this
hardware. I was just driven by Rob's comment to take a look at Rocket's
code and the design looked unusual to what I would expect from a DRM
scheduler-based driver. I'm also CCing some scheduler maintainers to check
if they agree that the IRQ handler shouldn't spawn further HW work.

Apart from that, this series also has some clean-up patches.

[1] https://lore.kernel.org/dri-devel/20260516144623.2582427-2-mcanal@igalia.com/T/

Best regards,
- Maíra

---
Maíra Canal (3):
      drm/rocket: Remove unused reset worker
      drm/rocket: Submit one drm_sched_job per task
      drm/rocket: Drop the dedicated reset workqueue

 drivers/accel/rocket/rocket_core.h |  10 +-
 drivers/accel/rocket/rocket_job.c  | 282 ++++++++++++++++++-------------------
 drivers/accel/rocket/rocket_job.h  |  26 +++-
 3 files changed, 159 insertions(+), 159 deletions(-)
---
base-commit: 640c57d6ca1346a1c2363a3f473b405af979e046
change-id: 20260605-rocket-per-task-jobs-b797f7e2b1e9

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] drm/rocket: Remove unused reset worker
  2026-06-05 16:06 [PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
@ 2026-06-05 16:06 ` Maíra Canal
  2026-06-05 16:25   ` sashiko-bot
  2026-06-05 16:06 ` [PATCH 2/3] drm/rocket: Submit one drm_sched_job per task Maíra Canal
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Maíra Canal @ 2026-06-05 16:06 UTC (permalink / raw)
  To: Tomeu Vizoso, Oded Gabbay, Christian König,
	Christian König, Rob Herring, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Sumit Semwal
  Cc: kernel-dev, dri-devel, Maíra Canal

The reset.work work_struct and its rocket_reset_work() handler were
initialized and cancelled, but never actually queued: no code path calls
queue_work()/schedule_work() on core->reset.work. The only reset entry
point is the scheduler's timedout_job callback, rocket_job_timedout(),
which invokes rocket_reset() directly on the timeout workqueue.

Drop the dead worker, its initialization and cancellation, and the
unused struct member.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
 drivers/accel/rocket/rocket_core.h |  1 -
 drivers/accel/rocket/rocket_job.c  | 10 ----------
 2 files changed, 11 deletions(-)

diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index f6d7382854ca..4d163a6ac653 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -48,7 +48,6 @@ struct rocket_core {
 
 	struct {
 		struct workqueue_struct *wq;
-		struct work_struct work;
 		atomic_t pending;
 	} reset;
 
diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
index 2f1861f960cc..8a26139a44f4 100644
--- a/drivers/accel/rocket/rocket_job.c
+++ b/drivers/accel/rocket/rocket_job.c
@@ -393,14 +393,6 @@ static enum drm_gpu_sched_stat rocket_job_timedout(struct drm_sched_job *sched_j
 	return DRM_GPU_SCHED_STAT_RESET;
 }
 
-static void rocket_reset_work(struct work_struct *work)
-{
-	struct rocket_core *core;
-
-	core = container_of(work, struct rocket_core, reset.work);
-	rocket_reset(core, NULL);
-}
-
 static const struct drm_sched_backend_ops rocket_sched_ops = {
 	.run_job = rocket_job_run,
 	.timedout_job = rocket_job_timedout,
@@ -444,7 +436,6 @@ int rocket_job_init(struct rocket_core *core)
 	};
 	int ret;
 
-	INIT_WORK(&core->reset.work, rocket_reset_work);
 	spin_lock_init(&core->fence_lock);
 	mutex_init(&core->job_lock);
 
@@ -488,7 +479,6 @@ void rocket_job_fini(struct rocket_core *core)
 {
 	drm_sched_fini(&core->sched);
 
-	cancel_work_sync(&core->reset.work);
 	destroy_workqueue(core->reset.wq);
 }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] drm/rocket: Submit one drm_sched_job per task
  2026-06-05 16:06 [PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
  2026-06-05 16:06 ` [PATCH 1/3] drm/rocket: Remove unused reset worker Maíra Canal
@ 2026-06-05 16:06 ` Maíra Canal
  2026-06-05 16:20   ` sashiko-bot
  2026-06-05 16:07 ` [PATCH 3/3] drm/rocket: Drop the dedicated reset workqueue Maíra Canal
  2026-06-05 16:40 ` [RFC PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
  3 siblings, 1 reply; 8+ messages in thread
From: Maíra Canal @ 2026-06-05 16:06 UTC (permalink / raw)
  To: Tomeu Vizoso, Oded Gabbay, Christian König,
	Christian König, Rob Herring, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Sumit Semwal
  Cc: kernel-dev, dri-devel, Maíra Canal

The current submission model breaks the DRM scheduler's design in two
ways.

First, a job spawns further hardware work from outside the scheduler.
rocket_job_run() submits only the first task of an inference; every
subsequent task is submitted by the threaded IRQ handler, which calls
rocket_job_hw_submit() directly. The scheduler expects all of a job's
hardware submission to happen in run_job(). Driving it from the IRQ
handler instead is invisible to the scheduler, and drm_sched_stop() only
synchronizes the scheduler's workqueue, not the threaded IRQ, so the
reset path races these IRQ-driven submissions. The job_lock mutex and the
reset.pending flag exist only as a workaround that self-inflicted race.

Second, the submission path returns after arming a job. rocket_job_push()
calls drm_sched_job_arm() and only then acquires the BO fences, bailing
out and cleaning the job up if that fails. But arming is a point of no
return: per its documentation, "Once this function was called, you *must*
submit @job with drm_sched_entity_push_job()", because it publishes the
job's fences.

Redesign the submission so each task is its own drm_sched_job, which is
what the scheduler's model actually expects:

  - Every submission to the NPU flows through run_job(), and the IRQ
    handler only signals the task's fence. Nothing is started without the
    knowledge of the scheduler, so it can serialize submission against
    reset.

  - The BO reservations and implicit dependencies are acquired before any
    task is armed, so the only fallible step precedes the point of no
    return.

struct rocket_job becomes the refcounted inference shared by its tasks;
struct rocket_task becomes the per-task scheduled unit, and each task
holds a reference to the inference, so the last task to be freed marks it
done. Ordering between tasks comes from the per-file entity's FIFO and
arming all tasks up front pins the inference to one core. The implicit
BO dependencies and the completion fence are anchored on the last task.

The IOMMU domain is attached once by the first task to run and detached
when the last task drops its reference. On timeout the inference is marked
cancelled before the scheduler restarts; the remaining tasks observe the
flag in run_job() and fail with -ECANCELED without touching hardware, so
only the affected inference is abandoned.

With this, the Rocket driver can comply to the DRM scheduler's
expectations.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
 drivers/accel/rocket/rocket_core.h |   5 +-
 drivers/accel/rocket/rocket_job.c  | 257 ++++++++++++++++++++-----------------
 drivers/accel/rocket/rocket_job.h  |  26 +++-
 3 files changed, 158 insertions(+), 130 deletions(-)

diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index 4d163a6ac653..f74dc99d07bb 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -41,14 +41,13 @@ struct rocket_core {
 
 	struct iommu_group *iommu_group;
 
-	struct mutex job_lock;
-	struct rocket_job *in_flight_job;
+	/* Task currently running on the hardware. */
+	struct rocket_task *in_flight_task;
 
 	spinlock_t fence_lock;
 
 	struct {
 		struct workqueue_struct *wq;
-		atomic_t pending;
 	} reset;
 
 	struct drm_gpu_scheduler sched;
diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
index 8a26139a44f4..871041eb7a1d 100644
--- a/drivers/accel/rocket/rocket_job.c
+++ b/drivers/accel/rocket/rocket_job.c
@@ -20,10 +20,10 @@
 
 #define JOB_TIMEOUT_MS 500
 
-static struct rocket_job *
-to_rocket_job(struct drm_sched_job *sched_job)
+static struct rocket_task *
+to_rocket_task(struct drm_sched_job *sched_job)
 {
-	return container_of(sched_job, struct rocket_job, base);
+	return container_of(sched_job, struct rocket_task, base);
 }
 
 static const char *rocket_fence_get_driver_name(struct dma_fence *fence)
@@ -61,8 +61,6 @@ rocket_copy_tasks(struct drm_device *dev,
 		  struct drm_rocket_job *job,
 		  struct rocket_job *rjob)
 {
-	int ret = 0;
-
 	if (job->task_struct_size < sizeof(struct drm_rocket_task))
 		return -EINVAL;
 
@@ -71,7 +69,7 @@ rocket_copy_tasks(struct drm_device *dev,
 	if (!rjob->task_count)
 		return 0;
 
-	rjob->tasks = kvmalloc_objs(*rjob->tasks, job->task_count);
+	rjob->tasks = kvzalloc_objs(*rjob->tasks, job->task_count);
 	if (!rjob->tasks) {
 		drm_dbg(dev, "Failed to allocate task array\n");
 		return -ENOMEM;
@@ -84,14 +82,12 @@ rocket_copy_tasks(struct drm_device *dev,
 				   u64_to_user_ptr(job->tasks) + i * job->task_struct_size,
 				   sizeof(task))) {
 			drm_dbg(dev, "Failed to copy incoming tasks\n");
-			ret = -EFAULT;
-			goto fail;
+			return -EFAULT;
 		}
 
 		if (task.regcmd_count == 0) {
 			drm_dbg(dev, "regcmd_count field in drm_rocket_task should be > 0.\n");
-			ret = -EINVAL;
-			goto fail;
+			return -EINVAL;
 		}
 
 		rjob->tasks[i].regcmd = task.regcmd;
@@ -99,26 +95,14 @@ rocket_copy_tasks(struct drm_device *dev,
 	}
 
 	return 0;
-
-fail:
-	kvfree(rjob->tasks);
-	return ret;
 }
 
-static void rocket_job_hw_submit(struct rocket_core *core, struct rocket_job *job)
+static void rocket_job_hw_submit(struct rocket_core *core, struct rocket_task *task)
 {
-	struct rocket_task *task;
 	unsigned int extra_bit;
 
-	/* Don't queue the job if a reset is in progress */
-	if (atomic_read(&core->reset.pending))
-		return;
-
 	/* GO ! */
 
-	task = &job->tasks[job->next_task_idx];
-	job->next_task_idx++;
-
 	rocket_pc_writel(core, BASE_ADDRESS, 0x1);
 
 	 /* From rknpu, in the TRM this bit is marked as reserved */
@@ -186,43 +170,68 @@ static void rocket_attach_object_fences(struct drm_gem_object **bos,
 static int rocket_job_push(struct rocket_job *job)
 {
 	struct rocket_device *rdev = job->rdev;
+	unsigned int bo_count = job->in_bo_count + job->out_bo_count;
+	struct rocket_task *last = &job->tasks[job->task_count - 1];
 	struct drm_gem_object **bos;
 	struct ww_acquire_ctx acquire_ctx;
-	int ret = 0;
+	int ret, i;
 
-	bos = kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void *),
-			     GFP_KERNEL);
+	bos = kvmalloc_array(bo_count, sizeof(void *), GFP_KERNEL);
+	if (!bos) {
+		ret = -ENOMEM;
+		goto err_cleanup_tasks;
+	}
 	memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *));
 	memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(void *));
 
-	ret = drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx);
+	ret = drm_gem_lock_reservations(bos, bo_count, &acquire_ctx);
 	if (ret)
-		goto err;
+		goto err_free_bos;
 
-	scoped_guard(mutex, &rdev->sched_lock) {
-		drm_sched_job_arm(&job->base);
+	/* Anchor the BO synchronization on the last task: its finished fence is
+	 * the inference's completion fence.
+	 */
+	ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count,
+					   &last->base, false);
+	if (ret)
+		goto err_unlock;
 
-		job->inference_done_fence = dma_fence_get(&job->base.s_fence->finished);
+	ret = rocket_acquire_object_fences(job->out_bos, job->out_bo_count,
+					   &last->base, true);
+	if (ret)
+		goto err_unlock;
 
-		ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count, &job->base, false);
-		if (ret)
-			goto err_unlock;
+	mutex_lock(&rdev->sched_lock);
 
-		ret = rocket_acquire_object_fences(job->out_bos, job->out_bo_count, &job->base, true);
-		if (ret)
-			goto err_unlock;
+	for (i = 0; i < job->task_count; i++) {
+		drm_sched_job_arm(&job->tasks[i].base);
 
-		kref_get(&job->refcount); /* put by scheduler job completion */
+		if (i == job->task_count - 1)
+			job->inference_done_fence = dma_fence_get(&last->base.s_fence->finished);
 
-		drm_sched_entity_push_job(&job->base);
+		/* put by scheduler job completion */
+		kref_get(&job->refcount);
+
+		drm_sched_entity_push_job(&job->tasks[i].base);
 	}
 
-	rocket_attach_object_fences(job->out_bos, job->out_bo_count, job->inference_done_fence);
+	mutex_unlock(&rdev->sched_lock);
+
+	rocket_attach_object_fences(job->out_bos, job->out_bo_count,
+				    job->inference_done_fence);
+
+	drm_gem_unlock_reservations(bos, bo_count, &acquire_ctx);
+	kvfree(bos);
+
+	return 0;
 
 err_unlock:
-	drm_gem_unlock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx);
-err:
+	drm_gem_unlock_reservations(bos, bo_count, &acquire_ctx);
+err_free_bos:
 	kvfree(bos);
+err_cleanup_tasks:
+	for (i = 0; i < job->task_count; i++)
+		drm_sched_job_cleanup(&job->tasks[i].base);
 
 	return ret;
 }
@@ -233,10 +242,22 @@ static void rocket_job_cleanup(struct kref *ref)
 						refcount);
 	unsigned int i;
 
+	/*
+	 * The last task holding a reference is gone, so the inference is over.
+	 * The ordered scheduler workqueue runs this (from free_job()) before
+	 * the next inference's run_job(), so the core's IOMMU group is free
+	 * in time for it.
+	 */
+	if (job->core)
+		iommu_detach_group(NULL, job->core->iommu_group);
+
 	rocket_iommu_domain_put(job->domain);
 
-	dma_fence_put(job->done_fence);
-	dma_fence_put(job->inference_done_fence);
+	if (job->tasks) {
+		for (i = 0; i < job->task_count; i++)
+			dma_fence_put(job->tasks[i].done_fence);
+		dma_fence_put(job->inference_done_fence);
+	}
 
 	if (job->in_bos) {
 		for (i = 0; i < job->in_bo_count; i++)
@@ -264,11 +285,11 @@ static void rocket_job_put(struct rocket_job *job)
 
 static void rocket_job_free(struct drm_sched_job *sched_job)
 {
-	struct rocket_job *job = to_rocket_job(sched_job);
+	struct rocket_task *task = to_rocket_task(sched_job);
 
 	drm_sched_job_cleanup(sched_job);
 
-	rocket_job_put(job);
+	rocket_job_put(task->job);
 }
 
 static struct rocket_core *sched_to_core(struct rocket_device *rdev,
@@ -286,108 +307,102 @@ static struct rocket_core *sched_to_core(struct rocket_device *rdev,
 
 static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job)
 {
-	struct rocket_job *job = to_rocket_job(sched_job);
-	struct rocket_device *rdev = job->rdev;
-	struct rocket_core *core = sched_to_core(rdev, sched_job->sched);
+	struct rocket_task *task = to_rocket_task(sched_job);
+	struct rocket_job *job = task->job;
+	struct rocket_core *core = sched_to_core(job->rdev, sched_job->sched);
 	struct dma_fence *fence = NULL;
 	int ret;
 
-	if (unlikely(job->base.s_fence->finished.error))
+	if (unlikely(sched_job->s_fence->finished.error))
 		return NULL;
 
-	/*
-	 * Nothing to execute: can happen if the job has finished while
-	 * we were resetting the NPU.
-	 */
-	if (job->next_task_idx == job->task_count)
+	/* An earlier task of this inference timed out, abandon the inference. */
+	if (atomic_read(&job->cancelled)) {
+		dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);
 		return NULL;
+	}
 
 	fence = rocket_fence_create(core);
 	if (IS_ERR(fence))
 		return fence;
 
-	if (job->done_fence)
-		dma_fence_put(job->done_fence);
-	job->done_fence = dma_fence_get(fence);
+	if (task->done_fence)
+		dma_fence_put(task->done_fence);
+	task->done_fence = dma_fence_get(fence);
 
-	ret = pm_runtime_get_sync(core->dev);
-	if (ret < 0)
-		return fence;
-
-	ret = iommu_attach_group(job->domain->domain, core->iommu_group);
-	if (ret < 0)
-		return fence;
-
-	scoped_guard(mutex, &core->job_lock) {
-		core->in_flight_job = job;
-		rocket_job_hw_submit(core, job);
+	ret = pm_runtime_resume_and_get(core->dev);
+	if (ret < 0) {
+		dma_fence_put(fence);
+		return ERR_PTR(ret);
 	}
 
+	/* Attach the domain once for the whole inference. */
+	if (!job->core) {
+		ret = iommu_attach_group(job->domain->domain, core->iommu_group);
+		if (ret < 0) {
+			pm_runtime_put_autosuspend(core->dev);
+			dma_fence_put(fence);
+			return ERR_PTR(ret);
+		}
+		job->core = core;
+	}
+
+	WRITE_ONCE(core->in_flight_task, task);
+	rocket_job_hw_submit(core, task);
+
 	return fence;
 }
 
 static void rocket_job_handle_irq(struct rocket_core *core)
 {
+	struct rocket_task *task;
+
 	pm_runtime_mark_last_busy(core->dev);
 
 	rocket_pc_writel(core, OPERATION_ENABLE, 0x0);
 	rocket_pc_writel(core, INTERRUPT_CLEAR, 0x1ffff);
 
-	scoped_guard(mutex, &core->job_lock)
-		if (core->in_flight_job) {
-			if (core->in_flight_job->next_task_idx < core->in_flight_job->task_count) {
-				rocket_job_hw_submit(core, core->in_flight_job);
-				return;
-			}
-
-			iommu_detach_group(NULL, iommu_group_get(core->dev));
-			dma_fence_signal(core->in_flight_job->done_fence);
-			pm_runtime_put_autosuspend(core->dev);
-			core->in_flight_job = NULL;
-		}
+	/*
+	 * Claim the in-flight task: the reset path may run concurrently, so
+	 * whichever of us wins owns the PM put.
+	 */
+	task = xchg(&core->in_flight_task, NULL);
+	if (task) {
+		pm_runtime_put_autosuspend(core->dev);
+		dma_fence_signal(task->done_fence);
+	}
 }
 
 static void
 rocket_reset(struct rocket_core *core, struct drm_sched_job *bad)
 {
-	if (!atomic_read(&core->reset.pending))
-		return;
+	struct rocket_task *task;
 
 	drm_sched_stop(&core->sched, bad);
 
-	/*
-	 * Remaining interrupts have been handled, but we might still have
-	 * stuck jobs. Let's make sure the PM counters stay balanced by
-	 * manually calling pm_runtime_put_noidle().
-	 */
-	scoped_guard(mutex, &core->job_lock) {
-		if (core->in_flight_job)
-			pm_runtime_put_noidle(core->dev);
-
-		iommu_detach_group(NULL, core->iommu_group);
-
-		core->in_flight_job = NULL;
-	}
+	/* Claim the in-flight task (see rocket_job_handle_irq()). */
+	task = xchg(&core->in_flight_task, NULL);
+	if (task)
+		pm_runtime_put_noidle(core->dev);
 
 	/* Proceed with reset now. */
 	rocket_core_reset(core);
 
-	/* NPU has been reset, we can clear the reset pending bit. */
-	atomic_set(&core->reset.pending, 0);
-
 	/* Restart the scheduler */
 	drm_sched_start(&core->sched, 0);
 }
 
 static enum drm_gpu_sched_stat rocket_job_timedout(struct drm_sched_job *sched_job)
 {
-	struct rocket_job *job = to_rocket_job(sched_job);
-	struct rocket_device *rdev = job->rdev;
-	struct rocket_core *core = sched_to_core(rdev, sched_job->sched);
+	struct rocket_task *task = to_rocket_task(sched_job);
+	struct rocket_job *job = task->job;
+	struct rocket_core *core = sched_to_core(job->rdev, sched_job->sched);
 
 	dev_err(core->dev, "NPU job timed out");
 
-	atomic_set(&core->reset.pending, 1);
+	/* Abandon the rest of the inference before the scheduler is restarted. */
+	atomic_set(&job->cancelled, 1);
+
 	rocket_reset(core, sched_job);
 
 	return DRM_GPU_SCHED_STAT_RESET;
@@ -437,7 +452,6 @@ int rocket_job_init(struct rocket_core *core)
 	int ret;
 
 	spin_lock_init(&core->fence_lock);
-	mutex_init(&core->job_lock);
 
 	core->irq = platform_get_irq(to_platform_device(core->dev), 0);
 	if (core->irq < 0)
@@ -526,7 +540,7 @@ static int rocket_ioctl_submit_job(struct drm_device *dev, struct drm_file *file
 	struct rocket_device *rdev = to_rocket_device(dev);
 	struct rocket_file_priv *file_priv = file->driver_priv;
 	struct rocket_job *rjob = NULL;
-	int ret = 0;
+	int i, ret = 0;
 
 	if (job->task_count == 0)
 		return -EINVAL;
@@ -536,42 +550,45 @@ static int rocket_ioctl_submit_job(struct drm_device *dev, struct drm_file *file
 		return -ENOMEM;
 
 	kref_init(&rjob->refcount);
-
+	atomic_set(&rjob->cancelled, 0);
 	rjob->rdev = rdev;
 
-	ret = drm_sched_job_init(&rjob->base,
-				 &file_priv->sched_entity,
-				 1, NULL, file->client_id);
-	if (ret)
-		goto out_put_job;
-
 	ret = rocket_copy_tasks(dev, file, job, rjob);
 	if (ret)
-		goto out_cleanup_job;
+		goto out_put_job;
+
+	for (i = 0; i < rjob->task_count; i++) {
+		ret = drm_sched_job_init(&rjob->tasks[i].base, &file_priv->sched_entity,
+					 1, NULL, file->client_id);
+		if (ret)
+			goto out_cleanup_tasks;
+
+		rjob->tasks[i].job = rjob;
+	}
 
 	ret = drm_gem_objects_lookup(file, u64_to_user_ptr(job->in_bo_handles),
 				     job->in_bo_handle_count, &rjob->in_bos);
 	if (ret)
-		goto out_cleanup_job;
+		goto out_cleanup_tasks;
 
 	rjob->in_bo_count = job->in_bo_handle_count;
 
 	ret = drm_gem_objects_lookup(file, u64_to_user_ptr(job->out_bo_handles),
 				     job->out_bo_handle_count, &rjob->out_bos);
 	if (ret)
-		goto out_cleanup_job;
+		goto out_cleanup_tasks;
 
 	rjob->out_bo_count = job->out_bo_handle_count;
 
 	rjob->domain = rocket_iommu_domain_get(file_priv);
 
 	ret = rocket_job_push(rjob);
-	if (ret)
-		goto out_cleanup_job;
 
-out_cleanup_job:
-	if (ret)
-		drm_sched_job_cleanup(&rjob->base);
+	goto out_put_job;
+
+out_cleanup_tasks:
+	for (i--; i >= 0; i--)
+		drm_sched_job_cleanup(&rjob->tasks[i].base);
 out_put_job:
 	rocket_job_put(rjob);
 
diff --git a/drivers/accel/rocket/rocket_job.h b/drivers/accel/rocket/rocket_job.h
index 4ae00feec3b9..50e2e9664911 100644
--- a/drivers/accel/rocket/rocket_job.h
+++ b/drivers/accel/rocket/rocket_job.h
@@ -10,13 +10,29 @@
 #include "rocket_core.h"
 #include "rocket_drv.h"
 
+/*
+ * A task is one hardware submission. Each task is its own drm_sched_job,
+ * so every submission to the NPU flows through run_job().
+ */
 struct rocket_task {
+	struct drm_sched_job base;
+
+	/* Inference this task belongs to. */
+	struct rocket_job *job;
+
 	u64 regcmd;
 	u32 regcmd_count;
+
+	/* Fence signaled by the IRQ handler once the hardware completes it. */
+	struct dma_fence *done_fence;
 };
 
+/* An inference: the whole userspace submission, made of one or more tasks. */
 struct rocket_job {
-	struct drm_sched_job base;
+	struct kref refcount;
+
+	/* Set when the inference must be abandoned (a task timed out). */
+	atomic_t cancelled;
 
 	struct rocket_device *rdev;
 
@@ -28,17 +44,13 @@ struct rocket_job {
 
 	struct rocket_task *tasks;
 	u32 task_count;
-	u32 next_task_idx;
 
-	/* Fence to be signaled by drm-sched once its done with the job */
 	struct dma_fence *inference_done_fence;
 
-	/* Fence to be signaled by IRQ handler when the job is complete. */
-	struct dma_fence *done_fence;
-
 	struct rocket_iommu_domain *domain;
 
-	struct kref refcount;
+	/* Core the domain is attached to. */
+	struct rocket_core *core;
 };
 
 int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file);

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] drm/rocket: Drop the dedicated reset workqueue
  2026-06-05 16:06 [PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
  2026-06-05 16:06 ` [PATCH 1/3] drm/rocket: Remove unused reset worker Maíra Canal
  2026-06-05 16:06 ` [PATCH 2/3] drm/rocket: Submit one drm_sched_job per task Maíra Canal
@ 2026-06-05 16:07 ` Maíra Canal
  2026-06-05 18:20   ` sashiko-bot
  2026-06-05 16:40 ` [RFC PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
  3 siblings, 1 reply; 8+ messages in thread
From: Maíra Canal @ 2026-06-05 16:07 UTC (permalink / raw)
  To: Tomeu Vizoso, Oded Gabbay, Christian König,
	Christian König, Rob Herring, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Sumit Semwal
  Cc: kernel-dev, dri-devel, Maíra Canal

Each core allocated an ordered workqueue ("rocket-reset-%d") and passed
it as the scheduler's timeout_wq. The only documented reason to provide a
custom timeout_wq is to serialize the timeout handlers of several
schedulers that share globally-reset hardware. That does not apply here:
each core has its own scheduler and is reset independently
(rocket_reset() only touches its own core). With the reset worker gone,
the workqueue served solely as the timeout_wq.

Drop it and let drm_sched fall back to the default timeout workqueue.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
 drivers/accel/rocket/rocket_core.h |  4 ----
 drivers/accel/rocket/rocket_job.c  | 15 +--------------
 2 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h
index f74dc99d07bb..607e4270f59c 100644
--- a/drivers/accel/rocket/rocket_core.h
+++ b/drivers/accel/rocket/rocket_core.h
@@ -46,10 +46,6 @@ struct rocket_core {
 
 	spinlock_t fence_lock;
 
-	struct {
-		struct workqueue_struct *wq;
-	} reset;
-
 	struct drm_gpu_scheduler sched;
 	u64 fence_context;
 	u64 emit_seqno;
diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
index 871041eb7a1d..1a454559ecf4 100644
--- a/drivers/accel/rocket/rocket_job.c
+++ b/drivers/accel/rocket/rocket_job.c
@@ -467,33 +467,20 @@ int rocket_job_init(struct rocket_core *core)
 		return ret;
 	}
 
-	core->reset.wq = alloc_ordered_workqueue("rocket-reset-%d", 0, core->index);
-	if (!core->reset.wq)
-		return -ENOMEM;
-
 	core->fence_context = dma_fence_context_alloc(1);
 
-	args.timeout_wq = core->reset.wq;
 	ret = drm_sched_init(&core->sched, &args);
 	if (ret) {
 		dev_err(core->dev, "Failed to create scheduler: %d.", ret);
-		goto err_sched;
+		return ret;
 	}
 
 	return 0;
-
-err_sched:
-	drm_sched_fini(&core->sched);
-
-	destroy_workqueue(core->reset.wq);
-	return ret;
 }
 
 void rocket_job_fini(struct rocket_core *core)
 {
 	drm_sched_fini(&core->sched);
-
-	destroy_workqueue(core->reset.wq);
 }
 
 int rocket_job_open(struct rocket_file_priv *rocket_priv)

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] drm/rocket: Submit one drm_sched_job per task
  2026-06-05 16:06 ` [PATCH 2/3] drm/rocket: Submit one drm_sched_job per task Maíra Canal
@ 2026-06-05 16:20   ` sashiko-bot
  0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-06-05 16:20 UTC (permalink / raw)
  To: Maíra Canal; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:

New issues:
- [Critical] Splitting an inference into independent `drm_sched_job` tasks breaks IOMMU domain isolation due to DRM scheduler interleaving.
- [High] Implicit BO dependencies are incorrectly anchored to the last task of an inference instead of the first, causing premature hardware execution.
- [High] IOMMU detachment in `rocket_job_cleanup()` races with the execution of the next inference because the ioctl thread holds the final kref.
- [High] If a task fails to start in `rocket_job_run()`, the rest of the inference is not cancelled, leading to partial out-of-sequence execution.

Pre-existing issues:
- [Critical] Unconditional call to `rocket_iommu_domain_put()` in `rocket_job_cleanup()` causes a NULL pointer dereference if job initialization fails early.
- [High] The IRQ handler accesses NPU registers before claiming the in-flight task, risking a hardware fault if the device has been reset concurrently.
- [High] The driver silently ignores failures of individual job submissions in `rocket_ioctl_submit`.
--

commit ea82a8e8306fe83c325efae485834200ae29aec9
Author: Maíra Canal <mcanal@igalia.com>
Subject: drm/rocket: Submit one drm_sched_job per task

This commit redesigns the job submission path so each task is its own
drm_sched_job. It ensures all submissions flow through the DRM scheduler
so the reset path can serialize against hardware execution, while holding
an inference refcount across tasks.

> diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
> --- a/drivers/accel/rocket/rocket_job.c
> +++ b/drivers/accel/rocket/rocket_job.c

[ ... ]

> @@ -186,43 +170,68 @@ static int rocket_job_push(struct rocket_job *job)
>  	struct drm_gem_object **bos;
>  	struct ww_acquire_ctx acquire_ctx;
> -	int ret = 0;
> +	int ret, i;
>  
> -	bos = kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void *),
> -			     GFP_KERNEL);
> +	bos = kvmalloc_array(bo_count, sizeof(void *), GFP_KERNEL);
> +	if (!bos) {
> +		ret = -ENOMEM;
> +		goto err_cleanup_tasks;
> +	}
>  	memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *));
>  	memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(void *));
>  
> -	ret = drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx);
> +	ret = drm_gem_lock_reservations(bos, bo_count, &acquire_ctx);
> +	if (ret)
> +		goto err_free_bos;
> +
> +	/* Anchor the BO synchronization on the last task: its finished fence is
> +	 * the inference's completion fence.
> +	 */
> +	ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count,
> +					   &last->base, false);

[Severity: High]
Does anchoring the input dependencies to the last task allow the first
task to run before the input buffers are ready?

Since the DRM scheduler uses these dependencies to know when a job is
ready to run, adding them to last->base leaves job->tasks[0] without
dependencies.

The scheduler might submit the first task to the hardware immediately,
causing a read-before-write data race. Should the input dependencies be
anchored to the first task instead?

[ ... ]

> @@ -233,10 +242,22 @@ static void rocket_job_cleanup(struct kref *ref)
>  						refcount);
>  	unsigned int i;
>  
> +	/*
> +	 * The last task holding a reference is gone, so the inference is over.
> +	 * The ordered scheduler workqueue runs this (from free_job()) before
> +	 * the next inference's run_job(), so the core's IOMMU group is free
> +	 * in time for it.
> +	 */
> +	if (job->core)
> +		iommu_detach_group(NULL, job->core->iommu_group);

[Severity: High]
Can this detachment race with the execution of a subsequent inference?

The ioctl thread initializes the inference refcount to 1 and drops it at
the end of rocket_ioctl_submit_job(). If the scheduler finishes all tasks
quickly, it drops its references, leaving the refcount at 1.

The scheduler might then start the next inference, which attaches its own
IOMMU domain. If the ioctl thread then wakes up and drops its final
reference, rocket_job_cleanup() will call iommu_detach_group().

Could this unmap the IOMMU while the next inference is actively executing
on the hardware?

> +
>  	rocket_iommu_domain_put(job->domain);

[Severity: Critical]
This is a pre-existing issue, but does this unconditionally dereference
the domain pointer?

If an error occurs early in rocket_ioctl_submit_job() (for example, if
rocket_copy_tasks() fails), the error path calls rocket_job_put(rjob)
and triggers cleanup before job->domain is initialized.

Since rocket_iommu_domain_put() internally executes
kref_put(&domain->kref, ...), passing a NULL pointer here will cause a
kernel panic. Should this check if job->domain is valid before putting it?

[ ... ]

> @@ -286,108 +307,102 @@ static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job)
>  	struct rocket_core *core = sched_to_core(rdev, sched_job->sched);
>  	struct dma_fence *fence = NULL;
>  	int ret;
[ ... ]
> -	ret = pm_runtime_get_sync(core->dev);
> -	if (ret < 0)
> -		return fence;
> +	ret = pm_runtime_resume_and_get(core->dev);
> +	if (ret < 0) {
> +		dma_fence_put(fence);
> +		return ERR_PTR(ret);
> +	}

[Severity: High]
If pm_runtime_resume_and_get() or iommu_attach_group() fails and returns
an ERR_PTR, does the rest of the inference still execute?

The scheduler will complete this failed task with an error, but it appears
the job->cancelled flag is not set on these early failure paths.

Since subsequent tasks in the same inference will not observe the cancelled
flag, will they be submitted out of order, potentially leaving the hardware
in an invalid state?

>  
> -	ret = iommu_attach_group(job->domain->domain, core->iommu_group);
> -	if (ret < 0)
> -		return fence;
> +	/* Attach the domain once for the whole inference. */
> +	if (!job->core) {
> +		ret = iommu_attach_group(job->domain->domain, core->iommu_group);

[Severity: Critical]
Does attaching the IOMMU domain only for the first task break isolation
when the scheduler interleaves tasks?

With the redesign mapping each task to a separate drm_sched_job, the DRM
scheduler can interleave tasks from different inferences on the same core.

If Process A's first task attaches its domain and then Process B's first
task attaches its domain, Process A's second task will see job->core is
already set.

Will Process A's second task then execute using Process B's IOMMU domain,
allowing cross-process memory access?

[ ... ]

> @@ -395,20 +409,20 @@ static void rocket_job_handle_irq(struct rocket_core *core)
>  {
> +	struct rocket_task *task;
> +
>  	pm_runtime_mark_last_busy(core->dev);
>  
>  	rocket_pc_writel(core, OPERATION_ENABLE, 0x0);
>  	rocket_pc_writel(core, INTERRUPT_CLEAR, 0x1ffff);
>  
> -	scoped_guard(mutex, &core->job_lock)
> -		if (core->in_flight_job) {
> -			if (core->in_flight_job->next_task_idx < core->in_flight_job->task_count) {
> -				rocket_job_hw_submit(core, core->in_flight_job);
> -				return;
> -			}
> -
> -			iommu_detach_group(NULL, iommu_group_get(core->dev));
> -			dma_fence_signal(core->in_flight_job->done_fence);
> -			pm_runtime_put_autosuspend(core->dev);
> -			core->in_flight_job = NULL;
> -		}
> +	/*
> +	 * Claim the in-flight task: the reset path may run concurrently, so
> +	 * whichever of us wins owns the PM put.
> +	 */
> +	task = xchg(&core->in_flight_task, NULL);

[Severity: High]
This is a pre-existing issue, but does writing to these registers before
claiming the in-flight task risk a hardware fault?

If a timeout occurs concurrently, rocket_reset() can claim the task, drop
the PM reference, and reset the hardware.

If this threaded IRQ handler then writes to the OPERATION_ENABLE and
INTERRUPT_CLEAR registers while the device is powered down or mid-reset,
could it trigger a bus error?

Should the handler claim the task first to guarantee the device is awake?

[ ... ]

>  int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file)
>  {
[ ... ]
>  	for (i = 0; i < args->job_count; i++)
>  		rocket_ioctl_submit_job(dev, file, &jobs[i]);
>  
>  exit:
>  	kvfree(jobs);
>  
>  	return ret;
>  }

[Severity: High]
This is a pre-existing issue, but is the return value of
rocket_ioctl_submit_job() intentionally ignored here?

If an individual job fails to submit due to an out-of-memory error or
invalid BO handles, the failure is neither checked nor accumulated.

Since user space will receive a success return code (ret = 0), does this
hide submission failures?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605-rocket-per-task-jobs-v1-0-e3319adf3c49@igalia.com?part=2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] drm/rocket: Remove unused reset worker
  2026-06-05 16:06 ` [PATCH 1/3] drm/rocket: Remove unused reset worker Maíra Canal
@ 2026-06-05 16:25   ` sashiko-bot
  0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-06-05 16:25 UTC (permalink / raw)
  To: Maíra Canal; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] `rocket_iommu_domain_put()` does not check for NULL before dereferencing, leading to a crash on the error path.
- [Medium] The ioctl handler silently ignores errors when submitting multiple jobs, reporting success to userspace even if submissions fail.
--

commit cc3020dc8d8efb7ac65346d6d8612ab4fbbadb48
Author: Maíra Canal <mcanal@igalia.com>

drm/rocket: Remove unused reset worker

The reset.work work_struct and its rocket_reset_work() handler were
initialized and cancelled, but never actually queued. This patch drops
the dead worker, its initialization, cancellation, and the unused struct
member.

> diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
> index 2f1861f960ccb..8a26139a44f46 100644
> --- a/drivers/accel/rocket/rocket_job.c
> +++ b/drivers/accel/rocket/rocket_job.c

[ ... ]

[Severity: High]
This is a pre-existing issue, but can this lead to a NULL pointer dereference
on the error path?

In rocket_ioctl_submit_job(), if rocket_copy_tasks() fails, it jumps to
out_cleanup_job, which then falls through to out_put_job:

drivers/accel/rocket/rocket_job.c:rocket_ioctl_submit_job() {
	...
	ret = rocket_copy_tasks(dev, file, job, rjob);
	if (ret)
		goto out_cleanup_job;
	...
out_cleanup_job:
	if (ret)
		drm_sched_job_cleanup(&rjob->base);
out_put_job:
	rocket_job_put(rjob);

	return ret;
}

This calls rocket_job_cleanup(), which unconditionally passes job->domain to
rocket_iommu_domain_put():

drivers/accel/rocket/rocket_job.c:rocket_job_cleanup() {
	...
	rocket_iommu_domain_put(job->domain);
	...
}

But rocket_iommu_domain_put() does not check if domain is NULL before passing
it to kref_put():

drivers/accel/rocket/rocket_drv.c:rocket_iommu_domain_put() {
	kref_put(&domain->kref, rocket_iommu_domain_destroy);
}

Since job->domain is assigned after rocket_copy_tasks() succeeds, could this
dereference a NULL pointer if an early failure occurs?

[Severity: Medium]
This is a pre-existing issue, but does this code silently ignore job
submission errors?

In rocket_ioctl_submit(), the loop calls rocket_ioctl_submit_job(), but the
return value is not captured or checked:

drivers/accel/rocket/rocket_job.c:rocket_ioctl_submit() {
	...
	for (i = 0; i < args->job_count; i++)
		rocket_ioctl_submit_job(dev, file, &jobs[i]);

exit:
	kvfree(jobs);

	return ret;
}

Since ret remains 0, the ioctl will return success to userspace even if
some or all jobs fail to submit. Will userspace assume the jobs were
successfully queued in this case?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605-rocket-per-task-jobs-v1-0-e3319adf3c49@igalia.com?part=1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler
  2026-06-05 16:06 [PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
                   ` (2 preceding siblings ...)
  2026-06-05 16:07 ` [PATCH 3/3] drm/rocket: Drop the dedicated reset workqueue Maíra Canal
@ 2026-06-05 16:40 ` Maíra Canal
  3 siblings, 0 replies; 8+ messages in thread
From: Maíra Canal @ 2026-06-05 16:40 UTC (permalink / raw)
  To: Tomeu Vizoso, Oded Gabbay, Christian König,
	Christian König, Rob Herring, Matthew Brost,
	Danilo Krummrich, Philipp Stanner, Sumit Semwal
  Cc: kernel-dev, dri-devel

On 05/06/26 13:06, Maíra Canal wrote:
> After Rob mentioned to me that Rocket could have a similar redundant
> job_lock just like Ethos [1], I decided to take a look at the driver to
> see if we could remove this lock. However, as I was reading the code, I
> identified that, more than the job_lock, the issue is a bit different. The
> job submission procedure in Rocket breaks the DRM scheduler's design in a
> fundamental way.
> 
> Currently, a job spawns further hardware work from outside the scheduler.
> The function rocket_job_run() submits only the first task of an inference;
> every subsequent task is submitted by the threaded IRQ handler, which calls
> rocket_job_hw_submit() directly.
> 
> The scheduler expects all of a job's hardware submission to happen in
> run_job(). Submitting jobs from the IRQ handler instead is completely
> invisible to the scheduler, which can cause some issues, like:
> drm_sched_stop() only synchronizes the scheduler's workqueue, not the IRQ,
> so the reset path races these IRQ-driven submissions. This creates the need
> of a job_lock mutex and the reset.pending flag, which exist only as a
> workaround to that self-inflicted race.
> 
> Considering the current status of the driver, solving this issue is quite
> simple: don't consider the whole submission as a DRM sched job, instead
> consider a task a DRM sched job. With that, the driver can comply to the
> DRM scheduler expectations and get rid of some locks, flags and indexes.
> 
> Having said that, this is only "compile-tested", I don't have this
> hardware. I was just driven by Rob's comment to take a look at Rocket's
> code and the design looked unusual to what I would expect from a DRM
> scheduler-based driver. I'm also CCing some scheduler maintainers to check
> if they agree that the IRQ handler shouldn't spawn further HW work.
> 
> Apart from that, this series also has some clean-up patches.
> 
> [1] https://lore.kernel.org/dri-devel/20260516144623.2582427-2-mcanal@igalia.com/T/

Sorry, I forgot the RFC tag.

Best regards,
- Maíra

> 
> Best regards,
> - Maíra
> 
> ---
> Maíra Canal (3):
>        drm/rocket: Remove unused reset worker
>        drm/rocket: Submit one drm_sched_job per task
>        drm/rocket: Drop the dedicated reset workqueue
> 
>   drivers/accel/rocket/rocket_core.h |  10 +-
>   drivers/accel/rocket/rocket_job.c  | 282 ++++++++++++++++++-------------------
>   drivers/accel/rocket/rocket_job.h  |  26 +++-
>   3 files changed, 159 insertions(+), 159 deletions(-)
> ---
> base-commit: 640c57d6ca1346a1c2363a3f473b405af979e046
> change-id: 20260605-rocket-per-task-jobs-b797f7e2b1e9
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] drm/rocket: Drop the dedicated reset workqueue
  2026-06-05 16:07 ` [PATCH 3/3] drm/rocket: Drop the dedicated reset workqueue Maíra Canal
@ 2026-06-05 18:20   ` sashiko-bot
  0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-06-05 18:20 UTC (permalink / raw)
  To: Maíra Canal; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Memory leak of the dynamically allocated `scheds` array in `rocket_job_open()` on error paths and when `num_cores == 1`.
--

commit 49abb67c60d7da03b194fc958eb96f281bc20ffc
Author: Maíra Canal <mcanal@igalia.com>

drm/rocket: Drop the dedicated reset workqueue

Each core allocated an ordered workqueue to serialize timeout handlers, which
is unnecessary since each core is reset independently. With the reset worker
gone, the workqueue served solely as the timeout_wq. This patch drops the
dedicated workqueue and lets drm_sched fall back to the default timeout
workqueue.

> diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
> index 871041eb7a1dc..1a454559ecf49 100644
> --- a/drivers/accel/rocket/rocket_job.c
> +++ b/drivers/accel/rocket/rocket_job.c

[ ... ]

>  void rocket_job_fini(struct rocket_core *core)
>  {
>  	drm_sched_fini(&core->sched);
> -
> -	destroy_workqueue(core->reset.wq);
>  }
>  
>  int rocket_job_open(struct rocket_file_priv *rocket_priv)

[Severity: High]
This isn't a bug introduced by this patch, but is there a memory leak of the
dynamically allocated scheds array on error paths and when rdev->num_cores
equals 1?

Looking at drivers/accel/rocket/rocket_job.c:rocket_job_open():

	struct drm_gpu_scheduler **scheds = kmalloc_objs(*scheds,
							 rdev->num_cores);
	...
	ret = drm_sched_entity_init(&rocket_priv->sched_entity,
				    DRM_SCHED_PRIORITY_NORMAL,
				    scheds,
				    rdev->num_cores, NULL);
	if (WARN_ON(ret))
		return ret;

If drm_sched_entity_init() fails, the function returns directly and the
scheds array is leaked.

Additionally, if drm_sched_entity_init() succeeds but rdev->num_cores == 1,
drm_sched_entity_init() explicitly drops the reference by setting
entity->sched_list to NULL:

drivers/gpu/drm/scheduler/sched_entity.c:drm_sched_entity_init() {
	...
	entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
	...
}

Because rocket_job_close() relies on entity->sched_list to free the memory:

drivers/accel/rocket/rocket_job.c:rocket_job_close() {
	...
	kfree(entity->sched_list);
	...
}

Calling kfree() with NULL resolves cleanly but permanently loses the original
scheds array allocation. Can this be used to trigger kernel memory exhaustion
by repeatedly calling open() and close() on the DRM device file?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605-rocket-per-task-jobs-v1-0-e3319adf3c49@igalia.com?part=3

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-05 18:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05 16:06 [PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal
2026-06-05 16:06 ` [PATCH 1/3] drm/rocket: Remove unused reset worker Maíra Canal
2026-06-05 16:25   ` sashiko-bot
2026-06-05 16:06 ` [PATCH 2/3] drm/rocket: Submit one drm_sched_job per task Maíra Canal
2026-06-05 16:20   ` sashiko-bot
2026-06-05 16:07 ` [PATCH 3/3] drm/rocket: Drop the dedicated reset workqueue Maíra Canal
2026-06-05 18:20   ` sashiko-bot
2026-06-05 16:40 ` [RFC PATCH 0/3] drm/rocket: Stop submitting hardware work from the IRQ handler Maíra Canal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.